UCONN Stamford Google Cloud Development Platform: Google API Assignment

Google API Assignment

Assignment:

Take a picture that has characters on it.

Upload picture on your storage bucket

Use vision OCR functionality to extract the text from image

Translate the text to spanish using the Google translate API

Create an MP3 file from the translated text using the text to speech API.

Install all python libraries

pip install --upgrade google-cloud-vision

pip install google-cloud-translate

pip install --upgrade google-cloud-texttospeech

Enable all APIs needed

gcloud services enable vision.googleapis.com

gcloud services enable translate.googleapis.com

gcloud services enable texttospeech.googleapis.com

Need service account JSON key

Generating a JSON key for GOOGLE_APPLICATION_CREDENTIALS is a straightforward process, but it requires navigating the Google Cloud Console. This key acts as a "passport" for your local environment or server to talk to Google APIs securely.

Here is the step-by-step guide to getting it done.

1. Navigate to the IAM & Admin Console

First, you need to head to the Google Cloud Console.

Go to the Service Accounts page.
Select your Project from the top dropdown menu if it isn't already selected.

2. Create a Service Account (If you don't have one)

If you already have a service account you want to use, skip to step 3. Otherwise:

Click + CREATE SERVICE ACCOUNT at the top.
Give it a name (e.g., my-app-service-account).
Click Create and Continue.
Grant access: Assign a "Role" that has the permissions your app needs (e.g., Storage Object Viewer or Pub/Sub Publisher).
Click Done.

3. Generate the JSON Key

Now, you’ll extract the actual credential file:

In the list of service accounts, click on the Email address of the account you just created (or an existing one).
Click on the Keys tab at the top.
Click the ADD KEY dropdown and select Create new key.
Choose JSON as the key type and click Create.

Your browser will automatically download a .json file. Keep this file safe! It contains private keys that grant access to your cloud resources; never commit it to public GitHub repositories.

4. Set the Environment Variable

Once you have the file (let's say it's named service-account-file.json), you need to point your system to it so your code can find it.

On macOS / Linux (Terminal)

Add this to your .bashrc, .zshrc, or run it in your current session:

Bash

export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/service-account-file.json"

Upload key

export GOOGLE_APPLICATION_CREDENTIALS="/home/john_iacovacci1/cloud-project-examples-316c375c6892.json”

ocr_to_speech.py

======================================================

from google.cloud import vision

from google.cloud import storage

from google.cloud import translate_v2 as translate

from google.cloud import texttospeech

vision_client = vision.ImageAnnotatorClient()

storage_client = storage.Client()

translate_client = translate.Client()

tts_client = texttospeech.TextToSpeechClient()

def process_image_to_audio(bucket_name, image_blob_name):

print(f"Extract text gs://{bucket_name}/{image_blob_name} ---")

image_uri = f"gs://{bucket_name}/{image_blob_name}"

image = vision.Image()

image.source.image_uri = image_uri

response = vision_client.text_detection(image=image)

if response.error.message:

raise Exception(f"Vision Error: {response.error.message}")

texts = response.text_annotations

if not texts:

print("No text.")

return

original_text = texts[0].description

print(f"Extracted :\n{original_text}\n")

print("Translate to Spanish")

translation = translate_client.translate(original_text, target_language='es')

translated_text = translation['translatedText']

print("Generate Audio")

synthesis_input = texttospeech.SynthesisInput(text=translated_text)

voice = texttospeech.VoiceSelectionParams(

language_code="es-ES",

ssml_gender=texttospeech.SsmlVoiceGender.NEUTRAL

)

audio_config = texttospeech.AudioConfig(

audio_encoding=texttospeech.AudioEncoding.MP3

)

tts_response = tts_client.synthesize_speech(

input=synthesis_input, voice=voice, audio_config=audio_config

)

bucket = storage_client.bucket(bucket_name)

base_name = image_blob_name.split('.')[0]

text_blob = bucket.blob(f"{base_name}_es.txt")

text_blob.upload_from_string(translated_text)

audio_blob = bucket.blob(f"{base_name}_es.mp3")

audio_blob.upload_from_string(tts_response.audio_content, content_type="audio/mpeg")

print(f"Completed")

print(f"Text: gs://{bucket_name}/{base_name}_es.txt")

print(f"Audio: gs://{bucket_name}/{base_name}_es.mp3")

if __name__ == "__main__":

BUCKET = "cloud-storage-exam"

IMAGE_FILE = "lunch.jpeg"

process_image_to_audio(BUCKET, IMAGE_FILE)

===================================================

Results:

john_iacovacci1@cloudshell:~/vision (cloud-project-examples)$ python3 ocr_to_speech.py

Extract text gs://cloud-storage-exam/lunch.jpeg ---

Extracted :

Easy and affordable

lunches you'll love

Hot summer days call for laid-back sandwiches just the

way you like them. Pick your favorite meats from our deli

and pair them with fresh, seasonal produce for your

perfect combo.

Translate to Spanish

Generate Audio

Completed

Text: gs://cloud-storage-exam/lunch_es.txt

Audio: gs://cloud-storage-exam/lunch_es.mp3

john_iacovacci1@cloudshell:~/vision (cloud-project-examples)$

Link to audio file

Lunch

UCONN Stamford Google Cloud Development Platform

UCONN

Google API Assignment

1. Navigate to the IAM & Admin Console

2. Create a Service Account (If you don't have one)

3. Generate the JSON Key

4. Set the Environment Variable

On macOS / Linux (Terminal)

No comments:

Post a Comment

Assignment 10 due before grading

Report Abuse