UCONN Stamford Google Cloud Development Platform: Vision Example Explained

Vision Example Explained

Start Cloud Shell

While Google Cloud can be operated remotely from your laptop, in this codelab you will be using Cloud Shell, a command line environment running in the Cloud.

Activate Cloud Shell

From the Cloud Console, click Activate Cloud Shell .

Before you can begin using the Vision API, run the following command in Cloud Shell to enable the API:

gcloud services enable vision.googleapis.com

You should see something like this:

Operation "operations/..." finished successfully.

Now, you can use the Vision API!

Navigate to your home directory:

cd ~

Create a Python virtual environment to isolate the dependencies:

virtualenv venv-vision

Activate the virtual environment:

source venv-vision/bin/activate

Install IPython and the Vision API client library:

pip install ipython google-cloud-vision

You should see something like this:

...

Installing collected packages: ..., ipython, google-cloud-vision

Successfully installed ... google-cloud-vision-3.4.0 ...

Now, you're ready to use the Vision API client library!

Welcome to Cloud Shell! Type "help" to get started.

Your Cloud Platform project in this session is set to uconn-customers.

Use “gcloud config set project [PROJECT_ID]” to change to a different project.

john_iacovacci1@cloudshell:~ (uconn-customers)$

gcloud services enable vision.googleapis.com

Operation "operations/acat.p2-320375012625-adfd3c6c-3941-4886-b334-f14e80b49c37" finished successfully.

john_iacovacci1@cloudshell:~ (uconn-customers)$

john_iacovacci1@cloudshell:~ (uconn-customers)$ virtualenv venv-vision

created virtual environment CPython3.9.2.final.0-64 in 640ms

creator CPython3Posix(dest=/home/john_iacovacci1/venv-vision, clear=False, no_vcs_ignore=False, global=False)

seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/john_iacovacci1/.local/share/virtualenv)

added seed packages: pip==23.2.1, setuptools==68.2.0, wheel==0.41.2

activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator

john_iacovacci1@cloudshell:~ (uconn-customers)$ pip install ipython google-cloud-vision

john_iacovacci1@cloudshell:~ (uconn-customers)$ ipython

Enter interactive python mode

Python 3.9.2 (default, Feb 28 2021, 17:03:44)

Type 'copyright', 'credits' or 'license' for more information

IPython 8.17.2 -- An enhanced Interactive Python. Type '?' for help.

In [1]:

4. Perform label detection

One of the Vision API core features is to identify objects or entities in an image,

known as label annotation. Label detection identifies general objects, locations,

activities, animal species, products, and more. The Vision API takes an input image

and returns the most likely labels which apply to that image. It returns the top-matching

labels along with a confidence score of a match to the image.

In this example, you will perform label detection on an image (courtesy of Alex Knight) of Setagaya, a popular district in Tokyo:

Copy the following code into your IPython session:

==========================================

from typing import Sequence

from google.cloud import vision

def analyze_image_from_uri(

image_uri: str,

feature_types: Sequence,

) -> vision.AnnotateImageResponse:

client = vision.ImageAnnotatorClient()

image = vision.Image()

image.source.image_uri = image_uri

features = [vision.Feature(type_=feature_type) for feature_type in feature_types]

request = vision.AnnotateImageRequest(image=image, features=features)

response = client.annotate_image(request=request)

return response

def print_labels(response: vision.AnnotateImageResponse):

print("=" * 80)

for label in response.label_annotations:

print(

f"{label.score:4.0%}",

f"{label.description:5}",

sep=" | ",

)

=========================================================

Take a moment to study the code and see how it uses the annotate_image client library method to analyze an image for a set of given features.

Send a request with the LABEL_DETECTION feature:

=================================

image_uri = "gs://cloud-samples-data/vision/label/setagaya.jpeg"

features = [vision.Feature.Type.LABEL_DETECTION]

response = analyze_image_from_uri(image_uri, features)

print_labels(response)

You should get the following output:

=========================================================

97% | Bicycle

96% | Tire

94% | Wheel

91% | Automotive lighting

89% | Infrastructure

87% | Bicycle wheel

86% | Mode of transport

85% | Building

83% | Electricity

82% | Neighbourhood

=========================================================

In [1]: from typing import Sequence

...:

...: from google.cloud import vision

...:

...: def analyze_image_from_uri(

...: image_uri: str,

...: feature_types: Sequence,

...: ) -> vision.AnnotateImageResponse:

...: client = vision.ImageAnnotatorClient()

...:

...: image = vision.Image()

...: image.source.image_uri = image_uri

...: features = [vision.Feature(type_=feature_type) for feature_type in feature_types]

...: request = vision.AnnotateImageRequest(image=image, features=features)

...:

...: response = client.annotate_image(request=request)

...:

...: return response

...:

...: def print_labels(response: vision.AnnotateImageResponse):

...: print("=" * 80)

...: for label in response.label_annotations:

...: print(

...: f"{label.score:4.0%}",

...: f"{label.description:5}",

...: sep=" | ",

...: )

Hit return until prompt occurs at prompt paste in information regarding picture

In [2]:

In [2]: image_uri = "gs://cloud-samples-data/vision/label/setagaya.jpeg"

...: features = [vision.Feature.Type.LABEL_DETECTION]

...:

...: response = analyze_image_from_uri(image_uri, features)

...: print_labels(response)

...: ...: print_labels(response)

=========================================================

97% | Bicycle

96% | Tire

94% | Wheel

91% | Automotive lighting

89% | Infrastructure

87% | Bicycle wheel

86% | Mode of transport

85% | Building

83% | Electricity

82% | Neighbourhood

In [3]:

5. Perform text detection

Text detection performs Optical Character Recognition (OCR).

It detects and extracts text within an image with support for a broad range

of languages. It also features automatic language identification.

In this example, you will perform text detection on a traffic sign image

Copy the following code into your IPython session:

def print_text(response: vision.AnnotateImageResponse):

print("=" * 80)

for annotation in response.text_annotations:

vertices = [f"({v.x},{v.y})" for v in annotation.bounding_poly.vertices]

print(

f"{repr(annotation.description):42}",

",".join(vertices),

sep=" | ",

)

===========================================

Send a request with the TEXT_DETECTION feature:

================================================

image_uri = "gs://cloud-samples-data/vision/ocr/sign.jpg"

features = [vision.Feature.Type.TEXT_DETECTION]

response = analyze_image_from_uri(image_uri, features)

print_text(response)

================================================

You should get the following output:

=========================================================

'WAITING?\nPLEASE\nTURN OFF\nYOUR\nENGINE' | (310,821),(2225,821),(2225,1965),(310,1965)

'WAITING' | (344,821),(2025,879),(2016,1127),(335,1069)

'?' | (2057,881),(2225,887),(2216,1134),(2048,1128)

'PLEASE' | (1208,1230),(1895,1253),(1891,1374),(1204,1351)

'TURN' | (1217,1414),(1718,1434),(1713,1558),(1212,1538)

'OFF' | (1787,1437),(2133,1451),(2128,1575),(1782,1561)

'YOUR' | (1211,1609),(1741,1626),(1737,1747),(1207,1731)

'ENGINE' | (1213,1805),(1923,1819),(1920,1949),(1210,1935)

In [3]: def print_text(response: vision.AnnotateImageResponse):

...: print("=" * 80)

...: for annotation in response.text_annotations:

...: vertices = [f"({v.x},{v.y})" for v in annotation.bounding_poly.vertices]

...: print(

...: f"{repr(annotation.description):42}",

...: ",".join(vertices),

...: sep=" | ",

...: )

...:

In [4]:

In [4]: image_uri = "gs://cloud-samples-data/vision/ocr/sign.jpg"

...: features = [vision.Feature.Type.TEXT_DETECTION]

...:

...: response = analyze_image_from_uri(image_uri, features)

...: print_text(response)

=========================================================

'WAITING?\nPLEASE\nTURN OFF\nYOUR\nENGINE' | (310,821),(2225,821),(2225,1965),(310,1965)

'WAITING' | (344,821),(2025,879),(2016,1127),(335,1069)

'?' | (2057,881),(2225,887),(2216,1134),(2048,1128)

'PLEASE' | (1207,1231),(1895,1254),(1891,1374),(1203,1351)

'TURN' | (1217,1414),(1718,1434),(1712,1559),(1212,1539)

'OFF' | (1787,1437),(2133,1451),(2128,1576),(1782,1562)

'YOUR' | (1211,1609),(1741,1626),(1737,1747),(1207,1731)

'ENGINE' | (1213,1805),(1922,1819),(1919,1949),(1210,1935)

=========================================================

6. Perform landmark detection

Landmark detection detects popular natural and man-made structures within an image.

In this example, you will perform landmark detection on an image

(courtesy of John Towner) of the Eiffel Tower:

Copy the following code into your IPython session:

def print_landmarks(response: vision.AnnotateImageResponse, min_score: float = 0.5):

print("=" * 80)

for landmark in response.landmark_annotations:

if landmark.score < min_score:

continue

vertices = [f"({v.x},{v.y})" for v in landmark.bounding_poly.vertices]

lat_lng = landmark.locations[0].lat_lng

print(

f"{landmark.description:18}",

",".join(vertices),

f"{lat_lng.latitude:.5f}",

f"{lat_lng.longitude:.5f}",

sep=" | ",

)

Send a request with the LANDMARK_DETECTION feature:

image_uri = "gs://cloud-samples-data/vision/landmark/eiffel_tower.jpg"

features = [vision.Feature.Type.LANDMARK_DETECTION]

response = analyze_image_from_uri(image_uri, features)

print_landmarks(response)

You should get the following output:

=========================================================

Trocadéro Gardens | (303,36),(520,36),(520,371),(303,371) | 48.86160 | 2.28928

Eiffel Tower | (458,76),(512,76),(512,263),(458,263) | 48.85846 | 2.29435

Here is how the results are presented by the online demo:

=======================================================

def print_landmarks(response: vision.AnnotateImageResponse, min_score: float = 0.5): print("=" * 80) for landmark in response.landmark_annotations: if landmark.score < min_score: continue vertices = [f"({v.x},{v.y})" for v in landmark.bounding_poly.vertices] lat_lng = landmark.locations[0].lat_lng print( f"{landmark.description:18}", ",".join(vertices), f"{lat_lng.latitude:.5f}", f"{lat_lng.longitude:.5f}", sep=" | ", )

=======================================================

image_uri = "gs://cloud-samples-data/vision/landmark/eiffel_tower.jpg"
features = [vision.Feature.Type.LANDMARK_DETECTION]

response = analyze_image_from_uri(image_uri, features)
print_landmarks(response)

=========================================================

Champ De Mars | (0,0),(640,0),(640,426),(0,426) | 48.85565 | 2.29863

Pont De Bir-Hakeim | (0,0),(640,0),(640,426),(0,426) | 48.85560 | 2.28759

Eiffel Tower | (0,0),(640,0),(640,426),(0,426) | 48.85837 | 2.29448

In [7]:

7. Perform face detection

Facial features detection detects multiple faces within an image along

with the associated key facial attributes such as emotional state or wearing

headwear.

In this example, you will detect faces in the following picture (courtesy of Himanshu Singh Gurjar):

Copy the following code into your IPython session:

===========================================================

def print_faces(response: vision.AnnotateImageResponse):

print("=" * 80)

for face_number, face in enumerate(response.face_annotations, 1):

vertices = ",".join(f"({v.x},{v.y})" for v in face.bounding_poly.vertices)

print(f"# Face {face_number} @ {vertices}")

print(f"Joy: {face.joy_likelihood.name}")

print(f"Exposed: {face.under_exposed_likelihood.name}")

print(f"Blurred: {face.blurred_likelihood.name}")

print("-" * 80)

=========================================================

Send a request with the FACE_DETECTION feature:

========================================================

image_uri = "gs://cloud-samples-data/vision/face/faces.jpeg"

features = [vision.Feature.Type.FACE_DETECTION]

response = analyze_image_from_uri(image_uri, features)

print_faces(response)

=========================================================

You should get the following output:

=========================================================

# Face 1 @ (1077,157),(2146,157),(2146,1399),(1077,1399)

Joy: VERY_LIKELY

Exposed: VERY_UNLIKELY

Blurred: VERY_UNLIKELY

--------------------------------------------------------------------------------

# Face 2 @ (144,1273),(793,1273),(793,1844),(144,1844)

Joy: VERY_UNLIKELY

Exposed: VERY_UNLIKELY

Blurred: UNLIKELY

--------------------------------------------------------------------------------

# Face 3 @ (785,167),(1100,167),(1100,534),(785,534)

Joy: VERY_UNLIKELY

Exposed: LIKELY

Blurred: VERY_LIKELY

--------------------------------------------------------------------------------

In [7]: def print_faces(response: vision.AnnotateImageResponse):

...: print("=" * 80)

...: for face_number, face in enumerate(response.face_annotations, 1):

...: vertices = ",".join(f"({v.x},{v.y})" for v in face.bounding_poly.vertices)

...: print(f"# Face {face_number} @ {vertices}")

...: print(f"Joy: {face.joy_likelihood.name}")

...: print(f"Exposed: {face.under_exposed_likelihood.name}")

...: print(f"Blurred: {face.blurred_likelihood.name}")

...: print("-" * 80)

...:

In [8]: image_uri = "gs://cloud-samples-data/vision/face/faces.jpeg"

...: features = [vision.Feature.Type.FACE_DETECTION]

...:

...: response = analyze_image_from_uri(image_uri, features)

...: print_faces(response)

=========================================================

# Face 1 @ (1076,151),(2144,151),(2144,1392),(1076,1392)

Joy: VERY_LIKELY

Exposed: VERY_UNLIKELY

Blurred: VERY_UNLIKELY

--------------------------------------------------------------------------------

# Face 2 @ (784,170),(1097,170),(1097,534),(784,534)

Joy: VERY_UNLIKELY

Exposed: VERY_UNLIKELY

Blurred: VERY_UNLIKELY

--------------------------------------------------------------------------------

# Face 3 @ (664,28),(816,28),(816,205),(664,205)

Joy: VERY_UNLIKELY

Exposed: VERY_UNLIKELY

Blurred: VERY_UNLIKELY

--------------------------------------------------------------------------------

# Face 4 @ (143,1273),(793,1273),(793,1844),(143,1844)

Joy: VERY_UNLIKELY

Exposed: VERY_UNLIKELY

Blurred: VERY_UNLIKELY

--------------------------------------------------------------------------------

In [9]:

8. Perform object detection

In this example, you will perform object detection on the same prior image

(courtesy of Alex Knight) of Setagaya:

Copy the following code into your IPython session:

=======================================================

def print_objects(response: vision.AnnotateImageResponse):

print("=" * 80)

for obj in response.localized_object_annotations:

nvertices = obj.bounding_poly.normalized_vertices

print(

f"{obj.score:4.0%}",

f"{obj.name:15}",

f"{obj.mid:10}",

",".join(f"({v.x:.1f},{v.y:.1f})" for v in nvertices),

sep=" | ",

)

===================================================

Send a request with the OBJECT_LOCALIZATION feature:

=====================================================

image_uri = "gs://cloud-samples-data/vision/label/setagaya.jpeg"

features = [vision.Feature.Type.OBJECT_LOCALIZATION]

response = analyze_image_from_uri(image_uri, features)

print_objects(response)

You should get the following output:

=========================================================

93% | Bicycle | /m/0199g | (0.6,0.6),(0.8,0.6),(0.8,0.9),(0.6,0.9)

92% | Bicycle wheel | /m/01bqk0 | (0.6,0.7),(0.7,0.7),(0.7,0.9),(0.6,0.9)

91% | Tire | /m/0h9mv | (0.7,0.7),(0.8,0.7),(0.8,1.0),(0.7,1.0)

75% | Bicycle | /m/0199g | (0.3,0.6),(0.4,0.6),(0.4,0.7),(0.3,0.7)

51% | Tire | /m/0h9mv | (0.3,0.6),(0.4,0.6),(0.4,0.7),(0.3,0.7)

In [9]: def print_objects(response: vision.AnnotateImageResponse):

...: print("=" * 80)

...: for obj in response.localized_object_annotations:

...: nvertices = obj.bounding_poly.normalized_vertices

...: print(

...: f"{obj.score:4.0%}",

...: f"{obj.name:15}",

...: f"{obj.mid:10}",

...: ",".join(f"({v.x:.1f},{v.y:.1f})" for v in nvertices),

...: sep=" | ",

...: )

...:

In [10]: image_uri = "gs://cloud-samples-data/vision/label/setagaya.jpeg"

...: features = [vision.Feature.Type.OBJECT_LOCALIZATION]

...:

...: response = analyze_image_from_uri(image_uri, features)

...: print_objects(response)

=========================================================

93% | Bicycle | /m/0199g | (0.6,0.6),(0.8,0.6),(0.8,0.9),(0.6,0.9)

92% | Bicycle wheel | /m/01bqk0 | (0.6,0.7),(0.7,0.7),(0.7,0.9),(0.6,0.9)

91% | Tire | /m/0h9mv | (0.7,0.7),(0.8,0.7),(0.8,1.0),(0.7,1.0)

75% | Bicycle | /m/0199g | (0.3,0.6),(0.4,0.6),(0.4,0.7),(0.3,0.7)

51% | Tire | /m/0h9mv | (0.3,0.6),(0.4,0.6),(0.4,0.7),(0.3,0.7)

In [11]:

9. Multiple features

You've seen how to use some features of the Vision API, but there are many more and you can request multiple features in a single request.

Here the kind of request you can make to get all insights at once:

image_uri = "gs://..."

features = [

vision.Feature.Type.OBJECT_LOCALIZATION,

vision.Feature.Type.FACE_DETECTION,

vision.Feature.Type.LANDMARK_DETECTION,

vision.Feature.Type.LOGO_DETECTION,

vision.Feature.Type.LABEL_DETECTION,

vision.Feature.Type.TEXT_DETECTION,

vision.Feature.Type.DOCUMENT_TEXT_DETECTION,

vision.Feature.Type.SAFE_SEARCH_DETECTION,

vision.Feature.Type.IMAGE_PROPERTIES,

vision.Feature.Type.CROP_HINTS,

vision.Feature.Type.WEB_DETECTION,

vision.Feature.Type.PRODUCT_SEARCH,

vision.Feature.Type.OBJECT_LOCALIZATION,

]

# response = analyze_image_from_uri(image_uri, features)

And there are more possibilities, like performing detections on a batch of images, synchronously or asynchronously. Check out all the how-to guides.

UCONN Stamford Google Cloud Development Platform

UCONN

Vision Example Explained

Start Cloud Shell

Activate Cloud Shell

4. Perform label detection

5. Perform text detection

6. Perform landmark detection

7. Perform face detection

8. Perform object detection

9. Multiple features

No comments:

Post a Comment

Assignment #12 due 5/9/25

Report Abuse