Google Vision API Quickstart
January 13th 2021
In order to use the Google Vision API which is part of the Google Cloud Platform, there are a few steps you need to take in order to get up and running and use the command line SDK.
Create a Google Platform Account
If you don’t already have an account you can signup by going here: https://console.cloud.google.com/home/ where you will be prompt to login or create an account.
Create a New Project
You will need to create a new project if you do not have a project setup already. The project will contain the specific credentials for authorization and the enabled APIs that you will use for the Vision API. You can name the project whatever you like.
Enable the Vision API
Once you created a project and are on the project dashboard you can use the hamburger menu in the top to select APIs & Services to navigate to the APIs dashboard or by going to: https://console.cloud.google.com/apis/dashboard
At the top of the dashboard click on ENABLE APIS & SERVICES and you will be directed to the APIs library. You can use the search to filter the Vision API to click on the Cloud Vision API and enable it.
Create an Identity Service Account
In order to use authentication service for the cloud API you will need to create a Identity service account which will generate a json
file for you that the cloud sdk (which we’ll download in a big) will use to generate an api token. Using the hamburger menu at the top navigate to Identity → Service Accounts. On the Service Accounts dashboard click CREATE SERVICE ACCOUNT.
- In the Service account name field, enter a name.
- From the Role list, select Project > Owner.
- Click Create. A JSON file that contains your key downloads to your computer. We will save this in our
~/.ssh
folder as we likely have other credentials stored and created there. - We will need to set an environment variable
GOOGLE_APPLICATION_CREDENTIALS
to the path where our JSON file was downloaded. Open up.bashrc
or.zschrc
in your home directory and paste:
# Export Google Service Application Credentials #
export GOOGLE_APPLICATION_CREDENTIALS="$HOME/.ssh/Project-3c4agk421ub7.json"
Make sure you replace
Project-3c4agk421ub7.json
with the name of the JSON file you downloaded.
Install the Google Cloud SDK
Navigate to https://cloud.google.com/sdk/docs/install and choose the OS and environment you are using to download the script to install the sdk. You can follow the instructions for your OS but if you are like me using macOS you will download the tar.gz
file to your computer. For organization I created a folder called ~/gcloud
where I downloaded the install script.
- Extract the archive to any location on your file system; preferably, your home directory. On macOS, this can be achieved by opening the downloaded `.tar.gz` archive file in the preferred location.If you would like to replace an existing installation, remove the existing
google-cloud-sdk
directory and extract the archive to the same location. - Optional. Use the install script to add Cloud SDK tools to your path.
./google-cloud-sdk/install.sh
You can now run the installer by executing the init script:
./google-cloud-sdk/bin/gcloud init
You will be prompt to login to the google account that you used to sign up for the cloud platform under. Once logged in you should be able to execute:
gcloud auth application-default print-access-token
Setting up the Request Data for Vision API
We’re going to use curl
to make a POST
call to the vision API and pass some data in the post body and use the Vision Web Detection request to get a list of websites that have the matching or closely matching image. We can create a request.json
file:
{
"requests": [
{
"image": {
"content": "base64-encoded-image"
},
"features": [
{
"maxResults": 10,
"type": "WEB_DETECTION"
},
]
}
]
}
In the “content” field we will replace the value with a base64 encoded image value. On mac you can create a base64 encoded value by executing:
base64 -i input.jpg -o output.txt
Copy the value of output.txt
into the “content” field of the request.json
Post to the Vision API
We’re now ready to execute the POST
request to the Vision API. On the command line, in the folder with your request.json
data you can execute:
curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://vision.googleapis.com/v1/images:annotate
With any luck you should get a response such as:
{
"responses": [
{
"webDetection": {
"webEntities": [
{
"entityId": "/m/02p7_j8",
"score": 1.44225,
"description": "Carnival in Rio de Janeiro"
},
{
"entityId": "/m/06gmr",
"score": 1.2913725,
"description": "Rio de Janeiro"
},
{
"entityId": "/m/04cx88",
"score": 0.78465,
"description": "Brazilian Carnival"
},
{
"entityId": "/m/09l9f",
"score": 0.7166,
"description": "Carnival"
},
...
],
"fullMatchingImages": [
{
"url": "https://1000lugaresparair.files.wordpress.com/2017/11/quinten-de-graaf-278848.jpg"
},
...
],
"partialMatchingImages": [
{
"url": "https://www.linnanneito.fi/wp-content/uploads/sambakarnevaali-riossa.jpg"
},
...
],
"pagesWithMatchingImages": [
{
"url": "https://www.intrepidtravel.com/us/brazil/rio-carnival-122873",
"pageTitle": "\u003cb\u003eRio Carnival\u003c/b\u003e | Intrepid Travel US",
"partialMatchingImages": [
{
"url": "https://www.intrepidtravel.com/sites/intrepid/files/styles/large/public/elements/product/hero/GGSR-Brazil-rio-carnival-ladies.jpg"
},
...
],
"visuallySimilarImages": [
{
"url": "https://pbs.twimg.com/media/DVoQOx6WkAIpHKF.jpg"
},
...
],
"bestGuessLabels": [
{
"label": "rio carnival",
"languageCode": "en"
}
]
}
}
]
}
Hooray! We’ve setup and configured the glcoud sdk to use the Google Vision API to get information about our images. You can read about all of the possibilities of the Vision API here: https://cloud.google.com/vision/docs/features-list
Enjoy!