Make a Vision API request

The Cloud Vision API is a REST API that uses HTTP POST operations to perform data analysis on images you send in the request. The API uses JSON for both requests and responses.

Summary

Requests are POST requests to https://vision.googleapis.com/v1/images:annotate.
You must authenticate your requests.
The request body looks like this. Responses look sort of like this, but the fields will vary depending on what type of annotation you're doing.
Here's how to send a request with cURL.
There are also client libraries.
Looking for a quick demo? Just drag and drop!

Endpoint

The Vision API consists of a single endpoint (https://vision.googleapis.com/v1/images) that supports one HTTP request method (annotate):

POST https://vision.googleapis.com/v1/images:annotate

Authentication

The POST request must authenticate by passing either an API key or an OAuth token. For details, refer to the Authenticate page.

JSON request format

The body of your POST request contains a JSON object, containing a single requests list, which itself contains one or more objects of type AnnotateImageRequest:

{
  "requests":[
    {
      "image":{
        "content":"/9j/7QBEUGhvdG9...image contents...eYxxxzj/Coa6Bax//Z"
      },
      "features":[
        {
          "type":"LABEL_DETECTION",
          "maxResults":1
        }
      ]
    }
  ]
}

Every request:

Must contain a requests list.

Within the requests list:

image specifies the image file. It can be sent as a base64-encoded string, a Cloud Storage file location, or as a publicly-accessible URL. See Providing the image for details.
features lists the types of annotation to perform on the image. You can specify one or many types, as well as the maxResults to return for each.
imageContext (not shown in the example above) specifies hints to the service to help with annotation: bounding boxes, languages, and crop hints aspect ratios.

Providing the image

You can provide the image in your request in one of three ways:

As a base64-encoded image string. If the image is stored locally, you can convert it to a string and pass it as the value of image.content:

{
  "requests":[
    {
      "image":{
        "content":"/9j/7QBEUGhvdG9zaG9...image contents...fXNWzvDEeYxxxzj/Coa6Bax//Z"
      },
      "features":[
        {
          "type":"FACE_DETECTION",
          "maxResults":10
        }
      ]
    }
  ]
}

See Base64-encoding for instructions on encoding on various platforms.

As a Cloud Storage URI. Pass the full URI as the value of image.source.imageUri:

{
  "requests":[
    {
      "image":{
        "source":{
          "imageUri":
            "gs://bucket_name/path_to_image_object"
        }
      },
      "features":[
        {
          "type":"LABEL_DETECTION",
          "maxResults":1
        }
      ]
    }
  ]
}

The file in Cloud Storage must be accessible to the authentication method you're using. If you're using an API key, the file must be publicly accessible. If you're using a service account, the file must be accessible to the user who created the service account.

As a publicly-accessible HTTP or HTTPS URL. Pass the URL as the value of image.source.imageUri:

{
  "requests":[
    {
      "image":{
        "source":{
          "imageUri":
            "https://www.google.com/images/branding/googlelogo/2x/googlelogo_color_272x92dp.png"
        }
      },
      "features":[
        {
          "type":"LOGO_DETECTION",
          "maxResults":1
        }
      ]
    }
  ]
}

When fetching images from HTTP/HTTPS URLs, Google cannot guarantee that the request will be completed. Your request may fail if the specified host denies the request (e.g. due to request throttling or DoS prevention), or if Google throttles requests to the site for abuse prevention. As a best practice, don't depend on externally-hosted images for production applications.

JSON response format

The annotate request receives a JSON response of type AnnotateImageResponse. Although the requests are similar for each feature type, the responses for each feature type can be quite different. Consult the Vision API Reference for complete information.

The code below demonstrates a sample label detection response for the photo shown below:

{
  "responses": [
    {
      "labelAnnotations": [
        {
          "mid": "/m/0bt9lr",
          "description": "dog",
          "score": 0.97346616
        },
        {
          "mid": "/m/09686",
          "description": "vertebrate",
          "score": 0.85700572
        },
        {
          "mid": "/m/01pm38",
          "description": "clumber spaniel",
          "score": 0.84881884
        },
        {
          "mid": "/m/04rky",
          "description": "mammal",
          "score": 0.847575
        },
        {
          "mid": "/m/02wbgd",
          "description": "english cocker spaniel",
          "score": 0.75829375
        }
      ]
    }
  ]
}

Client libraries

Google provides client libraries in a number of programming languages to simplify the process of building and sending requests, and receiving and parsing responses.

Refer to the Client libraries for installation and usage instructions.