Vision

The SambaCloud Vision API enables models to process image inputs alongside text.

Please see the Vision capabilities document for additional information.

Endpoint

Creates a model response for the given an input that can include both text and image data.

POST https://api.sambanova.ai/v1/chat/completions

Request parameters

The following table outlines the parameters required to make a vision request, parameter type, description, and default values.

Parameter	Type	Description	Required
`model`	String	The ID of the selected model to query.	Yes
`messages`	Array of objects	A list of messages forming the conversation. Each message can include both text and image inputs. See the Image Input Format below for details.	Yes
`max_tokens`	Integer	Maximum number of tokens to generate. The total length of input and generated tokens is limited by the model’s context length. Default is 1000.	No
`temperature`	Float	Controls randomness in responses. Value can be between 0 and 1. Default is 0.	No
`top_p`	Float	Adjusts the number of choices for each predicted token based on cumulative probabilities. Value can be between 0 and 1. Default is 0.9.	No
`top_k`	Integer	Limits the number of choices for the next predicted word or token. Value can be between 1 and 100. Default is 50.	No
`stop`	String or Array	Up to 4 sequences where the API will stop generating further tokens. Default is null.	No
`stream`	Boolean	If true, partial message deltas will be sent. Default is false.	No
`stream_options`	Object	Options for streaming response. Only set this when stream: true. Available option: include_usage (boolean). Default is null.	No

Messages format for image input

Single image per request - Each request supports only one image input. For multiple images, send separate requests.
Encoding requirements - Ensure the image is base64-encoded and within size limits. Invalid encoding will result in errors. View more information on our API Error page.

Parameter	Type	Description	Required
`type`	String	Indicates the type of content. For images, set this to image_url.	Yes
`image_url.url`	String	The base64-encoded image string. Must follow the format: data:<image_format>;base64,<data>.	Yes

Example request

{
  "model": "Llama-4-Maverick-17B-128E-Instruct",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is happening in this image?"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "data:image/jpeg;base64,<base64_encoded_image>"
          }
        }
      ]
    }
  ],
  "max_tokens": 300,
  "temperature": 0.7,
  "top_p": 0.9,
  "top_k": 50
}

Response

The API returns a chat completion object containing the model’s response to the provided input.

In this sample, the image entered was a nature scene, and your response will reflect your selected image.

Sample response

{
  "id": "chatcmpl-456",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "Llama-4-Maverick-17B-128E-Instruct",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "This image shows a sunset over a mountain range with a lake in the foreground. The scene is serene and filled with vibrant colors."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 50,
    "completion_tokens": 32,
    "total_tokens": 82
  }
}

Endpoints

Using the API

Endpoint

Request parameters

Messages format for image input

Example request

Response

Sample response

Endpoints

Using the API

​Endpoint

​Request parameters

​Messages format for image input

​Example request

​Response

​Sample response

Endpoint

Request parameters

Messages format for image input

Example request

Response

Sample response