Audio reasoning

Enables advanced audio analysis with optional text instructions.

Endpoint

POST https://api.sambanova.ai/v1/audio/reasoning

Request parameters

The following table outlines the parameters required to make a audio request, parameter type, description, and default values.

Parameter	Type	Description	Default
`model`	String	The ID of the model to use. Only Qwen2-Audio-7B-Instruct is currently available.	Required
`messages`	Message	A list of messages containing role (user/system/assistant), type (text/audio_content), and audio_content (base64 audio content).	Required
`response_format`	String	The output format, either “json” or “text”.	`json`
`temperature`	Integer	Sampling temperature between 0 and 1. Higher values (e.g., 0.8) increase randomness, while lower values (e.g., 0.2) make output more focused.	`0`
`max_tokens`	Integer	The maximum number of tokens to generate.	`1000`
`file`	File	Audio file in flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm format. Each single file must not exceed 30 seconds in duration.	Required
`stream`	Boolean	Enables streaming responses.	`false`
`stream_options`	Object	Additional streaming configuration (e.g., {“include_usage”: true}).	Optional

Request format

This section provides examples of how to send a request using different methods.

CURL

curl --location 'https://api.sambanova.ai/v1/audio/reasoning' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_API_KEY' \
--data '{
    "messages": [
        {"role": "assistant", "content": "you are a helpful assistant"},  
        {"role": "user", "content":[
                {
                    "type": "audio_content",
                    "audio_content": {
                        "content": "data:audio/mp3;base64,<base64_audio>"
                    }
                }
            ]
        },
        {"role": "user", "content": "what is the audio about"}
    ],   
    "max_tokens": 1024,
    "model": "Qwen2-Audio-7B-Instruct",
    "temperature": 0.01,
    "stream": true // Optional
}'

Python

import requests
import base64

def analyze_audio(audio_file_path, api_key):
    with open(audio_file_path, "rb") as audio_file:
        base64_audio = base64.b64encode(audio_file.read()).decode('utf-8')
    
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {api_key}"
    }
    
    data = {
        "messages": [
            {"role": "assistant", "content": "you are a helpful assistant"},
            {"role": "user", "content": [
                {
                    "type": "audio_content",
                    "audio_content": {
                        "content": f"data:audio/mp3;base64,{base64_audio}"
                    }
                }
            ]},
            {"role": "user", "content": "what is the audio about"}
        ],
        "model": "Qwen2-Audio-7B-Instruct",
        "max_tokens": 1024,
        "temperature": 0.01,
        "stream": True  # Optional
    }
    
    response = requests.post(
        "https://api.sambanova.ai/v1/audio/reasoning",
        headers=headers,
        json=data
    )
    
    return response.json()

Response format

The API returns a response in the selected format.

{
    "choices": [{
        "delta": {
            "content": "The sound is that of ",
            "role": "assistant"
        },
        "finish_reason": null,
        "index": 0,
        "logprobs": null
    }],
    "created": 1732317298,
    "id": "211b9a22-58cf-4b90-94e9-1fed8d0d9d0a",
    "model": "Qwen2-Audio-7B-Instruct",
} 

Streaming responses

When streaming is enabled, the API returns a series of data chunks in the following format:

data: {"choices":[{"delta":{"content":"","role":"assistant"},"finish_reason":null,"index":0,"logprobs":null}],"created":1732317298,"id":"211b9a22-58cf-4b90-94e9-1fed8d0d9d0a","model":"Qwen2-Audio-7B-Instruct","object":"chat.completion.chunk","system_fingerprint":"fastcoe"}

data: {"choices":[{"delta":{"content":"The sound is that of ","role":"assistant"},"finish_reason":null,"index":0,"logprobs":null}],"created":1732317298,"id":"211b9a22-58cf-4b90-94e9-1fed8d0d9d0a","model":"Qwen2-Audio-7B-Instruct","object":"chat.completion.chunk","system_fingerprint":"fastcoe"}

Endpoints

Using the API

Endpoint

Request parameters

Request format

CURL

Python

Response format

Streaming responses

Endpoints

Using the API

​Endpoint

​Request parameters

​Request format

​CURL

​Python

​Response format

​Streaming responses

Endpoint

Request parameters

Request format

CURL

Python

Response format

Streaming responses