Enables advanced audio analysis with optional text instructions.

Endpoint

POST https://api.sambanova.ai/v1/audio/reasoning

Request parameters

The following table outlines the parameters required to make a audio request, parameter type, description, and default values.

ParameterTypeDescriptionDefault
modelStringThe ID of the model to use. Only Qwen2-Audio-7B-Instruct is currently available.Required
messagesMessageA list of messages containing role (user/system/assistant), type (text/audio_content), and audio_content (base64 audio content).Required
response_formatStringThe output format, either “json” or “text”.json
temperatureIntegerSampling temperature between 0 and 1. Higher values (e.g., 0.8) increase randomness, while lower values (e.g., 0.2) make output more focused.0
max_tokensIntegerThe maximum number of tokens to generate.1000
fileFileAudio file in flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm format. Each single file must not exceed 30 seconds in duration.Required
streamBooleanEnables streaming responses.false
stream_optionsObjectAdditional streaming configuration (e.g., {“include_usage”: true}).Optional

Request format

This section provides examples of how to send a request using different methods.

CURL

curl --location 'https://api.sambanova.ai/v1/audio/reasoning' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_API_KEY' \
--data '{
    "messages": [
        {"role": "assistant", "content": "you are a helpful assistant"},  
        {"role": "user", "content":[
                {
                    "type": "audio_content",
                    "audio_content": {
                        "content": "data:audio/mp3;base64,<base64_audio>"
                    }
                }
            ]
        },
        {"role": "user", "content": "what is the audio about"}
    ],   
    "max_tokens": 1024,
    "model": "Qwen2-Audio-7B-Instruct",
    "temperature": 0.01,
    "stream": true // Optional
}'

Python

import requests
import base64

def analyze_audio(audio_file_path, api_key):
    with open(audio_file_path, "rb") as audio_file:
        base64_audio = base64.b64encode(audio_file.read()).decode('utf-8')
    
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {api_key}"
    }
    
    data = {
        "messages": [
            {"role": "assistant", "content": "you are a helpful assistant"},
            {"role": "user", "content": [
                {
                    "type": "audio_content",
                    "audio_content": {
                        "content": f"data:audio/mp3;base64,{base64_audio}"
                    }
                }
            ]},
            {"role": "user", "content": "what is the audio about"}
        ],
        "model": "Qwen2-Audio-7B-Instruct",
        "max_tokens": 1024,
        "temperature": 0.01,
        "stream": True  # Optional
    }
    
    response = requests.post(
        "https://api.sambanova.ai/v1/audio/reasoning",
        headers=headers,
        json=data
    )
    
    return response.json()

Response format

The API returns a response in the selected format.

{
    "choices": [{
        "delta": {
            "content": "The sound is that of ",
            "role": "assistant"
        },
        "finish_reason": null,
        "index": 0,
        "logprobs": null
    }],
    "created": 1732317298,
    "id": "211b9a22-58cf-4b90-94e9-1fed8d0d9d0a",
    "model": "Qwen2-Audio-7B-Instruct",
} 

Streaming responses

When streaming is enabled, the API returns a series of data chunks in the following format:

data: {"choices":[{"delta":{"content":"","role":"assistant"},"finish_reason":null,"index":0,"logprobs":null}],"created":1732317298,"id":"211b9a22-58cf-4b90-94e9-1fed8d0d9d0a","model":"Qwen2-Audio-7B-Instruct","object":"chat.completion.chunk","system_fingerprint":"fastcoe"}

data: {"choices":[{"delta":{"content":"The sound is that of ","role":"assistant"},"finish_reason":null,"index":0,"logprobs":null}],"created":1732317298,"id":"211b9a22-58cf-4b90-94e9-1fed8d0d9d0a","model":"Qwen2-Audio-7B-Instruct","object":"chat.completion.chunk","system_fingerprint":"fastcoe"}