Endpoints
Audio reasoning
Enables advanced audio analysis with optional text instructions.
Endpoint
Request parameters
The following table outlines the parameters required to make a audio request, parameter type, description, and default values.
Parameter | Type | Description | Default |
---|---|---|---|
model | String | The ID of the model to use. Only Qwen2-Audio-7B-Instruct is currently available. | Required |
messages | Message | A list of messages containing role (user/system/assistant), type (text/audio_content), and audio_content (base64 audio content). | Required |
response_format | String | The output format, either “json” or “text”. | json |
temperature | Integer | Sampling temperature between 0 and 1. Higher values (e.g., 0.8) increase randomness, while lower values (e.g., 0.2) make output more focused. | 0 |
max_tokens | Integer | The maximum number of tokens to generate. | 1000 |
file | File | Audio file in flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm format. Each single file must not exceed 30 seconds in duration. | Required |
stream | Boolean | Enables streaming responses. | false |
stream_options | Object | Additional streaming configuration (e.g., {“include_usage”: true}). | Optional |
Request format
This section provides examples of how to send a request using different methods.
CURL
Python
Response format
The API returns a response in the selected format.
Streaming responses
When streaming is enabled, the API returns a series of data chunks in the following format: