Endpoints
Translation
Translates audio content to a specified language.
Endpoint
Request parameters
The following tables outline the parameters required to make an audio translation request, parameter type, description, and default values.
For improved accuracy, we strongly recommend specifying the language
parameter when using any audio model.
Whisper Large v3
Parameter | Type | Description | Default |
---|---|---|---|
model | String | The ID of the model to use. | Required |
file | File | Audio file in FLAC, MP3, MP4, MPEG, MPGA, M4A, Ogg, WAV, or WebM format. File size limit is 25MB. | Required |
prompt | String | Prompt provided to influence translation style or vocabulary. Example: “Please translate carefully, including pauses and hesitations.” | Optional |
response_format | String | Output format: JSON or text. | json |
language | String | The language of the input audio. Supplying the input language in ISO-639-1 (e.g. en) format will improve accuracy and latency. | Optional |
stream | Boolean | Enables streaming responses. | false |
stream_options | Object | Additional streaming configuration (e.g., {“include_usage”: true}). | Optional |
Qwen2-Audio-7B-Instruct
Parameter | Type | Description | Default |
---|---|---|---|
model | String | The ID of the model to use. | Required |
response_format | String | The output format is either json or text. | json |
temperature | Number | Sampling temperature between 0 and 1. Higher values (e.g., 0.8) increase randomness, while lower values (e.g., 0.2) make output more focused. | 0 |
max_tokens | Number | The maximum number of tokens to generate. | 1000 |
file | File | Audio file in FLAC, MP3, MP4, MPEG, MPGA, M4A, Ogg, WAV, or WebM format. Each single file must not exceed 30 seconds in duration. | Required |
language | String | The target language for transcription or translation. | Optional |
stream | Boolean | Enables streaming responses. | false |
stream_options | Object | Additional streaming configuration (e.g., {“include_usage”: true}). | Optional |
Request format
This section provides examples of how to send a request using different methods.
CURL
Python
Response format
The API returns a translation of the input audio in the selected format.