Endpoints
Translation
Translates audio content to a specified language.
Endpoint
Request parameters
The following tables outline the parameters required to make an audio translation request, parameter type, description, and default values.
Whisper Large v3
Parameter | Type | Description | Default |
---|---|---|---|
model | String | The ID of the model to use. | Required |
prompt | Message | Prompt provided to influence transcription style or vocabulary. Example: “Please transcribe carefully, including pauses and hesitations.” | Optional |
temperature | String | Sampling temperature between 0 and 1. Higher values increase randomness; lower values produce more focused output. | 0 |
file | File | Audio file in FLAC, MP3, MP4, MPEG, MPGA, M4A, Ogg, WAV, or WebM format. File size limit is 25MB. | Required |
response format | String | Output format: JSON or text. | json |
Qwen2-Audio-7B-Instruct
Parameter | Type | Description | Default |
---|---|---|---|
model | String | The ID of the model to use. | Required |
response_format | String | The output format is either json or text. | json |
temperature | Number | Sampling temperature between 0 and 1. Higher values (e.g., 0.8) increase randomness, while lower values (e.g., 0.2) make output more focused. | 0 |
max_tokens | Number | The maximum number of tokens to generate. | 1000 |
file | File | Audio file in FLAC, MP3, MP4, MPEG, MPGA, M4A, Ogg, WAV, or WebM format. Each single file must not exceed 30 seconds in duration. | Required |
language | String | The target language for transcription or translation. | Optional |
stream | Boolean | Enables streaming responses. | false |
stream_options | Object | Additional streaming configuration (e.g., {“include_usage”: true}). | Optional |
Request format
This section provides examples of how to send a request using different methods.
CURL
Python
Response format
The API returns a translation of the input audio in the selected format.