Transcribes audio into the input language.

JavaScript

import SambaNova from 'sambanova';

const client = new SambaNova({
  apiKey: process.env['SAMBANOVA_API_KEY'], // This is the default and can be omitted
});

const transcription = await client.audio.transcriptions.create({
  file: fs.createReadStream('path/to/file'),
  model: 'Whisper-Large-v3',
});

console.log(transcription);

{
  "text": "Es un efecto de sonido de una campana sonando, específicamente una campana de iglesia."
}

POST

audio

transcriptions

JavaScript

import SambaNova from 'sambanova';

const client = new SambaNova({
  apiKey: process.env['SAMBANOVA_API_KEY'], // This is the default and can be omitted
});

const transcription = await client.audio.transcriptions.create({
  file: fs.createReadStream('path/to/file'),
  model: 'Whisper-Large-v3',
});

console.log(transcription);

{
  "text": "Es un efecto de sonido de una campana sonando, específicamente una campana de iglesia."
}

Authorizations

Authorization

string

header

required

SambaNova API Key

Body

multipart/form-data

Audio to transcribe and parameters

Transcription request object

model

required

The model ID to use See available models

file

required

The audio file object to transcribe or translate, in one of these formats: FLAC, MP3, MP4, MPEG, MPGA, M4A, Ogg, WAV, or WebM format. File size limit is 25MB.

prompt

string | null

Optional text prompt provided to influence transcription Translation style or vocabulary. Example: “Please transcribe carefully, including pauses and hesitations.”

language

enum<string> | null

Optional language of the input audio. Supplying the input language in ISO-639-1 (e.g. en) format will improve accuracy and latency.

Available options:

en,

zh,

de,

es,

ru,

ko,

fr,

ja,

pt,

tr,

pl,

ca,

nl,

ar,

sv,

it,

id,

hi,

fi,

vi,

he,

uk,

el,

ms,

cs,

ro,

da,

hu,

ta,

no,

th,

ur,

hr,

bg,

lt,

la,

mi,

ml,

cy,

sk,

te,

fa,

lv,

bn,

sr,

az,

sl,

kn,

et,

mk,

br,

eu,

is,

hy,

ne,

mn,

bs,

kk,

sq,

sw,

gl,

mr,

pa,

si,

km,

sn,

yo,

so,

af,

oc,

ka,

be,

tg,

sd,

gu,

am,

yi,

lo,

uz,

fo,

ht,

ps,

tk,

nn,

mt,

sa,

lb,

my,

bo,

tl,

mg,

as,

tt,

haw,

ln,

ha,

ba,

jw,

su,

yue

response_format

enum<string>

default:json

Output format JSON or text.

Available options:

json,

text

stream

boolean

default:false

Enables streaming responses.

stream_options

stream options · object

Optional settings that apply when stream is true.

Show child attributes

Response

Successful Response

Transcription Response
Transcription Stream Response

Transcription response json object

text

string

required

audio file text transcription

Create embeddings Translate audio into English.

⌘I

Overview

Inference API

Chat Agents API

Using the API

Transcribes audio into the input language.

Authorizations

Body

Response