Vapi is a developer platform for building voice AI agents. It handles the underlying infrastructure so developers can focus on creating high-quality voice experiences. Voice agents built with Vapi can engage in natural conversations with users, make and receive phone calls, integrate seamlessly with existing systems and APIs, and support complex workflows such as appointment scheduling, customer support, and other advanced use cases. SambaNova’s high-speed inference enables low-latency voice interactions, which is critical for natural-sounding conversations.Documentation Index
Fetch the complete documentation index at: https://sambanova-systems.mintlify.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
Prerequisites
Before starting, ensure you have:- A SambaCloud account and API key.
- A Vapi account with dashboard access.
- Python 3.11 or later.
- ngrok installed for exposing your local server.
Installation and setup
Clone the repository
Create a virtual environment
Install dependencies
Install ngrok
On macOS:Set environment variables
Create a.env file or export your API key directly:
Run the local LLM server
Start the Flask server:http://localhost:5000/.
Expose the server using ngrok
In a separate terminal, run:Test the endpoint
Verify your setup with a cURL request:Replace
https://abcd-1234.ngrok-free.dev with your actual ngrok URL.Configure Vapi with your custom LLM
- Log in to the Vapi Dashboard.
- Create a new assistant using a Blank Template.
- Navigate to Model → Provider → Custom LLM.
-
Enter the model name you’ll use (for example,
gpt-oss-120b). -
Paste your ngrok URL into the endpoint URL field:
- Save the configuration.

