Skip to main content

Documentation Index

Fetch the complete documentation index at: https://sambanova-systems.mintlify.dev/docs/llms.txt

Use this file to discover all available pages before exploring further.

This tutorial walks you through building low-latency Conversational AI agents using ElevenLabs and SambaCloud’s high-speed LLM inference engine. Low latency is crucial for smooth voice conversations, and SambaNova delivers this with specialized hardware optimized for world-class inference speeds using open-source models.

Prerequisites

Before starting, ensure you have:

Setup

Follow these steps to set up your AI agent.

Access the agent in ElevenLabs

  1. Go to the Agents page on ElevenLabs.
  2. Create a new agent, or select an existing agent to edit.
ElevenLabs Agents page

Configure the LLM settings

  1. Scroll to the LLM section of your agent settings.
  2. Select Custom LLM from the dropdown menu.
ElevenLabs LLM settings with Custom LLM selected

Retrieve SambaNova endpoint and model

  1. Open the SambaCloud Playground.
  2. Select View Code in the top-right to get your model endpoint URL and model name.
SambaCloud Playground View Code panel

Generate your SambaNova API key

  1. Go to your SambaCloud account.
  2. Generate an API key from the portal.
SambaCloud API key generation page

Add API key to ElevenLabs

  1. Return to the ElevenLabs agent settings page.
  2. Under Workspace Secrets, add name and value.
  3. Name: SAMBANOVA_API_KEY.
  4. Value: Paste the API key from previous step.
ElevenLabs Workspace Secrets with SAMBANOVA_API_KEY added
This enables ElevenLabs to access your SambaNova model.

Set token limit

In the Limit token usage section, set maximum tokens to 1024. This helps control the response length for optimal conversational flow.

Save and test

  1. Select Save to apply changes.
  2. Test your setup by selecting Test AI agent followed by Call AI agent.
  3. See video walkthrough for details.

Video walkthrough