In this guide, you’ll learn how to set up and use Llama Stack—a standardized framework that simplifies AI application development. This guide walks you through building the SambaNova distribution server, installing the client, and running your first model inference. Whether you’re prototyping or scaling up, this guide will help you get started quickly with best practices from the Llama ecosystem integrated into a modular, efficient architecture.Documentation Index
Fetch the complete documentation index at: https://sambanova-systems.mintlify.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
Components of Llama stack
Llama Stack includes two main components:- Server – A running distribution of Llama Stack that hosts various adaptors.
- Client – A consumer of the server’s API, interacting with the hosted adaptors.
Get your SambaCloud API key
- Create a SambaCloud account.
- Navigate to the API key section.
- Generate a new key (if you don’t already have one).
- Copy and store key securely
Build the SambaNova Llama Stack server
- Set up a Python virtual environment
- Install required dependencies
Run the SambaNova distribution server
- Export required environment variables
- Run the server with Docker
Install the Llama Stack client
In the same or another environment, run:Use the client to interact with the server
The following Python code demonstrates basic usage:Additional resources
Refer to the Llama Stack docs to:- Understand core concepts
- Dive into sample apps
- Learn how to extend and customize the framework

