SambaNova Cloud is now available as a SaaS offering on AWS Marketplace. Users can subscribe using their AWS account and connect securely via AWS PrivateLink, enabling fast, private, and scalable access to top open-source models like Llama 4, DeepSeek, and Whisper. Powered by SambaNova’s Reconfigurable Dataflow Unit (RDU), this integration delivers up to 10x faster inference than GPUs—ideal for real-time AI applications.
Key features
Benefits
See AWS Marketplace integration document for more information.
We’re excited to announce that DeepSeek-V3-0324 now supports function calling! This enhancement enables more dynamic and programmable interactions by allowing you to invoke external functions directly through the model’s outputs. See Function calling section for more information.
We’ve added Qwen3-32B, a high-capacity, multilingual LLM to the SambaNova Cloud platform as a Preview model. Qwen3-32B is part of the Qwen3 series and offers strong performance across a wide range of general-purpose language tasks, including question answering, summarization, reasoning, and coding.
We’re excited to announce the addition of Whisper-Large-V3, OpenAI’s latest large-scale automatic speech recognition (ASR) model. This model offers enhanced transcription accuracy, improved multilingual support, and better robustness against noisy audio environments.
Llama-4-Maverick-17B-128E-Instruct now supports image input functionality. You can provide up to two images as context alongside text prompts. This enhanced model is available for use via both the Playground and API, and is accessible to all users.
As part of our ongoing efforts to streamline and enhance the SambaNova Cloud model offerings, the following models have been deprecated:
These models will no longer receive updates and are scheduled for removal from active endpoints after April 14th, 2025. For more information and guidance on alternatives, please visit our Model Deprecations page.
We are excited to announce that another model of the Llama 4 family, Maverick, has been added to SambaNova Cloud. It is a 400-billion-parameter mixture-of-experts model with 17 billion active parameters and 128 experts, delivering results competitive across a variety of benchmarks with Gemma 3, Gemini 2.0 Flash, and Mistral 3.1.
The next generation of Llama models have arrived and now Llama 4 Scout is readily available on SambaNova Cloud. It is a 109B parameter mixture-of-experts model with 17B active parameters and 16 experts, and can deliver results competitive across a variety of benchmarks with Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1.
We’re excited to announce the addition of DeepSeek V3-0324, the first open source non-reasoning model that outperforms proprietary non-reasoning models! Along with its major boost in reasoning performance, DeepSeek V3-0324 also provides stronger front-end development skills and smarter tool-use capabilities.
QwQ-32B is a state-of-the-art reasoning model released by the Alibaba Qwen team. Despite having far fewer parameters than DeepSeek-R1’s 671B (with 37B activated), QwQ-32B delivers comparable performance. Beyond its exceptional capabilities in language understanding and creative reasoning, QwQ-32B now integrates advanced agent-related features, enabling it to think critically, leverage external tools, and adapt its reasoning based on dynamic environmental feedback.
The Llama 3.1 Swallow series comprises Japanese-optimized language models developed through continual pre-training of Meta’s Llama 3.1 architecture. These 8B and 70B parameter versions enhance Japanese linguistic capabilities while preserving English proficiency through training on 200B tokens from diverse sources including web corpora, technical content, and multilingual Wikipedia articles. The instruction-tuned variants, such as the v0.3, use synthetic Japanese data for fine-tuning, achieving state-of-the-art performance on Japanese benchmarks like MT-Bench, with the 8B Instruct v0.3 outperforming its predecessor by 8.4 points and 70B version 5.68 points.
Please refer to the Supported models page for more details of the supported configurations and their model cards.
We’re thrilled to announce DeepSeek-R1, a cutting-edge open-source model that rivals OpenAI’s o1 and has taken the world by storm, now available on SambaNova Cloud! Due to high demand, access will be limited during the initial preview phase, but you can experience DeepSeek-R1’s at a blazing-fast inference speed on SambaNova Cloud today. This is just the beginning. Please stay tuned for even more exciting improvements coming soon!
For API access and higher rate limits for DeepSeek-R1, please complete this form to join the waitlist.
Please refer to the Supported models page for more details of the supported configurations and their model cards.
We are excited to announce the addition of Tülu 3 405B, an open-source model that performs better than even DeepSeek-V3, to SambaNova Cloud.
DeepSeek-R1-Distill-Llama-70B is now live on SambaNova Cloud. Experience cutting-edge AI that outshines top closed-source models in math, coding, and beyond—power up your workloads with unmatched performance today.
The latest Llama 3.3 70B model from Meta and the new leading open-source reasoning model QwQ from Alibaba’s Qwen team are now available on the SambaNova Cloud.
This release also includes upgrades to the max sequence length for the following models:
Meta-Llama-3.1-70B-Instruct-8k
for the 8k sequence length anymore. While we still support the existing method for backward compatibility, we recommend switching to the new method for the best experience.SambaNova Cloud is now available as a SaaS offering on AWS Marketplace. Users can subscribe using their AWS account and connect securely via AWS PrivateLink, enabling fast, private, and scalable access to top open-source models like Llama 4, DeepSeek, and Whisper. Powered by SambaNova’s Reconfigurable Dataflow Unit (RDU), this integration delivers up to 10x faster inference than GPUs—ideal for real-time AI applications.
Key features
Benefits
See AWS Marketplace integration document for more information.
We’re excited to announce that DeepSeek-V3-0324 now supports function calling! This enhancement enables more dynamic and programmable interactions by allowing you to invoke external functions directly through the model’s outputs. See Function calling section for more information.
We’ve added Qwen3-32B, a high-capacity, multilingual LLM to the SambaNova Cloud platform as a Preview model. Qwen3-32B is part of the Qwen3 series and offers strong performance across a wide range of general-purpose language tasks, including question answering, summarization, reasoning, and coding.
We’re excited to announce the addition of Whisper-Large-V3, OpenAI’s latest large-scale automatic speech recognition (ASR) model. This model offers enhanced transcription accuracy, improved multilingual support, and better robustness against noisy audio environments.
Llama-4-Maverick-17B-128E-Instruct now supports image input functionality. You can provide up to two images as context alongside text prompts. This enhanced model is available for use via both the Playground and API, and is accessible to all users.
As part of our ongoing efforts to streamline and enhance the SambaNova Cloud model offerings, the following models have been deprecated:
These models will no longer receive updates and are scheduled for removal from active endpoints after April 14th, 2025. For more information and guidance on alternatives, please visit our Model Deprecations page.
We are excited to announce that another model of the Llama 4 family, Maverick, has been added to SambaNova Cloud. It is a 400-billion-parameter mixture-of-experts model with 17 billion active parameters and 128 experts, delivering results competitive across a variety of benchmarks with Gemma 3, Gemini 2.0 Flash, and Mistral 3.1.
The next generation of Llama models have arrived and now Llama 4 Scout is readily available on SambaNova Cloud. It is a 109B parameter mixture-of-experts model with 17B active parameters and 16 experts, and can deliver results competitive across a variety of benchmarks with Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1.
We’re excited to announce the addition of DeepSeek V3-0324, the first open source non-reasoning model that outperforms proprietary non-reasoning models! Along with its major boost in reasoning performance, DeepSeek V3-0324 also provides stronger front-end development skills and smarter tool-use capabilities.
QwQ-32B is a state-of-the-art reasoning model released by the Alibaba Qwen team. Despite having far fewer parameters than DeepSeek-R1’s 671B (with 37B activated), QwQ-32B delivers comparable performance. Beyond its exceptional capabilities in language understanding and creative reasoning, QwQ-32B now integrates advanced agent-related features, enabling it to think critically, leverage external tools, and adapt its reasoning based on dynamic environmental feedback.
The Llama 3.1 Swallow series comprises Japanese-optimized language models developed through continual pre-training of Meta’s Llama 3.1 architecture. These 8B and 70B parameter versions enhance Japanese linguistic capabilities while preserving English proficiency through training on 200B tokens from diverse sources including web corpora, technical content, and multilingual Wikipedia articles. The instruction-tuned variants, such as the v0.3, use synthetic Japanese data for fine-tuning, achieving state-of-the-art performance on Japanese benchmarks like MT-Bench, with the 8B Instruct v0.3 outperforming its predecessor by 8.4 points and 70B version 5.68 points.
Please refer to the Supported models page for more details of the supported configurations and their model cards.
We’re thrilled to announce DeepSeek-R1, a cutting-edge open-source model that rivals OpenAI’s o1 and has taken the world by storm, now available on SambaNova Cloud! Due to high demand, access will be limited during the initial preview phase, but you can experience DeepSeek-R1’s at a blazing-fast inference speed on SambaNova Cloud today. This is just the beginning. Please stay tuned for even more exciting improvements coming soon!
For API access and higher rate limits for DeepSeek-R1, please complete this form to join the waitlist.
Please refer to the Supported models page for more details of the supported configurations and their model cards.
We are excited to announce the addition of Tülu 3 405B, an open-source model that performs better than even DeepSeek-V3, to SambaNova Cloud.
DeepSeek-R1-Distill-Llama-70B is now live on SambaNova Cloud. Experience cutting-edge AI that outshines top closed-source models in math, coding, and beyond—power up your workloads with unmatched performance today.
The latest Llama 3.3 70B model from Meta and the new leading open-source reasoning model QwQ from Alibaba’s Qwen team are now available on the SambaNova Cloud.
This release also includes upgrades to the max sequence length for the following models:
Meta-Llama-3.1-70B-Instruct-8k
for the 8k sequence length anymore. While we still support the existing method for backward compatibility, we recommend switching to the new method for the best experience.