SambaCloud currently supports the following models for all developer accounts.
Model Lanes
- Fast Lane: These models are a showcase of our unique performance advantages and speeds.
- High Volume Lane: These models run with continuous batching — able to sustain high-volume workloads with large batch sizes.
Production models
Production models are intended for use in production environments and meet our high standards for speed and quality.
| Developer | Model lane | Model ID | Context length | View on Hugging Face | Model evaluation report |
|---|
| MiniMax | | | | | |
| Fast Lane | MiniMax-M2.5 | 160k tokens | Model card | |
| DeepSeek | | | | | |
| Fast Lane | DeepSeek-V3.1 | 128k tokens | Model card | |
| Meta | | | | | |
| Fast Lane | Meta-Llama-3.3-70B-Instruct | 128k tokens | Model card | LatticeFlow AI report |
| OpenAI | | | | | |
| Fast Lane | gpt-oss-120b | 128k tokens | Model card | |
Preview models
Preview models are intended for evaluation purposes and developer experimentation only, and should not be used in production environments. These models have limited capacity and may be removed at short notice.
| Developer | Model lane | Model ID | Context length | Max file size1 | View on Hugging Face | Model evaluation report |
|---|
| DeepSeek | | | | | | |
| High Volume Lane | DeepSeek-V3.2 | 8k tokens | — | Model card | |
| Meta | | | | | | |
| Fast Lane | Llama-4-Maverick-17B-128E-Instruct | 128k tokens | Up to 5 images, each ≤ 20 MB | Model card | LatticeFlow AI report |
1 Applies to multimodal (image/audio) inputs where supported.