Developer | Model ID | Context length | View on Hugging Face |
---|---|---|---|
DeepSeek | |||
DeepSeek-R1 | 32k tokens | Model card | |
DeepSeek-V3-0324 | 32k tokens | Model card | |
DeepSeek-R1-Distill-Llama-70B | 128k tokens | Model card | |
Meta | |||
Meta-Llama-3.3-70B-Instruct | 128k tokens | Model card | |
Meta-Llama-3.1-8B-Instruct | 16k tokens | Model card |
Developer | Model ID | Context length | Max file size1 | View on Hugging Face |
---|---|---|---|---|
Meta | ||||
Llama-4-Maverick-17B-128E-Instruct | 128k tokens | N/A | Model card | |
OpenAI | ||||
Whisper-Large-v3 | N/A | 25MB | Model card | |
Qwen | ||||
Qwen3-32B | 8k tokens | N/A | Model card | |
Tokyotech-llm | ||||
Llama-3.3-Swallow-70B-Instruct-v0.4 | 16k tokens | N/A | Model card | |
Other | ||||
E5-Mistral-7B-Instruct | 4k tokens | N/A | Model card |