InspectAI is an evaluation framework created by the UK AI Security Institute. It can be used to run a wide range of evaluations that measure coding, reasoning, agentic tasks, knowledge, behavior, and multimodal understanding. With InspectAI, evaluations and benchmarking become simple, reproducible, and consistent across multiple models and providers.Documentation Index
Fetch the complete documentation index at: https://sambanova-systems.mintlify.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
Prerequisites
Before you begin, ensure you have:- A SambaCloud account and an active API key at SambaCloud API Keys.
- To use InspectAI with SambaNova, set your SambaNova API key as an environment variable:
- Python environment with required packages installed.
Running evaluations
Before you can run your first evaluation, you’ll need to define a task in a Python script. Each task has three main components:- Dataset – the list of inputs and expected results
- Solver – how the model produces its outputs
- Scorer – how outputs are evaluated against the expected results
Example: Hello world
Save the following code into ahello_world.py file.
Llama-4-Maverick-17B-128E-Instruct model:
Viewing results
- Results are stored in the ./logs directory.
- Use the Inspect web UI for interactive viewing
- You can also use the Inspect Visual Studio Code extension for easier log exploration.
Additional resources
- See the InspecAI repo for more evaluation examples.
- For more details, see the official InspectAI documentation.

