Develop AI agents and apps using the lightweight VS Code extension.
Benchmark models with LM evaluation harness for reproducible, large-scale LLM evaluation.
Debug, evaluate, and improve LLM applications with real-time observability and experimentation.