SambaNova Runtime
This document covers administrative procedures and troubleshooting information for SambaNova Runtime, the system service that provides access to the RDU chips. Some of the information is for administrators and requires superuser privileges, but other information can be helpful for developers who are troubleshooting and optimizing their models.
Concepts and components
-
Release notes. Learn about new features, bug fixes, and other improvements for the different runtime components.
-
SambaNova Runtime architecture. Gives an overview of the components and how you can interact with them.
-
Get started with SambaNova Runtime. Learn about runtime components and tools, installation, and logs.
-
Configure Runtime components. Learn common administration tasks such as changing log levels, how to customize SND (SambaNova daemon), and more!
Troubleshooting
-
Troubleshoot Runtime. Learn how to resolve errors and optimize Runtime.
-
Find faults and errors with SNFADM. Learn about SNFM and its components, how to use SNFM logs, and how to use the SNFADM tool.
Runtime and Slurm
-
Use Slurm with SambaNova hardware introduces available capabilities.
-
How SambaNova administrators set up Slurm explains how system adminstrators set up Slurm and Generic Resources (GRES) for SambaNova environments.
-
How SambaNova developers can use Slurm explains how to use Slurm in batch scripts and how to use a Slurm Feeder script.
SNML APIs
-
SambaNova Management Layer (SNML) explains how the SNML APIs support many information retrieval tasks and some administration tasks. The doc page includes troubleshooting information and usage examples.
-
SNML API reference (all users) is a reference to SNML APIs that support retrieving information about your installation.
-
SNML API reference (admin users) is a reference to the SNML admin APIs, which support resetting RDUs, marking system faults as clear, and enabling or disabling performance counter collection.