Model Zoo architecture and workflows

SambaFlow architecture and workflows gives a generic introduction to the SambaNova software stack. This doc page discusses how the different components of Model Zoo fit together and how you perform generation and fine tuning with Model Zoo.

Model Zoo architecture

Model Zoo consists of these components, discussed in some detail below.

Static architecture diagram

Model Zoo Devbox container

The Model Zoo Devbox is a container that encapsulates the SambaFlow compiler and other prerequisite software such as Red Hat or Ubuntu Linux. Using the container rather than installing the individual components in your environment ensures that you won’t have a problem with incompatible software versions.

The models in Model Zoo are optimized for SambaNova hardware. Only SambaNova customers get access to the Devbox because running the models on CPU/GPU won’t result in good performance.

Model Zoo GitHub repository

After you’ve deployed the Devbox container, you clone the Model Zoo GitHub repo at https://github.com/sambanova/modelzoo into the container. The modelzoo repository includes:

  • Model source code, customized for running on RDU.

  • Libraries to support the models.

  • Example apps for running generation and training/finetuning, and YAML files for customizing the attributes of the run (e.g. batch size or compiler optimization mode).

Users who are interested in customizing the model can do so in these ways:

  • Customize model configuration. You can make changes to the YAML file (in the examples/nlp/*/config directory) to specify compiler behavior, hardware version, and more. For example you can specify the number of RDUs or how the compiler handles precision. At a minimum, it usually makes sense to specify the PEF name. Comments in the YAML file explain the changes you can make.

  • Change model source code. The source code for each model is based on a model available in Hugging Face. SambaNova engineers have modified the model to compile and run efficiently on RDU. Code comments in the source code explain the changes.

Other SambaNova components

Included in the Devbox is the SambaFlow compiler. When you work with a model, you first compile it to generate a PEF file that represents the dataflow graph on RDU.

External components - generation

When you are ready to run generation with your language model, you need a combination of SambaNova components and external components.

  • PEF file. You compile the model to generate the PEF. A PEF is required for both generation and training/fine tuning.

  • Hugging Face checkpoint and config.json. You download a checkpoint for your model and the corresponding config.json from Hugging Face. The model must use bf16 or fp32 precision. See Recommended checkpoints for a list of checkpoints that we’ve tested with the Model Zoo models.

External components - training and finetuning

The most common use case is that customers finetune with their own data, then perform generation with that custom model. To run finetuning, you need:

  • PEF file. You compile the model to generate the PEF. A PEF is required for both generation and training/fine tuning.

  • Converted dataset. We expect data in HDF5 format. SambaNova offers a set of open source scripts to help with the conversion at https://github.com/sambanova/generative_data_prep.

  • Hugging Face checkpoint. A Hugging Face format checkpoint (bf16 or fp32). No checkpoint is required for training.

Generation workflow

You can run generation using the Model Zoo code and example apps as is, or making changes to the configuration or the code.

Simple generation workflow

The simple generation workflow is as follows:

Simple generation process

Here are the steps, in sequence. For details, see the Walkthrough in the /examples README External link.

  1. Deploy the Model Zoo container (Devbox), which SambaNova customers get from Customer Support.

  2. Clone the modelzoo repo which includes the model code and example apps.

  3. Pull a checkpoint for the model from Hugging Face.

  4. Compile to generate a PEF, which encapsulates the graph, optimized for RDU.

  5. Run generation using the checkpoint and PEF. You can specify prompt data as part of generation.

Generation workflow with changes

Of you want to optimize compilation or make other changes, the main question is whether a recompile is required. For some changes, you can use the PEF as is for generation. For other changes, you have to recompile to have the changes take effect. See Making changes to base_config.yaml parameters

Generation process when user makes changes

Training and finetuning workflows

The training and finetuning workflow are very similar. The main difference is that:

  • When you perform training, you start with a large dataset and end up with your own model checkpoint.

  • When you perform finetuning, you use a checkpoint, which encapsulates a pretrained model and your own custom dataset. You end up with a new checkpoint trained with your custom data. For step-by-step instructions using Ultrachat data, see the Walkthrough in the /examples README External link.

Finetuning workflow

Here’s an overview of the finetuning workflow:

Simple fine tuning process
  1. First you prepare your environment:

    1. Deploy the Model Zoo container.

    2. Clone the Model Zoo GitHub repo which includes the model code and example apps.

  2. Next you pull a checkpoint from Hugging Face. A config.json is included in the download.

  3. Compile to generate a PEF, which represents a graph that’s optimized for RDU.

  4. Prepare your custom data to use HDF5 format. SambaNova utilities for data prep are available on GitHub.

  5. You can now use the training example app, passing in the checkpoint, configuration, and data.

  6. When finetuning is complete, the result is a new checkpoint, optimized for RDU, that has been trained on your custom data.

Training workflow

The training workflow is fairly similar to the fine-tuning workflow. However, instead of using a pretrained checkpoint, you train the model with a large dataset.

Doing a full training run with a large dataset takes a long time. In most cases, using a checkpoint and a small set of custom data makes more sense.
Simple training process
  1. First you prepare your environment:

    1. Deploy the Model Zoo container.

    2. Clone the modelzoo repo which includes the model code and example apps.

  2. Next, you compile to generate a PEF, which represents a graph that’s optimized for RDU.

  3. Prepare the data HDF5 format. SambaNova utilities for data prep are available on GitHub.

  4. You can now use the training example app with the HDF5 data set to generate a checkpoint that’s optimized for RDU.

See also