Examine logreg model code

SambaNova supports several tutorials. You learn how to compile and train a simple logreg model in Hello SambaFlow! Compile and run a model. This doc page examines the Python code and data you’re using to run logreg.

Our logreg model uses:

In this tutorial you learn what’s inside the Python code.

What you’ll learn

This tutorial explores these topics:

  • Typical imports

  • Components of main()

  • Model definition, including input arguments, compilation, and training.

Files

All tutorial code files are in our tutorials GitHub repo at https://github.com/sambanova/tutorials/tree/main. This doc page includes collapsible code snippets for each code component we discuss.

Data

The tutorial uses the classic MNIST dataset, which includes a training set of 60,000 examples, and a test set of 10,000 examples.

  • By default, the code downloads the dataset files as part of the training run.

  • In environments that don’t have access to the internet, you can explicitly download the dataset. See (Optional) Download model data.

Code files

The code for this tutorial is in a single code file, logreg.py.

Imports

Our model imports several Python modules. Here’s the Python code, followed by an explanation of each import.

Imports
import argparse
import sys
from typing import Tuple

import torch
import torch.distributed as dist
import torch.nn as nn
import torchvision

import sambaflow.samba.utils as utils
from sambaflow import samba
from sambaflow.samba.utils.argparser import (parse_app_args,
                                             parse_yaml_to_args)
from sambaflow.samba.utils.dataset.mnist import dataset_transform
from sambaflow.samba.utils.pef_utils import get_pefmeta
  • sambaflow.samba is the set of SambaFlow modules.

  • sambaflow.samba.utils contains all the utilities, such as tracing etc.

  • parse_app_args is our built-in argument parsing support for each supported execution mode (more details below).

  • dataset_transform is a utility function to transform the data.

  • get_pefmeta saves the model’s metadata in the resulting executable file (PEF file).

It all starts with main()

The workflows for SambaNova models are outlined in sambaflow-intro.adoc#_workflows[Workflows]. The intermediate tutorial includes both training and inference.

The main() function includes the functions to perform compilation and training, and also does some preparation.

Function Description See

utils.set_seed()

Set a random seed for reproducibility while we’re in the development phases of our tutorial.

parse_app_args()

Collect the arguments coming from add_common_args() and add_run_args(). When users run the model, they can specify predefined arguments that are handled bycompiler (e.g. o0) and the SambaFlow framework, as well as application-specific arguments.

Define input arguments

samba.randn(), samba.randint

Create random input and output for compilation

See the API Reference.

samba.from_torch_model_()

Set up the model to use the SambaFlow framework. The function, which also converts a PyTorch model to a Samba model, performs some initialization and related tasks. We pass in model, a class we create to represent the model.

Define the model

samba.optim.SGD()

Define the optimizer we’ll use for training the model. The SambaFlow framework supports AdamW and SGD out of the box. You can also specify a different optimizer.

See the API Reference

compile()

If the user specified compile() on the command line, call samba.session.compile().

Compile the model

run()

If the user specified run() on the command line, perform training, testing, or inference, based on other arguments that are passed in.

Train the model

Define the model

The model definition specifies the layers in the model and the number of features in each layer.

Here’s the Python code:

LogReg class
class LogReg(nn.Module):
    """ Define the model architecture

    Define the model architecture i.e. the layers in the model and the
    number of features in each layer.

    Args:
        nlin_layer (ivar): Linear layer
        criterion (ivar): Cross Entropy loss layer
    """

    def __init__(self, num_features: int, num_classes: int, bias: bool):
        """ Initialization function for this class

        Args:
            num_features (int):  Number of input features for the model
            num_classes (int): Number of output labels the model classifies inputs
            bias (bool): _description_
        """
        super().__init__()
        self.num_features = num_features
        self.num_classes = num_classes

        # Linear layer for predicting target class of inputs
        self.lin_layer = nn.Linear(in_features=num_features, out_features=num_classes, bias=bias)

        # Cross Entropy layer for loss computation
        self.criterion = nn.CrossEntropyLoss()

    def forward(self, inputs: torch.Tensor, targets: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:
        """ Forward pass of the model for the given inputs.

        The forward pass predicts the class labels for the inputs
        and computes the loss between the correct and predicted class labels.

        Args:
            inputs (torch.Tensor):  Input samples in the dataset
            targets (torch.Tensor): correct labels for the inputs

        Returns:
            Tuple[torch.Tensor, torch.Tensor]:The loss and predicted classes of the inputs
        """

        out = self.lin_layer(inputs)
        loss = self.criterion(out, targets)

        return loss, out

Two functions are defined for the class:

  • init(), the initialization function, uses num_features, num_classes, and bias that are specified for this model, and also specifies the linear layer and cross entropy layer.

  • forward(), which is used by train(), predicts class labels and computes the loss between the correct and predicted labels.

In main() we’ll then convert the model from a PyTorch model to a SambaFlow model by calling from_torch_model().

Define input arguments

The add_args function defines parameters for use with this model. These are all arguments that are typically used with an ML model.

add_args function
def add_args(parser: argparse.ArgumentParser) -> None:
    """ Add model-specific arguments.

    By default, the compiler and the SambaFlow framework support a set of arguments to compile() and run().
    The arguement parser supports adding application-specific arguments.

    Args:
        parser (argparse.ArgumentParser): SambaNova argument parser.
    """

    parser.add_argument('--lr', type=float, default=0.0015, help="Learning rate for training")
    parser.add_argument('--momentum', type=float, default=0.0, help="Momentum value for training")
    parser.add_argument('--weight-decay', type=float, default=3e-4, help="Weight decay for training")
    parser.add_argument('--num-epochs', '-e', type=int, default=1)
    parser.add_argument('--num-steps', type=int, default=-1)
    parser.add_argument('--num-features', type=int, default=784)
    parser.add_argument('--num-classes', type=int, default=10)
    parser.add_argument('--yaml-config', default=None, type=str, help='YAML file used with launch_app.py')
    parser.add_argument('--data-dir',
                        '--data-folder',
                        type=str,
                        default='mnist_data',
                        help="The folder to download the MNIST dataset to.")
    parser.add_argument('--bias', action='store_true', help='Linear layer will learn an additive bias')

Users of the model can then specify these arguments on the command line to set model parameters.

  • --num-epochs or -e specifies the number of epochs to run the training loop.

  • --num-features specifies the embedding dimension of the input data.

  • --num-classes is the number of different classes in our classification problem. For the MNIST example, the number of different classes is ten for digits from 0 to 9.

  • --data-folder specifies the download location for the MNIST data.

Data preparation

Data preparation is pretty standard, and familiar to those who’ve worked with PyTorch datasets. The prepare_dataloader() function defines and then returns both the train and the test dataset.

prepare_dataloader() function
def prepare_dataloader(args: argparse.Namespace) ->
            Tuple[torch.utils.data.DataLoader, torch.utils.data.DataLoader]:

    # Get the train & test data (images and labels) from the MNIST dataset
    train_dataset = torchvision.datasets.MNIST(root=f'{args.data_dir}',
                                               train=True,
                                               transform=dataset_transform(vars(args)),
                                               download=True)
    test_dataset = torchvision.datasets.MNIST(root=f'{args.data_dir}',
                                              train=False,
                                              transform=dataset_transform(vars(args)))

    # Get the train & test data loaders (input pipeline)
    train_loader = torch.utils.data.DataLoader(
        dataset=train_dataset, batch_size=args.batch_size, shuffle=True)
    test_loader = torch.utils.data.DataLoader(
        dataset=test_dataset, batch_size=args.batch_size, shuffle=False)
    return train_loader, test_loader

Compile the model

For model compilation, we use the samba.session.compile function, passing some arguments including the optimizer.

Calling samba.session.compile()
if args.command == "compile":
        #  Compile the model to generate a PEF (Plasticine Executable Format) binary
        samba.session.compile(model,
                              inputs,
                              optimizer,
                              name='logreg_torch',
                              app_dir=utils.get_file_dir(__file__),
                              config_dict=vars(args),
                              pef_metadata=get_pefmeta(args, model))

Train the model

The train() function defines the training logic. It is similar to a typical PyTorch training loop.

  • The outer loop iterates over the number of epochs provided by the --num-epochs argument.

  • The inner loop iterates over the training data.

Let’s look at the annotated code first, and then explore some details.

train() function
def train(args: argparse.Namespace, model: nn.Module, output_tensors:
            Tuple[samba.SambaTensor]) -> None:

    # Get data loaders for training and test data
    train_loader, test_loader = prepare_dataloader(args)

    # Total training steps (iterations) per epoch
    total_step = len(train_loader)

    hyperparam_dict = { "lr": args.lr,
                        "momentum": args.momentum,
                        "weight_decay": args.weight_decay}

    # Train and test for specified number of epochs
    for epoch in range(args.num_epochs):
        avg_loss = 0

        # Train the model for all samples in the train data loader
        for i, (images, labels) in enumerate(train_loader):
            global_step = epoch * total_step + i
            if args.num_steps > 0 and global_step >= args.num_steps:
                print('Maximum num of steps reached. ')
                return None

            sn_images = samba.from_torch_tensor(images, name='image', batch_dim=0)
            sn_labels = samba.from_torch_tensor(labels, name='label', batch_dim=0)

            loss, outputs = samba.session.run(input_tensors=[sn_images, sn_labels],
                                              output_tensors=output_tensors,
                                              hyperparam_dict=hyperparam_dict,
                                              data_parallel=args.data_parallel,
                                              reduce_on_rdu=args.reduce_on_rdu)

            # Sync the loss and outputs with host memory
            loss, outputs = samba.to_torch(loss), samba.to_torch(outputs)
            avg_loss += loss.mean()

            # Print loss per 10,000th sample in every epoch
            if (i + 1) % 10000 == 0 and args.local_rank <= 0:
                print('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}'.format(epoch + 1,
                    args.num_epochs, i + 1, total_step, avg_loss / (i + 1)))

        # Check the accuracy of the trained model for all samples in the test data loader
        # Sync the model parameters with host memory
        samba.session.to_cpu(model)
        test_acc = 0.0
        with torch.no_grad():
            correct = 0
            total = 0
            total_loss = 0
            for images, labels in test_loader:
                loss, outputs = model(images, labels)
                loss, outputs = samba.to_torch(loss), samba.to_torch(outputs)
                total_loss += loss.mean()
                _, predicted = torch.max(outputs.data, 1)
                total += labels.size(0)
                correct += (predicted == labels).sum()

            test_acc = 100.0 * correct / total

            if args.local_rank <= 0:
                print(f'Test Accuracy:
                      {test_acc:.2f} Loss: {total_loss.item() / len(test_loader):.4f}')


        # if args.acc_test:
           # assert args.num_epochs == 1, "Accuracy test only supported for 1 epoch"
           # assert test_acc > 91.0 and test_acc < 92.0, "Test accuracy not within specified bounds."

Here’s some detail on the code fragments.

  • The function from_torch_tensor creates SambaFlow tensors (SambaTensor) from PyTorch tensors. This function is similar to the torch.from_numpy function in PyTorch, which creates a PyTorch tensor from a NumPy array. (The function samba.to_torch creates a PyTorch tensor from a SambaTensor.)

  • When we run the model on the device, we call the samba.session.run function:

    loss, outputs = samba.session.run(input_tensors =
            [sn_images, sn_labels],
            output_tensors=output_tensors,
            hyperparam_dict=hyperparam_dict,
            data_parallel=args.data_parallel,
            reduce_on_rdu=args.reduce_on_rdu)
  • To collect data about loss and output and print those data, we convert back from SambaTensors to PyTorch tensors in loss, outputs = samba.to_torch(loss), samba.to_torch(outputs).

Main function

The main function runs in different modes depending on the command-line input. The two main execution modes are compile and run.

Here’s how compiling and running a SambaFlow model works:

  • You compile the model with the compile command. As part of compilation, our code generates random SambaTensors (ipt and tgt) and passes them to the compiler.

  • After compile has produced a PEF file, you can do a training run, passing in the PEF file name as a parameter.

Hello SambaFlow! Compile and run a model explains how to compile and run this model.

main() function
def main(argv):
    """
    :param argv: Command line arguments (`compile`, `test` or `run`)
    """
    args = parse_app_args(argv=argv,
                          common_parser_fn=add_args,
                          run_parser_fn=add_run_args)

    # when it is not distributed mode, local rank is -1.
    args.local_rank = dist.get_rank() if dist.is_initialized() else -1

    # Create random input and output data for testing
    ipt = samba.randn(args.batch_size,
                      args.num_features,
                      name='image',
                      batch_dim=0,
                      named_dims=('B', 'F')).bfloat16().float()
    tgt = samba.randint(args.num_classes, (args.batch_size, ),
                        name='label',
                        batch_dim=0,
                        named_dims=('B', ))

    ipt.host_memory = False
    tgt.host_memory = False

    # Instantiate the model
    model = LogReg(args.num_features, args.num_classes)

    # Sync model parameters with RDU memory
    samba.from_torch_model_(model)

    # Annotate parameters if weight normalization is on
    if args.weight_norm:
        utils.weight_norm_(model.lin_layer)

    inputs = (ipt, tgt)

    # Instantiate an optimizer if the model will be trained
    if args.inference:
        optimizer = None
    else:
        # We use the SGD optimizer to update the weights of the model
        optimizer = samba.optim.SGD(model.parameters(),
                                    lr=args.lr,
                                    momentum=args.momentum,
                                    weight_decay=args.weight_decay)

    if args.command == "compile":
        #  Compile the model to generate a PEF (Plasticine Executable Format) binary
        samba.session.compile(model,
                              inputs,
                              optimizer,
                              name='logreg_torch',
                              app_dir=utils.get_file_dir(__file__),
                              config_dict=vars(args),
                              pef_metadata=get_pefmeta(args, model))

    elif args.command in ["test", "run"]:
        # Trace the compiled graph to initialize the model weights and input/output tensors
        # for execution on the RDU.
        # The PEF required for tracing is the binary generated during compilation
        # Mapping refers to how the model layers are arranged in a pipeline for execution.
        # Valid options: 'spatial' or 'section'
        utils.trace_graph(model,
                          inputs,
                          optimizer,
                          pef=args.pef,
                          mapping=args.mapping)

        if args.command == "test":
            # Test the model's functional correctness. This tests if the result of execution
            # on the RDU is comparable to that on a CPU. CPU run results are used as reference.
            # Note that this test is different from testing model fit during training.
            # Given the same initial weights and inputs, this tests if the graph execution
            # on RDU generates outputs that are comparable to those generated on a CPU.
            outputs = model.output_tensors
            test(args, model, inputs, outputs)

        elif args.command == "run":

            # Train the model on RDU. This is where the model will be trained
            # i.e. weights will be learned to fit the input dataset
            train(args, model)


if __name__ == '__main__':
    main(sys.argv[1:])

For discussion of a main() function that’s very similar to the function above, see Tie the pieces together with main().

Learn more!