Batch inference
Batch inference is the process of generating predictions on a batch of observations. Within the platform, you can generate predictions on bulk data by creating a batch inference job.
Starting with release 23.5.1, SambaStudio has deprecated the concept of a Task when creating a batch inference job and will now use ML App, which aggregates models by the selected ML App. |
Create a batch inference job using the GUI
Create a new batch inference using the GUI job by following the steps below.
To navigate to a project, click Projects from the left menu and then select the desired project from the My Projects list. See the Projects document for information on how to create and delete projects. |
-
Create a new project or use an existing one.
-
From a project window, select the New Job button. The Create a new job window will appear.
-
Select Batch Inference under Create a new job.
-
Enter a name for the job into the Job Name field.
-
Select the ML App from the ML App drop-down.
The ML App selected will refine the models displayed, by corresponding model type, in the Select model drop-down.
-
From the Select model drop-down, choose My models, SambaNova models, Organization models, or Select from model hub.
The available models displayed are defined by the previously selected ML App drop-down. If you wish to view models that are not related to the selected ML App, select Clear from the ML App drop-down. Selecting a model with the ML App drop-down cleared, will auto populate the ML App field with the correct and corresponding ML App for the model.
-
My models displays a list of models that you have previously added to the Model Hub.
-
SambaNova models displays a list of models provided by SambaNova.
-
Organization models displays a list of all models that other members of your organization have previously added to the Model Hub.
-
Select from model hub displays the Model Hub window with a detailed list of all available models that you can use for a selected ML App. The My models, SambaNova, and Organization models checkboxes filter the model list by their respective group. The ML App drop-down filters the model list by the corresponding ML App. Choose the model you wish to use and confirm your choice by clicking Use model.
-
-
From the Select dataset drop-down, choose My datasets, SambaNova datasets, or Select from datasets
-
My datasets displays a list of datasets that you have added to the platform and can be used for a selected ML App.
-
SambaNova datasets displays a list of platform provided datasets for a selected ML App.
-
Select from datasets displays the Dataset Hub window with a detailed list of datasets that can be used for a selected ML App. The My datasets and SambaNova checkboxes filter the dataset list by their respective group. The ML App drop-down filters the dataset list by the corresponding ML App. Choose the dataset you wish to use and confirm your choice by clicking Use dataset.
-
-
In the Relative dataset folder/file path field, specify the relative path from storage to the folder that contains the data for batch inference.
-
Set the inference settings to to optimize your batch inference job for the input data or use the default values. Expand the Inference settings pane by clicking the blue double arrows to set and adjust the settings.
-
Click Run job to start the batch inference.
Batch inference jobs are long running jobs. The time they take to run to completion is heavily dependent on the task and the number of input samples. |

Create a batch inference job using the CLI
The example below demonstrates how to create a batch inference job using the snapi job create
command. You will need to specify the following:
-
A project to assign the job. Create a new project or use an existing one.
-
A name for your new job.
-
batch_predict
as the job type. This designates the job to be a batch inference job. -
The model to be used for the batch inference job.
-
The dataset you wish to use for the batch inference job.
The dataset must be compatible with the model you choose.
$ snapi job create \
--project <project-name> \
--job <your-new-job-name> \
--type batch_predict \
--model-checkpoint <model-name> \
--dataset <dataset-name>
Quit or delete a batch inference job
Follow the steps below to quit or delete a batch inference job.
Option 1: From the job window
-
From a Job window:
-
Click the Quit button to stop the job from running. The confirmation box will open.
-
Click the Delete button to quit and remove the job from the platform. All job related data and export history will permanently be removed from the platform. The confirmation box will open.
-
-
In the confirmation box, click the Yes button to confirm that you want to quit or delete the job.

Option 2: From the job list
-
Select Dashboard from the left menu to navigate to the list of jobs in the platform. Alternatively, the job list can be accessed from the job’s associated project window.
-
Click the three dots under the Actions column to display the drop-down menu and available actions for the selected job.
-
Click the Delete button to quit and remove the job from the platform. All job related data and export history will permanently be removed from the platform. The confirmation box will open.
-
Click the Quit button to stop the job from running. The confirmation box will open.
-
-
In the confirmation box, click the Yes button to confirm that you want to quit or delete the job.

Access results
Results of a completed batch inference job are enclosed in a single file that contains predictions for all input samples.
The file formats and schema for the various tasks are described below.
Automatic speech recognition CSV file
Results for automatic speech recognition are in a predictions.csv
file. The CSV file contains seven columns described in order below.
Column | Function | Description |
---|---|---|
Column 1 |
Input file path |
Uniquely identifies the audio file for which the transcription is used. |
Column 2 |
Diarized speech segment file path |
Uniquely identifies the segment of the input file that has been diarized and transcribed. Each row in the file represents a single speech segment. |
Column 3 |
Speaker identifier |
Helps identify utterances per speaker and is labeled as |
Column 4 |
Speech segment start time |
The start time of the speech segment in the input audio file. |
Column 5 |
Speech segment duration |
The duration of the speech segment in seconds. |
Column 6 |
Raw transcript |
The raw transcription of the speech segment. |
Column 7 |
Annotated transcript |
Transcript with punctuation for the speech segment. |
GPT 1.5B document classification text file
Results for GPT 1.5B document classification are in a test_results_arxiv.txt
file. Each row of the text contains the prediction for each associated input row as described below.
Row | Function | Description |
---|---|---|
Row 1 |
Index |
The index of the input row for which predictions are generated. |
Row 2 |
Probability |
An array of prediction probabilities, in order of the class indices as specified in the |
Row 3 |
Label |
This is the predicted label for the input. |
GPT 1.5B named entity recognition text file
Results for GPT 1.5B named entity recognition are in a predictions.txt
file. There are no columns in the file.
{"entities": [{"begin_offset": 9, "end_offset": 14, "entity": "B-LOC", "word": "JAPAN"}, {"begin_offset": 30, "end_offset": 35, "entity": "B-LOC", "word": "CHINA"}], "text": "SOCCER - JAPAN GET LUCKY WIN, CHINA IN SURPRISE DEFEAT."}
GPT 1.5B sentiment analysis
Results for GPT 1.5B sentiment analysis are in a tab-separated values formatted file as described below.
-
The filename is
test_results_finance_sentiment_monolith.txt
. -
Two column format with no column headers.
-
The first column is the index of the test samples, 0-indexed.
-
The second column is the label (
negative|positive
).
0 negative
1 positive
Access procedures
After your batch inference job is complete, you can access the results as described below.
Download results to your local machine
From the Batch Inference screen, click the blue Download Results button to download the results to your local machine.
Downloads are limited to files no larger than 2GB. Downloading results larger than this will fail to complete. Use the access from NFS method if the file size exceeds 2GB.

Upload to AWS S3
From the Batch Inference screen, upload results from batch inference to a folder in an AWS S3 bucket by clicking the blue Upload Results to AWS button. Provide the following information:
-
Bucket: The name of your S3 Bucket.
-
Folder: The explicit folder name of the dataset. This folder should include the dataset files required for your task.
-
Access Key ID: The unique ID that is provided by AWS IAM to manage access.
-
Secret Access Key: This allows authentication access for the provided Access Key ID.
-
Region: The AWS Region that your S3 bucket resides.
There is no limit to the number of times results can be uploaded to AWS S3 buckets, but only one upload is allowed at a time. |

Access from NFS
You can access the results from a batch inference job directly from the NFS and copy it to a location of your choice. The details card in the job results provides the path to the results file on the mounted NFS.
It is recommended to use this method to download results larger than 2GB to your local machine