Image classification tutorial
Image classification is the task of categorizing an image into one or multiple predefined classes by assigning it to a specific label. In this exercise, we describe a general approach to image classification in the SambaStudio platform.
Data preparation
In an image classification task, the classification data typically consists of a set of images along with corresponding labels. The images can be in any standard format, such as JPEG (Joint Photographic Experts Group) or PNG (Portable Network Graphic).
Training dataset requirements
An uploaded dataset is expected to have the following components:
-
A directory of images.
-
A
labels.csv
file, responsible for mapping the image location to the image label and identifier oftrain
,test
, andvalidation
split. -
A
class_to_idx.json
file[Optional]
. This file is responsible for mapping the class’s verbose name to the class index. It provides a way to retrieve the human-readable interpretation from the index number that corresponds to a specific class label.
The uploaded data should have a directory structure similar to the below example.
.
└── data_root/
├── images/
├── labels.csv
└── class_to_idx.json # Optional
The directory name For example, the following directory is also valid assuming
|
Image formats
JPEG (.jpg
extension) and PNG (.png
extension) are allowed formats. All images should be three channel RGB, with uint8 encoding. For example, if initial images have a fourth alpha channel, this will need to be removed during the dataset processing step.
Labels CSV
You are required to provide a .csv
file specifying the appropriate image-label pair in addition to an indicator if the data should be treated as training data or testing data. This information is denoted by the column headers as described below:
-
image_path header denotes the relative path to a given image inside of the dataset directory.
-
label header denotes the class id included in the image, ranging from
[0..n-1]
, wheren
is the number of classes. In the case of multi-label classification, labels are separated by a space if a sample has multiple labels present. -
subset header denotes one of
train
,test
, orvalidation
. Indicating if the image is in the training, test, or validation set. -
metadata [Optional] header denotes information relating to the given input data.
$ column -s, -t caltech256.csv | head -n 4
image_path label subset metadata
./images/138.mattress/138_0117.jpg 0 train
./images/138.mattress/138_0103.jpg 0 3 11 validation
./images/138.mattress/138_0088.jpg 0 train
Where the column
command is used for pretty-printing the .csv
for display purposes.
The class index mapping file
The class_to_idx.json
file is responsible for mapping the human-interpretable name to the class index. The expected format of this files is an string-index mapping dictionary.
$ python -m json.tool imagenet1000.json | head
{
"tench, Tinca tinca": 0,
"goldfish, Carassius auratus": 1,
"great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias": 2,
"tiger shark, Galeocerdo cuvieri": 3,
"hammerhead, hammerhead shark": 4,
"electric ray, crampfish, numbfish, torpedo": 5,
"stingray": 6,
"cock": 7,
"hen": 8,
The The app does not check the existence of this file or its correctness. |
CIFAR 100 example
Download the CIFAR 100 data from https://www.cs.toronto.edu/~kriz/cifar.html
import asyncio
import aiofiles
from io import BytesIO
from PIL import Image
import pickle
from pathlib import Path
import pandas as pd
import random
# Change to false if only the labels.csv file needs to be processed
SAVE_IMAGE = True
# The async code will open too many files at one time. Let's limit this
num_of_max_files_open = 200
data_dir = Path('./data')
data_dir.mkdir(exist_ok=True)
def unpickle(file):
with open(file, 'rb') as fo:
data = pickle.load(fo, encoding='bytes')
return data
def load_subset(subset):
data = unpickle(f'./cifar-100-python/{subset}')
filenames = data[b'filenames']
labels = data[b'fine_labels']
images = data[b'data']
assert len(labels) == len(images)
assert len(filenames) == len(images)
return filenames, labels, images
async def save_image(path: str, image: memoryview) -> None:
async with aiofiles.open(path, "wb") as file:
await file.write(image)
async def write_image(filename, label, image, subset, row, sem):
subset_dir = data_dir / subset
filepath = subset_dir / filename.decode()
async with sem:
# Each row of the array stores a 32x32 colour image. The first 1024 entries contain the red channel values, the
# next 1024 the green, and the final 1024 the blue. The image is stored in row-major order, so that the first 32
# entries of the array are the red channel values of the first row of the image.
if SAVE_IMAGE:
image = image.reshape(3, 32, 32).transpose(1, 2, 0)
img = Image.fromarray(image)
buffer = BytesIO()
img.save(buffer, format='png')
await save_image(filepath, buffer.getbuffer())
if row % 100 == 0:
print(f"{row:05d}", flush=True)
if subset == 'test':
# we use the ``test`` set as the ``validation`` set in this example
subset = 'validation'
return [str(filepath.relative_to(data_dir)), str(label), subset, str('')]
async def process_subset(subset):
subset_dir = data_dir / subset
subset_dir.mkdir(exist_ok=True)
tasks = []
sem = asyncio.Semaphore(num_of_max_files_open)
filenames, labels, images = load_subset(subset)
for row, sample in enumerate(zip(filenames, labels, images)):
tasks.append(asyncio.ensure_future(write_image(*sample, subset=subset, row=row, sem=sem)))
results = await asyncio.gather(*tasks)
df = pd.DataFrame(results, columns=["image_path", "label", "subset", "metadata"])
return df
async def main():
print("Processing training images")
train_df = await process_subset('train')
print("Processing test images")
test_df = await process_subset('test')
df = pd.concat([train_df, test_df])
df.to_csv(data_dir / 'labels.csv', index=False)
asyncio.run(main())
Fine-tune
You will need to add your dataset to the platform before you can use it for fine-tuning.
Add the dataset
If you plan to use the sample dataset provided by the platform, you can skip this section and proceed to Create a project using the GUI. |
Follow the steps below to add the dataset using the GUI.
-
Click Datasets from the left menu to navigate to the Dataset Hub window.
-
Click the Add dataset button.
Figure 1. Dataset Hub-
The Add a dataset window will open.
-
-
In the Dataset name field, input a name for your dataset.
-
From the Job type dropdown, select whether the dataset is to be used for Train/Evaluation or Batch inference.
-
The Share settings drop-down provides options for which tenant to share your dataset.
-
Share with <current-tenant> allows the dataset to be shared with the current tenant you are using, identified by its name in the drop-down.
-
Share with all tenants allows the dataset to be shared across all tenants.
-
Dataset will be shared with all users in <current-tenant> identifies that the dataset will be shared with other users in the tenant you are using.
If the Dataset will be shared with all users in <current-tenant> option is displayed, the Share with <current-tenant> and Share with all tenants options described above will not be available. Share with all tenants is an optional feature of SambaStudio. Please contact your administrator or SambaNova representative for more information.
-
-
From the Applicable ML Apps drop-down, select the ML App(s) that you wish the dataset to be associated. Multiple ML Apps can be selected.
Be sure to select appropriate ML Apps that correspond with your dataset, as the platform will not warn you of ML Apps selected that do not correspond with your dataset.
Figure 2. Add a dataset
Import the dataset from AWS S3
Follow the steps below to import your dataset from AWS S3.
|
-
Select AWS from the Source drop-down.
-
In the Bucket field, input the name of your S3 bucket.
-
Input the relative path to the dataset in the S3 bucket into the Folder field. This folder should include the required dataset files for the task (for example, the labels, training, and validation files).
-
In the Access key ID field, input the unique ID provided by AWS IAM to manage access.
-
Enter your Secret access key into the field. This allows authentication access for the provided Access Key ID.
-
Enter the AWS Region that your S3 bucket resides into the Region field.
An Access key, Secret access key, and user access permissions are required for AWS S3 import.
Figure 3. Import from AWS S3
Create a project using the GUI
Follow the steps below to create a project using the GUI.
-
Click Projects from the left menu to view the Projects window.
-
Click New project. The Create a new Project window will open.
-
Enter a name to be used into Project name field.
-
Add a brief description into the Description field.
-
Confirm that you want to create the new project:
-
Click Save and close to create your project and go to the Projects window.
-
Click Save and continue to create your project and go to your new project’s window. From the project window you can create a new job or new endpoint to be associated with your project.
-
Click Cancel to stop the creation of a new project and return to the Projects window.
-

Training jobs
You can fine-tune existing models by creating a new training job. The platform will perform the required data processing, such as tokenization, behind the scenes. You can select either a platform provided dataset or your own dataset.
Create a training job using the GUI
You no longer need to specify a Task during job creation. Instead, you select a model to use directly during the workflow or use the new ML App field to filter the list of models to use.
Create a new training job using the GUI for fine-tuning by following the steps below.
To navigate to a project, click Projects from the left menu and then select the desired project from the Projects window. See the Projects document for information on how to create and delete projects. |
-
Create a new project or use an existing one.
-
From a project window, click New job. The Create a new job window will appear.
-
Select Train model under Create a new job.
-
Enter a name for the job into the Job name field.
-
Select Image classification from the ML App drop-down.
-
From the Select dataset drop-down, choose My datasets, SambaNova datasets, or Select from datasets.
-
My datasets displays a list of datasets that you have added to the platform and can be used for the selected ML App.
-
SambaNova datasets displays a list of platform provided datasets for the selected ML App.
Figure 5. Create a training job -
Select from datasets displays the Dataset Hub window with a detailed list of datasets that can be used for the selected ML App. The My datasets and SambaNova checkboxes filter the dataset list by their respective group. The ML App drop-down filters the dataset list by the corresponding ML App. Choose the dataset you wish to use and confirm your choice by clicking Use dataset.
Figure 6. Dataset Hub
-
-
Set the hyperparameters to govern your training job or use the default values. Expand the Hyperparameters & settings pane by clicking the blue double arrows to set hyperparameters and adjust settings.
The num_intended_classes setting needs to match the number of classes in your dataset. For the CIFAR 100 example, the num_intended_classes setting would be 100.
Figure 7. Hyperparameters & settings
Evaluate the job using the GUI
Navigate to a Training job’s detail page during the job run (or after its completion) to view job information, generated checkpoints, and metrics. You can evaluate a checkpoints' accuracy, loss, and other metrics to determine if the checkpoint is of sufficient quality to deploy.
Navigate to a Training job’s detail page from the Dashboard or from its associated Project page. |
View information and metrics
You can view the following information and metrics about your training job.
- Model
-
Displays the model name and architecture used for training.
- Dataset
-
Displays the dataset used, including its size.
- Details & Hyperparameters
-
Displays the number of RDUs utilized and batch size. Click More to view a list of the hyperparameters and settings used during training. Click Less to hide the hyperparameters and settings list.
Figure 8. Expanded Details & Hyperparameters - Progress bar
-
The progress bar displays the state of the training job as well as the percentage completed of the training run.
- Metrics graph
-
Displays the various metrics generated during the training run. GPT 1.5B models, such as GPT_1.5B_NER_FINETUNED, generate additional metrics. Click Expand to view the additional metrics. Click Collapse to hide the additional metrics.
Figure 9. Expanded additional metrics - Checkpoints table
-
The Checkpoints table displays generated checkpoints of your training run.
-
You can customize your view of the Checkpoints table by enabling/disabling columns, from the Columns drop-down, to help you focus on comparing metrics that are relevant to you.
-
Download a CSV file of your checkpoints by clicking Export and selecting Download as CSV from the drop-down. The CSV file will be downloaded to the location configured by your browser.
-
From the Actions column drop-down, you can select Save to Model Hub or Delete for a checkpoint.
Figure 10. Checkpoints table -
For GPT 1.5B models, you can view the Confusion matrix for a checkpoint that can be used to further understand checkpoint performance. From the Actions drop-down, click Checkpoint metrics. The Confusion matrix window will open.
Figure 11. Confusion matrixAll labels listed in your labels file must be represented in the validation dataset. This ensures that the confusion matrix does not generate errors associated with missing labels or incorrectly attributed metrics.
-
Evaluate the job using the CLI
Similar to the GUI, the SambaNova API (snapi) provides feedback on job performance via the CLI. The example below demonstrates the snapi job metrics
command. You will need to specify the following:
-
The project that contains, or is assigned to, the job you wish to view metrics.
-
The name of the job you wish to view metrics.
If a Confusion Matrix can be generated for the job, the path to the generated matrix will be displayed in the output. |
$ snapi job metrics \
--project <project-name> \
--job <job-name>
TRAINING
INDEX TRAIN_LOSS TRAIN_STEPS
0 0.0 0.0
1 2.47 10.0
2 2.17 20.0
3 2.02 30.0
4 2.06 40.0
5 2.01 50.0
6 2.0 60.0
7 1.93 70.0
8 2.0 80.0
9 1.95 90.0
10 2.0 100.0
VALIDATION
INDEX VAL_STEPS VAL_LOSS VAL_STEPS_PER_SECOND
0 0.0 2.04 0.13
1 50.0 2.03 0.13
2 100.0 2.03 0.13
Confusion Matrix generated here -> <path-to-generated-confusion-matrix-jpeg>
Run |
Reviewing metrics
Navigate to a Training job’s detail page from the Dashboard or from its associated Project page to review its metrics.
-
The details page provides information about your training job including its completion status. Click Expand to view additional information.
Figure 12. Training details -
Click Collapse to hide additional information.
Figure 13. Expanded additional information -
Customize your view of the Checkpoints table by enabling/disabling columns to help you focus on comparing information that is relevant to you.
-
Download a CSV file of your checkpoints by clicking Export and selecting Download as CSV from the drop-down.
Figure 14. Checkpoints table
-
Save a checkpoint to the model hub
Once you’ve identified a checkpoint to use for inference or further fine-tuning, save it to the Model Hub. Do this by clicking on the 3-dot menu associated with that checkpoint, select Save to Model Hub and provide a Name and Description to help you identify the checkpoint.

View logs using the GUI
The Logs section allows you to preview and download logs of your training session. Logs can help you track progress, identify errors, and determine the cause of potential errors.
Logs can be visible in the platform earlier than other data, such as metrics, checkpoints, and job progress. |
-
From the Preview drop-down, select the log file you wish to preview.
-
The Preview window displays the latest 50 lines of the log.
-
To view more than 50 lines of the log, use the Download all feature to download the log file.
-
-
Click Download all to download a compressed file of your logs. The file will be downloaded to the location configured by your browser.

View logs using the CLI
Similar to View logs using the GUI, you can use the SambaNova API (snapi) to preview and download logs of your training session.
View the job log file names
The example below demonstrates the snapi job list-logs
command. Use this command to view the job log file names of your training job. This is similar to using the Preview drop-down menu in the GUI to view and select your job log file names. You will need to specify the following:
-
The project that contains, or is assigned to, the job you wish to view the job log file names.
-
The name of the job you wish to view the job log file names.
$ snapi job list-logs \
--project <project-name> \
--job <job-name>
train-0fb0568c-ca8e-4771-b7cf-e6ef156d1347-1-ncc9n-runner.log
train-0fb0568c-ca8e-4771-b7cf-e6ef156d1347-1-ncc9n-model.log
Run |
Preview a log file
After you have viewed the log file names for your training job, you can use the snapi job preview-log
command to preview the logs corresponding to a selected log file. The example below demonstrates the snapi job preview-log
command. You will need to specify the following:
-
The project that contains, or is assigned to, the job you wish to preview the job log file.
-
The name of the job you wish to preview the job log file.
-
The job log file name you wish to preview its logs. This file name is returned by running the
snapi job list-logs
command, which is described above.
$ snapi job preview-log \
--project <project-name> \
--job <job-name> \
--file train-0fb0568c-ca8e-4771-b7cf-e6ef156d1347-1-ncc9n-runner.log
2023-08-10 20:28:46 - INFO - 20be599f-9ea7-44ea-9dc5-b97294d97529 - Runner starting...
2023-08-10 20:28:46 - INFO - 20be599f-9ea7-44ea-9dc5-b97294d97529 - Runner successfully started
2023-08-10 20:28:46 - INFO - 20be599f-9ea7-44ea-9dc5-b97294d97529 - Received new train request
2023-08-10 20:28:46 - INFO - 20be599f-9ea7-44ea-9dc5-b97294d97529 - Connecting to modelbox at localhost:50061
2023-08-10 20:28:54 - INFO - 20be599f-9ea7-44ea-9dc5-b97294d97529 - Running training
2023-08-10 20:28:54 - INFO - 20be599f-9ea7-44ea-9dc5-b97294d97529 - Staging dataset
2023-08-10 20:28:54 - INFO - 20be599f-9ea7-44ea-9dc5-b97294d97529 - initializing metrics for modelbox:0
2023-08-10 20:28:54 - INFO - 20be599f-9ea7-44ea-9dc5-b97294d97529 - initializing checkpoint path for modelbox:0
2023-08-10 20:31:35 - INFO - 20be599f-9ea7-44ea-9dc5-b97294d97529 - Preparing training for modelbox:0
2023-08-10 20:31:35 - INFO - 20be599f-9ea7-44ea-9dc5-b97294d97529 - Running training for modelbox
Run |
Download the logs
Use the snapi download-logs
command to download a compressed file of your training job’s logs. The example below demonstrates the snapi download-logs
command. You will need to provide the following:
-
The project that contains, or is assigned to, the job you wish to download the compressed log file.
-
The name of the job you wish to download the compressed log file.
$ snapi job download-logs \
--project <project-name>> \
--job <job-name>
Successfully Downloaded: <job-name> logs
The default destination for the compressed file download is the current directory. To specify a destination directory, use the |