SambaStudio Python SDK#
Copyright © 2024 by SambaNova Systems, Inc. Disclosure, reproduction, reverse engineering, or any other use made without the advance written permission of SambaNova Systems, Inc. is unauthorized and strictly prohibited. All rights of ownership and enforcement are reserved.
- class snsdk.SnSdk(host_url: str, access_key: str, tenant_id: str | None = None, command: str | None = None, user_agent: str = 'SambaStudioSDK/23.11.2', disable_ssl_verify: bool = False)#
SambaNova Systems Python SDK for SambaStudio. You can create an instance of SnSdk class by passing the SambaStudio hostname and the access key.
- Parameters:
host_url (str) – Base URL of the SambaStudio API service. Example: https://example-domain.net.
access_key (str) – SambaStudio API authorization key.
- add_composite_model(name: str, description: str, dependencies: List[Dict], rdu_required: int) Dict #
Add a composite model.
- Parameters:
name (str) – The name of the composite model
description (str) – Model description.
dependencies (list[dict]) – The list of dependencies model name
rdu_required (int) – Number of rdu required to deploy the composite model.
- Returns:
The status code.
- Return type:
dict
- add_dataset(dataset_name: str, apps: Dict[Literal['ids', 'names'], List[str]], dataset_metadata: any, description: str, file_type: str, url: str, language: str, job_type: List[str], source: any) Dict #
[V2 Version] Add a new dataset.
Important
- The specified dataset_path used in SambaStudio are relative paths to the storage root directory <NFS_root>.
Paths outside storage root cannot be used as SambaStudio does not have access to those directories.
- Parameters:
dataset_name (str) – The name of the new dataset.
apps (list) – The list of ML Apps that the new dataset will be associated.
dataset_metadata (str) – The metadata for the dataset.
description (str) – Free-form text description of the dataset.
file_type (str) – Free-form text file types in the dataset.
url (str) – Free-form text including URL source of the dataset.
language (str) – Language of the NLP dataset.
job_type (List[str]) – The job type ([“evaluation”, “train”]/[“batch_predict”]/[“train”]/[“evaluation”]) for the dataset.
source (str) – The dataset source.
- Returns:
The new dataset’s ID.
- Return type:
dict
- add_endpoint_api_key(project_id: str, endpoint_id: str, api_key: str, description: str) Dict #
Adds an API key to an endpoint.
- Parameters:
id (str project) – The ID of the project.
endpoint_id (str) – The ID of the endpoint.
api_key (str) – The API key to be added.
description (str) – The API key description.
- Returns:
The API keys for the endpoint.
- Return type:
dict
- add_model(project: str, model_checkpoint: str, model_checkpoint_name: str, job: str, description: str, checkpoint_type: str) Dict #
Add a model from an existing checkpoint.
- Parameters:
project (str) – The project name or project_id that contains the checkpoint.
model_checkpoint (str) – The name of the checkpoint to use to create the new model.
model_checkpoint_name (str) – The name for the new model.
job (str) – The job name or job_id that contains the checkpoint.
description (str) – Model description.
checkpoint_type (str) – The type of checkpoint(pretrained/finetuned).
- Returns:
The status code.
- Return type:
dict
- add_users(users: List[Dict[str, str]]) Dict #
Add the user.
- Returns:
Response Dict
- Return type:
dict
- admin_job_list() Dict #
Returns the list of running jobs for all users.
Note
Use of this method requires the admin role.
- Returns:
The list of running jobs for all users.
- Return type:
dict
- admin_resource_usage() Dict #
Returns the number of total and available RDUs.
Note
Use of this method requires the admin role.
- Returns:
The number of total and available RDUs.
- Return type:
dict
- app_info(app: str) Dict #
Returns the app details for the ML App.
- Parameters:
app (str) – The ML App name.
- Returns:
The ML App details.
- Return type:
dict
- check_endpoint_updates(endpoint: str) Dict #
Checks for endpoint updates :param str endpoint: The endpoint name or ID. :returns: The endpoint update details if exists otherwise error. :rtype: dict
- checkpoint_info(checkpoint_name: str)#
Returns the checkpoint information.
- Parameters:
checkpoint_name – The name of the checkpoint.
- Returns:
Checkpoint information if it exists.
- Return type:
dict
- complete_folder_upload(dataset_id: str, folder_id: str, file_list: List[Dict[str, any]]) Dict #
Gets Folder Upload status :param str dataset_id: The datasetId for the newly uploading dataset. :param str folder_id: The id refers to the folder being uploaded. :param List[Dict[str, any]] file_list: The list of files the folder contains. :rtype: dict
- complete_multipart_file_upload(folder_id: str, file_path: str, chunk_size: int, upload_id: str, part_list: List[Dict[str, any]]) Dict #
Returns File upload status. :param str folder_id: The folderId for the newly uploaded dataset folder. :param str file_path: The path of the new dataset. :param str chunk_size: The size of parts in which the dataset is being divided into. :param str upload_id: The uploadId for this dataset upload. :param List[Dict[str, any]] part_list: The list of the chunks the file is divided into. :rtype: dict
- create_endpoint(project: str, endpoint_name: str, description: str, model_checkpoint: str, instances: int, hyperparams: str, rdu_arch: str) Dict #
Creates a new endpoint.
- Parameters:
project (str) – The project name or ID.
endpoint_name (str) – The name of the endpoint.
description (str) – The description of the endpoint.
model_checkpoint (str) – The model-checkpoint to used for the endpoint.
instances (str) – Specifies how many instances will be deployed.
- Returns:
The endpoint ID.
- Return type:
dict
- create_folder_upload(dataset_id: str, file_list: List[str]) Dict #
Creates the folder upload. :param str dataset_id: The datasetId for the newly uploading dataset. :param List[Dict[str, any]] file_list: The list of files the folder contains. :returns The Datasetpath and folderUploadId. :rtype: dict
- create_job(job_type: str, project: str, model_checkpoint: str, job_name: str, description: str, dataset: str, hyperparams: str, load_state: bool, sub_path: str, parallel_instances: int, rdu_arch: str) Dict #
Create a new job.
- Parameters:
job_type (str) – The type of job to create. Options are train or batch_predict.
project (str) – The project name or ID in which to create the job.
model_checkpoint (str) – The model checkpoint to use.
job_name (str) – The name of the job.
description (str) – job description.
dataset (str) – The dataset to use for the job.
hyperparams (str) – The hyper-parameters for the job in JSON format.
load_state (bool) – Only load weights from the model checkpoint, if True.
sub_path (str) – folder/file path.
parallel_instances (int) – Number of data parallel instances to be run in a training job.
- Returns:
The created job’s details.
- Raises:
ValueError – raised if the job_type is not train or batch_predict.
- Return type:
dict
- create_multipart_file_upload(folder_id: str, file_path: str, chunk_size: int, hash_algo: str | None = None, hash_digest: str | None = None) Dict #
Create Multiple part of new dataset and start upload. :param str folder_id: The folderId for the newly uploaded dataset folder. :param str file_path: The path of the new dataset. :param str chunk_size: The size of parts in which the dataset is being divided into. :param str hash_algo: hashing algorithm used to compute the digest of original file. e.g. ‘crc32’ :param str hash_digest: The hash digest of the original file using hash_algo. :returns: The new CreateMultipartUploadResult’s Object with Path and UploadId. :rtype: dict
- create_project(project_name: str, description: str) Dict #
Creates a new project.
- Parameters:
project_name (str) – The name to be used for the new project.
description (str) – The project description.
- Returns:
The project ID.
- Return type:
dict
- create_tenant(tenant_name: str, display_name: str | None = None) Dict #
Create a new tenant.
- Parameters:
tenant_name (str) – The name of the new tenant.
display_name (str) – The display name of the new tenant [Deprecated].
- Returns:
The tenant ID.
- Return type:
dict
- dataset_info(dataset: str) Dict #
Returns the details for the dataset.
- Parameters:
dataset (str) – The dataset name or ID.
- Returns:
The dataset details.
- Return type:
dict
- delete_checkpoint(checkpoint: str) Dict #
Deletes a checkpoint.
- Parameters:
checkpoint (str) – The checkpoint name.
- Returns:
The status code.
- Return type:
dict
- delete_dataset(dataset: str) Dict #
Deletes the specified dataset.
- Parameters:
dataset (str) – The dataset name or ID.
- Returns:
The status code.
- Return type:
dict
- delete_endpoint(project: str, endpoint: str) Dict #
Removes the endpoint.
- Parameters:
project (str) – The project name or ID associated with the endpoint.
endpoint (str) – The endpoint name or ID.
- Returns:
The status code.
- Return type:
dict
- delete_exported_model(model_id: str, model_activity_id: str) Dict #
Delete an exported model.
- Parameters:
model_id (str) – id exported model id to be deleted.
model_activity_id (str) – model_activity_id is used to monitor the export status of the model. Currently copying model checkpoint from the NFS export is not supported, model_activity_id attribute is not used.
- Returns:
The status code.
- Return type:
dict
- delete_job(project: str, job: str) Dict #
Delete the specified job.
- Parameters:
project (str) – The project name or ID in which the job exists.
job (str) – The job name or ID to delete.
- Returns:
The status code.
- Return type:
dict
- delete_model(model: str) Dict #
Delete a model.
- Parameters:
model (str) – The model name or ID that you wish to delete.
- Returns:
The status code.
- Return type:
dict
- delete_project(project: str) Dict #
Delete the given project.
- Parameters:
project (str) – The project name or ID to delete.
- Returns:
The status code.
- Return type:
dict
- delete_tenant(tenant_name: str) Dict #
Deletes a tenant.
- Parameters:
tenant_name (str) – The name of the tenant to be deleted.
- Returns:
The tenant object.
- Return type:
dict
- delete_user(user_id: str, tenant: str | None = None) Dict #
Delete the user.
- Returns:
Response Dict
- Return type:
dict
- download_job_artifacts(project: str, job: str, dest_dir: str = './', artifact_type: str = 'results') Dict #
Downloads the requested job artifacts.
- Parameters:
project (str) – The project name or ID in which the job exists.
job (str) – The job name or ID.
dest_dir (str) – destination directory for the download.
artifact_type (str) – it can be results or logs.
- Returns:
The status code.
- Return type:
dict
- download_logs(project: str, job: str, dest_dir: str = './') Dict #
Downloads the job logs.
- Parameters:
project (str) – The project ID.
job (str) – The job ID.
dest_dir (str) – destination directory for the download.
- Returns:
The status code.
- Return type:
dict
- download_results(project: str, job: str, dest_dir: str = './') Dict #
Downloads the batch_predict job results.
- Parameters:
project (str) – The project name or ID in which the job exists.
job (str) – The job name or ID to delete.
dest_dir (str) – destination directory for the download.
- Returns:
The status code.
- Return type:
dict
- edit_endpoint_api_key(project_id: str, api_key: str, status: str, description: str) Dict #
Updates an API key status or description of the project.
- Parameters:
project_id (str) – The ID of the project.
api_key (str) – The API key to be added.
status (str) – The status of the API key.
description (str) – The API key description.
- Return type:
dict
- endpoint_info(project: str, endpoint: str) Dict #
Gets the endpoint details.
- Parameters:
project (str) – The project name or ID associated with the endpoint.
endpoint (str) – The endpoint name or ID.
- Returns:
The endpoint details if exists, otherwise returns an error.
- Return type:
dict
- endpoint_info_by_id(endpoint: str) Dict #
Gets the endpoint details. :param str endpoint: The endpoint name or ID. :returns: The endpoint details if exists otherwise error. :rtype: dict
- export_model(model_id, storage_type) Dict #
Export a model. SambaNova owned models cannot be exported.
- Parameters:
model_id (str) – The model ID.
storage_type (str) – Storage type (Local) for the import source.
- Returns:
The status code.
- Return type:
dict
- exported_model_list() Dict #
Returns the list of the exported models.
- Returns:
The list of the exported models.
- Return type:
dict
- generate_dataset_id() Dict #
Generate datasetId for new dataset uploaded Returns: datasetId for new dataset uploaded.
- generic_predict(project: str, endpoint: str, key: str, input: list, params: str | None = None, trace_id: str | None = None) Dict #
Predict generic using array of input strings.
- Parameters:
project (str) – Project ID in which the endpoint exists.
endpoint (str) – Endpoint ID.
key (str) – API Key.
input (str) – An array of input for the predict API.
params (str) – Input params string.
trace_id (str) – tracce_id for telemetry.
- Returns:
Prediction results.
- Return type:
dict
- generic_predict_stream(project: str, endpoint: str, key: str, input: str, params: str | None = None, trace_id: str | None = None) Generator[Dict, None, None] #
NLP predict using inline input string.
- Parameters:
project (str) – Project ID in which the endpoint exists.
endpoint (str) – Endpoint ID.
key (str) – API Key.
input (str) – Input string.
params (str) – Input params string.
trace_id (str) – tracce_id for telemetry.
- Returns:
Prediction results.
- Return type:
Generator
- get_feature_list()#
Returns list of feature flags.
- get_file_upload_status(dataset_id: str, upload_id: str) Dict #
Gets the file upload status :param str dataset_id: The datasetId for the newly uploading dataset. :param str upload_id: The uploadId for this dataset upload.
- get_metrics(project: str, job: str, random_sample_limit: int = -1) Dict #
Return the metrics for the specified job.
- Parameters:
project (str) – The project name or ID in which the job exists.
job (str) – The job name or ID for which the metrics should be returned.
random_sample_limit (int) – Select metrics at random when the metric dataset exceeds the random_sample_limit.
- Returns:
The training metrics in JSON format.
- Return type:
dict
- import_model(model_id: str, import_path: str, import_model_name: str, storage_type: str, steps: int) Dict #
import a model.
- Parameters:
model_id (str) – Model ID.
import_path (str) – Source path to the model.
import_model_name (str) – Name for the imported model.
storage_type (str) – Storage type (Local) for the import source.
steps (int) – Specifies the step for the model checkpoint to start.
- Returns:
The status code.
- Return type:
dict
- job_info(project: str, job: str, verbose: bool = False) Dict #
Return the details for the given job.
- Parameters:
project (str) – The project name or ID in which the job exists.
job (str) – The job name or ID of the job to return.
verbose (bool) – True to include full config in output.
- Returns:
The job details.
- Return type:
dict
- job_log_list(project: str, job: str) Dict #
Returns the list of log file names of the given jobs.
- Parameters:
project (str) – The project ID.
job (str) – The job ID.
- Returns:
The list of log file names of the given jobs.
- Return type:
dict
- job_log_preview(project: str, job: str, log_file: str) Dict #
Returns the the preview of given log file.
- Parameters:
project (str) – The project ID.
job (str) – The job ID.
log_file (str) – log file name.
- Return type:
dict
- list_apps() Dict #
Returns the list of supported ML Apps.
- Returns:
The list of ML Apps.
- Return type:
dict
- list_checkpoints(project: str, job: str) Dict #
Returns the list of checkpoints for associated project and job.
- Parameters:
project (str) – The project name or project_id.
job (str) – The job name or job_id.
- Returns:
A list of checkpoints generated by the job.
- Return type:
dict
- list_datasets() Dict #
Returns the list of supported datasets.
- Returns:
The list of supported datasets
- Return type:
dict
- list_endpoints(project: str | None = None) Dict #
Returns all endpoints belonging to the user if the project is not specified. Returns the list of endpoints associated with the project when the project is specified.
- Parameters:
project (str) – The project name or project_id.
- Returns:
The list of the endpoints.
- Return type:
dict
- list_exports(project: str, job: str) Dict #
Return the list export results of batch predict results .
- Parameters:
project (str) – The project name or ID.
job (str) – The job name or ID.
- Returns:
List of exports.
- Return type:
dict
- list_jobs(project_id: str | None = None) Dict #
Return the list of jobs in the given project.
Return the list of all jobs for the user if the project_id is not set.
- Parameters:
project_id (str) – The project ID.
- Returns:
The job listing.
- Return type:
dict.
- list_models(verbose: bool = False) Dict #
Returns the list of available models.
- Parameters:
verbose (bool) – If True provides detailed information about the models.
- Returns:
Dict of models.
- Return type:
dict
- list_notifications(page: int | None = None, limit: int | None = None, levels: str | None = None, archived: bool | None = None, type: str | None = None, created_start: datetime | None = None, created_end: datetime | None = None, ordering: str | None = None, all: bool = False, project_id: str | None = None, job_id: str | None = None, endpoint_id: str | None = None)#
Return a list of notifications for the current user.
Returns one page at a time with up to limit elements.
Raises ValueError if:
limit is <= 0, or
levels is not 10, 20, 30, 40, or 50, or
type is not ‘user’ or ‘admin’
- list_projects() Dict #
Returns the list of projects.
- Returns:
The list of projects.
- Return type:
dict
- list_roles() Dict #
Returns the list of roles.
- Returns:
List of users.
- Return type:
dict
- list_tenants() Dict #
Return the list of tenants.
- Returns:
List of tenants.
- Return type:
dict
- list_users() Dict #
Returns the list of users in the tenants and organizations.
- Returns:
List of users.
- Return type:
dict
- login(server: str, username: str, password: str) Dict #
Login using username and password.
- Parameters:
server (str) – Base URL of the SambaStudio API service. Example: https://example-domain.net.
username (str) – Username for login.
password (str) – Password for login.
- Returns:
dict contains key.
- Return type:
dict
- model_info(model: str, job_type: str) Dict #
Returns the details of the model.
- Parameters:
model (str) – The model name or ID to retrieve.
job_type (str) – job_type (train/batch_predict/deploy).
- Returns:
The model details.
- Return type:
dict
- nlp_predict(project: str, endpoint: str, key: str, input: List[str] | str, params: str | None = None, trace_id: str | None = None) Dict #
NLP predict using inline input string.
- Parameters:
project (str) – Project ID in which the endpoint exists.
endpoint (str) – Endpoint ID.
key (str) – API Key.
input (str) – Input string.
params (str) – Input params string.
trace_id (str) – tracce_id for telemetry.
- Returns:
Prediction results.
- Return type:
dict
- nlp_predict_file(project: str, endpoint: str, key: str, file_path: str) Dict #
NLP predict using file input.
- Parameters:
project (str) – Project ID in which the endpoint exists.
endpoint (str) – Endpoint ID.
key (str) – API Key.
file_path (str) – Input file location.
- Returns:
Prediction results.
- Return type:
dict
- project_info(project: str) Dict #
Returns the project details.
- Parameters:
project (str) – The project name or ID.
- Returns:
The project details.
- Return type:
dict
- resume_job(job_type: str, project: str, job_name: str, hyperparams: str, load_state: bool) Dict #
Resume a existing job.
- Parameters:
job_type (str) – The type of job to create. Options are train.
project (str) – The project name or ID in which to create the job.
job_name (str) – The name of the job.
hyperparams (str) – The hyper-parameters for the job in JSON format.
load_state (bool) – only load weights from the model checkpoint, if True.
- Returns:
The created job’s details.
- Raises:
ValueError – raised if the job_type is not train or batch_predict.
- Return type:
dict
- search_dataset(dataset_name: str) Dict #
Search by dataset name and returns its dataset_id, if it exits.
- Parameters:
dataset_name (str) – The dataset name.
- Returns:
The dataset ID(dataset_id).
- Return type:
dict
- search_job(project: str, job_name: str) Dict #
Search by project(name/project_id) and job name and returns its job_id, if it exits.
- Parameters:
project (str) – The project name or project_id.
job_name (str) – The job name.
- Returns:
The job ID(job_id).
- Return type:
dict
- search_model(model_name: str) Dict #
Search by model name and return its model_id if it exits.
- Parameters:
model_name (str) – The model name for which to search.
- Returns:
The model’s ID.
- Return type:
dict
- search_project(project_name: str) Dict #
Search by project name and returns its project_id, if it exits.
- Parameters:
project_name (str) – The project name.
- Returns:
The project ID(project_id).
- Return type:
dict
- stop_endpoint(project: str, endpoint: str) Dict #
Stops the endpoint from running without deleting it.
- Parameters:
project (str) – The project name or ID associated with the endpoint.
endpoint (str) – The endpoint name or ID.
- Returns:
The status code.
- Return type:
dict
- stop_job(project: str, job: str) Dict #
Stop the specified job.
- Parameters:
project (str) – The project name or ID in which the job exists.
job (str) – The job name or ID of the job to stop.
- Returns:
The status code.
- Return type:
dict
- tenant_default_tenant() Dict #
Return the default tenant of a user.
- tenant_info(tenant: str) Dict #
Return the details of a tenant.
The returned details include the tenant name, time the tenant was created, and time the tenant was last updated.
- Parameters:
tenant (str) – The tenant name or ID.
- Returns:
The tenant details.
- Return type:
dict
- update_endpoint(project: str, endpoint: str, description: str | None = None, instances: int | None = None, rdu_arch: str | None = None) Dict #
Update the endpoint.
- Parameters:
project (str) – The project name or ID.
endpoint (str) – The endpoint name or ID.
description (str) – The endpoint Description.
instances (int) – Number of instances.
- Returns:
The status code
- Return type:
dict
- update_job(project: str, job: str, name: str, description: str) Dict #
Update the job name of description.
- Parameters:
project (str) – The name or id of the project.
job (str) – The name or id of the job.
name (str) – New name.
description (str) – New description.
- Returns:
Success/Error response
- Return type:
dict
- update_project(project: str, name: str, description: str) Dict #
Updates the project name and/or description.
- Parameters:
project (str) – The name or id of the project.
name (str) – The new name to be used for the project.
description (str) – The new description to be used for the project.
- Returns:
Success/Error response.
- Return type:
dict
- update_tenant(tenant_name: str, display_name: str, rdu_node_count: int, arch: str) Dict #
Update the existing tenant display_name, rdu_count.
- Parameters:
tenant_name (str) – The name of the new tenant.
display_name (str) – The display name of the new tenant [Deprecated].
rdu_count (int) – Number of rdus allocated to the tenant.
- Returns:
Updated tenant details.
- Return type:
dict
- upload_part_file(folder_id: str, file_path: str, part_number: int, upload_id: str, chunk_size: int, file: str) Dict #
Upload part of new dataset. :param str folder_id: The folderId for the newly uploaded dataset folder. :param str file_path: The file Path. :param str part_number: The part number which is being uploaded. :param str upload_id: The uploadId for this dataset upload. :param str chunk_size: The size of parts in which the dataset is being divided into. :param str file: The partial file content. :rtype: dict
- upload_results_to_aws(project: str, job: str, bucket: str, folder: str, access_key_id: str, secret_access_key: str, session_token: str, region_name: str) Dict #
Upload the the batch predict results to AWS.
- Parameters:
project (str) – The project name or ID.
job (str) – The job name or ID.
bucket (str) – AWS S3 bucket name.
folder (str) – AWS S3 object name.
access_key_id (str) – AWS access key.
secret_access_key (str) – AWS secret access key.
session_token (str) – AWS temporary token.
region_name (str) – AWS region name.
- Returns:
The export results job ID.
- Return type:
dict
- snsdk.check_uuid(id) bool #
Check if the given id is valid uuid