SambaStudio release notes
Release 25.1.1 (2025-02-13)
The SambaStudio 25.1.1 release focuses on performance and quality improvements.
New features
-
WIth this release, vision models are supported with the OpenAI compatible API.
-
The OpenAI compatible API is now the recommended API for vision models.
-
-
Added compatibility for new model options released as part of the Samba-1 Turbo 25.1.1-MP1.
Performance and quality improvements
-
General GUI layout updates.
-
Multiple GUI improvements to the Add a checkpoint from storage using the GUI workflow.
-
Model card information updates.
-
Performance improvements to the export checkpoints using the CLI workflow.
Known issues
|
Release 24.11.1 (2025-01-24)
The SambaStudio 24.11.1 version release features are described below.
New features
-
Enhanced the SambaNova API (SNAPI) command-line interface (CLI) for exporting user created models from SambaStudio.
-
Added the Debug mode feature to the Playground.
-
Debug mode enables SambaStudio to create verbose logs that can help troubleshooting specific issues or during testing phases.
-
-
Enhanced the features for adding your own checkpoints to the SambaStudio Model Hub when using the Import a checkpoint using the CLI or Add a checkpoint from storage using the GUI workflows.
-
Added an early access version of Speculative decoding to this release.
-
Speculative decoding is a type of assisted decoding technique meant to accelerate the inference performance of a larger target model by using the generated tokens of a smaller draft model.
-
Bug fixes
-
Fixed an issue where sometimes the default RDU hardware generation was not selected when creating a new endpoint.
-
Fixed an issue when selecting SambaNova Systems as the Owner in the Model Hub would sometimes display user created Composition of Experts (CoE) models.
-
Fixed an issue when adding a checkpoint from storage using the GUI would sometimes allow a name of an already existing model to be entered as the new model name.
Known issues
|
Release 24.10.1 (2024-12-03)
The SambaStudio 24.10.1 version release features are described below.
New features
-
Added additional information to the System Health box.
-
The number of healthy/unhealthy nodes per total number of nodes is displayed at the bottom of the box. Use the drop-down to view information for All Tenants or a selected tenant.
-
The number of healthy/unhealthy RDUs per total number of RDUs is displayed at the bottom of the box. Use the drop-down to view information for All Tenants or a selected tenant.
-
-
Added an early access version of function calling, which enables dynamic workflows by allowing the model to select and suggest function calls based on user input.
Bug fixes
-
Fixed an issue where sometimes a Collaborator was unable to delete a job.
-
Fixed an issue where a Collaborator was unable to restart an endpoint.
-
Fixed an issue where sometimes hyperparameter tooltips did not display.
-
Fixed an issue when adding a model response pane in the Playground, the GUI controls did not display correctly.
-
Fixed an issue where occasionally an individual model creator was not displayed in the Owner filters of the Model Hub.
-
Fixed an issue where sometimes a Project could be deleted even if it contained a Live endpoint.
Known issues
|
Release 24.9.1 (2024-10-17)
The SambaStudio 24.9.1 version release features are described below.
New features
-
OpenSearch integration is available for Train jobs, Batch inference jobs, and Endpoints.
-
Implemented a System Health feature that provides information about your SambaStudio environment and its ability to function.
-
Click System Health in the top menu bar to view the information.
-
-
Enhanced user feedback for Train job NaN errors indicating that a successful retry has started and resumed in the Activity notifications panel.
Bug fixes
-
Fixed an issue where model cards sometimes did not display correctly.
-
Fixed an issue when adding a checkpoint to the Model Hub would sometimes fail.
-
Fixed an issue when switching tenants would sometimes cause an Access Forbidden. Please contact your administrator error to display with access to the tenant denied.
Known issues
|
-
Sometimes the number of nodes displayed in System Health will not be in sync with the number of nodes displayed in RDUs Available.
-
As a workaround, please refresh the browser after opening the System Health box.
-
-
If you have created your own Composition of Experts model prior to SambaStudio version 24.9.1 that uses one of the experts listed below, sometimes the corresponding deployed endpoint can stop functioning correctly. For example, updating the endpoint or starting/stopping the endpoint will fail and you will receive an error message.
-
As a workaround, please create a new Composition of Experts model using the same expert(s) in SambaStudio version 24.9.1 or later.
-
Expert list:
-
Meta-Llama-3-8B-Instruct
-
Meta-Llama-3-70B-Instruct
-
Meta-Llama-Guard-2-8B
-
-
Release 24.8.2 (2024-10-01)
SambaStudio release 24.8.2 is a patch update for 24.8.1 that includes performance improvements and minor bug fixes.
Release 24.8.1 (2024-09-16)
The SambaStudio 24.8.1 version release features are described below.
New features
-
You can now import a checkpoint from your local storage by using the new CLI import feature, which includes two modes.
-
The Interactive mode provides detailed feedback in the CLI on each step of the process and prompts you to input and verify your checkpoint’s configuration values.
-
The Non-interactive mode allows you to enter all of the required details for the checkpoint you wish to import, including configuration values in the CLI.
-
-
We also improved the workflow for adding a checkpoint using the GUI.
-
Step 1: Details includes enhanced information entry for your checkpoint.
-
This step now includes the option of editing the chat template before uploading your checkpoint. The chat template specifies the prompt template for a checkpoint to be used in Jinja format.
-
-
Step 2: Validation provides enhanced feedback on your checkpoint configurations.
-
-
Implemented a new DDR usage bar when creating your CoE model. The DDR usage bar displays the amount of Double Data Rate (DDR) memory used by the selected experts during the configuration process. SambaStudio tracks and displays the memory used as a percentage and displays it in real-time updates.
-
Added detailed logs for endpoints using OpenSearch.
-
Added the ability to update an endpoint while still keeping the endpoint active for use during the update process.
-
SambaStudio 24.8.1 provides the ability to run our latest release of high-performance inference models, Samba-1 Turbo v0.3.
Bug fixes
-
Fixed an issue where sometimes the sidebar in the Playground would not collapse correctly.
-
Fixed an issue in the Playground where new modal windows would sometimes open behind existing modal windows.
-
Fixed an issue in the Playground where Download Results button sometimes did not display correctly.
Known issues
|
-
Occasionally, when making consecutive API calls to different experts within the CoE endpoint, switching time to some experts can result in up to a 7 second delay.
Release 24.7.1 (2024-08-21)
The SambaStudio 24.7.1 version release features are described below.
New features
-
As we continue to update the SambaStudio GUI, we have implemented a new Playground interface with an improved workflow and new features.
-
The Playground now provides the ability to compare responses to your prompts using different models. You can add up to six model response panes to compare responses in the Playground.
-
The Playground supports both chat and single-turn interactions.
-
Models now include a chat icon next to their name to indicate that it supports chat functionality.
-
You can now search and filter the list of model experts to quickly locate and refine your choice(s).
-
-
Updated the RDUs Available information displayed.
-
You can now easily view RDU information based on a selected tenant, all tenants, as well as the hardware generations and nodes allocated for each.
-
The RDUs Available summary will now inform you if an RDU is not functioning correctly by identifying it as Unhealthy.
-
-
Added Tracking IDs for Batch inference and Training jobs.
-
Tracking IDs provide a current ID that can be used to identify and report issues to the SambaNova Support team.
-
A Tracking ID is created for each new job.
-
A new Tracking ID will be created if a stopped job is restarted.
-
Only jobs created with, or after, SambaStudio release 24.7.1 will display a Tracking ID.
-
Issues
|
Release 24.6.1 (2024-07-24)
The SambaStudio 24.6.1 version release features are described below.
New features
-
We are excited to announce SambaStudio’s new graphical user interface (GUI).
-
The new GUI is a light-themed interface that provides many new UX and UI updates, including improved workflows and intuitive navigation.
-
-
Implemented new model versioning.
-
You can choose the model version to use when creating jobs and endpoints.
-
You can update your endpoint’s model version directly from an Endpoint window.
-
-
Improved adding a checkpoint to the Model Hub from NFS to include a multi-step process with checks and validation.
-
Endpoint monitoring is now available to all user roles and can be access from an Endpoints table or the Endpoint window.
Known issues
|
-
Occasionally you will receive error that is not relative to your specific request, this is due to a known issue where all requests within the same batch will either succeed or fail together and result in an error. For example, in a batch of 4 requests with 3 requests that succeed and 1 request that fails because it exceeds the
max_seq_length
of its model, all 4 requests will fail and receive the same error message. -
When cancelling a request of a Samba 1 Turbo App model, there is a known issue that does not prevent the model from processing the request. For example, if a user sends 30 concurrent requests and cancels all of them immediately, the model will still process those requests. This can result in slow time to first token (TTFT) for subsequent requests.
-
When using an E5 Mistral Embedding model in the Playground, there is a known issue where the endpoint will display an error, but will not crash.
-
The e5-mistral models only support the Predict API, not Stream API.
-
The e5-mistral models only support a batch size (BS) up to 8.
-
The Deepseek 33B models only support a batch size (BS) up to 4.
Release 24.5.1 (2024-06-14)
The SambaStudio 24.5.1 version release features are described below.
New features
-
Implemented new share roles of Collaborator and Viewer. Share roles allow greater control of allowing and assigning access to artifacts (projects and their jobs/endpoints, models, and datasets).
-
Share roles can be assigned for projects, models, and datasets by the owner/creator of the artifact (a User), tenant administrators (TenantAdmin), and organization administrators (OrgAdmin) user roles.
-
-
Integrated a new endpoint monitoring dashboard (Grafana) that displays metric information for all endpoints in a selected Tenant. This allows an organization administrator (OrgAdmin) or tenant administrator (TenantAdmin) to monitor the performance of SambaStudio endpoints.
-
Added integration with the OpenSearch Dashboard, which provides an interface to visualize and analyze logs.
-
Improved the available Playground modes by implementing a Prompt mode. Prompt mode replaces the Completion mode.
-
SambaStudio SN40L users can create endpoints that use dynamic batching of requests to the same model, which provides improved throughput and performance.
Bug fixes
-
Fixed an issue where the number of datasets in the Dataset Hub displayed incorrectly.
-
Fixed an issue that allowed a non-acceptable dataset name to be used when adding a new dataset, which would result in an upload failure. Now when attempting to add a dataset with a non-acceptable dataset name, an alert message will appear informing you to change the name before proceeding.
-
Fixed an issue where sometimes when creating a train job, selecting Clear from the ML App drop-down would not remove a selected ML App.
Known issues
|
-
Requests to SambaStudio have a 300-second connection timeout. Sometimes under high load, high volume curl requests will wait in the queue and surpass the SambaStudio connection timeout, resulting in a failed request. Please try one of the steps below as a workaround.
-
Reduce the load (requests per second) of the request.
-
Increase the number of instances used.
-
Contact SambaNova Support and open a support case to change the 300-second connection timeout configuration.
-
-
Please ensure that curl requests use valid JSON strings. If the input contains special characters, such as
"
or\n
, they will need to be escaped by prepending them with a backslash (\
). For example,“
would become\”
and\n
would become\\n
.-
If your
"content":
input contains the character\n
, please prepend the character as shown below.Example input with escaped character{ "conversation_id": "sambaverse-conversation-id", "messages": [ { "message_id": 0, "role": "user", "content": "How can I print the completion field in this json data:\\n{\"result\":{\"status\":{\"complete\":true,\"exitCode\":0,\"elapsedTime\":29.62261724472046,\"message\":\"\",\"progress\":0,\"progressMessage\":\"\",\"reason\":\"\"},\"responses\":[{\"completion\":\"\\n\\nDark matter is a hypothet" } ] }
-
-
Occasionally you will receive error that is not relative to your specific request, this is due to a known issue where all requests within the same batch will either succeed or fail together and result in an error. For example, in a batch of 4 requests with 3 requests that succeed and 1 request that fails because it exceeds the
max_seq_length
of its model, all 4 requests will fail and receive the same error message. -
When cancelling a request of a Samba 1 Turbo App model, there is a known issue that does not prevent the model from processing the request. For example, if a user sends 30 concurrent requests and cancels all of them immediately, the model will still process those requests. This can result in slow time to first token (TTFT) for subsequent requests.
-
When using an E5 Mistral Embedding model in the Playground, there is a known issue where the endpoint will display an error, but will not crash.
-
The e5-mistral models only support the Predict API, not Stream API.
-
The e5-mistral models only support a batch size (BS) up to 8.
-
The Deepseek 33B models only support a batch size (BS) up to 4.
Release 24.4.1 (2024-05-10)
The SambaStudio 24.4.1 version release features are described below.
New features
-
We are very excited to announce that SambaStudio has added the ability to create your own Composition of Experts (CoE) models.
-
Now you can compose your own CoE model that is accessed via a single endpoint. This allows you to create a CoE model and define the expert models used in your composition.
-
-
Added new Samba-1 routers.
-
The Samba-1-Instruct-Router is a Composition of Experts router. It provides a single model experience to a subset of the Samba-1 experts that comprise the Enterprise Grade AI (EGAI) benchmark.
-
The Samba-1-Chat-Router is a composition of 7B parameter models that provides a lightweight, benchmark winning
single model experience.
-
-
Improved the API features.
-
Added faster performing generic APIs.
-
Existing APIs are still supported and will continue to function.
-
-
Improved the Endpoint window in the GUI to provide the Stream URL path as well as the Predict URL path.
SambaNova AI Starter Kits
-
Updates to the SambaNova AI Starter Kit example guides will now be included in the SambaStudio release notes. SambaNova AI Starter Kits
are a collection of open-source examples and guides that facilitate the deployment of AI-driven use cases in the enterprise. You can use a deployed LLM endpoint in SambaStudio with one of the available AI Starter Kits.
-
Added a Samba-1 Routing Starter Kit.
-
Launched a Samba-1 CoE router starter kit showing customers how to use CoE models.
-
-
Added Sambaverse and Windows compatibility.
-
Updated Starter Kits for Sambaverse, enabling end-to-end customer self-service.
-
Launched a Windows-compatible starter kit deployment using Docker containers, which also includes a video.
-
-
Added web search and improved response quality in Starter Kits.
-
Launched a web search agent starter kit, allowing customers to use results as context to LLM prompt.
-
Improved response quality for document analysis Q&A.
-
-
Launched fine-tuning Starter Kits that includes information on synthetic Q&A data, low-resource languages, and embeddings models.
-
Added image and audio modalities.
-
Added post call analysis (including a video) and SambaNova embeddings for document analysis assistant.
-
-
Implemented SambaNova embeddings for EDGAR Q&A and document analysis assistant.
-
Known issues
|
-
Samba-1.0 Lite is currently not functioning as expected. As a workaround, please select a different CoE model for your needs.
-
An endpoint’s stream URL is not displaying in the SNSDK of Playground’s View Code and the CLI.
-
Please use the GUI to get an endpoint’s Stream URL path.
-
-
The SambaStudio GUI currently allows SN10 and SN30 hardware generations to start the Create your own CoE model process, even though CoE models will only run on SN40L hardware generations.
-
When creating a train job, the following models, including sequence size and vocabulary size in parenthesis, are currently not supported on SambaNova SN10 or SN30 RDU generations in both GUI and CLI workflows:
-
Llama-2-7b-16k-hf (Interpolate: True, from SS 4k)
-
Llama-2-7b-sambalingo-thai-base-hf (Llama2 7B Vocab size 57344)
-
-
When creating a non-CoE endpoint using the GUI or an endpoint using the CLI, the following models, including sequence size and vocabulary size in parenthesis, are currently not supported on SambaNova SN10 or SN30 RDU generations:
-
Llama-2-7b-80k-hf (Llama2 7B SS 64k)
-
Llama-2-7b-chat-16k-hf (Interpolate: True, from SS 4k)
-
Llama-2-7b-chat-80k-hf (Llama2 7B SS 64k)
-
-
When creating a batch inference job using the GUI or the CLI, the following models, including sequence size and vocabulary size in parenthesis, are currently not supported for batch inference on SambaNova SN10 or SN30 RDU generations:
-
Llama-2-7b-80k-hf (Llama2 7B SS 64k)
-
Llama-2-7b-chat-16k-hf (Interpolate: True, from SS 4k)
-
Llama-2-7b-chat-80k-hf (Llama2 7B SS 64k)
-
-
Requests to SambaStudio have a 300-second connection timeout. Sometimes under high load, high volume curl requests will wait in the queue and surpass the SambaStudio connection timeout, resulting in a failed request. Please try one of the steps below as a workaround.
-
Reduce the load (requests per second) of the request.
-
Increase the number of instances used.
-
Contact SambaNova Support and open a support case to change the 300-second connection timeout configuration.
-
-
Please ensure that curl requests use valid JSON strings. If the input contains special characters, such as
"
or\n
, they will need to be escaped by prepending them with a backslash (\
). For example,“
would become\”
and\n
would become\\n
.-
If your
"content":
input contains the character\n
, please prepend the character as shown below.Example input with escaped character{ "conversation_id": "sambaverse-conversation-id", "messages": [ { "message_id": 0, "role": "user", "content": "How can I print the completion field in this json data:\\n{\"result\":{\"status\":{\"complete\":true,\"exitCode\":0,\"elapsedTime\":29.62261724472046,\"message\":\"\",\"progress\":0,\"progressMessage\":\"\",\"reason\":\"\"},\"responses\":[{\"completion\":\"\\n\\nDark matter is a hypothet" } ] }
-
Release 24.2.2 (2024-04-01)
SambaStudio release 24.2.2 is a patch update for 24.2.1 that includes the features described below.
New features
-
Improved general performance of the platform.
-
Improved endpoint creation when using the snapi CLI.
-
Updated model document information:
Known issues
|
-
When creating a train job, the following models, including sequence size and vocabulary size in parenthesis, are currently not supported on SambaNova SN10 or SN30 RDU generations in both GUI and CLI workflows:
-
Llama-2-7b-16k-hf (Interpolate: True, from SS 4k)
-
Llama-2-7b-sambalingo-thai-base-hf (Llama2 7B Vocab size 57344)
-
-
When creating a non-CoE endpoint using the GUI or an endpoint using the CLI, the following models, including sequence size and vocabulary size in parenthesis, are currently not supported on SambaNova SN10 or SN30 RDU generations:
-
Llama-2-7b-80k-hf (Llama2 7B SS 64k)
-
Llama-2-7b-chat-16k-hf (Interpolate: True, from SS 4k)
-
Llama-2-7b-chat-80k-hf (Llama2 7B SS 64k)
-
-
When creating a batch inference job using the GUI or the CLI, the following models, including sequence size and vocabulary size in parenthesis, are currently not supported for batch inference on SambaNova SN10 or SN30 RDU generations:
-
Llama-2-7b-80k-hf (Llama2 7B SS 64k)
-
Llama-2-7b-chat-16k-hf (Interpolate: True, from SS 4k)
-
Llama-2-7b-chat-80k-hf (Llama2 7B SS 64k)
-
Release 24.2.1 (2024-03-19)
New features
-
With this release, SN40L
users can utilize SambaStudio’s new Composition of Experts (CoE) architecture. The CoE architecture is a system of multiple experts, where each expert is a fully trained machine learning model.
-
SambaNova’s first Composition of Experts model, Samba-1, is available for SN40L users. Samba-1 combines the comprehensive power of trillion-parameter models with the precision and efficiency of specialized models accessed via a single endpoint.
-
Added new CoE specific model cards to the Model Hub. CoE model cards allow you to view the expert models used to create the CoE model.
-
SN40L users can create CoE endpoints to be deployed for use in the Playground.
-
Added the ability to select an expert for CoE endpoints in the Playground. This allows you to choose an expert model for your particular prompt.
-
To get optimal responses from some CoE experts, use a task-specific prompt template to format your input.
-
Release 24.1.1 (2024-02-08)
New features
-
SambaStudio now supports accessing multiple SambaNova hardware generations and their associated nodes. This improves several workflows, including:
-
Added the ability to create new ASR pipelines that utilize your trained Hubert ASR models.
-
You can use your trained Hubert ASR models to create a new ASR pipeline with diarization (Diarization ASR Pipeline) or without diarization (ASR Pipeline).
-
-
Improved the GUI and feedback for adding a dataset. Improvements include:
-
A cleaner and more intuitive source selection process.
-
The ability to select a different local dataset before finalizing the adding a dataset from a local machine process.
-
A feedback statement when adding a dataset from a local machine stating to use the CLI process for datasets that are greater than 5GB or contain more than 1000 files.
-
-
Added a learning rate graph to the metrics graph display when evaluating training jobs.
-
The learning rate graph depicts the learning rate hyperparameter during the training run, allowing you monitor and optimize the balance between the quality of the final model with the required training time.
-
Known issues
|
-
When an organization administrator deletes a tenant, sometimes the GUI will not update that the tenant is deleted. As a workaround, please refresh the browser if the GUI update takes longer than 30 seconds.
-
We’ve noticed an issue where the CLI is parsing hyperparameter boolean values incorrectly when using snapi job create with the --hyperparams-file option. As a workaround, please use quotes for hyperparameter boolean values, as shown in the example hyperparameters file below.
Example hyperparameters filemax_seq_length : 2048 precision: "true"
-
Cancelling the upload of a dataset can sometimes result in the cancelled dataset still displaying a status of Uploading in the Dataset Hub. The cancelled dataset may take up to 20 minutes for its status to change to Failed, at which point the dataset can be deleted.