Create and use endpoints
With the SambaStudio platform, you can generate predictions from your models on your data by deploying it to an endpoint. This document describes how to:
Create an endpoint
You can create endpoints to be used for prediction using the GUI or the CLI. Follow the instructions described in the corresponding sections to learn how.
Starting with release 24.1.1, SambaStudio allows you to specify the SambaNova Systems' Reconfigurable Dataflow Units™ (RDUs) generation version to use for endpoints. RDUs can be specified for both GUI and CLI workflows. Contact your administrator for more information on RDU configurations specific to your SambaStudio platform. |
Create a CoE endpoint using the GUI
Starting with SambaStudio release 24.2.1, SN40L users can create Composition of Experts (CoE) endpoints. Follow the steps below to create an endpoint using a SambaNova provided CoE, such as Samba-1, using the GUI. After adding your CoE endpoint, adjust the endpoint share settings to share your CoE endpoint with other users and tenants.
Due to the large number of checkpoints involved, CoE endpoints can stay in the Setting Up status for 2-3 hours before deploying to Live status. |
-
Create a new project or use an existing one.
-
From a project window, click New endpoint. The Create endpoint window will open.
-
In the Create endpoint window, enter a name for the endpoint into the Endpoint name field.
-
From the ML App drop-down, select a CoE designated ML App. We selected Symphony CoE App in our example.
-
From the Select model drop-down, choose SambaNova models and select a downloaded CoE model.
-
Select the version of the model to use in the Select model drop-down.
-
The RDU generation will default to SN40L, as COE models only run on the SambaNova SN40L.
-
Select the number of instances to use when creating the endpoint. The information statement displays the required number of RDUs in a single node along with the total number and available number of RDUs in your SambaStudio configuration. In our example, our CoE model requires one RDU in a node for each instance. Increasing the number of instances will increase the RDU requirements.
-
Expand Model Parameters to view additional parameters.
-
The model parallel rdus field designates the number of RDUs used to run in parallel when creating the endpoint. The number directly relates to the number of RDUs available in a single node and the selected number of instances. If the required number of RDUs are not available in a single node, you will receive a warning message.
-
-
Click Add an endpoint to queue the endpoint for deployment. Once the endpoint is created, it can take several minutes for the endpoint to be deployed.
Insufficient RDUs in a single node
If the required number of RDUs are not available in a single node, you will receive a warning message stating that your endpoint will be in Awaiting RDU status until the RDUs become available in a single node. Contact your administrator for more information on RDU configurations specific to your SambaStudio platform.
Create a non-CoE endpoint using the GUI
Follow the steps below to create an endpoint, that is not part of the CoE architecture, using the GUI to be used for prediction. After saving your endpoint, adjust the endpoint share settings to share your endpoint with other users and tenants.
Base models do not support inference and cannot be deployed for endpoints. It is recommended to use Base models for training and not inference. |
-
Create a new project or use an existing one.
-
From a project window, click New endpoint. The Create endpoint window (Figure 3) will open.
-
In the Create endpoint window, enter a name for the endpoint into the Endpoint name field, as shown in Figure 3.
-
Select the ML App from the ML App drop-down, as shown in Figure 3.
-
From the Select model drop-down, choose My models, Shared models, SambaNova models, or Select from Model Hub.
-
My models displays a list of models that you have previously added to the Model Hub.
-
Shared models displays a list of models that have been shared with the selected active tenant.
-
SambaNova models displays a list of models provided by SambaNova.
Figure 3. Create endpoint window -
Select from Model Hub displays a window with a list of downloaded models that correspond to a selected ML App, as shown in Figure 4, or a list of all the downloaded models if an ML App is not selected. The list can be filtered by selecting options under Field of application, ML APP, Architecture, and Owner. Additionally, you can enter a term or value into the Search field to refine the model list by that input. Choose the model you wish to use and confirm your choice by clicking Use model.
Figure 4. Select from Model Hub
-
-
Select the version of the model to use in the Select model drop-down.
-
Enter a value for the Number of instances to designate how many instances will be deployed, as shown in Figure 5. The Number of instances setting helps the endpoint scale to accommodate a high volume number of requests. You can adjust the value based on the expected volume number of requests and available resources.
-
The RDU generation drop-down allows users to select an available RDU generation version to use for creating an endpoint, as shown in Figure 5. If more than one option is available, the SambaStudio platform will default to the recommended RDU generation version to use based on your platform’s configuration and the selected model. You can select a different RDU generation version to use, if available, than the recommended option from the drop-down. Contact your administrator for more information on RDU configurations specific to your SambaStudio platform.
-
Click Add an endpoint, as shown in Figure 5, to queue the endpoint for deployment. Once the endpoint is created, it can take several minutes for the endpoint to be deployed.
Figure 5. Completed create endpoint windowYour endpoint is deployed and will be available to generate predictions when the endpoint status displays Live.
Insufficient RDUs in a single node
If the required number of RDUs are not available in a single node, you will receive a warning message stating that your endpoint will be in Awaiting RDU status until the RDUs become available in a single node. Contact your administrator for more information on RDU configurations specific to your SambaStudio platform.
Create an endpoint using the CLI
The example below demonstrates how to create an endpoint using the snapi endpoint create command. You will need to provide the following:
-
A project to assign the endpoint. Create a new project or use an existing one.
-
A new endpoint name for the --endpoint_name input.
-
The model to be used for the --model-checkpoint input.
-
The RDU architecture generation version to use of your SambaStudio platform configuration for the --arch input.
-
Run the snapi tenant info command to view the available RDU generation version(s) specific to your SambaStudio platform. Contact your administrator for more information on RDU configurations specific to your SambaStudio platform.
-
Run the snapi model info command to obtain the --arch input compatible for the selected model.
-
-
The number of instances to be used for the --instances input.
$ snapi endpoint create \
--project <project-name> \
--endpoint_name <endpoint-name> \
--model-checkpoint <model-name> \
--arch SN10 \
--instances <number-of-instances>
Run snapi endpoint create --help to display additional usage and options. |
Example snapi model info command
The example snapi model info command snippet below demonstrates where to find the compatible --arch input for the GPT_13B_Base_Model when used in creating an endpoint. The required value is located on the last line of the example snippet and is represented as 'deploy': { 'sn10'. Note that this example snippet contains only a portion of the actual snapi model info command response. You will need to specify:
-
The model name or ID for the --model input.
-
Use deploy for the --job-type input. This returns the 'deploy': { 'sn10' value, which would be entered as --arch SN10 into the snapi create endpoint command.
Click to view the example snapi model info command snippet.
$ snapi model info \
--model GPT_13B_Base_Model \
--job-type deploy
Model Info
============
ID : 61b4ff7d-fbaf-444d-9cba-7ac89187e375
Name : GPT_13B_Base_Model
Architecture : GPT 13B
Field of Application : language
Validation Loss : -
Validation Accuracy : -
App : 57f6a3c8-1f04-488a-bb39-3cfc5b4a5d7a
Dataset : {'info': 'N/A\n', 'url': ''}
SambaNova Provided : True
Version : 1
Description : This is a randomly initialized model, meant to be used to kick off a pre-training job.
Generally speaking, the process of pre-training is expensive both in terms of compute and data. For most use cases, it will be better to fine tune one of the provided checkpoints, rather than starting from scratch.
Created Time : 2023-03-23 00:00:00 +0000 UTC
Status : Available
Steps : 0
Hyperparameters :
{ 'batch_predict': {},
'deploy': { 'sn10': { 'imageVariants': [],
View endpoint information
Once your endpoint has been deployed, you can view detailed information about it using the GUI or CLI.
View an endpoint using the GUI
The Endpoint window displays detailed information about your endpoint, as well as providing access to common functions. Follow the steps below to navigate to any Endpoint window. Additionally, you can view a list of Live endpoints from the Dashboard.
-
Click Projects from the left menu to navigate to the Projects window and view the current projects.
-
Click the Project that contains the endpoint you wish to view. That Project window will open displaying the Endpoints table, which is located beneath the Jobs table. The Endpoints table provides the following information about each endpoint, as shown in Figure 7:
-
Name displays the name of the endpoint.
-
Model displays the associated model of the endpoint.
-
Instances displays the number of instances used by the endpoint.
-
RDU Arch displays the RDU generation version used when the endpoint was created.
-
Date of creation displays the date and time the endpoint was created.
-
Owner displays the owner of the endpoint.
-
Monitoring provides a link to the Grafana monitoring dashboard. Click the icon to open the dashboard.
-
Actions provides additional interactions to the endpoint via a drop-down menu.
-
-
Click the endpoint you wish to view from the list to open the Endpoint window, as shown in Figure 7.
Figure 7. Endpoints tableThe Endpoint window displays the following information, as shown in Figure 8:
-
The endpoint name is displayed in the top of the window and in the breadcrumb path at the top of the window.
-
The status of the endpoint is displayed in the Details panel top bar.
-
ML App displays the ML App that was selected when the endpoint was created.
-
Generation displays the RDU generation version used when the endpoint was created.
-
Model/Checkpoint displays the endpoint’s associated model/checkpoint.
-
Model Version displays the current model version associated with the endpoint.
-
Instances displays the number of instances used by the endpoint.
-
Created on displays the date the endpoint was created.
-
Updated on displays the date the endpoint was last updated.
-
Owner displays the owner of the endpoint.
-
Model parameters displays the parameters used during endpoint creation.
-
Predict URL provides the URL of a predict endpoint to use programmatically to make requests. Use the copy icon to temporarily store the URL (copy) to your computer’s clipboard. See API reference for more information.
-
Streaming prediction URL provides the URL of a stream endpoint to use programmatically to make requests. Use the copy icon to temporarily store the URL (copy) to your computer’s clipboard. See API reference for more information.
-
The API Keys table displays a list of API keys that can be used to make authenticated requests of the endpoint URL. See Endpoint API keys for information on generating and using API keys. The API Keys table displays the following information:
-
KeyID displays the endpoint’s API key to be used for authenticated requests. Use the copy icon to temporarily store the API key (copy) to your computer’s clipboard.
-
Description displays the API key’s unique description so that it can be easily identified.
-
Created on displays the date the API key was created.
-
Owner displays the owner of the API key.
-
Status displays the current status of the API key.
Figure 8. Endpoint window
-
-
Endpoint monitoring
SambaStudio provides a monitoring dashboard (Grafana) that displays metric information for all endpoints. From an Endpoint window, click Monitoring to open the Grafana monitoring dashboard in a new browser window.
View endpoint info using the CLI
Use the snapi endpoint info command to view specific information about your endpoint, including:
-
The assigned ID of the endpoint.
-
The ML App used when the endpoint was created.
-
The Model used when the endpoint was created.
-
The number of Instances used when the endpoint was created.
-
The URL path to the endpoint location.
Please use the GUI to get an endpoint’s Stream URL path.
-
The initial API Key (First Key) created for the endpoint. See Endpoint API keys for information on generating and using API keys.
-
The current Status of the endpoint.
-
The associated API Keys of the endpoint that can be used to make authenticated requests of the endpoint URL. See Endpoint API keys for information on generating and using API keys.
-
The Arch RDU generation version used when the endpoint was created.
The example below demonstrates how to use the snapi endpoint info command to view the info of an endpoint. You will need to provide the following:
-
The project name or ID the endpoint is assigned.
-
The endpoint name or ID.
$ snapi endpoint info \
--project <project-name> \
--endpoint <endpoint-name>
Endpoint Info
=============
ID : 4eba9aaa-543f-4f8b-bbc1-30182252f85a
Name : hav2
Description :
Project ID : 87deae92-570e-443f-8ae8-4521fb43ad09
ML App : Generative Tuning 13B
Model : GPT_13B_Human_Aligned_Instruction_Tuned_V2
Instances : 0
URL : <path-to-endpoint>>
API Key : 6fbc8f9d-34e2-47e4-983f-19a1845326b4
Status : Stopped
Created at : 2023-09-07T21:47:27.581072+00:00
Last Updated : 2023-11-07T17:02:07.939675+00:00
API Keys : [{'user_id': 'billb', 'time_created': '2023-11-07T16:51:04.896235+00:00', 'description': 'hav2-shared-key', 'status': 'Live', 'id': '1205d9db-250a-4e71-b1a8-33fa3ffa05cc', 'endpoint_id': '4eba9aaa-543f-4f8b-bbc1-30182252f85a', 'time_updated': '2023-11-07T16:51:04.896235+00:00', 'api_key': 'a0e8e81f-697a-401a-b7c5-688788e0bb8a'}, {'user_id': 'billb', 'time_created': '2023-11-07T13:06:50.209626+00:00', 'description': 'First Key', 'status': 'Live', 'id': '12fffa5a-c615-4271-a920-f4bb1b3da303', 'endpoint_id': '4eba9aaa-543f-4f8b-bbc1-30182252f85a', 'time_updated': '2023-11-07T13:06:50.209626+00:00', 'api_key': '6fbc8f9d-34e2-47e4-983f-19a1845326b4'}]
Hyperparams : None
Arch : SN10
Model version updates
SambaStudio allows endpoints created with models that have a new version available to be updated directly from the Endpoint window. For example, if you’re endpoint was created with a version 1 model, you can update the model to version 2 directly from the Endpoint window. Follow the steps below to update the model associated with your endpoint.
-
In the Endpoint window, click Update available. The Update available box will open.
Figure 10. Endpoint Window update available -
The Update available box will list the updates that are able to be implemented. In our example, the Model Infrastructure and the Samba-1 Turbo Lama 13B model are able to be updated.
-
You can choose to update only one item by clicking Update next to that item.
-
-
Click Update all to update all items.
Figure 11. Update available box
Endpoint API keys
An endpoint API key allows you to make authenticated requests of the endpoint URL, such as generating predictions. SambaStudio provides the ability to create multiple API keys for an endpoint. This allows organizations to easily distribute and manage endpoints to different entities and users.
API keys can be added, edited, and revoked using the GUI or the CLI. Follow the instructions described in the corresponding sections to learn how.
The endpoint API key and the platform API key have two distinct implementations.
|
Add an endpoint API key using the GUI
You can add additional API keys to an endpoint. Please be aware of the following when adding a new API key to an endpoint.
-
Endpoint API keys can be added by:
-
The owner of the endpoint.
-
An organization administrator (OrgAdmin) across all tenants.
-
A tenant administrator (TenantAdmin) within their assigned tenant.
-
-
The platform creates an initial API key for each endpoint with the description of First Key.
Follow the steps below to add an API key to an endpoint.
-
From the API keys section of an endpoint window, click Add new.
Figure 12. API keys sectionThe Add new API key box will appear.
-
Enter a description into the Add description field for the endpoint’s new API key.
-
Click Add to complete the process and add the endpoint’s new API key.
Figure 13. Add new API key -
The API keys table will display the new API key and the description you created for your endpoint.
Figure 14. API keys table with new API key
Edit an endpoint API key description using the GUI
An endpoint’s API key existing description can be edited. Please be aware of the following when editing an endpoint API key description.
-
The description of an endpoint API key can be edited by:
-
The owner of the endpoint.
-
An organization administrator (OrgAdmin) across all tenants.
-
A tenant administrator (TenantAdmin) within their assigned tenant.
-
-
The initial API key, and its description of First Key, created for an endpoint by the platform, cannot be edited.
Follow the steps below to edit an endpoint API key description.
-
From the API keys section of an endpoint window, click the kebob icon (three dots) in the API key you wish to edit and select Edit from the drop-down.
Figure 15. API key drop-downThe Edit key description box will open.
-
Enter a new description into the Key description field.
-
Click Save to complete the operation and update the API key description.
Figure 16. Endpoint edit key description box
Revoke an endpoint API key using the GUI
Endpoint API keys can be revoked to prevent further usage. Please be aware of the following when revoking an endpoint API key.
-
Endpoint API keys can be revoked by:
-
The owner of the endpoint.
-
An organization administrator (OrgAdmin) across all tenants.
-
A tenant administrator (TenantAdmin) within their assigned tenant.
-
-
The initial API key (First Key) created for an endpoint by the platform, cannot be revoked.
-
Once an endpoint API key is revoked it cannot be reactivated.
-
The API keys table will display the revoked API key with the status Revoked.
-
All users will lose access to a revoked API key.
Follow the steps below to revoke an endpoint API key.
-
From the API keys section of an endpoint window, click the kebob icon (three dots) in the API key you wish to edit and select Revoke from the drop-down.
Figure 17. API key drop-downThe confirmation box will open with a statement describing the consequences of revoking the API key.
-
Click Revoke to confirm that you want to revoke the endpoint API key.
Figure 18. Endpoint revoke key confirmation box
List all endpoint API keys using the CLI
Similar to the endpoint API keys table of the GUI, the SambaStudio command-line interface (CLI) can display a list of API keys for an endpoint.
The example below demonstrates how to list all API keys for an endpoint using the snapi endpoint list-apikeys command. You will need to provide the endpoint name or ID.
$ snapi endpoint list-apikeys \
--endpoint <endpoint-name>
------------------------------------------------------------------------
List of API Keys for Endpoint <endpoint-name>
------------------------------------------------------------------------
+--------------------------------------+--------------------------------------+----------------------+----------------------+----------------------+
| API Key | Description | Created On | Owner | Status |
+======================================+======================================+======================+======================+======================+
| a0e8e81f-697a-401a-b7c5-688788e0bb8a | hav2-shared-key | 07 Nov, 2023 | billb | Active |
+--------------------------------------+--------------------------------------+----------------------+----------------------+----------------------+
| 6fbc8f9d-34e2-47e4-983f-19a1845326b4 | First Key | 07 Nov, 2023 | billb | Active |
+--------------------------------------+--------------------------------------+----------------------+----------------------+----------------------+
Add an endpoint API key using the CLI
Similar to the GUI, the SambaStudio command-line interface (CLI) can be used to add an API key to an endpoint. Please be aware of the following when adding a new API key to an endpoint.
-
Endpoint API keys can be added by:
-
The owner of the endpoint.
-
An organization administrator (OrgAdmin) across all tenants.
-
A tenant administrator (TenantAdmin) within their assigned tenant.
-
-
The platform creates an initial API key for each endpoint with the description of First Key.
The example below demonstrates how to add an API key to an endpoint using the snapi endpoint add-apikey command. You will need to provide the following:
-
The endpoint name or ID.
-
A description to identify the new API key is recommended.
$ snapi endpoint add-apikey \
--endpoint <endpoint-name> \
--description <endpoint-description>
Edit an endpoint API key description using the CLI
An endpoint’s API key existing description can be edited using the SambaStudio command-line interface (CLI). Please be aware of the following when editing an endpoint API key description.
-
The description of an endpoint API key can be edited by:
-
The owner of the endpoint.
-
An organization administrator (OrgAdmin) across all tenants.
-
A tenant administrator (TenantAdmin) within their assigned tenant.
-
-
The initial API key, and its description of First Key, created for an endpoint by the platform, cannot be edited.
The example below demonstrates how to edit an endpoint’s API key description using the snapi endpoint edit-apikey command. You will need to provide the following:
-
The endpoint name or ID.
-
The API key ID.
-
A new description for the API key.
$ snapi endpoint edit-apikey \
--endpoint <endpoint-name> \
--apikey <API-key-ID> \
--description <new-description>
Revoke an endpoint API key using the CLI
An endpoint’s API key can be revoked using the SambaStudio command-line interface (CLI). Please be aware of the following when revoking an endpoint API key.
-
Endpoint API keys can be revoked by:
-
The owner of the endpoint.
-
An organization administrator (OrgAdmin) across all tenants.
-
A tenant administrator (TenantAdmin) within their assigned tenant.
-
-
The initial API key (First Key) created for an endpoint by the platform, cannot be revoked.
-
Once an endpoint API key is revoked it cannot be reactivated.
-
The API key info will display the revoked API key with the status Revoked.
-
All users will lose access to a revoked API key.
The example below demonstrates how to revoke an endpoint’s API key using the snapi endpoint revoke-apikey command. You will need to provide the following:
-
The endpoint name or ID.
-
The API key ID.
$ snapi endpoint revoke-apikey \
--endpoint <endpoint-name> \
--apikey <API-key>
View endpoint API key info using the CLI
You can view detailed information about the API key of an endpoint using the SambaStudio command-line interface (CLI). This is useful to check the endpoint’s API key status (after revoking an API key) or confirm any changes you might have made to an endpoint API key (such as the description).
The example below demonstrates how to view an endpoint’s API key info using the snapi endpoint info-apikey command. You will need to provide the following:
-
The endpoint name or ID.
-
The API key ID.
$ snapi endpoint info-apikey \
--endpoint <endpoint-name> \
--apikey <API-key>
Edit an endpoint
You can edit an existing endpoint using the GUI or the CLI. Follow the instructions described in the corresponding sections to learn how.
Edit an endpoint using the GUI
Follow the steps below to edit an existing endpoint using the GUI.
If the required number of RDUs are not available in a single node are not available when editing your endpoint, you will receive a warning message stating that your endpoint will be in Awaiting RDU status until the RDUs become available in a single node. Contact your administrator for more information on RDU configurations specific to your SambaStudio platform. |
-
From an Endpoint window, click the kebob icon (three dots) and select Edit from the drop-down, as shown in Figure 19. Or, from the Endpoints table, click the kebob icon (three dots) for the endpoint you wish to edit and select Edit from the drop-down., as shown in Figure 20. The Edit endpoint window will open.
Figure 19. Edit an endpoint from the endpoint windowFigure 20. Edit an endpoint from the endpoints table -
As shown in Figure 21, from the Edit endpoint window you can:
-
Enter a new description into the Description field.
-
Choose a different model from the Select Model drop-down.
-
Select an available RDU generation version to use. If more than one option is available, the SambaStudio platform will default to the recommended RDU generation version to use based on your platform’s configuration and the selected model. You can select a different RDU generation version to use, if available, than the recommended option from the drop-down. Contact your administrator for more information on RDU configurations specific to your SambaStudio platform.
-
Update the Number of instances.
-
-
Click Edit endpoint to complete process and send the endpoint to the queue.
Figure 21. Update the endpoint
The endpoint’s updated information will be reflected in the endpoints table. |
Edit an endpoint using the CLI
The example below demonstrates how to use the snapi endpoint update command to edit and update an endpoint. You will need to provide the following:
-
The project the endpoint is assigned.
-
The endpoint name or ID.
-
The RDU generation version of the endpoint for the --arch input.
Similar to the GUI, you can:
-
Change the number of instances by using the
--instances
option. -
Change the description by using the
--description
option.
$ snapi endpoint update \
--project <project-name> \
--description <new-description> \
--endpoint <endpoint-name> \
--arch sn10 \
--instances <number-of-instances>
Stop and restart an endpoint
To free up Reconfigurable Dataflow Unit™ (RDU) resources in the platform, you may wish to stop an endpoint from running.
Stop an endpoint using the GUI
Do the following to stop an existing endpoint from running using the GUI.
-
From an Endpoint window, click the kebob icon (three dots) and select Stop from the drop-down, as shown in Figure 22. An alert box will open.
Figure 22. Stop endpoint -
The alert box (Figure 23) displays a warning statement that the models in the endpoint will stop running.
-
As shown in Figure 23, click Continue to confirm that you want to stop the endpoint from running. Click Cancel to return the Endpoint window.
Figure 23. This endpoint will stop running box
Restart an endpoint using the GUI
From an Endpoint window, click the kebob icon (three dots) and select Restart from the drop-down, as shown in Figure 24. The platform will restart the endpoint and will notify you that the endpoint is restarting.
The status of a restarted endpoint is displayed in the endpoint’s top Details bar. |
Stop an endpoint using the CLI
The example below demonstrates how to use the snapi endpoint stop command to stop an endpoint from running. You will need to provide the following:
-
The project the endpoint is assigned.
-
The endpoint name or ID.
$ snapi endpoint stop \
--project <project-name> \
--endpoint <endpoint-name>
Delete an endpoint
You can delete an existing endpoint using the GUI or the CLI. Follow the instructions described in the corresponding sections to learn how.
Delete an endpoint using the GUI
Do the following to delete an existing endpoint using the GUI.
-
From an Endpoint window, click the kebob icon (three dots) and select Delete from the drop-down, as shown in Figure 25. An alert window box open.
Figure 25. Delete endpointThe alert box (Figure 26) displays a message that the endpoint will be deleted and no longer available. The alert also lists the endpoint that will be deleted.
-
Click Yes to finalize deleting the endpoint, as shown in Figure 26. Click Cancel to return the Endpoint window.
Figure 26. Delete alert window
Delete an endpoint using the CLI
The example below demonstrates how to use the snapi endpoint delete command to delete an endpoint. You will need to provide the following:
-
The project the endpoint is assigned.
-
The endpoint name or ID.
$ snapi endpoint delete \
--project <project-name> \
--endpoint <endpoint-name> \
Endpoint new-endpoint-snapi successfully marked delete.