API reference and Swagger

This document contains API reference information and describes how to access and interact with the SambaStudio Swagger framework.

Online inference for generative inference

Once you have deployed an endpoint for a generative model, you can run online inference against it to get completions for prompts.

API reference

Request

HTTP Method Endpoint

HTTP Method	Endpoint
`POST`	URL from the endpoint detail page.

POST

URL from the endpoint detail page.

Headers

Param Description

Param	Description
`key`	The API Key from the endpoint detail page.
Content-Type	`application/json`

key

The API Key from the endpoint detail page.

Content-Type

application/json

Request body (json)

Param Type Description

Param	Type	Description
`inputs`	Array (string)	An array of stringified json objects. Each object contains: `prompt`: The prompt to provide to the model. `completion`: This is an empty string so that the model can generate content.
`param`	JSON	Allows for the setting `tuning parameters` to be used - specified as key value pairs: `max tokens to generate`: A json object that specifies: `type`: This is always "int". `value`: The number of tokens to generate. The total length of tokens (prompt + tokens to generate) must be under 2048. `temperature`: `type`: This is always “float”. `value`: Value between 0-1. `top p`: `type`: This is always “float". `value`: Value between 0-1. `do_sample`: `type`: This is always “bool”. value`: Value is either true/false.

inputs

Array (string)

An array of stringified json objects. Each object contains:

prompt: The prompt to provide to the model.
completion: This is an empty string so that the model can generate content.

param

JSON

Allows for the setting tuning parameters to be used - specified as key value pairs:

max tokens to generate: A json object that specifies:
- type: This is always "int".
- value: The number of tokens to generate. The total length of tokens (prompt + tokens to generate) must be under 2048.
temperature:
- type: This is always “float”.
- value: Value between 0-1.
top p:
- type: This is always “float".
- value: Value between 0-1.
do_sample:
- type: This is always “bool”.
- value`: Value is either true/false.

Request template

curl -k -X POST '<your endpoint url>' \
-H 'key: <your endpoint key>' \
--header 'Content-Type: application/json' \
--data-raw '{
      "inputs":[
           "{\"prompt\":\"<your prompt here>\",\"completion\":\"\"}"
       ],
      “params”: {
           “max tokens to generate”: {
                “type”: “int”,
                “value”: “<number of tokens to generate>”
           },
           “temperature”: {
                “type”: “float”,
                “value”: “<float between 0 - 1>”
           },
           “top p”: {
                “type”: “float”,
                “value”: “<float between 0 - 1>”
           },
           “do_sample”: {
                “type”: “bool”,
                “value”: “true/false”
           }
      }
}

Response params (JSON)

Param Type Description

Param	Type	Description
`status_code`	Integer	The HTTP status code for the request. `200`for when the dataset has been added successfully.
`data`	Array (string)	An array of strings that contains the entire prompt along with the generated content.

status_code

Integer

The HTTP status code for the request.

`200`for when the dataset has been added successfully.

data

Array (string)

An array of strings that contains the entire prompt along with the generated content.

If the request fails due to verification of certificate, use the -k flag when making your request.

Sample

Request

Example curl request to a generative model deployed to an endpoint

curl -k -X POST '<your endpoint url>' \
-H 'key: <your endpoint key>' \
--header 'Content-Type: application/json' \
--data-raw '{
   "inputs":[
      "{\"prompt\":\"Please answer the question\\n\\n Context: Ethanol fuel -- All biomass goes through at least some of these steps: it needs to be grown, collected, dried, fermented, distilled, and burned. All of these steps require resources and an infrastructure. The total amount of energy input into the process compared to the energy released by burning the resulting ethanol fuel is known as the energy balance (or ``energy returned on energy invested'\'''\''). Figures compiled in a 2007 report by National Geographic Magazine point to modest results for corn ethanol produced in the US: one unit of fossil-fuel energy is required to create 1.3 energy units from the resulting ethanol. The energy balance for sugarcane ethanol produced in Brazil is more favorable, with one unit of fossil-fuel energy required to create 8 from the ethanol. Energy balance estimates are not easily produced, thus numerous such reports have been generated that are contradictory. For instance, a separate survey reports that production of ethanol from sugarcane, which requires a tropical climate to grow productively, returns from 8 to 9 units of energy for each unit expended, as compared to corn, which only returns about 1.34 units of fuel energy for each unit of energy expended. A 2006 University of California Berkeley study, after analyzing six separate studies, concluded that producing ethanol from corn uses much less petroleum than producing gasoline. \\n\\n Question: Does ethanol take more energy make that produces?\",\"completion\":\"\"}"
   ],
   "params":{
      "max_tokens_to_generate":{
         "type":"int",
         "value":"100"
      },
      "do_sample":{
         "type":"bool",
         "value":"true"
      },
      "temperature":{
         "type":"float",
         "value":"1"
      },
      "top_p":{
         "type":"float",
         "value":"1"
      }
   }
}'

Response

{
     "status_code":200,
     "data":["Please answer the question\n\n Context: Ethanol fuel -- All biomass goes through at least some of these steps: it needs to be grown, collected, dried, fermented, distilled, and burned. All of these steps require resources and an infrastructure. The total amount of energy input into the process compared to the energy released by burning the resulting ethanol fuel is known as the energy balance (or ``energy returned on energy invested''). Figures compiled in a 2007 report by National Geographic Magazine point to modest results for corn ethanol produced in the US: one unit of fossil-fuel energy is required to create 1.3 energy units from the resulting ethanol. The energy balance for sugarcane ethanol produced in Brazil is more favorable, with one unit of fossil-fuel energy required to create 8 from the ethanol. Energy balance estimates are not easily produced, thus numerous such reports have been generated that are contradictory. For instance, a separate survey reports that production of ethanol from sugarcane, which requires a tropical climate to grow productively, returns from 8 to 9 units of energy for each unit expended, as compared to corn, which only returns about 1.34 units of fuel energy for each unit of energy expended. A 2006 University of California Berkeley study, after analyzing six separate studies, concluded that producing ethanol from corn uses much less petroleum than producing gasoline. \n\n Question: Does ethanol take more energy make that produces? Yes<|endoftext|>"]
}

Online inference for ASR

SambaStudio allows you to deploy an endpoint for automatic speech recognition (ASR) and run online inference against it, enabling live-transcription scenarios.

To run online inference for ASR, a flac or .wav file containing the audio is sent via a HTTP POST request. The resulting transcription is then returned.

The sample rate of the audio file must be 16kHz. The file must contain no more than 15s of audio.

API reference

Request

HTTP Method Endpoint

HTTP Method	Endpoint
`POST`	URL from the endpoint detail page.

POST

URL from the endpoint detail page.

Headers

Param Description

Param	Description
`key`	The API Key from the endpoint detail page.

key

The API Key from the endpoint detail page.

form-data

Param Description

Param	Description
`predict_file`	Path to the `.flac` or `.wav` audio file to be transcribed.

predict_file

Path to the .flac or .wav audio file to be transcribed.

Request template

curl -k -X POST '<your endpoint url>' \
-H 'key: <your endpoint key>' \
--form 'predict_file=<your file path>'

Response params (JSON)

Param Type Description

Param	Type	Description
`status_code`	Integer	The HTTP status code for the request. `200`for when the dataset has been added successfully.
`data`	String	The transcribed text.

status_code

Integer

The HTTP status code for the request.

`200`for when the dataset has been added successfully.

data

String

The transcribed text.

If the request fails due to verification of certificate, use the -k flag when making your request.

Sample

Request

Example curl request to transcribe a locally stored audio file using online ASR inference

curl -k -X POST "<your endpoint url" \
-H "key:<your endpoint key>" \
--form 'predict_file=@"/Users/username/Downloads/1462-170138-0001.flac"'

Response

{
  "status_code":200,
  "data":["He has written a delightful part for her and she's quite inexpressible."]
}

Online inference for other NLP tasks

For non-generative tasks, the Try It feature provides an in-platform prediction generation experience. To use the Try It feature and generate predictions, your endpoint must have reached the Live status. Follow the steps below to use the Try It feature to generate predictions.

See the Create and use endpoints document for information on how to use endpoints in the platform.

From an Endpoint window, click the Try Now button.

Figure 1. Try Now button
1. The Try It window will open.
Input text into the Try It window to use the following options:
1. Click the Run button to view a response relative to the endpoint’s task.
  
  Figure 2. Try It inputted text
2. Click the Curl command, CLI Command, and Python SDK buttons to view how to make a request programmatically for each option.
  
  Figure 3. Try It Curl command

SambaStudio Swagger framework

SambaStudio implements the OpenAPI Specification (OAS) Swagger framework to describe and use its REST APIs.

Access the SambaStudio Swagger framework

To access SambaStudio’s OpenAPI Swagger framework, add /api/docs to your host server URL.

Example host URL for Swagger

http://example-domain.net/api/docs

Interact with the SambaStudio APIs

For the Predict and Predict File APIs, use the information described in the Online inference for generative inference and Online inference for ASR sections of this document.

You will need the following information when interacting with the SambaStudio Swagger framework.

Project ID

When you viewing a Project window, the Project ID is displayed in the browser URL path after …/projects/details/. In the example below, cd6c07ca-2fd4-452c-bf3e-f54c3c2ead83 is the Project ID.

Example Project ID path

http://example-domain.net/ui/projects/details/cd6c07ca-2fd4-452c-bf3e-f54c3c2ead83

See Projects for more information.

Job ID

When you viewing a Job window, the Job ID is displayed in the browser URL path after …/projects/details/<Project-ID>/jobs/. In the example below, cb1ca778-e25e-42b0-bf43-056ab34374b0 is the Job ID.

Example Job ID path

http://example-domain.net/ui/projects/details/cd6c07ca-2fd4-452c-bf3e-f54c3c2ead83/jobs/cb1ca778-e25e-42b0-bf43-056ab34374b0

See the Train jobs document for information on training jobs. See the Batch inference document for information on batch inference jobs.

Endpoint ID

The Endpoint ID is displayed in the URL path of the Endpoint information window. The Endpoint ID is the last sequence of numbers.

Figure 4. Endpoint ID

See Create and use endpoints for more information.

Key

The SambaStudio Swagger framework requires the SambaStudio API authorization key. This Key is generated in the Resources section of the platform. See SambaStudio resources for information on how to generate your API authorization key.

Figure 5. Resources section