API V2
This document contains SambaStudio API version 2 (V2) reference information. It describes input and output formats for the new predict V2 API.
Generic API language models
This section describes how to create a model response for a given chat conversation, or a text prompt, using the Generic V2 API.
API Type | HTTP Method | Endpoint |
---|---|---|
Predict |
|
|
Stream |
|
|
Generic API request header
Header | Type | Value |
---|---|---|
|
Array |
When dynamic batching is disabled, the batch of requests sent is processed directly, as opposed to grouping individual requests into batches. We recommend disabling dynamic batching only if you have implemented your own queuing or batching mechanisms. Otherwise, keeping dynamic batching enabled helps optimize performance by grouping smaller requests for more efficient processing. |
Generic API request body
Attributes | Type | Description |
---|---|---|
items |
Array |
|
params Available params are dependent on the model. |
JSON object |
Allows setting the tuning parameters to be used, specified as key value pairs.
|
{
"items": [
{
"id": "item1",
"value": "{\"messages\":[{\"message_id\":0,\"role\":\"user\",\"content\":\"Hi\"},{\"message_id\":1,\"role\":\"assistant\",\"content\":\"Hi! It's nice to meet you. Is there something I can help you with or would you like to chat?\"},{\"message_id\":2,\"role\":\"user\",\"content\":\"\"}]}"
},
{
"id": "item2",
"value": "{\"messages\":[{\"message_id\":0,\"role\":\"user\",\"content\":\"Hi\"},{\"message_id\":1,\"role\":\"assistant\",\"content\":\"Hi! It's nice to meet you. Is there something I can help you with or would you like to chat?\"},{\"message_id\":2,\"role\":\"user\",\"content\":\"\"}]}"
}
],
"params": {
"do_sample": false,
"select_expert": "Meta-Llama-3-70B-Instruct",
"process_prompt": true,
"top_k": 50
}
}
Stream response format
The stream response format for the Generic V2 API is described below.
-
.result.items
: Contains responses for the prompts passed into the input request. -
Each
object
in this array will have following format.-
id
: Indicates which item this answer will correspond to. -
value
: Contains the response details.-
is_last_response
: This will be false for all the responses for a batch item, except the last stream response. -
stream_token
: The actual response token for the prompt.
-
-
{
"result": {
"items": [
{
"id": "item1",
"value": {
"is_last_response": false,
"stream_token": "severe cold symptoms, seek medical "
}
}
]
}
}
Last stream response
The format for the last stream response is described below.
-
.result.items
: Contains responses for the prompts passed into the input request. -
If an object is the last response for that particular batch item, it will have
value.is_last_response
set totrue
.-
id
: Indicates which item this answer will correspond to. -
completion
: Will contain the full completion generated for the prompt. -
stream_token
: Will be empty in the last response. -
start_time
: The start time of the request on the server. -
end_time
: The end time of the request on the server. -
prompt_tokens_count
: The total tokens count in the user prompt. -
completion_tokens_count
: The total tokens count generated by the model. -
batch_size_used
: The batch size that was used for prediction on the server. -
stop_reason
: The reason why the output was terminated.-
Examples:
max_len_reached
,end_of_text
, orstop_sequence_hit
.
-
-
time_to_first_token
: The time to get the first token on server side. -
throughput_after_first_token
: The number of tokens generated per second after the first token. -
total_tokens_count
: Total tokens count including the input and output. -
model_execution_time
: The total time needed by the model to execute the request.
-
{
"result": {
"items": [
{
"id": "item1",
"value": {
"batch_size_used": 1,
"completion": "I was created by a group of researchers at Meta AI. I'm a deep neural network, specifically a transformer, that's been trained on a large corpus of text data. My training data includes a massive amount of text from various sources, including books, articles, and websites, which allows me to understand and generate human-like language.\n\nMy development is the result of a collaboration between several researcher and engineers at Meta AI, who have expertise in natural language processing, machine learning, and software development. They've worked together to design and train me, as well as to develop the infrastructure that allows me to interact with users like you.\n\nI'm constantly learning and improving, so I can become more accurate and informative in my responses. This is possible thanks to the efforts of my developers, who continue to refine my training data and algorithms to make me a better conversational AI.",
"completion_tokens_count": 174,
"end_time": 1719850149.1811948,
"is_last_response": true,
"model_execution_time": 2.647777557373047,
"prompt": "\u003c|start_header_id|\u003euser\u003c|end_header_id|\u003e\n\nwho created you?\u003c|eot_id|\u003e\u003c|start_header_id|\u003eassistant\u003c|end_header_id|\u003e\n\n",
"prompt_tokens_count": 14,
"start_time": 1719850146.5334172,
"stop_reason": "end_of_text",
"stream_token": "",
"throughput_after_first_token": 70.48973284663778,
"time_to_first_token": 0.2644517421722412,
"total_tokens_count": 188
}
}
]
}
}
Non-stream response
The non-stream response format for the Generic V2 API is described below.
-
items: Will contain responses to all the items that were sent with the input request.
-
Each item response will contain following fields.
-
id
: Indicates which input request item this output will correspond to. -
value
: Actual prediction generated by the model for the given prompt.
-
{
"items": [
{
"id": "item1",
"value": "I'm just a language model, I don't have feelings or emotions like humans do, so I don't have good or bad days. I'm always \"on\" and ready to assist you with any questions or tasks you may have!\n\nThat being said, I'm functioning properly and ready to help you with anything you need. How can I assist you today?"
},
{
"id": "item2",
"value": "I'm an AI, and I can do a lot of things. Here are some examples:\n\n**Converse**: I can have a conversation with you, answering your questions, providing information, and even engaging in small talk.\n\n**Provide information**: I have been trained on a massive dataset of text and can provide information on a wide range of topics, including but not limited to:\n\n* History\n* Science\n* Technology\n* Health\n* Entertainment\n* Culture\n* Education\n* And many more!\n\n**Generate text**: I can generate text based on a prompt or topic. This can be useful for writing articles, creating content, or even composing emails.\n\n**Translate text**: I can translate text from one language to another. I currently support translations in dozens of languages.\n\n**Summarize content**: I can summarize long pieces of text, such as articles or documents, into shorter, more digestible versions.\n\n**Offer suggestions**: I can offer suggestions for things like gift ideas, travel destinations, books to read, and more.\n\n**Play games**: I can play simple text-based games with you, such as Hangman, 20 Questions, and Word Jumble.\n\n**Generate creative content**: I can generate creative content, such as poetry or short stories.\n\n**Assist with language learning**: I can help with language learning by providing grammar explanations, vocabulary practice, and conversation practice.\n\n**Provide definitions**: I can define words and phrases, explaining their meanings and usage.\n\n**And more!**: I'm constantly learning and improving, so there may be other things I can do that aren't listed here.\n\nWhat would you like to do or talk about?"
},
{
"id": "item3",
"value": "I was created by a group of researcher at Meta AI. I'm a deep neural network, specifically a transformer, that's been trained on a large corpus of text data. My training data includes a massive amount of text from various sources, including books, articles, and websites, which allows me to understand and generate human-like language.\n\nMy development is the result of a collaboration between several researcher and engineers at Meta AI, who have expertise in natural language processing, machine learning, and software development. They've worked together to design and train me, as well as to develop the infrastructure that allows me to interact with users like you.\n\nI'm constantly learning and improving, so I can become more accurate and informative in my responses. This is possible thanks to the efforts of my developers, who continue to refine my training data and algorithms to make me a better conversational AI."
},
{
"id": "item4",
"value": "Here are 10 things not to do when you're feeling cold:\n\n1. **Don't stay in wet clothes**: Wet clothes can make you lose heat quickly, making you feel even colder. Change into dry, warm clothes as soon as possible.\n2. **Avoid drinking cold beverages**: Drinking cold drinks can lower your body temperature further, making you feel colder. Opt for warm beverages like tea, coffee, or hot chocolate instead.\n3. **Don't take a cold shower**: Taking a cold shower can cause your body to lose heat rapidly, making you feel colder. Take a warm or hot shower instead to help raise your body temperature.\n4. **Don't go outside without dressing warmly**: If you need to go outside, make sure to dress warmly in layers, including a hat, gloves, and scarf. This will help prevent heat loss.\n5. **Don't ignore hypothermia symptoms**: If you're experiencing symptoms like shivering, confusion, or dizziness, seek medical attention immediately. Hypothermia can be life-threatening if left untreated.\n6. **Don't rely on caffeine or nicotine**: While caffeine and nicotine may provide a temporary energy boost, they can also cause blood vessels to constrict, making you feel colder.\n7. **Don't skip meals**: Eating regular meals can help keep your body warm by providing energy. Opt for warm, hearty meals like soup or stew.\n8. **Don't use electric blankets or heating pads on high**: While electric blankets and heating pads can provide warmth, using them on high settings can cause burns or fires. Use them on low settings and follow the manufacturer's instructions.\n9. **Don't neglect to maintain your heating system**: Make sure your heating system is in good working condition to prevent breakdowns and keep your home warm.\n10. **Don't underestimate the power of physical activity**: Engage in light physical activity like stretching, jumping jacks, or yoga to get your blood flowing and warm yourself up. However, avoid strenuous exercise that can cause you to sweat and lose heat.\n\nRemember to always prioritize your safety and health when feeling cold. If you're experiencing persistent or severe cold symptoms, seek medical attention if necessary."
}
]
}
Embeddings API
POST https://<your-sambastudio-domain>/api/v2/predict/generic/<project-id>/<endpoint-id>
Attributes | Type | Description |
---|---|---|
items |
Array |
|
params |
JSON object |
|
curl --location 'https://<your-sambastudio-domain>/api/v2/predict/generic/<project-id>/<endpoint-id>' \
--header 'Content-Type: application/json' \
--header 'key: API KEY' \
--data '{
"items": [
{
"id": "item0",
"value": "Hello"
},
{
"id": "item1",
"value": "how are you?"
}
],
"params":
{
"select_expert": "e5-mistral-7b-instruct-8192"
}
}'
{
"items": [
{
"id": "item0",
"partial": false,
"value": [
0.016115479171276093,
-0.0016850305255502462,
-0.0032306977082043886,
-0.00342073873616755,
0.010692975483834743,
......
0.0045765722170472145,
-0.009879584424197674,
0.014238224364817142
],
"params": {},
"status": null
}
]
}