Human Aligned (HA) models
This document provides information for SambaStudio’s Human Aligned (HA) models. These checkpoints have been trained on a small amount of data where prompts are given to humans, who manually write the completions to the prompts. This data is optimized for use in human-facing applications, such as a chatbot.
Data preparation for training
The Generative data preparation repo describes how to prepare data to be used to train SambaNova’s Human Aligned (HA) models. To access the data preparation package, including its associated documentation, please visit the SambaNova public GitHub using the following link: https://github.com/sambanova/generative_data_prep
Prompt guidelines
End prompts with either a colon (:
), a question mark (?
), or another way of letting the model know that it is time for it to start generating. For example, using Please summarize the previous article:
(with a colon) is a better prompt than Please summarize the previous article
(without a colon). Adding these annotations tends to lead to better generations as it indicates to the model that you’ve finished your question and are expecting an answer.
Example prompts
The examples below demonstrate prompts for SambaNova’s Human Aligned (HA) models. Each example is identified by a task type. Click the triangle to expand and view each example prompt.
Open domain Q&A example 1
Prompt:
How does the future outlook of Apple Inc. look like?
Open domain Q&A example 2
Prompt:
What are the biggest challenges traditional banks face with users these days?
Extractive summarization
Prompt:
Please summarize the following paragraph into a markdown table: I have three options to consider. Option one is to walk to work that will take me about 3 hours but will cost me $0. Option two is to take my car, which will take me 20 minutes, but will cost me $5. Option 3 is to take an Uber that will take me 10 minutes but will cost me $20.
Sentence rephrase example 1
Prompt:
Reword the sentence 'I like go to the carnival, because its so much good street food.' to enhance its fluency.
Sentence rephrase example 2
Prompt:
Combine the following two sentences into one coherent sentence.\n1. Soon after the 1998 election Fred proclaimed he will step down as party leader. \n2. Soon after the 1998 election Fred began working as a chef.
Usage
Use the Human Aligned (HA) models for any Playground use cases or where there is direct human interaction with the checkpoint itself.
Playground tuning parameter settings
The Playground tuning parameters provide additional flexibility and options for generative tuning. We recommend the following settings for Human Aligned (HA) models used in the Playground.
-
Setting Do sampling to On is recommended when using Human Aligned (HA) models.
-
A Temperature of
0.7
or higher is recommended when using Human Aligned (HA) models.
Hyperparameters and settings
The hyperparameters and settings for the Human Aligned (HA) models when creating a training job are described below.
Parameter | Definition | Allowed values |
---|---|---|
|
Specifies if final evaluation is performed. |
true, false |
|
Period of evaluating the model in number of training steps: |
Integer > 0 |
|
Strategy to validate the model during training. |
no, steps, epoch |
|
The learning rate to use in optimizer. |
0.0 < float < 1.0 |
|
Period of logging training loss in number of training steps. |
Integer > 0 |
|
Type of learning rate scheduler to use. |
polynomial_decay_schedule_with_warmup, cosine_schedule_with_warmup, fixed_lr |
|
Sequence length to pad or truncate the dataset. Should be set to align your dataset size with your chosen model. |
Defined by selected model. |
|
The number of iterations to run. |
Integer > 0 |
|
Loss scale for the prompt tokens. |
0.0 < float < 1.0 |
|
Determines whether to save the optimizer state when saving a checkpoint. |
true, false |
|
Period of saving the model checkpoints in number of training steps. |
Integer > 0 |
|
Determines whether or not to skip the checkpoint. |
true, false |
|
Subsample for the evaluation dataset. |
0.0 < float < 1.0 |
|
Random seed to use for the subsample evaluation. |
Integer > 0 |
|
Determines whether to use token_type_ids to compute loss. |
true, false Setting to true is recommended if Generative data preparation was used. |
|
Maximum size of the vocabulary. |
Defined by selected model. |
|
Number of warmup steps to use in learning rate scheduler in optimizer. |
Integer > 0 |
|
Weight decay rate to use in optimizer. |
0.0 < float < 1.0 |
Inference settings
The inference settings for Human Aligned (HA) models when creating a batch inference job are described below.
Parameter | Definition | Allowed values |
---|---|---|
Toggles whether to use sampling. If not enabled, greedy decoding is used. When enabled, the platform randomly picks the next word according to its conditional probability distribution. Language generation using sampling does not remain deterministic. If you need to have deterministic results, set this to off, as the model is less likely to generate unexpected or unusual words. Setting it to on allows the model a better chance of generating a high quality response, even with inherent deficiencies. However, this is not desirable in an industrial pipeline as it can lead to more hallucinations and non-determinism. |
true, false Setting to true is recommended. If set to false, temperature, top_k, and top_p are ignored and will have no affect. |
|
|
Sequence length to pad or truncate the dataset. Should be set to align your dataset size with your chosen model. |
Defined by selected model. |
|
The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt. When using max tokens to generate, make sure your total tokens for the prompt plus the requested max tokens to generate are not more than the supported sequence length of the model. You can use this parameter to limit the response to a certain number of tokens. The generation will stop under the following conditions:
This should not exceed |
1 → |
|
The repetition penalty, also known as frequency penalty, controls the model’s tendency to repeat predictions. The repetition penalty reduces the probability of words that have previously been generated. The penalty depends on how many times a word has previously occurred in the prediction. This parameter can be used to penalize words that were previously generated or belong to the context. It decreases the model’s likelihood to repeat the same line verbatim. |
Between 1 and 2. ~1.2-1.5 A value setting of 1 means no penalty. |
|
Stop sequences are used to make the model stop generating text at a desired point, such as the end of a sentence or a list. It is an optional setting that tells the API when to stop generating tokens. The completion will not contain the stop sequence. If nothing is passed, it defaults to the token |
Any comma separated strings, each stop word must be enclosed in double quotes. Example: "Stop phrase 1", "stop phrase 2 with sp3ciAl token$" |
The value used to modulate the next token probabilities. As the value decreases, the model becomes more deterministic and repetitive. With a temperature between Setting to a value of |
0 < x ⇐ 1 Has no affect when do_sample is set to false. |
|
The number of highest probability vocabulary tokens to keep for top k filtering. Top k means allowing the model to choose randomly among the top k tokens by their respective probabilities. For example, choosing the top three tokens means setting the top k parameter to a value of |
1 ⇐ x ⇐ Has no affect when do_sample is set to false. |
|
|
Shows the top |
0 ⇐ x ⇐ 20 |
Top p sampling, sometimes called nucleus sampling, is a technique used to sample possible outcomes of the model. It controls diversity via nucleus sampling as well as the randomness and originality of the model. The top p parameter specifies a sampling threshold during inference time. Top p shortlists the top tokens whose sum of likelihoods does not exceed a certain value. If set to less than |
0 < x ⇐ 1 Has no affect when do_sample is set to false. |
|
|
Maximum size of the vocabulary. |
Defined by selected model. |