Text Generation
Make sure to read our few-shot learning guide and our natural language instructions guide in order to make the most of generative AI models! See more use case ideas on the OpenAI website and take time to attend this free 1h prompt engineering class released by DeepLearning.AI.
Using Text Generation
This example uses text generation. Text generation can be very powerful, but using it correctly takes a bit of practice.
For most use cases based on text generation, especially if you use GPT-J, GPT-NeoX or Dolphin, it is crucial to understand the few-shot learning concept. Basically, in order for the generative model to understand what you want, you should give it a couple of examples in your input request.
If you use a model that understands natural language instructions like ChatDolphin or Fine-tuned GPT-NeoX, your requests can be made with simple human instructions.
If you're still not getting good results even when using few-shot learning, it might be a sign that you need to fine-tune your own generative model. You can easily do it on NLP Cloud.
Feel free to play with the text parameters like top p, temperature, repetition penalty... and see the documentation about how to use these parameters if needed.
If you need advice, please contact us!
Use GPU
Control whether you want to use the model on a GPU. Machine learning models run much faster on GPUs.
Length No Input
Whether min_length and max_length should not include the length of the input text. If false, min_length and max_length include the length of the input text. If true, min_length and max_length don't include the length of the input text. Defaults to false.
Remove Input
Whether you want to remove the input text form the result. Defaults to false.
Do Sample
Whether or not to use sampling; use greedy decoding otherwise. Defaults to true. This is an advanced parameter, so you should change it only if you know what you're doing.
Early Stopping
Whether to stop the beam search when at least num_beams sentences are finished per batch or not. Defaults to false. This is an advanced parameter, so you should change it only if you know what you're doing.
Min Length
The minimum number of tokens that the generated text should contain. The input size + the size of the generated text should not exceed 2048 tokens. If length_no_input is false, the size of the generated text is the difference between min_length and the length of your input text. If length_no_input is true, the size of the generated text simply is min_length. Defaults to 0.
Max Length
The maximum number of tokens that the generated text should contain. If length_no_input is false, the size of the generated text is the difference between max_length and the length of your input text. If length_no_input is true, the size of the generated text simply is max_length. Defaults to 50.
End Sequence
A specific token that should be the end of the generated sequence. For example if could be `.`, `\n`, or `###` or anything else below 10 characters.
Remove End Sequence
Whether you want to remove the end sequence form the result. Defaults to false.
Bad Words
List of tokens that are not allowed to be generated. Defaults to null.
No Repeat Ngram Size
If set to int > 0, all ngrams of that size can only occur once. Defaults to 0. This is an advanced parameter, so you should change it only if you know what you're doing.
Num Returned Sequences
The number of independently computed returned sequences. Defaults to 1.
Repetition Penalty
Prevents the same word to be repeated too many times. 1.0 means no penalty. Defaults to 1.0.
Length Penalty
Exponential penalty to the length. 1.0 means no penalty. Set to values < 1.0 in order to encourage the model to generate shorter sequences, or to a value > 1.0 in order to encourage the model to produce longer sequences. Defaults to 1.0.
Num Beams
Number of beams for beam search. 1 means no beam search. Defaults to 1. This is an advanced parameter, so you should change it only if you know what you're doing.
Top K
The number of highest probability vocabulary tokens to keep for top-k-filtering. Maximum 1000 tokens. The lower this value, the less likely the model is going to generate off-topic text.
Top P
If set to float < 1, only the most probable tokens with probabilities that add up to top_p or higher are kept for generation. The higher this value, the less deterministic the result will be. It's recommended to play with top_p if you want to produce original content for applications that require accurate results, while you should use temperature if you want to generate more funny results. You should not use both at the same time. Should be between 0 and 1. Defaults to 0.8.
Temperature
The value used to module the next token probabilities. The higher this value, the less deterministic the result will be. For example if temperature=0 the output will always be the same, while if temperature=1 each new request will produce very different results. It's recommended to play with top_p if you want to produce original content for applications that require accurate results, while you should use temperature if you want to generate more funny results. You should not use both at the same time. Should be between 0 and 1. Defaults to 1.
Language
AI models don't always work well with non-English languages.
We do our best to add non-English models when it's possible. See for example Fine-tuned GPT-NeoX 20B, Dolphin, ChatDolphin, XLM Roberta Large XNLI, Paraphrase Multilingual Mpnet Base V2, or spaCy. Unfortunately not all the models are good at handling non-English languages.
In order to solve this challenge, we developed a multilingual module that automatically translates your input into English, performs the actual NLP operation, and then translates the result back to your original language. It makes your requests a bit slower but often returns very good results.
Even for models that natively understand non-English languages, they actually sometimes work even better with the multilingual addon.
Simply select your language in the list, and from now on you can write the input text in your own language!
This multilingual add-on is a free feature.