All plans can be stopped anytime. You only pay for the time you use the service. The invoiced amount is automatically prorated so in case of a downgrade, you will get a discount on your next invoice. The only exception is the On-Premise plan (this plan is not prorated).
Prices do not include taxes by default. If you are a business registered in EU or an individual, please contact us so we can apply the correct VAT to your subscription.
Not sure which plan is best for you? Just ask our support team!
Use all the pre-trained models. You are invoiced after the fact, based on usage.
Note: this plan is perfect for testing or very irregular usage. For production use, we encourage you to select the pre-paid plans below that are less costly.
• No fixed cost: pay only if you consume
• Automatically get a $15 FREE credit
• All our pre-trained models are available
• Asynchronous mode included
• Monitor your usage in your dashboard
• Parallel requests: 4 (can be increased)
On CPU: $0.003 per request
On GPU: $0.005 per request
Dolphin/ChatDolphin: + $0.00002 per token
Fine-tuned LLaMA 2 70B: + $0.00007 per token
Stable Diffusion: + $0.05 per generated image
Whisper: + $0.0001 per second (duration of your audio or video file)
Speech T5: + $0.00003 per token
Pre-paid plans are the most cost-effective solutions if you plan to make an important volume of requests on our pre-trained AI models.
The cost is fixed and paid up-front at the beginning of the month. There is no variable cost based on usage (as opposed to our pay-as-you-go plan).
The rate limit is in "requests per minute" by default, but you can ask us to change this to "per hour" or "per day". This change is free of charge on the Enterprise plan and above.
On a GPU, AI models are around 10X faster on average.
Free | Starter | Starter GPU | Full | Full GPU | Enterprise | Enterprise GPU | Large Language Models | |
---|---|---|---|---|---|---|---|---|
Price | $0 | $29/month ($1/day) |
$99/month ($3.5/day) |
$59/month ($2/day) |
$199/month ($7/day) |
$499/month ($16/day) |
$699/month ($23/day) |
$2,499/month ($80/day) |
Parallel Requests | 2 | 4 | 4 | 8 | 8 | 15 | 15 | 20 |
Fine-tuned LLaMA 2 70B on GPU (requests per minute) | Variable | 3 | 30 | |||||
Whisper on GPU (requests per minute) | Variable | 5 | 50 | |||||
Stable Diffusion on GPU (requests per minute) | Variable | 5 | 50 | |||||
Speech T5 on GPU (requests per minute) | Variable | 5 | 50 | |||||
Dolphin/ChatDolphin on GPU (requests per minute) | Variable | 2 | 6 | 20 | 200 | |||
Bart Large CNN on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Bart Large CNN on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Bart Large MNLI Yahoo Answers on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Bart Large MNLI Yahoo Answers on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Bart Large SamSum on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Bart Large SamSum on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
XLM Roberta Large XNLI on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
XLM Roberta Large XNLI on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
T5 Base EN Generate Headlines on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
T5 Base EN Generate Headlines on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Roberta Base Squad 2 on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Roberta Base Squad 2 on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Distilbert Base SST 2 on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Distilbert Base SST 2 on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Distilbert Base Emotion on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Distilbert Base Emotion on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Finbert on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Finbert on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
NLLB 200 3.3B on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
NLLB 200 3.3B on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Paraphrase Multilingual Mpnet Base V2 on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Paraphrase Multilingual Mpnet Base V2 on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
SpaCy on CPU (requests per minute) | Variable | 10 | 30 | 150 |
• Train/fine-tune your own GPT-J or Dolphin model
• Automatically deploy your model to a basic dedicated GPU server
• Maximum context size in each request: 1024 tokens
• Parallel requests: 1
• Response time: 3 seconds per 50 generated tokens
• Price is fixed, no matter the size of your dataset or the number of requests you are making
• Pay-as-you-go on all the pre-trained models
$399 / month ($13 / day) for 1 deployed model on 1 dedicated
server
(3 fine-tunings per month included for free, then + $19 per fine-tuning)
(+ $379 per additional server or model)
• Train/fine-tune your own GPT-J or Dolphin model
• Automatically deploy your model to a cutting-edge dedicated GPU server
• Maximum context size in each request: 2048 tokens
• Parallel requests: 10
• Response time: 1 second per 50 generated tokens
• Price is fixed, no matter the size of your dataset or the number of requests you are making
• Pay-as-you-go on all the pre-trained models
$990 / month ($33 / day) for 1 deployed model on 1 dedicated
server
(3 fine-tunings per month included for free, then + $19 per fine-tuning)
(+ $890 per additional server or model)
• Fine-tune your own semantic search model
• Automatically deploy your model to a GPU server
• Response time: 1 second
• Price is fixed, no matter the size of your dataset or the number of requests you are making
• Maximum dataset size: 1 million examples
• Pay-as-you-go on all the pre-trained models
$299 / month ($10 / day) for 1 deployed model on 1 dedicated
server
(3 fine-tunings per month included for free, then + $19 per fine-tuning)
(+ $279 per additional server or model)
• Upload and deploy your own existing models, or deploy any AI model from the open-source community
• Deploy either on CPU or GPU servers
• Easily scale your model up or down
• Leverage a robust and highly-available infrastructure
Cost depends on your model.
Please contact us.
• Choose a specific continent or country
• Many regions available (US, France, Germany, Asia, Middle-East, and more)
+ $249 / month ($8 / day).
Please contact us.
• Choose a specific cloud provider
• Many cloud providers available (AWS, GCP, OVH, Scaleway, and more)
+ $249 / month ($8 / day).
Please contact us.
• Deploy models in-house within your own infrastructure
• No data is sent to the NLP Cloud servers (no internet connection required)
• Suited for sensitive data (e.g. medical applications, financial applications...)
• You can fine-tune your own model on NLP Cloud and then deploy it on-premise
• A 1h consultancy session is automatically included
$649 / month (not prorated).
Please contact us.
Not sure how to start your next natural language processing project, how to deal with MLOps, or how to make the most of these new AI models? We have highly skilled AI experts who will be happy to help you and provide trainings!
$200 / hour.
Please contact us.
Do you have an awesome AI project but you are lacking the technical skills to achieve it? Our technical experts can work on integrating natural language processing into your application!
$200 / hour.
Please contact us.
Many more plans can be created for you: a custom number of requests per minute, a mix of pre-trained and custom models, a specific plan for large language models, a rate limiting per hour instead of per minute, and much more! Just let us know.
Plans can also be paid in other currencies. Please ask us for more information if needed.