Pricing

All plans can be stopped anytime. You only pay for the time you use the service. The invoiced amount is automatically prorated so in case of a downgrade, you will get a discount on your next invoice. The only exception is the On-Premise plan (this plan is not prorated).

Prices do not include taxes by default. If you are a business registered in EU or an individual, please contact us so we can apply the correct VAT to your subscription.

Not sure which plan is best for you? Just ask our support team!

Pay-As-You-Go

Use all the pre-trained models. You are invoiced after the fact, based on usage.

• No fixed cost: pay only if you consume

• Automatically get a $15 FREE credit

• All our pre-trained models are available

• Asynchronous mode included

• Monitor your usage in your dashboard

• Parallel requests: 10 (can be increased)

On CPU: $0.003 per request

On GPU: $0.005 per request

Dolphin/ChatDolphin: + $0.0005 per 1K tokens

Fine-tuned LLaMA 2 70B: + $0.0018 per 1K tokens

Stable Diffusion: + $0.05 per generated image

Whisper: + $0.0001 per second (duration of your audio or video file)

Speech T5: + $0.0006 per 1K tokens

Pre-Paid Plans For Pre-Trained Models

Pre-paid plans are the most cost-effective solutions if you plan to make an important volume of requests on our pre-trained AI models.

The cost is fixed and paid up-front at the beginning of the month. There is no variable cost based on usage (as opposed to our pay-as-you-go plan).

The rate limit is in "requests per minute" by default, but you can ask us to change this to "per hour" or "per day". This change is free of charge on the Enterprise plan and above.

On a GPU, AI models are around 10X faster on average.

Free Starter Starter GPU Full Full GPU Enterprise Enterprise GPU Large Language Models
Price $0 $29/month
($1/day)
$99/month
($3.5/day)
$59/month
($2/day)
$199/month
($7/day)
$499/month
($16/day)
$699/month
($23/day)
$2,499/month
($80/day)
Parallel Requests 2 10 10 20 20 40 40 50
Whisper on GPU (requests per minute) Variable 1 3 10 50
Stable Diffusion on GPU (requests per minute) Variable 1 3 10 50
Fine-tuned LLaMA 2 70B on GPU (requests per minute) Variable 3 10 70 350
Dolphin/ChatDolphin on GPU (requests per minute) Variable 10 30 200 1000
Speech T5 on GPU (requests per minute) Variable 10 30 200 1000
Bart Large CNN on GPU (requests per minute) Variable 10 30 200 1000
Bart Large CNN on CPU (requests per minute) Variable 10 30 200
Bart Large MNLI Yahoo Answers on GPU (requests per minute) Variable 10 30 200 1000
Bart Large MNLI Yahoo Answers on CPU (requests per minute) Variable 10 30 200
Bart Large SamSum on GPU (requests per minute) Variable 10 30 200 1000
Bart Large SamSum on CPU (requests per minute) Variable 10 30 200
XLM Roberta Large XNLI on GPU (requests per minute) Variable 10 30 200 1000
XLM Roberta Large XNLI on CPU (requests per minute) Variable 10 30 200
T5 Base EN Generate Headlines on GPU (requests per minute) Variable 10 30 200 1000
T5 Base EN Generate Headlines on CPU (requests per minute) Variable 10 30 200
Roberta Base Squad 2 on GPU (requests per minute) Variable 10 30 200 1000
Roberta Base Squad 2 on CPU (requests per minute) Variable 10 30 200
Distilbert Base SST 2 on GPU (requests per minute) Variable 10 30 200 1000
Distilbert Base SST 2 on CPU (requests per minute) Variable 10 30 200
Distilbert Base Emotion on GPU (requests per minute) Variable 10 30 200 1000
Distilbert Base Emotion on CPU (requests per minute) Variable 10 30 200
Finbert on GPU (requests per minute) Variable 10 30 200 1000
Finbert on CPU (requests per minute) Variable 10 30 200
NLLB 200 3.3B on GPU (requests per minute) Variable 10 30 200 1000
NLLB 200 3.3B on CPU (requests per minute) Variable 10 30 200
Paraphrase Multilingual Mpnet Base V2 on GPU (requests per minute) Variable 10 30 200 1000
Paraphrase Multilingual Mpnet Base V2 on CPU (requests per minute) Variable 10 30 200
SpaCy on CPU (requests per minute) Variable 10 30 200

Pre-Paid Plans For Custom AI Models

Basic Fine-tuning

• Train/fine-tune your own GPT-J or Dolphin model

• Automatically deploy your model to a basic dedicated GPU server

• Maximum context size in each request: 1024 tokens

• Parallel requests: 1

• Response time: 3 seconds per 50 generated tokens

• Price is fixed, no matter the size of your dataset or the number of requests you are making

• Pay-as-you-go on all the pre-trained models


$399 / month ($13 / day) for 1 deployed model on 1 dedicated server


(3 fine-tunings per month included for free, then + $19 per fine-tuning)


(+ $379 per additional server or model)

Advanced Fine-tuning

• Train/fine-tune your own GPT-J or Dolphin model

• Automatically deploy your model to a cutting-edge dedicated GPU server

• Maximum context size in each request: 2048 tokens

• Parallel requests: 10

• Response time: 1 second per 50 generated tokens

• Price is fixed, no matter the size of your dataset or the number of requests you are making

• Pay-as-you-go on all the pre-trained models


$990 / month ($33 / day) for 1 deployed model on 1 dedicated server


(3 fine-tunings per month included for free, then + $19 per fine-tuning)


(+ $890 per additional server or model)

Semantic Search

• Fine-tune your own semantic search model

• Automatically deploy your model to a GPU server

• Response time: 1 second

• Price is fixed, no matter the size of your dataset or the number of requests you are making

• Maximum dataset size: 1 million examples

• Pay-as-you-go on all the pre-trained models


$299 / month ($10 / day) for 1 deployed model on 1 dedicated server


(3 fine-tunings per month included for free, then + $19 per fine-tuning)


(+ $279 per additional server or model)

In-house Models

• Upload and deploy your own existing models, or deploy any AI model from the open-source community

• Deploy either on CPU or GPU servers

• Easily scale your model up or down

• Leverage a robust and highly-available infrastructure


Cost depends on your model.
Please contact us.

Sensitive Applications

Specific Region

• Choose a specific continent or country

• Many regions available (US, France, Germany, Asia, Middle-East, and more)


+ $249 / month ($8 / day).
Please contact us.

Specific Cloud Provider

• Choose a specific cloud provider

• Many cloud providers available (AWS, GCP, OVH, Scaleway, and more)


+ $249 / month ($8 / day).
Please contact us.

Edge AI / On-Premise

• Deploy models in-house within your own infrastructure

• No data is sent to the NLP Cloud servers (no internet connection required)

• Suited for sensitive data (e.g. medical applications, financial applications...)

• You can fine-tune your own model on NLP Cloud and then deploy it on-premise

• A 1h consultancy session is automatically included


$649 / month (not prorated).
Please contact us.

Consulting

Expert Guidance

Not sure how to start your next natural language processing project, how to deal with MLOps, or how to make the most of these new AI models? We have highly skilled AI experts who will be happy to help you and provide trainings!


$200 / hour.
Please contact us.

Integration

Do you have an awesome AI project but you are lacking the technical skills to achieve it? Our technical experts can work on integrating natural language processing into your application!


$200 / hour.
Please contact us.

Many more plans can be created for you: a custom number of requests per minute, a mix of pre-trained and custom models, a specific plan for large language models, a rate limiting per hour instead of per minute, and much more! Just let us know.

Plans can also be paid in other currencies. Please ask us for more information if needed.