Pricing

Prices do not include taxes by default. If you are a business registered in EU or an individual, please contact us so we can apply the correct VAT to your subscription.

Pay-As-You-Go

$0 / month + usage

Use all the pre-trained models. You are invoiced after the fact, based on usage.

No fixed cost: pay only if you consume
Automatically get a $15 FREE credit
All our pre-trained models are available
Asynchronous mode included
Monitor your usage in your dashboard
Parallel requests: 10 (can be increased)

Try for free

On CPU $0.003 per request

On GPU $0.005 per request

ChatDolphin / Yi-34B / Mixtral-8x7B + $0.0005 per 1K tokens

GPT-OSS 120B / LLaMA 3.1 405B / LLaMA 3.3 70B + $0.0018 per 1K tokens

Whisper + $0.0001 per second (duration of your audio or video file)

Speech T5 + $0.0006 per 1K tokens

Pre-Paid Plans For Pre-Trained Models

Pre-paid plans are the most cost-effective solutions if you plan to make an important volume of requests on our pre-trained AI models.

The cost is fixed and paid up-front at the beginning of the month. There is no variable cost based on usage (as opposed to our pay-as-you-go plan).

All the pre-paid plans can be stopped anytime. You only pay for the time you use the service. The invoiced amount is automatically prorated. So in case of a downgrade, you will get a discount on your next invoice. The only exception is the On-Premise plan (this plan is not prorated).

The rate limit is in "requests per minute" by default, but you can ask us to change this to "per hour" or "per day". This change is free of charge on the Enterprise plan and above.

On a GPU, AI models are around 10X faster on average.

Not sure which plan is best for you? Ask our support team!

	Free	Starter	Full	Enterprise	Starter GPU	Full GPU	Enterprise GPU	Large Language Models
Price	$0	$29/month ($1/day)	$59/month ($2/day)	$229/month ($7/day)	$99/month ($3.5/day)	$199/month ($7/day)	$699/month ($23/day)	$2,499/month ($80/day)
Parallel Requests	2	10	20	40	10	20	40	50
Whisper on GPU (requests per minute)	Variable				1	3	10	50
GPT-OSS 120B and LLaMA 3 on GPU (requests per minute)	Variable				3	10	70	350
Dolphin/ChatDolphin/Yi-34B/Mixtral-8x7B on GPU (requests per minute)	Variable				10	30	200	1000
Speech T5 on GPU (requests per minute)	Variable				10	30	200	1000
Bart Large CNN on GPU (requests per minute)	Variable				10	30	200	1000
Bart Large CNN on CPU (requests per minute)	Variable	10	30	200
Bart Large MNLI Yahoo Answers on GPU (requests per minute)	Variable				10	30	200	1000
Bart Large MNLI Yahoo Answers on CPU (requests per minute)	Variable	10	30	200
XLM Roberta Large XNLI on GPU (requests per minute)	Variable				10	30	200	1000
XLM Roberta Large XNLI on CPU (requests per minute)	Variable	10	30	200
T5 Base EN Generate Headlines on GPU (requests per minute)	Variable				10	30	200	1000
T5 Base EN Generate Headlines on CPU (requests per minute)	Variable	10	30	200
Distilbert Base SST 2 on GPU (requests per minute)	Variable				10	30	200	1000
Distilbert Base SST 2 on CPU (requests per minute)	Variable	10	30	200
Distilbert Base Emotion on GPU (requests per minute)	Variable				10	30	200	1000
Distilbert Base Emotion on CPU (requests per minute)	Variable	10	30	200
NLLB 200 3.3B on GPU (requests per minute)	Variable				10	30	200	1000
NLLB 200 3.3B on CPU (requests per minute)	Variable	10	30	200
Paraphrase Multilingual Mpnet Base V2 on GPU (requests per minute)	Variable				10	30	200	1000
Paraphrase Multilingual Mpnet Base V2 on CPU (requests per minute)	Variable	10	30	200
SpaCy on CPU (requests per minute)	Variable	10	30	200

Pre-Paid Plans For Custom AI Models

Basic Fine-tuning

$399 / month ($13 / day)

For 1 deployed model on 1 dedicated server

Train/fine-tune your own ChatDolphin/GPT-OSS 120B/LLaMA 3.3 70B/Mixtral-8x7B model
Automatically deploy your model to a basic dedicated GPU server
Maximum context size in each request: 1024 tokens
Parallel requests: 5
Response time: 3 seconds per 50 generated tokens
Price is fixed, no matter the size of your dataset or the number of requests you are making
Pay-as-you-go on all the pre-trained models

(3 fine-tunings per month included for free, then + $19 per fine-tuning)

(+ $379 per additional server or model)

Advanced Fine-tuning

$990 / month ($33 / day)

For 1 deployed model on 1 dedicated server

Train/fine-tune your own ChatDolphin/GPT-OSS 120B/LLaMA 3.3 70B/Mixtral-8x7B model
Automatically deploy your model to a cutting-edge dedicated GPU server
Maximum context size in each request: 16,384 tokens
Parallel requests: 20
Response time: 1 second per 50 generated tokens
Price is fixed, no matter the size of your dataset or the number of requests you are making
Pay-as-you-go on all the pre-trained models

(3 fine-tunings per month included for free, then + $19 per fine-tuning)

(+ $890 per additional server or model)

Semantic Search

$299 / month ($10 / day)

For 1 deployed model on 1 dedicated server

Fine-tune your own semantic search model
Automatically deploy your model to a GPU server
Response time: 1 second
Price is fixed, no matter the size of your dataset or the number of requests you are making
Maximum dataset size: 1 million examples
Pay-as-you-go on all the pre-trained models

(3 fine-tunings per month included for free, then + $19 per fine-tuning)

(+ $279 per additional server or model)

Sensitive Applications

Specific Region

+ $249 / month ($8 / day)

Choose a specific continent or country
Many regions available (US, France, Germany, Asia, Middle-East, and more)

Please contact us.

Specific Cloud Provider

+ $249 / month ($8 / day)

Choose a specific cloud provider
Many cloud providers available (AWS, GCP, OVH, Scaleway, and more)

Please contact us.

Edge AI / On-Premise

$649 / month (not prorated)

Deploy models in-house within your own infrastructure
No data is sent to the NLP Cloud servers (no internet connection required)
Suited for sensitive data (e.g. medical applications, financial applications...)
You can fine-tune your own model on NLP Cloud and then deploy it on-premise
A 1h consultancy session is automatically included

Please contact us.

Consulting

Expert Guidance

$200 / hour

Not sure how to start your next natural language processing project, how to deal with MLOps, or how to make the most of these new AI models? We have highly skilled AI experts who will be happy to help you and provide trainings!

Please contact us.

Integration

$200 / hour

Do you have an awesome AI project but you are lacking the technical skills to achieve it? Our technical experts can work on integrating natural language processing into your application!

Please contact us.

Many more plans can be created for you: a custom number of requests per minute, a mix of pre-trained and custom models, a specific plan for large language models, a rate limiting per hour instead of per minute, and much more! Just let us know.

Plans can also be paid in other currencies. Please ask us for more information if needed.