Text understanding/generation (NLP),
ready for production, at a fair price.
Fine-tune and deploy your own AI models.
No DevOps required.
We open-sourced an "instruct" version of GPT-J that understands human instructions in natural language: feel free to download it and try it!
Fast and accurate AI models suited for production. Highly-available inference API leveraging the most advanced GPUs.
NLP Cloud is HIPAA / GDPR / CCPA compliant, and working on the SOC 2 certification. We cannot see your data, we do not store your data, and we do not use your data to train our own AI models.
Use all NLP Cloud's AI models in 200 languages, thanks to our multilingual models and our multilingual addon.
Do not worry about DevOps or API programming and focus on text processing only. Deliver your AI project in no time.
NLP Cloud selected the best language processing (NLP) models from the community and deployed them for you
Fine-tune your own models or upload your in-house custom models, and deploy them easily to production
"We spent a lot of energy fine-tuning our machine learning models, but we clearly underestimated the go-live process. NLP Cloud saved us a lot of time, and prices are really affordable."
Patrick, CTO at MatchMaker
"We use NLP Cloud's ChatDolphin model. It is very impressive and on par with OpenAI ChatGPT. Great thing is that it can be deployed on-premise, which is something we might consider in the future for privacy and compliance reasons."
Marc, Software Engineer
"We did the maths: developing the API by ourself, and then creating and maintaining the production platform for our entity extraction models, would have taken around 2 months of work. Finally we did the same thing in 1 hour for a very fair price by using NLP Cloud."
John, CTO
"We had developed a working API deployed with Docker for our model, but we quickly faced performance and scalability issues. After spending weeks on this we eventually went for this cloud solution and we haven't regretted it so far!"
Maria, CSO at CybelAI
"We eventually gave up on fine-tuning GPT-J... We are now exclusively fine-tuning and deploying GPT-J on NLP Cloud and we are happy like this."
Whalid, Lead Dev at Direct IT
"The NLP Cloud API has been extremely reliable and the support team is very nice and reactive."
Bogdan, Data Scientist at Alternative.io
See this customer reference: LAO using our classification API for automatic support tickets triaging
NLP Cloud is an NVIDIA partner
curl https://api.nlpcloud.io/v1/en_core_web_lg/entities \ > -X POST -d '{"text":"John Doe is a Go Developer at Google"}' ^2000 `[ { "end": 8, "start": 0, "text": "John Doe", "type": "PERSON" }, { "end": 25, "start": 13, "text": "Go Developer", "type": "POSITION" }, { "end": 35, "start": 30, "text": "Google", "type": "ORG" }, ] [email protected]:~$` ^3000
curl https://api.nlpcloud.io/v1/bart-large-mnli-yahoo-answers/classification \ > -X POST -d '{ "text":"John Doe is a Go Developer at Google. He has been working there for 10 years and has been awarded employee of the year.", "labels":["job", "nature", "space"], "multi_class": true }' ^2000 `{ "labels":["job", "space", "nature"], "scores":[0.9258800745010376, 0.1938474327325821, 0.010988450609147549] } [email protected]:~$` ^3000
curl https://api.nlpcloud.io/v1/roberta-base-squad2/question \ > -X POST -d '{ "context":"French president Emmanuel Macron said the country was at war with an invisible, elusive enemy, and the measures were unprecedented, but circumstances demanded them.", "question":"Who is the French president?" }' ^2000 `{ "answer":"Emmanuel Macron", "score":0.9595934152603149, "start":17, "end":32 } [email protected]:~$` ^3000
curl https://api.nlpcloud.io/v1/distilbert-finetuned-sst-2-english/sentiment \ > -X POST -d '{"context":"NLP Cloud proposes an amazing service!"}' ^2000 `{ "scored_labels":[ { "label":"POSITIVE", "score":0.9996881484985352 } ] } [email protected]:~$` ^3000
curl https://api.nlpcloud.io/v1/bart-large-cnn/summarization \ > -X POST -d '{"text":"The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure to reach a height of 300 metres. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct."}' ^2000 `{ "summary_text":"The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world." } [email protected]:~$` ^3000
curl https://api.nlpcloud.io/v1/gpt-j/generation \ > -X POST -d '{ "text":"GPT-J is a powerful NLP model", "min_length":10, "max_length":30 }' ^2000 `{ "generated_text":"GPTJ is a powerful NLP model for text generation. This is the open-source version of GPT-3 by OpenAI. It is the most advanced NLP model created as of today." } [email protected]:~$` ^3000
curl https://api.nlpcloud.io/v1/opus-mt-en-fr/translation \ > -X POST -d '{"text":"John Doe has been working for Microsoft in Seattle since 1999."}' ^2000 `{ "translation_text": "John Doe travaille pour Microsoft à Seattle depuis 1999." } [email protected]:~$` ^3000
curl https://api.nlpcloud.io/v1/python-langdetect/langdetection \ > -X POST -d '{"text":"John Doe has been working for Microsoft in Seattle since 1999. Il parle aussi un peu français."}' ^2000 `{ "languages": [ { "en": 0.7142834369645996 }, { "fr": 0.28571521669868466 } ] } [email protected]:~$` ^3000
[email protected]:~$
Use Case | Model Used | |
---|---|---|
Automatic Speech Recognition (speech to text): extract text from an audio or video file, with automatic language detection, and automatic punctuation, in 100 languages. | We use OpenAI's Whisper Large model. | See Docs Test Now |
Classification: send a piece of text, and let the AI apply the right categories to your text, in many languages. As an option, you can suggest the potential categories you want to assess. | We use Joe Davison's Bart Large MNLI Yahoo Answers, Joe Davison's XLM Roberta Large XNLI, and GPT-J, GPT-NeoX, and Dolphin, for classification in 100 languages. For classification without potential categories, use GPT-J/GPT-NeoX/Dolphin. | See Docs Test Now |
Chatbot/Conversational AI: discuss fluently with an AI and get relevant answers, in many languages. | We use GPT-J, GPT-NeoX, and Dolphin. They are powerful alternatives to OpenAI GPT-3. | See Docs Test Now |
Code generation: generate source code out of a simple instruction, in any programming language. | We use GPT-J, GPT-NeoX, and ChatDolphin. They are powerful alternatives to OpenAI GPT-3 and ChatGPT. | See Docs Test Now |
Dialogue Summarization: summarize a conversation, in many languages | We use Bart Large CNN SamSum. | See Docs Test Now |
Embeddings: calculate embeddings in more than 50 languages. | We use several Sentence Transformers models like Paraphrase Multilingual Mpnet Base V2. We also use GPT-J and Dolphin. | See Docs |
Grammar and spelling correction: send a block of text and let the AI correct the mistakes for you, in many languages. | We use GPT-J, GPT-NeoX, and ChatDolphin. They are powerful alternatives OpenAI GPT-3 and ChatGPT. | See Docs Test Now |
Headline generation: send a text, and get a very short summary suited for headlines, in many languages | We use Michau's T5 Base EN Generate Headline. | See Docs Test Now |
Image Generation/Text To Image: generate an image out of a simple text instruction. | We use Stability AI's Stable Diffusion model. It is a powerful alternative to OpenAI DALL-E 2 or MidJourney. | See Docs Test Now |
Intent Classification: understand the intent from a piece of text, in many languages. | We use GPT-J, GPT-NeoX, and ChatDolphin. They are powerful alternatives to OpenAI GPT-3 and ChatGPT. | See Docs Test Now |
Keywords and keyphrases extraction:extract the main keywords from a piece of text, in many languages. | We use GPT-J, GPT-NeoX, and Dolphin. They are powerful alternatives to OpenAI GPT-3. | See Docs Test Now |
Language Detection: detect one or several languages from a text. | We use Python's LangDetect library. | See
Docs Test Now |
Lemmatization: extract lemmas from a text, in many languages | All the large spaCy models are available. | See Docs |
Named Entity Recognition (NER): extract structured information from an unstructured text, like names, companies, countries, job titles... in many languages. | You can perform NER with all the large spaCy models. You can also use GPT-J, GPT-NeoX, and Dolphin, which are powerful alternatives to OpenAI GPT-3. | See Docs Test Now |
Noun Chunks: extract noun chunks from a text, in many languages | All the large spaCy models are available. | See Docs |
Paraphrasing and rewriting: generate a similar content with the same meaning, in many languages. | We use GPT-J, GPT-NeoX, and ChatDolphin. They are powerful alternatives to OpenAI GPT-3 and ChatGPT. | See Docs Test Now |
Part-Of-Speech (POS) tagging: assign parts of speech to each word of your text, in many languages | All the large spaCy models are available. | See Docs |
Product description and ad generation: generate one sentence or several paragraphs containing specific keywords for your product descriptions or ads, in many languages. | We use GPT-J, GPT-NeoX, and ChatDolphin. They are powerful alternatives to OpenAI GPT-3 and ChatGPT. | See Docs Test Now |
Question answering: ask questions about anything, in many languages. As an option you can give a context so the AI uses this context to answer your question. | We use Deepset's Roberta Base Squad 2. We also use GPT-J, GPT-NeoX, and ChatDolphin which are powerful alternatives to OpenAI GPT-3 and ChatGPT. | See
Docs Test Now |
Semantic Search: search your own data, in more than 50 languages. | Create your own semantic search model, based on Sentence Transformers, out of your own domain knowledge (internal documentation, contracts...) and ask semantic questions on it. | See
Docs Test Now |
Semantic Similarity: detect whether 2 pieces of text have the same meaning or not, in more than 50 languages. | We use Paraphrase Multilingual Mpnet Base V2. | See
Docs Test Now |
Sentiment and emotion analysis: determine sentiments and emotions from a text (positive, negative, fear, joy...), in many languages. We also have an AI for financial sentiment analysis. | We use DistilBERT Base Uncased Finetuned SST-2, DistilBERT Base Uncased Emotion, and Prosus AI's Finbert. | See
Docs Test Now |
Speech Synthesis (Text-To-Speech): convert text to audio | We use Microsoft Speech T5. | See
Docs Test Now |
Summarization: send a text, and get a smaller text keeping essential information only, in many languages | We use Facebook's Bart Large CNN. We also use GPT-J, GPT-NeoX, and ChatDolphin which are powerful alternatives to OpenAI GPT-3 and ChatGPT. | See Docs Test Now |
Text generation: achieve all the most advanced AI use cases by either making requests in natural language ("instruct" requests) or using few-shot learning. | We GPT-J, GPT-NeoX, Dolphin, and ChatDolphin. They are powerful alternatives to OpenAI GPT-3 and ChatGPT. You can also fine-tune your own text generation model for even better results. | See Docs Test Now |
Tokenization: extract tokens from a text, in many languages | All the large spaCy models are available. | See Docs |
Translation: translate text in 200 languages with automatic input language detection. | We use Facebook's NLLB 200 3.3B for translation in 200 languages. | See Docs Test Now |
Looking for a specific use case or AI model that is not in the list
above? Please let us know!
Implementation can be
very quick on our side.
Upload or Train/Fine-Tune your own AI models from your dashboard, and use them straight away in production without worrying about deployment considerations like RAM usage, high-availability, scalability... You can upload and deploy as many models as you want to production.
All plans can be stopped anytime. You only pay for the time you use the service.
The invoiced amount is automatically prorated. In case of a downgrade, you will get a discount on your next invoice.
Prices do not include taxes by default. If you are a business registered in EU or an individual, please contact us so we can apply the correct VAT to your subscription.
Not sure which plan is best for you? Just ask our support team!
Pre-paid plans are the most cost-effective solutions if you plan to make an important volume of requests on our pre-trained AI models.
The cost is fixed and paid up-front at the beginning of the month. There is no variable cost based on usage (as opposed to our pay-as-you-go plan).
The rate limit is in "requests per minute" by default, but you can ask us to change this to "per hour" or "per day". This change is free of charge on the Enterprise plan and above.
On a GPU, AI models are around 10X faster on average.
Free | Starter | Starter GPU | Full | Full GPU | Enterprise | Enterprise GPU | Large Language Models | |
---|---|---|---|---|---|---|---|---|
Price | $0 | $29/month ($1/day) |
$99/month ($3.5/day) |
$59/month ($2/day) |
$199/month ($7/day) |
$499/month ($16/day) |
$699/month ($23/day) |
$2,499/month ($80/day) |
Parallel Requests | 2 | 4 | 4 | 8 | 8 | 15 | 15 | 20 |
Fine-tuned GPT-NeoX 20B on GPU (requests per minute) | Variable | 3 | 30 | |||||
Whisper on GPU (requests per minute) | Variable | 5 | 50 | |||||
Stable Diffusion on GPU (requests per minute) | Variable | 5 | 50 | |||||
Speech T5 on GPU (requests per minute) | Variable | 5 | 50 | |||||
GPT-NeoX 20B on GPU (requests per minute) | Variable | 1 | 8 | 80 | ||||
Fast GPT-J on GPU (requests per minute) | Variable | 1 | 3 | 10 | 100 | |||
Dolphin/ChatDolphin on GPU (requests per minute) | Variable | 2 | 6 | 20 | 200 | |||
GPT-J on GPU (requests per minute) | Variable | 3 | 10 | 50 | 500 | |||
GPT-J on CPU (requests per minute) | Variable | 3 | 10 | 50 | ||||
Bart Large CNN on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Bart Large CNN on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Bart Large MNLI Yahoo Answers on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Bart Large MNLI Yahoo Answers on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Bart Large SamSum on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Bart Large SamSum on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
XLM Roberta Large XNLI on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
XLM Roberta Large XNLI on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
T5 Base EN Generate Headlines on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
T5 Base EN Generate Headlines on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Roberta Base Squad 2 on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Roberta Base Squad 2 on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Distilbert Base SST 2 on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Distilbert Base SST 2 on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Distilbert Base Emotion on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Distilbert Base Emotion on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Finbert on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Finbert on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
NLLB 200 3.3B on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
NLLB 200 3.3B on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Paraphrase Multilingual Mpnet Base V2 on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Paraphrase Multilingual Mpnet Base V2 on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
SpaCy on CPU (requests per minute) | Variable | 10 | 30 | 150 |
Use all the pre-trained models. You are invoiced after the fact, based on usage
Note: this plan is perfect for testing or very irregular usage. For production use, we encourage you to select the pre-paid plans above that are less costly.
• No fixed cost: pay only if you consume
• Automatically get a $15 FREE credit
• All our pre-trained models are available
• Asynchronous mode included
• Monitor your usage in your dashboard
• Parallel requests: 4 (can be increased)
On CPU: $0.003 per request
On GPU: $0.005 per request
GPT-J: + $0.00001 per token
Dolphin/ChatDolphin: + $0.00002 per token
Fast GPT-J: + $0.00003 per token
GPT-NeoX 20B: + $0.00004 per token
Fine-tuned GPT-NeoX 20B: + $0.00007 per token
Stable Diffusion: + $0.05 per generated image
Whisper: + $0.0001 per second (duration of your audio or video file)
Speech T5: + $0.00003 per token
• Train/fine-tune your own GPT-J or Dolphin model
• Automatically deploy your model to a basic dedicated GPU server
• Maximum context size in each request: 1024 tokens
• Parallel requests: 1
• Response time: 3 seconds per 50 generated tokens
• Price is fixed, no matter the size of your dataset or the number of requests you are making
• Pay-as-you-go on all the pre-trained models
$399 / month ($13 / day) for 1 deployed model on 1 dedicated
server
(3 fine-tunings per month included for free, then + $19 per fine-tuning)
(+ $379 per additional server or model)
• Train/fine-tune your own GPT-J or Dolphin model
• Automatically deploy your model to a cutting-edge dedicated GPU server
• Maximum context size in each request: 2048 tokens
• Parallel requests: 2
• Response time: 1 second per 50 generated tokens
• Price is fixed, no matter the size of your dataset or the number of requests you are making
• Pay-as-you-go on all the pre-trained models
$990 / month ($33 / day) for 1 deployed model on 1 dedicated
server
(3 fine-tunings per month included for free, then + $19 per fine-tuning)
(+ $890 per additional server or model)
• Fine-tune your own semantic search model
• Automatically deploy your model to a GPU server
• Response time: 1 second
• Price is fixed, no matter the size of your dataset or the number of requests you are making
• Maximum dataset size: 1 million examples
• Pay-as-you-go on all the pre-trained models
$299 / month ($10 / day) for 1 deployed model on 1 dedicated
server
(3 fine-tunings per month included for free, then + $19 per fine-tuning)
(+ $279 per additional server or model)
• Upload and deploy your own existing models, or deploy any AI model from the open-source community
• Deploy either on CPU or GPU servers
• Easily scale your model up or down
• Leverage a robust and highly-available infrastructure
Please contact us
• Choose a specific continent or country
• Many regions available (US, France, Germany, Asia, Middle-East, and more)
Please contact us
• Choose a specific cloud provider
• Many cloud providers available (AWS, GCP, OVH, Scaleway, and more)
Please contact us
• Deploy many model in-house within your own infrastructure
• No business data is sent to the NLP Cloud servers
• Suited for sensitive data (e.g. medical applications, financial applications...)
Please contact us
Not sure how to start your next natural language processing project, how to deal with MLOps, or how to make the most of these new AI models? We have highly skilled AI experts who will be happy to help you and provide trainings!
Please contact us
Do you have an awesome AI project but you are lacking the technical skills to achieve it? Our technical experts can work on integrating natural language processing into your application!
Please contact us
Many more plans can be created for you: a custom number of requests per minute, a mix of pre-trained and custom models, a specific plan for large language models, a rate limiting per hour instead of per minute, and much more! Just let us know.
Plans can also be paid in other currencies. Please ask us for more information if needed.
If you already have an account, send us a message from your dashboard.
Otherwise, send us an email here: [email protected].
NLP Cloud places the safety of your data and privacy as a major concern. To guarantee the platform and data stay safe, we continuously deploy our resources and methods into our platform and methods. Mentioned below is only a portion of the security protocols we use. If you'd like to discuss how NLP Cloud can conform to your compliance requirements, please contact us!
The NLP Cloud production data is handled and held inside the most reliable cloud services and corporate data-centers.
Data that is stored for long-term use is safeguarded by being cryptographically processed.
The firewalls and secure system settings put in place protect all of the NLP Cloud servers and databases. Furthermore, Linux is the operating system that powers all of our production servers.
NLP Cloud only stores a hashed version of your password, following the PBKDF2 algorithm with a SHA256 hash.
NLP Cloud has generated extensive safety protocols touching on multiple aspects. These protocols are constantly renewed and distributed among all collaborators.
Every employee understands security protocols and regulations and participates in frequent training programs. Only a limited set of system administrators are allowed to access the NLP Cloud servers
NLP Cloud maintains regular backups of information and regularly assesses its ability to restore the data in the event of a major issue.
NLP Cloud implements strong guidelines to strike a balance between regulation and speed while changing system configurations.
We use outside security specialists to conduct thorough examinations of the NLP Cloud system.