Advanced Artificial Intelligence API

Text understanding/generation (NLP),

ready for production, at a fair price.
Fine-tune and deploy your own AI models.
No DevOps required.

We open-sourced an "instruct" version of GPT-J that understands human instructions in natural language: feel free to download it and try it!

High Performance

Fast and accurate AI models suited for production. Highly-available inference API leveraging the most advanced GPUs.

Pre-trained AI Models

We selected the best language processing (NLP) models from the community and deployed them for you

Custom Models

Fine-tune your own models or upload your in-house custom models, and deploy them easily to production

Data Privacy

We are HIPAA / GDPR / CCPA compliant. No data is stored on our servers. We offer specific plans for sensitive applications.

No Complexity

Do not worry about DevOps or API programming and focus on text processing only. Deliver your AI project in no time.

Multilingual AI

Use all our AI models in many languages, thanks to our multilingual models and our multilingual addon.

Deliver Your AI Projects For Good

BBVA
Johnson & Johnson
Zapier
GSK
Generali
Schneider
General Electric
Dell
Zoom
PWC
Lufthansa
Deloitte

"We spent a lot of energy fine-tuning our machine learning models, but we clearly underestimated the go-live process. NLP Cloud saved us a lot of time, and prices are really affordable."

Patrick, CTO at MatchMaker

"We use NLP Cloud's ChatDolphin model. It is very impressive and on par with OpenAI ChatGPT. Great thing is that it can be deployed on-premise, which is something we might consider in the future for privacy and compliance reasons."

Marc, Software Engineer

"We did the maths: developing the API by ourself, and then creating and maintaining the production platform for our entity extraction models, would have taken around 2 months of work. Finally we did the same thing in 1 hour for a very fair price by using NLP Cloud."

John, CTO

"We had developed a working API deployed with Docker for our model, but we quickly faced performance and scalability issues. After spending weeks on this we eventually went for this cloud solution and we haven't regretted it so far!"

Maria, CSO at CybelAI

"We eventually gave up on fine-tuning GPT-J... We are now exclusively fine-tuning and deploying GPT-J on NLP Cloud and we are happy like this."

Whalid, Lead Dev at Direct IT

"The NLP Cloud API has been extremely reliable and the support team is very nice and reactive."

Bogdan, Data Scientist at Alternative.io

See this customer reference: LAO using our classification API for automatic support tickets triaging

NLP Cloud is an NVIDIA partner

NLP Cloud is a member of NVIDIA Inception Program

curl https://api.nlpcloud.io/v1/en_core_web_lg/entities \ > -X POST -d '{"text":"John Doe is a Go Developer at Google"}' ^2000 `[ { "end": 8, "start": 0, "text": "John Doe", "type": "PERSON" }, { "end": 25, "start": 13, "text": "Go Developer", "type": "POSITION" }, { "end": 35, "start": 30, "text": "Google", "type": "ORG" }, ] [email protected]:~$` ^3000

curl https://api.nlpcloud.io/v1/bart-large-mnli-yahoo-answers/classification \ > -X POST -d '{ "text":"John Doe is a Go Developer at Google. He has been working there for 10 years and has been awarded employee of the year.", "labels":["job", "nature", "space"], "multi_class": true }' ^2000 `{ "labels":["job", "space", "nature"], "scores":[0.9258800745010376, 0.1938474327325821, 0.010988450609147549] } [email protected]:~$` ^3000

curl https://api.nlpcloud.io/v1/roberta-base-squad2/question \ > -X POST -d '{ "context":"French president Emmanuel Macron said the country was at war with an invisible, elusive enemy, and the measures were unprecedented, but circumstances demanded them.", "question":"Who is the French president?" }' ^2000 `{ "answer":"Emmanuel Macron", "score":0.9595934152603149, "start":17, "end":32 } [email protected]:~$` ^3000

curl https://api.nlpcloud.io/v1/distilbert-finetuned-sst-2-english/sentiment \ > -X POST -d '{"context":"NLP Cloud proposes an amazing service!"}' ^2000 `{ "scored_labels":[ { "label":"POSITIVE", "score":0.9996881484985352 } ] } [email protected]:~$` ^3000

curl https://api.nlpcloud.io/v1/bart-large-cnn/summarization \ > -X POST -d '{"text":"The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure to reach a height of 300 metres. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct."}' ^2000 `{ "summary_text":"The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world." } [email protected]:~$` ^3000

curl https://api.nlpcloud.io/v1/gpt-j/generation \ > -X POST -d '{ "text":"GPT-J is a powerful NLP model", "min_length":10, "max_length":30 }' ^2000 `{ "generated_text":"GPTJ is a powerful NLP model for text generation. This is the open-source version of GPT-3 by OpenAI. It is the most advanced NLP model created as of today." } [email protected]:~$` ^3000

curl https://api.nlpcloud.io/v1/opus-mt-en-fr/translation \ > -X POST -d '{"text":"John Doe has been working for Microsoft in Seattle since 1999."}' ^2000 `{ "translation_text": "John Doe travaille pour Microsoft à Seattle depuis 1999." } [email protected]:~$` ^3000

curl https://api.nlpcloud.io/v1/python-langdetect/langdetection \ > -X POST -d '{"text":"John Doe has been working for Microsoft in Seattle since 1999. Il parle aussi un peu français."}' ^2000 `{ "languages": [ { "en": 0.7142834369645996 }, { "fr": 0.28571521669868466 } ] } [email protected]:~$` ^3000

[email protected]:~$
Use Case Model Used
Automatic Speech Recognition (speech to text): extract text from an audio or video file, with automatic language detection, and automatic punctuation, in 100 languages. We use OpenAI's Whisper Large model. See Docs
Test Now
Blog Post Generation: create a whole blog article out of a simple title, from 800 to 1500 words, and containing basic HTML tags. We use GPT-J, a powerful alternative to OpenAI GPT-3. See Docs
Test Now
Classification: send a piece of text, and let the AI apply the right categories to your text, in many languages. As an option, you can suggest the potential categories you want to assess. We use Joe Davison's Bart Large MNLI Yahoo Answers, Joe Davison's XLM Roberta Large XNLI, and GPT-J, GPT-NeoX, and Dolphin, for classification in 100 languages. For classification without potential categories, use GPT-J/GPT-NeoX/Dolphin. See Docs
Test Now
Chatbot/Conversational AI: discuss fluently with an AI and get relevant answers, in many languages. We use GPT-J, GPT-NeoX, and Dolphin. They are powerful alternatives to OpenAI GPT-3. See Docs
Test Now
Code generation: generate source code out of a simple instruction, in any programming language. We use GPT-J, GPT-NeoX, and ChatDolphin. They are powerful alternatives to OpenAI GPT-3 and ChatGPT. See Docs
Test Now
Dialogue Summarization: summarize a conversation, in many languages We use Bart Large CNN SamSum. See Docs
Test Now
Embeddings: calculate embeddings in more than 50 languages. We use several Sentence Transformers models like Paraphrase Multilingual Mpnet Base V2. We also use GPT-J and Dolphin. See Docs
Grammar and spelling correction: send a block of text and let the AI correct the mistakes for you, in many languages. We use GPT-J, GPT-NeoX, and ChatDolphin. They are powerful alternatives OpenAI GPT-3 and ChatGPT. See Docs
Test Now
Headline generation: send a text, and get a very short summary suited for headlines, in many languages We use Michau's T5 Base EN Generate Headline. See Docs
Test Now
Image Generation/Text To Image: generate an image out of a simple text instruction. We use Stability AI's Stable Diffusion model. It is a powerful alternative to OpenAI DALL-E 2 or MidJourney. See Docs
Test Now
Intent Classification: understand the intent from a piece of text, in many languages. We use GPT-J, GPT-NeoX, and ChatDolphin. They are powerful alternatives to OpenAI GPT-3 and ChatGPT. See Docs
Test Now
Keywords and keyphrases extraction:extract the main keywords from a piece of text, in many languages. We use GPT-J, GPT-NeoX, and Dolphin. They are powerful alternatives to OpenAI GPT-3. See Docs
Test Now
Language Detection: detect one or several languages from a text. We use Python's LangDetect library. See Docs
Test Now
Lemmatization: extract lemmas from a text, in many languages All the large spaCy models are available. See Docs
Named Entity Recognition (NER): extract structured information from an unstructured text, like names, companies, countries, job titles... in many languages. You can perform NER with all the large spaCy models. You can also use GPT-J, GPT-NeoX, and Dolphin, which are powerful alternatives to OpenAI GPT-3. See Docs
Test Now
Noun Chunks: extract noun chunks from a text, in many languages All the large spaCy models are available. See Docs
Paraphrasing and rewriting: generate a similar content with the same meaning, in many languages. We use GPT-J, GPT-NeoX, and ChatDolphin. They are powerful alternatives to OpenAI GPT-3 and ChatGPT. See Docs
Test Now
Part-Of-Speech (POS) tagging: assign parts of speech to each word of your text, in many languages All the large spaCy models are available. See Docs
Product description and ad generation: generate one sentence or several paragraphs containing specific keywords for your product descriptions or ads, in many languages. We use GPT-J, GPT-NeoX, and ChatDolphin. They are powerful alternatives to OpenAI GPT-3 and ChatGPT. See Docs
Test Now
Question answering: ask questions about anything, in many languages. As an option you can give a context so the AI uses this context to answer your question. We use Deepset's Roberta Base Squad 2. We also use GPT-J, GPT-NeoX, and ChatDolphin which are powerful alternatives to OpenAI GPT-3 and ChatGPT. See Docs
Test Now
Semantic Search: search your own data, in more than 50 languages. Create your own semantic search model, based on Sentence Transformers, out of your own domain knowledge (internal documentation, contracts...) and ask semantic questions on it. See Docs
Test Now
Semantic Similarity: detect whether 2 pieces of text have the same meaning or not, in more than 50 languages. We use Paraphrase Multilingual Mpnet Base V2. See Docs
Test Now
Sentiment and emotion analysis: determine sentiments and emotions from a text (positive, negative, fear, joy...), in many languages. We also have an AI for financial sentiment analysis. We use DistilBERT Base Uncased Finetuned SST-2, DistilBERT Base Uncased Emotion, and Prosus AI's Finbert. See Docs
Test Now
Speech Synthesis (Text-To-Speech): convert text to audio We use Microsoft Speech T5. See Docs
Test Now
Summarization: send a text, and get a smaller text keeping essential information only, in many languages We use Facebook's Bart Large CNN. We also use GPT-J, GPT-NeoX, and ChatDolphin which are powerful alternatives to OpenAI GPT-3 and ChatGPT. See Docs
Test Now
Text generation: achieve all the most advanced AI use cases by either making requests in natural language ("instruct" requests) or using few-shot learning. We GPT-J, GPT-NeoX, Dolphin, and ChatDolphin. They are powerful alternatives to OpenAI GPT-3 and ChatGPT. You can also fine-tune your own text generation model for even better results. See Docs
Test Now
Tokenization: extract tokens from a text, in many languages All the large spaCy models are available. See Docs
Translation: translate text in 200 languages with automatic input language detection. We use Facebook's NLLB 200 3.3B for translation in 200 languages. See Docs
Test Now

Looking for a specific use case or AI model that is not in the list above? Please let us know!
Implementation can be very quick on our side.

Clean API

NLP Cloud provides you with a simple and robust API.

Integration is easy, and we are taking care of the high-availability of AI models, no matter the load.

Client libraries are available in several languages: Python, Ruby, Go, Node.js, and PHP (more to come).

See the documentation for more details.

Fine-Tune and Deploy your Own Models

Upload or Train/Fine-Tune your own AI models from your dashboard, and use them straight away in production without worrying about deployment considerations like RAM usage, high-availability, scalability... You can upload and deploy as many models as you want to production.

Pricing

All plans can be stopped anytime. You only pay for the time you use the service.

The invoiced amount is automatically prorated. In case of a downgrade, you will get a discount on your next invoice.

Prices do not include taxes by default. If you are a business registered in EU or an individual, please contact us so we can apply the correct VAT to your subscription.

Not sure which plan is best for you? Just ask our support team!

Pre-Paid Plans For Pre-Trained Models

Pre-paid plans are the most cost-effective solutions if you plan to make an important volume of requests on our pre-trained AI models.

The cost is fixed and paid up-front at the beginning of the month. There is no variable cost based on usage (as opposed to our pay-as-you-go plan).

The rate limit is in "requests per minute" by default, but you can ask us to change this to "per hour" or "per day". This change is free of charge on the Enterprise plan and above.

On a GPU, AI models are around 10X faster on average.

Free Starter Starter GPU Full Full GPU Enterprise Enterprise GPU Large Language Models
Price $0 $29/month
($1/day)
$99/month
($3.5/day)
$59/month
($2/day)
$199/month
($7/day)
$499/month
($16/day)
$699/month
($23/day)
$2,499/month
($80/day)
Parallel Requests 2 4 4 8 8 15 15 20
Fine-tuned GPT-NeoX 20B on GPU (requests per minute) Variable 3 30
Whisper on GPU (requests per minute) Variable 5 50
Stable Diffusion on GPU (requests per minute) Variable 5 50
Speech T5 on GPU (requests per minute) Variable 5 50
GPT-NeoX 20B on GPU (requests per minute) Variable 1 8 80
Fast GPT-J on GPU (requests per minute) Variable 1 3 10 100
Dolphin/ChatDolphin on GPU (requests per minute) Variable 2 6 20 200
GPT-J on GPU (requests per minute) Variable 3 10 50 500
GPT-J on CPU (requests per minute) Variable 3 10 50
Bart Large CNN on GPU (requests per minute) Variable 10 30 150 150
Bart Large CNN on CPU (requests per minute) Variable 10 30 150
Bart Large MNLI Yahoo Answers on GPU (requests per minute) Variable 10 30 150 150
Bart Large MNLI Yahoo Answers on CPU (requests per minute) Variable 10 30 150
Bart Large SamSum on GPU (requests per minute) Variable 10 30 150 150
Bart Large SamSum on CPU (requests per minute) Variable 10 30 150
XLM Roberta Large XNLI on GPU (requests per minute) Variable 10 30 150 150
XLM Roberta Large XNLI on CPU (requests per minute) Variable 10 30 150
T5 Base EN Generate Headlines on GPU (requests per minute) Variable 10 30 150 150
T5 Base EN Generate Headlines on CPU (requests per minute) Variable 10 30 150
Roberta Base Squad 2 on GPU (requests per minute) Variable 10 30 150 150
Roberta Base Squad 2 on CPU (requests per minute) Variable 10 30 150
Distilbert Base SST 2 on GPU (requests per minute) Variable 10 30 150 150
Distilbert Base SST 2 on CPU (requests per minute) Variable 10 30 150
Distilbert Base Emotion on GPU (requests per minute) Variable 10 30 150 150
Distilbert Base Emotion on CPU (requests per minute) Variable 10 30 150
Finbert on GPU (requests per minute) Variable 10 30 150 150
Finbert on CPU (requests per minute) Variable 10 30 150
NLLB 200 3.3B on GPU (requests per minute) Variable 10 30 150 150
NLLB 200 3.3B on CPU (requests per minute) Variable 10 30 150
Paraphrase Multilingual Mpnet Base V2 on GPU (requests per minute) Variable 10 30 150 150
Paraphrase Multilingual Mpnet Base V2 on CPU (requests per minute) Variable 10 30 150
SpaCy on CPU (requests per minute) Variable 10 30 150

Pay-As-You-Go

Use all the pre-trained models. You are invoiced after the fact, based on usage

Note: this plan is perfect for testing or very irregular usage. For production use, we encourage you to select the pre-paid plans above that are less costly.

• No fixed cost: pay only if you consume

• Automatically get a $15 FREE credit

• All our pre-trained models are available

• Asynchronous mode included

• Monitor your usage in your dashboard

• Parallel requests: 4 (can be increased)

On CPU: $0.003 per request

On GPU: $0.005 per request

GPT-J: + $0.00001 per token

Dolphin/ChatDolphin: + $0.00002 per token

Fast GPT-J: + $0.00003 per token

GPT-NeoX 20B: + $0.00004 per token

Fine-tuned GPT-NeoX 20B: + $0.00007 per token

Stable Diffusion: + $0.05 per generated image

Whisper: + $0.0001 per second (duration of your audio or video file)

Speech T5: + $0.00003 per token

Pre-Paid Plans For Custom AI Models

Basic Fine-tuning

• Train/fine-tune your own GPT-J or Dolphin model

• Automatically deploy your model to a basic dedicated GPU server

• Maximum context size in each request: 1024 tokens

• Parallel requests: 1

• Response time: 3 seconds per 50 generated tokens

• Price is fixed, no matter the size of your dataset or the number of requests you are making

• Pay-as-you-go on all the pre-trained models


$399 / month ($13 / day) for 1 deployed model on 1 dedicated server


(3 fine-tunings per month included for free, then + $19 per fine-tuning)


(+ $379 per additional server or model)

Advanced Fine-tuning

• Train/fine-tune your own GPT-J or Dolphin model

• Automatically deploy your model to a cutting-edge dedicated GPU server

• Maximum context size in each request: 2048 tokens

• Parallel requests: 2

• Response time: 1 second per 50 generated tokens

• Price is fixed, no matter the size of your dataset or the number of requests you are making

• Pay-as-you-go on all the pre-trained models


$990 / month ($33 / day) for 1 deployed model on 1 dedicated server


(3 fine-tunings per month included for free, then + $19 per fine-tuning)


(+ $890 per additional server or model)

Semantic Search

• Fine-tune your own semantic search model

• Automatically deploy your model to a GPU server

• Response time: 1 second

• Price is fixed, no matter the size of your dataset or the number of requests you are making

• Maximum dataset size: 1 million examples

• Pay-as-you-go on all the pre-trained models


$299 / month ($10 / day) for 1 deployed model on 1 dedicated server


(3 fine-tunings per month included for free, then + $19 per fine-tuning)


(+ $279 per additional server or model)

In-house Models

• Upload and deploy your own existing models, or deploy any AI model from the open-source community

• Deploy either on CPU or GPU servers

• Easily scale your model up or down

• Leverage a robust and highly-available infrastructure


Please contact us

Sensitive Applications

Specific Region

• Choose a specific continent or country

• Many regions available (US, France, Germany, Asia, Middle-East, and more)


Please contact us

Specific Cloud Provider

• Choose a specific cloud provider

• Many cloud providers available (AWS, GCP, OVH, Scaleway, and more)


Please contact us

On-Premise

• Deploy many model in-house within your own infrastructure

• No business data is sent to the NLP Cloud servers

• Suited for sensitive data (e.g. medical applications, financial applications...)


Please contact us

Consulting

Expert Guidance

Not sure how to start your next natural language processing project, how to deal with MLOps, or how to make the most of these new AI models? We have highly skilled AI experts who will be happy to help you and provide trainings!


Please contact us

Integration

Do you have an awesome AI project but you are lacking the technical skills to achieve it? Our technical experts can work on integrating natural language processing into your application!


Please contact us

Many more plans can be created for you: a custom number of requests per minute, a mix of pre-trained and custom models, a specific plan for large language models, a rate limiting per hour instead of per minute, and much more! Just let us know.

Plans can also be paid in other currencies. Please ask us for more information if needed.

Support

If you already have an account, send us a message from your dashboard.

Otherwise, send us an email here: [email protected].

Frequently Asked Questions

What is a token?

A token is a unique entity that can either be a small word, part of a word, or punctuation. On average, 1 token is made up of 4 characters, and 100 tokens are roughly equivalent to 75 words. Natural Language Processing models need to turn your text into tokens in order to process it.

Can I try NLP Cloud for free?

Yes. All the I models can be tested for free thanks to the Free plan without a credit card. The pay-as-you-go plan plan is the best way to easily test all the features without restrictions. A credit card is needed for this plan, but you automatically get an initial $15 credit for your tests.

Can I monitor my pay-as-you-go consumption?

Yes, there is a "Monthly Usage" section in your dashboard that lets you monitor the number of requests you made during the month and the number of tokens you generated. It is updated in real time.

Can I set up a maximum limit for my pay-as-you-go consumption?

No you can't, but this is something we are working on. If you want to make sure your costs are perfectly under control, we encourage you to select a pre-paid plan like the Starter plan, the Full plan, or the Enterprise plan. With these plans, you know exactly how much you are going to spend per month.

What does fine-tuning mean?

Fine-tuning means creating ("training") your own AI with your own data. The idea is that you give the AI model many examples (in a "dataset") so it learns from you and is then excellent at addressing your use case. The is the best way to achieve state of the art results in machine learning. You don't necessarily need to spend too much time on your fine-tuning dataset as modern AI models can be fine-tuned with few examples. For example you can reach great results with only 500 examples.

Do I need to use a GPU?

It depends. Most of our AI models work very well without a GPU. But the most advanced models based on text generation like GPT-J and GPT-NeoX need a GPU in order to address bigger inputs and outputs, and to respond promptly. More generally, a GPU is recommended for production use for most of our models as it considerably improves the throughput and the response time.

What is GPT-3?

GPT-3 is the most advanced Natural Language Processing AI model ever created for the moment. But it is very expensive, it is not open-source, and OpenAI and Microsoft (the companies behind GPT-3) reject any application that doesn't match their (very strict) terms and conditions. At NLP Cloud we try to offset this monopoly by proposing great GPT-3 alternatives like GPT-J, GPT-NeoX, and more!

How do you compare to OpenAI?

NLP Cloud is a small and extremely dynamic tech company that proposes all the best AI models at a fair price. Not only is NLP Cloud less expensive than OpenAI, but we are also much less restrictive in terms of usage, and we offer many features and models that OpenAI doesn't offer. For example you can deploy our models on-premise, we are HIPAA compliant, you can deploy your own models, and much more!

I need a specific use-case or model that is not yet supported, can you support it?

Yes! We are very reactive and flexible. Most of our current models and features exist because our customers asked for them, so please let us know what you need. Implementing a new use-case or model can be very quick (sometimes a matter of days).