Advanced Artificial Intelligence API

Text understanding/generation (NLP),

ready for production, at a fair price.
Fine-tune and deploy your own AI models.
No DevOps required.

High Performance

Fast and accurate AI models suited for production. Highly-available inference API leveraging the most advanced NVIDIA GPUs.

Pre-trained AI Models

We selected the best open-source natural language processing (NLP) models from the community and deployed them for you

Custom Models

Fine-tune your own models - including GPT-J - or upload your in-house custom models, and deploy them easily to production

Data Privacy

We are HIPAA / GDPR / CCPA compliant. No data is stored on our servers. We offer specific plans for sensitive applications.

No Complexity

Do not worry about DevOps or API programming and focus on text processing only. Deliver your AI project in no time.

Multilingual AI

Use all our AI models in many languages, thanks to our multilingual models and our multilingual addon.

Deliver Your AI Projects For Good

BBVA
Johnson & Johnson
Zapier
GSK
Generali
Schneider
General Electric
Dell
Zoom
PWC
Lufthansa
Deloitte

"We spent a lot of energy fine-tuning our machine learning models, but we clearly underestimated the go-live process. NLP Cloud saved us a lot of time, and prices are really affordable."

Patrick, CTO at MatchMaker

"Simple and efficient. Typically the kind of out-of-the box service that will make Natural Language Processing, and AI in general, even more popular."

Marc, Software Engineer

"We did the maths: developing the API by ourself, and then creating and maintaining the production platform for our entity extraction models, would have taken around 2 months of work. Finally we did the same thing in 1 hour for a very fair price by using NLP Cloud."

John, CTO

"We had developed a working API deployed with Docker for our model, but we quickly faced performance and scalability issues. After spending weeks on this we eventually went for this cloud solution and we haven't regretted it so far!"

Maria, CSO at CybelAI

"We eventually gave up on fine-tuning GPT-J... We are now exclusively fine-tuning and deploying GPT-J on NLP Cloud and we are happy like this."

Whalid, Lead Dev at Direct IT

"The NLP Cloud API has been extremely reliable and the support team is very nice and reactive."

Bogdan, Data Scientist at Alternative.io

See this customer reference: LAO using our classification API for automatic support tickets triaging

NLP Cloud is an NVIDIA partner

NLP Cloud is a member of NVIDIA Inception Program

curl https://api.nlpcloud.io/v1/en_core_web_lg/entities \ > -X POST -d '{"text":"John Doe is a Go Developer at Google"}' ^2000 `[ { "end": 8, "start": 0, "text": "John Doe", "type": "PERSON" }, { "end": 25, "start": 13, "text": "Go Developer", "type": "POSITION" }, { "end": 35, "start": 30, "text": "Google", "type": "ORG" }, ] [email protected]:~$` ^3000

curl https://api.nlpcloud.io/v1/bart-large-mnli-yahoo-answers/classification \ > -X POST -d '{ "text":"John Doe is a Go Developer at Google. He has been working there for 10 years and has been awarded employee of the year.", "labels":["job", "nature", "space"], "multi_class": true }' ^2000 `{ "labels":["job", "space", "nature"], "scores":[0.9258800745010376, 0.1938474327325821, 0.010988450609147549] } [email protected]:~$` ^3000

curl https://api.nlpcloud.io/v1/roberta-base-squad2/question \ > -X POST -d '{ "context":"French president Emmanuel Macron said the country was at war with an invisible, elusive enemy, and the measures were unprecedented, but circumstances demanded them.", "question":"Who is the French president?" }' ^2000 `{ "answer":"Emmanuel Macron", "score":0.9595934152603149, "start":17, "end":32 } [email protected]:~$` ^3000

curl https://api.nlpcloud.io/v1/distilbert-finetuned-sst-2-english/sentiment \ > -X POST -d '{"context":"NLP Cloud proposes an amazing service!"}' ^2000 `{ "scored_labels":[ { "label":"POSITIVE", "score":0.9996881484985352 } ] } [email protected]:~$` ^3000

curl https://api.nlpcloud.io/v1/bart-large-cnn/summarization \ > -X POST -d '{"text":"The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure to reach a height of 300 metres. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct."}' ^2000 `{ "summary_text":"The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world." } [email protected]:~$` ^3000

curl https://api.nlpcloud.io/v1/gpt-j/generation \ > -X POST -d '{ "text":"GPT-J is a powerful NLP model", "min_length":10, "max_length":30 }' ^2000 `{ "generated_text":"GPTJ is a powerful NLP model for text generation. This is the open-source version of GPT-3 by OpenAI. It is the most advanced NLP model created as of today." } [email protected]:~$` ^3000

curl https://api.nlpcloud.io/v1/opus-mt-en-fr/translation \ > -X POST -d '{"text":"John Doe has been working for Microsoft in Seattle since 1999."}' ^2000 `{ "translation_text": "John Doe travaille pour Microsoft à Seattle depuis 1999." } [email protected]:~$` ^3000

curl https://api.nlpcloud.io/v1/python-langdetect/langdetection \ > -X POST -d '{"text":"John Doe has been working for Microsoft in Seattle since 1999. Il parle aussi un peu français."}' ^2000 `{ "languages": [ { "en": 0.7142834369645996 }, { "fr": 0.28571521669868466 } ] } [email protected]:~$` ^3000

[email protected]:~$
Use Case Model Used
Automatic Speech Recognition (Speech to text): extract text from an audio or video file We are using OpenAI's Whisper model for speech recognition in 97 languages, automatic language detection, and automatic punctuation, with PyTorch. You can also use your own model. See Docs
Test Now
Blog Post Generation: create a whole blog article out of a simple title, from 800 to 1500 words, and containing basic HTML tags, in many languages. We are using GPT-J and GPT-NeoX 20B with PyTorch and Hugging Face transformers. They are powerful open-source equivalents of OpenAI GPT-3. You can also use your own model. See Docs
Test Now
Classification: send a piece of text, and let the AI apply the right categories to your text, in many languages. As an option, you can suggest the potential categories you want to assess. We are using Joe Davison's Bart Large MNLI Yahoo Answers, Joe Davison's XLM Roberta Large XNLI, and GPT for classification in 100 languages with PyTorch, Jax, and Hugging Face transformers. You can also use your own model. For classification without potential categories, use GPT-J/GPT-NeoX. See Docs
Test Now
Chatbot/Conversational AI: discuss fluently with an AI and get relevant answers, in many languages. We are using GPT-J and GPT-NeoX 20B with PyTorch and Hugging Face transformers. They are powerful open-source equivalents of OpenAI GPT-3. You can also use your own model. See Docs
Test Now
Code generation: generate source code out of a simple instruction, in any programming language. We are using GPT-J and GPT-NeoX 20B with PyTorch and Hugging Face transformers. They are powerful open-source equivalents of OpenAI GPT-3. You can also use your own model. See Docs
Test Now
Dialogue Summarization: summarize a conversation, in many languages We are using Bart Large CNN SamSum with PyTorch and Hugging Face transformers. You can also use your own model. See Docs
Test Now
Embeddings: calculate embeddings from several pieces of text, in more than 50 languages. We are using Paraphrase Multilingual Mpnet Base V2 with PyTorch and Sentence Transformers, and GPT-J with PyTorch and Transformers. You can also use your own model. See Docs
Grammar and spelling correction: send a block of text and let the AI correct the mistakes for you, in many languages. We are using GPT-J and GPT-NeoX 20B with PyTorch and Hugging Face transformers. They are powerful open-source equivalents of OpenAI GPT-3. You can also use your own model. See Docs
Test Now
Headline generation: send a text, and get a very short summary suited for headlines, in many languages We are using Michau's T5 Base EN Generate Headline with PyTorch and Hugging Face transformers. You can also use your own model. See Docs
Test Now
Image Generation/Text To Image: generate an image out of a simple text instruction. We are Stability AI's Stable Diffusion model with Pytorch and Hugging Face Diffusers. It is a powerful open-source equivalent of OpenAI DALL-E 2. You can also use your own model. See Docs
Test Now
Intent Classification: detect the intent from a sentence, in many languages. We are using GPT-J and GPT-NeoX 20B with PyTorch and Hugging Face transformers. They are powerful open-source equivalents of OpenAI GPT-3. You can also use your own model. See Docs
Test Now
Keywords and keyphrases extraction:extract the main keywords from a piece of text, in many languages. We are using GPT-J and GPT-NeoX 20B with PyTorch and Hugging Face transformers. They are powerful open-source equivalents of OpenAI GPT-3. You can also use your own model. See Docs
Test Now
Language Detection: detect one or several languages from a text. We are simply using Python's LangDetect library. See Docs
Test Now
Lemmatization: extract lemmas from a text, in many languages All the large spaCy models are available (15 languages), or Ginza for Japanese, or upload your own custom spaCy model See Docs
Named Entity Recognition (NER): extract structured information from an unstructured text, like name, company, country, job title... in many languages. You can perform NER with all the large spaCy models (15 languages), or Ginza for Japanese, or GPT-J/GPT-NeoX, or use your own custom model. See Docs
Test Now
Noun Chunks: extract noun chunks from a text, in many languages All the large spaCy models are available (15 languages), or Ginza for Japanese, or upload your own custom spaCy model See Docs
Paraphrasing and rewriting: generate a similar content with the same meaning, in many languages. We are using GPT-J and GPT-NeoX 20B with PyTorch and Hugging Face transformers. They are powerful open-source equivalents of OpenAI GPT-3. You can also use your own model. See Docs
Test Now
Part-Of-Speech (POS) tagging: assign parts of speech to each word of your text, in many languages All the large spaCy models are available (15 languages), or Ginza for Japanese, or upload your own custom spaCy model See Docs
Product description and ad generation: generate one sentence or several paragraphs containing specific keywords for your product descriptions or ads, in many languages. We are using GPT-J and GPT-NeoX 20B with PyTorch and Hugging Face transformers. They are powerful open-source equivalents of OpenAI GPT-3. You can also use your own model. See Docs
Test Now
Question answering: ask questions about anything, in many languages. As an option you can give a context so the AI uses this context to answer your question. We are using the Deepset's Roberta Base Squad 2 with PyTorch and Hugging Face transformers, and GPT-J/GPT-NeoX. If you don't want to use a context, you should use GPT-J. You can also use your own model. See Docs
Test Now
Semantic Search: search your own data, in more than 50 languages. Create your own semantic search model, based on PyTorch and Sentence Transformers. See Docs
Test Now
Semantic Similarity: detect whether 2 pieces of text have the same meaning or not, in more than 50 languages. We are using Paraphrase Multilingual Mpnet Base V2 with PyTorch and Sentence Transformers. You can also use your own model. See Docs
Test Now
Sentiment and emotion analysis: determine sentiments and emotions from a text (positive, negative, fear, joy...), in many languages. We also have an AI for financial sentiment analysis. We are using DistilBERT Base Uncased Finetuned SST-2, DistilBERT Base Uncased Emotion, and Prosus AI's Finbert with PyTorch, Tensorflow, and Hugging Face transformers. You can also use your own model. See Docs
Test Now
Summarization: send a text, and get a smaller text keeping essential information only, in many languages We are using Facebook's Bart Large CNN and GPT-J/GPT-NeoX, with PyTorch, Jax, and Hugging Face transformers. You can also use your own model. See Docs
Test Now
Text generation: start a sentence and let the AI generate the rest for you, in many languages. You can achieve almost any text processing and text generation use case thanks to text generation with GPT-J and few-shot learning. You can also fine-tune GPT-J on NLP Cloud. We are using GPT-J and GPT-NeoX 20B with PyTorch and Hugging Face transformers. They are powerful open-source equivalents of OpenAI GPT-3. You can also use your own model. See Docs
Test Now
Tokenization: extract tokens from a text, in many languages All the large spaCy models are available (15 languages), or Ginza for Japanese, or upload your own custom spaCy model See Docs
Translation: translate text from one language to another. We use Facebook's NLLB 200 3.3B for translation in 200 languages (input language can be automatically detected) with PyTorch and Hugging Face transformers. You can also use your own model. See Docs
Test Now

Looking for a specific use case or AI model that is not in the list above? Please let us know!
Implementation can be very quick on our side.

Clean API

NLP Cloud provides you with a simple and robust API.

Integration is easy, and we are taking care of the high-availability of AI models, no matter the load.

Client libraries are available in several languages: Python, Ruby, Go, Node.js, and PHP (more to come).

See the documentation for more details.

Fine-Tune and Deploy your Own Models

Upload or Train/Fine-Tune your own AI models - including GPT-J - from your dashboard, and use them straight away in production without worrying about deployment considerations like RAM usage, high-availability, scalability... You can upload and deploy as many models as you want to production.

Pricing

All plans can be stopped anytime. You only pay for the time you used the service.

The invoiced amount is automatically prorated. In case of a downgrade, you will get a discount on your next invoice.

Prices do not include taxes by default. If you are a business registered in EU or an individual, please contact us so we can apply the correct VAT to your subscription.

Pre-Paid Plans For Pre-Trained Models

Pre-paid plans are the most cost-effective solutions if you plan to make an important volume of requests on our pre-trained AI models.

The cost is fixed and paid up-front at the beginning of the month. There is no variable cost based on usage (as opposed to our pay-as-you-go plan).

On a GPU, AI models are around 10X faster on average.

Free Starter Starter GPU Full Full GPU Enterprise Enterprise GPU Large Language Models
Price $0 $29/month
($1/day)
$99/month
($3.5/day)
$59/month
($2/day)
$199/month
($7/day)
$499/month
($16/day)
$699/month
($23/day)
$2,499/month
($80/day)
Parallel Requests 2 4 4 8 8 15 15 20
Fine-tuned GPT-NeoX 20B on GPU Variable 3 requests
per minute
30 requests
per minute
Whisper on GPU Variable 3 requests
per minute
30 requests
per minute
Stable Diffusion on GPU Variable 5 requests
per minute
50 requests
per minute
GPT-NeoX 20B on GPU Variable 1 request
per minute
8 requests
per minute
80 requests
per minute
Fast GPT-J on GPU Variable 1 request
per minute
3 requests
per minute
10 requests
per minute
100 requests
per minute
GPT-J on GPU Variable 1 request
per minute
10 requests
per minute
50 requests
per minute
500 requests
per minute
GPT-J on CPU Variable 1 request
per minute
10 requests
per minute
50 requests
per minute
Bart Large CNN on GPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
150 requests
per minute
Bart Large CNN on CPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
Bart Large MNLI Yahoo Answers on GPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
150 requests
per minute
Bart Large MNLI Yahoo Answers on CPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
Bart Large SamSum on GPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
150 requests
per minute
Bart Large SamSum on CPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
XLM Roberta Large XNLI on GPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
150 requests
per minute
XLM Roberta Large XNLI on CPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
T5 Base EN Generate Headlines on GPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
150 requests
per minute
T5 Base EN Generate Headlines on CPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
Roberta Base Squad 2 on GPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
150 requests
per minute
Roberta Base Squad 2 on CPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
Distilbert Base SST 2 on GPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
150 requests
per minute
Distilbert Base SST 2 on CPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
Distilbert Base Emotion on GPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
150 requests
per minute
Distilbert Base Emotion on CPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
Finbert on GPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
150 requests
per minute
Finbert on CPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
NLLB 200 3.3B on GPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
150 requests
per minute
NLLB 200 3.3B on CPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
Paraphrase Multilingual Mpnet Base V2 on GPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
150 requests
per minute
Paraphrase Multilingual Mpnet Base V2 on CPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute
SpaCy on CPU Variable 10 requests
per minute
30 requests
per minute
150 requests
per minute

Pre-Paid Plans For Custom AI Models

Basic Fine-tuning

• Train/fine-tune your own GPT-J model

• Automatically deploy your model to a basic dedicated GPU server

• Maximum context size in each request: 1024 tokens

• Parallel requests: 1

• Response time: 3 seconds per 50 generated tokens

• Price is fixed, no matter the size of your dataset or the number of requests you are making

• Pay-as-you-go on all the pre-trained models


$399 / month ($13 / day) for 1 deployed model on 1 dedicated server


(3 fine-tunings per month included for free, then + $19 per fine-tuning)


(+ $379 per additional server or model)

Advanced Fine-tuning

• Train/fine-tune your own Fast GPT-J model

• Automatically deploy your model to a cutting-edge dedicated GPU server

• Maximum context size in each request: 2048 tokens

• Parallel requests: 2

• Response time: 1 second per 50 generated tokens

• Price is fixed, no matter the size of your dataset or the number of requests you are making

• Pay-as-you-go on all the pre-trained models


$990 / month ($33 / day) for 1 deployed model on 1 dedicated server


(3 fine-tunings per month included for free, then + $19 per fine-tuning)


(+ $890 per additional server or model)

Semantic Search

• Fine-tune your own semantic search model

• Automatically deploy your model to a GPU server

• Response time: 1 second

• Price is fixed, no matter the size of your dataset or the number of requests you are making

• Maximum dataset size: 1 million examples

• Pay-as-you-go on all the pre-trained models


$299 / month ($10 / day) for 1 deployed model on 1 dedicated server


(3 fine-tunings per month included for free, then + $19 per fine-tuning)


(+ $279 per additional server or model)

In-house Models

• Upload and deploy your own existing models, or deploy any AI model from the open-source community

• Deploy either on CPU or GPU servers

• Easily scale your model up or down

• Leverage a robust and highly-available infrastructure


Please contact us

Pay-As-You-Go

All the models, based on usage

Note: this plan is perfect for testing or very irregular usage. For production use, we encourage you to select the pre-paid plans above that are less costly.

• No fixed cost: pay only if you consume

• Automatically get a $15 FREE credit

• All our pre-trained models are available

• Multilingual add-on included

• Asynchronous mode included

• Monitor your usage in your dashboard

• Parallel requests: 4 (can be increased)


On CPU: $0.003 per request

On GPU: $0.005 per request

GPT-J: + $0.00001 per token

Fast GPT-J: + $0.00003 per token

GPT-NeoX 20B: + $0.00004 per token

Fine-tuned GPT-NeoX 20B: + $0.00007 per token

Stable Diffusion: + $0.05 per generated image

Whisper: + $0.0006 per second (duration of your audio or video file)

Multilingual Add-On, based on usage

Note: this add-on brings multilingual capabilities to another plan. It is enabled by default on the Pay-as-you-go plan, but if you need to enable it on another plan please contact support.

• Use all the models in non-English languages with a great accuracy

• No fixed cost: pay only if you consume

• Monitor your usage in your dashboard







$0.05 per 1k characters sent and received

Sensitive Applications

Specific Region

• Choose a specific continent or country

• Many regions available (US, France, Germany, Asia, Middle-East, and more)


Please contact us

Specific Cloud Provider

• Choose a specific cloud provider

• Many cloud providers available (AWS, GCP, OVH, Scaleway, and more)


Please contact us

On-Premise

• Deploy many model in-house within your own infrastructure

• No business data is sent to the NLP Cloud servers

• Suited for sensitive data (e.g. medical applications, financial applications...)


Please contact us

Consulting

Expert Guidance

Not sure how to start your next natural language processing project, how to deal with MLOps, or how to make the most of these new AI models? We have highly skilled AI experts who will be happy to help you and provide trainings!


Please contact us

Integration

Do you have an awesome AI project but you are lacking the technical skills to achieve it? Our technical experts can work on integrating natural language processing into your application!


Please contact us

Many more plans can be created for you: a custom number of requests per minute, a mix of pre-trained and custom models, a specific plan for large language models, a rate limiting per hour instead of per minute, and much more! Just let us know.

Plans can also be paid in other currencies. Please ask us for more information if needed.

Support

If you already have an account, send us a message from your dashboard.

Otherwise, send us an email here: [email protected].

Frequently Asked Questions

What is a token?

A token is a unique entity that can either be a small word, part of a word, or punctuation. On average, 1 token is made up of 4 characters, and 100 tokens are roughly equivalent to 75 words. Natural Language Processing models need to turn your text into tokens in order to process it.

Can I try NLP Cloud for free?

Yes. All the I models can be tested for free thanks to the Free plan without a credit card. The pay-as-you-go plan plan is the best way to easily test all the features without restrictions. A credit card is needed for this plan, but you automatically get an initial $15 credit for your tests.

Can I monitor my pay-as-you-go consumption?

Yes, there is a "Monthly Usage" section in your dashboard that lets you monitor the number of requests you made during the month, the number of tokens you generated, and the number of characters used by the multilingual addon. It is updated in real time.

Can I set up a maximum limit for my pay-as-you-go consumption?

No you can't, but this is something we are working on. If you want to make sure your costs are perfectly under control, we encourage you to select a pre-paid plan like the Starter plan, the Full plan, or the Enterprise plan. With these plans, you know exactly how much you are going to spend per month.

What does fine-tuning mean?

Fine-tuning means creating ("training") your own AI with your own data. The idea is that you give the AI model many examples (in a "dataset") so it learns from you and is then excellent at addressing your use case. The is the best way to achieve state of the art results in machine learning. You don't necessarily need to spend too much time on your fine-tuning dataset as modern AI models can be fine-tuned with few examples. For example you can reach great results with only 500 examples.

Do I need to use a GPU?

It depends. Most of our AI models work very well without a GPU. But the most advanced models based on text generation like GPT-J and GPT-NeoX need a GPU in order to address bigger inputs and outputs, and to respond promptly. More generally, a GPU is recommended for production use for most of our models as it considerably improves the throughput and the response time.

What is GPT-3?

GPT-3 is the most advanced Natural Language Processing AI model ever created for the moment. But it is very expensive, it is not open-source, and OpenAI and Microsoft (the companies behind GPT-3) reject any application that doesn't match their (very strict) terms and conditions. At NLP Cloud we try to offset this monopoly by proposing great GPT-3 alternatives like GPT-J, GPT-NeoX, and more!

How do you compare to OpenAI?

NLP Cloud is a small and extremely dynamic tech company that proposes all the best AI models at a fair price. Not only is NLP Cloud less expensive than OpenAI, but we are also much less restrictive in terms of usage, and we offer many features and models that OpenAI doesn't offer. For example you can deploy our models on-premise, we are HIPAA compliant, you can deploy your own models, and much more!

I need a specific use-case or model that is not yet supported, can you support it?

Yes! We are very reactive and flexible. Most of our current models and features exist because our customers asked for them, so please let us know what you need. Implementing a new use-case or model can be very quick (sometimes a matter of days).