Text understanding/generation (NLP),
ready for production, at a fair price.
Fine-tune and deploy your own AI models.
No DevOps required.
We open-sourced an "instruct" version of GPT-J that understands human instructions in natural language: feel free to download it and try it!
Fast and accurate AI models suited for production. Highly-available inference API leveraging the most advanced NVIDIA GPUs.
We selected the best open-source natural language processing (NLP) models from the community and deployed them for you
Fine-tune your own models - including GPT-J - or upload your in-house custom models, and deploy them easily to production
We are HIPAA / GDPR / CCPA compliant. No data is stored on our servers. We offer specific plans for sensitive applications.
Do not worry about DevOps or API programming and focus on text processing only. Deliver your AI project in no time.
Use all our AI models in many languages, thanks to our multilingual models and our multilingual addon.
"We spent a lot of energy fine-tuning our machine learning models, but we clearly underestimated the go-live process. NLP Cloud saved us a lot of time, and prices are really affordable."
Patrick, CTO at MatchMaker
"Simple and efficient. Typically the kind of out-of-the box service that will make Natural Language Processing, and AI in general, even more popular."
Marc, Software Engineer
"We did the maths: developing the API by ourself, and then creating and maintaining the production platform for our entity extraction models, would have taken around 2 months of work. Finally we did the same thing in 1 hour for a very fair price by using NLP Cloud."
John, CTO
"We had developed a working API deployed with Docker for our model, but we quickly faced performance and scalability issues. After spending weeks on this we eventually went for this cloud solution and we haven't regretted it so far!"
Maria, CSO at CybelAI
"We eventually gave up on fine-tuning GPT-J... We are now exclusively fine-tuning and deploying GPT-J on NLP Cloud and we are happy like this."
Whalid, Lead Dev at Direct IT
"The NLP Cloud API has been extremely reliable and the support team is very nice and reactive."
Bogdan, Data Scientist at Alternative.io
See this customer reference: LAO using our classification API for automatic support tickets triaging
NLP Cloud is an NVIDIA partner
curl https://api.nlpcloud.io/v1/en_core_web_lg/entities \ > -X POST -d '{"text":"John Doe is a Go Developer at Google"}' ^2000 `[ { "end": 8, "start": 0, "text": "John Doe", "type": "PERSON" }, { "end": 25, "start": 13, "text": "Go Developer", "type": "POSITION" }, { "end": 35, "start": 30, "text": "Google", "type": "ORG" }, ] [email protected]:~$` ^3000
curl https://api.nlpcloud.io/v1/bart-large-mnli-yahoo-answers/classification \ > -X POST -d '{ "text":"John Doe is a Go Developer at Google. He has been working there for 10 years and has been awarded employee of the year.", "labels":["job", "nature", "space"], "multi_class": true }' ^2000 `{ "labels":["job", "space", "nature"], "scores":[0.9258800745010376, 0.1938474327325821, 0.010988450609147549] } [email protected]:~$` ^3000
curl https://api.nlpcloud.io/v1/roberta-base-squad2/question \ > -X POST -d '{ "context":"French president Emmanuel Macron said the country was at war with an invisible, elusive enemy, and the measures were unprecedented, but circumstances demanded them.", "question":"Who is the French president?" }' ^2000 `{ "answer":"Emmanuel Macron", "score":0.9595934152603149, "start":17, "end":32 } [email protected]:~$` ^3000
curl https://api.nlpcloud.io/v1/distilbert-finetuned-sst-2-english/sentiment \ > -X POST -d '{"context":"NLP Cloud proposes an amazing service!"}' ^2000 `{ "scored_labels":[ { "label":"POSITIVE", "score":0.9996881484985352 } ] } [email protected]:~$` ^3000
curl https://api.nlpcloud.io/v1/bart-large-cnn/summarization \ > -X POST -d '{"text":"The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure to reach a height of 300 metres. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct."}' ^2000 `{ "summary_text":"The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world." } [email protected]:~$` ^3000
curl https://api.nlpcloud.io/v1/gpt-j/generation \ > -X POST -d '{ "text":"GPT-J is a powerful NLP model", "min_length":10, "max_length":30 }' ^2000 `{ "generated_text":"GPTJ is a powerful NLP model for text generation. This is the open-source version of GPT-3 by OpenAI. It is the most advanced NLP model created as of today." } [email protected]:~$` ^3000
curl https://api.nlpcloud.io/v1/opus-mt-en-fr/translation \ > -X POST -d '{"text":"John Doe has been working for Microsoft in Seattle since 1999."}' ^2000 `{ "translation_text": "John Doe travaille pour Microsoft à Seattle depuis 1999." } [email protected]:~$` ^3000
curl https://api.nlpcloud.io/v1/python-langdetect/langdetection \ > -X POST -d '{"text":"John Doe has been working for Microsoft in Seattle since 1999. Il parle aussi un peu français."}' ^2000 `{ "languages": [ { "en": 0.7142834369645996 }, { "fr": 0.28571521669868466 } ] } [email protected]:~$` ^3000
[email protected]:~$
Use Case | Model Used | |
---|---|---|
Automatic Speech Recognition (Speech to text): extract text from an audio or video file | We are using OpenAI's Whisper model for speech recognition in 97 languages, automatic language detection, and automatic punctuation, with PyTorch. You can also use your own model. | See Docs Test Now |
Blog Post Generation: create a whole blog article out of a simple title, from 800 to 1500 words, and containing basic HTML tags, in many languages. | We are using GPT-J and GPT-NeoX 20B with PyTorch and Hugging Face transformers. They are powerful open-source equivalents of OpenAI GPT-3. You can also use your own model. | See Docs Test Now |
Classification: send a piece of text, and let the AI apply the right categories to your text, in many languages. As an option, you can suggest the potential categories you want to assess. | We are using Joe Davison's Bart Large MNLI Yahoo Answers, Joe Davison's XLM Roberta Large XNLI, and GPT for classification in 100 languages with PyTorch, Jax, and Hugging Face transformers. You can also use your own model. For classification without potential categories, use GPT-J/GPT-NeoX. | See Docs Test Now |
Chatbot/Conversational AI: discuss fluently with an AI and get relevant answers, in many languages. | We are using GPT-J and GPT-NeoX 20B with PyTorch and Hugging Face transformers. They are powerful open-source equivalents of OpenAI GPT-3. You can also use your own model. | See Docs Test Now |
Code generation: generate source code out of a simple instruction, in any programming language. | We are using GPT-J and GPT-NeoX 20B with PyTorch and Hugging Face transformers. They are powerful open-source equivalents of OpenAI GPT-3. You can also use your own model. | See Docs Test Now |
Dialogue Summarization: summarize a conversation, in many languages | We are using Bart Large CNN SamSum with PyTorch and Hugging Face transformers. You can also use your own model. | See Docs Test Now |
Embeddings: calculate embeddings from several pieces of text, in more than 50 languages. | We are using Paraphrase Multilingual Mpnet Base V2 with PyTorch and Sentence Transformers, and GPT-J with PyTorch and Transformers. You can also use your own model. | See Docs |
Grammar and spelling correction: send a block of text and let the AI correct the mistakes for you, in many languages. | We are using GPT-J and GPT-NeoX 20B with PyTorch and Hugging Face transformers. They are powerful open-source equivalents of OpenAI GPT-3. You can also use your own model. | See Docs Test Now |
Headline generation: send a text, and get a very short summary suited for headlines, in many languages | We are using Michau's T5 Base EN Generate Headline with PyTorch and Hugging Face transformers. You can also use your own model. | See Docs Test Now |
Image Generation/Text To Image: generate an image out of a simple text instruction. | We are Stability AI's Stable Diffusion model with Pytorch and Hugging Face Diffusers. It is a powerful open-source equivalent of OpenAI DALL-E 2. You can also use your own model. | See Docs Test Now |
Intent Classification: detect the intent from a sentence, in many languages. | We are using GPT-J and GPT-NeoX 20B with PyTorch and Hugging Face transformers. They are powerful open-source equivalents of OpenAI GPT-3. You can also use your own model. | See Docs Test Now |
Keywords and keyphrases extraction:extract the main keywords from a piece of text, in many languages. | We are using GPT-J and GPT-NeoX 20B with PyTorch and Hugging Face transformers. They are powerful open-source equivalents of OpenAI GPT-3. You can also use your own model. | See Docs Test Now |
Language Detection: detect one or several languages from a text. | We are simply using Python's LangDetect library. | See
Docs Test Now |
Lemmatization: extract lemmas from a text, in many languages | All the large spaCy models are available (15 languages), or Ginza for Japanese, or upload your own custom spaCy model | See Docs |
Named Entity Recognition (NER): extract structured information from an unstructured text, like name, company, country, job title... in many languages. | You can perform NER with all the large spaCy models (15 languages), or Ginza for Japanese, or GPT-J/GPT-NeoX, or use your own custom model. | See Docs Test Now |
Noun Chunks: extract noun chunks from a text, in many languages | All the large spaCy models are available (15 languages), or Ginza for Japanese, or upload your own custom spaCy model | See Docs |
Paraphrasing and rewriting: generate a similar content with the same meaning, in many languages. | We are using GPT-J and GPT-NeoX 20B with PyTorch and Hugging Face transformers. They are powerful open-source equivalents of OpenAI GPT-3. You can also use your own model. | See Docs Test Now |
Part-Of-Speech (POS) tagging: assign parts of speech to each word of your text, in many languages | All the large spaCy models are available (15 languages), or Ginza for Japanese, or upload your own custom spaCy model | See Docs |
Product description and ad generation: generate one sentence or several paragraphs containing specific keywords for your product descriptions or ads, in many languages. | We are using GPT-J and GPT-NeoX 20B with PyTorch and Hugging Face transformers. They are powerful open-source equivalents of OpenAI GPT-3. You can also use your own model. | See Docs Test Now |
Question answering: ask questions about anything, in many languages. As an option you can give a context so the AI uses this context to answer your question. | We are using the Deepset's Roberta Base Squad 2 with PyTorch and Hugging Face transformers, and GPT-J/GPT-NeoX. If you don't want to use a context, you should use GPT-J. You can also use your own model. | See
Docs Test Now |
Semantic Search: search your own data, in more than 50 languages. | Create your own semantic search model, based on PyTorch and Sentence Transformers. | See
Docs Test Now |
Semantic Similarity: detect whether 2 pieces of text have the same meaning or not, in more than 50 languages. | We are using Paraphrase Multilingual Mpnet Base V2 with PyTorch and Sentence Transformers. You can also use your own model. | See
Docs Test Now |
Sentiment and emotion analysis: determine sentiments and emotions from a text (positive, negative, fear, joy...), in many languages. We also have an AI for financial sentiment analysis. | We are using DistilBERT Base Uncased Finetuned SST-2, DistilBERT Base Uncased Emotion, and Prosus AI's Finbert with PyTorch, Tensorflow, and Hugging Face transformers. You can also use your own model. | See
Docs Test Now |
Summarization: send a text, and get a smaller text keeping essential information only, in many languages | We are using Facebook's Bart Large CNN and GPT-J/GPT-NeoX, with PyTorch, Jax, and Hugging Face transformers. You can also use your own model. | See Docs Test Now |
Text generation: start a sentence and let the AI generate the rest for you, in many languages. You can achieve almost any text processing and text generation use case thanks to text generation with GPT-J and few-shot learning. You can also fine-tune GPT-J on NLP Cloud. | We are using GPT-J and GPT-NeoX 20B with PyTorch and Hugging Face transformers. They are powerful open-source equivalents of OpenAI GPT-3. You can also use your own model. | See Docs Test Now |
Tokenization: extract tokens from a text, in many languages | All the large spaCy models are available (15 languages), or Ginza for Japanese, or upload your own custom spaCy model | See Docs |
Translation: translate text from one language to another. | We use Facebook's NLLB 200 3.3B for translation in 200 languages (input language can be automatically detected) with PyTorch and Hugging Face transformers. You can also use your own model. | See Docs Test Now |
Looking for a specific use case or AI model that is not in the list
above? Please let us know!
Implementation can be
very quick on our side.
Upload or Train/Fine-Tune your own AI models - including GPT-J - from your dashboard, and use them straight away in production without worrying about deployment considerations like RAM usage, high-availability, scalability... You can upload and deploy as many models as you want to production.
All plans can be stopped anytime. You only pay for the time you used the service.
The invoiced amount is automatically prorated. In case of a downgrade, you will get a discount on your next invoice.
Prices do not include taxes by default. If you are a business registered in EU or an individual, please contact us so we can apply the correct VAT to your subscription.
Not sure which plan is best for you? Just ask our support team!
Pre-paid plans are the most cost-effective solutions if you plan to make an important volume of requests on our pre-trained AI models.
The cost is fixed and paid up-front at the beginning of the month. There is no variable cost based on usage (as opposed to our pay-as-you-go plan).
On a GPU, AI models are around 10X faster on average.
Free | Starter | Starter GPU | Full | Full GPU | Enterprise | Enterprise GPU | Large Language Models | |
---|---|---|---|---|---|---|---|---|
Price | $0 | $29/month ($1/day) |
$99/month ($3.5/day) |
$59/month ($2/day) |
$199/month ($7/day) |
$499/month ($16/day) |
$699/month ($23/day) |
$2,499/month ($80/day) |
Parallel Requests | 2 | 4 | 4 | 8 | 8 | 15 | 15 | 20 |
Fine-tuned GPT-NeoX 20B on GPU (requests per minute) | Variable | 3 | 30 | |||||
Whisper on GPU (requests per minute) | Variable | 5 | 50 | |||||
Stable Diffusion on GPU (requests per minute) | Variable | 5 | 50 | |||||
GPT-NeoX 20B on GPU (requests per minute) | Variable | 1 | 8 | 80 | ||||
Fast GPT-J on GPU (requests per minute) | Variable | 1 | 3 | 10 | 100 | |||
GPT-J on GPU (requests per minute) | Variable | 3 | 10 | 50 | 500 | |||
GPT-J on CPU (requests per minute) | Variable | 3 | 10 | 50 | ||||
Bart Large CNN on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Bart Large CNN on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Bart Large MNLI Yahoo Answers on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Bart Large MNLI Yahoo Answers on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Bart Large SamSum on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Bart Large SamSum on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
XLM Roberta Large XNLI on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
XLM Roberta Large XNLI on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
T5 Base EN Generate Headlines on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
T5 Base EN Generate Headlines on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Roberta Base Squad 2 on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Roberta Base Squad 2 on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Distilbert Base SST 2 on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Distilbert Base SST 2 on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Distilbert Base Emotion on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Distilbert Base Emotion on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Finbert on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Finbert on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
NLLB 200 3.3B on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
NLLB 200 3.3B on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
Paraphrase Multilingual Mpnet Base V2 on GPU (requests per minute) | Variable | 10 | 30 | 150 | 150 | |||
Paraphrase Multilingual Mpnet Base V2 on CPU (requests per minute) | Variable | 10 | 30 | 150 | ||||
SpaCy on CPU (requests per minute) | Variable | 10 | 30 | 150 |
Note: this plan is perfect for testing or very irregular usage. For production use, we encourage you to select the pre-paid plans above that are less costly.
• No fixed cost: pay only if you consume
• Automatically get a $15 FREE credit
• All our pre-trained models are available
• Multilingual add-on included
• Asynchronous mode included
• Monitor your usage in your dashboard
• Parallel requests: 4 (can be increased)
On CPU: $0.003 per request
On GPU: $0.005 per request
GPT-J: + $0.00001 per token
Fast GPT-J: + $0.00003 per token
GPT-NeoX 20B: + $0.00004 per token
Fine-tuned GPT-NeoX 20B: + $0.00007 per token
Stable Diffusion: + $0.05 per generated image
Whisper: + $0.0001 per second (duration of your audio or video file)
Note: this add-on brings multilingual capabilities to another plan. It is enabled by default on the Pay-as-you-go plan, but if you need to enable it on another plan please contact support.
• Use all the models in non-English languages with a great accuracy
• No fixed cost: pay only if you consume
• Monitor your usage in your dashboard
$0.05 per 1k characters sent and received
• Train/fine-tune your own GPT-J model
• Automatically deploy your model to a basic dedicated GPU server
• Maximum context size in each request: 1024 tokens
• Parallel requests: 1
• Response time: 3 seconds per 50 generated tokens
• Price is fixed, no matter the size of your dataset or the number of requests you are making
• Pay-as-you-go on all the pre-trained models
$399 / month ($13 / day) for 1 deployed model on 1 dedicated
server
(3 fine-tunings per month included for free, then + $19 per fine-tuning)
(+ $379 per additional server or model)
• Train/fine-tune your own Fast GPT-J model
• Automatically deploy your model to a cutting-edge dedicated GPU server
• Maximum context size in each request: 2048 tokens
• Parallel requests: 2
• Response time: 1 second per 50 generated tokens
• Price is fixed, no matter the size of your dataset or the number of requests you are making
• Pay-as-you-go on all the pre-trained models
$990 / month ($33 / day) for 1 deployed model on 1 dedicated
server
(3 fine-tunings per month included for free, then + $19 per fine-tuning)
(+ $890 per additional server or model)
• Fine-tune your own semantic search model
• Automatically deploy your model to a GPU server
• Response time: 1 second
• Price is fixed, no matter the size of your dataset or the number of requests you are making
• Maximum dataset size: 1 million examples
• Pay-as-you-go on all the pre-trained models
$299 / month ($10 / day) for 1 deployed model on 1 dedicated
server
(3 fine-tunings per month included for free, then + $19 per fine-tuning)
(+ $279 per additional server or model)
• Upload and deploy your own existing models, or deploy any AI model from the open-source community
• Deploy either on CPU or GPU servers
• Easily scale your model up or down
• Leverage a robust and highly-available infrastructure
Please contact us
• Choose a specific continent or country
• Many regions available (US, France, Germany, Asia, Middle-East, and more)
Please contact us
• Choose a specific cloud provider
• Many cloud providers available (AWS, GCP, OVH, Scaleway, and more)
Please contact us
• Deploy many model in-house within your own infrastructure
• No business data is sent to the NLP Cloud servers
• Suited for sensitive data (e.g. medical applications, financial applications...)
Please contact us
Not sure how to start your next natural language processing project, how to deal with MLOps, or how to make the most of these new AI models? We have highly skilled AI experts who will be happy to help you and provide trainings!
Please contact us
Do you have an awesome AI project but you are lacking the technical skills to achieve it? Our technical experts can work on integrating natural language processing into your application!
Please contact us
Many more plans can be created for you: a custom number of requests per minute, a mix of pre-trained and custom models, a specific plan for large language models, a rate limiting per hour instead of per minute, and much more! Just let us know.
Plans can also be paid in other currencies. Please ask us for more information if needed.
If you already have an account, send us a message from your dashboard.
Otherwise, send us an email here: [email protected].