Text Classification API

What is Text Classification?

Text classification is the process of categorizing a block of text. As an option, you can ask the AI to choose a category among a list of categories you gave beforehand.

Let's say you have the following block of text:

Perseverance is just getting started, and already has provided some of the most iconic visuals in space exploration history. It reinforces the remarkable level of engineering and precision that is required to build and fly a vehicle to the Red Planet.

Let's also say that you also have the following categories: space, science, and food.

Now the question is: which ones of these categories apply best to this block of text? Answer is space and science of course.

If you don't suggest any candidate categories, the AI will suggest the best category possible based on the data it was trained on.

Why Use Text Classification?

Text classification can be used in many useful situations. Let's give you a couple of examples.

Sort Incoming Messages

Are you flooded with incoming messages at work? Well, properly labelling these messages in advance can definitely make you more productive. You could know in advance which messages are advertising, and which one are customer requests, for example

Detect Urgency

Some customers requests must sometimes be addressed as a priority. If that's the case it can be very interesting to detect them in advance and address them right away.

Leads Qualification

Let's say you are looking for companies in the automotive field. You could scan websites and only keep those who have the "automotive" label applied.

Economic Intelligence

You might want to monitor new content from various sources and categorize it accordingly. Text classification is the right way to do so.

Text Classification with Transformers and Generative Models.

The Transformer, released by Google in 2017, is the corner stone architecture of many advanced AI models. Thanks to the Transformer, accuracy of AI models has improved dramatically. However this improvement comes at a cost: neural networks based on the Transformer are extremely computation intensive.

Hugging Face is a central repository regrouping many open-source Natural Language Processing Transformer-based models. Two of them, Joe Davison's Bart Large MNLI Yahoo Answers (for English) and Joe Davison's XLM Roberta Large XNLI (for non-English languages) are perfectly suited for text classification in many languages.

For more advanced results, it is also possible to perform text classification with generative models like GPT-J, GPT-NeoX, Dolphin, and ChatDolphin. These models give great results, even when no input labels are provided.

Text Classification Inference API

Building an inference API for text classification is a necessary step as soon a you want to use text classification in production. But keep in mind that building such an API is not necessarily easy. First because you need to code the API (easy part) but also because you need to build a highly available, fast, and scalable infrastructure to serve your models behind the hood (hardest part). Machine learning models consume a lot of resources (memory, disk space, CPU, GPU...) which makes it hard to achieve high-availability and low latency at the same time.

Leveraging such an API is very interesting because it is completely decoupled from the rest of your stack (microservice architecture), so you can easily scale it independently and ensure high-availability of your models through redundancy. But an API is also the way to go in terms of language interoperability. Most machine learning frameworks are developed in Python, but it's likely that you want to access them from other languages like Javascript, Go, Ruby... In such situation, an API is a great solution.

NLP Cloud's Text Classification API

NLP Cloud proposes a text classification API that gives you the opportunity to perform text classification out of the box, based on Joe Davison's Bart Large MNLI Yahoo Answers, Joe Davison's XLM Roberta Large XNLI, GPT-J, and GPT-NeoX, Dolphin, and ChatDolphin, with excellent performances. You can either use these pre-trained models, or train your own models, or upload your own custom models!

For more details, see our documentation about text classification here. For advanced usage, see the text generation API endpoint here. And easily test text classification on our playground.

Testing text classification locally is one thing, but using it reliably in production is another thing. With NLP Cloud you can just do both!