What is Text Classification?
Text classification is the process of categorizing a block of text based on one or several labels.
Let's say you have the following block of text:
Perseverance is just getting started, and already has provided some of the most iconic visuals in space exploration history. It reinforces the remarkable level of engineering and precision that is required to build and fly a vehicle to the Red Planet.
Let's also say that you also have the following labels: space, science, and food.
Now the question is: which ones of these labels apply best to this block of text? Answer is space and science of course.
Why Use Text Classification?
Text classification can be used in many useful situations. Let's give you a couple of examples.
Sort Incoming Messages
Are you flooded with incoming messages at work? Well, properly labelling these messages in advance can definitely make you more productive. You could know in advance which messages are advertising, and which one are customer requests, for example
Some customers requests must sometimes be addressed as a priority. If that's the case it can be very interesting to detect them in advance and address them right away.
Let's say you are looking for companies in the automotive field. You could scan websites and only keep those who have the "automotive" label applied.
You might want to monitor new content from various sources and categorize it accordingly. Text classification is the right way to do so.
import nlpcloud client = nlpcloud.Client("bart-large-mnli-yahoo-answers", "", gpu=False, lang="en") client.classification("""Both Canada and hosts China enjoyed double gold celebrations at the Zhangjiakou National Biathlon Centre as day five of the 2022 Winter Paralympics saw six gold medals decided in the cross-country sprinting. Brian McKeever held off Jake Adicoff of the US to win the men’s sprint free visually impaired category by 0.8 seconds, but the closeness of the time doesn’t convey the ease with which the Canadian had slowed up and still claimed his 15th career Winter Paralympics gold medal. McKeever began losing his sight age 19 due to Stargardt disease and plans to retire from competition after these Games.""", space, sport, business, journalism, politics, True)
Control whether you want to use the model on a GPU. Machine learning models run much faster on GPUs.
Whether multiple labels should be applied to your text, meaning that the model will calculate an independent score for each label. Defaults to true. Ignored if you're using Fast GPT-J.
A list of labels you want to use to classify your text. Optional if you're using Fast GPT-J.
NLP has a critical weakness: it doesn't work well with non-English languages.
We do our best to add non-English models when it's possible. See for example XLM Roberta Large XNLI, TF Allociné, German Sentiment Bert... Unfortunately few models are available so it's not possible to cover all the NLP use cases with that method.
In order to solve this challenge, we developed a multilingual AI that automatically translates your input into English, performs the actual NLP operation, and then translates the result back to your original language. It makes your requests a bit slower but returns impressive results.
Simply select your language in the list, and from now on you can write the input text in your own language!
This multilingual add-on is a paid feature. Please contact the support team so they can upgrade your plan.