Keyword and Keyphrase Extraction
What is Keyword/Keyphrase Extraction and Why Use GPT?
Keyword extraction is about extracting one or several important words from a piece of text. These words must be core ideas from the text.
For example, imagine you have the following content:
Information Retrieval (IR) is the process of obtaining resources relevant to the information need. For instance, a search query on a web search engine can be an information need. The search engine can return web pages that represent relevant resources.
The important keywords in this example could be information, resources, search.
If keywords are too simple, you might want to extract keyphrases: a combination of several words. For example, in the above content, important keyphrases could be information retrieval, relevant resources, search query, search engine.
Performing keyword and keyphrase extraction is harder than it sounds. It takes an advanced AI model to understand the core ideas from a piece of text.
GPT-J and GPT-NeoX are the most advanced open-source NLP models as of this writing, and they are the best GPT-3 alternatives. These models are so big that they can adapt to many situations, and sound like they think like a human. For advanced use cases, it is possible to fine-tune GPT (train it with your own data), which is a great way perform paraphrasing that is perfectly tailored to your use case or industry.
Why Use Keyword/Keyphrase Extraction?
Keyword and keyphrase extractions are a great way to instantly have a good grasp on a piece of text, and potentially categorize the text for later use. Here are a couple of examples:
Social Media Analysis
Tons of ideas are written in social media and you might want to understand the main ideas behind this chaos. With keyword/keyphrase extraction you can instantly do this.
Asking for customers' feedbacks is great practice, but it takes a lot of time to properly analyze the results. You can easily perform qualitative analysis thanks to keyword/keyphrase extraction.
Do you want to monitor the brand of your competitors? You can easily do it by retrieving their content and get the most important ideas.
Finding the right keywords for your positioning can be tricky. A strategy could be to analyze your competitor's websites, and understand which keywords they are positioning on.
import nlpcloud client = nlpcloud.Client("finetuned-gpt-neox-20b", "", gpu=True, lang="en") client.kw_kp_extraction("""One month after the United States began what has become a troubled rollout of a national COVID vaccination campaign, the effort is finally gathering real steam.""")
Control whether you want to use the model on a GPU. Machine learning models run much faster on GPUs.
NLP has a critical weakness: it doesn't work well with non-English languages.
We do our best to add non-English models when it's possible. See for example XLM Roberta Large XNLI, TF Allociné, German Sentiment Bert... Unfortunately few models are available so it's not possible to cover all the NLP use cases with that method.
In order to solve this challenge, we developed a multilingual AI that automatically translates your input into English, performs the actual NLP operation, and then translates the result back to your original language. It makes your requests a bit slower but returns impressive results.
Simply select your language in the list, and from now on you can write the input text in your own language!
This multilingual add-on is a paid feature. Please contact the support team so they can upgrade your plan.