Question Answering API

What is Question Answering?

Question answering is about letting the AI automatically answer a question. As an option, you can give a context and let the AI use this context to answer the question.

For example imagine you want to ask the following question:

How to bake some bread?

The AI might answer something like this:

1. Grease a clean, dry bread pan with butter.
2. Stir 1 envelope dry active yeast to 1/4 cup warm water in a large bowl. Let the yeast bloom for 10 minutes, or until dissolved, then add 1 teaspoon salt, 1 teaspoon honey, and 1/2 cup unsalted butter.
3. Add 3 1/2 cups strong flour and mix well, then wait to process your dough for 3 minutes.
4. Add an ounce of water and mix well. If the water was not already warm enough to make a dough (i.e., it was not warm to the touch) then add a warm to hot bowl of water to the machine. Stir continuously for 3 minutes.
5. Turn your dough out onto a clean sheet of plastic wrap and fold over.
6. Cover with another piece of plastic wrap and place in a warm corner of your kitchen.

Now maybe you have specific advanced data you want to give the AI and ask a question on it (also known as "context"):

All NLP Cloud plans can be stopped anytime. You only pay for the time you used the service. In case of a downgrade, you will get a discount on your next invoice.

You might want to ask the following question:

When can plans be stopped?

And the answer would be:


Question answering

Why Use Question Answering?

Question Answering can be usefully used in the "real world". Here are a couple of examples.

Contracts Questions

Chat bots are used more and more everyday, both to answer customer questions and internal collaborators questions. Imagine that a customer is asking a legal question about his contract. You could perfectly use a question answering model for that and pass the contract as a context.

Product Questions

Here's another chat bots related example. Imagine that a collaborator has a technical question about a product. Why not provide him with a natural language interface and make his life easier?

Question Answering with Hugging Face Transformers and GPT-J.

Hugging Face transformers is an amazing library that has been recently released. It is based on either PyTorch or TensorFlow, depending on the model you're using. Transformers have clearly helped deep learning Natural Language Processing make great progress in terms of accuracy. However this accuracy improvement comes at a cost: transformers are extremely demanding in terms of resources.

Hugging Face is a central repository regrouping all the newest open-source Natural Language Processing transformer-based models. One of them, Deepset's Roberta Base Squad 2, is perfectly suited for question answering in many languages with a context. For question answering without context, the best solution is GPT-J.

Question Answering Inference API

Building an inference API for question answering is a necessary step as soon a you want to use question answering in production. But keep in mind that building such an API is not necessarily easy. First because you need to code the API (easy part) but also because you need to build a highly available, fast, and scalable infrastructure to serve your models behind the hood (hardest part). Machine learning models consume a lot of resources (memory, disk space, CPU, GPU...) which makes it hard to achieve high-availability and low latency at the same time.

Leveraging such an API is very interesting because it is completely decoupled from the rest of your stack (microservice architecture), so you can easily scale it independently and ensure high-availability of your models through redundancy. But an API is also the way to go in terms of language interoperability. Most machine learning frameworks are developed in Python, but it's likely that you want to access them from other languages like Javascript, Go, Ruby... In such situation, an API is a great solution.

NLP Cloud's Question Answering API

NLP Cloud proposes a question answering API that gives you the opportunity to perform question answering out of the box, based on Deepset's Roberta Base Squad 2, and GPT-J, with excellent performances. The response time (latency) is very good for the Roberta model and the accuracy of GPT-J is very impressive. You can either use the pre-trained model or train your own model, or upload your own custom models!

For more details, see our documentation about question answering here.

Testing question answering locally is one thing, but using it reliably in production is another thing. With NLP Cloud you can just do both!