Question Answering API

What is Question Answering?

Question answering is about letting the AI automatically answer a question. As an option, you can give a context and let the AI use this context to answer the question.

For example imagine you want to ask the following question:

How to bake some bread?

The AI might answer something like this:

1. Grease a clean, dry bread pan with butter.
2. Stir 1 envelope dry active yeast to 1/4 cup warm water in a large bowl. Let the yeast bloom for 10 minutes, or until dissolved, then add 1 teaspoon salt, 1 teaspoon honey, and 1/2 cup unsalted butter.
3. Add 3 1/2 cups strong flour and mix well, then wait to process your dough for 3 minutes.
4. Add an ounce of water and mix well. If the water was not already warm enough to make a dough (i.e., it was not warm to the touch) then add a warm to hot bowl of water to the machine. Stir continuously for 3 minutes.
5. Turn your dough out onto a clean sheet of plastic wrap and fold over.
6. Cover with another piece of plastic wrap and place in a warm corner of your kitchen.

Now maybe you have specific advanced data you want to give the AI and ask a question on it (also known as "context"):

All NLP Cloud plans can be stopped anytime. You only pay for the time you used the service. In case of a downgrade, you will get a discount on your next invoice.

You might want to ask the following question:

When can plans be stopped?

And the answer would be:


You might also want to answer questions about a large corpus of internal domain knowledge. In that case you will want to read our dedicated article: read it here.

Question answering

Why Use Question Answering?

Question Answering can be usefully used in the "real world". Here are a couple of examples.

Contracts Questions

Chat bots are used more and more everyday, both to answer customer questions and internal collaborators questions. Imagine that a customer is asking a legal question about his contract. You could perfectly use a question answering model for that and pass the contract as a context.

Product Questions

Here's another chat bots related example. Imagine that a collaborator has a technical question about a product. Why not provide him with a natural language interface and make his life easier?

Question Answering with Transformers and Generative Models.

The Transformer, released by Google in 2017, is the corner stone architecture of many advanced AI models. Thanks to the Transformer, accuracy of AI models has improved dramatically. However this improvement comes at a cost: neural networks based on the Transformer are extremely computation intensive.

Hugging Face is a central repository regrouping many open-source Natural Language Processing Transformer-based models. One of them, Roberta Base Squad 2, is perfectly suited for concise question answering. %}

For more advanced results, it is also possible to perform question answering with generative models like LLaMA 2, Dolphin, and ChatDolphin. These models give great results, even when no context is provided.

Question Answering Inference API

Building an inference API for question answering is a necessary step as soon a you want to use question answering in production. But keep in mind that building such an API is not necessarily easy. First because you need to code the API (easy part) but also because you need to build a highly available, fast, and scalable infrastructure to serve your models behind the hood (hardest part). Machine learning models consume a lot of resources (memory, disk space, CPU, GPU...) which makes it hard to achieve high-availability and low latency at the same time.

Leveraging such an API is very interesting because it is completely decoupled from the rest of your stack (microservice architecture), so you can easily scale it independently and ensure high-availability of your models through redundancy. But an API is also the way to go in terms of language interoperability. Most machine learning frameworks are developed in Python, but it's likely that you want to access them from other languages like Javascript, Go, Ruby... In such situation, an API is a great solution.

NLP Cloud's Question Answering API

NLP Cloud proposes a question answering API that gives you the opportunity to perform question answering out of the box, based on Deepset's Roberta Base Squad 2, LLaMA 2, Dolphin, and ChatDolphin, with excellent performances. The response time (latency) is very good for the Roberta model and the accuracy of generative models on this task is very impressive. You can either use the pre-trained model or train your own model, or upload your own custom models!

For more details, see our documentation about question answering here. For advanced usage, see the text generation API endpoint here. And easily test question answering on our playground.

Testing question answering locally is one thing, but using it reliably in production is another thing. With NLP Cloud you can just do both!