Summarization API

What is Summarization?

Text summarization simply is the process of summarizing a block of text in order to make it shorter.

Let's say you have the following block of text:

The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure to reach a height of 300 metres. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct.

This technical description is quite long and maybe not all these details are necessary for a common reader to grasp the general idea. So we now want to leverage machine learning in order to automatically summarize this piece of text.

A summarization model would return something like this:

The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world.

Interesting isn't it? As you can see, the general idea is still there, but tons of details were stripped. It makes the text half its initial size!

There are several types of summarizations. For example "headline generation" is about generating a very short sentence, perfectly suited for a blog or news title.

Why Use Summarization?

Text summarization can be usefully used in many situations. Let's give you a couple of examples.

News Review

Some jobs require a huge amount of time dedicated to reading the news. It is especially true in marketing and commercial departments. Feeding analysts with summarized content can help them save a lot of time and energy.

Content Creation

If your company is creating a lot of content on a regular basis, it is very likely that this content has to be summarized after every article creation in order to serve as a headline and be pushed to social networks. Why not automate this?

Legal Documents Parsing

Reading a lot of legal documents everyday is long an exhausting. Sometimes, reading all the details is not vital. In that case, providing people with a summary in addition to the original text can be a great productivity booster.

Reports Generation

Writing reports is sometimes compulsory for your customers, your management, or your colleagues. Summarization can definitely alleviate this task.

Summarization with Hugging Face Transformers and GPT-J.

Hugging Face transformers is an amazing library that has been recently released. It is based on either PyTorch or TensorFlow, depending on the model you're using. Transformers have clearly helped deep learning Natural Language Processing make great progress in terms of accuracy. However this accuracy improvement comes at a cost: transformers are extremely demanding in terms of resources.

Hugging Face is a central repository regrouping all the newest open-source Natural Language Processing transformer-based models. 1 of them, Facebook's Bart Large CNN is perfectly suited for text summarization. Another one, Michau's T5 Base EN Generate Headline is great for headline generation (a one sentence summary) in English. For even more advanced results, you can also perform summarization with GPT-J or GPT-NeoX 20B.

Summarization Inference API

Building an inference API for summarization is a necessary step as soon a you want to use summarization in production. But keep in mind that building such an API is not necessarily easy. First because you need to code the API (easy part) but also because you need to build a highly available, fast, and scalable infrastructure to serve your models behind the hood (hardest part). Machine learning models consume a lot of resources (memory, disk space, CPU, GPU...) which makes it hard to achieve high-availability and low latency at the same time.

Leveraging such an API is very interesting because it is completely decoupled from the rest of your stack (microservice architecture), so you can easily scale it independently and ensure high-availability of your models through redundancy. But an API is also the way to go in terms of language interoperability. Most machine learning frameworks are developed in Python, but it's likely that you want to access them from other languages like Javascript, Go, Ruby... In such situation, an API is a great solution.

NLP Cloud's Summarization API

NLP Cloud proposes a text summarization API that gives you the opportunity to perform summarization out of the box, based on Hugging Face transformers' Facebook's Bart Large CNN model, GPT-J, GPT-NeoX 20B and Michau's T5 Base EN Generate Headline model, with a good accuracy. Due to the complex computations needed for such a task, the response time (latency) is pretty high though. You can either use the pre-trained model, train your own models, or upload your own custom models!

For more details, see our documentation about text summarization here.

Testing text summarization locally is one thing, but using it reliably in production is another thing. With NLP Cloud you can just do both!