How To Summarize Text With Python And Machine Learning

Summarization is a very common task that many developers would like to automate. For example wouldn't it be nice to automatically create a summary of each blog article you're writing? Or automatically summarize documents for your employees? Tons of good applications exist.

Transformer-based models like Bart Large CNN make it easy to summarize text in Python. These machine learning models are easy to use but hard to scale. Let's see how to use Bart Large CNN and how to optimize its performances.

Summary writing

Transformers And Bart Large CNN

Transformers is an advanced Python framework that recently made it possible to achieve very advanced natural language processing use cases like text summarization.

Before Transformers and neural networks, a couple of options were available but none of them were really satisfying.

Many good pre-trained natural language processing models have been created these last years, based on Transformers, for various use cases. Bart Large CNN has been released by Facebook and gives excellent results for text summarization.

Here is how to use Bart Large CNN in your Python code.

Summarizing Text in Python

The simplest way to use Bart Large CNN is to download it from the Hugging Face repository, and use the text summarization pipeline from the Transformers library:

from transformers import pipeline

summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

article = """New York (CNN)When Liana Barrientos was 23 years old, she got married in Westchester County, New York.
A year later, she got married again in Westchester County, but to a different man and without divorcing her first husband.
Only 18 days after that marriage, she got hitched yet again. Then, Barrientos declared "I do" five more times, sometimes only within two weeks of each other.
In 2010, she married once more, this time in the Bronx. In an application for a marriage license, she stated it was her "first and only" marriage.
Barrientos, now 39, is facing two criminal counts of "offering a false instrument for filing in the first degree," referring to her false statements on the
2010 marriage license application, according to court documents.
Prosecutors said the marriages were part of an immigration scam.
On Friday, she pleaded not guilty at State Supreme Court in the Bronx, according to her attorney, Christopher Wright, who declined to comment further.
After leaving court, Barrientos was arrested and charged with theft of service and criminal trespass for allegedly sneaking into the New York subway through an emergency exit, said Detective
Annette Markowski, a police spokeswoman. In total, Barrientos has been married 10 times, with nine of her marriages occurring between 1999 and 2002.
All occurred either in Westchester County, Long Island, New Jersey or the Bronx. She is believed to still be married to four men, and at one time, she was married to eight men at once, prosecutors say.
Prosecutors said the immigration scam involved some of her husbands, who filed for permanent residence status shortly after the marriages.
Any divorces happened only after such filings were approved. It was unclear whether any of the men will be prosecuted.
The case was referred to the Bronx District Attorney\'s Office by Immigration and Customs Enforcement and the Department of Homeland Security\'s
Investigation Division. Seven of the men are from so-called "red-flagged" countries, including Egypt, Turkey, Georgia, Pakistan and Mali.
Her eighth husband, Rashid Rajput, was deported in 2006 to his native Pakistan after an investigation by the Joint Terrorism Task Force.
If convicted, Barrientos faces up to four years in prison.  Her next court appearance is scheduled for May 18."""

summary = summarizer(article, max_length=130, min_length=30))


Liana Barrientos, 39, is charged with two counts of "offering a false instrument for filing in the first degree" In total, she has been married 10 times, with nine of her marriages occurring between 1999 and 2002. She is believed to still be married to four men.

As you can see, this is only 4 lines of Python code, and the quality of the summary is very good! But you might have noticed that the model is big so it takes time to download it the first time.

The min_length and max_length parameters indicate the minimum and maximum sizes of your summary. They represent a number of tokens, not words. Basically a token can be a word, but also punctuation or subwords. In general you can consider that 100 tokens are roughly equal to 75 words.

Important note: your input text cannot be bigger than 1024 tokens (more or less equal to 800 words) as this is an internal limitation of this model. If you want to summarize bigger pieces of text, a good strategy is to summarize several parts of the text independently and then reassemble the results. You can even perform summaries of summaries!

Performance Considerations

There are 2 main problems with this Bart Large CNN model though.

First, like many deep learning models, it requires an important amount of disk space and RAM (around 1.5GB!). And this can still be considered as a small deep learning model compared to huge ones like GPT-3, GPT-J, T5 11B, etc.

Even more importantly, it is quite slow. This model is actually performing text generation under the hood, and text generation is inherently slow. If you're trying to summarize a piece of text made up of 800 words, it will take around 20 seconds on a good CPU...

The solution is to deploy Bart large CNN on a GPU. For example, on an NVIDIA Tesla T4, you can expect a x10 speedup and your 800 word piece of text will be summarized in around 2 seconds.

GPUs are of course very expensive, so it's up to you to do the math and decide if the investment is worth it!

Leveraging an External API for Production

Text summarization with Bart Large CNN is very easy to use in a simple script, but what if you want to use it in production for a large volume of requests?

As mentioned above, a first solution would be to take care of provisioning your own hardware with a GPU, and work on some production optimizations in order to make summarization faster.

A second solution would be to delegate this task to a dedicated service like NLP Cloud that will serve the Bart Large CNN model for you through an API. Test our summarization API endpoint here!


In 2022, it is possible to perform advanced text summarization in Python with very little effort, thanks to Transformers and Bart Large CNN.

Text summarization is a very useful task that more and more companies now automate in their application. As you can see, the complexity comes from the performance side. Some techniques exist in order to speed up your text summarization with Bart Large CNN, but this will be a topic for another article!

I hope this article will help you save time for your next project! Feel free to try text summarization on NLP Cloud!

Full-stack engineer at NLP Cloud