An Instruct Version Of GPT-J Using Stanford Alpaca's Dataset

At NLP Cloud we wanted to see if GPT-J could be fine-tuned as an instruct model in order to properly understand human instructions in natural language, without using few-shot learning. Other attempts have given interesting results in the open-source community, like Flan-T5, BloomZ, or Stanford Alpaca, so we wanted to do the same on GPT-J using Alpaca's dataset. Results are very good!

Stanford Alpaca

Few-Shot Learning VS Natural Language Instructions

Generative AI models are not good at understand human requests, by default.

In order to have these text generation models understand what you want, the best solution is to use few-shot learning. We made a special guide about few-shot learning you can find it here. GPT-J is a good example of a very capable model that only works correctly with few-shot learning.

Even if building these examples is usually not taking too much time, it is still very confusing for newcomers who want to use these AI models correctly. It is much easier to ask things naturally like you would do with a human.

For example let's say that you want to correct spelling mistakes with GPT-J. Here is an example of a prompt you have to use:

I love goin to the beach.
Correction: I love going to the beach.
###
Let me hav it!
Correction: Let me have it!
###
It have too many drawbacks.
Correction: It has too many drawbacks.
###
I do not wan to go
Correction:

As you can see, this is not complex but not straightforward either. If you properly fine-tune GPT-J, it can be turned into an "instruct" model, which means that you can now ask the following:

Correct spelling and grammar from the following text.
I do not wan to go

And it would return the following:

I do not want to go.

Much better isn't it? How can we achieve this?

Stanford Alpaca

Stanford Alpaca has been released a couple of days ago. This is a fine-tuned version of the Llama model, developed by Facebook. See more here about this project here.

Basically, the Stanford Alpaca team managed to come up with a state-of-the-art instruct model by fine-tuning Llama on a fairly small dataset (52k examples) made up of human instructions. The interesting thing is that they generated this dataset programmatically using a larger language model (GPT-3). You can download the dataset here.

At NLP Cloud we tried to fine-tune GPT-J using this dataset, and we got surprisingly good results!

Instruct GPT-J

The new Instruct GPT-J model we created is now on the Hugging Face Hub so you can easily use it: click here to see the model.

Here is how you can use the model, using Hugging Face Transformers:

from transformers import pipeline
import torch

generator = pipeline(model="nlpcloud/instruct-gpt-j-fp16", torch_dtype=torch.float16, device=0)

prompt = "Correct spelling and grammar from the following text.\nI do not wan to go\n"

print(generator(prompt))

Here are some prompts that you can try:

Write a short story about space.\n

Generate a C++ program that sorts a list of integers in ascending order.\n

Paraphrase the following text.\nAfter a war lasting 20 years, following the decision taken first by President Trump and then by President Biden to withdraw American troops, Kabul, the capital of Afghanistan, fell within a few hours to the Taliban, without resistance.\n

Summarize the following text.\nFor all its whizz-bang caper-gone-wrong energy, and for all its subsequent emotional troughs, this week’s Succession finale might have been the most important in its entire run. Because, unless I am very much wrong, Succession – a show about people trying to forcefully mount a succession – just had its succession. And now everything has to change. The episode ended with Logan Roy defying his children by selling Waystar Royco to idiosyncratic Swedish tech bro Lukas Matsson. It’s an unexpected twist, like if King Lear contained a weird new beat where Lear hands the British crown to Jack Dorsey for a laugh, but it sets up a bold new future for the show. What will happen in season four? Here are some theories. Season three of Succession picked up seconds after season two ended. It was a smart move, showing the immediate swirl of confusion that followed Kendall Roy’s decision to undo his father, and something similar could happen here. This week’s episode ended with three of the Roy siblings heartbroken and angry at their father’s grand betrayal. Perhaps season four could pick up at that precise moment, and show their efforts to reorganise their rebellion against him. This is something that Succession undoubtedly does very well – for the most part, its greatest moments have been those heart-thumping scenes where Kendall scraps for support to unseat his dad – and Jesse Armstrong has more than enough dramatic clout to centre the entire season around the battle to stop the Matsson deal dead in its tracks.\n

Note that, due to the way this model was fine-tuned, you should always use new lines at the end of your instructions.

Hardware Requirements

This model is an fp16 version of our fine-tuned model, which works very well on a GPU with 16GB of VRAM like an NVIDIA Tesla T4.

We did not notice any difference between the fp32 and fp16 versions in terms of quality.

Conclusion

GPT-J was already a very good model, and it is now even better when used as an instruct model.

Anyone can now turn his AI generative model into an instruct model thanks to this technique!

If you have questions or comments about the above, please don't hesitate to reach out!.

François
Data Scientist at NLP Cloud