GPT-3、GPT-J、GPT-Neoの使い方、数撃ちゃ当たるの学習法。

GPT-3、GPT-J、GPT-Neoは非常に強力なAIモデルです。ここでは、これらのモデルを効果的に利用する方法として、数撃ちゃ当たる学習法を紹介します。 スモールショット学習は、プロンプトにいくつかの例を与えるだけで、AIモデルを訓練/微調整するようなものです。

ジーピーティー・スリー

OpenAIが公開した「GPT-3」は、テキスト理解やテキスト生成のためのAIモデルとして、これまでで最も強力なものです。

1,750億ものパラメータで学習させたので、非常に汎用性が高く、ほとんど何でも理解できるのです

チャットボット、コンテンツ作成、エンティティ抽出、分類、要約など、GPT-3を使えばいろいろなことができます。しかし、ある程度の練習が必要で、このモデルを正しく使うことは簡単ではありません。

GPT-JとGPT-Neo

GPT-NeoとGPT-Jは、オープンソースの自然言語処理モデルで、AIをオープンソース化する研究者の集団によって作成されました。 研究者が作成した自然言語処理モデルです。 (EleutherAIのホームページを見る).

GPT-Jは60億のパラメータを持ち、この論文執筆時点で最も先進的なオープンソースの自然言語処理モデルとなっています。 のモデルです。これはOpenAIが独自に開発したGPT-3 Curieの直接的な代替となるものです。

これらのモデルは非常に汎用性が高い。自然言語処理のほとんどすべての用途に使用することができます:テキスト生成、センチメント 分析。 分類、機械翻訳、...そしてもっとたくさん(下記参照)。しかし、それらを効果的に使うには 練習が必要です。その応答時間(レイテンシ)は、より標準的な自然言語処理モデルよりも長くなる可能性があります。 モデルよりも長い場合があります。

GPT-JとGPT-NeoはいずれもNLP Cloud APIで利用可能です。以下では、GPT-JとGPT-NeoのAPIを利用した を使った例を紹介します。 GPU上のNLP CloudのGPT-Jエンドポイントを使って、Pythonクライアントで取得した例を紹介します。例をコピーペーストしたい場合。 お願い APIトークンの追加を忘れないでください。Pythonクライアントをインストールするために、まず、以下を実行します。 pip install nlpcloud.

フューチャーショット・ラーニング

数撃ちゃ当たるとは、機械学習モデルがわずか数例で予測できるようにすることである。 を予測できるようにすることです。GPT-3、GPT-J、GPT-Neoのようなモデルは、非常に大きなモデルであるため、再トレーニングすることなく、多くのコンテキストに容易に適応することができます。 GPT-3やGPT-J、GPT-Neoのようなモデルは、再トレーニングすることなく、多くのコンテキストに簡単に適応することができます。

わずかな例を与えるだけで、モデルの精度は飛躍的に向上するのです。

自然言語処理では、テキスト入力と一緒にこれらの例を渡すことです。以下の例をご覧ください

また、数発学習で物足りない場合は、OpenAIのウェブサイトのGPT-3やNLP CloudのGPT-Jを微調整することで、「数発学習」ではなく「数発学習」にすることもできます。 モデル を調整することもできます。

NLPクラウドのプレイグラウンドで簡単に数撃ちゃ当たるの学習が試せる (試してみる).

GPT-Jによるセンチメント分析

テスト 遊び場

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Message: Support has been terrible for 2 weeks...
            Sentiment: Negative
            ###
            Message: I love your API, it is simple and so fast!
            Sentiment: Positive
            ###
            Message: GPT-J has been released 2 months ago.
            Sentiment: Neutral
            ###
            Message: The reactivity of your team has been amazing, thanks!
            Sentiment:""",
    min_length=1,
    max_length=1,
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

出力:

Positive

このように、最初に3つの例を適切な形式で与えることで、GPT-Jは を理解させます。そして、その結果は良好です。

GPT-Jが異なるセクションを理解するのを助けることができます。 セクションを理解するのに役立ちます。 ###. こんな感じで、他にも完璧に使えるんです。 ---. または単に新しい の行を追加します。 そして、NLP Cloudのパラメータである "end_sequence "を設定します。 GPT-Jに新しい行の後にコンテンツの生成を停止するように指示します。 + ###: end_sequence="###".

GPT-JによるHTMLコード生成

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""description: a red button that says stop
    code: <button style=color:white; background-color:red;>Stop</button>
    ###
    description: a blue box that contains yellow circles with red borders
    code: <div style=background-color: blue; padding: 20px;><div style=background-color: yellow; border: 5px solid red; border-radius: 50%; padding: 20px; width: 100px; height: 100px;>
    ###
    description: a Headline saying Welcome to AI
    code:""",
    max_length=500,
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

出力:

<h1 style=color: white;>Welcome to AI</h1>

GPT-Jによるコード生成は本当にすごいです。これは、GPT-Jが膨大な量のデータに対して 巨大なコードベースで学習させた のおかげでもあります。

GPT-JによるSQLコード生成

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Question: Fetch the companies that have less than five people in it.
            Answer: SELECT COMPANY, COUNT(EMPLOYEE_ID) FROM Employee GROUP BY COMPANY HAVING COUNT(EMPLOYEE_ID) < 5;
            ###
            Question: Show all companies along with the number of employees in each department
            Answer: SELECT COMPANY, COUNT(COMPANY) FROM Employee GROUP BY COMPANY;
            ###
            Question: Show the last record of the Employee table
            Answer: SELECT * FROM Employee ORDER BY LAST_NAME DESC LIMIT 1;
            ###
            Question: Fetch three employees from the Employee table;
            Answer:""",
    max_length=100,
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

出力:

SELECT * FROM Employee ORDER BY ID DESC LIMIT 3;

SQLの自動生成はGPT-Jと非常に相性がよく、特にSQLの宣言的な性質と SQLは(多くのプログラミング言語と比較して)比較的少数の可能性を持つ、非常に限定された言語であるという事実です。 プログラミング言語と比較して)。

GPT-Jによる高度な固有表現抽出(NER)

テスト 遊び場

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""[Text]: Fred is a serial entrepreneur. Co-founder and CEO of Platform.sh, he previously co-founded Commerce Guys, a leading Drupal ecommerce provider. His mission is to guarantee that as we continue on an ambitious journey to profoundly transform how cloud computing is used and perceived, we keep our feet well on the ground continuing the rapid growth we have enjoyed up until now. 
        [Name]: Fred
        [Position]: Co-founder and CEO
        [Company]: Platform.sh
        ###
        [Text]: Microsoft (the word being a portmanteau of "microcomputer software") was founded by Bill Gates on April 4, 1975, to develop and sell BASIC interpreters for the Altair 8800. Steve Ballmer replaced Gates as CEO in 2000, and later envisioned a "devices and services" strategy.
        [Name]:  Steve Ballmer
        [Position]: CEO
        [Company]: Microsoft
        ###
        [Text]: Franck Riboud was born on 7 November 1955 in Lyon. He is the son of Antoine Riboud, the previous CEO, who transformed the former European glassmaker BSN Group into a leading player in the food industry. He is the CEO at Danone.
        [Name]:  Franck Riboud
        [Position]: CEO
        [Company]: Danone
        ###
        [Text]: David Melvin is an investment and financial services professional at CITIC CLSA with over 30 years’ experience in investment banking and private equity. He is currently a Senior Adviser of CITIC CLSA.
""",
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

出力:

[Name]: David Melvin
[Position]: Senior Adviser
[Company]: CITIC CLSA

このように、GPT-Jは非構造化テキストから構造化データを抽出することに非常に長けています。 これは、GPT-Jが再トレーニングを必要とせずに、実体抽出を解決していることが本当に印象的です。 通常 新しいタイプのエンティティ(名前、役職、国など)を抽出するには、アノテーション、学習、展開...といった全く新しいプロセスが必要です。 アノテーション、トレーニング、デプロイメント...。しかし、ここでは、完全にシームレスです。

GPT-Jによる質問応答

テスト 遊び場

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Context: NLP Cloud was founded in 2021 when the team realized there was no easy way to reliably leverage Natural Language Processing in production.
            Question: When was NLP Cloud founded?
            Answer: 2021
            ###
            Context: NLP Cloud developed their API by mid-2020 and they added many pre-trained open-source models since then.
            Question: What did NLP Cloud develop?
            Answer: API
            ###
            Context: All plans can be stopped anytime. You only pay for the time you used the service. In case of a downgrade, you will get a discount on your next invoice.
            Question: When can plans be stopped?
            Answer: Anytime
            ###
            Context: The main challenge with GPT-J is memory consumption. Using a GPU plan is recommended.
            Question: Which plan is recommended for GPT-J?
            Answer:""",
    min_length=1,
    max_length=20,
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

出力:

GPU計画

質問応答は非常によく機能する。他の自然言語処理専用モデルでも実現できますが、おそらく 同じレベルの精度ではない レベルの精度ではないかもしれない。

GPT-Jによるチュートリアル作成

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Here is a tutorial about how to make a cake.
        
            1. Take some flour.
            2. Take some sugar.""",
    max_length=500)
print(generation["generated_text"])

出力:

Here is a tutorial how to make a cake.
        
                    1. Take some flour.
                    2. Take some sugar.
                    3. Take some butter.
                    4. Take some eggs.
                    5. Take some water.
                    6. Take some baking powder.
                    7. Take some vanilla.
                    8. Mix all together.
                    9. Bake in a pan.
                    10. Enjoy.
                    
Well, that's it. You can make this for your birthday or a party or you can even make it for your kids. They will love this.

このように、GPT-Jは最初の書式設定に自動的に従うので、非常に印象的です。 生成されたコンテンツも非常に良いものです。これでちゃんとしたケーキが作れるかもしれませんね(まだ試してませんが まだ試していませんが)。

GPT-Jを使った文法・スペルチェック

テスト風景 遊び場

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""I love goin to the beach.
            Correction: I love going to the beach.
            ###
            Let me hav it!
            Correction: Let me have it!
            ###
            It have too many drawbacks.
            Correction: It has too many drawbacks.
            ###
            I do not wan to go
            Correction:""",
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

出力:

行きたくありません。

スペルや文法の訂正は期待通りに機能します。文中の間違いの場所をより具体的に知りたい場合は、専用のモデルを使用した方が良いでしょう。 文中の間違いの位置をより具体的に知りたい場合は、専用のモデルを使用することをお勧めします。

GPT-Jによる機械翻訳

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Hugging Face a révolutionné le NLP.
            Translation: Hugging Face revolutionized NLP.
            ###
            Cela est incroyable!
            Translation: This is unbelievable!
            ###
            Désolé je ne peux pas.
            Translation: Sorry but I cannot.
            ###
            NLP Cloud permet de deployer le NLP en production facilement.
            Translation""",
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

出力:

NLP Cloud makes it easy to deploy NLP to production.

機械翻訳は通常、専用のモデル(言語ごとに1つの場合が多い)を使用します。ここでは、すべての言語がGPT-Jで処理されます。 を処理することができ、非常に印象的です。

GPT-Jによるツイート生成

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""keyword: markets
            tweet: Take feedback from nature and markets, not from people
            ###
            keyword: children
            tweet: Maybe we die so we can come back as children.
            ###
            keyword: startups
            tweet: Startups should not worry about how to put out fires, they should worry about how to start them.
            ###
            keyword: NLP
            tweet:""",
    max_length=200,
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

出力:

People want a way to get the benefits of NLP without paying for it.

文脈に沿った短いつぶやきを、面白くて簡単に生成する方法を紹介します。

GPT-Jによるチャットボットと会話型AI

テスト風景 遊び場

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""This is a discussion between a [human] and a [robot]. 
The [robot] is very nice and empathetic.

[human]: Hello nice to meet you.
[robot]: Nice to meet you too.
###
[human]: How is it going today?
[robot]: Not so bad, thank you! How about you?
###
[human]: I am ok, but I am a bit sad...
[robot]: Oh? Why that?
###
[human]: I broke up with my girlfriend...
[robot]: """,
    min_length=1,
    max_length=20,
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

出力:

Oh? How did that happen?

このように、GPT-Jは、あなたが会話モードであることをきちんと理解しています。そして、とても強力なのは そして、非常に強力なのは、もしあなたが文脈の中でトーンを変えたとしても、モデルからの応答は同じトーンに従うということです。 トーン(皮肉、怒り、好奇心...)に従うということです。

でチャットボットを構築する方法について、実際に専用のブログ記事を書きました。 GPT-3/GPT-J, お気軽にお読みください。

GPT-Jによる意図的な分類

テスト 遊び場

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""I want to start coding tomorrow because it seems to be so fun!
            Intent: start coding
            ###
            Show me the last pictures you have please.
            Intent: show pictures
            ###
            Search all these files as fast as possible.
            Intent: search files
            ###
            Can you please teach me Chinese next week?
            Intent:""",
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

出力:

learn chinese

このように、GPT-Jは文章から意図を読み取ることができるので、非常に印象的です。より複雑な文章に対しても、非常によく機能します。 複雑な文にも対応できます。また、必要に応じて を指定することもできます。例えば、Javascriptの関数名を "learnChinese "のように自動生成することができます。 関数名「learnChinese」を自動的に生成することができます。

GPT-Jによるパラフレーズ

テスト 遊び場

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""[Original]: Algeria recalled its ambassador to Paris on Saturday and closed its airspace to French military planes a day later after the French president made comments about the northern Africa country. 
        [Paraphrase]: Last Saturday, the Algerian government recalled its ambassador and stopped accepting French military airplanes in its airspace. It happened one day after the French president made comments about Algeria.
        ###
        [Original]: President Macron was quoted as saying the former French colony was ruled by a "political-military system" with an official history that was based not on truth, but on hatred of France.
        [Paraphrase]: Emmanuel Macron said that the former colony was lying and angry at France. He also said that the country was ruled by a "political-military system".
        ###
        [Original]: The diplomatic spat came days after France cut the number of visas it issues for citizens of Algeria and other North African countries.
        [Paraphrase]: Diplomatic issues started appearing when France decided to stop granting visas to Algerian people and other North African people.
        ###
        [Original]: After a war lasting 20 years, following the decision taken first by President Trump and then by President Biden to withdraw American troops, Kabul, the capital of Afghanistan, fell within a few hours to the Taliban, without resistance.
        [Paraphrase]:""",
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True,
    min_length=0,
    max_length=50)
print(generation["generated_text"])

出力:

French President Emmanuel Macron hopes the diplomatic tension with Algeria will calm down. He wants to make progress in the relations with the former French colony. President Macron says he has a good relationship with his Algerian counterpart and he is glad that they have a cordial relationship. The president was quoted as saying that Algeria was ruled by a "political-military system" with an official history that was based not on truth, but on hatred of France. France cut the number of visas it issues for citizens of Algeria and other North African countries.

GPT-Jは、私たちのパラグラフに修正を加えましたが、大意はそのままです。 言い換えをすることです。GPT-Jがもっとオリジナルな言い換えを返すようにするには、入力に別の例を渡したり また、APIパラメータを工夫することで、よりオリジナリティのある言い換えができるようになる。 反復記号...

GPT-Jによる要約

テスト 遊び場

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""[Original]: America has changed dramatically during recent years. Not only has the number of graduates in traditional engineering disciplines such as mechanical, civil, electrical, chemical, and aeronautical engineering declined, but in most of the premier American universities engineering curricula now concentrate on and encourage largely the study of engineering science.  As a result, there are declining offerings in engineering subjects dealing with infrastructure, the environment, and related issues, and greater concentration on high technology subjects, largely supporting increasingly complex scientific developments. While the latter is important, it should not be at the expense of more traditional engineering.
        Rapidly developing economies such as China and India, as well as other industrial countries in Europe and Asia, continue to encourage and advance the teaching of engineering. Both China and India, respectively, graduate six and eight times as many traditional engineers as does the United States. Other industrial countries at minimum maintain their output, while America suffers an increasingly serious decline in the number of engineering graduates and a lack of well-educated engineers. 
        (Source:  Excerpted from Frankel, E.G. (2008, May/June) Change in education: The cost of sacrificing fundamentals. MIT Faculty 
        [Summary]: MIT Professor Emeritus Ernst G. Frankel (2008) has called for a return to a course of study that emphasizes the traditional skills of engineering, noting that the number of American engineering graduates with these skills has fallen sharply when compared to the number coming from other countries. 
        ###
        [Original]: So how do you go about identifying your strengths and weaknesses, and analyzing the opportunities and threats that flow from them? SWOT Analysis is a useful technique that helps you to do this.
        What makes SWOT especially powerful is that, with a little thought, it can help you to uncover opportunities that you would not otherwise have spotted. And by understanding your weaknesses, you can manage and eliminate threats that might otherwise hurt your ability to move forward in your role.
        If you look at yourself using the SWOT framework, you can start to separate yourself from your peers, and further develop the specialized talents and abilities that you need in order to advance your career and to help you achieve your personal goals.
        [Summary]: SWOT Analysis is a technique that helps you identify strengths, weakness, opportunities, and threats. Understanding and managing these factors helps you to develop the abilities you need to achieve your goals and progress in your career.
        ###
        [Original]: Jupiter is the fifth planet from the Sun and the largest in the Solar System. It is a gas giant with a mass one-thousandth that of the Sun, but two-and-a-half times that of all the other planets in the Solar System combined. Jupiter is one of the brightest objects visible to the naked eye in the night sky, and has been known to ancient civilizations since before recorded history. It is named after the Roman god Jupiter.[19] When viewed from Earth, Jupiter can be bright enough for its reflected light to cast visible shadows,[20] and is on average the third-brightest natural object in the night sky after the Moon and Venus.
        Jupiter is primarily composed of hydrogen with a quarter of its mass being helium, though helium comprises only about a tenth of the number of molecules. It may also have a rocky core of heavier elements,[21] but like the other giant planets, Jupiter lacks a well-defined solid surface. Because of its rapid rotation, the planet's shape is that of an oblate spheroid (it has a slight but noticeable bulge around the equator).
        [Summary]: Jupiter is the largest planet in the solar system. It is a gas giant, and is the fifth planet from the sun.
        ###
        [Original]: For all its whizz-bang caper-gone-wrong energy, and for all its subsequent emotional troughs, this week’s Succession finale might have been the most important in its entire run. Because, unless I am very much wrong, Succession – a show about people trying to forcefully mount a succession – just had its succession. And now everything has to change.
        The episode ended with Logan Roy defying his children by selling Waystar Royco to idiosyncratic Swedish tech bro Lukas Matsson. It’s an unexpected twist, like if King Lear contained a weird new beat where Lear hands the British crown to Jack Dorsey for a laugh, but it sets up a bold new future for the show. What will happen in season four? Here are some theories.
        Season three of Succession picked up seconds after season two ended. It was a smart move, showing the immediate swirl of confusion that followed Kendall Roy’s decision to undo his father, and something similar could happen here. This week’s episode ended with three of the Roy siblings heartbroken and angry at their father’s grand betrayal. Perhaps season four could pick up at that precise moment, and show their efforts to reorganise their rebellion against him. This is something that Succession undoubtedly does very well – for the most part, its greatest moments have been those heart-thumping scenes where Kendall scraps for support to unseat his dad – and Jesse Armstrong has more than enough dramatic clout to centre the entire season around the battle to stop the Matsson deal dead in its tracks.
        [Summary]:""",
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True,
    min_length=20,
    max_length=200)
print(generation["generated_text"])

出力:

Season 3 of Succession ended with Logan Roy trying to sell his company to Lukas Matsson.

テキストの要約は厄介な作業です。GPT-Jは、正しい例を与えさえすれば、非常に得意です。 を与える限り、非常に得意です。 要約の大きさ、そして要約のトーンは、あなたが作成した例文に大きく依存します。 に大きく依存します。例えば、子供向けの簡単な要約を作ろうとする場合と、高度な要約を作ろうとする場合では、同じ種類の例を作らないかもしれません。 例えば、子供向けの簡単な要約を作ろうとしているのか、それとも医者向けの高度な医学的要約を作ろうとしているのか、同じ種類の例を作ることはできないかもしれません。 もし、GPT-Jの入力サイズがあなたの要約例に対して小さすぎる場合、あなたの要約タスクのためにGPT-Jを微調整した方がいいかもしれません。

GPT-Jによるゼロショット・テキスト分類

テスト 遊び場

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Message: When the spaceship landed on Mars, the whole humanity was excited
        Topic: space
        ###
        Message: I love playing tennis and golf. I'm practicing twice a week.
        Topic: sport
        ###
        Message: Managing a team of sales people is a tough but rewarding job.
        Topic: business
        ###
        Message: I am trying to cook chicken with tomatoes.
        Topic:""",
    min_length=1,
    max_length=5,
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

出力:

food

ここでは、いわゆる「ゼロショット学習」によって、テキストを分類する簡単で強力な方法を紹介します。 という手法で、事前にカテゴリを宣言することなく、簡単にテキストを分類することができます。

GPT-Jによるキーワードとキーフレーズの抽出

テスト 遊び場にて

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Information Retrieval (IR) is the process of obtaining resources relevant to the information need. For instance, a search query on a web search engine can be an information need. The search engine can return web pages that represent relevant resources.
        Keywords: information, search, resources
        ###
        David Robinson has been in Arizona for the last three months searching for his 24-year-old son, Daniel Robinson, who went missing after leaving a work site in the desert in his Jeep Renegade on June 23. 
        Keywords: searching, missing, desert
        ###
        I believe that using a document about a topic that the readers know quite a bit about helps you understand if the resulting keyphrases are of quality.
        Keywords: document, understand, keyphrases
        ###
        Since transformer models have a token limit, you might run into some errors when inputting large documents. In that case, you could consider splitting up your document into paragraphs and mean pooling (taking the average of) the resulting vectors.
        Keywords:""",
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

出力:

paragraphs, transformer, input, errors

キーワード抽出は、テキストから主要なアイデアを得ることです。これは自然言語処理の興味深い分野であり のサブフィールドで、GPT-Jが非常にうまく扱える分野です。キーワード抽出については以下を参照してください。 複数単語の場合)をご覧ください。

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Information Retrieval (IR) is the process of obtaining resources relevant to the information need. For instance, a search query on a web search engine can be an information need. The search engine can return web pages that represent relevant resources.
        Keywords: information retrieval, search query, relevant resources
        ###
        David Robinson has been in Arizona for the last three months searching for his 24-year-old son, Daniel Robinson, who went missing after leaving a work site in the desert in his Jeep Renegade on June 23. 
        Keywords: searching son, missing after work, desert
        ###
        I believe that using a document about a topic that the readers know quite a bit about helps you understand if the resulting keyphrases are of quality.
        Keywords: document, help understand, resulting keyphrases
        ###
        Since transformer models have a token limit, you might run into some errors when inputting large documents. In that case, you could consider splitting up your document into paragraphs and mean pooling (taking the average of) the resulting vectors.
        Keywords:""",
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

出力:

large documents, paragraph, mean pooling

上記と同じ例ですが、今回は1つの単語ではなく、複数の単語(キーフレーズと呼びます)を抽出します。 (キーフレーズと呼ばれる)を抽出します。

製品説明と広告の生成

テスト 遊び場にて

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Generate a product description out of keywords.

        Keywords: shoes, women, $59
        Sentence: Beautiful shoes for women at the price of $59.
        ###
        Keywords: trousers, men, $69
        Sentence: Modern trousers for men, for $69 only.
        ###
        Keywords: gloves, winter, $19
        Sentence: Amazingly hot gloves for cold winters, at $19.
        ###
        Keywords: t-shirt, men, $39
        Sentence:""",
    min_length=5,
    max_length=30,
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

出力:

Extraordinary t-shirt for men, for $39 only.

GPT-Jに特定のキーワードを含む商品説明文や広告を生成してもらうことが可能です。ここでは ここでは簡単な文章を ここでは簡単な文章を生成していますが、必要であれば段落全体を生成することも可能です。

Blog Post Generation

テスト 遊び場

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""[Title]: 3 Tips to Increase the Effectiveness of Online Learning
[Blog article]: <h1>3 Tips to Increase the Effectiveness of Online Learning</h1>
<p>The hurdles associated with online learning correlate with the teacher’s inability to build a personal relationship with their students and to monitor their productivity during class.</p>
<h2>1. Creative and Effective Approach</h2>
<p>Each aspect of online teaching, from curriculum, theory, and practice, to administration and technology, should be formulated in a way that promotes productivity and the effectiveness of online learning.</p>
<h2>2. Utilize Multimedia Tools in Lectures</h2>
<p>In the 21st century, networking is crucial in every sphere of life. In most cases, a simple and functional interface is preferred for eLearning to create ease for the students as well as the teacher.</p>
<h2>3. Respond to Regular Feedback</h2>
<p>Collecting student feedback can help identify which methods increase the effectiveness of online learning, and which ones need improvement. An effective learning environment is a continuous work in progress.</p>
###
[Title]: 4 Tips for Teachers Shifting to Teaching Online 
[Blog article]: <h1>4 Tips for Teachers Shifting to Teaching Online </h1>
<p>An educator with experience in distance learning shares what he’s learned: Keep it simple, and build in as much contact as possible.</p>
<h2>1. Simplicity Is Key</h2>
<p>Every teacher knows what it’s like to explain new instructions to their students. It usually starts with a whole group walk-through, followed by an endless stream of questions from students to clarify next steps.</p>
<h2>2. Establish a Digital Home Base</h2>
<p>In the spirit of simplicity, it’s vital to have a digital home base for your students. This can be a district-provided learning management system like Canvas or Google Classrooms, or it can be a self-created class website. I recommend Google Sites as a simple, easy-to-set-up platform.</p>
<h2>3. Prioritize Longer, Student-Driven Assignments</h2>
<p>Efficiency is key when designing distance learning experiences. Planning is going to take more time and require a high level of attention to detail. You will not be able to correct mistakes on the fly or suddenly pivot when kids are disengaged.</p>
<h2>4. Individual Touchpoints Are Game-Changers</h2>
<p>You can create these touchpoints through any medium you like: emails, video messages, phone calls, messages through your learning management system, comments on shared documents, etc.</p>
###
[Title]: 3 Best Free Image Converter Software Programs
[Blog article]: <h1>3 Best Free Image Converter Software Programs</h1>
<p>Best free image converters for JPG, BMP, PSD, TIF, GIF, RAW, and more</p>
<h2>1. XnConvert</h2>
<p>XnConvert is the Swiss Army knife of image converters. With XnView, you can convert any of around 500 image formats to your choice of around 80 others. If you have a rare image format that you can't open, XnView can probably convert it.</p>
<h2>2. CoolUtils Online Image Converter</h2>
<p>The CoolUtils Online Image Converter is just that—an image converter that exists completely online, no download required. Unlike some online image converters, this one does the image converting for you in real-time—no waiting on an email link.</p>
<h2>3. FileZigZag</h2>
<p>FileZigZag is another online image converter service that will convert most common graphics formats. Just upload the original image, choose the desired output, and then wait for the download link to appear on the page.</p>
###
[Title]: 4 Best Tips to Find a Girlfriend
[Blog article]: """,
    max_length=500,
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

出力:

<h1>Text Generation API</h1>
<h2>What is Text Generation?</h2>
<p>Text generation is the process of generating text from a block of text. It is not simply the inverse of summarization: a text generator will take a short piece of text as an input and generate a longer one.</p>
<p>For example you have the following text:</p>
Ahead of the opening of the meeting,
<p>In this case the goal of the API is to automatically generate the following:</p>
Ahead of the opening of the meeting, Prime Minister Justin Trudeau spoke to reporters at the Montreal airport where he was travelling to speak at the conference.
<p>So basically we want to leverage the output of the API as the input for the text generator, and we will then be able to generate some sentences based on the input we already have.</p>
<h2>Text generation with GPT-2</h2>
<p>We will leverage GPT-2 in order to have a starting point for the text generator. In this case, we will use the API as the input for the text generator, and we will use the output of the API as the input for GPT-2.
However, we want to fine-tune GPT-2 to generate more complex sentences and to have better results.</p>

印象的でしょう?この生成されたブログ記事は小さいですが、もっと長い記事を生成することができます。この 生成されるブログ記事の構造は、数回の例で使用した構造に依存します。 より複雑な構造や、より適切なコンテンツを得るためには、GPT-Jを微調整することが重要です。

結論

このように、数撃ちゃ当たるは、GPT-3、GPT-J、GPT-Neoが素晴らしい成果を出すための素晴らしい技術なのです。 を達成するのに役立つ素晴らしいテクニックです。ここで重要なのは、リクエストをする前に正しい文脈を渡すことです。

単純なテキスト生成の場合でも、できるだけ多くの文脈を渡すことが推奨されます。 を渡すことをお勧めします。

お役に立ったでしょうか?これらのモデルの活用方法について、何か疑問がありましたら ご遠慮なくお問い合わせください。

Julien Salinas
NLP CloudのCTO。