GPT-3, GPT-J, GPT-NeoX는 매우 강력한 AI 모델입니다. 여기에서는 소수점 학습을 통해 이러한 모델을 효과적으로 사용하는 방법을 보여드리겠습니다. 소수 샷 학습은 프롬프트에 몇 가지 예제를 제공하기만 하면 AI 모델을 훈련/미세 조정하는 것과 같습니다.
OpenAI에서 출시한 GPT-3는 텍스트 이해 및 텍스트 생성을 위해 지금까지 출시된 것 중 가장 강력한 AI 모델입니다.
1,750억 개의 매개 변수에 대해 학습되었기 때문에 매우 다재다능하고 거의 모든 것을 이해할 수 있습니다!
챗봇, 콘텐츠 제작, 엔티티 추출, 분류, 요약 등 GPT-3로 모든 종류의 작업을 수행할 수 있습니다. 하지만 약간의 연습이 필요하며 이 모델을 올바르게 사용하는 것은 쉽지 않습니다.
GPT-NeoX와 GPT-J는 모두 오픈소스 자연어 처리 모델로, 오픈소스 AI를 개발하는 연구자 집단인 연구자들이 만든 오픈소스 자연어 처리 모델입니다. (EleutherAI 웹사이트 참조).
GPT-J는 60억 개의 파라미터를, GPT-NeoX는 200억 개의 파라미터를 가지고 있어 현재 가장 진보된 오픈 소스 자연어 처리 모델 중 가장 진보된 모델입니다. 이 모델들은 OpenAI의 독점적인 GPT-3 큐리에 대한 직접적인 대안입니다.
이 모델들은 매우 다재다능합니다. 텍스트 생성, 감성 표현 분석 분류, 기계 번역 등 모든 자연어 처리 사용 사례에 사용할 수 있습니다(아래 참조). 하지만 효과적으로 사용하려면 효과적으로 사용하려면 연습이 필요합니다. 응답 시간(지연 시간)도 표준 자연어 처리 모델보다 길 수 있습니다. 모델보다 더 길 수도 있습니다.
GPT-J와 GPT-NeoX는 모두 NLP 클라우드 API에서 사용할 수 있습니다. 아래에서는 NLP 클라우드 API를 사용하여
얻은 예제를
Python 클라이언트를 사용하여 얻은 예제를 보여드리겠습니다. 예제를 복사하여 붙여넣으려면,
제발
고유한 API 토큰을 추가하는 것을 잊지 마세요. Python 클라이언트를 설치하려면 먼저 다음을 실행합니다: pip install nlpcloud.
몇 개의 예제만으로 머신 러닝 모델이 예측을 할 수 있도록 돕는 것이 바로 퓨즈샷 학습입니다. 예제. 여기서 새로운 모델을 학습시킬 필요가 없습니다. GPT-3, GPT-J, GPT-NeoX와 같은 모델은 워낙 방대하기 때문에 재학습 없이도 다양한 상황에 쉽게 적응할 수 있습니다.
모델에 몇 가지 예제만 제공해도 정확도를 크게 높일 수 있습니다.
자연어 처리에서는 텍스트 입력과 함께 이러한 예제를 전달하는 것이 아이디어입니다. 아래 예시를 참조하세요!
또한 소수의 샷 학습으로 충분하지 않은 경우 OpenAI 웹 사이트에서 GPT-3를, NLP 클라우드에서 GPT-J를 미세 조정할 수도 있습니다. 모델 을 사용 사례에 맞게 완벽하게 조정할 수 있습니다.
텍스트 생성 섹션의 NLP 클라우드 플레이그라운드에서 소수의 샷 학습을 쉽게 테스트할 수 있습니다. 여기를 클릭하여 플레이그라운드에서 텍스트 생성을 사용해 보세요. 그런 다음 이 글의 아래 예시 중 하나를 사용하여 직접 확인해 보세요.
NLP 클라우드 플레이그라운드에서 트윗 생성 예시
import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Message: Support has been terrible for 2 weeks...
Sentiment: Negative
###
Message: I love your API, it is simple and so fast!
Sentiment: Positive
###
Message: GPT-J has been released 2 months ago.
Sentiment: Neutral
###
Message: The reactivity of your team has been amazing, thanks!
Sentiment:""",
min_length=1,
max_length=1,
length_no_input=True,
end_sequence="###",
remove_end_sequence=True,
remove_input=True)
print(generation["generated_text"])
출력:
Positive
보시다시피, 먼저 적절한 형식의 세 가지 예제를 제공함으로써 GPT-J는 감정 분석을 수행하고자 한다는 것을 이해하게 됩니다. 그리고 그 결과는 좋았습니다.
다음과 같은 사용자 정의 구분 기호를 사용하여 GPT-J가 다른
섹션을 이해하도록 도울 수 있습니다: ###. 이와 같은 다른 것을 완벽하게 사용할 수 있습니다: ---. 또는 단순히 새로운
줄을 추가합니다. 그런 다음 NLP 클라우드 매개변수인 "end_sequence"를 설정합니다.
새 줄 이후 콘텐츠 생성을 중지하도록 GPT-J에 지시합니다. + ###: end_sequence="###".
import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""description: a red button that says stop
code: <button style=color:white; background-color:red;>Stop</button>
###
description: a blue box that contains yellow circles with red borders
code: <div style=background-color: blue; padding: 20px;><div style=background-color: yellow; border: 5px solid red; border-radius: 50%; padding: 20px; width: 100px; height: 100px;>
###
description: a Headline saying Welcome to AI
code:""",
max_length=500,
length_no_input=True,
end_sequence="###",
remove_end_sequence=True,
remove_input=True)
print(generation["generated_text"])
출력:
<h1 style=color: white;>Welcome to AI</h1>
GPT-J를 사용한 코드 생성은 정말 놀랍습니다. 이는 부분적으로 GPT-J가 방대한 양의 데이터를 방대한 코드 베이스.
import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Question: Fetch the companies that have less than five people in it.
Answer: SELECT COMPANY, COUNT(EMPLOYEE_ID) FROM Employee GROUP BY COMPANY HAVING COUNT(EMPLOYEE_ID) < 5;
###
Question: Show all companies along with the number of employees in each department
Answer: SELECT COMPANY, COUNT(COMPANY) FROM Employee GROUP BY COMPANY;
###
Question: Show the last record of the Employee table
Answer: SELECT * FROM Employee ORDER BY LAST_NAME DESC LIMIT 1;
###
Question: Fetch three employees from the Employee table;
Answer:""",
max_length=100,
length_no_input=True,
end_sequence="###",
remove_end_sequence=True,
remove_input=True)
print(generation["generated_text"])
출력:
SELECT * FROM Employee ORDER BY ID DESC LIMIT 3;
자동 SQL 생성은 특히 SQL의 선언적 특성으로 인해 GPT-J와 매우 잘 작동합니다. SQL은 (대부분의 프로그래밍 언어에 비해) 가능성이 상대적으로 적은 상당히 제한된 언어라는 사실 프로그래밍 언어에 비해).
import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""[Text]: Fred is a serial entrepreneur. Co-founder and CEO of Platform.sh, he previously co-founded Commerce Guys, a leading Drupal ecommerce provider. His mission is to guarantee that as we continue on an ambitious journey to profoundly transform how cloud computing is used and perceived, we keep our feet well on the ground continuing the rapid growth we have enjoyed up until now.
[Name]: Fred
[Position]: Co-founder and CEO
[Company]: Platform.sh
###
[Text]: Microsoft (the word being a portmanteau of "microcomputer software") was founded by Bill Gates on April 4, 1975, to develop and sell BASIC interpreters for the Altair 8800. Steve Ballmer replaced Gates as CEO in 2000, and later envisioned a "devices and services" strategy.
[Name]: Steve Ballmer
[Position]: CEO
[Company]: Microsoft
###
[Text]: Franck Riboud was born on 7 November 1955 in Lyon. He is the son of Antoine Riboud, the previous CEO, who transformed the former European glassmaker BSN Group into a leading player in the food industry. He is the CEO at Danone.
[Name]: Franck Riboud
[Position]: CEO
[Company]: Danone
###
[Text]: David Melvin is an investment and financial services professional at CITIC CLSA with over 30 years’ experience in investment banking and private equity. He is currently a Senior Adviser of CITIC CLSA.
""",
top_p=0,
length_no_input=True,
end_sequence="###",
remove_end_sequence=True,
remove_input=True)
print(generation["generated_text"])
출력:
[Name]: David Melvin
[Position]: Senior Adviser
[Company]: CITIC CLSA
보시다시피, GPT-J는 비정형 텍스트에서 정형 데이터를 추출하는 데 매우 능숙합니다. 재교육 없이도 엔티티 추출을 해결한다는 점이 정말 인상적입니다! 일반적으로 새로운 유형의 엔티티(예: 이름, 직위, 국가 등)를 추출하려면 완전히 새로운 주석, 훈련, 배포... 여기서는 완전히 매끄럽게 진행됩니다.
import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Context: NLP Cloud was founded in 2021 when the team realized there was no easy way to reliably leverage Natural Language Processing in production.
Question: When was NLP Cloud founded?
Answer: 2021
###
Context: NLP Cloud developed their API by mid-2020 and they added many pre-trained open-source models since then.
Question: What did NLP Cloud develop?
Answer: API
###
Context: All plans can be stopped anytime. You only pay for the time you used the service. In case of a downgrade, you will get a discount on your next invoice.
Question: When can plans be stopped?
Answer: Anytime
###
Context: The main challenge with GPT-J is memory consumption. Using a GPU plan is recommended.
Question: Which plan is recommended for GPT-J?
Answer:""",
min_length=1,
max_length=20,
length_no_input=True,
end_sequence="###",
remove_end_sequence=True,
remove_input=True)
print(generation["generated_text"])
출력:
GPU 요금제
질문 답변은 매우 잘 작동합니다. 다른 전용 자연어 처리 모델을 통해서도 달성할 수 있지만, 아마도 같은 수준의 정확도는 떨어질 수 있습니다.
import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Here is a tutorial about how to make a cake.
1. Take some flour.
2. Take some sugar.""",
max_length=500)
print(generation["generated_text"])
출력:
Here is a tutorial how to make a cake.
1. Take some flour.
2. Take some sugar.
3. Take some butter.
4. Take some eggs.
5. Take some water.
6. Take some baking powder.
7. Take some vanilla.
8. Mix all together.
9. Bake in a pan.
10. Enjoy.
Well, that's it. You can make this for your birthday or a party or you can even make it for your kids. They will love this.
보시다시피 GPT-J가 자동으로 초기 서식을 따르는 방식이 매우 인상적이며, 생성된 콘텐츠도 생성된 콘텐츠도 매우 훌륭합니다. 이걸로 제대로 된 케이크를 만들 수도 있습니다(아직 시도해 보지는 않았지만 하지만).
import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""I love goin to the beach.
Correction: I love going to the beach.
###
Let me hav it!
Correction: Let me have it!
###
It have too many drawbacks.
Correction: It has too many drawbacks.
###
I do not wan to go
Correction:""",
length_no_input=True,
end_sequence="###",
remove_end_sequence=True,
remove_input=True)
print(generation["generated_text"])
출력:
가기 싫어요.
맞춤법 및 문법 교정은 예상대로 작동합니다. 문장의 오류 위치를 더 구체적으로 알려주고 싶다면 의 위치를 더 구체적으로 알려주고 싶다면 전용 모델을 사용하는 것이 좋습니다.
import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Hugging Face a révolutionné le NLP.
Translation: Hugging Face revolutionized NLP.
###
Cela est incroyable!
Translation: This is unbelievable!
###
Désolé je ne peux pas.
Translation: Sorry but I cannot.
###
NLP Cloud permet de deployer le NLP en production facilement.
Translation:""",
length_no_input=True,
end_sequence="###",
remove_end_sequence=True,
remove_input=True)
print(generation["generated_text"])
출력:
NLP Cloud makes it easy to deploy NLP to production.
기계 번역에는 일반적으로 전용 모델(보통 언어당 1개)이 사용됩니다. 여기서는 모든 언어가 를 사용하며, 이는 매우 인상적입니다.
import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""keyword: markets
tweet: Take feedback from nature and markets, not from people
###
keyword: children
tweet: Maybe we die so we can come back as children.
###
keyword: startups
tweet: Startups should not worry about how to put out fires, they should worry about how to start them.
###
keyword: NLP
tweet:""",
max_length=200,
length_no_input=True,
end_sequence="###",
remove_end_sequence=True,
remove_input=True)
print(generation["generated_text"])
출력:
People want a way to get the benefits of NLP without paying for it.
다음은 문맥에 따라 짧은 트윗을 작성하는 재미있고 쉬운 방법입니다.
import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""This is a discussion between a [human] and a [robot].
The [robot] is very nice and empathetic.
[human]: Hello nice to meet you.
[robot]: Nice to meet you too.
###
[human]: How is it going today?
[robot]: Not so bad, thank you! How about you?
###
[human]: I am ok, but I am a bit sad...
[robot]: Oh? Why that?
###
[human]: I broke up with my girlfriend...
[robot]:""",
min_length=1,
max_length=20,
length_no_input=True,
end_sequence="###",
remove_end_sequence=True,
remove_input=True)
print(generation["generated_text"])
출력:
Oh? How did that happen?
보시다시피 GPT-J는 사용자가 대화 모드에 있음을 제대로 이해합니다. 그리고 매우 강력한 점은 문맥에서 어조를 변경하면 모델의 응답이 동일하게 따라온다는 것입니다. 톤 (풍자, 분노, 호기심...).
실제로 저희는 챗봇을 구축하는 방법에 대한 전용 블로그 기사를 작성했습니다. GPT-3/GPT-J, 자유롭게 읽어보세요!
import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""I want to start coding tomorrow because it seems to be so fun!
Intent: start coding
###
Show me the last pictures you have please.
Intent: show pictures
###
Search all these files as fast as possible.
Intent: search files
###
Can you please teach me Chinese next week?
Intent:""",
length_no_input=True,
end_sequence="###",
remove_end_sequence=True,
remove_input=True)
print(generation["generated_text"])
출력:
learn chinese
GPT-J가 문장에서 의도를 감지하는 방식이 매우 인상적입니다. 복잡한 문장에서 매우 잘 작동합니다. 복잡한 문장에 매우 효과적입니다. 심지어 원하는 경우 의도를 다른 형식으로 지정하도록 요청할 수도 있습니다. 예를 들어 "learnChinese"와 같은 자바스크립트 함수 이름을 자동으로 생성할 수 있습니다.
import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""[Original]: Algeria recalled its ambassador to Paris on Saturday and closed its airspace to French military planes a day later after the French president made comments about the northern Africa country.
[Paraphrase]: Last Saturday, the Algerian government recalled its ambassador and stopped accepting French military airplanes in its airspace. It happened one day after the French president made comments about Algeria.
###
[Original]: President Macron was quoted as saying the former French colony was ruled by a "political-military system" with an official history that was based not on truth, but on hatred of France.
[Paraphrase]: Emmanuel Macron said that the former colony was lying and angry at France. He also said that the country was ruled by a "political-military system".
###
[Original]: The diplomatic spat came days after France cut the number of visas it issues for citizens of Algeria and other North African countries.
[Paraphrase]: Diplomatic issues started appearing when France decided to stop granting visas to Algerian people and other North African people.
###
[Original]: After a war lasting 20 years, following the decision taken first by President Trump and then by President Biden to withdraw American troops, Kabul, the capital of Afghanistan, fell within a few hours to the Taliban, without resistance.
[Paraphrase]:""",
length_no_input=True,
end_sequence="###",
remove_end_sequence=True,
remove_input=True,
min_length=0,
max_length=50)
print(generation["generated_text"])
출력:
French President Emmanuel Macron hopes the diplomatic tension with Algeria will calm down. He wants to make progress in the relations with the former French colony. President Macron says he has a good relationship with his Algerian counterpart and he is glad that they have a cordial relationship. The president was quoted as saying that Algeria was ruled by a "political-military system" with an official history that was based not on truth, but on hatred of France. France cut the number of visas it issues for citizens of Algeria and other North African countries.
GPT-J는 주요 의미를 유지하면서 문단을 수정했습니다. 의역의 의미입니다. GPT-J가 더 많은 원본 의역을 반환하도록 완벽하게 유도할 수 있습니다. 입력에 다른 예제를 전달하고 온도, top_p와 같은 API 매개 변수를 가지고 놀면 됩니다, 반복 페널티...
import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""[Original]: America has changed dramatically during recent years. Not only has the number of graduates in traditional engineering disciplines such as mechanical, civil, electrical, chemical, and aeronautical engineering declined, but in most of the premier American universities engineering curricula now concentrate on and encourage largely the study of engineering science. As a result, there are declining offerings in engineering subjects dealing with infrastructure, the environment, and related issues, and greater concentration on high technology subjects, largely supporting increasingly complex scientific developments. While the latter is important, it should not be at the expense of more traditional engineering.
Rapidly developing economies such as China and India, as well as other industrial countries in Europe and Asia, continue to encourage and advance the teaching of engineering. Both China and India, respectively, graduate six and eight times as many traditional engineers as does the United States. Other industrial countries at minimum maintain their output, while America suffers an increasingly serious decline in the number of engineering graduates and a lack of well-educated engineers.
(Source: Excerpted from Frankel, E.G. (2008, May/June) Change in education: The cost of sacrificing fundamentals. MIT Faculty
[Summary]: MIT Professor Emeritus Ernst G. Frankel (2008) has called for a return to a course of study that emphasizes the traditional skills of engineering, noting that the number of American engineering graduates with these skills has fallen sharply when compared to the number coming from other countries.
###
[Original]: So how do you go about identifying your strengths and weaknesses, and analyzing the opportunities and threats that flow from them? SWOT Analysis is a useful technique that helps you to do this.
What makes SWOT especially powerful is that, with a little thought, it can help you to uncover opportunities that you would not otherwise have spotted. And by understanding your weaknesses, you can manage and eliminate threats that might otherwise hurt your ability to move forward in your role.
If you look at yourself using the SWOT framework, you can start to separate yourself from your peers, and further develop the specialized talents and abilities that you need in order to advance your career and to help you achieve your personal goals.
[Summary]: SWOT Analysis is a technique that helps you identify strengths, weakness, opportunities, and threats. Understanding and managing these factors helps you to develop the abilities you need to achieve your goals and progress in your career.
###
[Original]: Jupiter is the fifth planet from the Sun and the largest in the Solar System. It is a gas giant with a mass one-thousandth that of the Sun, but two-and-a-half times that of all the other planets in the Solar System combined. Jupiter is one of the brightest objects visible to the naked eye in the night sky, and has been known to ancient civilizations since before recorded history. It is named after the Roman god Jupiter.[19] When viewed from Earth, Jupiter can be bright enough for its reflected light to cast visible shadows,[20] and is on average the third-brightest natural object in the night sky after the Moon and Venus.
Jupiter is primarily composed of hydrogen with a quarter of its mass being helium, though helium comprises only about a tenth of the number of molecules. It may also have a rocky core of heavier elements,[21] but like the other giant planets, Jupiter lacks a well-defined solid surface. Because of its rapid rotation, the planet's shape is that of an oblate spheroid (it has a slight but noticeable bulge around the equator).
[Summary]: Jupiter is the largest planet in the solar system. It is a gas giant, and is the fifth planet from the sun.
###
[Original]: For all its whizz-bang caper-gone-wrong energy, and for all its subsequent emotional troughs, this week’s Succession finale might have been the most important in its entire run. Because, unless I am very much wrong, Succession – a show about people trying to forcefully mount a succession – just had its succession. And now everything has to change.
The episode ended with Logan Roy defying his children by selling Waystar Royco to idiosyncratic Swedish tech bro Lukas Matsson. It’s an unexpected twist, like if King Lear contained a weird new beat where Lear hands the British crown to Jack Dorsey for a laugh, but it sets up a bold new future for the show. What will happen in season four? Here are some theories.
Season three of Succession picked up seconds after season two ended. It was a smart move, showing the immediate swirl of confusion that followed Kendall Roy’s decision to undo his father, and something similar could happen here. This week’s episode ended with three of the Roy siblings heartbroken and angry at their father’s grand betrayal. Perhaps season four could pick up at that precise moment, and show their efforts to reorganise their rebellion against him. This is something that Succession undoubtedly does very well – for the most part, its greatest moments have been those heart-thumping scenes where Kendall scraps for support to unseat his dad – and Jesse Armstrong has more than enough dramatic clout to centre the entire season around the battle to stop the Matsson deal dead in its tracks.
[Summary]:""",
length_no_input=True,
end_sequence="###",
remove_end_sequence=True,
remove_input=True,
min_length=20,
max_length=200)
print(generation["generated_text"])
출력:
Season 3 of Succession ended with Logan Roy trying to sell his company to Lukas Matsson.
텍스트 요약은 까다로운 작업입니다. GPT-J는 올바른 예제만 제공한다면 이 작업을 매우 잘 수행합니다. 예제. 요약의 크기와 요약의 어조는 여러분이 만든 예제에 따라 크게 달라집니다. 만들었습니다. 예를 들어, 어린이를 위한 간단한 요약을 만들려고 할 때와 고급 요약을 만들려고 할 때는 어린이를 위한 간단한 요약을 만들거나 의사를 위한 고급 의학 요약을 만들 때 동일한 유형의 예제를 만들지 않을 수 있습니다. 요약 예제에 비해 GPT-J의 입력 크기가 너무 작다면 요약 작업에 맞게 GPT-J를 미세 조정할 수 있습니다.
import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Message: When the spaceship landed on Mars, the whole humanity was excited
Topic: space
###
Message: I love playing tennis and golf. I'm practicing twice a week.
Topic: sport
###
Message: Managing a team of sales people is a tough but rewarding job.
Topic: business
###
Message: I am trying to cook chicken with tomatoes.
Topic:""",
min_length=1,
max_length=5,
length_no_input=True,
end_sequence="###",
remove_end_sequence=True,
remove_input=True)
print(generation["generated_text"])
출력:
food
이른바 '제로 샷' 기법을 통해 텍스트를 분류하는 쉽고 강력한 방법은 다음과 같습니다. 학습" 기법 덕분에 미리 카테고리를 지정할 필요 없이 텍스트를 쉽고 빠르게 분류할 수 있습니다.
import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Information Retrieval (IR) is the process of obtaining resources relevant to the information need. For instance, a search query on a web search engine can be an information need. The search engine can return web pages that represent relevant resources.
Keywords: information, search, resources
###
David Robinson has been in Arizona for the last three months searching for his 24-year-old son, Daniel Robinson, who went missing after leaving a work site in the desert in his Jeep Renegade on June 23.
Keywords: searching, missing, desert
###
I believe that using a document about a topic that the readers know quite a bit about helps you understand if the resulting keyphrases are of quality.
Keywords: document, understand, keyphrases
###
Since transformer models have a token limit, you might run into some errors when inputting large documents. In that case, you could consider splitting up your document into paragraphs and mean pooling (taking the average of) the resulting vectors.
Keywords:""",
length_no_input=True,
end_sequence="###",
remove_end_sequence=True,
remove_input=True)
print(generation["generated_text"])
출력:
paragraphs, transformer, input, errors
키워드 추출은 텍스트에서 주요 아이디어를 추출하는 것입니다. 이것은 GPT-J가 잘 처리할 수 있는 흥미로운 자연어 처리 하위 필드에서 매우 잘 처리할 수 있습니다. 키프레이즈 추출에 대해서는 아래를 참조하세요. 여러 단어).
import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Information Retrieval (IR) is the process of obtaining resources relevant to the information need. For instance, a search query on a web search engine can be an information need. The search engine can return web pages that represent relevant resources.
Keywords: information retrieval, search query, relevant resources
###
David Robinson has been in Arizona for the last three months searching for his 24-year-old son, Daniel Robinson, who went missing after leaving a work site in the desert in his Jeep Renegade on June 23.
Keywords: searching son, missing after work, desert
###
I believe that using a document about a topic that the readers know quite a bit about helps you understand if the resulting keyphrases are of quality.
Keywords: document, help understand, resulting keyphrases
###
Since transformer models have a token limit, you might run into some errors when inputting large documents. In that case, you could consider splitting up your document into paragraphs and mean pooling (taking the average of) the resulting vectors.
Keywords:""",
length_no_input=True,
end_sequence="###",
remove_end_sequence=True,
remove_input=True)
print(generation["generated_text"])
출력:
large documents, paragraph, mean pooling
위와 동일한 예제이지만 이번에는 하나의 단어가 아닌 여러 단어를 추출하고자 한다는 점만 다릅니다. (키프레이즈라고 합니다).
import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Generate a product description out of keywords.
Keywords: shoes, women, $59
Sentence: Beautiful shoes for women at the price of $59.
###
Keywords: trousers, men, $69
Sentence: Modern trousers for men, for $69 only.
###
Keywords: gloves, winter, $19
Sentence: Amazingly hot gloves for cold winters, at $19.
###
Keywords: t-shirt, men, $39
Sentence:""",
min_length=5,
max_length=30,
length_no_input=True,
end_sequence="###",
remove_end_sequence=True,
remove_input=True)
print(generation["generated_text"])
출력:
Extraordinary t-shirt for men, for $39 only.
GPT-J에 특정 키워드가 포함된 제품 설명 또는 광고를 생성하도록 요청할 수 있습니다. 여기 간단한 문장만 간단한 문장을 생성했지만 필요한 경우 전체 단락을 쉽게 생성할 수 있습니다.
import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""[Title]: 3 Tips to Increase the Effectiveness of Online Learning
[Blog article]: <h1>3 Tips to Increase the Effectiveness of Online Learning</h1>
<p>The hurdles associated with online learning correlate with the teacher’s inability to build a personal relationship with their students and to monitor their productivity during class.</p>
<h2>1. Creative and Effective Approach</h2>
<p>Each aspect of online teaching, from curriculum, theory, and practice, to administration and technology, should be formulated in a way that promotes productivity and the effectiveness of online learning.</p>
<h2>2. Utilize Multimedia Tools in Lectures</h2>
<p>In the 21st century, networking is crucial in every sphere of life. In most cases, a simple and functional interface is preferred for eLearning to create ease for the students as well as the teacher.</p>
<h2>3. Respond to Regular Feedback</h2>
<p>Collecting student feedback can help identify which methods increase the effectiveness of online learning, and which ones need improvement. An effective learning environment is a continuous work in progress.</p>
###
[Title]: 4 Tips for Teachers Shifting to Teaching Online
[Blog article]: <h1>4 Tips for Teachers Shifting to Teaching Online </h1>
<p>An educator with experience in distance learning shares what he’s learned: Keep it simple, and build in as much contact as possible.</p>
<h2>1. Simplicity Is Key</h2>
<p>Every teacher knows what it’s like to explain new instructions to their students. It usually starts with a whole group walk-through, followed by an endless stream of questions from students to clarify next steps.</p>
<h2>2. Establish a Digital Home Base</h2>
<p>In the spirit of simplicity, it’s vital to have a digital home base for your students. This can be a district-provided learning management system like Canvas or Google Classrooms, or it can be a self-created class website. I recommend Google Sites as a simple, easy-to-set-up platform.</p>
<h2>3. Prioritize Longer, Student-Driven Assignments</h2>
<p>Efficiency is key when designing distance learning experiences. Planning is going to take more time and require a high level of attention to detail. You will not be able to correct mistakes on the fly or suddenly pivot when kids are disengaged.</p>
<h2>4. Individual Touchpoints Are Game-Changers</h2>
<p>You can create these touchpoints through any medium you like: emails, video messages, phone calls, messages through your learning management system, comments on shared documents, etc.</p>
###
[Title]: 3 Best Free Image Converter Software Programs
[Blog article]: <h1>3 Best Free Image Converter Software Programs</h1>
<p>Best free image converters for JPG, BMP, PSD, TIF, GIF, RAW, and more</p>
<h2>1. XnConvert</h2>
<p>XnConvert is the Swiss Army knife of image converters. With XnView, you can convert any of around 500 image formats to your choice of around 80 others. If you have a rare image format that you can't open, XnView can probably convert it.</p>
<h2>2. CoolUtils Online Image Converter</h2>
<p>The CoolUtils Online Image Converter is just that—an image converter that exists completely online, no download required. Unlike some online image converters, this one does the image converting for you in real-time—no waiting on an email link.</p>
<h2>3. FileZigZag</h2>
<p>FileZigZag is another online image converter service that will convert most common graphics formats. Just upload the original image, choose the desired output, and then wait for the download link to appear on the page.</p>
###
[Title]: 4 Best Tips to Find a Girlfriend
[Blog article]: """,
max_length=500,
length_no_input=True,
end_sequence="###",
remove_end_sequence=True,
remove_input=True)
print(generation["generated_text"])
출력:
<h1>Text Generation API</h1>
<h2>What is Text Generation?</h2>
<p>Text generation is the process of generating text from a block of text. It is not simply the inverse of summarization: a text generator will take a short piece of text as an input and generate a longer one.</p>
<p>For example you have the following text:</p>
Ahead of the opening of the meeting,
<p>In this case the goal of the API is to automatically generate the following:</p>
Ahead of the opening of the meeting, Prime Minister Justin Trudeau spoke to reporters at the Montreal airport where he was travelling to speak at the conference.
<p>So basically we want to leverage the output of the API as the input for the text generator, and we will then be able to generate some sentences based on the input we already have.</p>
<h2>Text generation with GPT-2</h2>
<p>We will leverage GPT-2 in order to have a starting point for the text generator. In this case, we will use the API as the input for the text generator, and we will use the output of the API as the input for GPT-2.
However, we want to fine-tune GPT-2 to generate more complex sentences and to have better results.</p>
인상적이지 않나요? 이렇게 생성된 블로그 기사는 작지만 훨씬 더 긴 기사를 생성할 수 있습니다. 생성된 블로그 게시물의 생성된 블로그 게시물의 구조는 실제로 몇 개의 샷 예제에서 사용한 구조에 따라 달라집니다. 더 복잡한 구조와 더 관련성 높은 콘텐츠를 얻으려면 GPT-J를 미세 조정하는 것이 핵심입니다.
보시다시피, 소수 샷 학습은 GPT-3, GPT-J 및 GPT-NeoX가 놀라운 성과를 달성하는 데 도움이되는 훌륭한 기술입니다. 달성하는 데 도움이 되는 훌륭한 기술입니다! 여기서 핵심은 요청을 하기 전에 정확한 컨텍스트를 전달하는 것입니다.
간단한 텍스트 생성의 경우에도 모델에 도움이 되도록 가능한 한 많은 컨텍스트를 전달하는 것이 좋습니다. 모델.
도움이 되셨기를 바랍니다! 이 모델을 최대한 활용하는 방법에 대해 궁금한 점이 있으시면 주저하지 마시고 문의해 주세요.
François
NLP 클라우드의 풀스택 엔지니어