Building Systems with the ChatGPT API 课程链接:https://learn.deeplearning.ai/chatgpt-building-system/
第一节 简介
介绍了两种 LLM 的情况:Base LLM 使用监督学习进行训练,其开发周期相当漫长,而使用 Instruction tuned LLM 开发 prompt-based AI 则可以将开发过程极大程度缩短。
第二节 Language Models, the Chat Format and Tokens
讲到了 LLM 的 tokenizor 机制,导致 AI 看到的英文是 sub-word 级别而不是单个字母,进而导致 AI 没办法完成将一个单词按字母顺序倒序输出等这类字母基本的任务。我认为在中文中也会遇到类似的问题,我使用 tiktokenizor 对中文做过测试,实践表明有些中文被一整个切割,而有些可能被切为好几份。
然后讲到了 Chat Format,将对话分为三个角色 system、user、assistant。其中涉及对于 assistant 的角色风格和行为的设定最好放到 system 中。
第三节 Classification
讲到了使用 GPT 对用户的问题进行分类的实践,例子是以一个客服角色对用户问题进行二级分类并要求 GPT 以JSON 格式返回。
其中强调了 delimiter (分隔符)的作用,将需要被分类的用户问题用 delimiter 包裹效果会更好。
import os
import openai
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
openai.api_key = os.environ['OPENAI_API_KEY']
def get_completion_from_messages(messages,
model="gpt-3.5-turbo",
temperature=0,
max_tokens=500):
response = openai.ChatCompletion.create(
model=model,
messages=messages,
temperature=temperature,
max_tokens=max_tokens,
)
return response.choices[0].message["content"]
delimiter = "####"
system_message = f"""
You will be provided with customer service queries. \
The customer service query will be delimited with \
{
delimiter} characters.
Classify each query into a primary category \
and a secondary category.
Provide your output in json format with the \
keys: primary and secondary.
Primary categories: Billing, Technical Support, \
Account Management, or General Inquiry.
Billing secondary categories:
Unsubscribe or upgrade
Add a payment method
Explanation for charge
Dispute a charge
Technical Support secondary categories:
General troubleshooting
Device compatibility
Software updates
Account Management secondary categories:
Password reset
Update personal information
Close account
Account security
General Inquiry secondary categories:
Product information
Pricing
Feedback
Speak to a human
"""
user_message = f"""\
I want you to delete my profile and all of my user data"""
messages = [
{
'role':'system',
'content': system_message},
{
'role':'user',
'content': f"{
delimiter}{
user_message}{
delimiter}"},
]
response = get_completion_from_messages(messages)
print(response)
user_message = f"""\
Tell me more about your flat screen tvs"""
messages = [
{
'role':'system',
'content': system_message},
{
'role':'user',
'content': f"{
delimiter}{
user_message}{
delimiter}"},
]
response = get_comple