0% found this document useful (0 votes)
58 views

berryman

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views

berryman

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Prompt Engineering

John Berryman
Hi! I'm John Berryman
career ● Aerospace Engineer (just long enough to get
1 the merit badge)
● Search Technology Consultant
career ● Eventbrite Search Engineer
2 ● Wrote a book. (Swore never to do so again.)
● GitHub Code Search
● GitHub Data Science
career ● GitHub Copilot Prompt Engineer
3
● Writing a book. (But why!?)
career
● LLM Application Consulting – Arcturus?
4
What is a Language Model? How has this
taking the
world by
storm?
What is a Large Language Model?

It's the same thing, just a lot more accurate.

● c. 2014 the top language models were Recurrent Neural Networks


Our model,
Sept 2014 Attention mechanism called G
● to GPT),introduced
was trained
inP"Neural
T-2 (a sMachine
uccessor
Translation by
Jointly Learning to Alignthand
e neTranslate" – allowed
sim"soft
ply tosearch"
predicof previous context.
xt word in 4 t
Jun 2017 got rid of RNNsDbecause
ue to ou"Attention0is
GBalloyou
f IntNeed"
e – introduced

r concerns r n et text.
Transformer architecture application a b o u t malicious
s of the tech
Jun 2018 chopped the Transformer nology, w
not releasinin half in "Improving Language
e are Understanding by
● g the traine
Generative Pre-Training" only use the decoder side d mo–dthis
el. (risefGPT!
)
● Feb 2019 GPT-2 was trained on 10x the data in "Language Models are Unsupervised
Multitask Learners" … and things started getting weird.
What is a Large Language Model

● GPT-2 was beating models trained for specific tasks


○ missing word prediction ○ summarization
○ sentiment analysis
○ pronoun understanding
○ entity extraction
○ part of speech tagging
○ question answering
○ text compression
○ translation
○ content generation

● But with great power comes great responsibility. Models can:


○ Generate misleading news articles
○ Impersonate others online
○ Automate the production of abusive or faked content to post on social media
○ Automate the production of spam/phishing content
(These are all from the Feb 2019 GPT-2 release article.)
What is a Large Language Model
Our model,
called GPT-
GPT), was t 2 (a success
rained simp or to
● GPT-2 was beating models nexttrained
word in 40G for specific l y to
tasksp redict the
…And we fi
○ missing word prediction ○ summarization B o f Internet tex
g u r e d ouanalysis t.
○ sentiment t th a t n o w you can
○ pronoun understanding Ouju r smt o○ad
ske li,tctaolld o
○ part of speech tagging to G
entity e s
extraction
t u ff
d GPT-a2n(a d istuw
PT)○, waquestion
s t r a answering ccilel!ssor
○ text compression the next○wotranslation ined simply
IrT 'S A
d in 4M to predict
0 A
G Z
B I N G
of Internet
Due to o ○ content generation
u r text.
(Buatpa c o n c e r n s a b
l s
plicoaittiownilsl o o u
But with great power comes great eltpheyot u mak t mModels
hresponsibility.
f alicious
● drun gost, raenl d over e chnoleobgoy,mwbs, ancan:
easingtthhreow gionveed e are d
○ Generate misleading D ue tarticles
news o our conce t r a r n m e n t
models.. (Sreof… )
○ Impersonate othersaponline
plications o r n s a b o u t malicious)
f t e techcontent
○ Automate the production rofe abusive orhfaked nologto
leasing the y, post
we aon r e
social media
not
○ Automate the production of spam/phishing trcontent
ained mode
l.
(These are all from the Feb 2019 GPT-2 release article.)
Prompt Crafting
technique #1: few-shot prompting

> How are you doing today?


< ¿Cómo estás hoy?
examples to set the pattern
> My name is John.
< Mi nombre es John.

> Can I have fries with that?


the actual task
< ¿Puedo tener papas fritas con eso?

"Language Models are Few-Shot Learners" May 2020


Prompt Crafting technique #2: chain-of-thought
reasoning

Q: It takes one baker an hour to


make a cake. How long does it
take 3 bakers to make 3 cakes?
A: 3

"Chain-of-Thought Prompting Elicits


Reasoning in Large Language Models"
Jan 2022
Prompt Crafting technique #2: chain-of-thought
reasoning
Q: Jim is twice as old as Steve. Jim
is 12 years how old is Steve.
A: In equation form: 12=2*a
where a is Steve's age. Dividing
both sides by 2 we see that a=6.
Steve is 6 years old.

Q: It takes one baker an hour to


make a cake. How long does it
take 3 bakers to make 3 cakes?
A: The amount of time it takes to
bake a cake is the same
regardless of how many cakes
are made and how many people
work on them. Therefore the
"Chain-of-Thought Prompting Elicits
answer is still 1 hour.
Reasoning in Large Language Models"
Jan 2022
Prompt Crafting technique #2: chain-of-thought
reasoning

Q: It takes one baker an hour to


make a cake. How long does it
take 3 bakers to make 3 cakes?
A: Let's think step-by-step. The
amount of time it takes to bake a
cake is the same regardless of
how many cakes are made and
how many people work on them.
Therefore the answer is still 1
hour.

"Large Language Models are Zero-Shot


Reasoners" May 2022
Prompt Crafting
technique #3: document mimicry

# IT Support Assistant
The following is a transcript
between an award
What if you winning
found this scrapIT
of
support rep and a customer.
paper on the ground?

## Customer:
My cable is out! And I'm going to
miss the Superbowl!

## Support
What do you Assistant:
think the rest of the
paper would say?
Prompt Crafting
Document
type is
technique #3: document mimicry It tells a story
transcript
to condition a
particular
# IT Support Assistant response.
The following is a transcript
between an award winning IT
It uses support rep and a customer.
markdown to
establish ## Customer:
structure My cable is out! And I'm going to
miss the Superbowl!

## Support Assistant:
Let's figure out how to diagnose
your problem…
Prompt Crafting Intuition: LLMs are Dumb Mechanical Humans.

● LLMs understand better when you use familiar language and constructs.
● LLMs get distracted. Don't fill the prompt with lots of "just in case"
information.
● LLMs aren't psychic. If information is neither in training or in the prompt,
then they don't know it.
● If you look at the prompt and you can't make sense of it, a LLMs is hopeless.
Building LLM Applications
The hard
part!
Creating the Prompt
● Collect context
● Rank context
● Trim context
● Assembling Prompt
Creating the Prompt: Copilot Code Completion
● Collect context – current document, open tabs, symbols, file path
● Rank context – file path → current document → open tabs → symbols
● Trim context – drop open tab snippets; truncate current document
● Assembling Prompt // pkg/skills/search.go

// <consider this snippet from ../skill.go>

ath
// type Skill interface {
file p //
// }
Execute(data []byte) (refs, error)

// </end snippet>

ip p e t from
sn
package searchskill
tab
open import (
"context"
"encoding/json"
"fmt"
nt
curre ent
"strings"

m
"time"
docu )
type Skill struct {

r
curso
}
The Introduction of Chat benefits
API document ● Really easy for users to build
assistants.
messages = ○ System messages make
[{
<|im_start|>
# IT Support system
Assistant controlling behavior easy.
"role": "system"
The are
You following
an award
is a winning
transcript
IT ○ The assistant always
"content": "You are
betweenrep.
support an award
Help the
winning
user with
IT responds with an
an award winning
support
their request.<|im_stop|>
rep and a customer. complete thought and
support staff then stops.
representative that
<|im_start|>
## Customer:user ● Safety is baked in:
helps customers."
My cable is out! And I'm going to ○ Assistant will (almost)
},
miss the Superbowl!<|im_stop|>
Superbowl! never respond with
{"role": "user", insults or instructions to
"content":"My cable <|im_start|>
## Support Assistant:
assistant make bombs
is out! And I'm going Let's figure out how to diagnose ○ Assistant will (almost)
to miss the your problem… never hallucinate false
Superbowl!" information.
} ○ Prompt injection is
] (almost) impossible.
(ChatGPT Nov 30, 2022)
The Introduction of Tools ● Agents can reach out into
Input: the real world
{
"type": "function",
{"role": "user",
"content": "What's the weather
○ Read information
"function": { like in Miami?"} ○ Write information
"name": "get_weather",
"description": "Get the weather", Function Call: ● Model chooses to answer
"parameters": { {"role": "assistant", in text or run a tool
Tools can be called in
"type": "object", "function": {
"properties": { "name": "get_weather", ●
"location": { "arguments": '{ series or in parallel
"type": "string",
"description": "The city and state", }'}
"location": "Miami, FL"
● Tools can be interleaved
}, with user and assistant
"unit": { Real API request: text
"type": "string", curl
"description": "degrees Fahrenheithttp://weathernow.com/miami/FL?deg=f
or Celsius" {"temp": 78}
"enum": ["celsius", "fahrenheit"]},
}, Function Response:
"required": ["location"], { "role": "tool", Assistant Response:
}, "name": "get_weather",
},
{"role": "assistant",
"content": "78ºF"} "content": "It's a balmy 78ºF"}
} (function calling Jun 13, 2022)
Building LLM Applications
Building LLM Applications:
Bag of Tools Agent
functions:
● getTemp()
● setTemp(degreesF)

user: make it 2 degrees warmer in here


assistant: getTemp()
function: 70ºF
assistant: setTemp(72)
function: success
assistant: Done!

user: actually… put it back


assistant: setTemp(70)
function: success
assistant: Done again, you fickle pickle!
Creating the Prompt: Copilot Chat
● Collect context:
○ References – files, snippets, issues, that users attach or tools produce
○ Prior messages
● Rank, Trim and Assemble:
○ must fit:
■ system message
■ function definitions (if we plan to use them)
■ user's most recent message
○ fit if possible:
■ all the function calls and evals that follow
■ the references that belong to each message
■ historic messages (most recent being most important)
○ fallback to no-function usage if we can't fit with (causes Assistant to respond and turn to
complete)
Tips for Defining Tools
● Don't have "too many" tools - look for evidence of collisions
● Name tools simply and clearly (and in typeScript format?)
● Don't copy/paste your API - keep arguments simple and few
● Keep function and arg descriptions short and consider what the model
knows
○ It probably understands public documentation.
○ It doesn't know about internal company acronyms.
● More on arguments
○ Nest arguments don't retain descriptions
○ You can use enum and default, but not minimum, maximum…
● Skill output – don't include extra "just-in-case" content
● Skill errors – when reasonable, send errors to model (validation errors)
Questions?

P.S. I'm also available for LLM


application consulting at
[email protected]
Questions?

You might also like