Class 4 Software
Class 4 Software
Class 4:
The Software of Generative
AI — Probabilities vs.
Understanding
Prompt: organisational chart, pyramid, corporate ladder of artificial intelligence | Tool: Adobe Firefly
Class Agenda
3. Implications
4. Human vs. AI
2
PROBABILITIES
HOW ARTIFICIAL INTELLIGENCE GENERATES CONTENTS
3
We Learned the Format of Data for Generative AI Models
Generative AI models are typically trained in the prompt-response (e.g. text-to-text) format
Prompt Response
(Question / instruction) (Answer as text)
Large-language models
Write a review for The gen-AI class
the generative AI by Prof. Dawei
Gen-AI
class offered by Wang is one of the
Professor Dawei best classes I’ve
Wang in the taken. It helped me
University of Hong to gain insights
Kong. into gen-AI.
4
We Learned that Data Are Fed into Generative AI Models
For the model to learn, we feed the prompt-response training data into the model
5
Let’s Look at Large-Language Models
Specifically, let’s look at a large-language model we are all familiar with: the GPT models
6
Step 0: Train The Embedding Model
First step is to convert or preprocess the words into standardised numbers
Embeddings are like unique DNAs for each word. We code texts into numeric forms to facilitate
training. OpenAI offers its embedding models in its API, such as the text-embedding-3-large.
3,072 unique 3,072 unique 3,072 unique 3,072 unique 3,072 unique 3,072 unique 3,072 unique 3,072 unique
Token
numbers numbers numbers numbers numbers numbers numbers numbers
7
Step 1-A: Train The Transformer Model — Words
Transformer models learn association between words
We will randomly mask words in a sentence and ask the transformer model to predict the missing
word in the sentence. This is called masked language modeling.
… … … … … … … … …
8
Step 1-B: Train The Transformer Model — Higher-Level
Transformer models learn association between words, phrases, sentences and paragraphs
We stack multiple layers of transformers models. Layers will refine the representation of the input
text, allowing the model to capture increasingly abstract and complex patterns.
Encoder Decoder
… …
9
Step 2: Train The Generative Model
After the transformer model is PRE-TRAINED, we fine-tune using prompt-response pairs
By fine-tuning the model on prompt-response pairs and exposing it to various inquiries and
responses, the model learns to generate responses that are relevant to the given context.
Prompt Response
(Question / instruction) (Answer as text)
Write a review for the The model will start to fine-tune the The gen-AI class by
generative AI class probabilities so Gen-AI
that it can guess the Prof. Dawei Wang is
offered by Professor response based on the prompt. one of the best classes
Dawei Wang in the I’ve taken. It helped me
University of Hong to gain insights into
Kong. gen-AI.
10
Conditional Probabilities
Patterns in training data are ultimately coded as probabilities in the GPT model
Large-language model
Training data
Apples are red.
GPT
Bananas are yellow.
model
Oranges are orange.
11
CONDITIONAL PROBABILITIES
12
An Important Implication
This is called:
AUTOREGRESSIVENESS
13
UNDERSTANDING
HOW HUMAN INTELLIGENCE GENERATES CONTENTS
14
Let’s Look at How Human Generate Text
Let’s use the same example to learn how humans conduct this fill-in-the-gap exercise
15
Asking Humans to Fill in The Gap
Humans do not fill in the gap ____ by drawing from a database of probabilities, instead we ask
what’s the relationship between the cat and the word to be filled.
16
Human’s Thought Process
When we attempt to fill in the blank (or generate), we think through the problem first
Thought process
4 Why is the cat sitting there? The cat is probably lazing around
17
Humans Generate Using Understanding
When we attempt to fill in the blank (or generate), we use understanding and reasons
18
Human Understanding Is Based on Explanations
Humans understand because we know the reasons and explanations between relationships
C
A B
Cat is sitting on the mat.
Cat Mat
D
Cat is sitting on the mat because the mat is soft and comfortable.
19
A Comparison: Machines Generate Using Probabilities
When machines attempt to fill in the blank (or generate), they optimize and maximize
20
But Machines Can Generate Explanations Too?
But machines can also tell us the reasons by drawing on probabilities learned from its data
P( … | … ) = 0.7 C P( … | … ) = 0.6
21
Where Does Our Reasoning Come from?
It comes from our past experiences, education, family, friends, etc.
Social psychology Our social environment How our friends or family affect what we generate?
Cognitive psychology Our logic and reasoning How our intelligence affect what we generate?
Personality psychology Our traits and personality How our personality affect what we generate?
Evolutionary psychology Our ancestors and evolution How our instincts affect what we generate?
Biological psychology Our hormones and emotions How our emotions affect what we generate?
22
IMPLICATIONS
PROBABILITIES VS UNDERSTANDING
23
Probability Is Faster to Develop
It is much faster to train a large-language model than to teach a person how to write
How long it takes to train a person vs. machine from the ground up
A few months is enough to crawl the data and train the models
24
Probability Is More Predictable (Note: but not as predictable as rule-based models)
Generative AI is more predictable; A human can grow up to become a genius or a criminal
Autoregressive nature of generative AI model makes it have less variance or more predictable:
Machine:
Looking at human in general, you find geniuses like Herbert Simon and people who are less capable:
Human:
Capability
25
Overall… Probability Is Efficient
It is much faster to train a large-language model than to teach a person how to write
Resources Requires computational resources and Requires money, time, personal effort,
needed energy for development and training. and societal educational structures.
26
Understanding Is More Precise
Humans can easily tell the causality of a relationship
Correlational relationship
A A is related to B
B
Causal relationship
A A leads to B
B
27
Understanding vs. Probability in Precision
Asking AI to generate a horse-riding astronaut is easy but what about astronaut-riding horse?
Prompt: “A horse riding an astronaut” Prompt: “A horse riding on the back of astronaut”
28
Understanding Is Explainable
Human understanding is less of a “black-box” than machine-learning models
Mediating relationship
Causal relationship
A A leads to B
B
29
Understanding vs. Probability in Explanation
It is almost impossible to know for sure how machines generated those outputs
Generative AI model
Input = Output
BLACK BOX
30
Overall… Understanding Is Flexible
Statistical generalizability of machines vs. conceptual flexibility of human understanding
Probabilities Understanding
Machines Humans
Good at generalizing from training data to Can apply knowledge to entirely novel
similar situations based on statistical situations using understanding and
patterns. reasons.
31
HUMAN VS. AI
INTERACTIVE CREATIVITY TASKS
32
Intelligence
IQ
33
Our Data
Type Sample / Model Originality Collection / Model Year Size
Claude-3-Haiku 2023
Claude-3.5-Sonnet 2024
GPT-4o 2024
37
Comparing Creativity
38
Word Cloud
Average humans and machines Top-percentile humans and machines
39
Creative Geniuses
40
Demographic Groups
41
Temperature
Most likely answer Most random answer
43
Intelligence
IQ
44
Creativity Comparison
Divergent Convergent
Thinking ! Thinking
Generating multiple diverse answers or solutions. Generating singular effective answers or solutions.
Breaking norms and overcoming fixed thinking patterns. Finding connections between things or events.
Quick and flexible flow of ideas. Deep extension of thought.
Viewing problems from different perspectives. Exploring new perspectives on existing relationships.
45
Intelligence
IQ
# : Not surpassing humans Divergent Thinking Convergent Thinking # :Already surpassed humans
$ : Can augment humans
Creativity $ :Can replace humans
(Our Research) (Our Research)
49
Human-Machine Collaboration Case •
“Jiao Zi”
Personal Career:
2002: Began working on animated short films.
2008: The short film "Hit a Big Watermelon" won a special award at
the Berlin Film Festival.
2013: The short film "The Boss's Woman" won international gold
awards and other prizes.
2014: Raised only 50 million RMB to begin production on "Ne Zha."
50
Human-Machine Collaboration Case • “Ne Zha”
Small Big
production success
(people + machine) (film rating + box office)
Budget: RMB 40 m Box office: (5.036 billion RMB)
51 Source: Baidu