0% found this document useful (0 votes)
10 views60 pages

X33fcon-2023-Empowering-Security-GenerativeAI-Fundamentals-Applications

Uploaded by

sibai0x7df
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views60 pages

X33fcon-2023-Empowering-Security-GenerativeAI-Fundamentals-Applications

Uploaded by

sibai0x7df
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

Empowering Security with

Generative AI
Fundamentals and
Applications of GPT models

@Cyb3rWard0g
AI Cybersecurity Enthusiast

The presentation/slides/information I share today represent


my own personal views. I am speaking for myself and not
on behalf of my employer, Microsoft Corporation

Created by Midjourney
“Researcher wearing a scientist coat looking at the city
through a window at night cyberpunk style --ar 16:9 --v 5.1”
@Cyb3rWard0g

Principal Security Researcher at the


Microsoft Security Research Organization
Founder of the Open Threat Research
community! @OTR_Community
I ❤️ dogs and open source!
Empowering 🌎 https://github.com/OTRF

https://msrc-blog.microsoft.com/2022/09/07/curious-innovative-creative-community-driven-meet-cyb3rward0g-roberto-rodriquez/
• Fundamentals
• GPT Models
• Demos
Fundamentals

Created by Midjourney
“Two security researchers one woman one man as professors
explaining mathematics in a big blackboard with a lot of
mathematic formulas at night cyberpunk style --ar 16:9 --v 5.1”
ARTIFICIAL
INPUT OUTPUT

Algorithms that enable machines


to simulate or replicate human-
like intelligence

INTELLIGENCE
The ability to process information
and make decisions or take actions
to achieve a desired outcome
Artificial Intelligence
Artificial Intelligence In 1956, the Dartmouth Summer Research Project on
1950’s
Artificial Intelligence (DSRPAI) Conference marked the
birth of the field of AI (New Hampshire).

Machine Learning Machine Learning


A subset of AI that enables machines to learn from
1990’s
data using statistical models and artificial neural
networks to make decisions. Deep Blue (1997)
Deep Learning
Deep Learning
An ML technique that uses deep neural networks
2010s
(multiple neuron layers) to learn more complex
patterns from data. Transformers (2017), GPT (2018)

Foundation Models (General Purpose)


Generative AI Models pre-trained on large amounts of data that can
2020’s
be adapted for specific tasks, including generating
written, visual, and auditory content. GPT-3 (2020)
w1

Bias

w2

Input Weights Perceptron Activation Function Output


Learning Happens

Input Layer Hidden Layers Output Layer

Forward Propagation
Backpropagation
How does each weight contribute to the overall loss?

(+) Gradient
w w
Loss w
w

Actual
Value
(-) Gradient Predicted
Value Loss
w

Weight w
w w
Loss is Minimized

Learning Happens

Input Layer Hidden Layers Output Layer

Forward Propagation
Backpropagation
How does each weight contribute to the overall loss?

Actual
Value
Loss

Learning Happens

Input Layer Hidden Layers Output Layer

Forward Propagation
“Learned weights / parameters capture the knowledge and
patterns learned by the neural network during training”

[[0.2, 0.4, 0.1, 0.3],


[0.5, 0.6,
[[0.3,0.9,
0.8,0.7]]
0.5, 0.2],
[0.7, 0.1,
[[0.9,0.4, 0.9],
0.5, 0.3, 0.7],
[0.6, 0.2,
[0.4,0.4, 0.3],
0.6, 0.2, 0.1],
[[0.6],
[0.1, 0.6,
[0.8,0.9, 0.5]]
0.2, 0.7, 0.4],
[0.2],
[0.2, 0.3, 0.1, 0.6]]
[0.8],
[0.3]]

“Neural networks often involve large matrices and vector operations,


which can benefit from the parallel architecture of GPUs”
Input Layer Hidden Layers Output Layer
https://www.youtube.com/watch?v=ySEx_Bqxvvo&list=PLtBw6njQRU-rwp5__7C0oIVt26ZgjG9NI&index=3
Internal Internal
State State

h0 h1

Recurrence Recurrence
Relation Relation

https://www.youtube.com/watch?v=ySEx_Bqxvvo&list=PLtBw6njQRU-rwp5__7C0oIVt26ZgjG9NI&index=3
Internal Internal
State State

Recurrent
h0 h1
Cell
H Recurrence Recurrence
Relation Relation

https://www.youtube.com/watch?v=ySEx_Bqxvvo&list=PLtBw6njQRU-rwp5__7C0oIVt26ZgjG9NI&index=3
Many to One One to Many Many to Many
Sentiment Classification Text Generation Text Translation

https://www.youtube.com/watch?v=ySEx_Bqxvvo&list=PLtBw6njQRU-rwp5__7C0oIVt26ZgjG9NI&index=3
The dog is Fixed Length El perro es
brown
Encoder Vector Decoder marrón

https://arxiv.org/abs/1406.1078
The dog is The dog is brown
Tokenizer
brown

I 0.2 0.3 0.4


brown 0.4 0.5 0.1
The
food 0.5 0.1 0.4

Every word/token in the


dog dog 0.3 0.7 0.6
Neural
model vocabulary has a yes 0.2 0.2 0.7
Network
vector word embedding! is the 0.1 0.5 0.8
talk 0.4 0.6 0.9
brown
is 0.6 0.2 0.3

Input Tokens
Token Embeddings
The dog is Fixed Length El perro es
brown
Encoder Vector Decoder marrón
El perro es marrón

h1 h2 h3 h4
Input
Sentence
Embedding
word word word word
embedding embedding embedding embedding

The dog is brown Start


RNN

H
Decoder

Encoder

https://arxiv.org/abs/1706.03762
Encoder

h1 h2 h3

word word word word


embedding embedding embedding embedding
Embeddings

The dog is brown The dog is brown

Recurrent Neural Networks Transformers


Input

The dog is brown

Embedding Embedding Embedding Embedding

Embeddings
P0 P1 P2 P3

The dog is brown

Transformers Position-Aware Encoding


Self-Attention

 “The animal didn’t cross the street


because it was too tired”

 Self attention associates “it” with


“animal”.
 Self attention looks at other
positions in the input sequence for
clues that can help lead to a better
encoding for this word.

http://jalammar.github.io/illustrated-transformer/
Self-Attention

https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html
[[0.9, 0.5, 0.3, 0.7],
[0.4,
[[0.9,
0.6,0.5,
0.2,0.3,
0.1],
0.7],
[0.8,
[0.4,
0.2,
[[0.9,
0.6,
0.7,
0.5,
0.2,
0.4],
0.3,
0.1],
0.7],
[0.2,
[0.8,
0.3,
[0.4,
0.2,
[[0.9,
0.1,
0.6,
0.7,
0.6]]
0.5,
0.2,
0.4],
0.3,
0.1],
0.7],
[0.2,
[0.8,
0.3,
[0.4,
0.2,
[[0.9,
0.1,
0.6,
0.7,
0.6]]
0.5,
0.2,
0.4],
0.3,
0.1],
0.7],
[0.2,
[0.8,
0.3,
[0.4,
0.2,
0.1,
0.6,
0.7,
0.6]]
0.2,
0.4],
0.1],
[0.2,
[0.8,
0.3,0.2,
0.1,0.7,
0.6]]
0.4],
[0.2, 0.3, 0.1, 0.6]]

Large Amount of Parameters


Text Documents
Input Layer Hidden Layers Output Layer

“Parameters represent the model's understanding of the statistical relationships


between words and phrases in a language”
https://huggingface.co/learn/nlp-course/chapter1/4?fw=pt
Once up upon a time, there was a
unicorn that lives in a magical forest

What is the capital of France? What is the capital of France?


What is France’s largest city? The capital of France is Paris.
What is France’s population

https://learn.deeplearning.ai/chatgpt-prompt-eng/lesson/1/introduction
GPT Models

Created by Midjourney
“Two security researchers one woman one man as professors
explaining mathematics in a big blackboard with a lot of
mathematic formulas at night cyberpunk style --ar 16:9 --v 5.1”
Input Layer Hidden Layers Output Layer
https://learn.microsoft.com/en-us/semantic-kernel/prompt-engineering/llm-models#what-is-a-baseline-comparison-rubric-for-llm-ais
https://platform.openai.com/docs/models/gpt-4
https://platform.openai.com/docs/models/gpt-4
https://arxiv.org/abs/1706.03762
Prompt

[Image of Baby Stevie] as a dog cartoon


with a transparent background --v 5.1

https://www.midjourney.com/
Sit Down

Foundation Model
Come Stay

https://www.midjourney.com/
Rescue Dog

Special Training

Service Dog

Foundation Model
Police Dog
https://www.midjourney.com/
Traditional ML

Label Train Predict

Prompt-Based ML

Prompt Predict

Visual Prompting Livestream With Andrew Ng: https://www.youtube.com/watch?v=FE88OOUBonQ&t=129s


NLP Area Task Description
Natural Language Named Entity Identifying and categorizing named entities, such as names,
Understanding Recognition locations, organizations, etc.
(NLU) Sentiment Analysis Determining the sentiment or emotion expressed in a piece of
text.
Language Parsing Analyzing the syntactic structure of sentences.
Intent Recognition Identifying the intention or purpose behind a user's input.
Natural Language Text Summarization Generating concise summaries of longer documents or articles.
Generation (NLG) Dialogue Systems Generating conversational responses or dialogues in chatbots or
virtual assistants.
Text Generation Creating written content, such as news articles, stories, or
product descriptions.
Text Translation Translating text from one language to another.
Demos

Created by Midjourney
“Two security researchers one woman one man as professors
explaining mathematics in a big blackboard with a lot of
mathematic formulas at night cyberpunk style --ar 16:9 --v 5.1”
https://github.com/Cyb3rWard0g/GPT-Security-Adventures
https://thedfirreport.com/2022/03/21/phosphorus-automates-initial-access-using-proxyshell/
https://github.com/pinecone-io/examples/blob/master/generation/langchain/handbook/05-langchain-retrieval-augmentation.ipynb
Answer

Query Query
ATT&CK
Python Client Embedding
Context LLM
ATT&CK Database
Retriever
Relevant
Documents

Document
Tokenizing
Embedding

Vector Database
ATT&CK Groups
Agents use an LLM to determine which actions to take and in what order. An action can either be using
a tool and observing its output or returning to the user.
2108.07258.pdf (arxiv.org)

https://www.youtube.com/watch?v=QDX-1M5Nj7s&t=5s

https://www.youtube.com/watch?v=hfIUstzHs9A

https://www.youtube.com/watch?v=FE88OOUBonQ&t=129s
Tensor2Tensor Intro - Colaboratory (google.com)
Thank you

You might also like