0% found this document useful (0 votes)
54 views

Day 2 Module 2 - Understanding LLMs

Uploaded by

ama.dani.id
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views

Day 2 Module 2 - Understanding LLMs

Uploaded by

ama.dani.id
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

AI WORKPLACE FOUNDATIONS

Understanding LLMs
DAY 2, MODULE 2
Agenda

01 What is a LLM?

02 Key Timelines
03 LLM – Under the Hood
04 LLM Capabilities
05 Getting More Out of LLMs
06 Building LLM-Powered Assistants
What is a Large Language Model?

• A large language model is a trained


deep-learning model that contextually
understands human language and can
generate text in a human-like fashion.

• LLMs are trained on vast amounts of text


data to develop deep understanding of
language structures and meanings.
What is a Large Language Model?

• The “large” in large language models


refers to 3 things: the huge training data
utilized, the massive scale of the model's
architecture, and the costly computational
resources required for training.

• They are typically based on transformer Model Language Large


architectures, which rely on self-attention Neural
network
Designed for
NLP tasks
Lots of
params
mechanisms that allow the model to
capture long-range dependencies
between words in a sentence.
Key Timelines

In the Beginning Statistical Revolution Deep Learning Age Age of Transformers


’50s – 80s ’90s – 2000s 2010s 2017 & beyond
Symbolic AI & rule- Rise of probabilistic Word embeddings Rise of attention mech-
based systems (ELIZA) models (n-gram) anism. Pretraining
Sequence modelling
with masked modelling
Statistical methods, Neural networks (RNN, LSTM)
start of connectionism (feedforward, RNN) Attention mechanism Parameter scaling.
LLMs – Under the Hood

LLMs follow a two-step training process:

• Pre-training: The models learn from massive amounts of


unlabeled text data. Using self-supervised learning, the
model learns to predict masked or corrupted words in the
input, allowing it to capture rich contextual representations.

• Fine-tuning: The models are further trained on specific tasks


using labeled data to specialize their language
understanding for various applications. This is known as
transfer learning, which allows a model to generalize its
capabilities to various downstream NLP tasks.
LLM Capabilities

• Conversation and dialogue, mimicking different writing


styles, adapting to various genres, and producing
contextually appropriate responses.
• Language translation, capturing nuances and idiomatic
expressions.
• Document summarization and knowledge extraction from a
wide range of sources.
• Intelligent text suggestion and completion based on partial
input.
• Sentiment analysis, distinguishing positive, negative or
neutral tones.
• Creative content generation, including fictional stories,
poetry, or script dialogues.
More LLM Capabilities

1 2

Coding Copilot Data Analysis & Interpretation


Assist developers in completing code Automatic generation of reports from
snippets, suggesting functions, and raw data to provide insights and
debugging. Generate technical summaries.
documentation from code or explain Generative analytics enables running
code in simple language. analysis using prompts. Conversational
Code completion tools: Codex analytics uses NLQ to query databases
(OpenAI), Github Copilot, AlphaCode and fetch data for non-technical users.
(DeepMind), TabNine, IntelliCode (MS).
Leading LLMs

1 2

Open Models Closed Models


Llama 3, Llama 3.1, Mixtral 8x22b,
Claude 3.5, Gemini 1.5, Gemini Ultra,
Mixtral 8x7b, Mistral Large, Qwen2,
GPT-4, GPT-4 Turbo, GPT-4o
Command-R
Getting More out of LLMs

The ambiguity of natural language affects how LLMs perform in


different tasks. These issues can be addressed in two ways –
prompt engineering and finetuning:

Prompt engineering
• This involves designing and refining input queries, known as
prompts, to achieve desired responses from LLMs.
• The phrasing, structure, and context of a prompt directly
influence the quality and relevance of the model's output.
• Understanding how to tune prompts effectively will help you
obtain more accurate and nuanced responses that are useful
and relevant.
Getting More out of LLMs

The ambiguity of natural language affects how LLMs perform in


different tasks. These issues can be addressed in two ways –
prompt engineering and finetuning:

Finetuning
• Finetuning involves taking a pre-trained LLM, such as GPT-3,
and further training it on a domain-specific task.
• Finetuning a model on a more focused dataset enables the
model to adapt to the specific requirements of the target
task, resulting in improved performance and tailored
responses.
• When you finetune a LLM, you train it on how to respond, so
you don’t necessarily have to do any prompt engineering
subsequently.
Building LLM-Powered Assistants

Key must-haves of LLM assistants:


• Contextual understanding: Should be to comprehend and
interpret user input, going beyond syntax to understanding
nuanced contextual cues.
• Information retrieval: Should be accurate at retrieving and
presenting information, responding to general knowledge
queries, providing up-to-date weather forecasts, fetching
relevant news articles, and offering personalized
recommendations.
• Task management: Must have an exceptional task
management system tailored to user preferences and
priorities, including seamless organisation of to-do lists,
appointment scheduling & reminder setting.
Building LLM-Powered Assistants

Key must-haves of LLM assistants:


• Personalisation: Should be adaptive, incorporating user
preferences to provide tailored responses and
recommendations.
• Conversational interface: A simplified and intuitive user
interface for providing the user with a chat experience
• Voice interaction (optional): Enables users to effortlessly
communicate with it via speech. This integration of speech-
to-text and text-to-speech technologies fosters an intuitive
user experience to enhance convenience and accessibility.
• Security and Privacy: Should be able to safeguard user
data, prioritising trust and protection of sensitive information
at all times.
AI WORKPLACE FOUNDATIONS

Understanding LLMs
DAY 2, MODULE 2

You might also like