0% found this document useful (0 votes)
9 views

SESSION_1_LLMs

Uploaded by

Houssein Abdi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

SESSION_1_LLMs

Uploaded by

Houssein Abdi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

LARGE LANGUAGE

MODELS (LLMs)
Trainer: ILYAS NAYLE
ABOUT ME

Data Analyst and Machine


Learning Engineer

Contact: [email protected]
Key
Historical
Concepts
OVERVIEW Context and Architecture Mathematical Applications
and
OF LLMs Evolution of of LLMs Foundations of LLMs
Terminologi
LLMs
es

OUTLINE
OVERVIEW OF
LLMs

INTRODUCTION
What Are Large Language Models?
• Definition of LLMs?
• Importance of LLMs in today's
technology landscape
HISTORICAL CONTEXT
AND EVOLUTION OF
LLMS
 Early Models and Their Limitations
 Simple rule-based systems
 Limited understanding and response
capabilities
 Breakthroughs
 IBM Watson (2011)
 Google’s BERT (2018)
 OpenAI’s GPT-3 (2020)
ARCHITECTURE OF LLMS
Understanding the Architecture
Overview of Transformer Architecture
 Tone inflection: Incorporating tone inflection into LLMs
helps in generating responses that sound more natural and
human-like.
UNDERSTANDING  Volume control: Volume control is another aspect that can
enhance the expressiveness of text-to-speech systems
THE powered by LLMs, allowing for adjustments in the loudness
ARCHITECTURE of the generated speech..
Importance of Attention Mechanisms
• Attention mechanisms are a core component of LLMs. They
enable the model to weigh the importance of different
words in a sequence
• Improving the model's ability to understand context and
nuances
LLMs rely on complex mathematical algorithms
like gradient descent and backpropagation to learn.
These algorithms adjust the model's parameters to
minimize errors, enhancing their ability to predict
and generate accurate responses over time
Gradient Descent
• Fundamental optimization algorithm used in
training neural networks.
• Minimizes the loss function, improving model
accuracy.

•Backpropagation

•Essential for learning in neural networks.

•Calculates the gradient of the loss function with respect to


each weight by the chain rule.
MATHEMATICAL
FOUNDATIONS
KEY CONCEPTS AND TERMINOLOGIES
Transformer Architecture
The transformer architecture is a deep learning model introduced in the paper
"Attention is All You Need." It replaces recurrent neural networks (RNNs) with self-
attention mechanisms, allowing for parallel processing of input data, which
significantly speeds up training and inference.
Transformer Architecture
• Self-Attention: a mechanism that allows the model to focus on different parts of
the input sequence when making predictions
• Encoder-Decoder :The encoder-decoder framework is a common structure in
transformer models. The encoder processes the input sequence and generates a
set of features, which the decoder then uses to produce the output sequence
Tokenization: Tokenization is the process of breaking down text into smaller units
called tokens. Tokens can be words, subworlds, or characters, depending on the
tokenization strategy used.

Pre-training and Fine-tuning


• Pre-training: In this stage, the model is trained on a large corpus of text data to
learn general language patterns. This involves unsupervised learning, where the
model learns to predict words in a sentence based on the context.
• Fine-tuning” : After pre-training, the model is fine-tuned on a smaller, task-
specific dataset. This involves supervised learning, where the model is trained to
perform a specific task, such as sentiment analysis or question answering, using
labeled data.
APPLICATIONS
OF LLMS
1. Content generation
2. Translation and localization
3. Search and recommendation
4. Virtual assistants
5. Sentiment analysis
6. Object Detection in images
7. Image Segmentation
Persistence
Overview of LangGraph Agentic
LangGraph
ReAct Agent And
Components Search
Streaming

OUTLINE
LANGGRAPH & LANGCHAIN

Lan gCha in: A tool for Const ruc tion seq uenc es of operati ons.

Lan gGra ph: A Framework for bui lding Modul ar , task -


oriented appl ic ations.

Two capabi li ti es of bu il ding an Agent:

1. Hu man Input

2. Persist enc e
Blog Post
SIMPLE REACT
AGENT FROM
SCRATCH
LANGGRAPH
COMPONENTS
PROMPT HUB

LongChain Tools
Is a search engine, specifically
designed for AI agents.
AGENTIC SEARCH
PERSISTENCE AND STREAMING
LangChain Resources : https://www.langchain.com

keeps you around the state of an agent at a


particular point in time

can emit a list of signals of what is going on


the specific moment
OPEN-SOURCE MODELS WITH
HUGGING FACE
Overview of Hugging Face Selecting
Hugging Face
NLP Examples
Website Models

OUTLINE
What is Hugging Face? Hugging Face Ecosystem:

Hugging Face is an AI
company that has created a
suite of open-source tools
Transformers Library Datasets Library Model Hub:
and models, primarily
focusing on Natural
Language Processing (NLP).

OVERVIEW OF HUGGING FACE


SELECTING
MODELS
Hugging Face HUB
NLP is a field of linguistic and machine
learning, and it is focused on
everything related to human language.
Embeddings are a type of data representation in
machine learning and natural language processing
(NLP) that convert complex data, such as words or
images, into continuous vector spaces. These vectors
capture the semantic meaning of the data in a way
that makes it easier for algorithms to process and
analyze
Sentence
Embeddings
TEXT TO
SPEECH
MORE EXAMPLES:
We can think of many examples that we can use the LLMs model in it.
LLMS WITH SEMANTIC SEARCH
Keyword
Overview of
Search vs Keyword Embeddings Dense
Semantic ReRank
Semantic Search Retrieval
Search
Search

AGENDA
SEMANTIC SEARCH
Key Concepts How it works?

• Intent Understanding 1. Query Processing


2. Indexing
• Contextual Relevance 3. Retrieval
4. Ranking
• Entity Recognition

• Natural Language Processing (NLP) Benefits of Semantic Search

• Synonymy and Polysemy 1. Improved Relevance


2. Enhanced User Experience
3. Better Handling of Variants
4. Context-Aware Results
Aspect Keyword Search Semantic Search

Query Matching Exact keyword match Contextual and intent-based matching

High, understands the meaning behind


Context Understanding Limited or none
the words

Good, recognizes synonyms and related


Synonym Handling Poor, misses synonyms
terms

Polysemy Handling Poor, struggles with multiple meanings Good, disambiguates based on context

Often returns irrelevant results if High relevance based on context and


Relevance
keywords are present intent

Ease of Implementation Easy to implement Complex to implement

High, requires more computational


Computational Resources Low, fast and resource-efficient
power

User Experience Requires precise keywords Allows natural language queries

Basic search engines, document Advanced search engines, virtual


Example Use Case
retrieval systems assistants, e-commerce product search

KEYWORD SEARCH Comparison Table


VS
SEMANTIC SEARCH
KEYWORD SEARCH CAPABILITIES
EMBEDDINGS
DENSE RETRIEVAL
PROBLEM?
IN UPCOMING SESSION

ChatGPT Finetuning
Prompt Prompt Large
Engineering Engineering Language
for with Llama Models
Developers
E-mail: [email protected]
THANK YOU FOR LISTENING AND WELCOME AGAIN

You might also like