We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 25
NVIDIA.
A Beginner’s Guide to
Large Language Models
Part 1
(eed
one ee cry
uae
Stacy
Sa CT)Table of Contents
Preface ven eon eon eon 3
Glossary..
Introduction to LLMs...
‘What Are Large Language Models (LLMs)?
Foundation Language Models vs. Fine-Tuned Language Models eon u
Evolution of Large Language Models . . enn LD
Neural Networks... nnn sow
Transformers .
How Enterprises Can Benefit From Using Large Language Models.....
Challenges of Large Language Models ......nnntnnnnnnnennnnnnnesnnnn ow DA
Ways to Build LLMs. 21
How to Evaluate LLMs 22
Notable Companies in the LLM Field...
Popular Startup-developed LLM Apps.
[A Beginner's Guide to Large Language ModelsPreface
Language has been integral to human society for thousands of years. a
long-prevailing theory, laryngeal descent theory or LOT, suggests that speech and, thus, language,
may have evolved about 200,000 or 300,000 years ago, while newer research shows it could’ve
happened even sooner
Regardless of when it first appeared, language remains the cornerstone of human communication. It
thas taken on an even greater role in today’s digital age, where an unprecedented portion of the
population can communicate via both text and speech across the globe
This is underscored by the fact that 347.3 billion email messages are sent and received worldwide
every day, and that five billion people — or over 63% of the entire world population ~ send and receive
text messages,
Language has therefore become a vast trove of information that can help enterprises extract valuable
insights, identify trends, and make informed decisions. As an example, enterprises can analyze texts
like customer reviews to identify their products’ best-selling features and fine-tune their future
product development.
Similarly, language production — as opposed to language analysis — is also becoming an increasingly
important tool for enterprises. Creating blog posts, for example, can help enterprises raise brand
awareness to a previously unheard-of extent, while composing emails can help them attract new
stakeholders or partners at an unmatched speed.
However, both language analysis and production are time-consuming processes that can distract,
‘employees and decision-makers from more important tasks. For instance, leaders often need to sift
through vast amounts of text in order to make informed decisions instead of making them based on
extracted key information,
Enterprises can minimize these and other problems, such as the risk of human error, by employing
large language models (LLMs) for language-related tasks. LLMs can help enterprises accelerate and
largely automate their efforts related to both language production and analysis, saving valuable time
and resources while improving accuracy and efficiency.
Unlike previous solutions, such as rule-based systems, LLMs are incredibly versatile and can be easily
adapted to a wide range of language-related tasks, like generating content or summarizing legal
documentation,
[A Beginner's Guide to Large Language Models 3‘The goal of this book is to help enterprises understand what makes LLMs so groundbreaking
compared to previous solutions and how they can benefit from adopting ar developing them. It also
aims to help enterprises get a head start by outlining the most crucial steps to LLM development,
training, and deployment.
To achieve these goals, the book is divided into three parts:
> Part 1 defines LLMs and outlines the technological and methodological advancements over the
years that made them possible. It also tackles more practical topics, such as how enterprises can
develop their own LLMs and the most notable companies in the LLM field. This should help.
enterprises understand how adopting LLMs can unlock cutting-edge possibilities and revolutionize
their operations,
> Part 2 discusses five major use cases of LLMs within enterprises, including content generation,
summarization, and chatbot support. Each use case is exemplified with real-life apps and case
studies, so as to show how LLMs can solve real problems and help enterprises achieve specific
objectives.
> Part 3is a practical guide for enterprises that want to build, train, and deploy their own LLMs. It
provides an overview of necessary pre-requirements and possible trade-offs with different
development and deployment methods. ML engineers and data scientists can use this as a
reference throughout their LLM development processes.
Hopefully, this will inspire enterprises that have not yet adopted or developed their own LLMs to do.
0 soon in order to gain a competitive advantage and offer new SOTA services or products, The most
benefits will be, as usual, reserved for early adopters or truly visionary innovators.
[A Beginner's Guide to Large Language Models 4Glossary
Terms
Deep learning systems
Generative Al
Large language models (LLMs)
Natural language processing (NLP)
Long short-term memory neural network (STM)
"Natural language generation (NLG)
‘Natural language understanding (NLU)
‘Neural network (NN)
Perception Al
Recurrent neural network (RNN)
[A Beginner's Guide to Large Language Models
Description
‘systems that rely on neural networks with many hidden lavers to
learn complex patterns
[Al programs that can generate new content, ike text, images,
{and audio, rather than just analyze it.
Language models that recognize, summarize, translate, predict,
land generate text and other content. They're called large
because they are trained on large amounts of data and have
many parameters, wth popular LLMs reaching hundreds of
billions of parameters.
‘The ability of @ computer program to understand and generate
text in natural language.
A special type of RNNs with more complex cel blocks that allow
itto retain more past inputs.
[Apart of NLP that refers tothe ability of a computer program 10
generate human-like text.
‘Apart of NLP that refers tothe ability of a computer program to
understand human-like text
‘A machine learning algorithm in which the parameters are
‘organized into consecutive layers. The learning process of NN is
inspired by the human brain. Much lke humans, NNS “learn”
important features via representation learning and require less
human involvement than most other approaches to machine
learning,
{A programs that can process and analyze but nat generate data,
mainly developed before 2020.
Neural network that processes data sequentially nd can
memorize past inputs.Rule-based system [Asystem that relies on human-crafted rules to process data.
‘Traditional machine I
Traditional machine learning uses a statistical approach, drawing
probablity distributions of words or other tokens based on a
large annotated corpus. It eles less on rules and more on data
‘Transformer Atype of neural network architecture designed to process
sequential data non-sequentialy
Structured date Data that is quantitative in nature, such as phone numbers, and
can be easily standardized and acjusted to 9 pre-defined format
‘that ML algorithms ean quickly process.
Unstructured date Data that is qualitative in nature, such as customer reviews, and
ficult to standardize. Such data is stored in is native formats,
lke PDF fies, before use
Fine-tuning ‘transfor learning method used to improve model performance
‘on selected downstream task or datasets. I's used when the
target task i similar to the pre-training task and involves copying
‘the weights of @ PLM and tuning them on desited tasks or data,
‘customization ‘A method of improving model performance by modifying only
‘one ora few selected parameters of 2 PLM instead of updating
the entire model. [tnvolves using parameter-effcient
techniques (PEFT).
Pacameter-efficient techniques (PEFT) Techniques lke prompt learning, LoRa, and adapter tuning
which allow researchers to customize PLMs for downstream
asks or datasets whi preserving and leveraging existing
knowledge of PLM. These techniques are used during model
‘customization and allow for quicker training and often more
‘accurate predictions.
Prompt learning An umbrella term for two PEFT techniques, prompt tuning and
p-tuning, which help customize models by inserting virtual token
‘embeddings among discrete or real token embeddings.
‘Adapter tuning [A PEET technique that involves adding lightweight feed-forward
layers, called adapters, between existing PLM layers and
updating only their weights during customization while keeping
‘the original PLM weights frozen,
‘Open-domain question answering Answering questions from a variety of afferent domains, lke
legal, medical, and financial, instead of just one domain.
[Extractive question answering. Answering questions by extracting the answers from existing
textsor databases,
[A Beginner's Guide to Large Language Models
Sinan Ozdemir - Quick Start Guide To Large Language Models - Strategies and Best Practices For Using ChatGPT and Other LLMs-Addison-Wesley Professional (2023)
Instant download [EARLY RELEASE] Quick Start Guide to Large Language Models: Strategies and Best Practices for using ChatGPT and Other LLMs Sinan Ozdemir pdf all chapter
Download Full [EARLY RELEASE] Quick Start Guide to Large Language Models: Strategies and Best Practices for using ChatGPT and Other LLMs Sinan Ozdemir PDF All Chapters
Understanding Large Language Models: Learning Their Underlying Concepts and Technologies 1 / converted Edition Thimira Amaratunga - The ebook is available for quick download, easy access to content