0% found this document useful (0 votes)
24 views

A Beginner's Guide To Large Language Models

A Beginner's Guide to Large Language Models
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
24 views

A Beginner's Guide To Large Language Models

A Beginner's Guide to Large Language Models
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 25
NVIDIA. A Beginner’s Guide to Large Language Models Part 1 (eed one ee cry uae Stacy Sa CT) Table of Contents Preface ven eon eon eon 3 Glossary.. Introduction to LLMs... ‘What Are Large Language Models (LLMs)? Foundation Language Models vs. Fine-Tuned Language Models eon u Evolution of Large Language Models . . enn LD Neural Networks... nnn sow Transformers . How Enterprises Can Benefit From Using Large Language Models..... Challenges of Large Language Models ......nnntnnnnnnnennnnnnnesnnnn ow DA Ways to Build LLMs. 21 How to Evaluate LLMs 22 Notable Companies in the LLM Field... Popular Startup-developed LLM Apps. [A Beginner's Guide to Large Language Models Preface Language has been integral to human society for thousands of years. a long-prevailing theory, laryngeal descent theory or LOT, suggests that speech and, thus, language, may have evolved about 200,000 or 300,000 years ago, while newer research shows it could’ve happened even sooner Regardless of when it first appeared, language remains the cornerstone of human communication. It thas taken on an even greater role in today’s digital age, where an unprecedented portion of the population can communicate via both text and speech across the globe This is underscored by the fact that 347.3 billion email messages are sent and received worldwide every day, and that five billion people — or over 63% of the entire world population ~ send and receive text messages, Language has therefore become a vast trove of information that can help enterprises extract valuable insights, identify trends, and make informed decisions. As an example, enterprises can analyze texts like customer reviews to identify their products’ best-selling features and fine-tune their future product development. Similarly, language production — as opposed to language analysis — is also becoming an increasingly important tool for enterprises. Creating blog posts, for example, can help enterprises raise brand awareness to a previously unheard-of extent, while composing emails can help them attract new stakeholders or partners at an unmatched speed. However, both language analysis and production are time-consuming processes that can distract, ‘employees and decision-makers from more important tasks. For instance, leaders often need to sift through vast amounts of text in order to make informed decisions instead of making them based on extracted key information, Enterprises can minimize these and other problems, such as the risk of human error, by employing large language models (LLMs) for language-related tasks. LLMs can help enterprises accelerate and largely automate their efforts related to both language production and analysis, saving valuable time and resources while improving accuracy and efficiency. Unlike previous solutions, such as rule-based systems, LLMs are incredibly versatile and can be easily adapted to a wide range of language-related tasks, like generating content or summarizing legal documentation, [A Beginner's Guide to Large Language Models 3 ‘The goal of this book is to help enterprises understand what makes LLMs so groundbreaking compared to previous solutions and how they can benefit from adopting ar developing them. It also aims to help enterprises get a head start by outlining the most crucial steps to LLM development, training, and deployment. To achieve these goals, the book is divided into three parts: > Part 1 defines LLMs and outlines the technological and methodological advancements over the years that made them possible. It also tackles more practical topics, such as how enterprises can develop their own LLMs and the most notable companies in the LLM field. This should help. enterprises understand how adopting LLMs can unlock cutting-edge possibilities and revolutionize their operations, > Part 2 discusses five major use cases of LLMs within enterprises, including content generation, summarization, and chatbot support. Each use case is exemplified with real-life apps and case studies, so as to show how LLMs can solve real problems and help enterprises achieve specific objectives. > Part 3is a practical guide for enterprises that want to build, train, and deploy their own LLMs. It provides an overview of necessary pre-requirements and possible trade-offs with different development and deployment methods. ML engineers and data scientists can use this as a reference throughout their LLM development processes. Hopefully, this will inspire enterprises that have not yet adopted or developed their own LLMs to do. 0 soon in order to gain a competitive advantage and offer new SOTA services or products, The most benefits will be, as usual, reserved for early adopters or truly visionary innovators. [A Beginner's Guide to Large Language Models 4 Glossary Terms Deep learning systems Generative Al Large language models (LLMs) Natural language processing (NLP) Long short-term memory neural network (STM) "Natural language generation (NLG) ‘Natural language understanding (NLU) ‘Neural network (NN) Perception Al Recurrent neural network (RNN) [A Beginner's Guide to Large Language Models Description ‘systems that rely on neural networks with many hidden lavers to learn complex patterns [Al programs that can generate new content, ike text, images, {and audio, rather than just analyze it. Language models that recognize, summarize, translate, predict, land generate text and other content. They're called large because they are trained on large amounts of data and have many parameters, wth popular LLMs reaching hundreds of billions of parameters. ‘The ability of @ computer program to understand and generate text in natural language. A special type of RNNs with more complex cel blocks that allow itto retain more past inputs. [Apart of NLP that refers tothe ability of a computer program 10 generate human-like text. ‘Apart of NLP that refers tothe ability of a computer program to understand human-like text ‘A machine learning algorithm in which the parameters are ‘organized into consecutive layers. The learning process of NN is inspired by the human brain. Much lke humans, NNS “learn” important features via representation learning and require less human involvement than most other approaches to machine learning, {A programs that can process and analyze but nat generate data, mainly developed before 2020. Neural network that processes data sequentially nd can memorize past inputs. Rule-based system [Asystem that relies on human-crafted rules to process data. ‘Traditional machine I Traditional machine learning uses a statistical approach, drawing probablity distributions of words or other tokens based on a large annotated corpus. It eles less on rules and more on data ‘Transformer Atype of neural network architecture designed to process sequential data non-sequentialy Structured date Data that is quantitative in nature, such as phone numbers, and can be easily standardized and acjusted to 9 pre-defined format ‘that ML algorithms ean quickly process. Unstructured date Data that is qualitative in nature, such as customer reviews, and ficult to standardize. Such data is stored in is native formats, lke PDF fies, before use Fine-tuning ‘transfor learning method used to improve model performance ‘on selected downstream task or datasets. I's used when the target task i similar to the pre-training task and involves copying ‘the weights of @ PLM and tuning them on desited tasks or data, ‘customization ‘A method of improving model performance by modifying only ‘one ora few selected parameters of 2 PLM instead of updating the entire model. [tnvolves using parameter-effcient techniques (PEFT). Pacameter-efficient techniques (PEFT) Techniques lke prompt learning, LoRa, and adapter tuning which allow researchers to customize PLMs for downstream asks or datasets whi preserving and leveraging existing knowledge of PLM. These techniques are used during model ‘customization and allow for quicker training and often more ‘accurate predictions. Prompt learning An umbrella term for two PEFT techniques, prompt tuning and p-tuning, which help customize models by inserting virtual token ‘embeddings among discrete or real token embeddings. ‘Adapter tuning [A PEET technique that involves adding lightweight feed-forward layers, called adapters, between existing PLM layers and updating only their weights during customization while keeping ‘the original PLM weights frozen, ‘Open-domain question answering Answering questions from a variety of afferent domains, lke legal, medical, and financial, instead of just one domain. [Extractive question answering. Answering questions by extracting the answers from existing textsor databases, [A Beginner's Guide to Large Language Models

You might also like