Building Data-Driven Applications with LlamaIndex: A practical guide to retrieval-augmented generation (RAG) to enhance LLM applications
()
Related to Building Data-Driven Applications with LlamaIndex
Related ebooks
Generative AI on Google Cloud with LangChain: Design scalable generative AI solutions with Python, LangChain, and Vertex AI on Google Cloud Rating: 0 out of 5 stars0 ratingsGenerative AI Foundations in Python: Discover key techniques and navigate modern challenges in LLMs Rating: 0 out of 5 stars0 ratingsPython Natural Language Processing Cookbook: Over 60 recipes for building powerful NLP solutions using Python and LLM libraries Rating: 0 out of 5 stars0 ratingsCoding with ChatGPT and Other LLMs: Navigate LLMs for effective coding, debugging, and AI-driven development Rating: 0 out of 5 stars0 ratingsData Analysis with LLMs Rating: 0 out of 5 stars0 ratingsFull-Stack Web Development with TypeScript 5: Craft modern full-stack projects with Bun, PostgreSQL, Svelte, TypeScript, and OpenAI Rating: 0 out of 5 stars0 ratingsMastering LlamaIndex: Simplifying Data Access for Large Language Models Rating: 0 out of 5 stars0 ratingsRAG-Driven Generative AI: Build custom retrieval augmented generation pipelines with LlamaIndex, Deep Lake, and Pinecone Rating: 0 out of 5 stars0 ratingsLangChain in your Pocket: LangChain Essentials: From Basic Concepts to Advanced Applications Rating: 0 out of 5 stars0 ratingsGenerative AI Application Integration Patterns: Integrate large language models into your applications Rating: 0 out of 5 stars0 ratingsEmpowering Business Through the AI Revolution Rating: 0 out of 5 stars0 ratingsArtificial Intelligence Class 9: Vocational Course Code 417, Skill Education Rating: 0 out of 5 stars0 ratingsThe Lindahl Letter: 3 Years of AI/ML Research Notes Rating: 0 out of 5 stars0 ratingsUnderstanding Artificial Intelligence: A Beginner’s Guide to AI in Personal and Professional Life Rating: 0 out of 5 stars0 ratingsThe Apache Kafka® and Generative AI Handbook Rating: 0 out of 5 stars0 ratingsModern Graph Theory Algorithms with Python: Harness the power of graph algorithms and real-world network applications using Python Rating: 0 out of 5 stars0 ratingsHopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories Rating: 0 out of 5 stars0 ratingsHybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models Rating: 0 out of 5 stars0 ratingsStrategic Implementation of Agentic AI: Tools, Techniques, and Use Cases Rating: 0 out of 5 stars0 ratingsMastering DeepSeek AI: Unlocking the Power of Next-Generation Artificial Intelligence Rating: 0 out of 5 stars0 ratingsSmarter Decisions – The Intersection of Internet of Things and Decision Science Rating: 0 out of 5 stars0 ratingsHebbian Learning: Fundamentals and Applications for Uniting Memory and Learning Rating: 0 out of 5 stars0 ratingsCustomer 360: How Data, AI, and Trust Change Everything Rating: 0 out of 5 stars0 ratingsMining for Knowledge: Exploring GPU Architectures In Cryptocurrency and AI: The Crypto Mining Mastery Series, #2 Rating: 0 out of 5 stars0 ratingsChatGPT for Programmers: Enhance Your Coding Skills and Boost Productivity with AI-Powered Assistance (2024 Guide) Rating: 0 out of 5 stars0 ratingsArtificial Intelligence for Cybersecurity: Develop AI approaches to solve cybersecurity problems in your organization Rating: 0 out of 5 stars0 ratingsThe Today and Future of WSN, AI, and IoT: A Compass and Torchbearer for the Technocrats Rating: 0 out of 5 stars0 ratings
Programming For You
SQL All-in-One For Dummies Rating: 3 out of 5 stars3/5Linux: Learn in 24 Hours Rating: 5 out of 5 stars5/5SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL Rating: 4 out of 5 stars4/5Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time! Rating: 0 out of 5 stars0 ratingsPython: Learn Python in 24 Hours Rating: 4 out of 5 stars4/5Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps Rating: 4 out of 5 stars4/5Coding All-in-One For Dummies Rating: 4 out of 5 stars4/5Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1 Rating: 5 out of 5 stars5/5Python Data Structures and Algorithms Rating: 5 out of 5 stars5/5Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer. Rating: 5 out of 5 stars5/5JavaScript All-in-One For Dummies Rating: 5 out of 5 stars5/5Microsoft Azure For Dummies Rating: 0 out of 5 stars0 ratingsPYTHON PROGRAMMING Rating: 4 out of 5 stars4/5Learn Python in 10 Minutes Rating: 4 out of 5 stars4/5Learn SQL in 24 Hours Rating: 5 out of 5 stars5/5PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project Rating: 5 out of 5 stars5/5Beginning Programming with C++ For Dummies Rating: 4 out of 5 stars4/5Godot from Zero to Proficiency (Foundations): Godot from Zero to Proficiency, #1 Rating: 5 out of 5 stars5/5Algorithms For Dummies Rating: 4 out of 5 stars4/5Coding All-in-One For Dummies Rating: 0 out of 5 stars0 ratings
Reviews for Building Data-Driven Applications with LlamaIndex
0 ratings0 reviews
Book preview
Building Data-Driven Applications with LlamaIndex - Andrei Gheorghiu
Building Data-Driven Applications with LlamaIndex
As this ebook edition doesn't have fixed pagination, the page numbers below are hyperlinked for reference only, based on the printed edition of this book.
Copyright © 2024 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Group Product Manager: Niranjan Naikwadi
Publishing Product Manager: Nitin Nainani
Book Project Manager: Aparna Ravikumar Nair
Content Development Editor: Priyanka Soam
Technical Editor: Rahul Limbachiya
Copy Editor: Safis Editing
Indexer: Pratik Shirodkar
Production Designer: Shankar Kalbhor
DevRel Marketing Coordinator: Vinishka Kalra
First published: May 2024
Production reference: 1150424
Published by
Packt Publishing Ltd.
Grosvenor House
11 St Paul’s Square
Birmingham
B3 1RB, UK
ISBN 978-1-83508-950-7
www.packtpub.com
For the past six months, the focus required to create this book has sadly kept me away from the people I love. To my family and friends, your understanding and support have been my harbor in the storm of long hours and endless revisions.
Andreea, your love has been the gentle beacon guiding me through this journey. To my daughter Carla and every young reader out there: never stop learning! Life is a journey with so many possible destinations. Make sure you are the one choosing yours. My dear friends at ITAcademy, you guys rock! Thanks for supporting me along the way. Also, finalizing this book would not have been possible without the dedicated efforts and unwavering commitment of the Packt team. I extend my heartfelt gratitude to everyone involved in this project.
– Andrei Gheorghiu
Contributors
About the author
Andrei Gheorghiu is a seasoned IT professional and accomplished trainer at ITAcademy with over two decades of experience as a trainer, consultant, and auditor. With an impressive array of certifications, including ITIL Master, CISA, ISO 27001 Lead Auditor, and CISSP, Andrei has trained thousands of students in IT service management, information security, IT governance, and audit. His consulting experience spans the implementation of ERP and CRM systems, as well as conducting security assessments and audits for various organizations. Andrei’s passion for groundbreaking innovations drives him to share his vast knowledge and offer practical advice on leveraging technology to solve real-world challenges, particularly in the wake of recent advancements in the field. As a forward-thinking educator, his main goal is to help people upskill and reskill in order to increase their productivity and remain relevant in the age of AI.
About the reviewers
Rajesh Chettiar, holding a specialization in AI and ML, brings over 13 years of experience in machine learning, Generative AI, automation, and ERP solutions. He is passionate about keeping up with cutting-edge advancements in AI and is committed to improving his skills to foster innovation.
Rajesh resides in Pune with his parents, his wife, Pushpa, and his son, Nishith. In his free time, he likes to play with his son, watch movies with his family, and go on road trips. He also has a fondness for listening to Bollywood music.
Elliot helped write some of the LlamaIndexTS (Typescript version of LlamaIndex) codebase. He is actively looking to take on new generative AI projects (as of early 2024), he is available on GitHub and Linkedin.
I thank the Lord for everything. Thank you, Dad, Mom, and twin sister for your amazing support. Thank you to my friends who gave me their honest opinions and helped me grow. Thank you Yi Ding at LlamaIndex for helping me start this GenAI journey, and Yujian Tang for introducing me to Yi and always being supportive of open-source. Finally, thank you to everyone who has reached out to talk about generative AI; I learn new things every day from each of you
Srikannan Balakrishnan is an experienced AI/ML professional and a technical writer with a passion for translating complex information into simpler forms. He has a background in data science, including AI/ML, which fuels his ability to understand the intricacies of the subject matter and present it in a way that is accessible to both technical and non-technical audiences. He also has experience in Generative AI and has worked with different clients to solve their business problems with the power of data and AI. Beyond his technical expertise, he is a skilled communicator with a keen eye for detail. He is dedicated to crafting user-friendly documentation that empowers readers to grasp new concepts and navigate complex systems with confidence.
Arijit Das is an experienced Data Scientist with over 5 years of commercial experience, providing data-driven solutions to Fortune 500 clients across the US, UK, and EU. With expertise in Finance, Banking, Logistics, and HR management, Arijit excels in the Data Science lifecycle, from data extraction to model deployment and MLOps. Proficient in Supervised and Unsupervised ML techniques, including NLP, Arijit is currently focused on implementing cutting-edge ML practices at Citi globally.
Table of Contents
Preface
Part 1: Introduction to Generative AI and LlamaIndex
1
Understanding Large Language Models
Introducing GenAI and LLMs
What is GenAI?
What is an LLM?
Understanding the role of LLMs in modern technology
Exploring challenges with LLMs
Augmenting LLMs with RAG
Summary
2
LlamaIndex: The Hidden Jewel - An Introduction to the LlamaIndex Ecosystem
Technical requirements
Optimizing language models – the symbiosis of fine-tuning, RAG, and LlamaIndex
Is RAG the only possible solution?
What LlamaIndex does
Discovering the advantages of progressively disclosing complexity
An important aspect to consider
Introducing PITS – our LlamaIndex hands-on project
Here’s how it will work
Preparing our coding environment
Installing Python
Installing Git
Installing LlamaIndex
Signing up for an OpenAI API key
Discovering Streamlit – the perfect tool for rapid building and deployment!
Installing Streamlit
Finishing up
One final check
Familiarizing ourselves with the structure of the LlamaIndex code repository
Summary
Part 2: Starting Your First LlamaIndex Project
3
Kickstarting Your Journey with LlamaIndex
Technical requirements
Uncovering the essential building blocks of LlamaIndex – documents, nodes, and indexes
Documents
Nodes
Manually creating the Node objects
Automatically extracting Nodes from Documents using splitters
Nodes don’t like to be alone – they crave relationships
Why are relationships important?
Indexes
Are we there yet?
How does this actually work under the hood?
A quick recap of the key concepts
Building our first interactive, augmented LLM application
Using the logging features of LlamaIndex to understand the logic and debug our applications
Customizing the LLM used by LlamaIndex
Easy as 1-2-3
The temperature parameter
Understanding how Settings can be used for customization
Starting our PITS project – hands-on exercise
Let’s have a look at the source code
Summary
4
Ingesting Data into Our RAG Workflow
Technical requirements
Ingesting data via LlamaHub
An overview of LlamaHub
Using the LlamaHub data loaders to ingest content
Ingesting data from a web page
Ingesting data from a database
Bulk-ingesting data from sources with multiple file formats
Parsing the documents into nodes
Understanding the simple text splitters
Using more advanced node parsers
Using relational parsers
Confused about node parsers and text splitters?
Understanding chunk_size and chunk_overlap
Including relationships with include_prev_next_rel
Practical ways of using these node creation models
Working with metadata to improve the context
SummaryExtractor
QuestionsAnsweredExtractor
TitleExtractor
EntityExtractor
KeywordExtractor
PydanticProgramExtractor
MarvinMetadataExtractor
Defining your custom extractor
Is having all that metadata always a good thing?
Estimating the potential cost of using metadata extractors
Follow these simple best practices to minimize your costs
Estimate your maximal costs before running the actual extractors
Preserving privacy with metadata extractors, and not only
Scrubbing personal data and other sensitive information
Using the ingestion pipeline to increase efficiency
Handling documents that contain a mix of text and tabular data
Hands-on – ingesting study materials into our PITS
Summary
5
Indexing with LlamaIndex
Technical requirements
Indexing data – a bird’s-eye view
Common features of all Index types
Understanding the VectorStoreIndex
A simple usage example for the VectorStoreIndex
Understanding embeddings
Understanding similarity search
OK, but how does LlamaIndex generate these embeddings?
How do I decide which embedding model I should use?
Persisting and reusing Indexes
Understanding the StorageContext
The difference between vector stores and vector databases
Exploring other index types in LlamaIndex
The SummaryIndex
The DocumentSummaryIndex
The KeywordTableIndex
The TreeIndex
The KnowledgeGraphIndex
Building Indexes on top of other Indexes with ComposableGraph
How to use the ComposableGraph
A more detailed description of this concept
Estimating the potential cost of building and querying Indexes
Indexing our PITS study materials – hands-on
Summary
Part 3: Retrieving and Working with Indexed Data
6
Querying Our Data, Part 1 – Context Retrieval
Technical requirements
Learning about query mechanics – an overview
Understanding the basic retrievers
The VectorStoreIndex retrievers
The DocumentSummaryIndex retrievers
The TreeIndex retrievers
The KnowledgeGraphIndex retrievers
Common characteristics shared by all retrievers
Efficient use of retrieval mechanisms – asynchronous operation
Building more advanced retrieval mechanisms
The naive retrieval method
Implementing metadata filters
Using selectors for more advanced decision logic
Understanding tools
Transforming and rewriting queries
Creating more specific sub-queries
Understanding the concepts of dense and sparse retrieval
Dense retrieval
Sparse retrieval
Implementing sparse retrieval in LlamaIndex
Discovering other advanced retrieval methods
Summary
7
Querying Our Data, Part 2 – Postprocessing and Response Synthesis
Technical requirements
Re-ranking, transforming, and filtering nodes using postprocessors
Exploring how postprocessors filter, transform, and re-rank nodes
SimilarityPostprocessor
KeywordNodePostprocessor
PrevNextNodePostprocessor
LongContextReorder
PIINodePostprocessor and NERPIINodePostprocessor
MetadataReplacementPostprocessor
SentenceEmbeddingOptimizer
Time-based postprocessors
Re-ranking postprocessors
Final thoughts about node postprocessors
Understanding response synthesizers
Implementing output parsing techniques
Extracting structured outputs using output parsers
Extracting structured outputs using Pydantic programs
Building and using query engines
Exploring different methods of building query engines
Advanced uses of the QueryEngine interface
Hands-on – building quizzes in PITS
Summary
8
Building Chatbots and Agents with LlamaIndex
Technical requirements
Understanding chatbots and agents
Discovering ChatEngine
Understanding the different chat modes
Implementing agentic strategies in our apps
Building tools and ToolSpec classes for our agents
Understanding reasoning loops
OpenAIAgent
ReActAgent
How do we interact with agents?
Enhancing our agents with the help of utility tools
Using the LLMCompiler agent for more advanced scenarios
Using the low-level Agent Protocol API
Hands-on – implementing conversation tracking for PITS
Summary
Part 4: Customization, Prompt Engineering, and Final Words
9
Customizing and Deploying Our LlamaIndex Project
Technical requirements
Customizing our RAG components
How LLaMA and LLaMA 2 changed the open source landscape
Running a local LLM using LM Studio
Routing between LLMs using services such as Neutrino or OpenRouter
What about customizing embedding models?
Leveraging the Plug and Play convenience of using Llama Packs
Using the Llama CLI
Using advanced tracing and evaluation techniques
Tracing our RAG workflows using Phoenix
Evaluating our RAG system
Introduction to deployment with Streamlit
HANDS-ON – a step-by-step deployment guide
Deploying our PITS project on Streamlit Community Cloud
Summary
10
Prompt Engineering Guidelines and Best Practices
Technical requirements
Why prompts are your secret weapon
Understanding how LlamaIndex uses prompts
Customizing default prompts
Using advanced prompting techniques in LlamaIndex
The golden rules of prompt engineering
Accuracy and clarity in expression
Directiveness
Context quality
Context quantity
Required output format
Inference cost
Overall system latency
Choosing the right LLM for the task
Common methods used for creating effective prompts
Summary
11
Conclusion and Additional Resources
Other projects and further learning
The LlamaIndex examples collection
Moving forward – Replit bounties
The power of many – the LlamaIndex community
Key takeaways, final words, and encouragement
On the future of RAG in the larger context of generative AI
A small philosophical nugget for you to consider
Summary
Index
Other Books You May Enjoy
Preface
Beyond the initial hype that the fast advance of Generative AI and Large Language Models (LLMs) has produced, we have been able to observe both the abilities and shortcomings of this technology. LLMs are versatile and powerful tools driving innovation across various fields, serving as the foundation for natural language generation technology. Despite their potential, though, LLMs have limitations such as lacking access to real-time data, struggling to distinguish truth from falsehoods, maintaining context over long documents, and exhibiting unpredictable failures in reasoning and fact retention. Retrieval-Augmented Generation (RAG) attempts to solve many of these shortcomings and LlamaIndex is perhaps the simplest and most user-friendly way to begin your journey into this new development paradigm.
Driven by a flourishing and expanding community, this open source framework provides a huge number of tools for different RAG scenarios. Perhaps, that’s also why this book is needed. When I first encountered the LlamaIndex framework, I was impressed by its comprehensive official documentation. However, I soon realized that the sheer amount of options can be overwhelming for someone who’s just starting out. Therefore, my goal was to provide a beginner-friendly guide that helps you navigate the framework’s capabilities and use them in your projects. The more you explore the inner mechanics of LlamaIndex, the more you’ll appreciate its effectiveness. By breaking down complex concepts and offering practical examples, this book aims to bridge the gap between the official documentation and your understanding, ensuring that you can confidently build RAG applications while avoiding common pitfalls.
So, join me on a journey through the LlamaIndex ecosystem. From understanding fundamental RAG concepts to mastering advanced techniques, you’ll learn how to ingest, index, and query data from various sources, create optimized indexes tailored to your use cases, and build chatbots and interactive web applications that showcase the true potential of Generative AI. The book contains a lot of practical code examples, several best practices in prompt engineering, and troubleshooting techniques that will help you navigate the challenges of building LLM-based applications augmented with your data.
By the end of this book, you’ll have the skills and expertise to create powerful, interactive, AI-driven applications using LlamaIndex and Python. Moreover, you’ll be able to predict costs, deal with potential privacy issues, and deploy your applications, helping you become a sought-after professional in the rapidly growing field of Generative AI.
Who this book is for
This book has been specifically designed for developers at varying stages of their careers who are eager to understand and exploit the capabilities of Generative AI, particularly through the use of RAG. It aims to serve as a foundational guide for those with a basic understanding of Python development and a general familiarity with Generative AI concepts.
Here are the key audiences who will find this book invaluable:
Entry-level developers: Individuals who have a foundational understanding of Python and are beginning their journey into the world of generative AI will find this book an excellent starting point. It will guide you through the initial steps of using the LlamaIndex framework to create robust and innovative applications. You’ll learn the core components, basic workflows, and best practices to kickstart your RAG application development journey.
Experienced developers: For those who are already familiar with the landscape of generative AI and are looking to deepen their expertise, this book offers insight into advanced topics within the LlamaIndex framework. You’ll discover how to leverage your existing skills to develop and deploy more complex RAG applications, enhancing the capabilities of your projects and pushing the boundaries of what’s possible with AI.
Professionals seeking to harness the full power of LLMs: If you’re looking to improve your productivity by building quick solutions for data-driven problems, this book will teach you the basic concepts and provide you with powerful abilities. If you’re a natural learner and want to experiment with this wonderful technology, this book will provide you with the tools to solve complex problems with greater efficiency and creativity.
What this book covers
Chapter 1
, Understanding Large Language Models, serves as an introduction to generative AI and LLMs. It explains what LLMs are, their role in modern technology, and their strengths and weaknesses. The chapter aims to provide you with a foundational understanding of the capabilities of LLMs that LlamaIndex builds upon.
Chapter 2
, LlamaIndex: The Hidden Jewel - An Introduction to the LlamaIndex Ecosystem, introduces the LlamaIndex ecosystem and how it can augment LLMs. It explains the general structure of the book – starting with basic concepts and gradually introducing more complex elements of the LlamaIndex framework. The chapter also introduces the PITS – Personalized Intelligent Tutoring System project, which will be used to apply the concepts studied in the book and covers the preparation of the development environment.
Chapter 3
, Kickstarting Your Journey with LlamaIndex, covers the basics of starting your first LlamaIndex project. It explains the essential components of a RAG application in LlamaIndex, such as documents, nodes, indexes, and query engines. The chapter provides a typical workflow model and a simple hands-on example, where readers will begin building the PITS project.
Chapter 4
, Ingesting Data into Our RAG Workflow, focuses on importing our proprietary data into LlamaIndex, emphasizing the usage of the LlamaHub connectors. We learn how to break down and organize documents by parsing them into coherent, indexable chunks of information. The chapter also covers ingestion pipelines, important data privacy considerations, metadata extraction, and simple cost estimation methods.
Chapter 5
, Indexing with LlamaIndex, explores the topic of data indexing. It provides an overview of how indexing works, comparing different indexing techniques to help readers choose the most suitable one for their use cases. The chapter also explains the concept of layered indexing and covers persistent index storage and retrieval, cost estimation, embeddings, vector stores, similarity search, and storage contexts.
Chapter 6
, Querying Our Data, Part 1 – Context Retrieval, explains the mechanics of querying data and various querying strategies and architectures within LlamaIndex, with a deep focus on retrievers. It covers advanced concepts such as asynchronous retrieval, metadata filters, tools, selectors, retriever routers, and query transformations. The chapter also discusses fundamental paradigms such as dense retrieval and sparse retrieval, along with their strengths and weaknesses.
Chapter 7
, Querying Our Data, Part 2 – Postprocessing and Response Synthesis, continues the query mechanics topic, explaining the role of node post-processing and response synthesizers in the RAG workflow. It presents the overall query engine construct and its usage, as well as output parsing. The hands-on part of this chapter focuses on using LlamaIndex to generate personalized content in the PITS application.
Chapter 8
, Building Chatbots and Agents with LlamaIndex, introduces the essentials of chatbots, agents, and conversation tracking with LlamaIndex, applying this knowledge to the hands-on project. You will learn how LlamaIndex facilitates fluid interaction, retains context, and manages custom retrieval/response strategies, which are essential aspects for building effective conversational interfaces.
Chapter 9
, Customizing and Deploying Our LlamaIndex Project, provides a comprehensive guide to personalizing and launching LlamaIndex projects. It covers tailoring different components of the RAG pipeline, a beginner-friendly tutorial on deploying with Streamlit, advanced tracing methods for debugging, and techniques for evaluating and fine-tuning a LlamaIndex application.
Chapter 10
, Prompt Engineering Guidelines and Best Practices, explains the essential role of prompt engineering in enhancing the effectiveness of a RAG pipeline, highlighting how prompts are used under the hood
of the LlamaIndex framework. It guides readers on the nuances of customizing and optimizing prompts to harness the full power of LlamaIndex and ensure more reliable and tailored AI outputs.
Chapter 11
, Conclusion and Additional Resources, serves as a comprehensive conclusion, highlighting other projects and pathways for extended learning and summarizing the core insights from the book. It offers an overview of the main features of the framework, provides a curated list of additional resources for further exploration, and includes an index for quick terminology reference.
To get the most out of this book
You will need to have a basic understanding of Python development. General experience in using Generative AI models is also recommended. All the examples provided in the book have been specifically designed to run in a local Python environment, and because several libraries will be required along the way, it is recommended that you have a minimum of 20 GB of storage space available on your computer.
Because most of the examples presented in the book rely on the OpenAI API, you’ll also need to obtain an OpenAI API key.
If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.
As many of the code examples rely on the OpenAI API, keep in mind that running them will incur costs. Everything has been optimized for minimum cost but neither the author nor the publisher are responsible for these costs. You should also be advised of the security implications when using a public API such as the one provided by OpenAI. If you choose to use your own proprietary data to experiment with different examples, make sure you consult OpenAI’s privacy policy in advance.
Download the example code files
You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Building-Data-Driven-Applications-with-LlamaIndex
. The repository is organized in different folders. There is one corresponding folder for each chapter titled ch
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/
. Check them out!
Conventions used
There are a number of text conventions used throughout this book.
Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: […] using the download_llama_pack() method and specifying a download location such as […]
A block of code is set as follows:
from llama_index.llms.openai import OpenAI
llm = OpenAI(
api_base='http://localhost:1234/v1',
temperature=0.7
)
When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:
from llama_index.llms.openai import OpenAI
llm = OpenAI(
api_base=
'http://localhost:1234/v1'
,
temperature=0.7
)
Any command-line input or output is written as follows:
$ pip install llama-index-llms-neutrino
Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: Select System info from the Administration panel.
Tips or important notes
Appear like this.
Get in touch
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, email us at [email protected]
and mention the book title in the subject of your message.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata
and fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected]
with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com
.
Share Your Thoughts
Once you’ve read Building Data-Driven Applications with LlamaIndex, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page
for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.
Download a free PDF copy of this book
Thanks for purchasing this book!
Do you like to read on the go but are unable to carry your print books everywhere?
Is your eBook purchase not compatible with the device of your choice?
Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.
Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.
The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily
Follow these simple steps to get the benefits:
Scan the QR code or visit the link below
Download a free PDF copy of this bookhttps://packt.link/free-ebook/9781835089507
Submit your proof of purchase
That’s it! We’ll send your free PDF and other benefits to your email directly
Part 1:Introduction to Generative AI and LlamaIndex
As this ebook edition doesn't have fixed pagination, the page numbers below are hyperlinked for reference only, based on the printed edition of this book.
This first part begins by introducing generative AI and Large Language Models (LLMs), discussing their ability to produce human-like text, their limitations, and how Retrieval-Augmented Generation (RAG) can address these issues by enhancing accuracy, reasoning, and relevance. We then progress to understand how LlamaIndex leverages RAG to bridge the gap between LLMs’ extensive knowledge and proprietary data, elevating the potential of interactive AI applications.
This part has the following chapters:
Chapter 1
, Understanding Large Language Models
Chapter 2
, LlamaIndex: The Hidden Jewel - An Introduction to the LlamaIndex Ecosystem
1
Understanding Large Language Models
If you are reading this book, you have probably explored the realm of large language models (LLMs) and already recognize their potential applications as well as their pitfalls. This book aims to address the challenges LLMs face and provides a practical guide to building data-driven LLM applications with LlamaIndex, taking developers from foundational concepts to advanced techniques for implementing retrieval-augmented generation (RAG) to create high-performance interactive artificial intelligence (AI) systems augmented by external data.
This chapter introduces generative AI (GenAI) and LLMs. It explains how LLMs generate human-like text after training on massive datasets. We’ll also overview LLM capabilities, limitations such as outdated knowledge potential for false information, and lack of reasoning. You’ll be introduced to RAG as a potential solution, combining retrieval models using indexed data with generative models to increase fact accuracy, logical reasoning, and context relevance. Overall, you’ll gain a basic LLM understanding and learn about RAG as a way to overcome some LLM weaknesses, setting the stage for utilizing LLMs practically.
In this chapter, we will cover the following main topics:
Introducing GenAI and LLMs
Understanding the role of LLMs in modern technology
Exploring challenges with LLMs
Augmenting LLMs with RAG
Introducing GenAI and LLMs
Introductions are sometimes boring, but here, it is important for us to set the context and help you familiarize yourself with GenAI and LLMs before we dive deep into LlamaIndex. I will try to be as concise as possible and, if the reader is already familiar with this information, I apologize for the brief digression.
What is GenAI?
GenAI refers to systems that are capable of generating new content such as text, images, audio, or video. Unlike more specialized AI systems that are designed for specific tasks such as image classification or speech recognition, GenAI models can create completely new assets that are often very difficult – if not impossible – to distinguish from human-created content.
These systems use machine learning (ML) techniques such as neural networks (NNs) that are trained on vast amounts of data. By learning patterns and structures within the training data, generative models can model the underlying probability distribution of the data and sample from this distribution to generate new examples. In other words, they act as big prediction machines.
We will now discuss LLMs, which are one of the most popular fields in GenAI.
What is an LLM?
One of the most prominent and rapidly advancing branches of GenAI is natural language generation (NLG) through LLMs (Figure 1.1):
Figure 1.1 – LLMs are a sub-branch of GenAIFigure 1.1 – LLMs are a sub-branch of GenAI
LLMs are NNs that are specifically designed and optimized to understand and generate human language. They are large in the sense that they are trained on massive amounts of text containing billions or even trillions of words scraped from the internet and other sources. Larger models show increased performance on benchmarks, better generalization, and new emergent abilities. In contrast with earlier, rule-based generation systems, the main distinguishing feature of an LLM is that it can produce novel, original text that reads naturally.
By learning patterns from many sources, LLMs acquire various language skills found in their training data – from nuanced grammar to topic knowledge and even basic common-sense reasoning. These learned patterns allow LLMs to extend human-written text in contextually relevant ways. As they keep improving, LLMs create new possibilities for automatically generating natural language (NL) content at scale.
During the training process, LLMs gradually learn probabilistic relationships between words and rules that govern language structure from their huge dataset of training data. Once trained, they are able to generate remarkably human-like text by predicting the probability of the next word in a sequence, based on the previous words. In many cases, the text they generate is so natural that it makes you wonder: aren’t we humans just a similar but more sophisticated prediction machine? But that’s a topic for another book.
One of the key architectural innovations is the transformer (that is the T in GPT), which uses an attention mechanism to learn contextual relationships between words. Attention allows the model to learn long-range dependencies in text. It’s like if you’re listening carefully in a conversation, you pay attention to the context to understand the full meaning. This means they understand not just words that are close together but also how words that are far apart in a sentence or paragraph relate to each other.
Attention allows the model to selectively focus on relevant parts of the input sequence when making predictions, thus capturing complex patterns and dependencies within the data. This feature makes it possible for particularly large transformer models (with many parameters and trained on massive datasets) to demonstrate surprising new abilities such as in-context learning, where they can perform tasks with just a few examples in their prompt. To learn more about transformers and Generative Pre-trained Transformer (GPT), you can refer to Improving Language Understanding with unsupervised learning– Alec Radford, Karthik Narasimhan, Tim Salimans and Ilya Sutskever (https://openai.com/research/language-unsupervised
).
The best-performing LLMs such as GPT-4, Claude 2.1, and Llama 2 contain trillions of parameters and have been trained on internet-scale datasets using advanced deep learning (DL) techniques. The resulting model has an extensive vocabulary and a broad knowledge of language structure such as grammar and syntax, and about the world in general. Thanks to their unique traits, LLMs are able to generate text that is coherent, grammatically correct, and semantically relevant. The outputs they produce may not always be completely logical or factually accurate, but they usually read convincingly like being written by a human. But it’s not all about size. The quality of data and training algorithms – among others – can also play a huge role in the resulting performance of a particular model.
Many models feature a user interface that allows for response generation through prompts. Additionally, some offer an