0% found this document useful (0 votes)

340 views6 pages

Evolving LLOMPS For RAG

Learn RAG

Uploaded by

nihalgupta339

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

340 views6 pages

Evolving LLOMPS For RAG

Learn RAG

Uploaded by

nihalgupta339

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Retrieval

Augmented
Generation

10 App Hosting

11 Monitoring

9 Deployment & Inference

8
Application/
Orchestration

6 Prompt Engg 7 Evaluation

4 Foundation LLM 5 SFT Model

1 Data Preparation 2 Embeddings 3 Vector Storage

CHAPTER 7
Evolving LLMOps
Stack for RAG
Evolving LLMOps Stack for RAG
The production ecosystem for RAG and LLM applications is still evolving. Early
tooling and design patterns have emerged.

10 App Hosting

11 Monitoring

9 Deployment & Inference

8
Application/
Orchestration

6 Prompt Engg 7 Evaluation

4 Foundation LLM 5 SFT Model

1 Data Preparation 2 Embeddings 3 Vector Storage

Data Layer
The foundation of RAG applications is the data layer. This involves -
Data preparation - Sourcing, Cleaning, Loading & Chunking
Creation of Embeddings
Storing the embeddings in a vector store

Data Preparation Embeddings Vector Storage

Popular Data Layer Vendors (Non Exhaustive)

Abhinav Kimothi
Model Layer
2023 can be considered a year of LLM wars. Almost every other week in the
second half of the year a new model was released. Like there is no RAG without
data, there is no RAG without an LLM. There are four broad categories of LLMs
that can be a part of a RAG application

1. A Proprietary Foundation Model - Developed and maintained by providers

(like OpenAI, Anthropic, Google) and is generally available via an API
2. Open Source Foundation Model - Available in public domain (like Falcon,
Llama, Mistral) and has to be hosted and maintained by you.
3. A Supervised Fine-Tuned Proprietary Model - Providers enable fine-tuning of
their proprietary models with your data. The fine-tuned models are still
hosted and maintained by the providers and are available via an API
4. A Supervised Fine-Tuned Open Source Model - All Open Source models can
be fine-tuned by you on your data using full fine-tuning or PEFT methods.

There are a lot of vendors that have enabled access to open source models and
also facilitate easy fine tuning of these models

Proprietary Models Open Source Models

GPT3.5/GPT4 Llama 2 by Meta
Claude
Mistral & Mixtral

Falcon
phi2 by MicroSoft

Popular proprietary and open source LLMs (Non Exhaustive)

Proprietary Models Open Source Models

GPT Series
Claude, Jurassic &
Titan
AWS Sagemaker
Jumpstart

Popular vendors providing access to LLMs (Non Exhaustive)

Note : For Open Source models it is important to check the license type. Some
open source models are not available for commercial use

Abhinav Kimothi
Prompt Layer
Prompt Engineering is more than writing questions in natural language. There are
several prompting techniques and developers need to create prompts tailored
to the use cases. This process often involves experimentation: the developer
creates a prompt, observes the results and then iterates on the prompts to
improve the effectiveness of the app. This requires tracking and collaboration

Popular prompt engineering platforms (Non Exhaustive)

Evaluation
It is easy to build a RAG pipeline but to get it ready for production involves
robust evaluation of the performance of the pipeline. For checking
hallucinations, relevance and accuracy there are several frameworks and tools
that have come up.

Ragas
Popular RAG evaluation frameworks and tools (Non Exhaustive)

App Orchestration
An RAG application involves interaction of multiple tools and services. To run
the RAG pipeline, a solid orchestration framework is required that invokes these
different processes.

Popular App orchestration frameworks (Non Exhaustive)

Abhinav Kimothi
Deployment Layer
Deployment of the RAG application can be done on any of the available cloud
providers and platforms. Some important factors to consider while deployment
are also -
Security and Governance
Logging
Inference costs and latency

Popular cloud providers and LLMOps platforms (Non Exhaustive)

Application Layer
The application finally needs to be hosted for the intended users or systems to
interact with it. You can create your own application layer or use the available
platforms.

Popular app hosting platforms (Non Exhaustive)

Monitoring
Deployed application needs to be continuously monitored for both accuracy and
relevance as well as cost and latency.

Popular monitoring platforms (Non Exhaustive)

Other Considerations
LLM Cache - To reduce costs by saving responses for popular queries
LLM Guardrails - To add additional layer of scrutiny on generations

Abhinav Kimothi
More from

Retrieval
Augmented
Generation
A Simple Introduction

01. What is RAG?

02. How does RAG help?
03. What are some popular RAG use cases?
04. RAG Architecture
i) Indexing Pipeline
a) Data Loading
b) Document Splitting
c) Embedding
d) Vector Stores
ii) RAG Pipeline
a) Retrieval
b) Augmentation and Generation
05. Evaluation
06. RAG vs Finetuning
07. Evolving RAG LLMOps Stack
08. Multimodal RAG
09. Progression of RAG Systems
i) Naive RAG
ii) Advanced RAG
iii) Multimodal RAG

A Taxonomy of Retrieval Augmented Generation
100% (3)
A Taxonomy of Retrieval Augmented Generation
56 pages
Whitepaper - Foundational Large Language Models & Text Generation - v2
100% (1)
Whitepaper - Foundational Large Language Models & Text Generation - v2
86 pages
Hands-On Guide To Agentic Corrective RAG-1
100% (1)
Hands-On Guide To Agentic Corrective RAG-1
5 pages
26 RAG Concepts in Alphabetical Order
No ratings yet
26 RAG Concepts in Alphabetical Order
15 pages
GenAI and LLMs Creative Projects, With Solutions
100% (2)
GenAI and LLMs Creative Projects, With Solutions
206 pages
Building RAG-based LLM Applications For Production (Part 1) : Blog Detail
100% (2)
Building RAG-based LLM Applications For Production (Part 1) : Blog Detail
39 pages
The Best LLMs Cheatsheet - Part 1
No ratings yet
The Best LLMs Cheatsheet - Part 1
16 pages
LLMs and Retrieval-Augmented Generation (RAG)
No ratings yet
LLMs and Retrieval-Augmented Generation (RAG)
120 pages
Fine Tuning Techniques For Large Language Models LLMs
100% (2)
Fine Tuning Techniques For Large Language Models LLMs
15 pages
LangGraph: Multi-Agent Systems
No ratings yet
LangGraph: Multi-Agent Systems
9 pages
Primark - Full Factory List (En) - 2023
No ratings yet
Primark - Full Factory List (En) - 2023
75 pages
Large Language Model (LLM) 1
100% (1)
Large Language Model (LLM) 1
17 pages
Generative AI With Large Language Models AWS & DeepLearning
No ratings yet
Generative AI With Large Language Models AWS & DeepLearning
96 pages
KAG Graph + Multimodal RAG + LLM Agents = Powerful AI Reasoning - by Gao Dalie (高達烈) - in Towards AI - Freedium
No ratings yet
KAG Graph + Multimodal RAG + LLM Agents = Powerful AI Reasoning - by Gao Dalie (高達烈) - in Towards AI - Freedium
13 pages
Agentic RAGs 1740054167
No ratings yet
Agentic RAGs 1740054167
10 pages
LLM Questions
100% (2)
LLM Questions
51 pages
Code Generation With LLMs
No ratings yet
Code Generation With LLMs
59 pages
Types of RAG: @bhavishya Pandit
No ratings yet
Types of RAG: @bhavishya Pandit
15 pages
Classroom Visits and Observing The Teaching Learning Situation
No ratings yet
Classroom Visits and Observing The Teaching Learning Situation
36 pages
Developing Retrieval Augmented Generation (RAG) Based LLM Systems From Pdfs - An Expert Report
No ratings yet
Developing Retrieval Augmented Generation (RAG) Based LLM Systems From Pdfs - An Expert Report
36 pages
LLM and RAG
No ratings yet
LLM and RAG
12 pages
GenerativeAI Projects
100% (3)
GenerativeAI Projects
46 pages
Ethics Notes
100% (2)
Ethics Notes
47 pages
The 10 Generic Kinds of Agents 1730948119
100% (1)
The 10 Generic Kinds of Agents 1730948119
17 pages
Telangana Schemes and Policies (2014-2024) Updated Book-Target TSPSC - 35390223 - 2024 - 06 - 24 - 11 - 08
50% (2)
Telangana Schemes and Policies (2014-2024) Updated Book-Target TSPSC - 35390223 - 2024 - 06 - 24 - 11 - 08
117 pages
Building A Dynamic Multi-Agent Workflow - Harnessing AI Collaboration With LangChain & LangGraph - by Rohit Kumar - Oct, 2024 - Medium
No ratings yet
Building A Dynamic Multi-Agent Workflow - Harnessing AI Collaboration With LangChain & LangGraph - by Rohit Kumar - Oct, 2024 - Medium
13 pages
Vector Databases
No ratings yet
Vector Databases
35 pages
Designing Retrieval Augmented Generation
No ratings yet
Designing Retrieval Augmented Generation
32 pages
University of Cebu-Main: Philippine Popular Culture
No ratings yet
University of Cebu-Main: Philippine Popular Culture
145 pages
The Blue Ocean Strategy: W. Chan Kim & Renée Mauborgne
No ratings yet
The Blue Ocean Strategy: W. Chan Kim & Renée Mauborgne
47 pages
Multi-Document Agentic RAG Using Llama-Index and Mistral - by Plaban Nayak - The AI Forum - May, 2024 - Medium
100% (1)
Multi-Document Agentic RAG Using Llama-Index and Mistral - by Plaban Nayak - The AI Forum - May, 2024 - Medium
24 pages
What Are Vector Databases
No ratings yet
What Are Vector Databases
5 pages
Current Best Practices For Training LLMs From Scratch - Final
No ratings yet
Current Best Practices For Training LLMs From Scratch - Final
23 pages
Fine-Tuning Pre-Trained Models For Generative AI Applications
100% (2)
Fine-Tuning Pre-Trained Models For Generative AI Applications
19 pages
Enhancing AI Systems With Agentic Workflows Patterns in Large Language Model
No ratings yet
Enhancing AI Systems With Agentic Workflows Patterns in Large Language Model
6 pages
Vector Database Essentials
No ratings yet
Vector Database Essentials
26 pages
Guide To Agentic AI Multi Agent Pattern 1741332267
No ratings yet
Guide To Agentic AI Multi Agent Pattern 1741332267
11 pages
LLM Evaluation
No ratings yet
LLM Evaluation
1 page
A Step-By-Step Guide To Building AI Agents With LangGraph - by Alannaelga - Coinmonks - Nov, 2024 - Medium
No ratings yet
A Step-By-Step Guide To Building AI Agents With LangGraph - by Alannaelga - Coinmonks - Nov, 2024 - Medium
32 pages
Transformers
No ratings yet
Transformers
21 pages
LLM Mesh: A Practical Guide To Using Generative AI in The Enterprise
100% (1)
LLM Mesh: A Practical Guide To Using Generative AI in The Enterprise
27 pages
Introduction To Generative AI LLM
100% (1)
Introduction To Generative AI LLM
9 pages
AWS FMOps LLMOps Operationalise GenAI Using MLOps Principles
100% (1)
AWS FMOps LLMOps Operationalise GenAI Using MLOps Principles
56 pages
RAG Notes
No ratings yet
RAG Notes
19 pages
GenAI POC - Training
100% (1)
GenAI POC - Training
43 pages
Building Machine Learning Systems With A Feature Store - Early Release
100% (2)
Building Machine Learning Systems With A Feature Store - Early Release
48 pages
MM-LLMs Recent Advances in MultiModal Large Language Models
No ratings yet
MM-LLMs Recent Advances in MultiModal Large Language Models
22 pages
Aryan A. What Is LLMOps. Large Language Models in Production 2024
100% (1)
Aryan A. What Is LLMOps. Large Language Models in Production 2024
67 pages
Digital Marketing Ashutosh
No ratings yet
Digital Marketing Ashutosh
13 pages
GraphRAG + GPT-4o-Mini Is The RAG Heaven - by Vatsal Saglani - Jul, 2024 - Towards AI
No ratings yet
GraphRAG + GPT-4o-Mini Is The RAG Heaven - by Vatsal Saglani - Jul, 2024 - Towards AI
34 pages
AIML001 Generative AI On AWS - Build and Scale Generative AI Applications With Foundation Models
100% (2)
AIML001 Generative AI On AWS - Build and Scale Generative AI Applications With Foundation Models
28 pages
RT 31021112017
No ratings yet
RT 31021112017
8 pages
12th Activity 1
No ratings yet
12th Activity 1
6 pages
LLMs in Production-MLC - GRC
No ratings yet
LLMs in Production-MLC - GRC
39 pages
Past Simple Busy Teacher
No ratings yet
Past Simple Busy Teacher
8 pages
Top Ten Ways of Handling Guest Complaints
No ratings yet
Top Ten Ways of Handling Guest Complaints
6 pages
Paper3 - LLM Agent Operating System
No ratings yet
Paper3 - LLM Agent Operating System
14 pages
Langchain Retrieval Augmented Generation White Paper
100% (1)
Langchain Retrieval Augmented Generation White Paper
23 pages
Web Guide
No ratings yet
Web Guide
60 pages
RPG and Story-Based Game in Game Development
No ratings yet
RPG and Story-Based Game in Game Development
9 pages
List of Open Sourced Fine-Tuned Large Language Models (LLM) - by Sung Kim - Geek Culture - Mar, 2023 - Medium
No ratings yet
List of Open Sourced Fine-Tuned Large Language Models (LLM) - by Sung Kim - Geek Culture - Mar, 2023 - Medium
18 pages
Application of Large Language
No ratings yet
Application of Large Language
75 pages
MCP 9
No ratings yet
MCP 9
17 pages
Discharge Summary
No ratings yet
Discharge Summary
4 pages
Present Tenses Exercises
No ratings yet
Present Tenses Exercises
5 pages
Guide To Evaluating LLM and RAG Systems
No ratings yet
Guide To Evaluating LLM and RAG Systems
41 pages
RA. 9266 - Architecture Act of 2004
No ratings yet
RA. 9266 - Architecture Act of 2004
2 pages
Long-Context LLMs Meet RAG: Overcoming Challenges For Long Inputs in RAG
No ratings yet
Long-Context LLMs Meet RAG: Overcoming Challenges For Long Inputs in RAG
34 pages
Day 1
No ratings yet
Day 1
32 pages
Whitepaper Emebddings Vectorstores v2
No ratings yet
Whitepaper Emebddings Vectorstores v2
64 pages
ASSIGNMENT 2 (25%) : Diploma Programmes Introduction To Information Technology (CSC40704/ CSC40104)
No ratings yet
ASSIGNMENT 2 (25%) : Diploma Programmes Introduction To Information Technology (CSC40704/ CSC40104)
4 pages
RAG-HAT - A Hallucination-Aware Tuning Pipeline For LLM in Retrieval-Augmented Generation
No ratings yet
RAG-HAT - A Hallucination-Aware Tuning Pipeline For LLM in Retrieval-Augmented Generation
11 pages
Agents in LangChain
100% (2)
Agents in LangChain
11 pages
2021 Company Profile - AdvanceNet Group Updated
No ratings yet
2021 Company Profile - AdvanceNet Group Updated
11 pages
Hana Makahle 2017
No ratings yet
Hana Makahle 2017
3 pages
CNS Unit-1
No ratings yet
CNS Unit-1
27 pages
Chap 12 PM-BB Multiple Choice Type Questions
No ratings yet
Chap 12 PM-BB Multiple Choice Type Questions
24 pages
Langchain PDF Reader
100% (1)
Langchain PDF Reader
15 pages
Chapter - Ii Muslim Law of Testamentary Succession
100% (1)
Chapter - Ii Muslim Law of Testamentary Succession
51 pages
God Must Be Evil (If It Exists)
100% (1)
God Must Be Evil (If It Exists)
16 pages
Intangible Assets
No ratings yet
Intangible Assets
9 pages
The New Stack and Ops For AI - LLMOps
No ratings yet
The New Stack and Ops For AI - LLMOps
12 pages
Ucs Director Admin Guide
No ratings yet
Ucs Director Admin Guide
164 pages
Langchain 101
100% (2)
Langchain 101
4 pages
Mutus Liber Images
No ratings yet
Mutus Liber Images
15 pages
Mastering Chunking in RAG - Techniques and Strategies
No ratings yet
Mastering Chunking in RAG - Techniques and Strategies
12 pages
4.6.6 Lab - View Wired and Wireless NIC Information - ILM
No ratings yet
4.6.6 Lab - View Wired and Wireless NIC Information - ILM
4 pages
5 Ways of Healthy Living: Teliti Poster Di Bawah Dan Jawab Soalan-Soalan Yang Berikut
No ratings yet
5 Ways of Healthy Living: Teliti Poster Di Bawah Dan Jawab Soalan-Soalan Yang Berikut
2 pages
BR SprayMaster
No ratings yet
BR SprayMaster
16 pages
JPN Pharma Brochure
No ratings yet
JPN Pharma Brochure
2 pages