0% found this document useful (0 votes)
35 views52 pages

RAGvs Agentic RAG

The document provides a comprehensive guide comparing traditional Retrieval-Augmented Generation (RAG) systems with Agentic RAG systems. RAG enhances large language models by retrieving relevant external information to improve accuracy and context, while Agentic RAG introduces an intelligent agent that dynamically determines the best resources for complex queries. The guide also discusses the architecture, workflow, challenges, and advantages of both systems in handling user queries effectively.

Uploaded by

bee65473
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views52 pages

RAGvs Agentic RAG

The document provides a comprehensive guide comparing traditional Retrieval-Augmented Generation (RAG) systems with Agentic RAG systems. RAG enhances large language models by retrieving relevant external information to improve accuracy and context, while Agentic RAG introduces an intelligent agent that dynamically determines the best resources for complex queries. The guide also discusses the architecture, workflow, challenges, and advantages of both systems in handling user queries effectively.

Uploaded by

bee65473
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

prev Interview Prep Career GenAI Prompt Engg ChatGPT LLM nextL

RAG vs Agentic RAG: A Comprehensive Guide


Pankaj Singh 5
Last Updated : 10 Mar, 2025

Today, I am discussing RAG vs Agentic RAG. In this guide, I will provide you with the

comparison and then proceed to the hands-on part.

Firstly, let’s understand what RAG is. It is not a piece of old cloth but the framework

LLM uses to get relevant, up-to-date, and context-specific information by

combining retrieval and generation capabilities.

But can we see the limitations of LLMs without RAG? Absolutely! Here, I have asked

ChatGpt to give me output on its knowledge without any external searches for Swarm
by OpenAI; it cannot provide the right output. This is due to its knowledge cutoff date,

which is 2023, and to get the correct output, it has to be updated with new information

or access to an external source. Intriguing, right? So, can we augment the LLMs with

our own custom data to get the right response? Of course, we can do it with long-
context LLMs and RAG. Today, we will be talking about RAG.

New Feature Beta

Personalized GenAI Learning Path


2025 ✨
We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.

Show details
Crafted Just for YOU!
Accept all cookies Use necessary cookies

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 1/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Download Now

Instead of relying solely on the large language model’s (LLM) pre-trained knowledge,

which may be outdated or incomplete, RAG dynamically retrieves the most relevant
documents or information from an external knowledge base or database.

Let us comprehend this with an example: if we humans, after birth, rely on only one
source of information when exploring the external environment, our understanding

would remain severely limited. Similarly, a Large Language Model (LLM) on its own

has a predefined training dataset that serves as its “internal knowledge.” This has to

be the only source of Information for the model, resulting in old information,
ungrounded hallucinations, senseless content and more. While vast, this dataset

needs to be updated or more for real-time, context-specific queries. This is where


We use cookies essential for this site to function well. Please click to help us improve its
RAG (Retrieval-Augmented Generation)
usefulness with additional steps
cookies. Learn about ourin.
use of cookies in our Privacy Policy &
Cookies Policy.

Show details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 2/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

6. Agentic RAG Workflow

7. Understanding Agents in RAG Systems

8. Types of Agents in the RAG Pipeline

9. RAG vs Agentic RAG

10. Hands-On: Build a Simple RAG System

11. LangChain Agentic RAG System Using the IBM Granite-3.0-8B-Instruct model

12. Conclusion

13. Frequently Asked Questions

What Does RAG Do?

Here’s what RAG does:

We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.

Show details

Source: Author
https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 3/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

1. Retrieval (R): This involves searching for relevant data from external sources,

databases, or knowledge repositories. The goal is to gather specific, accurate, and


relevant information that can support or enhance the AI’s understanding of a

particular topic or query.

2. Augmentation (A): In this phase, the retrieved data is added to the prompt

context. This means the information is integrated or combined with the input given

to the AI, effectively enriching its knowledge base for better reasoning and context-

aware responses.

3. Generation (G): Finally, the AI uses the augmented context to generate outputs,

such as text, explanations, or insights, based on the combined input and retrieved

data. This step represents the output of generative AI tools like GPT models.

Together, the RAG framework helps improve AI-generated content’s relevance,

accuracy, and richness by grounding responses in retrieved and contextualised

information.

We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.

Show details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 4/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Source: Author

RAG vs Without RAG

Here’s the comparison of RAG and Without RAG:


We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Category Without RAG With RAG
Cookies Policy.

Show details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 5/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Accuracy Susceptible to generating Responses are grounded with


unverified or “hallucinated” verifiable citations from external
content, not tied to reliable sources.
sources.

Timeliness Relies on static pre-trained data, Enhances static pre-trained data by


which may be outdated or incorporating real-time, up-to-date
irrelevant to current events. information from external sources.

Contextual Often struggles to interpret Retrieves context-specific information,


Clarity ambiguous queries, leading to improving the clarity and specificity of
vague or incomplete answers. responses.

Customisation Cannot access or utilise user- Integrates public and private datasets,
specific datasets or private enabling highly tailored and relevant
sources, resulting in generic outputs.
responses.

Search Scope Limited to the pre-trained Capable of broad, on-demand


knowledge base; cannot extend to searches across multiple databases or
new or external information. online sources.

Reliability High potential for errors due to Ensures reliability by cross-


reliance on static and pre- referencing multiple trusted sources in
generated knowledge. real time.

Use Cases Suitable for general-purpose tasks Ideal for tasks requiring live updates,
but less effective for dynamic or research, or custom data integration.
data-intensive applications.

Transparency No clear reference or citation for Provides citations or links to sources,


the provided information, making ensuring transparency and
validation difficult. trustworthiness.

RAG (Retrieval-Augmented Generation)

We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.

Show details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 6/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Source: Author

Without RAG (Retrieval-Augmented Generation)

We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.

Show details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 7/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Source: Author

Working of RAG

RAG System Architecture: Data Indexing

We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.

Show details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 8/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Source: Dipanjan Sarkar

This part focuses on preparing and managing the knowledge base during retrieval.

Step 1: Load

The system ingests different types of data (e.g., text files, PDFs, URLs, and JSON

files).

The data can come from diverse sources, ensuring a comprehensive knowledge
base.

Step 2: Split

The data is divided into smaller, meaningful chunks.

This step ensures that retrieval works efficiently, allowing the system to fetch
precise and relevant parts of documents instead of retrieving entire files.

Step 3: Embed

Each chunk of data is converted into vector representations using embedding

models.

These embeddings capture the semantic meaning of the text, enabling the system
to perform similarity-based searches.

Step 4: Store
We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
The embeddings and corresponding data are stored in a vector database.
Cookies Policy.

Show details
The vector database is optimised for quick and accurate similarity searches, which
is crucial for retrieval.
https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 9/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Also read: Vector Embeddings with Cohere and Hugging Face

RAG System Architecture: Search and Generation

This describes the overall process of combining retrieval and generation to produce an
answer:

Source: Dipanjan Sarkar

Step 1: Question Input

The user provides a query or question.

The system begins by analysing this question for context and intent.

Step 2: Retrieve

We use cookies essential for this site to function well. Please click to help us improve its
The system queries
usefulness an indexed
with additional cookies. knowledge
Learn about our base (retrieval
use of cookies system)
in our Privacy Policyto& gather the
Cookies Policy.
most relevant documents or pieces of information.
Show details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 10/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

These documents serve as supporting evidence or context for generating an


answer.

Step 3: Prompt Creation

The retrieved documents are structured into a prompt for the LLM.

The prompt includes the original question and the retrieved information, guiding
the LLM in generating a context-aware response.

Step 4: Large Language Model (LLM)

The LLM processes the prompt, utilizing its generative capabilities to create a
coherent and precise response.

The response combines insights from the retrieved documents with the LLM’s pre-

trained knowledge.

Step 5: Answer Output

The final answer is presented to the user, blending the retrieved knowledge and

the LLM’s generative capabilities.

Here’s how the system architecture looks when combined together:

We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.

Show details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 11/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Source: Dipanjan Sarkar

Also read: Build a RAG Pipeline With the LLama Index

Challenges with RAG

Here are the challenges with RAG:

1. Contextual Understanding: RAG systems must understand the context and

intent behind each query, especially when handling ambiguous or multi-part


questions. But sometimes, they lag in this area!

2. Synthesis and Reasoning: Beyond retrieving relevant information, the system

must synthesize data from multiple sources and generate coherent, actionable

insights.

3. Customization: Adhering to specific internal style guides or user-defined

preferences adds another layer of complexity.


We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
4. Accuracy and Relevance: Ensuring that the retrieved and generated content is
Cookies Policy.

accurate,Show
relevant,
detailsand directly addresses the user’s query.

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 12/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

5. Scalability: Managing a large volume of diverse queries across different domains

or topics can strain the system’s ability to provide high-quality responses.

Traditional Retrieval-Augmented Generation (RAG) systems enhance AI by pairing

Large Language Models (LLMs) with vector databases to overcome LLM limitations.
While effective for basic tasks like Q&A or support bots, they struggle with complex

use cases. These systems often fail to contextualize retrieved data, resulting in

superficial responses that lack depth and nuance.

These challenges demonstrate why RAG systems require sophisticated mechanisms


for retrieval, context understanding, and natural language generation to handle these

nuanced use cases effectively. This is where Agentic RAG comes to the rescue.

I hope you now have a clear understanding of the traditional RAG. We will now
discuss a different version of RAG with agents—the Agentic Rag.

What is Agentic RAG?

Agentic RAG refers to a more intelligent and dynamic Retrieval-Augmented

Generation system where an “agent” plays a key role in orchestrating processes. The

agent intelligently determines which resources or databases are most relevant for a
user’s query, making it capable of handling more complex, multi-tasking scenarios. It

is an evolution from traditional RAG systems, offering greater adaptability and

decision-making by incorporating additional logic or heuristics into the retrieval and

response generation pipeline.

We use cookies essential for this site to function well. Please click to help us improve its
Agentic:usefulness
The system works
with additional on its
cookies. own,
Learn aboutmaking
our use ofdecisions
cookies in our and taking
Privacy Policy &actions
Cookies Policy.
depending on the situation.
Show details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 13/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

RAG (Retrieval-Augmented Generation): It mixes information from a knowledge


base with the AI’s ability to create responses.

Agentic RAG Workflow

Source: Author

The process flow of an Agentic RAG System for handling user queries. Here’s a
breakdown of each component:
We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
1. User Input and Initial Assessment:
Cookies Policy.

The Show
system receives a user query.
details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 14/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

The query is assessed to determine if it fits the criteria for retrieval (it is part of

the vector database).

2. Vector Database Selection:

The agent identifies the most relevant vector database for the query.

Multiple vector databases are available:


DB1: Contains data for generating code.

DB2: Contains other general data.

DB3: Contains data for generating charts.

If the query does not match any database, the process routes to a failsafe

mechanism.

3. Content Retrieval:

Once a database is selected, the relevant content is retrieved.

Retrieved content is integrated into the LLM prompt for further processing.

4. Response Type Selection:

Based on the query and retrieved content, the system determines the
appropriate response type:

Generate Code: If the query involves code-related tasks.

Generate Charts: If the query requires visualization.

Generate Text Response: For standard text-based answers.

5. Final Output:

The We
system generates
use cookies the
essential for thisappropriate response
site to function well. Please click(text,
to help code, or itschart) and
us improve
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
delivers the
Cookies final output.
Policy.

Show details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 15/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

If no relevant data is found, the system defaults to a failsafe response,


returning a message like:

“Sorry, I don’t have the information you’re looking for.”

Crucial Points:

Agent Role: The agent dynamically selects the most relevant database,

enhancing flexibility and efficiency in handling diverse queries.

Failsafe Mechanism: Ensures the system gracefully handles unanswerable

queries by returning a fallback response.

Task Specialization: Different vector databases are optimized for specific tasks
(e.g., code generation, chart creation), improving performance and accuracy for

complex scenarios.

It exemplifies a robust approach to Agentic RAG, demonstrating how modular and

context-aware processing enables handling a wide range of tasks.

Also read: How Agentic RAG Systems with CrewAI and LangChain Transform Tech?

Let’s see how a Self-reflective Agentic RAG System works:

We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.

Show details

Source: LangChain
https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 16/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

1. Agent (Node): Initiates the process and decides whether to retrieve documents by

evaluating the query (via a function call).

2. Should Retrieve (Conditional Edge): Determines if retrieval is necessary. If yes,

the process continues; if no, it ends.

3. Tool (Node): Executes a retrieval tool to fetch relevant documents or information.

4. Check Relevance (Conditional Edge): Assesses if the retrieved documents are


relevant. If yes, it moves to the next step; if no, it redirects to the rewrite process.

5. Rewrite (Node): Reformulates the query and restarts the retrieval process if

necessary.

6. Generate (Node): If relevant documents are found, the system generates an

answer and outputs it.

This iterative approach ensures accuracy and relevance by dynamically retrieving and
refining the query as needed.

Also read: A Comprehensive Guide to Building Agentic RAG Systems with LangGraph

Understanding Agents in RAG Systems

Agents are the driving force behind the Retrieval-Augmented Generation (RAG)

framework, functioning as specialized units that streamline each stage of the retrieval
and generation pipeline. They operate collaboratively to achieve tasks like

understanding user queries, retrieving relevant information, generating responses, and

managing the
Weoverall workflow.
use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.
By orchestrating these functions, agents ensure smooth, efficient, and intelligent
Show details
handling of tasks. This modular and adaptive approach allows the system to tackle

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 17/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

complex queries effectively while improving overall performance and system reliability.

Types of Agents in the RAG Pipeline

The RAG system employs several types of agents, each with a specific purpose and

methodology. Here’s a breakdown for clarity:

1. Routing Agents

Source: LlamaIndex

Purpose: Direct
We useuser queries
cookies to this
essential for thesitemost appropriate
to function well. Please sources.
click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.
How They Work: Analyze queries using large language models (LLMs) to determine
Show details
which parts of the RAG pipeline best handle the request.

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 18/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Here’s a hybrid approach combining Semantic Search and Summarization to

answer a specific query: “What did the author do during his time in art school?”. Here’s
a breakdown of how the system works:

1. Router (Green Box):

The entry point where the query is received.

Decides how to process the query, directing it to the appropriate engines.

2. Semantic Search + Summarization (Pink Area):

This is the main process to extract and summarize information relevant to the
query.

3. Vector Query Engine (Left Path):

Performs semantic search by comparing the query with document

embeddings (vectorized representations of the content).

Retrieves the top-k relevant documents based on similarity scores.

4. Summary Query Engine (Right Path):

Instead of ranking by relevance, this engine retrieves all potentially related


documents.

Focuses on summarizing or extracting the exact answer from the retrieved

data.

5. Docs (Document Corpus):

Represents the database or collection of text/documents being queried.

6. Final Output (Bottom):


We use cookies essential for this site to function well. Please click to help us improve its

Afterusefulness
processingwith additional cookies. Learn about our use of cookies in our Privacy Policy &
by the engines, a summarized response is generated: “During
Cookies Policy.

his time
Showindetails
art school, the author took foundation classes in fundamental

subjects like drawing, color, and design.”

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 19/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Advantages:

Enhance query accuracy by targeting relevant data sources.

Improve system efficiency by avoiding unnecessary processing.

Also, with this, you can combine QA and Summarisation

Also read: Agentic RAG for Analyzing Customer Issues

2. Query Planning Agents

We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.

Show details Source: LlamaIndex

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 20/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Purpose: Handle complex or multi-faceted queries by breaking them into smaller,

manageable components.

How They Work:

Divide the main query into sub-queries.

Assign retrieval and generation tasks for each sub-query across the RAG

pipelines.

The process of retrieving and comparing revenue growth information for Uber
and Lyft in 2021 from their financial documents (10-K filings).

Process Overview:

1. Query Decomposition:
The initial query (Compare revenue growth of Uber and Lyft in 2021) is split

into two sub-queries:

Describe revenue growth of Lyft in 2021.

Describe revenue growth of Uber in 2021.

2. Data Source:

The data is extracted from 10-K filings (annual financial reports) of Uber and
Lyft.

These filings are stored in a document database where each report is split

into smaller chunks for efficient retrieval.

We use cookies essential for this site to function well. Please click to help us improve its
3. Retrieval (Top-2 Chunks):
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.
For each sub-query:
Show details
The system identifies the most relevant chunks (top-2) from the

respective 10-K filings.


https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 21/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

For example:

Uber 10-K chunk 4 and Uber 10-K chunk 8 for the Uber sub-query.

Lyft 10-K chunk 4 and Lyft 10-K chunk 8 for the Lyft sub-query.

4. Results Compilation:

After retrieving the relevant chunks, the system processes the content to
generate responses for each sub-query.

Finally, the results for Lyft and Uber are combined to facilitate a comparison.

Key Insights:

Chunking: Large documents like 10-K filings are divided into smaller sections
(chunks) for more efficient and targeted searches.

Relevance Ranking: The system uses a ranking mechanism (e.g., semantic

similarity or keyword relevance) to select the top-2 chunks most likely to contain

the required information.

Modular Query Handling: By decomposing the query into smaller parts, the

system can handle complex, multi-entity questions more effectively.

Outcome: Results from each sub-query are synthesized into a complete, coherent

response.

Benefits:

Streamline responses to intricate questions.


We use cookies essential for this site to function well. Please click to help us improve its
Leverage multiple
usefulness withdata sources
additional cookies. to provide
Learn about ourcomprehensive answers.
use of cookies in our Privacy Policy &
Cookies Policy.

Show details
3. ReAct Agents (Reasoning and Action Agents)

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 22/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Source: ReAct: Synergizing Reasoning and Acting in Language Models

Purpose: Adaptively combine reasoning and dynamic action to handle real-time

queries and user interactions.

How They Work:

Select and execute tools or processes needed for specific tasks.

Retrieve data, process information, and store outputs incrementally.

Iterate the process, refining results until an accurate response is generated.

Why They Matter:

Handle dynamic queries requiring multiple steps and tool integrations.

Respond effectively to real-time changes in user input or query scope.

Also read: What is Agentic AI Planning Pattern?

4. Dynamic Planning and Execution Agents

We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.

Show details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 23/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Source: paperswithcode

Purpose: Continuously adapt to evolving data and changing user requirements.

Key Areas of Focus:

Long-Term Planning: Chart out strategies for sustained system performance.

Execution Insights: Monitor and refine real-time actions.

Efficiency: Minimize delays and optimize resource usage.

How They Work:

Separate overarching planning from granular, step-by-step execution.

Use computational graphs to map out comprehensive query solutions.

Incorporate a two-part system:


We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Planner: Designs strategies.
Cookies Policy.

Show details
Executor: Implements these strategies effectively.

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 24/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Components and Workflow

1. User Input:

Example Query: “How much does Microsoft’s market cap need to increase to
exceed Apple’s market cap?”

The system receives a natural language input from the user.

2. LLM Planner:

Task Generation: The query is analyzed, and tasks are created as a Directed
Acyclic Graph (DAG) with dependencies. For example:
Task 1: search(Microsoft Market Cap)

Task 2: search(Apple Market Cap)

Task 3: 1 – 2 (compute the difference after retrieving results for Tasks 1

and 2).

Tasks are arranged based on dependencies:


Tasks 1 and 2 are independent and can be executed in parallel.

Task 3 depends on the results of Tasks 1 and 2 and must wait for their
completion.

3. Task Fetching Unit:

Dependency Resolution:
This unit identifies the tasks ready for execution (those with resolved
dependencies).

For example:
We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Initially, Tasks 1 and 2 are fetched for parallel execution.
Cookies Policy.

Show details
Once Tasks 1 and 2 are completed, their results are fed into Task 3.

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 25/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

4. Executor:
Executes tasks using tools or functions as needed.

Tools Available:

search: Used to retrieve information (e.g., market caps for Microsoft and
Apple).

math: Performs calculations (e.g., subtracting one market cap from


another).

Execution Workflow:

Fetches tasks from the Task Fetching Unit.

Utilizes tools and functions to perform the necessary operations.

Results are stored in memory for dependent tasks.

5. Final Answer:
Once all tasks are executed and dependencies are resolved, the results are

returned to the user.

For the example query, the final result would quantify how much Microsoft’s
market cap needs to increase to exceed Apple’s.

Key Features:

Task Decomposition: Breaks down complex queries into manageable


components.

Parallel Execution: Executes independent tasks simultaneously to optimize


We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
performance.
Cookies Policy.

Show details
Dependency Management: Ensures tasks are executed in the correct sequence,
based on their interdependencies.

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 26/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Tool Integration: Supports multiple tools (e.g., search, math) to handle various

task types.

Why These Agents Matter?

By employing specialized agents with distinct functions, the RAG pipeline ensures:

Accuracy: Queries are routed and processed efficiently.

Scalability: Complex tasks are divided and executed seamlessly.

Flexibility: Dynamic agents respond effectively to changing scenarios or


unexpected inputs.

Efficiency: Redundant processes are avoided, ensuring faster, smarter results.

These agents collectively enable RAG systems to deliver high-quality, contextually


accurate, and timely responses to users, regardless of the complexity of their queries.

RAG vs Agentic RAG

Agentic RAG frameworks are much more versatile than traditional RAG setups. In a
traditional RAG system, the AI relies on a single tool—a vector database—for
retrieving information to shape its responses. While effective for basic data retrieval,

this approach is limited to working with static documents.

In contrast, agentic RAG systems go beyond simple data retrieval. These

advanced frameworks can integrate multiple tools to handle a variety of tasks. For
We use cookies essential for this site to function well. Please click to help us improve its
example, they can perform complex mathematical calculations, write emails, analyze
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.
data, or even make decisions based on contextual needs. This ability to incorporate
Show details
different tools makes them far more flexible and capable.

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 27/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Additionally, agentic RAG systems excel in multistep reasoning. They are context-

aware, meaning they can decide when and how to use specific tools to solve problems
or accomplish tasks. This ensures better accuracy and efficiency in handling more
complex requirements.

Its ability to work collaboratively in multiagent systems sets agentic RAG apart.
Multiple AI agents can work together, achieving results that are often far better than

those of a single AI agent. This adaptability and scalability make agentic RAG a
powerful choice for dynamic, real-world applications.

Also read:

The Tabular Comparison of RAG vs Agentic RAG

Feature RAG Agentic RAG

Task Complexity Handles simple query-based Handles complex multi-step tasks with
tasks but lacks advanced multiple tools and agents as needed for
decision-making retrieval, reasoning, and more

Decision-Making Limited, no autonomous Agents autonomously decide what data


decision-making involved to retrieve, how to retrieve, grade,
reason, reflect, and generate responses

Multi-Step Limited to single-step queries Excels at multi-step reasoning,


Reasoning and responses especially after retrieval with grading,
hallucination, and response evaluation

Key Role Combines LLMs with external Enhances RAG by using agents for
data retrieval to generate intelligent retrieval, response
responses generation, grading, critiquing, and
moreclick to help us improve its
We use cookies essential for this site to function well. Please
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Real-Time Cookies Policy.
Data Not possible in native RAG Designed for real-time data retrieval and
Retrieval Show details integration

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 28/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Integration with Tied to static retrieval from Deeply integrated with diverse retrieval
Retrieval pre-defined vector databases systems, agents control the process
Systems

Context- Limited by the static vector High, agents adapt to user query and
Awareness database, no advanced or retrieve context, including real-time data
real-time context-awareness

Also read: Evolution of RAG, Long Context LLMs to Agentic RAG

To understand RAG vs Agentic RAG, let’s understand their implementation.

Hands-On: Build a Simple RAG System

Necessary Libraries and Imports

Copy Code
!pip install langchain==0.3.4
!pip install langchain-openai==0.2.3
!pip install langchain-community==0.3.3
!pip install jq==1.8.0
!pip install pymupdf==1.24.12
!pip install langchain-chroma==0.1.4
from getpass import getpass
OPENAI_KEY = getpass('Enter Open AI API Key: ')
import os
os.environ['OPENAI_API_KEY'] = OPENAI_KEY
from langchain_openai import OpenAIEmbeddings
openai_embed_model = OpenAIEmbeddings(model='text-embedding-3-small')

1. Core Functionalities

JSON Document Handling


We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Processes JSON documents into structured formats:
Cookies Policy.

Show details
Copy Code
from langchain.document_loaders import JSONLoader
import json
from langchain.docstore.document import Document
https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 29/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya
# Load JSON documents
loader = JSONLoader(file_path='./rag_docs/wikidata_rag_demo.jsonl',
jq_schema='.',
text_content=False,
json_lines=True)
wiki_docs = loader.load()
# Process JSON documents
import json
from langchain.docstore.document import Document
wiki_docs_processed = []
for doc in wiki_docs:
doc = json.loads(doc.page_content)
metadata = {
"title": doc['title'],
"id": doc['id'],
"source": "Wikipedia"
}
data = ' '.join(doc['paragraphs'])
wiki_docs_processed.append(Document(page_content=data, metadata=metadata))

Output

Document(metadata={'title': 'Chi-square distribution', 'id': '71548',


'source': 'Wikipedia'}, page_content='In probability theory and statistics,
the chi-square distribution (also chi-squared or formula_1\xa0 distribution)
is one of the most widely used theoretical probability distributions. Chi-
square distribution with formula_2 degrees of freedom is written as
formula_3. It is a special case of gamma distribution. Chi-square
distribution is primarily used in statistical significance tests and
confidence intervals. It is useful, because it is relatively easy to show
that certain probability distributions come close to it, under certain
conditions. One of these conditions is that the null hypothesis must be
true. Another one is that the different random variables (or observations)
must be independent of each other.')

PDF Document Handling

Splits PDF content into chunks for vector embedding:


We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy. Copy Code
from langchain.document_loaders import PyMuPDFLoader
Show details
from langchain.text_splitter import RecursiveCharacterTextSplitter
def create_simple_chunks(file_path, chunk_size=3500, chunk_overlap=200):
loader = PyMuPDFLoader(file_path)

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 30/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya
doc_pages = loader.load()
splitter = RecursiveCharacterTextSplitter(chunk_size=chunk_size, chunk_overla
return splitter.split_documents(doc_pages)
from glob import glob
pdf_files = glob('./rag_docs/*.pdf')
# Process PDF files
paper_docs = []
for fp in pdf_files:
paper_docs.extend(create_simple_chunks(file_path=fp))

Output

Loading pages: ./rag_docs/cnn_paper.pdf

Chunking pages: ./rag_docs/cnn_paper.pdf

Finished processing: ./rag_docs/cnn_paper.pdf

Loading pages: ./rag_docs/attention_paper.pdf

Chunking pages: ./rag_docs/attention_paper.pdf

Finished processing: ./rag_docs/attention_paper.pdf

Loading pages: ./rag_docs/vision_transformer.pdf

Chunking pages: ./rag_docs/vision_transformer.pdf

Finished processing: ./rag_docs/vision_transformer.pdf

Loading pages: ./rag_docs/resnet_paper.pdf

Chunking pages: ./rag_docs/resnet_paper.pdf

Finished processing: ./rag_docs/resnet_paper.pdf

2. Embedding and Vector Storage


We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Creates embeddings for documents using OpenAI’s model and stores them
Cookies Policy.
in a
Showdatabase:
Chroma vector details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 31/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Copy Code
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma
# Initialize embedding model
openai_embed_model = OpenAIEmbeddings(model='text-embedding-3-small')
# Combine documents
total_docs = wiki_docs_processed + paper_docs
# Create and save vector database
chroma_db = Chroma.from_documents(documents=total_docs,
collection_name='my_db',
embedding=openai_embed_model,
collection_metadata={"hnsw:space": "cosine"},
persist_directory="./my_db")

Load an existing vector database from disk:

Copy Code
chroma_db = Chroma(persist_directory="./my_db",
collection_name='my_db',
embedding_function=openai_embed_model)

3. Semantic Retrieval

Retrieves the top-k most relevant documents based on a query:

Copy Code
similarity_retriever = chroma_db.as_retriever(search_type="similarity", search_kw
# Query for semantic similarity
query = "What is machine learning?"
top_docs = similarity_retriever.invoke(query)
# Display results
from IPython.display import display, Markdown
def display_docs(docs):
for doc in docs:
print('Metadata:', doc.metadata)
print('Content Brief:')
display(Markdown(doc.page_content[:1000]))
print()
We use cookies essential for this site to function well. Please click to help us improve its
display_docs(top_docs)
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.

Show details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 32/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

4. RAG Pipeline

Combines retrieval with a generative AI model for Q&A:

Prompt Template

Copy Code
from langchain_core.prompts import ChatPromptTemplate
rag_prompt = """You are an assistant who is an expert in question-answering tasks
We use cookies
Answer essential
the for this site to function
following well. Please
question usingclickonly
to helpthe
us improve its
following pieces of
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
If the answer is not in the context, do not make up answers, just
Cookies Policy.
Keep the answer detailed and well formatted based on the informat
Show details
Question:
{question}
Context:

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 33/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya
{context}
Answer:
"""
rag_prompt_template = ChatPromptTemplate.from_template(rag_prompt)

Pipeline Construction

Copy Code
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI
# Initialize ChatGPT model
chatgpt = ChatOpenAI(model_name="gpt-4o-mini", temperature=0)
# Format documents into a single string
def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)
# Construct the RAG pipeline
qa_rag_chain = (
{
"context": (similarity_retriever | format_docs),
"question": RunnablePassthrough()
}
|
rag_prompt_template
|
chatgpt
)

Example Usage

Copy Code
query = "What is the difference between AI, ML, and DL?"
result = qa_rag_chain.invoke(query)
# Display the generated answer
from IPython.display import display, Markdown
display(Markdown(result.content))

We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.

Show details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 34/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Copy Code
query = "What is LangGraph?"
result = qa_rag_chain.invoke(query)
display(Markdown(result.content))

Output

I don't know.

This is due to the fact that the document does not contain any information about the
LangGraph.

Also read: A Comprehensive Guide to Building Multimodal RAG Systems

LangChain Agentic RAG System Using the IBM Granite-3.0-8B-


Instruct model

Here, we will create an Agentic RAG system that uses external information to discuss
the 2024 USWe
Open.
use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.

1. Setting Up thedetails
Show Environment

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 35/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

This involves creating the necessary infrastructure:

Log in to watsonx.ai: Use your IBM Cloud credentials.

Create a watsonx.ai Project: Obtain the project ID for the configuration.

Set Up Jupyter Notebook: This can be done in the cloud environment or locally
by uploading pre-built notebooks.

2. Configuring Watson Machine Learning (WML)

To link machine learning capabilities:

Create WML Instance: Select the region and Lite plan for a free option.

Generate API Key: Required for secure integration.

Link WML to watsonx.ai Project: Integrate the project for seamless use.

3. Installing Libraries and Setting Credentials

Install required libraries:

Copy Code
!pip install langchain | tail -n 1
!pip install langchain-ibm | tail -n 1
!pip install langchain-community | tail -n 1
!pip install ibm-watsonx-ai | tail -n 1
!pip install ibm_watson_machine_learning | tail -n 1
!pip install chromadb | tail -n 1
!pip install tiktoken | tail -n 1
!pip install python-dotenv | tail -n 1
!pip install bs4 | tail -n 1

We use cookies essential for this site to function well. Please click to help us improve its
import os usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
from dotenv import
Cookies Policy.load_dotenv
from langchain_ibm import WatsonxEmbeddings, WatsonxLLM
Show details
from langchain.vectorstores import Chroma
from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 36/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.prompts import PromptTemplate
from langchain.tools import tool
from langchain.tools.render import render_text_description_and_args
from langchain.agents.output_parsers import JSONAgentOutputParser
from langchain.agents.format_scratchpad import format_log_to_str
from langchain.agents import AgentExecutor
from langchain.memory import ConversationBufferMemory
from langchain_core.runnables import RunnablePassthrough
from ibm_watson_machine_learning.metanames import GenTextParamsMetaNames as GenPa
from ibm_watsonx_ai.foundation_models.utils.enums import EmbeddingTypes

Import essential libraries (LangChain for agent framework, ibm-watsonx-ai, etc.).

Use .env to secure sensitive credentials like APIKEY and PROJECT_ID.

4. Initializing a Basic Agent

The Setup:

Model Parameters: Use IBM’s Granite-3.0-8B-Instruct LLM with defined decoding


methods, temperature, token limits, and stop sequences.

Prompt Template: A reusable format to guide agent responses.

Copy Code
llm = WatsonxLLM(
model_id= "ibm/granite-3-8b-instruct",
url=credentials.get("url"),
apikey=credentials.get("apikey"),
project_id=project_id,
params={
GenParams.DECODING_METHOD: "greedy",
GenParams.TEMPERATURE: 0,
GenParams.MIN_NEW_TOKENS: 5,
GenParams.MAX_NEW_TOKENS:
We use cookies essential for this site 250,
to function well. Please click to help us improve its
usefulness with additional cookies. Learn
GenParams.STOP_SEQUENCES: about our use"Observation"],
["Human:", of cookies in our Privacy Policy &
}, Cookies Policy.

) Show details
template = "Answer the {query} accurately. If you do not know the answer, simply
prompt = PromptTemplate.from_template(template)

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 37/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya
agent = prompt | llm
agent.invoke({"query": "What sport is played at the US Open?"})

'\n\nThe sport played at the US Open is tennis.'

Copy Code
agent.invoke({"query": "Where was the 2024 US Open Tennis Championship?"})

Do not make up an answer.\n\nThe 2024 US Open Tennis Championship has not


been officially announced yet, so the location is not confirmed. Therefore,
I do not know the answer to this question.'

5. Building a Knowledge Base

This step enables the agent to retrieve specific contextual information.

1. Data Collection: Use URLs to fetch content via LangChain’s WebBaseLoader.

2. Chunking: Split data into manageable pieces using

RecursiveCharacterTextSplitter.

3. Embedding: Convert documents into vector representations using IBM’s Slate


model.

4. Vector Store: Store embeddings in Chroma DB.

Copy Code
urls = [
"https://www.ibm.com/case-studies/us-open",
"https://www.ibm.com/sports/usopen",
"https://newsroom.ibm.com/US-Open-AI-Tennis-Fan-Engagement",
"https://newsroom.ibm.com/2024-08-15-ibm-and-the-usta-serve-up-new-and-enhanc
]
docs = [WebBaseLoader(url).load() for url in urls]
docs_list = [item for sublist in docs for item in sublist]
docs_list[0]
We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.

Show details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 38/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Copy Code
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(

chunk_size=250, chunk_overlap=0

doc_splitsWe
= use cookies essential for this site to function well. Please click to help us improve its
text_splitter.split_documents(docs_list)
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.
#The embedding model that we are using is an IBM Slate™ model through the watsonx
Show details
embeddings = WatsonxEmbeddings(
model_id=EmbeddingTypes.IBM_SLATE_30M_ENG.value,

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 39/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya
url=credentials["url"],
apikey=credentials["apikey"],
project_id=project_id,
)

#In order to store our embedded documents, we will use Chroma DB, an open source

vectorstore = Chroma.from_documents(
documents=doc_splits,
collection_name="agentic-rag-chroma",
embedding=embeddings,
)

Set up a retriever to enable queries over this knowledge base. We must set up a

retriever to access information in the vector store.

Copy Code
retriever = vectorstore.as_retriever()

6. Defining Tools

Create tools, like get_IBM_US_Open_context, for specialized queries.

Tools guide the agent to retrieve specific information from the vector store.

Copy Code
@tool
def get_IBM_US_Open_context(question: str):
"""Get context about IBM's involvement in the 2024 US Open Tennis Championshi
context = retriever.invoke(question)
return context
tools = [get_IBM_US_Open_context]

7. Advanced Prompt Template


We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.
System Prompt: Guides the agent on formatting, tool usage, and decision-making
Show details
logic.

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 40/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Human Prompt: Handles user inputs and intermediary steps.

Combine these into a structured ChatPromptTemplate.

Copy Code
system_prompt = """Respond to the human as helpfully and accurately as possible.
Use a json blob to specify a tool by providing an action key (tool name) and an a
Valid "action" values: "Final Answer" or {tool_names}
Provide only ONE action per $JSON_BLOB, as shown:"
```
{{
"action": $TOOL_NAME,
"action_input": $INPUT
}}
```
Follow this format:
Question: input question to answer
Thought: consider previous and subsequent steps
Action:
```
$JSON_BLOB
```
Observation: action result
... (repeat Thought/Action/Observation N times)
Thought: I know what to respond
Action:
```
{{
"action": "Final Answer",
"action_input": "Final response to human"
}}
Begin! Reminder to ALWAYS respond with a valid json blob of a single action.
Respond directly if appropriate. Format is Action:```$JSON_BLOB```then Observatio
human_prompt = """{input}
{agent_scratchpad}
(reminder to always respond in a JSON blob)"""
prompt = ChatPromptTemplate.from_messages(
[
("system", system_prompt),
MessagesPlaceholder("chat_history", optional=True),
We use cookies essential for this site to function well. Please click to help us improve its
("human", human_prompt),
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
] Cookies Policy.
)
Show details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 41/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

8. Adding Memory and Chains

Memory: Store historical interactions to refine responses using


ConversationBufferMemory.

Agent Chain: Combine the prompt, LLM, tools, and memory into an

AgentExecutor.

9. Testing and Using the RAG System

Verify behavior for complex queries requiring tools (e.g., retrieving IBM’s US Open
involvement).

Ensure fallback to basic knowledge for straightforward questions (e.g., “What is

the capital of France?”).

Copy Code
agent_executor.invoke({"input": "Where was the 2024 US Open Tennis Championship?"

We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.

Show details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 42/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

{'input': 'Where was the 2024 US Open Tennis Championship?',

'history': '',

We use cookies essential for this site to function well. Please click to help us improve its
'output': 'The 2024
usefulness withUS Open cookies.
additional TennisLearn
Championship
about our use of was held
cookies in ourat thePolicy
Privacy USTA& Billie
Jean King National Tennis Center in Flushing, Queens, New York.'}
Cookies Policy.

Show details
Great! The agent used its available RAG tool to return the location of the
2024 US Open, per the user's query. We even get to see the exact document
that the agent is retrieving its information from. Now, let's try a slightly
https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 43/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya
more complex question query. This time, the query will be about IBM's
involvement in the 2024 US Open.

Copy Code
agent_executor.invoke(

{"input": "How did IBM use watsonx at the 2024 US Open Tennis Championship?"}

We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.

Show details

> Finished chain.


https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 44/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya
Out[ ]:

{'input': 'How did IBM use watsonx at the 2024 US Open Tennis Championship?',

'history': 'Human: Where was the 2024 US Open Tennis Championship?\nAI: The
2024 US Open Tennis Championship was held at the USTA Billie Jean King
National Tennis Center in Flushing, Queens, New York.',

'output': 'IBM used watsonx at the 2024 US Open Tennis Championship to


create generative AI-powered features such as Match Reports, AI Commentary,
and SlamTracker. These features enhance the digital experience for fans and
scale the productivity of the USTA editorial team.'}

How Does It Work in Practice?

1. Query Processing: The agent parses the user’s query.

2. Decision Making: Determines whether to use tools or respond directly.

3. Tool Interaction: If necessary, invoke the tool (e.g., get_IBM_US_Open_context).

4. Final Response: Combines retrieved data or knowledge base information to


provide an accurate answer.

This structured system combines IBM’s watsonx.ai, LangChain, and machine learning
to build a versatile, knowledge-augmented AI agent tailored for both general and
domain-specific queries.

Also, if you are looking for an AI Agents course online, then explore: Agentic AI
Pioneer Program

Conclusion
We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
RAG (Retrieval-Augmented
Cookies Policy.
Generation) enhances LLMs by combining external data
retrieval withShow details capabilities, improving accuracy and relevance and reducing
generative
hallucinations. However, it struggles with complex, multi-step queries. Agentic RAG

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 45/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

advances this by integrating intelligent agents that dynamically select tools, refine
queries, and handle specialized tasks like code generation or visualizations. It
supports multi-agent collaboration, ensuring adaptability, scalability, and precise
context-aware responses. While traditional RAG suits basic Q&A and research,
Agentic RAG excels in dynamic, data-intensive applications like real-time analysis and

enterprise systems. Agentic RAG’s modularity and intelligence make it ideal for
tackling complex tasks beyond the scope of traditional RAG systems.

I hope you find this guide helpful in understanding RAG vs Agentic RAG! If you any

questions regarding the article comment below.

Pankaj Singh

Hi, I am Pankaj Singh Negi - Senior Content Editor | Passionate about storytelling and

crafting compelling narratives that transform ideas into impactful content. I love
reading about technology revolutionizing our lifestyle.

Advanced AI Agents Best Of Tech Guide RAG

Login to continue reading and enjoy expert-curated content.

Keep Reading for Free

We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &

Free Courses Cookies Policy.

Show details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 46/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

4.7

Generative AI - A Way of Life


Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and
ethics.

4.5

Getting Started with Large Language Models


Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model
training made simple.

4.6

Building LLM Applications using Prompt Engineering


This free course guides you on building LLM apps, mastering prompt engineering, and developing
chatbots with enterprise data.

4.8

We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.

Show details

Improving Real World RAG Systems: Key Challenges & Practical Solutions

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 47/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context,
relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions


Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive
course.

RECOMMENDED ARTICLES

A Comprehensive Guide to Building Multimodal RA...

Top 4 Agentic AI Design Patterns for Architecti...

Guide to Adaptive RAG Systems with LangGraph

A Comprehensive Guide to Building Agentic RAG S...

Building an Agentic RAG Application using LangC...

How to Become a RAG Specialist in 2025?

A Comprehensive Guide to RAG-to-SQL on Google C...

How Agentic RAG Systems with CrewAI and LangCha...


We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Top 7 Agentic RAG System to Build AI Agents
Cookies Policy.

Show details
Evolution of RAG, Long Context LLMs to Agentic RAG

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 48/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Responses From Readers

What are your thoughts?...

Submit reply

We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.
Write fordetails
Show us
Write, captivate, and earn accolades and rewards for your work

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 49/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Reach a Global Audience


Get Expert Feedback
Build Your Brand & Audience

Cash In on Your Knowledge


Join a Thriving Community
Level Up Your Data Science Game

Flagship Courses
GenAI Pinnacle Program | GenAI Pinnacle Plus Program | AI/ML BlackBelt Courses | Agentic AI
Pioneer Program

Free Courses
We use cookies essential for this site to function well. Please click to help us improve its
Generative AI | DeepSeek | OpenAI Agent SDK | LLM Applications using Prompt Engineering |
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
DeepSeek from Scratch
Cookies Policy.| Stability.AI | SSM & MAMBA | RAG Systems using LlamaIndex |
Getting Started withdetails
Show LLMs | Python | Microsoft Excel | Machine Learning | Deep Learning |
Mastering Multimodal RAG | Introduction to Transformer Model | Bagging & Boosting | Loan
Prediction | Time Series Forecastingn | Tableau | Business Analytics | Vibe Coding in Windsurf

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 50/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Popular Categories
Generative AI | Prompt Engineering | Generative AI Application | News | Technical Guides | AI
Tools | Interview Preparation | Research Papers | Success Stories | Quiz | Use Cases | Listicles

Generative AI Tools and Techniques


GANs | VAEs | Transformers | StyleGAN | Pix2Pix | Autoencoders | GPT | BERT | Word2Vec |
LSTM | Attention Mechanisms | Diffusion Models | LLMs | SLMs | StyleGAN | Encoder Decoder
Models | Prompt Engineering | LangChain | LlamaIndex | RAG | Fine-tuning | LangChain AI
Agent | Multimodal Models | RNNs | DCGAN | ProGAN | Text-to-Image Models | DDPM |
Document Question Answering | Imagen | T5 (Text-to-Text Transfer Transformer) | Seq2seq
Models | WaveNet | Attention Is All You Need (Transformer Architecture)

Popular GenAI Models


Llama 3.1 | Llama 3 | Llama 2 | GPT 4o Mini | GPT 4o | GPT 3 | Claude 3 Haiku | Claude 3.5
Sonnet | Phi 3.5 | Phi 3 | Mistral Large 2 | Mistral NeMo | Mistral-7b | Gemini 1.5 Pro | Gemini
Flash 1.5 | Bedrock | Vertex AI | DALL.E | Midjourney | Stable Diffusion

Data Science Tools and Techniques


Python | R | SQL | Jupyter Notebooks | TensorFlow | Scikit-learn | PyTorch | Tableau | Apache
Spark | Matplotlib | Seaborn | Pandas | Hadoop | Docker | Git | Keras | Apache Kafka | AWS |
NLP | Random Forest | Computer Vision | Data Visualization | Data Exploration | Big Data |
Common Machine Learning Algorithms | Machine Learning

Company Discover
We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
About Us Blogs
Cookies Policy.

Contact Us Show details Expert Sessions

Careers Learning Paths

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 51/52
3/27/25, 10:05 AM RAG vs Agentic RAG: A Comprehensive Guide - Analytics Vidhya

Comprehensive Guides

Learn Engage

Free Courses Community

AI&ML Program Hackathons

Pinnacle Plus Program Events

Agentic AI Program Podcasts

Contribute Enterprise

Become an Author Our Offerings

Become a Speaker Trainings

Become a Mentor Data Culture

Become an Instructor AI Newsletter

Terms & conditions Refund Policy Privacy Policy Cookies Policy © Analytics
Vidhya 2025.All rights reserved.

We use cookies essential for this site to function well. Please click to help us improve its
usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy &
Cookies Policy.

Show details

https://www.analyticsvidhya.com/blog/2024/11/rag-vs-agentic-rag/ 52/52

You might also like