100% found this document useful (1 vote)
437 views24 pages

Multi-Document Agentic RAG Using Llama-Index and Mistral - by Plaban Nayak - The AI Forum - May, 2024 - Medium

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
437 views24 pages

Multi-Document Agentic RAG Using Llama-Index and Mistral - by Plaban Nayak - The AI Forum - May, 2024 - Medium

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024

Forum | May, 2024 | Medium

Open in app

Search

Multi-document Agentic RAG using Llama-


Index and Mistral
Plaban Nayak · Follow
Published in The AI Forum
15 min read · May 12, 2024

Listen Share More

Multi-Document Agentic RAG workflow

Introduction
Large language models (LLMs) have revolutionized the way we extract insights from
vast amounts of text data. In the domain of financial analysis, LLM applications are
being designed to assist analysts in answering complex questions about company
performance, earnings reports, and market trends.

https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 1/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

One such application involves the use of a retrieval augmented generation (RAG)
pipeline to facilitate the extraction of information from financial statements and
other sources.

Consider a scenario where a financial analyst wants to understand the key


takeaways from a company’s Q2 earnings call, specifically focusing on the
technological moats the company is building. This type of question goes beyond
simple lookup and requires a more sophisticated approach. This is where the
concept of an LLM Agent comes into play.

What is an Agent?
According to Llama-Index an “agent” is an automated reasoning and decision
engine. It takes in a user input/query and can make internal decisions for executing
that query in order to return the correct result. The key agent components can
include, but are not limited to:

Breaking down a complex question into smaller ones

Choosing an external Tool to use + coming up with parameters for calling the Tool

Planning out a set of tasks

Storing previously completed tasks in a memory module

An LLM Agent is a system that combines various techniques such as planning,


tailored focus, memory utilization, and the use of different tools to answer complex
questions.

Let’s break down how an LLM Agent can be developed to answer the
aforementioned question:

Planning: The LLM Agent first needs to understand the nature of the question and
create a plan to extract relevant information. This involves identifying key terms like
“Q2 earnings call” and “technological moats” and determining the best sources to
gather this information from.

Tailored Focus: The LLM Agent then focuses its attention on the specific aspects of the
question related to technological moats. This involves filtering out irrelevant
information and honing in on the details that are most pertinent to the analyst’s
inquiry.

https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 2/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

Memory: The LLM Agent leverages its memory to recall relevant information from
past earnings calls, company reports, and other sources. This helps provide context and
background information to support its analysis.

Using Different Tools: The LLM Agent utilizes a range of tools and techniques to
extract and analyze information. This may include natural language processing (NLP)
algorithms, sentiment analysis, and topic modeling to gain deeper insights into the
earnings call.

Breaking Down Complex Questions: Finally, the LLM Agent breaks down the complex
question into simpler sub-parts, making it easier to extract relevant information and
provide a coherent answer.

From Source: General Components of an Agent

Tool Calling
In standard RAG LLMs are mainly used for synthesis of information only.

On the other hand Tool Calling adds a layer of query understanding on top of a RAG
Pipeline enabling the users to ask complex queries and get back more precise
results. This allows the LLM to figure out how to use a vectordb instead of just
consuming it’s outputs.

Tool Calling enables LLM to interact with external environments through a dynamic
interface where the tool calling not only helps to choose the appropriate tool but
https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 3/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

also infer necessary arguements for execution. Thus resulting in better


understanding of the ask and also generating better responses compared to
standard RAG.

Agent Reasoning Loop


What if the user asks a complex question consisting of multiple steps or a vague
question that needs clarification. Here agent reasoning loop comes into picture.
Instead of calling it in a single shot setting , an agent is able to reason over toolsn
going through multiple steps.

Source : llama index

Agent Architecture
In LlamaIndex an agent consists of two components:

AgentRunner

https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 4/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

AgentWorkers

The AgentRunner objects interfaces with the AgentWorkers.

AgentRunners are orchestrators which store:

State

Conversational Memory

Create Tasks

Maintain Tasks

Run Steps for each Task

Present User-Facing, High-Level User Interface

AgentWorkers take care of:

Selecting and using tools

Select the LLM to make use of the tools.

https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 5/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

Source: Llama-Index

Calling the agent query allows to query the agent in a one-off manner but does not
preserve the state. This is where the memory aspects comes into picture to maintain
the conversation history. Here the agent maintains the chat history into a
conversational memory buffer. By default the memory buffer is a flat list of items
that is a rolling buffer depending on the context window size of the LLM. Therefore
when the agent decides to use a tool it not only uses the current chat but also the
previous conversation history in order to perform the next set of actions.

Here we will build a multi-document agent to handle multiple documents. Here we


have implemented Agentic RAG on 3 documents ,the same can be extended for
more documents as well.
Technology Stack Used
Llama-Index: LlamaIndex is the Data Framework for Context-Augmented LLM
Apps.

Mistral API : Developers can interact with Mistral through its API, which is
similar to the experience with OpenAI’s API system.
https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 6/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

Mistral Large comes with new capabilities and strengths:


It is natively fluent in English, French, Spanish, German, and Italian, with a
nuanced understanding of grammar and cultural context.

Its 32K tokens context window allows precise information recall from large
documents.

Its precise instruction-following enables developers to design their moderation


policies — we used it to set up the system-level moderation of le Chat.

It is natively capable of function calling.

Code Implementation
Code was implemented using google colab

Install required dependencies

%%writefile requirements.txt
llama-index
llama-index-llms-huggingface
llama-index-embeddings-fastembed
fastembed
Unstructured[md]
chromadb
llama-index-vector-stores-chroma
llama-index-llms-groq

https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 7/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

einops
accelerate
sentence-transformers
llama-index-llms-mistralai
llama-index-llms-openai

!pip install -r requirements.txt

####################################################################
Successfully installed Unstructured-0.13.7 accelerate-0.30.1 asgiref-3.8.1 back

Download Documents to be processed

!mkdir data
#
! wget "https://arxiv.org/pdf/1810.04805.pdf" -O ./data/BERT_arxiv.pdf
! wget "https://arxiv.org/pdf/2005.11401" -O ./data/RAG_arxiv.pdf
! wget "https://arxiv.org/pdf/2310.11511" -O ./data/self_rag_arxiv.pdf
! wget "https://arxiv.org/pdf/2401.15884" -O ./data/crag_arxiv.pdf

Import Required Dependencies

from llama_index.core import SimpleDirectoryReader,VectorStoreIndex,SummaryInde


from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import StorageContext
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.tools import FunctionTool,QueryEngineTool
from llama_index.core.vector_stores import MetadataFilters,FilterCondition
from typing import List,Optional

import nest_asyncio
nest_asyncio.apply()

Read the Documents

https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 8/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

documents = SimpleDirectoryReader(input_files = ['./data/self_rag_arxiv.pdf']).


print(len(documents))
print(f"Document Metadata: {documents[0].metadata}")

Split the documents into chunks/nodes

splitter = SentenceSplitter(chunk_size=1024,chunk_overlap=100)
nodes = splitter.get_nodes_from_documents(documents)
print(f"Length of nodes : {len(nodes)}")
print(f"get the content for node 0 :{nodes[0].get_content(metadata_mode='all')}

###########################RESPONSE ################################
Length of nodes : 43
get the content for node 0 :page_label: 1
file_name: self_rag_arxiv.pdf
file_path: data/self_rag_arxiv.pdf
file_type: application/pdf
file_size: 1405127
creation_date: 2024-05-11
last_modified_date: 2023-10-19

Preprint.
SELF-RAG: LEARNING TO RETRIEVE , GENERATE ,AND
CRITIQUE THROUGH SELF-REFLECTION
Akari Asai†, Zeqiu Wu†, Yizhong Wang†§, Avirup Sil‡, Hannaneh Hajishirzi†§
†University of Washington§Allen Institute for AI‡IBM Research AI
{akari,zeqiuwu,yizhongw,hannaneh }@cs.washington.edu ,[email protected]
ABSTRACT
Despite their remarkable capabilities, large language models (LLMs) often produ
responses containing factual inaccuracies due to their sole reliance on the par
ric knowledge they encapsulate. Retrieval-Augmented Generation (RAG), an ad
hoc approach that augments LMs with retrieval of relevant knowledge, decreases
such issues. However, indiscriminately retrieving and incorporating a fixed num
of retrieved passages, regardless of whether retrieval is necessary, or passage
relevant, diminishes LM versatility or can lead to unhelpful response generatio
We introduce a new framework called Self-Reflective Retrieval-Augmented Gen-
eration ( SELF-RAG)that enhances an LM’s quality and factuality through retriev
and self-reflection. Our framework trains a single arbitrary LM that adaptively
retrieves passages on-demand, and generates and reflects on retrieved passages
and its own generations using special tokens, called reflection tokens. Generat
reflection tokens makes the LM controllable during the inference phase, enablin
to tailor its behavior to diverse task requirements. Experiments show that SELF
RAG(7B and 13B parameters) significantly outperforms state-of-the-art LLMs
and retrieval-augmented models on a diverse set of tasks. Specifically, SELF-RA
outperforms ChatGPT and retrieval-augmented Llama2-chat on Open-domain QA,
reasoning and fact verification tasks, and it shows significant gains in improv
factuality and citation accuracy for long-form generations relative to these mo

https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 9/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

1 I NTRODUCTION
State-of-the-art LLMs continue to struggle with factual errors (Mallen et al.,
despite their increased model and data scale (Ouyang et al., 2022). Retrieval-A
(RAG) methods (Figure 1 left; Lewis et al. 2020; Guu et al. 2020) augment the i
with relevant retrieved passages, reducing factual errors in knowledge-intensiv
2023; Asai et al., 2023a). However, these methods may hinder the versatility of
unnecessary or off-topic passages that lead to low-quality generations (Shi et
retrieve passages indiscriminately regardless of whether the factual grounding
the output is not guaranteed to be consistent with retrieved relevant passages
the models are not explicitly trained to leverage and follow facts from provide
work introduces Self-Reflective Retrieval-augmented Generation ( SELF-RAG)to im
LLM’s generation quality, including its factual accuracy without hurting its ve
retrieval and self-reflection. We train an arbitrary LM in an end-to-end manner
its own generation process given a task input by generating both task output an
tokens (i.e., reflection tokens ). Reflection tokens are categorized into retri
indicate the need for retrieval and its generation quality respectively (Figure
given an input prompt and preceding generations, SELF-RAGfirst determines if au
continued generation with retrieved passages would be helpful. If so, it output
calls a retriever model on demand (Step 1). Subsequently, SELF-RAGconcurrently
retrieved passages, evaluating their relevance and then generating correspondin
2). It then generates critique tokens to criticize its own output and choose be
of factuality and overall quality. This process differs from conventional RAG (
1Our code and trained models are available at https://selfrag.github.io/ .
1arXiv:2310.11511v1 [cs.CL] 17 Oct 2023

Instantiate the vectorstore

import chromadb
db = chromadb.PersistentClient(path="./chroma_db_mistral")
chroma_collection = db.get_or_create_collection("multidocument-agent")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

Instantiate the embedding model

from llama_index.embeddings.fastembed import FastEmbedEmbedding


from llama_index.core import Settings
#
embed_model = FastEmbedEmbedding(model_name="BAAI/bge-small-en-v1.5")
#
Settings.embed_model = embed_model
#

https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 10/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

Settings.chunk_size = 1024
#

Instantiate the LLM

from llama_index.llms.mistralai import MistralAI


os.environ["MISTRAL_API_KEY"] = userdata.get("MISTRAL_API_KEY")
llm = MistralAI(model="mistral-large-latest")

Instantiate the Vector Query tool and summary tool for specific document
LlamaIndex Data Agents process natural language input to perform actions rather
than generating responses. The key to creating effective data agents lies in
abstracting tools. But what exactly is meant by a tool in this context? Think of tools
as API interfaces designed for agent interactions rather than human interfaces.

Core Concepts:

Tool: Essentially, a Tool includes a generic interface and fundamental metadata


such as name, description, and function schema.

Tool Spec: This delves into the API specifics, presenting a comprehensive
service API specification that can be translated into various Tools.

There are several types of Tools available:

FunctionTool: Converts any user-defined function into a Tool, with the ability to
infer the function’s schema.

QueryEngineTool: Wraps around an existing query engine. Since our agent


abstractions are derived from BaseQueryEngine, this tool can also accommodate
https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 11/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

agents .

#instantiate Vectorstore
name = "BERT_arxiv"
vector_index = VectorStoreIndex(nodes,storage_context=storage_context)
vector_index.storage_context.vector_store.persist(persist_path="/content/chroma
#
# Define Vectorstore Autoretrieval tool
def vector_query(query:str,page_numbers:Optional[List[str]]=None)->str:
'''
perform vector search over index on
query(str): query string needs to be embedded
page_numbers(List[str]): list of page numbers to be retrieved,
leave blank if we want to perform a vector search ove
'''
page_numbers = page_numbers or []
metadata_dict = [{"key":'page_label',"value":p} for p in page_numbers]
#
query_engine = vector_index.as_query_engine(similarity_top_k =2,
filters = MetadataFilters.from_di

)
#
response = query_engine.query(query)
return response
#
#llamiondex FunctionTool wraps any python function we feed it
vector_query_tool = FunctionTool.from_defaults(name=f"vector_tool_{name}",
fn=vector_query)
# Prepare Summary Tool
summary_index = SummaryIndex(nodes)
summary_query_engine = summary_index.as_query_engine(response_mode="tree_summar
se_async=True,)
summary_query_tool = QueryEngineTool.from_defaults(name=f"summary_tool_{name}",
query_engine=summary_query_
description=("Use ONLY IF you
"DO NOT USE if you have specified

Test the LLM

response = llm.predict_and_call([vector_query_tool],
"Summarize the content in page number 2",
verbose=True)
######################RESPONSE###########################

https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 12/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

=== Calling Function ===


Calling function: vector_tool_BERT_arxiv with args: {"query": "summarize conten
=== Function Output ===
The content discusses the use of RAG models for knowledge-intensive generation

Helper function to generate Vectorstore Tool and Summary tool for all the documents

def get_doc_tools(file_path:str,name:str)->str:
'''
get vector query and sumnmary query tools from a document
'''
#load documents
documents = SimpleDirectoryReader(input_files = [file_path]).load_data()
print(f"length of nodes")
splitter = SentenceSplitter(chunk_size=1024,chunk_overlap=100)
nodes = splitter.get_nodes_from_documents(documents)
print(f"Length of nodes : {len(nodes)}")
#instantiate Vectorstore
vector_index = VectorStoreIndex(nodes,storage_context=storage_context)
vector_index.storage_context.vector_store.persist(persist_path="/content/chro
#
# Define Vectorstore Autoretrieval tool
def vector_query(query:str,page_numbers:Optional[List[str]]=None)->str:
'''
perform vector search over index on
query(str): query string needs to be embedded
page_numbers(List[str]): list of page numbers to be retrieved,
leave blank if we want to perform a vector search o
'''
page_numbers = page_numbers or []
metadata_dict = [{"key":'page_label',"value":p} for p in page_numbers]
#
query_engine = vector_index.as_query_engine(similarity_top_k =2,
filters = MetadataFilters.from_

)
#
response = query_engine.query(query)
return response
#
#llamiondex FunctionTool wraps any python function we feed it
vector_query_tool = FunctionTool.from_defaults(name=f"vector_tool_{name}",
fn=vector_query)
# Prepare Summary Tool
summary_index = SummaryIndex(nodes)
summary_query_engine = summary_index.as_query_engine(response_mode="tree_summ
se_async=True,)
summary_query_tool = QueryEngineTool.from_defaults(name=f"summary_tool_{name}
query_engine=summary_query

https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 13/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

description=("Use ONLY IF y
"DO NOT USE if you have specifi
return vector_query_tool,summary_query_tool

Prepare a input list with specified document names

import os
root_path = "/content/data"
file_name = []
file_path = []
for files in os.listdir(root_path):
if file.endswith(".pdf"):
file_name.append(files.split(".")[0])
file_path.append(os.path.join(root_path,file))
#
print(file_name)
print(file_path)

################################RESPONSE###############################
['self_rag_arxiv', 'crag_arxiv', 'RAG_arxiv', '', 'BERT_arxiv']
['/content/data/BERT_arxiv.pdf',
'/content/data/BERT_arxiv.pdf',
'/content/data/BERT_arxiv.pdf',
'/content/data/BERT_arxiv.pdf',
'/content/data/BERT_arxiv.pdf']

Note : FunctionTool expects a string that matches the pattern ‘^[a-zA-Z0–9_-]+$’ for
the tool name

Generate the vectortool and summary tool for each documents

papers_to_tools_dict = {}
for name,filename in zip(file_name,file_path):
vector_query_tool,summary_query_tool = get_doc_tools(filename,name)
papers_to_tools_dict[name] = [vector_query_tool,summary_query_tool]

####################RESPONSE###########################
length of nodes
Length of nodes : 28
length of nodes
Length of nodes : 28
length of nodes

https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 14/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

Length of nodes : 28
length of nodes
Length of nodes : 28
length of nodes
Length of nodes : 28

Get the tools into a flat list

initial_tools = [t for f in file_name for t in papers_to_tools_dict[f]]


initial_tools

Stuffing too many tool selections into the LLM prompt will lead to the following
issues:

The tools might not fit into the prompt especially if the number of our document
is big as we are modeling each documents as a separate tool.

Cost and latency will spike owing to the increase in number of tokens.

The prompt outline can also get confusing resulting in the LLm not performing
as instructed.

A solution here is to perform RAG on the level of tools.In order to perform this we
will use ObjectIndex class of Llama-Index.

The ObjectIndex class is one that allows for the indexing of arbitrary Python objects.
As such, it is quite flexible and applicable to a wide-range of use cases. As examples:

Use an ObjectIndex to index Tool objects to then be used by an agent.

Use an ObjectIndex to index a SQLTableSchema objects

The VectorStoreIndex is a critical component of LlamaIndex, facilitating the storage


and retrieval of data. It works by:

Accepting a list of Node objects and building an index from them.

Using different vector stores as the storage backend, enhancing the flexibility
and scalability of applications.

https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 15/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

from llama_index.core import VectorStoreIndex


from llama_index.core.objects import ObjectIndex
#
obj_index = ObjectIndex.from_objects(initial_tools,index_cls=VectorStoreIndex)
#

Set up the ObjectIndex as retriever

obj_retriever = obj_index.as_retriever(similarity_top_k=2)
tools = obj_retriever.retrieve("compare and contrast the papers self rag and co
#
print(tools[0].metadata)
print(tools[1].metadata)

###################################RESPONSE###########################
ToolMetadata(description='Use ONLY IF you want to get a holistic summary of the

ToolMetadata(description='vector_tool_self_rag_arxiv(query: str, page_numbers:

Setup the RAG Agent

from llama_index.core.agent import FunctionCallingAgentWorker


from llama_index.core.agent import AgentRunner
#
agent_worker = FunctionCallingAgentWorker.from_tools(tool_retriever=obj_retriev
llm=llm,
system_prompt="""You are a
Please always use the tool
verbose=True)
agent = AgentRunner(agent_worker)

Ask Query 1

#
response = agent.query("Compare and contrast self rag and crag.")
print(str(response))

##############################RESPONSE###################################

https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 16/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

Added user message to memory: Compare and contrast self rag and crag.
=== LLM Response ===
Sure, I'd be happy to help you understand the differences between Self RAG and

Self RAG (Retrieval-Augmented Generation) is a method where the model generates

On the other hand, CRAG (Contrastive Retrieval-Augmented Generation) is also a

Again, it's crucial to remember that both of these methods should only be used

Ask Query 2

response = agent.query("Summarize the paper corrective RAG.")


print(str(response))
###############################RESPONSE#######################
Added user message to memory: Summarize the paper corrective RAG.
=== Calling Function ===
Calling function: summary_tool_RAG_arxiv with args: {"input": "corrective RAG"}
=== Function Output ===
The corrective RAG approach is a method used to address issues or errors in a s
=== LLM Response ===
The corrective RAG approach categorizes issues into Red, Amber, and Green level
assistant: The corrective RAG approach categorizes issues into Red, Amber, and

Conclusion
Unlike the standard RAG pipeline — suitable for simple queries across a few
documents — this intelligent approach adapts based on initial findings to enhance
further data retrieval. Here we have developed an autonomous research agent,
enhancing our ability to engage with and analyze our data comprehensively.

References:

Controllable Agents for RAG - LlamaIndex


A Guide to Building a Full-Stack LlamaIndex Web App with Delphic
docs.llamaindex.ai

Llamaindex Agents

https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 17/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

Follow

Written by Plaban Nayak


2.5K Followers · Editor for The AI Forum

Machine Learning and Deep Learning enthusiast

More from Plaban Nayak and The AI Forum

Plaban Nayak in The AI Forum

RAG on Complex PDF using LlamaParse, Langchain and Groq


Retrieval-Augmented Generation (RAG) is a new approach that leverages Large Language
Models (LLMs) to automate knowledge search, synthesis…

13 min read · Apr 7, 2024

476 3

https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 18/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

Plaban Nayak in The AI Forum

Semantic Chunking for RAG


What is Chunking ?

17 min read · Apr 21, 2024

498 5

Plaban Nayak in The AI Forum

Which Vector Database Should You Use? Choosing the Best One for Your
Needs
https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 19/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

Introduction

9 min read · Apr 19, 2024

363 6

Plaban Nayak in The AI Forum

Build a Reliable RAG Agent using LangGraph


Introduction

13 min read · May 4, 2024

346 5

See all from Plaban Nayak

See all from The AI Forum

Recommended from Medium


https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 20/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

Florian June in AI Advances

Advanced RAG 11: Query Classification and Refinement


Priciples, Code Explanation and Insights about Adaptive-RAG and RQ-RAG

· 9 min read · May 12, 2024

356 2

Plaban Nayak in The AI Forum

RAG on Complex PDF using LlamaParse, Langchain and Groq

https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 21/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

Retrieval-Augmented Generation (RAG) is a new approach that leverages Large Language


Models (LLMs) to automate knowledge search, synthesis…

13 min read · Apr 7, 2024

476 3

Lists

Natural Language Processing


1461 stories · 973 saves

ChatGPT prompts
47 stories · 1573 saves

Joe Sasson in Towards Data Science

Local RAG From Scratch


Develop and deploy an entirely local RAG system from scratch

18 min read · May 11, 2024

728 4

https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 22/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

Mandar Karhade, MD. PhD. in Towards AI

How to Optimize Chunk Sizes for RAG in Production?


The chunk size can make or break the retrieval. Here is how to determine the best chunk size
for your use case.

· 14 min read · May 13, 2024

405 7

Assaf Elovic

How to Build the Ultimate AI Automation with Multi-Agent Collaboration


https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 23/24
21/05/2024, 15:48 Multi-document Agentic RAG using Llama-Index and Mistral | by Plaban Nayak | The AI Forum | May, 2024 | Medium

Learn how to build an autonomous research assistant using LangGraph with a team of
specialized AI agents

· 8 min read · May 10, 2024

649 2

Intel in Intel Tech

Tabular Data, RAG, & LLMs: Improve Results Through Data Table
Prompting
How to ingest small tabular data when working with LLMs.

10 min read · 6 days ago

83 1

See more recommendations

https://medium.com/the-ai-forum/multi-document-agentic-rag-using-llama-index-and-mistral-b334fa45d3ee 24/24

You might also like