0% found this document useful (0 votes)
26 views6 pages

RagApplication - Ipynb - Colab

The document outlines a process for building a Retrieval-Augmented Generation (RAG) application using Google Colab. It includes steps for installing necessary dependencies, preprocessing and chunking documents, creating embeddings with a sentence transformer, and integrating a language model for retrieval. The application aims to efficiently retrieve and generate responses based on user queries using the processed document data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views6 pages

RagApplication - Ipynb - Colab

The document outlines a process for building a Retrieval-Augmented Generation (RAG) application using Google Colab. It includes steps for installing necessary dependencies, preprocessing and chunking documents, creating embeddings with a sentence transformer, and integrating a language model for retrieval. The application aims to efficiently retrieve and generate responses based on user queries using the processed document data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

18/09/2025, 10:52 RagApplication.

ipynb - Colab

keyboard_arrow_down Install dependencies

!pip install sentence-transformers faiss-cpu transformers gradio datasets accelerate


Requirement already satisfied: dill<0.3.9,> 0.3.0 in /usr/local/lib/python3.12/dist packages (from datasets) (0.3.
Requirement already satisfied: xxhash in /usr/local/lib/python3.12/dist-packages (from datasets) (3.5.0)
Requirement already satisfied: multiprocess<0.70.17 in /usr/local/lib/python3.12/dist-packages (from datasets) (0.
Requirement already satisfied: psutil in /usr/local/lib/python3.12/dist-packages (from accelerate) (5.9.5)
Requirement already satisfied: idna>=2.8 in /usr/local/lib/python3.12/dist-packages (from anyio<5.0,>=3.0->gradio)
Requirement already satisfied: sniffio>=1.1 in /usr/local/lib/python3.12/dist-packages (from anyio<5.0,>=3.0->grad
Requirement already satisfied: aiohttp!=4.0.0a0,!=4.0.0a1 in /usr/local/lib/python3.12/dist-packages (from fsspec[
Requirement already satisfied: certifi in /usr/local/lib/python3.12/dist-packages (from httpx<1.0,>=0.24.1->gradio
Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.12/dist-packages (from httpx<1.0,>=0.24.1->
Requirement already satisfied: h11>=0.16 in /usr/local/lib/python3.12/dist-packages (from httpcore==1.*->httpx<1.0
Requirement already satisfied: hf-xet<2.0.0,>=1.1.3 in /usr/local/lib/python3.12/dist-packages (from huggingface-h
Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.12/dist-packages (from pandas<3.0,
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.12/dist-packages (from pandas<3.0,>=1.0->gra
Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.12/dist-packages (from pandas<3.0,>=1.0->g
Requirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.12/dist-packages (from pydantic<2.
Requirement already satisfied: pydantic-core==2.33.2 in /usr/local/lib/python3.12/dist-packages (from pydantic<2.1
Requirement already satisfied: typing-inspection>=0.4.0 in /usr/local/lib/python3.12/dist-packages (from pydantic<
Requirement already satisfied: charset_normalizer<4,>=2 in /usr/local/lib/python3.12/dist-packages (from requests-
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.12/dist-packages (from requests->trans
Requirement already satisfied: setuptools in /usr/local/lib/python3.12/dist-packages (from torch>=1.11.0->sentence
Requirement already satisfied: sympy>=1.13.3 in /usr/local/lib/python3.12/dist-packages (from torch>=1.11.0->sente
Requirement already satisfied: networkx in /usr/local/lib/python3.12/dist-packages (from torch>=1.11.0->sentence-t
Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.6.77 in /usr/local/lib/python3.12/dist-packages (from to
Requirement already satisfied: nvidia-cuda-runtime-cu12==12.6.77 in /usr/local/lib/python3.12/dist-packages (from
Requirement already satisfied: nvidia-cuda-cupti-cu12==12.6.80 in /usr/local/lib/python3.12/dist-packages (from to
Requirement already satisfied: nvidia-cudnn-cu12==9.10.2.21 in /usr/local/lib/python3.12/dist-packages (from torch
Requirement already satisfied: nvidia-cublas-cu12==12.6.4.1 in /usr/local/lib/python3.12/dist-packages (from torch
Requirement already satisfied: nvidia-cufft-cu12==11.3.0.4 in /usr/local/lib/python3.12/dist-packages (from torch>
Requirement already satisfied: nvidia-curand-cu12==10.3.7.77 in /usr/local/lib/python3.12/dist-packages (from torc
Requirement already satisfied: nvidia-cusolver-cu12==11.7.1.2 in /usr/local/lib/python3.12/dist-packages (from tor
Requirement already satisfied: nvidia-cusparse-cu12==12.5.4.2 in /usr/local/lib/python3.12/dist-packages (from tor
Requirement already satisfied: nvidia-cusparselt-cu12==0.7.1 in /usr/local/lib/python3.12/dist-packages (from torc
Requirement already satisfied: nvidia-nccl-cu12==2.27.3 in /usr/local/lib/python3.12/dist-packages (from torch>=1.
Requirement already satisfied: nvidia-nvtx-cu12==12.6.77 in /usr/local/lib/python3.12/dist-packages (from torch>=1
Requirement already satisfied: nvidia-nvjitlink-cu12==12.6.85 in /usr/local/lib/python3.12/dist-packages (from tor
Requirement already satisfied: nvidia-cufile-cu12==1.11.1.6 in /usr/local/lib/python3.12/dist-packages (from torch
Requirement already satisfied: triton==3.4.0 in /usr/local/lib/python3.12/dist-packages (from torch>=1.11.0->sente
Requirement already satisfied: click>=8.0.0 in /usr/local/lib/python3.12/dist-packages (from typer<1.0,>=0.12->gra
Requirement already satisfied: shellingham>=1.3.0 in /usr/local/lib/python3.12/dist-packages (from typer<1.0,>=0.1
Requirement already satisfied: rich>=10.11.0 in /usr/local/lib/python3.12/dist-packages (from typer<1.0,>=0.12->gr
Requirement already satisfied: joblib>=1.2.0 in /usr/local/lib/python3.12/dist-packages (from scikit-learn->senten
Requirement already satisfied: threadpoolctl>=3.1.0 in /usr/local/lib/python3.12/dist-packages (from scikit-learn-
Requirement already satisfied: aiohappyeyeballs>=2.5.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4
Requirement already satisfied: aiosignal>=1.4.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,
Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,!=4
Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0
Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0
Requirement already satisfied: propcache>=0.2.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,
Requirement already satisfied: yarl<2.0,>=1.17.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.8.2->p
Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.12/dist-packages (from rich>=10.11.
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.12/dist-packages (from rich>=10.1
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.12/dist-packages (from sympy>=1.13.3->
Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.12/dist-packages (from markdown-it-py>=2.2.0->
Downloading faiss_cpu-1.12.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (31.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 31.4/31.4 MB 53.0 MB/s eta 0:00:00
Installing collected packages: faiss-cpu
Successfully installed faiss-cpu-1.12.0

keyboard_arrow_down Loading all packages and Libraries

import pandas as pd
import numpy as np
import json
from sentence_transformers import SentenceTransformer
import faiss
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch
import gradio as gr

# Check if GPU is available


device = "cuda" if torch.cuda.is_available() else "cpu"
print("Using device:", device)

Using device: cpu

https://colab.research.google.com/drive/1YsAY2XSU-Ft03yE1u9DoDdRJ28fT6rYG?authuser=1#scrollTo=Ca09YusIMT1V&printMode=true 1/6
18/09/2025, 10:52 RagApplication.ipynb - Colab

keyboard_arrow_down 1. Document Preprocessing & Chunking

df = pd.read_csv('/content/drive/MyDrive/RagProject/RAG_Sample_Dataset.csv')
df.head()

doc_id document_text

0 DOC1 Electronics purchased during festival sales ca...

1 DOC2 Clothing items have a 30-day return policy, pr...

2 DOC3 Bulk buyers can request discounts by contactin...

3 DOC4 Refunds for defective products are processed w...

4 DOC5 Gift cards are non-refundable and cannot be ex...

Next steps: Generate code with df toggle_off View recommended plots New interactive sheet

documents = df['document_text'].dropna().tolist()
full_text = " ".join(documents)

# Upload the CSV file manually in Colab


# Make sure to upload

# Assume there's a 'content' column


documents = df['document_text'].dropna().tolist()
full_text = " ".join(documents)

# Function to chunk text


def chunk_text(text, chunk_size=500, overlap=50):

words = text.split()

chunks = []
start = 0
while start < len(words):
end = start + chunk_size
chunk = " ".join(words[start:end])
chunks.append(chunk)
start = end - overlap
print(len(chunks))
return chunks

chunks = chunk_text(full_text, chunk_size=500, overlap=50)

# Save chunks
with open('chunks.json', 'w') as f:
json.dump(chunks, f)

# Display first 3 chunks


print("First 3 chunks:")
for i, chunk in enumerate(chunks[:3]):
print(f"--- Chunk {i+1} ---")
print(chunk[:500], "...\n")

1
First 3 chunks:
--- Chunk 1 ---
Electronics purchased during festival sales can be returned within 10 days of delivery. Clothing items have a 30-day

len(chunks)

keyboard_arrow_down 2. Embeddings & Vector Database (FAISS)

# Load sentence-transformers model


embedder = SentenceTransformer('all-MiniLM-L6-v2')

# Create embeddings
embeddings = embedder.encode(chunks, show_progress_bar=True)

print("Embedding shape for one chunk:", embeddings[0].shape)

https://colab.research.google.com/drive/1YsAY2XSU-Ft03yE1u9DoDdRJ28fT6rYG?authuser=1#scrollTo=Ca09YusIMT1V&printMode=true 2/6
18/09/2025, 10:52 RagApplication.ipynb - Colab

# Create FAISS index


dimension = embeddings.shape[1]
index = faiss.IndexFlatL2(dimension)
index.add(embeddings)

print("FAISS index created with", index.ntotal, "vectors.")

/usr/local/lib/python3.12/dist-packages/huggingface_hub/utils/_auth.py:94: UserWarning:
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/toke
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
warnings.warn(
modules.json: 100% 349/349 [00:00<00:00, 22.6kB/s]

config_sentence_transformers.json: 100% 116/116 [00:00<00:00, 10.3kB/s]

README.md: 10.5k/? [00:00<00:00, 747kB/s]

sentence_bert_config.json: 100% 53.0/53.0 [00:00<00:00, 2.69kB/s]

config.json: 100% 612/612 [00:00<00:00, 43.7kB/s]

model.safetensors: 100% 90.9M/90.9M [00:01<00:00, 99.7MB/s]

tokenizer_config.json: 100% 350/350 [00:00<00:00, 23.5kB/s]

vocab.txt: 232k/? [00:00<00:00, 7.32MB/s]

tokenizer.json: 466k/? [00:00<00:00, 14.1MB/s]

special_tokens_map.json: 100% 112/112 [00:00<00:00, 9.55kB/s]

config.json: 100% 190/190 [00:00<00:00, 10.7kB/s]

Batches: 100% 1/1 [00:00<00:00, 3.09it/s]


Embedding shape for one chunk: (384,)
FAISS index created with 1 vectors.

query = "What is the return policy in clothing items?"


query_vec = embedder.encode([query])

D, I = index.search(query_vec, k=3)

print("Top 3 retrieved chunks:")


for i in I[0]:
print(chunks[i][:500], "...\n")

Top 3 retrieved chunks:


Electronics purchased during festival sales can be returned within 10 days of delivery. Clothing items have a 30-day

Electronics purchased during festival sales can be returned within 10 days of delivery. Clothing items have a 30-day

Electronics purchased during festival sales can be returned within 10 days of delivery. Clothing items have a 30-day

keyboard_arrow_down 3. Retrieval + LLM Integration (Open Source)

# Load open-source language model and tokenizer


model_name = "EleutherAI/gpt-neo-1.3B" # Or use another open model like 'tiiuae/falcon-7b-instruct'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype=torch.float16)
model.eval()

https://colab.research.google.com/drive/1YsAY2XSU-Ft03yE1u9DoDdRJ28fT6rYG?authuser=1#scrollTo=Ca09YusIMT1V&printMode=true 3/6
18/09/2025, 10:52 RagApplication.ipynb - Colab

tokenizer_config.json: 100% 200/200 [00:00<00:00, 13.3kB/s]

config.json: 1.35k/? [00:00<00:00, 77.4kB/s]

vocab.json: 798k/? [00:00<00:00, 21.4MB/s]

merges.txt: 456k/? [00:00<00:00, 23.6MB/s]

special_tokens_map.json: 100% 90.0/90.0 [00:00<00:00, 6.84kB/s]


`torch_dtype` is deprecated! Use `dtype` instead!
model.safetensors: 100% 5.31G/5.31G [01:19<00:00, 141MB/s]
GPTNeoForCausalLM(
(transformer): GPTNeoModel(
(wte): Embedding(50257, 2048)
(wpe): Embedding(2048, 2048)
(drop): Dropout(p=0.0, inplace=False)
(h): ModuleList(
(0-23): 24 x GPTNeoBlock(
(ln_1): LayerNorm((2048,), eps=1e-05, elementwise_affine=True)
(attn): GPTNeoAttention(
(attention): GPTNeoSelfAttention(
(attn_dropout): Dropout(p=0.0, inplace=False)
(resid_dropout): Dropout(p=0.0, inplace=False)
(k_proj): Linear(in_features=2048, out_features=2048, bias=False)
(v_proj): Linear(in_features=2048, out_features=2048, bias=False)
(q_proj): Linear(in_features=2048, out_features=2048, bias=False)
(out_proj): Linear(in_features=2048, out_features=2048, bias=True)
)
)
(ln_2): LayerNorm((2048,), eps=1e-05, elementwise_affine=True)
(mlp): GPTNeoMLP(
(c_fc): Linear(in_features=2048, out_features=8192, bias=True)
(c_proj): Linear(in_features=8192, out_features=2048, bias=True)
(act): NewGELUActivation()
(dropout): Dropout(p=0.0, inplace=False)
)
)
)
(ln_f): LayerNorm((2048,), eps=1e-05, elementwise_affine=True)
)
(lm_head): Linear(in_features=2048, out_features=50257, bias=False)
)

# Define generation pipeline


generator = pipeline('text-generation', model=model, tokenizer=tokenizer)

Device set to use cpu

def generate_answer(query, top_k=3):


query_vec = embedder.encode([query])
D, I = index.search(query_vec, k=top_k)
context = "\n\n".join([chunks[i] for i in I[0]])

prompt = f"Context:\n{context}\n\nQuestion: {query}\nAnswer:"

outputs = generator(prompt, max_length=300, do_sample=True, temperature=0.7)


answer = outputs[0]['generated_text'].split('Answer:')[-1].strip()
return context, answer

# Example
query = "How many days are allowed for product returns?"
context, answer = generate_answer(query)
print("Retrieved Context:\n")
print(context[:500], "...\n")
print("Generated Answer:\n")
print(answer)

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True`
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=300) seem to have been set. `max_new_tokens` will take precedence. Pl
Retrieved Context:

Electronics purchased during festival sales can be returned within 10 days of delivery. Clothing items have a 30-day

Generated Answer:

for cash. Warranty for electronics is valid for 1 year from the date of purchase. Customers can track their orders t

Electronics purchased during festival sales can be returned within 10 days of delivery. Clothing items have a 30-day

sample_queries = [
"What is the return policy for electronics?",
https://colab.research.google.com/drive/1YsAY2XSU-Ft03yE1u9DoDdRJ28fT6rYG?authuser=1#scrollTo=Ca09YusIMT1V&printMode=true 4/6
18/09/2025, 10:52 RagApplication.ipynb - Colab
What is the return policy for electronics? ,
"How can bulk buyers request discounts?",
"What is the warranty period for appliances?"]

results = []
for query in sample_queries:
context, answer = generate_answer(query)
results.append({'query': query, 'context': context, 'answer': answer})

df_results = pd.DataFrame(results)
df_results

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Both `max_new_tokens` (=256) and `max_length`(=300) seem to have been set. `max_new_tokens` will take precedence. Pl
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=300) seem to have been set. `max_new_tokens` will take precedence. Pl
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=300) seem to have been set. `max_new_tokens` will take precedence. Pl
query context answer

0 What is the return policy for electronics? Electronics purchased during festival sales ca... for cash. Warranty for electronics is valid fo...

1 How can bulk buyers request discounts? Electronics purchased during festival sales ca... for cash. Warranty for electronics is valid fo...

2 What is the warranty period for appliances? Electronics purchased during festival sales ca... for cash. Warranty for electronics is valid fo...

Next steps: Generate code with df_results toggle_off View recommended plots New interactive sheet

keyboard_arrow_down GUI using Gradio

def answer_with_ui(query):
context, answer = generate_answer(query)
return answer

gr.Interface(fn=answer_with_ui, inputs="text", outputs="text", title="Open Source RAG Q&A Assistant").launch()

It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically settin

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://8153fd2f35394880b1.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the termina

Open Source RAG Q&A Assistant


query output

Hi
processing | 12.2s

Clear Submit Flag

Use via API · Built with Gradio · Settings

Start coding or generate with AI.

https://colab.research.google.com/drive/1YsAY2XSU-Ft03yE1u9DoDdRJ28fT6rYG?authuser=1#scrollTo=Ca09YusIMT1V&printMode=true 5/6
18/09/2025, 10:52 RagApplication.ipynb - Colab

Could not connect to the reCAPTCHA service. Please check your internet connection and reload to get a reCAPTCHA challenge.

https://colab.research.google.com/drive/1YsAY2XSU-Ft03yE1u9DoDdRJ28fT6rYG?authuser=1#scrollTo=Ca09YusIMT1V&printMode=true 6/6

You might also like