18/09/2025, 10:52 RagApplication.
ipynb - Colab
keyboard_arrow_down Install dependencies
!pip install sentence-transformers faiss-cpu transformers gradio datasets accelerate
Requirement already satisfied: dill<0.3.9,> 0.3.0 in /usr/local/lib/python3.12/dist packages (from datasets) (0.3.
Requirement already satisfied: xxhash in /usr/local/lib/python3.12/dist-packages (from datasets) (3.5.0)
Requirement already satisfied: multiprocess<0.70.17 in /usr/local/lib/python3.12/dist-packages (from datasets) (0.
Requirement already satisfied: psutil in /usr/local/lib/python3.12/dist-packages (from accelerate) (5.9.5)
Requirement already satisfied: idna>=2.8 in /usr/local/lib/python3.12/dist-packages (from anyio<5.0,>=3.0->gradio)
Requirement already satisfied: sniffio>=1.1 in /usr/local/lib/python3.12/dist-packages (from anyio<5.0,>=3.0->grad
Requirement already satisfied: aiohttp!=4.0.0a0,!=4.0.0a1 in /usr/local/lib/python3.12/dist-packages (from fsspec[
Requirement already satisfied: certifi in /usr/local/lib/python3.12/dist-packages (from httpx<1.0,>=0.24.1->gradio
Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.12/dist-packages (from httpx<1.0,>=0.24.1->
Requirement already satisfied: h11>=0.16 in /usr/local/lib/python3.12/dist-packages (from httpcore==1.*->httpx<1.0
Requirement already satisfied: hf-xet<2.0.0,>=1.1.3 in /usr/local/lib/python3.12/dist-packages (from huggingface-h
Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.12/dist-packages (from pandas<3.0,
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.12/dist-packages (from pandas<3.0,>=1.0->gra
Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.12/dist-packages (from pandas<3.0,>=1.0->g
Requirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.12/dist-packages (from pydantic<2.
Requirement already satisfied: pydantic-core==2.33.2 in /usr/local/lib/python3.12/dist-packages (from pydantic<2.1
Requirement already satisfied: typing-inspection>=0.4.0 in /usr/local/lib/python3.12/dist-packages (from pydantic<
Requirement already satisfied: charset_normalizer<4,>=2 in /usr/local/lib/python3.12/dist-packages (from requests-
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.12/dist-packages (from requests->trans
Requirement already satisfied: setuptools in /usr/local/lib/python3.12/dist-packages (from torch>=1.11.0->sentence
Requirement already satisfied: sympy>=1.13.3 in /usr/local/lib/python3.12/dist-packages (from torch>=1.11.0->sente
Requirement already satisfied: networkx in /usr/local/lib/python3.12/dist-packages (from torch>=1.11.0->sentence-t
Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.6.77 in /usr/local/lib/python3.12/dist-packages (from to
Requirement already satisfied: nvidia-cuda-runtime-cu12==12.6.77 in /usr/local/lib/python3.12/dist-packages (from
Requirement already satisfied: nvidia-cuda-cupti-cu12==12.6.80 in /usr/local/lib/python3.12/dist-packages (from to
Requirement already satisfied: nvidia-cudnn-cu12==9.10.2.21 in /usr/local/lib/python3.12/dist-packages (from torch
Requirement already satisfied: nvidia-cublas-cu12==12.6.4.1 in /usr/local/lib/python3.12/dist-packages (from torch
Requirement already satisfied: nvidia-cufft-cu12==11.3.0.4 in /usr/local/lib/python3.12/dist-packages (from torch>
Requirement already satisfied: nvidia-curand-cu12==10.3.7.77 in /usr/local/lib/python3.12/dist-packages (from torc
Requirement already satisfied: nvidia-cusolver-cu12==11.7.1.2 in /usr/local/lib/python3.12/dist-packages (from tor
Requirement already satisfied: nvidia-cusparse-cu12==12.5.4.2 in /usr/local/lib/python3.12/dist-packages (from tor
Requirement already satisfied: nvidia-cusparselt-cu12==0.7.1 in /usr/local/lib/python3.12/dist-packages (from torc
Requirement already satisfied: nvidia-nccl-cu12==2.27.3 in /usr/local/lib/python3.12/dist-packages (from torch>=1.
Requirement already satisfied: nvidia-nvtx-cu12==12.6.77 in /usr/local/lib/python3.12/dist-packages (from torch>=1
Requirement already satisfied: nvidia-nvjitlink-cu12==12.6.85 in /usr/local/lib/python3.12/dist-packages (from tor
Requirement already satisfied: nvidia-cufile-cu12==1.11.1.6 in /usr/local/lib/python3.12/dist-packages (from torch
Requirement already satisfied: triton==3.4.0 in /usr/local/lib/python3.12/dist-packages (from torch>=1.11.0->sente
Requirement already satisfied: click>=8.0.0 in /usr/local/lib/python3.12/dist-packages (from typer<1.0,>=0.12->gra
Requirement already satisfied: shellingham>=1.3.0 in /usr/local/lib/python3.12/dist-packages (from typer<1.0,>=0.1
Requirement already satisfied: rich>=10.11.0 in /usr/local/lib/python3.12/dist-packages (from typer<1.0,>=0.12->gr
Requirement already satisfied: joblib>=1.2.0 in /usr/local/lib/python3.12/dist-packages (from scikit-learn->senten
Requirement already satisfied: threadpoolctl>=3.1.0 in /usr/local/lib/python3.12/dist-packages (from scikit-learn-
Requirement already satisfied: aiohappyeyeballs>=2.5.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4
Requirement already satisfied: aiosignal>=1.4.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,
Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,!=4
Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0
Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0
Requirement already satisfied: propcache>=0.2.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0,
Requirement already satisfied: yarl<2.0,>=1.17.0 in /usr/local/lib/python3.12/dist-packages (from aiohttp!=4.0.0a0
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.8.2->p
Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.12/dist-packages (from rich>=10.11.
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.12/dist-packages (from rich>=10.1
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.12/dist-packages (from sympy>=1.13.3->
Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.12/dist-packages (from markdown-it-py>=2.2.0->
Downloading faiss_cpu-1.12.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (31.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 31.4/31.4 MB 53.0 MB/s eta 0:00:00
Installing collected packages: faiss-cpu
Successfully installed faiss-cpu-1.12.0
keyboard_arrow_down Loading all packages and Libraries
import pandas as pd
import numpy as np
import json
from sentence_transformers import SentenceTransformer
import faiss
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch
import gradio as gr
# Check if GPU is available
device = "cuda" if torch.cuda.is_available() else "cpu"
print("Using device:", device)
Using device: cpu
https://colab.research.google.com/drive/1YsAY2XSU-Ft03yE1u9DoDdRJ28fT6rYG?authuser=1#scrollTo=Ca09YusIMT1V&printMode=true 1/6
18/09/2025, 10:52 RagApplication.ipynb - Colab
keyboard_arrow_down 1. Document Preprocessing & Chunking
df = pd.read_csv('/content/drive/MyDrive/RagProject/RAG_Sample_Dataset.csv')
df.head()
doc_id document_text
0 DOC1 Electronics purchased during festival sales ca...
1 DOC2 Clothing items have a 30-day return policy, pr...
2 DOC3 Bulk buyers can request discounts by contactin...
3 DOC4 Refunds for defective products are processed w...
4 DOC5 Gift cards are non-refundable and cannot be ex...
Next steps: Generate code with df toggle_off View recommended plots New interactive sheet
documents = df['document_text'].dropna().tolist()
full_text = " ".join(documents)
# Upload the CSV file manually in Colab
# Make sure to upload
# Assume there's a 'content' column
documents = df['document_text'].dropna().tolist()
full_text = " ".join(documents)
# Function to chunk text
def chunk_text(text, chunk_size=500, overlap=50):
words = text.split()
chunks = []
start = 0
while start < len(words):
end = start + chunk_size
chunk = " ".join(words[start:end])
chunks.append(chunk)
start = end - overlap
print(len(chunks))
return chunks
chunks = chunk_text(full_text, chunk_size=500, overlap=50)
# Save chunks
with open('chunks.json', 'w') as f:
json.dump(chunks, f)
# Display first 3 chunks
print("First 3 chunks:")
for i, chunk in enumerate(chunks[:3]):
print(f"--- Chunk {i+1} ---")
print(chunk[:500], "...\n")
1
First 3 chunks:
--- Chunk 1 ---
Electronics purchased during festival sales can be returned within 10 days of delivery. Clothing items have a 30-day
len(chunks)
keyboard_arrow_down 2. Embeddings & Vector Database (FAISS)
# Load sentence-transformers model
embedder = SentenceTransformer('all-MiniLM-L6-v2')
# Create embeddings
embeddings = embedder.encode(chunks, show_progress_bar=True)
print("Embedding shape for one chunk:", embeddings[0].shape)
https://colab.research.google.com/drive/1YsAY2XSU-Ft03yE1u9DoDdRJ28fT6rYG?authuser=1#scrollTo=Ca09YusIMT1V&printMode=true 2/6
18/09/2025, 10:52 RagApplication.ipynb - Colab
# Create FAISS index
dimension = embeddings.shape[1]
index = faiss.IndexFlatL2(dimension)
index.add(embeddings)
print("FAISS index created with", index.ntotal, "vectors.")
/usr/local/lib/python3.12/dist-packages/huggingface_hub/utils/_auth.py:94: UserWarning:
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/toke
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
warnings.warn(
modules.json: 100% 349/349 [00:00<00:00, 22.6kB/s]
config_sentence_transformers.json: 100% 116/116 [00:00<00:00, 10.3kB/s]
README.md: 10.5k/? [00:00<00:00, 747kB/s]
sentence_bert_config.json: 100% 53.0/53.0 [00:00<00:00, 2.69kB/s]
config.json: 100% 612/612 [00:00<00:00, 43.7kB/s]
model.safetensors: 100% 90.9M/90.9M [00:01<00:00, 99.7MB/s]
tokenizer_config.json: 100% 350/350 [00:00<00:00, 23.5kB/s]
vocab.txt: 232k/? [00:00<00:00, 7.32MB/s]
tokenizer.json: 466k/? [00:00<00:00, 14.1MB/s]
special_tokens_map.json: 100% 112/112 [00:00<00:00, 9.55kB/s]
config.json: 100% 190/190 [00:00<00:00, 10.7kB/s]
Batches: 100% 1/1 [00:00<00:00, 3.09it/s]
Embedding shape for one chunk: (384,)
FAISS index created with 1 vectors.
query = "What is the return policy in clothing items?"
query_vec = embedder.encode([query])
D, I = index.search(query_vec, k=3)
print("Top 3 retrieved chunks:")
for i in I[0]:
print(chunks[i][:500], "...\n")
Top 3 retrieved chunks:
Electronics purchased during festival sales can be returned within 10 days of delivery. Clothing items have a 30-day
Electronics purchased during festival sales can be returned within 10 days of delivery. Clothing items have a 30-day
Electronics purchased during festival sales can be returned within 10 days of delivery. Clothing items have a 30-day
keyboard_arrow_down 3. Retrieval + LLM Integration (Open Source)
# Load open-source language model and tokenizer
model_name = "EleutherAI/gpt-neo-1.3B" # Or use another open model like 'tiiuae/falcon-7b-instruct'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype=torch.float16)
model.eval()
https://colab.research.google.com/drive/1YsAY2XSU-Ft03yE1u9DoDdRJ28fT6rYG?authuser=1#scrollTo=Ca09YusIMT1V&printMode=true 3/6
18/09/2025, 10:52 RagApplication.ipynb - Colab
tokenizer_config.json: 100% 200/200 [00:00<00:00, 13.3kB/s]
config.json: 1.35k/? [00:00<00:00, 77.4kB/s]
vocab.json: 798k/? [00:00<00:00, 21.4MB/s]
merges.txt: 456k/? [00:00<00:00, 23.6MB/s]
special_tokens_map.json: 100% 90.0/90.0 [00:00<00:00, 6.84kB/s]
`torch_dtype` is deprecated! Use `dtype` instead!
model.safetensors: 100% 5.31G/5.31G [01:19<00:00, 141MB/s]
GPTNeoForCausalLM(
(transformer): GPTNeoModel(
(wte): Embedding(50257, 2048)
(wpe): Embedding(2048, 2048)
(drop): Dropout(p=0.0, inplace=False)
(h): ModuleList(
(0-23): 24 x GPTNeoBlock(
(ln_1): LayerNorm((2048,), eps=1e-05, elementwise_affine=True)
(attn): GPTNeoAttention(
(attention): GPTNeoSelfAttention(
(attn_dropout): Dropout(p=0.0, inplace=False)
(resid_dropout): Dropout(p=0.0, inplace=False)
(k_proj): Linear(in_features=2048, out_features=2048, bias=False)
(v_proj): Linear(in_features=2048, out_features=2048, bias=False)
(q_proj): Linear(in_features=2048, out_features=2048, bias=False)
(out_proj): Linear(in_features=2048, out_features=2048, bias=True)
)
)
(ln_2): LayerNorm((2048,), eps=1e-05, elementwise_affine=True)
(mlp): GPTNeoMLP(
(c_fc): Linear(in_features=2048, out_features=8192, bias=True)
(c_proj): Linear(in_features=8192, out_features=2048, bias=True)
(act): NewGELUActivation()
(dropout): Dropout(p=0.0, inplace=False)
)
)
)
(ln_f): LayerNorm((2048,), eps=1e-05, elementwise_affine=True)
)
(lm_head): Linear(in_features=2048, out_features=50257, bias=False)
)
# Define generation pipeline
generator = pipeline('text-generation', model=model, tokenizer=tokenizer)
Device set to use cpu
def generate_answer(query, top_k=3):
query_vec = embedder.encode([query])
D, I = index.search(query_vec, k=top_k)
context = "\n\n".join([chunks[i] for i in I[0]])
prompt = f"Context:\n{context}\n\nQuestion: {query}\nAnswer:"
outputs = generator(prompt, max_length=300, do_sample=True, temperature=0.7)
answer = outputs[0]['generated_text'].split('Answer:')[-1].strip()
return context, answer
# Example
query = "How many days are allowed for product returns?"
context, answer = generate_answer(query)
print("Retrieved Context:\n")
print(context[:500], "...\n")
print("Generated Answer:\n")
print(answer)
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True`
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=300) seem to have been set. `max_new_tokens` will take precedence. Pl
Retrieved Context:
Electronics purchased during festival sales can be returned within 10 days of delivery. Clothing items have a 30-day
Generated Answer:
for cash. Warranty for electronics is valid for 1 year from the date of purchase. Customers can track their orders t
Electronics purchased during festival sales can be returned within 10 days of delivery. Clothing items have a 30-day
sample_queries = [
"What is the return policy for electronics?",
https://colab.research.google.com/drive/1YsAY2XSU-Ft03yE1u9DoDdRJ28fT6rYG?authuser=1#scrollTo=Ca09YusIMT1V&printMode=true 4/6
18/09/2025, 10:52 RagApplication.ipynb - Colab
What is the return policy for electronics? ,
"How can bulk buyers request discounts?",
"What is the warranty period for appliances?"]
results = []
for query in sample_queries:
context, answer = generate_answer(query)
results.append({'query': query, 'context': context, 'answer': answer})
df_results = pd.DataFrame(results)
df_results
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=300) seem to have been set. `max_new_tokens` will take precedence. Pl
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=300) seem to have been set. `max_new_tokens` will take precedence. Pl
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=300) seem to have been set. `max_new_tokens` will take precedence. Pl
query context answer
0 What is the return policy for electronics? Electronics purchased during festival sales ca... for cash. Warranty for electronics is valid fo...
1 How can bulk buyers request discounts? Electronics purchased during festival sales ca... for cash. Warranty for electronics is valid fo...
2 What is the warranty period for appliances? Electronics purchased during festival sales ca... for cash. Warranty for electronics is valid fo...
Next steps: Generate code with df_results toggle_off View recommended plots New interactive sheet
keyboard_arrow_down GUI using Gradio
def answer_with_ui(query):
context, answer = generate_answer(query)
return answer
gr.Interface(fn=answer_with_ui, inputs="text", outputs="text", title="Open Source RAG Q&A Assistant").launch()
It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically settin
Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://8153fd2f35394880b1.gradio.live
This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the termina
Open Source RAG Q&A Assistant
query output
Hi
processing | 12.2s
Clear Submit Flag
Use via API · Built with Gradio · Settings
Start coding or generate with AI.
https://colab.research.google.com/drive/1YsAY2XSU-Ft03yE1u9DoDdRJ28fT6rYG?authuser=1#scrollTo=Ca09YusIMT1V&printMode=true 5/6
18/09/2025, 10:52 RagApplication.ipynb - Colab
Could not connect to the reCAPTCHA service. Please check your internet connection and reload to get a reCAPTCHA challenge.
https://colab.research.google.com/drive/1YsAY2XSU-Ft03yE1u9DoDdRJ28fT6rYG?authuser=1#scrollTo=Ca09YusIMT1V&printMode=true 6/6