vectorsearch
vectorsearch
state-of-the-art
retrieval for
generative AI apps
Pamela Fox
Principal Cloud Advocate (Python)
Agenda Retrieval-augmented generation (RAG)
Vectors and vector databases
State of the art retrieval with Azure AI Search
Data and platform integrations
Use cases
Retrieval-augmented
generation (RAG)
The limitations of LLMS
No internal knowledge
Incorporating domain knowledge
User Document
lessons · Scuba diving lessons ·
Surfing lessons · Horseback riding
Large
Question Search lessons These lessons provide Language
employees with the opportunity to try
new things, challenge themselves, Model
and improve their physical skills.….
Robust retrieval for RAG apps
Responses only as good as retrieved data
Example
Keyword search recall challenges
“vocabulary gap”
Gets worse with natural language questions Question:
“Looking for lessons on
Vector-based retrieval finds
underwater activities”
documents by semantic similarity
Robust to variation in how concepts are articulated
(word choices, morphology, specificity, etc.) Won’t match:
“Scuba classes”
“Snorkeling group sessions”
Vectors and vector databases
Vector embeddings
An embedding encodes an input as a list of floating-point numbers.
[-0.003335318, -
[[“snake”, [-0.122, ..],
“tortoise” OpenAI ada-002 0.0176891904,…] Search [“frog”, [-0.045, ..]]]
create embedding existing vectors
r = search_client.search( r = search_client.search(
None, None,
top=5, top=5,
vector_queries=[VectorizedQuery( vector_queries=[VectorizedQuery(
vector=search_vector, vector=search_vector,
k_nearest_neighbors=5, k_nearest_neighbors=5,
fields="embedding")]) fields="embedding",
exhaustive=True)])
Rich vector search query capabilities
Filtered vector search r = search_client.search(
None,
Scope to date ranges, categories, geographic top=5,
distances, access control groups, etc. vector_queries=[VectorizedQuery(
vector=query_vector,
Rich filter expressions k_nearest_neighbors=5,
fields="embedding")],
Pre-/post-filtering vector_filter_mode=VectorFilterMode.PRE_FILTER,
Pre-filter: great for selective filters, no recall disruption filter=
"tag eq 'perks' and created gt 2023-11-15T00:00:00Z")
Post-filter: better for low-selectivity filters,
but watch for empty results
https://learn.microsoft.com/azure/search/vector-search-filters
degraded quality
65
→ Can’t only focus on recall
Accuracy
60
Incorrect passages in prompt →
possibly well-grounded yet 55
wrong answers
→ Helps to establish thresholds for 50
5 10 15 20 25 30
“good enough” grounding data
Number of documents in input context
Source: Lost in the Middle: How Language Models Use Long Contexts, Liu et al. arXiv:2307.03172
Improving relevance
All information retrieval tricks apply!
80
72
70
60 58 59
60
50 50
Accuracy Score
50 48 48
44 45
41 41
40
30
20
10
0
Customer datasets Beir dataset Miracl dataset
Retrieval comparison using Azure AI Search in various retrieval modes on customer and academic benchmarks
Source: Outperforming vector search with hybrid + reranking
Impact of query types on relevance
Hybrid +
Keyword Vector Hybrid
Query type Semantic ranker
[NDCG@3] [NDCG@3] [NDCG@3]
[NDCG@3]
Concept seeking queries 39 45.8 46.3 59.6
Fact seeking queries 37.8 49 49.1 63.4
Exact snippet search 51.1 41.5 51 60.8
Web search-like queries 41.8 46.3 50 58.9
Keyword queries 79.2 11.7 61 66.9
Low query/doc term overlap 23 36.1 35.9 49.1
Queries with misspellings 28.8 39.1 40.6 54.6
Long queries 42.7 41.6 48.1 59.4
Medium queries 38.1 44.7 46.7 59.9
Short queries 53.1 38.8 53 63.9
Source: Outperforming vector search with hybrid + reranking
Azure AI Search:
Seamless Data and Platform Integrations
Data preparation for RAG applications
Chunking
Split long-form text into short passages
LLM context length limits
Focused subset of the content
Multiple independent passages
Basics
~200–500 tokens/passage
Maintain lexical boundaries
Introduce overlap
Layout
Layout information is valuable, e.g., tables
Vectorization
Indexing-time: convert passages to vectors
https://learn.microsoft.com/azure/search/vector-search-integrated-vectorization
Azure AI Studio &
Azure AI SDK
First-class integration
Build indexes from data in Blob
Storage, Microsoft Fabric, etc.
Attach to existing Azure AI Search
indexes
Use cases
Example uses
Developers have used Azure AI search to
create RAG apps for…
Public government data
Internal HR documents, company meetings,
presentations
Customer support requests and call
transcripts
Technical documentation and issue trackers
Product manuals
Next steps
Learn more about Azure AI Search
https://aka.ms/AzureAISearch
Dig more into quality evaluation details and why Azure AI Search
will make your application generate better results
https://aka.ms/ragrelevance
12:00-1:15pm
2:15-3:30pm