Advanced retrieval: hybrid + reranking
FULL TRANSCRIPT
This day is very exam heavy because AWS
loves the words hybrid, filters, and
reranking.
Advanced retrieval, hybrid search plus
re-ranking.
Big idea, one sentence. Good retrieval
is recall first, precision later. Use
hybrid search to catch candidates, then
rerank to put the best answers on top.
One, why vector only? Retrieval is not
enough. Vector search is great at
meaning but bad at exact terms, names,
IDs, codes, dates, legal phrases must
contain constraints. Example, find the
exact policy about NSW stamp duty 2023.
A pure vector search may return similar
tax docs, older years, other states.
That's where hybrid search comes in.
Knack 2 hybrid search keyword vector.
what it means. You combine keyword
search, BM25, text match, vector
similarity, semantic meaning, then merge
the results. This is most commonly done
in open search. Why AWS likes hybrid
search? Because real questions usually
contain intent, how do I, and
constraints in NSW after 2020 policy
123. Hybrid search respects both simple
mental model. Keyword search does it
literally mention this? Vector search
equals is it about the same thing?
Hybrid search asks both.
Nasher 3 metadata filters non-negotiable
in exams. Filters are not search they
are rules. Examples tenant ID equals X
region equals NSW year= 2021. Document
type is policy. Filters reduce noise
enforce security. Improve precision.
Reduce ranker load. Exam signal. If the
question mentions permissions, tenants,
time ranges, regions, and the answer
doesn't include metadata filters, it's
incomplete. Her four, topic
segmentation, quiet but important, what
it is. Instead of one giant index, you
split content by topic, or route queries
to topic specific indexes. Examples: HR
versus legal versus finance, product
docs versus support tickets. Why this
matters? Smaller search space, higher
relevance, cheaper retrieval. Exam
signal, reduce noise, improve relevance,
domain specific search, topic
segmentation.
Number five, reerrankers, precision
layer. This is where many people get
confused. What a re-ranker does. Takes
the top end results from retrieval,
scores them more carefully, reorders
them. It does not search the whole
corpus. Where re-rankers sit important.
You can see the code in our conversation
history. Rankers sit after retrieval,
never before. Number six, what
re-rankers fix. Exam gold re-rankers fix
almost right but wrong order subtle
intent mismatches long context relevance
query nuance they are especially good at
legal text policy docs multicondition
questions they do not fix missing
documents bad filters wrong corpus
garbage in garbage still out number
seven AWS static plus one advanced
retrieval edition static retrieval
strategy hybrid filters topic routing
re-ranker choice top K values plus one
user query the retrieval system is fixed
each query flows through it that's
static plus one again number eight how
AWS expects you to design this end to
end typical enterprise flow one user
query arrives two apply metadata filters
security tenant date run hybrid search
four take top 20 to 50 results five run
re-ranker six top three to five to the
LLM this minimizes hallucinations token
cost irrelevant content text classic
exam traps. Watch closely. Use reranker
instead of vector search. Reranker
searches the whole index. Hybrid search
doesn't need filters. Vector search
replaces keyword search. AWS wants
layers, not replacements. One memory
story. Don't forget this. The courtroom.
Keyword search. Keyword lookup in law
books. Vector search. Understanding
legal meaning. Hybrid search. Use both.
Filters. Jurisdiction and date rules.
Ranker. Senior judge reordering
arguments. LLM equals lawyer explaining
the verdict. The judge doesn't read
every book. They only review the best
arguments. Exam compression rules.
Memorize. Recall first. Hybrid search.
Precision later. Rank. Rules first.
Metadata filters. Smaller scope. Topic
segmentation. If an answer jumps
straight from QLM, it's wrong. What AWS
is really testing. They want to see if
you understand that retrieval is a
pipeline, not a query. Systems that
retrieve well hallucinate less, even
with average models. I'll walk through
one realistic enterprise example step by
step and show where hybrid search
filters, topic segmentation, and
reranking each earn their keep. Real
example, advanced retrieval, hybrid plus
re-ranking. Scenario, legal and policy
assistant for an Australian company. The
company has 500,000 documents, HR
policies, legal contracts, tax and
compliance docs stored as chunks with
embeddings indexed in open search. Users
ask questions like open quote, "What is
the NSW stamp duty exemption for first
home buyers after 2023?" Close quote.
This question has semantic intent
exemptions first home buyers hard
constraints NSW after 2023 exact terms
stamp duty first home buyer vector only
search will struggle keyword only search
will miss nuance so we use layers
step one topic segmentation reduce the
search space before searching the system
routes the query to a topic index legal
tax index not HR not product docs how
simple classifier or rules or comprehend
topic detection. Why this matters?
Instead of 500k docs, maybe 60k, less
noise, faster retrieval, cheaper
reranking. Exam signal, reduce noise,
less domain specific search, topic
segmentation.
Step two, metadata filters, rules before
relevance. Now we apply filters that are
not optional. You can see the code in
our conversation history. These are hard
gates, not ranking signals. Why filters
come early? Enforce security. Enforce
correctness. Prevent irrelevant
documents from ever being seen.
Exams. If permissions, region, or time
matter, filters come first.
Step three, hybrid retrieval recall
layer.
Now, the system runs hybrid search.
Keyword BM25 stamp duty first home buyer
exemption. Vector similarity. Semantic
meaning of the question. Captures
phrasing like concessions, property
transfer tax, eligible buyers. Open
search merges both result sets. Output
top 50 candidate chunks. This step
maximizes recall. Did we retrieve
everything that could be relevant?
Step four, reranker precision layer. Now
comes the reranker model. What it sees
the original user query, the top 50
candidate chunks. What it does scores
each chunk for fine grained relevance.
understands nuance like exemptions
versus reductions, current versus
historical rules, eligibility versus
definition. What it does not do, it does
not search the whole index. It does not
replace filters. It does not fix missing
documents.
Output top five chunks reordered by true
relevance. Exam signal improved
precision, better ranking, nuanced
relevance. Reranker. Step five, LLM
grounding only. Now only the top three
to five re-ranked chunks are sent to the
LLM. Why? Lower token cost, less
hallucination, stronger grounding. The
LLM now answers for NSW first home
buyers after 2023 stamp duty exemptions
apply if
what happens if you remove pieces exam
traps no hybrid search misses exact
legal terms misses older but still
relevant policies
no metadata filters other states leak in
wrong years appear security breach risk
no reranker almost right answers appear
first long policies outrank precise ones
re-ranker without good retrieval
Perfect ranking of the wrong documents
where each component lives. Clear mental
map. You can see the code in our
conversation history.
This ordering matters. AWS exams care
about order. Stash statics one real
world framing. Static hybrid search
configuration filter schema topic
segmentation rules ranker choice top kun
user query. You don't redesign retrieval
per query. You design it once, then feed
queries through it. One memory story.
Lock it in. The legal research team.
Topic segmentation. Assign the right
department. Filters jurisdiction and
year rules. Hybrid search. Junior
lawyers gather cases. Reranker. Senior
lawyer orders relevance. LLM. Junior
explains the conclusion. The senior
lawyer never reads everything. They only
review the best candidates. Ultrashort
exam cheat sheet. Need exact terms plus
meaning. Hybrid. Need security time
tenant filters. Need better ordering
reranker. Need less noise topic
segmentation. Reranker always comes
after retrieval.
UNLOCK MORE
Sign up free to access premium features
INTERACTIVE VIEWER
Watch the video with synced subtitles, adjustable overlay, and full playback control.
AI SUMMARY
Get an instant AI-generated summary of the video content, key points, and takeaways.
TRANSLATE
Translate the transcript to 100+ languages with one click. Download in any format.
MIND MAP
Visualize the transcript as an interactive mind map. Understand structure at a glance.
CHAT WITH TRANSCRIPT
Ask questions about the video content. Get answers powered by AI directly from the transcript.
GET MORE FROM YOUR TRANSCRIPTS
Sign up for free and unlock interactive viewer, AI summaries, translations, mind maps, and more. No credit card required.