qdrant-intent-aware-hybrid-search

Intent-aware hybrid search for e-commerce using Qdrant, IBM Granite embeddings, and Gemini 3.1 Flash Lite.

This project demonstrates why blindly applying Reciprocal Rank Fusion (RRF) can hurt search relevance, and how dynamically weighting retrieval sources based on query intent produces better results. Inspired by Doug Turnbull's "RRF is Not Enough".

Read the full write-up: Don't Just Fuse, Think First: Intent-Driven Hybrid Search with Qdrant

The Problem

Standard hybrid search combines dense (semantic) and sparse (BM25) results with equal-weight RRF. This works well when both sources agree, but when they disagree, irrelevant BM25 results can drag down good dense results:

Query: "I need something to block out airplane noise on long flights"

Dense Search (top 3):                       BM25 Search (top 3):
  1. Sony WH-1000XM5 Headphones               1. Nike Air Max 90 Running Shoes
  2. Bose QuietComfort Ultra Earbuds           2. Dyson Airwrap Multi-Styler
  3. Apple AirPods Pro                         3. Sony WH-1000XM5 Headphones

Naive RRF (equal weights):
  1. Sony WH-1000XM5 Headphones      ✓
  2. Nike Air Max 90 Running Shoes   <<<  irrelevant
  3. Bose QuietComfort Ultra Earbuds  ✓
  4. Dyson Airwrap Multi-Styler      <<<  irrelevant
  5. Apple AirPods Pro                ✓

The Solution

Use an LLM (Gemini 3.1 Flash Lite) to classify query intent and dynamically adjust RRF weights:

Query: "I need something to block out airplane noise on long flights"
Intent: semantic --> dense_weight=0.8, bm25_weight=0.2

Intent-Aware Weighted RRF:
  1. Sony WH-1000XM5 Headphones      ✓
  2. Bose QuietComfort Ultra Earbuds  ✓
  3. Apple AirPods Pro                ✓
  4. Dyson Airwrap Multi-Styler
  5. Sonos Era 300 Speaker

User Query
    |
    v
[Gemini 3.1 Flash Lite] --> intent + weights + phrase
    |
    v
+-----------------+     +-----------------------------+
| Dense Search    |     | BM25 Search                 |
| (Granite R2)    |     | + Phrase Filter (if needed)  |
+-----------------+     +-----------------------------+
        |                           |
        +----> Weighted RRF <-------+
                    |
                    v
             Ranked Results

Tech Stack

Qdrant 1.17.0 -- Vector database (Docker)
IBM Granite granite-embedding-small-english-r2 -- Dense embeddings (384-dim)
Qdrant's BM25 -- Sparse/lexical embeddings (via FastEmbed)
Qdrant Phrase Search -- Full-text index with phrase_matching for high-precision filtering
Qdrant Weighted RRF -- Dynamic fusion weights per query
Gemini 3.1 Flash Lite -- Structured intent classification

Quick Start

1. Clone the repo

git clone https://github.com/gururaser/qdrant-intent-aware-hybrid-search.git
cd qdrant-intent-aware-hybrid-search

2. Set up environment variables

Copy the example file and add your Gemini API key:

cp .env.example .env

Edit .env and replace your_api_key_here with your actual API key. You can get one for free at Google AI Studio.

3. Start Qdrant

docker run -d --name qdrant-hybrid \
  -p 6333:6333 -p 6334:6334 \
  -v qdrant_data:/qdrant/storage \
  qdrant/qdrant:v1.17.0

4. Install dependencies

pip install qdrant-client sentence-transformers google-genai fastembed pandas tabulate

5. Run the notebook

jupyter notebook hybrid_search_ecommerce.ipynb

Make sure to load your .env file in the notebook. The first cell with os.environ.get("GOOGLE_API_KEY") will pick it up if you run:

from dotenv import load_dotenv
load_dotenv()

Or set the variable directly in your terminal before launching Jupyter:

export GOOGLE_API_KEY="your_api_key_here"
jupyter notebook hybrid_search_ecommerce.ipynb

Project Structure

qdrant-intent-aware-hybrid-search/
├── hybrid_search_ecommerce.ipynb   # Full runnable notebook
├── .env.example                    # Template for environment variables
├── .gitignore
└── README.md

Intent Categories

The LLM classifies each query into one of four intents:

 Intent          Dense  BM25  When
 semantic          0.8   0.2  User describes a need in natural language
 keyword_lookup    0.2   0.8  User searches for a specific product/brand/model
 hybrid            0.5   0.5  Mix of descriptive language and specific terms
 phrase_match      0.4   0.6  User references a specific feature phrase

For phrase_match intents, the LLM also extracts the key phrase, which is used as a Qdrant MatchPhrase filter on the BM25 prefetch for higher precision.

Key Qdrant Features Used

Weighted RRF -- Assign different importance to each retrieval source:

query=models.RrfQuery(
    rrf=models.Rrf(weights=[bm25_weight, dense_weight])
)

Phrase Search -- Match exact multi-word phrases, not just individual tokens:

models.FieldCondition(
    key="text",
    match=models.MatchPhrase(phrase="noise cancelling"),
)

Native BM25 -- Server-side sparse embedding with IDF modifier:

sparse_vectors_config={
    "bm25": models.SparseVectorParams(modifier=models.Modifier.IDF)
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

qdrant-intent-aware-hybrid-search

The Problem

The Solution

Tech Stack

Quick Start

1. Clone the repo

2. Set up environment variables

3. Start Qdrant

4. Install dependencies

5. Run the notebook

Project Structure

Intent Categories

Key Qdrant Features Used

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
hybrid_search_ecommerce.ipynb		hybrid_search_ecommerce.ipynb

Folders and files

Latest commit

History

Repository files navigation

qdrant-intent-aware-hybrid-search

The Problem

The Solution

Tech Stack

Quick Start

1. Clone the repo

2. Set up environment variables

3. Start Qdrant

4. Install dependencies

5. Run the notebook

Project Structure

Intent Categories

Key Qdrant Features Used

References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages