Skip to content

gururaser/qdrant-intent-aware-hybrid-search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

qdrant-intent-aware-hybrid-search

Intent-aware hybrid search for e-commerce using Qdrant, IBM Granite embeddings, and Gemini 3.1 Flash Lite.

This project demonstrates why blindly applying Reciprocal Rank Fusion (RRF) can hurt search relevance, and how dynamically weighting retrieval sources based on query intent produces better results. Inspired by Doug Turnbull's "RRF is Not Enough".

Read the full write-up: Don't Just Fuse, Think First: Intent-Driven Hybrid Search with Qdrant


The Problem

Standard hybrid search combines dense (semantic) and sparse (BM25) results with equal-weight RRF. This works well when both sources agree, but when they disagree, irrelevant BM25 results can drag down good dense results:

Query: "I need something to block out airplane noise on long flights"

Dense Search (top 3):                       BM25 Search (top 3):
  1. Sony WH-1000XM5 Headphones               1. Nike Air Max 90 Running Shoes
  2. Bose QuietComfort Ultra Earbuds           2. Dyson Airwrap Multi-Styler
  3. Apple AirPods Pro                         3. Sony WH-1000XM5 Headphones

Naive RRF (equal weights):
  1. Sony WH-1000XM5 Headphones      ✓
  2. Nike Air Max 90 Running Shoes   <<<  irrelevant
  3. Bose QuietComfort Ultra Earbuds  ✓
  4. Dyson Airwrap Multi-Styler      <<<  irrelevant
  5. Apple AirPods Pro                ✓

The Solution

Use an LLM (Gemini 3.1 Flash Lite) to classify query intent and dynamically adjust RRF weights:

Query: "I need something to block out airplane noise on long flights"
Intent: semantic --> dense_weight=0.8, bm25_weight=0.2

Intent-Aware Weighted RRF:
  1. Sony WH-1000XM5 Headphones      ✓
  2. Bose QuietComfort Ultra Earbuds  ✓
  3. Apple AirPods Pro                ✓
  4. Dyson Airwrap Multi-Styler
  5. Sonos Era 300 Speaker
User Query
    |
    v
[Gemini 3.1 Flash Lite] --> intent + weights + phrase
    |
    v
+-----------------+     +-----------------------------+
| Dense Search    |     | BM25 Search                 |
| (Granite R2)    |     | + Phrase Filter (if needed)  |
+-----------------+     +-----------------------------+
        |                           |
        +----> Weighted RRF <-------+
                    |
                    v
             Ranked Results

Tech Stack


Quick Start

1. Clone the repo

git clone https://github.com/gururaser/qdrant-intent-aware-hybrid-search.git
cd qdrant-intent-aware-hybrid-search

2. Set up environment variables

Copy the example file and add your Gemini API key:

cp .env.example .env

Edit .env and replace your_api_key_here with your actual API key. You can get one for free at Google AI Studio.

3. Start Qdrant

docker run -d --name qdrant-hybrid \
  -p 6333:6333 -p 6334:6334 \
  -v qdrant_data:/qdrant/storage \
  qdrant/qdrant:v1.17.0

4. Install dependencies

pip install qdrant-client sentence-transformers google-genai fastembed pandas tabulate

5. Run the notebook

jupyter notebook hybrid_search_ecommerce.ipynb

Make sure to load your .env file in the notebook. The first cell with os.environ.get("GOOGLE_API_KEY") will pick it up if you run:

from dotenv import load_dotenv
load_dotenv()

Or set the variable directly in your terminal before launching Jupyter:

export GOOGLE_API_KEY="your_api_key_here"
jupyter notebook hybrid_search_ecommerce.ipynb

Project Structure

qdrant-intent-aware-hybrid-search/
├── hybrid_search_ecommerce.ipynb   # Full runnable notebook
├── .env.example                    # Template for environment variables
├── .gitignore
└── README.md

Intent Categories

The LLM classifies each query into one of four intents:

 Intent          Dense  BM25  When
 semantic          0.8   0.2  User describes a need in natural language
 keyword_lookup    0.2   0.8  User searches for a specific product/brand/model
 hybrid            0.5   0.5  Mix of descriptive language and specific terms
 phrase_match      0.4   0.6  User references a specific feature phrase

For phrase_match intents, the LLM also extracts the key phrase, which is used as a Qdrant MatchPhrase filter on the BM25 prefetch for higher precision.


Key Qdrant Features Used

Weighted RRF -- Assign different importance to each retrieval source:

query=models.RrfQuery(
    rrf=models.Rrf(weights=[bm25_weight, dense_weight])
)

Phrase Search -- Match exact multi-word phrases, not just individual tokens:

models.FieldCondition(
    key="text",
    match=models.MatchPhrase(phrase="noise cancelling"),
)

Native BM25 -- Server-side sparse embedding with IDF modifier:

sparse_vectors_config={
    "bm25": models.SparseVectorParams(modifier=models.Modifier.IDF)
}

References


About

Intent-aware hybrid search for e-commerce using Qdrant, IBM Granite embeddings, and Gemini 3.1 Flash Lite.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors