Skip to content

fix for the issue#42#44

Open
pavanjava wants to merge 1 commit into
mainfrom
qql22
Open

fix for the issue#42#44
pavanjava wants to merge 1 commit into
mainfrom
qql22

Conversation

@pavanjava
Copy link
Copy Markdown
Owner

@pavanjava pavanjava commented May 24, 2026

Summary by CodeRabbit

  • Refactor
    • Optimized collection validation and vector dimension handling to reduce unnecessary metadata lookups, improving performance for search, update, and collection alteration operations.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 24, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: a3d9e44e-ea54-43be-bdb0-cdf00de5799c

📥 Commits

Reviewing files that changed from the base of the PR and between c4a6301 and ff19012.

📒 Files selected for processing (1)
  • src/qql/executor.py

📝 Walkthrough

Walkthrough

This PR consolidates collection metadata fetching by caching dense vector sizes in CollectionTopology and refactoring multiple execution methods to rely on a single topology resolution call instead of repeated collection existence and size checks.

Changes

Topology Metadata Caching

Layer / File(s) Summary
CollectionTopology contract and dense size caching
src/qql/executor.py
CollectionTopology adds dense_sizes field and dense_size_map() method to expose cached dense vector sizes as {name: size}, with "" representing the unnamed single-vector case.
Topology resolution with dense size extraction
src/qql/executor.py
_resolve_topology() now extracts and caches dense vector sizes alongside dense vector names, supporting both named vectors (iterating entries) and single unnamed vectors (reading vectors.size).
Consolidate collection existence checks across execution flows
src/qql/executor.py
_execute_alter_collection(), _execute_search(), and _execute_update_vector() now check topology.exists after resolving topology, eliminating redundant direct collection_exists() calls.
Dimension validation using cached topology sizes
src/qql/executor.py
_ensure_collection() validates vector dimensions against topology.dense_size_map() for existing collections, removing the separate get_collection() inspection and raising QQLRuntimeError on mismatch.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

A rabbit hops through topology trees,
Caching sizes with newfound ease,
No more duplicate calls to make—
One resolve, then validation's cake! 🐰✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 30.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'fix for the issue#42' is vague and generic, providing no meaningful information about what the actual change accomplishes. Replace with a descriptive title that summarizes the main change, such as 'Optimize collection existence checks using topology caching' or 'Cache dense vector sizes in CollectionTopology'.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch qql22

Comment @coderabbitai help to get the list of available commands and usage tips.

@srimon12
Copy link
Copy Markdown
Collaborator

The change is in the right direction: caching dense vector sizes in CollectionTopology removes the extra get_collection() from _ensure_collection().

Before merging, I think we should complete the scope of issue #42 in this PR:

  1. Make _resolve_topology() the single metadata fetch path.

    • For existing collections, avoid collection_exists() + get_collection().
    • Prefer one get_collection() call and handle only the collection-not-found case as exists=False.
  2. Split API access from topology parsing.

    • Keep _resolve_topology() responsible for fetching.
    • Move vector/sparse parsing into a small helper like _topology_from_collection_info(info).
  3. Add regression tests for the actual API-call behavior.

    • Existing dense INSERT should fetch collection metadata once.
    • Existing hybrid auto-detect INSERT should fetch collection metadata once.
    • Dimension mismatch should still raise without a second metadata fetch.
    • Nonexistent collection flow should still create or fail as before.
  4. Link the PR to the issue.

    • Add Fixes #42 once the above is complete.

This would turn the current partial optimization into the full fix: one resolved topology object reused for both routing and validation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants