Skip to content

Batch metadata query to avoid N+1 across 5 call sites#7

Open
KRRT7 wants to merge 6 commits intomainfrom
perf/lightweight-query
Open

Batch metadata query to avoid N+1 across 5 call sites#7
KRRT7 wants to merge 6 commits intomainfrom
perf/lightweight-query

Conversation

@KRRT7
Copy link
Copy Markdown
Owner

@KRRT7 KRRT7 commented Apr 10, 2026

Summary

  • Five call sites used get_item() per scored ref — one SELECT and full deserialization per match (N+1 pattern)
  • Added get_metadata_multiple to ISemanticRefCollection that fetches only semref_id, range_json, knowledge_type in a single batch query
  • Replaced the N+1 loop with one get_metadata_multiple call at each site
  • Further optimized scope-filtering: binary search in contains_range, inline tuple comparisons in TextRange, skip pydantic validation in get_metadata_multiple

Call sites optimized

  1. lookup_term_filtered — batch metadata, filter by knowledge_type/range
  2. lookup_property_in_property_index — batch metadata, filter by range scope
  3. SemanticRefAccumulator.group_matches_by_type — batch metadata, group by knowledge_type
  4. SemanticRefAccumulator.get_matches_in_scope — batch metadata, filter by range scope
  5. get_scored_semantic_refs_from_ordinals_iter — two-phase: metadata filter then batch fetch

Additional optimizations

  • Binary search in TextRangeCollection.contains_range: replaced O(n) linear scan with bisect_right keyed on start, reducing scope-filtering from ~24ms to ~9ms
  • Inline tuple comparisons in TextRange: replaced TextLocation allocations in __eq__/__lt__/__contains__ with a shared _effective_end returning tuples
  • Skip pydantic validation in get_metadata_multiple: construct TextLocation/TextRange directly from JSON instead of going through __pydantic_validator__

Bugfix included

parse_azure_endpoint was returning the full URL with query string (?api-version=...), which AsyncAzureOpenAI mangled into a double-path. Now strips the query string before returning. Added 6 unit tests.

Benchmarks (pytest-async-benchmark pedantic, 200 rounds, Standard_D2s_v5)

200 matches against a 200-message indexed SQLite transcript. Only the function under test is timed.

Function main (median) optimized (median) Speedup
lookup_term_filtered 2.652 ms 1.260 ms 2.10x
group_matches_by_type 2.453 ms 992 μs 2.47x
get_scored_semantic_refs_from_ordinals_iter 2.511 ms 2.979 ms 0.84x
lookup_property_in_property_index 24.484 ms 9.376 ms 2.61x
get_matches_in_scope 24.062 ms 9.185 ms 2.62x

The scope-filtering benchmarks (#4, #5) improved from 1.08x/1.06x to 2.61x/2.62x after adding binary search in contains_range and inline tuple comparisons. The scored-refs benchmark trades N get_item calls for metadata + get_multiple, roughly breaking even.

How to run

pip install 'pytest-async-benchmark @ git+https://github.com/KRRT7/pytest-async-benchmark.git@feat/pedantic-mode' pytest-asyncio
uv run python -m pytest tests/benchmarks/test_benchmark_query.py -v -s

Test plan

  • All existing offline tests pass (456 passed, 7 skipped due to Azure credentials)
  • Benchmark shows consistent improvement across multiple runs on VM
  • 5 benchmarks cover all optimized call sites

@KRRT7 KRRT7 changed the title Batch metadata query to avoid N+1 in lookup_term_filtered Batch metadata query to avoid N+1 across 5 call sites Apr 10, 2026
@KRRT7 KRRT7 force-pushed the perf/lightweight-query branch from 038ca7c to 738f4df Compare April 10, 2026 09:34
KRRT7 added 4 commits April 10, 2026 04:37
lookup_term_filtered called get_item() per scored ref — one SELECT and
full deserialization per match. The filter only needs knowledge_type
(a plain column) and range (json.loads of range_json), never the
expensive knowledge_json deserialization (64% of per-row cost).

Add get_metadata_multiple to ISemanticRefCollection that fetches only
semref_id, range_json, knowledge_type in a single batch query. Replace
the N+1 loop in lookup_term_filtered with one get_metadata_multiple call.

Benchmark (200 matches, 200 rounds): 4.38ms → 1.32ms (3.3x speedup).
Apply the same get_metadata_multiple pattern from lookup_term_filtered
to four more sites that called get_item() in a loop:

- propindex.lookup_property_in_property_index: filter by .range
- SemanticRefAccumulator.group_matches_by_type: group by .knowledge_type
- SemanticRefAccumulator.get_matches_in_scope: filter by .range
- answers.get_scored_semantic_refs_from_ordinals_iter: two-phase
  metadata filter then batch get_multiple for matching full objects

All sites now use a single batch query instead of N individual SELECTs,
skipping knowledge_json deserialization where only range or
knowledge_type is needed.
parse_azure_endpoint returned the raw URL including ?api-version=...
which AsyncAzureOpenAI then mangled into invalid paths like
...?api-version=2024-06-01/openai/. Strip the query string before
returning — api_version is already returned as a separate value and
passed to the SDK independently.
@KRRT7 KRRT7 force-pushed the perf/lightweight-query branch from 738f4df to 8332c44 Compare April 10, 2026 09:38
KRRT7 added 2 commits April 10, 2026 04:42
…arisons

- Use bisect_right with key=start in TextRangeCollection.contains_range
  to skip O(n) linear scan (O(log n) for non-overlapping point ranges)
- Replace TextLocation allocations in TextRange __eq__/__lt__/__contains__
  with a shared _effective_end returning tuples
- Skip pydantic validation in get_metadata_multiple by constructing
  TextLocation/TextRange directly from JSON
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant