Skip to content

perf: Defer black import to first use#229

Merged
gvanrossum merged 4 commits intomicrosoft:mainfrom
KRRT7:perf/defer-black
Apr 10, 2026
Merged

perf: Defer black import to first use#229
gvanrossum merged 4 commits intomicrosoft:mainfrom
KRRT7:perf/defer-black

Conversation

@KRRT7
Copy link
Copy Markdown
Contributor

@KRRT7 KRRT7 commented Apr 10, 2026

Stack: 2/4 — depends on #231. Merge #231 first, then this PR.


  • Defer import black from module level to first use in answers.py and utils.py
  • black (code formatter + transitive deps: pathspec, black.nodes, etc.) loaded on every import typeagent but only used in two cold formatting paths

black.format_str() is called in two places:

  • create_context_prompt() in knowpro/answers.py — formats debug context for LLM prompts
  • format_code() in aitools/utils.py — developer pretty-print utility

Neither runs during normal library operation. Moving the import inside each function eliminates ~78ms of transitive module loading from the import chain.

Benchmark

Azure Standard_D2s_v5 — 2 vCPU, 8 GiB RAM, Python 3.13

Import Time (hyperfine, warmup 5, min-runs 30)

Benchmark Before After Speedup
import typeagent 791 ms ± 11 ms 713 ms ± 8 ms 1.11x

Offline E2E Test Suite (hyperfine, warmup 2, min-runs 10)

Benchmark Before After Speedup
69 offline tests 5.72 s ± 90 ms 5.60 s ± 98 ms 1.02x

Generated by codeflash optimization agent

KRRT7 added 3 commits April 9, 2026 22:45
uv 0.10.x is current; the <0.10.0 constraint caused build warnings.
parse_azure_endpoint returned the raw URL including ?api-version=...
which AsyncAzureOpenAI then mangled into invalid paths like
...?api-version=2024-06-01/openai/. Strip the query string before
returning — api_version is already returned as a separate value and
passed to the SDK independently.
black is only used in create_context_prompt() and format_code() -- both
cold paths. Moving the import inside the functions avoids loading black
and its transitive deps (pathspec, black.nodes, etc.) on every
import typeagent.
Copy link
Copy Markdown
Collaborator

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have one this a long time ago. :-(

@gvanrossum gvanrossum merged commit 63701f0 into microsoft:main Apr 10, 2026
15 checks passed
@KRRT7
Copy link
Copy Markdown
Contributor Author

KRRT7 commented Apr 10, 2026

Thanks, wasn't sure you wanted import time optimizations, if you do, I have a few more ready to go.

@gvanrossum
Copy link
Copy Markdown
Collaborator

This one is dear to my heart because I wish we could remove black from the mandatory dependencies. Happy to review more.

@KRRT7
Copy link
Copy Markdown
Contributor Author

KRRT7 commented Apr 10, 2026

why can't you remove black? out of curiousness

@gvanrossum
Copy link
Copy Markdown
Collaborator

why can't you remove black? out of curiousness

I could, but it gives the nicest formatted expressions when doing hardcore debugging. What we could do is import it conditionally and just use repr() if black cannot be imported. Then we can remove it from the main deps. Developers have the dev tools installed (pyright, black, isort, pytest, etc.) and will see the black-formatted debug output.

@KRRT7
Copy link
Copy Markdown
Contributor Author

KRRT7 commented Apr 10, 2026

Went ahead and did this — #235 replaces black.format_str with pprint.pformat + ast.literal_eval in both call sites (stdlib only, no conditional imports). Moved black from dependencies to the dev dependency-group so make format/make check still work but library consumers don't pull it in.

bmerkle added a commit that referenced this pull request Apr 22, 2026
**Stack: 3/4** — depends on #229. Merge #231, #229, then this PR.

---

- Add `add_terms_batch` and `add_properties_batch` to
`ITermToSemanticRefIndex` and `IPropertyToSemanticRefIndex` interfaces
- SQLite backend uses `executemany` instead of individual
`cursor.execute()` calls (~1000+ calls per indexing batch reduced to
2-3)
- Restructure `add_metadata_to_index_from_list` and
`add_to_property_index` to collect all data first (pure functions), then
batch-insert
- Memory backend implements batch methods as loops for interface
compatibility

## Benchmark

### Azure Standard_D2s_v5 -- 2 vCPU, 8 GiB RAM, Python 3.13

#### Indexing Pipeline (pytest-async-benchmark pedantic, 20 rounds, 3
warmup)

Only the hot path (`add_messages_with_indexing`) is timed -- DB
creation, storage init, and teardown are excluded.

| Benchmark | Before (min) | After (min) | Speedup |
|:---|---:|---:|---:|
| `add_messages_with_indexing` (200 msgs) | 28.8 ms | 25.0 ms |
**1.16x** |
| `add_messages_with_indexing` (50 msgs) | 7.8 ms | 6.7 ms | **1.16x** |
| VTT ingest (40 msgs) | 6.9 ms | 6.1 ms | **1.14x** |

Consistent ~14-16% improvement -- `executemany` amortizes per-call
overhead.

<details>
<summary><b>Reproduce the benchmark locally</b></summary>

Save the benchmark file below as
`tests/benchmarks/test_benchmark_indexing.py`, then:

```bash
pip install 'pytest-async-benchmark @ git+https://github.com/KRRT7/pytest-async-benchmark.git@feat/pedantic-mode' pytest-asyncio

# Run on main
git checkout main
python -m pytest tests/benchmarks/test_benchmark_indexing.py -v -s

# Run on this branch
git checkout perf/batch-inserts
python -m pytest tests/benchmarks/test_benchmark_indexing.py -v -s
```
</details>

---

*Generated by codeflash optimization agent*

---------

Co-authored-by: Bernhard Merkle <bernhard.merkle@gmail.com>
bmerkle added a commit that referenced this pull request Apr 22, 2026
**Stack: 4/4** — depends on #230. Merge #231, #229, #230, then this PR.

---

- Five call sites used `get_item()` per scored ref — one SELECT and full
deserialization per match (N+1 pattern)
- Added `get_metadata_multiple` to `ISemanticRefCollection` that fetches
only `semref_id, range_json, knowledge_type` in a single batch query
- Replaced the N+1 loop with one `get_metadata_multiple` call at each
site
- Further optimized scope-filtering: binary search in `contains_range`,
inline tuple comparisons in `TextRange`, skip pydantic validation in
`get_metadata_multiple`

### Call sites optimized

1. `lookup_term_filtered` — batch metadata, filter by
knowledge_type/range
2. `lookup_property_in_property_index` — batch metadata, filter by range
scope
3. `SemanticRefAccumulator.group_matches_by_type` — batch metadata,
group by knowledge_type
4. `SemanticRefAccumulator.get_matches_in_scope` — batch metadata,
filter by range scope
5. `get_scored_semantic_refs_from_ordinals_iter` — two-phase: metadata
filter then batch fetch

### Additional optimizations

- **Binary search in `TextRangeCollection.contains_range`**: replaced
O(n) linear scan with `bisect_right` keyed on `start`, reducing
scope-filtering from ~25ms to ~9ms
- **Inline tuple comparisons in `TextRange`**: replaced `TextLocation`
allocations in `__eq__`/`__lt__`/`__contains__` with a shared
`_effective_end` returning tuples
- **Skip pydantic validation in `get_metadata_multiple`**: construct
`TextLocation`/`TextRange` directly from JSON instead of going through
`__pydantic_validator__`

## Benchmark

### Azure Standard_D2s_v5 — 2 vCPU, 8 GiB RAM, Python 3.13

#### Query (pytest-async-benchmark pedantic, 200 rounds)

200 matches against a 200-message indexed SQLite transcript. Only the
function under test is timed.

| Function | Before (median) | After (median) | Speedup |
|:---|---:|---:|---:|
| `lookup_term_filtered` | 2.650 ms | 1.184 ms | **2.24x** |
| `group_matches_by_type` | 2.428 ms | 978 μs | **2.48x** |
| `get_scored_semantic_refs_from_ordinals_iter` | 2.541 ms | 2.946 ms |
0.86x |
| `lookup_property_in_property_index` | 25.306 ms | 9.365 ms | **2.70x**
|
| `get_matches_in_scope` | 25.011 ms | 9.160 ms | **2.73x** |

<details>
<summary><b>Reproduce the benchmark locally</b></summary>

```bash
pip install 'pytest-async-benchmark @ git+https://github.com/KRRT7/pytest-async-benchmark.git@feat/pedantic-mode' pytest-asyncio
python -m pytest tests/benchmarks/test_benchmark_query.py -v -s
```
</details>

---

*Generated by codeflash optimization agent*

---------

Co-authored-by: Bernhard Merkle <bernhard.merkle@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants