Skip to content

Stub: allow custom imports#5877

Open
Tpt wants to merge 4 commits intoPyO3:mainfrom
Tpt:tpt/custom-imports
Open

Stub: allow custom imports#5877
Tpt wants to merge 4 commits intoPyO3:mainfrom
Tpt:tpt/custom-imports

Conversation

@Tpt
Copy link
Copy Markdown
Contributor

@Tpt Tpt commented Mar 13, 2026

This feature is required to allow user-provided custom type hint to import types

These imports are written in a small DSL on the module like:

#[pymodule(stubs = {
    from datetime import datetime as dt, time;
    from uuid import UUID;
})]

Then parsed, sent as an AST inside the introspection data (following the same AST format as the type hints) and serialized by the introspection crate that merges these imports with the auto generated ones

The #[pymodule] parameter is named stub because we might include some other features in the future like protocols

These imports are written in a small DSL like:
```rust
#[pymodule(stubs = {
    from datetime import datetime as dt, time;
    from uuid import UUID;
})]
```
Then parsed, sent as an AST inside the introspection data (following the same AST format as the type hints) and serialized by the introspection crate that merges these imports with the auto generated ones

The `#[pymodule]` parameter is named `stub` because we might include some other features in the future like protocols
@Tpt Tpt force-pushed the tpt/custom-imports branch from 3ac8fee to 4fbc371 Compare March 13, 2026 14:02
@Tpt Tpt marked this pull request as ready for review March 13, 2026 14:51
Copy link
Copy Markdown
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I have a feeling it'll take a couple of rounds of review to figure out my preferred design on this one! Please bear with me 🙏

Comment thread pytests/src/annotations.rs Outdated
Comment on lines +6 to +7
from datetime import datetime as dt, time;
from uuid import UUID;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't decide how I feel about having full Python syntax embedded within the rust source like this. What are the alternatives? e.g. could be text, then it would be possible to use include_str! to maintain the stub as external Python file. That would also avoid the limitations of formatting & Rust syntax.

Possibly also interesting is https://github.com/m-ou-se/inline-python

Copy link
Copy Markdown
Contributor Author

@Tpt Tpt Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see two advantages of the current design:

  • having the imports close to the code (including the custom type annotation) makes it easier to get a single source of truth
  • having it parsed by a macro means we validate the syntax and send it to pyo3-introspection as an AST. This makes quite easy for the stub generation code to add to these custom imports the extra ones that are needed to for the type annotations generated by PyO3 and get a clean output.

I fear that skipping parsing step would make imports management quite painful and if we have to pay the parsing price, then it's nice to have such an embedding.
But I don't feel strongly on that, glad to pivot the MR toward something else if it's better.

if we go the embedding way, I am not sure about the ;
Pros:

  • if we allow more and more Python syntax including top level expressions it is required if we want to keep being able to use syn
  • it might make autoformatting possible

Cons:

  • departs from Python and confusing for the users

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having it parsed by a macro means we validate the syntax and send it to pyo3-introspection as an AST. This makes quite easy for the stub generation code to add to these custom imports the extra ones that are needed to for the type annotations generated by PyO3 and get a clean output.

This is compelling and makes a lot of sense. Let's keep it inline in the Rust source.

Regarding the ;, I would be keen to drop the requirement for it now, I think it'll make adding future possibilities like Protocol statements hard. I don't think the autoformatting is a real possibility without a lot of future extensions for rustfmt.

Looking at inline-python's source, I think we might be able to use span locations to properly detect whitespace to skip the ; requirement on 1.88+

That might be a good middle ground? I think we can probably still use syn to parse, we definitely wouldn't be able to use Punctuated, but a custom Parse implementation might still be able to collect the Python statements to a Vec...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Make sense! I'm going to experiment in this direction

Copy link
Copy Markdown
Contributor Author

@Tpt Tpt Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 2494f61 Just parsing until end of input works because each python statement is well delimited (no trailing punctuation). Care about line jump will be required when parsing protocols.

I have added a check to ensure that imports are on a single line in 7a2aace

I have also fixed dotted names support: d97c135

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm generally opposed to anything resembling "code" inside attributes.

some more cons are:

  • no auto formatting inside attributes
  • if the user gets something wrong the error messages can get horrific
  • additional complexity of having and maintaining a parser for all that stuff

What does an alternative design look like?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mejrs Great question!

I see two alternative designs:

  1. mandate users to write full import paths in their custom type annotations. So, no Iterator[datetime] but collections.abc.Iterator[datetime.datetime]. However to generate the imports we need to split in the dotted-names (like module.submodule.Class.NestedClass) into the module part (module.submodule) the class-name (Class) and the nested class part if present (NestedClass). I see multiple sub-options:
    a. not supporting nested classes at all
    b. an heuristic based on upper vs lower-case (that will likely fail on some painful modules like PIL.Image)
    c. mandating the user to split it themselves with some Pyo3-bespoke DSL.
    I am not sure to like any option.
  2. Let people write this in some module_extra_stubs.pyi (name TBD) file and then concatenate the file with the stubs generated by PyO3. However, we need there some logic to avoid name conflicts if e.g. the extra stubs contain a from this_module import foo for a custom type annotation but the auto-generate stubs create a from other_module import foo for an other auto-generated type annotation. I don't see how to escape that without parsing module_extra_stubs.pyi, going back to "additional complexity of having and maintaining a parser for all that stuff". And we loose the nice property of having these small extra stubs colocated with the code.

What do you think? If you have a better idea, I would love that, this bespoke parser is indeed gonna be a pain.

Copy link
Copy Markdown
Member

@mejrs mejrs Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe like

#[pymodule(stubs(dt = datetime::datetime, UUID = uuid::UUID))]

perhaps less convenient, but it is valid MetaItem syntax (i believe), so that would ease my concern about horrific error messages

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! Thanks!

but it is valid MetaItem syntax (i believe)

syn::MetaList value is an arbitrary TokenStream so I am not sure

#[pymodule(stubs(dt = datetime::datetime, UUID = uuid::UUID))]

would return better errors than

#[pymodule(stubs = { import datetime.datetime as dt; import uuid.UUID })]

because the parsing of whatever is inside of #[pymodule()] will rely on a syn-based parser anyway. So, not sure departing from Python syntax brings us much.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would use https://docs.rs/syn/latest/syn/struct.Attribute.html#method.parse_nested_meta I believe.

because the parsing of whatever is inside of #[pymodule()] will rely on a syn-based parser anyway. So, not sure departing from Python syntax brings us much.

Whatever the user writes still has to be somewhat valid, e.g. no unbalanced braces and curly braces. If they make a typo but it looks somewhat like MetaItem syntax then the compiler might still recover and put out a decent error message. That is what I meant with horrific error messages.

Comment on lines +48 to +52
/// A Python statement
enum PyStatement {
ImportFrom(PyImportFrom),
Import(PyImport),
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why limit the stubs to just these statements - is that just to keep initial implementation scope under control?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I guess we will expend this list to support things like protocol

@Tpt Tpt force-pushed the tpt/custom-imports branch from 7a2aace to 6cf6e1d Compare April 15, 2026 08:00
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Apr 15, 2026

Merging this PR will degrade performance by 10.73%

❌ 1 regressed benchmark
✅ 104 untouched benchmarks
⏩ 1 skipped benchmark1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Benchmark BASE HEAD Efficiency
bench_pyclass_create 3.9 µs 4.4 µs -10.73%

Comparing Tpt:tpt/custom-imports (6cf6e1d) with main (962a535)

Open in CodSpeed

Footnotes

  1. 1 benchmark was skipped, so the baseline result was used instead. If it was deleted from the codebase, click here and archive it to remove it from the performance reports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants