[Wasm RyuJit] Call Codegen Fixes for R2R#126778
[Wasm RyuJit] Call Codegen Fixes for R2R#126778adamperlin wants to merge 3 commits intodotnet:mainfrom
Conversation
…into locations in the R2R file work - Notably, a new level of indirection is needed
… in helper call codegen in Wasm
…-fix-call-codegen
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There was a problem hiding this comment.
Pull request overview
This PR adjusts WebAssembly RyuJIT call lowering/emission for ReadyToRun (R2R) so call targets and PortableEntryPoints (PEPs) are loaded through the correct number of indirections when the host exposes symbols as indirection cells (indirection cell -> PEP -> target).
Changes:
- Add an extra indirection when materializing the R2R indirection-cell argument on WASM so the value passed reflects the PEP rather than the cell itself.
- Update WASM direct-call lowering for
IAT_PVALUEto dereference throughindirection cell -> PEP -> target. - Update WASM helper-call emission to load the PEP from the indirection cell (for managed helpers) and to load the final call target via an additional dereference.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
src/coreclr/jit/morph.cpp |
Adds a WASM-specific extra indirection when preparing the R2R indirection-cell argument so the runtime sees the expected PEP pointer. |
src/coreclr/jit/lower.cpp |
Ensures WASM IAT_PVALUE direct call targets are lowered with an additional dereference to reach the actual dispatch target. |
src/coreclr/jit/codegenwasm.cpp |
Fixes helper-call emission to load the PEP from the indirection cell and to load the indirect call target by dereferencing the PEP. |
| // Push the call target onto the wasm evaluation stack by dereferencing the PEP. | ||
| // Push the call target onto the wasm evaluation stack by dereferencing the indirection cell | ||
| // and then the PEP pointed to by the indirection cell. | ||
| assert(helperFunction.accessType == IAT_PVALUE); |
There was a problem hiding this comment.
See if it's possible to use the new ADDR_REL reloc type for this to avoid the i32.const <address>; i32.load pattern.
|
Maybe a picture would be useful here? I imagine it something like this: We have an indirection cell that refers to a PEP which refers to a function index, which we then call indirectly. |
| cellAddr->AsIntCon()->gtTargetHandle = (size_t)call->gtCallMethHnd; | ||
| #endif | ||
| GenTree* indir = Ind(cellAddr); | ||
| #ifdef TARGET_WASM |
There was a problem hiding this comment.
The extra indirections related to portable entrypoints may want to be under if (opts.jitFlags->IsSet(JitFlags::JIT_FLAG_PORTABLE_ENTRY_POINTS))
| #endif | ||
| GenTree* indir = Ind(cellAddr); | ||
| #ifdef TARGET_WASM | ||
| indir = Ind(indir); // WebAssembly "function pointers" are actually PortableEntryPoint pointers, and |
There was a problem hiding this comment.
The portable entrypoint needs to be also passed as the last argument. Where does that happen when we take this path?
| #endif | ||
| GenTree* indir = Ind(cellAddr); | ||
| #ifdef TARGET_WASM | ||
| indir = Ind(indir); // WebAssembly "function pointers" are actually PortableEntryPoint pointers, and |
There was a problem hiding this comment.
IAT_PPVALUE exists. Is there a reason it is not used by crossgen when it seems to be exactly what it wants here? It feels odd to return IAT_PVALUE and then have the JIT dereference it twice.
There was a problem hiding this comment.
By the way, I would expect the optimization that is disabled for wasm above to result in large size savings if it was enabled.
There was a problem hiding this comment.
It feels odd to return IAT_PVALUE and then have the JIT dereference it twice.
I think IAT_PVALUE is correct here, and it is going to make more sense once it follows the portable entry point ABI (see my other comment).
There was a problem hiding this comment.
The portable entry point argument is being added in morph, near the morph change in this PR. I am not totally sure what it changes related to the number of indirections needed from the value returned by getFunctionEntryPoint though.
Do you mean that only one indirection is needed on top of the portable entry point argument? It is the optimization that is disabled above -- avoiding the duplication of the indirection cell.
There was a problem hiding this comment.
IAT_PVALUE?
It is notable that with the current Jit-EE contract for portable entrypoints (where they're considered "direct" function pointers), the IAT_* values are now context-dependent. E. g. the function for SGCT PIs also returns "direct" IAT_VALUE values, but they won't need to be indirected.
FWIW, I think that in the ideal world, we would only have one place where the portable entrypoint flag would matter - with indirect managed calli, and the rest of the Jit would continue to operate on "native function pointer" semantics. But it's not where where we are going here it seems.
There was a problem hiding this comment.
getAddressOfPInvokeTarget function.
PInvoke targets are unmanaged entrypoints. Unmanaged entrypoints are always direct. Managed entrypoints are always PEP. PEP vs. non-PEP Is implied by managed vs. unmanaged calling convention.
PEP ABI
The actual function pointer and the hidden argument must always come from the same pep value. For example, the following code is not valid with PEP ABI:
void* pep1 = <globally visible memory location>;
void* pep2 = <globally visible memory location>;
(*(void**)pep1)(arg0, arg1, ..., argN, pep2)
It will be eventually crucial for multithreading. Before multithreading, it can lead to subtle bugs if there is code with side-effects inserted between the two reads of globally visible memory location and the side-effects happen to update the globally visible memory location. (R2R indirection cells are globally visible memory locations for the purpose of this example.)
There was a problem hiding this comment.
The optimization we do for other targets (which certainly cannot be incorrect since the the indirection cell is a constant in those cases) can certainly be done in IR to make it robust and make it work in WASM. In fact this old change was doing exactly that: #65388
There was a problem hiding this comment.
PInvoke targets are unmanaged entrypoints. Unmanaged entrypoints are always direct. Managed entrypoints are always PEP. PEP vs. non-PEP Is implied by managed vs. unmanaged calling convention.
It is odd in the JIT to have IAT_VALUE mean "perform a call by calling with this index" in one case and IAT_VALUE mean "perform a call by indirecting then calling with this index" in another. I do not see what the calling convention has to do with anything here, we are generating an instruction in the end that takes an operand and it has no thoughts about calling conventions.
There was a problem hiding this comment.
Something else to consider (in favor of non-contextual IAT_VALUE reading) is that we may legitimately want to call managed methods directly (WASM call) even with portable entrypoints, e. g. for known helpers that we'll import or functions in the same module (assuming we'll be able to make that optimization at some point).
There was a problem hiding this comment.
I do not see what the calling convention has to do with anything here
https://github.com/dotnet/runtime/blob/main/docs/design/coreclr/botr/clr-abi.md#portable-entrypoints talks about it as managed calling convention (look for "the managed calling convention is modified as follows:")
If it is too confusing, we can use different terminology and assign different meaning to IAT_... enum with portable entry points at the JIT/EE interface level as long as the actual code remains the same. If we do changes along these lines, we should change the runtime JIT/EE interface used by the interpreter to match.
Something else to consider (in favor of non-contextual IAT_VALUE reading) is that we may legitimately want to call managed methods direct
Yes, we have discussed that as a potential future optimization. In theory, it can be used for managed method -> managed method calls too.
|
|
||
| #if defined(TARGET_WASM) | ||
| // On WASM, the address in the R2R table is actually the address of something that should | ||
| // be treated as a PortableEntryPoint. To actually dispatch, we need to indirect once more. |
There was a problem hiding this comment.
What is the reason we can't avoid this indirection? Requirement to allocate "runtime-shaped" PEPs dynamically (can the R2R cell be a superclass of a PEP layout-wise, or vice-versa)?
This is another cherry-picked commit from #126662; There are a few cases in Wasm codegen where we needed an extra layer of indirection to load a PEP or call target, since the host provides symbols which resolve to indirection cells for calls, and we have
indirection cell -> PEP -> target method.