Skip to content

Add flat LightExpression model with direct FE builder API, fill remaining LE conversion gaps, add LE vs FE creation benchmark, and start deterministic FE/LE property testing#521

Merged
dadhi merged 15 commits intomasterfrom
copilot/optimize-expression-representation
Apr 21, 2026
Merged

Add flat LightExpression model with direct FE builder API, fill remaining LE conversion gaps, add LE vs FE creation benchmark, and start deterministic FE/LE property testing#521
dadhi merged 15 commits intomasterfrom
copilot/optimize-expression-representation

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 20, 2026

This change introduces a flat, intrusive-array expression representation for LightExpression-oriented workflows, with round-trip conversion back to System.Linq.Expressions and LightExpression. It also closes the remaining ToLightExpression gaps so the flat model can cover the expression surface area LE already supports today, adds a direct FE factory surface so flat expressions can be built without going through LE first, adds a focused benchmark to compare LightExpression and FlatExpression creation costs on the same moderate real-world shape, and starts deterministic property-style coverage for FE/LE structural round-trips on a bounded generated subset.

  • Flat expression model

    • Added FastExpressionCompiler.FlatExpression.ExprTree and ExprNode
    • Stores expressions as a compact node array plus a separate closure-constant array
    • Preserves identity-sensitive nodes such as parameters and labels across round-trip reconstruction
  • Direct FE builder API

    • Added LightExpression-like factory methods directly on ExprTree
    • Supports direct FE construction for the currently covered node shapes, including:
      • parameters / variables / constants / defaults
      • unary and binary nodes
      • calls, member access, arrays, blocks, lambdas
      • member init / list init bindings
      • labels, gotos, loops, switch, try/catch/finally/fault
      • type tests, dynamic, runtime variables, debug info
    • This makes FE usable as the primary build representation while LE remains the fallback / verification path for now
  • Conversion pipeline

    • Added conversion from:
      • System.Linq.Expressions.Expression -> ExprTree
      • FastExpressionCompiler.LightExpression.Expression -> ExprTree
    • Added conversion back to:
      • ExprTree -> System.Linq.Expressions.Expression
      • ExprTree -> FastExpressionCompiler.LightExpression.Expression
    • Keeps compiler integration out of scope for now; this PR provides representation, direct construction, and conversion
  • LE compatibility coverage

    • Extended ToLightExpression to handle the remaining supported LE node kinds that were still missing:
      • Dynamic
      • RuntimeVariables
      • DebugInfo
  • Behavioral coverage

    • Added focused tests for:
      • LightExpression -> FlatExpression -> LightExpression round-trip
      • closure constant collection
      • parameter / label identity preservation
      • Dynamic, RuntimeVariables, and DebugInfo conversion paths
      • direct FE construction with the new ExprTree factory API
      • direct FE control-flow construction
  • Performance coverage

    • Added LightExprVsFlatExpr_Create_ComplexExpr to the existing benchmarks project
    • Compares relative creation-time performance and allocation between LightExpression and FlatExpression for the same moderate complex object-graph construction scenario
    • Reuses a shared CreateComplexFlatExpression helper so the LE and FE benchmark shapes stay aligned
  • Deterministic property coverage

    • Added a bounded deterministic property-style test in LightExpressionTests
    • Uses CsCheck 4.6.2 on net8.0/net9.0 to drive fixed-seed sampling
    • Generates valid matching FE and LE expression trees for a core recursive subset with explicit depth and breadth limits
    • Converts FE back to LE and compares the result structurally against the directly built LE tree
    • Keeps the generator local and validity-driven by construction instead of depending on a broader unmerged comparer port from master

Example:

using FastExpressionCompiler.FlatExpression;

var fe = default(ExprTree);
var x = fe.Parameter(typeof(int), "x");
var body = fe.Add(x, fe.ConstantInt(1));
fe.RootIndex = fe.Lambda(typeof(Func<int, int>), body, x);

var sysExpr = fe.ToExpression();
var lightExpr = fe.ToLightExpression();
  • Known follow-ups
    • Metadata is still stored as runtime objects (Type, MemberInfo, MethodInfo, binders), so this is not yet a fully serialization-ready wire format
    • FromLightExpression currently flattens via System.Linq.Expressions rather than directly from LE nodes
    • the direct builder currently uses leaf cloning when linking reused child references in the intrusive sibling-chain layout
    • no compiler integration, ToCSharpString, or ToExprString changes in this PR
    • benchmark coverage is currently fixed-shape and stays in the existing BenchmarkDotNet harness
    • current deterministic property coverage is intentionally limited to a valid generated subset and local structural comparison; broader fuzzy/invariant coverage can expand from there

Copilot AI changed the title [WIP] Optimize data-oriented representation of Expression for serialization Add flat LightExpression round-trip model and fill remaining LE conversion gaps Apr 20, 2026
Copilot AI requested a review from dadhi April 20, 2026 06:19
@dadhi
Copy link
Copy Markdown
Owner

dadhi commented Apr 20, 2026

@copilot The conversation from/to LE is to be able to test correctness etc. and to have a fallback for now.

The goal is to have FE as a primary build and compiled representation and superseef the LE with it.

So, did you created the factory methods for building FE directly with the surface api matching LE? If not please do, because next next step would be integrating with compiler without too many '#if defs'.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 20, 2026

@copilot The conversation from/to LE is to be able to test correctness etc. and to have a fallback for now.

The goal is to have FE as a primary build and compiled representation and superseef the L...

Added the direct FE builder surface on ExprTree with LE-like factory methods, plus focused tests for direct construction and control-flow cases. It now supports building FE without going through LE first, which should make the next compiler-integration step much cleaner. Addressed in 150bdd6.

Copilot AI changed the title Add flat LightExpression round-trip model and fill remaining LE conversion gaps Add flat LightExpression model with direct FE builder API and fill remaining LE conversion gaps Apr 20, 2026
Copy link
Copy Markdown
Owner

@dadhi dadhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ci build is failing because of missing xml doc comments. Please add a minimal summary-focused comment explaining what and why. Be technical and point, avoid windy ramblings.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 20, 2026

Ci build is failing because of missing xml doc comments. Please add a minimal summary-focused comment explaining what and why. Be technical and point, avoid windy ramblings.

Added concise XML summaries to the new public flat-expression API and fixed the net472 test project reference needed by the dynamic binder test so the release build can complete. Addressed in cc19f39.

Copilot AI requested a review from dadhi April 20, 2026 07:55
if (expressions == null || expressions.Length == 0)
throw new ArgumentException("Block should contain at least one expression.", nameof(expressions));

var children = new List<int>();
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid using list for creating a transient storage - use SmallListN or SmallMapN instead.
Goal is to make FE creation as fast and as gc free as possible. Apply in other places.
Look at LightExpression construction where I try to minimiza gc pressure.
We may use params ReadOnlySpan in .net 10 to minimize it further.
Avoid copying the big structs and using interim steuct data wrappers - prefer the diect style of manipulating the ExprNode. Users will know that the goal an extreme performance - so the responsibility is on their side. We may hrlp with Asserts, or making a right thing by construction but perf is the altimate goal.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworked the transient child storage in FlatExpression to use stack-backed SmallList instead of List<int> across the direct builder, flattening builder, child cloning, and reader paths, and switched node creation to write through AddDefaultAndGetRef to avoid the extra ExprNode copy. Verified with dotnet build src/FastExpressionCompiler.LightExpression/FastExpressionCompiler.LightExpression.csproj -c Release and the net9 test runner. Addressed in 7958d82. No UI changes.

Copilot AI requested a review from dadhi April 20, 2026 08:54
private static System.Linq.Expressions.NewExpression CreateValueTypeNewExpression(Type type) => SysExpr.New(type);
}

private sealed class ParameterData
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should replace the class ...Data objects with storing the fields directly in the ExprNode.
Storing in the GC collected classes defeats the goal of FlatExpression - to be flat, cache-friendly, no chasing pointers, list of nodes.
For instance, for Parameter we can store Name in Obj; and the rest in the _data, same for label.
If the Expression type requires more than 1 object reference (optional comparison method) it should be stored in the child sub nodes.
Please adopt and remove all the Data classes below.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced the remaining ...Data wrapper payloads by storing scalar payload directly on ExprNode, moving single-reference payloads into Obj, and encoding the extra Dynamic reference as an internal child metadata node. Verified with the full release build and the net9 test runner. Addressed in 7f3a5b6. No UI changes.

Copilot AI requested a review from dadhi April 20, 2026 10:35
}

internal void SetNextIdx(int nextIdx) =>
_data = (_data & ~(IndexMask << NextShift)) | ((ulong)(ushort)nextIdx << NextShift);
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~(IndexMask << NextShift) and ~((IndexMask << CountShift) | IndexMask)
Can we just introduce derived nicely named constants for them and compress those calculations for SetNextIdx and SetChildInfo?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduced named derived masks for the packed next/child bits and simplified SetNextIdx / SetChildInfo to use them directly. Addressed in 88fc048. No UI changes.

private int AddFactoryExpressionNode(Type type, object obj, ExpressionType nodeType, byte flags, int child0, int child1) =>
AddNode(type, obj, nodeType, ExprNodeKind.Expression, flags, CloneChild(child0), CloneChild(child1));

private int AddFactoryExpressionNode(Type type, object obj, ExpressionType nodeType, byte flags, int child0, int child1, int child2) =>
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let support up to 7 arguments to match LightExpression, plus minify the naming to avoid long winded signature - it is already clear that we are working with children/arguments/expressions/vairables/etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extended the compact fixed-arity expression helper path up to 7 children and shortened the helper child naming so the small-arity cases stay direct and closer to the LE style. Addressed in 88fc048. No UI changes.

// Any leaf reused in more than one parent would have its intrusive sibling link (`NextIdx`) overwritten.
// Clone the leaf before linking so the direct builder API may safely reuse returned indexes for
// parameters, labels, constants, default values and parameterless `new` expressions.
private static bool ShouldCloneWhenLinking(in ExprNode node) =>
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those checks that only operate on the one node state, can we move them into the Node struct itself and simplify? Do for the similar other situations as well for consistency.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the node-local predicates into ExprNode itself (Is, IsExpression, HasFlag, ShouldCloneWhenLinked) and updated the surrounding code to use them consistently. Addressed in 88fc048. No UI changes.

return nodeIndex;
}

private static bool ShouldInlineConstant(object value, Type type)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minify the check code into a single condition, avoid loc creep. Do the same for similar siutations/check.
Apply AggressiveInlining for such methods where it make sense.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compacted the small predicate/check helpers and added AggressiveInlining where it fits the hot tiny methods involved in the packed-node path. Addressed in 88fc048. No UI changes.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 20, 2026

@copilot The conversation from/to LE is to be able to test correctness etc. and to have a fallback for now.

The goal is to have FE as a primary build and compiled representation and superseef the L...

Yes — the direct FE factory surface matching the LE-style builder API is already in the branch on ExprTree. It was introduced in 150bdd6 and is still present in 88fc048. No UI changes.

Copilot AI requested a review from dadhi April 20, 2026 11:29
Copy link
Copy Markdown
Owner

@dadhi dadhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, next step I want a performance test to compare relative performance and memory allocation taking by creation of moderate real-world example, between LightExpression and FlatExpression. I will suggest to use latest CsCheck library (v4.6.2). - Later we may utilize for the expression fuzzy- and invariant-testing

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 20, 2026

Cool, next step I want a performance test to compare relative performance and memory allocation taking by creation of moderate real-world example, between LightExpression and FlatExpression. I will su...

Added LightExprVsFlatExpr_Create_ComplexExpr in the existing benchmarks project, using the same moderate complex-object graph shape for both builders via a new CreateComplexFlatExpression helper, so creation time and allocation can be compared directly between LE and FE. I kept this one in the existing BenchmarkDotNet harness; I did not add CsCheck yet because this benchmark is fixed-shape, but it should fit well once we start fuzzy/invariant testing. Addressed in f4bd9e9. No UI changes.

Copilot AI changed the title Add flat LightExpression model with direct FE builder API and fill remaining LE conversion gaps Add flat LightExpression model with direct FE builder API, fill remaining LE conversion gaps, and add LE vs FE creation benchmark Apr 20, 2026
Copilot AI requested a review from dadhi April 20, 2026 14:11
Copy link
Copy Markdown
Owner

@dadhi dadhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let start the deterministic property testing to at least provide some guarantees for our very broad api surface. I have an idea to test first - build/genetate arbitraty expression with specified limits on tje depth and breadth via CsCheck, is ensure its validity by black listing some expressions in some contexts. Ley gen the same FE and LE expression, then convert FE to LE and compare them structurally - the structural comparison for LE and SE was just merged into master. If you cannot find how to that CsCheck let produce our own valid expression generator (using the ssme perf/simplicity principmes as the rest of the Fec).

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 20, 2026

Let start the deterministic property testing to at least provide some guarantees for our very broad api surface. I have an idea to test first - build/genetate arbitraty expression with specified limit...

Added the first deterministic property-style coverage pass in LightExpressionTests: CsCheck v4.6.2 drives fixed-seed sampling on net8/net9, a bounded in-repo generator builds valid matching FE and LE trees for a core recursive subset, FE is converted back to LE, and the result is checked with a local structural comparer for that generated subset. I kept the generator local because it lets us enforce validity by construction and stay within the current branch without pulling the larger master structural-comparer feature set. Addressed in 46072b59. No UI changes.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • www.nuget.org
    • Triggering command: /home/REDACTED/work/_temp/ghcca-node/node/bin/node /home/REDACTED/work/_temp/ghcca-node/node/bin/node --enable-source-maps /home/REDACTED/work/_temp/copilot-developer-action-main/dist/index.js (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI changed the title Add flat LightExpression model with direct FE builder API, fill remaining LE conversion gaps, and add LE vs FE creation benchmark Add flat LightExpression model with direct FE builder API, fill remaining LE conversion gaps, add LE vs FE creation benchmark, and start deterministic FE/LE property testing Apr 20, 2026
Copilot AI requested a review from dadhi April 20, 2026 19:31
Copy link
Copy Markdown
Owner

@dadhi dadhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add FlatExpression.cs to the src nudpecs.
Add tests to both .net472 the same as to .net test runner.
Ensure that it is included into the packages the same way as LightExpression.
It won't be integrated the same way in Fec and tests yet until we do this integration in the separate PR.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 21, 2026

@dadhi I've opened a new pull request, #528, to work on those changes. Once the pull request is ready, I'll request review from you.

@dadhi dadhi marked this pull request as ready for review April 21, 2026 20:57
@dadhi dadhi merged commit 548779a into master Apr 21, 2026
2 checks passed
@dadhi dadhi deleted the copilot/optimize-expression-representation branch April 21, 2026 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Data-oriented representation of Expression partly on stack and optimized for constant collection

2 participants