Conversation
Prior to this PR the order in which contexts were derived was hardcoded: Every AST element forwarded its context to all its child elements (with possibly some derivation happening). This works well for many type systems, specifically ones that mainly deal with expressions but hits limits when dealing with side-effecting statements. Examples of things that do not work with this system: * We cannot constrain the indexing expression of an array assignment or array indexing based on the array parameter since this requires deriving the context of the expression from the array parameter * We cannot forward data from one statement to another (either forwards or backwards), since these are sibling elements. This PR therefore implements a new mechanism that is a strict superset of functionality from the existing `check*` methods by enabling every subelement to specify which subelements it depends on. The generator then performs a topological sort to generate subelements in some order satisfying the dependencies between contexts. As an example use, the `BitwidthTypeSystem` can now generate inbounds expressions by restricting the bitwidth of the indexing expression to match the dimensions of the array parameter. This is an alternative to #837 which implements the same functionality but in a less powerful and more composable fashion. Specifically, multiple type systems can still be combined into one as long as their dependencies do not create a cycle.
|
I got quite lost in this PR haha How do i read this? Does this dependency array specify any order? |
Yes! Every position in the array corresponds to the subelement we are calculating the context for. The very last position in a |
|
Oh I meant how this dependencyarray specifies the order of AST node generation, for instance: |
The order of AST node generation is implicit through what input dependencies the For example, for your first specification, one possible return DependencyArray<ast::ArrayAssignmentStatement>{
// Depends on the index (index 1 in the SubElements tuple).
/*arrayName=*/Dependency<ast::ArrayAssignmentStatement, 1>([](const Context& context, const ast::Expression& index) {
return ...;
}),
// Depends on the value (index 2 in the SubElements tuple).
/*index=*/Dependency<ast::ArrayAssignmentStatement, 2>(...),
// Depends on nothing.
/*value=*/Dependency<ast::ArrayAssignmentStatement>(...),
...
};One can also specify more than one index. E.g. the array name could also depend on both the value and index and the ordering would be unchanged. The framework performs a topological sort according to these dependencies to generate the AST nodes in that order. An abbreviated example for your second specification would e.g. be: return {
// Only depends on input context.
Dependency<ast::ArrayAssignmentStatement, PARENT_CONTEXT>(...),
// Requires array name + index.
Dependency<ast::ArrayAssignmentStatement, 0, 1>(...),
// Requires only the array name.
Dependency<ast::ArrayAssignmentStatement, 0>(...),
...
}; |
|
I am a bit worried now that configuring it is more complicated than hardcoding the order... |
This approach does have many advantages and I don't think is more complicated than hardcoding the order. Specifically:
Its also less powerful in that it doesn't allow nor make the type system care about generating AST nodes. |
|
Ok, I got this now, i am convinced that this is the way to go about it. As far as i understood, we had: DependencyArray:
Dependency:
I like the separation for requirement (1,) and (2, 3,) now, maybe we could try to improve the readability like this: I like that the template parameters specify the signature of the transfer function: Question: Is there a way to communicate this design consideration with the reader more explicitly? Question: Is there a better name for |
Absolutely, we could just add the indices to e.g.
The return type of the transfer function is always a context. The first argument I tried to add in the documentation of
I agree actually. It feels more imperative than declarative. The key-service is actually the transfer function, the generation order is a side-effect. |
Makes sense
BTW, this reminds me of the
I think either |
|
So it is not debating whether it should be called |
|
BTW what is the role of the check* methods now? Feels like there is some overlapping between the getDependencies and the check* now |
|
It is very much overlapping yes. I will remove the |
| copyFromParent<ast::ArrayReadExpression>(), | ||
| Dependency<ast::ArrayReadExpression>([] { | ||
| return DynamaticTypingContext{ | ||
| DynamaticTypingContext::IntegerRequired}; | ||
| }), | ||
| copyFromParent<ast::ArrayReadExpression>()}; |
There was a problem hiding this comment.
Maybe let's set up a convention for the code here:
// arrayName[index] = rhs;
auto arrayNameTransferFn = ...
auto arrayIndexTransferFn = ...
auto rhsTransferFn = ...
return {arrayNameTransferFn, arrayIndexTransferFn, rhsTransferFn};
There was a problem hiding this comment.
I decided against doing this since I dislike how it disconnects usage site with definition.
Specifically, the position within the array is very meaningful since it corresponds to specific AST subelements. The variable approach also allows the error of mismatching the order of declarations of the variables and order of using those variables in the variable.
I instead opted to just annotate the parameters using the /*name=*/ convention
| [&](const OpaqueContext &context) -> ast::Expression { | ||
| return generateExpression(context, depth + 1); | ||
| }, | ||
| [&](const OpaqueContext &context) -> ast::Expression { | ||
| return generateExpression(context, depth + 1); | ||
| }, | ||
| [&](ast::Expression &&lhs, | ||
| ast::Expression &&rhs) -> std::optional<ast::BinaryExpression> { | ||
| // Perform explicit casts to a legal operand type if neither of the | ||
| // expressions are legal for the given operation. | ||
| // This would e.g. cast 'double's that are meant to be applied to '&' to | ||
| // a random type that can be legally used with '&'. | ||
| if (!ast::BinaryExpression::isLegalOperandType(op, lhs.getType()) || | ||
| !ast::BinaryExpression::isLegalOperandType(op, rhs.getType())) { | ||
|
|
||
| std::optional<ast::ScalarType> scalarType = generateScalarType( | ||
| context, /*toExclude=*/[&](const ast::ScalarType &value) { | ||
| return !ast::BinaryExpression::isLegalOperandType(op, value); | ||
| }); | ||
| if (!scalarType) | ||
| return std::nullopt; | ||
|
|
||
| lhs = safeCastAsNeeded(*scalarType, std::move(lhs)); | ||
| rhs = safeCastAsNeeded(*scalarType, std::move(rhs)); | ||
| } | ||
|
|
||
| switch (op) { | ||
| case ast::BinaryExpression::ShiftLeft: | ||
| case ast::BinaryExpression::ShiftRight: { | ||
| ast::ScalarType datatype = lhs.getType(); | ||
| // Restrict the right expression to be in range of the bitwidth. | ||
| rhs = ast::BinaryExpression{ | ||
| std::move(rhs), ast::BinaryExpression::BitAnd, | ||
| ast::Constant{static_cast<uint32_t>(datatype.getBitwidth() - 1)}}; | ||
|
|
||
| // If the left-hand side is a signed integer, make sure the value is | ||
| // at least 0. Performing a left-shift on a negative value in C is | ||
| // undefined behavior. | ||
| if (op == ast::BinaryExpression::ShiftLeft && datatype.isSigned()) | ||
| lhs = generateMinExpression( | ||
| std::move(lhs), ast::Constant{static_cast<uint32_t>(0)}); | ||
| return ast::BinaryExpression{std::move(lhs), op, std::move(rhs)}; | ||
| } | ||
| case ast::BinaryExpression::Plus: | ||
| case ast::BinaryExpression::Minus: | ||
| case ast::BinaryExpression::Mul: { | ||
| ast::ScalarType lhsType = lhs.getType(); | ||
| ast::ScalarType rhsType = rhs.getType(); | ||
| if ((lhsType == ast::PrimitiveType::Int32 && | ||
| lhsType.getBitwidth() > rhsType.getBitwidth()) || | ||
| (rhsType == ast::PrimitiveType::Int32 && | ||
| rhsType.getBitwidth() > lhsType.getBitwidth())) { | ||
| // Promote integers where one operand is an 'int32_t' to 'uint32_t' | ||
| // to avoid undefined behavior on overflow. | ||
| lhs = safeCastAsNeeded(ast::PrimitiveType::UInt32, std::move(lhs)); | ||
| rhs = safeCastAsNeeded(ast::PrimitiveType::UInt32, std::move(rhs)); | ||
| } | ||
| return ast::BinaryExpression{std::move(lhs), op, std::move(rhs)}; | ||
| } | ||
| // case ast::BinaryExpression::Division: | ||
| break; | ||
| case ast::BinaryExpression::BitAnd: | ||
| case ast::BinaryExpression::BitOr: | ||
| case ast::BinaryExpression::BitXor: | ||
| case ast::BinaryExpression::Greater: | ||
| case ast::BinaryExpression::GreaterEqual: | ||
| case ast::BinaryExpression::Less: | ||
| case ast::BinaryExpression::LessEqual: | ||
| case ast::BinaryExpression::Equal: | ||
| case ast::BinaryExpression::NotEqual: | ||
| return ast::BinaryExpression{std::move(lhs), op, std::move(rhs)}; | ||
| } | ||
| llvm_unreachable("all enum cases handled"); | ||
| }); |
There was a problem hiding this comment.
Split
auto contextTransferFn = ...;
auto lhsGenFn = ...;
auto rhsGenFn = ...'
auto expressionGenFn = ...;
return generateWithDependencies<ast::BinaryExpression>(context, contextTransferFn, lhsGenFn, rhsGenFn, expressionGenFn);
There was a problem hiding this comment.
See #874 (comment)
I added argument comments instead.
| std::size_t workListSize = 0; | ||
| std::array<std::size_t, sizeof...(SubElements)> worklist; | ||
|
|
||
| std::array<std::size_t, sizeof...(SubElements)> forwardEdgeCount{}; |
There was a problem hiding this comment.
These are the number of incoming edges for a given node i. The algorithm primaril works on outgoing edges since scheduling a node removes all outgoing edges from that node from the graph, but after doing so we need to check whether the number of incoming edges of a node is now empty/0 to check whether it can now be scheduled.
I've added comments to make this more explicit
|
I've also decided to move the |
|
yeah it does feel like a property of an ast node |
|
Okay Another thing is the OpaqueTransferFn: what is the problem that this class tries to solve? Like a convience placeholder type that works for all possible template values of TransferFn? |
Correct yes! Kind of similar to |


Prior to this PR the order in which contexts were derived was hardcoded: Every AST element forwarded its context to all its child elements (with possibly some derivation happening). This works well for many type systems, specifically ones that mainly deal with expressions but hits limits when dealing with side-effecting statements. Examples of things that do not work with this system:
This PR therefore implements a new mechanism that is a strict superset of functionality from the existing
check*methods by enabling every subelement to specify which subelements it depends on. The generator then performs a topological sort to generate subelements in some order satisfying the dependencies between contexts.As an example use, the
BitwidthTypeSystemcan now generate inbounds expressions by restricting the bitwidth of the indexing expression to match the dimensions of the array parameter.The API is currently only implemented for
BinaryExpressionandArrayReadExpressionbut the goal is the remove the oldcheck*API entirely. This will be done in a follow up PRThis is an alternative to #837 which implements the same functionality but in a less powerful and more composable fashion. Specifically, multiple type systems can still be combined into one as long as their dependencies do not create a cycle.
Note: For formal geeks, this implements Attribute grammars but every AST node currently has the same set of symbols