We have refactored many parts in TreeNode API in Datafusion to serve more understandable and easy to use tool. Now, our focus is on these items:
RewriteRecursion
and VisitRecursion
. VisitRecursion
has 3 variants; Continue
, Skip
, Stop
, and RewriteRecursion
has one more; mutate
. Especially Skip
behavior is used to represent different recursion logic, and the user would have difficulty what to expect.What this document proposes is mainly focused on these problems. The design details will be explained with reference to the tree below:
pub struct Transformed<T> {
pub data: T,
pub transformed: bool,
pub tnr: TreeNodeRecursion,
}
pub enum TreeNodeRecursion {
Continue,
Jump,
Stop,
}
With the new Transformed
struct, we can store the results of the operation results for each node. That will provide us more wise traversals. Previous Skip
variant is somehow confusing, and have different meanings in different context. In the new suggested desing, it is replaced with Jump
. During top-down traversals, it means that the current subtree will be skipped, and if there exists another child subtree, the traversal is continued from there. If there does not, it means the same with the Stop
, halts the traversal and collect all transformation till that node. During bottom-up traversals, Jump
goes to next leaf node.
Skip
is interpreted in many other libraries as “the current node” is skipped, and the execution continues from the next node. However, our results are collected after the rule works on the current node. Therefore; we cannot use Skip
to skip self node. Instead, it could have a meaning skipping the next node. Rather than using Skip
in that way, the rules can have their own handlings to skip that node with Transformed::No
and Recursion::Continue
. To not replicate the behavior, I suggest to remove Skip
. Its meaning becomes also more complicated during post-order (bottom-up) traversals.
If is there anyone thinking that a Skip_Next_Node variant could be useful, please share your thoughts.
Pre-Order (Top-Down) Traversal: J → I → F → E → C → B → D → A → G → H
Example Scenario for a Top-Down Traversal:
After op is applied on J; (op(J)):
Continue | Jump | Stop |
---|---|---|
apply op(I) | terminate with Transformed::Yes if op(J) is Transformed::Yes | terminate with Transformed::Yes if op(J) is Transformed::Yes |
op(I):
Continue | Jump | Stop |
---|---|---|
apply op(F) | terminate with Transformed::Yes if op(J) is Transformed::Yes or op(I) is Transformed::Yes | terminate with Transformed::Yes if op(J) is Transformed::Yes or op(I) is Transformed::Yes |
op(F):
Continue | Jump | Stop |
---|---|---|
apply op(E) | terminate with (Transformed(op(J)) or …) | terminate with (Transformed(op(J)) or …) |
op(E):