Skip to content

Conversation

@vmingchen
Copy link
Contributor

@vmingchen vmingchen commented Mar 8, 2025

The additional transform_bottom is not needed because when walking up the query tree using transform_up, by the time we reach a DFRayStageExec, all its child nodes should have already been handled and we are ready to generate the ray stage for that DFRayStageExec node. Note that transform_up walks the tree in a bottom-up (post-order) fashion.

I tested with the TPCH queries and the stages are the same before and after this change.

@vmingchen vmingchen changed the title Address a TODO about simplify Ray stages collection Address a TODO in DFRayDataFrame::stages Mar 8, 2025
@vmingchen vmingchen marked this pull request as ready for review March 8, 2025 17:57
@robtandy
Copy link
Contributor

@andygrove can you allow CI for this PR such that we can ensure it validates correctly?

If / when it does, I'm happy to get it merged. Thank you for fixing this @vmingchen !

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vmingchen and @robtandy. I will go ahead and merge.

@andygrove andygrove merged commit dcea736 into apache:main Mar 11, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants