-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat (datafusion integration): convert datafusion expr filters to Iceberg Predicate #588
Merged
Merged
Changes from 4 commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
e8bd953
adding main function and tests
9a54ef9
adding tests, removing integration test for now
afc9c13
fixing typos and lints
9f88a5a
fixing typing issue
e473471
- added support in schmema to convert Date32 to correct arrow type
9d2112d
fixing format and lic
f042ddc
reducing number of tests (17 -> 7)
d9f7e3f
fix formats
cbbf3a6
fix naming
e864fd1
refactoring to use TreeNodeVisitor
bb41f70
fixing fmt
8650476
small refactor
21a38f5
adding swapped op and fixing CR comments
c6cfa68
Merge remote-tracking branch 'upstream/main' into feat-impl-df-filters
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could use
visitor pattern
to make code more cleaner to avoid code bomb if supporting more expressions.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I'm not a DataFusion expert but I think this makes sense being implemented as a visitor of
Expr
, probably by implementing a https://docs.rs/datafusion-common/41.0.0/datafusion_common/tree_node/trait.TreeNodeVisitor.htmlThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sdd, @FANNG1 , I appriciate it
I will try to refactor this, though I am wondering whether Visitor will indeed be the most suitable. Let me know what you think, but I think visitor shines when we have to run different logic on different kinds of elements (chained in some way) while we want to keep the logic in one place - i.e., the Visitor. whereas here we have one kind of element - Expr - which is an enum that can be deconstructed in different ways, for example -
so what Im trying to say is that using visitor will simply move the matching complexity to another place - to the visitor.
Does this make sense?
I will continue to try and refactor this but please let me know what you think