-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace parquet metadata thrift version with in memory version. #1004
Comments
@liurenjie1024 I think the current problem with this is that |
I think we should always return the in memory representation, rather the thrift one. Is there any case where returning the thrift one is more useful then the in memory one? |
FileMetadata
in parquet writer with in memory representation.
Probably not, so should we cahnge the AsyncFileWriter to return the in memory representation? |
Yes, but it seems there is no built no approach to do that? We may need to ask for help in arrow community? |
Yes, I'll look to submit an issue |
Hi, @jonathanc-n I found this method in
|
Thanks for that! I'll look into it later today. |
In parquet crate, there are two kinds of data structures for metadata: in memory version vs auto generated version from parquet's thrift definition. For example, there are two versions of
FileMetadata
: in memory vs thrift definition.We should use the in memory one as it provides more features, while thrift version was only used for ser/de in parquet.
There are several places in our crate which is using thrift version:
The text was updated successfully, but these errors were encountered: