Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement]: Mixed-format Table stats for Trino engine need to be calculated individually #2246

Closed
3 tasks done
Tracked by #2176
HuangFru opened this issue Nov 3, 2023 · 0 comments · Fixed by #2314
Closed
3 tasks done
Tracked by #2176

Comments

@HuangFru
Copy link
Contributor

HuangFru commented Nov 3, 2023

Search before asking

  • I have searched in the issues and found no similar issues.

What would you like to be improved?

We invoked table statistics in #1344 by applying native Iceberg's implementation to both the base and change store and merging them.

When calculating the change store, the native iceberg implementation cannot sense the max transaction of the mix-format change store, so the statistics calculated are wrong.

Two tables with almost the same amount of data:
Mix-format:
image

Iceberg-format:
image

Mix-format's 'row count' is ​​twice as long as Iceberg's. This may have some impact on performance.

How should we improve?

Two options:

  1. Only calculate the base store and ignore the change store's statistics.
  2. Implement the 'TableStatisticsReader' by using Mix-format's plan to get the most accurate statistics(May increase maintenance costs for upgraded versions).

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Subtasks

No response

Code of Conduct

@zhoujinsong zhoujinsong mentioned this issue Jun 25, 2024
66 tasks
@zhoujinsong zhoujinsong changed the title [Improvement]: Mix-format Table stats for Trino engine need to be calculated individually [Improvement]: Mixed-format Table stats for Trino engine need to be calculated individually Jun 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant