Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement]: Needs to sort the external Iceberg table list #1716

Closed
3 tasks done
Tracked by #2176
HuangFru opened this issue Jul 19, 2023 · 7 comments · Fixed by #2362
Closed
3 tasks done
Tracked by #2176

[Improvement]: Needs to sort the external Iceberg table list #1716

HuangFru opened this issue Jul 19, 2023 · 7 comments · Fixed by #2362
Labels

Comments

@HuangFru
Copy link
Contributor

HuangFru commented Jul 19, 2023

Search before asking

  • I have searched in the issues and found no similar issues.

What would you like to be improved?

Currently, an external Iceberg catalog's table list is returned by the Iceberg catalog's listTables method which can not guaranteed order.
For example, I have 500 hundred tables named 'mock_table_1' to 'mock_table_500', but the table list in AMS shows like this:

image

It's not very aesthetically pleasing and inconvenient.

How should we improve?

Sorting it by the lexicographical order of the table names.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Subtasks

No response

Code of Conduct

@zhoujinsong
Copy link
Contributor

Sorting by table name can indeed help users find the tables they are interested in more quickly. At the same time, we may also perform this sorting operation on the database list.

@zhoujinsong
Copy link
Contributor

Given that this improvement is very small and meaningful, I suggest opening it up to the new contributors. Therefore, I plan to label it as good first issue.

@zhoujinsong zhoujinsong added the good first issue Good for newcomers label Jul 19, 2023
@HuangFru
Copy link
Contributor Author

Cause the table_identifier table has UNIQUE KEY table_name_index (catalog_name,db_name,table_name), and we select by 'where catalog = ? and database =?', so it's ordered when selecting from the database.

@chenyuzhi459
Copy link
Contributor

chenyuzhi459 commented Oct 29, 2023

Sorting by table name can indeed help users find the tables they are interested in more quickly. At the same time, we may also perform this sorting operation on the database list.

I would like to provide this implementation, but considering that amoro has multiple types of ServerCatalog(iceberg/internal/mix-iceberg/paimon), maybe it would be better to wait for the table-list returned by catalog and sort it uniformly in memory, so that all tables returned by the catalog can be sorted.

@HuangFru

@HuangFru
Copy link
Contributor Author

Sorting by table name can indeed help users find the tables they are interested in more quickly. At the same time, we may also perform this sorting operation on the database list.

I would like to provide this implementation, but considering that amoro has multiple types of ServerCatalog(iceberg/internal/mix-iceberg/paimon), maybe it would be better to wait for the table-list returned by catalog and sort it uniformly in memory, so that all tables returned by the catalog can be sorted.

@HuangFru

Sorry for the late reply, there seems to be a problem with my email reminder.
The internal catalog is selected from the database and is ordered. You only need to deal with the external catalog.

@chenyuzhi459
Copy link
Contributor

chenyuzhi459 commented Nov 25, 2023

Sorting by table name can indeed help users find the tables they are interested in more quickly. At the same time, we may also perform this sorting operation on the database list.

I would like to provide this implementation, but considering that amoro has multiple types of ServerCatalog(iceberg/internal/mix-iceberg/paimon), maybe it would be better to wait for the table-list returned by catalog and sort it uniformly in memory, so that all tables returned by the catalog can be sorted.
@HuangFru

Sorry for the late reply, there seems to be a problem with my email reminder. The internal catalog is selected from the database and is ordered. You only need to deal with the external catalog.

However, According to the response code logic of the rest api /catalogs/{catalog}/databases/{db}/tables, the table list returned from TableService will be manually merged with the hive table returned by the MixedHiveCatalogImpl query, so it is not guaranteed that the table list returned is ordered finally. Is it better to sort the table list from the external catalog and sort the results from the interface at the same time?

@HuangFru

@HuangFru
Copy link
Contributor Author

Sorting by table name can indeed help users find the tables they are interested in more quickly. At the same time, we may also perform this sorting operation on the database list.

I would like to provide this implementation, but considering that amoro has multiple types of ServerCatalog(iceberg/internal/mix-iceberg/paimon), maybe it would be better to wait for the table-list returned by catalog and sort it uniformly in memory, so that all tables returned by the catalog can be sorted.
@HuangFru

Sorry for the late reply, there seems to be a problem with my email reminder. The internal catalog is selected from the database and is ordered. You only need to deal with the external catalog.

However, According to the response code logic of the rest api /catalogs/{catalog}/databases/{db}/tables, the table list returned from TableService will be manually merged with the hive table returned by the MixedHiveCatalogImpl query, so it is not guaranteed that the table list returned is ordered finally. Is it better to sort the table list from the external catalog and sort the results from the interface at the same time?

@HuangFru

Then I think we only need to sort the final results of the interface. After all, this only serves the display effect of the table list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants