[Improvement]: Support parallelized planning in one optimizer group #1951

majin1102 · 2023-09-11T09:21:19Z

Search before asking

I have searched in the issues and found no similar issues.

What would you like to be improved?

Currently, it is widely reported that there may be idle resources in an optimizer group while multiple tables are in a pending state. The bottleneck is that only single-threaded planning is supported in an optimizer group. When a table requires a long time to planTasks, it can cause the entire resource pool to hang. It is hoped that a multi-threaded and asynchronous planning mechanism can be introduced to improve resource utilization.

How should we improve?

This mechanism considers introducing a parameter:

max_planning_threads_per-group (or similar name)

to represent the maximum planning concurrency under a group. Even though there would be more features of task scheduling introduced in the future, this parameter will continue to be used. The current recommended approach is to construct a thread pool in the group based on max_planning_threads_per-group to perform planning. Every time the optimizer pollTask method is triggered, it will initiate asynchronous planning, and return directly when the concurrency reaches max_planning_threads_per-group.

Are you willing to submit PR?

Yes I am willing to submit a PR!

Subtasks

No response

Code of Conduct

I agree to follow this project's Code of Conduct

The text was updated successfully, but these errors were encountered:

huyuanfeng2018 · 2023-09-11T11:24:16Z

+1,
I think it is necessary to add some concurrency in the plan.

I have a question about when to trigger plan scheduling:

If polltask is triggered, does it mean that each task needs to be fully consumed before the next plan can be triggered?

HuangFru · 2023-09-12T01:58:44Z

Then the default value of 'max_planning_threads_per-group' needs to be carefully considered. When there are many threads used for planning or multiple large tables are planned at the same time, it may cause huge pressure on AMS.

AFAIK, the default value of 'max_planning_threads_per-group' could be set to 1 (or a small value). That is to say, the current single-threaded plan status is used by default. Users should manually adjust this value according to the resource conditions and usage of their own environment after fully understanding this parameter.

wangtaohz · 2023-09-13T01:57:29Z

If polltask is triggered, does it mean that each task needs to be fully consumed before the next plan can be triggered?

Yes, that's what I think. The prerequisite for starting a new thread to plan is that both the taskQueue and retryQueue are empty.

Then the default value of 'max_planning_threads_per-group' needs to be carefully considered. When there are many threads used for planning or multiple large tables are planned at the same time, it may cause huge pressure on AMS.

In addition to considering the pressure of AMS, multiple threads planning at the same time can also result in multiple tables being planned at the same time, thereby causing some disruptions to the optimizing priority of the tables.

baiyangtx · 2023-09-13T06:34:26Z

Does this mean you want multiple tables within a group to be planned simultaneously, or do you want to use multithreading to plan a single table?

huyuanfeng2018 · 2023-09-13T11:44:09Z

In addition to considering the pressure of AMS, multiple threads planning at the same time can also result in multiple tables being planned at the same time, thereby causing some disruptions to the optimizing priority of the tables.

This will happen, but I think this is acceptable

wangtaohz · 2023-09-18T08:41:06Z

Does this mean you want multiple tables within a group to be planned simultaneously, or do you want to use multithreading to plan a single table?

The current plan for a table(scan files of iceberg table) is already multi-threaded, so I think it mainly refers to multiple tables within a group to be planned simultaneously.

majin1102 · 2023-09-20T03:10:45Z

+1, I think it is necessary to add some concurrency in the plan.

I have a question about when to trigger plan scheduling:

If polltask is triggered, does it mean that each task needs to be fully consumed before the next plan can be triggered?

The current situation is more like each task needs to be polled before next planning, but I don't think it is a necessary operation.

pollTask() is always represented as polling task from task queue(or retry queue), and it could trigger a table selection and planning if there's no task available. For pollTask() operation, the only necessary and sufficient condition is that there are tasks in queue(there could be many task producers in the future, not only optimizingTask), so pollTask() in my opinion does't need to know table or planning or fully or not. It only triggers a task producing operation when it needs.

huyuanfeng2018 · 2023-09-22T09:56:25Z

I think the bigger problem now is that the caliber is not uniform. When the Optimizer comes to poll, it uses the task as a unit, but the plan uses the table as a unit. The number of tasks corresponding to a table is >= 1. This is different from Producer-Consumer-System, there is no way to determine how many times the plan needs to be executed based on the number of Optimizer polls. Therefore, even if multi-threading is used, it is actually more difficult to deal with this problem under the current framework. My suggestion is Add a strategy under the current framework.

majin1102 · 2023-10-13T07:23:46Z

I think the bigger problem now is that the caliber is not uniform. When the Optimizer comes to poll, it uses the task as a unit, but the plan uses the table as a unit. The number of tasks corresponding to a table is >= 1. This is different from Producer-Consumer-System, there is no way to determine how many times the plan needs to be executed based on the number of Optimizer polls. Therefore, even if multi-threading is used, it is actually more difficult to deal with this problem under the current framework. My suggestion is Add a strategy under the current framework.

That's why a parameter like max_planning_threads_per-group is necessary, which will limit the concurrency of planning. We could decouple planning and polling as two different scheduling models. polling is the Producer-Consumer model and planning is a scheduling model based on SchedulingPolicy(default is based on quota).

The only connection is that polling could trigger asynchronous planning actions. After triggering happened, polling could just return null and poll a specific task in polling loops after planning is finished

majin1102 added the type:improvement label Sep 11, 2023

majin1102 mentioned this issue Nov 10, 2023

[AMORO-1951] Support parallelized planning in one optimizer group #2282

Merged

3 tasks

majin1102 closed this as completed in #2282 Dec 5, 2023

zhoujinsong mentioned this issue Jun 25, 2024

Release-0.7.0 roadmap #2176

Closed

66 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Improvement]: Support parallelized planning in one optimizer group #1951

[Improvement]: Support parallelized planning in one optimizer group #1951

majin1102 commented Sep 11, 2023 •

edited

Loading

huyuanfeng2018 commented Sep 11, 2023

HuangFru commented Sep 12, 2023 •

edited

Loading

wangtaohz commented Sep 13, 2023

baiyangtx commented Sep 13, 2023

huyuanfeng2018 commented Sep 13, 2023

wangtaohz commented Sep 18, 2023

majin1102 commented Sep 20, 2023 •

edited

Loading

I have a question about when to trigger plan scheduling:

huyuanfeng2018 commented Sep 22, 2023

majin1102 commented Oct 13, 2023 •

edited

Loading

[Improvement]: Support parallelized planning in one optimizer group #1951

[Improvement]: Support parallelized planning in one optimizer group #1951

Comments

majin1102 commented Sep 11, 2023 • edited Loading

Search before asking

What would you like to be improved?

How should we improve?

Are you willing to submit PR?

Subtasks

Code of Conduct

huyuanfeng2018 commented Sep 11, 2023

I have a question about when to trigger plan scheduling:

HuangFru commented Sep 12, 2023 • edited Loading

wangtaohz commented Sep 13, 2023

baiyangtx commented Sep 13, 2023

huyuanfeng2018 commented Sep 13, 2023

wangtaohz commented Sep 18, 2023

majin1102 commented Sep 20, 2023 • edited Loading

I have a question about when to trigger plan scheduling:

huyuanfeng2018 commented Sep 22, 2023

majin1102 commented Oct 13, 2023 • edited Loading

majin1102 commented Sep 11, 2023 •

edited

Loading

HuangFru commented Sep 12, 2023 •

edited

Loading

majin1102 commented Sep 20, 2023 •

edited

Loading

majin1102 commented Oct 13, 2023 •

edited

Loading