-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update the Supervisor endpoint to not restart the Supervisor if the spec was unmodified #17707
base: master
Are you sure you want to change the base?
Conversation
…pec was unmodified
server/src/main/java/org/apache/druid/metadata/MetadataSupervisorManager.java
Fixed
Show fixed
Hide fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aho135 , thanks for the PR. It makes sense to not update/restart the supervisor if not required.
I have left some minor feedback on the approach.
...g-service/src/main/java/org/apache/druid/indexing/overlord/supervisor/SupervisorManager.java
Outdated
Show resolved
Hide resolved
...-service/src/main/java/org/apache/druid/indexing/overlord/supervisor/SupervisorResource.java
Outdated
Show resolved
Hide resolved
...g-service/src/main/java/org/apache/druid/indexing/overlord/supervisor/SupervisorManager.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/apache/druid/metadata/MetadataSupervisorManager.java
Outdated
Show resolved
Hide resolved
Thank you for these changes, @aho135! I think we would benefit from a change where we check if the spec has changed. If it hasn't we still restart the supervisor, but do not go to the metadata store and add an unnecessary entry in the spec history. Otherwise, the flow remains unchanged. I think @kfaraz has suggested this as well. I also wanted to understand if the problem was with the metadata operations associated with it including an unneeded entry, or if the supervisor operation is also problematic. If it is just the first case, is a feature flag really needed? If you still believe that the supervisor operation is wasteful, and want to introduce a flag, please add the relevant docs in |
Thanks for the review @AmatyaAvadhanula! My original motivation for this change was to avoid unnecessary restarts of the Supervisor if possible. Our use case is that we maintain a repository of schemas and do periodic releases. It is often unclear which schemas were actually modified. We want to be able to submit them all, and just restart the Supervisors which had schema updates. This is so we can avoid the undesirable side effects of task restart, such as small segments. With this use case in mind, I think that having the feature flag does make sense. I will add an update in the relevant doc |
…rvisorManager, update Supervisor API ref
Hi @kfaraz @AmatyaAvadhanula I've made the changes you suggested and added some unit tests. Would appreciate if you could take a look again. Thank you! |
@kfaraz I've added an additional check for retention rules to avoid metastore updates if the retention rules were unchanged. This brings the behavior in line with the compaction config endpoint: https://github.com/apache/druid/blob/master/server/src/main/java/org/apache/druid/server/coordinator/CoordinatorConfigManager.java#L110-L111 |
Every language defaults boolean values to false and many treat null as false. Maybe a small thing but I feel like the feature flag should default to false when not present. Introducing the new flag as something like |
Thanks for the review @JRobTS Agreed that defaulting the flag to false makes more sense. I updated this in the subsequent PR. Please let me know what you think! |
docs/api-reference/supervisor-api.md
Outdated
#### Sample request with restartIfUnmodified | ||
The following example sets the restartIfUnmodified flag to false. With this flag set to false, the Supervisor will only restart if there has been a modification to the SupervisorSpec. | ||
#### Sample request with skipRestartIfUnmodified | ||
The following example sets the skipRestartIfUnmodified flag to true. With this flag set to false, the Supervisor will only restart if there has been a modification to the SupervisorSpec. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in the 2nd sentence: "With this flag set to false ..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good catch @JRobTS this was updated in the subsequent commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update on this, @aho135 .
log.error(e, | ||
"Failed to upgrade pending segment[%s] to new pending segment[%s] on Supervisor[%s].", | ||
upgradedPendingSegment.getUpgradedFromSegmentId(), | ||
upgradedPendingSegment.getId().getVersion(), | ||
supervisorId | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log.error(e, | |
"Failed to upgrade pending segment[%s] to new pending segment[%s] on Supervisor[%s].", | |
upgradedPendingSegment.getUpgradedFromSegmentId(), | |
upgradedPendingSegment.getId().getVersion(), | |
supervisorId | |
); | |
log.error( | |
e, | |
"Failed to upgrade pending segment[%s] to new pending segment[%s] on Supervisor[%s].", | |
upgradedPendingSegment.getUpgradedFromSegmentId(), | |
upgradedPendingSegment.getId().getVersion(), | |
supervisorId | |
); |
@Context final HttpServletRequest req, | ||
@QueryParam("skipRestartIfUnmodified") boolean skipRestartIfUnmodified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style: Make the request the last argument of this method.
@Context final HttpServletRequest req, | ||
@QueryParam("skipRestartIfUnmodified") boolean skipRestartIfUnmodified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use a boxed Boolean
instead and handle nulls to make the default behaviour more obvious.
@@ -60,6 +62,7 @@ | |||
@RunWith(EasyMockRunner.class) | |||
public class SupervisorManagerTest extends EasyMockSupport | |||
{ | |||
private static final ObjectMapper MAPPER = new DefaultObjectMapper(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add an empty line after this
manager.start(); | ||
Assert.assertFalse(manager.shouldUpdateSupervisor(spec)); | ||
Assert.assertTrue(manager.shouldUpdateSupervisor(spec2)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add an empty line after this.
@@ -2409,6 +2409,7 @@ ddSketch | |||
DDSketch | |||
druid-ddsketch | |||
numBins | |||
skipRestartIfUnmodified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You wouldn't need this spelling entry if you use skipRestartIfUnmodified
(with backquotes) instead of skipRestartIfUnmodified in the docs.
@@ -2353,6 +2353,63 @@ Content-Length: 1359 | |||
</TabItem> | |||
</Tabs> | |||
|
|||
#### Sample request with skipRestartIfUnmodified | |||
The following example sets the skipRestartIfUnmodified flag to true. With this flag set to true, the Supervisor will only restart if there has been a modification to the SupervisorSpec. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following example sets the skipRestartIfUnmodified flag to true. With this flag set to true, the Supervisor will only restart if there has been a modification to the SupervisorSpec. | |
The following example sets the `skipRestartIfUnmodified` flag to true. With this flag set to true, the Supervisor will only restart if there has been a modification to the SupervisorSpec. |
* Checks whether the submitted SupervisorSpec differs from the current spec in SupervisorManager's supervisor list. | ||
* This is used in SupervisorResource specPost to determine whether the Supervisor needs to be restarted | ||
* @param spec The spec submitted | ||
* @return boolean - false if the spec is unchanged, else true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* @return boolean - false if the spec is unchanged, else true | |
* @return true only if the spec has been modified, false otherwise |
if (currentSupervisor != null && | ||
Arrays.equals(specAsBytes, jsonMapper.writeValueAsBytes(currentSupervisor.rhs)) | ||
) { | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you may simplify this as follows:
if (currentSupervisor != null && | |
Arrays.equals(specAsBytes, jsonMapper.writeValueAsBytes(currentSupervisor.rhs)) | |
) { | |
return false; | |
} | |
return currentSupervisor == null | |
|| !Arrays.equals(specAsBytes, jsonMapper.writeValueAsBytes(currentSupervisor.rhs); |
* @param spec The spec submitted | ||
* @return boolean - false if the spec is unchanged, else true | ||
*/ | ||
public boolean shouldUpdateSupervisor(SupervisorSpec spec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the suggestion from @AmatyaAvadhanula , you can also update the createOrUpdate
method to create a new entry in DB only if needed.
/**
* Creates or updates a supervisor and then starts it.
* If no change has been made to the supervisor spec, it is only restarted.
*
* @return true if the supervisor was updated, false otherwise
*/
public boolean createOrUpdateAndStartSupervisor(SupervisorSpec spec)
{
Preconditions.checkState(started, "SupervisorManager not started");
Preconditions.checkNotNull(spec, "spec");
Preconditions.checkNotNull(spec.getId(), "spec.getId()");
Preconditions.checkNotNull(spec.getDataSources(), "spec.getDatasources()");
synchronized (lock) {
Preconditions.checkState(started, "SupervisorManager not started");
// Persist a new version of the spec only if it has been updated
final boolean shouldUpdateSpec = shouldUpdateSupervisor(spec);
possiblyStopAndRemoveSupervisorInternal(spec.getId(), false);
createAndStartSupervisorInternal(spec, shouldUpdateSpec);
return shouldUpdateSpec;
}
}
Description
This PR adds an optional query parameter called restartIfUnmodified to the /druid/indexer/v1/supervisor endpoint. The caller can optionally set restartIfUnmodified=false so that the supervisor is not restarted if the spec is unchanged. Multiple members of the community mentioned that they maintain their own scripts to check whether the spec has changed before submitting to the endpoint: https://apachedruidworkspace.slack.com/archives/C0303FDCZEZ/p1738017586080509
For those that rely on this endpoint for restarting the supervisor, the behavior remains unchanged as restartIfUnmodified defaults to true.
This PR also helps avoid unnecessary updates to the metastore when updating retention rules by first checking if the rules were updated.
Release note
Adds an optional query parameter called restartIfUnmodified to the /druid/indexer/v1/supervisor endpoint. Callers can set restartIfUnmodified=false to not restart the supervisor if the spec is unchanged. Example:
curl -X POST --header "Content-Type: application/json" -d @supervisor.json localhost:8888/druid/indexer/v1/supervisor?restartIfUnmodified=false
Key changed/added classes in this PR
SupervisorResource
SupervisorManager
SupervisorResourceTest
SQLMetadataRuleManager
SQLMetadataRuleManagerTest
This PR has: