-
Notifications
You must be signed in to change notification settings - Fork 813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Startup Taint Removal Feature #1588
Conversation
cb25723
to
c710a9a
Compare
return nil | ||
} | ||
|
||
patchRemoveTaints := []JSONPatch{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@torredil mentioned this, but can we use https://pkg.go.dev/k8s.io/kubernetes/pkg/util/taints#RemoveTaint instead of JSON patch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally thought that was possible, but sadly, not really:
Firstly, RemoveTaint
requires a specific effect, so we'd have to repeat it for all three Effects (and the code wouldn't be very future proof, we'd have to update it whenever a new taint effect is added).
Secondly, and more importantly, RemoveTaint
only updates the local representation of the node, it doesn't actually make any calls to the k8s API. Thus, you then need to push the entire node object, which (1) is way less efficient than just patching the taints and (2) possibly even introduces a race condition (if the node is modified by something else between us downloading it and attempting to update it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there concerns with this implementation?
node.Spec.Taints = taintsToKeep
err = clientset.CoreV1().Nodes().Update(context.Background(), node, metav1.UpdateOptions{})
if err != nil {
return err
}
return nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, see my comment above:
Thus, you then need to push the entire node object, which (1) is way less efficient than just patching the taints and (2) possibly even introduces a race condition (if the node is modified by something else between us downloading it and attempting to update it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(2) possibly even introduces a race condition (if the node is modified by something else between us downloading it and attempting to update it).
K8s requires that objects submitted in update requests contain a resourceVersion
, the K8s API server verifies this field and will reject the request if there is a conflict which should prevent race conditions.
(1) is way less efficient than just patching the taints.
👍
/lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of creating the patch manually, can we DeepCopy
the Node object, delete the taints, and use strategicpatch.CreateTwoWayMergePatch
to create the patch data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could although I'd rather not unless we have a strong reason to because it eats an unnecessary amount of RAM and CPU creating a second copy of the node, and transforming both to JSON for the patch.
That said, I'm open to either option: I'll leave this open for other feedback.
Why did we remove this CLI option? |
Adding a knob to enable/disable this feature is unnecessary, because it has no impact (besides a single call to the k8s API on startup, which we already make for node metadata anyways) on users that don't use it. |
I'd still prefer an option for the feature, even if the default is behavior is |
/retest |
1 similar comment
/retest |
return nil | ||
} | ||
|
||
patchRemoveTaints := []JSONPatch{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there concerns with this implementation?
node.Spec.Taints = taintsToKeep
err = clientset.CoreV1().Nodes().Update(context.Background(), node, metav1.UpdateOptions{})
if err != nil {
return err
}
return nil
Regarding the CLI option, this is a good mental model to follow: @wmesard: Every configuration option represents a failure of the software to do the right thing automatically. Every configuration option needs to be documented and protected by unit tests, thereby increasing the cognitive load of user and developer alike. Sometimes they are necessary, but only as a last resort. |
Co-authored-by: Gengtao Xu <[email protected]> Signed-off-by: Connor Catlett <[email protected]>
/retest |
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: torredil The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
Add Startup Taint Removal Feature
Is this a bug fix or adding new feature?
New feature
What is this PR about? / Why do we need it?
Implements a feature to remove a taint on driver startup to alleviate potential race conditions. Supercedes #1581, all credit for the design and initial implementation to @gtxu.
This PR differs from the original in a few meaningful ways:
What testing is done?
CI/Manual/New unit tests added