-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add leader transfer logic #63
Conversation
could you please also update the readme? |
@@ -39,6 +39,7 @@ It adds the following extra flags, | |||
| `--defrag-rule` | defragmentation rule (etcd-defrag will run defragmentation if the rule is empty or it is evaluated to true), defaults to empty. See more details below. | | |||
| `--dry-run` | evaluate whether or not endpoints require defragmentation, but don't actually perform it, defaults to `false`. | | |||
| `--exclude-localhost` | whether to exclude localhost endpoints, defaults to `false`. | | |||
| `--move-leader` | whether to move the leader to a randomly picked non-leader ID and make it the new leader, defaults to `false`. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| `--move-leader` | whether to move the leader to a randomly picked non-leader ID and make it the new leader, defaults to `false`. | | |
| `--move-leader` | whether to move the leader to another follower before performing defragment on it to minimize potential performance impact, defaults to `false`. | |
@@ -70,6 +71,7 @@ Flags: | |||
--keepalive-time duration keepalive time for client connections (default 2s) | |||
--keepalive-timeout duration keepalive timeout for client connections (default 6s) | |||
--key string identify secure client using this TLS key file | |||
--move-leader whether to move the leader to a randomly picked non-leader ID and make it the new leader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto, please update the description
@@ -24,6 +24,7 @@ func newDefragCommand() *cobra.Command { | |||
defragCmd.Flags().StringSliceVar(&globalCfg.endpoints, "endpoints", []string{"127.0.0.1:2379"}, "comma separated etcd endpoints") | |||
defragCmd.Flags().BoolVar(&globalCfg.useClusterEndpoints, "cluster", false, "use all endpoints from the cluster member list") | |||
defragCmd.Flags().BoolVar(&globalCfg.excludeLocalhost, "exclude-localhost", false, "whether to exclude localhost endpoints") | |||
defragCmd.Flags().BoolVar(&globalCfg.moveLeader, "move-leader", false, "whether to move the leader to a randomly picked non-leader ID and make it the new leader") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto, please update the description
defer c.Close() | ||
|
||
ctx, cancel := commandCtx(gcfg.commandTimeout) | ||
defer cancel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comment
defer c.Close() | |
ctx, cancel := commandCtx(gcfg.commandTimeout) | |
defer cancel() | |
ctx, cancel := commandCtx(gcfg.commandTimeout) | |
defer func() { | |
c.Close() | |
cancel() | |
}() |
ctx, cancel := commandCtx(gcfg.commandTimeout) | ||
defer cancel() | ||
|
||
fmt.Printf("Requesting leader at %s to transfer leadership to member ID %d...\n", leaderEp, transfereeID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest not to print log in this utility function, please print log in main.go only if needed
fmt.Printf("Requesting leader at %s to transfer leadership to member ID %d...\n", leaderEp, transfereeID) |
return fmt.Errorf("failed to move leader: %w", err) | ||
} | ||
|
||
fmt.Println("successfully transferred leadership from", leaderEp, "to member ID ", transfereeID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
fmt.Println("successfully transferred leadership from", leaderEp, "to member ID ", transfereeID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good. Thank you.
Let me resolve the comments in a separate PR.
Refer to 5b38140 It also fixed a bug in this PR. When there is only one member in the cluster, we shouldn't move leader at all. |
--move-leader
flag.--continue-on-error
flag.This change mitigates the risk of outages caused by defragmenting the leader node in an etcd cluster.
Resolves #58