forked from RamenDR/ramen
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Syncing latest changes from main for ramen #277
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
add ConfigMap under dr-cluster kustomize transformer to update label "app: ramen-dr-cluster" Signed-off-by: rakeshgm <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
When fetching the same cache item concurrently, for example when from same addon on 2 clusters, or addon and fetch cron job running concurrently, one fetcher can delete the temporary file used by the other fetcher, causing this error: drenv.commands.Error: Command failed: command: ('addons/rook-cephfs/start', 'dr1') exitcode: 1 error: Traceback (most recent call last): File "/home/.../go/src/github.com/ramendr/ramen/test/addons/rook-cephfs/start", line 46, in <module> deploy(cluster) File "/home/.../go/src/github.com/ramendr/ramen/test/addons/rook-cephfs/start", line 17, in deploy cache.fetch(".", path) File "/home/.../go/src/github.com/ramendr/ramen/test/drenv/cache.py", line 28, in fetch os.rename(tmp, dest) FileNotFoundError: [Errno 2] No such file or directory: '/home/.../.cache/drenv/addons/rook-cephfs.yaml.tmp' -> '/home/.../.cache/drenv/addons/rook-cephfs.yaml' Fixed by using temporary file per process. If we have 2 fetchers, the last one will win, renaming its temporary file to the actual cache file. Example run with multiple fetchers: $ drenv clear 2024-05-13 00:15:59,145 INFO [main] Clearing cache 2024-05-13 00:15:59,146 INFO [main] Cache cleared in 0.00 seconds $ for i in 1 2 3 4; do (drenv fetch envs/regional-dr.yaml &); done 2024-05-13 00:15:59,318 INFO [rdr] Fetching 2024-05-13 00:15:59,320 INFO [rdr] Running addons/rook-operator/fetch 2024-05-13 00:15:59,321 INFO [rdr] Fetching 2024-05-13 00:15:59,322 INFO [rdr] Running addons/rook-cluster/fetch 2024-05-13 00:15:59,322 INFO [rdr] Running addons/rook-toolbox/fetch 2024-05-13 00:15:59,323 INFO [rdr] Running addons/rook-operator/fetch 2024-05-13 00:15:59,323 INFO [rdr] Running addons/rook-cephfs/fetch 2024-05-13 00:15:59,323 INFO [rdr] Running addons/recipe/fetch 2024-05-13 00:15:59,323 INFO [rdr] Running addons/csi-addons/fetch 2024-05-13 00:15:59,325 INFO [rdr] Running addons/rook-cluster/fetch 2024-05-13 00:15:59,325 INFO [rdr] Running addons/rook-toolbox/fetch 2024-05-13 00:15:59,325 INFO [rdr] Running addons/rook-cephfs/fetch 2024-05-13 00:15:59,327 INFO [rdr] Running addons/ocm-controller/fetch 2024-05-13 00:15:59,333 INFO [rdr] Running addons/csi-addons/fetch 2024-05-13 00:15:59,341 INFO [rdr] Running addons/ocm-controller/fetch 2024-05-13 00:15:59,345 INFO [rdr] Running addons/recipe/fetch 2024-05-13 00:15:59,356 INFO [rdr] Fetching 2024-05-13 00:15:59,365 INFO [rdr] Running addons/rook-operator/fetch 2024-05-13 00:15:59,371 INFO [rdr] Fetching 2024-05-13 00:15:59,374 INFO [rdr] Running addons/rook-operator/fetch 2024-05-13 00:15:59,377 INFO [rdr] Running addons/rook-cluster/fetch 2024-05-13 00:15:59,378 INFO [rdr] Running addons/csi-addons/fetch 2024-05-13 00:15:59,388 INFO [rdr] Running addons/rook-cluster/fetch 2024-05-13 00:15:59,391 INFO [rdr] Running addons/recipe/fetch 2024-05-13 00:15:59,395 INFO [rdr] Running addons/rook-cephfs/fetch 2024-05-13 00:15:59,397 INFO [rdr] Running addons/rook-cephfs/fetch 2024-05-13 00:15:59,411 INFO [rdr] Running addons/ocm-controller/fetch 2024-05-13 00:15:59,412 INFO [rdr] Running addons/csi-addons/fetch 2024-05-13 00:15:59,414 INFO [rdr] Running addons/rook-toolbox/fetch 2024-05-13 00:15:59,418 INFO [rdr] Running addons/recipe/fetch 2024-05-13 00:15:59,419 INFO [rdr] Running addons/rook-toolbox/fetch 2024-05-13 00:15:59,450 INFO [rdr] Running addons/ocm-controller/fetch 2024-05-13 00:16:00,521 INFO [rdr] addons/rook-toolbox/fetch completed in 1.20 seconds 2024-05-13 00:16:00,638 INFO [rdr] addons/csi-addons/fetch completed in 1.26 seconds 2024-05-13 00:16:00,793 INFO [rdr] addons/rook-cephfs/fetch completed in 1.47 seconds 2024-05-13 00:16:00,804 INFO [rdr] addons/rook-cephfs/fetch completed in 1.41 seconds 2024-05-13 00:16:00,830 INFO [rdr] addons/rook-toolbox/fetch completed in 1.51 seconds 2024-05-13 00:16:00,831 INFO [rdr] addons/csi-addons/fetch completed in 1.51 seconds 2024-05-13 00:16:00,922 INFO [rdr] addons/rook-cluster/fetch completed in 1.54 seconds 2024-05-13 00:16:00,938 INFO [rdr] addons/rook-toolbox/fetch completed in 1.52 seconds 2024-05-13 00:16:00,987 INFO [rdr] addons/rook-cephfs/fetch completed in 1.66 seconds 2024-05-13 00:16:01,106 INFO [rdr] addons/rook-toolbox/fetch completed in 1.69 seconds 2024-05-13 00:16:01,130 INFO [rdr] addons/rook-cluster/fetch completed in 1.81 seconds 2024-05-13 00:16:01,191 INFO [rdr] addons/csi-addons/fetch completed in 1.86 seconds 2024-05-13 00:16:01,234 INFO [rdr] addons/rook-cluster/fetch completed in 1.91 seconds 2024-05-13 00:16:01,267 INFO [rdr] addons/rook-cluster/fetch completed in 1.88 seconds 2024-05-13 00:16:01,314 INFO [rdr] addons/csi-addons/fetch completed in 1.90 seconds 2024-05-13 00:16:01,414 INFO [rdr] addons/rook-cephfs/fetch completed in 2.02 seconds 2024-05-13 00:16:01,591 INFO [rdr] addons/recipe/fetch completed in 2.25 seconds 2024-05-13 00:16:01,597 INFO [rdr] addons/recipe/fetch completed in 2.27 seconds 2024-05-13 00:16:01,696 INFO [rdr] addons/recipe/fetch completed in 2.31 seconds 2024-05-13 00:16:01,938 INFO [rdr] addons/recipe/fetch completed in 2.52 seconds 2024-05-13 00:16:02,094 INFO [rdr] addons/rook-operator/fetch completed in 2.73 seconds 2024-05-13 00:16:02,248 INFO [rdr] addons/rook-operator/fetch completed in 2.87 seconds 2024-05-13 00:16:02,252 INFO [rdr] addons/rook-operator/fetch completed in 2.93 seconds 2024-05-13 00:16:02,321 INFO [rdr] addons/rook-operator/fetch completed in 3.00 seconds 2024-05-13 00:16:05,471 INFO [rdr] addons/ocm-controller/fetch completed in 6.02 seconds 2024-05-13 00:16:05,472 INFO [rdr] Fetching finishied in 6.10 seconds 2024-05-13 00:16:05,918 INFO [rdr] addons/ocm-controller/fetch completed in 6.51 seconds 2024-05-13 00:16:05,919 INFO [rdr] Fetching finishied in 6.56 seconds 2024-05-13 00:16:06,020 INFO [rdr] addons/ocm-controller/fetch completed in 6.69 seconds 2024-05-13 00:16:06,021 INFO [rdr] Fetching finishied in 6.70 seconds 2024-05-13 00:16:06,394 INFO [rdr] addons/ocm-controller/fetch completed in 7.05 seconds 2024-05-13 00:16:06,394 INFO [rdr] Fetching finishied in 7.07 seconds Fixes: RamenDR#1386 Signed-off-by: Nir Soffer <[email protected]>
The csi-hostpath-driver and volumesnapshots addons start much slower with minikube 1.33. Replacing them with rook ceph rbd storage, the kubevirt environments start up to 1.93 times faster. Start times before and after this change: | env | local before | local after | lab before | lab after | |--------------|--------------|-------------|------------|-----------| | rdr-kubevirt | 600 | 475 | 920 | 603 | | kubevirt | 270 | 230 | 603 | 312 | Signed-off-by: Nir Soffer <[email protected]>
It is easier to debug issues with a minimal environment. With rook-cephfs and the required volumesnapshots minikube addon, the rook environment is less minimal, but it is still quicker to start compared with the full regional-dr environment. Example run: $ drenv start envs/rook.yaml 2024-05-12 21:44:00,426 INFO [rook] Starting environment 2024-05-12 21:44:00,483 INFO [dr1] Starting minikube cluster 2024-05-12 21:44:00,483 INFO [dr2] Starting minikube cluster 2024-05-12 21:44:38,650 INFO [dr1] Cluster started in 38.17 seconds 2024-05-12 21:44:39,090 INFO [dr1/0] Running addons/rook-operator/start 2024-05-12 21:44:39,090 INFO [dr1/1] Running addons/csi-addons/start 2024-05-12 21:44:59,732 INFO [dr2] Cluster started in 59.25 seconds 2024-05-12 21:45:00,218 INFO [dr2/0] Running addons/rook-operator/start 2024-05-12 21:45:00,218 INFO [dr2/1] Running addons/csi-addons/start 2024-05-12 21:45:08,913 INFO [dr1/1] addons/csi-addons/start completed in 29.82 seconds 2024-05-12 21:45:13,552 INFO [dr1/0] addons/rook-operator/start completed in 34.46 seconds 2024-05-12 21:45:13,552 INFO [dr1/0] Running addons/rook-cluster/start 2024-05-12 21:45:30,186 INFO [dr2/1] addons/csi-addons/start completed in 29.97 seconds 2024-05-12 21:45:41,444 INFO [dr2/0] addons/rook-operator/start completed in 41.23 seconds 2024-05-12 21:45:41,444 INFO [dr2/0] Running addons/rook-cluster/start 2024-05-12 21:46:21,806 INFO [dr1/0] addons/rook-cluster/start completed in 68.25 seconds 2024-05-12 21:46:21,806 INFO [dr1/0] Running addons/rook-toolbox/start 2024-05-12 21:46:25,669 INFO [dr1/0] addons/rook-toolbox/start completed in 3.86 seconds 2024-05-12 21:46:25,669 INFO [dr1/0] Running addons/rook-pool/start 2024-05-12 21:46:40,768 INFO [dr1/0] addons/rook-pool/start completed in 15.10 seconds 2024-05-12 21:46:40,768 INFO [dr1/0] Running addons/rook-cephfs/start 2024-05-12 21:47:01,116 INFO [dr2/0] addons/rook-cluster/start completed in 79.67 seconds 2024-05-12 21:47:01,116 INFO [dr2/0] Running addons/rook-toolbox/start 2024-05-12 21:47:01,689 INFO [dr1/0] addons/rook-cephfs/start completed in 20.92 seconds 2024-05-12 21:47:01,689 INFO [dr1/0] Running addons/rook-cephfs/test 2024-05-12 21:47:04,421 INFO [dr2/0] addons/rook-toolbox/start completed in 3.31 seconds 2024-05-12 21:47:04,421 INFO [dr2/0] Running addons/rook-pool/start 2024-05-12 21:47:08,994 INFO [dr1/0] addons/rook-cephfs/test completed in 7.30 seconds 2024-05-12 21:47:29,597 INFO [dr2/0] addons/rook-pool/start completed in 25.18 seconds 2024-05-12 21:47:29,597 INFO [dr2/0] Running addons/rook-cephfs/start 2024-05-12 21:47:44,236 INFO [dr2/0] addons/rook-cephfs/start completed in 14.64 seconds 2024-05-12 21:47:44,236 INFO [dr2/0] Running addons/rook-cephfs/test 2024-05-12 21:47:51,296 INFO [dr2/0] addons/rook-cephfs/test completed in 7.06 seconds 2024-05-12 21:47:51,296 INFO [rook/0] Running addons/rbd-mirror/start 2024-05-12 21:48:41,169 INFO [rook/0] addons/rbd-mirror/start completed in 49.87 seconds 2024-05-12 21:48:41,169 INFO [rook/0] Running addons/rbd-mirror/test 2024-05-12 21:48:52,317 INFO [rook/0] addons/rbd-mirror/test completed in 11.15 seconds 2024-05-12 21:48:52,317 INFO [rook] Environment started in 291.89 seconds Signed-off-by: Nir Soffer <[email protected]>
We added csi-hostpath-driver as a quick temporary solution until we have cephfs storage. Now that we have it, we can replace it and enjoy reduced start time, in particular with minikube 1.33. To replace csi-hostpath-driver, we have to add cephfs to the volsync development environment. This is slower locally, but faster in the e2e lab. For regional-dr, this is always faster, up to 1.82 time faster in the e2e lab. The main difference is cluster start time - minikube addons are loaded before minikube start returns. Before: 2024-05-12 23:01:42,844 INFO [dr2] Cluster started in 433.20 seconds 2024-05-12 23:02:07,215 INFO [dr1] Cluster started in 457.57 seconds After: 2024-05-12 23:18:13,386 INFO [hub] Cluster started in 71.87 seconds 2024-05-12 23:18:46,943 INFO [dr2] Cluster started in 105.43 seconds Start time before and after this change: | env | local before | local after | lab before | lab after | |--------------|--------------|-------------|------------|-----------| | regional-dr | 636 | 426 | 780 | 427 | | volsync | 261 | 352 | 520 | 395 | Signed-off-by: Nir Soffer <[email protected]>
Signed-off-by: Elena Gershkovich <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Looks like recent change in pylint trigger this incorrect report: drenv/commands.py:234:28: E0606: Possibly using variable 'input_view' before assignment (possibly-used-before-assignment) This cannot happen since we don't register proc.stdin if input is None, so when we reach this block input_view is assigned. However disabling the check risk missing a real issue in that block. Lets change the code so pylint can understand it better. This also make it easier to understand for humans. The cost is negligible, adding 2 temporary variables even when they are never used. Signed-off-by: Nir Soffer <[email protected]>
Signed-off-by: jacklu <[email protected]>
Signed-off-by: jacklu <[email protected]>
Signed-off-by: jacklu <[email protected]>
Signed-off-by: jacklu <[email protected]>
Signed-off-by: jacklu <[email protected]>
Signed-off-by: jacklu <[email protected]>
Signed-off-by: jacklu <[email protected]>
Signed-off-by: jacklu <[email protected]>
Signed-off-by: jacklu <[email protected]>
Minikube v1.33.1 includes the fixes we added recently for v1.33.0, so we don't need to setup or clean up anything. We can remove the code and require users and developer to upgrade to latest version, but it is nicer to make this transparent and skip the unneeded configuration. We can remove the special fixes for minikube 1.33.0 later maybe when 1.34 will be released. Example run with minikube 1.33.1: $ drenv setup -v 2024-05-18 00:19:54,127 INFO [main] Setting up minikube for drenv 2024-05-18 00:19:54,152 DEBUG [minikube] Using minikube version 1.33.1 2024-05-18 00:19:54,152 DEBUG [minikube] Skipping sysctl configuration 2024-05-18 00:19:54,153 DEBUG [minikube] Skipping systemd-resolved configuration Signed-off-by: Nir Soffer <[email protected]>
Signed-off-by: Nir Soffer <[email protected]>
The note is correct but not helpful at this point. Let's drop unnecessary details like we did for docs/user-quick-start.md. Signed-off-by: Nir Soffer <[email protected]>
Fixes: RamenDR#1260 Signed-off-by: Abhijeet Shakya <[email protected]>
Signed-off-by: Abhijeet Shakya <[email protected]>
Signed-off-by: Abhijeet Shakya <[email protected]>
This change will start using cache for kustomization resources, so starting the addon can directly use the cached resources. Changes: - drenv fetch can be used to fetch resources anytime. - Starting an addon will first try to fetch resources, then apply the fetched resources. If there is no change, fetch won't do anything, so takes very less time. Fixes: RamenDR#1337 Signed-off-by: Abhijeet Shakya <[email protected]>
Since we upgraded, the e2e job is failing[1] (due to a bug in the e2e integration, the job does not fail!). Lets try to go back to olm 0.22 since we know it worked before this change[2]. [1] last good build: https://github.com/RamenDR/ramen/actions/runs/9134579476/job/25120395289 [2] first bad build: https://github.com/RamenDR/ramen/actions/runs/9158274985/job/25177239838 Signed-off-by: Nir Soffer <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
This is required because we can have two PVCs with the same name when multinamespace support is enabled. Signed-off-by: Raghavendra Talur <[email protected]>
Also, create the rd and rs in the same namespace as the PVC and not VRG. Signed-off-by: Raghavendra Talur <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Now that that we have basic test running, we want to fail the workflow if the tests failed. Without this people assumes code changes are passed the tests. Signed-off-by: Nir Soffer <[email protected]>
kubeObjectsRecoveryStartOrResume() error handling is very confusing - the code tries to avoid duplicating error handling in 2 unrelated code paths (ok=true, ok=false), leading to referencing a nil request when ok is false. We need to refactor this later, for now just skip cleanup if there is nothing to cleanup. Bug: https://bugzilla.redhat.com/2282284 Signed-off-by: Nir Soffer <[email protected]>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: df-build-team The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR containing the latest commits from main branch