Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat] Relax requirement of running CI on main branch for every single commit #5150

Open
echoix opened this issue Feb 19, 2025 · 3 comments
Open
Labels
CI Continuous integration enhancement New feature or request

Comments

@echoix
Copy link
Member

echoix commented Feb 19, 2025

Is your feature request related to a problem? Please describe.

When we get a lot of active contributions, combined with GitHub actions CI activity of other OSGeo repos, the queue time for new work can become long, sometimes 6 hours. That's a happy problem to have (too much contributions).

I want to launch a discussion to re-think about our self-imposed requirement of having every single merge to main have all CI jobs run.

Recently, with the Docker changes, the concurrency groups were set so that only one can be in progress, and one queued. Merging more PRs when a Docker build is already in progress will not cancel the active build (unlike PRs), but will queue it to not have more than 1 running. If there was already one more commit waiting to be built, then it will cancel the non-started build, so we have only one waiting.

For Docker builds, that makes sense.

My open question is: what else can we apply this pattern? What else can be "saved" by a check in a later run?

I still need to test to see if a cancelled, non started job, can be relaunched if found to have a problem.

@echoix echoix added CI Continuous integration enhancement New feature or request labels Feb 19, 2025
@wenzeslaus
Copy link
Member

If the tradeoff is between waiting and being sure about what breaks the build, I'm for the clarity rather than speed.

Hm, perhaps, we have use case General linting / GitHub Super Linter (push) is failing on main while it passed in the PR. Somehow this slipped through the PR checks and presumably also local pre-commit. (Now fixed again.)

@echoix
Copy link
Member Author

echoix commented Feb 20, 2025

If the tradeoff is between waiting and being sure about what breaks the build, I'm for the clarity rather than speed.

Hm, perhaps, we have use case General linting / GitHub Super Linter (push) is failing on main while it passed in the PR. Somehow this slipped through the PR checks and presumably also local pre-commit. (Now fixed again.)

That is probably due to a setting like VALIDATE_ALL_CODEBASE, ie, only checking the files changed in a PR vs checking all the repo. Checking only changed files is notably something hard to "get right" for the expectations of everyone, as they don't really all intersect each other. In Megalinter, there's a different approach, on precommit too. One of the problems that is possible to have, and you just witnessed it, is when you change a config file, the errors won't be in the changed file, but on the other, untouched files.

Super-linter had changed not too long ago (last year or so) their method of determining changes, before or after they use git, I don't remember which one it is. But anyways, once merged, superlinter would be run on main, and the config indicates that the default branch (to compare to) is main, so I assume that instead of checking nothing, they check all.

On local precommit, when changing config files like that, I would use pre-commit run -a to run on all files (note, all files is not exactly all files, it still excludes some, influenced by the .gitignore). That's why I configured the advanced checks workflow to run pre-commit on all files when that file is changed (handles version changes in pre-commit config).

So, it's always a tradeoff, but knowing the limits of what is checked or not is helpful. We still want to have some velocity, and not waste to much waiting for simple PRs, but have enough checked for complex ones, and we don't want edge cases of side-effects on unexpected parts of the project to slip through.

@echoix
Copy link
Member Author

echoix commented Feb 20, 2025

Or, what we could also try to do, since we can't really decide on the queue order, is to limit, where we can, the workflows that have multiple jobs to have less run in parallel. But that would slow down jobs on main event when the CI is not full.
There's also the "environments" that can be used to do something similar to concurrency groups, but without cancelling, where there can only be one use of an "environment" at a time (like a mutex in other words). I don't like it as much either.
Both of these two other solutions would free some runners to continue processing the queue, but might end up helping out other projects of the organization that got queued before ours and that don't limit themselves as much.

Or, to get faster to next jobs on queue, we could more aggressively cancel jobs on PRs when there's already an error in some (required) job that will necessitate a new commit anyways before being possible to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous integration enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants