Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flaky test] Scheduler when Scheduling workloads on clusterQueues when Hold LocalQueue at startup Should admit workloads according to their priorities #4273

Open
mimowo opened this issue Feb 17, 2025 · 3 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/flake Categorizes issue or PR as related to a flaky test.

Comments

@mimowo
Copy link
Contributor

mimowo commented Feb 17, 2025

/kind flake
What happened:

Failed test: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/periodic-kueue-test-integration-main/1891370482536550400

What you expected to happen:

no failures

How to reproduce it (as minimally and precisely as possible):

Run CI

Anything else we need to know?:

{Timed out after 10.001s.
The function passed to Eventually failed at /home/prow/go/src/kubernetes-sigs/kueue/test/util/util.go:324 with:
Not enough workloads are pending
Expected
    <int>: 2
to equal
    <int>: 3 failed [FAILED] Timed out after 10.001s.
The function passed to Eventually failed at /home/prow/go/src/kubernetes-sigs/kueue/test/util/util.go:324 with:
Not enough workloads are pending
Expected
    <int>: 2
to equal
    <int>: 3
In [It] at: /home/prow/go/src/kubernetes-sigs/kueue/test/integration/singlecluster/scheduler/scheduler_test.go:565 @ 02/17/25 06:28:13.731
}

Environment:

  • Kubernetes version (use kubectl version):
  • Kueue version (use git describe --tags --dirty --always):
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:
@mimowo mimowo added the kind/bug Categorizes issue or PR as related to a bug. label Feb 17, 2025
@mimowo
Copy link
Contributor Author

mimowo commented Feb 17, 2025

/kind flake

@k8s-ci-robot k8s-ci-robot added the kind/flake Categorizes issue or PR as related to a flaky test. label Feb 17, 2025
@KPostOffice
Copy link
Contributor

export GINKGO_ARGS='--focus "Scheduler when Scheduling workloads on clusterQueues when Hold LocalQueue at startup Should admit workloads according to their priorities"'
export INTEGRATION_TARGET='test/integration/singlecluster/scheduler'
count=0

while make test-integration ;
do
  count=$[$count + 1]
  echo $count
done

ran this trying to reproduce flake ~100 times and have not been able to reproduce locally

@mimowo
Copy link
Contributor Author

mimowo commented Feb 18, 2025

ran this trying to reproduce flake ~100 times and have not been able to reproduce locally

Thanks for looking at the issue. The other tactics I use for reproducing the issues locally:

  • add some stress (./bin/stress --cpu=N)
  • add --race to GINKGO_ARGS
  • recompile with instrumentation, here are the steps for k8s tests, but it should be possible to adapt for Kueue:
# install stress command
go install golang.org/x/tools/cmd/stress@latest

# recompile with race instrumentation
go test ./pkg/controller/job -race -c

# run (it loops and reports failures)
stress ./job.test -test.run TestJobApiBackoffReset

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/flake Categorizes issue or PR as related to a flaky test.
Projects
None yet
Development

No branches or pull requests

3 participants