-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Onboard AWS Scale tests to Boskos #33183
Comments
we have to sit on this one until we can figure out automation to bump any given AWS account to be able to run 5k tests as limits need to be increased etc for various services like ec2, needs explicit bumps working with aws support folks etc. |
I don't think so? You can add the existing account to a new boskos pool? |
We haven't done this for the GCP 5k project, we just stuck the one project we have in a dedicated pool of a single project, so it can still make use of boskos's lifecycling features and be rented to multiple jobs. I recommend also putting any such multiple jobs into a job queue that matches the boskos pool, for too long we have only relied on manual scheduling. (job_queue_name and job_queue_capacities, not very well documented at the moment but test-infra/config/prow/config.yaml Line 9 in 52cac28
|
/sig scalability k8s-infra testing |
@dims I think it should be Ok to do this since we are moving the existing scale account under a boskos resource type: |
@ameukam cool! that sounds better :) having a separate type and then making sure we use that. all i was worried about was that we can't pick a random account and run scale test on it |
Yeah, we definitely don't want to make the entire main pool scale test ready. We actually have a few pools on GCP like this, e.g. the GPU projects are also special and we're not setting up that quota for every project. We actually even have a secondary scale pool with a few projects for smaller scalability jobs. I think we can mimic this, just need to add a pool definition with the aws account, make sure the janitor is enabled for that pool, and switch the job to reference the pool. If we roll that out between scheduled runs it should just work without disruptions. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/lifecycle frozen |
What would you like to be added:
Why is this needed:
When kubetest2 tear down doesn't fully succeed, it will end up leaking resources and until next run (which is after 24hours) these resources will not be cleaned up.
This is not cost effective to leave out leaked resources until next run.
Example run - https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-e2e-kops-aws-scale-amazonvpc-using-cl2/1818587581143584768
The text was updated successfully, but these errors were encountered: