Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Add OpenStackServerGroup CRD and Controller #1912

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dalees
Copy link

@dalees dalees commented Feb 28, 2024

What this PR does / why we need it:

Implements new CRD for OpenstackServerGroup in v1beta1 to allow managed Server Groups with standard policies, and adds ServerGroupRef to OpenstackMachine that references the new CRD and uses it for VM creation.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #1256

Special notes for your reviewer:

This implements comment #1256 (comment)

  1. Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

TODOs:

  • squashed commits
  • includes documentation
  • adds unit tests
  • Rebased onto v1beta1 commit (removes v1alpha8)

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 28, 2024
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Feb 28, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @dalees. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Feb 28, 2024
Copy link

netlify bot commented Feb 28, 2024

Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!

Name Link
🔨 Latest commit d8850da
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-cluster-api-openstack/deploys/66a045aeb8177600088ceb97
😎 Deploy Preview https://deploy-preview-1912--kubernetes-sigs-cluster-api-openstack.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 28, 2024
Copy link
Contributor

@dulek dulek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good, some remarks inline.

Comment on lines 31 to 33
// The name of the cloud to use from the clouds secret
// +optional
CloudName string `json:"cloudName"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit weird, we should probably have a reference to an OpenStackCluster instead?

Copy link
Author

@dalees dalees Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback! Yeah, this allows the resource to be reconciled alone, as it's self contained.

However that isn't in any of the use cases, it doesn't seem a limitation to be tied to an existing OpenStackCluster even if the OpenStackServerGroup was only used for workers. It would remove duplication of these creds.

I'll make this change, once the CRD approach is agreed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, okay, that's a fair point. The use case to keep all the workers from different clusters in a single ServerGroup makes sense, I see your point.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In v1beta1 CloudName has moved into IdentifyRef, but this review thread is still relevant.

For now, I've chosen not to tie the OpenStackServerGroup to an OpenStackCluster reference and instead hold it's own secret reference. This matches how OpenStackMachine's are designed, and feels reasonable for this resource and avoids circular dependency.

I don't think there's much of a use case for two clusters using a single OpenStackServerGroup, that's not the design intention (but is possible with the current implementation).

api/v1alpha8/types.go Outdated Show resolved Hide resolved
err = compute.ResolveReferencedMachineResources(scope, &openStackMachine.Spec, &openStackMachine.Status.ReferencedResources)
if err != nil {
return reconcile.Result{}, err
}

// Resolve referenced resources CAPO resources, using the K8s client
err = resolveReferencedClientResources(ctx, r.Client, openStackMachine)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like it's still a Machine resource. Couldn't we put that into ResolveReferencedMachineResources directly? Even if we need to change the arguments of the function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I did start by doing this; I changed to this separation as what they're fetching from is distinct (OpenStack resource vs Kubernetes resource) and the client objects used are different. The OpenStack compute package just doesn't feel like the right place to be looking up K8s resources. It also makes test cases clearer to mock each function.

However, I agree the naming isn't clear. I wonder if renaming ResolveReferencedMachineResources to ResolveReferencedOpenStackResources may help to this end.

I'm open to changing this, but wanted to provide my reasoning first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that, sure. Let's see what other reviewers will say here, especially @mdbooth as ResolveReferencedMachineResources() is an idea of his.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While rebased onto v1beta I have removed resolveReferencedClientResources and moved the code into the new resolveMachineResources.

controllers/openstackservergroup_controller.go Outdated Show resolved Hide resolved
controllers/openstackservergroup_controller.go Outdated Show resolved Hide resolved

serverGroupName := openStackServerGroup.Name

serverGroup, err := computeService.GetServerGroupByName(serverGroupName, false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, we should probably lookup by ID first in case we have duplicate names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree
instead of GetServerGroupByName raising error in case of multiple server groups with the same name, i suggest to check openStackServerGroup.Status.ID, if it is nil, create a new server group

err = compute.ResolveReferencedMachineResources(scope, &openStackMachine.Spec, &openStackMachine.Status.ReferencedResources)
if err != nil {
return reconcile.Result{}, err
}

// Resolve referenced resources CAPO resources, using the K8s client
err = resolveReferencedClientResources(ctx, r.Client, openStackMachine)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I did start by doing this; I changed to this separation as what they're fetching from is distinct (OpenStack resource vs Kubernetes resource) and the client objects used are different. The OpenStack compute package just doesn't feel like the right place to be looking up K8s resources. It also makes test cases clearer to mock each function.

However, I agree the naming isn't clear. I wonder if renaming ResolveReferencedMachineResources to ResolveReferencedOpenStackResources may help to this end.

I'm open to changing this, but wanted to provide my reasoning first.

pkg/cloud/services/compute/referenced_resources.go Outdated Show resolved Hide resolved
controllers/openstackservergroup_controller.go Outdated Show resolved Hide resolved
api/v1alpha8/types.go Outdated Show resolved Hide resolved
@jichenjc
Copy link
Contributor

jichenjc commented Mar 4, 2024

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 4, 2024
@mdbooth
Copy link
Contributor

mdbooth commented Mar 8, 2024

@pierreprinetti We agreed this in principal this week. Pinging you because it's similar to something ORC would do.

@chess-knight
Copy link
Contributor

Hi, at @SovereignCloudStack we are very interested in this feature. What is the progress here @dalees?

@dalees
Copy link
Author

dalees commented May 23, 2024

Hi, at @SovereignCloudStack we are very interested in this feature. What is the progress here @dalees?

Hello - pleased to hear of the interest! I'm keen to get this in, and I'm scheduled to revisit this in the next few weeks to get it back into a reviewable state.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign vincepri for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@dalees dalees force-pushed the crd_openstackservergroup branch 3 times, most recently from 2519eb2 to 2df0503 Compare June 18, 2024 02:39
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jun 18, 2024
@k8s-ci-robot k8s-ci-robot removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jul 8, 2024
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jul 8, 2024
@dalees
Copy link
Author

dalees commented Jul 9, 2024

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 9, 2024
@dalees
Copy link
Author

dalees commented Jul 15, 2024

This change is ready for review, when reviewers have the time :) cc @mdbooth

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 19, 2024
Implements new CRD for OpenstackServerGroup in v1beta1 to allow managed
Server Groups with standard policies, and adds ServerGroupRef to OpenstackMachine
that references the new CRD and uses it for VM creation.

Closes: kubernetes-sigs#1256
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 24, 2024
@chess-knight
Copy link
Contributor

Hi @dalees, thank you for pushing this PR. I am also kindly asking you and others about checking the OCCM host-id labelling issue kubernetes/cloud-provider-openstack#2579, what do you think about it? These two features can be nicely combined then, E.g. one can create an anti-affinity server group, and then check the host-id label of k8s nodes to ensure that nodes are distributed on different underlying hypervisors.

@mnaser
Copy link
Contributor

mnaser commented Aug 14, 2024

@mdbooth this seems like a really good candidate for an ORC-style approach too?

@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 14, 2024

// Store the resolved UUID, once it's ready and set.
if servergroup.Status.Ready && servergroup.Status.ID != "" {
resolved.ServerGroupID = servergroup.Status.ID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
resolved.ServerGroupID = servergroup.Status.ID
resolved.ServerGroupID = servergroup.Status.ID
openStackServer.Status.Resolved = resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this is missing


serverGroupName := openStackServerGroup.Name

serverGroup, err := computeService.GetServerGroupByName(serverGroupName, false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree
instead of GetServerGroupByName raising error in case of multiple server groups with the same name, i suggest to check openStackServerGroup.Status.ID, if it is nil, create a new server group

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
Status: Inbox
Development

Successfully merging this pull request may close these issues.

Use a server group to ensure anti-affinity for control plane nodes
9 participants