-
Notifications
You must be signed in to change notification settings - Fork 39.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubeadm: refactor the dry-run logic #126776
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: neolit123 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/triage accepted |
174edd6
to
451fa49
Compare
451fa49
to
f277228
Compare
f277228
to
7dc40c5
Compare
/retitle kubeadm: refactor the dry-run logic this is ready for review |
7dc40c5
to
66e222c
Compare
rebased |
// - Log the action. | ||
// - If the action is not GET or LIST just use the fake client store to write it, unless | ||
// a user reactor was added with PrependReactor() or AppendReactor(). | ||
// - Attempt to GET or LIST using the real dynamic client. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
real dynamic client to get the object. If a real object does not
exist it attempts to get the object from the fake object store.
The above message is copied from the PR's description. It‘s not truth when the real client is instantiated and the given object doesn't exist in the cluster, it will mark the action as handled.
// handleGetAction tries to handle all GET actions with the dynamic client.
func (d *DryRun) handleGetAction(action testing.GetAction) (bool, runtime.Object, error) {
if d.dynamicClient == nil {
return false, nil, errors.New("dynamicClient is nil")
}
unstructuredObj, err := d.dynamicClient.
Resource(action.GetResource()).
Namespace(action.GetNamespace()).
Get(context.Background(), action.GetName(), metav1.GetOptions{})
if err != nil {
return true, nil, err
}
newObj, err := d.decodeUnstructuredIntoAPIObject(action, unstructuredObj)
if err != nil {
return true, nil, err
}
return true, newObj, err
}
According to the implementation, it attempts to GET or LIST using the real dynamic client if it is instantiated, otherwise tries to GET or LIST the object from the fake client store. Is it desired?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't see the caller.
handled, obj, err := d.handleGetAction(getAction)
if err != nil {
fmt.Fprintln(d.writer, "[dryrun] Real object does not exist. "+
"Attempting to GET from followup reactors or from the fake client tracker")
return false, nil, nil
}
Please ingore above comments.
if err != nil { | ||
return nil, errors.Wrapf(err, "unable to get internal Kubernetes Service IP from the given service CIDR (%s)", d.cfg.Networking.ServiceSubnet) | ||
dryRun := apiclient.NewDryRun() | ||
if err := dryRun.WithKubeConfigFile(d.KubeConfigPath()); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In dry-run mode, d.KubeConfigPath()
always returns the kubeconfig path from a kubeadm tmp dir.
I'm not sure whether the created client is able to talk with the real cluster?
Do sub phases under init phase need a real cluster when a user run a sub phase with the --dry-run
flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure whether the created client is able to talk with the real cluster?
yes, it's a generated admin.conf, it just resides in the tmp dir.
Do sub phases under init phase need a real cluster when a user run a sub phase with the --dry-run flag.
that's a good Q because there is a problem with this it seems.
the phases can work with dry-run but the admin.conf in the TMP dir doesn't exist.
here is what happens:
sudo kubeadm init phase upload-config all --dry-run
error execution phase upload-config/kubeadm: failed to load kubeconfig: open /etc/kubernetes/tmp/kubeadm-init-dryrun3932008713/admin.conf: no such file or directory
so the CA and admin.conf don't exist for this scoped phase dry-run.
but sudo kubeadm init --dry-run
as a whole would work as that tmp dir admin.conf will exist ad KubeConfigPath will return it.
even if the user runs:
sudo kubeadm init phase certs ca
sudo kubeadm init phase kubeconfig admin
sudo kubeadm init phase upload-config all --dry-run
error execution phase upload-config/kubeadm: failed to load kubeconfig: open /etc/kubernetes/tmp/kubeadm-init-dryrun3932008713/admin.conf: no such file or directory
it would fail as the non-dry-run commands generated a CA/kubeconfig in the default location /etc/kubernetes/
but the --dry-run is looking for an admin.conf that exists in the tmp dir.
i think to avoid this problem the logic in init.go can check if the tmp dir file exists and if it doesn't it can fall back to the default /etc/kubernetes/admin.conf
i need to test this idea or find a better one..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@carlory so i approached this differently. the problem is that a tmp dir is used for many files not only the admin.conf.
i think the only way to allow running individual phases in dry-run mode is to allow the user to specify the dry-run directory. we already had an env variable for that called KUBEADM_INIT_DRYRUN_DIR. i just added new ones for UPGRADE and JOIN as well. so here is how this can be used.
lubo@3B69Q73:~/go/src/k8s.io/kubernetes$ sudo kubeadm init phase certs ca --dry-run
W0925 15:27:34.028290 147394 constants.go:610] Using dry-run directory /etc/kubernetes/tmp/kubeadm-init-dryrun1123083720. To override it, set the environment variable KUBEADM_INIT_DRYRUN_DIR
[certs] Using existing ca certificate authority
lubo@3B69Q73:~/go/src/k8s.io/kubernetes$ sudo ls /etc/kubernetes/tmp/kubeadm-init-dryrun1123083720
ca.crt ca.key
lubo@3B69Q73:~/go/src/k8s.io/kubernetes$ sudo KUBEADM_INIT_DRYRUN_DIR=/etc/kubernetes/tmp/kubeadm-init-dryrun1123083720
kubeadm init phase kubeconfig all --dry-run
[kubeconfig] Using kubeconfig folder "/etc/kubernetes/tmp/kubeadm-init-dryrun1123083720"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
so when not setting the env var a random dir is generted, but if the env var is set it can be used for phase dry-run.
i will push an update in a bit.
return retrieveValidatedConfigInfo(nil, cfg, constants.DiscoveryRetryInterval, timeout) | ||
func RetrieveValidatedConfigInfo(dryRunClient clientset.Interface, cfg *kubeadmapi.Discovery, timeout time.Duration) (*clientcmdapi.Config, error) { | ||
isDryRun := dryRunClient != nil | ||
isTesting := false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of isTesting
? Can we remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a left over from when i was testing with different ways to make the changes in token.go. i will remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, it has a purpose. without it the function retrieveValidatedConfigInfo is difficult to unit test.
during tests retrieveValidatedConfigInfo gets isDryRun = false and isTesting = true, that results in no loading of additional kubeconfig files (retrieveValidatedConfigInfo loads kubeconfig files). during tests client is a fake client and during real usage client is a dryRun client. getClusterInfo() is called from retrieveValidatedConfigInfo() which complicates it further.
ideas to improve this are welcome, but i think it might be suitable for a new PR that would refactor / break apart retrieveValidatedConfigInfo. perhaps in multiple functions.
cmd/kubeadm/app/cmd/join.go
Outdated
var err error | ||
|
||
client := dryRunClient | ||
if dryRunClient == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to move the creating client logic to InitCfg()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved this to fetchInitConfigurationFromJoinConfiguration as that func also needs tlsBootstrapCfg.
66e222c
to
7539b95
Compare
just rebased. |
The current dryrun client implemnetation is suboptimal and sparse. It has the following problems: - When an object CREATE or UPDATE reaches the default dryrun client the operation is a NO-OP, which means subsequent GET calls must fully emulate the object that exists in the store. - There are multiple implmentations of a DryRunGetter interface such the one in init_dryrun.go but there are no implementations for reset, upgrade, join. - There is a specific DryRunGetter that is backed by a real client in clientbacked_dryrun.go, but this is used for upgrade and does not work in conjuction with a fake client. This commit does the following changes: - Removes all existing *dryrun*.go implementations. - Add a new DryRun implementation in dryrun.go that implements 3 clients - fake clientset, real clientset, real dynamic client. - The DryRun object uses the method chaining pattern. - Allows the user opt-in into real clients only if needed, by passing a real kubeconfig. By default only constructs a fake client. - The default reactor chain for the fake client, always logs the object action, then for GET or LIST actions attempts to use the real dynamic client to get the object. If a real object does not exist it attempts to get the object from the fake object store. - The user can prepend or append reactors to the chain. - All known needed reactors for operations during init, join, reset, upgrade are added as methods of the DryRun struct. - Adds detailed unit test for the DryRun struct and its methods including reactors. Additional changes: - Use the new DryRun implementation in all command workflows - init, join, reset, upgrade. - Ensure that --dry-run works even if there is no active cluster by returning faked objects. For join, a faked cluster-info with a fake bootstrap token and CA are used.
7539b95
to
5b0a316
Compare
looking for more reviews. |
Thanks for your update. It looks good to me. /lgtm Looking another reviewer for the final approval. |
LGTM label has been added. Git tree hash: 1249f6da20fa2fbcf47fa5d311ef7a6acd7bf51f
|
@@ -105,7 +105,7 @@ func runCleanupNode(c workflow.RunData) error { | |||
klog.Warningf("[reset] Failed to remove containers: %v\n", err) | |||
} | |||
} else { | |||
fmt.Println("[reset] Would remove Kubernetes-managed containers") | |||
fmt.Println("[dyrrun] Would remove Kubernetes-managed containers") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fmt.Println("[dyrrun] Would remove Kubernetes-managed containers") | |
fmt.Println("[dryrun] Would remove Kubernetes-managed containers") |
typo
@@ -176,7 +176,7 @@ func resetConfigDir(configPathDir string, dirsToClean []string, isDryRun bool) { | |||
} | |||
} | |||
} else { | |||
fmt.Printf("[reset] Would delete files: %v\n", filesToClean) | |||
fmt.Printf("[dryrun] Would delete files: %v\n", filesToClean) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should file path be quoted using %q?
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
The current dryrun client implemnetation is suboptimal
and sparse. It has the following problems:
the operation is a NO-OP, which means subsequent GET calls must
fully emulate the object that exists in the store.
such the one in init_dryrun.go but there are no implementations
for reset, upgrade, join.
client in clientbacked_dryrun.go, but this is used for upgrade
and does not work in conjuction with a fake client.
This commit does the following changes:
3 clients - fake clientset, real clientset, real dynamic client.
a real kubeconfig. By default only constructs a fake client.
object action, then for GET or LIST actions attempts to use the
real dynamic client to get the object. If a real object does not
exist it attempts to get the object from the fake object store.
reset, upgrade are added as methods of the DryRun struct.
including reactors.
Additional changes:
init, join, reset, upgrade.
by returning faked objects. For join, a faked cluster-info
with a fake bootstrap token and CA are used.
Which issue(s) this PR fixes:
Fixes kubernetes/kubeadm#2653
Fixes kubernetes/kubeadm#1932
Special notes for your reviewer:
NONE
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: