-
-
Notifications
You must be signed in to change notification settings - Fork 361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Cilium] Full Kube proxy replacement not working #1210
Comments
I am getting the same error while simply trying to use cilium with this option:
|
when I install for the first time with default settings I get the same error |
Receiving the same error (see logs) when trying to create a new cluster but without 1the provisioner has dependencies to multiple values, but also available resources, e.g. load-balancer, or volumes. I could also recognize that some of the Details
depends_on = [
hcloud_load_balancer.cluster,
null_resource.control_planes,
random_password.rancher_bootstrap,
hcloud_volume.longhorn_volume
] module.kube-hetzner.null_resource.kustomization: Still creating... [6m0s elapsed]
module.kube-hetzner.null_resource.kustomization: Still creating... [6m10s elapsed]
module.kube-hetzner.null_resource.kustomization (remote-exec): error: timed out waiting for the condition on deployments/system-upgrade-controller
╷
│ Error: remote-exec provisioner error
│
│ with module.kube-hetzner.null_resource.kustomization,
│ on .terraform/modules/kube-hetzner/init.tf line 288, in resource "null_resource" "kustomization":
│ 288: provisioner "remote-exec" {
│
│ error executing "/tmp/terraform_1207280219.sh": Process exited with status 1 Although this error occurs, it seems the resources are prepared and the cluster is reachable. $ k get nodes
NAME STATUS ROLES AGE VERSION
training-shared-cluster-agent-large-bxf Ready <none> 28m v1.28.6+k3s2
training-shared-cluster-agent-large-cnb Ready <none> 28m v1.28.6+k3s2
training-shared-cluster-agent-large-ddd Ready <none> 28m v1.28.6+k3s2
training-shared-cluster-agent-large-iui Ready <none> 28m v1.28.6+k3s2
training-shared-cluster-agent-large-qtp Ready <none> 28m v1.28.6+k3s2
training-shared-cluster-agent-large-ric Ready <none> 28m v1.28.6+k3s2
training-shared-cluster-agent-large-rpr Ready <none> 28m v1.28.6+k3s2
training-shared-cluster-control-plane-fsn1-fdf Ready control-plane,etcd,master 28m v1.28.6+k3s2
training-shared-cluster-control-plane-fsn1-gvp Ready control-plane,etcd,master 27m v1.28.6+k3s2
training-shared-cluster-control-plane-fsn1-uli Ready control-plane,etcd,master 27m v1.28.6+k3s2 Other than that, I noticed that on each node there will be an instance of cilium with multiple restarts and $ k get pods -n kube-system
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system cilium-9jp8p 1/1 Running 0 27m
kube-system cilium-chbwp 0/1 Running 9 (5m9s ago) 27m
kube-system cilium-czp8w 1/1 Running 0 27m
kube-system cilium-jv7cz 1/1 Running 0 27m
kube-system cilium-mcmft 0/1 Running 8 (33s ago) 27m
kube-system cilium-ns945 1/1 Running 0 27m
kube-system cilium-operator-f5dcdcc8d-prm4z 1/1 Running 0 27m
kube-system cilium-operator-f5dcdcc8d-wpf6n 1/1 Running 0 27m
kube-system cilium-qgtql 1/1 Running 0 27m
kube-system cilium-svjx2 0/1 Running 9 (5m32s ago) 27m
kube-system cilium-t9r7x 1/1 Running 0 27m
kube-system cilium-zsmkr 1/1 Running 0 27m |
@M4t7e @Silvest89 Any ideas this issue? |
@byRoadrunner What makes you think that the current implementation with Cilium has kube-proxy? My cluster is kube-proxy-free without the need to use @mysticaltech |
@Silvest89 |
@byRoadrunner |
@Silvest89 definitely not on my own computer 😉 |
@byRoadrunner |
@Silvest89 Thanks for the clarifications, will have a look. @kube-hetzner/core FYi, if you have any ideas. |
I definitely used the latest available version for this testing, which was, and still is, v2.11.8. |
@Silvest89 just to clarify, a standard installation with just changing the cni to cilium should be kube-proxy free? |
Yes. |
Now I'm getting the same error like before/like the others but with complete default installation (only cni set to cilium).
EDIT: Ignore this, it was my fault, i forgot to increase the server_type from the defaults (which is needed for cilium) |
So I just tested and when running
But according to Cilium documentation (https://docs.cilium.io/en/stable/network/kubernetes/kubeproxy-free/#validate-the-setup) this should just return an empty line. |
@byRoadrunner How do you do the kube-proxy replacement? Via the Then look at the |
@mysticaltech kube-proxy replacement is already the default in the helm values which are deployed ( terraform-hcloud-kube-hetzner/locals.tf Line 383 in e911232
You are right about the part that it says In my previous comments i mentioned the hybrid mode and the validation steps provided by cilium, and if there are still |
Hey @byRoadrunner, you're right about your assumption that the current replacement is a hybrid solution. Kube Proxy is still running in background and manages a few functionalities. I'm currently working on a new PR to update Cilium to the 1.15 release and I can include the full kube-proxy replacement as well. I already have a working setup, but I want to test a few more things. I'll probably file the PR tomorrow. |
Great to hear from you @M4t7e 🙏 |
@M4t7e works like a charm, no more KUBE-SVC rules, thanks! |
@Silvest89 @M4t7e is this part of the README no longer true then:
is this related? #1267 I am looking to upgrade an existing cluster from the current default flannel to Cilium and its a bit confusing what the config should be. |
@maggie44 I added a comment to the issue: #1267 (comment) Cilium was not properly configured to take over the full kube-proxy functionality. If you don't craft your own
If you want to replace the kube-proxy with Cilium by setting k8sServiceHost: "127.0.0.1"
k8sServicePort: "6444" This is already the default configuration used when you do not specify custom terraform-hcloud-kube-hetzner/locals.tf Lines 453 to 455 in da24fd2
|
If further information is required, I will be happy to provide it.
Discussed in #1199
Originally posted by byRoadrunner January 31, 2024
Hi,
I'm trying to use cilium in complete kube proxy free mode (https://docs.cilium.io/en/stable/network/kubernetes/kubeproxy-free/).
For this I disabled the k3s kube proxy:
But this ends in the following error:
Probably cilium is deployed afterwards and networking does not correctly work at this point, so the nodes are unschedulable?
Has anyone done this before? Do you have any advise on how to do this? Or is this a bug which should not happen and should be moved to an issue?
Thanks!
The text was updated successfully, but these errors were encountered: