Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout Error While Attempting to Intercept Rollout Service Using Telepresence #3691

Open
simorla opened this issue Sep 30, 2024 · 2 comments

Comments

@simorla
Copy link

simorla commented Sep 30, 2024

We have created a rollout service and are trying to intercept it using Telepresence. However, we're encountering a timeout error.

Command:

telepresence --namespace ping-services intercept microservice-generic-smorla-69dd7655dd --service microservice-generic-smorla --port 8080:80 --mount=false --env-json /Users/smorla/microservice-generic/env.json

Error:

W0930 18:17:02.561188 8236 native_arm64.go:52] Could not read /proc/cpuinfo: open /proc/cpuinfo: no such file or directory
W0930 18:17:02.561239 8236 native_arm64.go:177] Could not read /proc/self/auxv: open /proc/self/auxv: no such file or directory
Error: request timed out while waiting for agent microservice-generic-smorla-69dd7655dd.ping-services to arrive

Service Details:

kubectl -n ping-services describe service microservice-generic-smorla

Name: microservice-generic-smorla
Namespace: ping-services
Labels:
allow-deletion-by-users=true
app.name=microservice-generic
telepresence=true
user.name=smorla
Type: ClusterIP
IP:
Port: 80/TCP
TargetPort: 8080/TCP
Endpoints:
Session Affinity: None

Relevant Pods in telepresence
kubectl get pods -n ambassador

traffic-manager-595d7f4558-f89tt 1/1 Running 0 6d4h
traffic-manager-ambassador-agent-df7f4ccfb-shqxj 1/1 Running 0 6d17h

kubectl get all -n ping-services | grep smorla

pod/microservice-generic-smorla-69dd7655dd-5jg56 2/2 Running 24m
service/microservice-generic-smorla ClusterIP 80/TCP 52m
replicaset.apps/microservice-generic-smorla-69dd7655dd 1 1 1 51m

### Telepresence Version:"

Client: v2.7.2 (api v3)
Root Daemon: v2.7.2 (api v3)
User Daemon: v2.7.2 (api v3)

Telepresence Status:
Root Daemon: Running
Version : v2.7.2 (api 3)
DNS : Remote IP: 127.0.0.1
User Daemon: Running
Version : v2.7.2 (api 3)
Intercepts: 0 total

Timeout Settings:
We have tried increasing the timeouts.agentArrival to 120s, but it hasn't resolved the issue

helm upgrade traffic-manager datawire/telepresence --namespace ambassador --set timeouts.agentArrival=120s

Any guidance or suggestions to resolve this issue would be greatly appreciated.

Logs::
2024-09-30 17:59:20.9023 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.alloy.local
2024-09-30 17:59:20.9025 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.otel.local
2024-09-30 17:59:20.9027 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.ping-cassandra.local
2024-09-30 17:59:20.9028 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.kube-node-lease.local
2024-09-30 17:59:20.9032 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.applicationroles-crud.local
2024-09-30 17:59:20.9034 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.node-problem-detector.local
2024-09-30 17:59:20.9036 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.tag1-v2.local
2024-09-30 17:59:20.9037 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.telemetry.local
2024-09-30 17:59:20.9039 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.olm.local
2024-09-30 17:59:20.9040 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.default.local
2024-09-30 17:59:20.9042 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.branding1.local
2024-09-30 17:59:20.9044 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.monitoring.local
2024-09-30 17:59:20.9046 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.jenkins-infra-test.local
2024-09-30 17:59:20.9051 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.mock-smtp-p1-perf.local
2024-09-30 17:59:20.9060 debug daemon/session/dns/SearchPaths : Performing initial recursion check with tel2-recursion-check.kube-system
2024-09-30 17:59:20.9805 debug daemon/session/dns/Server : LookupHost "tel2-recursion-check.kube-system"
2024-09-30 17:59:20.9807 debug daemon/session/dns/Server : SVCB _dns.resolver.arpa. -> NXDOMAIN
2024-09-30 17:59:21.4116 debug daemon/session/dns/Server : DNS resolver is not recursive
2024-09-30 17:59:21.4120 debug daemon/session/dns/Server : A tel2-recursion-check.kube-system. -> SERVFAIL rpc error: code = Unimplemented desc = unknown method LookupHost for service telepresence.manager.Manager
2024-09-30 17:59:21.4190 debug daemon/session/dns/SearchPaths : Recursion check finished
2024-09-30 17:59:20.9023 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.alloy.local
2024-09-30 17:59:20.9025 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.otel.local
2024-09-30 17:59:20.9027 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.ping-cassandra.local
2024-09-30 17:59:20.9028 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.kube-node-lease.local
2024-09-30 17:59:20.9032 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.applicationroles-crud.local
2024-09-30 17:59:20.9034 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.node-problem-detector.local
2024-09-30 17:59:20.9036 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.tag1-v2.local
2024-09-30 17:59:20.9037 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.telemetry.local
2024-09-30 17:59:20.9039 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.olm.local
2024-09-30 17:59:20.9040 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.default.local
2024-09-30 17:59:20.9042 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.branding1.local
2024-09-30 17:59:20.9044 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.monitoring.local
2024-09-30 17:59:20.9046 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.jenkins-infra-test.local
2024-09-30 17:59:20.9051 info daemon/session/dns/SearchPaths : Generated new /etc/resolver/telepresence.mock-smtp-p1-perf.local
2024-09-30 17:59:20.9060 debug daemon/session/dns/SearchPaths : Performing initial recursion check with tel2-recursion-check.kube-system
2024-09-30 17:59:20.9805 debug daemon/session/dns/Server : LookupHost "tel2-recursion-check.kube-system"
2024-09-30 17:59:20.9807 debug daemon/session/dns/Server : SVCB _dns.resolver.arpa. -> NXDOMAIN
2024-09-30 17:59:21.4116 debug daemon/session/dns/Server : DNS resolver is not recursive
2024-09-30 17:59:21.4120 debug daemon/session/dns/Server : A tel2-recursion-check.kube-system. -> SERVFAIL rpc error: code = Unimplemented desc = unknown method LookupHost for service telepresence.manager.Manager
2024-09-30 17:59:21.4190 debug daemon/session/dns/SearchPaths : Recursion check finished
smorla@mac-YM2KD29M test-kasparov %

@cindymullins-dw
Copy link
Collaborator

Hi @simorla , the latest version is 2.19.6. Could you please upgrade and let us know if the issue persists?

If it does, on the latest version run ‘telepresence gather logs’. Logs in later versions address this type of error so should be more informative. Also please note the daemon logs are not very relevant here. The Traffic Manager pod logs would be more helpful.

@simorla
Copy link
Author

simorla commented Oct 1, 2024

Hi @cindymullins-dw , Thank you . After updating to version 2.19.6 of the Traffic Manager, we are unable to use the telepresence connect command. It is throwing the error below. We tried increasing the trafficManagerConnect timeout to 10 minutes, but it is still not working. Could you please assist?

kubectl get pod -n ambassador
NAME READY STATUS RESTARTS AGE
traffic-manager-798786f49c-mr8kb 1/1 Running 0 47m
traffic-manager-ambassador-agent-df7f4ccfb-shqxj 1/1 Running 0 7d10h
kubectl get deployment traffic-manager -n ambassador -o jsonpath='{.spec.template.spec.containers[0].image}'

docker.io/datawire/ambassador-telepresence-manager:2.19.6%

kubectl get svc,pod -n ambassador -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/agent-injector ClusterIP 443/TCP 14d app=traffic-manager,telepresence=manager
service/traffic-manager ClusterIP None 8081/TCP,15766/TCP 14d app=traffic-manager,telepresence=manager
service/traffic-manager-ambassador-agent ClusterIP 80/TCP 14d app.kubernetes.io/instance=traffic-manager,app.kubernetes.io/name=ambassador-agent

NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/traffic-manager-798786f49c-mr8kb 1/1 Running 0 19m 100.98.0.27 it
pod/traffic-manager-ambassador-agent-df7f4ccfb-shqxj 1/1 Running 0 7d10h 100.116.208.15 t

kubectl logs traffic-manager-ambassador-agent-df7f4ccfb-shqxj -n ambassador | tail -10

time="2024-10-01T06:35:36Z" level=info msg="Setting cloud connect token from environment" THREAD=/lease-lock-watch
time="2024-10-01T06:35:36Z" level=error msg="Unable to get cloud connect token. This agent will do nothing." THREAD=/lease-lock-watch
time="2024-10-01T06:37:36Z" level=info msg="Setting cloud connect token from environment" THREAD=/lease-lock-watch
time="2024-10-01T06:37:36Z" level=error msg="Unable to get cloud connect token. This agent will do nothing." THREAD=/lease-lock-watch
time="2024-10-01T06:37:36Z" level=info msg="Setting cloud connect token from environment" THREAD=/lease-lock-watch
time="2024-10-01T06:37:36Z" level=error msg="Unable to get cloud connect token. This agent will do nothing." THREAD=/lease-lock-watch
time="2024-10-01T06:39:36Z" level=info msg="Setting cloud connect token from environment" THREAD=/lease-lock-watch
time="2024-10-01T06:39:36Z" level=error msg="Unable to get cloud connect token. This agent will do nothing." THREAD=/lease-lock-watch
time="2024-10-01T06:39:36Z" level=info msg="Setting cloud connect token from environment" THREAD=/lease-lock-watch
time="2024-10-01T06:39:36Z" level=error msg="Unable to get cloud connect token. This agent will do nothing." THREAD=/lease-lock-watch

Error Message:
telepresence connect
Launching Telepresence User Daemon
telepresence connect: error: connector.Connect: the port-forward connection to the traffic manager timed out. The current timeout 5m0s can be configured as "timeouts.trafficManagerConnect" in "/Users/smorla/Library/Application Support/telepresence/config.yml"

cat config.yml
logLevels:
userDaemon: debug
rootDaemon: debug
timeouts:
agentInstall: 5m
intercept: 15m
helm: 5m
apply: 15m
clusterConnect: 5m
proxyDial: 5m
trafficManagerConnect: 5m
trafficManagerAPI: 5m
connectivityCheck: 5m
smorla@mac-YM2KD29M telepresence %

kubectl logs traffic-manager-798786f49c-mr8kb -n ambassador | tail -20
2024-10-01 05:44:44.0549 info Traffic Manager v2.19.6 [uid:1000,gid:0]
2024-10-01 05:44:44.6580 info No license is installed for this traffic-manager
2024-10-01 05:44:44.6583 info starting cloud token watchers
2024-10-01 05:44:44.6692 info configmap traffic-manager-agent-cloud-token found, stopping cloud token watchers
2024-10-01 05:44:44.9265 info Using traffic-agent image "docker.io/datawire/ambassador-telepresence-agent:1.14.5"
2024-10-01 05:44:44.9588 info Extracting service subnet 100.64.0.0/13 from create service error message
2024-10-01 05:44:44.9589 info Using podCIDRStrategy: auto
2024-10-01 05:44:44.9589 info Using AlsoProxy: []
2024-10-01 05:44:44.9589 info Using NeverProxy: []
2024-10-01 05:44:44.9589 info Using AllowConflicting: [10.128.0.0/9]
2024-10-01 05:44:44.9597 info Cluster domain derived from agent-injector reverse lookup "agent-injector.ambassador.svc.cluster.local."
2024-10-01 05:44:44.9598 info Using cluster domain "cluster.local."
2024-10-01 05:44:44.9598 info ExcludeSuffixes: [.com .io .net .org .ru]
2024-10-01 05:44:44.9598 info IncludeSuffixes: []
2024-10-01 05:44:44.9601 info cli-config : Started watcher for ConfigMap traffic-manager
2024-10-01 05:44:44.9601 info prometheus : Prometheus metrics server not started
2024-10-01 05:44:45.0607 info Scanning 19 nodes
2024-10-01 05:44:45.0608 error no node subnet contains traffic-manager IP 100.98.0.27
2024-10-01 05:44:45.0613 info Deriving subnets from IPs of pods
2024-10-01 06:44:45.8029 error consumption-watcher : failed to report consumption for session 60e90f28-a38c-420e-8f6d-df7178bca580: rpc error: code = Unauthenticated desc =

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants