Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report docker error in logs when docker image goes to ImagePullBackOff, instead of reporting it as a timeout connecting to traffic-manager #2305

Open
petermant opened this issue Jan 14, 2022 · 0 comments
Labels
feature New feature or enhancement request

Comments

@petermant
Copy link

Use case / problem:
When there are issues starting the traffic manager pod in kubernetes, the connector.log reports a timeout:

2022-01-13 10:14:42.8241 info    connector/background-manager : No existing Traffic Manager found in namespace ambassador, installing v2.4.9...
2022-01-13 10:14:43.3856 debug   connector/background-manager : creating 1 resource(s) : source="helm"
2022-01-13 10:14:43.4034 debug   connector/background-manager : creating 10 resource(s) : source="helm"
2022-01-13 10:14:43.4697 debug   connector/background-manager : beginning wait for 10 resources with timeout of 2m0s : source="helm"
2022-01-13 10:14:43.5093 debug   connector/background-manager : Deployment is not ready: ambassador/traffic-manager. 0 out of 1 expected pods are ready : source="helm"

…repeated many times…
Then:

2022-01-13 10:16:40.7834 error   connector/background-init : Failed to initialize session with traffic-manager: the port-forward connection to the traffic manager timed out.  The current timeout 2m0s can be configured as "timeouts.trafficManagerConnect" in "/Users/xyz/Library/Application Support/telepresence/config.yml"

Proposed solution
The 'deployment is not ready' / '0 out of 1 pods are ready' line in the output above could be promoted to an error level instead of a debug level, and include an indication of why they aren't ready - i.e. the pod status, perhaps along with a troubleshooting hint about connecting to the ambassador namespace and doing kubectl describe pod ...

Even better, perhaps the output of describe pod could be included in the logs automatically?

In my case, it was a simple ImagePullBackOff error which was preventing it starting, and when I corrected my proxy details the image downloaded just fine - but it took quite a lot of time to realise this was the issue.

But because standard logging is at 'info' level, this just presents in the normal logs as a timeout which is tricky to diagnose.

Alternatives
Maybe an FAQ section on troubleshooting timeouts when connecting to the cluster, especially if there are other typical problems which users experience, other than this one?

Versions

  • Telepresence 2.4.9
@cindymullins-dw cindymullins-dw added feature New feature or enhancement request a:docs labels Jan 30, 2023
@thallgren thallgren added stale Issue is stale and will be closed and removed a:docs labels Aug 13, 2024
@thallgren thallgren removed the stale Issue is stale and will be closed label Aug 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or enhancement request
Projects
None yet
Development

No branches or pull requests

3 participants