You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What is the problem?
With enabled Tunneldigger but no WAN connection (e.g. Cable is disconnected on WAN) the node is experiencing high CPU Load Average (1.5-2.0) because of Tunneldigger's effort to restart and reconnect. This also causes super high flowrate of logs.
At the moment we have recreated this issue on TP-Link Archer C7 v2, Archer C7 v5 and Xiaomi Mi 4a Gigabit, so this seems to be independent of the device.
Sun Oct 15 14:35:46 2023 daemon.err tunneldigger[10258]: td-client: [gw05.ffnordhessen.net:20008] Reinitializing tunnel context.
Sun Oct 15 14:35:46 2023 daemon.err td-client: [gw05.ffnh.net:20008] Failed to connect to remote endpoint - check WAN connectivity!
Sun Oct 15 14:35:46 2023 daemon.err tunneldigger[10258]: td-client: [gw05.ffnh.net:20008] Failed to connect to remote endpoint - check WAN connectivity!
Sun Oct 15 14:35:46 2023 daemon.err td-client: [gw05.ffnordhessen.net:20008] Failed to connect to remote endpoint - check WAN connectivity!
Sun Oct 15 14:35:46 2023 daemon.info td-client: [gw05.ffnh.net:20008] Reinitializing tunnel context.
Sun Oct 15 14:35:46 2023 daemon.err tunneldigger[10258]: td-client: [gw05.ffnordhessen.net:20008] Failed to connect to remote endpoint - check WAN connectivity!
Sun Oct 15 14:35:46 2023 daemon.err tunneldigger[10258]: td-client: [gw05.ffnh.net:20008] Reinitializing tunnel context.
Sun Oct 15 14:35:46 2023 daemon.err td-client: No suitable brokers found. Retrying in 5 seconds
Sun Oct 15 14:35:46 2023 daemon.err tunneldigger[10258]: td-client: No suitable brokers found. Retrying in 5 seconds
What is the expected behaviour?
Tunneldigger should retry, but it could make sense to reduce the number of retries after X amount of tries (e.g. every 5 minutes after 10 fails).
Gluon Version:
Gluon v2023.1.1
seems like Commit ID 7feebe9
Is tunneldigger retrying more frequently than every 5 seconds as printed in the logs, or does it cause such high load even with when only retrying every 5 seconds? What is the load average when you stop tunneldigger (/etc/init.d/tunneldigger stop?
Thank you for taking your time to create such an useful tool.
It doesn't seem to run more often than in the 5 second interval. It rather seems as if the "trying to connect - erroring out ("Failed to connect to remote endpoint - check WAN connectivity!") - Reinitializing tunnel context - trying to connect next gateway" process is running way too quick and causing the load (take a look at the attached link to see a video of the issue.). It seems as if there should be some kind of rate limit at which the Tunneldigger should connect.
I can confirm running /etc/init.d/tunneldigger stop mitigates the high load average for a short period of time (Tunneldigger gets restartet automatically, probably by the Tunneldigger-Watchdog).
The 1min load average dropped to 0.02 after 3 minutes of running with stopped Tunneldigger.
Bug report
What is the problem?
With enabled Tunneldigger but no WAN connection (e.g. Cable is disconnected on WAN) the node is experiencing high CPU Load Average (1.5-2.0) because of Tunneldigger's effort to restart and reconnect. This also causes super high flowrate of logs.
At the moment we have recreated this issue on TP-Link Archer C7 v2, Archer C7 v5 and Xiaomi Mi 4a Gigabit, so this seems to be independent of the device.
Log output (this log absolutely SPAMS logread...):
Devices with said issues:
https://map.freifunk-nordhessen.de/#!/de/map/60e32727c0fe
https://map.freifunk-nordhessen.de/#!/de/map/8cdef9a91277
What is the expected behaviour?
Tunneldigger should retry, but it could make sense to reduce the number of retries after X amount of tries (e.g. every 5 minutes after 10 fails).
Gluon Version:
Gluon v2023.1.1
seems like Commit ID 7feebe9
Site Configuration:
https://gitlab.com/freifunk-nordhessen/ffnh-site
Custom patches:
no custom patches applied to Tunneldigger and/or dnsmasq-wan.
The text was updated successfully, but these errors were encountered: