Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection dies with 'Protocol violation with EOF' after some time #92

Closed
dwt opened this issue Jan 19, 2017 · 29 comments
Closed

Connection dies with 'Protocol violation with EOF' after some time #92

dwt opened this issue Jan 19, 2017 · 29 comments

Comments

@dwt
Copy link

dwt commented Jan 19, 2017

Not sure what to make of it - the tunnel comes online and works for a short time, then dies with these error messages:

DEBUG:  Error reading from SSL connection (Protocol violation with EOF).
INFO:   Cancelling threads...
INFO:   Setting ppp interface down.
INFO:   Restoring routes...
INFO:   Removing VPN nameservers...
DEBUG:  Waiting for pppd to exit...
INFO:   Terminated pppd.
INFO:   Closed connection to gateway.
DEBUG:  Gateway certificate validation failed.
DEBUG:  Gateway certificate digest found in white list.
INFO:   Logged out.

Running master from 19. of January 2017 on Mac OS X 10.12.2 (16C68).

What further information can I give to help you debug this problem?

@mrbaseman
Copy link
Collaborator

The connection over which the tunnel was running was stable during that time? Any log entries of pppd which could provide a hint?

@dwt
Copy link
Author

dwt commented Jan 27, 2017

The connection was stable during that time, I'm not really sure how to reproduce this, it seems that the tunnel breaks sometimes very few seconds after establishing (the first time I connected) then it worked flawlessly for half an hour (till I hang up). The last time I used it, I couldn't get any connection to work for longer than 30 seconds before it broke.

Today I only got these messages out of pppd (and IMO they look fine?). The first connect failed, second and third where fine and manually closed. (I don't really see that first connect in the log though, not sure why)

Standard	10:57:00.903531 +0100	pppd	publish_entry SCDSet() failed: Success!
Standard	10:57:00.903579 +0100	pppd	publish_entry SCDSet() failed: Success!
Standard	10:57:00.904576 +0100	pppd	pppd 2.4.2 (Apple version 838) started by root, uid 0
Standard	10:57:00.905492 +0100	pppd	Connect: ppp0 <--> /dev/ttys001
Standard	10:57:01.387849 +0100	pppd	local  IP address 172.16.0.4
Standard	10:57:01.387951 +0100	pppd	remote IP address 1.1.1.1
Standard	10:57:01.388102 +0100	pppd	Committed PPP store
Standard	10:57:01.388942 +0100	pppd	Committed PPP store
Standard	10:59:27.025956 +0100	pppd	Modem hangup
Standard	10:59:27.027486 +0100	pppd	Connection terminated.
Standard	10:59:27.027635 +0100	pppd	LCP close (User request).
Standard	10:59:34.170679 +0100	pppd	publish_entry SCDSet() failed: Success!
Standard	10:59:34.170745 +0100	pppd	publish_entry SCDSet() failed: Success!
Standard	10:59:34.171804 +0100	pppd	pppd 2.4.2 (Apple version 838) started by root, uid 0
Standard	10:59:34.172616 +0100	pppd	Connect: ppp0 <--> /dev/ttys001
Standard	10:59:34.591208 +0100	pppd	local  IP address 172.16.0.4
Standard	10:59:34.591400 +0100	pppd	remote IP address 1.1.1.1
Standard	10:59:34.591550 +0100	pppd	Committed PPP store
Standard	10:59:34.594628 +0100	pppd	Committed PPP store

@dwt
Copy link
Author

dwt commented Jan 27, 2017

This is how the pppd log looks for a failed connection

Standard	11:24:31.915863 +0100	pppd	publish_entry SCDSet() failed: Success!
Standard	11:24:31.915919 +0100	pppd	publish_entry SCDSet() failed: Success!
Standard	11:24:31.917006 +0100	pppd	pppd 2.4.2 (Apple version 838) started by root, uid 0
Standard	11:24:31.917924 +0100	pppd	Connect: ppp0 <--> /dev/ttys001
Standard	11:24:32.334984 +0100	pppd	local  IP address 172.16.0.4
Standard	11:24:32.335170 +0100	pppd	remote IP address 1.1.1.1
Standard	11:24:32.335283 +0100	pppd	Committed PPP store
Standard	11:24:32.336616 +0100	pppd	Committed PPP store
Standard	11:24:34.766591 +0100	pppd	Modem hangup
Standard	11:24:34.767663 +0100	pppd	Connection terminated.
Standard	11:24:34.767706 +0100	pppd	LCP close (User request).

Here's the openforti log

dwt@crest ~/Code/Projekte/Locomore/openfortivpn (git)-[master] % ./start_tunnel.sh
INFO:   Connected to gateway.
INFO:   Authenticated.
INFO:   Remote gateway has allocated a VPN.
INFO:   Got addresses: [172.16.0.4], ns [0.0.0.0, 0.0.0.0]
INFO:   Interface ppp0 is UP.
INFO:   Setting new routes...
add net 10.3.140.0: gateway 172.16.0.4
add net 10.3.7.0: gateway 172.16.0.4
add net 10.3.9.0: gateway 172.16.0.4
INFO:   Adding VPN nameservers...
INFO:   Tunnel is up and running.
INFO:   Cancelling threads...
INFO:   Setting ppp interface down.
INFO:   Restoring routes...
INFO:   Removing VPN nameservers...
INFO:   Terminated pppd.
INFO:   Closed connection to gateway.
INFO:   Logged out.

@mrbaseman
Copy link
Collaborator

@dwt thanks. Unfortunately, I can't see any hint from the logs you have provided, but here a few thoughts:
I have observed that the connection sometimes can't be established when it takes too long (timeout during name resolution, expired mobile token, or alike), but probably the scenario you are observing is something different.
In other cases the connection may be established, but breaks down when the routes pushed by the fortigate create some kind of a loop (i.e. the route for the traffic building up the vpn tunnel is altered and points through the vpn channel once it is established). This might be corrected by manually setting up a proper route beforehand. The scenario should be the same with the official forticlient (provided that it handles this routing scenario the same way).
Another possible cause for an instable connection could be another device (e.g. a mobile device running a forticlient?) that tries to authenticate with the same credentials and invalidates the authentication for your vpn connection. On the fortigate one might observe log entries about such an event.

@DimitriPapadopoulos
Copy link
Collaborator

Have you perhaps tried with a newer version of openfortivpn? I suspect this has been fixed in version 1.3.1 and newer.

@dwt
Copy link
Author

dwt commented Jul 25, 2017

Sorry, I don't work at the place that requires this VPN anymore, so I can't.

But thanks for the reminder! :)

@DimitriPapadopoulos
Copy link
Collaborator

Thank you for the feedback! I'll close this ticket, there's not much we can do about it this context. If needed please reopen it.

@markri
Copy link

markri commented Jul 27, 2017

I have the same issue. Tried version 1.3.1 and 1.4.0 (both of them die after some time) on CentOS 6.4 and CentOS 7.3 (similar log and issue)

openfortivpn debug log CentOS 6.4

DEBUG:  gateway ---> pppd (6 bytes)
gtw:    xx xx xx xx xx x

DEBUG:  Got Address: 10.212.134.201
DEBUG:  if_config: not ready yet...
DEBUG:  pppd ---> gateway (6 bytes)
pppd:   xx xx xx xx xx x

DEBUG:  Got Address: 10.212.134.201
DEBUG:  Interface Name: ppp0
DEBUG:  Interface Addr: 10.212.134.201
INFO:   Interface ppp0 is UP.
INFO:   Setting new routes...
DEBUG:  ip route show to 0.0.0.0/0.0.0.0
DEBUG:  Setting route to vpn server...
DEBUG:  ip route add to xx.xx.xx.xx/255.255.255.255 via 192.168.1.2 dev eth0
DEBUG:  ip route add to xx.xx.xx.xx/255.255.255.224 via 10.212.134.201 dev ppp0
INFO:   Adding VPN nameservers...
INFO:   Tunnel is up and running.
DEBUG:  Error reading from SSL connection (Protocol violation with EOF).
INFO:   Cancelling threads...
INFO:   Setting ppp interface down.
INFO:   Restoring routes...
DEBUG:  ip route del to xx.xx.xx.xx/255.255.255.255 via 192.168.1.2 dev eth0
INFO:   Removing VPN nameservers...
DEBUG:  Waiting for pppd to exit...
INFO:   Terminated pppd.
INFO:   Closed connection to gateway.
DEBUG:  Gateway certificate validation failed.
DEBUG:  Gateway certificate digest found in white list.
INFO:   Logged out.

pppd log CentOS 6.4

Jul 27 15:33:47 server1 pppd[9422]: pppd 2.4.5 started by marco, uid 0
Jul 27 15:33:47 server1 pppd[9422]: Using interface ppp0
Jul 27 15:33:47 server1 pppd[9422]: Connect: ppp0 <--> /dev/pts/0
Jul 27 15:33:48 server1 pppd[9422]: local  IP address 10.212.134.201
Jul 27 15:33:48 server1 pppd[9422]: remote IP address 1.1.1.1
Jul 27 15:38:51 server1 pppd[9422]: Hangup (SIGHUP)
Jul 27 15:38:51 server1 pppd[9422]: Modem hangup
Jul 27 15:38:51 server1 pppd[9422]: Connect time 5.1 minutes.
Jul 27 15:38:51 server1 pppd[9422]: Sent 0 bytes, received 0 bytes.
Jul 27 15:38:51 server1 pppd[9422]: Connection terminated.
Jul 27 15:38:51 server1 pppd[9422]: Exit.

Reviewing the log for multiple times, I found that the connection is broken exactly after 5.1 minute.

On Ubuntu 16.04 hower, I use the GUI https://github.com/theinvisible/openfortigui (which includes the 1.4.0 version right now). When I connect to the same VPN server I get a stable connection. So it seems that this issue is somehow OS / library related.

@DimitriPapadopoulos
Copy link
Collaborator

@markri OK, I'm reopening this issue since you can reproduce this with CentOS and openfortivpn 1.4.0. Please note the initial poster was on Mac OS X 10.12.2. So is this OS / library related or not? This is a total mystery for now: could as well be identical symptoms with different reasons...

Also does #50 help at all? Modifies /etc/ppp/options but not sure whether this is an openfortivpn bug or not.

@markri
Copy link

markri commented Jul 27, 2017

Thnx, just tried the exact same install (without GUI) on Ubuntu. As it seems, the connection on Ubuntu spawns some messages every few seconds now and then (probably a keep-alive). These messages were missing in CentOS.

Seems like you already were a step ahead 👍 about that
Just implemented your mentioned fix in #50

As the default options file is empty but the "lock" option in both CentOS 6 and 7. My connection is now stable for over more than 10 minute now (it broke exactly at 5.1 minutes before). So it seems to be this exact issue. Will let it run over night, but for now we can assume that this (is / can be) fixed

@DimitriPapadopoulos
Copy link
Collaborator

@markri Thanks for the feedback. I can't decide whether this is an openfortivpn bug or not. Someone more knowledgeable might comment here...

@markri
Copy link

markri commented Jul 28, 2017

To give some insight in the applied fix and results from the initiated connection over night

The connection lasted 8 hours and 55 minutes with the lcp-echo-interval 30 option. Finally it ends with the same message (see below). For me this is stable enough and thus workable.

DEBUG:  Error reading from SSL connection (Protocol violation with EOF).
INFO:   Cancelling threads...
INFO:   Setting ppp interface down.
INFO:   Restoring routes...
DEBUG:  ip route del to x.x.x.x/255.255.255.255 via 192.168.1.2 dev eth0
INFO:   Removing VPN nameservers...
DEBUG:  Waiting for pppd to exit...
INFO:   Terminated pppd.
INFO:   Closed connection to gateway.
DEBUG:  Gateway certificate validation failed.
DEBUG:  Gateway certificate digest found in white list.
INFO:   Logged out.

@DimitriPapadopoulos
Copy link
Collaborator

Indeed 8 hours is usually "good enough" but still, I wonder where "EOF" comes from and whether it's an openfortivpn bug or not.

@DimitriPapadopoulos
Copy link
Collaborator

DimitriPapadopoulos commented Jul 28, 2017

By the way, 8 hours is precisely the default timeout on the VPN server side. From the FortiOS™ Handbook SSL VPN for FortiOS 5.0:

Setting the client authentication timeout

The client authentication timeout controls how long an authenticated user will remain connected. When this time expires, the system forces the remote client to authenticate again. As with the idle timeout, a shorter period of time is more secure. The default value is 28800 seconds (8 hours). You can only modify this timeout value in the CLI

I guess this is what happens here. If so there remain two issues:

  1. You need to modify /etc/ppp/options: is that something openfortivpn should enforce instead of relying on the pppd configuration?
  2. Can openfortivpn handle the VPN server timeout more gracefully and print user needs to authenticate again?

@cp2587
Copy link

cp2587 commented Aug 10, 2017

Hello,

I have the same problem. Can openfortivpn handle the re-authentication or should i use a crontab line to ensure openfortivpn is restarted when needed ? (i need to have openfortivpn working as a service)

@DimitriPapadopoulos
Copy link
Collaborator

DimitriPapadopoulos commented Aug 19, 2017

I believe it would be useful to detect the server-side timeout, if at all possible, and print something more useful than Error reading from SSL connection (Protocol violation with EOF). One would need to start a VPN connection in verbose mode and have a look at DEBUG messages when the connection closes. The VPN server might send a message that could be intercepted.

About re-authentication, I don't think it is handled right now, there is no loop in the relevant part of the code:

        if (run_tunnel(&cfg) == 0)
                ret = EXIT_SUCCESS;
        goto exit;

But then many sites use one-time passwords. I'm not certain this should be included within openfortivpn.

@mrbaseman
Copy link
Collaborator

I have started to work on this last week. What about adding an option, e.g. --loop that tells openfortivpn explicitly to try reauthentication over and over and run tunnels in a loop? I have started this branch which probably needs some cleanup and more testing.
So far I couldn't run into the timeout, because I always have some traffic over the VPN channel before it would close, but when I forced the tunnel down manually, the return code was always 0 (the tunnel has been established successfully). I still have to change my settings so that I can reproduce the real situation where an idle tunnel causes the timeout

@cp2587
Copy link

cp2587 commented Nov 26, 2017

Hello @mrbaseman , did you end up merging your work in the master branch ? Otherwise, I would glady test your branch if you need to

@mrbaseman
Copy link
Collaborator

I haven't merged this into master yet. I have just rebased the branch on the current master and if you could test it, this would be good. Positive experience in the field is always a good argument for merging new features into the master branch.

@cp2587
Copy link

cp2587 commented Dec 5, 2017

I have deployed your branch but i have trouble figuring out the value i should set for --loop ? I have set 3600 as i want openfortivpn to re-authenticate itself every hour but i am not sure this is what it does...
Can you elaborate a little bit ?

@mrbaseman
Copy link
Collaborator

it is the time in seconds which openfortivpn waits before reconnecting in a loop. It doesn't hang up a running connection by itself, but when it drops out it tries to reconnect. If the reconnect fails it waits again and tries another time until you hit Ctrl+C. If reconnecting fails due to an interrupted network connection you don't want to try reconnecting every second and fill the pppd log quite quickly. Therefore, you can adjust this interval to a suitable value. Personally I would choose something between 30 and 300 seconds. 0 means "don't try to reconnect" which is the default setting.

@mrbaseman
Copy link
Collaborator

Just for the record: I have noticed that if you have another device doing ssl deep inspection on the path between the openfortivpn and the box with the vpn server, you might as well see the 'Protocol violation with EOF' error message, but that's at the beginning of the communication, quite quickly after a successful authentication.

@cp2587
Copy link

cp2587 commented Dec 13, 2017

Hello @mrbaseman ,

I have tested for a whole week with '--loop=5' and it works perfectly !

Do you have an ETA on the merge in the master branch ?

@mrbaseman
Copy link
Collaborator

Well, I have opened a pull request already a while ago (see above). Maybe your feedback has given the necessary kick for the merge now...

@mrbaseman
Copy link
Collaborator

We have decided to rename the option to --persistent instead of --loop. The behavior is exactly the same. I have merged it into the master branch now

@mrbaseman
Copy link
Collaborator

the --persistent option is available for a while now. I think we can close this issue.

@mrbaseman
Copy link
Collaborator

For the record: I have noticed just in the FortiOS Handbook for 5.6 that there is a new server-side configuration option set route-source-interface enable that prevents from ssl vpn sessions from being moved to another interface on dual wan connections. It's in the trouble shooting section under the headline "Tunnel-mode connection shuts down after a few seconds"

@FinboySlick
Copy link

While --persistent is useful, in cases where idle timeouts on the server side are pretty low, lcp-echo-interval is still required (otherwise link is disrupted durring reconnects). To integrate with things like OpenWRT, having a --pppd-lcp-echo-interval option would be greatly appreciated because it would allow for a pure UCI configuration.

@DimitriPapadopoulos
Copy link
Collaborator

@FinboySlick Thank you for this, quite interesting. Will follow-up in #802.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants