Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Ping #231

Open
amcmorris-piksel opened this issue Jun 11, 2020 · 24 comments
Open

Issue with Ping #231

amcmorris-piksel opened this issue Jun 11, 2020 · 24 comments

Comments

@amcmorris-piksel
Copy link

Setting up a new installation and having issues with Ping.

I am getting the following message in the console:
CRITICAL - Could not interpret output from ping command

When do from the command line under root it works, but if I try under nagios I get this error:
ping: socket: Operation not permitted

Anyone else seen this before? I am fairly new to Icinga so just getting my feet together with it.

A.

@jjethwa
Copy link
Owner

jjethwa commented Jun 11, 2020

Hi @amcmorris-piksel

Are you setting up a new ping check or is this the default ping check on the icinga server?

@amcmorris-piksel
Copy link
Author

@jjethwa This was a new ping check, code below:

object Host "NAME" {
address = "FQDN"
check_command = "hostalive"
}

Nothing complex, wonder if doing something silly, command below works okay from the root account and tried from nagios account and got the ^ error.

Plugin Output
/bin/ping -4 -n -U -w 30 -c 5 FQDN
CRITICAL - Could not interpret output from ping command

@jjethwa
Copy link
Owner

jjethwa commented Jun 12, 2020

Does the default icinga2 server hostalive check work?

The URL is http://<YOUR_SERVER_IP>:/icingaweb2/dashboard#!/icingaweb2/monitoring/host/show?host=icinga2

That uses the hostalive check_command as well

@amcmorris-piksel
Copy link
Author

Yes unfortunatly also getting the error on that with the following output: :(

/bin/ping -4 -n -U -w 30 -c 5 127.0.0.1
CRITICAL - Could not interpret output from ping command

Unsure what is going on, any idea of next steps?

@amcmorris-piksel
Copy link
Author

amcmorris-piksel commented Jun 15, 2020

Bit more info, on the same Docker Host have done a diff test:
docker run -p 8080:80 -h icinga2 -t jordan/icinga2:latest

And looks like getting the same output as above, also getting:

Check execution
Reachable | no

Happy to provide or try anything needed.

@jjethwa
Copy link
Owner

jjethwa commented Jun 15, 2020

Thanks for the details @amcmorris-piksel
I pulled latest but don't see the same issue unfortunately. It looks like the ping check is configured to use /usr/lib/nagios/plugins/check_ping

The full command is:

'/usr/lib/nagios/plugins/check_ping' '-4' '-H' '127.0.0.1' '-c' '200,15%' '-w' '100,5%'

@amcmorris-piksel
Copy link
Author

Thanks for that, just tried the below on a fresh image.

root@icinga2:/usr/lib/nagios/plugins# sudo -u nagios /usr/lib/nagios/plugins/check_ping '-4' '-H' '127.0.0.1' '-c' '200,15%' '-w' '100,5%'
/bin/ping -4 -n -U -w 10 -c 5 127.0.0.1
CRITICAL - Could not interpret output from ping command

I think this is an issue with the Docker host from some searching around:
#52

Just not sure what the equivalent will be to get this working in Ubuntu 16.04

@jjethwa
Copy link
Owner

jjethwa commented Jun 17, 2020

Ah, I had forgotten about that issue. Try adding the --privileged flag to the docker run command and see if that works

@amcmorris-piksel
Copy link
Author

Thanks, wish that worked, tried:
docker run --rm --privileged --cap-add=ALL -p 8080:80 -h icinga2 -t jordan/icinga2:latest

But got:

[2020-06-17 14:29:50 +0000] warning/PluginNotificationTask: Notification command for object 'icinga2' (PID: 2297, arguments: '/etc/icinga2/scripts/mail-host-notification.sh' '-4' '127.0.0.1' '-6' '::1' '-b' '' '-c' '' '-d' '2020-06-17 14:29:50 +0000' '-l' 'icinga2' '-n' 'icinga2' '-o' '/bin/ping -4 -n -U -w 30 -c 5 127.0.0.1
CRITICAL - Could not interpret output from ping command' '-r' 'root@localhost' '-s' 'DOWN' '-t' 'PROBLEM' '-v' 'false') terminated with exit code 36, output: /etc/icinga2/scripts/mail-host-notification.sh: 148: [: false: unexpected operator
mail: cannot send message: Process exited with a non-zero status

Just does not like this version of docker it looks like. :(

@jjethwa
Copy link
Owner

jjethwa commented Jun 17, 2020

So bizarre. Maybe you can try running it on one of the container Linux distros like Flatcar?

@amcmorris-piksel
Copy link
Author

Going to move the PoC to AWS rather than use our on premises Docker Hosts, thanks for the help.

@adamparker
Copy link

I had this happen to me as well with CentOS. One of the symptoms were that the ping processes were not being terminated properly and ended up as zombie processes. This would go on until eventually there were no resources available.

I never solved it but hope this information helps.

@jjethwa
Copy link
Owner

jjethwa commented Mar 6, 2021

Thanks for the tip, @adamparker

Would you be able to test out adding a timeout to your ping config to see if that gets rid of the zombies?

@ghost
Copy link

ghost commented Apr 1, 2021

We're having the same issue on ubuntu 20.04 with no internet access. We have the exact same setup in a Vagrant which works (even without the internet access).

It seems to be a rights issue (still not sure why it works on some machines and not on others):
root@icinga2:/# usermod nagios --shell /bin/bash
root@icinga2:/# su - nagios
nagios@icinga2:~$ /bin/ping 127.0.0.1
ping: socket: Operation not permitted
nagios@icinga2:~$ logout
root@icinga2:/# /bin/ping 127.0.0.1
PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.034 ms
64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.027 ms
^C
--- 127.0.0.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 28ms
rtt min/avg/max/mdev = 0.027/0.030/0.034/0.006 ms

looking online for a solution gave me the following:

chmod u+s /bin/ping
but this doesn't seem to work:
root@icinga2:/# chmod u+s /bin/ping
root@icinga2:/# su - nagios
nagios@icinga2:~$ /bin/ping 127.0.0.1
ping: socket: Operation not permitted

someone suggested changing the langauge of the system but it's already set to nothing.

Looking at the rights on both the server and in the vagrant:
vagrant: 543757 -rwsr-sr-x 1 root root 69368 Jan 13 2020 ping
server: 8357416 -rwsr-sr-x 1 root root 69368 Jan 13 2020 ping

I've also looked into the docker versions:
Vagrant:
Client: Docker Engine - Community
Version: 20.10.2
API version: 1.41
Go version: go1.13.15
Git commit: 2291f61
Built: Mon Dec 28 16:17:43 2020
OS/Arch: linux/amd64
Context: default
Experimental: true

Server: Docker Engine - Community
Engine:
Version: 20.10.2
API version: 1.41 (minimum version 1.12)
Go version: go1.13.15
Git commit: 8891c58
Built: Mon Dec 28 16:15:19 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.3
GitCommit: 269548fa27e0089a8b8278fc4fc781d7f65a939b
runc:
Version: 1.0.0-rc92
GitCommit: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
docker-init:
Version: 0.19.0
GitCommit: de40ad0

and the server:
Client: Docker Engine - Community
Version: 20.10.1
API version: 1.41
Go version: go1.13.15
Git commit: 831ebea
Built: Tue Dec 15 04:34:58 2020
OS/Arch: linux/amd64
Context: default
Experimental: true

Server: Docker Engine - Community
Engine:
Version: 20.10.1
API version: 1.41 (minimum version 1.12)
Go version: go1.13.15
Git commit: f001486
Built: Tue Dec 15 04:32:52 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.3
GitCommit: 269548fa27e0089a8b8278fc4fc781d7f65a939b
runc:
Version: 1.0.0-rc92
GitCommit: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
docker-init:
Version: 0.19.0
GitCommit: de40ad0

@jjethwa
Copy link
Owner

jjethwa commented Apr 5, 2021

Hi @Thixx

Thanks for all the details, I have not been able to track this down myself. I believe that it is coming down to how the host is handling the socket request. So far I have not run into the issue when using Flatcar as it's the main distro I use for docker containers.

@adamparker
Copy link

Hi,

I switched check_ping with check_icmp which has resolved the issue for me.

Check_ping also gave me trouble with Zombie processes which is described here https://community.icinga.com/t/defunct-zombie-ping-processes-when-using-check-ping-on/7012

@jjethwa
Copy link
Owner

jjethwa commented Apr 6, 2021

That's great news, thanks for the update @adamparker 😃

@ghost
Copy link

ghost commented Apr 7, 2021

Hi,

I switched check_ping with check_icmp which has resolved the issue for me.

Check_ping also gave me trouble with Zombie processes which is described here https://community.icinga.com/t/defunct-zombie-ping-processes-when-using-check-ping-on/7012

I wish that would work for me, but most of the commands can't be used because of the same issue... (check_icmp included)
Also @jjethwa I just can't switch to another OS, kind of stuck with Ubuntu for now.
I'm still looking into it.

@jjethwa
Copy link
Owner

jjethwa commented Apr 7, 2021

Thanks for the update, @Thixx I haven't had time to research more, but I still feel that we need to focus on the host. Could be a tweak to the docker daemon or an OS security setting.

@ghost
Copy link

ghost commented Apr 7, 2021

Thanks for the update, @Thixx I haven't had time to research more, but I still feel that we need to focus on the host. Could be a tweak to the docker daemon or an OS security setting.

Yeah, I think you're right!
I've seen related issues in suze and centos that are solved down the road.
I've found out that selinux isn't the problem and that I can't add capabilities to the container... or at least it looks like it 'forgets' them.

@AlphaDE
Copy link

AlphaDE commented Jun 6, 2022

Although this is older, but still open.

Just installed Icinga2 in an Ubuntu 20.04 LTS LXC (Proxmox) and ran into the same issue.

I finally found out that check_ping calls /bin/ping and the user nagios used by Icinga2 could not exute the ping command.

nagios@monitor:/usr/lib/nagios/plugins$ /bin/ping 127.0.0.1
/bin/ping: socket: Operation not permitted

I found in a different threat to execute

setcap cap_net_raw+p /bin/ping

and after this command, the problem was solved.

@jjethwa
Copy link
Owner

jjethwa commented Jun 6, 2022

Hi @AlphaDE

Thanks so much for the details! Adding it to the Dockerfile 😄

jjethwa added a commit that referenced this issue Jun 6, 2022
@TheMule71
Copy link

FYI, I've run into a similar problem. (I dont use your Dockerfile)

Many distros removed both the s-bit and capabilities to the executable of ping, sometimes relying on other methods to grant users access.

Also, container systems (docker, podman, etc.) have a role, in removing capabilities to the container as a whole.

Here's what I had to do in my Dockerfile:

RUN setcap 'cap_net_raw+ep' /usr/bin/ping

and run the container with
podman run --network slirp4netns:allow_host_loopback=true --cap-add=cap_net_raw ...
(from standard user, not root)

Hope it helps.

@jjethwa
Copy link
Owner

jjethwa commented Aug 2, 2022

Thanks for the tip @TheMule71 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants