[Solved] Zabbix server can't discover LEDE interfaces

Hi. Anyone else monitoring remote LEDE assets with Zabbix?

I'm having the following issue: The Zabbix OpenWrt templates works only on OpenWrt devices while probing network interfaces.

These Zabbix xml templates will fetch a lot of things(date, hostname, reboot, network, cpu, mem, disk...), but the ones that are not working are the "Network interface discovery" and "Wifi interface discovery". Phy discovery works great as well as "Mounted filesystem discovery". This is the Zabbix server interface. The little red exclamation point says "Value shold be a JSON Object" on LEDE routers but not on OpenWrt ones, and consequently only OpenWrt routers will have in/out traffic graphs and data related to network traffic .

image

No relevant logs at the zabbix server:

[root@zabbix01 zabbix]# grep -i WRT000 zabbix_server.log | grep netowrt
[root@zabbix01 zabbix]# grep -i WRT000 zabbix_server.log | grep wifi
[root@zabbix01 zabbix]# grep -i WRT000 zabbix_server.log | grep system.cpu.util
4270:20180109:100432.642 Zabbix agent item "system.cpu.util[,user]" on host "WRT000" failed: another network error, wait for 10 seconds

Packages installed on the router:

root@WRT000:/etc/config# opkg list-installed | grep zabbix
zabbix-agentd - 3.2.6-1
zabbix-extra-mac80211 - 3.2.6-1
zabbix-extra-network - 3.2.6-1
zabbix-extra-wifi - 3.2.6-1

Am i missing some package, or there is a new template available somewhere? I had no lucky on thinkering /etc/zabbix_agentd.conf.d/ files used to provide to the server the probe results.

Please try replacing the shipped /etc/zabbix_agentd.conf.d/network script with the following variant:

UserParameter=netowrt.discovery,/root/discover-interfaces.sh

And create a new script /root/discover-interfaces.sh with the following contents:

#!/bin/sh

. /usr/share/libubox/jshn.sh

eval "$(ubus call network.interface dump | jsonfilter -F ': ' -e 'INTERFACES=@.interface[@.l3_device]["interface","l3_device"]')"

json_init
json_add_array "data"

for interface in $INTERFACES; do
	device="${interface#*:}"
	network="${interface%%:*}"

	json_add_object
	json_add_string '{#IF}' "$device"
	json_add_string '{#NET}' "$network"
	json_close_object
done

json_close_array
json_dump

Don't forget to chmod 0755 it in the end to make it executable.

Hi Jow.

Still the same issue. “Value shold be a JSON Object”. network file contents:

root@WRT000:/etc/zabbix_agentd.conf.d# cat network
#see http://wiki.openwrt.org/doc/howto/zabbix for ready to use templates

UserParameter=netowrt.discovery,/root/discover-interfaces.sh

Used chmod +x /root/discover-interfaces.sh to add execute permissions. Double check:

root@WRT000:/etc/zabbix_agentd.conf.d# ls -l /root/discover-interfaces.sh
-rwxr-xr-x    1 root     root           434 Jan  9 15:18 /root/discover-interfaces.sh

And, if I run the script manually I can see that the interface list is being assembled correctly:

root@WRT000:~# ./discover-interfaces.sh
{ "data": [ { "{#IF}": "lo", "{#NET}": "loopback" }, { "{#IF}": "br-vlan110", "{#NET}": "vlan110" }, { "{#IF}": "eth0.111", "{#NET}": "vlan111" }, { "{#IF}": "br-vlan113", "{#NET}": "vlan113" }, { "{#IF}": "br-vlan115", "{#NET}": "vlan115" }, { "{#IF}": "br-vlan116", "{#NET}": "vlan116" }, { "{#IF}": "eth0.119", "{#NET}": "vlan119" }, { "{#IF}": "tun0", "{#NET}": "vpn0" }, { "{#IF}": "eth0.10", "{#NET}": "wan" } ] }

I've also tried to remove the host from the template and add it again without success. Disabling and reenabling the host at the server configuration wasn't helpful either.

Don't know if it helps, but i'm using latest Zabbix server stable release(3.4.5)

Can you compare that with the output of the command in /etc/zabbix_agentd.conf.d/network on a working OpenWrt node?

Sure. This is the output of a OpenWrt router:

root@WRTXXXX:/etc/zabbix_agentd.conf.d# lua -l uci -e 'x = uci.cursor(nil, "/var/state");list = "{\"data\":[";x:foreach("network", "interface", function(s) list=list.."{\"{#IF}\":\""..s.ifname.."\", \"{#NET}\":\""..s[".name"].."\"}," end); list=string.gsub(list,",$",""); print(list.."]}")'

{"data":[{"{#IF}":"lo", "{#NET}":"loopback"},{"{#IF}":"eth0.10", "{#NET}":"wan"},{"{#IF}":"tun0", "{#NET}":"vpn0"},{"{#IF}":"br-vlan115", "{#NET}":"vlan115"},{"{#IF}":"br-vlan110", "{#NET}":"vlan110"},{"{#IF}":"eth0.111", "{#NET}":"vlan111"},{"{#IF}":"br-vlan113", "{#NET}":"vlan113"},{"{#IF}":"br-vlan116", "{#NET}":"vlan116"},{"{#IF}":"eth0.119", "{#NET}":"vlan119"}]}

I can't spot too much differences except extra spaces at the end of the Json string...

Hmm, I'm a bit puzzled about whit it could be then; maybe zabbix uses different data models because it does not know about the "LEDE" distribution. My next step would probaby tcpdumping the zabbix agent traffic to see if there's any differences on the wire or if it is just the zabbix sever failing to parse the data.

What those router have in common is:

  • All of them are connected using any ADSL/DSL local provider

  • OpenVPN client. LEDE/OpenWrt assets contact zabbix through routing(no nat)

  • Vlan interfaces

  • Mwan3 with Wan being wan1, lan1 being wan2 and usb-rndis being wan3. No load balance, only fault tolerance.

  • Services order changed to: network(20), zabbix_agentd(60), openvpn(90). This was made to avoid openvpn trying to get connected before network starts

  • Making a diff between OpenWrt and LEDE network zabbix file show no difference.

  • Pretty much the same clean Zabbix-agent configuration. It only changes the IP and Hostname:

    root@WRT000:~# cat /etc/zabbix_agentd.conf
    PidFile=/tmp/zabbix_agentd.pid
    LogType=system
    Server=127.0.0.1,zabbix.domain.com,172.x.x.x
    #tun0 ip address
    ListenIP=10.255.x.x
    StartAgents=1
    Hostname=WRT000
    Include=/etc/zabbix_agentd.conf.d/
    
  • Forward allowed between lan zones and vpn zones:

image

PS: If i run the network UserParameter command that comes with the zabbix-extra-network on LEDE, i get a "nil value cant be concatenate":

LEDE:

root@WRT000:~# lua -l uci -e 'x = uci.cursor(nil, "/var/state");list = "{\"data\":[";x:foreach("network", "interface", function(s) list=list.."{\"{#IF}\":\""..s.ifname.."\", \"{#NET}\":\""..s[".name"].."\"}," end); list=string.gsub(list,",$",""); print(list.."]}")'
lua: (command line):1: attempt to concatenate field 'ifname' (a nil value)
stack traceback:
        [C]: in function 'foreach'
        (command line):1: in main chunk
        [C]: ?

OpenWrt:

root@WRTXXX:~# lua -l uci -e 'x = uci.cursor(nil, "/var/state");list = "{\"data\":[";x:foreach("network", "interface", function(s) list=list.."{\"{#IF}\":\""..s.ifname.."\", \"{#NET}\":\""..s[".name"].."\"}," end); list=string.gsub(list,",$",""); print(list.."]}")'

{"data":[{"{#IF}":"lo", "{#NET}":"loopback"},{"{#IF}":"eth0.10", "{#NET}":"wan"},{"{#IF}":"tun0", "{#NET}":"vpn0"},{"{#IF}":"br-vlan115", "{#NET}":"vlan115"},{"{#IF}":"br-vlan110", "{#NET}":"vlan110"},{"{#IF}":"eth0.111", "{#NET}":"vlan111"},{"{#IF}":"br-vlan113", "{#NET}":"vlan113"},{"{#IF}":"br-vlan116", "{#NET}":"vlan116"},{"{#IF}":"eth0.119", "{#NET}":"vlan119"}]}

Yes the failing lua UserParameter command was my initial suspicion on why the agent does not work on LEDE. It relies on deprecated functionality.

I wonder if you restarted the zabbix agent process after changing UserParameter to use the network.sh script?

Already done that. And if i use zabbix_get to mannualy fetch this UserParameter on LEDE, these are the results:

Using the script located at /root/discover-interfaces.sh :

[root@zabbix01 ~]# zabbix_get -s 10.255.x.y -p 10050 -k "netowrt.discovery"
Command failed: Not found
Failed to parse json data: unexpected end of data
{ "data": [ ] }

Using the default lua command:

[root@zabbix01 ~]# zabbix_get -s 10.255.x.y -p 10050 -k "netowrt.discovery"
lua: (command line):1: attempt to concatenate field 'ifname' (a nil value)
stack traceback:
        [C]: in function 'foreach'
        (command line):1: in main chunk
        [C]: ?

And i double checked for missing commands like jsonfilter and they do exist:

root@WRT000:~# which jsonfilter
/usr/bin/jsonfilter

The script seems to be executing correctly, cause if i move it to other location, i get the following error:

[root@zabbix01 ~]# zabbix_get -s 10.255.x.y -p 10050 -k "netowrt.discovery"
sh: /root/discover-interfaces.sh: not found

As what user is the zabbix agent process running?

Zabbix user. It's the default user

root@WRT000:~# ps w | grep zabbix
16300 zabbix    1316 S    /usr/sbin/zabbix_agentd -c /etc/zabbix_agentd.conf -f
16302 zabbix    1316 S    /usr/sbin/zabbix_agentd: collector [idle 1 sec]
16303 zabbix    1324 S    /usr/sbin/zabbix_agentd: listener #1 [waiting for connection]

And it have a valid user shell:

root@WRT000:~# grep zabbix /etc/passwd
zabbix:x:53:53:zabbix:/var/run/zabbix:/bin/ash

Ah, right thats why it is unable to fetch the ubus state information. You need to add an ACL for zabbix agent.

Create an /usr/share/acl.d/zabbix-agent.json with the following contents:

{
	"user": "zabbix",
	"access": {
		"network.interface": {
			"methods": [ "dump" ]
		}
	}
}

And reload the ubus service using killall -HUP ubusd. Afterwards retry zabbix_get using the network interface script.

If that works out, I'll submit all that upstream for inclusion in the package.

You rock. Worked like a charm. After creating the zabbix-agent.json file and reloading ubus it started fetching information :smile:

[root@zabbix01 ~]# zabbix_get -s 10.255.x.y -p 10050 -k "netowrt.discovery"
{ "data": [ { "{#IF}": "lo", "{#NET}": "loopback" }, { "{#IF}": "br-vlan110", "{#NET}": "vlan110" }, { "{#IF}": "eth0.111", "{#NET}": "vlan111" }, { "{#IF}": "br-vlan113", "{#NET}": "vlan113" }, { "{#IF}": "br-vlan115", "{#NET}": "vlan115" }, { "{#IF}": "br-vlan116", "{#NET}": "vlan116" }, { "{#IF}": "eth0.119", "{#NET}": "vlan119" }, { "{#IF}": "tun0", "{#NET}": "vpn0" }, { "{#IF}": "eth0.10", "{#NET}": "wan" } ] } 

Thanks for all the help.

Unfortunately, there is something Zabbix server should be missig. I have even removed the LEDE host from the server, and added again with the same templates. I can fetch info using zabbix_get but the web interface does not show me the JSON warning anymore(neither create the network interface items):

image

And just to let you know: After removing the template with "Delete All Data" together, the interface metrics started to work again. This was the final thing to be done to make it work again:

image

Yay.

Hi,
I followed the nice tips mentioned in this post to get Zabbix pull information from Openwrt. In my case I am using active agent and Zabbix pulled the interfaces but in the graph there is no data. Other system related information like CPU, Memorey, Disk and Process data is displayed on the graph but network interface shows no information. I changed the agent type to active in the template discovery rules. Do I have to change to Passive agent to be able to pull network interface data?

Well, I've switched jobs a year an half ago so, I'm not dealing with Zabbix anymore on the last 2 years. Sorry, I'll not be able to help ya...

I have resolved the problem by modifying zabbix agent to be active for the network interface items of each host. So for each "net.if.in[]" and "net.if.out[]", change the zabbix agent type to active. I hope this helps anyone who deployed the openwrt zabbix agent type active. Thank you!

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.