I want to get metrics from my router and ap and send them to an influxdb instance.
I installed luci-app-statistics which installed collectd and some plugins. All plugins work as expected on the router and ap.
When I configure the network output plugin and insert the influxdb host (myserver.lan) and port (25826), data isn't received by the influxdb instance. After some troubleshooting, I can see the following in the logs :
It seems that collectd has a problem in resolving the DNS name for myserver.lan
I then decided to input the server's ip address to get pass the DNS resolution but I have the same error :
First of all thanks for your answer.
Sorry for not mentioning it in my first post, but sending the collected data to the influxdb instance never worked, I always got that getaddrinfo error (when using the hostname or the ip address). Before posting, I also rebooted the router and restarted collectd and luci_statistics several times when testing and it never worked.
IPv6 is currently deactivated on my entire network, both on the router and on the hosts (and I wouldn't want to activate it, at least not for the moment), so re-configuring my entire network just to make the network plugin of collectd work is not at hand.
On the other side I defined an IPv6 domain for my influxdb host (just to see if it would solve the problem) and after checking that the resolution for that host is done for IPv4 and IPv6, I restarted collectd but sadly I still get the same error no matter if I use the domain name or the IPv4 address.
I am no expert (obviously) and my understanding is quite limited in this area, but it seems a little bit weird to have collectd query getaddrinfo when I input an IPv4 address for the remote influxdb host, and why does it fail in doing so?
Just to make sure it is not my network that is causing the issue, I installed and configured a VM with Debian and collectd and the database in influxdb is correctly populated (I understand what you've linked in your previous post about the getaddrinfo differences between glibc and musl, but in this case I think the implementation in OpenWRT is faulty compared to the one in Debian).
Mar 8 16:31:03 debian collectd[11302]: plugin_load: plugin "syslog" successfully loaded.
Mar 8 16:31:03 debian collectd[11302]: plugin_load: plugin "battery" successfully loaded.
Mar 8 16:31:03 debian collectd[11302]: plugin_load: plugin "cpu" successfully loaded.
Mar 8 16:31:03 debian collectd[11302]: plugin_load: plugin "df" successfully loaded.
Mar 8 16:31:03 debian collectd[11302]: plugin_load: plugin "disk" successfully loaded.
Mar 8 16:31:03 debian collectd[11302]: plugin_load: plugin "entropy" successfully loaded.
Mar 8 16:31:03 debian collectd[11302]: plugin_load: plugin "interface" successfully loaded.
Mar 8 16:31:03 debian collectd[11302]: plugin_load: plugin "irq" successfully loaded.
Mar 8 16:31:03 debian collectd[11302]: plugin_load: plugin "load" successfully loaded.
Mar 8 16:31:03 debian collectd[11302]: plugin_load: plugin "memory" successfully loaded.
Mar 8 16:31:03 debian collectd[11302]: plugin_load: plugin "processes" successfully loaded.
Mar 8 16:31:03 debian collectd[11302]: plugin_load: plugin "rrdtool" successfully loaded.
Mar 8 16:31:03 debian collectd[11302]: plugin_load: plugin "swap" successfully loaded.
Mar 8 16:31:03 debian collectd[11302]: plugin_load: plugin "users" successfully loaded.
Mar 8 16:31:03 debian collectd[11302]: plugin_load: plugin "network" successfully loaded.
Mar 8 16:31:03 debian collectd[11302]: Systemd detected, trying to signal readyness.
Mar 8 16:31:03 debian collectd[11302]: Initialization complete, entering read-loop.
Do you have any other idea in troubleshooting this besides activating IPv6 on my network (which if really needed and solving the problem, should not be an accepted solution, as in some cases IPv6 could be a no go)?
Well, regarding your reproduce example: actually the ping plugin of collectd does support specifying the address family in config, so "ping" itself could be configured to avoid the error...
But I think that the network plugin has no such support.
All upstream examples are actually based on specifying the IP address directly, instead of a hostname.
Yep, specifying the address family to IPv4 explicitly works.
But it still fails with implicit settings and no IPv6 connectivity.
So, musl/getaddrinfo appears to be the cause of the issue:
Not sure if anything might change in musl upstream would change, as this problematics has been discussed earlier in 2018 and the current behaviour set. See the discussion thread https://www.openwall.com/lists/musl/2018/07/11/1 )
(Not sure if reporting musl bug/feature in OpenWrt bug tracker would have any impact)
There's always a chance to find a better solution if someone reviews the old code.
On the other hand, the issue may never be fixed if it isn't reported.
There're quite a few musl-specific changes and patches:
Thanks everyone for troubleshooting this.
I will most probably try to open a bug report (never done so before, but there is a first for everything ; not sure I will be able to correctly explain the problem but I will try).
Centrally collecting metrics will have to wait until the bug is solved (which if it happens seems to be in a future not so near) or I finally start using IPv6.
Note that the issue is still relevant even when you do not disable IPv6 specifically.
I tested OpenWrt 19.07.7 x86_64 as a KVM/QEMU guest with mostly default configuration.
So, IPv6 is enabled but IPv6 connectivity is missing and the issue persists.