After updating from a previous version to 18.06-rc1 the router has high load in 1-2 days with low CPU.
How can i find the reason for the high load (8-20) ?
I've nothing found in logread or dmsg and can't execute any network related command like ifconfig, ip, ifup...
Mount a USB stick for non-volatile storage and send your logs to it. Run a
cron job that appends
date and the output
ps axjf (or, if using
busybox ps, then
ps w, as I recall) to another file on that stick.
Thank you for your answer!
I'v found an usb stick and will wait for the next crash:
logread -f >> /mnt/stick/logread.log &
and in cron:
*/5 * * * * date >> /mnt/stick/ps.log; ps w >> /mnt/stick/ps.log
Looks like what I'd do to try to figure out what's happening!
Another "trick" is to open a terminal over ssh and run something there. Your local terminal program should preserve the output through any crash/freeze of the router as well.
After another disconnect from wlan the load went up again.
Unfortunately i found nothing in the logs except from hostapd:
did not acknowledge authentication response and
IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
which looks not very critical.
the "ps-log" looks normal as well.
only reboot helps.
I also did a stresstest with many devices connected to wlan but i could not force the
freezing of ifconfig, ip, ifup etc...
Any other ideas? (as mentioned before, I a can login via ssh)
Update: figured out that
option wpa_disable_eapol_key_retries '1'
in wireless config causes the problem.
how to proceed now?
From https://w1.fi/cgit/hostap/plain/hostapd/hostapd.conf it looks like that is related to mitigating several of the the known security weaknesses of the 802.11 protocols
# Workaround for key reinstallation attacks
# This parameter can be used to disable retransmission of EAPOL-Key frames that
# are used to install keys (EAPOL-Key message 3/4 and group message 1/2). This
# is similar to setting wpa_group_update_count=1 and
# wpa_pairwise_update_count=1, but with no impact to message 1/4 and with
# extended timeout on the response to avoid causing issues with stations that
# may use aggressive power saving have very long time in replying to the
# EAPOL-Key messages.
# This option can be used to work around key reinstallation attacks on the
# station (supplicant) side in cases those station devices cannot be updated
# for some reason. By removing the retransmissions the attacker cannot cause
# key reinstallation with a delayed frame transmission. This is related to the
# station side vulnerabilities CVE-2017-13077, CVE-2017-13078, CVE-2017-13079,
# CVE-2017-13080, and CVE-2017-13081.
# This workaround might cause interoperability issues and reduced robustness of
# key negotiation especially in environments with heavy traffic load due to the
# number of attempts to perform the key exchange is reduced significantly. As
# such, this workaround is disabled by default (unless overridden in build
# configuration). To enable this, set the parameter to 1.
Thank you for your answer!
For me it's not a problem.
I only posted it if someone have the same issue and looking for a solution,
or as information for developers, that this option worked at 17.06 but did not work for 18.6-rc1 and up (also tried the snapshots) on MT7628.