Belkin RT3200/Linksys E8450 WiFi AX discussion

yes, very good i will waiting out the new version snapshot , thanks again :wink:

Tried Snapshot r17217-39f81b0bf6 didn't work for me. The only lan1, wlan0 and wlan1 are in bridge mode.

Edit: I restarted the interface in luci and it worked

2 Likes

Hi. I am currently on the non-UBI snapshot r17217. I am invested in openwrt and willing to go away with stock firmware and hence I was wondering if it is possible to flash the flash UBI installer i.e. the "openwrt-mediatek-mt7622-linksys_e8450-ubi-initramfs-recovery-installer.itb" image via luci?

1 Like

Yes.

See advice in https://github.com/dangowrt/linksys-e8450-openwrt-installer

The resulting file openwrt-mediatek-mt7622-linksys_e8450-ubi-initramfs-recovery-installer.itb is suitable to be flashed by the vendor firmware Web-UI as well as non-UBI OpenWrt running on the device

Ps.
Note that after flashing the recovery installer and letting it to run and to reboot the device with the new bootloader, you will end up with the recovery instance of OpenWrt running, and you still need to flash the normal UBI sysupgrade variant (via sysupgrade).

2 Likes

Thanks a lot !

Upgraded to snapshot r17216-8c2509dc5f via attended sysupgrade on my second E8450. Random MAC issue is gone, but last night it appears to have locked up. I have remote syslog set up and everything looks normal until this at the end:

Jul 29 22:13:48 ap-lsys-cc kernel: [76507.624097] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000053
Jul 29 22:13:48 ap-lsys-cc kernel: [76507.632912] Mem abort info:
Jul 29 22:13:48 ap-lsys-cc kernel: [76507.635697]   ESR = 0x96000005
Jul 29 22:13:48 ap-lsys-cc kernel: [76507.638756]   EC = 0x25: DABT (current EL), IL = 32 bits
Jul 29 22:13:48 ap-lsys-cc kernel: [76507.644069]   SET = 0, FnV = 0
Jul 29 22:13:48 ap-lsys-cc kernel: [76507.647116]   EA = 0, S1PTW = 0
Jul 29 22:13:48 ap-lsys-cc kernel: [76507.650272] Data abort info:
Jul 29 22:13:48 ap-lsys-cc kernel: [76507.653149]   ISV = 0, ISS = 0x00000005
Jul 29 22:13:48 ap-lsys-cc kernel: [76507.656977]   CM = 0, WnR = 0
Jul 29 22:13:48 ap-lsys-cc kernel: [76507.659957] user pgtable: 4k pages, 39-bit VAs, pgdp=00000000418a9000
Jul 29 22:13:48 ap-lsys-cc kernel: [76507.666441] [0000000000000053] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
Jul 29 22:13:48 ap-lsys-cc kernel: [76507.675158] Internal error: Oops: 96000005 [#1] SMP

@BeauSlim can you share also the loglines before and after the oops? There should be a hint towards where the NULL pointer dereference happened...

Yesterday evening I changed the SSID and password on the 2.4 GHz radio to fix a tasmota device configured for a previous setup. At 22:02 I changed it back to the new SSID:

Jul 29 19:58:23 ap-lsys-cc hostapd: wlan0: STA 5c:cf:7f:xx:xx:xx IEEE 802.11: authenticated
Jul 29 19:58:23 ap-lsys-cc hostapd: wlan0: STA 5c:cf:7f:xx:xx:xx IEEE 802.11: associated (aid 1)
Jul 29 19:58:23 ap-lsys-cc hostapd: wlan0: AP-STA-CONNECTED 5c:cf:7f:xx:xx:xx
Jul 29 19:58:23 ap-lsys-cc hostapd: wlan0: STA 5c:cf:7f:xx:xx:xx RADIUS: starting accounting session DB4D77548F26F79D
Jul 29 19:58:23 ap-lsys-cc hostapd: wlan0: STA 5c:cf:7f:xx:xx:xx WPA: pairwise key handshake completed (RSN)
Jul 29 19:58:23 ap-lsys-cc hostapd: wlan0: EAPOL-4WAY-HS-COMPLETED 5c:cf:7f:xx:xx:xx

Jul 29 20:02:09 ap-lsys-cc hostapd: wlan0: STA f8:cf:c5:xx:xx:xx IEEE 802.11: authenticated
Jul 29 20:02:09 ap-lsys-cc hostapd: wlan0: STA f8:cf:c5:xx:xx:xx IEEE 802.11: associated (aid 2)
Jul 29 20:02:09 ap-lsys-cc hostapd: wlan0: AP-STA-CONNECTED f8:cf:c5:xx:xx:xx
Jul 29 20:02:09 ap-lsys-cc hostapd: wlan0: STA f8:cf:c5:xx:xx:xx RADIUS: starting accounting session E83E62E81DE81FD6
Jul 29 20:02:09 ap-lsys-cc hostapd: wlan0: STA f8:cf:c5:xx:xx:xx WPA: pairwise key handshake completed (RSN)
Jul 29 20:02:09 ap-lsys-cc hostapd: wlan0: EAPOL-4WAY-HS-COMPLETED f8:cf:c5:xx:xx:xx

Jul 29 20:08:51 ap-lsys-cc hostapd: wlan1: AP-STA-DISCONNECTED 24:a0:74:xx:xx:xx
Jul 29 20:08:51 ap-lsys-cc hostapd: wlan1: STA 24:a0:74:xx:xx:xx IEEE 802.11: authenticated
Jul 29 20:08:51 ap-lsys-cc hostapd: wlan1: STA 24:a0:74:xx:xx:xx IEEE 802.11: associated (aid 1)
Jul 29 20:08:51 ap-lsys-cc hostapd: wlan1: AP-STA-CONNECTED 24:a0:74:xx:xx:xx
Jul 29 20:08:51 ap-lsys-cc hostapd: wlan1: STA 24:a0:74:xx:xx:xx RADIUS: starting accounting session 28945744EBB4563A
Jul 29 20:08:51 ap-lsys-cc hostapd: wlan1: STA 24:a0:74:xx:xx:xx WPA: pairwise key handshake completed (RSN)
Jul 29 20:08:51 ap-lsys-cc hostapd: wlan1: EAPOL-4WAY-HS-COMPLETED 24:a0:74:xx:xx:xx

Jul 29 20:34:32 ap-lsys-cc hostapd: wlan1: AP-STA-DISCONNECTED 24:a0:74:xx:xx:xx
Jul 29 20:34:32 ap-lsys-cc hostapd: wlan1: STA 24:a0:74:xx:xx:xx IEEE 802.11: disassociated due to inactivity
Jul 29 20:34:33 ap-lsys-cc hostapd: wlan1: STA 24:a0:74:xx:xx:xx IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)

Jul 29 20:40:10 ap-lsys-cc hostapd: wlan0: AP-STA-DISCONNECTED 5c:cf:7f:xx:xx:xx
Jul 29 20:40:10 ap-lsys-cc hostapd: wlan0: STA 5c:cf:7f:xx:xx:xx IEEE 802.11: disassociated due to inactivity
Jul 29 20:40:11 ap-lsys-cc hostapd: wlan0: STA 5c:cf:7f:xx:xx:xx IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)

Jul 29 21:03:07 ap-lsys-cc hostapd: nl80211: kernel reports: key addition failed
Jul 29 21:03:07 ap-lsys-cc hostapd: wlan1: STA 24:a0:74:xx:xx:xx IEEE 802.11: associated (aid 1)
Jul 29 21:03:07 ap-lsys-cc hostapd: wlan1: AP-STA-CONNECTED 24:a0:74:xx:xx:xx

Jul 29 21:08:01 ap-lsys-cc hostapd: wlan1: AP-STA-DISCONNECTED 18:3e:ef:xx:xx:xx
Jul 29 21:08:01 ap-lsys-cc hostapd: wlan1: STA 18:3e:ef:xx:xx:xx IEEE 802.11: disassociated due to inactivity
Jul 29 21:08:02 ap-lsys-cc hostapd: wlan1: STA 18:3e:ef:xx:xx:xx IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)

Jul 29 21:31:53 ap-lsys-cc hostapd: wlan1: STA 18:3e:ef:xx:xx:xx IEEE 802.11: authenticated
Jul 29 21:31:53 ap-lsys-cc hostapd: wlan1: STA 18:3e:ef:xx:xx:xx IEEE 802.11: associated (aid 2)
Jul 29 21:31:53 ap-lsys-cc hostapd: wlan1: AP-STA-CONNECTED 18:3e:ef:xx:xx:xx
Jul 29 21:31:53 ap-lsys-cc hostapd: wlan1: STA 18:3e:ef:xx:xx:xx RADIUS: starting accounting session 4139BD7DB31902DE
Jul 29 21:31:53 ap-lsys-cc hostapd: wlan1: STA 18:3e:ef:xx:xx:xx WPA: pairwise key handshake completed (RSN)
Jul 29 21:31:53 ap-lsys-cc hostapd: wlan1: EAPOL-4WAY-HS-COMPLETED 18:3e:ef:xx:xx:xx

Jul 29 22:02:06 ap-lsys-cc kernel: [75805.147336] device wlan0 left promiscuous mode
Jul 29 22:02:06 ap-lsys-cc kernel: [75805.151862] br-lan: port 3(wlan0) entered disabled state
Jul 29 22:02:06 ap-lsys-cc hostapd: Remove interface 'wlan0'
Jul 29 22:02:06 ap-lsys-cc hostapd: wlan0: interface state ENABLED->DISABLED
Jul 29 22:02:06 ap-lsys-cc hostapd: wlan0: AP-STA-DISCONNECTED f8:cf:c5:xx:xx:xx
Jul 29 22:02:06 ap-lsys-cc hostapd: wlan0: AP-DISABLED
Jul 29 22:02:06 ap-lsys-cc hostapd: wlan0: CTRL-EVENT-TERMINATING
Jul 29 22:02:06 ap-lsys-cc hostapd: rmdir[ctrl_interface=/var/run/hostapd]: Permission denied
Jul 29 22:02:06 ap-lsys-cc hostapd: nl80211: deinit ifname=wlan0 disabled_11b_rates=0
Jul 29 22:02:06 ap-lsys-cc hostapd: nl80211: Failed to remove interface wlan0 from bridge br-lan: Invalid argument
Jul 29 22:02:06 ap-lsys-cc netifd: Network device 'wlan0' link is down
Jul 29 22:02:06 ap-lsys-cc hostapd: Configuration file: /var/run/hostapd-phy0.conf (phy wlan0) --> new PHY
Jul 29 22:02:06 ap-lsys-cc netifd: Network device 'wlan0' link is up
Jul 29 22:02:06 ap-lsys-cc netifd: Network device 'wlan0' link is down
Jul 29 22:02:06 ap-lsys-cc kernel: [75805.648284] br-lan: port 3(wlan0) entered blocking state
Jul 29 22:02:06 ap-lsys-cc kernel: [75805.653613] br-lan: port 3(wlan0) entered disabled state
Jul 29 22:02:06 ap-lsys-cc kernel: [75805.659227] device wlan0 entered promiscuous mode
Jul 29 22:02:06 ap-lsys-cc kernel: [75805.664065] br-lan: port 3(wlan0) entered blocking state
Jul 29 22:02:06 ap-lsys-cc kernel: [75805.669482] br-lan: port 3(wlan0) entered listening state
Jul 29 22:02:06 ap-lsys-cc hostapd: wlan0: interface state UNINITIALIZED->COUNTRY_UPDATE
Jul 29 22:02:07 ap-lsys-cc netifd: Network device 'wlan0' link is up
Jul 29 22:02:07 ap-lsys-cc hostapd: wlan0: interface state COUNTRY_UPDATE->ENABLED
Jul 29 22:02:07 ap-lsys-cc hostapd: wlan0: AP-ENABLED
Jul 29 22:02:07 ap-lsys-cc kernel: [75805.727138] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
Jul 29 22:02:09 ap-lsys-cc kernel: [75807.725775] br-lan: port 3(wlan0) entered learning state
Jul 29 22:02:11 ap-lsys-cc kernel: [75809.805778] br-lan: port 3(wlan0) entered forwarding state
Jul 29 22:02:11 ap-lsys-cc kernel: [75809.811270] br-lan: topology change detected, sending tcn bpdu

Jul 29 22:13:48 ap-lsys-cc kernel: [76507.624097] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000053

There is nothing in the logs after the already posted Oops line until I manually restarted the device at 11 AM this morning.

I guess the oops cause the device to reboot into recovery mode (like it should in such a case). Can you extract (hopefully) more complete logs from /sys/fs/pstore?

edit: you find the device on IP 192.168.1.1/24 in recovery mode and should be able to login using SSH root user without password.

Im dont think its a good idea to automatically boot to recovery after kernel crash as it causes confusion.
How to go from recovery to normal image again? Is a rebooty by luci or power toggle enough?

You just have to clear PSTORE and reboot, then everything will be back to normal. This can either be done using rm /sys/fs/pstore/* or by disconnecting the device from power for a short moment, so DRAM content will be cleared.

I think this is extremely useful, especially in snapshot images as kernel oops will not go unnoticed in that way and users for the first time are able to report meaningful things back to us even if the device' serial console is not connected during the crash. If you don't like this, it's also simple to tell U-Boot not to care about PSTORE at all, simply change the bootcmd variable from inside OpenWrt using fw_printenv/fw_setenv.

For a production release things will have to be a bit different, of course. One option would be to let the recovery image handle and clear PSTORE automatically, ie. upload logs to a URL configured in U-Boot environment or store them on an additional UBI storage volume.

And, of course, we need clear indication in LuCI that we are currently running recovery, it'd also be great to have a notification about logs being present in PSTORE and be able to view and clear them in LuCI. I can provide a JSON-RPC interface for PSTORE ops if anyone is willing to implement the front-end part (I'm not into web/front-end stuff at all, ie. graphical stuff makes me feel lost and angry, I hate using web-browsers and prefer everything in a simple text-mode console myself, ncurses is far as it gets with UI in my case, at least I can still use that without having to use the mouse).

1 Like

kernel oops will not go unnoticed in that way

As you see it results only in "wlan did not work anymore until i power cycled" :smiley: I would wonder too if the device "boots itself" into recovery

What about to check if /sys/fs/pstore/* exists, and then show an additional line "crash logs found" or so in Luci-Overview, maybe red? Just by if (file-exist-condition) fields.push('output' in feeds/luci/modules/luci-mod-status/htdocs/luci-static/resources/view/status/include/*.js
But only if its possible to keep /sys/fs/pstore/* when not booting to recovery automatically.

Btw, if some of my access-points boot automatically to recovery and get the default ip (=same as the router) 192.168.1.1 it will disable whole internet access and it would some tome to notice its not a problem of the router

Additional:
For my device on the Overview page in luci it show "Linksys E8450" or "Linksys E8450 (UBI)", depending. It would be great if the Revovery shows as "Linksys E8450 (UBI-Recovery)", so i could get it directly :slight_smile:

I have already receive many useful PSTORE dumps in the past. Ok, sometimes it took an initial confusion for users to understand what just happened, but I still believe it's worth it even if only a fraction of users manage to extract and submit logs from PSTORE.

PSTORE generally doesn't get lost unless you manually clear it or unplug power for a few seconds. The problem with not booting into recovery if there is something in PSTORE is the potential of triggering (costly, in terms of device lifetime) infinite reboot loops in case something crashes early during boot.
Also, as a useful side-effect, users can decide to manually boot into recovery using echo c > /proc/sysrq-trigger.

Regarding LuCI suggestions: there is https://github.com/openwrt/luci/pull/5041 in order to provide a generic infrastructure to display notifications of all kinds to the user in LuCI.

Regarding default IP in recovery mode: I was thinking to implement a way to store settings relevant to recovery in U-Boot env, or simply use the existing ipaddr from U-Boot env also for OpenWrt when booting into recovery.

In the meantime, maybe move your router away from 192.168.1.0/24 subnet...

When the device does not boot automatically into recovery, but shows a red notification user could still send the logs. The PR is interesting, but the discussion luci/global will still take some time. Just add a line to overview is done fast (and deleted later also, if needed)

To prevent a reboot-loop a counter could be added. Something like just a number in /sys/fs/pstore/crashcount and if some limit is exceeded fall back to recovery.

I dont want to change my routers ip, its this since my 1st lan! :slight_smile: So i change the default of my images ip with files/etc/config file. Only recovery does not use this

Is $ fw_setenv bootcmd 'run boot_ubi' okay to change? Does holding the reset button durin power up then still work to get into recovery?

A reboot counter would not be enough, it'd need to be with timestamps (also tricky without RTC) to be able to recognize "early" reboots and what is "early" anyway (in seconds or ms)? I've only seen broken implementations of that approach for now...

Counting the number of records in pstore already works, but I don't see how it would be more transparent/easy for users to understand if their device hangs in recovery after 5 crashes instead of after the first time it happens. It will just delay the problem and keep users unaware of problems (unless they manually check /sys/fs/pstore or have an eye on uptime).

And yes, to not have U-Boot check PSTORE the change to U-Boot environment you stated works as expected (and you will still be able to manually trigger recovery or tftpboot by holding down RESET button during boot).

2 Likes

Thats simple: run in some startup script rm -f /sys/fs/pstore/crashcount on (every) sucessfull reboot (recovery + installed)

Not go to recovery after 1st crash helps if somthing after a long runtime goes wrong.
The boot-loop case is if eg in the kernel is something wrong and the device is not able to boot at all

In theory that's a good solution. However, it'd require an additional pstore record (crashcount) to be handled by the kernel -- for now, this is all just vanilla Linux features without patching anything related to pstore, but just using it as-is. If you think this is easy to do, please submit patches to upstream Linux and OpenWrt lists.
Imho it'd be easier to have an init-script which handles pstore in recovery, clears it and reboots (according to settings it finds in U-Boot env). I've just been to lazy to implement that (but it's on the list).

1 Like

Sorry, I had moved the device (unplugged it) before seeing your reply and I see nothing under /sys/fs/pstore. But I now know better for next time.

hello @daniel
my colleague has the same router as me but he lost the openwrt wifi on his router, he uses sqm for the video game and suddenly found himself in moderate nat,

instead of nat open, but with the difference that this time the router did not go back to the original blue interface as in the past, it disconnected 10SEC and reconnected, and the wifi came back, this problem could it be recurrent thank you?

ps I don't have the version of its software

The best would be if you manage to extract the logs from /sys/fs/pstore next time it happens, so we will be able to reproduce the cause (I didn't manage to crash it even once, but that can well be related to the behavior of wifi clients; on MT7620 there were problems which could only be triggered by WiFi action frames emitted by an Xbox.... you get the idea...).
If you that is not an option for you and you just want crashes to silently reboot the router and keep things functional, you can also disable the PSTORE feature in the U-Boot environment (see above).