Problem on rt2800soc, sporadic reboot since 23.05.5

There seems to be an issue with the rt2x00 driver or something related.
From OpenWRT 23.05.4 and 23.05.5 the speed increased from 1 Mbps to 5 Mbps, but during a speed test, the router reboots even 60 seconds after booting.
MT7620 router.

1 Like

Openwrt supports only one device using this, or should we dig up the crystal ball?

You mean which router?
Always mine. https://shop.gevaelettronica.it/en/4-battery-poe
When I find the time to figure out how to commit I will put it in the managed routers.

The defect also does this in version 23.05.4
It does it much less perhaps because it runs much slower (the speed test)

I was able to log the moment of the reboot.


[  282.872886] CPU 0 Unable to handle kernel paging request at virtual address 70368b58, epc == 70368b58, ra == 80004574
[  282.883826] Oops[#1]:
[  282.886152] CPU: 0 PID: 8460 Comm: geva_sleep Tainted: G           O       6.6.57 #0
[  282.894062] $ 0   : 00000000 00000001 00000002 fffff000
[  282.899425] $ 4   : 7fc32000 7fc32000 00000001 81997d12
[  282.904780] $ 8   : 00000010 805f294c 00000003 00000008
[  282.910132] $12   : 0044f000 82fe940c 82fe940c ffffff00
[  282.915487] $16   : 7fc32000 00000001 82e47318 82134a00
[  282.920842] $20   : 82b4bb40 8213465c 82b24d20 00000001
[  282.926197] $24   : 00000000 ffffffff
[  282.931551] $28   : 8207c000 8207dd20 8207dde8 80004574
[  282.936906] Hi    : 554a9555
[  282.939845] Lo    : 7ffaa000
[  282.942781] epc   : 70368b58 0x70368b58
[  282.946703] ra    : 80004574 arch_align_stack+0x50/0x70
[  282.952059] Status: 1100e403 KERNEL EXL IE
[  282.956344] Cause : 50800008 (ExcCode 02)
[  282.960435] BadVA : 70368b58
[  282.963373] PrId  : 00019650 (MIPS 24KEc)
[  282.967463] Modules linked in: rt2800soc(O) rt2800mmio(O) rt2800lib(O) nft_fib_inet nf_flow_table_inet mt76x0u(O) mt76x0e(O) mt76x0_common(O) rt2x00soc(O) rt2x00mmio(O) rt2x00lib(O) pppoe nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack mt76x02_usb(O) mt76x02_lib(O) mt76_usb(O) mt76(O) mac80211(O) l2tp_ppp cfg80211(O) ums_usbat ums_sddr55 ums_sddr09 ums_karma ums_jumpshot ums_isd200 ums_freecom ums_datafab ums_cypress ums_alauda pptp pppox ppp_mppe ppp_async nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c crc_ccitt compat(O) ledtrig_usbport pppoatm ppp_generic slhc msdos ip_gre gre l2tp_netlink l2tp_core udp_tunnel ip6_udp_tunnel ip_tunnel br2684 atm nls_utf8 nls_iso8859_1 nls_cp437 crypto_user algif_skcipher algif_rng algif_hash algif_aead af_alg sha512_generic sha1_generic
[  282.968109]  seqiv sha3_generic drbg hmac geniv rng ecb cmac arc4 uas usb_storage leds_gpio ohci_platform ohci_hcd fsl_mph_dr_of ehci_platform ehci_fsl sd_mod scsi_mod scsi_common ehci_hcd gpio_button_hotplug(O) vfat fat usbcore nls_base usb_common crc32c_generic
[  283.083916] Process geva_sleep (pid: 8460, threadinfo=73c6ccad, task=1c02656e, tls=77e6adf4)
[  283.092541] Stack : ffffffff 00000000 00002fd1 809c0fc4 82134600 801ac390 819979c0 82b4bb40
[  283.101113]         80765280 00000001 8207dde8 80178e48 00000000 00000000 00000000 00000000
[  283.109679]         00000000 00000000 00000000 00000000 00000001 819979c0 00000000 82b4b1e0
[  283.118245]         00000001 80031490 82134600 819979c0 00000000 ba2b197d ffffffff 82b4bb40
[  283.126811]         82134654 80149210 81d12c80 82fdb800 00000100 8020c9ec 00000034 b318f2cc
[  283.135379]         ...
[  283.137880] Call Trace:
[  283.137885]
[  283.141964] [<801ac390>] setup_arg_pages+0x48/0x2cc
[  283.147025] [<80178e48>] free_unref_page+0x4c/0x124
[  283.152030] [<80031490>] flush_itimer_signals+0x34/0x5c
[  283.157504] [<80149210>] arch_pick_mmap_layout+0x1a4/0x1c4
[  283.163181] [<8020c9ec>] load_elf_phdrs+0x78/0xcc
[  283.168235] [<8020ddb0>] load_elf_binary+0x2e4/0x1514
[  283.173524] [<801a33e0>] __kernel_read+0xe8/0x2ac
[  283.178607] [<801accc8>] bprm_execve+0x1ec/0x578
[  283.183332] [<801ad368>] copy_string_kernel+0x110/0x250
[  283.188744] [<801adec8>] do_execveat_common+0x1b4/0x240
[  283.194154] [<8000ff04>] do_page_fault+0xd4/0x554
[  283.198970] [<801aecb8>] sys_execve+0x34/0x48
[  283.203432] [<8000dfc0>] syscall_common+0x34/0x58
[  283.208263]
[  283.209782] Code: (Bad address in epc)
[  283.209782]
[  283.215114]
[  283.217949] ---[ end trace 0000000000000000 ]---
[  283.222708] Kernel panic - not syncing: Fatal exception
[  283.228042] Rebooting in 3 seconds..

mybe cause is my geva_sleep
I lost the source of that file,
I use it for flashing leds via sh.

What is the syntax for writing this in the mt7620.mk file?

CONFIG_BUSYBOX_CUSTOM=y
CONFIG_BUSYBOX_CONFIG_USLEEP=y

The problem is the same even if I replace geva_sleep with usleep.
The cause seems to be the sh script I use to control the LEDs.
Could it be that the command called cyclically during the speed test is the cause?
swconfig dev switch0 port 0 show

#!/bin/sh

while true
do
  # echo "led"
  STATUS=$(swconfig dev switch0 port 0 show | grep "link:")
  DOWN=$(echo $STATUS | grep "link:down")
  FULL=$(echo $STATUS | grep "speed:100baseT full-duplex")
  if [ "$DOWN" != "" ]
  then
	  # Disconnected
    echo *NoEth* > /dev/ttyS0
	elif [ "$FULL" != "" ]
  then
	  # Half
    echo *EthFull* > /dev/ttyS0
  else
		# Full
    echo *EthHalf* > /dev/ttyS0
  fi
  usleep 500
done
[  216.931270] CPU 0 Unable to handle kernel paging request at virtual address 70368b58, epc == 70368b58, ra == 80004574
[  216.942252] Oops[#1]:
[  216.944578] CPU: 0 PID: 17384 Comm: grep Tainted: G           O       6.6.57 #0
[  216.952044] $ 0   : 00000000 00000001 00000002 8069760c
[  216.957410] $ 4   : 7fc5af8f 00000001 ffffff83 82c14c00
[  216.962765] $ 8   : 00000000 00000000 00000001 82bea5a5
[  216.968118] $12   : 00000000 80728f80 00000003 82c68168
[  216.973472] $16   : 7fc5af8f 82bea5a0 00000000 81a2f680
[  216.978827] $20   : 80696714 77ea9e38 82c83140 80760000
[  216.984182] $24   : 00000000 82ed9cbc
[  216.989535] $28   : 82ed8000 82ed9dd0 82ed9de8 80004574
[  216.994890] Hi    : 554a9555
[  216.997829] Lo    : 7ffaa000
[  217.000766] epc   : 70368b58 0x70368b58
[  217.004687] ra    : 80004574 arch_align_stack+0x50/0x70
[  217.010043] Status: 1100e403 KERNEL EXL IE
[  217.014327] Cause : 50800008 (ExcCode 02)
[  217.018418] BadVA : 70368b58
[  217.021355] PrId  : 00019650 (MIPS 24KEc)
[  217.025445] Modules linked in: rt2800soc(O) rt2800mmio(O) rt2800lib(O) nft_fib_inet nf_flow_table_inet mt76x0u(O) mt76x0e(O) mt76x0_common(O) rt2x00soc(O) rt2x00mmio(O) rt2x00lib(O) pppoe nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack mt76x02_usb(O) mt76x02_lib(O) mt76_usb(O) mt76(O) mac80211(O) l2tp_ppp cfg80211(O) ums_usbat ums_sddr55 ums_sddr09 ums_karma ums_jumpshot ums_isd200 ums_freecom ums_datafab ums_cypress ums_alauda pptp pppox ppp_mppe ppp_async nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c crc_ccitt compat(O) ledtrig_usbport pppoatm ppp_generic slhc msdos ip_gre gre l2tp_netlink l2tp_core udp_tunnel ip6_udp_tunnel ip_tunnel br2684 atm nls_utf8 nls_iso8859_1 nls_cp437 crypto_user algif_skcipher algif_rng algif_hash algif_aead af_alg sha512_generic sha1_generic
[  217.026089]  seqiv sha3_generic drbg hmac geniv rng ecb cmac arc4 uas usb_storage leds_gpio ohci_platform ohci_hcd fsl_mph_dr_of ehci_platform ehci_fsl sd_mod scsi_mod scsi_common ehci_hcd gpio_button_hotplug(O) vfat fat usbcore nls_base usb_common crc32c_generic
[  217.141897] Process grep (pid: 17384, threadinfo=56b20611, task=60e06a97, tls=77e3fdf4)
[  217.150080] Stack : 77dd7000 77de9bc0 82c8a000 00000001 82c8a400 8020e6f0 00000100 807e2f2c
[  217.158653]         00000006 00000000 00000012 00000000 00000000 00000003 0045fac8 00000002
[  217.167221]         0044eb70 00400000 00460028 00400034 80696714 00403a90 00000000 00000005
[  217.175788]         77de9bc0 00000002 77dd7000 8069760c 00000034 00000000 00000000 00000000
[  217.184353]         82c83500 00000000 00000100 fd481fd4 807e3898 fffffff8 82c8a400 807e2f2c
[  217.192922]         ...
[  217.195423] Call Trace:
[  217.195428]
[  217.199434] [<8020e6f0>] load_elf_binary+0xc24/0x1514
[  217.204609] [<801accc8>] bprm_execve+0x1ec/0x578
[  217.209413] [<801ad368>] copy_string_kernel+0x110/0x250
[  217.214826] [<801adec8>] do_execveat_common+0x1b4/0x240
[  217.220171] [<80010014>] do_page_fault+0x1e4/0x554
[  217.225075] [<801aecb8>] sys_execve+0x34/0x48
[  217.229536] [<8000dfc0>] syscall_common+0x34/0x58
[  217.234430]
[  217.235949] Code: (Bad address in epc)
[  217.235949]
[  217.241278]
[  217.244319] ---[ end trace 0000000000000000 ]---
[  217.250236] Kernel panic - not syncing: Fatal exception
[  217.255599] Rebooting in 3 seconds..

Problem seems wifi led

What alternative do I have in command "iw phy0-ap0 station dump" ?

#!/bin/sh

ETH=/sys/devices/platform/leds/leds/blue.eth/brightness
POWER=/sys/devices/platform/leds/leds/orange.power/brightness
WIFI=/sys/devices/platform/leds/leds/red.wifi/brightness


while true
do
	ST0=$(iw phy0-ap0 station dump | grep "Station")
	ST1=""
	ST2=""
	if [ "$ST0" != "" ] || [ "$ST1" != "" ] || [ "$ST2" != "" ]
	then
	   echo 255 > $WIFI
	else
	   echo 0 > $WIFI
	fi
	usleep 500
	
done

I abandoned 23.05.5,
I could not get it to work.
I always have kernel panic

The driver definitely has problems.

The driver does not support wpa3 (more accurately pmf/11w}
Probably cpu heats in 10yo dustball too much.

Rt2800soc ( not rt2xyz ) only, you can add documentation request to add caveat to wiki pages.

I did not understand a single word.

Up to version 23.05.4 it worked correctly.
Since version 23.05.5 the command
iw phy0-ap0 station
Repeated in 1-second cycles, creates kernel panic

Please connect to your OpenWrt device using ssh and copy the output of the following commands and post it here using the "Preformatted text </> " button:
grafik
Remember to redact passwords, MAC addresses and any public IP addresses you may have:

ubus call system board
cat /etc/config/network
cat /etc/config/wireless
cat /etc/config/dhcp
cat /etc/config/firewall

on 23.05.4 or 23.05.5 ?
Now i have only device with 23.05.4

More about configs, .4 and .5 are interchangeable and should not impose new memory leaks etc.
EDIT: adjusted title

Something has changed because it no longer works.
There is a new bug

root@PowerLink_at2:~# ubus call system board
{
        "kernel": "5.15.162",
        "hostname": "PowerLink_at2",
        "system": "MediaTek MT7620A ver:2 eco:6",
        "model": "GEVA Battery POE",
        "board_name": "geva,batterypoe",
        "rootfs_type": "squashfs",
        "release": {
                "distribution": "OpenWrt",
                "version": "23.05.4",
                "revision": "r24012-d8dd03c46f",
                "target": "ramips/mt7620",
                "description": "OpenWrt 23.05.4 r24012-d8dd03c46f"
        }
}

root@PowerLink_at2:~# cat /etc/config/network

config interface 'loopback'
        option device 'lo'
        option proto 'static'
        option ipaddr '127.0.0.1'
        option netmask '255.0.0.0'

config globals 'globals'
        option ula_prefix 'fdfa:e240:202f::/48'

config device
        option name 'br-lan'
        option type 'bridge'
        list ports 'eth0.1'

config interface 'lan'
        option device 'br-lan'
        option proto 'static'
        option netmask '255.255.255.0'
        option ip6assign '60'
        option ipaddr '192.168.1.69'
        option gateway '192.168.1.1'
        list dns '192.168.1.1'

config switch
        option name 'switch0'
        option reset '1'
        option enable_vlan '1'

config switch_vlan
        option device 'switch0'
        option vlan '1'
        option vid '1'
        option ports '0 6t'

root@PowerLink_at2:~# cat /etc/config/wireless
config wifi-device 'radio0'
        option type 'mac80211'
        option hwmode '11g'
        option path 'platform/10180000.wmac'
        option channel '6'
        option txpower '10'
        option cell_density '0'
        option htmode 'HT20'
        option legacy_rates '1'

config wifi-iface 'default_radio0'
        option device 'radio0'
        option network 'lan'
        option mode 'ap'
        option encryption 'none'
        option ssid 'www.linktechs.net'
        option disassoc_low_ack '0'
        option short_preamble '0'

config wifi-device 'radio1'
        option type 'mac80211'
        option hwmode '11a'
        option path 'platform/101c0000.ehci/usb1/1-1/1-1:1.0'
        option htmode 'VHT80'
        option noscan '1'
        option channel 'auto'
        option cell_density '0'

config wifi-iface 'default_radio1'
        option device 'radio1'
        option network 'lan'
        option mode 'ap'
        option encryption 'none'
        option ssid 'PowerLink_5Ghz'

root@PowerLink_at2:~# cat /etc/config/dhcp

config dnsmasq
        option domainneeded '1'
        option boguspriv '1'
        option filterwin2k '0'
        option localise_queries '1'
        option rebind_protection '1'
        option rebind_localhost '1'
        option local '/lan/'
        option domain 'lan'
        option expandhosts '1'
        option nonegcache '0'
        option authoritative '1'
        option readethers '1'
        option leasefile '/tmp/dhcp.leases'
        option resolvfile '/tmp/resolv.conf.d/resolv.conf.auto'
        option nonwildcard '1'
        option localservice '1'
        option ednspacket_max '1232'

config dhcp 'lan'
        option interface 'lan'
        option start '100'
        option limit '150'
        option leasetime '12h'
        option dhcpv4 'server'
        option dhcpv6 'server'
        option ra 'server'
        list ra_flags 'managed-config'
        list ra_flags 'other-config'
        option ignore '1'

config odhcpd 'odhcpd'
        option maindhcp '0'
        option leasefile '/tmp/hosts/odhcpd'
        option leasetrigger '/usr/sbin/odhcpd-update'
        option loglevel '4'

root@PowerLink_at2:~# cat /etc/config/firewall

config defaults
        option input 'ACCEPT'
        option output 'ACCEPT'
        option forward 'REJECT'
        option synflood_protect '1'

config include
        option path '/etc/firewall.user'

root@PowerLink_at2:~#
openwrt\target\linux\ramips\image\mt7620.mk

define Device/geva_batteryPoE
  SOC := mt7620a
  IMAGE_SIZE := 16064k
  DEVICE_VENDOR := GEVA
  DEVICE_MODEL := BatteryPoE
  DEVICE_PACKAGES := kmod-mt76x0e kmod-usb2 kmod-usb-ohci \
	kmod-usb-ledtrig-usbport kmod-usb-storage swconfig iperf3 luci \
	kmod-mt76x0u  kmod-usb-storage-extras kmod-usb-storage-uas \
	chat ppp ppp-mod-pppoa ppp-mod-pppoe ppp-mod-pppol2tp \
	ppp-mod-pptp usb-modeswitch usbutils luci-theme-material \
	block-mount kmod-fs-msdos luci-app-commands speedtestcpp \
	wireless-regdb
  SUPPORTED_DEVICES += BatteryPoE
endef
TARGET_DEVICES += geva_batteryPoE
mt7620a_geva_batteryPoE.dts

/dts-v1/;

#include "mt7620a.dtsi"
#include <dt-bindings/gpio/gpio.h>
#include <dt-bindings/input/input.h>

/ {
    compatible = "geva,batterypoe", "ralink,mt7620a-soc";
    model = "GEVA Battery POE";

    chosen {
		bootargs = "console=ttyS0,115200";
    };
	
	aliases {
		label-mac-device = &ethernet;
		led-boot = &led_power;
		led-failsafe = &led_power;
		led-running = &led_power;
		led-upgrade = &led_power;
	};	

    leds {
	compatible = "gpio-leds";

		led_power: power {
			label = "orange.power";
			gpios = <&gpio0 9 GPIO_ACTIVE_LOW>;
		};

		eth {
		    label = "blue.eth";
		    gpios = <&gpio2 0 GPIO_ACTIVE_LOW>;
		};

		wifi {
			label = "red.wifi";
			gpios = <&gpio3 0 GPIO_ACTIVE_LOW>;
		};
	};

	keys {
		compatible = "gpio-keys-polled";
		poll-interval = <20>;

        reset {
			label = "reset";
			gpios = <&gpio0 1 GPIO_ACTIVE_LOW>;
			linux,code = <KEY_RESTART>;
        };
    };

};

&gpio0 {
    status = "okay";
};

&gpio2 {
    status = "okay";
};

&gpio3 {
    status = "okay";
};

//GPIO 9,14 are shared with UART Full, GPIO 72 - wlan LEDs (see datasheets)
//GPIO 1,2 are share with I2C
//so switch used gpio into gpio-mode:
&state_default {
    gpio {
	groups = "i2c", "wled", "uartf", "ephy";
	function = "gpio";
    };
};

//Your device use spi nor-flash 16MB, so enable spi-controller
//and describe your flash layout
&spi0 {
    status = "okay";

    flash@0 {
	compatible = "jedec,spi-nor";
	reg = <0>;
	spi-max-frequency = <50000000>;

		partitions {
			compatible = "fixed-partitions";
			#address-cells = <1>;
			#size-cells = <1>;

			partition@0 {
			label = "u-boot";
			reg = <0x0 0x30000>;
			read-only;
			};

			partition@30000 {
			label = "u-boot-env";
			reg = <0x30000 0x10000>;
			read-only;
			};

			factory: partition@40000 {
			label = "factory";
			reg = <0x40000 0x10000>;
			read-only;
			};

			partition@50000 {
			compatible = "denx,uimage";
			label = "firmware";
			reg = <0x50000 0xfb0000>;
			};
		};
    };
};

//We known that your board use USB 2.0, so enable it (USB 1.1 too)
&ehci {
    status = "okay";
};

&ohci {
    status = "okay";
};

//Enable eth and set MAC/switch defaults
&ethernet {
	nvmem-cells = <&macaddr_factory_28>;
	nvmem-cell-names = "mac-address";
	mediatek,portmap = "wllll";
};

//Enable build-in wifi
&wmac {
	status = "okay";
//your factory is empty and data is incorrect
//      ralink,mtd-eeprom = <&factory 0>;
};

&factory {
	compatible = "nvmem-cells";
	#address-cells = <1>;
	#size-cells = <1>;

	macaddr_factory_28: macaddr@28 {
		reg = <0x28 0x6>;
	};
};
1 Like

Dmesg output looks like a memory leak. Nothing in your config says like "dynamic huge memory".

check changelog (superfluously nothing towards encountered damage)

maybe it is in mainline kernel then?

Do you want an hardware to test ?

Please take a look:

Your kernel oops log shows you are using the master branch, not v23. If possible, please attach the oops log from OpenWrt v23.

1 Like

You mean I have forgotten this command?

# Select a specific code revision
git branch -a
git tag
git checkout v23.05.5

The newly introduced init script will prevent calling get_random_u32_below() in arch_align_stack().

Will I find it in future versions ?
Will I be able to try again from v23.05.6 ?