Kernel crash [Help!]

Device: Zbtlink ZBT-WG1602-V04 (32M)
SOC: MT7621AT
software version: OpenWrt 23.05.0 r23497-6637af95aa / LuCI openwrt-23.05 branch git-23.236.53405-fc638c8
SSD filesystem: NTFS
Filesystem Driver: kmod-fs-ntfs3

Log:

[Oct17 00:30] CPU 1 Unable to handle kernel paging request at virtual address 02715bb0, epc == 8016db84, ra == 8016da88
[  +0.010771] Oops[#1]:
[  +0.002348] CPU: 1 PID: 36 Comm: kcompactd0 Not tainted 5.15.134 #0
[  +0.006324] $ 0   : 00000000 00000001 02715ba8 00000002
[  +0.005312] $ 4   : fffffffc 00000000 00000008 0000000c
[  +0.005271] $ 8   : 0000244a 00000000 00000000 00000000
[  +0.005234] $12   : f0000080 000089b0 00000000 00000000
[  +0.005302] $16   : 80e02a94 0001c47d 00000000 81555e30
[  +0.005297] $20   : 00000000 00000050 00000000 00000000
[  +0.005326] $24   : eb3d5f80 00000000

Description:
I have connected an mpcie to USB3.0 module ("Renesas uDP720202") connected to the device that provides 2 USB3.0 ports, I have connected an SSD to one of these ports, and it gets Identified using "uas" driver, the disc is mounted and used as aria2 storage for downloads, also shared using samba4.

When the crash happens:
1- when aria2 is downloading (most sever , most repetitive).
2- when copying/writing files to SSD (rare but happens at not so frequent intervals)

some data:

root@OpenWrt:~# lsusb -t
/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 5000M (this is the renesas device)
    |__ Port 2: Dev 2, If 0, Class=, Driver=uas, 5000M
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 480M
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci-mtk/1p, 5000M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci-mtk/2p, 480M
    |__ Port 1: Dev 2, If 0, Class=, Driver=cdc_ether, 480M
    |__ Port 1: Dev 2, If 1, Class=, Driver=cdc_ether, 480M
    |__ Port 1: Dev 2, If 2, Class=, Driver=cdc_acm, 480M
    |__ Port 1: Dev 2, If 3, Class=, Driver=cdc_acm, 480M
    |__ Port 1: Dev 2, If 4, Class=, Driver=cdc_acm, 480M
    |__ Port 1: Dev 2, If 5, Class=, Driver=cdc_acm, 480M
    |__ Port 1: Dev 2, If 6, Class=, Driver=cdc_acm, 480M
    |__ Port 1: Dev 2, If 7, Class=, Driver=cdc_acm, 480M
    |__ Port 2: Dev 3, If 0, Class=, Driver=cdc_ether, 480M
    |__ Port 2: Dev 3, If 7, Class=, Driver=cdc_acm, 480M
    |__ Port 2: Dev 3, If 5, Class=, Driver=cdc_acm, 480M
    |__ Port 2: Dev 3, If 3, Class=, Driver=cdc_acm, 480M
    |__ Port 2: Dev 3, If 1, Class=, Driver=cdc_ether, 480M
    |__ Port 2: Dev 3, If 6, Class=, Driver=cdc_acm, 480M
    |__ Port 2: Dev 3, If 4, Class=, Driver=cdc_acm, 480M
    |__ Port 2: Dev 3, If 2, Class=, Driver=cdc_acm, 480M
root@OpenWrt:~# lspci
00:00.0 PCI bridge: Device 0e8d:0801 (rev 01)
00:01.0 PCI bridge: Device 0e8d:0801 (rev 01)
00:02.0 PCI bridge: Device 0e8d:0801 (rev 01)
01:00.0 Network controller: MEDIATEK Corp. Device 7603
02:00.0 Network controller: MEDIATEK Corp. MT7662E 802.11ac PCI Express Wireless Network Adapter
03:00.0 USB controller: Renesas Technology Corp. uPD720202 USB 3.0 Host Controller (rev 02)
```

Is there more in the logs for the OOPS after the CPU register dump?

Sorry for the late reply, No logs after this, As you can see the logs dont appear to be complete, as the router immediatly reboots!

Does the crash happen with other usb devices connected? Or only the ssd?

You may be pulling more power from the slot than the board is designed to supply - usb3 can be power hungry.

Try an externally powered device or powered hub or a low power drive - such as a usb stick

only the SSD connected,
Regarding the power : I connected 2 power supplies in parallel 12V2A and 12V1A giving a collective power of 12V3Amps , that being said the mPCIe module takes external power from a buck converter (output is 5.2 volts and 3 Amps max).

that being said, I tested the the output of the usb port using an elctronic USB load, when I draw 2 amps from the USB port, the voltage at the port drops to 4.6 volts and I dont think an SSD would draw 2 Amps.
So I dont know if it is a power supply problem.

USB stick looks to be running normally but I need to do more thorough investigations.

If you’re sure you have sufficient power then test using a Linux suitable file system. NTFS is a very second class citizen and can cause problems.

Ext4 or even btrfs are far superior options

I was able to get the full crash message:

[  +0.399469] CPU 1 Unable to handle kernel paging request at virtual address 11815968, epc == 8016db84, ra == 8016da88
[  +0.011577] Oops[#1]:
[  +0.002324] CPU: 1 PID: 36 Comm: kcompactd0 Not tainted 5.15.134 #0
[  +0.006293] $ 0   : 00000000 00000001 11815960 00000002
[  +0.005299] $ 4   : fffffffc 00000000 00000008 0000000c
[  +0.005290] $ 8   : 00002853 00000000 00000000 00000000
[  +0.005325] $12   : f0000080 000164cc ffffffff 00000000
[  +0.005334] $16   : 80e019b4 0001c405 00000000 81555e30
[  +0.005309] $20   : 00000000 00000006 00000000 00000000
[  +0.005277] $24   : 00000000 00000000
[  +0.005243] $28   : 81554000 81555d18 0001c800 8016da88
[  +0.005257] Hi    : 00000000
[  +0.002914] Lo    : 00000000
[  +0.002926] epc   : 8016db84 0x8016db84
[  +0.003861] ra    : 8016da88 0x8016da88
[  +0.003826] Status: 1100fc03  KERNEL EXL IE
[  +0.004221] Cause : 40800008 (ExcCode 02)
[  +0.004060] BadVA : 11815968
[  +0.002931] PrId  : 0001992f (MIPS 1004Kc)
[  +0.004128] Modules linked in: xt_connlimit pppoe ppp_async nf_conncount xt_state xt_helper xt_conntrack xt_connmark xt_connbytes xt_CT pppox ppp_generic nft_redir nft_nat nft_masq nft_flow_offload nft_fib_inet nft_ct nft_chain_nat nf_nat nf_flow_table_ipv6 nf_flow_table_ipv4 nf_flow_table_inet nf_flow_table nf_conntrack mt76x2e mt76x2_common mt76x02_lib mt7603e mt76 mac80211 ipt_REJECT cfg80211 cdc_ether xt_time xt_tcpudp xt_tcpmss xt_statistic xt_recent xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_ecn xt_dscp xt_comment xt_TCPMSS xt_LOG xt_HL xt_DSCP xt_CLASSIFY usbnet ums_usbat ums_sddr55 ums_sddr09 ums_karma ums_jumpshot ums_isd200 ums_freecom ums_datafab ums_cypress ums_alauda slhc nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_quota nft_objref nft_numgen nft_log nft_limit nft_hash nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_counter nf_tables nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_mangle iptable_filter ipt_ECN ip_tables
[  +0.000487]  crc_ccitt compat cdc_acm ntfs3 ledtrig_usbport xt_set ip_set_list_set ip_set_hash_netportnet ip_set_hash_netport ip_set_hash_netnet ip_set_hash_netiface ip_set_hash_net ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ipmac ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 sha512_generic seqiv jitterentropy_rng drbg hmac cmac uas usb_storage sd_mod scsi_mod scsi_common ext4 mbcache jbd2 exfat mmc_block mtk_sd mmc_core leds_gpio xhci_plat_hcd xhci_pci xhci_mtk_hcd xhci_hcd gpio_button_hotplug usbcore nls_base usb_common mii crc32c_generic
[  +0.150774] Process kcompactd0 (pid: 36, threadinfo=59809c9d, task=fcd2f5c6, tls=00000000)
[  +0.008284] Stack : 00000000 7030f231 38e38e39 80a00000 80b79fe8 80e01900 80a00000 80e019d0

Update: it was a power supply problem indeed : the 2 amp power supply was very poor quality and the voltage actually dropped to 11 volts under moderate loads.

so I connected a 12 volts 9 amps power supply and now the system is very stable.
thank you guys.

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.