When is this patch hitting Master? Was a pull request already created?
[WIP][RFC][RFT] ramips: initial 5.4 support
openwrt:master
← dengqf6:ramips-5.4
When is this patch hitting Master? Was a pull request already created?
I am wondering if Archer C7 v2 has similar interrupt handling issues, there’s also a lot of ERRs in the interrupt output.
What do you mean by 'without if condition', were you meaning modify the patch by eliminate the added 'if' line in routine fe_poll_tx?
BTW, I am compiling master branch with this patch you referred to, and will start testing tomorrow.
Just saw those errors on my Archer C7v3. Guess I need to swap it out...
No, i am talking about this patch. This disable Flow Control. https://github.com/openwrt/mt76/issues/211#issuecomment-569944489. Remove all conditional and leave alone:
/ * (GE1, Force 1000M / FD, FC OFF, MAX_RX_LENGTH 1536) * /
mtk_switch_w32 (gsw, 0x2305e30b, GSW_REG_MAC_P0_MCR);
mt7530_mdio_w32 (gsw, 0x3600, 0x5e30b);
you don't need to apply this patch, just edit the file at ./target/linux/ramips/files-4.14/drivers/net/ethernet/mediatek/gsw_mt7621.c
I can't get zero interrupt ERRs, with hw offloading on at least.
CPU0 CPU1 CPU2 CPU3
8: 225649 225616 225626 225614 MIPS GIC Local 1 timer
9: 10551 0 0 0 MIPS GIC 63 IPI call
10: 0 3335 0 0 MIPS GIC 64 IPI call
11: 0 0 10829 0 MIPS GIC 65 IPI call
12: 0 0 0 3295 MIPS GIC 66 IPI call
13: 3403753 0 0 0 MIPS GIC 67 IPI resched
14: 0 96741 0 0 MIPS GIC 68 IPI resched
15: 0 0 35819 0 MIPS GIC 69 IPI resched
16: 0 0 0 27239 MIPS GIC 70 IPI resched
19: 14 0 0 0 MIPS GIC 33 ttyS0
20: 0 0 0 0 MIPS GIC 29 xhci-hcd:usb1
21: 1205402 0 0 0 MIPS GIC 10 1e100000.ethernet
22: 2 0 0 0 MIPS GIC 30 gsw
23: 2 0 58652 0 MIPS GIC 11 mt76x2e
24: 206807 0 0 0 MIPS GIC 31 mt76x2e
26: 0 0 0 0 GPIO 7 keys
27: 0 0 0 0 GPIO 18 keys
ERR: 7
Just run it for less than 40 minutes.
Surely those interruption errors are due to the wifi interface.
Anyway, have you applied both patches? (220-mt7621-disable-flow-control and OpenWrt-Devel-PATCHv2-2-2-ramips-ethernet-fix-to-interrupt-handling)?
Copy patches to root of build path and apply with the commands:
patch -p1 < 220-mt7621-disable-flow-control.patch
patch -p1 < OpenWrt-Devel-PATCHv2-2-2-ramips-ethernet-fix-to-interrupt-handling.patch
If you are a building master, you can skip the first one.
I have uploaded both patches: https://www.mediafire.com/file/kzcmkazpsntny0b/openwrt_patches.zip/file
I am using the master branch, in which the other patch you were referring to already included.
BTW, what’s your use case, is it without the WiFi?
Ubiquiti Edgerouter X, no WiFi.
OK, I will have to test it with my DIR-860L B1 for a longer time to see if the timeout issues gone with this patch, then.
12 days without errors and the router was restarted 1 hour ago. I have updated the ER-X bootloader (it had the factory version) to see if this fixed. In addition, that bootloader had a very serious security issue, at boot the switch ports communicated with each other until the system boots.
Hello, I am wondering if these patches will make it to master?
Hard to say in which way it will be adopted.
Now the developers are busy transitioning to the next major release which is based on Linux 5.4 kernel, for my understanding it’s a whole new story than just patches.
Best hope is, this particular patch, as critical as is, will be in 19.07.x branch eventually.
Is there a PR open for this fix?
comment for mark
So far I've got positive result with the interrupt handle patch, first of all, no mtk_soc_eth timeout spotted. The interrupt ERRs is much less than before:
CPU0 CPU1 CPU2 CPU3
8: 12476688 12476653 12476661 12476652 MIPS GIC Local 1 timer
9: 59060 0 0 0 MIPS GIC 63 IPI call
10: 0 18597 0 0 MIPS GIC 64 IPI call
11: 0 0 39702 0 MIPS GIC 65 IPI call
12: 0 0 0 11681 MIPS GIC 66 IPI call
13: 152046 0 0 0 MIPS GIC 67 IPI resched
14: 0 1925474 0 0 MIPS GIC 68 IPI resched
15: 0 0 207644 0 MIPS GIC 69 IPI resched
16: 0 0 0 422116 MIPS GIC 70 IPI resched
19: 12 0 0 0 MIPS GIC 33 ttyS0
20: 0 0 0 0 MIPS GIC 29 xhci-hcd:usb1
21: 96750942 0 0 0 MIPS GIC 10 1e100000.ethernet
22: 2 0 0 0 MIPS GIC 30 gsw
23: 2 0 3003252 0 MIPS GIC 11 mt76x2e
24: 11949047 0 0 0 MIPS GIC 31 mt76x2e
26: 0 0 0 0 GPIO 7 keys
27: 0 0 0 0 GPIO 18 keys
ERR: 567
for 1 day, 10:42.
For me the patch is not working now. In just 24 hours it has already given the first error:
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.027642] ------------[ cut here ]------------
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.036889] WARNING: CPU: 3 PID: 0 at net/sched/sch_generic.c:320 dev_watchdog+0x1ac/0x324
Tue Mar 17 19:22:53 2020 kern.info kernel: [84611.053381] NETDEV WATCHDOG: eth0 (mtk_soc_eth): transmit queue 0 timed out
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.067251] Modules linked in: pppoe ppp_async pppox ppp_generic nf_nat_pptp nf_conntrack_pptp nf_conntrack_ipv6 iptable_nat ipt_REJECT ipt_MASQUERADE xt_time xt_tcpudp xt_tcpmss xt_statistic xt_state xt_recent xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_helper xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_HL xt_FLOWOFFLOAD xt_DSCP xt_CT xt_CLASSIFY ts_fsm ts_bm slhc nf_reject_ipv4 nf_nat_tftp nf_nat_snmp_basic nf_nat_sip nf_nat_rtsp nf_nat_redirect nf_nat_proto_gre nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv4 nf_nat_ipv4 nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_log_ipv4 nf_flow_table_hw nf_flow_table nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_tftp nf_conntrack_snmp nf_conntrack_sip nf_conntrack_rtsp nf_conntrack_rtcache
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.211023] nf_conntrack_proto_gre nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp nf_conntrack_broadcast ts_kmp nf_conntrack_amanda nf_conntrack iptable_raw iptable_mangle iptable_filter ipt_ECN ip_tables crc_ccitt nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 tun nls_utf8 nls_iso8859_15 nls_cp852 nls_cp850 nls_cp437 nls_base leds_gpio gpio_button_hotplug
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.285045] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.14.167 #0
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.297176] Stack : 00000000 8fd90f40 80580000 8007265c 805a0000 80546510 00000000 00000000
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.313819] 80512100 8fc0fdc4 8fc3cffc 805808e7 8050cef0 00000001 8fc0fd68 53261646
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.330461] 00000000 00000000 806e0000 00004530 00000000 000000ec 00000008 00000000
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.347103] 00000000 80580000 00045975 00000000 00000000 805a0000 00000000 80540718
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.363743] 80370050 00000140 00000003 8fd90f40 00000000 80299210 0000000c 806e000c
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.380389] ...
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.385263] Call Trace:
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.390157] [<8000c7b0>] show_stack+0x58/0x100
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.399028] [<8044f8c4>] dump_stack+0xa4/0xe0
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.407715] [<8002f5f8>] __warn+0xe0/0x138
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.415873] [<8002f680>] warn_slowpath_fmt+0x30/0x3c
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.425767] [<80370050>] dev_watchdog+0x1ac/0x324
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.435159] [<8008932c>] call_timer_fn.isra.25+0x24/0x84
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.445756] [<800895e8>] run_timer_softirq+0x1bc/0x248
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.456014] [<8046d770>] __do_softirq+0x128/0x2ec
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.465399] [<80033f84>] irq_exit+0xac/0xc8
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.473755] [<8024c1c0>] plat_irq_dispatch+0xfc/0x138
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.483820] [<80007588>] except_vec_vi_end+0xb8/0xc4
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.493711] [<80008f50>] r4k_wait_irqoff+0x1c/0x24
Tue Mar 17 19:22:53 2020 kern.warn kernel: [84611.503392] ---[ end trace 81b0755d3220520a ]---
Tue Mar 17 19:22:53 2020 kern.err kernel: [84611.512630] mtk_soc_eth 1e100000.ethernet eth0: transmit timed out
Tue Mar 17 19:22:53 2020 kern.info kernel: [84611.524982] mtk_soc_eth 1e100000.ethernet eth0: dma_cfg:80000065
Tue Mar 17 19:22:53 2020 kern.info kernel: [84611.537041] mtk_soc_eth 1e100000.ethernet eth0: tx_ring=0, base=0e990000, max=0, ctx=3103, dtx=3103, fdx=3091, next=3103
Tue Mar 17 19:22:53 2020 kern.info kernel: [84611.558726] mtk_soc_eth 1e100000.ethernet eth0: rx_ring=0, base=0e030000, max=0, calc=3134, drx=3135
Tue Mar 17 19:22:53 2020 kern.info kernel: [84611.580051] mtk_soc_eth 1e100000.ethernet: 0x100 = 0x5b60000c, 0x10c = 0x80818
It remains to try with another power adapter. The router drain a maximum of 5W and the adapter is 6W (12V 0.5A). Maybe it's not enough. I will try with 1A.
You could try disabling the wifi and see if it still errors.
Here comes the bad news, the router and the whole lan seemed disconnected for a while, just like the symptom of the mtk_soc_eth timeout. When I checked with logread, however, I didn't found any suspicious logs there.
It is quite disappointing since I was sure it was so close for us to get a stable OpenWRT firmware for the DIR-860L B1.
There is a lot of discussion going on in this pull request:
openwrt:master
← dengqf6:ramips-5.4
I think we will have to wait and see if this fixes things once and for all.