NBG6817: OpenWrt rebooting constantly

Thanks, no, I didn't so far.

I've just done a very quick test of the kernel 4.19 forward port (excluding the dsa changes) on my nbg6817, seems to be working fine (it might need some further refinements for USB3 on ipq8065, at least manual enabling of kmod-usb-dwc3 && kmod-usb-dwc3-qcom, but I'm not using USB on my router).

when compiling,do you have this error?

Applying /home/hingbong/openwrt-r7800/target/linux/generic/hack-4.19/302-powerpc-Enable-kernel-XZ-compression-option-on-BOOK3.patch using plaintext: 
patching file arch/powerpc/Kconfig
Hunk #1 FAILED at 199.
1 out of 1 hunk FAILED -- saving rejects to file arch/powerpc/Kconfig.rej
patching file arch/powerpc/platforms/Kconfig.cputype
Hunk #1 FAILED at 75.
1 out of 1 hunk FAILED -- saving rejects to file arch/powerpc/platforms/Kconfig.cputype.rej
Patch failed!  Please fix /home/hingbong/openwrt-r7800/target/linux/generic/hack-4.19/302-powerpc-Enable-kernel-XZ-compression-option-on-BOOK3.patch!

I just applied https://git.openwrt.org/?p=openwrt/staging/chunkeey.git;a=commitdiff;h=5789287f4437624989166c60452f5bc9ce06fd82 to current master, as it doesn't touch anything outside of target/linux/ipq806x/ (aside from package/kernel/linux/modules/usb.mk and target/linux/generic/config-4.19), the patch in question can't affect powerpc specific code - and yes, it did compile (and run) for me.

1 Like

do you get the pppoe service restart itself?

Interesting.

The netgear-r7800 and tplink-c2600 routers have been placed behind different cable modems that do not emit jumbo frames (the UBEE modem did) and for that reason these routers now work reliable.

It's a pity that the work-around (setting MTU to a large value and thereby causing stmmac to allocate a larger DMS buffer) came to late for me; the concerned production environment of a client of mine needed a quick solution (i.e. change of modem).

I still have one r7800 in semi-production. To reproduce the error a host connected to one of its LAN phys should emit jumbo frames. Still need to figure out how to force that on the Apple computers that are used; just setting the MTU to a large value on the Ethernet interface of the Mac did not do the trick. Any suggestion howto ?

Continuing the discussion from NBG6817: OpenWrt rebooting constantly:
Same problem here: my setup: Huawei B715 as modem (DMZ mode) and R7800 with OpenWRT 18.06.1(OpenWrt 18.06.1 r7258-5eb055306f / LuCI openwrt-18.06 branch (git-18.228.31946-f64b152) )

[46158.104334] ipq806x-gmac-dwmac 37200000.ethernet eth0: len 1541 larger than size (1536)
[46158.137186] ipq806x-gmac-dwmac 37200000.ethernet eth0: len 1541 larger than size (1536)

@por, I did not find a 100% repeatable pattern that would cause the NBG6817 to crash on any given device. On the 8 year old PC of my spouse, a simple internet speedtest (speedtest.net) does the job. On my PC, I ran the Samba stress test to a NAS (MyBookLive) and that caused it to crash after just 5 minutes. On Ubuntu 18.x there was not much needed, just booting the PC. The ipq806x-gmac-dwmac error messages come all the time, but if you are a little (un)lucky the freeing of overflow data does not cause a panic right away, due to way the kernel aligns dma buffers. Normally devices should not send jumbo-packets to a device that has jumbo packets disabled, so that adds another bit of luck-factor to the mix.


i use linux 4.19 and the dsa https://git.openwrt.org/?p=openwrt/staging/chunkeey.git;a=commit;h=0ebf2d98c1a6debf035055b7d006a66e8024336b,set the Jambo packet,i doesn't reboot

Same problem here https://forum.openwrt.org/t/ethernet-eth0-len-1571-larger-than-size-1536/32098
at NBG6817 with OpenWRT 18.06.2

it's bug only in 4.14.xx kernel (18.06.xx) ?
4.4.xx kernel in 17.01.xx is free from this bug?

The 4.4.x driver is very different. I quickly browsed the latest 4.4.176 and it does not have the oversized frame reception bug, nor the incorrect dma_free code. That said, a similar bug related to the pre-allocation of SKB buffers was also present in 4.4.x, but fixed in late 2015 here. Unfortunately, the 4.4 version of the driver also had it's fair share of issues. Just take a look here. So unless the 17.01.xx version is based on a very recent version of 4.4.x, you have a good chance to experience some kind of issue, but as far as I can tell not the ones discussed above with 18.06.xx.

How can I set MTU different to standard 1500?

option mtu '1954'

at /etc/config/network not working.

/sbin/ifconfig eth0 mtu 1954 up

at /etc/rc.local not working either leading to error:

kern.err kernel: [ 31.621766] ipq806x-gmac-dwmac 37400000.ethernet eth1: must be stopped to change its MTU

config device
	option name 'eth0'
	option mtu '1954'

Not working:

kern.err kernel: [ 420.840774] ipq806x-gmac-dwmac 37200000.ethernet eth0: len 1550 larger than size (1536)

  • What didn't work?
  • Did you did it for all interfaces (bridges and/or VLANs)?

Yes, for all. Still got kernel error.

I'm experiencing the same issue ever since installing Fedora 30 on my laptop, both with wireless and wired ethernet. I get the ipq806x-gmac-dwmac warnings and subsequent crash.

This didn't happen with Fedora 29, so a bug must have been introduced with Fedora 30 that messes up the way MTU-values are set. Does anybody have any tips for how I can try to debug this? I would like to report it upstream.

EDIT: I thought I had found a workaround by setting the MTU manually to 1500 for my network in the KDE system settings, but for some reason jumbo packets are still being sent when my laptop reconnects to my network after waking up from suspend.

I initially didn't notice this, but ipq806x-gmac-dwmac are followed by these warnings:

daemon.warn dnsmasq[2721]: reducing DNS packet size for nameserver 92.220.228.70 to 1280

I have also figured out that disabling NetworkManager and connecting manually to my network does not result in any warnings and crash. By manually I mean with wpa_supplicant:

$ sudo wpa_supplicant -B -iwlp18s0 -cwpa.conf -Dnl80211
$ sudo dhclient wlp18s0

And just with dhclient after plugging in an Ethernet cable:

$ sudo dhclient enp19s0

I assume that the warnings about reducing DNS packet size are somehow related to the previous warnings. Does anybody know how to make any sense of this?

There are some fixes for oversized packets causing memory corruption
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/?h=v4.19.42&qt=grep&q=stmmac