[SOLVED]O2 Box 6432: speedtest crashes router/modem

Sadly that did not help. The android app doesn't crash the router anymore when MTU is set to 9000. But the mac app still does.

I ran Wireshark to see if I could spot any difference between the mac app and running the test in Firefox. The main difference was that the mac app opened a UDP port and started to send packages with the length of 16byte.

23335	10.473388	37.120.180.41	192.168.1.123	TCP	1506	8080 → 64088 [ACK] Seq=3290443 Ack=79 Win=14528 Len=1440 TSval=1579959423 TSecr=830839536 [TCP segment of a reassembled PDU]
23336	10.473391	37.120.180.41	192.168.1.123	TCP	1506	8080 → 64086 [ACK] Seq=5716843 Ack=79 Win=14528 Len=1440 TSval=1579959423 TSecr=830839533 [TCP segment of a reassembled PDU]
23337	10.473393	37.120.180.41	192.168.1.123	TCP	1506	8080 → 64086 [ACK] Seq=5718283 Ack=79 Win=14528 Len=1440 TSval=1579959423 TSecr=830839533 [TCP segment of a reassembled PDU]
23338	10.473395	37.120.180.41	192.168.1.123	TCP	1506	8080 → 64086 [ACK] Seq=5719723 Ack=79 Win=14528 Len=1440 TSval=1579959423 TSecr=830839533 [TCP segment of a reassembled PDU]
23339	10.473489	192.168.1.123	37.120.180.41	TCP	66	64088 → 8080 [ACK] Seq=79 Ack=3291883 Win=129600 Len=0 TSval=830839551 TSecr=1579959423
23340	10.473507	192.168.1.123	37.120.180.41	TCP	66	64086 → 8080 [ACK] Seq=79 Ack=5719723 Win=250560 Len=0 TSval=830839551 TSecr=1579959423
23341	10.473535	192.168.1.123	37.120.180.41	TCP	66	64086 → 8080 [ACK] Seq=79 Ack=5721163 Win=253440 Len=0 TSval=830839551 TSecr=1579959423
23342	10.481461	192.168.1.123	37.120.180.41	UDP	58	50509 → 8080 Len=16
23343	10.531786	192.168.1.123	37.120.180.41	UDP	58	50509 → 8080 Len=16
23344	10.568716	192.168.1.123	37.120.180.41	TCP	66	[TCP Window Update] 64087 → 8080 [ACK] Seq=79 Ack=4921963 Win=203328 Len=0 TSval=830839645 TSecr=1579959421
23345	10.585056	192.168.1.123	37.120.180.41	UDP	58	50509 → 8080 Len=16
[...]
23409	13.900650	192.168.1.123	37.120.180.41	UDP	58	50509 → 8080 Len=16

After frame 24409 the router stops working and restarts.

I'm uncertain what else I could do. I tried to change the TCP congestion algorithm from cubic to reno, but no success. I furthermore tried to get any info from dmesg and the system.log but there is no output at all when the device decides to restart.

Maybe I just got unlucky and my device has a hardware fault?

could be, you could also try setting up remote logging, so you might catch some kernel output that you normally do not see as the network goes down. A hardware fault is also possible, but then if your MTU 1472 trick helps I believe software should at least be involved somewhat.

The 1472 did not help, got the crashes again.... I guess I was just lucky.... Fiddling around seems to please the android app somehow. And in some instances the mac app. But nothing I could really pinpoint.

I currently write the log files to /root/. I also installed the full fledged dmesg and started it like this:

dmesg --level=emerg,alert,crit,err,warn,notice,info,debug -w > kern.log

But nothing. The last lines in the log file are always:

[  140.964711] enter showtime
[  141.890179] pppoe-wan: renamed from ppp0

How could external logging help here? Since device looses all network connection, it would stop writing output to another computer the same way it stops writing to the file. The only way would be the serial port I guess, but I don't have the equipment to do this.

BTW I have xRX200 rev 1.2 what is your revision. @Plonk34 what is yours?

Since the boxes are not that expensive, I ordered another one from eBay. The one I'm using right now was pretty beaten up. Don't know what the previous owner did with the box....

Oh, I do not have this box, I use a bt homehub5a as bridged modem (running openwrt) and an old wndr3700v2 as router (running a relative recent openwrt master build, it is this secondary router that runs the PPPoE client).

[quote="stonerl, post:25, topic:34258"]
But nothing. The last lines in the log file are always:

[  140.964711] enter showtime
[  141.890179] pppoe-wan: renamed from ppp0

Aha, enter showtime and the low time (140 seconds after boot) indicates that nothing intersting hit dmesg, so maybe you need to look at logread and how to direct that onto disk or to a server:
https://openwrt.org/docs/guide-user/base-system/log.essentials

Did external logging with ncat, but nothing. No output when the device restarts.

But I observed something new. I connected a TP-Link TL-WR841N with OpenWrt 18.06.02 via Ethernet to the o2 Box and used it as a Wi-Fi AP. And surprisingly no reboots, when I'm connected to the TP-Link AP Wi-Fi.

As soon as I connect to the o2 Box' Wi-Fi or Ethernet-Port. The box reboots when I'm running the speedtest apps.

So you did get normal log output, but nothing suspicious before restarts?

So the TP-Link TL-WR841N was configured as dumb AP or was it doing NAT itself?

Nothing suspicious, just the normal output.

Thats the config on the TP-Link

config interface 'loopback'
        option ifname 'lo'
        option proto 'static'
        option ipaddr '127.0.0.1'
        option netmask '255.0.0.0'

config globals 'globals'

config interface 'lan'
        option type 'bridge'
        option ifname 'eth1.1'
        option proto 'static'
        option netmask '255.255.255.0'
        option stp '1'
        option igmp_snooping '1'
        option ipaddr '192.168.1.2'
        option gateway '192.168.1.254'
        option broadcast '192.168.1.255'
        option ip6assign '64'
        option mtu '9000'

config switch
        option name 'switch0'
        option reset '1'
        option enable_vlan '1'

config switch_vlan
        option device 'switch0'
        option vlan '1'
        option ports '1 2 3 4 0t'

DHCP and DNS is disabled. 192.168.1.254 is my o2 Box. Important is the MTU of 9000 if I change it back to 1500 on the TP-Link the o2 Box crashes.

For Christ's sake... This box has completely erratic behaviour. Everything in the following block-quote were my observations. And they were true until the moment I intended to post it. Tested this several times and now... all for nothing. This f***ing box must be broken.

Another find. If I do anything MTU related on the o2 Box, it crashes. For example when I set the MTU for lan to 9000 but leaf it 1500 for the wan interface, it crashes regardless of the TP-Link settings.

When I leave the lan mtu settings at 1500 but enable sqm for the wan interface it crashes, since sqn fiddles with the MTU. The same happens when I only set wan to 1472 or 1492 but leave the lan untouched.

So it seems the MTU on the o2 Box must not be different for the wan and lan interface. But also the TP-Link AP must not have an MTU of 1500. Every other value I tested (9000, 1972) is fine.

The culprit is that I need to connect via the TP-Link that has an MTU different of 1500. Every direct connection to the o2 Box regardless of LAN or Wi-Fi crashes the box when I run the app.

Better replace it to be on the safe side. All these adjustments of MTU should not be necessary.

I've experienced that issue on a different device for over 7 years. I have that erratic rebooting on https://openwrt.org/toh/zinwell/zw4400 (I don't know why this device isn't in the Table of Hardware, too.)

OpenWrt versions 14 and 15 were horrible. LEDE Rebboot (v17) was a great improvement - it rarely happens now. I tracked this improvement to the addition of NAPI to the OpenWrt kernels.

I hope this is helpful and relevant to your issue.

Well I'm waiting for the new box to arrive, and see if it behaves different.

If I remember correctly, I had Fritz.box a couple of years ago, which had a similar issue. Spontaneous reboots, especially under heavy load. Several firmware updates couldn't fix this issue. Finally, I got it replaced by the manufacturer and the problem disappeared.

AFAIK, it had a similar chipset. So cross fingers...

Got the new device today. Same problems. Reboot when using the Ookla Speedtest apps.

Shall I open a bug report? Is there any way to debug this any further?

It's kind of solved. I tested today's 2019-04-08 master snapshot. The issue is solved there. I guess I have to wait for the next release.

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.