How can we make the lantiq xrx200 devices faster

I may be even more off base here, but I spent a fair bit of time with the dsl command pipe, and the piped commands to the modem chipset are not filtered, you can push all sorts of commands, even bogus, down the pipe. "Setting RA_MODE" means sending a "g997racs" (G997_RateAdaptationConfigSet) down the pipe. If a parameter of "4" is not accepted, it's the modem firmware that rejects it, not the driver that filters it.

If it works for you, great. It also works just fine on my SOHO line (VDSL2, no vectoring, DSL from the FTTB termination in the basement.)

However, my home line is a different story. As much as I wanted to keep everything all-in-one, with an XRX200 my DSL line (both on VDSL+Vectoring and previously on ADSL2+) always started out great with an SNR of around 12dB, then kept gradually degrading over the next about 7 to 14 days to the point where the SNR dipped way below 4dB and eventually resynchronized.

I then recently switched to a Broadcom based modem (the rather affordable ZyXEL VMG1312) and never looked back. The Broadcom modem manages to recover all dips and keep the line rock stable, my SNR curve now looks like a calm lake with barely any ripples:


The only downside with this particular modem is that the VMG1312 only has 100 mbit ethernet interfaces. In effect this means that I "only" get around 98 mbit out of a line that can, theoretically, go to about 112. But in practice this hardly matters, and I will gladly exchange a few mbits for the stability. And in the process I switched the router to something that speaks 802.11ac, which is a definite improvement over the Lantiq modem routers that (with the exception of the HH5A) only speak 801.22n.

For the longest time I thought that the modem chipset does not matter as long as the line is fine. I was proven wrong, and I now regret that I didn't try Broadcom much, much earlier. In fact, I am very much tempted to switch the SOHO line to a Broadcom modem, too.

3 Likes

yeah you've basically described the problem with how things are currently, I wonder if anyone has tried to run an xrx200 based device with standard firmware to see if the problem is specific to official openwrt or whether it just happens across the board with this chipset but I think if this problem was that common like with the fritzbox 7490 for instance there would be more talk about it ?!?!?

Yeah broadcom based modems don't have the problem, you can actually add Asus brand mediatek based devices (dsl ac51-52-55-56-68 use mediatek for dsl) running the latest firmware to that list as well, I have a DSL-ac56u that seamlessly adapts the connection speed to the ISP limit point of 103-107 while the attainable rate is over 130, with the DSL-ac56u the SNR starts to climb after the initial connect rather than drop and the downstream speed actually rises, usually in under 10 minutes.

When researching the Broadcom modems, I actually found talk about stock firmware XRX200-based Fritz!Boxen showing the same behaviour (in German forums, for example here), so I am inclined to say that it is not OpenWrt related. The consensus seems to be that Lantiqs are not doing nearly as well recovering if the line quality is fluctuating (even if it's a generally quite good line like mine.)

That's good to know ... even if I'm completely fine and happy now, it might help other people.

A bit offtopic, but have you ever tried load-balancing with 2 lan ports?

I have not. Aside from not knowing if it's even possible or how to even start with a PPPoE connection (I am using the VMG1312 as a bridge modem), I am pretty certain it is a lot of fiddling and heartache for very little gain. If it really bothered me, I would probably get something Broadcom based with gigabit interfaces (like a Draytek Vigor 130).

But it doesn't bother me, at all. I am paying for a 100/40 line, and I am getting these real-life usable speeds (measured through Wifi no less):

image

(Also with all the recent resynchronisations I seem to have angered the DLM gods residing in the MSAN temple who in a spiteful turn took my "unrestricted" profile and are currently capping my downstream synchronisation at bang-on 100 mbit. So without any customer support acrobatics I would not be able to get anything beyond 100 mbit right now anyway. And I will certainly not rock the boat with the line as I currently have it, I am perfectly happy with it.)

Yeah, about that ... :blush:

1 Like

very interesting, I just find it hard to believe Lantiq/Intel aren't able to fix this problem. I know fritz introduced some kind of 'calibrate' mini firmware but it still just seems strange to me that this problem has persisted for so long with no solution. Considering all the updates etc, adding long reach etc.... but somehow they can't get rate adaption working properly with broadcom nodes ? I dunno, just seems strange, but maybe this chipset was never designed to really go above 80/20 or something.

Same here, I am testing the broadcom zyxel as a means to figure out whether I can/should get my ISP to send a technician to check the line. (Since DLM reduces my link to 90/32 out of the contracted 100/40, I am leaning towards opening a "Stoerungsticket", but since the link stays up and functional for weeks now, my "Leidensdruck" to do so is pretty slim)

I would say your sepeedtests are reporting not what they actually measure as sustained goodput.

Theoretically the limit for TCP/IPv4 goodput over a 100Mbps fast-ethernet adapter with the typical encapsulation in Germany is (the +4 is for the VLAN tag, as bridged modem that needs to travel over the 1312's FastEthernet port as well):

100 * ((1500-8-20-20)/(1500+38+4)) = 94.16 Mbps

But it seems you are using LibreSpeed, and that tests is known to typically try to account for some arbitrary overhead. See https://github.com/librespeed/speedtest/blob/master/doc.md:

overheadCompensationFactor : compensation for HTTP and network overhead. Default value assumes typical MTUs used over the Internet. You might want to change this if you're using this in your internal network with different MTUs, or if you're using IPv6 instead of IPv4.

  • Default: 1.06 probably a decent estimate for all overhead. This was measured empirically by comparing the measured speed and the speed reported by my the network in adapter.

As far as I can tell, the developer truly believes this compensation factor to be meaningful, and my attempts to explain the (relative simple) theory behind this fell on deaf ears... (as can be seen from '1514 / 1460' as all that will do is to adjust the speedtest numbers to match what the kernel reports, but only IFF packet MTU is 1500).

That said, if we undo the damage of this arbitrary compensation we end up with a goodput of:
98.3 / 1.06 = 92.7358490566
41.8 / 1.06 = 39.4339622642

which seems actually achievable...

Sorry for the ranting...

Mmmh, as far as I can tell the issue really started to flare up in Germany, once G.INP retransmissions were enabled for both down- and uploads (only downlods seeed to have worked more reliable), and the xrx200 was not designed with that in mind, it seems that there might have been to little internal memory for bi-directional retransmissions and all the other stuff, at least that is my pet theory... It does probably also not help that most line-cards nowadays use Broadcom chipsets and I bet that Broadcom tests its own CPE chipsets and drivers extensively against its own line cards (and can tweak code on both ends to make things work, while the lantiq developers can only try to fix things from the CPE side). I have no proof for this though, so this is opinion and not a fact...

I was briefly in that situation, too, when I tried some Lantiq firmware that was a bit harsh and produced lots of line errors. I actually got downgraded every night until I ended up with a 70/30 line. After I changed the firmware to something less aggressive, one night I magically got bumped up to "unrestricted" again.

This has yet to happen with my 100/unlimited profile I'm currently getting but, again, with the fast ethernet connection to the modem the 100mbit cap does not actually matter. :slight_smile:

Yeah, you might be right about librespeed.org ... but synthetic speedtests aside, if we're talking real life I get somewhere around 11.5 MB/s in real life downloads, and I remember that I got roughly the same when I used an integrated Lantiq modem. Seriously, that's Good Enough For Me™. I actually appreciate the upload speeds a bit more than getting every last bit of downstream, and those don't scrape the 100 mbit ethernet line barrier anyway.

1 Like

The funny thing was that with the lantiq firmware blobs, DLM left the download close to maximum (cycling between 100, 106 and 116, while pushing the upload down to 27 for some time) and attempted relatively often to increase rates again, only to run into more resyncs due to too optimistic settings, with the broadcom modem the line is slower, but far more stable...

+1; same situation here, but since I shaped my 100/40 link down to 49/31 (my old router was reaching its limits) I do not care about a theoretical loss of ~16Mbps gross rate; IMHO a slower ink with competent AQM will be more usable than a faster link without (all within reason, a 10Gbps link without SQM would work fine for me, since I will not be able to naturally saturate this so bufferbloat would theoretically still exist, but probably never manifest itself)...

11.5 * 8 * 1024^2/(1000^2) = 96.47 Mbps that still seems a tad high, but that said, with a bit of rounding here and there that might still work, or you probably really mean MB and not MiB
11.5 * 8 * 1000^2/(1000^2) = 92 Mbps :wink:

+1; the only "challenge" is to realize that (assuming one uses SQM in the first place) once the downlinkrate approaches 100 Mbps, it becomes necessary to account for the actual overhead on the fast-ethernet segment, which is slightly larger than the DSL part, in my case with a shaper on pppoe-wan I will need to switch from overhead 38 (PPPoE, VLAN, parts of the ethernet overhead) to overhead 50 (PPPoE, VLAN tag, all of the ethernet overhead). I might as well switch this over right now...

@pc2005 it has been reported above that from v5.4.60 the patches are not being applied correctly. Luckily I was using v5.4.59 and one of my HH5A seems to be working fine with no crashes but other one is not.

The one not crashing is only providing internet access and the other one which crashes around 9/10 times handles a HDD, Samba4 and MiniDLNA. Also the crash happens when I connect both routers with a LAN cable otherwise the 'crashing hub' boots fine.

I have the crash log with me here in any case.

have you tried merging:


it seems to fix a problem where the device won't boot up if there's something running dhcp requests at the wrong time, it never gets to the blue light gets stuck on green

1 Like

Although that patch seems promising on a first look basis, considering it allows the device to boot fine, there is apparently a bug introduced as a result which makes the device unable to be accessed. That auto-enable feature is in place because it enables the VLANs and if VLANs are not enabled then you cannot access the device because it relies heavily on VLANs to be able to use LAN and WAN ports.

1 Like

i use vlans fine even after merging that branch and have no problems accessing the device but I haven't gotten around to testing with 5.4.60+ there might have been a commit in there that caused a conflict I dunno

@pc2005 I just recompiled the firmware from v5.4.60 and the patches work fine and so far no compilation issues.
@wilsonyan Which kernel version are you using? If I disable VLANs altogether from switch config and use eth0 instead of eth0.1 in LAN I can actually access the device just fine and it works normally. So yes something is causing the conflict because I am also not able to manually enable the VLANs through swconfig and it says something along the lines of Wrong parameter or wrong data....

How to apply patches? Because maybe I'm doing something wrong ...
Maybe I have a conflict with other Easybox 904 xDSL patches.

You have to use correct patches for v4 and v5.4 kernel. I would recommend to use v5.4 patches and so you need these ones.

You will need to create a new file for each patch using vim or vi or nano but do not use a simple notepad thingy and also put a new line at the end of the file.
If you have more patches that start with 09xx then you can move the numbering of the above patches further down the line as 0954 and 0956 etc. Keep in mind the file name should end up with .patch.

Also you will need to use OpenWrt source and create/copy the patches to openwrt-source/target/linux/lantiq/patches-5.4/ and then compile your firmware with necessary packages.

Thank you for explaining.

Applying /media/demo/DANE3/kompilacje/source-eb904-new-kernel-v9/openwrt/target/linux/lantiq/patches-5.4/5904-xrx200-net-smp-frags-support.patch using plaintext: 
patching file drivers/net/ethernet/lantiq_xrx200_legacy.c
Hunk #2 FAILED at 38.
Hunk #5 FAILED at 194.
Hunk #6 succeeded at 233 (offset 25 lines).
Hunk #7 succeeded at 536 (offset 25 lines).
Hunk #8 succeeded at 589 (offset 25 lines).
Hunk #9 succeeded at 604 (offset 25 lines).
Hunk #10 succeeded at 666 (offset 54 lines).
Hunk #11 succeeded at 717 (offset 54 lines).
Hunk #12 succeeded at 842 (offset 78 lines).
Hunk #13 succeeded at 855 (offset 78 lines).
Hunk #14 succeeded at 870 (offset 78 lines).
Hunk #15 FAILED at 953.
Hunk #16 succeeded at 1296 (offset 87 lines).
Hunk #17 succeeded at 1693 (offset 87 lines).
Hunk #18 succeeded at 1758 (offset 87 lines).
Hunk #19 FAILED at 1703.
Hunk #20 succeeded at 1840 (offset 98 lines).
Hunk #21 succeeded at 1928 (offset 106 lines).
Hunk #22 succeeded at 1953 (offset 106 lines).
Hunk #23 succeeded at 1994 (offset 106 lines).
Hunk #24 succeeded at 2023 (offset 106 lines).
Hunk #25 FAILED at 1967.
Hunk #26 succeeded at 2175 (offset 138 lines).
Hunk #27 succeeded at 2184 (offset 138 lines).
patch unexpectedly ends in middle of line
Hunk #28 succeeded at 2295 with fuzz 1 (offset 138 lines).
5 out of 28 hunks FAILED -- saving rejects to file drivers/net/ethernet/lantiq_xrx200_legacy.c.rej
Patch failed!  Please fix /media/demo/DANE3/kompilacje/source-eb904-new-kernel-v9/openwrt/target/linux/lantiq/patches-5.4/5904-xrx200-net-smp-frags-support.patch!
make[2]: *** [Makefile:32: /media/demo/DANE3/kompilacje/source-eb904-new-kernel-v9/openwrt/build_dir/target-mips_24kc_musl/linux-lantiq_xrx200/linux-5.4.61/.prepared_26f29cfe6b12f8531fcf8f6f2d4b2391] Error 1
make[2]: Leaving directory '/media/demo/DANE3/kompilacje/source-eb904-new-kernel-v9/openwrt/target/linux/lantiq'
make[1]: *** [Makefile:13: menuconfig] Error 2
make[1]: Leaving directory '/media/demo/DANE3/kompilacje/source-eb904-new-kernel-v9/openwrt/target/linux'
make: *** [/media/demo/DANE3/kompilacje/source-eb904-new-kernel-v9/openwrt/include/toplevel.mk:175: kernel_menuconfig] Błąd 2

The patches from @pc2005 conflict with the patches for Easybox 904 xDSL. Based on: Quallenauge's sources.

1 Like

Just a very late follow-up:

It is even doing ever so slightly better than that:
image
(assuming that dslreports is not skewing the results; measured via wifi)

1 Like

My suspicion is, that dslreports, like other speedtests, tries to remove the effect of the TCP start-up phase, so instead of returning simply the sum over all measurements flow's total transmitted volume divided by total time, they probably try to ignore the first X seconds per flow, and that leads to odd quantization effects...
I have seen this quite often, speedtest results slightly above the theoretical limit (librespeed and speedof.me typically are the worst offenders) and tend to believe this to be caused by some post-processing...
But, to be honest, that level of detail probably is not what casual speedtest users are after, so as far as I am concerned the speedtests are fine for their intended purpose...