Optimized build for the D-Link DIR-860L

Good to hear you are still trying :slight_smile: I have tried to apply the patch more times than I care to admit. But I must say I have never worked with patch files before, so this is all new for me. Please keep us up to date if you are successful and if so, how you managed to do it. Would love to be able to apply these patches to my own builds as well.

Edit: The patch seems just fine. It didn't apply cleanly for me because gmail messed up the formatting I think. I just created a new patch manually using quilt (this was a good excuse to learn quilt ;)), and manually applied all the changes that were in the patch. I am now running a new compile with that patch added. Let's hope it compiles successfully and results in a usable MTU of 2048 bytes. Will keep you posted.

I got it to work! See the ping test to my Dir-860l b1 in the screenshot below:

The patch was indeed fine, the formatting was probably messed up by gmail. Because I couldn't manually fix the .patch file, I just created a new patch with quilt and added all the changes manually to the source code. This allowed me to create a proper .patch file with the correct formatting. To apply the patch, simply copy the .patch file to:

source/target/linux/ramips/patches-4.4/

And compile the new image. The patch will be applied automatically. You will find the patch file below:

https://mega.nz/#!OZRwXbYS!9wRcjTZ5hkMuU7464s5wCDeh95Ze559C-QouV0MFOMc

Could you please let me know if this is also successful for you?

Hmm, I applied the same patch as you did. So it was either my naming (I named it 9999-...) or a formatting issue. Anyways, I put your patch in /source/target/linux/ramips/patches-4.4/ and the build compiled just fine. Also, it looks like it is working :smiley:

C:\Users\Bart>ping -l 5000 192.168.0.1
Pinging 192.168.0.1 with 5000 bytes of data:
Reply from 192.168.0.1: bytes=5000 time=18ms TTL=64
Reply from 192.168.0.1: bytes=5000 time=11ms TTL=64
Reply from 192.168.0.1: bytes=5000 time=16ms TTL=64
Reply from 192.168.0.1: bytes=5000 time=16ms TTL=64

You can even use higher mtu but then you will get some "Request timed out.". Also the time to reply increases. I have to note that I am testing using 5 GHz WLAN and that all offloads are off!

The patch only allows for a MTU up to 2048 bytes. AFAIK, this is a hardware limitation. Your packets are probably getting fragmented right now. Set the do-not-fragment flag (-f) and try again :slight_smile:

If I use the -f flag then I get the following output:

C:\Users\Bart>ping -l 1974 -f 192.168.0.1
Pinging 192.168.0.1 with 1974 bytes of data:
Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.

So something is wrong :confused:

Edit: double-checked the patches-4.4 folder and the patch is there. Just did a make dirclean and building again just to be sure.

Does your NIC support jumbo frames? Can you post the output of the command "ifconfig" on your router?

Alright, I feel stupid. My wireless card doesn't support jumbo frames but my network card does :sweat_smile:

C:\Users\Bart>ping -l 1974 -f 192.168.0.1
Pinging 192.168.0.1 with 1974 bytes of data:
Reply from 192.168.0.1: bytes=1974 time=1ms TTL=64
Reply from 192.168.0.1: bytes=1974 time<1ms TTL=64
Reply from 192.168.0.1: bytes=1974 time<1ms TTL=64
Reply from 192.168.0.1: bytes=1974 time<1ms TTL=64

So the patch does it's magic!
ifconfig shows an MTU of 1500 for all interfaces (except lo ofcourse).

I tried reading upon these baby jumbo frames but could not find the answer.

My ISP (fiber GPON) provided me a wireless router modem combination. I put the router in bridge mode, but my IPTV and VOIP phone are still from this box.

On my Lede router I put the PPPOE config and it works (MTU 1492). If I apply this patch would the baby jumbo frame cross the bridge or do I need extra configuration in the modem?

Not using VLAN at the moment even the original setup uses it for Internet, IPTV and VOIP. Should I change that??

After that still a big IF. Not sure my ISP supports it

They probably don't because one of the devices in the chain connecting you to the internet and thus the web doesn't support jumbo frames. All devices talking to each other have to agree on the jumbo frame size. Some further reading if you want (better explanation + benchmarks).

TL;DR: Only useful when transferring large files or streaming lots of data (eg. video) over your own (LAN) network.

Still building LEDE (17.01 + master branches) and but there is a regression in the open source mt76 driver which leads to stack traces and flaky performance. So no new builds until I am sure it is fixed!

Is this with the master branch only, or does this also affect the new 17.01.1 release?

Unfortunately both. First, I thought the patches I applied were the culprit but even a barebones build of either 17.01 or master branch has it.
Multiple people posted similar stack traces to the LEDE bug tracker + the mt76 issues area on GitHub.

Very weird. Reason why I am asking is because on my phone the wifi sometimes hangs and pages are not coming through anymore. But this phone has always had issues with a wide variety of APs, so the phone itself is probably to blame. Another reason for the random wifi issues might be my recent introduction of 80211r to the network. It isn't really bothering me all that much, since all my other devices (both cabled and on wifi) are working just fine.

In all this time I have seen a crash once, and I'm currently running LEDE 17.01.1 and 17.01.0 before that. So for me it is relatively stable, but completely stable should be the goal of course.

I think the issues caused by a bug in the code for the switch or SoC, and running SQM merely causes the issue to crop up a lot quicker. All the fixes that have been proposed (Running fq_codel instead of cake, disabling offloading with ethtool, etc) have never worked for me. While these things make it more stable, it still crashes under heavy load. Without SQM at all, I have only had a crash once as mentioned before.

I think it would be a good idea to focus on the bug that is causing SQM with cake to crash as fast as it does. Hopefully that will uncover the root cause of the issues and fix them all in one go. The biggest issue with these errors is that it seems impossible for me to get a stack trace for debugging purposes, since the router reboots during the crash, wiping all the logs.

Do you have any ideas on how I could help debugging this issue by providing relevant and useful information for the developers?

I am running into another issue. When I run the following command:

[quote]root@LEDE:~# iw list | grep -i dbm
* 2412 MHz [1] (20.0 dBm)
* 2417 MHz [2] (20.0 dBm)
* 2422 MHz [3] (20.0 dBm)
* 2427 MHz [4] (20.0 dBm)
* 2432 MHz [5] (20.0 dBm)
* 2437 MHz [6] (20.0 dBm)
* 2442 MHz [7] (20.0 dBm)
* 2447 MHz [8] (20.0 dBm)
* 2452 MHz [9] (20.0 dBm)
* 2457 MHz [10] (20.0 dBm)
* 2462 MHz [11] (20.0 dBm)
* 2467 MHz [12] (20.0 dBm)
* 2472 MHz [13] (20.0 dBm)
* 5180 MHz [36] (20.0 dBm)
* 5200 MHz [40] (20.0 dBm)
* 5220 MHz [44] (20.0 dBm)
* 5240 MHz [48] (20.0 dBm)
* 5260 MHz [52] (20.0 dBm) (radar detection)
* 5280 MHz [56] (20.0 dBm) (radar detection)
* 5300 MHz [60] (20.0 dBm) (radar detection)
* 5320 MHz [64] (20.0 dBm) (radar detection)
* 5500 MHz [100] (27.0 dBm) (radar detection)
* 5520 MHz [104] (27.0 dBm) (radar detection)
* 5540 MHz [108] (27.0 dBm) (radar detection)
* 5560 MHz [112] (27.0 dBm) (radar detection)
* 5580 MHz [116] (27.0 dBm) (radar detection)
* 5600 MHz [120] (27.0 dBm) (radar detection)
* 5620 MHz [124] (27.0 dBm) (radar detection)
* 5640 MHz [128] (27.0 dBm) (radar detection)
* 5660 MHz [132] (27.0 dBm) (radar detection)
* 5680 MHz [136] (27.0 dBm) (radar detection)
* 5700 MHz [140] (27.0 dBm) (radar detection)
* 5745 MHz [149] (14.0 dBm)
* 5765 MHz [153] (14.0 dBm)
* 5785 MHz [157] (14.0 dBm)
* 5805 MHz [161] (14.0 dBm)
* 5825 MHz [165] (14.0 dBm)
[/quote]

I can see the txpower limits for my regulatory domain (Netherlands). When running my 5ghz wifi at channel 36, I am able to select up to 20 txpower as expected, but setting 19 or 20 results in a txpower of 18. This is probably correct, since the antennas easily have a gain of 2 dBm.

The issues start when I start a DFS channel (100, but I've also tried others).I can now select a txpower of up to 27 in Luci as expected. However, setting anything higher than 16 will result in a txpower of 16. So this lower than the non-DFS channels, while it should have been higher according to the output of iw list! Is there anything I can do to utilize higher tx powers?

Will look into the txpower issue tomorrow. I think it has something to do with how the regulatory domain is set.

To all, I was building 17.01 builds but those suffered from a bug in memory handling. For stack traces see this and for reading about memory handling see this.
However, I read on the forum that a user was using master branch builds without those errors so I compiled a new build form the master branch. So far so good, uptime of 11 hours without errors. If it reaches 24 hours uptime without something cropping up, I will release it.

1 Like

I am now running the latest master branch. For now, it seems to be stable. Unfortunately, SQM is still crashing for me at higher speeds. Even with offloading disabled with ethtool. I've created a ticket about this longstanding issue on flyspray at the following link: https://bugs.lede-project.org/index.php?do=details&task_id=764

It contains some interesting details that might help debugging this issue. Please upvote it so that the developers can have a closer look at this issue.

Thanks for the log. From what I can gather from it, I think it has to do with SMP scheduling. Building a build with SMP disabled to test my hypothesis.

Also, I can replicate your txpower issue but haven't figured out why we cannot raise txpower yet. Tried changing regulatory domains but no dice.

Very interesting hypothesis. I was thinking in a similar direction. I believe the MT7621 devices are one of the few devices that are utilizing a SMT architecture. Maybe there is a bug in the SMP scheduling with this specific SMT implementation? Could you send me the build once it's finished, please? I'd love to try it out. :slight_smile:

Edit: Started my own compile as well. Maybe it is finished earlier ^^ Will send it to you once it's done compiling.

Edit: Compile finished, going to flash now. Will send it if it doesn't brick it :wink:

Meh, mine was a dud. Let me know if yours works! Otherwise, I'll muck around some more.

Same for me. Had to restore from recovery mode to get a working router back. I'll look around some more as well. Let me know if you manage to figure it out.

Edit: I also had the option to disable SMT and keep SMP enabled. I think I will try that next. Should give you a 2 core / 2 thread device if it works.

Yes, that was my next idea. I disabled both of them + I added a -mno-mt compiler flag.