Crash on GL-MT3000, wireless remains operational

I think this gives the fan speed:
cat /sys/devices/platform/pwm-fan/supplier:regulator:regulator.2/consumer/hwmon/hwmon2/fan1_input

Set to speed 7, fan speed comes out at 5134

1 Like

After a while at speed 7, the temp has dropped to 38968 mC and still dropping very slowly. No load at the moment, just me in an ssh terminal session.

1 Like

I'll let it run overnight.

It is running mesh11sd and it has a watchdog, so I could get it to log temp to flash memory when the crash happens.... I'll try this after an overnight test at speed 7.

1 Like

What is ambient temp?

I'm ~28.3c but the thermometer is on a wall.

In my office with the window open on the Atlantic coast of Scotland at 23:09, it is 16.8C, thermometer on the window ledge by the open window.

1 Like

@LilRedDog
After overnight, still going ok. Temp 36840 mC

I'll do some intensive load tests now.

1 Like

@LilRedDog
I had one "crash" immediately after doing an online speed-test on 2.4GHz, with the mesh interface on the same 2.4GHz radio. The speed test completed and the temp hardly changed.

A power cycle was needed as usual (the mesh layer 2 continued to communicate and both the AP ssids remained up, but with no connectivity on ipv4.

This means it is probably not overheating after all :frowning:

I set fan speed = 0 (ie off) and started the test again.
Now it is hovering at about 60C, but still going.
Tests done from my laptop, via:

  1. ethernet
  2. 2.4GHz
  3. 5GHz

Mesh running on 2.4GHz HE40 (ax aka wifi6) at ~600Mb/s so good performance.

The mesh-portal (aka router) is a gl-mt6000, so I am getting a "wifi6" ax link.
With two mt6000's I get ~1.2 Gb/s on the same mesh link channel, so I don't know why the gl-mt3000 is significantly less (but still very good for 2.4GHz).

There is a quite new patch proposal that improves DMA for ethernet, but I do not remember exactly where. I am afraid temperatures have nothing common with hangs here.

EDIT: backporting this patch:

That is the conclusion that I came to.

Excellent.
The problem here seems to be ip traffic stops, but layer 2 continues (ie the mesh link remains active with HWMP packets still getting through).
Are we seeing something different? We seem to be seeing the problem even when ethernet is not involved.....

Here https://www.cnx-software.com/2023/01/08/gl-inet-gl-mt3000-wifi-6-router-review-specs-unboxing-teardown/ are a few pictures

1 Like

@LilRedDog
Fan off and mt3000 inside a plastic bag for 6 hours.
Temp is up to 67696 mC and working just fine....

So overheating is not a problem, at least not yet.

@lukasz92
Do you have a link for the backport of the patch?

@LilRedDog
Back to step_wise ie auto/ultra_quiet for overnight, I don't want my office to go up in flames overnight!

No wait! On setting step_wise the fan turned on as expected, full on by the sound of it. Temp began falling. Fan speed stepped down, temp dropped some more.

Then.....

Fan speed stepped down again and at the same instant (as far as I could see/hear), the crash happened.

I was running:
while true; do cat /sys/devices/virtual/thermal/thermal_zone0/temp; sleep 1; done
in a second terminal window and monitoring the temp, updating once per second. The update stopping coincided with the sound of the fan step down...
Coincidence?

Sometimes these kind of issues can be caused by a dying power supply. Could you try with another one? Don’t worry about the 3A requirement on the current one, unless you have external drives, modems or 2.5G eth.

Poor router. :sweat:

I'm going to leave mine alone:

I have found all IPv6 settings and disabled them and it turns out my previous 'lan lock-ups' were a bad cable crimp...
So, maybe it is just does not like IPv6?

I dunno but its worth the hope.

no more plastic bag needed :slight_smile:

openwrt/target/linux/mediatek/patches-5.15/992-handle-dma-buffer-size-soc-specific.patch at bc641e7998f445752a9368146f0e567bcdd1ed00 · lukasz1992/openwrt (github.com)

or even the whole tree lukasz1992/openwrt at v23.05.3-lukasz1992 (github.com) patched by me. I do not have timeout issues (but not use 802.11r/s etc.).

1 Like

Yes indeed, this is always worth considering.
But in this case, not only have I had the problem on a number of the devices, but others are having very similar issues.

It is (often) the other way round with me, with the lockup effecting just ipv4.

As Spock said "Logic clearly dictates that the needs of the many outweigh the needs of the few".

1 Like

It looks like it might be the problem here too, although I don't ever see the timeout kernel message. The layer 2 comms on the mesh interface would indeed be uneffected if this is the cause of my problem.

So this is only in your OpenWrt fork?
Are you gong to submit a PR to OpenWrt?

1 Like

I suggested checking this because I have 3 of these. 2 on OpenWrt stable, 1 on the stock firmware. No problems whatsoever.