Chasing memory leak on TL-WR902ACv3 and git head openwrt

G'day all,

I'm using a self-compiled image using current openwrt.git.
Aside from the default config it has LuCI, openvpn, USB and a few little utils (like iperf3).
It is being used as a travel router where clients use Wifi and the ethernet port is the WAN uplink.

The thing runs for a day or so, then the wireless kicks the clients off and any attempt to re-associate is met with an "invalid password" error on the client (probably because the oom killer knocks out hostapd. I caught it in the log once while I was ssh'd in from the WAN side).

I've written some scripts to compare slab usage, and nothing sticks out.

I have another machine log in and do a free every 30 minutes, which results in output like this :

brad@srv:~/diag$ cat fred.log | egrep "^Mem:" | less
              total        used        free      shared  buff/cache   available
Mem:          60100       26212       16576          96       17312       14108
Mem:          60100       26928       15888         160       17284       13432
Mem:          60100       27372       15112         160       17616       12824
Mem:          60100       27744       14600         160       17756       12388
Mem:          60100       28132       14340         160       17628       12064
Mem:          60100       28620       13852         160       17628       11576
Mem:          60100       29200       13272         160       17628       10996
Mem:          60100       29592       12880         160       17628       10604
Mem:          60100       30272       12200         160       17628        9924
Mem:          60100       30676       11796         160       17628        9520
Mem:          60100       31456       10944         160       17700        8712
Mem:          60100       31368       11032         160       17700        8800
Mem:          60100       31648       10752         160       17700        8520
Mem:          60100       31832       12596         160       15672        9168
Mem:          60100       32260       12168         160       15672        8740
Mem:          60100       32376       12052         160       15672        8624
Mem:          60100       32568       11860         160       15672        8432
Mem:          60100       33504       10924         160       15672        7496
Mem:          60100       34160       12528         160       13412        7928
Mem:          60100       35316       11372         160       13412        6772
Mem:          60100       35572       14996         160        9532        8140

Usage climbs and free falls until the OOM killer wipes out one of the hostapd tasks (as it's the biggest memory user).

I've tried with :

  • Firewall disabled and a single manual MASQ entry for clients.
  • USB disabled (the USB port is unused but the drivers compiled in).
  • 2.4G Radio disabled with all clients on 5G.
  • 5G Radio disabled with all clients on 2.4G.

Disabling the 5G radio results in this :

brad@srv:~/diag$ cat fred.log | egrep "^Mem:"
              total        used        free      shared  buff/cache   available                                             
Mem:          60100       30072       12408          92       17620       12168
Mem:          60100       30076       12404          92       17620       12164
Mem:          60100       30148       12204          92       17748       12032
Mem:          60100       30228       12236          92       17636       12008
Mem:          60100       30308       12156          92       17636       11928
Mem:          60100       30264       12200          92       17636       11972
Mem:          60100       30228       12236          92       17636       12008
Mem:          60100       30224       12240          92       17636       12012
Mem:          60100       30216       12248          92       17636       12020
Mem:          60100       30228       12236          92       17636       12008
Mem:          60100       30216       12248          92       17636       12020
Mem:          60100       30216       12248          92       17636       12020
Mem:          60100       30228       12236          92       17636       12008
Mem:          60100       30264       12200          92       17636       11972
Mem:          60100       30228       12236          92       17636       12008
Mem:          60100       30240       12224          92       17636       11996
Mem:          60100       30228       12236          92       17636       12008
Mem:          60100       30232       12104          92       17764       11940
Mem:          60100       30228       12044          92       17828       11864
Mem:          60100       30284       11988          92       17828       11808
Mem:          60100       30264       12072          92       17764       11908
Mem:          60100       30252       12084          92       17764       11920
Mem:          60100       30292       12160          92       17648       11940
Mem:          60100       30344       12108          92       17648       11888
Mem:          60100       30464       11988          92       17648       11768
Mem:          60100       30540       11900          92       17660       11684
Mem:          60100       26940       15436          92       17724       13472
Mem:          60100       26964       15412          92       17724       13448

So outwardly it would look like the driver for the 5G radio is leaking memory. It's identified as " Type: MediaTek MT7610E 802.11nac".

Has anyone seen anything similar?
I've tried kmemleak, but it doesn't identify anything obvious and just brings the system to its knees. Slab tracing does the same.

root@TL-WR902AC:~# cat /etc/openwrt_*
DISTRIB_ID='OpenWrt'
DISTRIB_RELEASE='SNAPSHOT'
DISTRIB_REVISION='r14522-8cfb839907'
DISTRIB_TARGET='ramips/mt76x8'
DISTRIB_ARCH='mipsel_24kc'
DISTRIB_DESCRIPTION='OpenWrt SNAPSHOT r14522-8cfb839907'
DISTRIB_TAINTS='no-all no-ipv6 busybox'
r14522-8cfb839907
root@TL-WR902AC:~# uname -a
Linux TL-WR902AC 5.4.65 #0 Fri Sep 18 19:42:06 2020 mips GNU/Linux

Config as follows :
brad@debian64:~/build/openwrt$ scripts/diffconfig.sh

CONFIG_TARGET_ramips=y
CONFIG_TARGET_ramips_mt76x8=y
CONFIG_TARGET_ramips_mt76x8_DEVICE_tplink_tl-wr902ac-v3=y
CONFIG_BUSYBOX_CUSTOM=y
CONFIG_BUSYBOX_CONFIG_DIFF=y
# CONFIG_BUSYBOX_CONFIG_FEATURE_IPV6 is not set
# CONFIG_BUSYBOX_DEFAULT_FEATURE_IPV6 is not set
# CONFIG_IPV6 is not set
# CONFIG_KERNEL_AIO is not set
# CONFIG_KERNEL_CGROUPS is not set
# CONFIG_KERNEL_CRASHLOG is not set
# CONFIG_KERNEL_DEBUG_FS is not set
# CONFIG_KERNEL_DEBUG_INFO is not set
# CONFIG_KERNEL_DEBUG_KERNEL is not set
# CONFIG_KERNEL_ELF_CORE is not set
# CONFIG_KERNEL_FANOTIFY is not set
# CONFIG_KERNEL_FHANDLE is not set
# CONFIG_KERNEL_IPV6 is not set
# CONFIG_KERNEL_IP_MROUTE is not set
# CONFIG_KERNEL_KALLSYMS is not set
# CONFIG_KERNEL_NAMESPACES is not set
# CONFIG_KERNEL_SECCOMP is not set
CONFIG_KERNEL_SLABINFO=y
CONFIG_KERNEL_SLUB_DEBUG=y
CONFIG_KERNEL_SLUB_DEBUG_ON=y
# CONFIG_KERNEL_SWAP is not set
CONFIG_OPENSSL_ENGINE=y
CONFIG_OPENSSL_PREFER_CHACHA_OVER_GCM=y
CONFIG_OPENSSL_WITH_ASM=y
CONFIG_OPENSSL_WITH_CHACHA_POLY1305=y
CONFIG_OPENSSL_WITH_CMS=y
CONFIG_OPENSSL_WITH_DEPRECATED=y
CONFIG_OPENSSL_WITH_ERROR_MESSAGES=y
CONFIG_OPENSSL_WITH_PSK=y
CONFIG_OPENSSL_WITH_SRP=y
CONFIG_OPENSSL_WITH_TLS13=y
CONFIG_OPENVPN_openssl_ENABLE_DEF_AUTH=y
CONFIG_OPENVPN_openssl_ENABLE_FRAGMENT=y
CONFIG_OPENVPN_openssl_ENABLE_LZ4=y
CONFIG_OPENVPN_openssl_ENABLE_LZO=y
CONFIG_OPENVPN_openssl_ENABLE_MULTIHOME=y
CONFIG_OPENVPN_openssl_ENABLE_PF=y
CONFIG_OPENVPN_openssl_ENABLE_PORT_SHARE=y
CONFIG_OPENVPN_openssl_ENABLE_SERVER=y
CONFIG_OPENVPN_openssl_ENABLE_SMALL=y
# CONFIG_PACKAGE_MAC80211_DEBUGFS is not set
# CONFIG_PACKAGE_MAC80211_MESH is not set
# CONFIG_PACKAGE_ca-bundle is not set
CONFIG_PACKAGE_cgi-io=y
CONFIG_PACKAGE_iperf3=y
CONFIG_PACKAGE_kmod-crypto-crc32c=y
CONFIG_PACKAGE_kmod-crypto-hash=y
CONFIG_PACKAGE_kmod-fs-ext4=y
CONFIG_PACKAGE_kmod-ledtrig-default-on=y
CONFIG_PACKAGE_kmod-ledtrig-heartbeat=y
CONFIG_PACKAGE_kmod-ledtrig-netdev=y
CONFIG_PACKAGE_kmod-ledtrig-timer=y
CONFIG_PACKAGE_kmod-lib-crc16=y
CONFIG_PACKAGE_kmod-mt76-usb=y
CONFIG_PACKAGE_kmod-mt76x02-usb=y
CONFIG_PACKAGE_kmod-mt76x0u=y
# CONFIG_PACKAGE_kmod-nf-ipt6 is not set
CONFIG_PACKAGE_kmod-nls-cp437=y
# CONFIG_PACKAGE_kmod-ppp is not set
CONFIG_PACKAGE_kmod-scsi-core=y
CONFIG_PACKAGE_kmod-tun=y
CONFIG_PACKAGE_kmod-usb-storage=y
CONFIG_PACKAGE_libiwinfo-lua=y
CONFIG_PACKAGE_liblua=y
CONFIG_PACKAGE_liblucihttp=y
CONFIG_PACKAGE_liblucihttp-lua=y
CONFIG_PACKAGE_liblzo=y
CONFIG_PACKAGE_libopenssl=y
CONFIG_PACKAGE_librt=y
CONFIG_PACKAGE_libubus-lua=y
CONFIG_PACKAGE_libusb-1.0=y
# CONFIG_PACKAGE_libustream-wolfssl is not set
CONFIG_PACKAGE_lua=y
CONFIG_PACKAGE_luci-app-firewall=y
CONFIG_PACKAGE_luci-app-opkg=y
CONFIG_PACKAGE_luci-base=y
CONFIG_PACKAGE_luci-compat=y
CONFIG_PACKAGE_luci-lib-base=y
CONFIG_PACKAGE_luci-lib-ip=y
CONFIG_PACKAGE_luci-lib-jsonc=y
CONFIG_PACKAGE_luci-lib-nixio=y
CONFIG_PACKAGE_luci-mod-admin-full=y
CONFIG_PACKAGE_luci-mod-network=y
CONFIG_PACKAGE_luci-mod-status=y
CONFIG_PACKAGE_luci-mod-system=y
CONFIG_PACKAGE_luci-theme-bootstrap=y
CONFIG_PACKAGE_openvpn-openssl=y
# CONFIG_PACKAGE_ppp is not set
CONFIG_PACKAGE_rpcd=y
CONFIG_PACKAGE_rpcd-mod-file=y
CONFIG_PACKAGE_rpcd-mod-iwinfo=y
CONFIG_PACKAGE_rpcd-mod-luci=y
CONFIG_PACKAGE_rpcd-mod-rrdns=y
CONFIG_PACKAGE_uclibcxx=y
CONFIG_PACKAGE_uhttpd=y
CONFIG_PACKAGE_uhttpd-mod-lua=y
CONFIG_PACKAGE_uhttpd-mod-ubus=y
CONFIG_PACKAGE_usbutils=y
# CONFIG_PACKAGE_wpad-basic-wolfssl is not set
CONFIG_PACKAGE_wpad-wolfssl=y
# CONFIG_TARGET_ROOTFS_INITRAMFS is not set
CONFIG_WPA_MSG_MIN_PRIORITY=2
CONFIG_uhttpd_lua=y
CONFIG_PACKAGE_kmod-lib-crc-ccitt=y
CONFIG_PACKAGE_libip6tc=y
1 Like

Looks like it's pretty conclusive.

Clean boot with radio1 disabled (5G). Run for some ~15 hours on 2.4G.
Enable radio1 and associate a station and we lose 5M of ram in just over an hour.

Tried a new power supply also.

              total        used        free      shared  buff/cache   available
Mem:          60100       23632       20416          92       16052       17156
Mem:          60100       24188       19328          92       16584       16432
Mem:          60100       24256       19260          92       16584       16364
Mem:          60100       24256       19260          92       16584       16364
Mem:          60100       24284       19152          92       16664       16296
Mem:          60100       24388       19048          92       16664       16192
Mem:          60100       24356       19080          92       16664       16224
Mem:          60100       24360       19076          92       16664       16220
Mem:          60100       24388       19048          92       16664       16192
Mem:          60100       24400       19036          92       16664       16180
Mem:          60100       24364       19072          92       16664       16216
Mem:          60100       24396       19040          92       16664       16184
Mem:          60100       24396       19040          92       16664       16184
Mem:          60100       24388       19048          92       16664       16192
Mem:          60100       24388       19048          92       16664       16192
Mem:          60100       24396       19040          92       16664       16184
Mem:          60100       24440       18996          92       16664       16140
Mem:          60100       24408       19028          92       16664       16172
Mem:          60100       24408       19028          92       16664       16172
Mem:          60100       24440       18996          92       16664       16140
Mem:          60100       24408       19028          92       16664       16172
Mem:          60100       24376       19060          92       16664       16204
Mem:          60100       24408       19028          92       16664       16172
Mem:          60100       24408       19028          92       16664       16172
Mem:          60100       24376       19060          92       16664       16204
Mem:          60100       24440       18996          92       16664       16140
Mem:          60100       24608       18828          92       16664       15972
Mem:          60100       24540       18896          92       16664       16040
----- enable radio1 here ----
Mem:          60100       25796       17596          96       16708       14768
Mem:          60100       27988       15352          96       16760       12552
Mem:          60100       29996       13344          96       16760       10544
Mem:          60100       30452       12888          96       16760       10088
Mem:          60100       30648       12692          96       16760        9892
Mem:          60100       30720       12572          96       16808        9796

I've tried numerous methods of slab debugging, but pretty much everything except cat /proc/slab kills the router. kmemleak works for a quite a while but doesn't actually find anything really untoward (like 5 or 6 small leaks in a period where the mem use increases 10M).

I did a stress test to try an exacerbate the problem, but it just doesn't. It appears to loosely be related to having devices connected, but doesn't seem to get worse under high traffic loads.
In the time I did this test, free memory drops 5M. Comparing slab outputs, slab usage increases a total of 250k.

A comparision of the 2 /proc/slab runs.
Pre / post [actual slab size (in-use * size), number of slabs, allocated slab size]

tw_sock_TCP [3952, 1, 4096] [11856, 3, 12288] grew
proc_inode_cache [142560, 36, 147456] [15840, 4, 16384] shrank
skbuff_fclone_cache [12096, 3, 12288] [16128, 4, 16384] grew
TCP [31104, 2, 32768] [46656, 3, 49152] grew
nf_conntrack [36288, 9, 36864] [56448, 14, 57344] grew
vm_area_struct [72576, 18, 73728] [76608, 19, 77824] grew
skbuff_head_cache [36864, 9, 36864] [77824, 19, 77824] grew
inode_cache [268000, 67, 274432] [88000, 22, 90112] shrank
radix_tree_node [135520, 35, 143360] [139392, 36, 147456] grew
ovl_inode [322560, 80, 327680] [326592, 81, 331776] grew
kmalloc-2k [276480, 9, 294912] [337920, 11, 360448] grew
kmalloc-512 [384000, 25, 409600] [414720, 27, 442368] grew
kmalloc-1k [522240, 17, 557056] [614400, 20, 655360] grew
dentry [669312, 166, 679936] [633024, 157, 643072] shrank
kmalloc-256 [537600, 70, 573440] [791040, 103, 843776] grew
kmalloc-128 [2119680, 552, 2260992] [2150400, 560, 2293760] grew
remaining {}
Slabs 1976 1960 -16
Mem 14675968 14925824 249856

Slabs : total number of allocated slabs pre/post/difference
Mem : total allocated mem pre/post/difference (in bytes)

Got me stuffed. All I know is by taking radio1 down (the MT7610) the memory usage stabilises. Doesn't release memory, just stops growing. Likewise, turn it back on and the leak grows until the router dies.

While I don't have a clue what is causing this memory usage I would recommend to install the latest stable version to verify if it is a snapshot issue "only". If its running stable on 19.07.4 then you could make a bug report if later snapshots not fixing this issue. There are still devices in progress transfering to kernel 5.4 which is causing various issues.

It doesn't occur on 19.07.4. On 19.07.4 the ethernet port "randomly" (hangs/resets) which makes it pretty much unusable for streaming.

I'm using a self-compiled image of the absolute latest git tree. I'm attempting to diagnose the issue so I can report a bug rather than reporting an "it don't work right".

I was hoping to get some feedback on how to diagnose an issue on a small memory machine. I'll keep looking to see if I can get something more before I chat to the mt76 team.

You could post here:

It is more likely that you find others with same issue. Beside that I think this thread fits better in section "Developers". Not every developer is reading in this section.

It would help also if you can get sth. like a crash log or the log right before the crash happen (if the kernel is giving any output). As you are doing your own builds to debug sth. try to stay on OpenWrt config defaults as near as possible.

It might not be related but I also have a memory leak on the last few snapshots built with imagebuilder on a c7 v5.
I have collectd running on my router sending data to a remote influxdb server so have some graphs confirming the leak.
I currently have to auto reboot it every 24 hrs or it becomes unresponsive. Interestingly the memory usage doesn't appear to change from reboot at midnight till about 8am when the one or two users become active.
I've disabled the 5g wifi with no difference, the memory leak is still there.
I am using a wr842v1 as a repeater and it also shows similar symptoms of memory leakage.
[edit] after rereading the op it probably isn't related.

To be honest I had no idea where to post it. I didn't think it was strictly a development issue and thought I might catch the odd user who went "Oh that sounds familiar".

I'll have a crack. The problem is I know when it "crashes" because I can't associate with the wireless as the OOM killer knocks off hostapd. I'll try and get in on the ethernet interface after it kills hostapd and before it panics and reboots itself. I did see it in dmesg once, but didn't think to copy/paste it as frankly it's just an OOM situation.

I also attached my diffconfig to the first post to show it's an essentially unmodified config. Remove IPV6, insert openvpn & iperf3.

On to last nights results. A comparison of /proc/meminfo.
Top Line : Wednesday 23 September 20:19:13 AWST 2020
Bottom Line : Thursday 24 September 09:50:34 AWST 2020

This was trimmed from a list of 56 entries. Literally nothing sticks out.

MemTotal:          60100 kB
MemTotal:          60100 kB

MemFree:           20040 kB
MemFree:           13212 kB

MemAvailable:      16780 kB
MemAvailable:       7496 kB

Buffers:            4136 kB
Buffers:            4128 kB


Cached:            11916 kB
Cached:             7416 kB

SwapCached:            0 kB
SwapCached:            0 kB

Active:             9816 kB
Active:             9364 kB

Inactive:           8788 kB
Inactive:           5052 kB

Active(anon):       2624 kB
Active(anon):       2944 kB

Inactive(anon):       24 kB
Inactive(anon):       24 kB

Active(file):       7192 kB
Active(file):       6420 kB

Inactive(file):     8764 kB
Inactive(file):     5028 kB

Unevictable:           0 kB
Unevictable:           0 kB

Mlocked:               0 kB
Mlocked:               0 kB

SwapTotal:             0 kB
SwapTotal:             0 kB

SwapFree:              0 kB
SwapFree:              0 kB

Dirty:                 0 kB
Dirty:                 0 kB

Writeback:             0 kB
Writeback:             0 kB

AnonPages:          2560 kB
AnonPages:          2880 kB

Mapped:             6768 kB
Mapped:             5076 kB

Shmem:                96 kB
Shmem:                96 kB

KReclaimable:       2096 kB
KReclaimable:       1692 kB

Slab:              14816 kB
Slab:              15036 kB

SReclaimable:       2096 kB
SReclaimable:       1692 kB

SUnreclaim:        12720 kB
SUnreclaim:        13344 kB

KernelStack:         384 kB
KernelStack:         360 kB

PageTables:          284 kB
PageTables:          284 kB

NFS_Unstable:          0 kB
NFS_Unstable:          0 kB

Bounce:                0 kB
Bounce:                0 kB

WritebackTmp:          0 kB
WritebackTmp:          0 kB

CommitLimit:       30048 kB
CommitLimit:       30048 kB

Committed_AS:       6972 kB
Committed_AS:       7244 kB

VmallocTotal:    1048372 kB
VmallocTotal:    1048372 kB

VmallocUsed:         548 kB
VmallocUsed:         548 kB

VmallocChunk:          0 kB
VmallocChunk:          0 kB

Percpu:               32 kB
Percpu:               32 kB

Edit : Looks like it might be related to a kernel compile option. I'm doing a binary search of the options that differ from the default snapshot config to see if I can pin it down.

1 Like

I would guess it is IPv6 related. It is very deep integrated into the kernel nowdays.
If you don't do this for saving space on flash rom then I would recommend just to disable/delete WAN6 interface. It has the same effect.

It's beginning to look like you are spot on. More testing required....

I was indeed doing it for space saving, and it would indicate a bug if the driver leaks when an optional kernel configuration is switched off. At least if I can conclusively narrow it down it'll make it easier to instrument the driver and locate the issue.

So that was a bust. I might try an unmodified snapshot next.

A much snipped log of the last 17 hours :

MemAvailable:      13724 kB
MemAvailable:      10044 kB
MemAvailable:       9288 kB
MemAvailable:       7624 kB
MemAvailable:       8852 kB
MemAvailable:       5920 kB
MemAvailable:       1796 kB

Edit:
Tried to install a snapshot, but not enough room to install openvpn-openssl wpad-openssl and luci.

Built an image with the imagebuilder using the latest snapshot.
make image PROFILE="tplink_tl-wr902ac-v3" PACKAGES="luci openvpn-openssl wpad-openssl -wpad-basic-wolfssl"

Not looking good.

I ran a git-latest build of 19.07 last night just to check and definitely no leak there. While the intermittent lockups with the ethernet are gone, probably due to https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=34a96529041d4e9502c490c66f8af0154187c6d2 I'm still unable to stream without intermittent pauses in the data.

Even a snapshot built using the image builder has the same leak, so it's not related to my build or build-config and using the same config on 19.07 doesn't leak so it's not the way I have the system running.

I guess as mt76 doesn't leak in 19.07 and does in snapshot the next thing to do is find the revision where it started to leak and do a bisection.

I tried 8 different version of mt76 going back to late 2018 including the specific version used in 19.07.04 (March 2020) to no avail.

I've now found that it still leaks memory when only using the 2.4G, although much, much slower.
When using the 5G radio, it'll lockup/reboot in 17-20 hours chewing through some 20M of ram. When not using the 5G radio it's still dropped 5M in 21 hours.

When using 19.04 available memory doesn't drop at all but the pauses in traffic make it unusable.

Right now I'm out of ideas, so I'll just leave the 5G radio disabled and reboot it every couple of days until I can make other arrangements.

I'm still working on this intermittently.

I've run it for a while just as a VPN endpoint with the wireless modules unloaded. No leak.

I fired up kmemleak again and found quite a lot of these :

unreferenced object 0x818cb780 (size 176):
  comm "mt76-tx phy1", pid 675, jiffies 339390 (age 932.332s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 10 cb 82 00 00 00 00  ................
    00 00 00 00 00 00 00 00 11 00 01 40 01 00 02 00  ...........@....
  backtrace:
    [<ef9e41f5>] kmem_cache_alloc+0x120/0x290
    [<a4a210d5>] __build_skb+0x34/0xd0
    [<a0a51c47>] __netdev_alloc_skb+0x100/0x1a8
    [<c1f86658>] ieee80211_send_bar+0x48/0x1fc [mac80211]
    [<8655c79e>] mt76_tx_complete_skb+0x270/0x4b4 [mt76]
    [<912654ed>] mt76_tx_complete_skb+0x478/0x4b4 [mt76]

I'm not convinced they're not a false alarm, but I'm investigating further.
I don't see anywhere near enough to address the amount of memory disappearing, but they are consistently in the kmemleak output.

The investigation continues ....

1 Like

It's looking like it was fixed by this :
08a42ef057b0c1c31d66358f29376b939487c732
mac80211: fix memory leak on filtered powersave frames

After the status rework, ieee80211_tx_status_ext is leaking un-acknowledged
packets for stations in powersave mode.
To fix this, move the code handling those packets from __ieee80211_tx_status
into ieee80211_tx_status_ext

Reported-by: Tobias Waldvogel <tobias.waldvogel@gmail.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>

I built a new image last week and the leak appeared to have gone. So I went back through the changelogs and this was the only thing that jumped out. I reverted that commit on the existing tree and it looks like the leak is back.

I'll run some more tests for another week or so.

1 Like

That leak in mac80211 was a significant contributor to leaking memory, however there is still a slow leak remaining in the order of ~500kb/day.

The search continues.

I'm going to close this out because the leak is still there and I've abandoned the device.

I have similar issue on devices (NetGear R6220 and Xiaomi MIR3G) with mt76 driver but on trunk openwrt (test every 2 weeks new build - wait for fix)


What is the best way to track down which process is doing it ?

Prepare an oil burner with some lemon and eucalyptus oil and three sandalwood incense sticks (not two, not four but exactly three). Place the oil burner 35cm from the rear of the router in the middle, and place the incense sticks completely upright in a triangle around the router (one in front center, the other two off the rear corners). Light them all up and let them settle for about 3 minutes and then repeatedly chant "memorius leakius revealioso" 6 times slowly over 5 minutes while waving you hands in a left to right swooshing motion over the top of the two rear incense sticks.

One the prep work is done, ssh into the router and run a ps checking the VSZ column for large values (in this context lets call large anything over 4096). If there are no large values then it's not a process as such but more likely a kernel memory leak.

I noted hostapd was pretty big, but it never grew. The kernel just leaked memory.

If that fails, try taking it out for a candle lit Italian meal, giving it a scented back rub and asking it very nicely.

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.