TD-W8970 v1 crash with no visible errors in logs

I might as well buy a new td-w8970 v1.... But man this router thought me so much :cry::cry::cry:

I think something related to network in general is broken especially related to Wi-Fi, probably hardware and specific to my router BUUUUT I have to say that, doing these thing got me less crashes:

  • Wi-Fi disabled if I'm home alone (this helped soooo much)
  • trying these experimental drivers helped a lot
  • separating ethernet and Wi-Fi on two networks (192.168.1.X and 192.168.2.X), disabling the bridge
  • disabling VLAN
  • connecting only my PC (ethernet) and nothing else (this helped the most combined with no Wi-Fi, but of course the device is much more useless in this configuration)

When I do everything described there I can almost reach 6h of uptime, which is a record but, still not enough.

I'd consider a separate modem and router. That way you can get the best in both for your budget, yet when one fails or becomes underpowered/obsolete, you can swap it out, and still have half your investment still running. At least in the US, the modem-router-wifi units are built as cheaply as possible for sale to the ISPs, where there really isn't any consumer choice. If your TD-W8970v1 is working as a modem, then maybe unloading everything but the modem function will help.

I have a TD-W8980 myself and I use some experimental patches, which have been tested thoroughly and they improved the wifi a lot and also the overall cpu performance and ethernet speed. I use it with a external hdd for torrents and samba mainly because Internet is routed from HH5A. Btw you can also try these patches and hopefully they can improve the performance of your router in general.

At first it was thought that your flash may be corrupted so for that I would choose the overlay option with slightly changing the overlay part. At boot router will need to mount the USB and then, unless restarted, everything runs from the USB. I just modified the copy command to include rom and overlay.

mkdir /mnt/upper
tar -C /rom -cvf - . | tar -C /mnt/upper -xf - 
tar -C /overlay -cvf - . | tar -C /mnt -xf - 

You just need to execute the above commands in order and then everything should be in USB and your router should work fine.

Have a look here

I just realized that I haven't tried LEDE in a while and never started doing the whole git bisect thing... I rebuilt an image with the same "verbose-as-fuck" i used for master builds and kept the weird network and USB-overlay configuration. The router is still running and I haven't got a crash in 15 hours... And I begin to see things in the crashlogs!

I see this (all lines were starting with <date> kern.warn kernel: [<timestamp>]):

 ------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at kernel/workqueue.c:1358 __queue_work+0xd4/0x348()
Modules linked in: ltq_atm_vr9 ath9k ath9k_common iptable_nat ath9k_hw ath pppoe nf_nat_ipv4 nf_conntrack_ipv4 mac80211 iptable_mangle iptable_filter ipt_REJECT ipt_MASQUERADE ip_tables cfg80211 xt_time xt_tcpudp xt_state xt_nat xt_multiport xt_mark xt_mac xt_limit xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT xt_LOG x_tables rtc_ds1307 pppox ppp_async nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_nat nf_log_ipv4 nf_log_common nf_defrag_i
CPU: 0 PID: 0 Comm: swapper Not tainted 4.4.167 #0
Stack : 806868a2 00000000 00000000 804a86e0 8051b0dc 8051ad5b 8046cc58 00000000
        80683ae0 83ae855c 00000200 80510000 00200000 8005da44 80528edc 80528ee0
        00000003 80477790 80475380 83819e24 00000200 8005b914 00200000 8005da44
        00000000 8050c080 00000000 00000000 00000000 00000000 00000000 00000000
        00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
        ...
Call Trace:
[<80017928>] show_stack+0x54/0x88
[<8002f788>] warn_slowpath_common+0xa4/0xd4
[<8002f83c>] warn_slowpath_null+0x18/0x24
[<80043aac>] __queue_work+0xd4/0x348
[<80068bc8>] call_timer_fn+0x68/0x130
[<800695cc>] run_timer_softirq+0x1dc/0x284
[<800327a4>] __do_softirq+0xf8/0x35c
[<80002970>] except_vec_vi_end+0xb8/0xc4
[<8001372c>] r4k_wait_irqoff+0x18/0x20
[<80057fb8>] cpu_startup_entry+0x118/0x194
[<80547be0>] start_kernel+0x43c/0x458
[11148.210203]
---[ end trace a68fafe65084dbcd ]---

Any guess except that I should post this on bug tracker? Btw the router did not crash after it, kept going for at least 3 hours , until I rebooted it cause I changed some settings.

It seems to be a bug somewhere in the work-queue maybe but if this is LEDE image then it would probably got fixed in 18.06 but if it's not LEDE then it would be a good idea to report it.

Yes it is a lede image, but its the only thing that seems to not crash within 5hours xD If it was already fixed then they may backport it, or point me to its fix, I could try to backport it and try to tackle the next bug...

It doesnt look much mandatory to fix if router is still working for you. It could be just some variable over-flowing. If the build works then stay on it and dont think much about fixing stuff because you cant just backport the entire 18.06, that will be pointless. I'd suggest to just use it as is and maybe wait for 19.x builds.

Only on LEDE, from 18.06 it crashes... I'll wait for 19.x and see if something changes

@Vento please can you try

The strange thing are it looks that it only happen on firmware with VMMC support but please please try it anyway (try to chmod any bigger file that exist from the beginning)