TD-W8970 v1 crash with no visible errors in logs

Ok makes sense.

Btw i'm compiling one image with /dev/mem (kernel stuffs) and devmem (busybox utility) present so I'll probably find out what's inside that crashlog.

UPDATE:
devmem is actually useless. dd (which is already present by default in OpenWRT) is much more easy to use.

Crash continue to happens. I dumped mem (the last MB in RAM) to the attached USB with

dd skip=63M count=1M if=/dev/mem bs=1 of=/mnt/sda1/crashlog

Since my router has only 64MB of RAM and the crashlog adress is 0x3f00000 (= 63MB).
The content of the crashlog dump is just A1 EE DE AD followed by lots of 00 (basically garbage -.-"). I changed /etc/sysct.d/10-default.conf to dump core to my USB and to wait 10 seconds instead of 3 (maybe it is a time problem).

kernel.panic = 10
kernel.core_pattern = /mnt/sda1/%e.%t.%p.%s.core

I've also tried to use kernel.panic_print=31 but it is not working (got sysctl: error: 'kernel.panic_print' is an unknown key) on OpenWRT default build configuration does anyone knows how to enable it?

Btw my USB has 2 partitions (1GB each) one for data, where I put logs, crashlogs, etc... and the other for swap. I know swap it's useless on a router but at least now there should be no memory problem (never happened AFAIK, but at least now I'm 99.999999% they will not happen).

The crashlog (the last Mb in the RAM, hoping to not get more stuffs) has actually some interesting data (used strings dump > /path/to/usb/drive), but I think that I simply get the memory of some process.
I think that the last section of ram are dedicated to MEI driver (since I can see almost every format string used by that program in case of errors). Can someone take a copy of the last MB of RAM in a clean no-issue boot scenario? Of course must have the same model, /dev/mem enabled in the build and using ADSL/VDSL connection.

Probably nobody is interested anymore on my router xD Btw the crashlog in this router seem just broken. I'm recompiling an image from the master bracnh, default configuration except for luci, /dev/mem and Collect debug information, usb stuffs to save logs enabled, and all stack protection stuffs disabled. Basically I'm mimicking a default image + more debug stuffs.

UPDATE

To whomever want to try this configuration. Download this and save it somewhere (I'll call it master_diff_036)... and execute this commands (assuming that master_diff_036 is placed in the same directory):

cd /dir/where/you/place/master_diff_036
# this will download openwrt sources and create an "openwrt" directory
git clone https://github.com/openwrt/openwrt.git
cd openwrt
./scripts/feeds update -a
./scripts/feeds install -a
cp ../master_diff_036 ./.config
make defconfig
# trick, "make download" will go faster
mkdir ./dl
cd ./dl
wget ftp://ftp.denx.de/pub/u-boot/u-boot-2018.03.tar.bz2
cd ..
make download
# let's save a build.log file
make V=s -j1 2>&1 | tee build.log | grep -i '[^_-"a-z]error[^_-.a-z]'

When everything is done the final image is called openwrt-lantiq-xrx200-tplink_tdw8970-squashfs-sysupgrade.bin and you found it inside where_you_started_doing_this/openwrt/bin/targets/lantiq/xrx200 folder.

I am not sure if this was discussed before, but have you tried to go back to stock firmware?

Yes. With the stock image everything works and I do not get crashes.

Have you tried 17.01.5 for your router? Do you remember when did the crashes start happening? Was it always the case?

All the following consideration are personal opinions, I may sound nrgative but I'm actually just a bit upset to have a weird rare problem and all the possibile ways to even just understand what couse it seems to not work. I actually really like the whole Openwrt project and I learnt sooooo much! I think that all the crashes are related to some kind of personal situation/configuration.

  • 17.01.04 was the last version that worked for me (crashes were almost never existant, 1 a week maybe, but I manually reboot the router every day so crashes may have been sporatic unrelated event)
  • Skipped 17.01.05 because I simply forgot to checks for updates
  • 17.01.06 had problems (random crashes, rare but happening at least once each day)
  • images build from the lede-17.01 branch are fast but will crash within 5h~10h (if the net is used, OC, I never tested crash times when idling)
  • images build from the openwrt-18.06 branch are a joke
  • images build from the master branch are faster than lede-17.01 and crash a bit less (still can't reach the 10h mark)... but still ultimately crash and make the router not usable.

I used to game online with no problems, but not it's impossible, crashes are always a threat. I can not start a match when it is 99% sure that I'll crash.

That's not considered a crash as this topic suggests. It's possible something broke in 18.06 for your router and that is why it's not working anymore. But it's also possible that it's has been rectified in master branch and it will be incorporated in future releases. Now as you suggest that master branch works fine and router crashes under heavy load, it's normal because it only has a limited amount of CPU and RAM. It's highly advisable that you use this router for routing purposes only depending on your internet. I am not sure if it will be able to handle more than 50mbps of DL.

Ye I was thinking the same thing. Something broke around 17.01.5/6 and carried over on next releases.

Ok I make a mistake while I was writing "high load", which for me for me is 12 Mbps (ISP contract limit). High load is when I download things, gaming a bit possibly and some other devices are connected with wifi... The router should not reboot, everyone should go slowly I know, but the router should not crash and reboot... Also because it works perfectly while on stock (and on ancient releases of OpenWRT).

I may just revert back to stock 17.01.04 or last 15.05... sad :frowning: I really wanted to have SMP (introduced with 18.06) and offloading (currently in master?).

Yes I understand that but these features are still new so there could be problems associated with them. If you are interested in using a faster device then you should check out the following topics.

  1. How can we make the lantiq xrx200 devices faster
  2. Xrx200 IRQ balancing between VPEs

Please note that this is experimental and you'll need to use master branch without any kernel changes and just apply the patches.

Ty for the infos. For now I'll stick to test this device against official images or master and try to find out why it crashes, I'm stubborn xD

1 Like

The last image (message num 25) finally crashed. Here you can find the complete dump of last log before crash, log after crash, crashlog dumped from ram (which is useless since is broken), core dump, config folder, and other stuffs. Remember I have an usb attached to the router with 2 partitions, one EXT4 to catch where things go and 1 for SWAP (~925MB or swap, just because I can). I masked passwords, mac and ip adresses.DOWNLOAD LINK IF ANYONE WANT TO HELP.

Just to know, how can I download the exact sources (/commit) used to build 17.01.4?:thinking:

EDIT:

Fuund the commit. I'm dumb

You can try image builder from http://archive.openwrt.org/releases/17.01.4/targets/ rather compiling from source. Just to be on the safe side.

Too late. I just went full "myself" and build a extremely minimal image with also some kernel changes. #ohno

Joking aside. Yes I also grabbed the official 17.01.04 and I'm getting ready to do tests. I might try to do a binary search for the commit responsible of this reboot problem

  • trying 17.01.05, if it works it means that the problem is between 05 and 06, if not is between 04 and 05
  • choose a commit between the selected half
  • repeat until I find the commit which makes everything goes bad

It will probably take a lot of time... Worst case: log(commits_num_from_04_to_06) am i right?

Hey a little bit of OT, it might be related to another problem that this router has... but it is just a wild guess. I'll ask this here so I do not create another thread.

I would like to separate eth0 from wlan (no default no bridge), and also have DHCP on eth0 with static-ip based on switch-port not MAC addresses. Everything without using VLAN.

Out of the box OpenWRT create a bridge between eth0 and wlan, putting every client under 192.168.1.X. I'd like to not create the bridge, but instead put every client connecting through WiFi under 192.168.2.X (using available IP range 192.168.2.100 - 192.168.2.200) and selecting fixed IP for the (exposed) port of the switch. The 4 ports on the back of the router corresponds (from left to right) to the internal switch ports 4, 2, 0 and 5, is there a way to assign specific IP to each port? Whatever connects to switch ports 4 has IP 192.168.1.2, port 2 will have IP 192.168.1.3, port 0 will have IP 192.168.1.4 and port 5 will have IP 192.168.1.5.

This is rather strange for me TBH but still I dont think if it's possible. Even if it was possible, you'll need to create VLANs and you'll probably need to create different networks that can be connected to those VLANs but still you can't assign same subnet to different VLANs (AFAIK) and if you assign a static IP to a network that IP will be considered a gateway for the connecting device to that port and device will need an IP address different to that IP but in the same subnet.

This is possible and you just need to go to LAN physical tab and untick the bridge option and then assign a eth0 interface. Create a new interface and do the same with wlan.

Basically no, the switches in a typical all-in-one router work on Layer 2, Ethernet addresses, not IP addresses.

You could subnet down to a /30 so each subnet has an IP for the router, one for a device, and a broadcast address, but that's going to be a maintenance nightmare, as well as destroying any on-link services.

I'm not sure what you're trying to accomplish with that topology. Perhaps explaining your goals would lead to an approach.

Sorry to continue to write on this forum. Btw how can I build my image to be as verbose as possible? I want to know if my image is crashing for reason or is the CPU simply giving up.