Hello. I am opening this thread for mainly 2 reasons:
- If I open another separate thread for every issue that I encounter, admins will probably ban me... and they will be right to do it.... so I created this thread with every single problem I have now. Again I'm sorry that I created so many threads in the past.
- I want to find out if it is my TD-W8970 v1 (1.2) is broken or if it is my configuration that makes it crash. I think, but I might be completely wrong, that no-one is checking the working status of this router model on every configurations (which makes sense, its impossible to check everything), and since I have a quite unusual configuration (PPPoA connection, no VLAN, etc... etc...) maybe I get crashes.
WARNING: I wrote a lot.
More reasons on why this thread exists.
- I love this router. It was cheap when I bought it. Gigabit switch, ADSL2+ (even VDSL), CPU with sort of SMP, 500 MHz, three wifi antennas.
- I have started custom building by own OpenWRT with 15.05, some builds more stable than others, of course, but once I found a stable one it worked with almost 0-problems (at least had 24h of uptime with no crash).
- Currently does not matter if I choose a pre-compiled image or build my own version from any active branch I select (lede-17.01, openwrt-18.06, master) the router will have major problems (switch dead, random reboots, frezzes, ...) in few hours from boot, I want to find out why instead of simply complaining.
- I want to build an ultra fast minimal image of OpenWRT
- I want to learn
- I am currently stuck using this garbage. I can hear it sobbing because it can not handle so much work.
- Check some photos in the "QUESTION YOU MIGHT HAVE" section.
BUILD ENV
Main OS is Windows, so I set up a Debian VM (minimal, no-gui, I ssh into it) with every packet I need to compile my own version of OpenWRT.
RETRIVE SOURCES
cd /home/vento
# First delete old build folder, called "buildop"
rm -rf buildop
# update current local version of openwrt
# previously obtained with
# git clone https://www.github.com/openwrt/openwrt
# mv openwrt openwrt_backup
#
cd openwrt_backup
git pull
cd ..
# create a new "buildop" directory
cp -R openwrt_backup buildop
# update feeds and install only necessary packages
cd buildop
./scripts/feeds update -a
./scripts/feeds install libpam libgnutls libopenldap libidn2 libssh2 liblzma libnetsnmp jansson
BUILD CONFIG
Let's download my last configuration used, file name: master_diff_031
(as you see not a real minimal build configuration, lots of debug stuffs are still there but its to find out why it does not work).
# apply diff file
cd buildop
cat /home/username/master_diff_031 > .config
# expand config file
make defconfig
Now let me explain the major changes you see inside that diff file (created with /scripts/diffconfig.sh > master_diff_031
).
I do not want to use a thing with no RTC as NTP server
# CONFIG_BUSYBOX_CONFIG_FEATURE_NTPD_SERVER is not set
No IPV6
# CONFIG_BUSYBOX_DEFAULT_FEATURE_IPV6 is not set
# CONFIG_IPV6 is not set
# CONFIG_KERNEL_IPV6 is not set
# CONFIG_PACKAGE_libip6tc is not set
# CONFIG_PACKAGE_kmod-nf-ipt6 is not set
No need of OPKG
CONFIG_CLEAN_IPKG=y
# CONFIG_PACKAGE_openwrt-keyring is not set
# CONFIG_PACKAGE_opkg is not set
# CONFIG_PACKAGE_libuclient is not set
# CONFIG_PACKAGE_uclient-fetch is not set
# CONFIG_PACKAGE_usign is not set
# CONFIG_SIGNED_PACKAGES is not set
# CONFIG_PER_FEED_REPO is not set
No need of security
CONFIG_PKG_CC_STACKPROTECTOR_NONE=y
# CONFIG_PKG_CHECK_FORMAT_SECURITY is not set
# CONFIG_PKG_FORTIFY_SOURCE_1 is not set
CONFIG_PKG_FORTIFY_SOURCE_NONE=y
# CONFIG_PKG_RELRO_FULL is not set
CONFIG_PKG_RELRO_NONE=y
CONFIG_COLLECT_KERNEL_DEBUG=y
CONFIG_KERNEL_CC_STACKPROTECTOR_NONE=y
# CONFIG_KERNEL_CC_STACKPROTECTOR_REGULAR is not set
# CONFIG_KERNEL_STACKPROTECTOR is not set
I do not need any of these
# CONFIG_FSTOOLS_UBIFS_EXTROOT is not set
CONFIG_TARGET_PREINIT_SUPPRESS_FAILSAFE_NETMSG=y
# CONFIG_OPENLDAP_DEBUG is not set
# CONFIG_PACKAGE_MAC80211_MESH is not set
I'll manually download the lantiq DSL firmware
# CONFIG_PACKAGE_bspatch is not set
# CONFIG_PACKAGE_dsl-vrx200-firmware-xdsl-a is not set
# CONFIG_PACKAGE_dsl-vrx200-firmware-xdsl-b-patch is not set
# CONFIG_PACKAGE_libbz2 is not set
# CONFIG_PACKAGE_ltq-vdsl-vr9-vectoring-fw-installer is not set
This router does not have that much ram
# CONFIG_KERNEL_CC_OPTIMIZE_FOR_PERFORMANCE is not set
CONFIG_KERNEL_CC_OPTIMIZE_FOR_SIZE=y
I'd like to make the router work before testing offload
# CONFIG_PACKAGE_kmod-ipt-offload is not set
# CONFIG_PACKAGE_kmod-nf-flow is not set
Wanted to try uboot-envtools
CONFIG_PACKAGE_uboot-envtools=y
MAKE IMAGE
Run these before make download (download a file manually). The whole process will take much less time (there is a bug).
mkdir -p /home/vento/buildop/dl
cd /home/vento/buildop/dl
wget ftp://ftp.denx.de/pub/u-boot/u-boot-2018.03.tar.bz2
cd /home/vento/buildop
Use the command make download
, not to enable parallel make, but to later get a build log easier to read. Save a debug log of the make download
step by running it like this: make download V=s -j1 2>&1 | tee download.log
.
Other things related to `uboot-envtools`. Skip if you want.
Since I'm trying to use uboot-envtools
, make mtd0 writable by running nano target/linux/lantiq/files-4.14/arch/mips/boot/dts/TDW89X0.dtsi
, and removing the line "read-only;
" from partitions definition:
...
partitions {
compatible = "fixed-partitions";
#address-cells = <1>;
#size-cells = <1>;
partition@0 {
reg = <0x0 0x20000>;
label = "u-boot";
read-only; /* <------------------ REMOVE THIS LINE */
};
...
Now we are ready to compile the whole thing, run make V=s -j1 2>&1 | tee build.log
.
Download logs should be saved under /home/vento/buildop/download.log
, build log under /home/vento/buildop/build.log
.
The built image can is found at /home/vento/buildop/bin/targets/lantiq/xrx200/openwrt-lantiq-xrx200-tplink_tdw8970-squashfs-sysupgrade.bin
.
FLASHING PROCESS
Since sysupgrade
failed me sometimes (long story) I went a bit "crazy" with the installation process. To be fair I've never got a no-boot/dead-router after I've started doing this.
- I setup my Rasberry PI like this. Actual photo. I run flashrom like this
sudo flashrom -V -p linux_spi:dev=/dev/spidev0.0,spispeed=512 -c "W25Q64.V"
(careful flash chip might be different), and append-r
,-w
and-v
to save, write and verify (duh!). -
Special step. Done this only one time. Now I have every file I need.
- Starting from a clean original firmware.
- Reset settings
- Update to last "available" firmware, released 19/06/2015. Released by TP-Link support not available on their site. Last firmware available from official site was released 13/06/2014.
- Reset settings
- Dump with Raspberry PI, I saved it as
original.bin
. - Shutdown Raspberry PI. Remove clip. Reconnect clip. Start Raspberry again. Check if chip has been dumped correctly:
sudo flashrom -V -p linux_spi:dev=/dev/spidev0.0,spispeed=512 -c "W25Q64.V" -v original.bin
. - Make a copy of
original.bin
, calledempty.bin
. - Open
empty.bin
with an hex editor (I used HxD).- Write
0xFF
from address0x20000
to address0x7A0000
.
- Write
- Make a backup of both
original.bin
andempty.bin
. Send them to myself with email.
- Get the update file (
update.bin
) ready- Make a copy of
empty.bin
, calledupdate.bin
. - Open with an hex editor,
update.bin
and the sysupgrade file got from the build process. - Copy every byte of the sysupgrade file
- Starting from address
0x20000
, overwrite every byte ofupdate.bin
- Make a copy of
- Copy
update.bin
in the raspberry - Flash
update.bin
it withsudo flashrom -V -p linux_spi:dev=/dev/spidev0.0,spispeed=512 -c "W25Q64.V" -w update.bin
- Shutdown Raspberry PI. Remove clip. Reconnect clip. Start Raspberry again. Check if chip has been written correctly:
sudo flashrom -V -p linux_spi:dev=/dev/spidev0.0,spispeed=512 -c "W25Q64.V" -v update.bin
.
POST FLASHING CONFIGURATIONS
Yes I know that all following changes can be integrated in the build process, since removing/changing files actually take more space on the flash chip... I'll do it one day, once the router do not crash. Make sure to not mess up file permissions.
- Remove banners just for fun.
rm /etc/banner.failsafe touch /etc/banner.failsafe rm /etc/banner touch /etc/banner
- Remove stuffs (see next bullet points)
rm /etc/firewall.user rm /lib/netifd/ppp6-up
- Apply custom configs (files in
/etc/config/
)- As you see I have a pppoa connection.
- I've removed the default VLAN (eth0.1) and disabled VLAN features.
- Since remote logging fails to capture crashlogs I've simply disabled it.
- Write
kernel.randomize_va_space=0
inside/etc/sysctl.conf
. Disable security stuffs. - From
/etc/sysctl.d/10-default.conf
changenet.ipv4.tcp_syncookies
to0
and remove ipv6 related strings, now useless. - Use this modified
/lib/netifd/proto/ppp.sh
. Functionppp_generic_setup
has been modified to ditch completely any IPv6 reference (except for the stringnoipv6
passed as parameter to ppp), and to forcefully NOT request DNS ips from the ISP. Remember to not mess up file permissions,ppp.sh
must be executable! - Download the publicly available lantiq dsl firmware. The file is called
xcpe_581816_580B11.bin
, rename it tolantiq-vrx200-a.bin
and place it here/lib/firmware/
.
SO? WHAT IS THE PROBLEM?
- Router crash within 10 hours, 5 or less if its under heavy load.
- Or drop pppoa-wan connection just to bring it back 20 seconds later.
- Yes after a crash is see
Crashlog allocated RAM at address 0x3f00000
. But no crashlog file is generated under/sys/kernel/debug
. - I have tried to setup my raspberry as log server, it worked but no crashlogs are saved.
- I am currently unable to detect why this router crash.
OTHER QUESTIONS THAT I HAVE
- How can I read crashlogs from ram? In the logread is see
Crashlog allocated RAM at address 0x3f00000
. I FOUND OUT HOW! POST 22 & 23. COMPILE YOUR IMAGE WITH/dev/mem
ENABLED AND USE THE COMMANDdd
. - Are there other why to troubleshot this? To know why this router crash?
- After feeds update, I install only the minimum needed packages (see "RETRIVE SOURCES" section) to not get errors while running
make menuconfig
,make defconfig
, ... etc ... Is this ok? Should I simply run./scripts/feeds install -a
? - I see this with logread at every boot:
cacheinfo: Failed to find cpu0 device node
followed bycacheinfo: Unable to detect cache hierarchy for CPU 0
. This was happening time ago also on 18.06 but was later fixed... regression in master? Should I simply forget about it? - This is more a statement than a question. I used to compile this build adding
-march=34kc -mtune=34kc -mmt -mdsp
to target optimization (somewhere insidemake menuconfig
).-march=34kc
should automatically enable-mmt -mdsp
, but I write them anyway. Every-time that I brought this up I get tons of "34kc is like 24kc for GCC, it does not makes any difference!", but it does! I get less ram usage, and MT ASE and DSP ASE have been introduced with 34kc, 24kc do not have them. Also MT ASE are related to multithreading, useful since 18.06 introduced support to SMP on this cpu! Am I missing something big? - Does
kernel.randomize_va_space=0
actually makes a difference? - Why
Compile with support for patented functionality
(undermake menuconfig
>Global build settings
) is selected by default in LEDE but not on 18.06 and later?
QUESTION THAT YOU MIGHT HAVE
- No GUI/LUCI? Yep. Less ram usage.
-
Is the router overheating? I think not. Max temp were 75 °C, now is 36 °C!
- Wait... now it needs more power! Yep I bought a 30 W power-supply and router+fan should need 27 W max.
- Dust? Nope. Custom air filter, easily removable thanks to its quality cardboard frame + black duct tape.
- Isn't all of that an overkill?. Yes absolutely.
- Is it loud?. Nah, PC is louder, event outside noise sometimes.
- Did you test different configurations? I "think"... it can also be that I'm doing the same thing wrong over and over (probably this).
- Why master and not lede or last 18.06?. LEDE feels faster but it crash more, also have less features that I'd like to use once everything works. 18.06 is completely unstable for me it can last 2 hours without a crash.
- Do you know that OpenWRT does not have 100% uptime. Yes, yes I know that why I used to schedule an automatic reboot every morning at 7 (and/or shutdown it down completely at night and manually turn it on in the morning).
-
uboot-envtools
is not configured! Yes I know, I wrote it only because it was inside themaster_diff_031
file. Left it not configured do no harm. I'll play withuboot-envtools
once everything works. - Hey lots of other stuffs can be removed! This is not a real minimal build. Yes I know. I want to remove more things I do not use, and I did in past configurations/test... First I'd like to achieve a stable build and properly identify where the problem is, to gradually later remove more and more things.