Data corruption on USB stick

Linksys WRT1900AC v1, OpenWrt snapshot r19033
I use overlay on a USB stick. Luci statistics writes data on it. After some time, the filesystem gets corrupted. With ext2 and f2fs the corruption is detected during fsck after reboot. With btrfs (with checksums option), the problem appears after a few minutes or hours of use and is detectable by running "btrfs scrub". Sometimes btrfs won't even mount a fresh filesystem immediately after mkfs.btrfs.
The router runs very stable if not using USB storage.

I tried:

  • changing the stick: 4 sticks, different manufacturers, different sizes, I even tried to use a CompactFlash card in a card reader
  • changing the usb port: USB2, USB3, with/without USB hub
  • changing filesystems: ext2, f2fs, btrfs
  • changing the router and firmware: Asus Merlin on Asus RT-N18U, but it doesn't have btrfs and the other filesystems don't have checksums, so the problem only appears after reboot.

I don't know what else to do. The only thing I didn't try is a SATA HDD. (I don't have one available right now.)

root@GRAPHRT:~# btrfs scrub start /dev/sda
scrub started on /dev/sda, fsid 12f0822d-0c24-4df8-ae90-ba2a376c7e91 (pid=16433)
root@GRAPHRT:~# ERROR: there are uncorrectable errors

root@GRAPHRT:~# btrfs scrub status /dev/sda
UUID:             12f0822d-0c24-4df8-ae90-ba2a376c7e91
Scrub started:    Tue Mar  8 10:28:54 2022
Status:           finished
Duration:         0:00:07
Total to scrub:   33.14MiB
Rate:             4.73MiB/s
Error summary:    csum=65
  Corrected:      11
  Uncorrectable:  54
  Unverified:     0
root@GRAPHRT:~# dmesg | tail
[94111.595852] BTRFS warning (device sda): checksum error at logical 278302720 on dev /dev/sda, physical 336564224, root 5, inode 455, offset 36864, length 4096, links 1 (path: upper/var/db/collectd/GRAPHRT/interface-pppoe-RDS/if_errors.rrd)
[94111.623848] btrfs_dev_stat_print_on_error: 54 callbacks suppressed
[94111.623863] BTRFS error (device sda): bdev /dev/sda errs: wr 0, rd 0, flush 0, corrupt 428, gen 0
[94111.640134] scrub_handle_errored_block: 43 callbacks suppressed
[94111.640222] BTRFS error (device sda): unable to fixup (regular) error at logical 278302720 on dev /dev/sda
[94112.958343] BTRFS info (device sda): scrub: finished on devid 1 with status: 0
[94135.812687] BTRFS warning (device sda): csum failed root 5 ino 425 off 202014720 csum 0x3f500621 expected csum 0x16342cd3 mirror 1
[94135.824567] BTRFS error (device sda): bdev /dev/sda errs: wr 0, rd 0, flush 0, corrupt 429, gen 0
[94135.835457] BTRFS warning (device sda): csum failed root 5 ino 428 off 202035200 csum 0x0efad072 expected csum 0x33666ef0 mirror 1
[94135.847466] BTRFS error (device sda): bdev /dev/sda errs: wr 0, rd 0, flush 0, corrupt 430, gen 0

Do you use overlay on USB? What filesystem? Any corruption detected after reboot?

I also use collectd, and centralize the stats from two devices on the main router. I write on a f2fs overlay constantly, and have never had any issue about corruption.

On the other hand, you seem to have tested all the alternatives that could be tested...

Btrfs is a journaling "server grade" file system that expects very high levels of storage resilience such as found in a typical server raid 5 array.
Putting it on a usb stick can be done but it is going to result in a very short lifetime and certain corruption if the drive is not dismounted (allowing journal cache writes to complete) before rebooting or powering down.
Pulling the stick or the router power will certainly result in a failure to remount on next boot.

F2fs in contrast is designed for use specifically on flash memory/usb sticks etc with restricted writing capability as is j2fs and results in very low levels of data loss/corruption.

1 Like

Btrfs may be server grade (is it?) but it still shouldn't fail to mount the drive immediately after mkfs.
F2fs also gets corrupted, but it keeps going without detecting the errors until umount+fsck or reboot.

Overlay on usb will also dramatically emphasise any problems with the hardware.

Glitches due to dust, dirt and wear on contacts along with those caused my mechanical movement can all cause corruption.

Glitches can also occur if the router power supply is marginal and cannot supply enough power for short intervals with the added usb drain of a memory stick. Sometimes power supplies are specified as "just enough" to keep costs down and often a power supply will degrade with age.

You are very likely suffering from the knock on effects of all this.

The real solution is to buy a new router of course but I do see why you would want to try this. It works but now you are seeing the negatives.

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.