Does OpenWrt support ZFS filesystem?

I do not see ZFS mentioned anywhere. Is it supported? If not then why? The only alternative is Btrfs and that filesystem has a lot of bugs not fixed for ages. The kernel of OpenWrt is also too old for Btrfs, as new fixes and features are available in kernel 5.x only.

No, and I wouldn't expect that to change in the future either.

--

  • zfs is not mainline, so supporting it would require a significant maintenance burden.
  • things like dkms are not viable on OpenWrt, as its runtime environment isn't apt for compiling (no runtime toolchain, no devel packages, no headers).
  • the abstraction layers for ZoL are huge, too large for most target platforms generally used for running OpenWrt.
  • RAM requirements for zfs are way beyond what most target platforms can provide.
  • the licensing dilemma (CDDL vs GPLv2) would be quite difficult - at least prebuilt kernel modules wouldn't be possible (which, combined with no dkms/ runtime compiling, would reduce its practicality considerably.
  • no one cared enough about zfs to tackle the problems above or signed up to packaging it for OpenWrt.
9 Likes

This isn't my experience. I've been using btrfs for multiple NAS/fileservers for at least 4 or 5 years and it has worked flawlessly. I use it for both online storage and offline/offsite backups with snapshots.

What doesn't work well is documented at the btrfs wiki, mainly RAID5/6. I have used btrfs alone, and in raid 1 and raid 10 successfully.

2 Likes

I can confirm it had a lot of bugs using BTRFS for years now. There were two bugs during development which were severe (one was a faulty kernel release). I've lost a whole raid. :smiley: Took me hours to get things back from other backup drives. I never had those issues with ZFS what I'm using since Solaris 10 was released. But ZoL (ZFS on Linux) had also various bugs during development and it was a long time feature incomplete compared to origin ZFS (mainly compression, dedub feature and encryption). But that is solved since Oracle decided to release the code to the public (as soon BTRFS was reliable ^^) and ofc not under GPLvx. There is still one nice to have feature which ZoL is missing (afaik): "self-healing" (detecting corruption on runtime and automatically repair the issue; you would need a "mirror" for this feature).

But today BTRFS is stable and reliable. The only thing I have to mention is that you should use space_cachev2 if you use BTRFS devices for reliability. V1 is still standard.

What speaks for BTRFS is the RAM usage compared to ZFS. When it comes to performance, well it depends (as so often) on usage szenarios. Random writes are "huge" on ZFS compared to BTRFS. BTRFS has gained massive performance during years of development. The gap is not so huge/non-existent anymore and probably not existent if you run both on a Linux. ZFS is 128 bit based and might be "more future proof" compared to 64 bit for BTRFS.

Can you expand on this. I have several btrfs drives I've just created with default settings would be happy to understand the difference and maybe make changes

2^64 = 1.8x10^19 ~ 10 million terabytes per file, if you address 4k blocks it's 40 billion terabytes per volume.

I'm not worried that we will get that much storage before someone can figure out btrfs2 or whatever

Well in the past there were several issues space_cache related. Especially on SSDs in combination with discard enabled. Space cache is keeping track about the used and unused space of a drive for faster reads. This cache is loded into RAM if enabled. This has effects (in general) if the system is using TLP features to save power (e. g. on Notebooks). TLP (harddrive related stuff) is not recommended in combination with BTRFS.

In the past I had several issues with huge drives getting filled very quickly by just copying files on it. The drives were getting slow, showing no reaction or didn't have any free space after a while. V2 is addressing this issues.

Me too. I just remarked it if someone is driving a datacenter here. :smiley:

Good to know. Is v2 considered stable? Can you easily convert a filesystem? And why is it not the default?

Was about to copy and paste from the wiki already:

(nospace_cache since: 3.2, space_cache=v1 and space_cache=v2 since 4.5, default: space_cache=v1)

Options to control the free space cache. The free space cache greatly improves performance when reading block group free space into memory. However, managing the space cache consumes some resources, including a small amount of disk space.

There are two implementations of the free space cache. The original one, referred to as v1, is the safe default. The v1 space cache can be disabled at mount time with nospace_cache without clearing.

On very large filesystems (many terabytes) and certain workloads, the performance of the v1 space cache may degrade drastically. The v2 implementation, which adds a new B-tree called the free space tree, addresses this issue. Once enabled, the v2 space cache will always be used and cannot be disabled unless it is cleared. Use clear_cache,space_cache=v1 or clear_cache,nospace_cache to do so. If v2 is enabled, kernels without v2 support will only be able to mount the filesystem in read-only mode. The btrfs(8) command currently only has read-only support for v2. A read-write command may be run on a v2 filesystem by clearing the cache, running the command, and then remounting with space_cache=v2.

If a version is not explicitly specified, the default implementation will be chosen, which is v1.

This sounds like you can't take a snapshot without clearing the cache and remounting etc?

This is outdated (last edited January 2020). I cannot recal which version of btrfs-progs you would need. I just checked my arch linux its running v5.9.3 and snapshotting is working. :wink:

EDIT: But you have to clear the cache anyway before switching to v2.

EDIT2: After reading again and rereading. I'm pretty sure this is outdated or just not written clearly. I think this applies only to the space_cache management itself. For me snapshotting is working since I've switched. But that I did a few month ago. So not directly after they have introduded it.

OpenZFS is still way better than BTRFS when it comes to RAID except possibly mirroring :wink:

Probably yes. But as I don't use BTRFS on a "large" scale raid I connot compare. I just have three drives in a small server running with BTRFS raid. ZFS is running on the "big" server only.

Space cache v2 is not only fully stable, it's the first thing a BTRFS dev is going to ask you to try out if you have any performance issues on very large BTRFS filesystems (several terabytes).

Snapshot, dedup, compression, send/receive, RAID works fine with it.

1 Like

and how do you enable it? just mount with space_cache=v2 ?

Unmount, and then mount it with clear_cache,space_cache=v2. You should see a few lines in your log regarding the space cache being cleared and a incompatibility flag for v2 being set.

After that, you don't even need space_cache=v2 in your mount options, it's implied by the incompat flag. If you want to go back, repeat the same process but specify space_cache=v1.

from man 5 btrfs:

  space_cache, space_cache=version, nospace_cache
       (nospace_cache since: 3.2, space_cache=v1 and space_cache=v2 since 4.5, default: space_cache=v1)

       Options to control the free space cache. The free space cache greatly improves performance when reading
       block group free space into memory. However, managing the space cache consumes some resources, including a
       small amount of disk space.

       There are two implementations of the free space cache. The original one, referred to as v1, is the safe
       default. The v1 space cache can be disabled at mount time with nospace_cache without clearing.

       On very large filesystems (many terabytes) and certain workloads, the performance of the v1 space cache
       may degrade drastically. The v2 implementation, which adds a new B-tree called the free space tree,
       addresses this issue. Once enabled, the v2 space cache will always be used and cannot be disabled unless
       it is cleared. Use clear_cache,space_cache=v1 or clear_cache,nospace_cache to do so. If v2 is enabled,
       kernels without v2 support will only be able to mount the filesystem in read-only mode. The btrfs(8)
       command currently only has read-only support for v2. A read-write command may be run on a v2 filesystem by
       clearing the cache, running the command, and then remounting with space_cache=v2.

perfect. I was able to enable it without unmounting by

mount -o remount,clear_cache,space_cache=v2 ...
1 Like

Do you know more about? Is there a the limit like how many files/folders it can handle? I'm just asking for interest.

ZFS "ram usage" is just a cache that you can increase or reduce by changing the default settings, and it gives ZFS ridicolously better random reads/write performance than btrfs. You can think of it as an integrated bcache setup.

On the same hardware (threadripper, 20 SAS drives in a RAID10, either created with btrfs or with ZFS), I can run like 4 different VMs with the virtualdisk on a ZFS pool without much lag, while with btrfs even a single VM chokes.

That said, btrfs for a NAS or even a normal PC it's 100% fine and stable as long as you don't use RAID5/6, and I also agree on the space cache v2.

Although I never used it in OpenWrt so I can't vouch about the older kernel version used in the Stable branch.

Yep, you can even increase this with an extra SSD cache drive. :smiley:
As I said already it depends on usage scenario.

I never touched my setup in terms of reformatting with BTRFS to compare. So this is very interesting for me. Beside that mine is not so huge as yours and older.

But as I said already it depends all on usage. Most ppl. just want a fileserver with (maybe) a bit ISCSI. Most do not run a DB or VMWare ESXi server.

As I was searching on a comparision BTRFS vs. ZFS I found only this:

https://www.diva-portal.org/smash/get/diva2:822493/FULLTEXT01.pdf

This document is very spread in WWW and has (IMO) certain issues in terms of usage scenarios. But it is telling that BTRFS is better. There is sadly nothing else available at this level.

But I still agree that ZFS is the better FS compared to BTRFS. I'm still impressed about the easy (raid-)managment what ZFS introduced to the computer world. Esp. the feature where you can replace your drives simply with bigger ones step by step. And ofc. snapshotting. :slight_smile:

For this kind of usage if you are running vms off a disk image you need that disk image to be set nodatacow