Should OpenWrt work with non-ASCII filenames on FAT volumes by default?

There is an option in kernel config, namely

CONFIG_FAT_DEFAULT_UTF8=y

This option makes the kernel mount vfat with utf8=1 by default, while you can set utf8=0 if you don't want to do it.

FAT filesystem with long file names, also called vfat, uses UTF-16 internally to store filenames, while SSH and Web clients expect to see UTF-8 encoding most of time, since it's what is used in all modern UNIX-like OSes by default lately. If option utf8 is set, when volume is mounted, the filenames are translated from UTF-16 to UTF-8, this translation is lossless and filenames will be displayed correctly when you use ls command in shell, or browse them via a file share client.

However, if utf8 option is disabled, and iocharset is set to anything else than utf8, such as iso8859-1, non-ascii filenames will be mangled, since UTF-16 stored on the volume would be translated into an non-unicode encoding.

This option is not set by default in mainline kernel due to legacy reasons, but they are not applicable to OpenWrt. So maybe it makes sense to add this option to kernel configs for all platform where there are devices with USB support?

Also it makes sense to set default iocharset to utf8 as well, it won't change how vfat is handled, if utf8=1, however it should allow openwrt to get rid of kmod-nls-iso8859-1 dependency.

Also, in order to manipulate the files with non-ascii filenames from the shell, CONFIG_BUSYBOX_CONFIG_UNICODE_SUPPORT would be needed. Without it, they will be displayed however working with them would be tricky. Without this option, typing non-ascii chars is handled incorrectly.

2 Likes

P.S.

  1. if iocharset=utf8, but utf8=0, then unicode in filenames will work, but it would require kmod-nls-cp437 (unless this default is changed), and file names will be case sensitive.
  2. if mount -t msdos is used instead of -t vfat, file names will be limited to 8+3 and codepage should be set correctly for the country where the filesystem originates from, while iocharset still has to be utf8.
  3. This option changes vmlinuz size by 0.2 KiB or so. Which means total device fw size won't change at all most of time.