RAID not working on 18.06.0

Hello,

I just upgraded OpenWrt on my Zyxel NSA325v2 to 18.06.0, r7188-b0b5c64c22 but I am unable to get RAID working. (It worked with the snapshot I used before.) The problem is that MDADM crashes when trying to assemble array:

# mdadm -v --assemble /dev/md1 --uuid XXX
mdadm: looking for devices for /dev/md1
...
mdadm: /dev/sdb1 is identified as a member of /dev/md1, slot 1.
mdadm: /dev/sda1 is identified as a member of /dev/md1, slot 0.
Segmentation fault

Does anyone have an idea what could be wrong?

Thanks

Maybe if some one could see something from strace output:

...
writev(2, [{iov_base="mdadm: /dev/sdb1 is identified a"..., iov_len=64}, {iov_base=NULL, iov_len=0}], 2mdadm: /dev/sdb1 is identified as a member of /dev/md1, slot 1.
) = 64
open("/dev/sda1", O_RDWR|O_EXCL|O_DIRECT|O_LARGEFILE) = 5
ioctl(5, BLKSSZGET, [512])              = 0
fstat64(5, {st_mode=S_IFBLK|0600, st_rdev=makedev(8, 1), ...}) = 0
ioctl(5, BLKGETSIZE64, [2199023255552]) = 0
_llseek(5, 4096, [4096], SEEK_SET)      = 0
read(5, "\374N+\251\1\0\0\0\1\0\0\0\0\0\0\0\200\376{a<qqX\367\314@B\221v@\202"..., 4096) = 4096
_llseek(5, 0, [8192], SEEK_CUR)         = 0
_llseek(5, 8192, [8192], SEEK_SET)      = 0
read(5, "bitm\4\0\0\0\200\376{a<qqX\367\314@B\221v@\202\375q\0\0\0\0\0\0"..., 512) = 512
_llseek(5, 0, [8704], SEEK_CUR)         = 0
fstat64(5, {st_mode=S_IFBLK|0600, st_rdev=makedev(8, 1), ...}) = 0
close(5)                                = 0
writev(2, [{iov_base="mdadm: /dev/sda1 is identified a"..., iov_len=64}, {iov_base=NULL, iov_len=0}], 2mdadm: /dev/sda1 is identified as a member of /dev/md1, slot 0.
) = 64
open("/dev/sda1", O_RDONLY|O_EXCL|O_DIRECT|O_LARGEFILE) = 5
ioctl(5, BLKSSZGET, [512])              = 0
fstat64(5, {st_mode=S_IFBLK|0600, st_rdev=makedev(8, 1), ...}) = 0
ioctl(5, BLKGETSIZE64, [2199023255552]) = 0
_llseek(5, 4096, [4096], SEEK_SET)      = 0
read(5, "\374N+\251\1\0\0\0\1\0\0\0\0\0\0\0\200\376{a<qqX\367\314@B\221v@\202"..., 4096) = 4096
_llseek(5, 0, [8192], SEEK_CUR)         = 0
_llseek(5, 8192, [8192], SEEK_SET)      = 0
read(5, "bitm\4\0\0\0\200\376{a<qqX\367\314@B\221v@\202\375q\0\0\0\0\0\0"..., 512) = 512
_llseek(5, 0, [8704], SEEK_CUR)         = 0
close(5)                                = 0
ioctl(4, RAID_VERSION, 0xbebf11ac)      = 0
fstat64(4, {st_mode=S_IFBLK|0600, st_rdev=makedev(9, 1), ...}) = 0
readlink("/sys/dev/block/9:1", "../../devices/virtual/block/md1", 199) = 31
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x1521ea4} ---
+++ killed by SIGSEGV +++
Segmentation fault
# ls -la /sys/dev/block/9:1
ls: /sys/dev/block/9:1: No such file or directory
# ls -la /sys/dev/block/*
lrwxrwxrwx    1 root     root             0 Jan  1  1970 /sys/dev/block/254:0 -> ../../devices/virtual/block/ubiblock0_1
lrwxrwxrwx    1 root     root             0 Jan  1  1970 /sys/dev/block/31:0 -> ../../devices/platform/mbus@f1000000/f4000000.nand/mtd/mtd0/mtdblock0
lrwxrwxrwx    1 root     root             0 Jan  1  1970 /sys/dev/block/31:1 -> ../../devices/platform/mbus@f1000000/f4000000.nand/mtd/mtd1/mtdblock1
lrwxrwxrwx    1 root     root             0 Jan  1  1970 /sys/dev/block/31:2 -> ../../devices/platform/mbus@f1000000/f4000000.nand/mtd/mtd2/mtdblock2
lrwxrwxrwx    1 root     root             0 Jan  1  1970 /sys/dev/block/8:0 -> ../../devices/platform/ocp@f1000000/f1080000.sata/ata1/host0/target0:0:0/0:0:0:0/block/sda
lrwxrwxrwx    1 root     root             0 Jan  1  1970 /sys/dev/block/8:1 -> ../../devices/platform/ocp@f1000000/f1080000.sata/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda1
lrwxrwxrwx    1 root     root             0 Jan  1  1970 /sys/dev/block/8:16 -> ../../devices/platform/ocp@f1000000/f1080000.sata/ata2/host1/target1:0:0/1:0:0:0/block/sdb
lrwxrwxrwx    1 root     root             0 Jan  1  1970 /sys/dev/block/8:17 -> ../../devices/platform/ocp@f1000000/f1080000.sata/ata2/host1/target1:0:0/1:0:0:0/block/sdb/sdb1
lrwxrwxrwx    1 root     root             0 Jan  1  1970 /sys/dev/block/8:18 -> ../../devices/platform/ocp@f1000000/f1080000.sata/ata2/host1/target1:0:0/1:0:0:0/block/sdb/sdb2
lrwxrwxrwx    1 root     root             0 Jan  1  1970 /sys/dev/block/8:2 -> ../../devices/platform/ocp@f1000000/f1080000.sata/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda2

Strangely running the mdadm command from GDB gets it working...

# gdb --args mdadm -v --assemble /dev/md1 --uuid XXX
GNU gdb (GDB) 8.0.1
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-openwrt-linux".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from mdadm...(no debugging symbols found)...done.
(gdb) run
Starting program: /sbin/mdadm -v --assemble /dev/md1 --uuid XXX
warning: Unable to find dynamic linker breakpoint function.
GDB will be unable to debug shared library initializers
and track explicitly loaded dynamic code.
mdadm: looking for devices for /dev/md1
...
mdadm: /dev/sdb1 is identified as a member of /dev/md1, slot 1.
mdadm: /dev/sda1 is identified as a member of /dev/md1, slot 0.
mdadm: added /dev/sdb1 to /dev/md1 as 1
mdadm: added /dev/sda1 to /dev/md1 as 0
mdadm: /dev/md1 has been started with 2 drives.
[Inferior 1 (process 2649) exited normally]
(gdb)
# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] 
md1 : active raid1 sda1[0] sdb1[2]
      2147352576 blocks super 1.2 [2/2] [UU]
      bitmap: 0/16 pages [0KB], 65536KB chunk

unused devices: <none>

self-build or from downloads.openwrt.com?

Hello,

downloaded from here

http://downloads.openwrt.org/releases/18.06.0/targets/kirkwood/generic/openwrt-18.06.0-kirkwood-zyxel_nsa325-squashfs-sysupgrade.bin

downloaded to device, upgraded from device via cli sysupgrade -F because it said that this device is not supported

Same situation with OpenWrt 18.06.1, r7258-5eb055306f

Is the v2 actually the same as the supported version; if device is not actually supported there could be differences that matter here.

Also are you sure you have all necessary kernel modules installed (and did you reboot at least once after flashing before trying this; the first boot can be a little squirrely).

Hello,

afaik v1 and v2 are the same except for some cosmetic changes. According to wiki both version should be supported.

https://openwrt.org/toh/hwdata/zyxel/zyxel_nsa325

I did many reboots. I am currently using it the way that I manually enable the raid via the GDB. This is most puzzling for me - why the command fails in normal environment and succeeds in gdb.

I would like to open issue to describe the problem but I do not know how to do it. Could anyone direct me?

The issue is still present on OpenWrt 18.06.4, r7808-ef686b7292

But I found out that when I disable address space randomization in GDB using (gdb) set disable-randomization off the segmentation fault also happens when command is executed from GDB.

So I finally found workaround that makes it work automatically - disabling ASLR (Address Space Layout Randomization) by adding kernel.randomize_va_space=0 to /etc/sysctl.conf.

I submitted issue here https://bugs.openwrt.org/index.php?do=details&task_id=2507

Good, then it would helpful to attach that stacktrace leading to that crash to that bug report and try to reproduce it with the latest master/snapshot images as well. If it works fine with the latest master/snapshot images, then you can probably find the fix in the mdadm Git repo and try to backport it to very old mdadm version 4.0 in 18.06. Or wait for 19.07 release which has mdadm version 4.1.

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.