Sudden NFS failure on 18.06.2 after package upgrade

Hello,

Using OpenWrt 18.06.2 r7676-cddd7b4c77 on a Zyxel NBG6716. This morning I tried to upgrade the system's packages using opkg update and opkg upgrade. I received some error message about NFS, but did not read it. Sorry.

Now when I try to mount from any nfs client, I get a timeout on the client. In OpenWrt's system log, I see the following upon mounting an export:

Thu May  2 15:31:51 2019 kern.info kernel: [ 1363.325165] do_page_fault(): sending SIGSEGV to rpc.mountd for invalid write access to 00000000
Thu May  2 15:31:51 2019 kern.info kernel: [ 1363.334264] epc = 776f22f4 in libc.so[776ca000+92000]
Thu May  2 15:31:51 2019 kern.info kernel: [ 1363.339462] ra  = 0040731f in rpc.mountd[400000+12000]

I have performed a factory reset of the device, even went back to the stock firmware, then cleanly installed OpenWrt again. Portmap is running, nfsd is running... I have no clue where to investigate.

Is the nfs-kernel-server package being updated? What can go wrong during an upgrade of that package?

Kind regards,
FWieP

1 Like

You should not upgrade packages as this is known to be problematic.

Several posts are talking about that.

3 Likes

OpenWrt's opkg system does not have the notion of ABI versioning (yet). What likely happened is one of the libraries that your NFS package(s) are linked against changed their ABI.

If you want to upgrade your packages, the "only" route is to build a new image using the image builder (from a self-consistent set of packages), use a snapshot build and get all the packages that day, or the build your own image from source using the build system.

1 Like

Thank you for the info about the correct way of building with up-to-date packages.

What I still don't understand: I installed OpenWrt clean, then installed nfs-kernel-server and portmap using opkg. I think there shouldn't be any version conflicts when installed like this. Or am I wrong?

As of this moment, these packages are installed when following the instructions in the Wiki:

kmod-fs-nfs 4.9.152-1
kmod-fs-nfs-common 4.9.152-1
kmod-fs-nfs-common-rpcsec 4.9.152-1
kmod-fs-nfsd 4.9.152-1
nfs-kernel-server 2.3.3-2

After the clean install and the installation of the packages, I uploaded the old config backup with (among other settings) my exports. Can there be any harm in that?

Thanks again,
FWieP

Hello again,

I performed a complete opkg remove of all nfs-related packages, then rebooted. Then I reinstalled nfs-kernel-server using LuCI. This is what shows up in the console window:

Installing nfs-kernel-server (2.3.3-2) to root...
Downloading http://downloads.openwrt.org/releases/18.06.2/packages/mips_24kc/packages/nfs-kernel-server_2.3.3-2_mips_24kc.ipk
Installing kmod-fs-nfs-common (4.9.152-1) to root...
Downloading http://downloads.openwrt.org/releases/18.06.2/targets/ar71xx/nand/packages/kmod-fs-nfs-common_4.9.152-1_mips_24kc.ipk
Installing kmod-fs-nfs-common-rpcsec (4.9.152-1) to root...
Downloading http://downloads.openwrt.org/releases/18.06.2/targets/ar71xx/nand/packages/kmod-fs-nfs-common-rpcsec_4.9.152-1_mips_24kc.ipk
Installing kmod-fs-nfsd (4.9.152-1) to root...
Downloading http://downloads.openwrt.org/releases/18.06.2/targets/ar71xx/nand/packages/kmod-fs-nfsd_4.9.152-1_mips_24kc.ipk
Installing kmod-fs-nfs (4.9.152-1) to root...
Downloading http://downloads.openwrt.org/releases/18.06.2/targets/ar71xx/nand/packages/kmod-fs-nfs_4.9.152-1_mips_24kc.ipk
Configuring kmod-fs-nfs-common.
Configuring kmod-fs-nfs-common-rpcsec.
Configuring kmod-fs-nfsd.
Configuring kmod-fs-nfs.
Configuring nfs-kernel-server.

Collected errors:
 * resolve_conffiles: Existing conffile /etc/exports is different from the conffile in the new package. The new conffile will be placed at /etc/exports-opkg.

The error on the last line is understandable, nothing to worry about. The first package seems to be coming from mips_24kc, not ar71xx/nand. Is this correct?

My dmesg shows the following after the install:

[696.242861] NFSD: the nfsdcld client tracking upcall will be removed in 3.10. Please transition to using nfsdcltrack.
[696.253747] NFSD: starting 90-second grace period (net 80488620)
[816.530323] NFSD: Unable to end grace period: -145

And this is the error when trying to mount from a client:

[ 1549.967638] NFSD: Unable to create client record on stable storage: -145

Can anyone help me resolve this issue?

Thanks again,
FWieP

Packages that don't involve the kernel typically come from the architecture-specific as opposed to kernel-specific archives.

As for as the problem with the seemingly running NFSD, that is a package I haven't run under OpenWrt.

I think the problem is that recently we ported rpcbind back to the 18.06 branch, so now nfs-kernel-server needs rpcbind instead of portmap for nfsv2/v3 connections. I would advice to use nfsv4 if possible, since you don't need rpcbind/portmap for it.

1 Like

Thank you for the swift response.

How do I force the NFS-server to use version 4? I know of "vers=4" and "-t nfs4" in /etc/fstab on the client-side, but how do I configure the server?

Thanks again,
FWieP

ok this seems easier said than done :stuck_out_tongue:

Check the nfsv4 only server section's.
https://mockmoon-cybernetics.ch/computer/linux/nfs.html
https://wiki.debian.org/NFSServerSetup
https://bbs.archlinux.org/viewtopic.php?id=193629

The problem is i have no clue what path is correct for the openwrt nfs-server .conf file. So i don't know where to stick NFSD_OPTS="-N 2 -N 3 in. On other distros you can check the buildin path via man-pages, but we don't include those.
The arch link notes that /run/sysconfig/nfs-utils should reflect the options if you find the correct file/location.

Maybe someone else can help, with this elusive nfs server .conf file location on openwrt.

There is an error in 18.06.2 that some of the packages necessary to run an NFS server are missing for some platforms. If you need to run an NFS server, revert to 18.06.1 or use a trunk build until the next release.

1 Like

Thank you for this answer!

I will wait patiently for the next stable release.
Meanwhile, I'll fall back to using Samba.

Thanks!
FWieP

Hi
I also had this problem, after a clean system upgrade (image 'flash') to my massive extroot setup TL WR1043NDv2.1, previously running LEDE 17.0.1 and after installing nfs-kernel-server couldn't connect to nfs server, always timed-out.

Then I fixed the problem by removing installed nfs-kernel-server (with --autoremove as well) and installed manually nfs-kernel-server version from 17.xx.x releases. It will install all dependencies as well (from 18.06.2)
It worked right away with no problems.

2 Likes

You can also try compile your own nfs package from snapshots against 18.06.2 and see if this works. I created my package-builder for scenarios like this.

We? The maintainer of nfs-kernel-server did that! Even told him many times that it isn't a good idea. Also, nfs-kernel-server is no way tested on OpenWrt 18.06.02. This isn't good at all.

If you want to use "we" then I or somebody else will appreciate if you can send pull requests instead of committing to packages repository directly. Cross-reference for pull requests in OpenWrt packages repository which also continues in the issue.

I don't agree that it seems easier said than done even I agree that NFSv4 would be really useful to have in the stable release of OpenWrt. I was looking into it. More details here.

This depends on other pull requests, which were merged to master.

1 Like

Wow why so hostile? I used "we" in a loose sense as maintainers, since i could not remember the full name of the actual person who did this.

No idea what you are talking about, i only use PR. I don't have write access so, cant directly commit?

PS: Just to make sure there was no update problem, does your /etc/services has rpcbind in it?

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.