Calling on all 4G/QMI Experts! wan IP refresh issue

Recently I have had the joy of setting up OpenWrt on my shiny new 4G modem, namely a Zyxel NR7101, to replace my old Huawei B818-263 operating in bridge mode.

An old problem that I had with my B818-263 was that every 48 hours my ISP would require a wan IP refresh of sorts, and the B818-263 would drop the connection to NO-CARRIER, and back when I used an Asus router running Merlin this would result in loss of internet connectivity and so I wrote the following script to detect the NO-CARRIER event and force a wan IP refresh:

#!/bin/sh

renew_wan_lease=0

ip monitor link dev eth0 | while read event; do
	
	# logger "maintain-wan-lease detected eth0 event: "$event

	case $event in

	*'NO-CARRIER'* )
		if [ $renew_wan_lease -eq 0 ]; then
			logger "maintain-wan-lease detected eth0 state change to: 'NO-CARRIER', so forcing udhcpc to release wan lease."
			killall -SIGUSR2 udhcpc
			renew_wan_lease=1		
		fi
	;;

	*'LOWER_UP'* )
		if [ $renew_wan_lease -eq 1 ]; then
			logger "maintain-wan-lease detected eth0 state change from: 'NO-CARRIER' to: 'LOWER_UP', so forcing udhcpc to renew wan lease."
			killall -SIGUSR1 udhcpc
			renew_wan_lease=0
		fi
	;;
	esac

done

Ever since replacing my Asus router with an OpenWrt router this issue went away and I did not need my script for whatever reason and I presumed this related to OpenWrt properly handling the NO-CARRIER event.

Interestingly and annoyingly since getting my Zyxel NR7101 I see that my 48hr loss of internet connectivity issue has come back.

I connect using the 'luci-proto-qmi' utility, which I see @jow and others have had a hand in developing over the years:

And after 48 hours of having been connected I got no connectivity, with log here:

Wed Oct 26 03:16:49 2022 daemon.warn dnsmasq-dhcp[1]: not giving name OpenWrt to the DHCP lease of 192.168.0.2 because the name exists in /tmp/hosts/dhcp.cfg01411c with address 192.168.0.1
Wed Oct 26 03:16:51 2022 daemon.notice netifd: wan_4 (12250): udhcpc: sending renew to server 55.66.77.888
Wed Oct 26 03:16:51 2022 daemon.notice netifd: wan_4 (12250): udhcpc: lease of 11.222.33.444 obtained from 55.66.77.888, lease time 7200
Wed Oct 26 04:16:51 2022 daemon.notice netifd: wan_4 (12250): udhcpc: sending renew to server 55.66.77.888
Wed Oct 26 04:16:51 2022 daemon.notice netifd: wan_4 (12250): udhcpc: lease of 11.222.33.444 obtained from 55.66.77.888, lease time 7200
Wed Oct 26 05:16:50 2022 daemon.notice netifd: wan_4 (12250): udhcpc: sending renew to server 55.66.77.888
Wed Oct 26 05:16:50 2022 daemon.notice netifd: wan_4 (12250): udhcpc: lease of 11.222.33.444 obtained from 55.66.77.888, lease time 7200
Wed Oct 26 06:16:50 2022 daemon.notice netifd: wan_4 (12250): udhcpc: sending renew to server 55.66.77.888
Wed Oct 26 06:16:50 2022 daemon.notice netifd: wan_4 (12250): udhcpc: lease of 11.222.33.444 obtained from 55.66.77.888, lease time 7200
Wed Oct 26 07:16:50 2022 daemon.notice netifd: wan_4 (12250): udhcpc: sending renew to server 55.66.77.888
Wed Oct 26 07:16:50 2022 daemon.notice netifd: wan_4 (12250): udhcpc: lease of 11.222.33.444 obtained from 55.66.77.888, lease time 7200
Wed Oct 26 08:16:50 2022 daemon.notice netifd: wan_4 (12250): udhcpc: sending renew to server 55.66.77.888
Wed Oct 26 08:16:50 2022 daemon.notice netifd: wan_4 (12250): udhcpc: lease of 11.222.33.444 obtained from 55.66.77.888, lease time 7200
Wed Oct 26 09:16:49 2022 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.0.2 XX:XX:XX:XX:XX
Wed Oct 26 09:16:49 2022 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.0.2 XX:XX:XX:XX:XX OpenWrt
Wed Oct 26 09:16:49 2022 daemon.warn dnsmasq-dhcp[1]: not giving name OpenWrt to the DHCP lease of 192.168.0.2 because the name exists in /tmp/hosts/dhcp.cfg01411c with address 192.168.0.1
Wed Oct 26 09:16:50 2022 daemon.notice netifd: wan_4 (12250): udhcpc: sending renew to server 55.66.77.888
Wed Oct 26 09:16:50 2022 daemon.notice netifd: wan_4 (12250): udhcpc: lease of 11.222.33.444 obtained from 55.66.77.888, lease time 7200
Wed Oct 26 10:16:51 2022 daemon.notice netifd: wan_4 (12250): udhcpc: sending renew to server 55.66.77.888
Wed Oct 26 10:16:51 2022 daemon.notice netifd: wan_4 (12250): udhcpc: lease of 11.222.33.444 obtained from 55.66.77.888, lease time 7200
Wed Oct 26 11:16:51 2022 daemon.notice netifd: wan_4 (12250): udhcpc: sending renew to server 55.66.77.888
Wed Oct 26 11:16:51 2022 daemon.notice netifd: wan_4 (12250): udhcpc: lease of 11.222.33.444 obtained from 55.66.77.888, lease time 7200
Wed Oct 26 12:16:51 2022 daemon.notice netifd: wan_4 (12250): udhcpc: sending renew to server 55.66.77.888
Wed Oct 26 12:16:51 2022 daemon.notice netifd: wan_4 (12250): udhcpc: lease of 11.222.33.444 obtained from 55.66.77.888, lease time 7200
Wed Oct 26 13:16:51 2022 daemon.notice netifd: wan_4 (12250): udhcpc: sending renew to server 55.66.77.888
Wed Oct 26 13:16:51 2022 daemon.notice netifd: wan_4 (12250): udhcpc: lease of 11.222.33.444 obtained from 55.66.77.888, lease time 7200
Wed Oct 26 14:16:50 2022 daemon.notice netifd: wan_4 (12250): udhcpc: sending renew to server 55.66.77.888
Wed Oct 26 14:16:50 2022 daemon.notice netifd: wan_4 (12250): udhcpc: lease of 11.222.33.444 obtained from 55.66.77.888, lease time 7200
Wed Oct 26 15:16:49 2022 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.0.2 XX:XX:XX:XX:XX
Wed Oct 26 15:16:49 2022 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.0.2 XX:XX:XX:XX:XX OpenWrt
Wed Oct 26 15:16:49 2022 daemon.warn dnsmasq-dhcp[1]: not giving name OpenWrt to the DHCP lease of 192.168.0.2 because the name exists in /tmp/hosts/dhcp.cfg01411c with address 192.168.0.1
Wed Oct 26 15:16:50 2022 daemon.notice netifd: wan_4 (12250): udhcpc: sending renew to server 55.66.77.888
*** LOST CONNECTIVITY HERE ***
Wed Oct 26 15:19:07 2022 authpriv.info dropbear[17773]: Child connection from 192.168.1.136:50803
Wed Oct 26 15:19:10 2022 authpriv.notice dropbear[17773]: Password auth succeeded for 'root' from 192.168.1.136:50803
Wed Oct 26 15:19:25 2022 user.info : luci: accepted login on / for root from 192.168.1.136
*** MANUAL WAN RESET HERE ***
Wed Oct 26 15:26:59 2022 daemon.notice netifd: wan (20169): Stopping network wan
Wed Oct 26 15:26:59 2022 daemon.notice netifd: wan_4 (12250): udhcpc: received SIGTERM
Wed Oct 26 15:26:59 2022 daemon.notice netifd: wan_4 (12250): udhcpc: unicasting a release of 11.222.33.444 to 55.66.77.888
Wed Oct 26 15:26:59 2022 daemon.notice netifd: wan_4 (12250): udhcpc: sending release
Wed Oct 26 15:26:59 2022 daemon.notice netifd: wan_4 (12250): udhcpc: entering released state
Wed Oct 26 15:26:59 2022 daemon.notice netifd: wan_4 (12250): Command failed: ubus call network.interface notify_proto { "action": 0, "link-up": false, "keep": false, "interface": "wan_4" } (Permission denied)
Wed Oct 26 15:26:59 2022 daemon.notice netifd: Interface 'wan_4' is now down
Wed Oct 26 15:26:59 2022 daemon.notice netifd: Network alias '' link is down
Wed Oct 26 15:26:59 2022 daemon.notice netifd: Interface 'wan_4' has link connectivity loss
Wed Oct 26 15:26:59 2022 daemon.notice netifd: Interface 'wan_4' is disabled
Wed Oct 26 15:26:59 2022 daemon.warn dnsmasq[1]: no servers found in /tmp/resolv.conf.d/resolv.conf.auto, will retry
Wed Oct 26 15:26:59 2022 daemon.notice netifd: wan (20169): Command failed: ubus call network.interface notify_proto { "action": 0, "link-up": false, "keep": false, "interface": "wan" } (Permission denied)
Wed Oct 26 15:26:59 2022 daemon.notice netifd: Interface 'wan' is now down
Wed Oct 26 15:26:59 2022 daemon.notice netifd: Interface 'wan' is setting up now
Wed Oct 26 15:26:59 2022 daemon.notice netifd: wan (20209): Waiting for SIM initialization
Wed Oct 26 15:27:00 2022 daemon.notice netifd: wan (20209): Failed to parse message data
Wed Oct 26 15:27:00 2022 daemon.notice netifd: wan (20209): PIN verification is disabled
Wed Oct 26 15:27:00 2022 daemon.notice netifd: wan (20209): Device does not support 802.3 mode. Informing driver of raw-ip only for wwan0 ..
Wed Oct 26 15:27:00 2022 daemon.notice netifd: wan (20209): Waiting for network registration
Wed Oct 26 15:27:01 2022 daemon.notice netifd: wan (20209): Starting network wan
Wed Oct 26 15:27:02 2022 daemon.notice netifd: wan (20209): Setting up wwan0
Wed Oct 26 15:27:02 2022 daemon.notice netifd: Interface 'wan' is now up
Wed Oct 26 15:27:02 2022 daemon.notice netifd: Network device 'wwan0' link is up
Wed Oct 26 15:27:02 2022 daemon.notice netifd: Network alias 'wwan0' link is up
Wed Oct 26 15:27:02 2022 daemon.notice netifd: Interface 'wan_4' is enabled
Wed Oct 26 15:27:02 2022 daemon.notice netifd: Interface 'wan_4' has link connectivity
Wed Oct 26 15:27:02 2022 daemon.notice netifd: Interface 'wan_4' is setting up now
Wed Oct 26 15:27:02 2022 daemon.notice netifd: wan_4 (20369): udhcpc: started, v1.35.0
Wed Oct 26 15:27:02 2022 user.notice firewall: Reloading firewall due to ifup of wan (wwan0)
Wed Oct 26 15:27:02 2022 daemon.notice netifd: wan_4 (20369): udhcpc: broadcasting discover
Wed Oct 26 15:27:03 2022 daemon.notice netifd: wan_4 (20369): udhcpc: broadcasting select for 11.222.33.444, server 55.66.77.888
Wed Oct 26 15:27:03 2022 daemon.notice netifd: wan_4 (20369): udhcpc: lease of 11.222.33.444 obtained from 55.66.77.888, lease time 7200
Wed Oct 26 15:27:03 2022 daemon.notice netifd: Interface 'wan_4' is now up
Wed Oct 26 15:27:03 2022 daemon.info dnsmasq[1]: reading /tmp/resolv.conf.d/resolv.conf.auto
Wed Oct 26 15:27:03 2022 daemon.info dnsmasq[1]: using nameserver 10.206.64.1#53
Wed Oct 26 15:27:03 2022 daemon.info dnsmasq[1]: using nameserver 10.206.64.1#53
Wed Oct 26 15:27:03 2022 daemon.info dnsmasq[1]: using only locally-known addresses for test
Wed Oct 26 15:27:03 2022 daemon.info dnsmasq[1]: using only locally-known addresses for onion
Wed Oct 26 15:27:03 2022 daemon.info dnsmasq[1]: using only locally-known addresses for localhost
Wed Oct 26 15:27:03 2022 daemon.info dnsmasq[1]: using only locally-known addresses for local
Wed Oct 26 15:27:03 2022 daemon.info dnsmasq[1]: using only locally-known addresses for invalid
Wed Oct 26 15:27:03 2022 daemon.info dnsmasq[1]: using only locally-known addresses for bind
Wed Oct 26 15:27:03 2022 daemon.info dnsmasq[1]: using only locally-known addresses for lan
Wed Oct 26 15:27:04 2022 user.notice firewall: Reloading firewall due to ifup of wan_4 (wwan0)
root@OpenWrt:~#

I got this:

root@OpenWrt:~# uqmi -d /dev/cdc-wdm0 --get-data-status
"disconnected"
root@OpenWrt:~# uqmi -d /dev/cdc-wdm0 --get-current-settings
"Out of call"

Manual wan reset fixed this. But obviously this is not practical for ordinary use.

How should I properly fix this? @patrakov or @trendy maybe this is one for either of you?

It does not look like a WAN IP refresh issue - you bring up the whole PDP context anew.

My recommendation would be: try ModemManager instead of uqmi. It listens for network-initiated disconnections, and can call hotplug scripts based on that. Not sure if it reconnects automatically. I don't use ModemManager myself because of this bug: https://gitlab.freedesktop.org/mobile-broadband/ModemManager/-/issues/87

1 Like

When I had my Asus device this exact 48hr refresh issue was fixed using:

From LOWER_UP -> NO-CARRIER

killall -SIGUSR2 udhcpc

And then from NO-CARRIER -> LOWER_UP

killall -SIGUSR1 udhcpc

The transition LOWER_UP->NO-CARRIER->LOWER_UP happened within about 0.5 seconds.

Doesn't that point to a wan IP issue?

Well yes I wondered about this and notice there is a luci-proto-modemmanager. There isn't much documentation or explanation about this. What's the difference between modemmanager and uqmi?

Is uqmi or perhaps rather luci-proto-qmi not doing something that it should be doing?

Doesn't that point to a wan IP issue?

This was on a different router with a different modem. Next time this happens, please try poking udhcpc, just like you did before. It would be a really valuable data point - something that would let us distinguish between the PDP issue and the DHCP issue.

Regarding uqmi not doing something - well, if it is not a daemon, then it does not listen to the unsolicited reports from the modem, which is the whole point of ModemManager.

1 Like

OK, got it. I suppose I can't somehow trigger this without waiting 48 hours can I?

Also, I noticed this line right before the stall:

Wed Oct 26 15:16:50 2022 daemon.notice netifd: wan_4 (12250): udhcpc: sending renew to server 55.66.77.888

Doesn't that mean that it tried to renew but presumably should have released?

In general shouldn't OpenWrt automatically call release upon NO-CARRIER or does this not make sense in this context?

Like would you expect to see 'ip monitor link dev wwan0' show the NO-CARRIER event on this wan loss of connectivity, and if so what should be responsible for capturing that and issuing the release?

Hi
I have the same issue with one of my service providers, but they disconnect my LTE session every 6th hour.
I have created an uqmi daemon that checks the --get-current-settings every 30s. If the response is "Out of call" it will re-connect the session and update the interface with the new IP-address, without restarting the interface.
The daemon is part of my customized uqmi package.

Interesting. How would this compare to just using 'ModemManager'? Would that handle this situation too? Is the developer of that available on this forum to comment?

Should we not edit 'luci-proto-qmi' or 'uqmi' itself in OpenWrt to handle this situation?

Or perhaps that's really the point of the ModemManager as @patrakov writes above.

I don´t know how ModemManager handles this scenario, I have never used it.
My device are to "small" for ModemManger.

1 Like

What is the intent of the "--set-autoconnect" feature of uqmi?

To my understanding, --set-autoconnect will tell the modem to re-connect of the session is disconnected. But, it will not update the interface with the new IP-address.

@patrakov and @bmork so even with ModemManager installed I see loss of internet connectivity at the 48 hours mark. ModemManager reports in the log:

Mon Oct 31 17:37:46 2022 daemon.info [2719]: <info>  [modem0/bearer1] verbose call end reason (6,36): [3gpp] regular-deactivation
Mon Oct 31 17:37:46 2022 daemon.info [2719]: <info>  [modem0] state changed (connected -> registered)
Mon Oct 31 17:37:46 2022 daemon.info [2719]: <info>  [modem0/bearer1] connection #1 finished: duration 172767s, tx: 1441693405 bytes, rx: 16244930702 bytes
Mon Oct 31 17:38:11 2022 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.0.2 xx:xx:xx:xx:xx:xx
Mon Oct 31 17:38:11 2022 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.0.2 xx:xx:xx:xx:xx:xx OpenWrt
Mon Oct 31 17:38:11 2022 daemon.warn dnsmasq-dhcp[1]: not giving name OpenWrt to the DHCP lease of 192.168.0.2 because the name exists in /tmp/hosts/dhcp.cfg01411c with address 192.168.0.1

192.168.0.2 is my downstream router.

So ModemManager sees the (ISP forced) disconnect but does not bother to reconnect?

Manually restarting wan 'wwan0' in LuCi reconnected the modem.

That seems kina silly - how come default behaviour of ModemManager doesn't involve reconnect?

Is it just that I need to issue 'uqmi -d /dev/cdc-wdm0 --set-autoconnect enabled' with ModemManager disabled and then re-enable ModemManager or something like that? Otherwise how do I tell modem to always reconnect?

Help appreciated here.

Nice. There you have the disconnect reason in clear text.

Yes, silly. Don't know why. Better ask the MM experts

1 Like

Please see these bug links:

In short, please put a script into /etc/ModemManager/connection.d, let it figure out if the disconnect was user-initiated or network-initiated (by asking netifd?), and reconnect network-initiated disconnections.

That's assuming that uqmi -d /dev/cdc-wdm0 --set-autoconnect enabled doesn't help.

1 Like

Just a few cents: I also have to struggle to keep a qmi connection permanently available, for data exchange between openwrt+low level sensors up to AWS-IoT. However, optionally I also have a wan or wifi-wan connection available, and the cell connection is backup only. Thus, I am using mwan3 for switchover, if possible. Because mwan3 pings all the available interfaces, I have a small, always active script ("daemon"?) to periodically check mwan3s staus of the cell connection. In case, mwan3 says "offline", the script does "ifdown wwan", initializes the modem with a hard reset, and then "ifup wwan". If this does not help for some retries, then a reboot is done. Just testing this stuff, but looks promising.
Not so easy to test, though.

1 Like

Do you think the uqmi command '--set-autoconnect enabled' will work in this situation? If so, what's the way to execute it (is there an mmcli counterpart or equivalent) without disrupting ModemManager? And will that survive reboot?

And thanks @patrakov and @reinerotto for your helpful responses.

I don't know. The advice was given based on a (possibly unfounded) assumption that you can temporarily stop ModemManager, and that the setting would survive the reboot. I.e., something worth a shot, but not guaranteed to work.

P.S. I don't have a QMI-based modem.

1 Like

@patrakov do you know the status of the fix you linked in terms of implementation in OpenWrt? It seems @aleksander0m and @jow chatted about this on #openwrtl-devel and a fix has been developed to help pass informaiton to netifd:

Talked to jow in #openwrt-devel about this, and I think I have a clean > way to solve this issue. Any process can notify netifd that the > underlying network is really disconnected, basically like this: > . /lib/netifd/netifd-proto.sh > proto_init_update $ifname 0; proto_send_update > > ifname there is the name of the netifd interface. > > What we could do is setup event scripts in ModemManager to be executed > upon some specific events; e.g. on bearer connection and on bearer > disconnection. ModemManager would call those scripts as appropriate, > e.g. providing additional info as input arguments to the script. So, > when a bearer gets disconnected, MM could call every script in > /etc/ModemManager/bearer-disconnect.d/ passing as arguments the modem > dbus path and the bearer dbus path. > > The openwrt package could provide a custom script in > bearer-disconnect.d that would: > ** find the ifname applicable to the modem object* > ** run the proto update sequence as above* > > What do you think of having something like that? > > We already support custom scripts for when FCC unlock is needed, > having event scripts for bearer connection/disconnection events would > probably be fine also. They could also be used e.g. to setup/cleanup > iptables rules on the specific connected interface and such, something > that is currently not really possible because the user may not know > which is the exact net interface of the modem being connected (even > less when using multiplexing as net interfaces are created on the > fly).

https://lists.freedesktop.org/archives/modemmanager-devel/2022-January/009075.html

but the issue of having netifd reconnect was left hanging here:

Hey, On Tue, Jan 11, 2022 at 11:47 AM Enrico Mioso <mrkiko.rs at gmail.com> wrote: > > With this architecture it will be netifd itself that retries? > Yes. Or at least it could be netifd doing it, if you don't have any other monitoring application. > Is already there a way that provides an indication for netifd to wait some time before retrying? Trying to bring a bearer back up immediately might not be so much productive. > I agree, immediate reconnection attempts are not always desirable. Also, there should be a way to configure the amount of retries so that they're not infinite, or if keep on retrying after some time (e.g. retry 5 times with 20s between retries and if they all fail, wait 10 minutes before restarting the new retry cycle). NetworkManager has all those kind of settings, but truth be told, no idea about netifd.

https://lists.freedesktop.org/archives/modemmanager-devel/2022-January/009078.html

So @jow I'm keen to know what is the status of implementation of:

  • ModemManager communicating to netifd that connection has been lost

  • netifd reacting to loss of connection and reconnecting

I see this commit:

Could that please be pulled into 22.03? And what about having netifd react to loss of connection and reconnecting? Will that already happen?


Background is that in my case on 22.03.2 using ModemManager through luci-proto-modemmanager, owing to my ISP disconnecting me very 48 hours, I see:

Mon Oct 31 17:37:46 2022 daemon.info [2719]: <info>  [modem0/bearer1] verbose call end reason (6,36): [3gpp] regular-deactivation

I lose internet connectivity, netifd doesn't seem to know connection is down and there is no automatic reconnection. Surely this affects quite a lot of users who have to come up with their own DIY reconnection scripts, whereas it seems the proper fix is pretty much all there if not already here.

The modemmanager: report network initiated disconnections to netifd patch handles that. MM now has a dispatcher script that gets called on every connection/disconnection event; and for the disconnection ones, a report to netifd is done notifying that the underlying connection is down.

Unless I'm missing something, the reconnection would be done if the wwan configuration in /etc/config/network is set to automatically connect.

2 Likes

Ah, so it seems like both the issues:

have been solved already with this - fantastic!

Can we get this into 22.03?

Also, regarding:

is there an option to specify 'automatic connection' in /etc/config/network? I have at the moment:

config interface 'wan'
        option proto 'modemmanager'
        option device '/sys/devices/platform/1e1c0000.xhci/usb2/2-1'
        option auth 'pap'
        option iptype 'ipv4v6'
        option apn 'xx
        option username 'yy'
        option password 'zz'

would that count?

@patrakov by any chance would you have the skills to build package for 22.03 and submit pull request with @aleksander0m's patch in it?

It seems like we really need this in 22.03...

Or to build package for 22.03.2? I would try out master on this device but I really don't want to risk needing to open it up, etc.