PPPoE disconnects every few hours

The patch just adds two lines to the pppoe kernel module, so you can just build the module and insert it into the running OpenWRT system:

  • stop ppp (it uses the kernel module when connected): ifdown wan
  • rmmod pppoe
  • insmod /root/new_module/pppoe.ko
  • ifup wan

Just to be sure, you can use the verbose patch (" patch with debug output "), it prints a message when it is loaded (or first used, not sure), just so you can see that the patched module is in use (sometimes insmod loads the original module, from the /lib/modules/ folder, not sure why...).

Also note that you must use the stripped module file, not the one in the build folder! That one might crash your router.
Just compare md5sum of unchanged modules with the installed ones, to find the folder with the "correct" version.
(sorry for this if you are an openwrt build expert, it confused me on my first tries)

PS: the debug version also writes a log when a PADT is recevied, noting the destination address, so you can see of "wrong" packets arrive on your connection. (if they don't then the reason for your problem might be something else)

2 Likes

The fix is in the queue for the linux kernel: https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git

See:
5.4-stable patches https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/commit/?id=8c2ea82ce4afa19c35557dcc1875663843c9a24d
5.6-stable patches https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/commit/?id=193c9be96f7a387420631158675bc557892cb4c3
4.19-stable patches https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/commit/?id=bdaf671f9a65fedbc724704f4a3083fa1169de64

4 Likes

And it was released 3 days ago (on 2020-05-20) in linux kernel versions 5.6.14, 5.4.42 and 4.19.124

1 Like

So far the couple of lost PPPoE sessions I've had since installing the patch in my firmware don't appear to be this issue :frowning: but just no response to the maximum number of configured pings (was set at 3 pings at 5s intervals; have changed this to 7 pings at 3s intervals to see what happens...).

But I'm glad to see your patch make it into the Linux kernel anyway! Thanks and well done!

Not for 4.14: is the issue not present in that kernel?

As far as I can tell, the issue is present in every kernel since PPPoE support was added until @xerces8's patch was accepted. I would suggest whoever's maintaining the 4.14 LTS kernel has elected not to incorporate it at this time. @xerces8's patch can easily be applied to the 4.14 kernel code (which is what I did) if you need the fix.

Someone opened a bug about this issue for OpenWRT: https://bugs.openwrt.org/index.php?do=details&task_id=3149

As 19.07.3 was released recently, I would need to repatch/rebuild if I update.

Is there anything to do to help this patch land sooner into openwrt?

Attaching a backport of the accepted kernel patch to the bug report is probably the only thing you can do that might see action.

I asked the kernel mainatiner about including the patch into the 4.14 kernel.

See this email thread: https://lkml.org/lkml/2020/6/4/860

Maybe someone from here can answer the question in that message and test the patch on 4.14 (I did, but didn't do a clean job).

1 Like

I can confirm that it applies and builds cleanly in 4.14.

It looks it's on the way:

Hm, indeed it looks ok for 4.14 (and 4.9, 4.4) so I've queued it up,
thank you.

--
Thanks,
Sasha (Levin, the second maintainer of the 4.14 kernel)

Now just wait for the next openwrt release...

1 Like

Good to see you were able to encourage some more interest in the backport!

[PATCH 4.14 06/46] pppoe: only process PADT targeted at local interfaces

patch landed in kernel 4.14 (probably 4.14.184).

(also in 4.4. and 4.9)

2 Likes

Just wanted to thank you @xerces8 for your overall work on this! Good job!

1 Like

Anyone having this issue with latest snapshots ?

It's unlikely you're experiencing this exact issue in the latest snapshots because it was patched.

I am experiencing a different issue with to PPPoE disconnects here. Another user seems to be experiencing something similar here. If you enable debug logging in /etc/ppp/options you can get more info about your disconnects.

I've been struggling with PPPoE disconnections too.
I'm running the last version of Openwrt in an Archer C60 V1 router. I have FTTH service with a TP-Link GPON provided.

It was disconnecting every few hours. The system log shows that Sent PADT and so on...

I've searched and read everything about in the forum.

My router and PPPoE are up now for 2d17h... I'm still observing, but the problem seems to be with SQM QoS.

I disabled all ipv6 services in the router and lan. PPPoE kept disconnecting every now and then.

Then I disabled all the SQM QoS services and connection is up for more than 2 days.

I'll wait 5 days, then I'll put ipv6 back and check how it goes.

Question, on which interface did you instantiate SQM, ethN(.N) or pppoe-wan?

Hello, thank you for your feedback.

SQM was set to pppoe-wan and I also put it to ETH1-WAN and in both situations the connection went down and the log showed that same Sent PADT.
I'm pretty sure is the Link Layer Adaption.

I paid more attention and the default configuration is Link Layer set no none. It was set as ATM.

I have SQM enable again. The pppoe didn't disconnect to apply changes. Let's keep observing.

Mmmh, with SQM on pppoe-wan it will not even see the LCP messages, making it hard (but apparently not impossible) to affect the PPPoE state machine. Are you 100% sure that the link never shows these disconnects without SQM or are they "just" considerably rarer without SQM?