I have been experiencing intermittent disconnects (usually happens between 2-150 minutes) with a pppoe connection with FTTH.
I enabled debug logging in the pppd options and it appears that my connection is terminated due to the ISP server sending me an LCP TermReq packet:
Here is an example:
Thu Oct 1 21:37:37 2020 daemon.debug pppd[24186]: rcvd [LCP TermReq id=0x5b] 07 38 00 00 00 00 00 00 01 98 00 00 01 da 2e 9e 3d 7f e2 4f 22 40
Thu Oct 1 21:37:37 2020 daemon.info pppd[24186]: LCP terminated by peer
Thu Oct 1 21:37:37 2020 daemon.info pppd[24186]: Connect time 32.8 minutes.
Thu Oct 1 21:37:37 2020 daemon.info pppd[24186]: Sent 6776715 bytes, received 66813993 bytes.
Thu Oct 1 21:37:37 2020 daemon.notice netifd: Network device 'pppoe-wan' link is down
Thu Oct 1 21:37:37 2020 daemon.debug dnsmasq[2575]: stopped listening on pppoe-wan(#66): 84.x.x.x port 53
Thu Oct 1 21:37:37 2020 daemon.debug pppd[24186]: Script /lib/netifd/ppp-down started (pid 24402)
Thu Oct 1 21:37:37 2020 daemon.debug pppd[24186]: sent [LCP TermAck id=0x5b]
Thu Oct 1 21:37:37 2020 daemon.notice netifd: Interface 'wan' has lost the connection
Thu Oct 1 21:37:37 2020 daemon.warn dnsmasq[2575]: no servers found in /tmp/resolv.conf.d/resolv.conf.auto, will retry
Thu Oct 1 21:37:37 2020 daemon.debug pppd[24186]: Script /lib/netifd/ppp-down finished (pid 24402), status = 0x1
I captured the pppoe data (and then all of the data) on the wan port waiting for this to reproduce, and this is the non IP packets:
$ /usr/sbin/tcpdump -e -r ./pppoe_dump_1.pcap | tail
reading from file ./pppoe_dump_1.pcap, link-type EN10MB (Ethernet)
12:11:21.516842 00:02:3b:10:2e:b1 (oui Unknown) > 74:da:da:xx:xx:xx (oui Unknown), ethertype PPPoE S (0x8864), length 60: PPPoE [ses 0x1a] LCP (0xc021), length 28: LCP, Echo-Reply (0x0a), id 93, length 28
12:11:22.215794 00:06:19:30:1c:ee (oui Unknown) > cf:00:00:00:00:00 (oui Unknown), ethertype Loopback (0x9000), length 64: Loopback, skipCount 0, Reply, receipt number 1, data (44 octets)
12:11:26.256096 00:06:19:30:1c:ee (oui Unknown) > cf:00:00:00:00:00 (oui Unknown), ethertype Loopback (0x9000), length 64: Loopback, skipCount 0, Reply, receipt number 1, data (44 octets)
12:11:26.519692 74:da:da:xx:xx:xx (oui Unknown) > 00:02:3b:10:2e:b1 (oui Unknown), ethertype PPPoE S (0x8864), length 30: PPPoE [ses 0x1a] LCP (0xc021), length 10: LCP, Echo-Request (0x09), id 94, length 10
12:11:27.264920 00:06:19:30:1c:ee (oui Unknown) > cf:00:00:00:00:00 (oui Unknown), ethertype Loopback (0x9000), length 64: Loopback, skipCount 0, Reply, receipt number 1, data (44 octets)
12:11:28.274663 00:06:19:30:1c:ee (oui Unknown) > cf:00:00:00:00:00 (oui Unknown), ethertype Loopback (0x9000), length 64: Loopback, skipCount 0, Reply, receipt number 1, data (44 octets)
12:11:29.112870 00:02:3b:10:2e:b1 (oui Unknown) > 74:da:da:xx:xx:xx (oui Unknown), ethertype PPPoE S (0x8864), length 60: PPPoE [ses 0x1a] LCP (0xc021), length 28: LCP, Term-Request (0x05), id 16, length 28
12:11:29.121857 74:da:da:xx:xx:xx (oui Unknown) > 00:02:3b:10:2e:b1 (oui Unknown), ethertype PPPoE S (0x8864), length 26: PPPoE [ses 0x1a] LCP (0xc021), length 6: LCP, Term-Ack (0x06), id 16, length 6
12:11:29.285003 00:06:19:30:1c:ee (oui Unknown) > cf:00:00:00:00:00 (oui Unknown), ethertype Loopback (0x9000), length 64: Loopback, skipCount 0, Reply, receipt number 1, data (44 octets)
12:11:31.305106 00:06:19:30:1c:ee (oui Unknown) > cf:00:00:00:00:00 (oui Unknown), ethertype Loopback (0x9000), length 64: Loopback, skipCount 0, Reply, receipt number 1, data (44 octets)
My simple wan config is below:
config interface 'wan'
option ifname 'wan'
option proto 'pppoe'
option username 'xxx@xx'
option password 'xxxx'
I have not yet contacted my ISP because when I plug a Raspberry Pi 4 running Ubuntu 20.04.1 directly into the connection and use it as a pppoe client and router, the connection is stable. It stands to reason that something unique with my router is causing this.
I am using the a home built latest SNAPSHOT build on my router. This router is not yet supported officially, but I already have a PR to add support here: https://github.com/openwrt/openwrt/pull/3468 None of the code changes are related to pppd. It only has minor changes in partition layout, boot headers, firmware image signatures that the factory firmware accepts, etc.
I have tried enabling and disabling keepalive_adaptive in the network configuration as well as adjusting ping intervals and it did not help. I also tried building ppp with and without multilink (which didn't help either).
I am wondering what else I can try before contacting my ISP because I suspect this issue is addressable on my side. Also, I have my doubts whether they'll route my case to an engineer that knows what an LCP TermReq is.
Thanks!