Interesting. Could you please run ifconfig and check if the interface has either PROMISC or ALLMULTI set? I suspect that having either attribute set on any interface which uses the same ethernet device (i.e. eth1 or eth0) would be enough to avoid the problem. If neither attribute is set then it seems you do not see the problem in your case, and it must be more complex.
In my setup, I have no "wan" -- this is an internal router and I map the various wired ports into various vlans on eth0 and eth1. However, it is the case that all my testing has been done using the wired connection to the port labelled "Internet". I will test to see if I see the same problem on other physical ports.
Thanks for the feedback. Is there an interface on that device that is part of a bridge? As part of my debugging, I have discovered that the device gets the promiscuous mode bit set if it is used in a bridge, even though the PROMISC flag doesn't get set on the interface.
I did some more testing and the problem affects both eth0 and eth1 and all physical ports. I have also tried various hacks in the driver to test things out (see the bug report for more info) and have convinced myself it is a real hardware or firmware problem in the multicast filter in the device. It is easy for the driver to workround just by treating the device as if it had no multicast filter and receiving all multicast frames.
Does anyone know if the R7800 has firmware for the lan (dwmac) device? If so, I would be interested to test different versions if there are any.
If not, I will workround it in the driver. What is the best way in the lede world to have the driver work differently on a particular router? Conditional code in the driver? A specific patch which gets selected at build time? Is there a good way for the driver to work out at runtime which device it is running on to avoid having to have a specific image for the R7800? Or should I use a boot parameter?
Good point that the default config means the bug only shows up on the "wan" interface. But many people using lede (like me) don't use the default config. And even people using the default are seeing the problem when trying to use IPv6 on the wan port (see IPv6 works only with wan in promiscuous mode). We certainly need a fix.
Thanks for the suggestion about the switch. I had not checked that but, looking at both the counters (and the ar8327 driver code setting the registers) the switch seems to be forwarding the multicast packets on to the dwmac device.
Just wondering. Does the stock firmware have these issues as well?
I saw you mentioning that the issue might be hardware related to the switch. Wouldn't the stock firmware show the same issue? and if not, could we maybe extract the switch drivers from the GPL source-code and use those as a workaround?
i pushed an ipq branch to my staging treee. this is based on dissents tree with some cleanups and a few missing patches added. could folks please help test this tree ?
Sorry, but no beef.
I pulled the commit "ipq: more v4.9 fixes" on top of the LEDE master and compiled a 4.9 build for my R7800. But the router gets into a reboot loop, so no improvement
Looks like we need to get proper bootlogs from R7800 to solve the kernel 4.9 problem. I will install serial cable to my router.
Has anybody opened R7800 and used the serial header? Based on first inspection, I think that the case screws are hidden under the rubber feet. I tried to find any reference about the r7800 serial connection, but did not find anything really good. Based on FCC photos (and looking through the side vents) it looks like there is a proper serial header with pins, but the header is unnamed and pins are not identified.
I will likely install short patch cable to the internal header and use it to create a new permanent serial header outside the router, so that I will not need to open the router later.