Multicast/switch/snooping/vlan problem. Bug?

I have an R7800 running @hnyman's stable build (slightly customized). I have set up a VLAN in which multicast is received. The multicast traffic is coming in from an upstream switch, and is tagged with the VLAN id (which is 11).

To avoid flooding the switch with the multicast, I enabled IGMP snooping on the switch. For some reason, this causes queries from the upstream server to become dropped on the router.

With IGMP snooping on the switch off, the queries are correctly received from the upstream server:

root@R7800:/# tcpdump -i eth1.11 igmp
15:22:09.172022 IP 192.168.10.1 > all-systems.mcast.net: igmp query v2

If I enable IGMP snooping on the switch, the queries are dropped and never seen. Isn't IGMP snooping only supposed to filter the actual stream that comes after some device joins the multicast?

I just don't understand this. It appears that the switch drops all ingress IGMP packets when IGMP snooping is enabled for the port. It doesn't drop all multicast traffic, though. If a downstream device requests a multicast stream, the stream will start (and be received by the switch both with and without IGMP snooping enabled), but the queries from the upstream server as to whether someone is still listening are dropped by the switch if IGMP snooping is enabled. This then causes the server to stop streaming, of course.

What is it I am missing here?

Surely someone in here has to have some knowledge here?

It appears that if IGMP snooping is enabled on this switch, it just drops all IGMP queries. However, IGMP reports and leaves are still being broadcast.

If I run tcpdump on eth1 with IGMP snooping turned off, I see all IGMP message types from various devices on my LAN. If I then turn IGMP snooping on, all queries are being dropped. No exceptions.

I've also used Ostinato to craft IGMP messages and injected them to the switch to test. I've tested various combinations, and it is specifically messages with the query type (value 17) that is dropped. This just can't be right.

So, could someone please shed some light on this. Is it possible to dump the internal registers of the switch to see the settings?

With IGMP snooping, multicast forwarding is disabled for bridges. One pure bridge solution is to disable multicast_snooping.

Add the following in /etc/rc.local

echo "0" > /sys/devices/virtual/net/br-lan/bridge/multicast_snooping
Replace br-lan with your actual bridge interface, sometimes also called br0.
This will forward all multicast packets to all ports on your bridge, making igmpproxy or udpxy unnecessary. In large networks, this may not be desirable.

So with igmp snooping off multicast traffic gets forwaded.
With snooping on multicast traffic gets dropped.

Isn't that the exact behaviour you see?
Can you configure more options for igmp snooping on the switch? Or only on/off?

No, the issue I'm seeing is that only multicast queries are being dropped. Other multicast packets are being forwarded as they should.

The switch itself seems to have a lot more options than what is exposed by swconfig, and it is possible that the current driver configures the switch wrong, I'm just not sure.

Since I am monitoring with tcpdump on the router itself, I believe that I am looking as close to the physical interface as I can, so that's why I think that the packets are dropped by the switch itself. However, I am not entirely sure.

Testing this is simple enough. Inject IGMP packets from a computer to a switch port, and run tcpdump on the router itself to inspect the packets on the switch (as I mentioned in the OP).

Use swconfig to turn on/off IGMP snooping in the switch, either globally or just for the port where you inject the packets, like so:

root@R7800:~# swconfig dev switch0 port 4 set igmp_snooping 1

I have my computer (from where I inject packets) connected to port 4 on the switch, and with snooping enabled, no IGMP queries are shown by tcpdump. If I then turn snooping off again:

root@R7800:~# swconfig dev switch0 port 4 set igmp_snooping 0

...then my injected IGMP queries are shown by tcpdump.

Multicast snooping for bridges are disabled by default, but that setting makes no difference. Enabling it (and also enabling multicast_querier) makes no difference. Again, I think this is because the switch hardware itself is dropping those packets. I've examined the datasheet for the switch chip (QCA8337N), but I can't find any setting that I think can cause this, so I am really at a loss. The only thing I am sure of, is that with IGMP snooping enabled, multicasting doesn't really work due to the queries being dropped. And with IGMP snooping off, the multicast traffic is flooded to all the ports, which I don't want.

Would be nice if @nbd or @blogic could chime in here, I can't believe I'm the only one seeing this issue.

I thought by switch you mean a standalone switch.
But you are talking about the buildin switch off your openwrt/lede device?

Port 4 belongs to which interface?
Did you try to enable/disable igmp snooping on that interface through UCI?
//edit
Okay this only works for bridges :confused:

What do you want to achieve exactly?

Yes, I meant the internal switch in the router.
What I want, is to have IGMP snooping on the physical switch in the router to avoid flooding multicast streams to all wired devices. The specific issue has been clearly stated in this thread, but I can repeat: With IGMP snooping enabled on the switch (globally or on specific ports), the port(s) where this is enabled will silently drop IGMP queries, which is what the servers transmits every few minutes to ensure someone still wants the stream they are broadcasting. Since the switch drops the queries, no devices can respond to them, and the server stops the stream. Which is no good.

The snooping itself works in the sense that flooding is correctly avoided, but that doesn't help since the server still stops the stream when no devices respond to the queries it sends...

If someone else with a router that has a QCA8337N switch chip could test this, it would be great. I've attached sample packets that can be injected with Ostinato to test. The same packets are also in this pcap file, in case you have some other tool to inject the packets.

To test with Ostinato, just choose the network interface and then use the file menu and choose "Open Streams" and point it to the .ostm-file. There are three packets in this file, one general query and two reports.

Now, on the router, first ensure IGMP snooping is off:

root@R7800:~# swconfig dev switch0 set igmp_snooping 0

Then run tcpdump (must be installed, of course) and inject (play) the stream in Ostinato, and observe the tcpdump output:

root@R7800:~# tcpdump -i eth1 igmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes
00:59:31.624280 IP 192.168.20.1 > all-systems.mcast.net: igmp query v2
00:59:32.624203 IP 192.168.20.1 > all-routers.mcast.net: igmp v2 report all-routers.mcast.net
00:59:32.624189 IP 192.168.20.101 > 224.0.0.252: igmp v2 report 224.0.0.252

All three packets are received, so now enable IGMP snooping on the switch:

root@R7800:~# swconfig dev switch0 set igmp_snooping 1

Redo the packet injection and check tcpdump:

root@R7800:~# tcpdump -i eth1 igmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes
01:28:20.107150 IP 192.168.20.1 > all-routers.mcast.net: igmp v2 report all-routers.mcast.net
01:28:20.107141 IP 192.168.20.101 > 224.0.0.252: igmp v2 report 224.0.0.252

Now the query has been dropped, which I believe is a bug. A general query should always be forwarded, I think.

@mroek : I'm not using igmp snooping anymore on my R7800 as I faced too many issues. Most of them where related to multicast in multiple domain separation (VLAN) and IGMPv3. I didn't notice the general query issue, so it dit a quick test this morning on wan (eth0.100 is the wan multicast) with my ISP multicast provider:

root@LEDE:~# tcpdump -i eth0.100 igmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0.100, link-type EN10MB (Ethernet), capture size 262144 bytes
09:56:53.967395 IP 0.0.0.0 > all-systems.mcast.net: igmp query v2

root@LEDE:~# swconfig dev switch0 port 0 set igmp_snooping 1
root@LEDE:~# swconfig dev switch0 port 5 set igmp_snooping 1
root@LEDE:~# tcpdump -i eth0.100 igmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0.100, link-type EN10MB (Ethernet), capture size 262144 bytes
09:58:58.969184 IP 0.0.0.0 > all-systems.mcast.net: igmp query v2

I'm still seeing general queries...

@Nague: Hmm, that's really odd. Are you running master or stable on your router? Your're getting the queries on the physical WAN port (eth0), while I'm only using the LAN (eth1). I guess that shouldn't really matter, though.

@mroek : I'm using stable. You are right, it shouldn't matter.The difference here is that I'm using VLAN. But still... it should works.

My understanding is that if at least one port on the switch is igmp_snooping enable, all igmp_snooping disable ports are dropping igmp protocol. And I don't understand what is the global igmp_snooping setting doing...

I'm also using VLAN (eth1.11) for the real multicast server (from which the queries are originating), but to simplify testing I changed to just inject the packets from my computer to the switch.

Could you save the query packet from your test to a pcap file, so that I could try to inject that exact packet to my router as a test? Just don't want to leave any stone unturned...

Regarding the setting, if I enable snooping on just the one port, queries are still received on the other ports. While it isn't meaningful to have it enabled on only one port, it still shows the issue.

https://drive.google.com/open?id=0B5UlRlPEA0N1RzFIZ2gtejlyWFE

Thanks. Your query packet is also dropped on my router if snooping is enabled, as expected.
Could you possibly test on your LAN, by sending queries to a port on eth1 of the router?

Here's another test I just did. I installed tcpreplay on the router, and with IGMP snooping turned off globally on the switch, I did this:

root@R7800:/tmp# tcpreplay -i eth1.1 igmp_q.pcap
Actual: 1 packets (60 bytes) sent in 0.000725 seconds.
Rated: 82700.0 Bps, 0.661 Mbps, 1379.31 pps
Flows: 1 flows, 1379.31 fps, 1 flow packets, 0 non-flow
Statistics for network device: eth1.1
        Successful packets:        1
        Failed packets:            0
        Truncated packets:         0
        Retried packets (ENOBUFS): 0
        Retried packets (EAGAIN):  0

At the same time I ran a tcpdump on the router, which shows the packet:

root@R7800:/etc/config# tcpdump -i eth1.1 igmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1.1, link-type EN10MB (Ethernet), capture size 262144 bytes
14:27:42.384599 IP 192.168.10.1 > all-systems.mcast.net: igmp query v2

And I also ran a Wireshark capture on my computer, which is attached to the physical switch port 4. The query packet was received correctly on the computer, as expected.

Then I turned on IGMP snooping for the CPU port on the LAN side of the switch:

root@R7800:/tmp# swconfig dev switch0 port 6 set igmp_snooping 1

...and replayed the packet again. This time the packet is NOT received by my computer, but the tcpdump (on eth1.1) on the router itself still shows the packet.

Could someone explain that?

Here's a new piece of the puzzle that I need help interpreting:

Up until now I have simply ignored eth0, as in my case I'm not wanting to involve the WAN side at all (the multicast source is in a VLAN which comes in at one of the LAN ports), but now I just discovered that the queries aren't dropped when IGMP snooping is enabled, instead they are forwarded to eth0.

So if I enable global IGMP snooping on the switch:

root@R7800:/tmp# swconfig dev switch0 set igmp_snooping 1

...and then inject general queries from a computer in the LAN (switch port 4), then the queries end up at eth0:

root@R7800:/tmp# tcpdump -i eth0 igmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
09:37:00.204693 IP 192.168.20.1 > all-systems.mcast.net: igmp query v2

However, if I run tcpdump on eth0.2, I don't see the queries. There has to be some vital piece of information I'm not getting here, so if anyone can shed some more light I'd be very happy.

@mroek : This is surprising ! Switch port 0, face to eth0, isn't member of vlan 1, right ? Could you provide your uci switch config ?

That's correct. Switch ports 1-4 are untagged members of VLAN 1, and port 6 (eth1) is a tagged member.
Switch port 5 is an untagged member of VLAN 2, and port 0 (eth0) is a tagged member.

The packets I inject at port 4 are all untagged. If I have IGMP snooping off, then I can see all of them (one query and two reports) with tcpdump at eth1 (or eth1.1). Turning IGMP snooping on makes the query end up at eth0 (but not eth0.2), while the reports still end up at eth1 (or eth1.1).

I think there can be no doubt that this is a bug.

On many switch, all ports are default untag vlan 1. This could be the trick here. Could you test using a 3rd vlan, untag on port 0 (eth0) ?

Not sure what you mean? I tried making eth0 untagged in VLAN 2, which caused the router to remove VLAN 2 altogether (it auto-migrated WAN to eth0). That made no change to the IGMP behaviour, as far as I could tell. Queries still ends up at eth0 with snooping enabled.

I also tried adding a new dummy VLAN which is untagged on cpu0 and tagged on one of the LAN-ports, also no change. Is that what you meant?