Fiber + multicast > wifi problems

Scenario:
A large building with 300 fiber internet access and 25 access points distributed throughout the building. Before installing the fiber there was a conventional ADSL and everything worked perfectly, but as soon as the fiber was installed the access points were saturated and the Wi-Fi network came down mostly in just 5 hours. This is a known problem. A Mikrotik router was set up and all the multicast traffic ranges were cut and IGMP was activated, and we got the network to last a week without falling, but finally it kept falling.
Finally another technician changed the configuration of the Mikrotik and everything works perfectly since then.
What I want is to know if that configuration can be played on a router with LEDE firmware.

/queue type

add kind=pcq name=DOWNLOAD pcq-classifier=dst-address \

    pcq-dst-address6-mask=64 pcq-src-address6-mask=64 pcq-total-limit=\

    5000KiB

add kind=pcq name=UPLOAD pcq-classifier=src-address \

    pcq-dst-address6-mask=64 pcq-src-address6-mask=64 pcq-total-limit=\

    5000KiB

/queue tree

add max-limit=170M name="QOS Download" parent=bridge1 queue=DOWNLOAD

add limit-at=3M max-limit=170M name="PRI 1" packet-mark=pm_prio1 parent=\

    "QOS Download" priority=1 queue=default

add limit-at=100M max-limit=170M name="PRI 3" packet-mark=pm_prio3 \

    parent="QOS Download" priority=3 queue=default

add limit-at=5M max-limit=170M name="PRI 2" packet-mark=pm_prio2 parent=\

    "QOS Download" priority=2 queue=default

add name="PRI 4" packet-mark=pm_prio4 parent="QOS Download" priority=4 \

    queue=default

add limit-at=30M max-limit=170M name="PRI 5" packet-mark=pm_prio5 \

    parent="QOS Download" priority=5 queue=default

add limit-at=30M max-limit=170M name="PRI 6" packet-mark=pm_prio6 \

    parent="QOS Download" priority=6 queue=default

add limit-at=5M max-limit=170M name="PRI 7" packet-mark=pm_prio7 parent=\

    "QOS Download" priority=7 queue=default

add limit-at=512k max-limit=170M name="PRI 8" packet-mark=pm_prio8 \

    parent="QOS Download" queue=default

add max-limit=10M name="QOS UPLOAD" parent=pppoe-out1-fibra queue=UPLOAD

add limit-at=20M max-limit=30M name="PRI 3 UPLOAD" packet-mark=pm_prio3 \

    parent="QOS UPLOAD" priority=3 queue=default


/ip firewall filter

add action=drop chain=input comment="BLOCK EXTERNAL DNS" dst-port=53 \

    in-interface=pppoe-out1-fibra protocol=udp

add action=accept chain=input comment="ACCEPT ESTABLISHED CONNECTIONS" \

    connection-state=established

add action=accept chain=input comment="ACCEPT RELATED CONNECTIONS" \

    connection-state=related

add action=drop chain=input comment="DROP INVALID PACKETS" \

    connection-state=invalid

/ip firewall mangle

add action=log chain=notes comment="\C1rbol QoS 18/10/2016"

add action=jump chain=forward connection-mark=no-mark jump-target=\

    conmark

add action=jump chain=forward connection-mark=!no-mark jump-target=\

    pktmark

add action=mark-connection chain=conmark comment=\

    "Priority 1: Administration Services" new-connection-mark=\

    cm_prio-1 passthrough=yes protocol=icmp

add action=mark-connection chain=conmark new-connection-mark=cm_prio-1 \

    protocol=ospf

add action=mark-connection chain=conmark new-connection-mark=cm_prio-1 \

    protocol=ipip

add action=mark-connection chain=conmark new-connection-mark=cm_prio-1 \

    protocol=gre

add action=mark-connection chain=conmark new-connection-mark=cm_prio-1 \

    protocol=ipsec-esp

add action=mark-connection chain=conmark new-connection-mark=cm_prio-1 \

    protocol=ipsec-ah

add action=mark-connection chain=conmark dst-port=53,123,161,162,500 \

    new-connection-mark=cm_prio-1 protocol=udp

add action=mark-packet chain=pktmark connection-mark=cm_prio-1 \

    new-packet-mark=pm_prio1 passthrough=no

add action=mark-connection chain=conmark comment=\

    "Priority 2: Administration Protocols" dst-port=\

    1812-1813,4500,5004,5060-5061,16000-17000 new-connection-mark=\

    cm_prio-2 passthrough=yes protocol=udp

add action=mark-connection chain=conmark dst-port="22,21,1701,1723,1935,1\

    955,2210-2211,2222,3306,3389,3478,4500,5900-5950" \

    new-connection-mark=cm_prio-2 passthrough=yes protocol=tcp

add action=mark-connection chain=conmark dst-port=\

    5938,7004-7005,7064-7065,8291 new-connection-mark=cm_prio-2 \

    protocol=tcp

add action=mark-packet chain=pktmark connection-mark=cm_prio-2 \

    new-packet-mark=pm_prio2 passthrough=no

add action=mark-connection chain=conmark comment=\

    "Priority 3: Browsing 0-512KB" dst-port=80,443,3128,8080 \

    new-connection-mark=cm_prio-3 passthrough=yes protocol=tcp

add action=mark-packet chain=pktmark connection-bytes=0-512000 \

    connection-mark=cm_prio-3 new-packet-mark=pm_prio3 passthrough=no \

    protocol=tcp

add action=mark-packet chain=pktmark comment=\

    "Priority 6: Downloads between 512KB and 4MB" connection-bytes=\

    512001-4096000 connection-mark=cm_prio-3 new-packet-mark=pm_prio6 \

    passthrough=no protocol=tcp

add action=mark-packet chain=pktmark comment=\

    "Priority 7: Downloads bigger than 4MB" connection-bytes=4096001-0 \

    connection-mark=cm_prio-5 new-packet-mark=pm_prio7 passthrough=no \

    protocol=tcp

add action=mark-connection chain=conmark dst-port=\

    2082-2083,2086-2087,2095-2096,2222 new-connection-mark=cm_prio-4 \

    protocol=tcp

add action=mark-packet chain=pktmark connection-mark=cm_prio-4 \

    new-packet-mark=pm_prio4 passthrough=no

add action=mark-connection chain=conmark comment=\

    "Priority 8: P2P TORRENT" new-connection-mark=cm_prio-8 p2p=all-p2p

add action=mark-packet chain=pktmark connection-mark=cm_prio-8 \

    new-packet-mark=pm_prio8 passthrough=no

add action=mark-connection chain=conmark comment="Priority 5: THE REST" \

    connection-mark=no-mark new-connection-mark=cm_prio-5

add action=mark-packet chain=pktmark connection-mark=cm_prio-5 \

    new-packet-mark=pm_prio5 passthrough=no

add action=return chain=conmark

add action=log chain=notes comment="END OF QOS"

There is no reference to IGMP or multicast in this configuration, but it works perfectly. Maybe someone knows why.

Another person brings this solution, but I have not tried it:

Can you also implement this solution with LEDE ?, although I am more interested in the first.

Thanks in advance

From the configuration file you have posted it would appear what your access points cannot handle is mainly P2P torrent traffic. Torrents open a lot of simultaneous connections, which improperly configured Linux embedded routers cannot handle, hence reboots (if you had ssh access to the machine you could just reconfigure and avoid the problem).

The settings are mostly QoS which OpenWRT has a package for, here is the web interface one:
https://openwrt.org/packages/pkgdata/luci-app-qos
...but if you really want to you can attain more advanced solutions with clever use of iptables included in base OpenWRT.

Though to be honest, the point that you are even solving this in this manner is worrying. You should take care of the problem at the source: configure your torrents to use sane settings or if this is a business environment, outlaw torrents.

LP,
Jure

@dustwolf I do not know what luci-app-qos does inside, and I need to have control. I do not know if this package or luci-app-sqm is better. We are talking about a public school where educational videos of the Internet are usually shown. There may be some uncontrolled Torrent client out there, but then the conventional ADSL would have to fall too, and the truth is that the network was not falling. I think Torrent can be a problem, but it's not the main problem. Multicast is the real problem.
I need to know if the configuration I have provided is transferable to LEDE in some way, for example a script. And if it's too long, how to do with LEDE what they say in


TIA

Do you actually use the multicast traffic? If not, it's trivial to completely block it in the firewall on lede. If you use it then you would need to limit the bandwidth using custom qos. It's not clear to me what all that mikrotik stuff does, nor which of the many configs implements the critical fix. And no, there's no way to just use that config on lede as is. That's all mikrotik specific.

I promise you it's possible to implement a solution on lede, but we need to actually know what the cause is in order to design it.

Edit... Your proposed solution is to block all multicast traffic, I think that is the default in lede, no traffic from WAN forwards to LAN except traffic related to ongoing connections initiated from LAN. In order to get multicast onto LAN I've seen people installing igmpproxy package. You could flash lede onto a testing device and drop it in place of your mikrotik for testing.

This issue occurred on the Linux Kernel until NAPI was added. This was therefore fixed in OpenWRT around Attitude Adjustment.

https://wiki.linuxfoundation.org/networking/napi

See:

I don't think you'll have the issue you described on OpenWRT version 17.01.4 or greater.

@lleachii

I believe there may also be a second issue: multicast traffic on WiFi. If the multicast traffic is bridged over to the LAN (which by default I don't think it will be in LEDE/OpenWRT) then I think WiFi APs send multicast traffic at the lowest data rate, 1Mbps or 6Mbps for example. This means if you have anything close to 1Mbps of multicast traffic, your WiFi falls over as the multicast traffic takes all the available airtime.

I've never seen this issue myself because I haven't set up any kind of IPTV or the like, but I remember some discussing it on threads involving igmpproxy

Unless you specifically go out of your way to bridge multicast traffic onto your LAN though, I don't think LEDE will suffer from this multicast problem, which is why I suggested to just drop in a LEDE device in place of the mikrotik device and see how it goes for a few minutes (wifi should fall over basically right away if multicast is your problem). LOTS of people are using LEDE devices on 300mbps connections with success. I would recommend for a deployment like this, long term, to use an x86 solution in order to get robust handling of that kind of bandwidth, but you can do fine with an archer c7 or something like that for testing

or

for examples of how to get an x86 router solution working.

Another issue that is sometimes seen is limitations in the ISP provided equipment, particularly the number of total connections, we debugged that issue in this thread:

Fiber connections for public institutions have not contracted TV services, only Internet access. Therefore there should be no multicast traffic, but the truth is that the Wi-Fi access points are saturated and stop providing service.

I do not know either, as I said that configuration was done by another technician. I would like to know if someone in the forum knows what this configuration does to make the network work perfectly

At this moment, I already have a test router with LEDE in a large institute, with two modifications, "Drop invalid packets", and I have modified the IGMP rule changing only "accept" to "drop". No other modifications, no packages installed, nothing. It takes less than a week but it seems to work. Before the Wi-Fi network fell in a day or two.

Until I made the two modifications I said earlier, the Wi-Fi network fell. Now I need to wait at least a month to see the result.

I think this clearly defines the problem. And I need to stop it, knowing what I'm doing, of course. The truth is that it only affects the access points, it does not affect the wired network. So... How can I block all multicast traffic to WiFi with LEDE from Luci GUI?

TIA

It depends on where the multicast traffic is coming from. If it's coming from WAN, then it's blocked already. But, if it's coming from some schoolkid turning on a LAN game on is laptop with his friends, then you've got a different problem :wink:

If you do in fact have a multicast issue, then multicast originating in your LAN is probably the culprit, as by default you won't get WAN multicast bridged onto your LAN. This also explains why it only happens after a certain amount of time, like a month or so, that's the interval between when kids try to play games with their friends :wink:

If you do have a multicast problem originating in your LAN then you probably need to block it in the APs or switches.

Do you have any use for multicast at all? Remember things like MDNS discovery of printers and soforth are often multicast based.

Most of what I see that script you have doing is adjusting bandwidth usage quality of service so that you can't saturate the connections. perhaps you blocked multicast and it solved the first problem, but now every month or so your network falls over for a different reason such as the torrents idea etc??

EDIT: I suggest setting up a PC ready to go, connected to a managed switch with a switch port that is set up to mirror traffic to the PC for debugging. next time your wifi falls over turn on wireshark on this pc and capture 10 seconds of traffic. You'll find out a lot about what the traffic is and will be able to figure out how best to stop it. Even a raspberry PI would work here most likely, but an older x86 PC sitting around in the salvage pile would be fine too.

Another option is to try raising the minimum transmission rate on the APs (depending on their firmware this may or may not be an option you have). Since you have 25 access points around your building, I'm guessing the clients are pretty close to an AP at all times.

If you raise the minimum rate to say 12 or 24 Mbps then the multicast traffic will use this data rate, and you'll have plenty of breathing room, even if a megabit or two of multicast LAN games are going on. You'll also generally get better performance in your wifi network.

The downside is you'll limit the range over which people can connect to the wifi. As long as there are 25 access points distributed in an appropriate manner, with appropriate choice of channels, this is probably not a big problem.

Here's an article about this issue I just googled up:

I think this is the problem, but it only happens with the fiber, with the ADSL not happening.

Think that they are the same access points and the same switches as before. Anyway, I have updated the firmware in all of them, because many manufacturers have implemented IGMP in most of them. So I can not do more, just spend a lot of money to change access points and switches.
And at this time it is not necessary since it is working with the configuration that Mikrotik has at this moment. It's just that I want to achieve the same with LEDE.

I think no. The people here have to come to study. :rofl:

I've already done many of the things you propose, but as I say ,I'm not worried because everything works for now. It's just that I prefer LEDE to Mikrotik and I want to find the "trick".

Thanks

Nobody?
Ok. At least... How to block all multicast traffic to WiFi on LEDE?
TIA

https://www.iana.org/assignments/multicast-addresses/multicast-addresses.xhtml

https://www.iana.org/assignments/ipv6-multicast-addresses/ipv6-multicast-addresses.xhtml

You shouldn't block the multicast/broadcast associated with DHCP (assuming you're using it) or the many link-local IPv6 addresses that are required for IPv6 nodes https://tools.ietf.org/html/rfc6434

You can make your own decisions about broadcast services on a case-by-case (typically port-by-port) basis, if you think there is much there.

Wireshark or a similar tool and looking at the traffic is highly recommended to understand what is causing the problem, in specific.

Hi @jeff
The first time I blocked from 224.0.0.0 to 239.255.255.255, and wifi network was alive about a week, but finally fell down. So the traffic control from the other technician do the trick, but I don't know why.
Is it possible that multicast over ipv6 is the cause?
Anyway, Do you know how to make a rule for the firewall that stops all multicast traffic? I need to test this first.

Thanks in Advance

Can you describe your network topology a bit?

Lede should not let multicast move from WAN side (isp side) to LAN side by default. The default config already does this. If your network fell down with Lede router, it probably isn't because of multicast wan traffic.

If you have an issue with multicast or broadcast on your LAN then the source is from the LAN and probably won't even hit your router so please provide info on your network topology to help there.

If traffic control is really the fix then probably unicast torrent traffic or the like is the real issue.

If you can make the LAN fall over, and capture packets, you will get much farther.

1 Like

The network of first highschool still has the Mikrotik router, and everything works. The second Highschool has a Lede router in test at this moment, and yes, fell down till I did two modifications, explained before, and still I'm waiting to fell down, so I must wait a little bit more to see a positive result.
In other hand, I think like you, multicast traffic came from LAN not from WAN. I think that ISP provide IPTV or other kind of multicast over the fiber, just in case, so customer can have this kind of services in the future, even if customer haven't contracted. In a situation like this any kid with VLC can do it.

First Highschool:
ISP router ---> Mikrotik router ---> Main switch ---> Other switches ---> All APs
Second Highschool:
ISP router ---> Lede router ---> Main switch ---> Other switches ---> All APs

It is not easy to propose this in the first highschool, since everything is working, but I could do it in the second if my modifications do not work in the near future.

TIA

Looking back I see that you put an explicit firewall rule to block ipv4 multicast range, but you didn't explain that explicitly, was that a forwarding rule? An input rule? Also, what was the second thing you did on the LEDE box (not the mikrotik)

Does your internal LAN network use public IPs? or NAT?

LEDE already has WAN forwarding set to reject. So no packet coming from WAN that isn't destined for a device on your LAN that explicitly initiated an outbound connection should pass. I can't see how multicast traffic coming across your router from WAN to LAN on the network with LEDE box can possibly be the cause of falling over.

Please explain what explicitly your firewall rule was, where did you put it? etc Perhaps provide a screenshot from Luci or something.

Now, as for your topology. If a student connects to an AP and starts up say VLC to stream video to his friend, or starts up some twitch game that spams realtime game data as multicast... and several friends get on together, so that say 1Mbps of multicast data is going from one laptop to another... then yes things could fall over from multicast, because the multicast packets will hit all the APs and take up all the bandwidth...

BUT, given your network topology, there is no way to prevent this by any settings in either the LEDE or Mikrotik routers, because this traffic will never hit those routers it will all be switched between APs by the either "other switches" or "main switch"

So, if the Mikrotik QoS settings are the real fix then the problem isn't multicast traffic on the LAN

In other words, I think in reality perhaps at one point you had a multicast problem on the Mikrotik based network, but you also have some other kinds of problems. The QoS settings in the Mikrotik are probably helping with these other problems that you aren't even really aware of.

Having the LAN fall over and then wiresharking the results will get you the information you need to address whatever your full problem is. without that, I think it's just speculation at this point. LEDE shouldn't be passing multicast from WAN to LAN so if the LEDE network is falling over, then you have additional issues, and need the packet capture to determine what they are.

Luci-Invalid%20Packets-Drop

NAT. There is only one dynamic public IP on WAN side in ISP router.

Do not you think it's frustrating not to know why things happen? So ... Is there any way to move that configuration to a Lede router?. It was my first request.

Thanks

There is no way to simply move that config to a LEDE router, like some kind of automated translation. I thought we'd answered that so sorry if it was not explicit.

Can the LEDE router do the same stuff? Yes, but it would require writing a custom configuration in qos-script that does the same thing, and we can't automate that, so it really requires someone who knows how the mikrotik control language works, and can manually translate it. That's not me unfortunately.

Also, I find it surprising that your firewall rule discarding igmp input to the router from WAN is doing anything at all. igmp basically says "hey I'd like to subscribe to this multicast stream" and if it's coming in on WAN and going to your router... your router is not going to just start manufacturing multicast packets all of a sudden. So unless there's a piece of software running on your router that is generating multicast packets and sending them out the WAN... this rule is not what is keeping things working. And besides it would only affect the WAN side, not the LAN side. So if the problem is that you've got multicast flooding on your LAN, it just doesn't seem like it can possibly be fixed by this rule.

So again I think you need to set up a machine that is all set to wireshark things for when your network falls over. Then you will know what to do, and we can help you better.

Another suggestion: it's very possible for someone to screw up your network by plugging cables into switches in a way that produces a loop. If you send a broadcast packet (like even a DHCP discover) and there's a loop in your network, then that packet will loop through the switches amplifying itself exponentially until all the packets on your network are broadcast packets.

Smart switches have "storm control" and "spanning tree protocol" which keep those things from happening, and turning those on in your switches might eliminate the problem if it's caused by a network loop.

1 Like

Thanks for all @dlakelan. I'm lost, in the second highschool it's working for now, and I've only changed those two options. Is it possible that the "Drop invalid packets" option do trick? That option is in the Mikrotik configuration too. Does this option prevent the network from falling in some way?
Once more... thanks for your interest.