OpenWrt Forum Archive

Topic: Bridging two VLANs removes kills directed IP packets

The content of this topic has been archived on 5 Apr 2018. There are no obvious gaps in this topic, but there may still be some posts missing at the end.

I set up two VLANs on Port 3 and 4 of a TL-WR1043ND (Backfire RC-4) as eth0.3 and eth0.4.
Both I bridge in an attempt to form one unified LAN.

The reason why I do this is that I need the Openwrt-Router to act as a switch that allows packet sniffing between a PC and a server: eth0.3 and eth0.4 should plug between a formerly closed ethernet connection and act totally transparent while allowing tcpdump to dump packets to an attached USB storage device.

I use two separate VLANs instead of just one, because the learning switches would not pass all data to port 5 of the router, but switch most directly, once they've learned the attached device's MAC addresses.
The creation of two seperate VLANs should force all packets up through the software switch.

What I see when comparing wireshark-logs from an attached PC to the tcpdump from the OpenWRT-bridge, is that most packets flow correctly as they should.
I see broadcast ARP-requests coming through to the PC from the LAN, and so are SMB packets, VRRP packets, STP-packets, etc.

However some packets (namely directed ARP requests, DHCP responses etc.) are arriving at OpenWRT's bridge, but they are not being passed to the attached PC.
I can see them in the tcpdump-file, but they're not arriving at the PC.

Apparently, the router filters them out.
I did disable the firewall, ("/etc/init.d ... stop") but that doesn't improve the situation.


Can anyone confirm that this is a bug or point me to a mistake I may have made?

The /etc/config/network file is this:

config 'interface' 'loopback'
    option 'ifname' 'lo'
    option 'proto' 'static'
    option 'ipaddr' '127.0.0.1'
    option 'netmask' '255.0.0.0'

config 'interface' 'lan'
    option 'ifname' 'eth0.1'
    option 'type' 'bridge'
    option 'proto' 'static'
    option 'ipaddr' '192.168.1.1'
    option 'netmask' '255.255.255.0'

config 'interface' 'wan'
    option 'ifname' 'eth0.2'
    option 'proto' 'dhcp'

config 'switch'
    option 'name' 'rtl8366rb'
    option 'reset' '1'
    option 'enable_vlan' '1'

config 'switch_vlan'
    option 'device' 'rtl8366rb'
    option 'vlan' '1'
    option 'ports' '3 4 5t'

config 'switch_vlan'
    option 'device' 'rtl8366rb'
    option 'vlan' '2'
    option 'ports' '0 5t'

config 'switch_vlan'
    option 'device' 'rtl8366rb'
    option 'vlan' '3'
    option 'ports' '1 5t'

config 'switch_vlan'
    option 'device' 'rtl8366rb'
    option 'vlan' '4'
    option 'ports' '2 5t'

config 'interface' 'sniffer'
    option 'type' 'bridge'
    option 'ifname' 'eth0.3 eth0.4'
    option 'defaultroute' '0'
    option 'proto' 'none'
    option 'peerdns' '0'

The simplest way to find the deviation is when comparing the PC's Wireshark log with the router's tcpdump.

The PC, for example, sends a DHCP Request, succeeded by a DHCP Discover. Those packets, by nature, are broadcast packets from 0.0.0.0 to 255.255.255.255.
The Router does receive them, they are contained in the tcpdump.
The Router then, 40-41 milliseconds later, receives DHCP offers from four servers. Those packets, by nature, are directed at the newly assigned IP. They're certainly all the same IP.
The PC does not receive any of them.
The PC's Wireshark log only shows the Request/Discovers, but not the responses that the Router still saw.

That means the bridge ("br-sniffer") and the corresponding interfaces eth0.3 and eth0.4 have indeed received the Request/Discover and they habe indeed forwarded their broadcast as they should. They do receive answers, but those they choose not to send to the PC.

Why is that?

I have no clue what I may be doing wrong - or what makes the packets so special, other than them being directed an an IP rather than broadcasts.

Any idea?
Thanks for any help.

Bump!

Can perhaps anyone either confirm that what I try to do is nonsense or that there is a bug?

I even tried to set up a forwarding rule:

iptables -A FORWARD -i br-sniffer -o eth0.3 -j ACCEPT

But even that doesn't change anything.

Any help?

Update:

I found out that the configuration described above is totally correct:

It does work flawlessly on a Broadcom-switch-based device (Asus WL-500GP V1).

So, the result:

The drivers for the rtl8366rb seem to be buggy and the ticked opened is here:

https://dev.openwrt.org/ticket/8701#comment:2

Bump... I'd be very interested to know if anyone has had any luck on this front yet?

I do confirm that the bug is still there.

It's really a shame, because this is the only consumer-class-GbE-switch widely available.
The fact that the current drivers do not allow it to be used as a bridge is limiting the use quite a lot.

In the meantime, I had to downgrade to a 100-Mbit/s-Router - where a lot of various switches are availale, and all those I tried did handle the bridge correctly.

But the speed degradation in a SoHo-environment is frustrating (essentially I can measure only a quarter of the speed I achieved with the GbE-switch.

So a repair of this driver would truly be appreciated.

Hello.
I wish to setup transparent bandwidth monitoring on a wr1043nd.
I suspect I will have to setup 2 vlans and bridge them to do this.
I'm running  Backfire (10.03.1-rc4, r24045).
Is this bridge issue resolved?

This is actually a hardware issue, not a software issue; Realtek as well as Atheros switches are affected by it, Broadcom switches work.
This can be worked around by disabling learning for the switch, but this will kill the actual switching performance (since it then works like a hub - see this ticket).


For a technical background what the actual problem is:

The ARL Table where the switch stores learnt MACs and their ports uses only the MAC address for indexing on Realtek and Atheros switches. This has the effect that the switch tries to forward a frame to a port that isn't part of the current VLAN (since it learnt that the destination is at that port), notes that the destination isn't part of the current VLAN, and drops the frame.

Broadcom switches do the indexing based on VID and MAC, so that the same MAC can be at different ports for different VIDs at the same time. Therefore the switch forwards the frames correctly within the VLANs and bridging works.

Can I turn off learning on the fly?

will the following do it?
swconfig dev rtl8366rb set enable_learning 0

Is it possible to disable learning for just 1 vlan?

(Last edited by gbung on 11 Aug 2011, 17:52)

Can someone confirm this is what I'm supposed to do.
I've setup  2 vlans on 2 ports and bridged them together. Arp requests get through but arp replies still don't make it back.

KanjiMonster wrote:

This is actually a hardware issue, not a software issue; Realtek as well as Atheros switches are affected by it, Broadcom switches work.

[...]

The ARL Table where the switch stores learnt MACs and their ports uses only the MAC address for indexing on Realtek and Atheros switches. This has the effect that the switch tries to forward a frame to a port that isn't part of the current VLAN (since it learnt that the destination is at that port), notes that the destination isn't part of the current VLAN, and drops the frame.

Broadcom switches do the indexing based on VID and MAC, so that the same MAC can be at different ports for different VIDs at the same time. Therefore the switch forwards the frames correctly within the VLANs and bridging works.

Thanks for the excellent explaination.

This, in essence, means, that hardware VLAN bridging is not supported by the Realtek switch.
Disabling learning is one way of bypassing the problem, actually using a physical wire (assuming the router has enough ports) is another simple workaround.

Question
Are you aware of any way of bridging the VLANs together on a higher SW level? Routed through the CPU?

The discussion might have continued from here.