How Should WAN and LAN be Bridged for Transparent SQM?

I'm trying to bridge my router to the upstream router, so both routers would share the same subnet and sees all the devices within the network, the downstream router will also serve a wireless access point.

According to this thread, it is possible to bridge LAN and WAN on a OpenWrt/LEDE box, and perform SQM transparently on LEDE without changing the architecture of the network. It is desirable in my scenario, because the heavy-lifting works such as traffic accounting and user management would be done at the upstream device.

However, I don't understand how should I configure such LEDE bridge. As @silentcreek pointed out that,

I basically followed these instructions: https://wiki.openwrt.org/doc/recipes/dumbap
However, there is one step in these instructions that I didn’t follow: For devices that don’t have a real dedicated WAN port but only a switch port that is configured as WAN in the switch/VLAN configuration, it recommeds to put all the switch ports in one VLAN. You shouldn’t do that because then you don’t have a seperate WAN interface anymore on which you can perform SQM. So, simply bridge the two interfaces and you should be good.

It is easy to understand that, if you remove the VLAN for WAN and move the WAN port to the LAN VLAN, all the traffic would flow in the same network within the same bridge, and since we are doing Cake SQM and it needs two interfaces to work, we should not do it.

But I still don't understand how should you bridge the distinct VLANs for WAN and LAN together while still preserving a WAN interface in order to make SQM/Cake to work. In a default LEDE installation, there are two interface, br-lan with wireless network and the LAN VLAN bridged together, and the wan interface powered by the underlying eth0.2. If I bridge Wireless, eth0.1 and eth0.2 together for the LAN interface, and set the IP address of the bridge interface to 192.168.2.2,

The current configuration is,

config interface 'loopback'
        option ifname 'lo'
        option proto 'static'
        option ipaddr '127.0.0.1'
        option netmask '255.0.0.0'

config globals 'globals'
        option ula_prefix 'fd29:8bc8:191f::/48'

config interface 'lan'
        option type 'bridge'
        option proto 'static'
        option ipaddr '192.168.2.2'
        option netmask '255.255.255.0'
        option ifname 'eth0.1 eth0.2'

config switch
        option name 'switch0'
        option reset '1'
        option enable_vlan '1'

config switch_vlan
        option device 'switch0'
        option vlan '1'
        option vid '1'
        option ports '0t 3 4 5'  # Main LAN, VLAN 1

config switch_vlan
        option device 'switch0'
        option vlan '3'
        option vid '3'
        # A dedicated Ethernet port for management
        option ports '0t 2'

config interface 'dmz'
        # A dedicated Ethernet port for troubleshooting and management
        option proto 'static'
        option ifname 'eth0.3'
        option ipaddr '192.168.3.1'
        option netmask '255.255.255.0'

config switch_vlan
        option device 'switch0'
        option vlan '4'  # it has been deleted and recreated, thus "4".
        option ports '0t 1'  # WAN, VLAN 2
        option vid '2'

config interface 'wan'
        option ifname 'eth0.2'
        option proto 'none'
        option auto '1'
        option force_link '1'

Accessing luci on the 192.168.2.2 over Ethernet is okay but the upstream router at 192.168.2.1 just became unreachable.

$ ping 192.168.2.1
PING 192.168.2.1 (192.168.2.1) 56(84) bytes of data.
From 192.168.2.101 icmp_seq=1 Destination Host Unreachable
From 192.168.2.101 icmp_seq=2 Destination Host Unreachable

What's wrong? Even stranger that it works perfectly over Wi-Fi, but not Ethernet? What?! I captured the traffic on the WAN port and it seems my computer was trying to send a bunch of ARP packets to ask for the MAC address of 192.168.2.1, but my box never receives the reply. While with Wi-Fi connection, all the traffic can flow to the upstream without problem.

So, simply bridge the two interfaces and you should be good.

How? @silentcreek, could you elaborate the configuration and post your /etc/config/network? Thanks.

If you tie all the Ethernet ports together in the hardware switch, software never sees the port to port traffic. This is often what you want, since even very heavy use will not burden the CPU at all.

But it is not what you want here.

If instead you tie the port going up to the main router from the other four ports through a kernel bridge, software such as SQM can see and control the traffic.

In other words, keep the eth0.1 / eth0.2 paradigm, and have the kernel link them.

I understand the point you explained here, now the problem is that the bridge is not working as intended. I have updated the post with the current configuration file and more detailed description to the problem.

The problem is that the bridge is not working over a Ethernet port in VLAN 1, and the upstream router 192.168.2.1 is not reachable, however the same configuration works for wireless connection, on Wi-Fi everything is reachable. What’s wrong?

There should not be a wan network at all in this unit. It's not going to route anything. It will take everything wired or wifi into br-lan and bridge it out to the main router. There will be only one network, lan. You can add a separate network e.g. "admin" and connect it to a different VLAN if you want. That network would only be to log into the router; it will not bridge or route out.

You need to tell this router that the main router 192.168.2.1 is its gateway to the Internet and DNS server, with option gateway and option dns respectively in the lan group. This is necessary for internal connections from OpenWrt to the Internet such as setting the clock with NTP, running a VPN server on the router, or downloading packages. It is not necessary for the users' connections to work.

Also make sure you have turned off both the IPv4 and IPv6 DHCP servers. The main router will issue all DHCP.

All of the above is the standard configuration of what is known as a "dumb AP" or "bridged AP" Other than the additional detail that you're making sure your link to the main router is bridged by software not by hardware.

After that is all working, you would apply SQM to eth0.2 so that the traffic to and from the Internet is controlled.

@biergaizi

I don't have this setup in operation anymore (the main reason being that my internet connection basically pushed the CPU on my TP-Link Archer C7 v2 to its limits when doing SQM and other things), but I looked into my backups to retrieve an older config from back when I was still using it this way.

So, the /etc/config/network looked like this:

config interface 'loopback'
	option ifname 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'

config interface 'lan'
	option type 'bridge'
	option proto 'static'
	option ipaddr '192.168.200.10'
	option netmask '255.255.255.255'
	option gateway '192.168.200.5'
	option dns '192.168.200.5'
	option _orig_ifname 'eth1 radio0.network1 radio1.network1'
	option _orig_bridge 'true'
	option ifname 'eth0 eth1'
	option igmp_snooping '0'

config switch
	option name 'switch0'
	option reset '1'
	option enable_vlan '1'

config switch_vlan
	option device 'switch0'
	option vlan '1'
	option ports '2 3 4 5 0'

config switch_vlan
	option device 'switch0'
	option vlan '2'
	option ports '1 6'

config interface 'lan6'
	option proto 'dhcpv6'
	option ifname '@lan'
	option reqaddress 'try'
	option reqprefix 'no'
	option peerdns '0'

I think you can ignore the options _orig_ifname and _orig_bridge. That is just "a LuCI thing". The important part is including eth0 and eth1 in the option ifname, i.e. bridge them; and, of course the switch configuration. Also note: The Archer C7 v2's switch configuration translates to this: Switch port 1 is the physical "WAN" interface on the device, port 6 is the CPU alias the logical interface eth0, port 0 is also CPU alias logical interface eth1, all other ports are the physical "LAN" interfaces.

The corresponding /etc/config/sqm was this:

config queue 'eth0'
	option qdisc 'fq_codel'
	option script 'simple.qos'
	option qdisc_advanced '0'
	option ingress_ecn 'ECN'
	option egress_ecn 'ECN'
	option qdisc_really_really_advanced '0'
	option itarget 'auto'
	option etarget 'auto'
	option linklayer 'none'
	option interface 'eth0'
	option download '142500'
	option upload '9500'
	option enabled '1'
	option verbosity '2'

And the /etc/config/wireless was this (shortened and redacted):

config wifi-device 'radio0'
    option type 'mac80211'
    option hwmode '11a'
    option path 'pci0000:01/0000:01:00.0'
    option htmode 'VHT80'

config wifi-iface
	option device 'radio0'
	option mode 'ap'
	option ssid '<SOMENETWORK>'
	option encryption 'psk2+ccmp'
	option network 'lan'
	option key '<SOMEKEY>'

config wifi-device 'radio1'
	option type 'mac80211'
	option hwmode '11g'
	option path 'platform/qca955x_wmac'
	option htmode 'HT20'

config wifi-iface
	option device 'radio1'
	option mode 'ap'
	option ssid '<SOMENETWORK>'
	option encryption 'psk2+ccmp'
	option network 'lan'
	option key '<SOMEKEY>'

So, the wifi networks were also on the same "lan" bridge. I hope this is somewhat helpful.