Sorry but I don't really see why the gateway config causing VLAN leakage ... so i tested what you suggested:
config interface 'lan'
option proto 'static'
option device 'sw0.1'
option ipaddr '10.0.0.2'
option netmask '255.255.255.0'
option defaultroute '0'
config interface 'guest'
option proto 'static'
option device 'sw0.2'
option ipaddr '20.0.0.2'
option netmask '255.255.255.0'
option defaultroute '0'
root@r2:~# ip route
10.0.0.0/24 dev sw0.1 scope link src 10.0.0.2
20.0.0.0/24 dev sw0.2 scope link src 20.0.0.2
192.168.1.0/24 dev eth4 scope link src 192.168.1.134
root@r2:~# ping -c2 -I 20.0.0.2 10.0.0.1
PING 10.0.0.1 (10.0.0.1) from 20.0.0.2: 56 data bytes
64 bytes from 10.0.0.1: seq=0 ttl=64 time=0.276 ms
64 bytes from 10.0.0.1: seq=1 ttl=64 time=0.203 ms
--- 10.0.0.1 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.203/0.239/0.276 ms
... and the exact same happens.
As said I don't see how gateway is related to the problem because I think (could be wrong):
Default gateway (or probably better call default router) is a last resort direction: if don't know where to send than send to def. gw. Which means it is not L2 but L3 (as it is an IP address of a host) but VLAN is in L2.
Basically I have two extended broadcast domains over two physical devices which are connected via direct cable.
With the simple routing table (after removing gateway IP as you suggested) traffic flow is slightly different:
root@r2:~# tcpdump -e -i eth3 arp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth3, link-type EN10MB (Ethernet), capture size 262144 bytes
12:13:27.451693 xx:xx:xx:xx:6a:80 (oui Unknown) > xx:xx:xx:xx:8d:e7 (oui Unknown), ethertype 802.1Q (0x8100), length 46: vlan 1, p 0, ethertype ARP, Request who-has 10.0.0.1 tell 10.0.0.2, length 28
12:13:27.451942 xx:xx:xx:xx:8d:e7 (oui Unknown) > xx:xx:xx:xx:6a:80 (oui Unknown), ethertype 802.1Q (0x8100), length 64: vlan 1, p 0, ethertype ARP, Reply 10.0.0.1 is-at xx:xx:xx:xx:8d:e7 (oui Unknown), length 46
12:13:27.531511 xx:xx:xx:xx:8d:e7 (oui Unknown) > xx:xx:xx:xx:6a:80 (oui Unknown), ethertype 802.1Q (0x8100), length 64: vlan 2, p 0, ethertype ARP, Request who-has 20.0.0.2 tell 20.0.0.1, length 46
12:13:27.531567 xx:xx:xx:xx:6a:80 (oui Unknown) > xx:xx:xx:xx:8d:e7 (oui Unknown), ethertype 802.1Q (0x8100), length 46: vlan 2, p 0, ethertype ARP, Reply 20.0.0.2 is-at xx:xx:xx:xx:6a:80 (oui Unknown), length 28
It looks the problem is more due to default config, i.e. reverse path filtering is disabled by default. When switch it to 'strict':
root@r2:~# ping -c2 -I 20.0.0.2 10.0.0.1
PING 10.0.0.1 (10.0.0.1) from 20.0.0.2: 56 data bytes
^C
--- 10.0.0.1 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss
root@r2:~# tcpdump -i eth3 -e -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth3, link-type EN10MB (Ethernet), capture size 262144 bytes
12:29:30.896774 xx:xx:xx:xx:6a:80 > xx:xx:xx:xx:8d:e7, ethertype 802.1Q (0x8100), length 102: vlan 1, p 0, ethertype IPv4, 20.0.0.2 > 10.0.0.1: ICMP echo request, id 8266, seq 0, length 64
12:29:31.899308 xx:xx:xx:xx:6a:80 > xx:xx:xx:xx:8d:e7, ethertype 802.1Q (0x8100), length 102: vlan 1, p 0, ethertype IPv4, 20.0.0.2 > 10.0.0.1: ICMP echo request, id 8266, seq 1, length 64
12:29:35.900433 xx:xx:xx:xx:6a:80 > xx:xx:xx:xx:8d:e7, ethertype 802.1Q (0x8100), length 46: vlan 1, p 0, ethertype ARP, Request who-has 10.0.0.1 tell 10.0.0.2, length 28
12:29:35.900652 xx:xx:xx:xx:8d:e7 > xx:xx:xx:xx:6a:80, ethertype 802.1Q (0x8100), length 64: vlan 1, p 0, ethertype ARP, Reply 10.0.0.1 is-at xx:xx:xx:xx:8d:e7, length 46
root@r1:~# tcpdump -e -i eth3 -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth3, link-type EN10MB (Ethernet), capture size 262144 bytes
12:29:30.896918 xx:xx:xx:xx:6a:80 > xx:xx:xx:xx:8d:e7, ethertype 802.1Q (0x8100), length 102: vlan 1, p 0, ethertype IPv4, 20.0.0.2 > 10.0.0.1: ICMP echo request, id 8266, seq 0, length 64
12:29:31.899397 xx:xx:xx:xx:6a:80 > xx:xx:xx:xx:8d:e7, ethertype 802.1Q (0x8100), length 102: vlan 1, p 0, ethertype IPv4, 20.0.0.2 > 10.0.0.1: ICMP echo request, id 8266, seq 1, length 64
12:29:35.900536 xx:xx:xx:xx:6a:80 > xx:xx:xx:xx:8d:e7, ethertype 802.1Q (0x8100), length 64: vlan 1, p 0, ethertype ARP, Request who-has 10.0.0.1 tell 10.0.0.2, length 46
12:29:35.900578 xx:xx:xx:xx:8d:e7 > xx:xx:xx:xx:6a:80, ethertype 802.1Q (0x8100), length 46: vlan 1, p 0, ethertype ARP, Reply 10.0.0.1 is-at xx:xx:xx:xx:8d:e7, length 28
As expected traffic from VLAN2 (via interface IP 20.0.0.2) is not reaching 10.0.0.1 in VLAN1.
And of course it is working as expected within VLAN1:
root@r2:~# ping -c2 -I 10.0.0.2 10.0.0.1
PING 10.0.0.1 (10.0.0.1) from 10.0.0.2: 56 data bytes
64 bytes from 10.0.0.1: seq=0 ttl=64 time=0.320 ms
64 bytes from 10.0.0.1: seq=1 ttl=64 time=0.412 ms
--- 10.0.0.1 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.320/0.366/0.412 ms
root@r2:~# tcpdump -i eth3 -e -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth3, link-type EN10MB (Ethernet), capture size 262144 bytes
12:33:35.557237 xx:xx:xx:xx:6a:80 > xx:xx:xx:xx:8d:e7, ethertype 802.1Q (0x8100), length 102: vlan 1, p 0, ethertype IPv4, 10.0.0.2 > 10.0.0.1: ICMP echo request, id 8766, seq 0, length 64
12:33:35.557484 xx:xx:xx:xx:8d:e7 > xx:xx:xx:xx:6a:80, ethertype 802.1Q (0x8100), length 102: vlan 1, p 0, ethertype IPv4, 10.0.0.1 > 10.0.0.2: ICMP echo reply, id 8766, seq 0, length 64
12:33:36.558264 xx:xx:xx:xx:6a:80 > xx:xx:xx:xx:8d:e7, ethertype 802.1Q (0x8100), length 102: vlan 1, p 0, ethertype IPv4, 10.0.0.2 > 10.0.0.1: ICMP echo request, id 8766, seq 1, length 64
12:33:36.558576 xx:xx:xx:xx:8d:e7 > xx:xx:xx:xx:6a:80, ethertype 802.1Q (0x8100), length 102: vlan 1, p 0, ethertype IPv4, 10.0.0.1 > 10.0.0.2: ICMP echo reply, id 8766, seq 1, length 64
12:33:40.634367 xx:xx:xx:xx:6a:80 > xx:xx:xx:xx:8d:e7, ethertype 802.1Q (0x8100), length 46: vlan 1, p 0, ethertype ARP, Request who-has 10.0.0.1 tell 10.0.0.2, length 28
12:33:40.634574 xx:xx:xx:xx:8d:e7 > xx:xx:xx:xx:6a:80, ethertype 802.1Q (0x8100), length 64: vlan 1, p 0, ethertype ARP, Reply 10.0.0.1 is-at xx:xx:xx:xx:8d:e7, length 46
12:33:40.717806 xx:xx:xx:xx:8d:e7 > xx:xx:xx:xx:6a:80, ethertype 802.1Q (0x8100), length 64: vlan 1, p 0, ethertype ARP, Request who-has 10.0.0.2 tell 10.0.0.1, length 46
12:33:40.717833 xx:xx:xx:xx:6a:80 > xx:xx:xx:xx:8d:e7, ethertype 802.1Q (0x8100), length 46: vlan 1, p 0, ethertype ARP, Reply 10.0.0.2 is-at xx:xx:xx:xx:6a:80, length 28
root@r1:~# tcpdump -e -i eth3 -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth3, link-type EN10MB (Ethernet), capture size 262144 bytes
12:33:35.557983 xx:xx:xx:xx:6a:80 > xx:xx:xx:xx:8d:e7, ethertype 802.1Q (0x8100), length 102: vlan 1, p 0, ethertype IPv4, 10.0.0.2 > 10.0.0.1: ICMP echo request, id 8766, seq 0, length 64
12:33:35.558052 xx:xx:xx:xx:8d:e7 > xx:xx:xx:xx:6a:80, ethertype 802.1Q (0x8100), length 102: vlan 1, p 0, ethertype IPv4, 10.0.0.1 > 10.0.0.2: ICMP echo reply, id 8766, seq 0, length 64
12:33:36.559071 xx:xx:xx:xx:6a:80 > xx:xx:xx:xx:8d:e7, ethertype 802.1Q (0x8100), length 102: vlan 1, p 0, ethertype IPv4, 10.0.0.2 > 10.0.0.1: ICMP echo request, id 8766, seq 1, length 64
12:33:36.559131 xx:xx:xx:xx:8d:e7 > xx:xx:xx:xx:6a:80, ethertype 802.1Q (0x8100), length 102: vlan 1, p 0, ethertype IPv4, 10.0.0.1 > 10.0.0.2: ICMP echo reply, id 8766, seq 1, length 64
12:33:40.635096 xx:xx:xx:xx:6a:80 > xx:xx:xx:xx:8d:e7, ethertype 802.1Q (0x8100), length 64: vlan 1, p 0, ethertype ARP, Request who-has 10.0.0.1 tell 10.0.0.2, length 46
12:33:40.635134 xx:xx:xx:xx:8d:e7 > xx:xx:xx:xx:6a:80, ethertype 802.1Q (0x8100), length 46: vlan 1, p 0, ethertype ARP, Reply 10.0.0.1 is-at xx:xx:xx:xx:8d:e7, length 28
12:33:40.718316 xx:xx:xx:xx:8d:e7 > xx:xx:xx:xx:6a:80, ethertype 802.1Q (0x8100), length 46: vlan 1, p 0, ethertype ARP, Request who-has 10.0.0.2 tell 10.0.0.1, length 28
12:33:40.718632 xx:xx:xx:xx:6a:80 > xx:xx:xx:xx:8d:e7, ethertype 802.1Q (0x8100), length 64: vlan 1, p 0, ethertype ARP, Reply 10.0.0.2 is-at xx:xx:xx:xx:6a:80, length 46
So is it possible that default config is not correct?
Or let me ask differently:
If need to use two (or more) Owrt enabled devices for example due to wifi reception problems and want to segregate trusted and untrusted end devices, but want to keep management centralized (i.e. only one device to provide WAN, DNS, DHCP etc services, the other Owrt devices to work as dumb AP/switch) is VLAN a solution or something else? If creating two VLANs for un/trusted network and assigning different iptables zones is not enough, what else may be required to have a fully isolated two networks? If VLAN is not the right approach than what could be?