MediaTek and VLAN 802.1ad on ethernet

Dear all,
I have a MediaTek-based router (YouHua WR1200JS) and I'm having problems with VLAN 802.1ad on cabled connections.
Basically, I set up an interface with VLAN 802.1ad (instead, 802.1q goes through surprisingly well) on the router on top of eth0.1 (eth0-1.17) and on eth0.2 (eth0-2_17), then I did the same on my laptop enp0s25 and tried to get ping going through these interfaces, and it didn't get through.
For the original report, see the LibreMesh development mailing list here.

Here goes what I tested and how to reproduce.

I just flashed OpenWrt snapshot, stopped and disabled the firewall, and installed tcpdump-full both on a MediaTek-based YouHua WR1200JS and on an Atheros-based TP-Link WDR3600 (for checking that in this case everything works fine).

I created VLAN 802.1ad interfaces on top of the "yellow ports" eth0.1 and on the "blue port" eth0.2:

openwrt# ip link add link eth0.1 name eth0-1_17 type vlan proto 802.1ad id 17; ip link set eth0-1_17 up; ip address add 10.2.1.1/24 dev eth0-1_17
openwrt# ip link add link eth0.2 name eth0-2_17 type vlan proto 802.1ad id 17; ip link set eth0-2_17 up; ip address add 10.3.1.1/24 dev eth0-2_17

And on the laptop:

laptop# ip link add link enp0s25 name enp0s25.17 type vlan proto 802.1ad id 17; ip link set enp0s25.17 up; ip address add 10.2.1.2/24 dev enp0s25.17; ip address add 10.3.1.2/24 dev enp0s25.17

On the MediaTek router, the ping does not work neither on eth0-1_17 nor on eth0-2_17 in neither of the directions.

When I use tcpdump on the router or Wireshark on the laptop, I paste here the first two ping packets pinging the laptop from the router (ping 10.2.1.2), as captured on various interfaces (at the same time, they are the same two packets):

  • On the router, on eth0-1_17:
0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ac 08 00 45 00   T.uz..._%.~...E.
0010   00 54 46 ef 40 00 40 01 dd b3 0a 02 01 01 0a 02   .TF.@.@.........
0020   01 02 08 00 3a bd f9 06 00 00 e9 04 db 36 00 00   ....:........6..
0030   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0050   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0060   00 00                                             ..

Second packet:

0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ac 08 00 45 00   T.uz..._%.~...E.
0010   00 54 47 41 40 00 40 01 dd 61 0a 02 01 01 0a 02   .TGA@.@..a......
0020   01 02 08 00 92 65 f9 06 00 01 82 5b ea 36 00 00   .....e.....[.6..
0030   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0050   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0060   00 00                                             ..

Both the first and the second ping packets are ok.

  • On the router, on eth0.1:
0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ac 88 a8 00 11   T.uz..._%.~.....
0010   08 00 45 00 00 54 46 ef 40 00 40 01 dd b3 0a 02   ..E..TF.@.@.....
0020   01 01 0a 02 01 02 08 00 3a bd f9 06 00 00 e9 04   ........:.......
0030   db 36 00 00 00 00 00 00 00 00 00 00 00 00 00 00   .6..............
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0050   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0060   00 00 00 00 00 00                                 ......

Second packet:

0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ac 88 a8 00 11   T.uz..._%.~.....
0010   08 00 45 00 00 54 47 41 40 00 40 01 dd 61 0a 02   ..E..TGA@.@..a..
0020   01 01 0a 02 01 02 08 00 92 65 f9 06 00 01 82 5b   .........e.....[
0030   ea 36 00 00 00 00 00 00 00 00 00 00 00 00 00 00   .6..............
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0050   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0060   00 00 00 00 00 00                                 ......

Both the first and the second ping are correctly tagged with VLAN 802.1ad.

  • On the router, on eth0:
0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ac 81 00 00 01   T.uz..._%.~.....
0010   88 a8 00 11 08 00 45 00 00 54 46 ef 40 00 40 01   ......E..TF.@.@.
0020   dd b3 0a 02 01 01 0a 02 01 02 08 00 3a bd f9 06   ............:...
0030   00 00 e9 04 db 36 00 00 00 00 00 00 00 00 00 00   .....6..........
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0050   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0060   00 00 00 00 00 00 00 00 00 00                     ..........

Second packet:

0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ac 81 00 00 01   T.uz..._%.~.....
0010   88 a8 00 11 08 00 45 00 00 54 47 41 40 00 40 01   ......E..TGA@.@.
0020   dd 61 0a 02 01 01 0a 02 01 02 08 00 92 65 f9 06   .a...........e..
0030   00 01 82 5b ea 36 00 00 00 00 00 00 00 00 00 00   ...[.6..........
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0050   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0060   00 00 00 00 00 00 00 00 00 00                     ..........

Both the first and the second ping are correctly tagged both with VLAN 802.1q and VLAN 802.1ad.

  • On the laptop, on enp0s25:
0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ac 88 a8 00 11   T.uz..._%.~.....
0010   08 00 08 00 3a bd f9 06 00 00 e9 04 dd b3 00 00   ....:...........
0020   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0030   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0050   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0060   00 00 00 00 00 00                                 ......

Just after the definition of the VLAN 802.1ad (0xE to 0x11), which is ok, the IP header information is broken.

Second packet:

0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ac 88 a8 00 11   T.uz..._%.~.....
0010   08 00 08 00 92 65 f9 06 00 01 82 5b dd 61 00 00   .....e.....[.a..
0020   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0030   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0050   00 00 63 65 04 5f 74 63 70 05 6c 6f 63 61 6c 00   ..ce._tcp.local.
0060   00 10 80 01 00 00                                 ......

From the second ping packet on, additionally to be broken the IP header, at the end of the packet there is some leak from maybe some cache in the switch?
In this case it can be read, more often it does not have a string representation.

  • On the laptop, on enp0s25.17:
0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ac 08 00 08 00   T.uz..._%.~.....
0010   3a bd f9 06 00 00 e9 04 dd b3 00 00 00 00 00 00   :...............
0020   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0030   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0050   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0060   00 00                                             ..

Here the VLAN 802.1ad header has been removed, but the IP header is still broken.

Second packet:

0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ac 08 00 08 00   T.uz..._%.~.....
0010   92 65 f9 06 00 01 82 5b dd 61 00 00 00 00 00 00   .e.....[.a......
0020   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0030   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 63 65   ..............ce
0050   04 5f 74 63 70 05 6c 6f 63 61 6c 00 00 10 80 01   ._tcp.local.....
0060   00 00                                             ..

And in the second packet we can still see the weird content leaked from somewhere (usually not a string, but just weird hex content).


When pinging the router from my laptop (ping 10.2.1.1), I receive broken ping replies.

On the router, the reply being sent looks good when captured on any interface, here I report only the packet as captured on eth0.1:

0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ac 88 a8 00 11   T.uz..._%.~.....
0010   08 00 45 00 00 54 9e ec 00 00 40 01 c5 b6 0a 02   ..E..T....@.....
0020   01 01 0a 02 01 02 00 00 56 42 2e 87 00 1f 14 12   ........VB......
0030   4c 5d 00 00 00 00 51 d5 0a 00 00 00 00 00 10 11   L]....Q.........
0040   12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21   .............. !
0050   22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31   "#$%&'()*+,-./01
0060   32 33 34 35 36 37                                 234567

And this is how the same packet reaches my laptop on enp0s25, broken:

0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ac 88 a8 00 11   T.uz..._%.~.....
0010   08 00 00 00 56 42 2e 87 00 1f 14 12 c5 b6 00 00   ....VB..........
0020   00 00 51 d5 0a 00 00 00 00 00 10 11 12 13 14 15   ..Q.............
0030   16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25   .......... !"#$%
0040   26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33 34 35   &'()*+,-./012345
0050   36 37 9e b9 26 c3 af 0a a2 86 d6 da 13 18 e5 13   67..&...........
0060   09 66 fc b6 54 68                                 .f..Th

If I connect on the "blue port" (ping 10.3.1.2) the ping does not get through neither in any direction.

So, when connecting to the blue WAN port and pinging my laptop from the router (ping 10.3.1.2), the packets reach my laptop with the broken header as when connecting to yellow LAN ports but the weird content at the end of the packet is not always present.

As captured on my laptop on enp0s25:

0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ad 88 a8 00 11   T.uz..._%.~.....
0010   08 00 08 00 99 fa 3c 07 00 3e 1b a2 e7 07 00 00   ......<..>......
0020   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0030   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0050   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0060   00 00 00 00 00 00                                 ......

When pinging the router on the WAN port from my laptop (ping 10.3.1.1), I receive broken ping replies on the laptop.

Also in this case, the packets captured on the router looks good, both the incoming request and the outcoming reply.

As captured on enp0s25 on my laptop:

0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ad 88 a8 00 11   T.uz..._%.~.....
0010   08 00 00 00 2d 6f 29 61 00 54 b8 0e d0 ab 00 00   ....-o)a.T......
0020   00 00 dc 9c 09 00 00 00 00 00 10 11 12 13 14 15   ................
0030   16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25   .......... !"#$%
0040   26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33 34 35   &'()*+,-./012345
0050   36 37 00 00 00 00 00 00 00 00 00 00 00 00 00 00   67..............
0060   00 00 00 00 00 00                                 ......

If I do the same with the TP-Link WDR3600 router, the ping just works.

Any idea of what could be going on?
Can anyone reproduce on other MediaTek routers?
Thanks,
Ilario

2 Likes

For helping understanding where the leaked content comes from (seems from the router, maybe a cache in the switch or directly from RAM?) I paste a few more packets where the content contains a string:

0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ac 88 a8 00 11   T.uz..._%.~.....
0010   08 00 08 00 28 c7 75 07 00 16 c7 be dd a3 00 00   ....(.u.........
0020   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0030   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0050   00 00 72 69 6f 40 69 67 6c 61 70 74 6f 70 09 5f   ..rio@iglaptop._
0060   70 72 65 73 65 6e                                 presen

where ilario@iglaptop is my username and hostname.

0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ac 08 00 08 00   T.uz..._%.~.....
0010   37 88 a5 06 00 0f 5f 4e 73 6a 00 00 00 00 00 00   7....._Nsj......
0020   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0030   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 34 2d   ..............4-
0050   73 68 61 32 35 36 2c 64 69 66 66 69 65 2d 68 65   sha256,diffie-he
0060   6c 6c                                             ll
0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ac 08 00 08 00   T.uz..._%.~.....
0010   bb 1d a5 06 00 10 cc b7 73 60 00 00 00 00 00 00   ........s`......
0020   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0030   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 63 6f   ..............co
0050   6d 2c 75 6d 61 63 2d 36 34 40 6f 70 65 6e 73 73   m,umac-64@openss
0060   68 2e                                             h.
0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ac 08 00 08 00   T.uz..._%.~.....
0010   71 b3 a5 06 00 11 06 21 73 34 00 00 00 00 00 00   q......!s4......
0020   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0030   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 68 2e   ..............h.
0050   63 6f 6d 2c 68 6d 61 63 2d 73 68 61 32 2d 32 35   com,hmac-sha2-25
0060   36 2c                                             6,
0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ac 08 00 08 00   T.uz..._%.~.....
0010   54 12 a5 06 00 7a d1 52 5f 6b 00 00 00 00 00 00   T....z.R_k......
0020   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0030   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 cb d9   ................
0050   39 bc 60 96 3f d3 64 6e 73 2d 73 64 04 5f 75 64   9.`.?.dns-sd._ud
0060   70 05                                             p.
0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ad 08 00 00 00   T.uz..._%.~.....
0010   f4 23 2c 17 00 02 8e 10 7c f9 00 00 00 00 43 82   .#,.....|.....C.
0020   03 00 00 00 00 00 10 11 12 13 14 15 16 17 18 19   ................
0030   1a 1b 1c 1d 1e 1f 20 21 22 23 24 25 26 27 28 29   ...... !"#$%&'()
0040   2a 2b 2c 2d 2e 2f 30 31 32 33 34 35 36 37 66 f6   *+,-./01234567f.
0050   7c 51 ac c8 55 d9 2c 2d 2e 2f 30 31 32 33 34 35   |Q..U.,-./012345
0060   36 37                                             67

here part of the packet repeats.

0000   54 ee 75 7a c2 1f d4 5f 25 eb 7e ad 08 00 00 00   T.uz..._%.~.....
0010   96 11 2c 17 00 07 93 10 7b f4 00 00 00 00 99 8f   ..,.....{.......
0020   06 00 00 00 00 00 10 11 12 13 14 15 16 17 18 19   ................
0030   1a 1b 1c 1d 1e 1f 20 21 22 23 24 25 26 27 28 29   ...... !"#$%&'()
0040   2a 2b 2c 2d 2e 2f 30 31 32 33 34 35 36 37 65 73   *+,-./01234567es
0050   65 6e 63 65 04 5f 74 63 70 05 6c 6f 63 61 6c 00   ence._tcp.local.
0060   00 0c                                             ..
0030   aa aa 64 68 63 70 20 31 2e 32 38 2e 34 0c 07 4f ..dhcp 1.28.4..O
0040   70 65 6e 57 72 74                               penWrt

of this one I don't have the full content but here appears 1.28.4 which is the Busybox version.

I tested on OpenWrt 19.07 and it still happens. ARP (tested with iputils-arping) and Batman-adv packets (both multicast and ping generated with batctl ping command) gets damaged but the respective tools (arping and batctl ping) do not report this as a bad reply.

1 Like

Have you tried snapshot with DSA driver?

Yes, I tested compiling master approx one week ago, connected by cable two YouHua WR1200JS and the ping does not get through. I still have to try using tcpdump between a YouHua WR1200 and a different hardware (my laptop's ethernet is behaving weirdly, I'll try with a TP-Link WDR3600). As soon I will find time (ETA before end of July) I'll do that and report here. Thanks for the interest!

1 Like

I flashed today a YouHua WR1200JS with snapshot (has DSA) and a TP-Link WDR3600 (with ath79, do not have DSA). Sorry for the awful testing setup, but my laptop's ethernet port is dead and I still have to buy an USB one.

As I still do not understand how to use the new lan1, lan2... stuff, I connected the YouHua's WAN port to a WDR3600's LAN port. Then I tried to disable some filtering of iptables on the WAN port editing the /etc/config/firewall file.

The most meaningful test was the following:

I created an 802.1ad tagged interface on the WDR3600 LAN:

ip link add link eth0.1 name eth0-1_17 type vlan proto 802.1ad id 17; ip link set eth0-1_17 up; ip address add 10.2.1.1/24 dev eth0-1_17

then I created a 802.1ad tagged interface on YouHua WAN:

ip link add link wan name wan_17 type vlan proto 802.1ad id 17; ip link set wan_17 up; ip address add 10.2.1.2/24 dev wan_17

at this point I run the following commands on the YouHua (while running tcpdump):

ping 10.2.1.1
arping 10.2.1.1
ping6 ff02::1%wan_17

no command showed any successful reply (clearly, ping6 showed the answer from the YouHua router itself, but none from the WDR3600).

Using tcpdump on YouHua side, I can see the packets being emitted (they look good on wan_17 and on wan, instead when sniffing on eth0 I can see only malformed stuff, even when the communication is working, why?). Using tcpdump on WDR3600 side I can never see any tagged packet arriving, neither the 802.1Q tagged packets, which maybe is just due to WDR3600 switch chip, I'll test this again once I have the USB ethernet adapter.

DSA uses MediaTek's custom tag on CPU port, which tcpdump could not understand.

I finally have an USB3-10/100/1000ethernet adapter and I did the following:
assign IPs to my laptop without tagging, with 802.1q tagging and with 802.1ad (Q-in-Q) tagging (two for each, as on the routers I will set one for a lan and one for a wan port) with

# ip address add 10.0.1.2/24 dev enp0s20u1; ip address add 10.1.1.2/24 dev enp0s20u1
# ip link add link enp0s20u1 name enp0s20u1.17 type vlan proto 802.1ad id 17; ip link set enp0s20u1.17 up; ip address add 10.2.1.2/24 dev enp0s20u1.17; ip address add 10.3.1.2/24 dev enp0s20u1.17
# ip link add link enp0s20u1 name enp0s20u1.17q type vlan id 17; ip link set enp0s20u1.17q up; ip address add 10.4.1.2/24 dev enp0s20u1.17q; ip address add 10.5.1.2/24 dev enp0s20u1.17q

then I made a baseline with TP-Link WDR3600 flashed with ATH79 SNAPSHOT, r14394-252197f014 (two weeks old) doing:

OpenWrt:~# ip address add 10.0.1.1/24 dev br-lan 
OpenWrt:~# ip address add 10.1.1.1/24 dev eth0.2
OpenWrt:~# ip link add link eth0.1 name eth0-1_17 type vlan proto 802.1ad id 17; ip link set eth0-1_17 up; ip address add 10.2.1.1/24 dev eth0-1_17
OpenWrt:~# ip link add link eth0.2 name eth0-2_17 type vlan proto 802.1ad id 17; ip link set eth0-2_17 up; ip address add 10.3.1.1/24 dev eth0-2_17
OpenWrt:~# ip link add link eth0.1 name eth0-1_17q type vlan id 17; ip link set eth0-1_17q up; ip address add 10.4.1.1/24 dev eth0-1_17q
OpenWrt:~# ip link add link eth0.2 name eth0-2_17q type vlan id 17; ip link set eth0-2_17q up; ip address add 10.5.1.1/24 dev eth0-2_17q

And I run ping from the router towards the laptop, first connecting on a TP-Link WDR3600 LAN port:

OpenWrt:~# ping 10.0.1.2 # plain works
OpenWrt:~# ping 10.2.1.2 # Q-in-Q works
OpenWrt:~# ping 10.4.1.2 # Q does not work, this is known and should be due to the hardware switch, that's why in LibreMesh we're using Q-in-Q

Then connect to TP-Link WDR3600 WAN port (after disabling some deny filters in /etc/config/firewall):

OpenWrt:~# ping 10.1.1.2 # plain works
OpenWrt:~# ping 10.3.1.2 # Q-in-Q works
OpenWrt:~# ping 10.5.1.2 # Q does not work, but that's expected, see above

Then I connect to YouHua WR1200JS and do the same setup (just, this time I will have to use wan and lan1 or br-lan instead of eth0.2 and br-lan):

OpenWrt:~# ip address add 10.0.1.1/24 dev br-lan 
OpenWrt:~# ip address add 10.1.1.1/24 dev wan
OpenWrt:~# ip link add link lan1 name lan1_17 type vlan proto 802.1ad id 17; ip link set lan1_17 up; ip address add 10.2.1.1/24 dev lan1_17
OpenWrt:~# ip link add link wan name wan_17 type vlan proto 802.1ad id 17; ip link set wan_17 up; ip address add 10.3.1.1/24 dev wan_17
OpenWrt:~# ip link add link lan1 name lan1_17q type vlan id 17;
ip: RTNETLINK answers: Resource busy
OpenWrt:~# ip link add link wan name wan_17q type vlan id 17; ip link set wan_17q up; ip address add 10.5.1.1/24 dev wan_17q

Is it expected that I cannot create a 802.1ad VLAN interface on top of lan1 while there is a 802.1q?

And then while connected to a YouHua WR1200JS LAN port I ping my laptop from the router:

OpenWrt:~# ping 10.0.1.2 # plain works
OpenWrt:~# ping 10.2.1.2 # Q-in-Q does not work

While pinging 10.2.1.2 from the router, I used Wireshark on my laptop observing just ARP requests and replies, over and over.

Weird thing is that the ARP requests were coming in tagged 802.1q ID 17 instead than 802.1ad ID 17! This is a completely different problem than the one reported at the beginning of the thread! Do you agree?

Just for confirming this glitch, I added a new IP on the router's WAN tagged with Q-in-Q and an IP on my laptop on the interface tagged with Q:

OpenWrt:~# ip address add 10.7.1.1/24 dev lan1_17
# ip address add 10.7.1.2/24 dev enp0s20u1.17q

and pinging the laptop from the router:

OpenWrt:~# ping 10.7.1.2 # does not work

But if I run tcpdump on the lan1 interface while running this ping I can see the request going out tagged Q-in-Q (which then reaches my laptop tagged Q) and the reply comes back as tagged Q (as my laptop replies using the same tagging it sees). You can find the pcap here: https://uz.sns.it/~ilario/20200920-lan1.pcap

Then I removed the lan1_17 interface and weirdly I was allowed finally to create the lan1_17q:

OpenWrt:~# ip link delete lan1_17
OpenWrt:~# ip link add link lan1 name lan1_17q type vlan id 17; ip link set lan1_17q up; ip address add 10.4.1.1/24 dev lan1_17q
OpenWrt:~# ping -c 10 10.4.1.2 # Q works

and connecting to YouHua WR1200JS WAN port (after disabling some deny filters in /etc/config/firewall):

OpenWrt:~# ping 10.1.1.2 # works
OpenWrt:~# ping 10.3.1.2 # Q-in-Q does not work
OpenWrt:~# ping 10.5.1.2 # Q works

While pinging 10.3.1.2 from the router, I used Wireshark on my laptop observing just ARP requests and replies, over and over. And in the same way as on lan1, the ARP requests were coming in tagged 802.1q ID 17 instead than 802.1ad ID 17.

Again, I added a new IP on the router's WAN tagged with Q-in-Q and an IP on my laptop on the interface tagged with Q:

OpenWrt:~# ip address add 10.6.1.1/24 dev wan_17
# ip address add 10.6.1.2/24 dev enp0s20u1.17q

and pinging the laptop from the router:

OpenWrt:~# ping 10.6.1.2 # works the first time I issue it! Than it stops working

If I run tcpdump on the wan interface while running this ping I can see the request going out tagged Q-in-Q (which then reaches my laptop tagged Q) and the reply comes back as tagged Q. You can find the pcap here: https://uz.sns.it/~ilario/20200920-wan.pcap

Please tell me if you need further tests!

PS I confirmed the most important results using an USB2 10/100 ethernet adapter.

1 Like

Probably a bug:


skb_vlan_tagged will return true if the frame is either 802.1ad or 802.1q tagged, but the code assumes it is always 802.1q.
1 Like

It's amazing that you traced the bug until the kernel! Thanks!
I am certainly not able to write the patch, but let me know if I can help with testing it or if I should report the bug somewhere else.

A patch is available! :smiley:
http://lkml.iu.edu/hypermail/linux/kernel/2103.0/00696.html

Does it fix the bug?