While looking into L3 routing on rtl838x I stumbled over a mention of an undocumented hardware limitation at the HPE community
The ARP table on the HP 1920 24G (JG924A) can only reliably handle 60 entries. Go over this and you will start to experience the loss in performance reported here.
In English, If you have more than 60 devices across all of your VLANS you will have this problem. This is a pretty significant limitation for what is meant to be a layer 3 switch. I also note than this is not mentioned in any of the devices specifications.
We are currently talking to our supplier about returning our 1920 24G as it is not fit for purpose. It was sold as a Layer 3 switch. The specifications issued by HP make no reference to only supporting a maximum of 60 ARP entries (the ARP table can actually hold many more). If you contact HP support you eventually get this information but they will make you jump through hoops before they admit it.
See: http://h20565.www2.hpe.com/hpsc/doc/public/display?sp4ts.oid=7399514&docId=mmr_kc-0130409&docLocale=en_US
My comments:
- On 1920-24G I managed to reproduce this slow routing speed condition
- On 1920-24G there is no syslog event generated when ARP table grows beyond 64 entries.
- On 1920-24G the ARP table can't grow beyond 256 entries, so no CPU based routing for host #257! This condition does generate a syslog message:"Oct 31 08:25:14 2015 HP1920G %%10TPMB/4/ARP TABLE FULL(t): 1.3.6.1.4.1.25506.2.38.1.2.4.1: ARP table is full and number of items is 256."
- HP doesn't specify IPv6 max neighbor table size, so we can assume same limits apply.
Note: On IPv6 number of hosts does not equal to neighbor table size! A host uses multiple addresses (with its link-local and privacy extension addresses)
On 1920-24G I did managed to get this IPv6 event at 256 entries:
"Oct 31 09:11:02 2015 HP1920G %%10TPMB/4/ND TABLE FULL(t): 1.3.6.1.4.1.25506.2.38.1.5.4.1: Neighbor table is full and number of items is 256."- Switch reboot is seems useless to me , to make neigbors time out earlier use commands like:
arp timer aging 4 (minutes, default=20)
ipv6 neighbor stale-aging 1 (hours! default=4)
After ARP table size is back within safe limits (60) , hardware speed returns
So back to the 1920-24 I'm testing with
# brctl showmacs switch | wc -l
177
every local port has 2 or 7 entries for it's local mac address. 115 of the 177 entries are local, for 28 ports
# cat /sys/kernel/debug/rtl838x/l2_table | grep mac | wc -l
148
looks like it is one entry per VLAN
mac 5c:8a:38:86:6a:e3 vid 0 rvid 0
mac 5c:8a:38:86:6a:e3 vid 1 rvid 1
mac 5c:8a:38:86:6a:e3 vid 10 rvid 10
mac 5c:8a:38:86:6a:e3 vid 20 rvid 20
mac 5c:8a:38:86:6a:e3 vid 22 rvid 22
mac 5c:8a:38:86:6a:e3 vid 30 rvid 30
mac 5c:8a:38:86:6a:e3 vid 40 rvid 40
mac 5c:8a:38:86:6a:e4 vid 0 rvid 0
mac 5c:8a:38:86:6a:e4 vid 1 rvid 1
mac 5c:8a:38:86:6a:e4 vid 10 rvid 10
mac 5c:8a:38:86:6a:e4 vid 20 rvid 20
mac 5c:8a:38:86:6a:e4 vid 22 rvid 22
mac 5c:8a:38:86:6a:e4 vid 30 rvid 30
mac 5c:8a:38:86:6a:e4 vid 40 rvid 40
iperf between the same two nodes using L2 and L3
[ 1] 0.0000-10.0717 sec 30.2 MBytes 25.2 Mbits/sec
[ 2] 0.0000-10.0506 sec 878 MBytes 733 Mbits/sec
[ 3] 0.0000-10.1007 sec 30.7 MBytes 25.5 Mbits/sec
so brought it down below 60 entries, but that did not change anything. So it's still routing on CPU... Still leaving this here for later.