Yeah - well the reason I ask is because the datasheet for the X710 adapters that I'm looking suggests they have an internal switch exactly for this fuction. However elsewhere in the document it also states that the internal switch is not a 'learning switch' and has to be configured by the host system. It has the command reference for how to program the switch (very low level) and I wonder if this is something the hardware offload feature of the kernel is doing in the background.
The switch is not a learning switch and is managed by the host. The programming interface exposed to
the operating system is that of a managed switch. Each switch element can be configured either via a
software device driver running on a PF or via the EMP.
10 GBit/s ethernet cards and faster are quite different from 'normal' 1 GBit/s ones, because they have to do a lot of offloading to the hardware to keep up at all. Answering this would really depend on the details and the exact hardware in question, sorry I've to pass on this question (it wasn't clear in your original post that you were referring to 40 GBit/s cards) - my gut feeling is still not very positive though.
Bridging two 10GB port with a standard Linux L2 bridge, I get throughput of 9.2Gbps (which is about max with ethernet overhead anyway) and it consumes approx. 2 whole cores of my 8 core Atom processor (Intel C2758).