I switched off my bevy (gaggle? flight?) of APU for routing and firewall as they didn't have enough to do much more than 500 Mbps NAT and routing†. I use them now for jail servers for DHCP, DNS, mail, NextCloud, web, ... Somewhere, in the deep, distant past, I did some speed tests of ath79, ipq40xx, and amd64, including the APU2.
† "More" meaning more functionality, such as SQM or packet inspection such as Snort.
I've got a GL.iNet MV1000 "Brume" I just dusted off for guest-network duty. It is running a bit over 3 W at idle. I don't recall how well it handles throughput, nor have I kept up with the DSA and NAT-offloading changes over the last few years.
The Hardkernel H2 was a great option with "Intel(R) Celeron(R) J4115 CPU @ 1.80GHz" and the ability to take an Intel 4-port server card over PCIe (dual, bonded interfaces). Unfortunately, supply-chain issues have resulted in its discontinuance.
amd64 with (hopefully) genuine 4-port Intel "pulls" is my current choice, though I don't know which compact, low-power option I'd buy these days.
Edit: I haven't read it or thought much about it in the last three years: Comparative Throughput Testing Including NAT, SQM, WireGuard, and OpenVPN