J1900 performance issue

dlakelan · April 19, 2019, 4:05pm

use a managed switch and bonded NICs and VLANs, you can go a long way with this unless you plan to do multiwan with 4 or 5 gigabit wan connections

rakesh · April 19, 2019, 4:23pm

Currently shows "powersave". Will give a try when I am measuring performance.

dlakelan · April 19, 2019, 6:05pm

pretty sure powersave keeps the clock at it's lowest all the time. so it will be best to set it to performance for a router, unless you maybe set it to powersave every night and change it again in the morning using cron

diizzy · April 19, 2019, 7:11pm

Unless scaling is very broken it should be fine, you might run into latency issues in theory but I doubt you'll see that on x86.

dlakelan · April 19, 2019, 7:29pm

ondemand scales up but I think powersave stays at low clock the whole time

diizzy · April 19, 2019, 7:34pm

Correct, ondemand should be fine in most cases.

mindwolf · December 8, 2020, 3:51pm

rm /etc/hotplug.d/net/20-smp-tune
DONOT LEAVE THIS SCRIPT AS IT WILL NEGATE ANY CHANGES
if your WAN is say eth0, pin the irq affinity eth1-3 to the corresponding core eth1-3. You only need to move the eth* as opposed to every rx-tx.
echo the smp_affinity to each rps & xps
echo performance to scaling governor

The workload will now be spread across mulitple cores and cpu usage should now be below ~20% as opposed to spikes of 85-90% as viewed using htop.

rakesh · December 8, 2020, 4:29pm

That's a huge performance gain claimed.
Will certainly come back to this when I want to extract more performance from this.

mindwolf · December 8, 2020, 6:17pm

I recently purchased this unit Nov 19 and have applied the listed tweaks above so I can definitely back this claim. The smp hotplug script is the first thing to delete because it will override any smp settings. Maybe it helps on some systems, but I found it hurts on this unit. I'll post some before and after screenshots sometime this week as I have late night work scheduled.

rakesh · December 8, 2020, 6:35pm

Please do post specific commands. Some of the steps you posted I didn't understand.
I was planning to research and use them but will be helpful if you can provide such that I can copy-paste commands.

mindwolf · December 8, 2020, 6:49pm

Post the output of cat /proc/interrupts

mindwolf · December 8, 2020, 6:55pm

Try these commands. irqbalance doesn't do a good job at all and you banirqs doesn't work, you'll need the newer code anways. Try these and let me know. Edit this to your current setup if need be. The numbers 2, 4, 8 are the affinity masks.

for n in $(cat /proc/interrupts | awk '/eth1/ { print $1}' | tr -d \:); do echo 2 > /proc/irq/$n/smp_affinity;done
  for n in $(cat /proc/interrupts | awk '/eth2/ { print $1}' | tr -d \:);do echo 4 > /proc/irq/$n/smp_affinity;done
  for n in $(cat /proc/interrupts | awk '/eth3/ { print $1}' | tr -d \:);do echo 8 > /proc/irq/$n/smp_affinity;done
 # for f in /sys/class/net/*/queues/*/byte_queue_limits/;do echo 6056 > $f/limit_max;done
  for c in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor;do echo performance > $c;done
  for i in $(ls /sys/class/net | awk '/eth/');do ethtool -K $i tso off gso off gro off;done
 find /sys/devices -name xps_cpus | awk '/eth1/' | while read q; do echo 2 > $q; done
 find /sys/devices -name xps_cpus | awk '/eth2/' | while read q; do echo 4 > $q; done
 find /sys/devices -name xps_cpus | awk '/eth3/' | while read q; do echo 8 > $q; done
 find /sys/devices -name rps_cpus | awk '/eth1/' | while read q; do echo 2 > $q; done
 find /sys/devices -name rps_cpus | awk '/eth2/' | while read q; do echo 4 > $q; done
 find /sys/devices -name rps_cpus | awk '/eth3/' | while read q; do echo 8 > $q; done

mindwolf · December 8, 2020, 7:00pm

Be sure to measure with htop before and after the changes.

JonP · December 13, 2020, 1:34am

A Zombie post lives!!

Not sure if the Newer if not Original Poster will be back, but I'm interested in seeing if better tuning is worth it for my above mentioned CI327, that I still use.

My situation is different, in that I have only 2 instead of 4 ethernet ports in the CI327, so I'm not clear on how many things need to be different for a case like mine.

Also, I see varying things looking at cat /proc/interrupts vs using htop. With the former, I see a ton of interrupts for eth0 and eth1, both on CPU0 only. With htop, during a Speedtest creating a lot of traffic, I see softirq's on the bargraphs, but fairly evenly distributed across all 4 cores. Hard IRQ vs soft IRQ? Is there a way to set up htop for clearer or more detailed info? One thing I seem to be missing to get a more quantitative number is a display like in top, where there's a total % of sirq, for instance. Hard to watch 4 changing bargraphs and guestimate totals to be able to see if changes make a difference.

JonP · December 13, 2020, 1:35am

Here's my cat /proc/interrupts

root@OpenWrt:~# cat /proc/interrupts
            CPU0       CPU1       CPU2       CPU3
   0:         26          0          0          0   IO-APIC    2-edge      timer
   1:          4          0          0          0   IO-APIC    1-edge      i8042
   4:         16          0          0          0   IO-APIC    4-edge      ttyS0
   8:          1          0          0          0   IO-APIC    8-fasteoi   rtc0
   9:          0          0          0          0   IO-APIC    9-fasteoi   acpi
  12:          5          0          0          0   IO-APIC   12-edge      i8042
  39:         52          0          0          0   IO-APIC   39-fasteoi   mmc0
  42:         52          0          0          0   IO-APIC   42-fasteoi   mmc1
 120:        397          0          0          0   PCI-MSI 32768-edge      i915
 121:         20          0          0          0   PCI-MSI 294912-edge      ahci[0000:00:12.0]
 122:       7463          0          0          0   PCI-MSI 344064-edge      xhci_hcd
 123:  791437515          0          0          0   PCI-MSI 1048576-edge      eth0
 124:  786251854          0          0          0   PCI-MSI 1572864-edge      eth1
 NMI:          0          0          0          0   Non-maskable interrupts
 LOC:  527368966  891906441  927056053  966413305   Local timer interrupts
 SPU:          0          0          0          0   Spurious interrupts
 PMI:          0          0          0          0   Performance monitoring interrupts
 IWI:          0          0          0          0   IRQ work interrupts
 RTR:          0          0          0          0   APIC ICR read retries
 RES:   18407113   20374747   13625146   14481811   Rescheduling interrupts
 CAL:      32621  280873617  279814627  343650721   Function call interrupts
 TLB:      24525      10166      10962      10119   TLB shootdowns
 TRM:          0          0          0          0   Thermal event interrupts
 THR:          0      32766      21014      18308   Threshold APIC interrupts
 DFR:          0          0          0          0   Deferred Error APIC interrupts
 MCE:          0          0          0          0   Machine check exceptions
 MCP:      17350      54939      45607      42957   Machine check polls
 HYP:          0          0          0          0   Hypervisor callback interrupts
 ERR:          0
 MIS:          0
 PIN:          0          0          0          0   Posted-interrupt notification event
 NPI:          0          0          0          0   Nested posted-interrupt event
 PIW:          0          0          0          0   Posted-interrupt wakeup event

mindwolf · December 19, 2020, 4:25pm

look in /etc/hotplug.d/ and rm the smp tune script && reboot.

then cat /proc/interrupts and look to see if the interrupts have been evenly distributed (mostly). If they haven't, then it's time to manually move things around e.g. eth1 > cpu1 etc.

# Spread the APIC interrupts evenly across CPU1-3
echo 2 > /proc/interrrupts/4/smp_affinity
echo 2 > /proc/interrrupts/8/smp_affinity

echo 4 > /proc/interrrupts/9/smp_affinity
echo 4 > /proc/interrrupts/12/smp_affinity

echo 8 > /proc/interrrupts/39/smp_affinity
echo 8 > /proc/interrrupts/42/smp_affinity

# Do the same with MSI interrupts
echo 4 > /proc/interrrupts/120/smp_affinity
echo 4 > /proc/interrrupts/121/smp_affinity

echo 8 > /proc/interrrupts/123/smp_affinity

# ETH1 corresponds to CPU1, which makes sense to pin it there. 
echo 2 > /proc/interrrupts/124/smp_affinity

JonP · December 31, 2020, 8:30am

Sorry I haven't had a chance to try out some of this..

Going upthread a bit, I have checked the scaling governor, and it's on powersave. But, I do see in htop, the cpu speeds changing up and down. Hmm... do I need to change that, then? I haven't pulled out the smp script yet. Wondering if there's an issue with performance or SQM processing, if the cpu speed is bouncing from 795-2300 with varying load. Also wondering how much warmer it will be if I pin it on 2.3Ghz...

As I mentioned earlier, it looks like the load is distributed evenly across the 4 cores, doing my run a speedtest and watch with htop test, although the indicated cpu speeds are bouncing up and down. Is this indicating some kind of load balancing, even though it looks different in /proc/interrupts?

dlakelan · December 31, 2020, 3:20pm

That's the Intel pstate power save. There is a setting where you can write a value to a file and force lower latency behavior

But you need to hold the file open. I wrote a shell script that does it but will have to figure out where it is.

mindwolf · December 31, 2020, 4:32pm

@JonP

In powersave mode, as @dlakelan pointed out, it's entering a deep sleep state and then trying to wake up to take care of the requested interrupts, causing latency. you could echo performance or see if ondemand is an option as well.

dlakelan · December 31, 2020, 4:36pm

performance/ondemand are older techniques, in the intel pstate driver there's just performance and powersave (powersave is very like the older ondemand).

The thing to do is write a shell script which opens up the file /dev/cpu_dma_latency and writes the number 1000 in 32bit binary integer form, and then holds the file open between the hours of say 6am and 11pm. This should cause the pstate driver to not go into lower power movdes that take a lot of time to wake from during normal "daytime" hours.