Sunxi kernel 4.9 high cpu usage

I'm using banana pi device with openwrt.
It was running DESIGNATED DRIVER (Bleeding Edge, r48830) since 2016. Everything was mostly fine except for stmmac driver that crashed on jumboframes.
Yesterday I decided to upgrade to the latest lede (now openwrt again) build.
Now on the same workload kernel cpu usage is higher by 30..40%, maximum speeds are lower.

OLD kernel 4.4.3

 5.07%  [ip_tables]     [k] ipt_do_table               
 1.65%  [kernel]        [k] csum_partial               
 1.62%  [kernel]        [k] stmmac_poll                
 1.58%  [kernel]        [k] _raw_spin_unlock_irqrestore
 1.36%  [kernel]        [k] __netif_receive_skb_core   
 1.30%  [kernel]        [k] stmmac_xmit                
 1.27%  [sch_htb]       [k] htb_dequeue                
 1.26%  [kernel]        [k] __dev_queue_xmit           
 1.25%  [kernel]        [k] ndesc_get_rx_status        
 1.19%  [nf_conntrack]  [k] __nf_conntrack_find_get    
 1.08%  [kernel]        [k] net_rx_action              
 1.05%  [kernel]        [k] __do_softirq               
 0.88%  [kernel]        [k] dev_gro_receive            
 0.88%  [kernel]        [k] ip_forward                 
 0.86%  [kernel]        [k] _raw_spin_unlock_irq       
 0.84%  [kernel]        [k] kmem_cache_alloc           
 0.81%  [kernel]        [k] netif_skb_features         
 0.81%  [xt_conntrack]  [k] conntrack_mt               
 0.80%  [kernel]        [k] ndesc_get_tx_status        
 0.77%  [kernel]        [k] csum_partial_copy_nocheck  

NEW kernel 4.9.75

 6.08%  [kernel]        [k] __slab_alloc.constprop.7
 3.64%  [kernel]        [k] _raw_spin_unlock_irqrestore
 2.57%  [kernel]        [k] __usb_hcd_giveback_urb
 2.04%  [ip_tables]     [k] ipt_do_table
 1.27%  [kernel]        [k] __do_softirq
 0.93%  [kernel]        [k] stmmac_poll
 0.81%  [kernel]        [k] net_rx_action
 0.68%  [kernel]        [k] stmmac_xmit
 0.64%  [kernel]        [k] _raw_spin_unlock_irq
 0.62%  [kernel]        [k] __dev_queue_xmit
 0.58%  [kernel]        [k] csum_partial
 0.53%         [.] 0x00029b18
 0.50%  perf            [.] 0x000bd69c
 0.48%  [kernel]        [k] __netif_receive_skb_core
 0.47%  [kernel]        [k] nf_iterate
 0.47%  [kernel]        [k] __skb_flow_dissect
 0.46%  [nf_conntrack]  [k] __nf_conntrack_find_get
 0.44%  [kernel]        [k] csum_partial_copy_nocheck
 0.43%  [kernel]        [k] dev_gro_receive
 0.42%  [kernel]        [k] __memzero

These functions take much cpu time
6.08% [kernel] [k] __slab_alloc.constprop.7
3.64% [kernel] [k] _raw_spin_unlock_irqrestore

and performance is slow. Had to revert back
So i'm asking is it caused by updated stmmac driver or may be changes in kernel.
Any ideas, workarounds ?

this device acts as VLAN router. It routes 100 mbit eth0.1 to eth0.2 with duplex mode. Physical eth0 uploads and downloads up to 200 mbit simultaneously

I found the reason.

These options are not harmless :

Enable /proc slab debug info
Enable /proc page monitoring

They add significant overhead;a=commitdiff;h=7e99a6ba690f27b36e99144178c71f0687b07ad9

Actually they was not set in default configuration. It was my initiative to enable additional features.
I did not test which option caused the most slowdown. Just compared to my old 4.3 configuration, found the difference, then disabled them both and it helped.

I guess SLAB_INFO should cause more damage because its executed in every kernel alloc. It keeps up-to-date information available in /proc and uses locks for multicore sync.

I checked ubuntu kernel config

These options are present there :

Problems seems to come from
it enables all debugging if no special kernel command line parameter is given
Currently if "Enable /proc slab debug info" is checked then both CONFIG_SLUB_DEBUG and CONFIG_SLUB_DEBUG_ON enabled.

So its logical to enable CONFIG_SLUB_DEBUG_ON only if special checkbox marked in menucofig.
Something like "Enable slub debugging (slow !)"