OpenWrt Forum Archive

Topic: Organizing Atheros ath9k bug reports for kernel.org

The content of this topic has been archived on 4 May 2018. There are no obvious gaps in this topic, but there may still be some posts missing at the end.

The relatively youn ath9k driver is maturing, but some basic performance and packet loss issues seem to be going on longer than they should be.  The need is to try and get these problems distilled down to easily reproducible problems that the developers can focus on.  Currently there are NO BUG REPORTS filed on kernel.org related to MIPS routers and AP mode. I believe it will improve things if we start to file them.

Kernel.org bug reports for ath9k

see: http://wireless.kernel.org/en/users/Dri … ugsonath9k


Some reports to review

1. Intel N clients can not connect to an ath9k AP at speeds greater than 54Mbps. see: https://lists.ath9k.org/pipermail/ath9k … 01292.html  and follow-up: https://lists.ath9k.org/pipermail/ath9k … 01296.html

2.  Lark reports AP81 issues with TX and RX interaction problems and packet loss.  see: https://lists.ath9k.org/pipermail/ath9k … 01294.html

3. Gabor Juhos reported locking issues during large file transfers on AP81.  https://lists.ath9k.org/pipermail/ath9k … 01154.html  Is this confirmed fixed?  Anyone reproduce it?

4.  client mode issues on AP81 routers; we need to do a fresh round of testing.  are ping times stable or erratic?

5.  Reports of the wireless AP mode dropping connections after some period of time.  See here: http://forum.openwrt.org/viewtopic.php?pid=84917

Atheros employees haven't confirmed they have even tried this driver on MIPS based routers!  We need to get them more involved on this platform to improve timing issues and performance focus.    See the response from this active Atheros employee: https://lists.ath9k.org/pipermail/ath9k … 01247.html

What other bugs do you have?  What can you confirm on latest OpenWrt Trunk?

(Last edited by RoundSparrow on 31 Mar 2009, 20:42)

I have a reproducible one. smile

Hardware: TP-LINK TL WR941ND v2.2
Software: openwrt trunk r14853 (latest 2h ago)

Setup:
(Very similar to larks.)

A(MacOS/Intel/Atheros laptop) .......(wireless)...... B (WR941N as pure AP) ------------(wired)------------ C(Linux box)
 192.168.111.109                                     192.168.111.107                                           192.168.111.35

Config:

root@OpenWrt:/# cat /etc/config/network 
config interface loopback
        option ifname   lo
        option proto    static
        option ipaddr   127.0.0.1
        option netmask  255.0.0.0

config interface eth
        option ifname   eth0

config interface wan
        option ifname   "wan lan1 lan2 lan3 lan4"
        option type     bridge
        option proto    dhcp
root@OpenWrt:/# cat /etc/config/wireless 
config wifi-device  wlan0
        option type     mac80211
        option channel  6
        option hwmode   11n

        # REMOVE THIS LINE TO ENABLE WIFI:
        #option disabled 1

config wifi-iface
        option device   wlan0
        option network  wan
        option mode     ap
        option ssid     OpenWrt
        option encryption none

Steps-by-step:

1. Ping the laptop (A) from the linux box (C) using this command executed on the linux box (C):

$ sudo ping -i 0.005 -s 1400 <A>

Expected result: No packet loss.

Actual result: Packets are lost in a consistent repetitive pattern. About 500 packets go through perfectly (2.5 seconds), then approximately 60 are lost (0.3 seconds), and then 500 go through again, and so on. You can use the below command to analyze (only prints first packet after loss and summary):

$ sudo ping -i 0.005 -s 1400 192.168.111.109 | ruby -e 'n=-1; STDIN.each_line { |l| n += 1; next unless n > 0; seq = l.scan(/icmp_seq=([0-9]*) /).first.first.to_i; puts "#{l.chomp}: lost #{seq - n} packets (seq #{n} - #{seq-1})" if seq != n; n = seq }'
1408 bytes from 192.168.111.109: icmp_seq=358 ttl=64 time=2.22 ms: lost 60 packets (seq 298 - 357)
1408 bytes from 192.168.111.109: icmp_seq=863 ttl=64 time=2.07 ms: lost 59 packets (seq 804 - 862)
1408 bytes from 192.168.111.109: icmp_seq=1174 ttl=64 time=3.04 ms: lost 61 packets (seq 1113 - 1173)
1408 bytes from 192.168.111.109: icmp_seq=1678 ttl=64 time=2.50 ms: lost 62 packets (seq 1616 - 1677)
1408 bytes from 192.168.111.109: icmp_seq=2383 ttl=64 time=2.84 ms: lost 59 packets (seq 2324 - 2382)
1408 bytes from 192.168.111.109: icmp_seq=2891 ttl=64 time=2.53 ms: lost 61 packets (seq 2830 - 2890)
1408 bytes from 192.168.111.109: icmp_seq=3203 ttl=64 time=2.99 ms: lost 62 packets (seq 3141 - 3202)
1408 bytes from 192.168.111.109: icmp_seq=3695 ttl=64 time=2.66 ms: lost 46 packets (seq 3649 - 3694)
1408 bytes from 192.168.111.109: icmp_seq=4385 ttl=64 time=3.74 ms: lost 47 packets (seq 4338 - 4384)
1408 bytes from 192.168.111.109: icmp_seq=4884 ttl=64 time=2.41 ms: lost 52 packets (seq 4832 - 4883)
1408 bytes from 192.168.111.109: icmp_seq=5559 ttl=64 time=2.15 ms: lost 52 packets (seq 5507 - 5558)
1408 bytes from 192.168.111.109: icmp_seq=6066 ttl=64 time=2.44 ms: lost 59 packets (seq 6007 - 6065)
1408 bytes from 192.168.111.109: icmp_seq=6768 ttl=64 time=2.31 ms: lost 62 packets (seq 6706 - 6767)
1408 bytes from 192.168.111.109: icmp_seq=7277 ttl=64 time=4.83 ms: lost 61 packets (seq 7216 - 7276)
1408 bytes from 192.168.111.109: icmp_seq=7780 ttl=64 time=4.09 ms: lost 61 packets (seq 7719 - 7779)
1408 bytes from 192.168.111.109: icmp_seq=8289 ttl=64 time=5.05 ms: lost 61 packets (seq 8228 - 8288)
1408 bytes from 192.168.111.109: icmp_seq=8600 ttl=64 time=3.74 ms: lost 61 packets (seq 8539 - 8599)
1408 bytes from 192.168.111.109: icmp_seq=9310 ttl=64 time=2.89 ms: lost 62 packets (seq 9248 - 9309)
1408 bytes from 192.168.111.109: icmp_seq=9622 ttl=64 time=4.21 ms: lost 62 packets (seq 9560 - 9621)

It seems remarkably consistent. I'm thinking about building a clock that uses this as time source. tongue

The funiest thing is that this hardware makes a sound as sends the packets. smile If I put my ear to the router I can hear the packets being lost because there is silence. When the packets start flowing again the router makes a very low hissing sound. I think it is coming from the capacitors... smile

If I change the delay (from 0.005 seconds) in the range 0.001 - 0.010 seconds the timing changes a bit. Packet loss seems to start at about a 0.008 second delay. This is only 2 Mbit/s though so should be perfectly ok, no?

Here is the result for 0.008 seconds delay:

$ sudo ping -i 0.008 -s 1400 192.168.111.109 | ruby -e 'n=-1; STDIN.each_line { |l| n += 1; next unless n > 0; seq = l.scan(/icmp_seq=([0-9]*) /).first.first.to_i; puts "#{l.chomp}: lost #{seq - n} packets (seq #{n} - #{seq-1})" if seq != n; n = seq }'
1408 bytes from 192.168.111.109: icmp_seq=4 ttl=64 time=4.53 ms: lost 3 packets (seq 1 - 3)
1408 bytes from 192.168.111.109: icmp_seq=1369 ttl=64 time=2.37 ms: lost 1 packets (seq 1368 - 1368)
1408 bytes from 192.168.111.109: icmp_seq=2741 ttl=64 time=2.51 ms: lost 1 packets (seq 2740 - 2740)
1408 bytes from 192.168.111.109: icmp_seq=5736 ttl=64 time=2.57 ms: lost 1 packets (seq 5735 - 5735)

Here is the result for 0.075 seconds delay:

$ sudo ping -i 0.0075 -s 1400 192.168.111.109 | ruby -e 'n=-1; STDIN.each_line { |l| n += 1; next unless n > 0; seq = l.scan(/icmp_seq=([0-9]*) /).first.first.to_i; puts "#{l.chomp}: lost #{seq - n} packets (seq #{n} - #{seq-1})" if seq != n; n = seq }'
1408 bytes from 192.168.111.109: icmp_seq=291 ttl=64 time=2.33 ms: lost 20 packets (seq 271 - 290)
1408 bytes from 192.168.111.109: icmp_seq=703 ttl=64 time=2.32 ms: lost 21 packets (seq 682 - 702)
1408 bytes from 192.168.111.109: icmp_seq=974 ttl=64 time=3.12 ms: lost 21 packets (seq 953 - 973)
1408 bytes from 192.168.111.109: icmp_seq=1528 ttl=64 time=3.19 ms: lost 20 packets (seq 1508 - 1527)
1408 bytes from 192.168.111.109: icmp_seq=2085 ttl=64 time=3.87 ms: lost 21 packets (seq 2064 - 2084)
1408 bytes from 192.168.111.109: icmp_seq=2499 ttl=64 time=2.46 ms: lost 21 packets (seq 2478 - 2498)

Here is the result for 0.003 seconds delay:

$ sudo ping -i 0.003 -s 1400 192.168.111.109 | ruby -e 'n=-1; STDIN.each_line { |l| n += 1; next unless n > 0; seq = l.scan(/icmp_seq=([0-9]*) /).first.first.to_i; puts "#{l.chomp}: lost #{seq - n} packets (seq #{n} - #{seq-1})" if seq != n; n = seq }'
1408 bytes from 192.168.111.109: icmp_seq=384 ttl=64 time=3.23 ms: lost 101 packets (seq 283 - 383)
1408 bytes from 192.168.111.109: icmp_seq=648 ttl=64 time=3.23 ms: lost 14 packets (seq 634 - 647)
1408 bytes from 192.168.111.109: icmp_seq=911 ttl=64 time=2.39 ms: lost 13 packets (seq 898 - 910)
1408 bytes from 192.168.111.109: icmp_seq=1255 ttl=64 time=3.41 ms: lost 94 packets (seq 1161 - 1254)
1408 bytes from 192.168.111.109: icmp_seq=1602 ttl=64 time=2.82 ms: lost 97 packets (seq 1505 - 1601)
1408 bytes from 192.168.111.109: icmp_seq=1865 ttl=64 time=2.47 ms: lost 13 packets (seq 1852 - 1864)
1408 bytes from 192.168.111.109: icmp_seq=2211 ttl=64 time=10.3 ms: lost 96 packets (seq 2115 - 2210)
1408 bytes from 192.168.111.109: icmp_seq=2475 ttl=64 time=2.65 ms: lost 14 packets (seq 2461 - 2474)
1408 bytes from 192.168.111.109: icmp_seq=2739 ttl=64 time=2.39 ms: lost 14 packets (seq 2725 - 2738)
1408 bytes from 192.168.111.109: icmp_seq=3002 ttl=64 time=2.42 ms: lost 13 packets (seq 2989 - 3001)
1408 bytes from 192.168.111.109: icmp_seq=3350 ttl=64 time=2.79 ms: lost 98 packets (seq 3252 - 3349)
1408 bytes from 192.168.111.109: icmp_seq=3695 ttl=64 time=2.49 ms: lost 95 packets (seq 3600 - 3694)
1408 bytes from 192.168.111.109: icmp_seq=3959 ttl=64 time=2.36 ms: lost 14 packets (seq 3945 - 3958)
1408 bytes from 192.168.111.109: icmp_seq=4221 ttl=64 time=3.48 ms: lost 12 packets (seq 4209 - 4220)
1408 bytes from 192.168.111.109: icmp_seq=4486 ttl=64 time=2.44 ms: lost 15 packets (seq 4471 - 4485)
1408 bytes from 192.168.111.109: icmp_seq=4833 ttl=64 time=5.01 ms: lost 97 packets (seq 4736 - 4832)
1408 bytes from 192.168.111.109: icmp_seq=5180 ttl=64 time=2.32 ms: lost 97 packets (seq 5083 - 5179)
1408 bytes from 192.168.111.109: icmp_seq=5529 ttl=64 time=2.50 ms: lost 99 packets (seq 5430 - 5528)
1408 bytes from 192.168.111.109: icmp_seq=5793 ttl=64 time=2.41 ms: lost 14 packets (seq 5779 - 5792)

Here is the result for a zero delay:

$ sudo ping -i 0 -s 1400 192.168.111.109 | ruby -e 'n=-1; STDIN.each_line { |l| n += 1; next unless n > 0; seq = l.scan(/icmp_seq=([0-9]*) /).first.first.to_i; puts "#{l.chomp}: lost #{seq - n} packets (seq #{n} - #{seq-1})" if seq != n; n = seq }'
1408 bytes from 192.168.111.109: icmp_seq=585 ttl=64 time=2.49 ms: lost 112 packets (seq 473 - 584)
1408 bytes from 192.168.111.109: icmp_seq=945 ttl=64 time=2.64 ms: lost 110 packets (seq 835 - 944)
1408 bytes from 192.168.111.109: icmp_seq=1306 ttl=64 time=2.21 ms: lost 111 packets (seq 1195 - 1305)
1408 bytes from 192.168.111.109: icmp_seq=1666 ttl=64 time=2.48 ms: lost 110 packets (seq 1556 - 1665)
1408 bytes from 192.168.111.109: icmp_seq=1941 ttl=64 time=2.84 ms: lost 25 packets (seq 1916 - 1940)
1408 bytes from 192.168.111.109: icmp_seq=2302 ttl=64 time=2.42 ms: lost 111 packets (seq 2191 - 2301)
1408 bytes from 192.168.111.109: icmp_seq=2662 ttl=64 time=2.52 ms: lost 110 packets (seq 2552 - 2661)
1408 bytes from 192.168.111.109: icmp_seq=3023 ttl=64 time=3.27 ms: lost 111 packets (seq 2912 - 3022)
1408 bytes from 192.168.111.109: icmp_seq=3297 ttl=64 time=2.51 ms: lost 24 packets (seq 3273 - 3296)
1408 bytes from 192.168.111.109: icmp_seq=3654 ttl=64 time=2.42 ms: lost 107 packets (seq 3547 - 3653)
1408 bytes from 192.168.111.109: icmp_seq=3927 ttl=64 time=2.44 ms: lost 23 packets (seq 3904 - 3926)
1408 bytes from 192.168.111.109: icmp_seq=4286 ttl=64 time=2.36 ms: lost 109 packets (seq 4177 - 4285)
1408 bytes from 192.168.111.109: icmp_seq=4649 ttl=64 time=2.23 ms: lost 113 packets (seq 4536 - 4648)
1408 bytes from 192.168.111.109: icmp_seq=4925 ttl=64 time=4.33 ms: lost 26 packets (seq 4899 - 4924)
1408 bytes from 192.168.111.109: icmp_seq=5286 ttl=64 time=3.25 ms: lost 111 packets (seq 5175 - 5285)
1408 bytes from 192.168.111.109: icmp_seq=5646 ttl=64 time=2.39 ms: lost 110 packets (seq 5536 - 5645)
1408 bytes from 192.168.111.109: icmp_seq=6006 ttl=64 time=2.51 ms: lost 110 packets (seq 5896 - 6005)
1408 bytes from 192.168.111.109: icmp_seq=6365 ttl=64 time=2.43 ms: lost 109 packets (seq 6256 - 6364)
1408 bytes from 192.168.111.109: icmp_seq=6725 ttl=64 time=4.31 ms: lost 110 packets (seq 6615 - 6724)
1408 bytes from 192.168.111.109: icmp_seq=7087 ttl=64 time=2.51 ms: lost 112 packets (seq 6975 - 7086)
1408 bytes from 192.168.111.109: icmp_seq=7448 ttl=64 time=2.27 ms: lost 111 packets (seq 7337 - 7447)

From listening to the sound from the capacitors I have a feeling that the router only sends packets about half the time (when delay is zero). Its "hshshsh - silence - hshshshsh - silence - shshshsh", and the silence is about as long as the hissing sound, or even longer. The reason more packets aren't lost is probably some queue? But there are no printouts about packets being droped or TX queue overflow. In fact there is no abnormal output in the log at all. Top shows router as 95% idle and 3% softirq.

I'm thinking we should make some effort to try and reproduce these problems on desktop x86 (PCI or PCI Express) hardware with the ath9k driver.  I'm wondering if it is AHB bus specific or MIPS architecture issue...

Anyone that can try this on a desktop?

Maybe we should at least put this in the openwrt bugtrack database? Would be a shame to let such a detailed bug report with sound and all go to waste. wink I could even make a recording if you think that would help. smile

RoundSparrow wrote:

1. Intel N clients can not connect to an ath9k AP at speeds greater than 54Mbps. see: https://lists.ath9k.org/pipermail/ath9k … 01292.html  and follow-up: https://lists.ath9k.org/pipermail/ath9k … 01296.html

2.  Lark reports AP81 issues with TX and RX interaction problems and packet loss.  see: https://lists.ath9k.org/pipermail/ath9k … 01294.html

As you can see in th follow-up message, 54mbps limit problem is easily solved by adding "wme_enabled=1" setting.

But packet loss issues are very similar to those reported by Lark / _bbb_, and he has an Intel client card too!. Here are my ping reports:

https://lists.ath9k.org/pipermail/ath9k … 01298.html
https://lists.ath9k.org/pipermail/ath9k … 01302.html

So I suppose reports (1) and (2) should be united into one.

There are some other users reporting problems with similar setup  (ath9k AP <-> win32 intel 4965/5100/5300 client).
And i've seen no success reports with this setup.

Yura80 wrote:

But packet loss issues are very similar to those reported by Lark / _bbb_, and he has an Intel client card too!. Here are my ping reports:

Just to be clear: my setup is Atheros-Atheros.

A thought struck me earlier today: I remember seeing a log print from the (modified) original TrendNET TEW-632BRP that said something about some work around for tx queue lockup. Was it some sort of watchdog thread that checked for lockups? I think I have seen some reference in some README to this as a known hardware bug. Maybe it's this we are seeing in different shapes and forms? Anybody remember the details?

_bbb_ wrote:

Anyone that can try this on a desktop?

There is a thread on hostapd mailing list with someone complaining of low throughput (20Mbps maxing out, similar to what I see on OpenWrt and these routers).

Threads with ath9k in subject: http://lists.shmoo.com/pipermail/hostap … hread.html

Added #5 to the list.  would really like some help on getting some of these posted as bugs on kernel.org ...

BTW, some big changes coming in ath9k with code sharing between ath5k/ath9k/r9170.  See: http://marc.info/?l=linux-wireless& … 40&w=2

(Last edited by RoundSparrow on 31 Mar 2009, 20:48)

For #5 I don't think it drops connection.  I never had it drop any connections while I was connected but when no wireless devices are connected the wifi kinda drop after a few minutes even and you can't connect to the wifi untill I reset.  I never had it drop while I had some one connected.

in the form you link to for #5
MENTALDOMINANCE claimed to allways have it online put he has a printer conected to it
And _bbb_ happens when his mac book goes to sleep (disconnects and no other connected to the wifi?)
just a thought.

btw i have the dlink 615-c1

(Last edited by jmlb on 12 Nov 2009, 19:28)

The discussion might have continued from here.