OpenWrt Forum Archive

Topic: ar9331's usb stability issue - [SOLVED]

The content of this topic has been archived between 23 Mar 2017 and 6 May 2018. There are no obvious gaps in this topic, but there may still be some posts missing at the end.

You need to remove only the ".config" file in the OpenWRT root directory.

You also need to fetch all the packages from separate feeds and install them before running the "make menuconfig".

From the root directory, do:

./scripts/feeds update -a
./scripts/feeds install -a

You should then get some ~3000 packages wink

Ok, thank you.  I didn't realize that ".config" was an actual file which would not be shown by "ls". 

Running now.  I wonder what time tomorrow it will finish.

On a quad core i7, the Full Monty takes about 2h15.

You can then check into the "logs" directory what happened, especially the "logs/package/error.txt" file: these are the packages in error.

This procedure is what is performed periodically by the buildbot: http://buildbot.openwrt.org:8010/.

Well, some progress, but I appear stuck again.  The first attempt failed saying I needed libssl-dev, so:

sudo apt-get update
sudo apt-get install libssl-dev

The next pass ran for an hour and 20 minutes, building many packages, but then failed in a way that doesn't tell me much.  The "logs/package/error.txt" file doesn't exist.

Here are the last 20 or so lines of the run

rm -f /home/lb/openwrt/aa_r39407/staging_dir/host/bin/python2
(cd /home/lb/openwrt/aa_r39407/staging_dir/host/bin; ln -s python2.7 python2)
rm -f /home/lb/openwrt/aa_r39407/staging_dir/host/bin/python2-config
(cd /home/lb/openwrt/aa_r39407/staging_dir/host/bin; ln -s python2.7-config python2-config)
rm -f /home/lb/openwrt/aa_r39407/staging_dir/host/bin/python-config
(cd /home/lb/openwrt/aa_r39407/staging_dir/host/bin; ln -s python2-config python-config)
test -d /home/lb/openwrt/aa_r39407/staging_dir/host/lib/pkgconfig || /usr/bin/install -c -d -m 755 /home/lb/openwrt/aa_r39407/staging_dir/host/lib/pkgconfig
rm -f /home/lb/openwrt/aa_r39407/staging_dir/host/lib/pkgconfig/python2.pc
(cd /home/lb/openwrt/aa_r39407/staging_dir/host/lib/pkgconfig; ln -s python-2.7.pc python2.pc)
rm -f /home/lb/openwrt/aa_r39407/staging_dir/host/lib/pkgconfig/python.pc
(cd /home/lb/openwrt/aa_r39407/staging_dir/host/lib/pkgconfig; ln -s python2.pc python.pc)
/usr/bin/install -c -m 644 ./Misc/python.man \
                /home/lb/openwrt/aa_r39407/staging_dir/host/share/man/man1/python2.7.1
make[4]: Leaving directory `/home/lb/openwrt/aa_r39407/build_dir/host/Python-2.7.3'
install -m0755 /home/lb/openwrt/aa_r39407/build_dir/host/Python-2.7.3/Parser/pgen /home/lb/openwrt/aa_r39407/staging_dir/host/bin/
touch /home/lb/openwrt/aa_r39407/build_dir/host/Python-2.7.3/.built
make[3]: Leaving directory `/home/lb/openwrt/aa_r39407/feeds/packages/lang/python'
make[2]: Leaving directory `/home/lb/openwrt/aa_r39407'
make[1]: *** [/home/lb/openwrt/aa_r39407/staging_dir/target-mips_r2_uClibc-0.9.33.2/stamp/.package_compile] Error 2
make[1]: Leaving directory `/home/lb/openwrt/aa_r39407'
make: *** [world] Error 2
lb@lb-atom1:~/openwrt/aa_r39407$

(Last edited by lizby on 1 Mar 2014, 21:26)

Well, I've built openwrt many dozens of times over the past 6 years and flashed several dozen different devices, but my notes are old and you can always learn something new--thanks for the references.

In going over those links you referred to, I saw something I hadn't used before, "make prereq"  I tried that, got a little but not much further, and stopped again with Error 2.

make[3]: Leaving directory `/home/lb/openwrt/aa_r39407/feeds/packages/net/axel'
make[2]: Leaving directory `/home/lb/openwrt/aa_r39407'
make[1]: *** [/home/lb/openwrt/aa_r39407/staging_dir/target-mips_r2_uClibc-0.9.33.2/stamp/.package_compile] Error 2
make[1]: Leaving directory `/home/lb/openwrt/aa_r39407'
make: *** [world] Error 2

Again cryptic.  I don't recall in the past having gotten errors for which the meaning was so obscure.  I'll keep looking into prerequisites, but basically I have what I wanted--a way to build an AA image with the serial patch included, and a way to build, if I want additional individual packages.

(Last edited by lizby on 2 Mar 2014, 01:49)

Hi

I'm still fighting to use a plain Low Speed keypad with my WR703N, with a hub in between. I'm running BB r39770.

If I use a High Speed hub, it perfectly works, but if I use a Full Speed hub, it doesn't (the keypad is not recognized as HID, and no message is displayed in dmesg)

High Speed Hub (7 ports) :

[  128.180000] usb 1-1: new high-speed USB device number 3 using ehci-platform
[  128.340000] hub 1-1:1.0: USB hub found
[  128.340000] hub 1-1:1.0: 4 ports detected
[  128.620000] usb 1-1.1: new high-speed USB device number 4 using ehci-platform
[  128.740000] hub 1-1.1:1.0: USB hub found
[  128.740000] hub 1-1.1:1.0: 4 ports detected
[  148.450000] usb 1-1.1.1: new low-speed USB device number 5 using ehci-platform
[  148.600000] input: HID 1710:8812 as /devices/platform/ehci-platform/usb1/1-1/1-1.1/1-1.1.1/1-1.1.1:1.0/input/input0
[  148.610000] hid-generic 0003:1710:8812.0001: input,hidraw0: USB HID v1.10 Keyboard [HID 1710:8812] on usb-ehci-platform-1.1.1/input0

Full Speed hub (4 ports)

[  184.550000] usb 1-1: new full-speed USB device number 6 using ehci-platform
[  184.720000] hub 1-1:1.0: USB hub found
[  184.720000] hub 1-1:1.0: 4 ports detected
[  196.840000] usb 1-1.3: new low-speed USB device number 7 using ehci-platform
[  212.920000] hub 1-1:1.0: cannot reset port 3 (err = -145)
[  213.940000] hub 1-1:1.0: cannot reset port 3 (err = -145)
[  214.960000] hub 1-1:1.0: cannot reset port 3 (err = -145)
[  215.980000] hub 1-1:1.0: cannot reset port 3 (err = -145)

I tried another 4 ports FS hub (different model) with the same result :-(

The build revision suggest that your work / patch described in the thread has been in integrated, so I'm wondering what could be wrong ?

Any idea ? Thanks !

[Follow-Up]

When using a LS device (keyboards, mice), a true High Speed hub is necessary.

Thanks to Squonk for his expertise.

Hello!

Has anybody tried USB port on TPlink MR3220 for USB Audio after the  566-ath9k-ar933x-usb-hang-workaround.patch?

I tried it today and I get a lot of disturbances (sounds like static) when playing audio.
No special messages in dmesg, cheking stream0 values - all O.K.

Would just like to now if there is something wrong in my build (MPD problems, kernel, alsa....)

Thanks,
Davor.

To continue - tried the daily snapshot on Alix2d2 board - no crackling.

Any comment?


Thank you,
Davor.

davor128 wrote:

To continue - tried the daily snapshot on Alix2d2 board - no crackling.

Any comment?


Thank you,
Davor.

Can you try with WiFi turned off?

Same situation.

I have an alix2d2 with the latest trunk and DAC near the Wifi antenna - no problems.

So it's another issue, since the problem we found was related to RF spurious emissions causing the USB PLL to unlock and hang the USB.

It may cause some glitches in the audio, but if this problems also happens with WiFi turned off, it is something different.

I should add that this situation is not "the end of the wrold"...
I saw that there is a patch for the USB instability and decided to try.

You may recall my previous tests with this router.

I also found that Raspberry Pi also had (or still has) problems using USB port for USB AUDIO.

Since USB port seems to work in situations where basic data transfer is needed I would say the problem is solved.

On the other hand, it is interesting to see, what sort of "solutons" are used by manufacturers when developing the product.
Puting something on the market even if you now there is a flaw in the product...
And then enthusiasts search for the solution.

I will try some more settings and tests, can't help my curiosity. Hope my two cats won't get hurt...:)

I have a Carambola2-based project, with a LUFA USB CDC-ACM ATmega16U4 device attached, running recently-pulled Barrier Breaker bits (as in today). I believe I have the atkh9k USB stability patch.

USB stability is great with wifi enabled in STA mode, associated with an AP.

However, if I am not associated with an AP, USB activity quickly leads to a hang. A reset of the USB device brings the USB
back, however it has new Linux /dev links.

I'm theorizing that the unassociated wifi interface is scanning repeatedly - as evidenced by 'iw event' - and this is tripping
over some variant of the same issue that's supposed to be fixed here.

Any suggestions on how to proceed?

Thanks -
Dana  K6JQ

The AR9331 WiFi is using an on-chip synthesizer to generate the local oscillator (LO) frequencies for the receiver and transmitter mixers.

This synthesizer is a fractional-N synthesizer taking the 40 MHz crystal input as the reference and using an on-chip voltage controlled oscillator (VCO) to provide the desired LO signal based on a phase/frequency locked loop (PLL).

Unfortunately, it looks like the AR9331 USB in full-speed mode is influenced in some way when there is a required change in the LO frequency, such as when powering up or performing a channel reselection.

My guess is that the built-in FS USB controller is deriving its clock from this PLL in some way, and as the synthesizer takes approximately 0.2 ms to settle, the USB controller just hangs.

For an unknown reason, this phenomenon does not happen with the High-Speed USB controller is used.

If you look at the patch, its role is mainly to disable the USB PLL lock detect when either WiFi power up or channel reselection is performed:
http://patchwork.openwrt.org/patch/4608/

From what you say, we may well have missed one case when the device is not associated... The game is to find where this happens!

It is not as critical as the original behavior, since it only happens when WiFi is turned on and not in associated STA or AP mode, but I understand this may be serious in some particular cases.

If you have time to investigate this case, please go ahead and try to determine where the PLL frequency might be changed during scanning, as I won't have much time to allocate to this problem in the near future.

(Last edited by Squonk on 26 Apr 2014, 10:06)

Thank you for the insight - this confirmed and clarified my understanding of the USB PLL issue (a truly marvelous bit of SoC randomness there).

While I have not tested it yet, I tend to believe an unassociated AP won't enounter this problem - based on my possibly incomplete understanding that the AP won't channel scan and the assumption that unassociated STA mode is repeatedly channel-scanning.

Unfortunately I am limited to Full and Low Speed with the ATmega16U4; I haven't tried it yet, but I suspect that limiting to Low Speed would not work-around this issue (likely uses the same hardware path inside the AR9331).

An application-level work-around for me may be simply disabling the unassociated wifi during USB I/O; sounds nice in theory but rather ugly in practice.

I'll take a look at the Wifi driver - admittedly not very familiar with it - and see if I can understand where the unassociated scan case is evading the existing patch. Any advice, help, encouragement is highly appreciated.

Thanks again!

danak6jq wrote:

While I have not tested it yet, I tend to believe an unassociated AP won't enounter this problem - based on my possibly incomplete understanding that the AP won't channel scan and the assumption that unassociated STA mode is repeatedly channel-scanning.

Yes, the problem is probably coming from the channel scanning.

danak6jq wrote:

Unfortunately I am limited to Full and Low Speed with the ATmega16U4; I haven't tried it yet, but I suspect that limiting to Low Speed would not work-around this issue (likely uses the same hardware path inside the AR9331).!

AFAICT, Low-Spepd USB is not working with the AR9331, unless you put a hub in-between. It is rather strange, since basically the only required change between FS and LS is to divide the clock by 8 (or oversample x8, this is the same), unlike the HS and FS/LS wich uses 2 disttinct controller, the HS handing over to the FS/LS one the corresponding packets.

danak6jq wrote:

I'll take a look at the Wifi driver - admittedly not very familiar with it - and see if I can understand where the unassociated scan case is evading the existing patch. Any advice, help, encouragement is highly appreciated.

Good luck, maybe "iw event" can provide some details? The problem is the PLL startup and channel reselection, so look at when this happens to find the culprit.

Good luck!

The issue: un-associated STA Wifi causes hang of Full-Speed USB I/O.

Experimentation has shown:

- Simply holding the USB CDC-ACM device open without traffic is adequate to reproduce this issue (may take a minute)
- 'iw event' indicates, as expected, un-associated STA is repeatedly channel scanning
- Disabling fast channel-change in ath9k/hw.c does not make a difference
- Issue is 100% reproducible

Hello Everybody. Thanks for all information found in this topic (especially to Squonk), very useful info. Im fighting with the same issue as Danak6jq is having. Im using Carambola 2 with external usb device (LPC11U3X), and trying to stabilize this communication. As soon as wifi activity disconnects or connects to any network, the communication between usb device and carambola 2 is being dropped. I have patched the firmware with the 566 patch from 8devices forum, but this has not solved this issue at all. I have tested also with other patch 561 from the openwrt resources, but this didn't help with this issue. Im still having not stable communication that can still be broken through wifi activity. Im not sure if i made this firmware correctly. You are writing that the problems are solved with this patch, but im not sure if i did it properly, cause i made firmware build with this patch. Is this really SOLVED and i can not follow correctly Your instructions, or is it still possible that i have done sth wrong and it is still possible that this wont work with patches too. The build can only be prepared with one of this two patches, both is not possible to compile, compilation errors (im checking with patch 561(so called) - (from Openwrt) and called 566 (from 8Devices). Is it right what im doing? Is there any possibility to check if this patch is included in running system what i have loaded into Carambola 2 device. Can You help me somehow with this issue? Im sorry for my English, hope You understand what i mean. I must to have stable communication between USB and Carambola 2 without influence of wifi activity. Thank You for the answer.

(Last edited by rudebwoy on 6 May 2014, 15:56)

rudebwoy wrote:

Hello Everybody. Thanks for all information found in this topic (especially to Squonk), very useful info. Im fighting with the same issue as Danak6jq is having. Im using Carambola 2 with external usb device (LPC11U3X), and trying to stabilize this communication. As soon as wifi activity disconnects or connects to any network, the communication between usb device and carambola 2 is being dropped. I have patched the firmware with the 566 patch from 8devices forum, but this has not solved this issue at all. I have tested also with other patch 561 from the openwrt resources, but this didn't help with this issue. Im still having not stable communication that can still be broken through wifi activity. Im not sure if i made this firmware correctly. You are writing that the problems are solved with this patch, but im not sure if i did it properly, cause i made firmware build with this patch. Is this really SOLVED and i can not follow correctly Your instructions, or is it still possible that i have done sth wrong and it is still possible that this wont work with patches too. The build can only be prepared with one of this two patches, both is not possible to compile, compilation errors (im checking with patch 561(so called) - (from Openwrt) and called 566 (from 8Devices). Is it right what im doing? Is there any possibility to check if this patch is included in running system what i have loaded into Carambola 2 device. Can You help me somehow with this issue? Im sorry for my English, hope You understand what i mean. I must to have stable communication between USB and Carambola 2 without influence of wifi activity. Thank You for the answer.

I personally wondered if patch 566 was really applied, but I tinkered with the patch and confirmed that it is indeed present on my firmware build. This left me believing that the patch solves a worse problem - that the AR9331 running as an AP kills the USB. If you use the AR9331 as a client/STA, and it is unassociated, it will still kill the USB.

Ultimately, I gave up on a patch solution for this, and tested a passive USB 2.0 hub between the AR9331 and my USB peripherhal, and, as expected, USB operation has been stable. The side-effect for me is that adding a USB 2.0 hub means my project will have an external USB port now, which is a feature I was considering anyway. So I'm happy enough.

I do have doubts that any software patch will resolve this issue anyway.

Thank You danak6jq. I have got the same result as You wrote. Anyway this thread will stay opened for OpenWrt developers if somebody want to help and patch this issue.
Thank You one more time

I will try to explain the root cause for this bug, it may help to understand the situation...

Basically, a piezoelectric crystal provides a very stable (~30 ppm or parts per millions, not per cent!)  filter for an oscillator in the Mhz range, on the AR9331, it is either a 25 MHz or 40 MHz crystal.

However, the WiFi RF requires much higher frequencies in the 2.4 GHz range but still very stable, and moreover, the ability to have not a single stable but variable frequencies in this band in order to cover the different available channels.

This is achieved through the use of an electronic circuit called a PLL (Phase-Locked Loop) that uses the lowest frequency crystal-based oscillator to "drive" the highest frequency oscillator: when the highest frequency oscillator is "late" compared to the stable lowest frequency one, its is boosted a little, if "early", it is slowed down a little, so the highest frequency oscillator is "phased-locked" with the lowest one.

Unfortunately  in the AR9331, it looks like that, changing the RF PLL frequency affects the USB clock, which is also derived from the single 25 MHz crystal-based oscillator by (I hope!) a separate PLL to generate the 12 MHz required for the Full-Speed USB OHCI controller clock. For an unknown reason, it does not affect the 480 MHz clock required by the separate High-Speed USB EHCI controller, which usually handles over FS traffic down to the OHCI one...

Thus, the AR9331 has a problem when starting the RF oscillator or changing the WiFi channel, and what the patch basically does is to "hold" the USB PLL while doing these actions.

But it looks like there is still a case that is not handled correctly when a client is not associated to an AP and regularly sends WiFi "Probe Request" packets on the different available channels to get a list of available APs, causing the same problem on the USB.

IMHO, it is not a major problem though, since when not associated, you can't do much anyway, but it is nevertheless the same bug, we just have to find where to put the missing workaround macro!

So I think you are safe as long as you are either an AP or an associated client, please report if not.

Squonk wrote:

But it looks like there is still a case that is not handled correctly when a client is not associated to an AP and regularly sends WiFi "Probe Request" packets on the different available channels to get a list of available APs, causing the same problem on the USB.

IMHO, it is not a major problem though, since when not associated, you can't do much anyway, but it is nevertheless the same bug, we just have to find where to put the missing workaround macro!

So I think you are safe as long as you are either an AP or an associated client, please report if not.

I beg to differ regarding the seriousness of this issue in embedded application of the AR9331 where the Wifi interface is configured as a client / STA. First of all, in an embedded application, Wifi connectivity is not required to function as a standalone system - think of the AR9331 as an application CPU that happens to have both an Ethernet and Wifi interface available as needed (and not a Wifi device that happens to be able to run applications).

Further, the stability of the AR9331 with client-mode Wifi is directly impacted by that of an associated AP. If the AR9331 client Wifi is associated with an AP, and the AP is rebooted or power is interrupted, even a fairly short interruption of AP assocation will cause the AR9331 to enter Wifi scan mode and the USB is likely to be broken.

Once the USB is broken, programmatic recovery is difficult, if possible at all.

(Last edited by danak6jq on 8 May 2014, 15:24)

danak6jq wrote:

I beg to differ regarding the seriousness of this issue in embedded application of the AR9331 where the Wifi interface is configured as a client / STA. First of all, in an embedded application, Wifi connectivity is not required to function as a standalone system - think of the AR9331 as an application CPU that happens to have both an Ethernet and Wifi interface available as needed (and not a Wifi device that happens to be able to run applications).

In this case, you can just turn off WiFi.

danak6jq wrote:

Further, the stability of the AR9331 with client-mode Wifi is directly impacted by that of an associated AP. If the AR9331 client Wifi is associated with an AP, and the AP is rebooted or power is interrupted, even a fairly short interruption of AP assocation will cause the AR9331 to enter Wifi scan mode and the USB is likely to be broken.

Once the USB is broken, programmatic recovery is difficult, if possible at all.

Yes, good point on this one!

So, we need to find the spots where we need to add the missing workaround macros.