USB xhci-hcd crashing: ipq806x, ipq40xx

Today I am troubleshooting why my ipq806x device USB port is crashing.

The following is a cleaned up excerpt of the local log messages

kern.warn kernel: xhci-hcd xhci-hcd.0.auto: xHCI host not responding to stop endpoint command.
kern.warn kernel: xhci-hcd xhci-hcd.0.auto: USBSTS: 0x00000000
kern.warn kernel: xhci-hcd xhci-hcd.0.auto: Host halt failed, -110
kern.err kernel: xhci-hcd xhci-hcd.0.auto: xHCI host controller not responding, assume dead
kern.err kernel: xhci-hcd xhci-hcd.0.auto: HC died; cleaning up

This is from a tew827dru. I am the maintainer/dev for this device. It has an ipq8064 SoC and two USB3 ports.

The USB device triggering these crashes is a Asus USB-AC51 (mt7610u) in station/client mode. It connects okay but then when I put some network load on it, the whole USB port dies. It's highly reproducible.

I have not tried many other USB devices, but a USB flash drive and another WiFi adapter (AR9271) is working okay.

This is not a power problem. I've load tested the USB ports on my device and they put out plenty of voltage on my test gear.

Googling around indicates multiple other people with similar problems. All of them are Qualcom SoCs.

Linksys EA8500 ipq8064
Nighthawk R7800 ipq8065
The Zte mf287 uses a IPQ4018
A Unielec U4019-01 with ipq4019

Maybe someone will find this some day. I don't really have interesting or time to follow up on it further right now.

No, the MF287 series is based on IPQ4018. Sorry, nothing more to add, just this small correction.

Right. My bad. That's a radio. I think I copy pasted it from a git log somewhere. All the same anyway.

There are two things on top of my mind, even if you've already discarded one of them:

  • power draw, especially peak power draw
    802.11ac uses considerably more power than 802.11n; 802.11ax would be even more power hungry. Peak power requirements might not always last long, a (fraction of-) a second might be enough to cause trouble
  • potentially CPU starvation
    we're seeing issues with users trying to pass-through (which involved partial host emulation) their USB WLAN cards to a VM, with many wireless cards being rather timing sensitive this almost always causes issues (for host and VM), as the emulation simply can't keep up. ipq8064 might not quite provide the performance necessary for USB here either (so keep an eye on htop/ top during the transfers).
2 Likes

Just to be clear, we are talking about the USB controller crashing, not the endpoint.

I tested multiple things to make sure this wasn't a power problem. I used a powered hub and the mt7610u adapter still crashed with the exact same behavior. I tested the ports with a USB power load meter and it holds up the voltage level as expected under 2A load.

One of the links above also mentions testing with a powered hub.

I also tested a couple of NVME-to-USB adapters with the most power-hungry drives I have, along with an old spindle hard drive which has given me problems in the past when it didn't get enough power from the port, and all of those worked fine. I was not able to crash the port with any of those storage devices or get any other odd behavior.

As for CPU, this was tested on an unloaded test device with nothing else going on.

Unfortunately I don't have any other USB WiFi adapters to test right now.

1 Like

Just as an additional data point, I've been testing a mt7921au USB WLAN card on an ASRock g10 (ipq8064, r26613-169a695280, kernel 6.6.33) over night (6 GHz, ch69, WPA3SAE in STA mode). mt76-usb-rx phy and sirq cap the throughput to ~25 MByte/s, but it's been running stable without any error messages or warnings observed for a couple of hours now, having transferred quite a few GBytes of data.

EDIT: iperf3 can push that value to ~610 Mbits/sec, with both cores being totally maxed out (compared to only a single core being maxed out for my more normal throughput tests via https transfers).

Please show lsusb -t to identify the USB storage device.

Good info! Thanks for testing.

I have a mt7921au adapter on order from Aliexpress, but it might be a few weeks before it shows up. I ordered it just a week ago.

I will also look into building an image for the current dev branch. Maybe I will luck out and this is something already fixed.

Today I got my generic AX1800 mt7921u 802.11ax WiFi adapter and... it works perfectly. No crashes at all, even during some heavy load testing.

Specifically the adapter is from Aliexpress "VictoryEagle Store", $7, very cheap. You can find them by searching for "mt7921". There's also the Fenvi FU-AX1800 and Comfast has a model for a little more, but they all look pretty much the same in the $7-14 range.

My power meter says it's using 50mA more than the older mt7610u adapter, but it's not super accurate and I don't really trust this cheap meter for anything but gross measurements. But I'm also using a a short USB extension because this adapter won't fit into the USB port due to it's size.

One thing that is worth noting is that this new adapter is USB3, so that factor is in play too. All of my other adapters are USB2.

There's one easy way to test that hypothesis, USB2 extension cable (no idea how well mt7921au will like that).

I think this is the only USB2 extension cord I own anymore and it's 16 foot long.

Tested. It's working great on USB2.

Maybe the mt7610u driver is triggering some specific behavior.

I regret I don't have the time to dig into this deeper.