After some further testing by @mm933, it turned out that the issue is actually reproducible on vendor firmware with the magic disabled in the device tree. The dc_ep_clk_on failed
error was just no longer present in the dmesg
output in the support data, because there were too many other kernel messages.
I also found out why my adaptation of the magic hack for OpenWrt didn't work. The upstream DesignWare PCIe driver code includes auto detection of the number of ATU regions. This involves overwriting some ATU registers, and happens after the host_init
function of the Qualcomm PCIe driver is called. As a result, the programmed ATU entry wasn't actually active.
I fixed that, and it seems to work now. I also did some additional cleanup and sent the patch to the mailing list.
Github: https://github.com/janh/openwrt/commit/1625aa9ce27ed848c851bf739eb29936749bf6f2
Mailing list: https://lists.openwrt.org/pipermail/openwrt-devel/2023-January/040401.html
Patchwork: https://patchwork.ozlabs.org/project/openwrt/patch/20230130224020.473703-1-jan@3e8.eu/
@varda, @jaghatei, or anyone else who encountered the dc_ep_clk_on failed
error: This patch should really fix the issue. If you tested it successfully, consider sending your Tested-by
to the mailing list.