Possible cause of R7800 latency issues

Amazing work, very well done! :slight_smile: Would you mind creating a pull request for this fix?

1 Like

@kyva1929 or someone else should do that. I don't know enough about the internals to make pull requests, I just followed the guide kyva1929 wrote. The patch needs to be cleaned up and done properly.

Could you share this build for others to test please?

1 Like

If anyone here wants to try a preemptive (non voluntary) kernel, here is the patch from @dissent1 :

UPDATE: theoreemtive kernel still needs to be enabled, but this patch fixes a boot issue.

diff --git a/target/linux/ipq806x/patches-4.9/0038-clk-qcom-Add-support-for-High-Frequency-PLLs-HFPLLs.patch b/target/linux/ipq806x/patches-4.9/0038-clk-qcom-Add-support-for-High-Frequency-PLLs-HFPLLs.patch
index 7092614..a206757 100644
--- a/target/linux/ipq806x/patches-4.9/0038-clk-qcom-Add-support-for-High-Frequency-PLLs-HFPLLs.patch
+++ b/target/linux/ipq806x/patches-4.9/0038-clk-qcom-Add-support-for-High-Frequency-PLLs-HFPLLs.patch
@@ -117,7 +117,7 @@ Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
 +       * H/W requires a 5us delay between disabling the bypass and
 +       * de-asserting the reset. Delay 10us just to be safe.
 +       */
-+      usleep_range(10, 100);
++      udelay(10);
 +
 +      /* De-assert active-low PLL reset. */
 +      regmap_update_bits(regmap, hd->mode_reg, PLL_RESET_N, PLL_RESET_N);
@@ -128,7 +128,7 @@ Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
 +                      regmap_read(regmap, hd->status_reg, &val);
 +              } while (!(val & BIT(hd->lock_bit)));
 +      } else {
-+              usleep_range(60, 100);
++              udelay(60);
 +      }
 +
 +      /* Enable PLL output. */
1 Like

Hmm, I am not seeing any impact; might have to recompile the firmware. However, I did notice the AR8XXX_CAP_MIB_COUNTERS constant that is used to indicate the mib counters capability. If it is not present and num_mibs, mib_desc, and mib_func are set to NULL then it should disable all mib counters.

Would it not be enough to have ar8xx_has_mib_counters return false all the time? Otherwise the callers are still obtaining mutex’s and do extra work. Lock contention is never a good thing

The ar8216-files list @nbd 's name, maybe we should ask for his advice on what is the correct approach?
@nbd, the relevant discussion starts at post 134, please have a look.

1 Like

I set ar8xxx_has_mib_counters to return false:

diff --git a/target/linux/generic/files/drivers/net/phy/ar8216.h b/target/linux/generic/files/drivers/net/phy/ar8216.h
index ba0e0ddccd..322a9a6940 100644
--- a/target/linux/generic/files/drivers/net/phy/ar8216.h
+++ b/target/linux/generic/files/drivers/net/phy/ar8216.h
@@ -582,7 +582,8 @@ static inline bool ar8xxx_has_gige(struct ar8xxx_priv *priv)
 
 static inline bool ar8xxx_has_mib_counters(struct ar8xxx_priv *priv)
 {
-       return priv->chip->caps & AR8XXX_CAP_MIB_COUNTERS;
+       /* return priv->chip->caps & AR8XXX_CAP_MIB_COUNTERS; */
+       return false;
 }
 
 static inline bool chip_is_ar8216(struct ar8xxx_priv *priv)


Here is the resulting ping graph from pinging the first ISP hop:

isphop

As can be seen there are no spikes.

1 Like

Nice results! A still cleaner hack would be to change this line ar8327.c:1509 by removing mib counters capability.

Keep in mind that the target has a variable ping on it's own, that's why the graph appears a bit jittery.

From what I understand the whole driver is a bit of a mess and is in need of a rewrite.

The rewrite is called qca8k and based on the dsa framework.

1 Like

Then why not switch to DSA?

is there any reason why you guys hack around to get a solution yourselves instead of applying the mentioned OpenWRT patch directly?

If you are referring to the commit mentioned in this thread earlier, then it is already in. Just not helping as much.

so what did you guys do then to fix it? additional hacks of other parts of the switch code?

yes, it's already on master, right?

@Ansuel blogic worked on this, see his staging-tree qca8k
There is a patch in the Netgear R7800 topic, enabling DSA, which worked for a while, but at my last attempt not anymore, too much manual hacking due to the changes in master, couldn't find out exactly what was wrong.

1 Like

Core team wants a transition script written. Currently it requires you to wipe /etc/config/network when flashing.

Well the only thing to remove is the switch section and modify the lab bridge ifname section

Is there a document that describes what needs to be done?