Weird Internet disappearance on Archer C7v2

I did that and it did not help. I even compiled my own build from trunk with nothing added but fast classifier. It still happens

I think you've still got a MAC conflict as the table with show that MAC address both on the WAN bridge's interface, as well as on link for the LAN bridge. Even though not on the same link, likely within the same resolution table.

I'm pretty sure that if you're going to "steal" your PC's MAC, you'll need to replace it with a locally administered MAC.

1 Like

I believe if that was the case, it'd happen right away, not when under load

Still happening, even with ath79 build from master. Happens only under high load and dmesg doesn't show anything. Blocks

Did you give your PC a locally administered MAC?

If not, you've still got a L2 routing problem as the router has the MAC as permanent on one of its interfaces, yet it's coming on on another.

1 Like

If that is the problem, how can you explain it only happening under high load?

Alternatively, you can prove everyone wrong by trying their suggestion.

1 Like

I'd have to agree with @fantom-x:

If you don't understand why this a MAC conflict, I don't think explaining resolution tables will make anything more clear to you. I'll simply reiterate:

How about I ask you a question:

A router has the OEM MAC address yy:yy:yy:yy:yy:yy. I clone the MAC of one of my clients: xx:xx:xx:xx:xx:xx. It keeps the same MAC address per your statement:

QUESTION IS: PLEASE EXPLAIN HOW THE LAN DEVICES KNOW WHICH MACHINE xx:xx:xx:xx:xx:xx is the ROUTER (INTERNET GATEWAY), OR THE LAN CLIENT (NO INTERNET GATEWAY)???

This can also be reworded to your specific situation:

Why would router xx:xx:xx:xx:xx:xx send a packet to ITSELF (THE CLIENT) at xx:xx:xx:xx:xx:xx???

I have changed PC's MAC and will let you know if this solves anything. Still, it seems no one can explain why this would only cause issue under high load.

:rofl:

OK...here goes. IF CORRECTING THE MAC CONFLICT THIS FIXES THE PROBLEM, THIS WOULD BE WHY:

As @jeff already said:

As I just said:

NOW FOR ARP TABLES!

Address Resolution Protocol is actually a Layer 2 protocol that links Layer 3 (IP Addresses) to itself (the MAC addresses). It's important to note here, that ARP is carried out by ALL devices that need to know the IP address of send traffic to a IPv4 device connected directly by Ethernet.

SO WHAT HAPPENS UNDER HIGH LOAD???

If you managed to follow me this far...

LAN CLIENTS: "Where IS 192.168.1.1?"
ROUTER: " 192.168.1.1 is at xx:xx:xx:xx:xx:xx."
CLONED LAN CLIENT: "Where IS 192.168.1.1?"
ROUTER: "WAIT, I'M xx:xx:xx:xx:xx:xx!!!"

Despite your belief, only one device can be xx:xx:xx:xx:xx:xx - as far as the router is concerned.

I theorize that you CPU experiencing a condition under high traffic, it hangs or finally resolves xx:xx:xx:xx:xx:xx as itself. HEAVY LOAD IS NEEDED TO CAUSE THE ISSUE YOU'RE EXPERIENCING. It's called a programmatic deadlock...and if it went my day off, I would probably remember which type.

And I just remembered!

I tried to find an RFC on the topic and recall from a search that this is the ONLY clear things that implies that no Layer 2 device should ever see the same mac anywhere twice:

RFC2321:

  • Networks where vendors inadvertently ship units with
    duplicate MAC addresses to the same end-user or where all users
    have a tool for changing MAC addresses.

Your router is sharing usage of the MAC (if you disagree, you'll need to look at the code on the manufacturers hardware). As more devices need Internet, it has less usage of the MAC to make the requests on WAN.

@lleachii thank you for your explanation. It does make sense this way.

1 Like

And it did not help. Even though I've changed MAC of my PC, the exact same thing still happened just now.

Well, that sux..and that's why I didn't want to waste precious time explaning a possible deadlock, when a quick setting change could suffice to see it's solved...as @fantom-x had suggested.

Next: What version of OpenWrt are you running?

(If you want to know why again) I had a slightly similar issue years ago; but it was something that would likely only occur in an older version of OpenWrt. A friend would come over and connect to WiFi or Wired, the machine would cause the router to crash, requiring a hard reset. They were running torrent. I still have those routers awaiting full fixes in 18.06.0 (likely the last firmware to be mass-produced for the router). This was always intended to be my Step 2 in assisting your troubleshooting...

If you have an older Kernel, NAPI was not added; and there would be no log of your error. This is why I'm asking.

  • If you're running a newer version of OpenWrt, you can check the log to see if you have something similar to:

Also...what's your ISP provisioned speed?

@lleachii I've built it from master like 3 days or so, it's using kernel 4.14.51, though the problem was present in LEDE 17.01.4, some "optimized" build for Archer C7 and DD-WRT starting july last year I believe. It's not present in stock firmware though. There are no messages about anything being exhausted in the logs. My ISP provisioned speed is 1 Gbit/s

Wow! That's your problem (or blessing) there buddy!

You may need more rugged hardware for a full Gigabit connection. A 720 MHz single core processor can't theoritically handle a full 1000 Mbps.

As a rule-of-thumb, for your machine to route traffic, you need 1 MHz of CPU per 1 Mbps of WAN speed. This need for CPU resources just increases with: iptables, traffic shaping, etc.

It was happening even when I had 100 Mbps connection as well.

Are you running any applications on your router aside from the default install?

No, it's fairly minimal install.

I have noticed that when this happens, LEDs for other PCs connected via wire turn off while LED for my PC keeps blinking. Router becomes unaccessible. As soon as I unplug and re-plug my PC right away, all LEDs start functioning again and router becomes accessible once again.
Something is crashing perhaps?