IPQ40xx Switch Config "Strangeness" (swconfig)

Stay away from VLAN 1 and VLAN 2 in your configs.

Be aware that there may be ARP problems with bridging LAN and WAN ports (Edit: What I am guessing happens is that the ARP request is answered by the bridge master, which is "wrong" either for requests coming in from the "LAN" ports or from the "WAN" ports, depending on how the bridge was configured. I haven't pursued this further to confirm.)

1 Like

Indeed things worked with VLAN IDs other then 1-2.

That solves that, though there is one strange thing I guess (or maybe that is what this whole topic is about and I just missed the point completely)

For the WAN port to work I had to explicitly assign it and the CPU vlan 2 in the configuration file:

config switch_vlan
	option device 'switch0'
	option vlan '5'
	option ports '5 0t'
	option vid '2'

It still doesn't show in the luci GUI and as a result changes to the configuration from luci tend to remove port 5 and thus disable it again, but other then that things seem to be working OK.

I also completely removed the configuration for VID 1 since it had no ports associated with it, this does not seem to have caused any issues.

Add to above, "Don't use LuCI to configure the switch" :wink:

The reason I avoid using VLAN tag 1 or 2 (either explicitly or with vid) is that the driver reserves them for untagged packets. As I understand the driver, GMAC0 always gets packets coming in from the "LAN" ports and they are "hard-coded" to be tagged with VLAN 1 if they are untagged. GMAC1, similarly, always gets packets coming in from the "WAN" port, tagged VLAN 2 if they are untagged. The reverse mapping is also true, as far as I know; GMAC0-originated packets can only exit a "LAN" port and GMAC1-originated packets can only exit the "WAN" port.

By avoiding any use of VLAN 1 and VLAN 2, the behavior is a lot more predictable, at least for me.

It's strange though because if I associate all untagged ports with specific vlans then they should get that VID and not 1/2

I'll see now if configuring other parts of the settings with LuCI that touch that file (interfaces, wireless networks) will mess these settings back up again or not....

I am trying to set up a device that has IPQ4019/QCA8072 as a dumb AP with VLAN segregation per SSID.

So far I have not been able to accomplish it either using DSA or swconfig.

Some basic questions:

  • by default there are 2 VLANs set in swconfig, and it has been discussed not to change them, but are they tagging/untagging traffic?
  • this device I'm working with has 2 physical ethernet ports albeit on swconfig there are 6 ports listed. Is port 0 linked to CPU? Are port 4 and 5 linked to the physical ethernet ports (looking at swconfig stats it'd appear so)? are the other ports (1 to 3) linked to the wireless chips this devices has (QCA9886 and IPQ4019)?
  • using swconfig, and despite there is no switch0 recognized in LuCi, do I have to double tag frames (one in swconfig and the other in a VLAN interface such as eth0.XXX)?
  • on DSA I can create an interface linked to a physical interface with type vlan. How can I use it on a bridge (for instance if I want to set the same wireless network for all radios and be tagged)

Note on DSA: it works without VLANs, but when using it to bridge device reboots due to a kernel null pointer.

Thanks and apologies for so basic questions.

I hope this isn't a necropost but for me this is the $64K question when it comes to determining the suitability of this device for, say, a gigabit WAN. I'm presently running @NoTengoBattery's ea6350v3 build (and very nice it is too), which includes what I understand is some of @chunkeey's work making the switch usable.

Given that this device has a psgmii channel (theoretically 5 gbps) between the IPQ4018 and the QCA8075 switch controller/phy, depending on how the board is laid out I'd expect at least to be able to simultaneously full-duplex 1 Gpbs between WAN and LAN for a total of 2 gbps across the entire device.

But in practice the device behaves as if there were a single 1Gbps phy shared between all 5 ports; i.e. you might as well be running a 1 port router to a switch and doing everything with VLANs, because (for example) uploading at 500mbps means you won't be able to download more than 500mbps at the same time.

For example: running iperf in duplex mode between two devices attached to ports 1 and 5, configured on different subnets so they have to be routed across the SoC (very little in the way of firewall rules)

root@spec10:~# iperf -c 172.18.1.24 -d -t 20
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.18.1.24, TCP port 5001
TCP window size:  374 KByte (default)
------------------------------------------------------------
[  5] local 172.24.11.90 port 44338 connected with 172.18.1.24 port 5001
[  4] local 172.24.11.90 port 5001 connected with 172.18.1.24 port 56352
[ ID] Interval       Transfer     Bandwidth
[  5]  0.0-20.0 sec  1.29 GBytes   555 Mbits/sec
[  4]  0.0-20.0 sec  1.09 GBytes   469 Mbits/sec

That's about 1.1gbps, and it's at the high end of the results of most of my testing. While I'm doing this, the CPU is never at less than 70% idle, so I don't think we're CPU bound. For unidirectional testing I get a good 940mbps, every time. So I'm wondering if some internal channel or pseudo-PHY is getting lost by eliminating the (supposedly fake) second gmac?

Next I may try reverting to factory firmware and see if I get the same results. Who knows, it's conceivable that Linksys started with the assumption that nobody'd need more than a gigabit across the entire device, and simply wired in the switch that way, but it'd be good to know if there's more internal bandwidth on this device that we're not able to access yet.

(Edit: I don't know if it's been mentioned but the Mikrotik hAP ac^2 uses the same IPQ4018/QCA8075 pairing and they report being able to route 2 Gbps across all ports. In fact their results are remarkably consistent with having the equivalent of 2 gigabit nics in use, given that the ceiling is almost exactly the same for all tests with 1500 byte packets regardless of the router processing workload. https://mikrotik.com/product/hap_ac2#fndtn-testresults)

(Edit2: judging by this very simplified block diagram which appears to be from Mikrotik, the IPQ4018 internally has what's labeled as a 2 gbps connection between the switch/phy controller and the CPU. It's not clear to me whether this represents full or half duplex. If full, then it really is like having a single 1 gig NIC attached to a switch and it'd explain the apparent performance. But that's not consistent with Mikrotik's published results.

https://i.mt.lv/cdn/rb_files/RBD52G-5HacD2HnD-TC-180326082323.png

Which seems like a waste of the potential of the components used. Sorry if I'm going over old ground here, just wondering if what we've got now is all we're going to get in terms of raw routing bandwidth.)

1 Like

@jeff According to IPQ40xx software guide, the "eth1" is actually simulated by VLAN, there is only one CPU port (port 0):

Port 0 (CPU port) acts as an interface between the EDMA and ESS modules. This port is added to
both the LAN and WAN groups. Hence it is required to make it a VLAN trunk port.
The EDMA driver configures the EDMA queuing interface. It also creates two virtual network devices:
eth0 (WAN) and eth1 (LAN). All packets received on the eth0 interface are forwarded to the ESS WAN
group and all packets received on eth1 interfaces are forwarded to the ESS LAN group.
The EDMA driver uses the default VLAN tags to access LAN and WAN groups on ESS: 1 (LAN), 2
(WAN). These default tags can be modified and/or disabled through the Linux sysctl interface. In this
case, the default tag support is disabled in the EDMA driver and users must configure tags through
Linux vconfig or an equivalent interface.

The EDMA driver assigns default VLAN tags to the eth0/eth1 interfaces. By default, eth0 (WAN)
is assigned VLAN tag 2, while eth1 (LAN) is assigned VLAN tag 1. These tags are inserted and
removed from packets by EDMA hardware. Linux OS is not aware of these tags.

So I propose the VLAN patch, to make it aware so there is less confusion about the VLAN settings of this target.

IPQ40xx does HWNAT inside the switch, so it is not a problem.

1 Like

Does the driver have to use VLAN 1 and VLAN 2 or could those be changed to use something less common?

This can be disabled with a patch

1 Like

To reproduce your test with an IPQ4029 based GL.iNet GL-B1300 router, I need to tag those LAN ports below, create interfaces for them, enable DHCP and wire up with two devices?
image

Stupid question but would it be possible to put also the "WAN" port into VLAN 1 so that everything is in the LAN VLAN and I don't have to software bridge eth0 and eth1? Best case without patches to the standard release.

I'm using the Fritzbox 4040.

Did the @NoTengoBattery patches land? Or do we still need to use your special builds if we want to use vlans?

(wiki page still says this may be needed)

For those who read this post and are a little bit confused (I was) about a solution how to configure VLAN on the EA6350. This is based on the several postings (thanks to the guys who found this out):

The best is to setup VLAN in a fresh installation of OpenWRT on the router. If you have already tried to configure VLAN it is not sure that your configuration has been already corrupted.

Connect to the router with SSH, open /etc/config/network and paste the following lines for every VLAN ID you want to create:

config switch_vlan'
option device 'switch0'
option vlan '100'
option ports '0t 3t'

In this example a VLAN with ID 100 is created and available on LAN 3 as tagged. 0 stands for the CPU port eth0 which is necessary to link switch and CPU inside the router. For every created VLAN you create an interface under Network -> Interfaces -> Add new interface: for example

  • General Settings:
    Protocol: static address
    IPv4 address 192.168.100.1
    IPv4 netmask 255.255.255.0

  • Physical Settings: Interface eth.100

  • Firewall Settings: select lan (standard firewall rule of OpenWRT)
    DHCP Server: chose Setup DHCP Server and configure it like you want or take the standard values

Some important notes:

  • Don’t forget the static routes in your DSL-Router
  • Never create VLANs with ID 1 or 2 or change something in their configuration
  • Never configure the switch inside Luci (Network -> Interfaces -> Switch). This can corrupt /etc/config/network. You can have a look at your configuration but don’t change it there (only in /etc/config/network)

This is how I created a setup with a Fritzbox as the DSL-Router, the Linksys EA6350 connected with LAN 1 to the Fritzbox and a VLAN Switch on LAN 3.
I didn’t try to use the WAN port.

2 Likes

I use patches when compiling my own firmware to address the VLAN issues with my Linksys EA6350v3 that runs off a IPQ40XX chip. I have uploaded the patches to my GitHub page including the directories where to place the two files. You can find it here > https://github.com/TheSurgeNetwork/Pre-compiled-OpenWrt-Firmware/releases/tag/1

I can completely remove all of my VLAN's including VLAN 1 and 2 and I can also use the WAN port as a VLAN trunk too without any issues. This is only possible with the patches above though.

2 Likes

Any idea why it is implemented the way it is in OpenWrt?

According to the patches themselves, from Christian who is an active Linux developer, he is waiting for upstream to implement the driver better.

Spoiler alert: they won't. The thing that I don't understand is why he will retract from his own patches, which makes the device work as it should, in the meantime.

I think that he is doing it that way because it does break the currently running devices in a way that only a factory reset can recover them. Also, the patches does not include the changes to the configuration script (the 02 network thing) which probably means that when someone tested it, it just didn't worked.

If you look carefully the 02 network file, which I've fixed by myself, I've deleted all of the other devices and only kept the EA6350v3, since it's a change I've done myself and I cannot test it on other devices since I don't own ones.

Another thing is, probably, that the IPQ4018 requires the changes but the IPQ4019 doesn't. This means that the patches will fix the xx18 family and break the xx19 family.


P.S.: I've found a special file which may greatly improve the wireless performance of the device under OpenWrt. I recommend you to join the discussion and the testing in my custom build thread.

What is your thread?

I'm running openwrt on another ipq4018 device (AVM Fritz!Box 4040)
Would it be sufficient to modify /etc/board.d/02_network and apply the patch to
arch/arm/boot/dts/qcom-ipq4019.dtsi and
drivers/net/ethernet/qualcomm/essedma/edma.c ?

Thanks,
Thomas

I think it will, but doing it wrong (specially the 02_network) will render the device unavailable and you will have to have a way to boot a working firmware if it's the case.

The EA6350v3 is dual firmware and can be (relatively) easily recovered from a bad boot, I don't know your if your device will. The serial console should be online (no Wi-Fi and no Ethernet will work) and you should be able to sideload a working firmware that way and flash it from the console and try it again.


@fantom-x my thread is here:
Optimized build for Linksys EA6350v3 (civic)

1 Like