Dumb AP Unable to obtain DHCP address from Router

https://openwrt.org/docs/guide-user/base-system/dhcp#dhcp_option_example_to_set_an_alternative_default_gateway

Thanks, no luck so far - there got to better way. May look at the tcpdump again.

Think I've narrowed this issue down to the man-in-the-middle (mikrotik switch running SWOS). Still battling the issue, locked out a have a dozen times in the process of trouble shooting, debricking is for the birds :sweat_smile:

This may be the issue if it helps others. Simple network architecture:
20 PCs / 20 desk phones all setup as follows:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
PC <--> Phone <--> [single port] MT Switch
PC <--> Phone <--> [single port] MT Switch <--[single port]--> OWRT <--> WAN Gateway
PC <--> Phone <--> [single port] MT Switch /
PC <--> Phone <--> [single port] MT Switch/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Above schema works really well with MT Switch VLAN configured as all ports "strict" VLAN
mode, all ports accept any packets "tagged & untagged" ingress. Those that are "untagged"
are assigned IP on port Default VLAN ID set on the switch, tagged packets are left alone
and forwarded.

Now I'm introducing several Wireless SSIDs via dumb APs bridging to router to obtain DHCP addresses and forwarding. I thought initially maybe each port is just being overwhelmed due
to the number of VLANs flowing into a single port being as such:

Ingress:
PC <-> Phone <-> Single eth drop <-> single port on <--> (4 Port Gb NON-VLAN Switch) <->.....
6 x SSIDs / 6 x VLANs <-> Dumb AP <-> single port on <-----^

continued from right side:
-> single eth port/cable <--> (VLANed MT SWOS Switch in ?) <-single port-> OWRT

Above is 6 total VLANs x users authed on WiFI, a PC and a Phone all ingress a single port on the MT SWOS Switch, egress a single port to OWRT and back.

Then I read further into this about making "strict" VLAN ports: https://wiki.mikrotik.com/wiki/SwOS/CSS326

Now why would one want ingress packets that match the ports default VLAN ID to be stripped on egress? If this is the case, in my scenario packets are arriving at the OWRT "untagged" if I'm understanding this correctly. OWRT does not know which VLAN DHCP IP to hand out, correct? To make matters worse, the egress header handling feature I'm seeing on other SWOS version seem to support does not exist in mine which is the latest version 2.9.

What to do? I've tried changing the VLAN port mode from "strict" to "enabled", however in doing so only one out of the three SSID VLANs in testing has received an IP and is able to route to internet. The icing on the cake is the IP 10.1.10.173 (a VLAN_1 IP) is being received and used by a VLAN_8 user??????????

Please consider using the following tcpdump options:

-e     Print the link-level header on each dump line.
-v     When parsing and printing, produce (slightly more) verbose output.

-v can be given multiple times to increase verbosity. If there is a VLAN tag, tcpdump now prints it like this:

ethertype 802.1Q (0x8100), length 346: vlan 6, p 0, ethertype IPv4

I read up a bit on DHCP Option 132. There also exists a document named VLAN Feature on Yealink IP Phones.pdf .
My "typo" remark about DHCP Option: 132,VID=3 was based on my lack of knowledge about this option. Please ignore what I said there, i now think this option is fine.

The DHCP server instructs the phone to use VLAN 3 tagging, not the other way round. Setting the default VLAN 4 on the switch port where a phone or a wired PC is connected also seems OK to me.

However, you should not use untagged VLAN 4 on the switch port where the AP is connected. It will not work because you set up the dumb AP to use untagged VLAN 1 on its uplink port. The VLAN configurations on both ends of a link must match. You should either:

  • reconfigure the switch to use VLAN 1 as default VLAN (untagged) on the port where the AP is connected, or
  • make VLAN 1 tagged on both the switch port and the AP, and set an unused VLAN ID different from 1 as the default VLAN (also recommended above by @jeff).

I'd be surprised if this was the case. Even very old switches can handle some more than 10 VLANs, today's switches support many more.

If 'x' implies multiplication, this is not happening here. The amount of VLAN IDs observed by the switch will not get multiplied by the number of users.

When a device connected to the switch sends an untagged packet, it expects the reply to also be untagged. The switch tries to comply with this assumption. Between infrastructure devices like a switch and an AP, it is preferred to use tagged VLANs exclusively.

You can verify it with tcpdump or wireshark.