Yuck, it's a while since I was able to read USB transactions fluently
Doesn't wireshark dissect those packets even further?
In any case, the top level list of requests is in itself very interesting. Assuming we aren't missing some later transaction here it seems that Windows use the same initial RNDIS config we see. There is no mode switching. If there were, then we'd either see additional device and config descriptor requests (for a morphing device) or addtional set configuration requests (for multi config devices).
And looking at packet #6 we see what I believe is the same RNDIS config you see in Linux. You can simply compare the packet dump with
hexdump -C /sys/bus/usb/devices/x-y/descriptors
ignoring the initial parts (packet header in pcap and device descriptor in "descriptors").
Useful fools for manually dissecting these dumps is https://www.usbmadesimple.co.uk/ums_4.htm and constants and structs in include/uapi/linux/usb/ch9.h and include/uapi/linux/usb/cdc.h. Note that this device seems to use Communication (CDC) class specific descriptors even if it fomally is a "Wireless Controller".
Manual dissection of packet #6 shows a e0/01/03 device with a single two-interface function having a control interface with an interrupt ep and a data interface with two bulk endpoints. I.e. pretty standard CDC, simlar to e.g. an ethernet dongle:
09 02 4b 00 02 01 00 e0 01 config (len 0x004b, 2 ints, cfg #1, ..)
08 0b 00 02 e0 01 03 07 interface association (2 ints starting at #0, cl/sc/pr = e0/01/03, string descr 7)
09 04 00 00 01 e0 01 03 05 interface (#0, alt #0, 1 ep, cl/sc/pr = e0/01/03, string descr 5)
05 24 00 10 01 class int (cdc header, ver 1.10)
05 24 01 00 01 class int (cdc call management, no cap, data int: #1)
04 24 02 00 class int (cdc acm, no cap)
05 24 06 00 01 class int (cdc union, master #0, slave #1)
07 05 82 03 08 00 04 ep
09 04 01 00 02 0a 00 00 06 interface
07 05 81 02 00 02 00 ep
07 05 01 02 00 02 00 ep
and we see in the next packet that Windows does "set configuration 1", so nothing unexpected or different from Linux going on so far.
I don't know what thoise unknown packets are, but they look like simply incomplete copies of packet 13, so I assume they're just timeouts while the device is reconfiguring and enabling itself.
No surprises in the two string descriptor requests either. It's actually one request for descriptor #7, referenced by the IA above. The initial one is to get the required buffer size. The returned string is "RNDIS".
So all this is approximately the same Linux would do. I guess the real difference is what the RNDIS driver does in those control requests. I believe there are suprisingly many of those in that WIndows capture. Linux is probably not as fluent in talkiing RNDIS.
EDIT: Or maybe this is even simpler? Maybe it's not Windows doing something magic bu Linux doing something unexpected - causing the firmware to reset? It would be useful to see a capture of a device probing cycle on Linux too. In particular any control requests sent by the rndis_host driver right before the device resets.
You can use tcpdump to capture USB packets on OpenWrt if you build libpcap with usbmon support and install and load the usbmon module. There's a config setting under libpcap for this: CONFIG_PCAP_HAS_USB. Unfortunately I don't think it is enabled in any official builds. You can also do captures on a desktop Linux, using any of tcpdump, tshark or wireshark as long as you load the usbmon module. Select the appropriate "usbmonX" interface matching the bus you plug the modem into. It's easiest to dedicate a bus to the device being tested to avoid "noise" from other devices in the capture.