DSA and Appletalk multicast issues

I hope so. Sorry I can't contribute more hands on, I'm tied up with some intensive dev/testing elsewhere and also do not have access to any genuine old Apple products which would help. At least I can give my encouragement!

So yeah, trace is on. This looks like finding the right needle in a haystack full of them.

Got lost reading and found dropwatch. Found OpenWrt doesn't have a package. Whatever. Here's the Makefile. Builds nicely in a SDK.

dropwatch Makefile
include $(TOPDIR)/rules.mk

PKG_NAME:=dropwatch
PKG_VERSION:=1.5.4
PKG_RELEASE:=1

PKG_SOURCE_URL:=https://github.com/nhorman/$(PKG_NAME)/archive/refs/tags
PKG_SOURCE_VERSION:=1.5.4

PKG_BUILD_DIR:=$(BUILD_DIR)/$(PKG_NAME)-$(PKG_VERSION)
PKG_SOURCE:=$(PKG_NAME)-v$(PKG_SOURCE_VERSION).tar.gz
PKG_SOURCE_URL_FILE:=v$(PKG_SOURCE_VERSION).tar.gz
PKG_SOURCE_SUBDIR:=$(PKG_NAME)-v$(PKG_VERSION)
PKG_HASH:=8c43d0c15d0cb9ce179fa1fb0610611723689a6f551b23c70a7ddc1cf068e8d2

PKG_MAINTAINER:=TBD
PKG_LICENSE:=TBD
PKG_LICENSE_FILES:=LICENSE
PKG_CPE_ID:=cpe:/a:tbd:tbd

include $(INCLUDE_DIR)/package.mk

define Package/dropwatch
  SECTION:=net
  CATEGORY:=Network
  TITLE:=Dropwatch
  DEPENDS:=+libnl +libreadline +libpcap
endef

define Package/dropwatch/description
 Dropwatch
endef

define Build/Configure
	cd $(PKG_BUILD_DIR) && ./autogen.sh && ./configure --without-bfd --libdir=$(STAGING_DIR)/usr/
endef

define Package/dropwatch/install
	$(INSTALL_DIR) $(1)/usr/bin
	$(INSTALL_BIN) $(PKG_BUILD_DIR)/src/dropwatch $(1)/usr/bin/dropwatch
	$(INSTALL_BIN) $(PKG_BUILD_DIR)/src/dwdump $(1)/usr/bin/dwdump
endef

$(eval $(call BuildPackage,dropwatch))

Maybe I should PR that into OpenWrt.

Ran it... gets angry. Wants KERNEL_NET_DROP_MONITOR=y. Darn... not the kernel_menuconfig crap again.
Let's try just cram it into .config with a CONFIG_ in front as I saw some comment in build system that those get passed along... oh yay! it worked. Now I know better. Never kernel_menuconfig again.

Run dropwatch -lkas, tcpdump -i br-lan -e | grep Appletalk on side terminal waiting for old faithful 23.05 to send a beacon and...
1 drops at snap_rcv+90 (0xc09bffa4) [software]

Oh hello. In a sea of IPv4/v6 frames... we lost a SNAP just as I saw it appear on the tcpdump terminal? What are the odds?

So that's where the frame goes. Let's try again with the initial broadcast...
8 drops at snap_rcv+90 (0xc09bffa4) [software]
3 drops at snap_rcv+90 (0xc09bffa4) [software]

11 SNAPs... the 10 AARPs and the 1 discovery. We have a winner.

And dropwatch can give a bit more of detail:

drop at: snap_rcv+0x90/0xf0 (0xc09bffa4)
origin: software
input port ifindex: 9
timestamp: Sun Dec  1 20:54:29 2024 705349729 nsec
protocol: 0x4
length: 51
original length: 51
drop reason: NOT_SPECIFIED

Timestamp matches to +20msec. No doubt this is our man. The drop reason is not particularly useful, but I owe dropwatch one.

So now that we know snap_rcv+90 is going Thanos on Appletalk... any suggestions? I'll read on it, hopefully it's something that can be fixed.

Followup... so we have ftrace... let's trace for snaps:

root@TestWrt-63:/sys/kernel/debug/tracing# cat available_filter_functions | grep "snap_" > set_ftrace_filter 
root@TestWrt-63:/sys/kernel/debug/tracing# echo "function_graph" > current_tracer 
root@TestWrt-63:/sys/kernel/debug/tracing# cat trace_pipe 
 1)               |  snap_rcv() {
 1)   2.720 us    |    find_snap_client();
 1) + 16.160 us   |  }
 1)               |  snap_rcv() {
 1)   3.680 us    |    find_snap_client();
 1) + 26.720 us   |  }
 1)               |  snap_rcv() {
 1)   3.840 us    |    find_snap_client();
 1) + 26.400 us   |  }
 1)               |  snap_rcv() {
 1)   3.360 us    |    find_snap_client();
 1) + 24.000 us   |  }

Don't know exactly what that means but seems that it looks for a client to send the stuff to and finds none. That's something to work on.

More followup... rmmod appletalk && modprobe appletalk

 0)               |  unregister_snap_client() {
 0) * 33070.40 us |  }
 0)               |  unregister_snap_client() {
 0) * 40023.04 us |  }
 ------------------------------------------
 0)   rmmod-2461   =>  modprob-2462 
 ------------------------------------------

 0)               |  register_snap_client() {
 0)   4.160 us    |    find_snap_client();
 0) + 34.720 us   |  }
 0)               |  register_snap_client() {
 0)   1.920 us    |    find_snap_client();
 0)   7.040 us    |  }

Alright.... let's see what the darn thing does when it "registers" itself. Time to look at ancient C code.
Question still stands... how come it can leave a forwarding address at a normal post office, but not at a fancy DSA one?

1 Like

:clap::clap::clap: Brilliant ! :slight_smile:

Fantastic work :muscle:

Abridged snap_rcv().

snap_rcv(skb, netdev, pkttype, orig_netdev) {
	proto = find_snap_client(skb_transport_header(skb));
	if (proto) {
		/* Pass the frame on. */
		rc = proto->rcvfunc(skb, dev, &snap_packet_type, orig_dev);
	}
	if (!proto) goto drop;
}

(Note arguments to find_snap_client() comes from skb_transport_header().)

Unabridged find_snap_client().

26   /*
27    *	Find a snap client by matching the 5 bytes.
28    */
29   static struct datalink_proto *find_snap_client(const unsigned char *desc)
30   {
31   	struct datalink_proto *proto = NULL, *p;
32   
33   	list_for_each_entry_rcu(p, &snap_list, node, lockdep_is_held(&snap_lock)) {
34   		if (!memcmp(p->type, desc, 5)) {
35   			proto = p;
36   			break;
37   		}
38   	}
39   	return proto;
40   }

Both from net/802/psnap.c

(May need to look at how snap_list is updated.)

Here's skb_transport_header().

3010   static inline unsigned char *skb_transport_header(const struct sk_buff *skb)
3011   {
3012   	DEBUG_NET_WARN_ON_ONCE(!skb_transport_header_was_set(skb));
3013   	return skb->head + skb->transport_header;
3014   }

From include/linux/skbuff.h

So the 5 octets pointed to there must be correct.

(My uneducated guess: head is just the buffer head and transport_header counts how many octets into the buffer the "transport layer", whatever that means for this type of frame, starts.)

Documentation:
https://www.kernel.org/doc/html/latest/networking/skbuff.html

AppleTalk has a transport layer, page 6:
https://www.cisco.com/en/US/docs/internetworking/troubleshooting/guide/tr1909.pdf

Perhaps transport layer points to somewhere in the DDP header ?

0x0000 hop count + length
0x0000 checksum
0x0000 destination net
0x0000 source net
0x00 destination node
0x00 source node
0x00 destination socket
0x00 source socket
0x00 next header type

Would it make sense to match a socket based on destination net + source net + destination node fx.?

Only 11 hits you say... Might it be nice with a stack trace?

As before but add something more or less like:

echo function > current_tracer
echo find_snap_client > set_ftrace_filter
echo 1 > options/func_stack_trace

Thanks!

Looks like:

          <idle>-0       [001] ..s..  8744.047176: find_snap_client <-snap_rcv
          <idle>-0       [001] ..s..  8744.047218: <stack trace>
 => snap_rcv
 => llc_rcv
 => __netif_receive_skb_one_core
 => netif_receive_skb
 => br_handle_frame_finish
 => br_handle_frame
 => __netif_receive_skb_core.constprop.0
 => __netif_receive_skb_list_core
 => netif_receive_skb_list_internal
 => napi_complete_done
 => gro_cell_poll
 => __napi_poll.constprop.0
 => net_rx_action
 => handle_softirqs
 => irq_exit
 => call_with_stack
 => __irq_svc
 => default_idle_call
 => do_idle
 => cpu_startup_entry
 => secondary_start_kernel
 => 0x42301294

Do those tea leaves tell you anything?

modprobe looks like:

        modprobe-2612    [000] b....  9009.979171: find_snap_client <-register_snap_client
        modprobe-2612    [000] b....  9009.979209: <stack trace>
 => register_snap_client
 => init_module
 => do_one_initcall
 => do_init_module
 => sys_init_module
 => ret_fast_syscall
        modprobe-2612    [000] b....  9009.979243: find_snap_client <-register_snap_client
        modprobe-2612    [000] b....  9009.979252: <stack trace>
 => register_snap_client
 => aarp_proto_init
 => init_module
 => do_one_initcall
 => do_init_module
 => sys_init_module
 => ret_fast_syscall

Seems to register for main data traffic and for AARP.

As usual I don't know what I'm looking at, and I can't see whether DSA should have an impact on this or not (or why prior the DSA commit this thing works... might be unrelated to DSA but more driven by something else that happened to be brought along). That's the first thing I would like to understand... why it broke. The 5-byte proto is the key.

This is registration:

Summary
/*
 *      Register SNAP clients. We don't yet use this for IP.
 */
struct datalink_proto *register_snap_client(const unsigned char *desc,
                                            int (*rcvfunc)(struct sk_buff *,
                                                           struct net_device *,
                                                           struct packet_type *,
                                                           struct net_device *))
{
        struct datalink_proto *proto = NULL;

        spin_lock_bh(&snap_lock);

        if (find_snap_client(desc))
                goto out;

        proto = kmalloc(sizeof(*proto), GFP_ATOMIC);
        if (proto) {
                memcpy(proto->type, desc, 5);
                proto->rcvfunc          = rcvfunc;
                proto->header_length    = 5 + 3; /* snap + 802.2 */
                proto->request          = snap_request;
                list_add_rcu(&proto->node, &snap_list);
        }
out:
        spin_unlock_bh(&snap_lock);

        return proto;
}

That is what adds to snap_list. It is being called. So it's either:

  • registration is not adding an entry or adding messed up proto bytes
  • frame comes in with messed up proto bytes

And in best of theories we are talking about these last 5 bytes (of header of 8):
dsap SNAP, ssap SNAP, ctrl, oui Appletalk, pid Appletalk
0xaa, 0xaa, 0x03, 0x080007, 0x809b

Why does this work prior to the DSA-conversion commit??

I think we are past that point no? It already decided it's 802.3 and that it needs to match by proto. Goes AWOL after that.
Maybe the stack trace tells you something to the contrary.

Sure although not sure what that would be... From the stack trace it looks like GRO is maybe active, if you want to play with disabling that see ethtool -k. Hmm what else, well the bridge code is still active too but as mentioned you can make the interface standalone and get rid of br_handle_frame{,_finish} easily enough..

Haven't read through the rest yet, but will

Where's that pcap when you need it haha :wink:

We could potentially printk() the 5 bytes as seen by find_snap() vs. register_snap_client() but that requires a kernel rebuild.

Yes hmm SNAP is probably a better candidate than DDP since we are rummaging about in find_snap_client() :upside_down_face:

Helps that SNAP OUI + PID is 5 consecutive octets.

(also for anyone following along, if OUI happened to be all zeroes in the relevant frame, PID would represent ethertype.)

Yes, or before... In any case I think you are right, and I *think* the 5 bytes are supposed to be a match on oui + pid.

If I run atalkd locally, it hears itself:

 0) ! 114.240 us  |  snap_request();
 0)               |  snap_request() {
 0)               |    snap_rcv() {
 0)   2.240 us    |      find_snap_client();
 0)   5.760 us    |      atalk_rcv [appletalk]();
 0) + 16.800 us   |    }
 0) + 30.880 us   |  }

It still ignores packets from outside, but hears it's own.

So this is my wild guess...

  • frames look OK with tcpdump (and look fine on capture - yes, captured one! finally! man, did I put that off...)
  • frames are OK when locally produced because it's on "this side" of DSA and proto bytes are a match
  • frame is getting messed up somehow (by DSA or something else) so the proto bytes are not a match. Somewhere on the chain the code assumes Ethernet II and overwrites critical bytes with length or shifts things up and down.

With the stack traces earlier above and some creative brute force (unless you have better ideas) I can test that out. If I'm right this is a 802.3 kernel bug.
Any suggestions on how to dump the buffer at netif_receive_skb?

I'm checking the kernel repo for relevant changes. snap_rcv and llc_rcv do not seem to have had any relevant changes. I'll keep going down the list. I assume if anything would be relevant to my pipe dream it might be br_handle_frame and _finish (if that's the DSA magic).
Oh... that's what I need... the stack trace for an internal (working) package. Tomorrow...

Capture:

Summary
[
  {
    "_index": "packets-2024-12-01",
    "_type": "doc",
    "_score": null,
    "_source": {
      "layers": {
        "frame": {
          "frame.encap_type": "1",
          "frame.time": "Dec  1, 2024 23:46:51.999748000 EST",
          "frame.time_utc": "Dec  2, 2024 04:46:51.999748000 UTC",
          "frame.time_epoch": "1733114811.999748000",
          "frame.offset_shift": "0.000000000",
          "frame.time_delta": "0.000000000",
          "frame.time_delta_displayed": "0.000000000",
          "frame.time_relative": "0.000000000",
          "frame.number": "1",
          "frame.len": "60",
          "frame.cap_len": "60",
          "frame.marked": "0",
          "frame.ignored": "0",
          "frame.protocols": "eth:llc:ddp:rtmp",
          "frame.coloring_rule.name": "Broadcast",
          "frame.coloring_rule.string": "eth[0] & 1"
        },
        "eth": {
          "eth.dst": "09:00:07:ff:ff:ff",
          "eth.dst_tree": {
            "eth.dst_resolved": "AppleTalk-broadcast-address",
            "eth.dst.oui": "589831",
            "eth.dst.oui_resolved": "Apple, Inc.",
            "eth.addr": "09:00:07:ff:ff:ff",
            "eth.addr_resolved": "AppleTalk-broadcast-address",
            "eth.addr.oui": "589831",
            "eth.addr.oui_resolved": "Apple, Inc.",
            "eth.dst.lg": "0",
            "eth.lg": "0",
            "eth.dst.ig": "1",
            "eth.ig": "1"
          },
          "eth.src": "c4:71:54:05:39:c9",
          "eth.src_tree": {
            "eth.src_resolved": "TpLinkTechno_05:39:c9",
            "eth.src.oui": "12874068",
            "eth.src.oui_resolved": "Tp-Link Technologies Co.,Ltd.",
            "eth.addr": "c4:71:54:05:39:c9",
            "eth.addr_resolved": "TpLinkTechno_05:39:c9",
            "eth.addr.oui": "12874068",
            "eth.addr.oui_resolved": "Tp-Link Technologies Co.,Ltd.",
            "eth.src.lg": "0",
            "eth.lg": "0",
            "eth.src.ig": "0",
            "eth.ig": "0"
          },
          "eth.len": "37",
          "eth.padding": "00:00:00:00:00:00:00:00:00"
        },
        "llc": {
          "llc.dsap": "0xaa",
          "llc.dsap_tree": {
            "llc.dsap.sap": "85",
            "llc.dsap.ig": "0"
          },
          "llc.ssap": "0xaa",
          "llc.ssap_tree": {
            "llc.ssap.sap": "85",
            "llc.ssap.cr": "0"
          },
          "llc.control": "0x0003",
          "llc.control_tree": {
            "llc.control.u_modifier_cmd": "0x00",
            "llc.control.ftype": "0x03"
          },
          "llc.oui": "524295",
          "llc.apple_atalk_pid": "0x809b"
        },
        "ddp": {
          "ddp.src": "c4:71:54:05:39:c9",
          "ddp.dst": "09:00:07:ff:ff:ff",
          "ddp.hopcount": "0",
          "ddp.len": "29",
          "ddp.checksum": "51435",
          "ddp.dst.net": "0",
          "ddp.src.net": "6400",
          "ddp.dst.node": "255",
          "ddp.src.node": "148",
          "ddp.dst_socket": "1",
          "ddp.src_socket": "1",
          "ddp.type": "1"
        },
        "rtmp": {
          "rtmp.net": "6400",
          "nbp.nodeid.length": "8",
          "nbp.nodeid": "148",
          "Tuple 1:  Range Start: 6400  Dist: 0  Range End: 6400  Version: 0x82": {
            "rtmp.tuple.range_start": "6400",
            "rtmp.tuple.dist": "0",
            "rtmp.tuple.range_end": "6400",
            "rtmp.version": "0x82"
          },
          "Tuple 2:  Range Start: 6400  Dist: 0  Range End: 6400": {
            "rtmp.tuple.range_start": "6400",
            "rtmp.tuple.dist": "0",
            "rtmp.tuple.range_end": "6400"
          }
        }
      }
    }
  }
]

Base64 encoded:

Summary
1MOyoQIABAAAAAAAAAAAAAAABAABAAAAuztNZ0RBDwA8AAAAPAAAAAkAB////8RxVAU5yQAlqqoD
CAAHgJsAHcjrAAAZAP+UAQEBGQAIlBkAgBkAghkAgBkAggAAAAAAAAAAAA==

Nice! :slight_smile:

Is this one of the dropped frames or one of the accepted ones?

Next time you happen to do a capture, if possible do one from the sender as well as the receiver at the same time? For comparison purposes...

I don't see anything problematic in this one (by itself).

If they all show up with nothing wrong, we might need to do that printk.

Just to be precise, the appletalk kernel module hears it. (atalk_rcv() is in net/appletalk/ddp.c.)

Still haven't gotten my head around how those two interface.

Could do: this, scroll to "Raw buffer as a hex string"

DSA is earlier, this is just bridging code. Reason I looked at it and wanted it excluded is that bridges deal a whole bunch with multicast, so seemed like a nice thing to exclude as a culprit.

That's an ignored one. That one was generated by the other box running 23.05.
I'll coordinate captures at some point.
I'll capture a successful one in 6-7 weeks with my usual packet capture turnaround.

That's what I was thinking. More builds... yay.

Yes. The trace is interesting. On externally generated packets it shows find_snap_client() and ends. On locally generated ones it shows atalk_rcv [appletalk](); right after. That's the breakage point.
I'll hack a printk in snap_rcv to begin with.

Hah. :slight_smile:

Sounds good!

The more the merrier, printk all the stuff. Fewer builds needed that way...

Also here's a couple "culprit elimination commands":

# ethtool --offload eth0 gro off
# brctl delif br-lan lan3

(replace lan3 with port ubuntu machine connects to of course)

No effect. Played with lan(x) port membership at the bridge and didn't see any change in behaviour. I can't really just do a port for Ubuntu (or 23.05) unless I go work by the network cabinet and give up working over WiFi. Might come to that eventually, but not yet.


Well... I added printk here and there... and something looks very messed up.

Regarding why packets (or frames?) are not routed to the appletalk kmod... it's because the 5-byte signature seems to be shifted by two bytes. Added prints to find_snap_client (called by snap_rcv) so it would display the 5-byte "desc" (obtained with skb_transport_header(skb)) from packet vs what is registered by the kmod (p->type(s)).
The kmod seems to register two clients:

  • AARP: 00:00:00:80:f3
  • Appletalk: 08:00:07:80:9b

Log says (when an Appletalk packet, so 08:00:07:80:9b is coming in:

Mon Dec  2 23:46:10 2024 kern.warn kernel: [ 2430.913607] p->type 0:0:0:80:f3
Mon Dec  2 23:46:10 2024 kern.warn kernel: [ 2430.915861] desc    aa:3:8:0:7
Mon Dec  2 23:46:10 2024 kern.warn kernel: [ 2430.921851] p->type 8:0:7:80:9b
Mon Dec  2 23:46:10 2024 kern.warn kernel: [ 2430.924194] desc    aa:3:8:0:7

Registration entries seem OK. desc is shifted by two. AA:AA:03 are the SSAP:DSAP:CTRL bytes before the OUI:PROTO five bytes. So somehow the packet is underrun by two bytes.

That's not all the bad stuff. Added code to print the packet at llc_rcv.
At beginning of it:

[ 2430.893550] 000000	09 00 07 FF FF FF C4 71 54 05 39 C9 00 25 AA AA
[ 2430.895476] 000010	03 08 00 07 80 9B 00 1D C8 EB 00 00 19 00 FF 94
[ 2430.901454] 000020	01 01 01 19 00 08 94 19 00 80 19 00 82 19

and at end:

[ 2430.945375] 000000	09 00 07 FF FF FF C4 71 54 05 39 C9 00 25 AA AA
[ 2430.948243] 000010	03 08 00 07 80 9B 00 1D C8 EB 00 00 19 00 FF 94
[ 2430.954235] 000020	01 01

That is super weird. llc_rcv is chopping 12 bytes off the end of the packet. I can't imagine any situation where that would be good. Neither does WireShark:
For starting package:

Frame 1: 46 bytes on wire (368 bits), 46 bytes captured (368 bits) on interface Fake IF, text2pcap, id 0
IEEE 802.3 Ethernet 
Logical-Link Control
Datagram Delivery Protocol
Routing Table Maintenance Protocol
[Malformed Packet: RTMP]

For ending package:

Frame 1: 34 bytes on wire (272 bits), 34 bytes captured (272 bits) on interface Fake IF, text2pcap, id 0
IEEE 802.3 Ethernet 
Logical-Link Control
Datagram Delivery Protocol
[Malformed Packet: DDP]

I stumbled into this to print the packets , which may be messed up.
https://olegkutkov.me/2019/10/17/printing-sk_buff-data/
I'll review. It may be truncating the packet printouts so truncated packets might not be a real issue. I take pride into only using someone else's quality code and the explanation on that page looks legit enough.

I'll add more printing further up the chain, and look into the definition of skb for whatever else I may be missing. Regardless...

The 2-byte shift for OUI:Proto is severely messed up. Means skb_transport_header(skb) is messed up. Not a stretch to think then all other pointers including length are messed up.

Keep in mind:

  • this is mainline kernel code. it's under net/llc and net/802. Appletalk code has not been reached by any of this.
  • this works with a close enough kernel (6.6.58 vs 6.6.63) on a non-DSA snapshot.

So let's start with the wild guesses... does DSA have a 2-byte tag at the beginning of the frame that could be causing the underrrun?

Hah. Just thinking aloud here...

C is essentially just pointer magic, as far as I understand?

Hit the compiler with some code, plus throw some incantation of command-line arguments you find in a back alley at it, and it will produce machine instructions that it thinks has decent odds of manipulating pointers the way you want.

In other words, not sure "code quality" is a measurable quantity for C code, beyond "Did you pick the right pointer?" and "Did you give that pointer a descriptive name so I can pick it later?"...

... :upside_down_face: ...

And since nearly everyone is bad at naming things, probably often ends up with all the pointers named "data" or "buf", and then you have an encyclopedia of sorts next to the code, where you can look up what "data" on line 1137 is supposed to really mean.

So need to at least watch out for bad pointers in == bad data out.

And have a decent encyclopedia, I'm looking at Dave Miller's notes from back in the vger.kernel.org days...

Cool.

Interesting!

But maybe too soon to conclude "chopped off the end"? See point about pointers.

Looking at the packet printer, it starts at skb_mac_header(skb) and then goes on for skb->len octets.

Let's say we used pkt_hex_dump() at a place where the stack has already finished parsing MAC addresses? Those are 12 bytes for instance. Might the code have simply done something to reset the skb->mac_header offset thus throwing the packet printer off?

For example, llc_rcv() clones the skb in line 208:

207		if (rcv) {
208			struct sk_buff *cskb = skb_clone(skb, GFP_ATOMIC);
209			if (cskb)
210				rcv(cskb, dev, pt, orig_dev);
211		}
212		sap_handler(sap, skb);

I don't know for sure (hence the long-winded tale, apologies), but one could suspect that skb_clone() might effectively reset some of the pointers and offsets in the cloned sk_buff, for example.

In any case, perhaps the most stable / unchanging of the skb pointers is the tail pointer?

We could try modifying the packet printer to always start at (skb->tail - skb->len) instead of skb_mac_header(skb) and see if that magically fixes the issue.

The output might still be chopped off somewhere, but at least we'd be looking at the full buffer based on start/length/end and not relying on some offset to some field that the skb may still contain or not.

Not sure - but that's my 2 cents anyway :slight_smile:

Sounds good!

Ouch!

Okay so thinking aloud again, at this point we would have expected LLC parsing to be done, which consisted of reading the special unicorn value 0xAAAA that means essentially skip looking at the LLC SAP addresses, plus another LLC octet containing a CTRL byte with some bitflags, which is probably the minimum you can get away with if you want to tell the LLC layer to stay calm and do nothing.

Then comes the OUI:PID which as you mention SHOULD be either:

OUI=00:00:00 ("please read PID as was it ethertype") + PID=0x80f3 (Appletalk-ARP)
-or-
OUI=08:00:07 (Apple) PID=80:9b (Appletalk DDP)

Okay.

Yep!

It does:
https://elixir.bootlin.com/linux/v6.6.63/source/include/linux/dsa/tag_qca.h#L11

But lots of things are 2 bytes, it's a common choice for when 1 byte is not enough. :grin:

fx. LLC SAP header (DSAP + SSAP)
fx. ethertype || 802.3-length

I find the most curious thing to be that llc_rcv() manages to detect the 'special unicorn' value of 0xAAAA but then does not manage to set the pointers and offsets correctly for subsequent functions in the stack to not read it.

But here we also have to take into account that there are a couple functions in between llc_rcv() and find_snap_client(). I can see in line 210 above for example that llc_rcv() calls rcv() which then passes the skb on.

Could be in there somewhere.

From stack trace

Just looked at snap_rcv(), it is pretty simple. You were absolutely correct, llc_rcv() is a good place to concentrate the effort.

(OUI=0, ethertype=aarp versus OUI=apple PID=appletalk might yield different strack traces though, don't know.)

Let's just printk() all the things in llc_rcv() ?

Click to reveal patch for `/net/llc/llc_input.c`
--- llc_input.c.orig    2024-12-03 20:00:00.000000000 +0000
+++ llc_input.c 2024-12-03 20:00:00.000000000 +0100
@@ -19,7 +19,7 @@
 #include <net/llc_pdu.h>
 #include <net/llc_sap.h>
 
-#if 0
+#if 1
 #define dprintk(args...) printk(KERN_DEBUG args)
 #else
 #define dprintk(args...)
@@ -183,6 +183,7 @@
        if (unlikely(!llc_fixup_skb(skb)))
                goto drop;
        pdu = llc_pdu_sn_hdr(skb);
+       dprintk("%s: len=%u DSAP=%02x SSAP=%02x CTRL1=%02x\n", __func__, skb->len, pdu->dsap, pdu->ssap, pdu->ctrl_1);
        if (unlikely(!pdu->dsap)) /* NULL DSAP, refer to station */
               goto handle_station;
        sap = llc_sap_find(pdu->dsap);
@@ -200,15 +201,25 @@
        sap_handler = dest ? READ_ONCE(llc_type_handlers[dest - 1]) : NULL;
        if (unlikely(!sap_handler)) {
                if (rcv)
+               {
+                       dprintk("%s: len=%u, invoking rcv uncloned\n", __func__, skb->len);
                        rcv(skb, dev, pt, orig_dev);
+               }
                else
                        kfree_skb(skb);
+               dprintk("%s: PDU category / LLC_DEST_xxx invalid handler=%d\n", __func__, dest);
        } else {
                if (rcv) {
                        struct sk_buff *cskb = skb_clone(skb, GFP_ATOMIC);
+                       if (cskb) {
+                               dprintk("%s: len=%u, invoking rcv cloned\n", __func__, cskb->len);
+                       } else {
+                               dprintk("%s: skb_clone failed\n", __func);
+                       }
                        if (cskb)
                                rcv(cskb, dev, pt, orig_dev);
                }
+               dprintk("%s: len=%u, PDU category / LLC_DEST_xxx handler=%d, invoking\n", __func__, skb->len, dest);
                sap_handler(sap, skb);
        }
        llc_sap_put(sap);
@@ -220,7 +231,11 @@
 handle_station:
        sta_handler = READ_ONCE(llc_station_handler);
        if (!sta_handler)
+       {
+               dprintk("%s: sta_handler not found\n", __func__);
                goto drop;
+       }
+       dprintk("%s: len=%u, invoking sta_handler\n", skb->len, __func__);
        sta_handler(skb);
        goto out;
 }

Thank you for all that. Will need to re-read it a few times.

Added additional printk stuff as you suggested... and not sure anymore what I'm looking at. Might need to let it rest for a bit. I've been dipping too much into sleep time for this and it is beginning to show.

Asked pkt_hex_dump to print from skb->data for skb-len.

llc_rcv seems well behaved. Let's look at two scenarios.
First one, let's be positive, let's look at a frame produced by a local atalkd, so something that works. On entry, data is:

000000	AA AA 03 08 00 07 80 9B 00 14 FC A9 00 00 00 C8
000010	FF 1C 06 06 06 05 00 00 00 00 00 00

Then llc_fixup_skb is called to take care of the LLC header and it looks like:

000000	08 00 07 80 9B 00 14 FC A9 00 00 00 C8 FF 1C 06
000010	06 06 05 00 00 00 00 00 00

It goes to say llc_rcv: len=25 DSAP=aa SSAP=aa CTRL1=03.
So all fine and dandy. Then it goes to snap_rcv and it looks good:

Tue Dec  3 18:57:59 2024 kern.warn kernel: [  994.755793] p->type 0:0:0:80:f3
Tue Dec  3 18:57:59 2024 kern.warn kernel: [  994.758080] desc    8:0:7:80:9b
...
Tue Dec  3 18:57:59 2024 kern.warn kernel: [  994.764066] p->type 8:0:7:80:9b
Tue Dec  3 18:57:59 2024 kern.warn kernel: [  994.766570] desc    8:0:7:80:9b

We have a match, off it goes on that yellow brick road, and all in life is good.

Alright. As for the wicked one of the west... llc_rcv is still well behaved, I hope. On entry,

000000	AA AA 03 08 00 07 80 9B 00 26 2A 29 00 00 19 00
000010	FF 94 06 06 06 06 A0 19 00 19 00 00 06 09 00 07
000020	00 00 3D 0A 4F 70 65 6E 57 72 74 2D 36 34

and after llc_fixup_skb:

000000	08 00 07 80 9B 00 26 2A 29 00 00 19 00 FF 94 06
000010	06 06 06 A0 19 00 19 00 00 06 09 00 07 00 00 3D
000020	0A 4F 70 65 6E 57 72 74 2D 36 34

with the happy llc_rcv: len=43 DSAP=aa SSAP=aa CTRL1=03.
Off goes to snap_rcv and it's a no-go.

Tue Dec  3 18:57:59 2024 kern.warn kernel: [  994.851135] p->type 0:0:0:80:f3
Tue Dec  3 18:57:59 2024 kern.warn kernel: [  994.853477] desc    aa:3:8:0:7
...
Tue Dec  3 18:57:59 2024 kern.warn kernel: [  994.859467] p->type 8:0:7:80:9b
Tue Dec  3 18:57:59 2024 kern.warn kernel: [  994.861811] desc    aa:3:8:0:7

So skb->data and skb-len are OK, skb_transport_header(skb) is messed up.

And that's really the crux of it. As you point out, a buffer is a buffer is a buffer (by any other name, had Shakespeare been into C) and so is the data in it. But pointers to it are what matters, and somehow, if received via DSA, skb_transport_header(skb) is two bytes short.

So...

I'll catch up on sleep to become human again, then figure out what the heck skb_transport_header(skb) does (it's an inline I think so can't be that complicated) and then try figure out what the DSA path does to eff that up.

For reference, this is the trace for a package that is self-created and then processed:

          atalkd-2198    [000] ..s1.  1941.843723: atrtr_get_dev <-atalk_rcv
          atalkd-2198    [000] ..s1.  1941.843735: <stack trace>
 => atalk_rcv
 => snap_rcv
 => llc_rcv
 => __netif_receive_skb_one_core
 => process_backlog
 => __napi_poll.constprop.0
 => net_rx_action
 => handle_softirqs
 => call_with_stack
 => do_softirq
 => __local_bh_enable_ip
 => __dev_queue_xmit
 => snap_request
 => atalk_sendmsg
 => __sock_sendmsg
 => __sys_sendto
 => ret_fast_syscall

None of this:

 => br_handle_frame_finish
 => br_handle_frame
 => __netif_receive_skb_core.constprop.0
 => __netif_receive_skb_list_core
 => netif_receive_skb_list_internal
 => napi_complete_done
 => gro_cell_poll

instead just this

 => handle_softirqs

So one of those lucky 7 are messing it up for me. It ain't gro because turning it off does nothing, so then there were six.

PS:

Yes, because LLC (I think) runs off skb->data but snap_rcv calls skb_transport_header(skb) which is two bytes off.

PS2:
So that sucker is

static inline unsigned char *skb_transport_header(const struct sk_buff *skb)
{
	return skb->head + skb->transport_header;
}

ergo skb->transport_header is getting offset by 2 on the DSA path.

Keep walking... (hopefully with a scotch). Somewhere up the line that must be getting messed up.

Some of the sk_buff fields are offsets rather than pointers, probably to save a few bytes in the skb struct (there's a lot of skb's being created and flushed). There are some small helper functions (that will likely inline) which just grabs a pointer and an offset and make a new pointer out of it, might be one of those.

Relevant.

Lol :rofl:

That sounds like a fantastic judgment call, we can always pick it up later. Have a good one! :slight_smile:

This paper might give a few hints:

See Figure 9 on page 5, showing DSA switch tags.

In the next round of masochistic debugging...

Asked kernel to dump the full skb at different locations.
llc_fixup_skb does something weird (even though it looks harmless). That is what "chopping off" 18 bytes at end of data packet for an AppleTalk packet and 10 at end of an AARP packet.
For that one I know where to keep looking. I hope this is due to a messed up offset somewhere.
It advances the transport_header offset by three as expected but it doesn't use it. llc_input runs everything off network_header and it works, so network_header is OK.
Which means transport_header is messed up somewhere (2 bytes short) before in the chain as it traverses DSA land. That will require more serious digging.

1 Like

Ooh, exciting!

This (beyond being cryptic as hell) is returning much shorter data size.

		pdulen = eth_hdr(skb)->h_proto;
		data_size = ntohs(pdulen) - llc_len;

llc_len is 3.
For a packet generated locally, goes from this:

Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.406396] pkt_hex_dump: skb:head
Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.407306] Packet hex dump:
Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.410593] 000000	00 00 00 00 00 00 00 00 00 00 00 00 00 1C
Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.413687] pkt_hex_dump: skb:data
Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.419340] Packet hex dump:
Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.422580] 000000	AA AA 03 08 00 07 80 9B 00 14 FC A9 00 00 00 C8
Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.425613] 000010	FF 1C 06 06 06 05 00 00 00 00 00 00
Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.431689] pkt_hex_dump: skb:tail
Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.436866] Packet hex dump:
Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.440280] 000000	00 00 00 00 00 00 C8 1C 3C 00 20 A0 F9 84 00 00

to this:

Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.590604] pkt_hex_dump: skb:head
Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.599916] Packet hex dump:
Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.603215] 000000	00 00 00 00 00 00 00 00 00 00 00 00 00 1C AA AA
Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.606312] 000010	03
Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.612243] pkt_hex_dump: skb:data
Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.614673] Packet hex dump:
Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.618122] 000000	08 00 07 80 9B 00 14 FC A9 00 00 00 C8 FF 1C 06
Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.621101] 000010	06 06 05 00 00 00 00 00 00
Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.627149] pkt_hex_dump: skb:tail
Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.631601] Packet hex dump:
Fri Dec  6 13:44:59 2024 kern.warn kernel: [  566.634984] 000000	00 00 00 00 00 00 C8 1C 3C 00 20 A0 F9 84 00 00

All nice and clean, the three LLC bytes move from data to header. data_size calculated from that thing above returns current size. No harm done.

For a packet coming in from outside, goes from this:

[  339.672515] pkt_hex_dump: skb:head
[  339.673420] Packet hex dump:
[  339.676711] 000000	00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
[  339.679760] 000010	00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
[  339.685751] 000020	00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
[  339.691992] 000030	00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
[  339.698250] 000040	00 00 09 00 09 00 07 FF FF FF C4 71 54 05 39 C9 
[  339.704499] 000050	00 25 
[  339.710745] pkt_hex_dump: skb:data
[  339.713523] Packet hex dump:
[  339.716820] 000000	AA AA 03 08 00 07 80 9B 00 1D C8 EB 00 00 19 00 
[  339.719864] 000010	FF 94 01 01 01 19 00 08 94 19 00 80 19 00 82 19 
[  339.725854] 000020	00 80 19 00 82 00 00 00 00 00 00 00 00 00 
[  339.732103] pkt_hex_dump: skb:tail
[  339.738002] Packet hex dump:
[  339.741300] 000000	00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 

to this:

[  340.032027] pkt_hex_dump: skb:head
[  340.041389] Packet hex dump:
[  340.044679] 000000	00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
[  340.047728] 000010	00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
[  340.053718] 000020	00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
[  340.059972] 000030	00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
[  340.066218] 000040	00 00 09 00 09 00 07 FF FF FF C4 71 54 05 39 C9 
[  340.072469] 000050	00 25 AA AA 03 
[  340.078715] pkt_hex_dump: skb:data
[  340.082186] Packet hex dump:
[  340.085565] 000000	08 00 07 80 9B 00 1D C8 EB 00 00 19 00 FF 94 01 
[  340.088613] 000010	01 01 19 00 08 94 19 00 80 19 00 82 19 00 80 19 
[  340.094607] 000020	00 82 
[  340.100850] pkt_hex_dump: skb:tail
[  340.103628] Packet hex dump:
[  340.106924] 000000	00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 

Sure, the 3 bytes move out to the head... but good 9 bytes are lost at the end.
data_size is calculated at 34 when it came in at 43.
Lots of empty space in head too (I've seen that before... some social media stuff...).

I guess I should add tracing to a working non-DSA image and flash that to another device to compare. Will do if I hit a block.

Hmmm.... this all may be a red herring. (Yay! I'm wasting my time again!). It's picking up length from the header. Weird. So chopped off packet is OK. I'm not going to get anything more out of it.
Maybe AppleTalk is sending screwy packets. Maybe something eff'd up the size field. I'll park it for now.

Alright... let's look at the ugly one then... when did transport_header go wrong?