Discussion:
Sending 802.1Q packets using AF_PACKET socket on filtered bridge forwards with wrong MAC addresses
(too old to reply)
Brandon Carpenter
2017-11-28 17:21:17 UTC
Permalink
While attempting to send ARP DAD packets on a bridge device, I noticed
that hosts attached to the bridge ports were not responding. After
digging in with tcpdump/wireshark, I noticed that the MAC addresses of
the packets (the first 12 bytes of the Ethernet frame) were wrong and
were an exact duplicate of the following 12 bytes (bytes 13-24), but
only on slave bridge ports. The bridge device itself had the correct
# ip link add br0 type bridge vlan_filtering 1
--
Brandon Carpenter | Software Engineer
Cypherpath, Inc.
400 Columbia Point Drive Ste 101 | Richland, Washington USA
Office: (650) 713-3060
--
CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is
for the sole use of the intended recipient(s) and may contain proprietary,
confidential or privileged information or otherwise be protected by law.
Any unauthorized review, use, disclosure or distribution is prohibited. If
you are not the intended recipient, please notify the sender and destroy
all copies and the original message.
Brandon Carpenter
2017-11-28 19:11:20 UTC
Permalink
While attempting to send ARP DAD packets on a bridge device, I noticed
that hosts attached to the bridge ports were not responding. After
digging in with tcpdump/wireshark, I noticed that the MAC addresses of
the packets (the first 12 bytes of the Ethernet frame) were wrong and
were an exact duplicate of bytes 14-25 (zero-based), but only on slave
bridge ports. The bridge device itself has the correct addresses. Here
is an example setup:

# uname -a
Linux servo 4.13.12-1-ARCH #1 SMP PREEMPT Wed Nov 8 11:54:06 CET
2017 x86_64 GNU/Linux
# ip -V
ip utility, iproute2-ss171113
# bridge -V
bridge utility, 0.0

# ip link add br0 type bridge vlan_filtering 1
# ip link add veth0 type veth peer name veth1
# sysctl -wq net.ipv6.conf.{br0,veth{0,1}}.disable_ipv6=1 #
Eliminate IPv6 noise
# ip link set veth0 master br0
# bridge vlan add dev br0 vid 1 self
# bridge vlan add dev veth0 vid 1
# ip link set veth0 up
# ip link set veth1 up
# ip link set br0 up


The following Python script sends a DAD probe on the given interface:

import socket
import sys

ifname = sys.argv[1]

frame = (b"\xff\xff\xff\xff\xff\xff\xaa\xbb\xcc\xdd\xee\xff\x81\x00"
# Ethernet header
b"\x00\x01\x08\x06" # 802.1Q header with VID 1 and ARP ethertype
# DAD Request for 10.11.12.13
b"\x00\x01\x08\x00\x06\x04\x00\x01"
b"\xaa\xbb\xcc\xdd\xee\xff\x00\x00\x00\x00"
b"\xff\xff\xff\xff\xff\xff\x0a\x0b\x0c\x0d")

with socket.socket(socket.AF_PACKET, socket.SOCK_RAW, 0) as sock:
sock.sendto(frame, (ifname, 0x8100)) # ETH_P_8021Q


Save the script above to arping.py and send a probe on the bridge
device while snooping the bridge ports:

# python3 arping.py br0


In one terminal:

# tcpdump -envli br0 -xx
tcpdump: listening on br0, link-type EN10MB (Ethernet), capture
size 262144 bytes
10:20:18.591728 aa:bb:cc:dd:ee:ff > ff:ff:ff:ff:ff:ff, ethertype
802.1Q (0x8100), length 46: vlan 1, p 0, ethertype ARP, Ethernet (len
6), IPv4 (len 4), Request who-has 10.11.12.13 (ff:ff:ff:ff:ff:ff) tell
0.0.0.0, length 28
0x0000: ffff ffff ffff aabb ccdd eeff 8100 0001
0x0010: 0806 0001 0800 0604 0001 aabb ccdd eeff
0x0020: 0000 0000 ffff ffff ffff 0a0b 0c0d

and another:

# tcpdump -envli veth0 -xx
tcpdump: listening on veth0, link-type EN10MB (Ethernet), capture
size 262144 bytes
10:20:18.591736 08:00:06:04:00:01 > 00:01:08:06:00:01, ethertype
802.1Q (0x8100), length 46: vlan 1, p 0, ethertype ARP, Ethernet (len
6), IPv4 (len 4), Request who-has 10.11.12.13 (ff:ff:ff:ff:ff:ff) tell
0.0.0.0, length 28
0x0000: 0001 0806 0001 0800 0604 0001 8100 0001
0x0010: 0806 0001 0800 0604 0001 aabb ccdd eeff
0x0020: 0000 0000 ffff ffff ffff 0a0b 0c0d


Doing `python3 arping.py veth1` has the expected results, with all MAC
addresses set correctly. The problem also disappears after turning
VLAN filtering off or sending without the VLAN tag. It is just this
one combination of sending VLAN-tagged packets on a AF_PACKET socket
via the filtered bridge master device that causes the incorrect MAC
addresses on slave bridge ports. I also tried with a unicast Ethernet
address and SOCK_DGRAM socket and saw the same result.

I looked through the bridge code, but didn't see anything obvious. I
don't do much kernel programming and haven't touched it in years, so
I'm not sure what to look for. I first suspected it had to do with
pushing/pulling the head of sk_buff clones, but I didn't see anything
that concerned me. And so I am bringing it to the attention of the
experts. I am willing to poke around and test things, if you can just
point me in the right direction.

Thanks in advance and sorry for any previous messages I accidentally
sent. My email app was being stupid. And I may have pressed the wrong
key combination.

Brandon
--
CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is
for the sole use of the intended recipient(s) and may contain proprietary,
confidential or privileged information or otherwise be protected by law.
Any unauthorized review, use, disclosure or distribution is prohibited. If
you are not the intended recipient, please notify the sender and destroy
all copies and the original message.
Brandon Carpenter
2017-11-28 23:37:40 UTC
Permalink
I tracked down the function responsible for changing the MAC header.
It is skb_vlan_untag() called from __allowed_ingress() in
net/bridge/br_vlan.c. Before skb_vlan_untag() is called, the MAC
headers are perfect and afterward they are altered as previously
described. I'll continue digging into that function, but if any of you
have insight into what skb_vlan_untag() does and why it might alter
the MAC header the way it does in this special case, I would
appreciate the help.

Thanks,

Brandon
--
CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is
for the sole use of the intended recipient(s) and may contain proprietary,
confidential or privileged information or otherwise be protected by law.
Any unauthorized review, use, disclosure or distribution is prohibited. If
you are not the intended recipient, please notify the sender and destroy
all copies and the original message.
Brandon Carpenter
2017-11-29 22:01:11 UTC
Permalink
I narrowed the search to a memmove() called from
skb_reorder_vlan_header() in net/core/skbuff.c.
memmove(skb->data - ETH_HLEN, skb->data - skb->mac_len - VLAN_HLEN,
2 * ETH_ALEN);
Calling skb_reset_mac_len() after skb_reset_mac_header() before
calling br_allowed_ingress() in net/bridge/br_device.c fixes the
problem.

diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index af5b8c87f590..e10131e2f68f 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -58,6 +58,7 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct
net_device *dev)
BR_INPUT_SKB_CB(skb)->brdev = dev;

skb_reset_mac_header(skb);
+ skb_reset_mac_len(skb);
eth = eth_hdr(skb);
skb_pull(skb, ETH_HLEN);


I'll put together an official patch and submit it. Should I use
another email account? Are my emails being ignored because of that
stupid disclaimer my employer attaches to my messages (outside my
control)?

Brandon
--
CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is
for the sole use of the intended recipient(s) and may contain proprietary,
confidential or privileged information or otherwise be protected by law.
Any unauthorized review, use, disclosure or distribution is prohibited. If
you are not the intended recipient, please notify the sender and destroy
all copies and the original message.
Toshiaki Makita
2017-12-06 04:44:03 UTC
Permalink
Hi,
(CC: Vlad)
Post by Brandon Carpenter
I narrowed the search to a memmove() called from
skb_reorder_vlan_header() in net/core/skbuff.c.
memmove(skb->data - ETH_HLEN, skb->data - skb->mac_len - VLAN_HLEN,
2 * ETH_ALEN);
Calling skb_reset_mac_len() after skb_reset_mac_header() before
calling br_allowed_ingress() in net/bridge/br_device.c fixes the
problem.
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index af5b8c87f590..e10131e2f68f 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -58,6 +58,7 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct
net_device *dev)
BR_INPUT_SKB_CB(skb)->brdev = dev;
skb_reset_mac_header(skb);
+ skb_reset_mac_len(skb);
eth = eth_hdr(skb);
skb_pull(skb, ETH_HLEN);
Thanks for debugging this problem.
It seems this has been broken since a6e18ff11170 ("vlan: Fix untag
operations of stacked vlans with REORDER_HEADER off").

Unfortunately this does not always work correctly, since in tx path
drivers assume network header to be set to L3 protocol header offset.
Packet socket (packet_snd()) determines network header by
dev_hard_header which is ETH_HLEN in bridge devices, so this works for
packet socket, but with vlan devices on top of bridge device with
tx-vlan hwaccel disabled we get ETH_HLEN + VLAN_HLEN or longer by mac_len.

Since mac_len can be arbitrarily long if we stack vlan devices on bridge
devices, and since we want to untag the outermost tag, using mac_len to
untag in tx path is probably no longer correct.

I'll think deeper about how to fix it.
Post by Brandon Carpenter
I'll put together an official patch and submit it. Should I use
another email account? Are my emails being ignored because of that
stupid disclaimer my employer attaches to my messages (outside my
control)?
Brandon
--
Toshiaki Makita
Continue reading on narkive:
Loading...