Discussion:
[Bridge] [PATCH v2 net-next 0/6] Allow bridge to function in non-promisc mode
Vlad Yasevich
2013-04-19 20:52:44 UTC
Permalink
This series is an almost complete rework of the prior attempt
to make the bridge function in non-promisc mode. In this series
the "promiscuity" of an interface is dynamically determined and
the interface may transition from/to promiscuous mode based on
bridge configuration.

The series keeps an idea of an "uplink" port. That is still user
designated.
The series also adds a concept of "dynamic" bridge port. This is
the default state of the port and means that the user has not
specified any static FDBs for that port.
Once a user has added a static FDB entry to port and also specified
an "uplink" flag for that FDB, the mac address from that FDB is
added to the bridge hw address list and synched down to uplinks.
"Uplinks" are always considered dynamic ports even if a static entry
has been added for them.
Promiscuity is determined by the number of dynamic ports. If there
are no dynamic ports (i.e all ports have static FDBs set), then we
know all the neighbors and can switch promisc off on all of the ports.
If we have only 1 dynamic port and its an uplink, we can synch all
static hw addresses to this port and mark it non-promisc.
If we have more then 1 dynamic port, then all ports have to be
promiscuouse.
This is the algorith that Michael Tsirkin proposed earlier.

Changes since v1:
- Dynamic promisc mode selection. Almost complete re-write.

Vlad Yasevich (6):
bridge: Allow an ability to designate an uplink port
bridge: make flags sysfs interface a little bit more extensible
bridge: Implement IFF_UNICAST_FLT.
bridge: Allow user to program hw addresses to uplink devices.
bridge: Automatically set promisc on uplink ports.
bridge: Store bridge mac to uplinks

include/uapi/linux/if_link.h | 1 +
include/uapi/linux/neighbour.h | 6 +-
net/bridge/br_device.c | 69 ++++++++++++++++++++++++--
net/bridge/br_fdb.c | 108 +++++++++++++++++++++++++++++++---------
net/bridge/br_if.c | 50 +++++++++++++++---
net/bridge/br_netlink.c | 5 ++
net/bridge/br_private.h | 12 ++++-
net/bridge/br_stp_if.c | 2 +-
net/bridge/br_sysfs_if.c | 60 ++++++++++++++++++----
net/core/dev.c | 1 +
net/core/rtnetlink.c | 4 +-
11 files changed, 265 insertions(+), 53 deletions(-)
--
1.7.7.6
Vlad Yasevich
2013-04-19 20:52:45 UTC
Permalink
Set a flag on the port that designates it as an uplink.

Signed-off-by: Vlad Yasevich <vyasevic at redhat.com>
---
include/uapi/linux/if_link.h | 1 +
net/bridge/br_netlink.c | 2 ++
net/bridge/br_private.h | 1 +
net/bridge/br_sysfs_if.c | 2 ++
4 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index e316354..0c5c76e 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -221,6 +221,7 @@ enum {
IFLA_BRPORT_GUARD, /* bpdu guard */
IFLA_BRPORT_PROTECT, /* root port protection */
IFLA_BRPORT_FAST_LEAVE, /* multicast fast leave */
+ IFLA_BRPORT_UPLINK, /* uplink port */
__IFLA_BRPORT_MAX
};
#define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 8e3abf5..1767d15 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -281,6 +281,7 @@ static const struct nla_policy ifla_brport_policy[IFLA_BRPORT_MAX + 1] = {
[IFLA_BRPORT_MODE] = { .type = NLA_U8 },
[IFLA_BRPORT_GUARD] = { .type = NLA_U8 },
[IFLA_BRPORT_PROTECT] = { .type = NLA_U8 },
+ [IFLA_BRPORT_UPLINK] = { .type = NLA_U8 },
};

/* Change the state of the port and notify spanning tree */
@@ -328,6 +329,7 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[])
br_set_port_flag(p, tb, IFLA_BRPORT_GUARD, BR_BPDU_GUARD);
br_set_port_flag(p, tb, IFLA_BRPORT_FAST_LEAVE, BR_MULTICAST_FAST_LEAVE);
br_set_port_flag(p, tb, IFLA_BRPORT_PROTECT, BR_ROOT_BLOCK);
+ br_set_port_flag(p, tb, IFLA_BRPORT_UPLINK, BR_UPLINK);

if (tb[IFLA_BRPORT_COST]) {
err = br_stp_set_path_cost(p, nla_get_u32(tb[IFLA_BRPORT_COST]));
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 3cbf5be..c90ec57 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -156,6 +156,7 @@ struct net_bridge_port
#define BR_BPDU_GUARD 0x00000002
#define BR_ROOT_BLOCK 0x00000004
#define BR_MULTICAST_FAST_LEAVE 0x00000008
+#define BR_UPLINK 0x00000010

#ifdef CONFIG_BRIDGE_IGMP_SNOOPING
u32 multicast_startup_queries_sent;
diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c
index a1ef1b6..1f28cd4 100644
--- a/net/bridge/br_sysfs_if.c
+++ b/net/bridge/br_sysfs_if.c
@@ -158,6 +158,7 @@ static BRPORT_ATTR(flush, S_IWUSR, NULL, store_flush);
BRPORT_ATTR_FLAG(hairpin_mode, BR_HAIRPIN_MODE);
BRPORT_ATTR_FLAG(bpdu_guard, BR_BPDU_GUARD);
BRPORT_ATTR_FLAG(root_block, BR_ROOT_BLOCK);
+BRPORT_ATTR_FLAG(uplink, BR_UPLINK);

#ifdef CONFIG_BRIDGE_IGMP_SNOOPING
static ssize_t show_multicast_router(struct net_bridge_port *p, char *buf)
@@ -195,6 +196,7 @@ static const struct brport_attribute *brport_attrs[] = {
&brport_attr_hairpin_mode,
&brport_attr_bpdu_guard,
&brport_attr_root_block,
+ &brport_attr_uplink,
#ifdef CONFIG_BRIDGE_IGMP_SNOOPING
&brport_attr_multicast_router,
&brport_attr_multicast_fast_leave,
--
1.7.7.6
Vlad Yasevich
2013-04-19 20:52:46 UTC
Permalink
Some changes of the flags may need to trigger additional actions.
Make the flag change function generic and have the per-attribute
function call that.

Signed-off-by: Vlad Yasevich <vyasevic at redhat.com>
---
net/bridge/br_sysfs_if.c | 39 ++++++++++++++++++++++++++++-----------
1 files changed, 28 insertions(+), 11 deletions(-)

diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c
index 1f28cd4..606362c 100644
--- a/net/bridge/br_sysfs_if.c
+++ b/net/bridge/br_sysfs_if.c
@@ -25,6 +25,10 @@ struct brport_attribute {
ssize_t (*show)(struct net_bridge_port *, char *);
int (*store)(struct net_bridge_port *, unsigned long);
};
+static ssize_t show_flag(struct net_bridge_port *p, char *buf,
+ unsigned long mask);
+static int store_flag(struct net_bridge_port *p, unsigned long v,
+ unsigned long mask);

#define BRPORT_ATTR(_name,_mode,_show,_store) \
const struct brport_attribute brport_attr_##_name = { \
@@ -37,24 +41,37 @@ const struct brport_attribute brport_attr_##_name = { \
#define BRPORT_ATTR_FLAG(_name, _mask) \
static ssize_t show_##_name(struct net_bridge_port *p, char *buf) \
{ \
- return sprintf(buf, "%d\n", !!(p->flags & _mask)); \
+ return show_flag(p, buf, _mask); \
} \
static int store_##_name(struct net_bridge_port *p, unsigned long v) \
{ \
- unsigned long flags = p->flags; \
- if (v) \
- flags |= _mask; \
- else \
- flags &= ~_mask; \
- if (flags != p->flags) { \
- p->flags = flags; \
- br_ifinfo_notify(RTM_NEWLINK, p); \
- } \
- return 0; \
+ return store_flag(p, v, _mask); \
} \
static BRPORT_ATTR(_name, S_IRUGO | S_IWUSR, \
show_##_name, store_##_name)

+static ssize_t show_flag(struct net_bridge_port *p, char *buf,
+ unsigned long mask)
+{
+ return sprintf(buf, "%d\n", !!(p->flags & mask));
+}
+
+static int store_flag(struct net_bridge_port *p, unsigned long v,
+ unsigned long mask)
+{
+ unsigned long flags = p->flags;
+
+ if (v)
+ flags |= mask;
+ else
+ flags &= ~mask;
+
+ if (flags != p->flags) {
+ p->flags = flags;
+ br_ifinfo_notify(RTM_NEWLINK, p);
+ }
+ return 0;
+}

static ssize_t show_path_cost(struct net_bridge_port *p, char *buf)
{
--
1.7.7.6
Stephen Hemminger
2013-04-19 20:54:41 UTC
Permalink
On Fri, 19 Apr 2013 16:52:46 -0400
Post by Vlad Yasevich
+static ssize_t show_flag(struct net_bridge_port *p, char *buf,
+ unsigned long mask)
+{
+ return sprintf(buf, "%d\n", !!(p->flags & mask));
+}
Flags are bit, show in hex please.
Vlad Yasevich
2013-04-19 21:35:26 UTC
Permalink
Post by Stephen Hemminger
On Fri, 19 Apr 2013 16:52:46 -0400
Post by Vlad Yasevich
+static ssize_t show_flag(struct net_bridge_port *p, char *buf,
+ unsigned long mask)
+{
+ return sprintf(buf, "%d\n", !!(p->flags & mask));
+}
Flags are bit, show in hex please.
I did not change the output format. In fact, on second look
I can pull this change out as it isn't needed.

-vlad
Stephen Hemminger
2013-04-19 20:55:33 UTC
Permalink
On Fri, 19 Apr 2013 16:52:46 -0400
Post by Vlad Yasevich
+
+static int store_flag(struct net_bridge_port *p, unsigned long v,
+ unsigned long mask)
+{
+ unsigned long flags = p->flags;
+
+ if (v)
+ flags |= mask;
+ else
+ flags &= ~mask;
+
+ if (flags != p->flags) {
+ p->flags = flags;
+ br_ifinfo_notify(RTM_NEWLINK, p);
+ }
+ return 0;
+}
I don't want to allow arbitrary flag stores (and shows).
It exposes kernel bits to user space and creates an unintended ABI.
Vlad Yasevich
2013-04-19 21:33:34 UTC
Permalink
Post by Stephen Hemminger
On Fri, 19 Apr 2013 16:52:46 -0400
Post by Vlad Yasevich
+
+static int store_flag(struct net_bridge_port *p, unsigned long v,
+ unsigned long mask)
+{
+ unsigned long flags = p->flags;
+
+ if (v)
+ flags |= mask;
+ else
+ flags &= ~mask;
+
+ if (flags != p->flags) {
+ p->flags = flags;
+ br_ifinfo_notify(RTM_NEWLINK, p);
+ }
+ return 0;
+}
I don't want to allow arbitrary flag stores (and shows).
It exposes kernel bits to user space and creates an unintended ABI.
Stephen

This is really no different then what you currently have. It simply
removed code duplication and allows for slight sensibility.

Right now, the above function is essentially defined for every flag
attributed that you define with BRPORT_ATTR_FLAG(). You essentially
have store_<flag>_name() function for every flag that replicates the
above code.

This patch removes this duplication by having each store_<flag>_name()
function call into this new static store_flag() function with the
mask that is coded and expanded at compile time.

So there is no new ABI, there is no arbitrary flag stores, but there is
smaller code and there is ability to extend the flag store behavior to
possibly do something else it if is needed.

-vlad
Vlad Yasevich
2013-04-19 20:52:47 UTC
Permalink
Implement IFF_UNICAST_FLT on the bridge. Unicast addresses added
to the bridge device are synched to the uplink devices. When the uplinks
change, current bridge device addresses are added/removed depending on
the change of the BR_UPLINK flag.

Signed-off-by: Vlad Yasevich <vyasevic at redhat.com>
---
net/bridge/br_device.c | 17 ++++++++++++++---
net/bridge/br_if.c | 24 ++++++++++++++++++++++++
net/bridge/br_netlink.c | 3 +++
net/bridge/br_private.h | 1 +
net/bridge/br_sysfs_if.c | 21 ++++++++++++++++++++-
5 files changed, 62 insertions(+), 4 deletions(-)

diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 9673128..5c3c904 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -104,8 +104,19 @@ static int br_dev_open(struct net_device *dev)
return 0;
}

-static void br_dev_set_multicast_list(struct net_device *dev)
+static void br_dev_set_rx_mode(struct net_device *dev)
{
+ struct net_bridge *br = netdev_priv(dev);
+ struct net_bridge_port *port;
+
+ rcu_read_lock();
+ list_for_each_entry_rcu(port, &br->port_list, list) {
+ if (port->flags & BR_UPLINK) {
+ dev_uc_sync_multiple(port->dev, dev);
+ dev_mc_sync_multiple(port->dev, dev);
+ }
+ }
+ rcu_read_unlock();
}

static int br_dev_stop(struct net_device *dev)
@@ -301,7 +312,7 @@ static const struct net_device_ops br_netdev_ops = {
.ndo_start_xmit = br_dev_xmit,
.ndo_get_stats64 = br_get_stats64,
.ndo_set_mac_address = br_set_mac_address,
- .ndo_set_rx_mode = br_dev_set_multicast_list,
+ .ndo_set_rx_mode = br_dev_set_rx_mode,
.ndo_change_mtu = br_change_mtu,
.ndo_do_ioctl = br_dev_ioctl,
#ifdef CONFIG_NET_POLL_CONTROLLER
@@ -344,7 +355,7 @@ void br_dev_setup(struct net_device *dev)
SET_ETHTOOL_OPS(dev, &br_ethtool_ops);
SET_NETDEV_DEVTYPE(dev, &br_type);
dev->tx_queue_len = 0;
- dev->priv_flags = IFF_EBRIDGE;
+ dev->priv_flags = IFF_EBRIDGE | IFF_UNICAST_FLT;

dev->features = NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_HIGHDMA |
NETIF_F_GSO_MASK | NETIF_F_HW_CSUM | NETIF_F_LLTX |
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index f17fcb3..7c74cd5 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -142,6 +142,10 @@ static void del_nbp(struct net_bridge_port *p)

nbp_vlan_flush(p);
br_fdb_delete_by_port(br, p, 1);
+ if (p->flags & BR_UPLINK) {
+ dev_uc_unsync(dev, br->dev);
+ dev_mc_unsync(dev, br->dev);
+ }

list_del_rcu(&p->list);

@@ -448,6 +452,26 @@ int br_del_if(struct net_bridge *br, struct net_device *dev)
return 0;
}

+void br_manage_uplinks(struct net_bridge_port *p, unsigned long mask)
+{
+ struct net_bridge *br = p->br;
+
+ if (!(mask & BR_UPLINK))
+ return;
+
+ if (p->flags & BR_UPLINK) {
+ /* Newly marked uplink port. Sync all of device addresses
+ * to it.
+ */
+ dev_uc_sync(p->dev, br->dev);
+ dev_mc_sync(p->dev, br->dev);
+ } else {
+ /* Uplink was unmakred. Remove device addresses */
+ dev_uc_unsync(p->dev, br->dev);
+ dev_mc_unsync(p->dev, br->dev);
+ }
+}
+
void __net_exit br_net_exit(struct net *net)
{
struct net_device *dev;
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 1767d15..5286903 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -323,6 +323,7 @@ static void br_set_port_flag(struct net_bridge_port *p, struct nlattr *tb[],
/* Process bridge protocol info on port */
static int br_setport(struct net_bridge_port *p, struct nlattr *tb[])
{
+ unsigned long flags = p->flags;
int err;

br_set_port_flag(p, tb, IFLA_BRPORT_MODE, BR_HAIRPIN_MODE);
@@ -348,6 +349,8 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[])
if (err)
return err;
}
+
+ br_manage_uplinks(p, flags ^ p->flags);
return 0;
}

diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index c90ec57..c43374a 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -427,6 +427,7 @@ extern int br_del_if(struct net_bridge *br,
extern int br_min_mtu(const struct net_bridge *br);
extern netdev_features_t br_features_recompute(struct net_bridge *br,
netdev_features_t features);
+extern void br_manage_uplinks(struct net_bridge_port *p, unsigned long mask);

/* br_input.c */
extern int br_handle_frame_finish(struct sk_buff *skb);
diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c
index 606362c..575ad9a 100644
--- a/net/bridge/br_sysfs_if.c
+++ b/net/bridge/br_sysfs_if.c
@@ -175,7 +175,26 @@ static BRPORT_ATTR(flush, S_IWUSR, NULL, store_flush);
BRPORT_ATTR_FLAG(hairpin_mode, BR_HAIRPIN_MODE);
BRPORT_ATTR_FLAG(bpdu_guard, BR_BPDU_GUARD);
BRPORT_ATTR_FLAG(root_block, BR_ROOT_BLOCK);
-BRPORT_ATTR_FLAG(uplink, BR_UPLINK);
+
+static ssize_t show_uplink(struct net_bridge_port *p, char *buf)
+{
+ return show_flag(p, buf, BR_UPLINK);
+}
+
+static int set_uplink(struct net_bridge_port *p, unsigned long v)
+{
+ unsigned long oflags = p->flags;
+ int err;
+
+ err = store_flag(p, v, BR_UPLINK);
+
+ if (oflags != p->flags)
+ br_manage_uplinks(p, oflags ^ p->flags);
+
+ return err;
+}
+
+BRPORT_ATTR(uplink, S_IRUSR | S_IWUSR, show_uplink, set_uplink);

#ifdef CONFIG_BRIDGE_IGMP_SNOOPING
static ssize_t show_multicast_router(struct net_bridge_port *p, char *buf)
--
1.7.7.6
Vlad Yasevich
2013-04-19 20:52:48 UTC
Permalink
Add support to bridge fdb handling to program hw addresses to the
bridge master device which will sync them to uplink devices.
The use sets a new flag in the ndmsg structure to say that
a given address should be added to or removed from uplinks.

Signed-off-by: Vlad Yasevich <vyasevic at redhat.com>
---
include/uapi/linux/neighbour.h | 6 ++--
net/bridge/br_fdb.c | 50 ++++++++++++++++++++++++++++------------
net/core/rtnetlink.c | 4 +-
3 files changed, 40 insertions(+), 20 deletions(-)

diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h
index f175212..fd1587d 100644
--- a/include/uapi/linux/neighbour.h
+++ b/include/uapi/linux/neighbour.h
@@ -34,11 +34,11 @@ enum {
*/

#define NTF_USE 0x01
-#define NTF_PROXY 0x08 /* == ATF_PUBL */
-#define NTF_ROUTER 0x80
-
#define NTF_SELF 0x02
#define NTF_MASTER 0x04
+#define NTF_PROXY 0x08 /* == ATF_PUBL */
+#define NTF_UPLINK 0x10
+#define NTF_ROUTER 0x80

/*
* Neighbor Cache Entry States.
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index c581f12..5585e00 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -671,7 +671,7 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
const unsigned char *addr, u16 nlh_flags)
{
struct net_bridge_port *p;
- int err = 0;
+ int err = -EINVAL;
struct net_port_vlans *pv;
unsigned short vid = VLAN_N_VID;

@@ -680,6 +680,16 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
return -EINVAL;
}

+ p = br_port_get_rtnl(dev);
+ if (p == NULL) {
+ pr_info("bridge: RTM_NEWNEIGH %s not a bridge port\n",
+ dev->name);
+ return -EINVAL;
+ }
+
+ if (is_multicast_ether_addr(addr))
+ goto uplink;
+
if (tb[NDA_VLAN]) {
if (nla_len(tb[NDA_VLAN]) != sizeof(unsigned short)) {
pr_info("bridge: RTM_NEWNEIGH with invalid vlan\n");
@@ -695,13 +705,6 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
}
}

- p = br_port_get_rtnl(dev);
- if (p == NULL) {
- pr_info("bridge: RTM_NEWNEIGH %s not a bridge port\n",
- dev->name);
- return -EINVAL;
- }
-
pv = nbp_get_vlan_info(p);
if (vid != VLAN_N_VID) {
if (!pv || !test_bit(vid, pv->vlan_bitmap)) {
@@ -729,6 +732,13 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
}
}

+uplink:
+ /* Check to see if the user requested this address be added to
+ * uplink
+ */
+ if (ndm->ndm_flags & NTF_UPLINK)
+ err = ndo_dflt_fdb_add(ndm, tb, p->br->dev, addr, nlh_flags);
+
out:
return err;
}
@@ -765,10 +775,20 @@ int br_fdb_delete(struct ndmsg *ndm, struct nlattr *tb[],
const unsigned char *addr)
{
struct net_bridge_port *p;
- int err;
+ int err = -EINVAL;
struct net_port_vlans *pv;
unsigned short vid = VLAN_N_VID;

+ p = br_port_get_rtnl(dev);
+ if (p == NULL) {
+ pr_info("bridge: RTM_DELNEIGH %s not a bridge port\n",
+ dev->name);
+ return -EINVAL;
+ }
+
+ if (is_multicast_ether_addr(addr))
+ goto uplink;
+
if (tb[NDA_VLAN]) {
if (nla_len(tb[NDA_VLAN]) != sizeof(unsigned short)) {
pr_info("bridge: RTM_NEWNEIGH with invalid vlan\n");
@@ -783,12 +803,6 @@ int br_fdb_delete(struct ndmsg *ndm, struct nlattr *tb[],
return -EINVAL;
}
}
- p = br_port_get_rtnl(dev);
- if (p == NULL) {
- pr_info("bridge: RTM_DELNEIGH %s not a bridge port\n",
- dev->name);
- return -EINVAL;
- }

pv = nbp_get_vlan_info(p);
if (vid != VLAN_N_VID) {
@@ -814,6 +828,12 @@ int br_fdb_delete(struct ndmsg *ndm, struct nlattr *tb[],
err &= __br_fdb_delete(p, addr, vid);
}
}
+uplink:
+ /* Check to see if the user requested this address be removed
+ * from uplink
+ */
+ if (ndm->ndm_flags & NTF_UPLINK)
+ err = ndo_dflt_fdb_del(ndm, tb, p->br->dev, addr);
out:
return err;
}
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 589d0ab..1d3c223 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -2042,7 +2042,7 @@ int ndo_dflt_fdb_add(struct ndmsg *ndm,
/* If aging addresses are supported device will need to
* implement its own handler for this.
*/
- if (ndm->ndm_state && !(ndm->ndm_state & NUD_PERMANENT)) {
+ if (ndm->ndm_state && !(ndm->ndm_state & (NUD_PERMANENT | NUD_NOARP))) {
pr_info("%s: FDB only supports static addresses\n", dev->name);
return err;
}
@@ -2142,7 +2142,7 @@ int ndo_dflt_fdb_del(struct ndmsg *ndm,
/* If aging addresses are supported device will need to
* implement its own handler for this.
*/
- if (ndm->ndm_state & NUD_PERMANENT) {
+ if (ndm->ndm_state & (NUD_PERMANENT | NUD_NOARP)) {
pr_info("%s: FDB only supports static addresses\n", dev->name);
return -EINVAL;
}
--
1.7.7.6
Vlad Yasevich
2013-04-19 20:52:49 UTC
Permalink
Since we now support adding HW mac addresses to uplink ports, we can make
uplinks non-promisc in certain situations. To aid in this determination
we introduce a concept of dynamic port. "Dynamic port" is a port in its
default default state without any statically configured FDB entries that
are meant to be added to the uplink. Now the promisc selection can be
done as follows:
Case A: 0 uplinks and 0 dynamic ports
- In this case, there are no uplinks specified, and the user has
manually specified the neighbors for all ports. Every port in
this case can be non-promisc since we know how to reach all the
neighbors. A sample use case is a routed bridge.
Case B: 1 dynamic port (uplink)
- Uplink is always considered a dynamic port. If it is the only
one, and all other ports have static FDBs, then the uplink can
be non-promisc. A sample use case is a VM hypervisior with a
single uplink and a bunch of VMs each of which provides the
addresses it will use.
Case C: any other combination
- If we have have more then 1 dynamic port, whether it be another
uplink or simply a non statically programmed port, then all ports
needs to be promisc so we can reach any neighbors that may be
behind the dynamic port.


Signed-off-by: Vlad Yasevich <vyasevic at redhat.com>
---
net/bridge/br_device.c | 35 ++++++++++++++++++++++++-
net/bridge/br_fdb.c | 64 ++++++++++++++++++++++++++++++++++++++--------
net/bridge/br_if.c | 48 +++++++++++++++++++++--------------
net/bridge/br_private.h | 8 +++++-
net/core/dev.c | 1 +
5 files changed, 123 insertions(+), 33 deletions(-)

diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 5c3c904..470fb1b 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -108,12 +108,43 @@ static void br_dev_set_rx_mode(struct net_device *dev)
{
struct net_bridge *br = netdev_priv(dev);
struct net_bridge_port *port;
-
+ u32 dp = br->dynamic_ports;
+ u32 up = br->uplink_ports;
+
+ /* Prereq: Uplink ports are always dynamic.
+ *
+ * Case A: 0 dynamic ports
+ * - all non-promisc with synced addrs.
+ * Case B: 1 dynamic port (uplink)
+ * - uplink is non-promisc, other ports are promisc
+ * Case C: any other combination
+ * - all promisc, no need to sync.
+ */
rcu_read_lock();
list_for_each_entry_rcu(port, &br->port_list, list) {
- if (port->flags & BR_UPLINK) {
+ unsigned long promisc = BR_PROMISC;
+
+ if (up == 0 && dp == 0) {
+ /* Case A */
+ dev_uc_sync_multiple(port->dev, dev);
+ dev_uc_sync_multiple(port->dev, dev);
+ promisc = 0;
+ } else if (dp == 1 && (port->flags & BR_UPLINK)) {
+ /* Case B */
dev_uc_sync_multiple(port->dev, dev);
dev_mc_sync_multiple(port->dev, dev);
+ promisc = 0;
+ }
+
+ /* Set proper promisc mode if it needs to change */
+ if ((port->flags & BR_PROMISC) != promisc) {
+ if (promisc) {
+ port->flags |= BR_PROMISC;
+ dev_set_promiscuity(dev, 1);
+ } else {
+ port->flags &= ~BR_PROMISC;
+ dev_set_promiscuity(dev, -1);
+ }
}
}
rcu_read_unlock();
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 5585e00..5a18c2e 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -89,6 +89,36 @@ static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f)
call_rcu(&f->rcu, fdb_rcu_free);
}

+static void fdb_static_count_inc(struct net_bridge_port *p)
+{
+ p->static_cnt++;
+
+ /* Uplink always counts as dynamic port */
+ if (p->flags & BR_UPLINK)
+ return;
+
+ /* If this is the first static fdb entry, decrement the number of
+ * number of dynamic ports.
+ */
+ if (p->static_cnt == 1)
+ p->br->dynamic_ports--;
+}
+
+static void fdb_static_count_dec(struct net_bridge_port *p)
+{
+ p->static_cnt--;
+
+ /* Uplink always counts as dynamic port */
+ if (p->flags & BR_UPLINK)
+ return;
+
+ /* If this was the last static fdb entry for this port, add the
+ * port back to dynamic count.
+ */
+ if (!p->static_cnt)
+ p->br->dynamic_ports++;
+}
+
void br_fdb_changeaddr(struct net_bridge_port *p, const unsigned char *newaddr)
{
struct net_bridge *br = p->br;
@@ -218,7 +248,7 @@ void br_fdb_flush(struct net_bridge *br)
* if do_all is set also flush static entries
*/
void br_fdb_delete_by_port(struct net_bridge *br,
- const struct net_bridge_port *p,
+ struct net_bridge_port *p,
int do_all)
{
int i;
@@ -235,6 +265,10 @@ void br_fdb_delete_by_port(struct net_bridge *br,

if (f->is_static && !do_all)
continue;
+
+ if (f->is_uplink)
+ fdb_static_count_dec(p);
+
/*
* if multiple ports all have the same device address
* then when one port is deleted, assign
@@ -386,7 +420,8 @@ static struct net_bridge_fdb_entry *fdb_find_rcu(struct hlist_head *head,
static struct net_bridge_fdb_entry *fdb_create(struct hlist_head *head,
struct net_bridge_port *source,
const unsigned char *addr,
- __u16 vid)
+ __u16 vid,
+ bool uplink)
{
struct net_bridge_fdb_entry *fdb;

@@ -397,6 +432,7 @@ static struct net_bridge_fdb_entry *fdb_create(struct hlist_head *head,
fdb->vlan_id = vid;
fdb->is_local = 0;
fdb->is_static = 0;
+ fdb->is_uplink = uplink;
fdb->updated = fdb->used = jiffies;
hlist_add_head_rcu(&fdb->hlist, head);
}
@@ -425,7 +461,7 @@ static int fdb_insert(struct net_bridge *br, struct net_bridge_port *source,
fdb_delete(br, fdb);
}

- fdb = fdb_create(head, source, addr, vid);
+ fdb = fdb_create(head, source, addr, vid, false);
if (!fdb)
return -ENOMEM;

@@ -477,7 +513,7 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
} else {
spin_lock(&br->hash_lock);
if (likely(!fdb_find(head, addr, vid))) {
- fdb = fdb_create(head, source, addr, vid);
+ fdb = fdb_create(head, source, addr, vid, false);
if (fdb)
fdb_notify(br, fdb, RTM_NEWNEIGH);
}
@@ -610,7 +646,7 @@ out:

/* Update (create or replace) forwarding database entry */
static int fdb_add_entry(struct net_bridge_port *source, const __u8 *addr,
- __u16 state, __u16 flags, __u16 vid)
+ __u16 state, __u16 flags, __u16 vid, bool uplink)
{
struct net_bridge *br = source->br;
struct hlist_head *head = &br->hash[br_mac_hash(addr, vid)];
@@ -621,7 +657,7 @@ static int fdb_add_entry(struct net_bridge_port *source, const __u8 *addr,
if (!(flags & NLM_F_CREATE))
return -ENOENT;

- fdb = fdb_create(head, source, addr, vid);
+ fdb = fdb_create(head, source, addr, vid, uplink);
if (!fdb)
return -ENOMEM;
fdb_notify(br, fdb, RTM_NEWNEIGH);
@@ -658,7 +694,8 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge_port *p,
} else {
spin_lock_bh(&p->br->hash_lock);
err = fdb_add_entry(p, addr, ndm->ndm_state,
- nlh_flags, vid);
+ nlh_flags, vid,
+ ndm->ndm_flags & BR_UPLINK);
spin_unlock_bh(&p->br->hash_lock);
}

@@ -731,14 +768,16 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
goto out;
}
}
-
uplink:
/* Check to see if the user requested this address be added to
* uplink
*/
- if (ndm->ndm_flags & NTF_UPLINK)
+ if (ndm->ndm_flags & NTF_UPLINK) {
+ fdb_static_count_inc(p);
err = ndo_dflt_fdb_add(ndm, tb, p->br->dev, addr, nlh_flags);
-
+ if (err)
+ fdb_static_count_dec(p);
+ }
out:
return err;
}
@@ -832,8 +871,11 @@ uplink:
/* Check to see if the user requested this address be removed
* from uplink
*/
- if (ndm->ndm_flags & NTF_UPLINK)
+ if (ndm->ndm_flags & NTF_UPLINK) {
err = ndo_dflt_fdb_del(ndm, tb, p->br->dev, addr);
+ if (!err)
+ fdb_static_count_dec(p);
+ }
out:
return err;
}
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 7c74cd5..c62da60 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -132,7 +132,8 @@ static void del_nbp(struct net_bridge_port *p)

sysfs_remove_link(br->ifobj, p->dev->name);

- dev_set_promiscuity(dev, -1);
+ if (p->flags & BR_PROMISC)
+ dev_set_promiscuity(dev, -1);

spin_lock_bh(&br->lock);
br_stp_disable_port(p);
@@ -142,13 +143,19 @@ static void del_nbp(struct net_bridge_port *p)

nbp_vlan_flush(p);
br_fdb_delete_by_port(br, p, 1);
- if (p->flags & BR_UPLINK) {
- dev_uc_unsync(dev, br->dev);
- dev_mc_unsync(dev, br->dev);
- }
+
+ /* Remove any bridge hw addresses from the port device */
+ dev_uc_unsync(dev, br->dev);
+ dev_mc_unsync(dev, br->dev);
+ dev_uc_del(p->dev, br->dev->dev_addr);

list_del_rcu(&p->list);

+ /* Now that the port has been removed, call dev_set_rx_mode()
+ * so that the port removal may be recorded.
+ */
+ dev_set_rx_mode(br->dev);
+
dev->priv_flags &= ~IFF_BRIDGE_PORT;

netdev_rx_handler_unregister(dev);
@@ -353,14 +360,10 @@ int br_add_if(struct net_bridge *br, struct net_device *dev)

call_netdevice_notifiers(NETDEV_JOIN, dev);

- err = dev_set_promiscuity(dev, 1);
- if (err)
- goto put_back;
-
err = kobject_init_and_add(&p->kobj, &brport_ktype, &(dev->dev.kobj),
SYSFS_BRIDGE_PORT_ATTR);
if (err)
- goto err1;
+ goto put_back;

err = br_sysfs_addif(p);
if (err)
@@ -382,6 +385,7 @@ int br_add_if(struct net_bridge *br, struct net_device *dev)
dev_disable_lro(dev);

list_add_rcu(&p->list, &br->port_list);
+ br->dynamic_ports++;

netdev_update_features(br->dev);

@@ -405,6 +409,8 @@ int br_add_if(struct net_bridge *br, struct net_device *dev)

kobject_uevent(&p->kobj, KOBJ_ADD);

+ dev_set_rx_mode(br->dev);
+
return 0;

err5:
@@ -416,8 +422,6 @@ err3:
err2:
kobject_put(&p->kobj);
p = NULL; /* kobject_put frees */
-err1:
- dev_set_promiscuity(dev, -1);
put_back:
dev_put(dev);
kfree(p);
@@ -460,16 +464,22 @@ void br_manage_uplinks(struct net_bridge_port *p, unsigned long mask)
return;

if (p->flags & BR_UPLINK) {
- /* Newly marked uplink port. Sync all of device addresses
- * to it.
+ /* Uplinks are always considered dynamic ports, so if
+ * any static fdbs are configured, we need to add this
+ * port back to dynamic count.
*/
- dev_uc_sync(p->dev, br->dev);
- dev_mc_sync(p->dev, br->dev);
+ br->uplink_ports++;
+ if (p->static_cnt)
+ br->dynamic_ports++;
} else {
- /* Uplink was unmakred. Remove device addresses */
- dev_uc_unsync(p->dev, br->dev);
- dev_mc_unsync(p->dev, br->dev);
+ /* Uplink flag was turned off. This port can once again
+ * be removed from the dynamic count.
+ */
+ br->uplink_ports--;
+ if (p->static_cnt)
+ br->dynamic_ports--;
}
+ dev_set_rx_mode(br->dev);
}

void __net_exit br_net_exit(struct net *net)
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index c43374a..40ac501 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -90,6 +90,7 @@ struct net_bridge_fdb_entry
mac_addr addr;
unsigned char is_local;
unsigned char is_static;
+ bool is_uplink;
__u16 vlan_id;
};

@@ -157,6 +158,9 @@ struct net_bridge_port
#define BR_ROOT_BLOCK 0x00000004
#define BR_MULTICAST_FAST_LEAVE 0x00000008
#define BR_UPLINK 0x00000010
+#define BR_PROMISC 0x80000000
+
+ u32 static_cnt;

#ifdef CONFIG_BRIDGE_IGMP_SNOOPING
u32 multicast_startup_queries_sent;
@@ -282,6 +286,8 @@ struct net_bridge
u8 vlan_enabled;
struct net_port_vlans __rcu *vlan_info;
#endif
+ u32 dynamic_ports;
+ u32 uplink_ports;
};

struct br_input_skb_cb {
@@ -375,7 +381,7 @@ extern void br_fdb_changeaddr(struct net_bridge_port *p,
extern void br_fdb_change_mac_address(struct net_bridge *br, const u8 *newaddr);
extern void br_fdb_cleanup(unsigned long arg);
extern void br_fdb_delete_by_port(struct net_bridge *br,
- const struct net_bridge_port *p, int do_all);
+ struct net_bridge_port *p, int do_all);
extern struct net_bridge_fdb_entry *__br_fdb_get(struct net_bridge *br,
const unsigned char *addr,
__u16 vid);
diff --git a/net/core/dev.c b/net/core/dev.c
index fad4c38..471081b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4587,6 +4587,7 @@ void dev_set_rx_mode(struct net_device *dev)
__dev_set_rx_mode(dev);
netif_addr_unlock_bh(dev);
}
+EXPORT_SYMBOL(dev_set_rx_mode);

/**
* dev_get_flags - get flags reported to userspace
--
1.7.7.6
Vlad Yasevich
2013-04-19 20:52:50 UTC
Permalink
Bridge mac address can change for many reasons. Since now some ports
may be non-promisc, we need to keep track of the bridge mac and program
it onto ports that may need that information. We do that by writing
to any port that turns non-promisc and delete it from any port that
turns back to promisc.

Signed-off-by: Vlad Yasevich <vyasevic at redhat.com>
---
net/bridge/br_device.c | 21 ++++++++++++++++++++-
net/bridge/br_private.h | 2 ++
net/bridge/br_stp_if.c | 2 +-
3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 470fb1b..2521fb8 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -141,9 +141,23 @@ static void br_dev_set_rx_mode(struct net_device *dev)
if (promisc) {
port->flags |= BR_PROMISC;
dev_set_promiscuity(dev, 1);
+
+ /* Remove the bridge mac from port since it
+ * is now promisc.
+ */
+ if (memcmp(dev->dev_addr, port->dev->dev_addr,
+ dev->addr_len))
+ dev_uc_del(port->dev, dev->dev_addr);
} else {
port->flags &= ~BR_PROMISC;
dev_set_promiscuity(dev, -1);
+
+ /* Add bridge mac address to the port since
+ * it is non-promisc.
+ */
+ if (memcmp(dev->dev_addr, port->dev->dev_addr,
+ dev->addr_len))
+ dev_uc_add(port->dev, dev->dev_addr);
}
}
}
@@ -207,6 +221,12 @@ static int br_change_mtu(struct net_device *dev, int new_mtu)
return 0;
}

+void br_change_mac_address(struct net_device *dev, const unsigned char *addr)
+{
+ memcpy(dev->dev_addr, addr, ETH_ALEN);
+ dev_set_rx_mode(dev);
+}
+
/* Allow setting mac address to any valid ethernet address. */
static int br_set_mac_address(struct net_device *dev, void *p)
{
@@ -218,7 +238,6 @@ static int br_set_mac_address(struct net_device *dev, void *p)

spin_lock_bh(&br->lock);
if (!ether_addr_equal(dev->dev_addr, addr->sa_data)) {
- memcpy(dev->dev_addr, addr->sa_data, ETH_ALEN);
br_fdb_change_mac_address(br, addr->sa_data);
br_stp_change_bridge_id(br, addr->sa_data);
}
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 40ac501..ace976f 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -334,6 +334,8 @@ extern void br_dev_setup(struct net_device *dev);
extern void br_dev_delete(struct net_device *dev, struct list_head *list);
extern netdev_tx_t br_dev_xmit(struct sk_buff *skb,
struct net_device *dev);
+extern void br_change_mac_address(struct net_device *dev,
+ const unsigned char *addr);
#ifdef CONFIG_NET_POLL_CONTROLLER
static inline struct netpoll_info *br_netpoll_info(struct net_bridge *br)
{
diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
index 0bdb4eb..5a5b894 100644
--- a/net/bridge/br_stp_if.c
+++ b/net/bridge/br_stp_if.c
@@ -186,9 +186,9 @@ void br_stp_change_bridge_id(struct net_bridge *br, const unsigned char *addr)

wasroot = br_is_root_bridge(br);

+ br_change_mac_address(br->dev, addr);
memcpy(oldaddr, br->bridge_id.addr, ETH_ALEN);
memcpy(br->bridge_id.addr, addr, ETH_ALEN);
- memcpy(br->dev->dev_addr, addr, ETH_ALEN);

list_for_each_entry(p, &br->port_list, list) {
if (ether_addr_equal(p->designated_bridge.addr, oldaddr))
--
1.7.7.6
Stephen Hemminger
2013-04-19 20:58:00 UTC
Permalink
On Fri, 19 Apr 2013 16:52:44 -0400
Post by Vlad Yasevich
This series is an almost complete rework of the prior attempt
to make the bridge function in non-promisc mode. In this series
the "promiscuity" of an interface is dynamically determined and
the interface may transition from/to promiscuous mode based on
bridge configuration.
The series keeps an idea of an "uplink" port. That is still user
designated.
The series also adds a concept of "dynamic" bridge port. This is
the default state of the port and means that the user has not
specified any static FDBs for that port.
Once a user has added a static FDB entry to port and also specified
an "uplink" flag for that FDB, the mac address from that FDB is
added to the bridge hw address list and synched down to uplinks.
"Uplinks" are always considered dynamic ports even if a static entry
has been added for them.
Promiscuity is determined by the number of dynamic ports. If there
are no dynamic ports (i.e all ports have static FDBs set), then we
know all the neighbors and can switch promisc off on all of the ports.
If we have only 1 dynamic port and its an uplink, we can synch all
static hw addresses to this port and mark it non-promisc.
If we have more then 1 dynamic port, then all ports have to be
promiscuouse.
This is the algorith that Michael Tsirkin proposed earlier.
It seems that this bridge with uplink port is just a flavor of macvlan.
The only argument you made for not using macvlan is that user scripts
are expecting bridge API for setup. Which sounds a lot like the original
OVS fake-bridge which was dropped when merged upstream.
Vlad Yasevich
2013-04-19 21:48:02 UTC
Permalink
Post by Stephen Hemminger
On Fri, 19 Apr 2013 16:52:44 -0400
Post by Vlad Yasevich
This series is an almost complete rework of the prior attempt
to make the bridge function in non-promisc mode. In this series
the "promiscuity" of an interface is dynamically determined and
the interface may transition from/to promiscuous mode based on
bridge configuration.
The series keeps an idea of an "uplink" port. That is still user
designated.
The series also adds a concept of "dynamic" bridge port. This is
the default state of the port and means that the user has not
specified any static FDBs for that port.
Once a user has added a static FDB entry to port and also specified
an "uplink" flag for that FDB, the mac address from that FDB is
added to the bridge hw address list and synched down to uplinks.
"Uplinks" are always considered dynamic ports even if a static entry
has been added for them.
Promiscuity is determined by the number of dynamic ports. If there
are no dynamic ports (i.e all ports have static FDBs set), then we
know all the neighbors and can switch promisc off on all of the ports.
If we have only 1 dynamic port and its an uplink, we can synch all
static hw addresses to this port and mark it non-promisc.
If we have more then 1 dynamic port, then all ports have to be
promiscuouse.
This is the algorith that Michael Tsirkin proposed earlier.
It seems that this bridge with uplink port is just a flavor of macvlan.
The only argument you made for not using macvlan is that user scripts
are expecting bridge API for setup. Which sounds a lot like the original
OVS fake-bridge which was dropped when merged upstream.
No, macvlans have limitations that are not trivial to solve. It isn't a
user script issue. I am not familiar with OVS fake-bridge, but from
what little I've found about it seems tied to handling for specific
vlans. I don't see how these two things are similar.

A bridge with just an Uplink defined and no other config, is still
just a bridge and doesn't do anything special at all.
Once a user adds a static FDB for say a VM that's connected to the
bridge, that's when the new code tries to do something. It will
add the mac of the VM to the bridge, synch it to the uplink and see
if it can turn off promisc on the uplink. If it can, great! We win
in that we now have to look at a lot less traffic. If not, then there
is no gain and no loss.

I can see how you could think that it is macvlan-like, but it's still
a bridge.

-vlad
Stephen Hemminger
2013-04-25 15:56:56 UTC
Permalink
On Fri, 19 Apr 2013 16:52:44 -0400
Post by Vlad Yasevich
This series is an almost complete rework of the prior attempt
to make the bridge function in non-promisc mode. In this series
the "promiscuity" of an interface is dynamically determined and
the interface may transition from/to promiscuous mode based on
bridge configuration.
The series keeps an idea of an "uplink" port. That is still user
designated.
The series also adds a concept of "dynamic" bridge port. This is
the default state of the port and means that the user has not
specified any static FDBs for that port.
Once a user has added a static FDB entry to port and also specified
an "uplink" flag for that FDB, the mac address from that FDB is
added to the bridge hw address list and synched down to uplinks.
"Uplinks" are always considered dynamic ports even if a static entry
has been added for them.
Promiscuity is determined by the number of dynamic ports. If there
are no dynamic ports (i.e all ports have static FDBs set), then we
know all the neighbors and can switch promisc off on all of the ports.
If we have only 1 dynamic port and its an uplink, we can synch all
static hw addresses to this port and mark it non-promisc.
If we have more then 1 dynamic port, then all ports have to be
promiscuouse.
This is the algorith that Michael Tsirkin proposed earlier.
Instead of a uplink port, maybe this idea would work better in combination
with another patch I have been working on.

In many bridged environments, ports have only one possible MAC address
on the other side. My patch provides a flag to mark those ports as bound
with only one peer MAC address. This allows those ports to be skipped on
flooding, and for security only packets with that source address would
be allowed.

After that change, your promicious code could just use that flag:
i.e:
uplink ports = total ports - bound ports
if (uplink ports == 1)
enter non-promicious mode
Michael S. Tsirkin
2013-04-25 16:45:27 UTC
Permalink
Post by Stephen Hemminger
On Fri, 19 Apr 2013 16:52:44 -0400
Post by Vlad Yasevich
This series is an almost complete rework of the prior attempt
to make the bridge function in non-promisc mode. In this series
the "promiscuity" of an interface is dynamically determined and
the interface may transition from/to promiscuous mode based on
bridge configuration.
The series keeps an idea of an "uplink" port. That is still user
designated.
The series also adds a concept of "dynamic" bridge port. This is
the default state of the port and means that the user has not
specified any static FDBs for that port.
Once a user has added a static FDB entry to port and also specified
an "uplink" flag for that FDB, the mac address from that FDB is
added to the bridge hw address list and synched down to uplinks.
"Uplinks" are always considered dynamic ports even if a static entry
has been added for them.
Promiscuity is determined by the number of dynamic ports. If there
are no dynamic ports (i.e all ports have static FDBs set), then we
know all the neighbors and can switch promisc off on all of the ports.
If we have only 1 dynamic port and its an uplink, we can synch all
static hw addresses to this port and mark it non-promisc.
If we have more then 1 dynamic port, then all ports have to be
promiscuouse.
This is the algorith that Michael Tsirkin proposed earlier.
Instead of a uplink port, maybe this idea would work better in combination
with another patch I have been working on.
In many bridged environments, ports have only one possible MAC address
on the other side. My patch provides a flag to mark those ports as bound
with only one peer MAC address. This allows those ports to be skipped on
flooding, and for security only packets with that source address would
be allowed.
uplink ports = total ports - bound ports
if (uplink ports == 1)
enter non-promicious mode
Almost except sometimes (with some guests)
X mac addresses are needed, not just one.

How about a generalization:
- a flag to disable learning per port (only use static entries)
- a flag to disable flood per port
Both sees to exist on openbsd, they are useful by themselves.

Now Vlad's patch can work if both learning and flood are
disabled for all ports except maybe one.
--
MST
Stephen Hemminger
2013-04-25 17:35:57 UTC
Permalink
On Thu, 25 Apr 2013 19:45:27 +0300
Post by Michael S. Tsirkin
Post by Stephen Hemminger
On Fri, 19 Apr 2013 16:52:44 -0400
Post by Vlad Yasevich
This series is an almost complete rework of the prior attempt
to make the bridge function in non-promisc mode. In this series
the "promiscuity" of an interface is dynamically determined and
the interface may transition from/to promiscuous mode based on
bridge configuration.
The series keeps an idea of an "uplink" port. That is still user
designated.
The series also adds a concept of "dynamic" bridge port. This is
the default state of the port and means that the user has not
specified any static FDBs for that port.
Once a user has added a static FDB entry to port and also specified
an "uplink" flag for that FDB, the mac address from that FDB is
added to the bridge hw address list and synched down to uplinks.
"Uplinks" are always considered dynamic ports even if a static entry
has been added for them.
Promiscuity is determined by the number of dynamic ports. If there
are no dynamic ports (i.e all ports have static FDBs set), then we
know all the neighbors and can switch promisc off on all of the ports.
If we have only 1 dynamic port and its an uplink, we can synch all
static hw addresses to this port and mark it non-promisc.
If we have more then 1 dynamic port, then all ports have to be
promiscuouse.
This is the algorith that Michael Tsirkin proposed earlier.
Instead of a uplink port, maybe this idea would work better in combination
with another patch I have been working on.
In many bridged environments, ports have only one possible MAC address
on the other side. My patch provides a flag to mark those ports as bound
with only one peer MAC address. This allows those ports to be skipped on
flooding, and for security only packets with that source address would
be allowed.
uplink ports = total ports - bound ports
if (uplink ports == 1)
enter non-promicious mode
Almost except sometimes (with some guests)
X mac addresses are needed, not just one.
- a flag to disable learning per port (only use static entries)
- a flag to disable flood per port
Both sees to exist on openbsd, they are useful by themselves.
Now Vlad's patch can work if both learning and flood are
disabled for all ports except maybe one.
Ok, then maybe allow multiple bound MAC address's per guest.
And more flags seems like a good and easy solution to per-port
control. If you have tested code, please submit it.
Michael S. Tsirkin
2013-04-25 21:13:58 UTC
Permalink
Post by Stephen Hemminger
On Thu, 25 Apr 2013 19:45:27 +0300
Post by Michael S. Tsirkin
Post by Stephen Hemminger
On Fri, 19 Apr 2013 16:52:44 -0400
Post by Vlad Yasevich
This series is an almost complete rework of the prior attempt
to make the bridge function in non-promisc mode. In this series
the "promiscuity" of an interface is dynamically determined and
the interface may transition from/to promiscuous mode based on
bridge configuration.
The series keeps an idea of an "uplink" port. That is still user
designated.
The series also adds a concept of "dynamic" bridge port. This is
the default state of the port and means that the user has not
specified any static FDBs for that port.
Once a user has added a static FDB entry to port and also specified
an "uplink" flag for that FDB, the mac address from that FDB is
added to the bridge hw address list and synched down to uplinks.
"Uplinks" are always considered dynamic ports even if a static entry
has been added for them.
Promiscuity is determined by the number of dynamic ports. If there
are no dynamic ports (i.e all ports have static FDBs set), then we
know all the neighbors and can switch promisc off on all of the ports.
If we have only 1 dynamic port and its an uplink, we can synch all
static hw addresses to this port and mark it non-promisc.
If we have more then 1 dynamic port, then all ports have to be
promiscuouse.
This is the algorith that Michael Tsirkin proposed earlier.
Instead of a uplink port, maybe this idea would work better in combination
with another patch I have been working on.
In many bridged environments, ports have only one possible MAC address
on the other side. My patch provides a flag to mark those ports as bound
with only one peer MAC address. This allows those ports to be skipped on
flooding, and for security only packets with that source address would
be allowed.
uplink ports = total ports - bound ports
if (uplink ports == 1)
enter non-promicious mode
Almost except sometimes (with some guests)
X mac addresses are needed, not just one.
- a flag to disable learning per port (only use static entries)
- a flag to disable flood per port
Both sees to exist on openbsd, they are useful by themselves.
Now Vlad's patch can work if both learning and flood are
disabled for all ports except maybe one.
Ok, then maybe allow multiple bound MAC address's per guest.
That's exactly what static fdb entries do, right?
Post by Stephen Hemminger
And more flags seems like a good and easy solution to per-port
control. If you have tested code, please submit it.
Need to put out some other fires, will do afterwards
if Vlad doesn't beat me to it.
--
MST
Stephen Hemminger
2013-05-02 17:23:31 UTC
Permalink
Doing research on another problem, I noticed that this would
break user mode spanning tree (RSTP) code.

The daemon assumes that bridge is promicious mode and therefore
will receive all link-level multicast packets.
Vlad Yasevich
2013-05-02 17:41:05 UTC
Permalink
Post by Stephen Hemminger
Doing research on another problem, I noticed that this would
break user mode spanning tree (RSTP) code.
The daemon assumes that bridge is promicious mode and therefore
will receive all link-level multicast packets.
Just took another look at RSTP code and I see that's its using
a packet socket with filter to catch STP frames.

Would setting IFF_ALL_MULTI solve the problem? STP packets
are all multicast.

Thanks
-vlad
Stephen Hemminger
2013-05-02 18:14:14 UTC
Permalink
On Thu, 02 May 2013 13:41:05 -0400
Post by Vlad Yasevich
Post by Stephen Hemminger
Doing research on another problem, I noticed that this would
break user mode spanning tree (RSTP) code.
The daemon assumes that bridge is promicious mode and therefore
will receive all link-level multicast packets.
Just took another look at RSTP code and I see that's its using
a packet socket with filter to catch STP frames.
Would setting IFF_ALL_MULTI solve the problem? STP packets
are all multicast.
Thanks
-vlad
See opening of the packet.c in RSTP daemon.

/*
* Open up a raw packet socket to catch all 802.2 packets.
* and install a packet filter to only see STP (SAP 42)
*
* Since any bridged devices are already in promiscious mode
* no need to add multicast address.
*/
int packet_sock_init(void)
{
int s;
struct sock_fprog prog = {
.len = sizeof(stp_filter) / sizeof(stp_filter[0]),
.filter = stp_filter,
};

s = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_802_2));
if (s < 0) {
ERROR("socket failed: %m");
return -1;
}

if (setsockopt(s, SOL_SOCKET, SO_ATTACH_FILTER, &prog, sizeof(prog)) < 0)
ERROR("setsockopt packet filter failed: %m");
Michael S. Tsirkin
2013-05-02 18:37:58 UTC
Permalink
Post by Stephen Hemminger
Doing research on another problem, I noticed that this would
break user mode spanning tree (RSTP) code.
The daemon assumes that bridge is promicious mode and therefore
will receive all link-level multicast packets.
Since you have to play with the new learning/flooding
options to get the non promisc behaviour,
can't we just say "don't do it then"
short term, and fix applcations long term?
Stephen Hemminger
2013-05-02 21:16:26 UTC
Permalink
On Thu, 2 May 2013 21:37:58 +0300
Post by Michael S. Tsirkin
Post by Stephen Hemminger
Doing research on another problem, I noticed that this would
break user mode spanning tree (RSTP) code.
The daemon assumes that bridge is promicious mode and therefore
will receive all link-level multicast packets.
Since you have to play with the new learning/flooding
options to get the non promisc behaviour,
can't we just say "don't do it then"
short term, and fix applcations long term?
It will work as long as non-promiscious mode turns on all-multicast
in driver.

Loading...