Discussion:
[PATCH net-next 0/5] Switchdev: flooding offload option
(too old to reply)
Igor Mitsyanko
2018-03-10 03:03:03 UTC
Permalink
Main goal of the patchset is to allow for flooding offload to switchdev
in scenarios when HW switchdev ports and SW ports are operating in a
single bridge.

In case a data frame that needs to be flooded is ingressed on a SW port,
it needs to be flooded to a single HW port of any given
switchdev-capable hardware only. Switchdev hardware will than take care
about flooding to the rest of the ports that it manages.


Igor Mitsyanko (5):
bridge: initialize port flags with switchdev defaults
bridge: propagate BR_ flags updates through sysfs to switchdev
bridge: allow switchdev port to handle flooding by itself
bridge: provide sysfs and netlink interface to set BR_FLOOD_OFFLOAD
bridge: verify "HW only" flags can't be set without HW support

include/linux/if_bridge.h | 1 +
include/uapi/linux/if_link.h | 1 +
net/bridge/br_forward.c | 2 ++
net/bridge/br_if.c | 17 ++++++++++++++++-
net/bridge/br_netlink.c | 8 +++++++-
net/bridge/br_private.h | 13 ++++++++++++-
net/bridge/br_switchdev.c | 14 +++++++++++++-
net/bridge/br_sysfs_if.c | 17 +++++++++++++----
8 files changed, 65 insertions(+), 8 deletions(-)
--
2.9.5
Igor Mitsyanko
2018-03-10 03:03:05 UTC
Permalink
sysfs interface to configure bridge flags only updates SW copy but does
not notify hardware through switchdev interface. Make sure it is.

Signed-off-by: Igor Mitsyanko <***@quantenna.com>
---
net/bridge/br_sysfs_if.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c
index 126a8ea..9bdd177 100644
--- a/net/bridge/br_sysfs_if.c
+++ b/net/bridge/br_sysfs_if.c
@@ -51,6 +51,7 @@ static int store_flag(struct net_bridge_port *p, unsigned long v,
unsigned long mask)
{
unsigned long flags;
+ int err;

flags = p->flags;

@@ -59,10 +60,16 @@ static int store_flag(struct net_bridge_port *p, unsigned long v,
else
flags &= ~mask;

- if (flags != p->flags) {
- p->flags = flags;
- br_port_flags_change(p, mask);
- }
+ if (flags == p->flags)
+ return 0;
+
+ err = br_switchdev_set_port_flag(p, flags, mask);
+ if (err)
+ return err;
+
+ p->flags = flags;
+ br_port_flags_change(p, mask);
+
return 0;
}
--
2.9.5
Andrew Lunn
2018-03-10 16:38:38 UTC
Permalink
Post by Igor Mitsyanko
sysfs interface to configure bridge flags only updates SW copy but does
not notify hardware through switchdev interface. Make sure it is.
---
net/bridge/br_sysfs_if.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c
index 126a8ea..9bdd177 100644
--- a/net/bridge/br_sysfs_if.c
+++ b/net/bridge/br_sysfs_if.c
@@ -51,6 +51,7 @@ static int store_flag(struct net_bridge_port *p, unsigned long v,
unsigned long mask)
{
unsigned long flags;
+ int err;
flags = p->flags;
@@ -59,10 +60,16 @@ static int store_flag(struct net_bridge_port *p, unsigned long v,
else
flags &= ~mask;
- if (flags != p->flags) {
- p->flags = flags;
- br_port_flags_change(p, mask);
- }
+ if (flags == p->flags)
+ return 0;
+
+ err = br_switchdev_set_port_flag(p, flags, mask);
+ if (err)
+ return err;
You might want to consider the br_warn() in
br_switchdev_set_port_flag(). Do we want to spam the kernel log? Or
should store_flag() do some validation before calling
br_switchdev_set_port_flag()?

Andrew
Igor Mitsyanko
2018-03-12 20:07:40 UTC
Permalink
Post by Andrew Lunn
+ return 0;
+
+ err = br_switchdev_set_port_flag(p, flags, mask);
+ if (err)
+ return err;
You might want to consider the br_warn() in
br_switchdev_set_port_flag(). Do we want to spam the kernel log? Or
should store_flag() do some validation before calling
br_switchdev_set_port_flag()?
Andrew
Is there any convention for that in Linux? While I would agree that
simply returning a error code is sufficient in this case, another user
of br_switchdev_set_port_flag() is a netlink interface, aren't they
supposed to be an equivalent? That is, if netlink prints into kernel
log, sysfs should do that too?
Igor Mitsyanko
2018-03-10 03:03:08 UTC
Permalink
Setting bridge flag BR_FLOOD_OFFLOAD only makes sense if underlying
port hardware advertises support for it. Make sure kernel checks that
condition before allowing to update the flag value.

Signed-off-by: Igor Mitsyanko <***@quantenna.com>
---
net/bridge/br_private.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index a6d2f2b..6d2f8b1 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -50,6 +50,9 @@ enum {
/* Path to usermode spanning tree program */
#define BR_STP_PROG "/sbin/bridge-stp"

+/* Flags that can only be set if HW supports it */
+#define BR_PORT_FLAGS_HW_ONLY BR_FLOOD_OFFLOAD
+
typedef struct bridge_id bridge_id;
typedef struct mac_addr mac_addr;
typedef __u16 port_id;
@@ -1150,7 +1153,7 @@ static inline int br_switchdev_set_port_flag(struct net_bridge_port *p,
unsigned long flags,
unsigned long mask)
{
- return 0;
+ return mask & BR_PORT_FLAGS_HW_ONLY ? -EOPNOTSUPP : 0;
}

static inline void
--
2.9.5
Igor Mitsyanko
2018-03-10 03:03:04 UTC
Permalink
Default bridge port flags for switchdev devices can be different from
what is used in bridging core. Get default value from switchdev itself
on port initialization.

Signed-off-by: Igor Mitsyanko <***@quantenna.com>
---
net/bridge/br_if.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 9ba4ed6..d658b55 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -342,6 +342,21 @@ static int find_portno(struct net_bridge *br)
return (index >= BR_MAX_PORTS) ? -EXFULL : index;
}

+static unsigned long nbp_flags_get_default(struct net_device *dev)
+{
+ struct switchdev_attr attr = {
+ .orig_dev = dev,
+ .id = SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS,
+ };
+ int ret;
+
+ ret = switchdev_port_attr_get(dev, &attr);
+ if (ret)
+ return BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
+ else
+ return attr.u.brport_flags;
+}
+
/* called with RTNL but without bridge lock */
static struct net_bridge_port *new_nbp(struct net_bridge *br,
struct net_device *dev)
@@ -363,7 +378,7 @@ static struct net_bridge_port *new_nbp(struct net_bridge *br,
p->path_cost = port_cost(dev);
p->priority = 0x8000 >> BR_PORT_BITS;
p->port_no = index;
- p->flags = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
+ p->flags = nbp_flags_get_default(dev);
br_init_port(p);
br_set_state(p, BR_STATE_DISABLED);
br_stp_port_timer_init(p);
--
2.9.5
Stephen Hemminger
2018-03-10 16:32:34 UTC
Permalink
On Fri, 9 Mar 2018 19:03:04 -0800
Post by Igor Mitsyanko
Default bridge port flags for switchdev devices can be different from
what is used in bridging core. Get default value from switchdev itself
on port initialization.
---
Yes hardware devices may come it with different default values.
But the user experience on Linux needs to be consistent.

Instead, it makes more sense to me for each device driver using switchdev
to program to the values that it sees in the bridge.

Please revise this.
Igor Mitsyanko
2018-03-12 19:03:10 UTC
Permalink
Post by Stephen Hemminger
Yes hardware devices may come it with different default values.
But the user experience on Linux needs to be consistent.
Instead, it makes more sense to me for each device driver using switchdev
to program to the values that it sees in the bridge.
Please revise this.
Right, it does change default user view, setting flags instead of
querying makes more sense then.

However there is a problem that switchdev may not support all flags that
bridge code sets by default. For example, looking at
spectrum_switchdev.c, it only advertises support for BR_LEARNING |
BR_FLOOD | BR_MCAST_FLOOD.
Currently, I assume that even though some flags are shown as set by
default on a new bridge port, some of them are not actually working in
switchdev?
Andrew Lunn
2018-03-10 16:30:50 UTC
Permalink
Post by Igor Mitsyanko
Default bridge port flags for switchdev devices can be different from
what is used in bridging core. Get default value from switchdev itself
on port initialization.
---
net/bridge/br_if.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 9ba4ed6..d658b55 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -342,6 +342,21 @@ static int find_portno(struct net_bridge *br)
return (index >= BR_MAX_PORTS) ? -EXFULL : index;
}
+static unsigned long nbp_flags_get_default(struct net_device *dev)
+{
+ struct switchdev_attr attr = {
+ .orig_dev = dev,
+ .id = SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS,
+ };
+ int ret;
+
+ ret = switchdev_port_attr_get(dev, &attr);
+ if (ret)
+ return BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
Hi Igor

Please check if ret == -EOPNOTSUPP and only then use the defaults. A
real error should be propagated, causing new_nbp to fail. You might
also consider what to do when ENODATA is returned.

Andrew
Igor Mitsyanko
2018-03-12 18:44:14 UTC
Permalink
Post by Andrew Lunn
+
+ ret = switchdev_port_attr_get(dev, &attr);
+ if (ret)
+ return BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
Hi Igor
Please check if ret == -EOPNOTSUPP and only then use the defaults. A
real error should be propagated, causing new_nbp to fail. You might
also consider what to do when ENODATA is returned.
Andrew
Hi Andrew,
ok, will change it so an error is propagated.
There is one more comment from Stephen suggesting that flags must be set
in switchdev, rather then queried: this approach should take care about
ENODATA I assume.
Igor Mitsyanko
2018-03-10 03:03:07 UTC
Permalink
Allow to modify BR_FLOOD_OFFLOAD flag value through both sysfs and
netlink interface.

Signed-off-by: Igor Mitsyanko <***@quantenna.com>
---
include/uapi/linux/if_link.h | 1 +
net/bridge/br_netlink.c | 8 +++++++-
net/bridge/br_sysfs_if.c | 2 ++
3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 11d0c0e..5d3eb7f 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -333,6 +333,7 @@ enum {
IFLA_BRPORT_BCAST_FLOOD,
IFLA_BRPORT_GROUP_FWD_MASK,
IFLA_BRPORT_NEIGH_SUPPRESS,
+ IFLA_BRPORT_FLOOD_OFFLOAD,
__IFLA_BRPORT_MAX
};
#define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 015f465c..26f7c51 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -213,7 +213,9 @@ static int br_port_fill_attrs(struct sk_buff *skb,
BR_VLAN_TUNNEL)) ||
nla_put_u16(skb, IFLA_BRPORT_GROUP_FWD_MASK, p->group_fwd_mask) ||
nla_put_u8(skb, IFLA_BRPORT_NEIGH_SUPPRESS,
- !!(p->flags & BR_NEIGH_SUPPRESS)))
+ !!(p->flags & BR_NEIGH_SUPPRESS)) ||
+ nla_put_u8(skb, IFLA_BRPORT_FLOOD_OFFLOAD,
+ !!(p->flags & BR_FLOOD_OFFLOAD)))
return -EMSGSIZE;

timerval = br_timer_value(&p->message_age_timer);
@@ -763,6 +765,10 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[])
if (err)
return err;

+ err = br_set_port_flag(p, tb, IFLA_BRPORT_FLOOD_OFFLOAD, BR_FLOOD_OFFLOAD);
+ if (err)
+ return err;
+
if (br_vlan_tunnel_old && !(p->flags & BR_VLAN_TUNNEL))
nbp_vlan_tunnel_info_flush(p);

diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c
index 9bdd177..3be600d 100644
--- a/net/bridge/br_sysfs_if.c
+++ b/net/bridge/br_sysfs_if.c
@@ -199,6 +199,7 @@ BRPORT_ATTR_FLAG(proxyarp_wifi, BR_PROXYARP_WIFI);
BRPORT_ATTR_FLAG(multicast_flood, BR_MCAST_FLOOD);
BRPORT_ATTR_FLAG(broadcast_flood, BR_BCAST_FLOOD);
BRPORT_ATTR_FLAG(neigh_suppress, BR_NEIGH_SUPPRESS);
+BRPORT_ATTR_FLAG(flood_offload, BR_FLOOD_OFFLOAD);

#ifdef CONFIG_BRIDGE_IGMP_SNOOPING
static ssize_t show_multicast_router(struct net_bridge_port *p, char *buf)
@@ -250,6 +251,7 @@ static const struct brport_attribute *brport_attrs[] = {
&brport_attr_broadcast_flood,
&brport_attr_group_fwd_mask,
&brport_attr_neigh_suppress,
+ &brport_attr_flood_offload,
NULL
};
--
2.9.5
Igor Mitsyanko
2018-03-10 03:03:06 UTC
Permalink
Introduce BR_FLOOD_OFFLOAD bridge port flag that can be used by
switchdev-capable hardware to advertize that it wants to handle all
flooding by itself.
In that case there is no need for a driver to set skb::offload_fwd_mark
on each offloaded packet as it is implied by BR_FLOOD_OFFLOAD bridge
port flag.

BR_FLOOD_OFFLOAD port flags helps in two scenarios:
1. Mixed bridge configuration with SW ports and switchdev-capable ports.
In case a data frame that needs to be flooded is ingressed on a SW port,
it needs to be flooded to a single HW port of any given
switchdev-capable hardware only. Switchdev hardware will than take care
about flooding to the rest of the ports that it manages.
2. Switch driver does not have to identify frames that were flooded by
hardware and explicitly mark them with skb::offload_fwd_mark. Assuming
it knows that hardware will always handle flooding by itself, it can
simply advertise BR_FLOOD_OFFLOAD port flag.

Note: current implementation can only handle a single switchdev-capable
device in a single port. Frame will be flooded as usual to all ports of
any additional switchdev present in a given bridge.

Signed-off-by: Igor Mitsyanko <***@quantenna.com>
---
include/linux/if_bridge.h | 1 +
net/bridge/br_forward.c | 2 ++
net/bridge/br_private.h | 8 ++++++++
net/bridge/br_switchdev.c | 14 +++++++++++++-
4 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index 02639eb..5d0e277 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -50,6 +50,7 @@ struct br_ip_list {
#define BR_VLAN_TUNNEL BIT(13)
#define BR_BCAST_FLOOD BIT(14)
#define BR_NEIGH_SUPPRESS BIT(15)
+#define BR_FLOOD_OFFLOAD BIT(16)

#define BR_DEFAULT_AGEING_TIME (300 * HZ)

diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
index b4eed11..ac761a9 100644
--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -163,6 +163,8 @@ static struct net_bridge_port *maybe_deliver(
if (!should_deliver(p, skb))
return prev;

+ nbp_switchdev_offload_fwd_track(p, skb);
+
if (!prev)
goto out;

diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 8e13a64..a6d2f2b 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -1111,6 +1111,8 @@ void nbp_switchdev_frame_mark(const struct net_bridge_port *p,
struct sk_buff *skb);
bool nbp_switchdev_allowed_egress(const struct net_bridge_port *p,
const struct sk_buff *skb);
+void nbp_switchdev_offload_fwd_track(const struct net_bridge_port *p,
+ struct sk_buff *skb);
int br_switchdev_set_port_flag(struct net_bridge_port *p,
unsigned long flags,
unsigned long mask);
@@ -1138,6 +1140,12 @@ static inline bool nbp_switchdev_allowed_egress(const struct net_bridge_port *p,
return true;
}

+static inline void
+nbp_switchdev_offload_fwd_track(const struct net_bridge_port *p,
+ struct sk_buff *skb)
+{
+}
+
static inline int br_switchdev_set_port_flag(struct net_bridge_port *p,
unsigned long flags,
unsigned long mask)
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index ee775f4..aee3c01 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -46,6 +46,9 @@ int nbp_switchdev_mark_set(struct net_bridge_port *p)
void nbp_switchdev_frame_mark(const struct net_bridge_port *p,
struct sk_buff *skb)
{
+ if (p->flags & BR_FLOOD_OFFLOAD)
+ skb->offload_fwd_mark = 1;
+
if (skb->offload_fwd_mark && !WARN_ON_ONCE(!p->offload_fwd_mark))
BR_INPUT_SKB_CB(skb)->offload_fwd_mark = p->offload_fwd_mark;
}
@@ -57,8 +60,17 @@ bool nbp_switchdev_allowed_egress(const struct net_bridge_port *p,
BR_INPUT_SKB_CB(skb)->offload_fwd_mark != p->offload_fwd_mark;
}

+void nbp_switchdev_offload_fwd_track(const struct net_bridge_port *p,
+ struct sk_buff *skb)
+{
+ if (skb->offload_fwd_mark || !(p->flags & BR_FLOOD_OFFLOAD))
+ return;
+
+ nbp_switchdev_frame_mark(p, skb);
+}
+
/* Flags that can be offloaded to hardware */
-#define BR_PORT_FLAGS_HW_OFFLOAD (BR_LEARNING | BR_FLOOD | \
+#define BR_PORT_FLAGS_HW_OFFLOAD (BR_LEARNING | BR_FLOOD | BR_FLOOD_OFFLOAD | \
BR_MCAST_FLOOD | BR_BCAST_FLOOD)

int br_switchdev_set_port_flag(struct net_bridge_port *p,
--
2.9.5
Andrew Lunn
2018-03-10 16:55:40 UTC
Permalink
Post by Igor Mitsyanko
Introduce BR_FLOOD_OFFLOAD bridge port flag that can be used by
switchdev-capable hardware to advertize that it wants to handle all
flooding by itself.
In that case there is no need for a driver to set skb::offload_fwd_mark
on each offloaded packet as it is implied by BR_FLOOD_OFFLOAD bridge
port flag.
Is this sufficiently granular? There are a few different use cases for
flooding:

There is no fdb entry in the software switch for the destination MAC
address, so flood the packet out all ports of the bridge. The hardware
switch might have an entry in its fdb to the destination switch, so it
could unicast out the correct hardware port. If not, it should flood
the packet.

A point to remember here, the software switch and the hardware switch
can have different forwarding data bases.

A broadcast packet. Send it out all ports.

A multicast packet. If the hardware switch is capable of IGMP
snooping, it could have FDB entries indicating which ports it should
send the frame out of, and which is should not. Otherwise it needs to
flood.

Is one flag sufficient for all of these, and any other use cases i
might of missed?

As far as DSA switches go, i don't know of any of them which could
implement anything like this, so BR_FLOOD_OFFLOAD will never be
set. But maybe some of the TOR switches supported by switchdev can do
some of these, and not others....

Andrew
Igor Mitsyanko
2018-03-12 23:00:14 UTC
Permalink
Post by Andrew Lunn
Is this sufficiently granular? There are a few different use cases for
There is no fdb entry in the software switch for the destination MAC
address, so flood the packet out all ports of the bridge. The hardware
switch might have an entry in its fdb to the destination switch, so it
could unicast out the correct hardware port. If not, it should flood
the packet.
A point to remember here, the software switch and the hardware switch
can have different forwarding data bases.
A broadcast packet. Send it out all ports.
A multicast packet. If the hardware switch is capable of IGMP
snooping, it could have FDB entries indicating which ports it should
send the frame out of, and which is should not. Otherwise it needs to
flood.
Is one flag sufficient for all of these, and any other use cases i
might of missed?
As far as DSA switches go, i don't know of any of them which could
implement anything like this, so BR_FLOOD_OFFLOAD will never be
set. But maybe some of the TOR switches supported by switchdev can do
some of these, and not others....
Andrew
The flag was introduced to enable hardware switch capabilities of
drivers/net/wireless/quantenna/qtnfmac wifi driver. It does not have any
switchdev functionality in upstream tree at this moment, and this
patchset was intended as a preparatory change.

qtnfmac driver provides several physical radios (5 GHz and 2.4 GHz),
each can have up to 8 virtual network interfaces. These interfaces can
be bridged together in various configurations, and I'm trying to figure
out what is the most efficient way to handle it from bridging perspective.
My assumption was that software FDB and hardware FDB should always be in
sync with each other. I guess it is a safe assumption if handled
correctly? Hardware should send a notification for each new FDB it has
learned, and switchdev driver should process FDB notifications from
software bridge.

qtnfmac hardware has its own memory and maintains FWT table, so for the
best efficiency forwarding between virtual interfaces should be handled
locally. Qtnfmac can handle all the mentioned flooding by itself:
- unknown unicasts
- broadcast and unknown multicast
- known multicasts (does have IGMP snooping)
- can do multicast-to-unicast translation if required.

The most important usecase IMO is a muticast transmission, specific
example being:
- 2.4GHz x 8 and 5GHz x 8 virtual wifi interfaces, bridged with backbone
ethernet interface in Linux
- multicast video streaming from a server behind ethernet
- multicast clients connected to some wifi interfaces

Best way to process this should be to handle multicasting locally in
wifi firmware:
- SW bridge in Linux will send a multicast frame to a single virtual
WiFi interface.
- WiFi firmware will forward/flood frames to all intended recipients
locally.

BR_FLOOD_OFFLOAD flag is intended to address this case in particular,
perhaps there are better ways to do that?
In a broader sense it is a way for hardware to tell that it will handle
all flooding by itself, so there is no granularity in this. I'm not sure
granularity is needed though, as there may not be much sense to do only
some types of flooding and not to do others?
Andrew Lunn
2018-03-13 01:11:40 UTC
Permalink
Post by Igor Mitsyanko
The flag was introduced to enable hardware switch capabilities of
drivers/net/wireless/quantenna/qtnfmac wifi driver. It does not have any
switchdev functionality in upstream tree at this moment, and this patchset
was intended as a preparatory change.
O.K. But i suggest you add basic switchdev support first. Then think
about adding new functionality. That way you can learn more about
switchdev, and we can learn more about your hardware.
Post by Igor Mitsyanko
qtnfmac driver provides several physical radios (5 GHz and 2.4 GHz), each
can have up to 8 virtual network interfaces. These interfaces can be bridged
together in various configurations, and I'm trying to figure out what is the
most efficient way to handle it from bridging perspective.
I think the first thing to do is get this part correctly represented
by switchdev. I don't think any of us maintainers have thought about
how wireless and switchdev can be combined. The wifi model seems to be
one phy device, with multiple MACs running on top of it, with each MAC
being a single SSID. So is it one SSID per virtual interface? Or are
your virtual network interfaces actually virtual phys in the wireless
model, and you can have multiple MACs on top of each virtual phy?
Post by Igor Mitsyanko
My assumption was that software FDB and hardware FDB should always
be in sync with each other. I guess it is a safe assumption if
handled correctly? Hardware should send a notification for each new
FDB it has learned, and switchdev driver should process FDB
notifications from software bridge.
No, you cannot make this assumption. Take the example of DSA
switches. They are generally connected over an MDIO bus, or an SPI
bus. The bandwidth is small. How long do you think it takes the
hardware to learn 8K MAC addresses with 5x 1Gbps ports receiving 64
byte packets? DSA drivers have no way of keeping up with the
hardware. And there is no need to. Everything works fine with the SW
and the HW bridge having different dynamic FDB entries.

I don't even think your hardware will have the hardware and software
in sync. How fast can your hardware learn new addresses? 'Line' rate?
Or do you prevent the hardware learning a new address until the
software bridge has confirmed it has learnt the previous new address?
Post by Igor Mitsyanko
qtnfmac hardware has its own memory and maintains FWT table, so for the best
efficiency forwarding between virtual interfaces should be handled locally.
- unknown unicasts
- broadcast and unknown multicast
- known multicasts (does have IGMP snooping)
- can do multicast-to-unicast translation if required.
The most important usecase IMO is a muticast transmission, specific example
- 2.4GHz x 8 and 5GHz x 8 virtual wifi interfaces, bridged with backbone
ethernet interface in Linux
- multicast video streaming from a server behind ethernet
- multicast clients connected to some wifi interfaces
I agree this makes sense. But we need to ensure the solution is
generic, not something which just works for your hardware/firmware. I
know somebody who would love to be able to do something like this with
DSA drivers. They would probably sacrifice IGMP snooping and just
flood everywhere, if that is all the hardware could do. But so far,
i've not been able to figure out a way to do this.

Andrew
Roopa Prabhu
2018-03-13 14:41:42 UTC
Permalink
Post by Andrew Lunn
Post by Igor Mitsyanko
The flag was introduced to enable hardware switch capabilities of
drivers/net/wireless/quantenna/qtnfmac wifi driver. It does not have any
switchdev functionality in upstream tree at this moment, and this patchset
was intended as a preparatory change.
O.K. But i suggest you add basic switchdev support first. Then think
about adding new functionality. That way you can learn more about
switchdev, and we can learn more about your hardware.
Post by Igor Mitsyanko
qtnfmac driver provides several physical radios (5 GHz and 2.4 GHz), each
can have up to 8 virtual network interfaces. These interfaces can be bridged
together in various configurations, and I'm trying to figure out what is the
most efficient way to handle it from bridging perspective.
I think the first thing to do is get this part correctly represented
by switchdev. I don't think any of us maintainers have thought about
how wireless and switchdev can be combined. The wifi model seems to be
one phy device, with multiple MACs running on top of it, with each MAC
being a single SSID. So is it one SSID per virtual interface? Or are
your virtual network interfaces actually virtual phys in the wireless
model, and you can have multiple MACs on top of each virtual phy?
Post by Igor Mitsyanko
My assumption was that software FDB and hardware FDB should always
be in sync with each other. I guess it is a safe assumption if
handled correctly? Hardware should send a notification for each new
FDB it has learned, and switchdev driver should process FDB
notifications from software bridge.
No, you cannot make this assumption. Take the example of DSA
switches. They are generally connected over an MDIO bus, or an SPI
bus. The bandwidth is small. How long do you think it takes the
hardware to learn 8K MAC addresses with 5x 1Gbps ports receiving 64
byte packets? DSA drivers have no way of keeping up with the
hardware. And there is no need to. Everything works fine with the SW
and the HW bridge having different dynamic FDB entries.
I don't even think your hardware will have the hardware and software
in sync. How fast can your hardware learn new addresses? 'Line' rate?
Or do you prevent the hardware learning a new address until the
software bridge has confirmed it has learnt the previous new address?
Post by Igor Mitsyanko
qtnfmac hardware has its own memory and maintains FWT table, so for the best
efficiency forwarding between virtual interfaces should be handled locally.
- unknown unicasts
- broadcast and unknown multicast
- known multicasts (does have IGMP snooping)
- can do multicast-to-unicast translation if required.
The most important usecase IMO is a muticast transmission, specific example
- 2.4GHz x 8 and 5GHz x 8 virtual wifi interfaces, bridged with backbone
ethernet interface in Linux
- multicast video streaming from a server behind ethernet
- multicast clients connected to some wifi interfaces
I agree this makes sense. But we need to ensure the solution is
generic, not something which just works for your hardware/firmware. I
know somebody who would love to be able to do something like this with
DSA drivers. They would probably sacrifice IGMP snooping and just
flood everywhere, if that is all the hardware could do. But so far,
i've not been able to figure out a way to do this.
I concur with Andrews thoughts here: We already have enough switchdev
learning and flooding control.
More fine tuning can be handled at the driver layer. This solution
tries to bypass some of that and adds a new
infrastructure to control flooding in hw. And I am also afraid that
the use of this flag will propagate to
more places in the bridge driver. If none of the existing mechanisms
work, then yes, we can probably revise this
series into something generic for other switchdev users to use as well.
Andrew Lunn
2018-03-10 22:08:50 UTC
Permalink
Post by Igor Mitsyanko
Main goal of the patchset is to allow for flooding offload to switchdev
in scenarios when HW switchdev ports and SW ports are operating in a
single bridge.
In case a data frame that needs to be flooded is ingressed on a SW port,
it needs to be flooded to a single HW port of any given
switchdev-capable hardware only. Switchdev hardware will than take care
about flooding to the rest of the ports that it manages.
Hi Igor

You don't appear to be adding any user of this. Please also make one
of the switchdev drivers actually use this new functionality.

Andrew
Igor Mitsyanko
2018-03-12 23:08:27 UTC
Permalink
Post by Andrew Lunn
Hi Igor
You don't appear to be adding any user of this. Please also make one
of the switchdev drivers actually use this new functionality.
Andrew
Hi Andrew, a first user is supposed to be
drivers/net/wireless/quantenna/qtnfmac wifi driver. I will merge the
patchset with another one that modifies qtnfmac driver (so that it has
at least one user) and resend.

However RFC portion is still under discussion, would be good to
understand if this approach is at all acceptable.
Loading...