Age | Commit message (Collapse) | Author |
|
When the Felix driver would probe the ports and verify functionality, it
would fail if it hit single port mode that wasn't supported by the driver.
The initial case for the VSC7512 driver will have physical ports that
exist, but aren't supported by the driver implementation. Add the
OCELOT_PORT_MODE_NONE macro to handle this scenario, and allow the Felix
driver to continue with all the ports that are currently functional.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The architecture around the VSC7512 differs from existing felix drivers. In
order to add support for all the chip's features (pinctrl, MDIO, gpio) the
device had to be laid out as a multi-function device (MFD).
One difference between an MFD and a standard platform device is that the
regmaps are allocated to the parent device before the child devices are
probed. As such, there is no need for felix to initialize new regmaps in
these configurations, they can simply be requested from the parent device.
Add support for MFD configurations by performing this request from the
parent device.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The define FELIX_MAC_QUIRKS was used directly in the felix.c shared driver.
Other devices (VSC7512 for example) don't require the same quirks, so they
need to be configured on a per-device basis.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The build system currently complains:
scripts/Makefile.build:252: drivers/net/dsa/ocelot/Makefile:
felix.o is added to multiple modules: mscc_felix mscc_seville
Since felix.c holds the DSA glue layer, create a mscc_felix_dsa_lib.ko.
This is similar to how mscc_ocelot_switch_lib.ko holds a library for
configuring the hardware.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Colin Foster <colin.foster@in-advantage.com>
Link: https://lore.kernel.org/r/20230125145716.271355-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Felix (VSC9959) has a DEV_GMII:MM_CONFIG block composed of 2 registers
(ENABLE_CONFIG and VERIF_CONFIG). Because the MAC Merge statistics and
pMAC statistics are already in the Ocelot switch lib even if just Felix
supports them, I'm adding support for the whole MAC Merge layer in the
common Ocelot library too.
There is an interrupt (shared with the PTP interrupt) which signals
changes to the MM verification state. This is done because the
preemptible traffic classes should be committed to hardware only once
the verification procedure has declared the link partner of being
capable of receiving preemptible frames.
We implement ethtool getters and setters for the MAC Merge layer state.
The "TX enabled" and "verify status" are taken from the IRQ handler,
using a mutex to ensure serialized access.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The Felix VSC9959 switch supports frame preemption and has a MAC Merge
layer. In addition to the structured stats that exist for the eMAC,
export the counters associated with its pMAC (pause, RMON, MAC, PHY,
control) plus the high-level MAC Merge layer stats. The unstructured
ethtool counters, as well as the rtnl_link_stats64 were left to report
only the eMAC counters.
Because statistics processing is quite self-contained in ocelot_stats.c
now, I've opted for introducing an ocelot->mm_supported bool, based on
which the common switch lib does everything, rather than pushing the
TSN-specific code in felix_vsc9959.c, as happens for other TSN stuff.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Ever since commit 4d1d157fb6a4 ("net: mscc: ocelot: share the common stat
definitions between all drivers") the stats_layout entry in ocelot and
felix drivers have become redundant. Remove the unnecessary code.
Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Drop the custom implementation of phylink_validate() in favor of the
generic one, which requires config->mac_capabilities to be set.
This was used up until now because of the possibility of being paired
with Aquantia PHYs with support for rate matching. The phylink framework
gained generic support for these, and knows to advertise all 10/100/1000
lower speed link modes when our SERDES protocol is 2500base-x
(fixed speed).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Existing felix DSA drivers (vsc9959, vsc9953) are all switches that were
integrated in NXP SoCs, which makes them a bit unusual compared to the
usual Microchip branded Ocelot switches.
To be precise, looking at
Documentation/devicetree/bindings/net/mscc,vsc7514-switch.yaml, one can
see 21 memory regions for the "switch" node, and these correspond to the
"targets" of the switch IP, which are spread throughout the guts of that
SoC's memory space.
In NXP integrations, those targets still exist, but they were condensed
within a single memory region, with no other peripheral in between them,
so it made more sense for the driver to ioremap the entire memory space
of the switch, and then find the targets within that memory space via
some offsets hardcoded in the driver.
The effect of this design decision is that now, the felix driver expects
hardware instantiations to provide their own resource definitions, which
is kind of odd when considering a typical device (those are retrieved
from 'reg' properties in the device tree, using platform_get_resource()
or similar).
Allow other hardware instantiations that share the felix driver to not
provide a hardcoded array of resources in the future. Instead, make the
common denominator based on which regmaps are created be just the
resource "names". Each instantiation comes with its own array of names
that are mandatory for it, and with an optional array of resources.
So we split the resources in 2 arrays, one is what's requested and the
other is what's provided. There is one pool of provided resources, in
felix->info->resources (of length felix->info->num_resources). There are
2 different ways of requesting a resource. One is by enum ocelot_target
(this handles the global regmaps), and one is by int port (this handles
the per-port ones).
For the existing vsc9959 and vsc9953, it would be a bit stupid to
request something that's not provided, given that the 2 arrays are both
defined in the same place.
The advantage is that we can now modify felix_request_regmap_by_name()
to make felix->info->resources[] optional, and if absent, the
implementation can call dev_get_regmap() and this is something that is
compatible with MFD.
Co-developed-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use less verbose resource definitions in vsc9959 and vsc9953. This also
sets IORESOURCE_MEM in the constant array of resources, so we don't have
to do this from felix_init_structs() - in fact, in the future, we may
even support IORESOURCE_REG resources.
Note that this macro takes start and length as argument, and we had
start and end before. So transform end into length.
While at it, sort the resources according to their offset.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
It turns out that the idea of having a customizable implementation of a
regmap creation from a resource is not exactly useful. The idea was for
the new MFD-based VSC7512 driver to use something that creates a SPI
regmap from a resource. But there are problems in actually getting those
resources (it involves getting them from MFD).
To avoid all that, we'll be getting resources by name, so this custom
init_regmap() method won't be needed. Remove it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Changing the DSA master means different things depending on the tagging
protocol in use.
For NPI mode ("ocelot" and "seville"), there is a single port which can
be configured as NPI, but DSA only permits changing the CPU port
affinity of user ports one by one. So changing a user port to a
different NPI port globally changes what the NPI port is, and breaks the
user ports still using the old one.
To address this while still permitting the change of the NPI port,
require that the user ports which are still affine to the old NPI port
are down, and cannot be brought up until they are all affine to the same
NPI port.
The tag_8021q mode ("ocelot-8021q") is more flexible, in that each user
port can be freely assigned to one CPU port or to the other. This works
by filtering host addresses towards both tag_8021q CPU ports, and then
restricting the forwarding from a certain user port only to one of the
two tag_8021q CPU ports.
Additionally, the 2 tag_8021q CPU ports can be placed in a LAG. This
works by enabling forwarding via PGID_SRC from a certain user port
towards the logical port ID containing both tag_8021q CPU ports, but
then restricting forwarding per packet, via the LAG hash codes in
PGID_AGGR, to either one or the other.
When we change the DSA master to a LAG device, DSA guarantees us that
the LAG has at least one lower interface as a physical DSA master.
But DSA masters can come and go as lowers of that LAG, and
ds->ops->port_change_master() will not get called, because the DSA
master is still the same (the LAG). So we need to hook into the
ds->ops->port_lag_{join,leave} calls on the CPU ports and update the
logical port ID of the LAG that user ports are assigned to.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Drivers could refuse to offload a LAG configuration for a variety of
reasons, mainly having to do with its TX type. Additionally, since DSA
masters may now also be LAG interfaces, and this will translate into a
call to port_lag_join on the CPU ports, there may be extra restrictions
there. Propagate the netlink extack to this DSA method in order for
drivers to give a meaningful error message back to the user.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
present in DSA
DSA is integrated with the new standardized ethtool -S --groups option,
but the felix driver only exports unstructured statistics.
Reuse the array of 64-bit statistics collected by ocelot_check_stats_work(),
but just export select values from it.
Since ocelot_check_stats_work() runs periodically to avoid 32-bit
overflow, and the ethtool calling context is sleepable, we update the
64-bit stats one more time, to provide up-to-date values. The locking
scheme with a mutex followed by a spinlock is a bit hard to digest, so
we create and use a ocelot_port_stats_run() helper with a callback that
populates the ethool stats group the caller is interested in.
The exported stats are:
ethtool -S swp0 --groups eth-phy
ethtool -S swp0 --groups eth-mac
ethtool -S swp0 --groups eth-ctrl
ethtool -S swp0 --groups rmon
ethtool --include-statistics --show-pause swp0
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Move the logic from the ocelot switchdev driver's ocelot_get_stats64()
method to the common switch lib and reuse it for the DSA driver.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is a partial revert of commit c295f9831f1d ("net: mscc: ocelot:
switch from {,un}set to {,un}assign for tag_8021q CPU ports"), because
as it turns out, this isn't how tag_8021q CPU ports under a LAG are
supposed to work.
Under that scenario, all user ports are "assigned" to the single
tag_8021q CPU port represented by the logical port corresponding to the
bonding interface. So one CPU port in a LAG would have is_dsa_8021q_cpu
set to true (the one whose physical port ID is equal to the logical port
ID), and the other one to false.
In turn, this makes 2 undesirable things happen:
(1) PGID_CPU contains only the first physical CPU port, rather than both
(2) only the first CPU port will be added to the private VLANs used by
ocelot for VLAN-unaware bridging
To make the driver behave in the same way for both bonded CPU ports, we
need to bring back the old concept of setting up a port as a tag_8021q
CPU port, and this is what deals with VLAN membership and PGID_CPU
updating. But we also need the CPU port "assignment" (the user to CPU
port affinity), and this is what updates the PGID_SRC forwarding rules.
All DSA CPU ports are statically configured for tag_8021q mode when the
tagging protocol is changed to ocelot-8021q. User ports are "assigned"
to one CPU port or the other dynamically (this will be handled by a
future change).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The way in which dsa_tree_change_tag_proto() works is that when
dsa_tree_notify() fails, it doesn't know whether the operation failed
mid way in a multi-switch tree, or it failed for a single-switch tree.
So even though drivers need to fail cleanly in
ds->ops->change_tag_protocol(), DSA will still call dsa_tree_notify()
again, to restore the old tag protocol for potential switches in the
tree where the change did succeeed (before failing for others).
This means for the felix driver that if we report an error in
felix_change_tag_protocol(), we'll get another call where proto_ops ==
old_proto_ops. If we proceed to act upon that, we may do unexpected
things. For example, we will call dsa_tag_8021q_register() twice in a
row, without any dsa_tag_8021q_unregister() in between. Then we will
actually call dsa_tag_8021q_unregister() via old_proto_ops->teardown,
which (if it manages to run at all, after walking through corrupted data
structures) will leave the ports inoperational anyway.
The bug can be readily reproduced if we force an error while in
tag_8021q mode; this crashes the kernel.
echo ocelot-8021q > /sys/class/net/eno2/dsa/tagging
echo edsa > /sys/class/net/eno2/dsa/tagging # -EPROTONOSUPPORT
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014
Call trace:
vcap_entry_get+0x24/0x124
ocelot_vcap_filter_del+0x198/0x270
felix_tag_8021q_vlan_del+0xd4/0x21c
dsa_switch_tag_8021q_vlan_del+0x168/0x2cc
dsa_switch_event+0x68/0x1170
dsa_tree_notify+0x14/0x34
dsa_port_tag_8021q_vlan_del+0x84/0x110
dsa_tag_8021q_unregister+0x15c/0x1c0
felix_tag_8021q_teardown+0x16c/0x180
felix_change_tag_protocol+0x1bc/0x230
dsa_switch_event+0x14c/0x1170
dsa_tree_change_tag_proto+0x118/0x1c0
Fixes: 7a29d220f4c0 ("net: dsa: felix: reimplement tagging protocol change with function pointers")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220808125127.3344094-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
port
Currently, sending a packet into a time gate too small for it (or always
closed) causes the queue system to hold the frame forever. Even worse,
this frame isn't subject to aging either, because for that to happen, it
needs to be scheduled for transmission in the first place. But the frame
will consume buffer memory and frame references while it is forever held
in the queue system.
Before commit a4ae997adcbd ("net: mscc: ocelot: initialize watermarks to
sane defaults"), this behavior was somewhat subtle, as the switch had a
more intricately tuned default watermark configuration out of reset,
which did not allow any single port and tc to consume the entire switch
buffer space. Nonetheless, the held frames are still there, and they
reduce the total backplane capacity of the switch.
However, after the aforementioned commit, the behavior can be very
clearly seen, since we deliberately allow each {port, tc} to consume the
entire shared buffer of the switch minus the reservations (and we
disable all reservations by default). That is to say, we allow a
permanently closed tc-taprio gate to hang the entire switch.
A careful inspection of the documentation shows that the QSYS:Q_MAX_SDU
per-port-tc registers serve 2 purposes: one is for guard band calculation
(when zero, this falls back to QSYS:PORT_MAX_SDU), and the other is to
enable oversized frame dropping (when non-zero).
Currently the QSYS:Q_MAX_SDU registers are all zero, so oversized frame
dropping is disabled. The goal of the change is to enable it seamlessly.
For that, we need to hook into the MTU change, tc-taprio change, and
port link speed change procedures, since we depend on these variables.
Frames are not dropped on egress due to a queue system oversize
condition, instead that egress port is simply excluded from the mask of
valid destination ports for the packet. If there are no destination
ports at all, the ingress counter that increments is the generic
"drop_tail" in ethtool -S.
The issue exists in various forms since the tc-taprio offload was introduced.
Fixes: de143c0e274b ("net: dsa: felix: Configure Time-Aware Scheduler via taprio offload")
Reported-by: Richie Pearn <richard.pearn@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Update the VCAP filters to support multiple tag_8021q CPU ports.
TX works using a filter for VLAN ID on the ingress of the CPU port, with
a redirect and a VLAN pop action. This can be updated trivially by
amending the ingress port mask of this rule to match on all tag_8021q
CPU ports.
RX works using a filter for ingress port on the egress of the CPU port,
with a VLAN push action. Here we need to replicate these filters for
each tag_8021q CPU port, and let them all have the same action.
This means that the OCELOT_VCAP_ES0_TAG_8021Q_RXVLAN() cookie needs to
encode a unique value for every {user port, CPU port} pair it's given.
Do this by encoding the CPU port in the upper 16 bits of the cookie, and
the user port in the lower 16 bits.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is a desire for the felix driver to gain support for multiple
tag_8021q CPU ports, but the current model prevents it.
This is because ocelot_apply_bridge_fwd_mask() only takes into
consideration whether a port is a tag_8021q CPU port, but not whose CPU
port it is.
We need a model where we can have a direct affinity between an ocelot
port and a tag_8021q CPU port. This serves as the basis for multiple CPU
ports.
Declare a "dsa_8021q_cpu" backpointer in struct ocelot_port which
encodes that affinity. Repurpose the "ocelot_set_dsa_8021q_cpu" API to
"ocelot_assign_dsa_8021q_cpu" to express the change of paradigm.
Note that this change makes the first practical use of the new
ocelot_port->index field in ocelot_port_unassign_dsa_8021q_cpu(), where
we need to remove the old tag_8021q CPU port from the reserved VLAN range.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Absorb the final details of calling ocelot_port_{,un}set_dsa_8021q_cpu(),
i.e. the need to lock &ocelot->fwd_domain_lock, into the callee, to
simplify the caller and permit easier code reuse later.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tag_8021q CPU
Add more logic to ocelot_port_{,un}set_dsa_8021q_cpu() from the ocelot
switch lib by encapsulating the ocelot_apply_bridge_fwd_mask() call that
felix used to have.
This is necessary because the CPU port change procedure will also need
to do this, and it's good to reduce code duplication by having an entry
point in the ocelot switch lib that does all that is needed.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
PGID_CPU must be updated every time a port is configured or unconfigured
as a tag_8021q CPU port. The ocelot switch lib already has a hook for
that operation, so move the updating of PGID_CPU to those hooks.
These bits are pretty specific to DSA, so normally I would keep them out
of the common switch lib, but when tag_8021q is in use, this has
implications upon the forwarding mask determined by
ocelot_apply_bridge_fwd_mask() and called extensively by the switch lib.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
PGID_BC is configured statically by ocelot_init() to flood towards the
CPU port module, and dynamically by ocelot_port_set_bcast_flood()
towards all user ports.
When the tagging protocol changes, the intention is to turn off flooding
towards the old pipe towards the host, and to turn it on towards the new
pipe.
Due to a recent change which removed the adjustment of PGID_BC from
felix_set_host_flood(), 3 things happen.
- when we change from NPI to tag_8021q mode: in this mode, the CPU port
module is accessed via registers, and used to read PTP packets with
timestamps. We fail to disable broadcast flooding towards the CPU port
module, and to enable broadcast flooding towards the physical port
that serves as a DSA tag_8021q CPU port.
- from tag_8021q to NPI mode: in this mode, the CPU port module is
redirected to a physical port. We fail to disable broadcast flooding
towards the physical tag_8021q CPU port, and to enable it towards the
CPU port module at ocelot->num_phys_ports.
- when the ports are put in promiscuous mode, we also fail to update
PGID_BC towards the host pipe of the current protocol.
First issue means that felix_check_xtr_pkt() has to do extra work,
because it will not see only PTP packets, but also broadcasts. It needs
to dequeue these packets just to drop them.
Third issue is inconsequential, since PGID_BC is allocated from the
nonreserved multicast PGID space, and these PGIDs are conveniently
initialized to 0x7f (i.e. flood towards all ports except the CPU port
module). Broadcasts reach the NPI port via ocelot_init(), and reach the
tag_8021q CPU port via the hardware defaults.
Second issue is also inconsequential, because we fail both at disabling
and at enabling broadcast flooding on a port, so the defaults mentioned
above are preserved, and they are fine except for the performance impact.
Fixes: 7a29d220f4c0 ("net: dsa: felix: reimplement tagging protocol change with function pointers")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently the ocelot switch lib is unaware of the index of a struct
ocelot_port, since that is kept in the encapsulating structures of outer
drivers (struct dsa_port :: index, struct ocelot_port_private :: chip_port).
With the upcoming increase in complexity associated with assigning DSA
tag_8021q CPU ports to certain user ports, it becomes necessary for the
switch lib to be able to retrieve the index of a certain ocelot_port.
Therefore, introduce a new u8 to ocelot_port (same size as the chip_port
used by the ocelot switchdev driver) and rework the existing code to
populate and use it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The error handling for the current tagging protocol change procedure is
a bit brittle (we dismantle the previous tagging protocol entirely
before setting up the new one). By identifying which parts of a tagging
protocol are unique to itself and which parts are shared with the other,
we can implement a protocol change procedure where error handling is a
bit more robust, because we start setting up the new protocol first, and
tear down the old one only after the setup of the specific and shared
parts succeeded.
The protocol change is a bit too open-coded too, in the area of
migrating host flood settings and MDBs. By identifying what differs
between tagging protocols (the forwarding masks for host flooding) we
can implement a more straightforward migration procedure which is
handled in the shared portion of the protocol change, rather than
individually by each protocol.
Therefore, a more structured approach calls for the introduction of a
structure of function pointers per tagging protocol. This covers setup,
teardown and the host forwarding mask. In the future it will also cover
how to prepare for a new DSA master.
The initial tagging protocol setup (at driver probe time) and the final
teardown (at driver removal time) are also adapted to call into the
structured methods of the specific protocol in current use. This is
especially relevant for teardown, where we previously called
felix_del_tag_protocol() only for the first CPU port. But by not
specifying which CPU port this is for, we gain more flexibility to
support multiple CPU ports in the future.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Ocelot switches support a single active CPU port at a time (at least as
a trapping destination, i.e. for control traffic). This is true
regardless of whether we are using the native copy-to-CPU-port-module
functionality, or a redirect action towards the software-defined
tag_8021q CPU port.
Currently we assume that the trapping destination in tag_8021q mode is
the first CPU port, yet in the future we may want to migrate the user
ports to the second CPU port.
For that to work, we need to make sure that the tag_8021q trapping
destination is a CPU port that is active, i.e. is used by at least some
user port on which the trap was added. Otherwise, we may end up
redirecting the traffic to a CPU port which isn't even up.
Note that due to the current design where we simply choose the CPU port
of the first port from the trap's ingress port mask, it may be that a
CPU port absorbes control traffic from user ports which aren't affine to
it as per user space's request. This isn't ideal, but is the lesser of
two evils. Following the user-configured affinity for traps would mean
that we can no longer reuse a single TCAM entry for multiple traps,
which is what we actually do for e.g. PTP. Either we duplicate and
deduplicate TCAM entries on the fly when user-to-CPU-port mappings
change (which is unnecessarily complicated), or we redirect trapped
traffic to all tag_8021q CPU ports if multiple such ports are in use.
The latter would have actually been nice, if it actually worked, but it
doesn't, since a OCELOT_MASK_MODE_REDIRECT action towards multiple ports
would not take PGID_SRC into consideration, and it would just duplicate
the packet towards each (CPU) port, leading to duplicates in software.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
DSA has not supported (and probably will not support in the future
either) independent tagging protocols per CPU port.
Different switch drivers have different requirements, some may need to
replicate some settings for each CPU port, some may need to apply some
settings on a single CPU port, while some may have to configure some
global settings and then some per-CPU-port settings.
In any case, the current model where DSA calls ->change_tag_protocol for
each CPU port turns out to be impractical for drivers where there are
global things to be done. For example, felix calls dsa_tag_8021q_register(),
which makes no sense per CPU port, so it suppresses the second call.
Let drivers deal with replication towards all CPU ports, and remove the
CPU port argument from the function prototype.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
At the time - commit 7569459a52c9 ("net: dsa: manage flooding on the CPU
ports") - not introducing a dedicated switch callback for host flooding
made sense, because for the only user, the felix driver, there was
nothing different to do for the CPU port than set the flood flags on the
CPU port just like on any other bridge port.
There are 2 reasons why this approach is not good enough, however.
(1) Other drivers, like sja1105, support configuring flooding as a
function of {ingress port, egress port}, whereas the DSA
->port_bridge_flags() function only operates on an egress port.
So with that driver we'd have useless host flooding from user ports
which don't need it.
(2) Even with the felix driver, support for multiple CPU ports makes it
difficult to piggyback on ->port_bridge_flags(). The way in which
the felix driver is going to support host-filtered addresses with
multiple CPU ports is that it will direct these addresses towards
both CPU ports (in a sort of multicast fashion), then restrict the
forwarding to only one of the two using the forwarding masks.
Consequently, flooding will also be enabled towards both CPU ports.
However, ->port_bridge_flags() gets passed the index of a single CPU
port, and that leaves the flood settings out of sync between the 2
CPU ports.
This is to say, it's better to have a specific driver method for host
flooding, which takes the user port as argument. This solves problem (1)
by allowing the driver to do different things for different user ports,
and problem (2) by abstracting the operation and letting the driver do
whatever, rather than explicitly making the DSA core point to the CPU
port it thinks needs to be touched.
This new method also creates a problem, which is that cross-chip setups
are not handled. However I don't have hardware right now where I can
test what is the proper thing to do, and there isn't hardware compatible
with multi-switch trees that supports host flooding. So it remains a
problem to be tackled in the future.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
For symmetry with host FDBs and MDBs where the indirection is now
handled outside the ocelot switch lib, do the same for bridge port
flags (unicast/multicast/broadcast flooding).
The only caller of the ocelot switch lib which uses the NPI port is the
Felix DSA driver.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
For symmetry with host FDBs where the indirection is now handled outside
the ocelot switch lib, do the same for host MDB entries. The only caller
of the ocelot switch lib which uses the NPI port is the Felix DSA driver.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
I remembered why we had the host FDB migration procedure in place.
It is true that host FDB entry migration can be done by changing the
value of PGID_CPU, but the problem is that only host FDB entries learned
while operating in NPI mode go to PGID_CPU. When the CPU port operates
in tag_8021q mode, the FDB entries are learned towards the unicast PGID
equal to the physical port number of this CPU port, bypassing the
PGID_CPU indirection.
So host FDB entries learned in tag_8021q mode are not migrated any
longer towards the NPI port.
Fix this by extracting the NPI port -> PGID_CPU redirection from the
ocelot switch lib, moving it to the Felix DSA driver, and applying it
for any CPU port regardless of its kind (NPI or tag_8021q).
Fixes: a51c1c3f3218 ("net: dsa: felix: stop migrating FDBs back and forth on tag proto change")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
No conflicts.
Build issue in drivers/net/ethernet/sfc/ptp.c
54fccfdd7c66 ("sfc: efx_default_channel_type APIs can be static")
49e6123c65da ("net: sfc: fix memory leak due to ptp channel")
https://lore.kernel.org/all/20220510130556.52598fe2@canb.auug.org.au/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Switches using the Lynx PCS driver support 1000base-X optical SFP
modules. Accept this interface type on a port.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220510164320.10313-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The felix driver is the only user of dsa_port_walk_mdbs(), and there
isn't even a good reason for it, considering that the host MDB entries
are already saved by the ocelot switch lib in the ocelot->multicast list.
Rewrite the multicast entry migration procedure around the
ocelot->multicast list so we can delete dsa_port_walk_mdbs().
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
I just realized we don't need to migrate the host-filtered FDB entries
when the tagging protocol changes from "ocelot" to "ocelot-8021q".
Host-filtered addresses are learned towards the PGID_CPU "multicast"
port group, reserved by software, which contains BIT(ocelot->num_phys_ports).
That is the "special" port entry in the analyzer block for the CPU port
module.
In "ocelot" mode, the CPU port module's packets are redirected to the
NPI port.
In "ocelot-8021q" mode, felix_8021q_cpu_port_init() does something funny
anyway, and changes PGID_CPU to stop pointing at the CPU port module and
start pointing at the physical port where the DSA master is attached.
The fact that we can alter the destination of packets learned towards
PGID_CPU without altering the MAC table entries themselves means that it
is pointless to walk through the FDB entries, forget that they were
learned towards PGID_CPU, and re-learn them towards the "unicast" PGID
associated with the physical port connected to the DSA master. We can
let the PGID_CPU value change simply alter the destination of the
host-filtered unicast packets in one fell swoop.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ocelot_fdb_add() redirects FDB entries installed on the NPI port towards
the special reserved PGID_CPU used for host-filtered addresses. PGID_CPU
contains BIT(ocelot->num_phys_ports) in the destination port mask, which
is code name for the CPU port module.
Whereas felix_migrate_fdbs_to_*_port() uses the ocelot->num_phys_ports
PGID directly, and it appears that this works too. Even if this PGID is
set to zero, apparently its number is special and packets still reach
the CPU port module.
Nonetheless, in the end, these addresses end up in the same place
regardless of whether they go through an extra indirection layer or not.
Use PGID_CPU across to have more uniformity.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since the blamed commit, VCAP filters can appear on more than one list.
If their action is "trap", they are chained on ocelot->traps via
filter->trap_list. This is in addition to their normal placement on the
VCAP block->rules list head.
Therefore, when we free a VCAP filter, we must remove it from all lists
it is a member of, including ocelot->traps.
There are at least 2 bugs which are direct consequences of this design
decision.
First is the incorrect usage of list_empty(), meant to denote whether
"filter" is chained into ocelot->traps via filter->trap_list.
This does not do the correct thing, because list_empty() checks whether
"head->next == head", but in our case, head->next == head->prev == NULL.
So we dereference NULL pointers and die when we call list_del().
Second is the fact that not all places that should remove the filter
from ocelot->traps do so. One example is ocelot_vcap_block_remove_filter(),
which is where we have the main kfree(filter). By keeping freed filters
in ocelot->traps we end up in a use-after-free in
felix_update_trapping_destinations().
Attempting to fix all the buggy patterns is a whack-a-mole game which
makes the driver unmaintainable. Actually this is what the previous
patch version attempted to do:
https://patchwork.kernel.org/project/netdevbpf/patch/20220503115728.834457-3-vladimir.oltean@nxp.com/
but it introduced another set of bugs, because there are other places in
which create VCAP filters, not just ocelot_vcap_filter_create():
- ocelot_trap_add()
- felix_tag_8021q_vlan_add_rx()
- felix_tag_8021q_vlan_add_tx()
Relying on the convention that all those code paths must call
INIT_LIST_HEAD(&filter->trap_list) is not going to scale.
So let's do what should have been done in the first place and keep a
bool in struct ocelot_vcap_filter which denotes whether we are looking
at a trapping rule or not. Iterating now happens over the main VCAP IS2
block->rules. The advantage is that we no longer risk having stale
references to a freed filter, since it is only present in that list.
Fixes: e42bd4ed09aa ("net: mscc: ocelot: keep traps in a list")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There is a desire to share the oclot_stats_layout struct outside of the
current vsc7514 driver. In order to do so, the length of the array needs to
be known at compile time, and defined in the struct ocelot and struct
felix_info.
Since the array is defined in a .c file and would be declared in the header
file via:
extern struct ocelot_stat_layout[];
the size of the array will not be known at compile time to outside modules.
To fix this, remove the need for defining the number of stats at compile
time and allow this number to be determined at initialization.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When the device tree has 2 CPU ports defined, a single one is active
(has any dp->cpu_dp pointers point to it). Yet the second one is still a
CPU port, and DSA still calls ->change_tag_protocol on it.
On the NXP LS1028A, the CPU ports are ports 4 and 5. Port 4 is the
active CPU port and port 5 is inactive.
After the following commands:
# Initial setting
cat /sys/class/net/eno2/dsa/tagging
ocelot
echo ocelot-8021q > /sys/class/net/eno2/dsa/tagging
echo ocelot > /sys/class/net/eno2/dsa/tagging
traffic is now broken, because the driver has moved the NPI port from
port 4 to port 5, unbeknown to DSA.
The problem can be avoided by detecting that the second CPU port is
unused, and not doing anything for it. Further rework will be needed
when proper support for multiple CPU ports is added.
Treat this as a bug and prepare current kernels to work in single-CPU
mode with multiple-CPU DT blobs.
Fixes: adb3dccf090b ("net: dsa: felix: convert to the new .change_tag_protocol DSA API")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220412172209.2531865-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Gain support for port mirroring using tc-matchall by forwarding the
calls to the ocelot switch library.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Follow the established programming model for this driver and provide
shims in the felix DSA driver which call the implementations from the
ocelot switch lib. The ocelot switchdev driver wasn't integrated with
dcbnl due to lack of hardware availability.
The switch doesn't have any fancy QoS classification enabled by default.
The provided getters will create a default-prio app table entry of 0,
and no dscp entry. However, the getters have been made to actually
retrieve the hardware configuration rather than static values, to be
future proof in case DSA will need this information from more call paths.
For default-prio, there is a single field per port, in ANA_PORT_QOS_CFG,
called QOS_DEFAULT_VAL.
DSCP classification is enabled per-port, again via ANA_PORT_QOS_CFG
(field QOS_DSCP_ENA), and individual DSCP values are configured as
trusted or not through register ANA_DSCP_CFG (replicated 64 times).
An untrusted DSCP value falls back to other QoS classification methods.
If trusted, the selected ANA_DSCP_CFG register also holds the QoS class
in the QOS_DSCP_VAL field.
The hardware also supports DSCP remapping (DSCP value X is translated to
DSCP value Y before the QoS class is determined based on the app table
entry for Y) and DSCP packet rewriting. The dcbnl framework, for being
so flexible in other useless areas, doesn't appear to support this.
So this functionality has been left out.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The Felix driver declares FDB isolation but puts all standalone ports in
VID 0. This is mostly problem-free as discussed with Alvin here:
https://patchwork.kernel.org/project/netdevbpf/cover/20220302191417.1288145-1-vladimir.oltean@nxp.com/#24763870
however there is one catch. DSA still thinks that FDB entries are
installed on the CPU port as many times as there are user ports, and
this is problematic when multiple user ports share the same MAC address.
Consider the default case where all user ports inherit their MAC address
from the DSA master, and then the user runs:
ip link set swp0 address 00:01:02:03:04:05
The above will make dsa_slave_set_mac_address() call
dsa_port_standalone_host_fdb_add() for 00:01:02:03:04:05 in port 0's
standalone database, and dsa_port_standalone_host_fdb_del() for the old
address of swp0, again in swp0's standalone database.
Both the ->port_fdb_add() and ->port_fdb_del() will be propagated down
to the felix driver, which will end up deleting the old MAC address from
the CPU port. But this is still in use by other user ports, so we end up
breaking unicast termination for them.
There isn't a problem in the fact that DSA keeps track of host
standalone addresses in the individual database of each user port: some
drivers like sja1105 need this. There also isn't a problem in the fact
that some drivers choose the same VID/FID for all standalone ports.
It is just that the deletion of these host addresses must be delayed
until they are known to not be in use any longer, and only the driver
has this knowledge. Since DSA keeps these addresses in &cpu_dp->fdbs and
&cpu_db->mdbs, it is just a matter of walking over those lists and see
whether the same MAC address is present on the CPU port in the port db
of another user port.
I have considered reusing the generic dsa_port_walk_fdbs() and
dsa_port_walk_mdbs() schemes for this, but locking makes it difficult.
In the ->port_fdb_add() method and co, &dp->addr_lists_lock is held, but
dsa_port_walk_fdbs() also acquires that lock. Also, even assuming that
we introduce an unlocked variant of the address iterator, we'd still
need some relatively complex data structures, and a void *ctx in the
dsa_fdb_walk_cb_t which we don't currently pass, such that drivers are
able to figure out, after iterating, whether the same MAC address is or
isn't present in the port db of another port.
All the above, plus the fact that I expect other drivers to follow the
same model as felix where all standalone ports use the same FID, made me
conclude that a generic method provided by DSA is necessary:
dsa_fdb_present_in_other_db() and the mdb equivalent. Felix calls this
from the ->port_fdb_del() handler for the CPU port, when the database
was classified to either a port db, or a LAG db.
For symmetry, we also call this from ->port_fdb_add(), because if the
address was installed once, then installing it a second time serves no
purpose: it's already in hardware in VID 0 and it affects all standalone
ports.
This change moves dsa_db_equal() from switch.c to dsa.c, since it now
has one more caller.
Fixes: 54c319846086 ("net: mscc: ocelot: enforce FDB isolation when VLAN-unaware")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The two blamed commits were written/tested individually but not
together.
When put together, commit 90897569beb1 ("net: dsa: felix: start off with
flooding disabled on the CPU port"), which deletes a reinitialization of
PGID_UC/PGID_MC/PGID_BC, is no longer sufficient to ensure that these
port masks don't contain the CPU port module.
This is because commit b903a6bd2e19 ("net: dsa: felix: migrate flood
settings from NPI to tag_8021q CPU port") overwrites the hardware
default settings towards the CPU port module with the settings that used
to be present on the NPI port treated as a regular port. There, flooding
is enabled, so flooding would get enabled on the CPU port module too.
Adding conditional logic somewhere within felix_setup_tag_npi() to
configure either the default no-flood policy or the flood policy
inherited from the tag_8021q CPU port from a previous call to
dsa_port_manage_cpu_flood() is getting complicated. So just let the
migration logic do its thing during initial setup (which will
temporarily turn on flooding), then turn flooding off for the NPI port
after felix_set_tag_protocol() finishes. Here we are in felix_setup(),
so the DSA slave interfaces are not yet created, and this doesn't affect
traffic in any way.
Fixes: 90897569beb1 ("net: dsa: felix: start off with flooding disabled on the CPU port")
Fixes: b903a6bd2e19 ("net: dsa: felix: migrate flood settings from NPI to tag_8021q CPU port")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We no longer need the workaround in the felix driver to avoid calling
dsa_port_walk_fdbs() when &dp->fdbs is an uninitialized list, because
that list is now initialized from all call paths of felix_set_tag_protocol().
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Due to an apparently incorrect conflict resolution on my part in commit
54c319846086 ("net: mscc: ocelot: enforce FDB isolation when
VLAN-unaware"), "ocelot->ports[port]->is_dsa_8021q_cpu = false" was
supposed to be replaced by "ocelot_port_unset_dsa_8021q_cpu(ocelot, port)"
which does the same thing, and more. But now we have both, so the direct
assignment is redundant. Remove it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Packet extraction failures over register-based MMIO are silent, and
difficult to pinpoint. Add an error message to remedy this.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Automated tools complain that felix_check_xtr_pkt() has logic to drain
the CPU queue on the reception of a PTP packet over Ethernet, yet it
returns an uninitialized error code in the case where the CPU queue was
empty.
This is not likely to happen (/possible if hardware works correctly),
but it isn't a fatal condition either. The PTP packet will be dequeued
from the CPU queue when the next PTP packet arrives. So initialize "err"
to 0 for the case where nothing was dequeued during this iteration.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The DSA ->port_rxtstamp() function is never called for PTP_CLASS_NONE:
dsa_skb_defer_rx_timestamp:
if (type == PTP_CLASS_NONE)
return false;
if (likely(ds->ops->port_rxtstamp))
return ds->ops->port_rxtstamp(ds, p->dp->index, skb, type);
So practically, the argument is unused, so remove it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This assignment is redundant, since ocelot->npi has already been set to
-1 by felix_npi_port_deinit(). Call path:
felix_change_tag_protocol
-> felix_del_tag_protocol(DSA_TAG_PROTO_OCELOT)
-> felix_teardown_tag_npi
-> felix_npi_port_deinit
-> felix_set_tag_protocol(DSA_TAG_PROTO_OCELOT_8021Q)
-> felix_setup_tag_8021q
-> felix_8021q_cpu_port_init
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|