summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
AgeCommit message (Collapse)Author
2016-08-17mlxsw: spectrum: Create PVID vPort before registering netdeviceIdo Schimmel
After registering a netdevice it's possible for user space applications to configure an IP address on it. From the driver's perspective, this means a router interface (RIF) should be created for the PVID vPort. Therefore, we must create the PVID vPort before registering the netdevice. Fixes: 99724c18fc66 ("mlxsw: spectrum: Introduce support for router interfaces") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-24mlxsw: spectrum: Add support in matchall mirror TC offloadingYotam Gigi
This patch offloads port mirroring directives to hw using the matchall TC with action mirror. It includes both the implementation of the ndo_setup_tc function for the spectrum driver and the spectrum hardware offload configuration code. The hardware offload code is basically two new functions which are capable of adding and removing a new mirror ports pair. It is done using the MPAT, MPAR and SBIB registers: - A new Switch-Port Analyzer (SPAN) entry is added using MPAT to the 'to' port. - The 'to' port is bound to the SPAN entry using MPAR register. - In case of egress SPAN, the 'to' port gets a new internal shared buffer using SBIB register. In addition, a new database was added to the mlxsw_sp struct to store all the SPAN entries and their bound ports list. The number of supported SPAN entries is determined by resource query. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-05mlxsw: Add the unresolved next-hops probesYotam Gigi
Now, the driver sends arp probes for all unresolved neighbours that are currently a nexthop for some route on the system. The job is set periodically every 5 seconds. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-05mlxsw: spectrum_router: Add the nexthop neigh activity updateYotam Gigi
For nexthop neighbours we need to make kernel to think there is a traffic flowing to them preventing it from going to stale state. Otherwise kernel would stale it and eventually the neigh would be removed from HW and nexthop as well. That would reduce ECMP group in HW. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-05mlxsw: spectrum_router: Implement next-hop routingJiri Pirko
Implement next-hop routing offload including ECMP. To make it possible, introduce next-hop group entity. This entity keeps track of resolved neighbours and updates HW adjacency table accordingly. Note that HW next-hops are stored in this adjacency table, in form of MAC. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-05mlxsw: Introduce simplistic KVD linear area managerJiri Pirko
This is a very simple manager for KVD linear area. Currently, the allocator will either allocate a single entry from pre-defined sub-area, or in case more than one entry is needed, it will allocate 32-entry chunk in other pre-defined sub-area. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-05mlxsw: spectrum: Define sizes of KVD areasJiri Pirko
Override the defaults and define the area sizes ourselves. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-05mlxsw: spectrum_router: Periodically update the kernel's neigh tableYotam Gigi
As previously explained, the driver should periodically poll the device for neighbours activity according to the configured DELAY_PROBE_TIME. This will prevent active neighbours from staying in STALE state for long periods of time. During init configure the polling interval according to the DELAY_PROBE_TIME used in the default table. In addition, register a netevent notification block, so that the interval is updated whenever DELAY_PROBE_TIME changes. Using the computed interval schedule a delayed work, which will update the kernel via neigh_event_send() on any active neighbour since the last delayed work. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-05mlxsw: spectrum_router: Add private neigh tableJiri Pirko
We need to hold some private data for every neigh entry. It would be possible to do it using neigh_priv_len/ndo_neigh_construct/ ndo_neigh_destroy however only for the port device itself. That would not work for stacked devices like bridge/team/bond. So introduce a private neigh table. Hook onto ndos neigh_construct/destroy and add/remove table entry according to that. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-04mlxsw: spectrum: Enable L3 interfaces on top of bridge devicesIdo Schimmel
As with the previously introduced L3 interfaces, listen to 'inetaddr' notifications sent for bridges devices configured on top of the port netdevs and create / destroy router interfaces (RIFs) accordingly. This also includes VLAN devices configured on top of the VLAN-aware bridge. The RIFs will be destroyed either when the last IP address is removed or when the underlying FID is is destroyed. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-04mlxsw: spectrum: Configure FIDs based on bridge eventsIdo Schimmel
Before introducing support for L3 interfaces on top of the VLAN-aware bridge we need to add some missing infrastructure. Such an interface can either be the bridge device itself or a VLAN device on top of it. In the first case the router interface (RIF) is associated with FID 1, which is created whenever the first port netdev joins the bridge. We currently assume the default PVID is 1 and that it's already created, as it seems reasonable. This can be extended in the future. However, in the second case it's entirely possible we've yet to create a matching FID. This can happen if the VLAN device was configured before making any bridge port member in the VLAN. Prevent such ordering problems by using the VLAN device's CHANGEUPPER event to configure the FID. Make the VLAN device hold a reference to the FID and prevent it from being destroyed even if none of the port netdevs is using it. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-04mlxsw: spectrum: Unsplit the vFID rangeIdo Schimmel
Previous commit deprecated the vFIDs used to get traffic to the CPU ('port_vfids'). Thus, we now use the vFIDs as god intended and the artificial split is no longer needed. Rename functions and variables to reflect that. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-04mlxsw: spectrum: Introduce support for router interfacesIdo Schimmel
Up until now we only supported bridged interfaces. Packets ingressing through the switch ports were either classified to FIDs (in the case of the VLAN-aware bridge) or vFIDs (in the case of VLAN-unaware bridges). The packets were then forwarded according to the FDB. Routing was done entirely in slowpath, by splitting the vFID range in two and using the lower 0.5K vFIDs as dummy bridges that simply flooded all incoming traffic to the CPU. Instead, allow packets to be routed in the device by creating router interfaces (RIFs) that will direct them to the router block. Specifically, the RIFs introduced here are Sub-port RIFs used for VLAN devices and port netdevs. Packets ingressing from the {Port / LAG ID, VID} with which the RIF was programmed with will be assigned to a special kind of FIDs called rFIDs and from there directed to the router. Create a RIF whenever the first IPv4 address was programmed on a VLAN / LAG / port netdev. Destroy it upon removal of the last IPv4 address. Receive these notifications by registering for the 'inetaddr' notification chain. A non-zero (10) priority is used for the notification block, so that RIFs will be created before routes are offloaded via FIB code. Note that another trigger for RIF destruction are CHANGEUPPER notifications causing the underlying FID's reference count to go down to zero. This can happen, for example, when a VLAN netdev with an IP address is put under bridge. While this configuration doesn't make sense it does cause the device and the kernel to get out of sync when the netdev is unbridged. We intend to address this in the future, hopefully in current cycle. Finally, Remove the lower 0.5K vFIDs, as they are deprecated by the RIFs, which will trap packets according to their DIP. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-04mlxsw: spectrum: Edit RIF properties based on netdev eventsIdo Schimmel
We are just about to introduce router interfaces (RIFs), but before that we need to be able update the device with the correct RIF attributes whenever they change for the netdev the RIF is backing. Two such attributes are MTU and MAC. The MAC is used both to set the source MAC of packets egressing from the RIF and also to program an FDB rule that will direct packets to the router block. Use the existing netdevice notification block and respond to CHANGEADDR and CHANGEMTU accordingly. Store both attributes in the RIF struct in case we need to revert to old attributes following a failed update. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-04mlxsw: spectrum: Add couple of lower device helper functionsJiri Pirko
Add functions that iterate over lower devices and find port device. As a dependency add netdev_for_each_all_lower_dev and netdev_for_each_all_lower_dev_rcu macro with netdev_all_lower_get_next and netdev_all_lower_get_next_rcu shelpers. Also, add functions to return mlxsw struct according to lower device found and mlxsw_port struct with a reference to lower device. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-04mlxsw: spectrum_router: Implement fib4 add/del switchdev obj opsJiri Pirko
Implement ipv4 FIB entries addition and removal. Initially, we support local and broadcast routes using "ip2me" trap action. Also, unicast routes without nexthop are supported using "local" action. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-04mlxsw: spectrum_router: Add virtual router managementJiri Pirko
Virtual router is a construct used inside HW. In this implementation we map kernel tables to virtual routers one to one. Introduce management logic to create virtual routers when needed and destroy in case they are no longer in use. According to that, call into LPM tree management. Each virtual router is always bound to one LPM tree. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-04mlxsw: spectrum_router: Implement LPM trees managementJiri Pirko
Introduce basic LPM tree management allowing to share the trees in between tables if the used prefixes in the tables are the same. Build the tree structure according to the used prefixes. Although it is not optimal for many use cases, this initial implementation does only simple linear left-tree. More advanced structures will be introduced later on, possibly including mechanisms to change trees on the fly. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-04mlxsw: spectrum_router: Implement private fibJiri Pirko
Shadow FIB is needed in order to hold additional information for FIB entries and keep track of used prefixes. That is needed for the LPM tree construction to be introduced later on in this set. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02mlxsw: spectrum: Add router interface structIdo Schimmel
When enabling the router in the device we will represent L3 netdevs using router interfaces (RIFs). These will be specified whenever programming routes or neighbours on the netdev. Introduce the basic RIF infrastructure which allows one to lookup a RIF by its netdev. Later patches in the series will extend this, but the basic routines are needed now in order to direct traffic to CPU. Pointers to the RIF structs are stored in an array indexed by the RIF's number. This will allow us to efficiently update the kernel's neighbour table when regularly dumping the device's table. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02mlxsw: spectrum_router: Add basic ipv4 router initializationIdo Schimmel
Create a skeleton router file and do basic HW initialization of router. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02mlxsw: spectrum: Remove VLANs configuration via SELF flagIdo Schimmel
When port isn't bridged it is still possible to invoke switchdev ops and configure the device's VLAN filters. However, this will require us to use different Router InterFaces (RIFs) for the same netdev, instead of one per-netdev as with any other configuration. Taking the above into account and the fact that this functionality is questionable with regards to the device's normal use-case, remove it and instead return an error. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21mlxsw: spectrum: Free resources upon vPort destructionIdo Schimmel
There are situations in which a vPort is destroyed while still holding references to device's resources such as FIDs and FDB records. This can happen, for example, when a VLAN device is deleted while still being bridged. Instead of trying to make sure vPort destruction is invoked when it no longer uses device's resources, just free them upon destruction. This simplifies the code, as we no longer need to take different situations into account when events are received - cleanup is taken care of in one place. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21mlxsw: spectrum: Refactor FDB flushing logicIdo Schimmel
FDB entries are learned using {Port / LAG ID, FID} and therefore should be flushed whenever a port (vPort) leaves its FID (vFID). However, when the bridge port is a LAG device (or a VLAN device on top), then FDB flushing is conditional. Ports removed from such LAG configurations must not trigger flushing, as other ports might still be members in the LAG and therefore the bridge port is still active. The decision whether to flush or not was previously computed in the netdevice notification block, but in order to flush the entries when a port leaves its FID this decision should be computed there. Strip the notification block from this logic and instead move it to one FDB flushing function that is invoked from both the FID / vFID leave functions. When port isn't member in LAG, FDB flushing should always occur. Otherwise, it should occur only when the last port (vPort) member in the LAG leaves the FID (vFID). This will allow us - in the next patch - to simplify the cleanup code paths that are hit whenever the topology above the port netdevs changes. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21mlxsw: spectrum: Don't count on FID being presentIdo Schimmel
Not all vPorts will have FIDs assigned to them, so make sure functions first test for FID presence. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21mlxsw: spectrum: Add FID get / set functionsIdo Schimmel
As previously explained, not all vPorts will be assigned FIDs, so instead of returning the FID index of a vPort, return a pointer to its FID struct. This will allow us to know whether it's legal to access the vPort's FID parameters such as index and device. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21mlxsw: spectrum: Check if port is vPort using its VIDIdo Schimmel
When L3 interfaces will be introduced a vPort won't necessarily have a FID assigned to it. This can happen if it's not member in a bridge (in which case it's assigned a vFID) or doesn't have an IP address (in which case it's assigned an rFID). Therefore, instead check the VID parameter to test whether a port is a vPort or not. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21mlxsw: spectrum: Use per-FID struct for the VLAN-aware bridgeIdo Schimmel
In a very similar way to the vFIDs, make the first 4K FIDs - used in the VLAN-aware bridge - use the new FID struct. Upon first use of the FID by any of the ports do the following: 1) Create the FID 2) Setup a matching flooding entry 3) Create a mapping for the FID Unlike vFIDs, upon creation of a FID we always create a global VID-to-FID mapping, so that ports without upper vPorts can use it instead of creating an explicit {Port, VID} to FID mapping. When a port leaves a FID the reverse is performed. Whenever the FID's reference count reaches zero the FID is deleted along with the global mapping. The per-FID struct will later allow us to configure L3 interfaces on top of the VLAN-aware bridge. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21mlxsw: spectrum: Make vFID struct genericIdo Schimmel
Up until now we had a dedicated struct only for vFIDs, but before introducing support for L3 interfaces we need to make it generic and use it for all three types of FIDs: 1) FIDs - 0..4K-1, used for the VLAN-aware bridge 2) vFIDs - 4K..15K-1, used for VLAN-unaware bridges 3) rFIDs - 15K..16K-1, used to direct traffic to / from the router in the device. Will be introduced later in the series. The three types of L3 interfaces - Router InterFaces, RIFs - that will be introduced correspond to the three types of FIDs and are configured using them. Therefore, we'll need to store the links between them as well as a reference count on the underlying FID, so that the corresponding RIF will be destroyed when it reaches zero. Note that the lower 0.5K vFIDs are currently used for for non-bridged netdevs, so that traffic could be flooded to the CPU port. However, when rFIDs will be introduced we'll no longer need these and they too will be used for VLAN-unaware bridges. Make the vFID struct generic by renaming it and some of its fields. FIDs will be converted to use it later in the series. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21mlxsw: spectrum: Use FID instead of vFID to setup floodingIdo Schimmel
Use a FID index instead of vFID and ease the transition towards a generic FID struct. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21mlxsw: spectrum: Remove redundant function argumentIdo Schimmel
In all call sites 'only_uc' is set to false, so strip it. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21mlxsw: spectrum: Use DECLARE_BITMAP() macroIdo Schimmel
There is a macro to do this kind of declarations, so use it. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09mlxsw: spectrum: Don't sleep during ndo_get_phys_port_name()Ido Schimmel
When rtnl_fill_ifinfo() is called for a certain netdevice it queries its various parameters such as switch id and physical port name. The function might get called in an atomic context, which means the underlying driver must not sleep during the query operation. Don't query the device and sleep during ndo_get_phys_port_name(), but instead store the needed parameters in port creation time. Fixes: 2bf9a58675c5 ("mlxsw: spectrum: Add support for physical port names") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14mlxsw: spectrum_buffers: Implement occupancy monitoringJiri Pirko
Implement occupancy API introduced in devlink and mlxsw core. This is done by accessing SBPM register for Port-Pool and SBSR for Port-TC current and max occupancy values. Max clear is implemented using the same registers. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14mlxsw: spectrum_buffers: Implement shared buffer configurationJiri Pirko
Implement previously introduced mlxsw core shared buffer API. For Spectrum, that is done utilizing registers SBPR, SBCM and SBPM. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14mlxsw: spectrum_buffers: Cache shared buffer configurationJiri Pirko
In order to achieve faster dumping of current setting and also in order to provide possibility to get pool mode without a need to query hardware, do cache the configuration in driver. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08mlxsw: Move devlink port registration into common core codeJiri Pirko
Remove devlink port reg/unreg from spectrum and switchx2 code and rather do the common work in core. That also ensures code separation where devlink is only used in core.c. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06mlxsw: spectrum: Add IEEE 802.1Qbb PFC supportIdo Schimmel
Implement the appropriate DCB ops and allow a user to configure certain traffic classes as lossless. The operation configures PFC for both the egress (respecting PFC frames) and ingress (sending PFC frames) parts of the port. At egress, when a PFC frame is received for a PFC enabled priority, then all the priorities mapped to the same TC are stopped. At ingress, the priority group (PG) buffers to which the enabled PFC priorities are mapped are configured to be lossless. PFC frames will be transmitted when the Xoff threshold is crossed. The user-supplied delay parameter is used to determine the PG's size according to the following formula: PG_SIZE = PG_SIZE_LOSSY + delay * CELL_FACTOR + MTU In the worst case scenario the delay will be made up of packets that are all of size CELL_SIZE + 1, which means each packet will require almost twice its true size when buffered in the switch. We therefore multiply this value by the "cell factor", which is close to 2. Another MTU is added in case the transmitting host already started transmitting a maximum length frame when the PFC packet was received. As with PAUSE enabled ports, when the port's MTU is changed both the PGs' size and threshold are adjusted accordingly. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06mlxsw: spectrum: Add support for PAUSE framesIdo Schimmel
When a packet ingress the switch it's placed in its assigned priority group (PG) buffer in the port's headroom buffer while it goes through the switch's pipeline. After going through the pipeline - which determines its egress port(s) and traffic class - it's moved to the switch's shared buffer awaiting transmission. However, some packets are not eligible to enter the shared buffer due to exceeded quotas or insufficient space. Marking their associated PGs as lossless will cause the packets to accumulate in the PG buffer. Another reason for packets accumulation are complicated pipelines (e.g. involving a lot of ACLs). To prevent packets from being dropped a user can enable PAUSE frames on the port. This will mark all the active PGs as lossless and set their size according to the maximum delay, as it's not configured by user. +----------------+ + | | | | | | | | | | | | | | | | | | Delay | | | | | | | | | | | | | | | Xon/Xoff threshold +----------------+ + | | | | | | 2 * MTU | | | +----------------+ + The delay (612 [Cells]) was calculated according to worst-case scenario involving maximum MTU and 100m cables. After marking the PGs as lossless the device is configured to respect incoming PAUSE frames (Rx PAUSE) and generate PAUSE frames (Tx PAUSE) according to user's settings. Whenever the port's headroom configuration changes we take into account the PAUSE configuration, so that we correctly set the PG's type (lossy / lossless), size and threshold. This can happen when: a) The port's MTU changes, as it directly affects the PG's size. b) A PG is created following user configuration, by binding a priority to it. Note that the relevant SUPPORTED flags were already mistakenly set by the driver before this commit. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06mlxsw: spectrum: Allow setting maximum rate for a TCIdo Schimmel
Allow a user to set maximum rate for a particular TC using DCB ops. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06mlxsw: spectrum: Add IEEE 802.1Qaz ETS supportIdo Schimmel
Implement the appropriate DCB ops and allow a user to configure: * Priority to traffic class (TC) mapping with a total of 8 supported TCs * Transmission selection algorithm (TSA) for each TC and the corresponding weights in case of weighted round robin (WRR) As previously explained, we treat the priority group (PG) buffer in the port's headroom as the ingress counterpart of the egress TC. Therefore, when a certain priority to TC mapping is configured, we also configure the port's headroom buffer. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06mlxsw: spectrum: Introduce support for Data Center Bridging (DCB)Ido Schimmel
Introduce basic infrastructure for DCB and add the missing ops in following patches. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06mlxsw: spectrum: Add bytes to cells helperIdo Schimmel
Buffers in the switch store packets in units called buffer cells. Add a helper to convert from bytes to cells, so that the actual number of cells required (result is round up) is returned. Also, drop the SB (shared buffer) acronym from the BYTES_PER_CELL macro, as this unit is also used in the ports' buffers and not only the switch's shared buffer. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-05mlxsw: spectrum: Reduce number of supported 802.1D bridgesIdo Schimmel
Resources allocated for these bridges at init time cannot be later used for other purposes. While current number is supported by the device, it's mostly theoretical with regards to any real use case, which leads to poor utilization of device's resources. Solve that by reducing the number. The long term plan is to make this value (along with others) user configurable via devlink and write it to NVRAM, so that it can be used during the next init. Until then we must hardcode such values. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11mlxsw: spectrum: Check requested ageing time is validIdo Schimmel
Commit c62987bbd8a1 ("bridge: push bridge setting ageing_time down to switchdev") added a check for minimum and maximum ageing time, but this breaks existing behaviour where one can set ageing time to 0 for a non-learning bridge. Push this check down to the driver and allow the check in the bridge layer to be removed. Currently ageing time 0 is refused by the driver, but we can later add support for this functionality. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01mlxsw: spectrum: Introduce port splittingIdo Schimmel
Allow a user to split or unsplit a port using the newly introduced devlink ops. Once split, the original netdev is destroyed and 2 or 4 others are created, according to user configuration. The new ports are like any other port, with the sole difference of supporting a lower maximum speed. When unsplit, the reverse process takes place. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01mlxsw: spectrum: Store local port to module mapping during initIdo Schimmel
The port netdevs are each associated with a different local port number in the device. These local ports are grouped into groups of 4 (e.g. (1-4), (5-8)) called clusters. The cluster constitutes the one of two possible modules they can be mapped to. This mapping is board-specific and done by the device's firmware during init. When splitting a port by 4, the device requires us to first unmap all the ports in the cluster and then map each to a single lane in the module associated with the port netdev used as the handle for the operation. This means that two port netdevs will disappear, as only 100Gb/s (4 lanes) ports can be split and we are guaranteed to have two of these ((1, 3), (5, 7) etc.) in a cluster. When unsplit occurs we need to reinstantiate the two original 100Gb/s ports and map each to its origianl module. Therefore, during driver init store the initial local port to module mapping, so it can be used later during unsplitting. Note that a by 2 split doesn't require us to store the mapping, as we only need to reinstantiate one port whose module is known. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01mlxsw: Implement devlink interfaceJiri Pirko
Implement newly introduced devlink interface. Add devlink port instances for every port and set the port types accordingly. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-18mlxsw: spectrum: Allow for PVID deletionIdo Schimmel
When PVID is toggled off on a port member in a VLAN filtering bridge or the PVID VLAN is deleted, make the port drop untagged packets. Reverse the operation when PVID is toggled back on. Set the PVID back to the default (1), when leaving the bridge so that untagged traffic will be directed to the CPU. Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28switchdev: Require RTNL mutex to be held when sending FDB notificationsIdo Schimmel
When switchdev drivers process FDB notifications from the underlying device they resolve the netdev to which the entry points to and notify the bridge using the switchdev notifier. However, since the RTNL mutex is not held there is nothing preventing the netdev from disappearing in the middle, which will cause br_switchdev_event() to dereference a non-existing netdev. Make switchdev drivers hold the lock at the beginning of the notification processing session and release it once it ends, after notifying the bridge. Also, remove switchdev_mutex and fdb_lock, as they are no longer needed when RTNL mutex is held. Fixes: 03bf0c281234 ("switchdev: introduce switchdev notifier") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>