diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2016-05-17 16:26:30 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2016-05-17 16:26:30 -0700 |
commit | a7fd20d1c476af4563e66865213474a2f9f473a4 (patch) | |
tree | fb1399e2f82842450245fb058a8fb23c52865f43 /Documentation | |
parent | b80fed9595513384424cd141923c9161c4b5021b (diff) | |
parent | 917fa5353da05e8a0045b8acacba8d50400d5b12 (diff) | |
download | lwn-a7fd20d1c476af4563e66865213474a2f9f473a4.tar.gz lwn-a7fd20d1c476af4563e66865213474a2f9f473a4.zip |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
"Highlights:
1) Support SPI based w5100 devices, from Akinobu Mita.
2) Partial Segmentation Offload, from Alexander Duyck.
3) Add GMAC4 support to stmmac driver, from Alexandre TORGUE.
4) Allow cls_flower stats offload, from Amir Vadai.
5) Implement bpf blinding, from Daniel Borkmann.
6) Optimize _ASYNC_ bit twiddling on sockets, unless the socket is
actually using FASYNC these atomics are superfluous. From Eric
Dumazet.
7) Run TCP more preemptibly, also from Eric Dumazet.
8) Support LED blinking, EEPROM dumps, and rxvlan offloading in mlx5e
driver, from Gal Pressman.
9) Allow creating ppp devices via rtnetlink, from Guillaume Nault.
10) Improve BPF usage documentation, from Jesper Dangaard Brouer.
11) Support tunneling offloads in qed, from Manish Chopra.
12) aRFS offloading in mlx5e, from Maor Gottlieb.
13) Add RFS and RPS support to SCTP protocol, from Marcelo Ricardo
Leitner.
14) Add MSG_EOR support to TCP, this allows controlling packet
coalescing on application record boundaries for more accurate
socket timestamp sampling. From Martin KaFai Lau.
15) Fix alignment of 64-bit netlink attributes across the board, from
Nicolas Dichtel.
16) Per-vlan stats in bridging, from Nikolay Aleksandrov.
17) Several conversions of drivers to ethtool ksettings, from Philippe
Reynes.
18) Checksum neutral ILA in ipv6, from Tom Herbert.
19) Factorize all of the various marvell dsa drivers into one, from
Vivien Didelot
20) Add VF support to qed driver, from Yuval Mintz"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1649 commits)
Revert "phy dp83867: Fix compilation with CONFIG_OF_MDIO=m"
Revert "phy dp83867: Make rgmii parameters optional"
r8169: default to 64-bit DMA on recent PCIe chips
phy dp83867: Make rgmii parameters optional
phy dp83867: Fix compilation with CONFIG_OF_MDIO=m
bpf: arm64: remove callee-save registers use for tmp registers
asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions
switchdev: pass pointer to fib_info instead of copy
net_sched: close another race condition in tcf_mirred_release()
tipc: fix nametable publication field in nl compat
drivers: net: Don't print unpopulated net_device name
qed: add support for dcbx.
ravb: Add missing free_irq() calls to ravb_close()
qed: Remove a stray tab
net: ethernet: fec-mpc52xx: use phy_ethtool_{get|set}_link_ksettings
net: ethernet: fec-mpc52xx: use phydev from struct net_device
bpf, doc: fix typo on bpf_asm descriptions
stmmac: hardware TX COE doesn't work when force_thresh_dma_mode is set
net: ethernet: fs-enet: use phy_ethtool_{get|set}_link_ksettings
net: ethernet: fs-enet: use phydev from struct net_device
...
Diffstat (limited to 'Documentation')
29 files changed, 734 insertions, 105 deletions
diff --git a/Documentation/DocBook/80211.tmpl b/Documentation/DocBook/80211.tmpl index f9b9ad7894f5..5f7c55999c77 100644 --- a/Documentation/DocBook/80211.tmpl +++ b/Documentation/DocBook/80211.tmpl @@ -75,7 +75,6 @@ <chapter> <title>Device registration</title> !Pinclude/net/cfg80211.h Device registration -!Finclude/net/cfg80211.h ieee80211_band !Finclude/net/cfg80211.h ieee80211_channel_flags !Finclude/net/cfg80211.h ieee80211_channel !Finclude/net/cfg80211.h ieee80211_rate_flags @@ -136,6 +135,7 @@ !Finclude/net/cfg80211.h cfg80211_tx_mlme_mgmt !Finclude/net/cfg80211.h cfg80211_ibss_joined !Finclude/net/cfg80211.h cfg80211_connect_result +!Finclude/net/cfg80211.h cfg80211_connect_bss !Finclude/net/cfg80211.h cfg80211_roamed !Finclude/net/cfg80211.h cfg80211_disconnected !Finclude/net/cfg80211.h cfg80211_ready_on_channel diff --git a/Documentation/accounting/getdelays.c b/Documentation/accounting/getdelays.c index 7785fb5eb93f..b5ca536e56a8 100644 --- a/Documentation/accounting/getdelays.c +++ b/Documentation/accounting/getdelays.c @@ -505,6 +505,8 @@ int main(int argc, char *argv[]) if (!loop) goto done; break; + case TASKSTATS_TYPE_NULL: + break; default: fprintf(stderr, "Unknown nested" " nla_type %d\n", @@ -512,7 +514,8 @@ int main(int argc, char *argv[]) break; } len2 += NLA_ALIGN(na->nla_len); - na = (struct nlattr *) ((char *) na + len2); + na = (struct nlattr *)((char *)na + + NLA_ALIGN(na->nla_len)); } break; diff --git a/Documentation/devicetree/bindings/btmrvl.txt b/Documentation/devicetree/bindings/btmrvl.txt deleted file mode 100644 index 58f964bb0a52..000000000000 --- a/Documentation/devicetree/bindings/btmrvl.txt +++ /dev/null @@ -1,29 +0,0 @@ -btmrvl ------- - -Required properties: - - - compatible : must be "btmrvl,cfgdata" - -Optional properties: - - - btmrvl,cal-data : Calibration data downloaded to the device during - initialization. This is an array of 28 values(u8). - - - btmrvl,gpio-gap : gpio and gap (in msecs) combination to be - configured. - -Example: - -GPIO pin 13 is configured as a wakeup source and GAP is set to 100 msecs -in below example. - -btmrvl { - compatible = "btmrvl,cfgdata"; - - btmrvl,cal-data = /bits/ 8 < - 0x37 0x01 0x1c 0x00 0xff 0xff 0xff 0xff 0x01 0x7f 0x04 0x02 - 0x00 0x00 0xba 0xce 0xc0 0xc6 0x2d 0x00 0x00 0x00 0x00 0x00 - 0x00 0x00 0xf0 0x00>; - btmrvl,gpio-gap = <0x0d64>; -}; diff --git a/Documentation/devicetree/bindings/net/apm-xgene-enet.txt b/Documentation/devicetree/bindings/net/apm-xgene-enet.txt index 078060a97f95..05f705e32a4a 100644 --- a/Documentation/devicetree/bindings/net/apm-xgene-enet.txt +++ b/Documentation/devicetree/bindings/net/apm-xgene-enet.txt @@ -18,6 +18,8 @@ Required properties for all the ethernet interfaces: - First is the Rx interrupt. This irq is mandatory. - Second is the Tx completion interrupt. This is supported only on SGMII based 1GbE and 10GbE interfaces. +- channel: Ethernet to CPU, start channel (prefetch buffer) number + - Must map to the first irq and irqs must be sequential - port-id: Port number (0 or 1) - clocks: Reference to the clock entry. - local-mac-address: MAC address assigned to this device diff --git a/Documentation/devicetree/bindings/net/dsa/dsa.txt b/Documentation/devicetree/bindings/net/dsa/dsa.txt index 5fdbbcdf8c4b..9f4807f90c31 100644 --- a/Documentation/devicetree/bindings/net/dsa/dsa.txt +++ b/Documentation/devicetree/bindings/net/dsa/dsa.txt @@ -31,8 +31,6 @@ A switch child node has the following optional property: switch. Must be set if the switch can not detect the presence and/or size of a connected EEPROM, otherwise optional. -- reset-gpios : phandle and specifier to a gpio line connected to - reset pin of the switch chip. A switch may have multiple "port" children nodes diff --git a/Documentation/devicetree/bindings/net/dsa/marvell.txt b/Documentation/devicetree/bindings/net/dsa/marvell.txt new file mode 100644 index 000000000000..7629189398aa --- /dev/null +++ b/Documentation/devicetree/bindings/net/dsa/marvell.txt @@ -0,0 +1,35 @@ +Marvell DSA Switch Device Tree Bindings +--------------------------------------- + +WARNING: This binding is currently unstable. Do not program it into a +FLASH never to be changed again. Once this binding is stable, this +warning will be removed. + +If you need a stable binding, use the old dsa.txt binding. + +Marvell Switches are MDIO devices. The following properties should be +placed as a child node of an mdio device. + +The properties described here are those specific to Marvell devices. +Additional required and optional properties can be found in dsa.txt. + +Required properties: +- compatible : Should be one of "marvell,mv88e6085", +- reg : Address on the MII bus for the switch. + +Optional properties: + +- reset-gpios : Should be a gpio specifier for a reset line + +Example: + + mdio { + #address-cells = <1>; + #size-cells = <0>; + + switch0: switch@0 { + compatible = "marvell,mv88e6085"; + reg = <0>; + reset-gpios = <&gpio5 1 GPIO_ACTIVE_LOW>; + }; + }; diff --git a/Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt b/Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt index ecacfa44b1eb..d4b7f2e49984 100644 --- a/Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt +++ b/Documentation/devicetree/bindings/net/hisilicon-hns-dsaf.txt @@ -7,19 +7,45 @@ Required properties: - mode: dsa fabric mode string. only support one of dsaf modes like these: "2port-64vf", "6port-16rss", - "6port-16vf". + "6port-16vf", + "single-port". - interrupt-parent: the interrupt parent of this device. - interrupts: should contain the DSA Fabric and rcb interrupt. - reg: specifies base physical address(es) and size of the device registers. - The first region is external interface control register base and size. - The second region is SerDes base register and size. + The first region is external interface control register base and size(optional, + only used when subctrl-syscon does not exist). It is recommended using + subctrl-syscon rather than this address. + The second region is SerDes base register and size(optional, only used when + serdes-syscon in port node does not exist). It is recommended using + serdes-syscon rather than this address. The third region is the PPE register base and size. - The fourth region is dsa fabric base register and size. - The fifth region is cpld base register and size, it is not required if do not use cpld. -- phy-handle: phy handle of physicl port, 0 if not any phy device. see ethernet.txt [1]. + The fourth region is dsa fabric base register and size. It is not required for + single-port mode. +- reg-names: may be ppe-base and(or) dsaf-base. It is used to find the + corresponding reg's index. + +- phy-handle: phy handle of physical port, 0 if not any phy device. It is optional + attribute. If port node exists, phy-handle in each port node will be used. + see ethernet.txt [1]. +- subctrl-syscon: is syscon handle for external interface control register. +- reset-field-offset: is offset of reset field. Its value depends on the hardware + user manual. - buf-size: rx buffer size, should be 16-1024. - desc-num: number of description in TX and RX queue, should be 512, 1024, 2048 or 4096. +- port: subnodes of dsaf. A dsaf node may contain several port nodes(Depending + on mode of dsaf). Port node contain some attributes listed below: +- reg: is physical port index in one dsaf. +- phy-handle: phy handle of physical port. It is not required if there isn't + phy device. see ethernet.txt [1]. +- serdes-syscon: is syscon handle for SerDes register. +- cpld-syscon: is syscon handle + register offset pair for cpld register. It is + not required if there isn't cpld device. +- port-rst-offset: is offset of reset field for each port in dsaf. Its value + depends on the hardware user manual. +- port-mode-offset: is offset of port mode field for each port in dsaf. Its + value depends on the hardware user manual. + [1] Documentation/devicetree/bindings/net/phy.txt Example: @@ -28,11 +54,11 @@ dsaf0: dsa@c7000000 { compatible = "hisilicon,hns-dsaf-v1"; mode = "6port-16rss"; interrupt-parent = <&mbigen_dsa>; - reg = <0x0 0xC0000000 0x0 0x420000 - 0x0 0xC2000000 0x0 0x300000 - 0x0 0xc5000000 0x0 0x890000 + reg = <0x0 0xc5000000 0x0 0x890000 0x0 0xc7000000 0x0 0x60000>; - phy-handle = <0 0 0 0 &soc0_phy4 &soc0_phy5 0 0>; + reg-names = "ppe-base", "dsaf-base"; + subctrl-syscon = <&subctrl>; + reset-field-offset = 0; interrupts = <131 4>,<132 4>, <133 4>,<134 4>, <135 4>,<136 4>, <137 4>,<138 4>, <139 4>,<140 4>, <141 4>,<142 4>, @@ -43,4 +69,15 @@ dsaf0: dsa@c7000000 { buf-size = <4096>; desc-num = <1024>; dma-coherent; + + port@0 { + reg = 0; + phy-handle = <&phy0>; + serdes-syscon = <&serdes>; + }; + + port@1 { + reg = 1; + serdes-syscon = <&serdes>; + }; }; diff --git a/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt b/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt index e6a9d1c30878..b9ff4ba6454e 100644 --- a/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt +++ b/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt @@ -36,6 +36,34 @@ Required properties: | | | | | | external port + This attribute is remained for compatible purpose. It is not recommended to + use it in new code. + +- port-idx-in-ae: is the index of port provided by AE. + In NIC mode of DSAF, all 6 PHYs of service DSAF are taken as ethernet ports + to the CPU. The port-idx-in-ae can be 0 to 5. Here is the diagram: + +-----+---------------+ + | CPU | + +-+-+-+---+-+-+-+-+-+-+ + | | | | | | | | + debug debug service + port port port + (0) (0) (0-5) + + In Switch mode of DSAF, all 6 PHYs of service DSAF are taken as physical + ports connected to a LAN Switch while the CPU side assume itself have one + single NIC connected to this switch. In this case, the port-idx-in-ae + will be 0 only. + +-----+-----+------+------+ + | CPU | + +-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | service| port(0) + debug debug +------------+ + port port | switch | + (0) (0) +-+-+-+-+-+-++ + | | | | | | + external port + - local-mac-address: mac addr of the ethernet interface Example: @@ -43,6 +71,6 @@ Example: ethernet@0{ compatible = "hisilicon,hns-nic-v1"; ae-handle = <&dsaf0>; - port-id = <0>; + port-idx-in-ae = <0>; local-mac-address = [a2 14 e4 4b 56 76]; }; diff --git a/Documentation/devicetree/bindings/net/marvell-bt-sd8xxx.txt b/Documentation/devicetree/bindings/net/marvell-bt-sd8xxx.txt new file mode 100644 index 000000000000..14aa6cf58201 --- /dev/null +++ b/Documentation/devicetree/bindings/net/marvell-bt-sd8xxx.txt @@ -0,0 +1,56 @@ +Marvell 8897/8997 (sd8897/sd8997) bluetooth SDIO devices +------ + +Required properties: + + - compatible : should be one of the following: + * "marvell,sd8897-bt" + * "marvell,sd8997-bt" + +Optional properties: + + - marvell,cal-data: Calibration data downloaded to the device during + initialization. This is an array of 28 values(u8). + + - marvell,wakeup-pin: It represents wakeup pin number of the bluetooth chip. + firmware will use the pin to wakeup host system. + - marvell,wakeup-gap-ms: wakeup gap represents wakeup latency of the host + platform. The value will be configured to firmware. This + is needed to work chip's sleep feature as expected. + - interrupt-parent: phandle of the parent interrupt controller + - interrupts : interrupt pin number to the cpu. Driver will request an irq based + on this interrupt number. During system suspend, the irq will be + enabled so that the bluetooth chip can wakeup host platform under + certain condition. During system resume, the irq will be disabled + to make sure unnecessary interrupt is not received. + +Example: + +IRQ pin 119 is used as system wakeup source interrupt. +wakeup pin 13 and gap 100ms are configured so that firmware can wakeup host +using this device side pin and wakeup latency. +calibration data is also available in below example. + +&mmc3 { + status = "okay"; + vmmc-supply = <&wlan_en_reg>; + bus-width = <4>; + cap-power-off-card; + keep-power-in-suspend; + + #address-cells = <1>; + #size-cells = <0>; + btmrvl: bluetooth@2 { + compatible = "marvell,sd8897-bt"; + reg = <2>; + interrupt-parent = <&pio>; + interrupts = <119 IRQ_TYPE_LEVEL_LOW>; + + marvell,cal-data = /bits/ 8 < + 0x37 0x01 0x1c 0x00 0xff 0xff 0xff 0xff 0x01 0x7f 0x04 0x02 + 0x00 0x00 0xba 0xce 0xc0 0xc6 0x2d 0x00 0x00 0x00 0x00 0x00 + 0x00 0x00 0xf0 0x00>; + marvell,wakeup-pin = <0x0d>; + marvell,wakeup-gap-ms = <0x64>; + }; +}; diff --git a/Documentation/devicetree/bindings/net/microchip,enc28j60.txt b/Documentation/devicetree/bindings/net/microchip,enc28j60.txt new file mode 100644 index 000000000000..1dc3bc75539d --- /dev/null +++ b/Documentation/devicetree/bindings/net/microchip,enc28j60.txt @@ -0,0 +1,59 @@ +* Microchip ENC28J60 + +This is a standalone 10 MBit ethernet controller with SPI interface. + +For each device connected to a SPI bus, define a child node within +the SPI master node. + +Required properties: +- compatible: Should be "microchip,enc28j60" +- reg: Specify the SPI chip select the ENC28J60 is wired to +- interrupt-parent: Specify the phandle of the source interrupt, see interrupt + binding documentation for details. Usually this is the GPIO bank + the interrupt line is wired to. +- interrupts: Specify the interrupt index within the interrupt controller (referred + to above in interrupt-parent) and interrupt type. The ENC28J60 natively + generates falling edge interrupts, however, additional board logic + might invert the signal. +- pinctrl-names: List of assigned state names, see pinctrl binding documentation. +- pinctrl-0: List of phandles to configure the GPIO pin used as interrupt line, + see also generic and your platform specific pinctrl binding + documentation. + +Optional properties: +- spi-max-frequency: Maximum frequency of the SPI bus when accessing the ENC28J60. + According to the ENC28J80 datasheet, the chip allows a maximum of 20 MHz, however, + board designs may need to limit this value. +- local-mac-address: See ethernet.txt in the same directory. + + +Example (for NXP i.MX28 with pin control stuff for GPIO irq): + + ssp2: ssp@80014000 { + compatible = "fsl,imx28-spi"; + pinctrl-names = "default"; + pinctrl-0 = <&spi2_pins_b &spi2_sck_cfg>; + status = "okay"; + + enc28j60: ethernet@0 { + compatible = "microchip,enc28j60"; + pinctrl-names = "default"; + pinctrl-0 = <&enc28j60_pins>; + reg = <0>; + interrupt-parent = <&gpio3>; + interrupts = <3 IRQ_TYPE_EDGE_FALLING>; + spi-max-frequency = <12000000>; + }; + }; + + pinctrl@80018000 { + enc28j60_pins: enc28j60_pins@0 { + reg = <0>; + fsl,pinmux-ids = < + MX28_PAD_AUART0_RTS__GPIO_3_3 /* Interrupt */ + >; + fsl,drive-strength = <MXS_DRIVE_4mA>; + fsl,voltage = <MXS_VOLTAGE_HIGH>; + fsl,pull-up = <MXS_PULL_DISABLE>; + }; + }; diff --git a/Documentation/devicetree/bindings/net/nfc/pn533-i2c.txt b/Documentation/devicetree/bindings/net/nfc/pn533-i2c.txt new file mode 100644 index 000000000000..1aea822d4530 --- /dev/null +++ b/Documentation/devicetree/bindings/net/nfc/pn533-i2c.txt @@ -0,0 +1,31 @@ +* NXP Semiconductors PN532 NFC Controller + +Required properties: +- compatible: Should be "nxp,pn532-i2c" or "nxp,pn533-i2c". +- clock-frequency: I²C work frequency. +- reg: address on the bus +- interrupt-parent: phandle for the interrupt gpio controller +- interrupts: GPIO interrupt to which the chip is connected + +Optional SoC Specific Properties: +- pinctrl-names: Contains only one value - "default". +- pintctrl-0: Specifies the pin control groups used for this controller. + +Example (for ARM-based BeagleBone with PN532 on I2C2): + +&i2c2 { + + status = "okay"; + + pn532: pn532@24 { + + compatible = "nxp,pn532-i2c"; + + reg = <0x24>; + clock-frequency = <400000>; + + interrupt-parent = <&gpio1>; + interrupts = <17 IRQ_TYPE_EDGE_FALLING>; + + }; +}; diff --git a/Documentation/devicetree/bindings/net/phy.txt b/Documentation/devicetree/bindings/net/phy.txt index bc1c3c8bf8fa..c00a9a894547 100644 --- a/Documentation/devicetree/bindings/net/phy.txt +++ b/Documentation/devicetree/bindings/net/phy.txt @@ -35,6 +35,8 @@ Optional Properties: - broken-turn-around: If set, indicates the PHY device does not correctly release the turn around line low at the end of a MDIO transaction. +- reset-gpios: Reference to a GPIO used to reset the phy. + Example: ethernet-phy@0 { @@ -42,4 +44,5 @@ ethernet-phy@0 { interrupt-parent = <40000>; interrupts = <35 1>; reg = <0>; + reset-gpios = <&gpio1 17 GPIO_ACTIVE_LOW>; }; diff --git a/Documentation/devicetree/bindings/net/stmmac.txt b/Documentation/devicetree/bindings/net/stmmac.txt index 6605d19601c2..4d302db657c0 100644 --- a/Documentation/devicetree/bindings/net/stmmac.txt +++ b/Documentation/devicetree/bindings/net/stmmac.txt @@ -59,6 +59,8 @@ Optional properties: - snps,fb: fixed-burst - snps,mb: mixed-burst - snps,rb: rebuild INCRx Burst + - snps,tso: this enables the TSO feature otherwise it will be managed by + MAC HW capability register. - mdio: with compatible = "snps,dwmac-mdio", create and register mdio bus. Examples: diff --git a/Documentation/devicetree/bindings/net/wireless/marvell-sd8xxx.txt b/Documentation/devicetree/bindings/net/wireless/marvell-sd8xxx.txt new file mode 100644 index 000000000000..c421aba0a5bc --- /dev/null +++ b/Documentation/devicetree/bindings/net/wireless/marvell-sd8xxx.txt @@ -0,0 +1,63 @@ +Marvell 8897/8997 (sd8897/sd8997) SDIO devices +------ + +This node provides properties for controlling the marvell sdio wireless device. +The node is expected to be specified as a child node to the SDIO controller that +connects the device to the system. + +Required properties: + + - compatible : should be one of the following: + * "marvell,sd8897" + * "marvell,sd8997" + +Optional properties: + + - marvell,caldata* : A series of properties with marvell,caldata prefix, + represent calibration data downloaded to the device during + initialization. This is an array of unsigned 8-bit values. + the properties should follow below property name and + corresponding array length: + "marvell,caldata-txpwrlimit-2g" (length = 566). + "marvell,caldata-txpwrlimit-5g-sub0" (length = 502). + "marvell,caldata-txpwrlimit-5g-sub1" (length = 688). + "marvell,caldata-txpwrlimit-5g-sub2" (length = 750). + "marvell,caldata-txpwrlimit-5g-sub3" (length = 502). + - marvell,wakeup-pin : a wakeup pin number of wifi chip which will be configured + to firmware. Firmware will wakeup the host using this pin + during suspend/resume. + - interrupt-parent: phandle of the parent interrupt controller + - interrupts : interrupt pin number to the cpu. driver will request an irq based on + this interrupt number. during system suspend, the irq will be enabled + so that the wifi chip can wakeup host platform under certain condition. + during system resume, the irq will be disabled to make sure + unnecessary interrupt is not received. + +Example: + +Tx power limit calibration data is configured in below example. +The calibration data is an array of unsigned values, the length +can vary between hw versions. +IRQ pin 38 is used as system wakeup source interrupt. wakeup pin 3 is configured +so that firmware can wakeup host using this device side pin. + +&mmc3 { + status = "okay"; + vmmc-supply = <&wlan_en_reg>; + bus-width = <4>; + cap-power-off-card; + keep-power-in-suspend; + + #address-cells = <1>; + #size-cells = <0>; + mwifiex: wifi@1 { + compatible = "marvell,sd8897"; + reg = <1>; + interrupt-parent = <&pio>; + interrupts = <38 IRQ_TYPE_LEVEL_LOW>; + + marvell,caldata_00_txpwrlimit_2g_cfg_set = /bits/ 8 < + 0x01 0x00 0x06 0x00 0x08 0x02 0x89 0x01>; + marvell,wakeup-pin = <3>; + }; +}; diff --git a/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.txt b/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.txt index 96aae6b4f736..74d7f0af209c 100644 --- a/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.txt +++ b/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.txt @@ -5,12 +5,18 @@ Required properties: * "qcom,ath10k" * "qcom,ipq4019-wifi" -PCI based devices uses compatible string "qcom,ath10k" and takes only -calibration data via "qcom,ath10k-calibration-data". Rest of the properties -are not applicable for PCI based devices. +PCI based devices uses compatible string "qcom,ath10k" and takes calibration +data along with board specific data via "qcom,ath10k-calibration-data". +Rest of the properties are not applicable for PCI based devices. AHB based devices (i.e. ipq4019) uses compatible string "qcom,ipq4019-wifi" -and also uses most of the properties defined in this doc. +and also uses most of the properties defined in this doc (except +"qcom,ath10k-calibration-data"). It uses "qcom,ath10k-pre-calibration-data" +to carry pre calibration data. + +In general, entry "qcom,ath10k-pre-calibration-data" and +"qcom,ath10k-calibration-data" conflict with each other and only one +can be provided per device. Optional properties: - reg: Address and length of the register set for the device. @@ -35,8 +41,11 @@ Optional properties: - qcom,msi_addr: MSI interrupt address. - qcom,msi_base: Base value to add before writing MSI data into MSI address register. -- qcom,ath10k-calibration-data : calibration data as an array, the - length can vary between hw versions +- qcom,ath10k-calibration-data : calibration data + board specific data + as an array, the length can vary between + hw versions. +- qcom,ath10k-pre-calibration-data : pre calibration data as an array, + the length can vary between hw versions. Example (to supply the calibration data alone): @@ -105,5 +114,5 @@ wifi0: wifi@a000000 { "legacy"; qcom,msi_addr = <0x0b006040>; qcom,msi_base = <0x40>; - qcom,ath10k-calibration-data = [ 01 02 03 ... ]; + qcom,ath10k-pre-calibration-data = [ 01 02 03 ... ]; }; diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt index 334b49ef02d1..57f52cdce32e 100644 --- a/Documentation/networking/bonding.txt +++ b/Documentation/networking/bonding.txt @@ -1880,8 +1880,8 @@ or more peers on the local network. The ARP monitor relies on the device driver itself to verify that traffic is flowing. In particular, the driver must keep up to -date the last receive time, dev->last_rx, and transmit start time, -dev->trans_start. If these are not updated by the driver, then the +date the last receive time, dev->last_rx. Drivers that use NETIF_F_LLTX +flag must also update netdev_queue->trans_start. If they do not, then the ARP monitor will immediately fail any slaves using that driver, and those slaves will stay down. If networking monitoring (tcpdump, etc) shows the ARP requests and replies on the network, then it may be that diff --git a/Documentation/networking/dsa/bcm_sf2.txt b/Documentation/networking/dsa/bcm_sf2.txt index d999d0c1c5b8..eba3a2431e91 100644 --- a/Documentation/networking/dsa/bcm_sf2.txt +++ b/Documentation/networking/dsa/bcm_sf2.txt @@ -38,7 +38,7 @@ Implementation details ====================== The driver is located in drivers/net/dsa/bcm_sf2.c and is implemented as a DSA -driver; see Documentation/networking/dsa/dsa.txt for details on the subsytem +driver; see Documentation/networking/dsa/dsa.txt for details on the subsystem and what it provides. The SF2 switch is configured to enable a Broadcom specific 4-bytes switch tag diff --git a/Documentation/networking/dsa/dsa.txt b/Documentation/networking/dsa/dsa.txt index 3b196c304b73..631b0f7ae16f 100644 --- a/Documentation/networking/dsa/dsa.txt +++ b/Documentation/networking/dsa/dsa.txt @@ -334,7 +334,7 @@ more specifically with its VLAN filtering portion when configuring VLANs on top of per-port slave network devices. Since DSA primarily deals with MDIO-connected switches, although not exclusively, SWITCHDEV's prepare/abort/commit phases are often simplified into a prepare phase which -checks whether the operation is supporte by the DSA switch driver, and a commit +checks whether the operation is supported by the DSA switch driver, and a commit phase which applies the changes. As of today, the only SWITCHDEV objects supported by DSA are the FDB and VLAN @@ -533,7 +533,7 @@ Bridge layer out at the switch hardware for the switch to (re) learn MAC addresses behind this port. -- port_stp_update: bridge layer function invoked when a given switch port STP +- port_stp_state_set: bridge layer function invoked when a given switch port STP state is computed by the bridge layer and should be propagated to switch hardware to forward/block/learn traffic. The switch driver is responsible for computing a STP state change based on current and asked parameters and perform @@ -542,6 +542,12 @@ Bridge layer Bridge VLAN filtering --------------------- +- port_vlan_prepare: bridge layer function invoked when the bridge prepares the + configuration of a VLAN on the given port. If the operation is not supported + by the hardware, this function should return -EOPNOTSUPP to inform the bridge + code to fallback to a software implementation. No hardware setup must be done + in this function. See port_vlan_add for this and details. + - port_vlan_add: bridge layer function invoked when a VLAN is configured (tagged or untagged) for the given switch port @@ -552,6 +558,12 @@ Bridge VLAN filtering function that the driver has to call for each VLAN the given port is a member of. A switchdev object is used to carry the VID and bridge flags. +- port_fdb_prepare: bridge layer function invoked when the bridge prepares the + installation of a Forwarding Database entry. If the operation is not + supported, this function should return -EOPNOTSUPP to inform the bridge code + to fallback to a software implementation. No hardware setup must be done in + this function. See port_fdb_add for this and details. + - port_fdb_add: bridge layer function invoked when the bridge wants to install a Forwarding Database entry, the switch hardware should be programmed with the specified address in the specified VLAN Id in the forwarding database @@ -565,6 +577,10 @@ of DSA, would be the its port-based VLAN, used by the associated bridge device. the specified MAC address from the specified VLAN ID if it was mapped into this port forwarding database +- port_fdb_dump: bridge layer function invoked with a switchdev callback + function that the driver has to call for each MAC address known to be behind + the given port. A switchdev object is used to carry the VID and FDB info. + TODO ==== diff --git a/Documentation/networking/filter.txt b/Documentation/networking/filter.txt index 96da119a47e7..b9a4edf21ade 100644 --- a/Documentation/networking/filter.txt +++ b/Documentation/networking/filter.txt @@ -216,14 +216,14 @@ opcodes as defined in linux/filter.h stand for: jmp 6 Jump to label ja 6 Jump to label - jeq 7, 8 Jump on k == A - jneq 8 Jump on k != A - jne 8 Jump on k != A - jlt 8 Jump on k < A - jle 8 Jump on k <= A - jgt 7, 8 Jump on k > A - jge 7, 8 Jump on k >= A - jset 7, 8 Jump on k & A + jeq 7, 8 Jump on A == k + jneq 8 Jump on A != k + jne 8 Jump on A != k + jlt 8 Jump on A < k + jle 8 Jump on A <= k + jgt 7, 8 Jump on A > k + jge 7, 8 Jump on A >= k + jset 7, 8 Jump on A & k add 0, 4 A + <x> sub 0, 4 A - <x> @@ -1095,6 +1095,87 @@ all use cases. See details of eBPF verifier in kernel/bpf/verifier.c +Direct packet access +-------------------- +In cls_bpf and act_bpf programs the verifier allows direct access to the packet +data via skb->data and skb->data_end pointers. +Ex: +1: r4 = *(u32 *)(r1 +80) /* load skb->data_end */ +2: r3 = *(u32 *)(r1 +76) /* load skb->data */ +3: r5 = r3 +4: r5 += 14 +5: if r5 > r4 goto pc+16 +R1=ctx R3=pkt(id=0,off=0,r=14) R4=pkt_end R5=pkt(id=0,off=14,r=14) R10=fp +6: r0 = *(u16 *)(r3 +12) /* access 12 and 13 bytes of the packet */ + +this 2byte load from the packet is safe to do, since the program author +did check 'if (skb->data + 14 > skb->data_end) goto err' at insn #5 which +means that in the fall-through case the register R3 (which points to skb->data) +has at least 14 directly accessible bytes. The verifier marks it +as R3=pkt(id=0,off=0,r=14). +id=0 means that no additional variables were added to the register. +off=0 means that no additional constants were added. +r=14 is the range of safe access which means that bytes [R3, R3 + 14) are ok. +Note that R5 is marked as R5=pkt(id=0,off=14,r=14). It also points +to the packet data, but constant 14 was added to the register, so +it now points to 'skb->data + 14' and accessible range is [R5, R5 + 14 - 14) +which is zero bytes. + +More complex packet access may look like: + R0=imm1 R1=ctx R3=pkt(id=0,off=0,r=14) R4=pkt_end R5=pkt(id=0,off=14,r=14) R10=fp + 6: r0 = *(u8 *)(r3 +7) /* load 7th byte from the packet */ + 7: r4 = *(u8 *)(r3 +12) + 8: r4 *= 14 + 9: r3 = *(u32 *)(r1 +76) /* load skb->data */ +10: r3 += r4 +11: r2 = r1 +12: r2 <<= 48 +13: r2 >>= 48 +14: r3 += r2 +15: r2 = r3 +16: r2 += 8 +17: r1 = *(u32 *)(r1 +80) /* load skb->data_end */ +18: if r2 > r1 goto pc+2 + R0=inv56 R1=pkt_end R2=pkt(id=2,off=8,r=8) R3=pkt(id=2,off=0,r=8) R4=inv52 R5=pkt(id=0,off=14,r=14) R10=fp +19: r1 = *(u8 *)(r3 +4) +The state of the register R3 is R3=pkt(id=2,off=0,r=8) +id=2 means that two 'r3 += rX' instructions were seen, so r3 points to some +offset within a packet and since the program author did +'if (r3 + 8 > r1) goto err' at insn #18, the safe range is [R3, R3 + 8). +The verifier only allows 'add' operation on packet registers. Any other +operation will set the register state to 'unknown_value' and it won't be +available for direct packet access. +Operation 'r3 += rX' may overflow and become less than original skb->data, +therefore the verifier has to prevent that. So it tracks the number of +upper zero bits in all 'uknown_value' registers, so when it sees +'r3 += rX' instruction and rX is more than 16-bit value, it will error as: +"cannot add integer value with N upper zero bits to ptr_to_packet" +Ex. after insn 'r4 = *(u8 *)(r3 +12)' (insn #7 above) the state of r4 is +R4=inv56 which means that upper 56 bits on the register are guaranteed +to be zero. After insn 'r4 *= 14' the state becomes R4=inv52, since +multiplying 8-bit value by constant 14 will keep upper 52 bits as zero. +Similarly 'r2 >>= 48' will make R2=inv48, since the shift is not sign +extending. This logic is implemented in evaluate_reg_alu() function. + +The end result is that bpf program author can access packet directly +using normal C code as: + void *data = (void *)(long)skb->data; + void *data_end = (void *)(long)skb->data_end; + struct eth_hdr *eth = data; + struct iphdr *iph = data + sizeof(*eth); + struct udphdr *udp = data + sizeof(*eth) + sizeof(*iph); + + if (data + sizeof(*eth) + sizeof(*iph) + sizeof(*udp) > data_end) + return 0; + if (eth->h_proto != htons(ETH_P_IP)) + return 0; + if (iph->protocol != IPPROTO_UDP || iph->ihl != 5) + return 0; + if (udp->dest == 53 || udp->source == 9) + ...; +which makes such programs easier to write comparing to LD_ABS insn +and significantly faster. + eBPF maps --------- 'maps' is a generic storage of different types for sharing data between kernel @@ -1293,5 +1374,5 @@ to give potential BPF hackers or security auditors a better overview of the underlying architecture. Jay Schulist <jschlst@samba.org> -Daniel Borkmann <dborkman@redhat.com> -Alexei Starovoitov <ast@plumgrid.com> +Daniel Borkmann <daniel@iogearbox.net> +Alexei Starovoitov <ast@kernel.org> diff --git a/Documentation/networking/gen_stats.txt b/Documentation/networking/gen_stats.txt index 70e6275b757a..ff630a87b511 100644 --- a/Documentation/networking/gen_stats.txt +++ b/Documentation/networking/gen_stats.txt @@ -33,7 +33,8 @@ my_dumping_routine(struct sk_buff *skb, ...) { struct gnet_dump dump; - if (gnet_stats_start_copy(skb, TCA_STATS2, &mystruct->lock, &dump) < 0) + if (gnet_stats_start_copy(skb, TCA_STATS2, &mystruct->lock, &dump, + TCA_PAD) < 0) goto rtattr_failure; if (gnet_stats_copy_basic(&dump, &mystruct->bstats) < 0 || @@ -56,7 +57,8 @@ existing TLV types. my_dumping_routine(struct sk_buff *skb, ...) { if (gnet_stats_start_copy_compat(skb, TCA_STATS2, TCA_STATS, - TCA_XSTATS, &mystruct->lock, &dump) < 0) + TCA_XSTATS, &mystruct->lock, &dump, + TCA_PAD) < 0) goto rtattr_failure; ... } diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index b183e2b606c8..6c7f365b1515 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -63,6 +63,16 @@ fwmark_reflect - BOOLEAN fwmark of the packet they are replying to. Default: 0 +fib_multipath_use_neigh - BOOLEAN + Use status of existing neighbor entry when determining nexthop for + multipath routes. If disabled, neighbor information is not used and + packets could be directed to a failed nexthop. Only valid for kernels + built with CONFIG_IP_ROUTE_MULTIPATH enabled. + Default: 0 (disabled) + Possible values: + 0 - disabled + 1 - enabled + route/max_size - INTEGER Maximum number of routes allowed in the kernel. Increase this when using large numbers of interfaces and/or routes. diff --git a/Documentation/networking/mac80211-injection.txt b/Documentation/networking/mac80211-injection.txt index ec8f934c2eb2..d58d78df9ca2 100644 --- a/Documentation/networking/mac80211-injection.txt +++ b/Documentation/networking/mac80211-injection.txt @@ -37,14 +37,27 @@ radiotap headers and used to control injection: HT rate for the transmission (only for devices without own rate control). Also some flags are parsed - IEEE80211_TX_RC_SHORT_GI: use short guard interval - IEEE80211_TX_RC_40_MHZ_WIDTH: send in HT40 mode + IEEE80211_RADIOTAP_MCS_SGI: use short guard interval + IEEE80211_RADIOTAP_MCS_BW_40: send in HT40 mode * IEEE80211_RADIOTAP_DATA_RETRIES number of retries when either IEEE80211_RADIOTAP_RATE or IEEE80211_RADIOTAP_MCS was used + * IEEE80211_RADIOTAP_VHT + + VHT mcs and number of streams used in the transmission (only for devices + without own rate control). Also other fields are parsed + + flags field + IEEE80211_RADIOTAP_VHT_FLAG_SGI: use short guard interval + + bandwidth field + 1: send using 40MHz channel width + 4: send using 80MHz channel width + 11: send using 160MHz channel width + The injection code can also skip all other currently defined radiotap fields facilitating replay of captured radiotap headers directly. diff --git a/Documentation/networking/netdev-features.txt b/Documentation/networking/netdev-features.txt index f310edec8a77..7413eb05223b 100644 --- a/Documentation/networking/netdev-features.txt +++ b/Documentation/networking/netdev-features.txt @@ -131,13 +131,11 @@ stack. Driver should not change behaviour based on them. * LLTX driver (deprecated for hardware drivers) -NETIF_F_LLTX should be set in drivers that implement their own locking in -transmit path or don't need locking at all (e.g. software tunnels). -In ndo_start_xmit, it is recommended to use a try_lock and return -NETDEV_TX_LOCKED when the spin lock fails. The locking should also properly -protect against other callbacks (the rules you need to find out). +NETIF_F_LLTX is meant to be used by drivers that don't need locking at all, +e.g. software tunnels. -Don't use it for new drivers. +This is also used in a few legacy drivers that implement their +own locking, don't use it for new (hardware) drivers. * netns-local device diff --git a/Documentation/networking/netdevices.txt b/Documentation/networking/netdevices.txt index 0b1cf6b2a592..7fec2061a334 100644 --- a/Documentation/networking/netdevices.txt +++ b/Documentation/networking/netdevices.txt @@ -69,10 +69,9 @@ ndo_start_xmit: When the driver sets NETIF_F_LLTX in dev->features this will be called without holding netif_tx_lock. In this case the driver - has to lock by itself when needed. It is recommended to use a try lock - for this and return NETDEV_TX_LOCKED when the spin lock fails. - The locking there should also properly protect against - set_rx_mode. Note that the use of NETIF_F_LLTX is deprecated. + has to lock by itself when needed. + The locking there should also properly protect against + set_rx_mode. WARNING: use of NETIF_F_LLTX is deprecated. Don't use it for new drivers. Context: Process with BHs disabled or BH (timer), @@ -83,8 +82,6 @@ ndo_start_xmit: o NETDEV_TX_BUSY Cannot transmit packet, try later Usually a bug, means queue start/stop flow control is broken in the driver. Note: the driver must NOT put the skb in its DMA ring. - o NETDEV_TX_LOCKED Locking failed, please retry quickly. - Only valid when NETIF_F_LLTX is set. ndo_tx_timeout: Synchronization: netif_tx_lock spinlock; all TX queues frozen. diff --git a/Documentation/networking/segmentation-offloads.txt b/Documentation/networking/segmentation-offloads.txt new file mode 100644 index 000000000000..f200467ade38 --- /dev/null +++ b/Documentation/networking/segmentation-offloads.txt @@ -0,0 +1,130 @@ +Segmentation Offloads in the Linux Networking Stack + +Introduction +============ + +This document describes a set of techniques in the Linux networking stack +to take advantage of segmentation offload capabilities of various NICs. + +The following technologies are described: + * TCP Segmentation Offload - TSO + * UDP Fragmentation Offload - UFO + * IPIP, SIT, GRE, and UDP Tunnel Offloads + * Generic Segmentation Offload - GSO + * Generic Receive Offload - GRO + * Partial Generic Segmentation Offload - GSO_PARTIAL + +TCP Segmentation Offload +======================== + +TCP segmentation allows a device to segment a single frame into multiple +frames with a data payload size specified in skb_shinfo()->gso_size. +When TCP segmentation requested the bit for either SKB_GSO_TCP or +SKB_GSO_TCP6 should be set in skb_shinfo()->gso_type and +skb_shinfo()->gso_size should be set to a non-zero value. + +TCP segmentation is dependent on support for the use of partial checksum +offload. For this reason TSO is normally disabled if the Tx checksum +offload for a given device is disabled. + +In order to support TCP segmentation offload it is necessary to populate +the network and transport header offsets of the skbuff so that the device +drivers will be able determine the offsets of the IP or IPv6 header and the +TCP header. In addition as CHECKSUM_PARTIAL is required csum_start should +also point to the TCP header of the packet. + +For IPv4 segmentation we support one of two types in terms of the IP ID. +The default behavior is to increment the IP ID with every segment. If the +GSO type SKB_GSO_TCP_FIXEDID is specified then we will not increment the IP +ID and all segments will use the same IP ID. If a device has +NETIF_F_TSO_MANGLEID set then the IP ID can be ignored when performing TSO +and we will either increment the IP ID for all frames, or leave it at a +static value based on driver preference. + +UDP Fragmentation Offload +========================= + +UDP fragmentation offload allows a device to fragment an oversized UDP +datagram into multiple IPv4 fragments. Many of the requirements for UDP +fragmentation offload are the same as TSO. However the IPv4 ID for +fragments should not increment as a single IPv4 datagram is fragmented. + +IPIP, SIT, GRE, UDP Tunnel, and Remote Checksum Offloads +======================================================== + +In addition to the offloads described above it is possible for a frame to +contain additional headers such as an outer tunnel. In order to account +for such instances an additional set of segmentation offload types were +introduced including SKB_GSO_IPIP, SKB_GSO_SIT, SKB_GSO_GRE, and +SKB_GSO_UDP_TUNNEL. These extra segmentation types are used to identify +cases where there are more than just 1 set of headers. For example in the +case of IPIP and SIT we should have the network and transport headers moved +from the standard list of headers to "inner" header offsets. + +Currently only two levels of headers are supported. The convention is to +refer to the tunnel headers as the outer headers, while the encapsulated +data is normally referred to as the inner headers. Below is the list of +calls to access the given headers: + +IPIP/SIT Tunnel: + Outer Inner +MAC skb_mac_header +Network skb_network_header skb_inner_network_header +Transport skb_transport_header + +UDP/GRE Tunnel: + Outer Inner +MAC skb_mac_header skb_inner_mac_header +Network skb_network_header skb_inner_network_header +Transport skb_transport_header skb_inner_transport_header + +In addition to the above tunnel types there are also SKB_GSO_GRE_CSUM and +SKB_GSO_UDP_TUNNEL_CSUM. These two additional tunnel types reflect the +fact that the outer header also requests to have a non-zero checksum +included in the outer header. + +Finally there is SKB_GSO_REMCSUM which indicates that a given tunnel header +has requested a remote checksum offload. In this case the inner headers +will be left with a partial checksum and only the outer header checksum +will be computed. + +Generic Segmentation Offload +============================ + +Generic segmentation offload is a pure software offload that is meant to +deal with cases where device drivers cannot perform the offloads described +above. What occurs in GSO is that a given skbuff will have its data broken +out over multiple skbuffs that have been resized to match the MSS provided +via skb_shinfo()->gso_size. + +Before enabling any hardware segmentation offload a corresponding software +offload is required in GSO. Otherwise it becomes possible for a frame to +be re-routed between devices and end up being unable to be transmitted. + +Generic Receive Offload +======================= + +Generic receive offload is the complement to GSO. Ideally any frame +assembled by GRO should be segmented to create an identical sequence of +frames using GSO, and any sequence of frames segmented by GSO should be +able to be reassembled back to the original by GRO. The only exception to +this is IPv4 ID in the case that the DF bit is set for a given IP header. +If the value of the IPv4 ID is not sequentially incrementing it will be +altered so that it is when a frame assembled via GRO is segmented via GSO. + +Partial Generic Segmentation Offload +==================================== + +Partial generic segmentation offload is a hybrid between TSO and GSO. What +it effectively does is take advantage of certain traits of TCP and tunnels +so that instead of having to rewrite the packet headers for each segment +only the inner-most transport header and possibly the outer-most network +header need to be updated. This allows devices that do not support tunnel +offloads or tunnel offloads with checksum to still make use of segmentation. + +With the partial offload what occurs is that all headers excluding the +inner transport header are updated such that they will contain the correct +values for if the header was simply duplicated. The one exception to this +is the outer IPv4 ID field. It is up to the device drivers to guarantee +that the IPv4 ID field is incremented in the case that a given header does +not have the DF bit set. diff --git a/Documentation/networking/stmmac.txt b/Documentation/networking/stmmac.txt index d64a14714236..671fe3dd56d3 100644 --- a/Documentation/networking/stmmac.txt +++ b/Documentation/networking/stmmac.txt @@ -1,6 +1,6 @@ STMicroelectronics 10/100/1000 Synopsys Ethernet driver -Copyright (C) 2007-2014 STMicroelectronics Ltd +Copyright (C) 2007-2015 STMicroelectronics Ltd Author: Giuseppe Cavallaro <peppe.cavallaro@st.com> This is the driver for the MAC 10/100/1000 on-chip Ethernet controllers @@ -138,6 +138,8 @@ struct plat_stmmacenet_data { int (*init)(struct platform_device *pdev, void *priv); void (*exit)(struct platform_device *pdev, void *priv); void *bsp_priv; + int has_gmac4; + bool tso_en; }; Where: @@ -181,6 +183,8 @@ Where: registers. init/exit callbacks should not use or modify platform data. o bsp_priv: another private pointer. + o has_gmac4: uses GMAC4 core. + o tso_en: Enables TSO (TCP Segmentation Offload) feature. For MDIO bus The we have: @@ -278,6 +282,13 @@ Please see the following document: o stmmac_ethtool.c: to implement the ethtool support; o stmmac.h: private driver structure; o common.h: common definitions and VFTs; + o mmc_core.c/mmc.h: Management MAC Counters; + o stmmac_hwtstamp.c: HW timestamp support for PTP; + o stmmac_ptp.c: PTP 1588 clock; + o dwmac-<XXX>.c: these are for the platform glue-logic file; e.g. dwmac-sti.c + for STMicroelectronics SoCs. + +- GMAC 3.x o descs.h: descriptor structure definitions; o dwmac1000_core.c: dwmac GiGa core functions; o dwmac1000_dma.c: dma functions for the GMAC chip; @@ -289,11 +300,32 @@ Please see the following document: o enh_desc.c: functions for handling enhanced descriptors; o norm_desc.c: functions for handling normal descriptors; o chain_mode.c/ring_mode.c:: functions to manage RING/CHAINED modes; - o mmc_core.c/mmc.h: Management MAC Counters; - o stmmac_hwtstamp.c: HW timestamp support for PTP; - o stmmac_ptp.c: PTP 1588 clock; - o dwmac-<XXX>.c: these are for the platform glue-logic file; e.g. dwmac-sti.c - for STMicroelectronics SoCs. + +- GMAC4.x generation + o dwmac4_core.c: dwmac GMAC4.x core functions; + o dwmac4_desc.c: functions for handling GMAC4.x descriptors; + o dwmac4_descs.h: descriptor definitions; + o dwmac4_dma.c: dma functions for the GMAC4.x chip; + o dwmac4_dma.h: dma definitions for the GMAC4.x chip; + o dwmac4.h: core definitions for the GMAC4.x chip; + o dwmac4_lib.c: generic GMAC4.x functions; + +4.12) TSO support (GMAC4.x) + +TSO (Tcp Segmentation Offload) feature is supported by GMAC 4.x chip family. +When a packet is sent through TCP protocol, the TCP stack ensures that +the SKB provided to the low level driver (stmmac in our case) matches with +the maximum frame len (IP header + TCP header + payload <= 1500 bytes (for +MTU set to 1500)). It means that if an application using TCP want to send a +packet which will have a length (after adding headers) > 1514 the packet +will be split in several TCP packets: The data payload is split and headers +(TCP/IP ..) are added. It is done by software. + +When TSO is enabled, the TCP stack doesn't care about the maximum frame +length and provide SKB packet to stmmac as it is. The GMAC IP will have to +perform the segmentation by it self to match with maximum frame length. + +This feature can be enabled in device tree through "snps,tso" entry. 5) Debug Information diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt index 2f659129694b..31c39115834d 100644 --- a/Documentation/networking/switchdev.txt +++ b/Documentation/networking/switchdev.txt @@ -89,6 +89,18 @@ Typically, the management port is not participating in offloaded data plane and is loaded with a different driver, such as a NIC driver, on the management port device. +Switch ID +^^^^^^^^^ + +The switchdev driver must implement the switchdev op switchdev_port_attr_get +for SWITCHDEV_ATTR_ID_PORT_PARENT_ID for each port netdev, returning the same +physical ID for each port of a switch. The ID must be unique between switches +on the same system. The ID does not need to be unique between switches on +different systems. + +The switch ID is used to locate ports on a switch and to know if aggregated +ports belong to the same switch. + Port Netdev Naming ^^^^^^^^^^^^^^^^^^ @@ -104,25 +116,13 @@ external configuration. For example, if a physical 40G port is split logically into 4 10G ports, resulting in 4 port netdevs, the device can give a unique name for each port using port PHYS name. The udev rule would be: -SUBSYSTEM=="net", ACTION=="add", DRIVER="<driver>", ATTR{phys_port_name}!="", \ - NAME="$attr{phys_port_name}" +SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \ + ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}" Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y is the port name or ID, and Z is the sub-port name or ID. For example, sw1p1s0 would be sub-port 0 on port 1 on switch 1. -Switch ID -^^^^^^^^^ - -The switchdev driver must implement the switchdev op switchdev_port_attr_get -for SWITCHDEV_ATTR_ID_PORT_PARENT_ID for each port netdev, returning the same -physical ID for each port of a switch. The ID must be unique between switches -on the same system. The ID does not need to be unique between switches on -different systems. - -The switch ID is used to locate ports on a switch and to know if aggregated -ports belong to the same switch. - Port Features ^^^^^^^^^^^^^ diff --git a/Documentation/networking/timestamping.txt b/Documentation/networking/timestamping.txt index a977339fbe0a..671cccf0dcd2 100644 --- a/Documentation/networking/timestamping.txt +++ b/Documentation/networking/timestamping.txt @@ -44,11 +44,17 @@ timeval of SO_TIMESTAMP (ms). Supports multiple types of timestamp requests. As a result, this socket option takes a bitmap of flags, not a boolean. In - err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, (void *) val, &val); + err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, (void *) val, + sizeof(val)); val is an integer with any of the following bits set. Setting other bit returns EINVAL and does not change the current state. +The socket option configures timestamp generation for individual +sk_buffs (1.3.1), timestamp reporting to the socket's error +queue (1.3.2) and options (1.3.3). Timestamp generation can also +be enabled for individual sendmsg calls using cmsg (1.3.4). + 1.3.1 Timestamp Generation @@ -71,13 +77,16 @@ SOF_TIMESTAMPING_RX_SOFTWARE: kernel receive stack. SOF_TIMESTAMPING_TX_HARDWARE: - Request tx timestamps generated by the network adapter. + Request tx timestamps generated by the network adapter. This flag + can be enabled via both socket options and control messages. SOF_TIMESTAMPING_TX_SOFTWARE: Request tx timestamps when data leaves the kernel. These timestamps are generated in the device driver as close as possible, but always prior to, passing the packet to the network interface. Hence, they require driver support and may not be available for all devices. + This flag can be enabled via both socket options and control messages. + SOF_TIMESTAMPING_TX_SCHED: Request tx timestamps prior to entering the packet scheduler. Kernel @@ -90,7 +99,8 @@ SOF_TIMESTAMPING_TX_SCHED: machines with virtual devices where a transmitted packet travels through multiple devices and, hence, multiple packet schedulers, a timestamp is generated at each layer. This allows for fine - grained measurement of queuing delay. + grained measurement of queuing delay. This flag can be enabled + via both socket options and control messages. SOF_TIMESTAMPING_TX_ACK: Request tx timestamps when all data in the send buffer has been @@ -99,6 +109,7 @@ SOF_TIMESTAMPING_TX_ACK: over-report measurement, because the timestamp is generated when all data up to and including the buffer at send() was acknowledged: the cumulative acknowledgment. The mechanism ignores SACK and FACK. + This flag can be enabled via both socket options and control messages. 1.3.2 Timestamp Reporting @@ -183,6 +194,37 @@ having access to the contents of the original packet, so cannot be combined with SOF_TIMESTAMPING_OPT_TSONLY. +1.3.4. Enabling timestamps via control messages + +In addition to socket options, timestamp generation can be requested +per write via cmsg, only for SOF_TIMESTAMPING_TX_* (see Section 1.3.1). +Using this feature, applications can sample timestamps per sendmsg() +without paying the overhead of enabling and disabling timestamps via +setsockopt: + + struct msghdr *msg; + ... + cmsg = CMSG_FIRSTHDR(msg); + cmsg->cmsg_level = SOL_SOCKET; + cmsg->cmsg_type = SO_TIMESTAMPING; + cmsg->cmsg_len = CMSG_LEN(sizeof(__u32)); + *((__u32 *) CMSG_DATA(cmsg)) = SOF_TIMESTAMPING_TX_SCHED | + SOF_TIMESTAMPING_TX_SOFTWARE | + SOF_TIMESTAMPING_TX_ACK; + err = sendmsg(fd, msg, 0); + +The SOF_TIMESTAMPING_TX_* flags set via cmsg will override +the SOF_TIMESTAMPING_TX_* flags set via setsockopt. + +Moreover, applications must still enable timestamp reporting via +setsockopt to receive timestamps: + + __u32 val = SOF_TIMESTAMPING_SOFTWARE | + SOF_TIMESTAMPING_OPT_ID /* or any other flag */; + err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, (void *) val, + sizeof(val)); + + 1.4 Bytestream Timestamps The SO_TIMESTAMPING interface supports timestamping of bytes in a diff --git a/Documentation/sysctl/net.txt b/Documentation/sysctl/net.txt index 809ab6efcc74..f0480f7ea740 100644 --- a/Documentation/sysctl/net.txt +++ b/Documentation/sysctl/net.txt @@ -43,6 +43,17 @@ Values : 1 - enable the JIT 2 - enable the JIT and ask the compiler to emit traces on kernel log. +bpf_jit_harden +-------------- + +This enables hardening for the Berkeley Packet Filter Just in Time compiler. +Supported are eBPF JIT backends. Enabling hardening trades off performance, +but can mitigate JIT spraying. +Values : + 0 - disable JIT hardening (default value) + 1 - enable JIT hardening for unprivileged users only + 2 - enable JIT hardening for all users + dev_weight -------------- |