diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2015-04-15 09:00:47 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2015-04-15 09:00:47 -0700 |
commit | 6c373ca89399c5a3f7ef210ad8f63dc3437da345 (patch) | |
tree | 74d1ec65087df1da1021b43ac51acc1ee8601809 /Documentation | |
parent | bb0fd7ab0986105765d11baa82e619c618a235aa (diff) | |
parent | 9f9151412dd7aae0e3f51a89ae4a1f8755fdb4d0 (diff) | |
download | lwn-6c373ca89399c5a3f7ef210ad8f63dc3437da345.tar.gz lwn-6c373ca89399c5a3f7ef210ad8f63dc3437da345.zip |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
1) Add BQL support to via-rhine, from Tino Reichardt.
2) Integrate SWITCHDEV layer support into the DSA layer, so DSA drivers
can support hw switch offloading. From Floria Fainelli.
3) Allow 'ip address' commands to initiate multicast group join/leave,
from Madhu Challa.
4) Many ipv4 FIB lookup optimizations from Alexander Duyck.
5) Support EBPF in cls_bpf classifier and act_bpf action, from Daniel
Borkmann.
6) Remove the ugly compat support in ARP for ugly layers like ax25,
rose, etc. And use this to clean up the neigh layer, then use it to
implement MPLS support. All from Eric Biederman.
7) Support L3 forwarding offloading in switches, from Scott Feldman.
8) Collapse the LOCAL and MAIN ipv4 FIB tables when possible, to speed
up route lookups even further. From Alexander Duyck.
9) Many improvements and bug fixes to the rhashtable implementation,
from Herbert Xu and Thomas Graf. In particular, in the case where
an rhashtable user bulk adds a large number of items into an empty
table, we expand the table much more sanely.
10) Don't make the tcp_metrics hash table per-namespace, from Eric
Biederman.
11) Extend EBPF to access SKB fields, from Alexei Starovoitov.
12) Split out new connection request sockets so that they can be
established in the main hash table. Much less false sharing since
hash lookups go direct to the request sockets instead of having to
go first to the listener then to the request socks hashed
underneath. From Eric Dumazet.
13) Add async I/O support for crytpo AF_ALG sockets, from Tadeusz Struk.
14) Support stable privacy address generation for RFC7217 in IPV6. From
Hannes Frederic Sowa.
15) Hash network namespace into IP frag IDs, also from Hannes Frederic
Sowa.
16) Convert PTP get/set methods to use 64-bit time, from Richard
Cochran.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1816 commits)
fm10k: Bump driver version to 0.15.2
fm10k: corrected VF multicast update
fm10k: mbx_update_max_size does not drop all oversized messages
fm10k: reset head instead of calling update_max_size
fm10k: renamed mbx_tx_dropped to mbx_tx_oversized
fm10k: update xcast mode before synchronizing multicast addresses
fm10k: start service timer on probe
fm10k: fix function header comment
fm10k: comment next_vf_mbx flow
fm10k: don't handle mailbox events in iov_event path and always process mailbox
fm10k: use separate workqueue for fm10k driver
fm10k: Set PF queues to unlimited bandwidth during virtualization
fm10k: expose tx_timeout_count as an ethtool stat
fm10k: only increment tx_timeout_count in Tx hang path
fm10k: remove extraneous "Reset interface" message
fm10k: separate PF only stats so that VF does not display them
fm10k: use hw->mac.max_queues for stats
fm10k: only show actual queues, not the maximum in hardware
fm10k: allow creation of VLAN on default vid
fm10k: fix unused warnings
...
Diffstat (limited to 'Documentation')
23 files changed, 283 insertions, 73 deletions
diff --git a/Documentation/ABI/testing/sysfs-class-net b/Documentation/ABI/testing/sysfs-class-net index beb8ec4dabbc..5ecfd72ba684 100644 --- a/Documentation/ABI/testing/sysfs-class-net +++ b/Documentation/ABI/testing/sysfs-class-net @@ -188,6 +188,14 @@ Description: Indicates the interface unique physical port identifier within the NIC, as a string. +What: /sys/class/net/<iface>/phys_port_name +Date: March 2015 +KernelVersion: 4.0 +Contact: netdev@vger.kernel.org +Description: + Indicates the interface physical port name within the NIC, + as a string. + What: /sys/class/net/<iface>/speed Date: October 2009 KernelVersion: 2.6.33 diff --git a/Documentation/ABI/testing/sysfs-class-net-queues b/Documentation/ABI/testing/sysfs-class-net-queues index 5e9aeb91d355..0c0df91b1516 100644 --- a/Documentation/ABI/testing/sysfs-class-net-queues +++ b/Documentation/ABI/testing/sysfs-class-net-queues @@ -24,6 +24,14 @@ Description: Indicates the number of transmit timeout events seen by this network interface transmit queue. +What: /sys/class/<iface>/queues/tx-<queue>/tx_maxrate +Date: March 2015 +KernelVersion: 4.1 +Contact: netdev@vger.kernel.org +Description: + A Mbps max-rate set for the queue, a value of zero means disabled, + default is disabled. + What: /sys/class/<iface>/queues/tx-<queue>/xps_cpus Date: November 2010 KernelVersion: 2.6.38 diff --git a/Documentation/devicetree/bindings/net/apm-xgene-enet.txt b/Documentation/devicetree/bindings/net/apm-xgene-enet.txt index 6151999c5dca..f55aa280d34f 100644 --- a/Documentation/devicetree/bindings/net/apm-xgene-enet.txt +++ b/Documentation/devicetree/bindings/net/apm-xgene-enet.txt @@ -14,7 +14,11 @@ Required properties for all the ethernet interfaces: - "enet_csr": Ethernet control and status register address space - "ring_csr": Descriptor ring control and status register address space - "ring_cmd": Descriptor ring command register address space -- interrupts: Ethernet main interrupt +- interrupts: Two interrupt specifiers can be specified. + - First is the Rx interrupt. This irq is mandatory. + - Second is the Tx completion interrupt. + This is supported only on SGMII based 1GbE and 10GbE interfaces. +- port-id: Port number (0 or 1) - clocks: Reference to the clock entry. - local-mac-address: MAC address assigned to this device - phy-connection-type: Interface type between ethernet device and PHY device @@ -49,6 +53,7 @@ Example: <0x0 0X10000000 0x0 0X200>; reg-names = "enet_csr", "ring_csr", "ring_cmd"; interrupts = <0x0 0x3c 0x4>; + port-id = <0>; clocks = <&menetclk 0>; local-mac-address = [00 01 73 00 00 01]; phy-connection-type = "rgmii"; diff --git a/Documentation/devicetree/bindings/net/ieee802154/at86rf230.txt b/Documentation/devicetree/bindings/net/ieee802154/at86rf230.txt index d3bbdded4cbe..168f1be50912 100644 --- a/Documentation/devicetree/bindings/net/ieee802154/at86rf230.txt +++ b/Documentation/devicetree/bindings/net/ieee802154/at86rf230.txt @@ -6,11 +6,14 @@ Required properties: - spi-max-frequency: maximal bus speed, should be set to 7500000 depends sync or async operation mode - reg: the chipselect index - - interrupts: the interrupt generated by the device + - interrupts: the interrupt generated by the device. Non high-level + can occur deadlocks while handling isr. Optional properties: - reset-gpio: GPIO spec for the rstn pin - sleep-gpio: GPIO spec for the slp_tr pin + - xtal-trim: u8 value for fine tuning the internal capacitance + arrays of xtal pins: 0 = +0 pF, 0xf = +4.5 pF Example: @@ -18,6 +21,7 @@ Example: compatible = "atmel,at86rf231"; spi-max-frequency = <7500000>; reg = <0>; - interrupts = <19 1>; + interrupts = <19 4>; interrupt-parent = <&gpio3>; + xtal-trim = /bits/ 8 <0x06>; }; diff --git a/Documentation/devicetree/bindings/net/ieee802154/cc2520.txt b/Documentation/devicetree/bindings/net/ieee802154/cc2520.txt index 0071883c08d8..fb6d49f184ed 100644 --- a/Documentation/devicetree/bindings/net/ieee802154/cc2520.txt +++ b/Documentation/devicetree/bindings/net/ieee802154/cc2520.txt @@ -13,11 +13,15 @@ Required properties: - cca-gpio: GPIO spec for the CCA pin - vreg-gpio: GPIO spec for the VREG pin - reset-gpio: GPIO spec for the RESET pin +Optional properties: + - amplified: include if the CC2520 is connected to a CC2591 amplifier + Example: cc2520@0 { compatible = "ti,cc2520"; reg = <0>; spi-max-frequency = <4000000>; + amplified; pinctrl-names = "default"; pinctrl-0 = <&cc2520_cape_pins>; fifo-gpio = <&gpio1 18 0>; diff --git a/Documentation/devicetree/bindings/net/keystone-netcp.txt b/Documentation/devicetree/bindings/net/keystone-netcp.txt index f9c07710478d..d0e6fa38f335 100644 --- a/Documentation/devicetree/bindings/net/keystone-netcp.txt +++ b/Documentation/devicetree/bindings/net/keystone-netcp.txt @@ -49,6 +49,7 @@ Required properties: - compatible: Should be "ti,netcp-1.0" - clocks: phandle to the reference clocks for the subsystem. - dma-id: Navigator packet dma instance id. +- ranges: address range of NetCP (includes, Ethernet SS, PA and SA) Optional properties: - reg: register location and the size for the following register @@ -64,10 +65,30 @@ NetCP device properties: Device specification for NetCP sub-modules. 1Gb/10Gb (gbe/xgbe) ethernet switch sub-module specifications. Required properties: - label: Must be "netcp-gbe" for 1Gb & "netcp-xgbe" for 10Gb. +- compatible: Must be one of below:- + "ti,netcp-gbe" for 1GbE on NetCP 1.4 + "ti,netcp-gbe-5" for 1GbE N NetCP 1.5 (N=5) + "ti,netcp-gbe-9" for 1GbE N NetCP 1.5 (N=9) + "ti,netcp-gbe-2" for 1GbE N NetCP 1.5 (N=2) + "ti,netcp-xgbe" for 10 GbE + - reg: register location and the size for the following register regions in the specified order. - - subsystem registers - - serdes registers + - switch subsystem registers + - sgmii port3/4 module registers (only for NetCP 1.4) + - switch module registers + - serdes registers (only for 10G) + + NetCP 1.4 ethss, here is the order + index #0 - switch subsystem registers + index #1 - sgmii port3/4 module registers + index #2 - switch module registers + + NetCP 1.5 ethss 9 port, 5 port and 2 port + index #0 - switch subsystem registers + index #1 - switch module registers + index #2 - serdes registers + - tx-channel: the navigator packet dma channel name for tx. - tx-queue: the navigator queue number associated with the tx dma channel. - interfaces: specification for each of the switch port to be registered as a @@ -120,14 +141,13 @@ Optional properties: Example binding: -netcp: netcp@2090000 { +netcp: netcp@2000000 { reg = <0x2620110 0x8>; reg-names = "efuse"; compatible = "ti,netcp-1.0"; #address-cells = <1>; #size-cells = <1>; - ranges; - + ranges = <0 0x2000000 0xfffff>; clocks = <&papllclk>, <&clkcpgmac>, <&chipclk12>; dma-coherent; /* big-endian; */ @@ -137,9 +157,9 @@ netcp: netcp@2090000 { #address-cells = <1>; #size-cells = <1>; ranges; - gbe@0x2090000 { + gbe@90000 { label = "netcp-gbe"; - reg = <0x2090000 0xf00>; + reg = <0x90000 0x300>, <0x90400 0x400>, <0x90800 0x700>; /* enable-ale; */ tx-queue = <648>; tx-channel = <8>; diff --git a/Documentation/devicetree/bindings/net/macb.txt b/Documentation/devicetree/bindings/net/macb.txt index aaa696414f57..ba19d671e808 100644 --- a/Documentation/devicetree/bindings/net/macb.txt +++ b/Documentation/devicetree/bindings/net/macb.txt @@ -2,10 +2,13 @@ Required properties: - compatible: Should be "cdns,[<chip>-]{macb|gem}" - Use "cdns,at91sam9260-macb" Atmel at91sam9260 and at91sam9263 SoCs. + Use "cdns,at91sam9260-macb" for Atmel at91sam9 SoCs or the 10/100Mbit IP + available on sama5d3 SoCs. Use "cdns,at32ap7000-macb" for other 10/100 usage or use the generic form: "cdns,macb". Use "cdns,pc302-gem" for Picochip picoXcell pc302 and later devices based on the Cadence GEM, or the generic form: "cdns,gem". + Use "cdns,sama5d3-gem" for the Gigabit IP available on Atmel sama5d3 SoCs. + Use "cdns,sama5d4-gem" for the Gigabit IP available on Atmel sama5d4 SoCs. - reg: Address and length of the register set for the device - interrupts: Should contain macb interrupt - phy-mode: See ethernet.txt file in the same directory. diff --git a/Documentation/devicetree/bindings/net/nfc/nxp-nci.txt b/Documentation/devicetree/bindings/net/nfc/nxp-nci.txt new file mode 100644 index 000000000000..5b6cd9b3f628 --- /dev/null +++ b/Documentation/devicetree/bindings/net/nfc/nxp-nci.txt @@ -0,0 +1,35 @@ +* NXP Semiconductors NXP NCI NFC Controllers + +Required properties: +- compatible: Should be "nxp,nxp-nci-i2c". +- clock-frequency: I²C work frequency. +- reg: address on the bus +- interrupt-parent: phandle for the interrupt gpio controller +- interrupts: GPIO interrupt to which the chip is connected +- enable-gpios: Output GPIO pin used for enabling/disabling the chip +- firmware-gpios: Output GPIO pin used to enter firmware download mode + +Optional SoC Specific Properties: +- pinctrl-names: Contains only one value - "default". +- pintctrl-0: Specifies the pin control groups used for this controller. + +Example (for ARM-based BeagleBone with NPC100 NFC controller on I2C2): + +&i2c2 { + + status = "okay"; + + npc100: npc100@29 { + + compatible = "nxp,nxp-nci-i2c"; + + reg = <0x29>; + clock-frequency = <100000>; + + interrupt-parent = <&gpio1>; + interrupts = <29 GPIO_ACTIVE_HIGH>; + + enable-gpios = <&gpio0 30 GPIO_ACTIVE_HIGH>; + firmware-gpios = <&gpio0 31 GPIO_ACTIVE_HIGH>; + }; +}; diff --git a/Documentation/devicetree/bindings/net/stmmac.txt b/Documentation/devicetree/bindings/net/stmmac.txt index 8ca65cec52ae..29aca8591b16 100644 --- a/Documentation/devicetree/bindings/net/stmmac.txt +++ b/Documentation/devicetree/bindings/net/stmmac.txt @@ -35,10 +35,11 @@ Optional properties: - reset-names: Should contain the reset signal name "stmmaceth", if a reset phandle is given - max-frame-size: See ethernet.txt file in the same directory -- clocks: If present, the first clock should be the GMAC main clock, - further clocks may be specified in derived bindings. +- clocks: If present, the first clock should be the GMAC main clock and + the second clock should be peripheral's register interface clock. Further + clocks may be specified in derived bindings. - clock-names: One name for each entry in the clocks property, the - first one should be "stmmaceth". + first one should be "stmmaceth" and the second one should be "pclk". - clk_ptp_ref: this is the PTP reference clock; in case of the PTP is available this clock is used for programming the Timestamp Addend Register. If not passed then the system clock will be used and this is fine on some diff --git a/Documentation/networking/can.txt b/Documentation/networking/can.txt index 0a2859a8ee7e..5abad1e921ca 100644 --- a/Documentation/networking/can.txt +++ b/Documentation/networking/can.txt @@ -22,7 +22,8 @@ This file contains 4.1.3 RAW socket option CAN_RAW_LOOPBACK 4.1.4 RAW socket option CAN_RAW_RECV_OWN_MSGS 4.1.5 RAW socket option CAN_RAW_FD_FRAMES - 4.1.6 RAW socket returned message flags + 4.1.6 RAW socket option CAN_RAW_JOIN_FILTERS + 4.1.7 RAW socket returned message flags 4.2 Broadcast Manager protocol sockets (SOCK_DGRAM) 4.2.1 Broadcast Manager operations 4.2.2 Broadcast Manager message flags @@ -601,7 +602,22 @@ solution for a couple of reasons: CAN FD frames by checking if the device maximum transfer unit is CANFD_MTU. The CAN device MTU can be retrieved e.g. with a SIOCGIFMTU ioctl() syscall. - 4.1.6 RAW socket returned message flags + 4.1.6 RAW socket option CAN_RAW_JOIN_FILTERS + + The CAN_RAW socket can set multiple CAN identifier specific filters that + lead to multiple filters in the af_can.c filter processing. These filters + are indenpendent from each other which leads to logical OR'ed filters when + applied (see 4.1.1). + + This socket option joines the given CAN filters in the way that only CAN + frames are passed to user space that matched *all* given CAN filters. The + semantic for the applied filters is therefore changed to a logical AND. + + This is useful especially when the filterset is a combination of filters + where the CAN_INV_FILTER flag is set in order to notch single CAN IDs or + CAN ID ranges from the incoming traffic. + + 4.1.7 RAW socket returned message flags When using recvmsg() call, the msg->msg_flags may contain following flags: diff --git a/Documentation/networking/filter.txt b/Documentation/networking/filter.txt index 9930ecfbb465..135581f015e1 100644 --- a/Documentation/networking/filter.txt +++ b/Documentation/networking/filter.txt @@ -280,7 +280,8 @@ Possible BPF extensions are shown in the following table: rxhash skb->hash cpu raw_smp_processor_id() vlan_tci skb_vlan_tag_get(skb) - vlan_pr skb_vlan_tag_present(skb) + vlan_avail skb_vlan_tag_present(skb) + vlan_tpid skb->vlan_proto rand prandom_u32() These extensions can also be prefixed with '#'. diff --git a/Documentation/networking/igb.txt b/Documentation/networking/igb.txt index 43d3549366a0..15534fdd09a8 100644 --- a/Documentation/networking/igb.txt +++ b/Documentation/networking/igb.txt @@ -42,10 +42,10 @@ Additional Configurations Jumbo Frames ------------ Jumbo Frames support is enabled by changing the MTU to a value larger than - the default of 1500. Use the ifconfig command to increase the MTU size. + the default of 1500. Use the ip command to increase the MTU size. For example: - ifconfig eth<x> mtu 9000 up + ip link set dev eth<x> mtu 9000 This setting is not saved across reboots. diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index 1b8c964b0d17..071fb18dc57c 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -388,6 +388,16 @@ tcp_mtu_probing - INTEGER 1 - Disabled by default, enabled when an ICMP black hole detected 2 - Always enabled, use initial MSS of tcp_base_mss. +tcp_probe_interval - INTEGER + Controls how often to start TCP Packetization-Layer Path MTU + Discovery reprobe. The default is reprobing every 10 minutes as + per RFC4821. + +tcp_probe_threshold - INTEGER + Controls when TCP Packetization-Layer Path MTU Discovery probing + will stop in respect to the width of search range in bytes. Default + is 8 bytes. + tcp_no_metrics_save - BOOLEAN By default, TCP saves various connection metrics in the route cache when the connection closes, so that connections established in the @@ -1116,11 +1126,23 @@ arp_accept - BOOLEAN gratuitous arp frame, the arp table will be updated regardless if this setting is on or off. +mcast_solicit - INTEGER + The maximum number of multicast probes in INCOMPLETE state, + when the associated hardware address is unknown. Defaults + to 3. + +ucast_solicit - INTEGER + The maximum number of unicast probes in PROBE state, when + the hardware address is being reconfirmed. Defaults to 3. app_solicit - INTEGER The maximum number of probes to send to the user space ARP daemon via netlink before dropping back to multicast probes (see - mcast_solicit). Defaults to 0. + mcast_resolicit). Defaults to 0. + +mcast_resolicit - INTEGER + The maximum number of multicast probes after unicast and + app probes in PROBE state. Defaults to 0. disable_policy - BOOLEAN Disable IPSEC policy (SPD) for this interface @@ -1198,6 +1220,17 @@ anycast_src_echo_reply - BOOLEAN FALSE: disabled Default: FALSE +idgen_delay - INTEGER + Controls the delay in seconds after which time to retry + privacy stable address generation if a DAD conflict is + detected. + Default: 1 (as specified in RFC7217) + +idgen_retries - INTEGER + Controls the number of retries to generate a stable privacy + address if a DAD conflict is detected. + Default: 3 (as specified in RFC7217) + mld_qrv - INTEGER Controls the MLD query robustness variable (see RFC3810 9.1). Default: 2 (as specified by RFC3810 9.1) @@ -1518,6 +1551,20 @@ use_optimistic - BOOLEAN 0: disabled (default) 1: enabled +stable_secret - IPv6 address + This IPv6 address will be used as a secret to generate IPv6 + addresses for link-local addresses and autoconfigured + ones. All addresses generated after setting this secret will + be stable privacy ones by default. This can be changed via the + addrgenmode ip-link. conf/default/stable_secret is used as the + secret for the namespace, the interface specific ones can + overwrite that. Writes to conf/all/stable_secret are refused. + + It is recommended to generate this secret during installation + of a system and keep it stable after that. + + By default the stable secret is unset. + icmp/*: ratelimit - INTEGER Limit the maximal rates for sending ICMPv6 packets. diff --git a/Documentation/networking/ipvs-sysctl.txt b/Documentation/networking/ipvs-sysctl.txt index 7a3c04729591..3ba709531adb 100644 --- a/Documentation/networking/ipvs-sysctl.txt +++ b/Documentation/networking/ipvs-sysctl.txt @@ -22,6 +22,27 @@ backup_only - BOOLEAN If set, disable the director function while the server is in backup mode to avoid packet loops for DR/TUN methods. +conn_reuse_mode - INTEGER + 1 - default + + Controls how ipvs will deal with connections that are detected + port reuse. It is a bitmap, with the values being: + + 0: disable any special handling on port reuse. The new + connection will be delivered to the same real server that was + servicing the previous connection. This will effectively + disable expire_nodest_conn. + + bit 1: enable rescheduling of new connections when it is safe. + That is, whenever expire_nodest_conn and for TCP sockets, when + the connection is in TIME_WAIT state (which is only possible if + you use NAT mode). + + bit 2: it is bit 1 plus, for TCP connections, when connections + are in FIN_WAIT state, as this is the last state seen by load + balancer in Direct Routing mode. This bit helps on adding new + real servers to a very busy cluster. + conntrack - BOOLEAN 0 - disabled (default) not 0 - enabled diff --git a/Documentation/networking/ixgb.txt b/Documentation/networking/ixgb.txt index 1e0c045e89f7..9b4a10a1cf50 100644 --- a/Documentation/networking/ixgb.txt +++ b/Documentation/networking/ixgb.txt @@ -39,7 +39,7 @@ Channel Bonding documentation can be found in the Linux kernel source: The driver information previously displayed in the /proc filesystem is not supported in this release. Alternatively, you can use ethtool (version 1.6 -or later), lspci, and ifconfig to obtain the same information. +or later), lspci, and iproute2 to obtain the same information. Instructions on updating ethtool can be found in the section "Additional Configurations" later in this document. @@ -90,7 +90,7 @@ select m for "Intel(R) PRO/10GbE support" located at: 3. Assign an IP address to the interface by entering the following, where x is the interface number: - ifconfig ethx <IP_address> + ip addr add ethx <IP_address> 4. Verify that the interface works. Enter the following, where <IP_address> is the IP address for another machine on the same subnet as the interface @@ -177,7 +177,7 @@ NOTE: These changes are only suggestions, and serve as a starting point for tuning your network performance. The changes are made in three major ways, listed in order of greatest effect: -- Use ifconfig to modify the mtu (maximum transmission unit) and the txqueuelen +- Use ip link to modify the mtu (maximum transmission unit) and the txqueuelen parameter. - Use sysctl to modify /proc parameters (essentially kernel tuning) - Use setpci to modify the MMRBC field in PCI-X configuration space to increase @@ -202,7 +202,7 @@ setpci -d 8086:1a48 e6.b=2e # to change as well. # set the txqueuelen # your ixgb adapter should be loaded as eth1 for this to work, change if needed -ifconfig eth1 mtu 9000 txqueuelen 1000 up +ip li set dev eth1 mtu 9000 txqueuelen 1000 up # call the sysctl utility to modify /proc/sys entries sysctl -p ./sysctl_ixgb.conf - END ixgb_perf.sh @@ -297,10 +297,10 @@ Additional Configurations ------------ The driver supports Jumbo Frames for all adapters. Jumbo Frames support is enabled by changing the MTU to a value larger than the default of 1500. - The maximum value for the MTU is 16114. Use the ifconfig command to + The maximum value for the MTU is 16114. Use the ip command to increase the MTU size. For example: - ifconfig ethx mtu 9000 up + ip li set dev ethx mtu 9000 The maximum MTU setting for Jumbo Frames is 16114. This value coincides with the maximum Jumbo Frames size of 16128. diff --git a/Documentation/networking/ixgbe.txt b/Documentation/networking/ixgbe.txt index 0ace6e776ac8..6f0cb57b59c6 100644 --- a/Documentation/networking/ixgbe.txt +++ b/Documentation/networking/ixgbe.txt @@ -70,10 +70,10 @@ Avago 1000BASE-T SFP ABCU-5710RZ 82599-based adapters support all passive and active limiting direct attach cables that comply with SFF-8431 v4.1 and SFF-8472 v10.4 specifications. -Laser turns off for SFP+ when ifconfig down +Laser turns off for SFP+ when device is down ------------------------------------------- -"ifconfig down" turns off the laser for 82599-based SFP+ fiber adapters. -"ifconfig up" turns on the laser. +"ip link set down" turns off the laser for 82599-based SFP+ fiber adapters. +"ip link set up" turns on the laser. 82598-BASED ADAPTERS @@ -213,13 +213,13 @@ Additional Configurations ------------ The driver supports Jumbo Frames for all adapters. Jumbo Frames support is enabled by changing the MTU to a value larger than the default of 1500. - The maximum value for the MTU is 16110. Use the ifconfig command to + The maximum value for the MTU is 16110. Use the ip command to increase the MTU size. For example: - ifconfig ethx mtu 9000 up + ip link set dev ethx mtu 9000 - The maximum MTU setting for Jumbo Frames is 16110. This value coincides - with the maximum Jumbo Frames size of 16128. + The maximum MTU setting for Jumbo Frames is 9710. This value coincides + with the maximum Jumbo Frames size of 9728. Generic Receive Offload, aka GRO -------------------------------- diff --git a/Documentation/networking/mpls-sysctl.txt b/Documentation/networking/mpls-sysctl.txt new file mode 100644 index 000000000000..639ddf0ece9b --- /dev/null +++ b/Documentation/networking/mpls-sysctl.txt @@ -0,0 +1,20 @@ +/proc/sys/net/mpls/* Variables: + +platform_labels - INTEGER + Number of entries in the platform label table. It is not + possible to configure forwarding for label values equal to or + greater than the number of platform labels. + + A dense utliziation of the entries in the platform label table + is possible and expected aas the platform labels are locally + allocated. + + If the number of platform label table entries is set to 0 no + label will be recognized by the kernel and mpls forwarding + will be disabled. + + Reducing this value will remove all label routing entries that + no longer fit in the table. + + Possible values: 0 - 1048575 + Default: 0 diff --git a/Documentation/networking/packet_mmap.txt b/Documentation/networking/packet_mmap.txt index a6d7cb91069e..daa015af16a0 100644 --- a/Documentation/networking/packet_mmap.txt +++ b/Documentation/networking/packet_mmap.txt @@ -440,9 +440,10 @@ and the following flags apply: +++ Capture process: from include/linux/if_packet.h - #define TP_STATUS_COPY 2 - #define TP_STATUS_LOSING 4 - #define TP_STATUS_CSUMNOTREADY 8 + #define TP_STATUS_COPY (1 << 1) + #define TP_STATUS_LOSING (1 << 2) + #define TP_STATUS_CSUMNOTREADY (1 << 3) + #define TP_STATUS_CSUM_VALID (1 << 7) TP_STATUS_COPY : This flag indicates that the frame (and associated meta information) has been truncated because it's @@ -466,6 +467,12 @@ TP_STATUS_CSUMNOTREADY: currently it's used for outgoing IP packets which reading the packet we should not try to check the checksum. +TP_STATUS_CSUM_VALID : This flag indicates that at least the transport + header checksum of the packet has been already + validated on the kernel side. If the flag is not set + then we are free to check the checksum by ourselves + provided that TP_STATUS_CSUMNOTREADY is also not set. + for convenience there are also the following defines: #define TP_STATUS_KERNEL 0 diff --git a/Documentation/networking/pktgen.txt b/Documentation/networking/pktgen.txt index 6915c6b27869..0344f1d45b37 100644 --- a/Documentation/networking/pktgen.txt +++ b/Documentation/networking/pktgen.txt @@ -3,13 +3,11 @@ HOWTO for the linux packet generator ------------------------------------ -Date: 041221 - -Enable CONFIG_NET_PKTGEN to compile and build pktgen.o either in kernel -or as module. Module is preferred. insmod pktgen if needed. Once running -pktgen creates a thread on each CPU where each thread has affinity to its CPU. -Monitoring and controlling is done via /proc. Easiest to select a suitable -a sample script and configure. +Enable CONFIG_NET_PKTGEN to compile and build pktgen either in-kernel +or as a module. A module is preferred; modprobe pktgen if needed. Once +running, pktgen creates a thread for each CPU with affinity to that CPU. +Monitoring and controlling is done via /proc. It is easiest to select a +suitable sample script and configure that. On a dual CPU: @@ -27,7 +25,7 @@ For monitoring and control pktgen creates: Tuning NIC for max performance ============================== -The default NIC setting are (likely) not tuned for pktgen's artificial +The default NIC settings are (likely) not tuned for pktgen's artificial overload type of benchmarking, as this could hurt the normal use-case. Specifically increasing the TX ring buffer in the NIC: @@ -35,20 +33,20 @@ Specifically increasing the TX ring buffer in the NIC: A larger TX ring can improve pktgen's performance, while it can hurt in the general case, 1) because the TX ring buffer might get larger -than the CPUs L1/L2 cache, 2) because it allow more queueing in the +than the CPU's L1/L2 cache, 2) because it allows more queueing in the NIC HW layer (which is bad for bufferbloat). -One should be careful to conclude, that packets/descriptors in the HW +One should hesitate to conclude that packets/descriptors in the HW TX ring cause delay. Drivers usually delay cleaning up the -ring-buffers (for various performance reasons), thus packets stalling -the TX ring, might just be waiting for cleanup. +ring-buffers for various performance reasons, and packets stalling +the TX ring might just be waiting for cleanup. -This cleanup issues is specifically the case, for the driver ixgbe -(Intel 82599 chip). This driver (ixgbe) combine TX+RX ring cleanups, +This cleanup issue is specifically the case for the driver ixgbe +(Intel 82599 chip). This driver (ixgbe) combines TX+RX ring cleanups, and the cleanup interval is affected by the ethtool --coalesce setting of parameter "rx-usecs". -For ixgbe use e.g "30" resulting in approx 33K interrupts/sec (1/30*10^6): +For ixgbe use e.g. "30" resulting in approx 33K interrupts/sec (1/30*10^6): # ethtool -C ethX rx-usecs 30 @@ -60,15 +58,16 @@ Running: Stopped: eth1 Result: OK: max_before_softirq=10000 -Most important the devices assigned to thread. Note! A device can only belong -to one thread. +Most important are the devices assigned to the thread. Note that a +device can only belong to one thread. Viewing devices =============== -Parm section holds configured info. Current hold running stats. -Result is printed after run or after interruption. Example: +The Params section holds configured information. The Current section +holds running statistics. The Result is printed after a run or after +interruption. Example: /proc/net/pktgen/eth1 @@ -93,7 +92,8 @@ Result: OK: 13101142(c12220741+d880401) usec, 10000000 (60byte,0frags) Configuring threads and devices ================================ -This is done via the /proc interface easiest done via pgset in the scripts +This is done via the /proc interface, and most easily done via pgset +as defined in the sample scripts. Examples: @@ -192,10 +192,11 @@ Examples: pgset "rate 300M" set rate to 300 Mb/s pgset "ratep 1000000" set rate to 1Mpps -Example scripts -=============== +Sample scripts +============== -A collection of small tutorial scripts for pktgen is in examples dir. +A collection of small tutorial scripts for pktgen is in the +samples/pktgen directory: pktgen.conf-1-1 # 1 CPU 1 dev pktgen.conf-1-2 # 1 CPU 2 dev @@ -206,25 +207,26 @@ pktgen.conf-1-1-ip6 # 1 CPU 1 dev ipv6 pktgen.conf-1-1-ip6-rdos # 1 CPU 1 dev ipv6 w. route DoS pktgen.conf-1-1-flows # 1 CPU 1 dev multiple flows. -Run in shell: ./pktgen.conf-X-Y It does all the setup including sending. +Run in shell: ./pktgen.conf-X-Y +This does all the setup including sending. Interrupt affinity =================== -Note when adding devices to a specific CPU there good idea to also assign -/proc/irq/XX/smp_affinity so the TX-interrupts gets bound to the same CPU. -as this reduces cache bouncing when freeing skb's. +Note that when adding devices to a specific CPU it is a good idea to +also assign /proc/irq/XX/smp_affinity so that the TX interrupts are bound +to the same CPU. This reduces cache bouncing when freeing skbs. Enable IPsec ============ -Default IPsec transformation with ESP encapsulation plus Transport mode -could be enabled by simply setting: +Default IPsec transformation with ESP encapsulation plus transport mode +can be enabled by simply setting: pgset "flag IPSEC" pgset "flows 1" To avoid breaking existing testbed scripts for using AH type and tunnel mode, -user could use "pgset spi SPI_VALUE" to specify which formal of transformation +you can use "pgset spi SPI_VALUE" to specify which transformation mode to employ. diff --git a/Documentation/networking/rds.txt b/Documentation/networking/rds.txt index c67077cbeb80..e1a3d59bbe0f 100644 --- a/Documentation/networking/rds.txt +++ b/Documentation/networking/rds.txt @@ -62,11 +62,10 @@ Socket Interface ================ AF_RDS, PF_RDS, SOL_RDS - These constants haven't been assigned yet, because RDS isn't in - mainline yet. Currently, the kernel module assigns some constant - and publishes it to user space through two sysctl files - /proc/sys/net/rds/pf_rds - /proc/sys/net/rds/sol_rds + AF_RDS and PF_RDS are the domain type to be used with socket(2) + to create RDS sockets. SOL_RDS is the socket-level to be used + with setsockopt(2) and getsockopt(2) for RDS specific socket + options. fd = socket(PF_RDS, SOCK_SEQPACKET, 0); This creates a new, unbound RDS socket. diff --git a/Documentation/networking/s2io.txt b/Documentation/networking/s2io.txt index d2a9f43b5546..0362a42f7cf4 100644 --- a/Documentation/networking/s2io.txt +++ b/Documentation/networking/s2io.txt @@ -38,7 +38,7 @@ The corresponding adapter's LED will blink multiple times. 3. Features supported: a. Jumbo frames. Xframe I/II supports MTU up to 9600 bytes, -modifiable using ifconfig command. +modifiable using ip command. b. Offloads. Supports checksum offload(TCP/UDP/IP) on transmit and receive, TSO. diff --git a/Documentation/networking/scaling.txt b/Documentation/networking/scaling.txt index 99ca40e8e810..cbfac0949635 100644 --- a/Documentation/networking/scaling.txt +++ b/Documentation/networking/scaling.txt @@ -421,6 +421,15 @@ best CPUs to share a given queue are probably those that share the cache with the CPU that processes transmit completions for that queue (transmit interrupts). +Per TX Queue rate limitation: +============================= + +These are rate-limitation mechanisms implemented by HW, where currently +a max-rate attribute is supported, by setting a Mbps value to + +/sys/class/net/<dev>/queues/tx-<n>/tx_maxrate + +A value of zero means disabled, and this is the default. Further Information =================== diff --git a/Documentation/networking/vxge.txt b/Documentation/networking/vxge.txt index bb76c667a476..abfec245f97c 100644 --- a/Documentation/networking/vxge.txt +++ b/Documentation/networking/vxge.txt @@ -39,7 +39,7 @@ iii) PCI-SIG's I/O Virtualization iv) Jumbo frames X3100 Series supports MTU up to 9600 bytes, modifiable using - ifconfig command. + ip command. v) Offloads supported: (Enabled by default) Checksum offload (TCP/UDP/IP) on transmit and receive paths |