lwn.git - Linux kernel documentation tree maintained by Jonathan Corbet

Age	Commit message (Collapse)	Author
2014-03-12	gianfar: Fix multi-queue support checks @probe()	Claudiu Manoil
	priv is not instantiated at gfar_of_init() time, when parsing the DT for info on supported HW queues. Before the netdev can be allocated, the number of supported queues must be known. Because the number of supported queues depends on device type, move the compatibility checks before netdev allocation. Local vars are used to hold the operation mode info before netdev allocation. This fixes the null accesses for priv->.., in gfar_of_init. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12	r8152: support dumping the hw counters	hayeswang
	Add dumping the tally counter by ethtool. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12	ieee802154: at86rf230: add support for rf233 chip	Thomas Stilwell
	The rf233 and rf231 are sufficiently similar that we can treat rf233 like rf231. rf233 is missing some features that rf231 has, but we don't currently make use of them so there's nothing to handle differently yet. Should we add support in the future for rf231 *_NOCLK or SLEEP states, or PAD_IO drive strength, exceptions will need to be made for rf233. Signed-off-by: Thomas Stilwell <stilwellt@openlabs.co> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11	bonding: force cast of IP address in options	stephen hemminger
	The option code is taking IP address and putting it into a generic container. Force cast to silence sparse warnings. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10	r8152: add skb_cow_head	hayeswang
	Call skb_cow_head() before editing the tx packet header. The header would be reallocated if it is shared. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10	net: eth: cpsw: Use net_device_stats from struct net_device	Tobias Klauser
	Instead of using an own copy of struct net_device_stats in struct cpsw_priv, use stats from struct net_device. Also remove the thus unnecessary .ndo_get_stats function, as it just returns dev->stats, which is the default. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10	hyperv: Change the receive buffer size for legacy hosts	Haiyang Zhang
	Due to a bug in the Hyper-V host verion 2008R2, we need to use a slightly smaller receive buffer size, otherwise the buffer will not be accepted by the legacy hosts. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10	Drivers: net: hyperv: Enable large send offload	KY Srinivasan
	Enable segmentation offload. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10	Drivers: net: hyperv: Enable send side checksum offload	KY Srinivasan
	Enable send side checksum offload. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10	Drivers: net: hyperv: Enable receive side IP checksum offload	KY Srinivasan
	Enable receive side checksum offload. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10	Drivers: net: hyperv: Enable offloads on the host	KY Srinivasan
	Prior to enabling guest side offloads, enable the offloads on the host. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10	Drivers: net: hyperv: Cleanup the send path	KY Srinivasan
	In preparation for enabling offloads, cleanup the send path. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10	Drivers: net: hyperv: Enable scatter gather I/O	KY Srinivasan
	Cleanup the code and enable scatter gather I/O. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10	sky2: allow mac to come from dt	Tim Harvey
	The driver reads the mac address from the device registers which would need to have been programmed by the bootloader. This patch adds the ability to pull the mac from devicetree via the pci device dt node. Signed-off-by: Tim Harvey <tharvey@gateworks.com> Cc: netdev@vger.kernel.org Cc: devicetree@vger.kernel.org Cc: Grant Likely <grant.likely@linaro.org> Cc: Rob Herring <robh+dt@kernel.org> Changes since v2: - eliminated use of stack tmpaddr per feedback Changes since v1: - simplified based on feedback - fixed formatting Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10	IB/mlx5_core: remove unreachable function call in module init	Kleber Sacilotto de Souza
	The call to mlx5_health_cleanup() in the module init function can never be reached. Removing it. Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> Acked-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10	sfc: Use ether_addr_copy and eth_broadcast_addr	Edward Cree
	Faster than memcpy/memset on some architectures. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10	gianfar: Use Single-Queue polling for "fsl,etsec2"	Claudiu Manoil
	For the "fsl,etsec2" compatible models the driver currently supports 8 Tx and Rx DMA rings (aka HW queues). However, there are only 2 pairs of Rx/Tx interrupt lines, as these controllers are integrated in low power SoCs with 2 CPUs at most. As a result, there are at most 2 NAPI instances that have to service multiple Tx and Rx queues for these devices. This complicates the NAPI polling routine having to iterate over the mutiple Rx/Tx queues hooked to the same interrupt lines. And there's also an overhead at HW level, as the controller needs to service all the 8 Tx rings in a round robin manner. The combined overhead shows up for multi parallel Tx flows transmitted by the kernel stack, when the driver usually starts returning NETDEV_TX_BUSY leading to NETDEV WATCHDOG Tx timeout triggering if the Tx path is congested for too long. As an alternative, this patch makes the driver support only one Tx/Rx DMA ring per NAPI instance (per interrupt group or pair of Tx/Rx interrupt lines) by default. The simplified single queue polling routine (gfar_poll_sq) will be the default napi poll routine for the etsec2 devices too. Some adjustments needed to be made to link the Tx/Rx HW queues with each NAPI instance (2 in this case). The gfar_poll_sq() is already successfully used by older SQ_SG_MODE (single interrupt group) controllers. This patch fixes Tx timeout triggering under heavy Tx traffic load (i.e. iperf -c -P 8) for the "fsl,etsec2" (currently the only MQ_MG_MODE devices). There's also a significant memory footprint reduction by supporting 2 Rx/Tx DMA rings (at most), instead of 8, for these devices. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10	gianfar: Separate out the Tx interrupt handling (Tx NAPI)	Claudiu Manoil
	There are some concurrency issues on devices w/ 2 CPUs related to the handling of Rx and Tx interrupts. eTSEC has separate interrupt lines for Rx and Tx but a single imask register to mask these interrupts and a single NAPI instance to handle both Rx and Tx work. As a result, the Rx and Tx ISRs are identical, both are invoking gfar_schedule_cleanup(), however both handlers can be entered at the same time when the Rx and Tx interrupts are taken by different CPUs. In this case spurrious interrupts (SPU) show up (in /proc/interrupts) indicating a concurrency issue. Also, Tx overruns followed by Tx timeout have been observed under heavy Tx traffic load. To address these issues, the schedule cleanup ISR part has been changed to handle the Rx and Tx interrupts independently. The patch adds a separate NAPI poll routine for Tx cleanup to be triggerred independently by the Tx confirmation interrupts only. Existing poll functions are modified to handle only the Rx path processing. The Tx poll routine does not need a budget, since Tx processing doesn't consume NAPI budget, and hence it is registered with minimum NAPI weight. NAPI scheduling does not require locking since there are different NAPI instances between the Rx and Tx confirmation paths now. So, the patch fixes the occurence of spurrious Rx/Tx interrupts. Tx overruns also occur less frequently now. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-08	igb: fix warning if !CONFIG_IGB_HWMON	Jeff Kirsher
	Fix warning about code defined but never used if IGB_HWMON not defined. Reported-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-08	igb: fix array size calculation	Todd Fujinaka
	Use ARRAY_SIZE for array size calculation. Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-08	ixgbevf: fix skb->pkt_type checks	Florian Fainelli
	skb->pkt_type is not a bitmask, but contains only value at a time from the range defined in include/uapi/linux/if_packet.h. Checking it like if it was a bitmask of values would also cause PACKET_OTHERHOST, PACKET_LOOPBACK and PACKET_FASTROUTE to be matched by this check since their lower 2 bits are also set, although that does not fix a real bug, it is still potentially confusing. This bogus check was introduced in commit 815cccbf ("ixgbe: add setlink, getlink support to ixgbe and ixgbevf"). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-07	e1000e: Fix SHRA register access for 82579	David Ertman
	Previous commit c3a0dce35af0 fixed an overrun for the RAR on i218 devices. This commit also attempted to homogenize the RAR/SHRA access for all parts accessed by the e1000e driver. This change introduced an error for assigning MAC addresses to guest OS's for 82579 devices. Only RAR[0] is accessible to the driver for 82579 parts, and additional addresses must be placed into the SHRA[L\|H] registers. The rar_entry_count was changed in the previous commit to an inaccurate value that accounted for all RAR and SHRA registers, not just the ones usable by the driver. This patch fixes the count to the correct value and adjusts the e1000_rar_set_pch2lan() function to user the correct index. Cc: John Greene <jogreene@redhat.com> Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-07	e1000e: Fix ethtool offline tests for 82579 parts	David Ertman
	Changes to the rar_entry_count value require a change to the indexing used to access the SHRA[H\|L] registers when testing them with 'ethtool -t <iface> offline' Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-07	e1000e: Fix not generating an error on invalid load parameter	David Ertman
	Valid values for InterruptThrottleRate are 10-100000, or one of 0, 1, 3, 4. '2' is not valid. This is a legacy from the branching from the e1000 driver code that e1000e was based from. Prior to this patch, if the e1000e driver was loaded with a forced invalid InterruptThrottleRate of '2', then no throttle rate would be set and no error message generated. Now, a message will be generated that an invalid value was used and the value for InterruptThrottleRate will be set to the default value. Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-07	e1000e: Feature Enable PHY Ultra Low Power Mode (ULP)	David Ertman
	ULP is a power saving feature that reduces the power consumption of the PHY when a cable is not connected. ULP is gated on the following conditions: 1) The hardware must support ULP. Currently this is only I218 devices from Intel 2) ULP is initiated by the driver, so, no driver results in no ULP. 3) ULP's implementation utilizes Runtime Power Management to toggle its execution. ULP is enabled/disabled based on the state of Runtime PM. 4) ULP is not active when wake-on-unicast, multicast or broadcast is active as these features are mutually-exclusive. Since the PHY is in an unavailable state while ULP is active, any access of the PHY registers will fail. This is resolved by utilizing kernel calls that cause the device to exit Runtime PM (e.g. pm_runtime_get_sync) and then, after PHY access is complete, allow the device to resume Runtime PM (e.g. pm_runtime_put_sync). Under certain conditions, toggling the LANPHYPC is necessary to disable ULP mode. Break out existing code to toggle LANPHYPC to a new function to avoid code duplication. Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com> Cc: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-07	e1000e Refactor of Runtime Power Management	David Ertman
	Fix issues with: RuntimePM causing the device to repeatedly flip between suspend and resume with the interface administratively downed. Having RuntimePM enabled interfering with the functionality of Energy Efficient Ethernet. Added checks to disallow functions that should not be executed if the device is currently runtime suspended Make runtime_idle callback to use same deterministic behavior as the igb driver. Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com> Acked-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-07	e1000e: Refactor PM flows	David Ertman
	Refactor the system power management flows to prevent the suspend path from being executed twice when hibernating since both the freeze and poweroff callbacks were set to e1000_suspend() via SET_SYSTEM_SLEEP_PM_OPS. There are HW workarounds that are performed during this flow and calling them twice was causing erroneous behavior. Re-arrange the code to take advantage of common code paths and explicitly set the individual dev_pm_ops callbacks for suspend, resume, freeze, thaw, poweroff and restore. Add a boolean parameter (reset) to the e1000e_down function to allow for cases when the HW should not be reset when downed during a PM event. Now that all suspend/shutdown paths result in a call to __e1000_shutdown() that checks Wake on Lan status, removing redundant check for WoL in e1000_power_down_phy(). Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com> Acked-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-07	e1000e: Add missing branding strings in ich8lan.c	David Ertman
	Branding strings from recently released and soon to be released hardware configurations that are supported by e1000e. Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com> Acked-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-07	e1000e: Cleanup - Update GPL header and Copyright	David Ertman
	This patch is to update the GPL header by removing the portion that refers to the Free Software Foundation address. Change the copyright date for 2014. Reformat the header comments to conform to kernel networking coding norms Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-07	e1000e: Fix 82579 sets LPI too early.	David Ertman
	Enabling EEE LPI sooner than one second after link up on 82579 causes link issues with some switches. Remove EEE enablement for 82579 parts from the link initialization flow to avoid initializing too early. EEE initialization for 82579 will be done in e1000e_update_phy_task. Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com> Acked-by: Bruce W Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-07	e1000e: Resolve issues with Management Engine (ME) briefly blocking PHY resets	David Ertman
	On a ME enabled system with the cable out, the driver init flow would generate an erroneous message indicating that resets were being blocked by an active ME session. Cause was ME clearing the semaphore bit to block further PHY resets for up to 50 msec during power-on/cycle. After this interval, ME would re-set the bit and allow PHY resets. To resolve this, change the flow of e1000e_phy_hw_reset_generic() to utilize a delay and retry method. Poll the FWSM register to minimize any extra time added to the flow. If the delay times out at 100ms (checked in 10msec increments), then return the value E1000_BLK_PHY_RESET, as this is the accurate state of the PHY. Attempting to alter just the call to e1000e_phy_hw_reset_generic() in e1000_init_phy_workarounds_pchlan() just caused the problem to move further down the flow. Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com> Acked-by: Bruce W. Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-07	e1000e: Cleanup unecessary references	David Ertman
	Cleaning up some pointer references that are no longer necessary Signed-off-by: Dave Ertman <davidx.m.ertman@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-07	e1000e: PTP lock in e1000e_phc_adjustfreq	Todd Fujinaka
	Add lock in e1000e_phc_adjfreq to prevent concurrent changes to TIMINCA and SYSTIMH/L. Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-03-07	Merge tag 'linux-can-next-for-3.15-20140307' of ↵	David S. Miller
	git://gitorious.org/linux-can/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2014-02-12 this is a pull request of twelve patches for net-next/master. Alexander Shiyan contributes two patches for the mcp251x, one making the driver more quiet and the other one improves the compile time coverage by removing the #ifdef CONFIG_PM_SLEEP. Then two patches for the flexcan driver by me, one removing the #ifdef CONFIG_PM_SLEEP, too, the other one making use of platform_get_device_id(). Another patch by me which converts the janz-ican3 driver to use netdev_<level>(). The remaining 7 patches are by Oliver Hartkopp, they add CAN FD support to the netlink configuration interface. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07	r8152: support IPv6	hayeswang
	Support hw IPv6 checksum for TCP and UDP packets. Note that the hw has the limitation of the range of the transport offset. Besides, the TCP Pseudo Header of the IPv6 TSO of the hw bases on the Microsoft document which excludes the packet length. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07	r8152: support TSO	hayeswang
	Support scatter gather and TSO. Adjust the tx checksum function and set the max gso size to fix the size of the tx aggregation buffer. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07	r8152: support rx checksum	hayeswang
	Support hw rx checksum for TCP and UDP packets. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07	r8152: calculate the dropped packets for rx	hayeswang
	Continue dealing with the remain rx packets, even though the allocation of the skb fail. This could calculate the correct dropped packets. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07	r8152: up the priority of the transmission	hayeswang
	move the tx_bottom() from delayed_work to tasklet. It makes the rx and tx balanced. If the device is in runtime suspend when getting the tx packet, wakeup the device before trasmitting. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07	r8152: check tx agg list before spin lock	hayeswang
	Check tx agg list before spin lock to avoid doing spin lock every times. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07	r8152: replace spin_lock_irqsave and spin_unlock_irqrestore	hayeswang
	Use spin_lock and spin_unlock in interrupt context. The ndo_start_xmit would not be called in interrupt context, so replace the relative spin_lock_irqsave and spin_unlock_irqrestore with spin_lock_bh and spin_unlock_bh. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07	Merge branch 'master' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates This series contains updates to i40e and i40evf. Most notable are: Joseph completes the implementation of the ethtool ntuple rule management interface by adding the get, update and delete interface reset. Akeem provides a fix to prevent a possible overflow due to multiplication of number and size by using kzalloc, so use kcalloc. Jesse provides an implementation for skb_set_hash() and adds the L4 type return when we know it is an L4 hash. He also adds a counter to statistics for Tx timeouts to help users. Lastly he provides a change to stay away from the cache line where the done bit may be getting written back for the transmit ring since the hardware may be writing the whole cache line for a partial update. Shannon cleans up code comments. Anjali removes a firmware workaround for newer firmware since the number of MSIx vectors are being reported correctly. v2: - dropped patch 01 of the series based on feedback from the author Joe Perches and Shannon Nelson. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07	xen-netback: Aggregate TX unmap operations	Zoltan Kiss
	Unmapping causes TLB flushing, therefore we should make it in the largest possible batches. However we shouldn't starve the guest for too long. So if the guest has space for at least two big packets and we don't have at least a quarter ring to unmap, delay it for at most 1 milisec. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07	xen-netback: Timeout packets in RX path	Zoltan Kiss
	A malicious or buggy guest can leave its queue filled indefinitely, in which case qdisc start to queue packets for that VIF. If those packets came from an another guest, it can block its slots and prevent shutdown. To avoid that, we make sure the queue is drained in every 10 seconds. The QDisc queue in worst case takes 3 round to flush usually. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07	xen-netback: Handle guests with too many frags	Zoltan Kiss
	Xen network protocol had implicit dependency on MAX_SKB_FRAGS. Netback has to handle guests sending up to XEN_NETBK_LEGACY_SLOTS_MAX slots. To achieve that: - create a new skb - map the leftover slots to its frags (no linear buffer here!) - chain it to the previous through skb_shinfo(skb)->frag_list - map them - copy and coalesce the frags into a brand new one and send it to the stack - unmap the 2 old skb's pages It's also introduces new stat counters, which help determine how often the guest sends a packet with more than MAX_SKB_FRAGS frags. NOTE: if bisect brought you here, you should apply the series up until "xen-netback: Timeout packets in RX path", otherwise malicious guests can block other guests by not releasing their sent packets. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07	xen-netback: Add stat counters for zerocopy	Zoltan Kiss
	These counters help determine how often the buffers had to be copied. Also they help find out if packets are leaked, as if "sent != success + fail", there are probably packets never freed up properly. NOTE: if bisect brought you here, you should apply the series up until "xen-netback: Timeout packets in RX path", otherwise Windows guests can't work properly and malicious guests can block other guests by not releasing their sent packets. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07	xen-netback: Remove old TX grant copy definitons and fix indentations	Zoltan Kiss
	These became obsolete with grant mapping. I've left intentionally the indentations in this way, to improve readability of previous patches. NOTE: if bisect brought you here, you should apply the series up until "xen-netback: Timeout packets in RX path", otherwise Windows guests can't work properly and malicious guests can block other guests by not releasing their sent packets. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07	xen-netback: Introduce TX grant mapping	Zoltan Kiss
	This patch introduces grant mapping on netback TX path. It replaces grant copy operations, ditching grant copy coalescing along the way. Another solution for copy coalescing is introduced in "xen-netback: Handle guests with too many frags", older guests and Windows can broke before that patch applies. There is a callback (xenvif_zerocopy_callback) from core stack to release the slots back to the guests when kfree_skb or skb_orphan_frags called. It feeds a separate dealloc thread, as scheduling NAPI instance from there is inefficient, therefore we can't do dealloc from the instance. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07	xen-netback: Handle foreign mapped pages on the guest RX path	Zoltan Kiss
	RX path need to know if the SKB fragments are stored on pages from another domain. Logically this patch should be after introducing the grant mapping itself, as it makes sense only after that. But to keep bisectability, I moved it here. It shouldn't change any functionality here. xenvif_zerocopy_callback and ubuf_to_vif are just stubs here, they will be introduced properly later on. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-07	xen-netback: Minor refactoring of netback code	Zoltan Kiss
	This patch contains a few bits of refactoring before introducing the grant mapping changes: - introducing xenvif_tx_pending_slots_available(), as this is used several times, and will be used more often - rename the thread to vifX.Y-guest-rx, to signify it does RX work from the guest point of view Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>