summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
AgeCommit message (Collapse)Author
2013-08-13RDMA/cxgb4: Fix QP flush logicSteve Wise
This patch makes following fixes in QP flush logic: - correctly flushes unsignaled WRs followed by a signaled WR - supports for flushing a CQ bound to multiple QPs - resets cidx_flush if a active queue starts getting HW CQEs again - marks WQ in error when we leave RTS. This was only being done for user queues, but we need it for kernel queues too so that post_send/post_recv will start returning the appropriate error synchronously - eats unsignaled read resp CQEs. HW always inserts CQEs so we must silently discard them if the read work request was unsignaled. - handles QP flushes with pending SW CQEs. The flush and out of order completion logic has a bug where if out of order completions are flushed but not yet polled by the consumer and the qp is then flushed then we end up inserting duplicate completions. - c4iw_flush_sq() should only flush wrs that have not already been flushed. Since we already track where in the SQ we've flushed via sq.cidx_flush, just start at that point and flush any remaining. This bug only caused a problem in the presence of unsignaled work requests. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> [ Fixed sparse warning due to htonl/ntohl confusion. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13RDMA/cxgb4: Add support for active and passive open connection with IPv6 addressVipul Pandya
Add new cpl messages, cpl_act_open_req6 and cpl_t5_act_open_req6, for initiating active open connections. Use LLD api cxgb4_create_server and cxgb4_create_server6 for initiating passive open connections. Similarly use cxgb4_remove_server to remove the passive open connections in place of listen_stop. Add support for iWARP over VLAN device and enable IPv6 support on VLAN device. Make use of import_ep in c4iw_reconnect. Signed-off-by: Vipul Pandya <vipul@chelsio.com> [ Fix build when IPv6 is disabled and make sure iw_cxgb4 is not built-in when ipv6 is a module. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-03-14RDMA/cxgb4: Bump tcam_full stat and WR reply timeoutVipul Pandya
Always bump the tcam_full stat. Also, bump wr reply timeout to 30 seconds. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-14RDMA/cxgb4: Use DSGLs for fastreg and adapter memory writes for T5.Vipul Pandya
It enables direct DMA by HW to memory region PBL arrays and fast register PBL arrays from host memory, vs the T4 way of passing these arrays in the WR itself. The result is lower latency for memory registration, and larger PBL array support for fast register operations. This patch also updates ULP_TX_MEM_WRITE command fields for T5. Ordering bit of ULP_TX_MEM_WRITE is at bit position 22 in T5 and at 23 in T4. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-14RDMA/cxgb4: Add module_params to enable DB FC & Coalescing on T5Vipul Pandya
Both DB Flow-Control and DB Coalescing are disabled by default on T5 Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-14RDMA/cxgb4: Add Support for Chelsio T5 adapterVipul Pandya
Adds support for Chelsio T5 adapter. Enables T5's Write Combining feature. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-27IB/cxgb4: convert to idr_alloc()Tejun Heo
Convert to the much saner new idr interface. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-26Merge branches 'core', 'cxgb4', 'ipoib', 'iser', 'misc', 'mlx4', 'qib' and ↵Roland Dreier
'srp' into for-next
2013-02-21IB/core: Add "type 2" memory windows supportShani Michaeli
This patch enhances the IB core support for Memory Windows (MWs). MWs allow an application to have better/flexible control over remote access to memory. Two types of MWs are supported, with the second type having two flavors: Type 1 - associated with PD only Type 2A - associated with QPN only Type 2B - associated with PD and QPN Applications can allocate a MW once, and then repeatedly bind the MW to different ranges in MRs that are associated to the same PD. Type 1 windows are bound through a verb, while type 2 windows are bound by posting a work request. The 32-bit memory key is composed of a 24-bit index and an 8-bit key. The key is changed with each bind, thus allowing more control over the peer's use of the memory key. The changes introduced are the following: * add memory window type enum and a corresponding parameter to ib_alloc_mw. * type 2 memory window bind work request support. * create a struct that contains the common part of the bind verb struct ibv_mw_bind and the bind work request into a single struct. * add the ib_inc_rkey helper function to advance the tag part of an rkey. Consumer interface details: * new device capability flags IB_DEVICE_MEM_WINDOW_TYPE_2A and IB_DEVICE_MEM_WINDOW_TYPE_2B are added to indicate device support for these features. Devices can set either IB_DEVICE_MEM_WINDOW_TYPE_2A or IB_DEVICE_MEM_WINDOW_TYPE_2B if it supports type 2A or type 2B memory windows. It can set neither to indicate it doesn't support type 2 windows at all. * modify existing provides and consumers code to the new param of ib_alloc_mw and the ib_mw_bind_info structure Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-14RDMA/cxgb4: Fix endpoint timeout race conditionVipul Pandya
The endpoint timeout logic had a race that could cause an endpoint object to be freed while it was still on the timedout list. This can happen if the timer is stopped after it had fired, but before the timedout thread processed the endpoint timeout. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-14RDMA/cxgb4: Keep QP referenced until TID releasedVipul Pandya
The driver is currently releasing the last ref on the QP too early. This can cause bus errors due to HW still fetching WRs from the HW queue. The fix is to keep a qp ref until we release the HW TID. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-12-19RDMA/cxgb4: Fix bug for active and passive LE hash collision pathVipul Pandya
Retries active opens for INUSE errors. Logs any active ofld_connect_wr error replies. Sends ofld_connect_wr on same ctrlq. It needs to go on the same control txq as regular CPL active/passive messages. Retries on active open replies with EADDRINUSE. Uses active open fw wr only if active filter region is set. Adds stat for ofld_connect_wr failures. This patch also adds debugfs file to show endpoints. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-12-19RDMA/cxgb4: Fix LE hash collision bug for active open connectionVipul Pandya
It enables establishing active open connection using fw_ofld_connection work request when cpl_act_open_rpl says TCAM full error which may be because of LE hash collision. Current support is only for IPv4 active open connections. Sets ntuple bits in active open requests. For T4 firmware greater than 1.4.10.0 ntuple bits are required to be set. Adds nocong and enable_ecn module parameter options. Signed-off-by: Vipul Pandya <vipul@chelsio.com> [ Move all FW return values to t4fw_api.h. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-18RDMA/cxgb4: Add query_qp supportVipul Pandya
This allows querying the QP state before flushing. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-18RDMA/cxgb4: Remove kfifo usageVipul Pandya
Using kfifos for ID management was limiting the number of QPs and preventing NP384 MPI jobs. So replace it with a simple bitmap allocator. Remove IDs from the IDR tables before deallocating them. This bug was causing the BUG_ON() in insert_handle() to fire because the ID was getting reused before being removed from the IDR table. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-18RDMA/cxgb4: DB Drop Recovery for RDMA and LLD queuesVipul Pandya
Add module option db_fc_threshold which is the count of active QPs that trigger automatic db flow control mode. Automatically transition to/from flow control mode when the active qp count crosses db_fc_theshold. Add more db debugfs stats On DB DROP event from the LLD, recover all the iwarp queues. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-18RDMA/cxgb4: Disable interrupts in c4iw_ev_dispatch()Vipul Pandya
Use GFP_ATOMIC in _insert_handle() if ints are disabled. Don't panic if we get an abort with no endpoint found. Just log a warning. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-18RDMA/cxgb4: Add DB Overflow AvoidanceVipul Pandya
Get FULL/EMPTY/DROP events from LLD. On FULL event, disable normal user mode DB rings. Add modify_qp semantics to allow user processes to call into the kernel to ring doobells without overflowing. Add DB Full/Empty/Drop stats. Mark queues when created indicating the doorbell state. If we're in the middle of db overflow avoidance, then newly created queues should start out in this mode. Bump the C4IW_UVERBS_ABI_VERSION to 2 so the user mode library can know if the driver supports the kernel mode db ringing. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-18RDMA/cxgb4: Add debugfs RDMA memory statsVipul Pandya
Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-01Merge branches 'amso1100', 'cma', 'cxgb3', 'cxgb4', 'fdr', 'ipath', 'ipoib', ↵Roland Dreier
'misc', 'mlx4', 'misc', 'nes', 'qib' and 'xrc' into for-next
2011-10-31RDMA/cxgb4: Serialize calls to CQ's comp_handlerKumar Sanghvi
Commit 01e7da6ba53c ("RDMA/cxgb4: Make sure flush CQ entries are collected on connection close") introduced a potential problem where a CQ's comp_handler can get called simultaneously from different places in the iw_cxgb4 driver. This does not comply with Documentation/infiniband/core_locking.txt, which states that at a given point of time, there should be only one callback per CQ should be active. This problem was reported by Parav Pandit <Parav.Pandit@Emulex.Com>. Based on discussion between Parav Pandit and Steve Wise, this patch fixes the above problem by serializing the calls to a CQ's comp_handler using a spin_lock. Reported-by: Parav Pandit <Parav.Pandit@Emulex.Com> Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-06RDMA/cxgb4: Add support for MPAv2 Enhanced RDMA NegotiationKumar Sanghvi
This patch adds support for Enhanced RDMA Connection Establishment (draft-ietf-storm-mpa-peer-connect-06), aka MPAv2. Details of draft can be obtained from: <http://www.ietf.org/id/draft-ietf-storm-mpa-peer-connect-06.txt> The patch updates the following functions for initiator perspective: - send_mpa_request - process_mpa_reply - post_terminate for TERM error codes - destroy_qp for TERM related change - adds layer/etype/ecode to c4iw_qp_attrs for sending with TERM - peer_abort for retrying connection attempt with MPA_v1 message - added c4iw_reconnect function The patch updates the following functions for responder perspective: - process_mpa_request - send_mpa_reply - c4iw_accept_cr - passes ird/ord to upper layers Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-24RDMA/cxgb4: Use completion objects for event blockingSteve Wise
There exists a race condition when using wait_queue_head_t objects that are declared on the stack. This was being done in a few places where we are sending work requests to the FW and awaiting replies, but we don't have an endpoint structure with an embedded c4iw_wr_wait struct. So the code was allocating it locally on the stack. Bad design. The race is: 1) thread on cpuX declares the wait_queue_head_t on the stack, then posts a firmware WR with that wait object ptr as the cookie to be returned in the WR reply. This thread will proceed to block in wait_event_timeout() but before it does: 2) An interrupt runs on cpuY with the WR reply. fw6_msg() handles this and calls c4iw_wake_up(). c4iw_wake_up() sets the condition variable in the c4iw_wr_wait object to TRUE and will call wake_up(), but before it calls wake_up(): 3) The thread on cpuX calls c4iw_wait_for_reply(), which calls wait_event_timeout(). The wait_event_timeout() macro checks the condition variable and returns immediately since it is TRUE. So this thread never blocks/sleeps. The function then returns effectively deallocating the c4iw_wr_wait object that was on the stack. 4) So at this point cpuY has a pointer to the c4iw_wr_wait object that is no longer valid. Further its pointing to a stack frame that might now be in use by some other context/thread. So cpuY continues execution and calls wake_up() on a ptr to a wait object that as been effectively deallocated. This race, when it hits, can cause a crash in wake_up(), which I've seen under heavy stress. It can also corrupt the referenced stack which can cause any number of failures. The fix: Use struct completion, which supports on-stack declarations. Completions use a spinlock around setting the condition to true and the wake up so that steps 2 and 4 above are atomic and step 3 can never happen in-between. Signed-off-by: Steve Wise <swise@opengridcomputing.com>
2011-05-09RDMA/cxgb4: EEH errors can hang the driverSteve Wise
A few more EEH fixes: c4iw_wait_for_reply(): detect fatal EEH condition on timeout and return an error. The iw_cxgb4 driver was only calling ib_deregister_device() on an EEH event followed by a ib_register_device() when the device was reinitialized. However, the RDMA core doesn't allow multiple iterations of register/deregister by the provider. See drivers/infiniband/core/sysfs.c: ib_device_unregister_sysfs() where the kobject ref is held until the device is deallocated in ib_deallocate_device(). Calling deregister adds this kobj reference, and then a subsequent register call will generate a WARN_ON() from the kobject subsystem because the kobject is being initialized but is already initialized with the ref held. So the provider must deregister and dealloc when resetting for an EEH event, then alloc/register to re-initialize. To do this, we cannot use the device ptr as our ULD handle since it will change with each reallocation. This commit adds a ULD context struct which is used as the ULD handle, and then contains the device pointer and other state needed. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09RDMA/cxgb4: Reset wait condition atomicallySteve Wise
The driver was never really waiting for RDMA_WR/FINI completions because the condition variable used to determine if the completion happened was never reset, and this condition variable is reused for both connection setup and teardown. This causes various driver crashes under heavy loads due to releasing resources too early. The fix is to use atomic bits to correctly reset the condition immediately after the completion is detected. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09RDMA/cxgb4: Don't change QP state outside EP lockSteve Wise
Concurrent ingress CLOSE and ULP ABORT operations causes a crash due to a race condition where the close path releases the EP lock and then tries to move the QP state to CLOSED. This must be done inside the EP lock to avoid the race. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-14RDMA/cxgb4: Remove db_drop_taskSteve Wise
Unloading iw_cxgb4 can crash due to the unload code trying to use db_drop_task, which is uninitialized. So remove this dead code. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-01-13Merge branch 'for-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits) Documentation/trace/events.txt: Remove obsolete sched_signal_send. writeback: fix global_dirty_limits comment runtime -> real-time ppc: fix comment typo singal -> signal drivers: fix comment typo diable -> disable. m68k: fix comment typo diable -> disable. wireless: comment typo fix diable -> disable. media: comment typo fix diable -> disable. remove doc for obsolete dynamic-printk kernel-parameter remove extraneous 'is' from Documentation/iostats.txt Fix spelling milisec -> ms in snd_ps3 module parameter description Fix spelling mistakes in comments Revert conflicting V4L changes i7core_edac: fix typos in comments mm/rmap.c: fix comment sound, ca0106: Fix assignment to 'channel'. hrtimer: fix a typo in comment init/Kconfig: fix typo anon_inodes: fix wrong function name in comment fix comment typos concerning "consistent" poll: fix a typo in comment ... Fix up trivial conflicts in: - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c) - fs/ext4/ext4.h Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.
2011-01-10RDMA/cxgb3,cxgb4: Remove dead codeStephen Hemminger
This removes unused code found by running 'make namespacecheck'; compile tested only. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-11-15infiniband: Only include mutex.h once in drivers/infiniband/hw/cxgb4/iw_cxgb4.hJesper Juhl
Only include the header linux/mutex.h once inside drivers/infiniband/hw/cxgb4/iw_cxgb4.h Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-09-28RDMA/cxgb4: Use a mutex for QP and EP state transitionsSteve Wise
Move the connection setup/teardown paths to the workq thread removing spin lock/irq disable requirements for these paths. This allows calls down to the LLD for EP and QP state transition actions to be atomic with respect to processing CPL messages coming up from the HW. Namely, calls to rdma_init() and rdma_fini() can now be called with the mutex held avoiding many race conditions with the abort path. The QP spinlock is still used but only to manipulate the qp state. This allows the fastpaths, poll, post_send, and pos_recv, to run in the irq context. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28RDMA/cxgb4: Support on-chip SQsSteve Wise
T4 support on-chip SQs to reduce latency. This patch adds support for this in iw_cxgb4: - Manage ocqp memory like other adapter mem resources. - Allocate user mode SQs from ocqp mem if available. - Map ocqp mem to user process using write combining. - Map PCIE_MA_SYNC reg to user process. Bump uverbs ABI. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28RDMA/cxgb4: Centralize the wait logicSteve Wise
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-02RDMA/cxgb4: Use correct control txqSteve Wise
There is only one control txq per tx channel. So use the port number as the queue index when sending. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06RDMA/cxgb4: Use the DMA state API instead of the pci equivalentsFUJITA Tomonori
This replace the PCI DMA state API (include/linux/pci-dma.h) with the DMA equivalents since the PCI DMA state API will be obsolete. No functional change. For further information about the background: http://marc.info/?l=linux-netdev&m=127037540020276&w=2 Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24RDMA/cxgb4: Register RDMA provider based on LLD state_change eventsSteve Wise
The LLD now supports proper UP state change events, so move the RDMA provider registration to UP path. This fixes a crash when loading iw_cxgb4 _after_ the NFS/RDMA transport is up and running. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-05MAINTAINERS: Add cxgb4 and iw_cxgb4 entriesRoland Dreier
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-04-21RDMA/cxgb4: Add driver for Chelsio T4 RNICSteve Wise
Add an RDMA/iWARP driver for Chelsio T4 Ethernet adapters. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>