aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/hw/mlx4/qp.c
Commit message (Collapse)AuthorAgeFilesLines
* more driver stuff from 3.2.72Wolfgang Wiedmeyer2015-10-231-46/+100
|
* net: fix NULL dereferences in check_peer_redir()Eric Dumazet2012-02-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit d3aaeb38c40e5a6c08dd31a1b64da65c4352be36, along with dependent backports of commits: 69cce1d1404968f78b177a0314f5822d5afdbbfb 9de79c127cccecb11ae6a21ab1499e87aa222880 218fa90f072e4aeff9003d57e390857f4f35513e 580da35a31f91a594f3090b7a2c39b85cb051a12 f7e57044eeb1841847c24aa06766c8290c202583 e049f28883126c689cf95859480d9ee4ab23b7fa ] Gergely Kalman reported crashes in check_peer_redir(). It appears commit f39925dbde778 (ipv4: Cache learned redirect information in inetpeer.) added a race, leading to possible NULL ptr dereference. Since we can now change dst neighbour, we should make sure a reader can safely use a neighbour. Add RCU protection to dst neighbour, and make sure check_peer_redir() can be called safely by different cpus in parallel. As neighbours are already freed after one RCU grace period, this patch should not add typical RCU penalty (cache cold effects) Many thanks to Gergely for providing a pretty report pointing to the bug. Reported-by: Gergely Kalman <synapse@hippy.csoma.elte.hu> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* IB/mlx4: Fix memory ordering of VLAN insertion control bitsEli Cohen2010-12-011-5/+5
| | | | | | | | | | | We must fully update the control segment before marking it as valid, so that hardware doesn't start executing it before we're ready. Signed-off-by: Eli Cohen <eli@mellanox.co.il> [ Move VLAN control bit setting to before wmb(). - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Add VLAN support for IBoEEli Cohen2010-10-251-15/+64
| | | | | | | | | | | | | | | | | | This patch allows IBoE traffic to be encapsulated in 802.1Q tagged VLAN frames. The VLAN tag is encoded in the GID and derived from it by a simple computation. The netdev notifier callback is modified to catch VLAN device addition/removal and the port's GID table is updated to reflect the change, so that for each netdevice there is an entry in the GID table. When the port's GID table is exhausted, GID entries will not be added. Only children of the main interfaces can add to the GID table; if a VLAN interface is added on another VLAN interface (e.g. "vconfig add eth2.6 8"), then that interfaces will not add an entry to the GID table. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/core: Add VLAN support for IBoEEli Cohen2010-10-251-1/+1
| | | | | | | | | | | | | | | | | Add 802.1q VLAN support to IBoE. The VLAN tag is encoded within the GID derived from a link local address in the following way: GID[11] GID[12] contain the VLAN ID when the GID contains a VLAN. The 3 bits user priority field of the packets are identical to the 3 bits of the SL. In case of rdma_cm apps, the TOS field is used to generate the SL field by doing a shift right of 5 bits effectively taking to 3 MS bits of the TOS field. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Add support for IBoEEli Cohen2010-10-251-25/+105
| | | | | | | | | | | | | | | | | | | | | Add support for IBoE to mlx4_ib. The bulk of the code is handling the new address vector fields; mlx4 needs the MAC address of a remote node to include it in a WQE (for datagrams) or in the QP context (for connected QPs). Address resolution is done by assuming all unicast GIDs are either link-local IPv6 addresses. Multicast group attach/detach needs to update the NIC's multicast filters; but since attaching a QP to a multicast group can be done before the QP is bound to a port, for IBoE we need to keep track of all multicast groups that a QP is attached too before it transitions from INIT to RTR (since it does not have a port in the INIT state). Signed-off-by: Eli Cohen <eli@mellanox.co.il> [ Many things cleaned up and otherwise monkeyed with; hope I didn't introduce too many bugs. - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/pack: IBoE UD packet packing supportEli Cohen2010-10-141-1/+1
| | | | | | | | | | Add support for packing IBoE packet headers. Signed-off-by: Eli Cohen <eli@mellanox.co.il> [ Clean up and fix ib_ud_header_init() a bit. - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Add support for masked atomic operationsVladimir Sokolovsky2010-04-211-11/+39
| | | | | | | | Add support for masked atomic operations (masked compare and swap, masked fetch and add). Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo2010-03-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* Merge branch 'misc' into for-nextRoland Dreier2010-03-011-1/+1
|\ | | | | | | | | Conflicts: drivers/infiniband/core/uverbs_main.c
| * IB/core: Fix and clean up ib_ud_header_init()Eli Cohen2010-02-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | ib_ud_header_init() first clears header and then fills up the various fields. Later on, it tests header->immediate_present, which it has already cleared, so the condition is always false. Fix this by adding an immediate_present parameter and setting header->immediate_present as is done with grh_present. Also remove unused calculation of header_len. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | IB/mlx4: Simplify retrieval of ib_deviceEli Cohen2010-02-121-1/+1
|/ | | | | | | | struct ib_qp already holds a pointer to the ib device. No need to dive to the hw device object to retrieve it. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Fix queue overflow check in post_recvOr Gerlitz2010-01-061-1/+1
| | | | | | | | | In mlx4_ib_post_recv(), we should check the queue for overflow using recv_cq instead of send_cq (current code looks like a copy-and-paste mistake). Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* Merge branch 'for-linus' of ↵Linus Torvalds2009-12-161-13/+12
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (45 commits) RDMA/cxgb3: Fix error paths in post_send and post_recv RDMA/nes: Fix stale ARP issue RDMA/nes: FIN during MPA startup causes timeout RDMA/nes: Free kmap() resources RDMA/nes: Check for zero STag RDMA/nes: Fix Xansation test crash on cm_node ref_count RDMA/nes: Abnormal listener exit causes loopback node crash RDMA/nes: Fix crash in nes_accept() RDMA/nes: Resource not freed for REJECTed connections RDMA/nes: MPA request/response error checking RDMA/nes: Fix query of ORD values RDMA/nes: Fix MAX_CM_BUFFER define RDMA/nes: Pass correct size to ioremap_nocache() RDMA/nes: Update copyright and branding string RDMA/nes: Add max_cqe check to nes_create_cq() RDMA/nes: Clean up struct nes_qp RDMA/nes: Implement IB_SIGNAL_ALL_WR as an iWARP extension RDMA/nes: Add additional SFP+ PHY uC status check and PHY reset RDMA/nes: Correct fast memory registration implementation IB/ehca: Fix error paths in post_send and post_recv ...
| * IB/mlx4: Remove limitation on LSO header sizeEli Cohen2009-11-121-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Current code has a limitation: an LSO header is not allowed to cross a 64 byte boundary. This patch removes this limitation by setting the WQE RR for large headers thus allowing LSO headers of any size. The extra buffer reserved for MLX4_IB_QP_LSO QPs has been doubled, from 64 to 128 bytes, assuming this is reasonable upper limit for header length. Also, this patch will cause IB_DEVICE_UD_TSO to be set only for HCA FW versions that set MLX4_DEV_CAP_FLAG_BLH; e.g. FW version 2.6.000 and higher. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mlx4: Remove unneeded codeEli Cohen2009-11-121-1/+0
| | | | | | | | | | | | | | There is no such flag DE - the field is reserved and should be zero. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | tree-wide: fix assorted typos all over the placeAndré Goddard Rosa2009-12-041-1/+1
|/ | | | | | | | | | That is "success", "unknown", "through", "performance", "[re|un]mapping" , "access", "default", "reasonable", "[con]currently", "temperature" , "channel", "[un]used", "application", "example","hierarchy", "therefore" , "[over|under]flow", "contiguous", "threshold", "enough" and others. Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* IB/mlx4: Annotate CQ lockingRoland Dreier2009-09-051-4/+8
| | | | | | | | | mlx4_ib_lock_cqs()/mlx4_ib_unlock_cqs() are helper functions that lock/unlock both CQs attached to a QP in the proper order to avoid AB-BA deadlocks. Annotate this so sparse can understand what's going on (and warn us if we misuse these functions). Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Add strong ordering to local inval and fast reg work requestsJack Morgenstein2009-06-051-0/+4
| | | | | | | | | | | | | | | | | The ConnectX Programmer's Reference Manual states that the "SO" bit must be set when posting Fast Register and Local Invalidate send work requests. When this bit is set, the work request will be executed only after all previous work requests on the send queue have been executed. (If the bit is not set, Fast Register and Local Invalidate WQEs may begin execution too early, which violates the defined semantics for these operations) This fixes the issue with NFS/RDMA reported in <http://lists.openfabrics.org/pipermail/general/2009-April/059253.html> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Don't overwrite fast registration page list when posting work requestJack Morgenstein2009-05-071-1/+1
| | | | | | | | | | | | | | | | | | The low-level mlx4 driver modified the page-list addresses for fast register work requests post send to big-endian, and set a "present" bit. This caused problems later when the consumer attempted to unmap the pages using the page-list (using the list addresses which were assumed to be still in CPU-endian order). Fix the mlx4 driver to allocate two buffers and use a private buffer for the hardware-format bus addresses. This patch fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1571>, an NFS/RDMA server crash. The cause of the crash was found by Vu Pham of Mellanox. The fix is along the lines suggested by Steve Wise in comment #21 in bug 1571. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB: Remove __constant_{endian} usesHarvey Harrison2009-01-171-11/+11
| | | | | | | | | | | The base versions handle constant folding just fine, use them directly. The replacements are OK in the include/ files as they are not exported to userspace so we don't need the __ prefixed versions. This patch does not affect code generation at all. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Fix memory ordering problem when posting LSO sendsRoland Dreier2009-01-161-9/+19
| | | | | | | | | | | | | | | | | | The current work request posting code writes the LSO segment before writing any data segments. This leaves a window where the LSO segment overwrites the stamping in one cacheline that the HCA prefetches before the rest of the cacheline is filled with the correct data segments. When the HCA processes this work request, a local protection error may result. Fix this by saving the LSO header size field off and writing it only after all data segments are written. This fix is a cleaned-up version of a patch from Jack Morgenstein <jackm@dev.mellanox.co.il>. This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1383>. Reported-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* mlx4_core: Add QP range reservation supportYevgeny Petrilin2008-10-101-2/+19
| | | | | | | | | | | | To allow allocating an aligned range of consecutive QP numbers, add an interface to reserve an aligned range of QP numbers and have the QP allocation function always take a QP number. This will be used for RSS support in the mlx4_en Ethernet driver and also potentially by IPoIB RSS support. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Set RLKEY bit for kernel QPsVladimir Sokolovsky2008-10-081-0/+3
| | | | | | | | Set RLKEY bit in the HW context for kernel QPs so that kernel QPs can use the reserved L_Key for memory reference. Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Fix up fast register page list formatVladimir Sokolovsky2008-09-151-0/+6
| | | | | | | | | | | Byte swap the addresses in the page list for fast register work requests to big endian to match what the HCA expectx. Also, the addresses must have the "present" bit set so that the HCA knows it can access them. Otherwise the HCA will fault the first time it accesses the memory region. Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Allow 4K messages for UD QPsAlex Naslednikov2008-08-071-1/+1
| | | | | | | | | | Current code limits the max message size to 2K for UD QPs, while MTU might be as big as 4K. This patch sets the maximum message size to 4K, which is needed for UD to work correctly on fabrics with a 4K MTU. Signed-off-by: Alex Naslednikov <xalex@mellanox.co.il> Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* mlx4: Update/add Mellanox Technologies copyright lines to mlx4 driver filesJack Morgenstein2008-07-251-0/+1
| | | | | | | | Update existing Mellanox copyright lines to 2008, and add such lines to files where they are missing. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Add support for memory management extensions and local DMA L_KeyRoland Dreier2008-07-231-5/+67
| | | | | | | | | | | | | Add support for the following operations to mlx4 when device firmware supports them: - Send with invalidate and local invalidate send queue work requests; - Allocate/free fast register MRs; - Allocate/free fast register MR page lists; - Fast register MR send queue work requests; - Local DMA L_Key. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Rename struct mlx4_lso_seg to mlx4_wqe_lso_segRoland Dreier2008-07-221-1/+1
| | | | | | | Make the struct name consistent with other WQE segment struct types defined in <linux/mlx4/qp.h>. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Use kzalloc() for new QPs so flags are initialized to 0Eli Cohen2008-07-141-13/+2
| | | | | | | | | | | | | | | Current code uses kmalloc() and then just does a bitwise OR operation on qp->flags in create_qp_common(), which means that qp->flags may potentially have some unintended bits set. This patch uses kzalloc() and avoids further explicit clearing of structure members, which also shrinks the code: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-65 (-65) function old new delta create_qp_common 2024 1959 -65 Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Add support for blocking multicast loopback packetsRon Livne2008-07-141-3/+18
| | | | | | | | | Add support for handling the IB_QP_CREATE_MULTICAST_BLOCK_LOOPBACK flag by using the per-multicast group loopback blocking feature of mlx4 hardware. Signed-off-by: Ron Livne <ronli@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Remove extra code for RESET->ERR QP state transitionRoland Dreier2008-07-141-26/+0
| | | | | | | | | | Commit 65adfa91 ("IB/mlx4: Fix RESET to RESET and RESET to ERROR transitions") added some extra code to handle a QP state transition from RESET to ERROR. However, the latest 1.2.1 version of the IB spec has clarified that this transition is actually not allowed, so we can remove this extra code again. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Configure QPs' max message size based on real device capabilityEli Cohen2008-07-141-1/+2
| | | | | | | | | | | | ConnectX returns the max message size it supports through the QUERY_DEV_CAP firmware command. When modifying a QP to RTR, the max message size for the QP must be specified. This value must not exceed the value declared through QUERY_DEV_CAP. The current code ignores the max allowed size and unconditionally sets the value to 2^31. This patch sets all QPs to the max value allowed as returned from firmware. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Optimize QP stampingEli Cohen2008-07-141-2/+6
| | | | | | | | | | | | | | | The idea is that for QPs with fixed size work requests (eg selective signaling QPs), before stamping the WQE, we read the value of the DS field, which gives the effective size of the descriptor as used in the previous post. Then we stamp only that area, since the rest of the descriptor is already stamped. When initializing the send queue buffer, make sure the DS field is initialized to the max descriptor size so that the subsequent stamping will be done on the entire descriptor area. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Fix creation of kernel QP with max number of send s/g entriesRoland Dreier2008-05-201-5/+8
| | | | | | | | | | | | | | | | | | When creating a kernel QP where the consumer asked for a send queue with lots of scatter/gater entries, set_kernel_sq_size() incorrectly returned an error if the send queue stride is larger than the hardware's maximum send work request descriptor size. This is not a problem; the only issue is to make sure that the actual descriptors used do not overflow the maximum descriptor size, so check this instead. Clamp the returned max_send_sge value to be no bigger than what query_device returns for the max_sge to avoid confusing hapless users, even if the hardware is capable of handling a few more s/g entries. This bug caused NFS/RDMA mounts to fail when the server adapter used the mlx4 driver. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Fix uninitialized-var warning in mlx4_ib_post_send()Andrew Morton2008-05-161-1/+1
| | | | | | | | | | drivers/infiniband/hw/mlx4/qp.c: In function 'mlx4_ib_post_send': drivers/infiniband/hw/mlx4/qp.c:1460: warning: 'seglen' may be used uninitialized in this function This is the dopey gcc-doesn't-know-that-foo(&var)-writes-to-var problem. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB: expand ib_umem_get() prototypeArthur Kepner2008-04-291-1/+1
| | | | | | | | | | | | | | | | | | | Add a new parameter, dmasync, to the ib_umem_get() prototype. Use dmasync = 1 when mapping user-allocated CQs with ib_umem_get(). Signed-off-by: Arthur Kepner <akepner@sgi.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Jes Sorensen <jes@sgi.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Roland Dreier <rdreier@cisco.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mlx4_core: Move kernel doorbell management into coreYevgeny Petrilin2008-04-231-3/+3
| | | | | | | | | In addition to mlx4_ib, there will be ethernet and FC consumers of mlx4_core, so move the code for managing kernel doorbells into the core module to avoid having to duplicate this multiple times. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Update QP state if query QP succeedsDotan Barak2008-04-161-5/+12
| | | | | | | | | If the QP was moved to another state (such as SQE) by the hardware, then after this change the user won't have to set the IBV_QP_CUR_STATE mask in order to execute modify QP in order to recover from this state. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/core: Add support for "send with invalidate" work requestsRoland Dreier2008-04-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new IB_WR_SEND_WITH_INV send opcode that can be used to mark a "send with invalidate" work request as defined in the iWARP verbs and the InfiniBand base memory management extensions. Also put "imm_data" and a new "invalidate_rkey" member in a new "ex" union in struct ib_send_wr. The invalidate_rkey member can be used to pass in an R_Key/STag to be invalidated. Add this new union to struct ib_uverbs_send_wr. Add code to copy the invalidate_rkey field in ib_uverbs_post_send(). Fix up low-level drivers to deal with the change to struct ib_send_wr, and just remove the imm_data initialization from net/sunrpc/xprtrdma/, since that code never does any send with immediate operations. Also, move the existing IB_DEVICE_SEND_W_INV flag to a new bit, since the iWARP drivers currently in the tree set the bit. The amso1100 driver at least will silently fail to honor the IB_SEND_INVALIDATE bit if passed in as part of userspace send requests (since it does not implement kernel bypass work request queueing). Remove the flag from all existing drivers that set it until we know which ones are OK. The values chosen for the new flag is not consecutive to avoid clashing with flags defined in the XRC patches, which are not merged yet but which are already in use and are likely to be merged soon. This resurrects a patch sent long ago by Mikkel Hagen <mhagen@iol.unh.edu>. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Micro-optimize mlx4_ib_post_send()Roland Dreier2008-04-161-8/+8
| | | | | | | | | | | | | | Rather than have build_mlx_header() return a negative value on failure and the length of the segments it builds on success, add a pointer parameter to return the length and return 0 on success. This matches the calling convention used for build_lso_seg() and generates slightly smaller code -- eg, on 64-bit x86: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-22 (-22) function old new delta mlx4_ib_post_send 2023 2001 -22 Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Add IPoIB LSO supportEli Cohen2008-04-161-9/+63
| | | | | | | Add TSO support to the mlx4_ib driver. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/core: Add creation flags to struct ib_qp_init_attrEli Cohen2008-04-161-0/+3
| | | | | | | | | | | | | | Add a create_flags member to struct ib_qp_init_attr that will allow a kernel verbs consumer to create a pass special flags when creating a QP. Add a flag value for telling low-level drivers that a QP will be used for IPoIB UD LSO. The create_flags member will also be useful for XRC and ehca low-latency QP support. Since no create_flags handling is implemented yet, add code to all low-level drivers to return -EINVAL if create_flags is non-zero. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Add IPoIB checksum offload supportEli Cohen2008-04-161-0/+3
| | | | | | | | | | | | ConnectX devices support checksum generation and verification of TCP and UDP packets for UD IPoIB messages. This patch checks if the HCA supports this and sets the IB_DEVICE_UD_IP_CSUM capability flag if it does. It implements support for handling the IB_SEND_IP_CSUM send flag and setting the csum_ok field in receive work completions. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Ali Ayub <ali@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Endianness annotationsRoland Dreier2008-04-161-2/+2
| | | | | | Trivial fixes to stamp_send_wqe(). Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Use multiple WQ blocks to post smaller send WQEsJack Morgenstein2008-02-081-34/+176
| | | | | | | | | | | | | | | | | | | | | | | | | | | ConnectX HCA supports shrinking WQEs, so that a single work request can be made of multiple units of wqe_shift. This way, WRs can differ in size, and do not have to be a power of 2 in size, saving memory and speeding up send WR posting. Unfortunately, if we do this then the wqe_index field in CQEs can't be used to look up the WR ID anymore, so our implementation does this only if selective signaling is off. Further, on 32-bit platforms, we can't use vmap() to make the QP buffer virtually contigious. Thus we have to use constant-sized WRs to make sure a WR is always fully within a single page-sized chunk. Finally, we use WRs with the NOP opcode to avoid wrapping around the queue buffer in the middle of posting a WR, and we set the NoErrorCompletion bit to avoid getting completions with error for NOP WRs. However, NEC is only supported starting with firmware 2.2.232, so we use constant-sized WRs for older firmware. And, since MLX QPs only support SEND, we use constant-sized WRs in this case. When stamping during NOP posting, do stamping following setting of the NOP WQE valid bit. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Consolidate code to get an entry from a struct mlx4_bufRoland Dreier2008-02-061-5/+1
| | | | | | | | | | We use struct mlx4_buf for kernel QP, CQ and SRQ buffers, and the code to look up an entry is duplicated in get_cqe_from_buf() and the QP and SRQ versions of get_wqe(). Factor this out into mlx4_buf_offset(). This will also make it easier to switch over to using vmap() for buffers. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Lock SQ lock in mlx4_ib_post_send()Roland Dreier2007-10-301-2/+2
| | | | | | | | | | | Because of a typo, mlx4_ib_post_send() takes the same lock rq.lock as mlx4_ib_post_recv(). Correct the code so the intended sq.lock is taken when posting a send. Noticed by Yossi Leybovitch and pointed out by Jack Morgenstein from Mellanox. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Sanity check userspace send queue sizesJack Morgenstein2007-10-181-2/+14
| | | | | | | | | | | | | Add sanity checks to send queue sizes passed in from userspace. The minimum sq stride value below is taken from the MT25408 PRM (section 11.10, Table 306, log_sq_stride definition). Without this check, userspace can submit arbitrarily large/small values for the number of WQEs and the stride, which can crash the kernel. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Use __set_data_seg() in mlx4_ib_post_recv()Roland Dreier2007-10-091-5/+9
| | | | | | | | | | | | Use a __set_data_seg() helper in mlx4_ib_post_recv() too; in addition to making the code easier to read, this also allows gcc to generate better code -- on x86_64: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-8 (-8) function old new delta mlx4_ib_post_recv 359 351 -8 Signed-off-by: Roland Dreier <rolandd@cisco.com>