Eli Cohen [Fri, 11 Mar 2016 20:58:42 +0000 (22:58 +0200)]
IB/mlx5: Implement callbacks for manipulating VFs
Implement the IB defined callbacks used to manipulate the policy for the
link state, set GUIDs or get statistics information. This functionality
is added into a new file that will be used to add any SRIOV related
functionality to the mlx5 IB layer.
The following callbacks have been added:
mlx5_ib_get_vf_config
mlx5_ib_set_vf_link_state
mlx5_ib_get_vf_stats
mlx5_ib_set_vf_guid
In addition, publish whether this device is based on a virtual function.
In mlx5 supported devices, virtual functions are implemented as vHCAs.
vHCAs have their own QP number space so it is possible that two vHCAs
will use a QP with the same number at the same time.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Fri, 11 Mar 2016 20:58:41 +0000 (22:58 +0200)]
net/mlx5_core: Implement modify HCA vport command
Implement the modify HCA vport commands used to modify the parameters of
virtual HCA's ports.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Fri, 11 Mar 2016 20:58:40 +0000 (22:58 +0200)]
net/mlx5_core: Add VF param when querying vport counter
Add a vf parameter to mlx5_core_query_vport_counter so we can call it to
query counters of virtual functions. Also update current users of the
API.
PFs may call mlx5_core_query_vport_counter with other_vport set to
indicate that they are querying a virtual function. The virtual
function to be queried is given by the vf parameter. Virtual function
numbering is zero based so the first VF is 0 and so on. When a PF
queries its own function, the other_vport parameter is cleared.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Fri, 11 Mar 2016 20:58:39 +0000 (22:58 +0200)]
IB/ipoib: Add ndo operations for configuring VFs
Add ndo operations to the network driver that enables configuring the
following operations:
ipoib_set_vf_link_state - configure the VF link policy
ipoib_get_vf_config - get link state configuration
ipoib_set_vf_guid - set a VF port or node GUID
ipoib_get_vf_stats - get statistics of a VF
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Fri, 11 Mar 2016 20:58:38 +0000 (22:58 +0200)]
IB/core: Add interfaces to control VF attributes
Following the practice exercised for network devices which allow the PF
net device to configure attributes of its virtual functions, we
introduce the following functions to be used by IPoIB which is the
network driver implementation for IB devices.
ib_set_vf_link_state - set the policy for a VF link. More below.
ib_get_vf_config - read configuration information of a VF
ib_get_vf_stats - read VF statistics
ib_set_vf_guid - set the node or port GUID of a VF
Also add an indication in the device cap flags that indicates that this
IB devices is based on a virtual function.
A VF shares the physical port with the PF and other VFs. When setting
the link state we have three options:
1. Auto - in this mode, the virtual port follows the state of the
physical port and becomes active only if the physical port's state is
active. In all other cases it remains in a Down state.
2. Down - sets the state of the virtual port to Down
3. Up - causes the virtual port to transition into Initialize state if
it was not already in this state. A virtualization aware subnet manager
can then bring the state of the port into the Active state.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Fri, 11 Mar 2016 20:58:37 +0000 (22:58 +0200)]
IB/core: Support accessing SA in virtualized environment
Per the ongoing standardisation process, when virtual HCAs are present
in a network, traffic is routed based on a destination GID. In order to
access the SA we use the well known SA GID.
We also add a GRH required boolean field to the port attributes which is
used to report to the verbs consumer whether this port is connected to a
virtual network. We use this field to realize whether we need to create
an address vector with GRH to access the subnet administrator. We clear
the port attributes struct before calling the hardware driver to make
sure the default remains that GRH is not required.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Fri, 11 Mar 2016 20:58:36 +0000 (22:58 +0200)]
IB/core: Add subnet prefix to port info
The subnet prefix is a part of the port_info MAD returned and should be
available at the ib_port_attr struct. We define it here and provide a
default implementation in case the hardware driver does not provide one.
The subnet prefix is required when creating the address vector to access
the SA in networks where GRH must be used.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Fri, 11 Mar 2016 20:58:35 +0000 (22:58 +0200)]
IB/mlx5: Fix decision on using MAD_IFC
Fix the condition that dictates when MAD_IFC should be used. According
to firmware specifications, MAD_IFC commands must be used only if the
ib_virt capability is off.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Fri, 11 Mar 2016 20:58:34 +0000 (22:58 +0200)]
net/core: Add support for configuring VF GUIDs
Add two new NLAs to support configuration of Infiniband node or port
GUIDs. New applications can choose to use this interface to configure
GUIDs with iproute2 with commands such as:
ip link set dev ib0 vf 0 node_guid 00:02:c9:03:00:21:6e:70
ip link set dev ib0 vf 0 port_guid 00:02:c9:03:00:21:6e:78
A new ndo, ndo_sef_vf_guid is introduced to notify the net device of the
request to change the GUID.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Leon Romanovsky [Tue, 23 Feb 2016 08:25:25 +0000 (10:25 +0200)]
IB/{core, ulp} Support above 32 possible device capability flags
The old bitwise device_cap_flags variable was limited to u32 which
has all bits already defined. In order to overcome it, we converted
device_cap_flags variable to be u64 type.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Leon Romanovsky [Tue, 23 Feb 2016 08:25:24 +0000 (10:25 +0200)]
IB/core: Replace setting the zero values in ib_uverbs_ex_query_device
The setting to zero during variable initialization eliminates
the need to explicitly set to zero variables and structures.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Sagi Grimberg [Tue, 23 Feb 2016 08:25:23 +0000 (10:25 +0200)]
net/mlx5_core: Introduce offload arithmetic hardware capabilities
Define the necessary hardware structures for the offload
arithmetic capabilities and read/cache them on driver load.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Leon Romanovsky [Tue, 23 Feb 2016 08:25:22 +0000 (10:25 +0200)]
net/mlx5_core: Refactor device capability function
Device capability function was called similar in all places.
It was called twice for every queried parameter, while the
difference between calls was in HCA capability mode only.
The change proposed unify these calls into one function.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Leon Romanovsky [Tue, 23 Feb 2016 08:25:21 +0000 (10:25 +0200)]
net/mlx5_core: Fix caching ATOMIC endian mode capability
Add caching of maximum device capability of ATOMIC endian mode.
Fixes:
f91e6d8941bf ('net/mlx5_core: Add setting ATOMIC endian mode')
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dan Carpenter [Fri, 18 Mar 2016 05:41:59 +0000 (08:41 +0300)]
ib_srpt: fix a WARN_ON() message
The first argument of WARN_ON() is a condition, so it means the warning
message here will just be the name without the ->qp_num information.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Doug Ledford [Wed, 16 Mar 2016 17:57:43 +0000 (13:57 -0400)]
Merge branches 'nes', 'cxgb4' and 'iwpm' into k.o/for-4.6
Faisal Latif [Fri, 26 Feb 2016 15:18:05 +0000 (09:18 -0600)]
iwpm: crash fix for large connections test
During large connection test, there is a crash at wake_up() in the callback as waitq is
not yet initialized. Callback can happen before iwpm_wait_complete_req() is called to
initialize waitq.
To resolve, using signaling semaphore instead of waitq.
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Reviewed-by: Tatyana E Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Steve Wise [Fri, 26 Feb 2016 15:18:04 +0000 (09:18 -0600)]
iw_cxgb3: support for iWARP port mapping
Now with the new iWARP port mapping service in the iwcm, it is
trivial to add cxgb3 support.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Steve Wise [Fri, 26 Feb 2016 15:18:03 +0000 (09:18 -0600)]
iw_cxgb4: remove port mapper related code
Now that most of the port mapper code been moved to iwcm, we can remove
it from iw_cxgb4.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Faisal Latif [Fri, 26 Feb 2016 15:18:02 +0000 (09:18 -0600)]
iw_nes: remove port mapper related code
Now that most of the port mapper code been moved to iwcm, we can
remove it from port mapper service user drivers.
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Tatyana E. Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Faisal Latif [Fri, 26 Feb 2016 15:18:00 +0000 (09:18 -0600)]
iwcm: common code for port mapper
moved port mapper related code from drivers into common code
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Tatyana E. Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Doug Ledford [Wed, 16 Mar 2016 17:38:28 +0000 (13:38 -0400)]
Merge branches 'mlx4', 'mlx5' and 'ocrdma' into k.o/for-4.6
Doug Ledford [Mon, 14 Mar 2016 21:42:57 +0000 (17:42 -0400)]
Merge branches 'ib_core', 'ib_ipoib', 'srpt', 'drain-cq-v4' and 'net/9p' into k.o/for-4.6
Christoph Hellwig [Thu, 3 Mar 2016 08:36:06 +0000 (09:36 +0100)]
net/9p: convert to new CQ API
Trivial conversion to the new RDMA CQ API.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Maor Gottlieb [Mon, 7 Mar 2016 16:51:47 +0000 (18:51 +0200)]
IB/mlx5: Add support for don't trap rules
Each bypass flow steering priority will be split into two priorities:
1. Priority for don't trap rules.
2. Priority for normal rules.
When user creates a flow using IB_FLOW_ATTR_FLAGS_DONT_TRAP flag, the
driver creates two flow rules, one used for receiving the traffic and
the other one for forwarding the packet to continue matching in lower
or equal priorities.
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Maor Gottlieb [Mon, 7 Mar 2016 16:51:46 +0000 (18:51 +0200)]
net/mlx5_core: Introduce forward to next priority action
Add support to create flow rule that forward packets
to the first flow table in the next priority (next priority
could be the first priority in the next namespace or the
next priority in the same namespace).
This feature could be used for DONT_TRAP rules or rules
that only want to mark the packet with flow tag.
In order to do it optimally, each flow table has list
of all rules that point to this flow table,
when a flow table is destroyed/created, we update the list
head correspondingly.
This kind of rule is created when destination is NULL and
action is MLX5_FLOW_CONTEXT_ACTION_FWD_NEXT_PRIO.
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Maor Gottlieb [Mon, 7 Mar 2016 16:51:45 +0000 (18:51 +0200)]
net/mlx5_core: Create anchor of last flow table
Create an empty flow table in the end of NIC rx namesapce.
Adding this flow table simplify the implementation of "forward
to next prio" rules.
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Sagi Grimberg [Mon, 29 Feb 2016 17:07:34 +0000 (19:07 +0200)]
iser: Accept arbitrary sg lists mapping if the device supports it
If the device support arbitrary sg list mapping (device cap
IB_DEVICE_SG_GAPS_REG set) we allocate the memory regions with
IB_MR_TYPE_SG_GAPS and allow the block layer to pass us
gaps by skip setting the queue virt_boundary.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Sagi Grimberg [Mon, 29 Feb 2016 17:07:33 +0000 (19:07 +0200)]
mlx5: Add arbitrary sg list support
Allocate proper context for arbitrary scatterlist registration
If ib_alloc_mr is called with IB_MR_MAP_ARB_SG, the driver
allocate a private klm list instead of a private page list.
Set the UMR wqe correctly when posting the fast registration.
Also, expose device cap IB_DEVICE_MAP_ARB_SG according to the
device id (until we have a FW bit that correctly exposes it).
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Sagi Grimberg [Mon, 29 Feb 2016 17:07:32 +0000 (19:07 +0200)]
IB/core: Add arbitrary sg_list support
Devices that are capable in registering SG lists
with gaps can now expose it in the core to ULPs
using a new device capability IB_DEVICE_SG_GAPS_REG
(in a new field device_cap_flags_ex in the device attributes
as we ran out of bits), and a new mr_type IB_MR_TYPE_SG_GAPS_REG
which allocates a memory region which is capable of handling
SG lists with gaps.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Sagi Grimberg [Thu, 3 Mar 2016 11:37:51 +0000 (13:37 +0200)]
IB/mlx5: Expose correct max_fast_reg_page_list_len
While documentation indicates that the number of translation
entries per memory key is unlimited, in practice, we can
only fit a finite amount of translation entries in a single
registration wqe (which is log_max_klm_list_size).
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Doug Ledford [Thu, 3 Mar 2016 16:23:37 +0000 (11:23 -0500)]
IB/mlx5: Make coding style more consistent
These three related functions can't agree whether to put the
umrwr on the stack dirty and then memset it, or to initialize
it on the stack. Make them all agree.
Signed-off-by: Doug Ledford <dledford@redhat.com>
Christoph Hellwig [Thu, 3 Mar 2016 08:38:22 +0000 (09:38 +0100)]
IB/mlx5: Convert UMR CQ to new CQ API
Simplifies the code, and makes it more fair vs other users by using a
softirq for polling.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Haggai Eran <haggaie@mellanox.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Markus Elfring [Sat, 26 Dec 2015 17:40:43 +0000 (18:40 +0100)]
IB/ocrdma: Skip using unneeded intermediate variable
Return the value from a call of the ocrdma_mbx_modify_qp() function
without using an extra assignment for the local variable "status".
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Markus Elfring [Sat, 26 Dec 2015 17:28:35 +0000 (18:28 +0100)]
IB/ocrdma: Skip using unneeded intermediate variable
Return zero at the end without using the local variable "status".
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Markus Elfring [Sat, 26 Dec 2015 17:18:18 +0000 (18:18 +0100)]
IB/ocrdma: Delete unnecessary variable initialisations in 11 functions
The variable "status" will be set to an appropriate value a bit later.
Thus let us omit the explicit initialisation at the beginning.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Hal Rosenstock [Tue, 5 Jan 2016 18:52:55 +0000 (13:52 -0500)]
IB/core: Documentation fix in the MAD header file
In ib_mad.h, ib_mad_snoop_handler uses send_buf rather than send_wr
Signed-off-by: Hal Rosenstock <hal@mellanox.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Parav Pandit [Tue, 1 Mar 2016 19:20:29 +0000 (00:50 +0530)]
IB/core: trivial prink cleanup.
1. Replaced printk with appropriate pr_warn, pr_err, pr_info.
2. Removed unnecessary prints around memory allocation failure
which are not required, as reported by the checkpatch script.
Signed-off-by: Parav Pandit <pandit.parav@gmail.com>
Reviewed-by: Haggai Eran <haggaie@mellanox.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Amitoj Kaur Chawla [Fri, 12 Feb 2016 07:46:10 +0000 (13:16 +0530)]
IB/core: Replace memset with eth_zero_addr
Use eth_zero_addr to assign the zero address to the given address
array instead of memset when second argument is address of zero.
The Coccinelle semantic patch used to make this change is as follows:
// <smpl>
@eth_zero_addr@
expression e;
@@
-memset(e,0x00,ETH_ALEN);
+eth_zero_addr(e);
// </smpl>
Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Sun, 14 Feb 2016 15:07:49 +0000 (17:07 +0200)]
IB/core: Modify conditional on ucontext existence
Since we allow to call legacy verbs using their extended counterpart,
the check on ucontext has to move up to a common area in case this verb
is ever extended.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Sun, 14 Feb 2016 15:07:48 +0000 (17:07 +0200)]
IB/core: IB/core: Allow legacy verbs through extended interfaces
When an extended verb is an extension to a legacy verb, the original
functionality is preserved. Hence we do not require each hardware driver
to set the extended capability. This will allow the use of the extended
verb in its simple form with drivers that do not support the extended
capability.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Sun, 14 Feb 2016 15:07:47 +0000 (17:07 +0200)]
IB/core: Avoid duplicate code
Move the check on the validity of the command to a common area.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Doug Ledford [Thu, 3 Mar 2016 15:18:41 +0000 (10:18 -0500)]
Merge branch 'k.o/for-4.5-rc' into HEAD
Majd Dibbiny [Sun, 14 Feb 2016 16:35:52 +0000 (18:35 +0200)]
IB/{core, mlx5}: Fix input len in vendor part of create_qp/srq
Currently, the inlen field of the vendor's part of the command
doesn't match the command buffer. This happens because the inlen
accommodates ib_uverbs_cmd_hdr which is deducted from the in buffer.
This is problematic since the vendor function could be called either
from the legacy verb (where the input length mismatches the actual
length) or by the extended verb (where the length matches). The vendor
has no idea which function calls it and therefore has no way to know
how the length variable should be treated.
Fixing this by aligning the inlen to the correct length.
All vendor drivers either assumed that inlen >= sizeof(vendor_uhw_cmd)
or just failed wrongly (mlx5) and fixed in this patch.
Fixes:
cfb5e088e26a ('IB/mlx5: Add CQE version 1 support to user QPs and SRQs')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Majd Dibbiny [Sun, 14 Feb 2016 16:35:51 +0000 (18:35 +0200)]
IB/mlx5: Avoid using user-index for SRQs
Normal SRQs, unlike XRC SRQs, don't have user-index, therefore
avoid verifying it and using it.
Fixes:
cfb5e088e26a ('IB/mlx5: Add CQE version 1 support to user QPs and SRQs')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Hans Westgaard Ry [Wed, 2 Mar 2016 12:44:28 +0000 (13:44 +0100)]
IB/ipoib: Add handling for sending of skb with many frags
IPoIB converts skb-fragments to sge adding 1 extra sge when SG is enabled.
Current codepath assumes that the max number of sge a device support
is at least MAX_SKB_FRAGS+1, there is no interaction with upper layers
to limit number of fragments in an skb if a device suports fewer
sges. The assumptions also lead to requesting a fixed number of sge
when IPoIB creates queue-pairs with SG enabled.
A fallback/slowpath is implemented using skb_linearize to
handle cases where the conversion would result in more sges than supported.
Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Reviewed-by: HÃ¥kon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Matan Barak [Mon, 29 Feb 2016 16:05:30 +0000 (18:05 +0200)]
IB/mlx5: Add memory windows allocation support
This patch adds user-space support for memory windows allocation and
deallocation. It also exposes the supported types via
query_device_caps verb.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Tested-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Matan Barak [Mon, 29 Feb 2016 16:05:29 +0000 (18:05 +0200)]
IB/core: Add vendor's specific data to alloc mw
Passing udata to the vendor's driver in order to pass data from the
user-space driver to the kernel-space driver. This data will be
used in downstream patches.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Matan Barak [Mon, 29 Feb 2016 16:05:28 +0000 (18:05 +0200)]
net/mlx5: Refactor mlx5_core_mr to mkey
Mlx5's mkey mechanism is also used for memory windows.
The current code base uses MR (memory region) naming, which is
inaccurate. Changing MR to mkey in order to represent its different
usages more accurately.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Noa Osherovich [Mon, 29 Feb 2016 14:46:51 +0000 (16:46 +0200)]
IB/mlx5: Added support for re-registration of MRs
This patch adds support for re-registration of memory regions in MLX5.
The functionality is basically the same as deregister followed by
register, but attempts to reuse the existing resources as much as
possible.
Original memory keys are kept if possible, saving the need to
communicate new ones to remote peers.
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Noa Osherovich [Mon, 29 Feb 2016 14:46:50 +0000 (16:46 +0200)]
IB/mlx5: Refactoring register MR code
In order to add re-registration of memory region, some logic was
extracted to separate functions:
- ODP related logic.
- Some of the UMR WQE preparation code.
- DMA mapping.
- Umem creation.
- Creating MKey using FW interface.
- MR fields assignments after successful creation.
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Haggai Eran [Mon, 29 Feb 2016 13:45:12 +0000 (15:45 +0200)]
IB/cma: Print warning on different inner and header P_Keys
Commit
4c21b5bcef73 ("IB/cma: Add net_dev and private data checks to RDMA
CM") added checks for incoming RDMA CM requests that they can be matched to
a netdev based on the P_Key in the BTH of the request. This behavior was
reverted in commit
ab3964ad2acf ("IB/cma: Use inner P_Key to determine
netdev"), since the mlx5 and ipath drivers didn't send the correct value
in the BTH P_Key.
Since the ipath driver was removed, and the mlx5 driver can now send GSI
packets on different P_Keys, we could revert the patch to let the rdma_cm
module look on the BTH P_Key when deciding to what netdev a packet belongs.
However, that still breaks compatibility with the older drivers.
Change the behavior to print a warning when receiving a request that has a
different BTH P_Key and inner payload P_Key. In the future, after users
have seen the warnings and upgraded their setups, remove the warning and
block these requests.
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Haggai Eran [Mon, 29 Feb 2016 13:45:11 +0000 (15:45 +0200)]
IB/mlx5: Eliminate GSI RX QP's send buffers
Now that the transmission of GSI MADs is done with the special transmission
QPs, eliminate the send buffers in the GSI receive QP.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Haggai Eran [Mon, 29 Feb 2016 13:45:10 +0000 (15:45 +0200)]
IB/mlx5: Pick the right GSI transmission QP for sending
Pick the QP to use according to the wr.ud.pkey_index field in the work
request. If the QP doesn't exist, it means the P_Key is zero and the packet
would have been dropped, so just generate a completion and move on.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Haggai Eran [Mon, 29 Feb 2016 13:45:09 +0000 (15:45 +0200)]
IB/mlx5: Reorder GSI completions
The emulated GSI QP's send completions are generated by multiple hardware
QPs, so their completions could arrive out of order with respect to the
order their work request were submitted.
Reorder the completions by keeping a list of the posted work request and
their completions. A newly received completion from the hardware updates
the list and marks its work request as completed. However, the completions
are only reported to the client according to the list order.
In order to support that, create a new private CQ to handle the hardware
completions.
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Haggai Eran [Mon, 29 Feb 2016 13:45:08 +0000 (15:45 +0200)]
IB/mlx5: Generate completions in software
The GSI QP emulation requires also emulating completions for transmitted
MADs. The CQ on which these completions are generated can also be used by
the hardware, and the MAD layer is free to use any CQ of the device for the
GSI QP.
Add a method for generating software completions to each mlx5 CQ. Software
completions are polled first, and generate calls to the completion handler
callback if necessary.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Haggai Eran [Mon, 29 Feb 2016 13:45:07 +0000 (15:45 +0200)]
IB/mlx5: Create GSI transmission QPs when P_Key table is changed
Whenever the P_Key table is changed, we create the required GSI
transmission QPs on-demand.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Haggai Eran [Mon, 29 Feb 2016 13:45:06 +0000 (15:45 +0200)]
IB/mlx5: Create multiple transmission GSI QPs
In order to send GSI MADs on different P_Keys, mlx5 needs different QPs to
be created, each with a different P_Key set when the QP is modified to the
INIT state.
Create QPs for each non-zero P_Key in the P_Key table.
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Haggai Eran [Mon, 29 Feb 2016 13:45:05 +0000 (15:45 +0200)]
IB/mlx5: Add GSI QP wrapper
mlx5 creates special GSI QPs that has limited ability to control the P_Key
of transmitted packets. The sent P_Key is taken from the QP object,
similarly to what happens with regular UD QPs.
Create a software wrapper around GSI QPs that with the following patches
will be able to emulate the functionality of a GSI QP including control of
the P_Key per work request.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Haggai Eran [Mon, 29 Feb 2016 13:45:04 +0000 (15:45 +0200)]
IB/mlx5: Modify QP debugging prints
Add debugging prints to the modify QP verb to help understand the cause a
returned error.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Haggai Eran [Mon, 29 Feb 2016 13:45:03 +0000 (15:45 +0200)]
IB/mlx5: Add support for setting source QP number
In order to create multiple GSI QPs, we need to set the source QP number to
one on all these QPs. Add the necessary definitions and infrastructure to
do that.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Erez Shitrit [Sun, 21 Feb 2016 14:27:18 +0000 (16:27 +0200)]
IB/mlx5: Add support for CSUM in RX flow
The driver checks the csum from the HW when completion arrived and marks
it in the wc->wc_flags field for the ulp drivers.
These is for packets from type IB_WC_RECV only.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Erez Shitrit [Sun, 21 Feb 2016 14:27:17 +0000 (16:27 +0200)]
IB/mlx5: Implement UD QP offloads for IPoIB in the TX flow
In order to support LSO and CSUM in the TX flow the driver does the
following:
* LSO bit for the enum mlx5_ib_qp_flags was added, indicates QP that
supports LSO offloads.
* Enables the special offload when the QP is created, and enable the
special work request id (IB_WR_LSO) when comes.
* Calculates the size of the WQE according to the new WQE format that
support these offloads.
* Handles the new WQE format when arrived, sets the relevant
fields, and copies the needed data.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Erez Shitrit [Sun, 21 Feb 2016 14:27:16 +0000 (16:27 +0200)]
IB/mlx5: Define interface bits for IPoIB offloads
The HW can supply several offloads for UD QP, added offloads for
checksumming for both TX and RX and LSO for TX.
Two new bits were added in order to expose and enable these offloads:
1. HCA capability bit: declares the support for IPoIB basic offloads.
2. QPC bit which will be used in the QP creation flow, which set these
abilities in the QP.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Meny Yossefi [Thu, 18 Feb 2016 16:15:01 +0000 (18:15 +0200)]
IB/mlx5: Modify MAD reading counters method to use counter registers
Modify mlx5_ib_process_mad to use PPCNT and query_vport commands
instead of MAD_IFC, as MAD_IFC is deprecated on new firmware
versions (and doesn't support RoCE anyway).
Traffic counters exist in both 32-bit and 64-bit forms.
Declaring support of extended coutners results in traffic counters
to be read in their 64-bit form only via the query_vport command.
Error counters exist only in 32-bit form and read via PPCNT command.
This commit also adds counters support in RoCE.
Signed-off-by: Meny Yossefi <menyy@mellanox.com>
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Meny Yossefi [Thu, 18 Feb 2016 16:15:00 +0000 (18:15 +0200)]
net/mlx5_core: Add helper function to read IB error counters
Added helper function to read IB standard error counters
via the PPCNT register.
The PPCNT register read command provides the 32-bit error counters
of both IB/RoCE link layer and transport layer.
Signed-off-by: Meny Yossefi <menyy@mellanox.com>
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Meny Yossefi [Thu, 18 Feb 2016 16:14:59 +0000 (18:14 +0200)]
net/mlx5_core: Add helper function to read virtual port counters
Added helper function to read 64bit virtual port Infiniband traffic
counters.
Signed-off-by: Meny Yossefi <menyy@mellanox.com>
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Leon Romanovsky [Sun, 21 Feb 2016 16:12:26 +0000 (18:12 +0200)]
IB/core: Fix missed clean call in registration path
In case of failure returned from query function in
IB device registration, we need to clean IB cache which
was missed.
This change fixes it.
Fixes:
3e153a93a1c1 ('IB/core: Save the device attributes on the device
structure')
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:10:09 +0000 (11:10 -0800)]
IB/srpt: Fix wait list processing
Since the wait list is not protected against concurrent access
it must be processed from the context of the completion handler.
Replace the wait list processing code in the IB CM RTU callback
handler by code that triggers a completion handler. This patch
fixes the following rare crash:
WARNING: CPU: 2 PID: 78656 at lib/list_debug.c:53 __list_del_entry+0x67/0xd0()
list_del corruption,
ffff88041ae404b8->next is LIST_POISON1 (
dead000000000100)
Call Trace:
[<
ffffffff81251c6b>] dump_stack+0x4f/0x74
[<
ffffffff810574ab>] warn_slowpath_common+0x8b/0xd0
[<
ffffffff81057591>] warn_slowpath_fmt+0x41/0x70
[<
ffffffff8126f007>] __list_del_entry+0x67/0xd0
[<
ffffffff8126f081>] list_del+0x11/0x40
[<
ffffffffa0265242>] srpt_cm_handler+0x172/0x1a4 [ib_srpt]
[<
ffffffffa0370370>] cm_process_work+0x20/0xf0 [ib_cm]
[<
ffffffffa0370dae>] cm_establish_handler+0xbe/0x110 [ib_cm]
[<
ffffffffa03733e7>] cm_work_handler+0x67/0xd0 [ib_cm]
[<
ffffffff8107184d>] process_one_work+0x1bd/0x460
[<
ffffffff81073148>] worker_thread+0x118/0x420
[<
ffffffff81078444>] kthread+0xe4/0x100
[<
ffffffff8151caff>] ret_from_fork+0x3f/0x70
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Estrin <alex.estrin@intel.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:09:50 +0000 (11:09 -0800)]
IB/srpt: Introduce srpt_process_wait_list()
This patch does not change any functionality.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Estrin <alex.estrin@intel.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:09:28 +0000 (11:09 -0800)]
IB/srpt: Log out all initiators if a port is disabled
If an initiator observes LUN deletion during shutdown of the
target stack then that will trigger an I/O error even when using
multipathd. Users need a way to avoid that shutting down the
target stack causes I/O errors, e.g. by providing a way to force
initiator logout. Hence close all sessions if a target port is
disabled.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Estrin <alex.estrin@intel.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:09:10 +0000 (11:09 -0800)]
IB/srpt: Fix srpt_write_pending()
The only allowed return values for the write_pending() callback
function are 0, -EAGAIN and -ENOMEM. Since attempting to perform
RDMA over a disconnecting channel will result in an IB error
completion anyway, remove the code that checks the channel state
from srpt_write_pending().
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:08:53 +0000 (11:08 -0800)]
IB/srpt: Detect session shutdown reliably
The Last WQE Reached event is only generated after one or more work
requests have been queued on the QP associated with a session. Since
session shutdown can start before any work requests have been queued,
use a zero-length RDMA write to wait until a QP has been drained.
Additionally, rework the code for closing and disconnecting a session.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:08:34 +0000 (11:08 -0800)]
IB/srpt: Use a mutex to protect the channel list
In a later patch a function that can block will be called while
iterating over the rch_list. Hence protect that list with a
mutex instead of a spinlock. And since it is not allowed to sleep
while the task state != TASK_RUNNING, convert the list test in
srpt_ch_list_empty() into a lockless test.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:08:12 +0000 (11:08 -0800)]
IB/srpt: Log private data associated with REJ
To make it possible to determine why an initiator sent a REJ,
log the private data associated with the received REJ packet.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:07:49 +0000 (11:07 -0800)]
IB/srpt: Eliminate srpt_find_channel()
In the CM REQ message handler, store the channel pointer in
cm_id->context such that the function srpt_find_channel() is no
longer needed. Additionally, make the CM event messages more
informative.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:07:29 +0000 (11:07 -0800)]
IB/srpt: Inline trivial CM callback functions
Inline those CM callback functions that are only two lines long.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Estrin <alex.estrin@intel.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:07:11 +0000 (11:07 -0800)]
IB/srpt: Fix how aborted commands are processed
srpt_abort_cmd() must not be called in state SRPT_STATE_DATA_IN. Issue
a warning if this occurs.
srpt_abort_cmd() must not invoke target_put_sess_cmd() for commands
in state SRPT_STATE_DONE because the srpt_abort_cmd() callers already
do this when necessary. Hence remove this call.
If an RDMA read fails the corresponding SCSI command must fail. Hence
add a transport_generic_request_failure() call.
Remove an incorrect srpt_abort_cmd() call from srpt_rdma_write_done().
Avoid that srpt_send_done() calls srpt_abort_cmd() for finished SCSI
commands.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Estrin <alex.estrin@intel.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:06:55 +0000 (11:06 -0800)]
IB/srpt: Fix srpt_handle_cmd() error paths
The target core function that should be called if target_submit_cmd()
fails is target_put_sess_cmd(). Additionally, change the return type
of srpt_handle_cmd() from int into void.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Estrin <alex.estrin@intel.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:06:14 +0000 (11:06 -0800)]
IB/srpt: Fix srpt_close_session()
Avoid that srpt_close_session() waits if it doesn't have to wait.
Additionally, increase the time during which srpt_close_session()
waits until closing a session has finished. This makes it easier
to detect session shutdown bugs.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:05:58 +0000 (11:05 -0800)]
IB/srpt: Simplify srpt_shutdown_session()
The target core guarantees that shutdown_session() is only invoked
once per session. This means that the ib_srpt target driver doesn't
have to track whether or not shutdown_session() has been called.
Additionally, ensure that target_sess_cmd_list_set_waiting() is
called before target_wait_for_sess_cmds() by moving it into
srpt_release_channel_work().
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:05:38 +0000 (11:05 -0800)]
IB/srpt: Simplify channel state management
The only allowed channel state changes are those that change
the channel state into a state with a higher numerical value.
This allows to merge the functions srpt_set_ch_state() and
srpt_test_and_set_ch_state() into a single function.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:05:19 +0000 (11:05 -0800)]
IB/srpt: Use scsilun_to_int()
Just like other target drivers, use scsilun_to_int() to unpack SCSI
LUN numbers. This patch only changes the behavior of ib_srpt for LUN
numbers >= 16384.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:05:01 +0000 (11:05 -0800)]
IB/srpt: Introduce target_reverse_dma_direction()
Use the function target_reverse_dma_direction() instead of
reimplementing it.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:04:43 +0000 (11:04 -0800)]
IB/srpt: Inline srpt_get_ch_state()
The callers of srpt_get_ch_state() can access ch->state safely without
using locking. Hence inline this function.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:04:20 +0000 (11:04 -0800)]
IB/srpt: Inline srpt_sdev_name()
srpt_sdev_name() is too trivial to keep it as a separate function.
Hence inline this function.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:04:02 +0000 (11:04 -0800)]
IB/srpt: Remove struct srpt_node_acl
Since struct srpt_node_acl is identical to struct se_node_acl,
remove the definition of the former structure. This patch does
not change any functionality.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Estrin <alex.estrin@intel.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:03:31 +0000 (11:03 -0800)]
IB/srpt: Add parentheses around sizeof argument
Although sizeof is an operator and hence in many cases parentheses can
be left out, the recommended kernel coding style is to surround the
sizeof argument with parentheses. This patch does not change any
functionality. It has been generated by running the following shell
command:
sed -i 's/sizeof \([^ );,]*\)/sizeof(\1)/g' drivers/infiniband/ulp/srpt/*.[ch]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Bart Van Assche [Thu, 11 Feb 2016 19:03:09 +0000 (11:03 -0800)]
IB/srpt: Simplify srpt_handle_tsk_mgmt()
Let the target core check task existence instead of the SRP target
driver. Additionally, let the target core check the validity of the
task management request instead of the ib_srpt driver.
This patch fixes the following kernel crash:
BUG: unable to handle kernel NULL pointer dereference at
0000000000000001
IP: [<
ffffffffa0565f37>] srpt_handle_new_iu+0x6d7/0x790 [ib_srpt]
Oops: 0002 [#1] SMP
Call Trace:
[<
ffffffffa05660ce>] srpt_process_completion+0xde/0x570 [ib_srpt]
[<
ffffffffa056669f>] srpt_compl_thread+0x13f/0x160 [ib_srpt]
[<
ffffffff8109726f>] kthread+0xcf/0xe0
[<
ffffffff81613cfc>] ret_from_fork+0x7c/0xb0
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Fixes:
3e4f574857ee ("ib_srpt: Convert TMR path to target_submit_tmr")
Tested-by: Alex Estrin <alex.estrin@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Devesh Sharma [Thu, 28 Jan 2016 13:59:59 +0000 (08:59 -0500)]
RDMA/ocrdma: Support user AH creation for RoCE-v2
This patch adds support to create RoCE-v2 compatible AH. It uses ahid
field to tell network-header-type to user space library. The library
has to decode network-header-type from ahid field.
Signed-off-by: Somnath Kotur <somnath.kotur@avagotech.com>
Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Devesh Sharma [Thu, 28 Jan 2016 13:59:58 +0000 (08:59 -0500)]
RDMA/ocrdma: Support RoCE-v2 in the RC path
This patch implements following changes to support RoCE-v2
in the RC path:
* Get the GID-type for a given sgid.
* Based on the GID-type get IPv4/IPv6 L3-address
and give those to underlying device.
* Resolve and provide network header type to device.
Signed-off-by: Somnath Kotur <somnath.kotur@avagotech.com>
Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Devesh Sharma [Thu, 28 Jan 2016 13:59:57 +0000 (08:59 -0500)]
RDMA/ocrdma: Support RoCE-v2 in the UD path
This patch adds following changes to support RoCE-v2
in the UD path.
* During AH creation GID-type is resolved for a given gid-index.
* Based on GID-type protocol header is built.
* Work completion reports network header type and set
IB_WC_WITH_NETWORK_HDR_TYPE flag in wc->wc_flags to indicate
that the network header type is valid.
Signed-off-by: Somnath Kotur <somnath.kotur@avagotech.com>
Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Somnath Kotur [Thu, 28 Jan 2016 13:59:56 +0000 (08:59 -0500)]
RDMA/ocrdma: Export udp encapsulation capability
Add support to read device configuration and initialize port-immutables
to report UDP-Encap flag during port query.
Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: Somnath Kotur <somnath.kotur@avagotech.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Insu Yun [Wed, 17 Feb 2016 18:06:33 +0000 (13:06 -0500)]
nes: handling failed allocation when creating workqueue
Since create_singlethread_workqueue uses kzalloc internally,
it can fail when the system is under memory pressure, so need
to handle it.
Signed-off-by: Insu Yun <wuninsu@gmail.com>
Reviewed-by: Leon Romanovsky <leon@leon.nu>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Ben Hutchings [Mon, 15 Feb 2016 21:25:44 +0000 (21:25 +0000)]
RDMA/nes: Replace LRO with GRO
GRO is simpler to use than the old inet_lro library, and is compatible
with forwarding and bridging configurations.
Compile-tested only.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Marina Varshaver [Thu, 18 Feb 2016 16:31:06 +0000 (18:31 +0200)]
IB/mlx4: Add support for the don't trap rule
Add support for receiving multicast/unicast traffic with
the don't trap rule.
Sniffing these packets requires a flow steering rule of type NORMAL
at priority 0 with flag IB_FLOW_ATTR_FLAGS_DONT_TRAP set.
Choosing between multicast or unicast is done via ethernet L2 dest_mac
mask and value:
- If mask is all zeros - unicast and multicast are set.
- If mask non zero - only mask with multicast bit 1 and rest 0 is
supported, the mac value will choose if it is
multicast or unicast rule.
If the mask multicast bit is on and some other bits are on too, it means
a request for specific multicast or unicast, this is not supported,
either receive all multicast or all unicast.
Only when limitations are met registered QP will receive requested type
but other QPs can receive same traffic if registered for it.
Otherwise, if limitations are not met, an error will be returned.
Limitations:
- Rule must be with priority 0.
- A0 mode is not supported.
- Sniffer QP cannot appear in any other flow steering rule.
Signed-off-by: Marina Varshaver <marinav@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Marina Varshaver [Thu, 18 Feb 2016 16:31:05 +0000 (18:31 +0200)]
IB/core: Add don't trap flag to flow creation
Don't trap flag (i.e. IB_FLOW_ATTR_FLAGS_DONT_TRAP) indicates that QP
will receive traffic, but will not steal it.
When a packet matches a flow steering rule that was created with
the don't trap flag, the QPs assigned to this rule will get this
packet, but matching will continue to other equal/lower priority
rules. This will let other QPs assigned to those rules to get the
packet too.
If both don't trap rule and other rules have the same priority
and match the same packet, the behavior is undefined.
The don't trap flag can't be set with default rule types
(i.e. IB_FLOW_ATTR_ALL_DEFAULT, IB_FLOW_ATTR_MC_DEFAULT) as default rules
don't have rules after them and don't trap has no meaning here.
Signed-off-by: Marina Varshaver <marinav@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Abhilash Jindal [Sun, 31 Jan 2016 18:53:31 +0000 (13:53 -0500)]
IB/mlx4: Use boottime
Wall time obtained from ktime_get_real_ns is susceptible to sudden jumps due to
user setting the time or due to NTP. Boot time is constantly increasing time
better suited for comparing two timestamps.
Signed-off-by: Abhilash Jindal <klock.android@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Steve Wise [Wed, 17 Feb 2016 16:17:12 +0000 (08:17 -0800)]
IB/iser: Use ib_drain_sq()
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Steve Wise [Wed, 17 Feb 2016 16:15:42 +0000 (08:15 -0800)]
IB/srp: Use ib_drain_rq()
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>