Doug Ledford [Mon, 21 Mar 2016 21:32:23 +0000 (17:32 -0400)]
Merge branches 'i40iw', 'sriov' and 'hfi1' into k.o/for-4.6
Eli Cohen [Fri, 11 Mar 2016 20:58:43 +0000 (22:58 +0200)]
IB/ipoib: Allow mcast packets from other VFs
With SRIOV enabled, two VFs on the same HCA which have the same port LID
and may have the same QP number. To enable receiving multicasts from
such VFs, further qualify the check: ignore the receive only if, in
addition, the packet source gid equals the receiving VF's source gid.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Fri, 11 Mar 2016 20:58:42 +0000 (22:58 +0200)]
IB/mlx5: Implement callbacks for manipulating VFs
Implement the IB defined callbacks used to manipulate the policy for the
link state, set GUIDs or get statistics information. This functionality
is added into a new file that will be used to add any SRIOV related
functionality to the mlx5 IB layer.
The following callbacks have been added:
mlx5_ib_get_vf_config
mlx5_ib_set_vf_link_state
mlx5_ib_get_vf_stats
mlx5_ib_set_vf_guid
In addition, publish whether this device is based on a virtual function.
In mlx5 supported devices, virtual functions are implemented as vHCAs.
vHCAs have their own QP number space so it is possible that two vHCAs
will use a QP with the same number at the same time.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Fri, 11 Mar 2016 20:58:41 +0000 (22:58 +0200)]
net/mlx5_core: Implement modify HCA vport command
Implement the modify HCA vport commands used to modify the parameters of
virtual HCA's ports.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Fri, 11 Mar 2016 20:58:40 +0000 (22:58 +0200)]
net/mlx5_core: Add VF param when querying vport counter
Add a vf parameter to mlx5_core_query_vport_counter so we can call it to
query counters of virtual functions. Also update current users of the
API.
PFs may call mlx5_core_query_vport_counter with other_vport set to
indicate that they are querying a virtual function. The virtual
function to be queried is given by the vf parameter. Virtual function
numbering is zero based so the first VF is 0 and so on. When a PF
queries its own function, the other_vport parameter is cleared.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Fri, 11 Mar 2016 20:58:39 +0000 (22:58 +0200)]
IB/ipoib: Add ndo operations for configuring VFs
Add ndo operations to the network driver that enables configuring the
following operations:
ipoib_set_vf_link_state - configure the VF link policy
ipoib_get_vf_config - get link state configuration
ipoib_set_vf_guid - set a VF port or node GUID
ipoib_get_vf_stats - get statistics of a VF
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Fri, 11 Mar 2016 20:58:38 +0000 (22:58 +0200)]
IB/core: Add interfaces to control VF attributes
Following the practice exercised for network devices which allow the PF
net device to configure attributes of its virtual functions, we
introduce the following functions to be used by IPoIB which is the
network driver implementation for IB devices.
ib_set_vf_link_state - set the policy for a VF link. More below.
ib_get_vf_config - read configuration information of a VF
ib_get_vf_stats - read VF statistics
ib_set_vf_guid - set the node or port GUID of a VF
Also add an indication in the device cap flags that indicates that this
IB devices is based on a virtual function.
A VF shares the physical port with the PF and other VFs. When setting
the link state we have three options:
1. Auto - in this mode, the virtual port follows the state of the
physical port and becomes active only if the physical port's state is
active. In all other cases it remains in a Down state.
2. Down - sets the state of the virtual port to Down
3. Up - causes the virtual port to transition into Initialize state if
it was not already in this state. A virtualization aware subnet manager
can then bring the state of the port into the Active state.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Fri, 11 Mar 2016 20:58:37 +0000 (22:58 +0200)]
IB/core: Support accessing SA in virtualized environment
Per the ongoing standardisation process, when virtual HCAs are present
in a network, traffic is routed based on a destination GID. In order to
access the SA we use the well known SA GID.
We also add a GRH required boolean field to the port attributes which is
used to report to the verbs consumer whether this port is connected to a
virtual network. We use this field to realize whether we need to create
an address vector with GRH to access the subnet administrator. We clear
the port attributes struct before calling the hardware driver to make
sure the default remains that GRH is not required.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Fri, 11 Mar 2016 20:58:36 +0000 (22:58 +0200)]
IB/core: Add subnet prefix to port info
The subnet prefix is a part of the port_info MAD returned and should be
available at the ib_port_attr struct. We define it here and provide a
default implementation in case the hardware driver does not provide one.
The subnet prefix is required when creating the address vector to access
the SA in networks where GRH must be used.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Fri, 11 Mar 2016 20:58:35 +0000 (22:58 +0200)]
IB/mlx5: Fix decision on using MAD_IFC
Fix the condition that dictates when MAD_IFC should be used. According
to firmware specifications, MAD_IFC commands must be used only if the
ib_virt capability is off.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Eli Cohen [Fri, 11 Mar 2016 20:58:34 +0000 (22:58 +0200)]
net/core: Add support for configuring VF GUIDs
Add two new NLAs to support configuration of Infiniband node or port
GUIDs. New applications can choose to use this interface to configure
GUIDs with iproute2 with commands such as:
ip link set dev ib0 vf 0 node_guid 00:02:c9:03:00:21:6e:70
ip link set dev ib0 vf 0 port_guid 00:02:c9:03:00:21:6e:78
A new ndo, ndo_sef_vf_guid is introduced to notify the net device of the
request to change the GUID.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Leon Romanovsky [Tue, 23 Feb 2016 08:25:25 +0000 (10:25 +0200)]
IB/{core, ulp} Support above 32 possible device capability flags
The old bitwise device_cap_flags variable was limited to u32 which
has all bits already defined. In order to overcome it, we converted
device_cap_flags variable to be u64 type.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Leon Romanovsky [Tue, 23 Feb 2016 08:25:24 +0000 (10:25 +0200)]
IB/core: Replace setting the zero values in ib_uverbs_ex_query_device
The setting to zero during variable initialization eliminates
the need to explicitly set to zero variables and structures.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Sagi Grimberg [Tue, 23 Feb 2016 08:25:23 +0000 (10:25 +0200)]
net/mlx5_core: Introduce offload arithmetic hardware capabilities
Define the necessary hardware structures for the offload
arithmetic capabilities and read/cache them on driver load.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Leon Romanovsky [Tue, 23 Feb 2016 08:25:22 +0000 (10:25 +0200)]
net/mlx5_core: Refactor device capability function
Device capability function was called similar in all places.
It was called twice for every queried parameter, while the
difference between calls was in HCA capability mode only.
The change proposed unify these calls into one function.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Leon Romanovsky [Tue, 23 Feb 2016 08:25:21 +0000 (10:25 +0200)]
net/mlx5_core: Fix caching ATOMIC endian mode capability
Add caching of maximum device capability of ATOMIC endian mode.
Fixes:
f91e6d8941bf ('net/mlx5_core: Add setting ATOMIC endian mode')
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dan Carpenter [Fri, 18 Mar 2016 05:41:59 +0000 (08:41 +0300)]
ib_srpt: fix a WARN_ON() message
The first argument of WARN_ON() is a condition, so it means the warning
message here will just be the name without the ->qp_num information.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Tatyana Nikolova [Fri, 18 Mar 2016 15:38:33 +0000 (10:38 -0500)]
i40iw: Replace the obsolete crypto hash interface with shash
This patch replaces the obsolete crypto hash interface with shash
and resolves a build failure after merge of the rdma tree
which is caused by the removal of crypto hash interface
Removing CRYPTO_ALG_ASYNC from crypto_alloc_shash(),
because it is by definition sync only
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:15:44 +0000 (11:15 -0800)]
IB/hfi1: Add SDMA cache eviction algorithm
This commit adds a cache eviction algorithm for the SDMA
user buffer cache.
Besides the interval RB tree used for node lookup, the cache
nodes are also arranged in a doubly-linked list. When a node is
used, it is put at the beginning of the list. Less frequently
used nodes naturally move to the tail of the list.
When the cache limit is reached, the eviction code starts
traversing the linked list in reverse, freeing buffers until
enough space has been freed to fit the new user buffer. This
guarantees that only the least used cache nodes will be removed
from the cache.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:15:39 +0000 (11:15 -0800)]
IB/hfi1: Switch to using the pin query function
Use the new function to query whether the expected receive
user buffer can be pinned successfully. This requires that
a new variable be added to the hfi1_filedata structure used
to hold the number of pages pinned by the expected receive
code.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:15:33 +0000 (11:15 -0800)]
IB/hfi1: Specify mm when releasing pages
This change adds a pointer to the process mm_struct when
calling hfi1_release_user_pages().
Previously, the function used the mm_struct of the current
process to adjust the number of pinned pages. However, is some
cases, namely when unpinning pages due to a MMU notifier call,
we want to drop into that code block as it will cause a deadlock
(the MMU notifiers take the process' mmap_sem prior to calling
the callbacks).
By allowing to caller to specify the pointer to the mm_struct,
the caller has finer control over that part of hfi1_release_user_pages().
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:15:28 +0000 (11:15 -0800)]
IB/hfi1: Add pin query function
System administrators can use the locked memory
ulimit setting to set the maximum amount of memory
a user can lock/pin. However, this setting alone is not
enough to guarantee good operation of the hfi1 driver
due to the fact that the setting does not have fine
enough granularity to account for the limit being used
by multiple user processes and caches.
Therefore, a better limiting algorithm is needed. This
is where the new hfi1_can_pin_pages() function and the
cache_size module parameter come in.
The function works by looking at the ulimit and cache_size
value to compute a cache size. The algorithm examines the
ulimit value and, if it is not "unlimited", computes a
per-cache limit based on the number of configured user
contexts.
After that, the lower of the two - cache_size and computed
per-cache limit - is used.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:15:22 +0000 (11:15 -0800)]
IB/hfi1: Implement SDMA-side buffer caching
Add support for caching of user buffers used for SDMA
transfers. This change improves performance by
avoiding repeatedly pinning the pages of buffers, which
are being re-used by the application.
While the cost of the pinning operation has been made
heavier by adding the extra code to search the cache tree,
re-allocate pages arrays, and future cache evictions,
that cost will be amortized against the savings when the
same buffer is re-used. It is also worth noting that in
most cases, the cost of pinning should be much lower due
to the buffer already being in the cache.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:15:16 +0000 (11:15 -0800)]
IB/hfi1: Adjust last address values for intervals
Last address values for intervals in the interval RB tree
nodes should be non-inclusive in order to avoid confusing
ranges.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:15:10 +0000 (11:15 -0800)]
IB/hfi1: Add filter callback
This commit adds a filter callback, which can be used to filter
out interval RB nodes matching a certain interval down to a
single one.
This is needed for the upcoming SDMA-side caching where buffers
will need to be filtered by their virtual address.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:15:04 +0000 (11:15 -0800)]
IB/hfi1: Remove compare callback
Interval RB trees provide their own searching function,
which also takes care of determining the path through
the tree that should be taken.
This make the compare callback unnecessary.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:14:59 +0000 (11:14 -0800)]
IB/hfi1: Add MMU tracing
Add a new tracepoint type for the MMU functions and calls
to that tracepoint to allow tracing of MMU functionality.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:14:53 +0000 (11:14 -0800)]
IB/hfi1: Use interval RB trees
The interval RB trees can handle RB nodes which
hold ranged information. This is exactly the usage
for the buffer cache implemented in the expected
receive code path.
Convert the MMU/RB functions to use the interval RB
tree API. This will help with future users of the
caching API, as well.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:14:48 +0000 (11:14 -0800)]
IB/hfi1: Notify remove MMU/RB callback of calling context
Tell the remove MMU/RB callback if it's being called as
part of a memory invalidation or not. This can be important
in preventing a deadlock if the remove callback attempts to
take the map_sem semaphore because the kernel's MMU
invalidation functions have already taken it.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:14:42 +0000 (11:14 -0800)]
IB/hfi1: Remove the use of add/remove RB function pointers
The usage of function pointers for RB node insertion
and removal in the expected receive code path was
meant to be a small performance optimization. However,
maintaining it, especially with the new MMU API, would
become more troublesome as the API is extended.
Since the performance optimization is minor, remove the
function pointers and replace with direct calls.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:14:36 +0000 (11:14 -0800)]
IB/hfi1: Allow remove MMU callbacks to free nodes
In order to allow the remove MMU callbacks to free the
RB nodes, it is necessary to prevent any references to
the nodes after the remove callback has been called.
Therefore, remove the node from the tree prior to calling
the callback. In other words, the MMU/RB API now guarantees
that all RB node operations it performs will be done prior
to calling the remove callback and that the RB node will
not be touched afterwards.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:14:31 +0000 (11:14 -0800)]
IB/hfi1: Prevent NULL pointer dereference
Prevent a potential NULL pointer dereference (found
by code inspection) when unregistering an MMU handler.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:14:25 +0000 (11:14 -0800)]
IB/hfi1: Allow MMU function execution in IRQ context
Future users of the MMU/RB functions might be searching or
manipulating the MMU RB trees in interrupt context. Therefore,
the MMU/RB functions need to be able to run in interrupt
context. This requires that we use the IRQ-aware API for
spin locks.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:14:20 +0000 (11:14 -0800)]
IB/hfi1: Re-factor MMU notification code
The MMU notification code added to the
expected receive side has been re-factored and
split into it's own file. This was done in
order to make the code more general and, therefore,
usable by other parts of the driver.
The caching behavior remains the same. However,
the handling of the RB tree (insertion, deletions,
and searching) as well as the MMU invalidation
processing is now handled by functions in the
mmu_rb.[ch] files.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Alex Estrin [Mon, 7 Mar 2016 19:35:51 +0000 (11:35 -0800)]
IB/rdmavt: Post receive for QP in ERR state
Accordingly IB Spec post WR to receive queue must
complete with error if QP is in Error state.
Please refer to C10-42, C10-97.2.1
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Marciniszyn [Mon, 7 Mar 2016 19:35:46 +0000 (11:35 -0800)]
IB/hfi1: Enable adaptive pio by default
Set the piothreshold to the agreed upon default of 256B.
Reviewed-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Marciniszyn [Mon, 7 Mar 2016 19:35:41 +0000 (11:35 -0800)]
IB/hfi1: Fix adaptive pio packet corruption
The adaptive pio heuristic missed a case that causes a corrupted
packet on the wire.
The case is if SDMA egress had been chosen for a pio-able packet and
then encountered a ring space wait, the packet is queued. The sge
cursor had been incremented as part of the packet build out for SDMA.
After the send engine restart, the heuristic might now chose pio based
on the sdma count being zero and start the mmio copy using the already
incremented sge cursor.
Fix this by forcing SDMA egress when the SDMA descriptor has already
been built.
Additionally, the code to wait for a QPs pio count to zero when
switching to SDMA was missing. Add it.
There is also an issue with UD QPs, in that the different SLs can pick
a different egress send context. For now, just insure the UD/GSI
always go through SDMA.
Reviewed-by: Vennila Megavannan <vennila.megavannan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Marciniszyn [Mon, 7 Mar 2016 19:35:35 +0000 (11:35 -0800)]
IB/hfi1: Fix panic in adaptive pio
The following panic occurs while running ib_send_bw -a with
adaptive pio turned on:
[ 8551.143596] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 8551.152986] IP: [<
ffffffffa0902a94>] pio_wait.isra.21+0x34/0x190 [hfi1]
[ 8551.160926] PGD
80db21067 PUD
80bb45067 PMD 0
[ 8551.166431] Oops: 0000 [#1] SMP
[ 8551.276725] task:
ffff880816bf15c0 ti:
ffff880812ac0000 task.ti:
ffff880812ac0000
[ 8551.285705] RIP: 0010:[<
ffffffffa0902a94>] pio_wait.isra.21+0x34/0x190 [hfi1]
[ 8551.296462] RSP: 0018:
ffff880812ac3b58 EFLAGS:
00010282
[ 8551.303029] RAX:
000000000000002d RBX:
0000000000000000 RCX:
0000000000000800
[ 8551.311633] RDX:
ffff880812ac3c08 RSI:
0000000000000000 RDI:
ffff8800b6665e40
[ 8551.320228] RBP:
ffff880812ac3ba0 R08:
0000000000001000 R09:
ffffffffa09039a0
[ 8551.328820] R10:
ffff880817a0c000 R11:
0000000000000000 R12:
ffff8800b6665e40
[ 8551.337406] R13:
ffff880817a0c000 R14:
ffff8800b6665800 R15:
ffff8800b6665e40
[ 8551.355640] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 8551.362674] CR2:
0000000000000000 CR3:
000000080abe8000 CR4:
00000000001406e0
[ 8551.371262] Stack:
[ 8551.374119]
ffff880812ac3bf0 ffff88080cf54010 ffff880800000800 ffff880812ac3c08
[ 8551.383036]
ffff8800b6665800 ffff8800b6665e40 0000000000000202 ffffffffa08e7b80
[ 8551.391941]
00000001007de431 ffff880812ac3bc8 ffffffffa0904645 ffff8800b6665800
[ 8551.400859] Call Trace:
[ 8551.404214] [<
ffffffffa08e7b80>] ? hfi1_del_timers_sync+0x30/0x30 [hfi1]
[ 8551.412417] [<
ffffffffa0904645>] hfi1_verbs_send+0x215/0x330 [hfi1]
[ 8551.420154] [<
ffffffffa08ec126>] hfi1_do_send+0x166/0x350 [hfi1]
[ 8551.427618] [<
ffffffffa055a533>] rvt_post_send+0x533/0x6a0 [rdmavt]
[ 8551.435367] [<
ffffffffa050760f>] ib_uverbs_post_send+0x30f/0x530 [ib_uverbs]
[ 8551.443999] [<
ffffffffa0501367>] ib_uverbs_write+0x117/0x380 [ib_uverbs]
[ 8551.452269] [<
ffffffff815810ab>] ? sock_recvmsg+0x3b/0x50
[ 8551.459071] [<
ffffffff81581152>] ? sock_read_iter+0x92/0xe0
[ 8551.466068] [<
ffffffff81212857>] __vfs_write+0x37/0x100
[ 8551.472692] [<
ffffffff81213532>] ? rw_verify_area+0x52/0xd0
[ 8551.479682] [<
ffffffff81213782>] vfs_write+0xa2/0x1a0
[ 8551.486089] [<
ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
[ 8551.493891] [<
ffffffff812146c5>] SyS_write+0x55/0xc0
[ 8551.500220] [<
ffffffff816ae0ee>] entry_SYSCALL_64_fastpath+0x12/0x71
[ 8551.531284] RIP [<
ffffffffa0902a94>] pio_wait.isra.21+0x34/0x190 [hfi1]
[ 8551.539508] RSP <
ffff880812ac3b58>
[ 8551.544110] CR2:
0000000000000000
The priv s_sendcontext pointer was not setup properly. Fix with this
patch by using the s_sendcontext and eliminating its send engine use.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Marciniszyn [Mon, 7 Mar 2016 19:35:30 +0000 (11:35 -0800)]
IB/hfi1: Fix PIO wakeup timing hole
There is a timing hole if there had been greater than
PIO_WAIT_BATCH_SIZE waiters. This code will dispatch the first
batch but leave the others in the queue. If the restarted waiters
don't in turn wait on a buffer, there is a hang.
Fix by forcing a return when the QP queue is non-empty.
Reviewed-by: Vennila Megavannan <vennila.megavannan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Marciniszyn [Mon, 7 Mar 2016 19:35:24 +0000 (11:35 -0800)]
IB/hfi1: Fix ordering of trace for accuracy
The postitioning of the sdma ibhdr trace was
causing an extra trace message when the tx send
returned -EBUSY.
Move the trace to just before the return
and handle negative return values to avoid
any trace.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Marciniszyn [Mon, 7 Mar 2016 19:35:19 +0000 (11:35 -0800)]
IB/hfi1: Add unique trace point for pio and sdma send
This allows for separately enabling pio and sdma
tracepoints to cut the volume of trace information.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Marciniszyn [Mon, 7 Mar 2016 19:35:14 +0000 (11:35 -0800)]
IB/hfi1: Fix issues with qp_stats print
The changes are to aid in coorelating trace information
with QPs between the trace and qp_stats information
Such changes include adds a space after QP and clarifying that the second
QP is actually the remote QP.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Marciniszyn [Mon, 7 Mar 2016 19:35:08 +0000 (11:35 -0800)]
IB/hfi1: Report pid in qp_stats to aid debug
Tracking user/QP ownership is needed to debug issues with
user ULPs like OpenMPI.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Easwar Hariharan [Mon, 7 Mar 2016 19:35:03 +0000 (11:35 -0800)]
IB/hfi1: Improve LED beaconing
The current LED beaconing code is unclear and uses the timer handler to
turn off the timer. This patch simplifies the code by removing the
special semantics of timeon = timeoff = 0 being interpreted as a request
to turn off the beaconing.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Kaike Wan [Sat, 5 Mar 2016 16:50:49 +0000 (08:50 -0800)]
IB/hfi1: Don't call cond_resched in atomic mode when sending packets
This patch fixed the problem where the driver might reschedule in atomic
mode when sending packets. This is due to the fact that the call to
cond_resched() in hfi1_do_send() might occur in atomic mode and a check is
required to avoid the warning message:
"kernel: BUG: scheduling while atomic: swapper/2/0/0x10000100."
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Sat, 5 Mar 2016 16:50:43 +0000 (08:50 -0800)]
IB/hfi1: Add adaptive cacheless verbs copy
The kernel memcpy is faster than a cacheless copy. However,
if too much of the L3 cache is overwritten by one-time copies
then overall bandwidth suffers. Implement an adaptive scheme
where full page copies are tracked and if the number of unique
entries are larger than a threshold, verbs will use a cacheless
copy. Tracked entries are gradually cleaned, allowing memcpy to
resume once the larger copies have stopped.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Sat, 5 Mar 2016 16:50:38 +0000 (08:50 -0800)]
IB/hfi1: Handle host handshake timeout
Host handshake timeout can occur during the verify capability
state. This is a LNI related failure and should be
handled in the same way as other LNI failures.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Sat, 5 Mar 2016 16:50:33 +0000 (08:50 -0800)]
IB/hfi1: Add ASIC flag view/clear
Different OSes using parts of the same hardware may leave
cross-device flags set. Export a debugfs file to view and
clear these flags if needed.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Sat, 5 Mar 2016 16:50:27 +0000 (08:50 -0800)]
IB/hfi1: Hold i2c resource across debugfs open/close
External i2c firmware updates are done in multiple steps and
cannot have other things done in between. For debugfs files,
acquire the resource on open and release it on close.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Sat, 5 Mar 2016 16:50:22 +0000 (08:50 -0800)]
IB/hfi1: Reduce hardware mutex timeout
The hardware mutex is now held only long enough to set
or clear flags. Reduce the timeout to something more
reasonable.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Sat, 5 Mar 2016 16:50:17 +0000 (08:50 -0800)]
IB/hfi1: Remove unused HFI1_DO_INIT_ASIC flag
The flag HFI1_DO_INIT_ASIC flag is no longer used. Remove
the flag and the code that sets it.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Sat, 5 Mar 2016 16:50:11 +0000 (08:50 -0800)]
IB/hfi1: Change thermal init to use resource reservation
Use the resource reservation system to flag that the ASIC
thermal has been initialized.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Sat, 5 Mar 2016 16:50:06 +0000 (08:50 -0800)]
IB/hfi1: Change QSFP functions to use resource reservation
Remove the mutex guarding each operation in favor the ASIC
resource acquire/release. Push the resource acquire/release,
above each operation call to allow exclusive access across
multiple operations.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Sat, 5 Mar 2016 16:50:01 +0000 (08:50 -0800)]
IB/hfi1: Change SBus handling to use resource reservation
The SBus resource includes SBUS, PCIE, and THERM registers.
Change SBus handling to use the new ASIC resource reservation system.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Sat, 5 Mar 2016 16:49:55 +0000 (08:49 -0800)]
IB/hfi1: Change EPROM handling to use resource reservation
Change EPROM handling to use the new ASIC resource reservation system.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Sat, 5 Mar 2016 16:49:50 +0000 (08:49 -0800)]
IB/hfi1: Add ASIC resource reservation functions
The ASIC block is a shared hardware resource between two devices
on the chip. Add functions to acquire and release these resources
in a way that is safe for both multiple users on the same OS
and multiple users on different OSes, while holding the hardware
mutex as little as possible.
Reservations are noted in a scratch register in the shared region.
There are two types of reservations: per-HFI dynamic and permanent.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Sat, 5 Mar 2016 16:49:45 +0000 (08:49 -0800)]
IB/hfi1: Add shared ASIC structure
Create a shared structure to exist between devices that share the
same ASIC.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Sat, 5 Mar 2016 16:49:39 +0000 (08:49 -0800)]
IB/hfi1: Remove ASIC block clear
The ASIC block is shared between two HFIs. Individual devices
should not initialize registers there. Retain the power-on values.
Individual users set registers as needed with one exception.
Clear sbus fast mode on "slow" calls.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Harish Chegondi [Sat, 5 Mar 2016 16:49:34 +0000 (08:49 -0800)]
IB/hfi1: Replace kmalloc and memcpy with a kmemdup
This change was recommended by Coccinelle tool when I ran the command:
-bash-4.2$ make coccicheck MODE=patch M=drivers/infiniband/hw/hfi1/
Reviewed-by: Jubin John <jubin.john@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Harish Chegondi [Sat, 5 Mar 2016 16:49:29 +0000 (08:49 -0800)]
IB/hfi1: Move constant to the right in bitwise operations
Implement changes recommended by the Coccinelle tool to move constant to
the right in bitwise operations
-bash-4.2$ make coccicheck MODE=report M=drivers/infiniband/hw/hfi1/
drivers/infiniband/hw/hfi1/pio.c:765:4-16: Move constant to right.
drivers/infiniband/hw/hfi1/rc.c:2503:19-29: Move constant to right.
drivers/infiniband/hw/hfi1/chip.c:9813:11-22: Move constant to right.
drivers/infiniband/hw/hfi1/chip.c:14468:29-40: Move constant to right.
Reviewed-by: Jubin John <jubin.john@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Harish Chegondi [Sat, 5 Mar 2016 16:49:24 +0000 (08:49 -0800)]
IB/hfi1: Add the break statement that was removed in an earlier patch
The break statement was unintentionally removed in this patch
commit
41ca419abc0ca7ee65d765408cdc1a7fed2897a3
("staging/rdma/hfi1: Remove hfi1 MR and hfi1 specific qp type")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Doug Ledford [Wed, 16 Mar 2016 17:57:43 +0000 (13:57 -0400)]
Merge branches 'nes', 'cxgb4' and 'iwpm' into k.o/for-4.6
Faisal Latif [Wed, 20 Jan 2016 19:40:16 +0000 (13:40 -0600)]
i40iw: changes for build of i40iw module
MAINTAINERS, Kconfig, and Makefile to build i40iw module
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Faisal Latif [Wed, 20 Jan 2016 19:40:15 +0000 (13:40 -0600)]
i40iw: Kconfig and Makefile for iwarp module
Kconfig and Makefile needed to build iwarp module.
Changes since v2:
moved from Kbuild to Makefile
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Faisal Latif [Wed, 20 Jan 2016 19:40:14 +0000 (13:40 -0600)]
i40iw: virtual channel handling files
i40iw_vf.[ch] and i40iw_virtchnl[ch] are used for virtual
channel support for iWARP VF module.
Changes since v2:
code cleanup
Acked-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Faisal Latif [Wed, 20 Jan 2016 19:40:13 +0000 (13:40 -0600)]
i40iw: user kernel shared files
i40iw_user.h and i40iw_uk.c are used by both user library as well as
kernel requests.
Acked-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Faisal Latif [Wed, 20 Jan 2016 19:40:12 +0000 (13:40 -0600)]
i40iw: add X722 register file
X722 Hardware registers defines for iWARP component.
Acked-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Faisal Latif [Wed, 20 Jan 2016 19:40:11 +0000 (13:40 -0600)]
i40iw: add hardware related header files
header files for hardware accesses
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Faisal Latif [Wed, 20 Jan 2016 19:40:10 +0000 (13:40 -0600)]
i40iw: add file to handle cqp calls
i40iw_ctrl.c provides for hardware wqe support and cqp.
Changes since v2:
cleanup coccinelle error reported by Julia Lawall
Changes since v1:
reported by Christoph Hellwig's review
-remove unnecessary casts
Acked-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Faisal Latif [Fri, 26 Feb 2016 15:18:01 +0000 (09:18 -0600)]
i40iw: use shared code for port mapper
Removei/change for port mapper code which has been moved to iwcm.
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Faisal Latif [Fri, 26 Feb 2016 15:18:05 +0000 (09:18 -0600)]
iwpm: crash fix for large connections test
During large connection test, there is a crash at wake_up() in the callback as waitq is
not yet initialized. Callback can happen before iwpm_wait_complete_req() is called to
initialize waitq.
To resolve, using signaling semaphore instead of waitq.
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Reviewed-by: Tatyana E Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Steve Wise [Fri, 26 Feb 2016 15:18:04 +0000 (09:18 -0600)]
iw_cxgb3: support for iWARP port mapping
Now with the new iWARP port mapping service in the iwcm, it is
trivial to add cxgb3 support.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Steve Wise [Fri, 26 Feb 2016 15:18:03 +0000 (09:18 -0600)]
iw_cxgb4: remove port mapper related code
Now that most of the port mapper code been moved to iwcm, we can remove
it from iw_cxgb4.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Faisal Latif [Fri, 26 Feb 2016 15:18:02 +0000 (09:18 -0600)]
iw_nes: remove port mapper related code
Now that most of the port mapper code been moved to iwcm, we can
remove it from port mapper service user drivers.
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Tatyana E. Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Faisal Latif [Fri, 26 Feb 2016 15:18:00 +0000 (09:18 -0600)]
iwcm: common code for port mapper
moved port mapper related code from drivers into common code
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Tatyana E. Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Doug Ledford [Wed, 16 Mar 2016 17:38:28 +0000 (13:38 -0400)]
Merge branches 'mlx4', 'mlx5' and 'ocrdma' into k.o/for-4.6
Doug Ledford [Mon, 14 Mar 2016 21:42:57 +0000 (17:42 -0400)]
Merge branches 'ib_core', 'ib_ipoib', 'srpt', 'drain-cq-v4' and 'net/9p' into k.o/for-4.6
Christoph Hellwig [Thu, 3 Mar 2016 08:36:06 +0000 (09:36 +0100)]
net/9p: convert to new CQ API
Trivial conversion to the new RDMA CQ API.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Fri, 26 Feb 2016 21:33:33 +0000 (13:33 -0800)]
staging/rdma/hfi1: Fix memory leaks
Fix 3 memory leaks reported by the LeakCheck tool in the KEDR framework.
The following resources were allocated memory during their respective
initializations but not freed during cleanup:
1. SDMA map elements
2. PIO map elements
3. HW send context to SW index map
This patch fixes the memory leaks by freeing the allocated memory in the
cleanup path.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Easwar Hariharan [Fri, 26 Feb 2016 21:33:28 +0000 (13:33 -0800)]
staging/rdma/hfi1: Fix reporting of LED status in Get(LedInfo) and Get(PortInfo)
The LedInfo SMA attribute is redefined to control the LED beaconing
state machine instead of the LED directly. In accordance, we now
return the state of LED beaconing, represented by whether the beaconing
timer is active, instead of the state of the LED itself for SMA queries
Get(LedInfo) and Get(PortInfo). While we are at it, we fix the beaconing
timer control code so that the state of the timer is accurately updated.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Kaike Wan [Fri, 26 Feb 2016 21:33:23 +0000 (13:33 -0800)]
staging/rdma/hfi1: Check interrupt registers mapping
This patch tests the interrupt registers when the driver has no access to
its upstream component. In this case, it is highly likely that it is
running in a virtual machine (eg, Qemu-kvm guest). If the interrupt
registers are not mapped properly by the virtual machine monitor, an
error message will be printed and the probing will be terminated. This
will help the user identify the issue. On the other hand, if the driver
is running in a host or has access to its upstream component in some
other VM, it will do nothing.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Kaike Wan [Fri, 26 Feb 2016 21:33:18 +0000 (13:33 -0800)]
staging/rdma/hfi1: Avoid using upstream component if it is not accessible
When the hfi1 device is assigned to a VM (eg KVM), the hfi1 driver has
no access to the upstream component and therefore cannot use it to perform
some operations, such as secondary bus reset. As a result, the hfi1 driver
cannot perform the pcie Gen3 transition. Instead, those operation should
be done in the host environment, preferrably done during the Option ROM
initialization. Similarly, the hfi1 driver cannot support ASPM and tune
the pcie capability under this circumstance.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Jianxin Xiong [Fri, 26 Feb 2016 21:33:13 +0000 (13:33 -0800)]
staging/rdma/hfi1: Fix header size calculation for RC/UC QPs with GRH enabled
There is a header size counter in both the QP struture and the txreq
structure. The counter in the txreq structure is not updated properly
for RC and UC queue pairs with GRH enabled, and thus causing SDMA
send to fail. This patch fixes the RC and UC path.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Fri, 26 Feb 2016 21:33:08 +0000 (13:33 -0800)]
IB/rdmavt: Check lkey_table_size value before use
The lkey_table_size driver specific parameter value is used before its
value is sanity checked and restricted to RVT_MAX_LKEY_TABLE_BITS.
This causes a vmalloc allocation failure for large values. Fix this
by moving the value check before the first usage of the value.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Thu, 18 Feb 2016 19:13:01 +0000 (11:13 -0800)]
staging/rdma/hfi1: Fix counter read for cp
A cp or cat of /sys/kernel/debug/hfi1/hfi1_0/port1counters
produces the following message:
hfi1 0000:81:00.0: hfi1_0: index not supported
hfi1 0000:81:00.0: hfi1_0: read_cntrs does not support indexing
Fix by removing the file position logic and the associated messages
and make the file positioning the responsibility of the caller.
The port counter read function argument is changed to the per port
data structure since the counters are relative to the port and not
the device.
Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Thu, 18 Feb 2016 19:12:51 +0000 (11:12 -0800)]
staging/rdma/hfi1: Guard i2c access against cp
An attempt to cp or cat /sys/kernel/debug/hfi1/hfi1_0/i2c1
produces this message:
hfi1 0000:81:00.0: hfi1_0: IB0:1 I2C failed even retrying
Fix the issue by explicitly rejecting a simple cat/cp with an
-EINVAL error return.
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Marciniszyn [Thu, 18 Feb 2016 19:12:42 +0000 (11:12 -0800)]
IB/rdamvt: fix cross build with rdmavt
The new check routine causes a larger than supported frame size
on s390.
Changing the check routine to noinline fixes the issue.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Thu, 18 Feb 2016 19:12:34 +0000 (11:12 -0800)]
staging/rdma/hfi1: Disclose more information when i2c fails
Improve logging messages when there are i2c failures.
Clean i2c read error handling.
Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Thu, 18 Feb 2016 19:12:25 +0000 (11:12 -0800)]
staging/rdma/hfi1: Fix debugfs access race
Debugfs access races with the driver being ready. Make sure the
driver is ready before debugfs files appear and debufs files are
gone before the driver starts tearing down.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Easwar Hariharan [Thu, 18 Feb 2016 19:12:16 +0000 (11:12 -0800)]
staging/rdma/hfi1: Cleanup comments and logs in PHY code
This is a set of minor fixes including comment and log message cleanups
and improvements to the PHY layer code.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Thu, 18 Feb 2016 19:12:08 +0000 (11:12 -0800)]
staging/rdma/hfi1: Fix xmit discard error weight
Count only the errors that apply to xmit discards. Update
the comment to better explain the limitations of the count.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Marciniszyn [Thu, 18 Feb 2016 19:11:59 +0000 (11:11 -0800)]
staging/rdma/hfi1: fix 0-day syntax error
Setting CONFIG_HFI1_DEBUG_SDMA_ORDER causes a syntax error:
sdma.c: In function ‘complete_tx’:
sdma.c:370: error: ‘txp’ undeclared (first use in
this function)
sdma.c:370: error: (Each undeclared identifier is reported only once
sdma.c:370: error: for each function it appears in.)
Adjust code under ifdef to reference the tx properly.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:22:17 +0000 (20:22 -0800)]
staging/rdma/hfi1: Fix header
Fix the header by moving the copyright notice out of the license text
and to the top of the header. Also, update the copyright date.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:22:09 +0000 (20:22 -0800)]
staging/rdma/hfi1: Remove else after break
Remove else after break to fix checkpatch warning:
WARNING: else is not generally useful after a break or return
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:22:00 +0000 (20:22 -0800)]
staging/rdma/hfi1: Add braces on all arms of statement
Add braces on all arms of statements to fix checkpatch check:
CHECK: braces {} should be used on all arms of this statement
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:21:52 +0000 (20:21 -0800)]
staging/rdma/hfi1: Fix code alignment
Fix code alignment to fix checkpatch check:
CHECK: Alignment should match open parenthesis
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:21:43 +0000 (20:21 -0800)]
staging/rdma/hfi1: Fix block comments
Fix block comments with proper formatting to fix checkpatch warnings:
WARNING: Block comments use * on subsequent lines
WARNING: Block comments use a trailing */ on a separate line
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:21:34 +0000 (20:21 -0800)]
staging/rdma/hfi1: Add comment for spinlock_t definition
Add comments describing the spinlock for spinlock_t definitions to
fix checkpatch check:
CHECK: spinlock_t definition without comment
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:21:26 +0000 (20:21 -0800)]
staging/rdma/hfi1: Remove void function return statement
Remove return statement at the end of a void function to fix
checkpatch warning:
WARNING: void function return statements are not generally useful
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:21:16 +0000 (20:21 -0800)]
staging/rdma/hfi1: Use pointer instead of struct name
Use sizeof(*p) instead of sizeof(struct foo) to fix checkpatch check:
CHECK: Prefer alloc(sizeof(*p)...) over alloc(sizeof(struct foo)...)
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>