Arnd Bergmann [Mon, 14 Mar 2016 14:29:43 +0000 (15:29 +0100)]
aacraid: add missing curly braces
gcc-6 warns about obviously wrong indentation for newly added code in
aac_slave_configure():
drivers/scsi/aacraid/linit.c: In function 'aac_slave_configure':
drivers/scsi/aacraid/linit.c:458:3: warning: statement is indented as if it were guarded by... [-Wmisleading-indentation]
sdev->tagged_supported = 1;
^~~~
drivers/scsi/aacraid/linit.c:455:4: note: ...this 'else' clause, but it is not
gcc is correct, and evidently this was meant to be within the curly
braces that should have been there to start with. This patch adds them,
which avoids the warning and makes it clear what was intended here.
Nothing changes in behavior because in the 'if' block, the
sdev->tagged_supported flag is known to be set already.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes:
6bf3b630d0a7 ("aacraid: SCSI blk tag support")
Reviewed-by: Raghava Aditya Renukunta <raghavaaditya.renukunta@pmcs.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 18 Mar 2016 13:55:38 +0000 (14:55 +0100)]
scsi_common: do not clobber fixed sense information
For fixed sense the information field is 32 bits, to we need to truncate
the information field to avoid clobbering the sense code.
Fixes:
a1524f226a02 ("libata-eh: Set 'information' field for autosense")
Cc: <stable@vger.kernel.org> #v4.1+
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Arnd Bergmann [Thu, 17 Mar 2016 12:29:52 +0000 (13:29 +0100)]
scsi: ufs: select CONFIG_NLS
A recent change to ufshcd introduced a call to utf16s_to_utf8s, a
function that is provided by the NLS module, so we get a link error when
that is not present:
drivers/scsi/built-in.o: In function `ufshcd_read_string_desc':
:(.text+0x124d0): undefined reference to `utf16s_to_utf8s'
This adds a Kconfig 'select' statement to avoid the build error.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes:
b573d484e4ff ("scsi: ufs: add support to read device and string descriptors")
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Arnd Bergmann [Wed, 16 Mar 2016 16:39:17 +0000 (17:39 +0100)]
scsi: fc: use get/put_unaligned64 for wwn access
A bug in the gcc-6.0 prerelease version caused at least one
driver (lpfc) to have excessive stack usage when dealing with
wwn data, on the ARM architecture.
lpfc_scsi.c: In function 'lpfc_find_next_oas_lun':
lpfc_scsi.c:117:1: warning: the frame size of 1152 bytes is larger than 1024 bytes [-Wframe-larger-than=]
I have reported this as a gcc regression in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70232
However, using a better implementation of wwn_to_u64() not only
helps with the particular gcc problem but also leads to better
object code for any version or architecture.
The kernel already provides get_unaligned_be64() and
put_unaligned_be64() helper functions that provide an
optimized implementation with the desired semantics.
The lpfc_find_next_oas_lun() function in the example that
grew from 1146 bytes to 5144 bytes when moving from gcc-5.3
to gcc-6.0 is now 804 bytes, as the optimized
get_unaligned_be64() load can be done in three instructions.
The stack usage is now down to 28 bytes from 128 bytes with
gcc-5.3 before.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Maurizio Lombardi [Wed, 16 Mar 2016 13:44:08 +0000 (14:44 +0100)]
fnic: move printk()s outside of the critical code section.
This patch moves a printk() outside of the code section where interrupt
are disabled. In some cases a flood of error messages may cause a kernel
panic. It also removes one of the printk()s because the same error
message was printed twice.
[709686.317197] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 12
[709686.317200] CPU: 12 PID: 1963 Comm: systemd-journal Tainted: GF O-------------- 3.10.0-229.el7.x86_64 #1
[709686.317201] Hardware name: Cisco Systems Inc UCSB-B200-M3/UCSB-B200-M3, BIOS B200M3.2.2.3.6.
030620151309 03/06/2015
[709686.317206]
ffffffff8182b2e8 00000000392722ba ffff88046fcc5c48 ffffffff81603f36
[709686.317209]
ffff88046fcc5cc8 ffffffff815fd7da 0000000000000010 ffff88046fcc5cd8
[709686.317211]
ffff88046fcc5c78 00000000392722ba ffff88046fcc5c88 000000000000000c
[709686.317212] Call Trace:
[709686.317221] <NMI> [<
ffffffff81603f36>] dump_stack+0x19/0x1b
[709686.317223] [<
ffffffff815fd7da>] panic+0xd8/0x1e7
[709686.317227] [<
ffffffff8110a760>] ? watchdog_enable_all_cpus.part.2+0x40/0x40
[709686.317229] [<
ffffffff8110a822>] watchdog_overflow_callback+0xc2/0xd0
[709686.317233] [<
ffffffff8114c901>] __perf_event_overflow+0xa1/0x250
[709686.317235] [<
ffffffff8114d404>] perf_event_overflow+0x14/0x20
[709686.317239] [<
ffffffff810301fd>] intel_pmu_handle_irq+0x1fd/0x410
[709686.317242] [<
ffffffff811908d1>] ? unmap_kernel_range_noflush+0x11/0x20
[709686.317246] [<
ffffffff81373574>] ? ghes_copy_tofrom_phys+0x124/0x210
[709686.317249] [<
ffffffff8160cfcb>] perf_event_nmi_handler+0x2b/0x50
[709686.317251] [<
ffffffff8160c719>] nmi_handle.isra.0+0x69/0xb0
[709686.317252] [<
ffffffff8160c830>] do_nmi+0xd0/0x340
[709686.317256] [<
ffffffff8160bb71>] end_repeat_nmi+0x1e/0x2e
[709686.317260] [<
ffffffff812e24fd>] ? memcpy+0xd/0x110
[709686.317263] [<
ffffffff812e24fd>] ? memcpy+0xd/0x110
[709686.317265] [<
ffffffff812e24fd>] ? memcpy+0xd/0x110
[709686.317269] <<EOE>> [<
ffffffff8132c297>] ? vgacon_scroll+0x2d7/0x330
[709686.317273] [<
ffffffff813a086c>] scrup+0xfc/0x110
[709686.317275] [<
ffffffff813a0920>] lf+0xa0/0xb0
[709686.317278] [<
ffffffff813a1b32>] vt_console_print+0x2d2/0x420
[709686.317283] [<
ffffffff8106f4a1>] call_console_drivers.constprop.15+0x91/0xf0
[709686.317287] [<
ffffffff8107069f>] console_unlock+0x3bf/0x400
[709686.317291] [<
ffffffff81070996>] vprintk_emit+0x2b6/0x530
[709686.317294] [<
ffffffff815fd961>] printk_emit+0x44/0x5b
[709686.317297] [<
ffffffff81070d98>] devkmsg_writev+0x158/0x1d0
[709686.317303] [<
ffffffff811c5ef9>] do_sync_readv_writev+0x79/0xd0
[709686.317307] [<
ffffffff811c73ee>] do_readv_writev+0xce/0x260
[709686.317310] [<
ffffffff811c8d18>] ? __sb_start_write+0x58/0x110
[709686.317314] [<
ffffffff811c7615>] vfs_writev+0x35/0x60
[709686.317318] [<
ffffffff811c776c>] SyS_writev+0x5c/0xd0
[709686.317322] [<
ffffffff81613da9>] system_call_fastpath+0x16/0x1b
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Arnd Bergmann [Tue, 15 Mar 2016 21:40:31 +0000 (22:40 +0100)]
qla2xxx: avoid maybe_uninitialized warning
The qlt_check_reserve_free_req() function produces an incorrect warning
when CONFIG_PROFILE_ANNOTATED_BRANCHES is set:
drivers/scsi/qla2xxx/qla_target.c: In function 'qlt_check_reserve_free_req':
drivers/scsi/qla2xxx/qla_target.c:1887:3: error: 'cnt_in' may be used uninitialized in this function [-Werror=maybe-uninitialized]
ql_dbg(ql_dbg_io, vha, 0x305a,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
"qla_target(%d): There is no room in the request ring: vha->req->ring_index=%d, vha->req->cnt=%d, req_cnt=%d Req-out=%d Req-in=%d Req-Length=%d\n",
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
vha->vp_idx, vha->req->ring_index,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
vha->req->cnt, req_cnt, cnt, cnt_in, vha->req->length);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/qla2xxx/qla_target.c:1887:3: error: 'cnt' may be used uninitialized in this function [-Werror=maybe-uninitialized]
The problem is that gcc fails to track the state of the condition across
an annotated branch.
This slightly rearranges the code to move the second if() block
into the first one, to avoid the warning while retaining the
behavior of the code.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-By: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Arnd Bergmann [Mon, 14 Mar 2016 14:29:45 +0000 (15:29 +0100)]
megaraid_sas: add missing curly braces in ioctl handler
gcc-6 found a dubious indentation in the megasas_mgmt_fw_ioctl
function:
drivers/scsi/megaraid/megaraid_sas_base.c: In function 'megasas_mgmt_fw_ioctl':
drivers/scsi/megaraid/megaraid_sas_base.c:6658:4: warning: statement is indented as if it were guarded by... [-Wmisleading-indentation]
kbuff_arr[i] = NULL;
^~~~~~~~~
drivers/scsi/megaraid/megaraid_sas_base.c:6653:3: note: ...this 'if' clause, but it is not
if (kbuff_arr[i])
^~
The code is actually correct, as there is no downside in clearing a NULL
pointer again.
This clarifies the code and avoids the warning by adding extra curly
braces.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes:
90dc9d98f01b ("megaraid_sas : MFI MPT linked list corruption fix")
Reviewed-by: Hannes Reinecke <hare@suse.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Arnd Bergmann [Mon, 14 Mar 2016 14:29:44 +0000 (15:29 +0100)]
lpfc: fix misleading indentation
gcc-6 complains about the indentation of the lpfc_destroy_vport_work_array()
call in lpfc_online(), which clearly doesn't look right:
drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_online':
drivers/scsi/lpfc/lpfc_init.c:2880:3: warning: statement is indented as if it were guarded by... [-Wmisleading-indentation]
lpfc_destroy_vport_work_array(phba, vports);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/lpfc/lpfc_init.c:2863:2: note: ...this 'if' clause, but it is not
if (vports != NULL)
^~
Looking at the patch that introduced this code, it's clear that the
behavior is correct and the indentation is wrong.
This fixes the indentation and adds curly braces around the previous
if() block for clarity, as that is most likely what caused the code
to be misindented in the first place.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes:
549e55cd2a1b ("[SCSI] lpfc 8.2.2 : Fix locking around HBA's port_list")
Reviewed-by: Sebastian Herbszt <herbszt@gmx.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Mon, 14 Mar 2016 09:43:08 +0000 (10:43 +0100)]
scsi_transport_sas: add 'scsi_target_id' sysfs attribute
There is no way to detect the scsi_target_id for any given SAS remote
port, so add a new sysfs attribute 'scsi_target_id'.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Dan Carpenter [Fri, 11 Mar 2016 11:19:03 +0000 (14:19 +0300)]
scsi_dh_alua: uninitialized variable in alua_check_vpd()
The pg_updated variable is support to be set to false at the start but
it is uninitialized.
Fixes:
cb0a168cb6b8 ('scsi_dh_alua: update 'access_state' field')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yaniv Gardi [Thu, 10 Mar 2016 15:37:21 +0000 (17:37 +0200)]
scsi: ufs-qcom: add printouts of testbus debug registers
This change adds printouts of testbus and debug registers.
Reviewed-by: Gilad Broner <gbroner@codeaurora.org>
Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yaniv Gardi [Thu, 10 Mar 2016 15:37:20 +0000 (17:37 +0200)]
scsi: ufs-qcom: enable/disable the device ref clock
This change enables the device ref clock before changing to HS mode
and disables it if entered to PWM mode.
Reviewed-by: Gilad Broner <gbroner@codeaurora.org>
Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yaniv Gardi [Thu, 10 Mar 2016 15:37:19 +0000 (17:37 +0200)]
scsi: ufs-qcom: set PA_Local_TX_LCC_Enable before link startup
Some UFS devices (and may be host) have issues if LCC is
enabled. So we are setting PA_Local_TX_LCC_Enable to 0
before link startup which will make sure that both host
and device TX LCC are disabled once link startup is
completed.
Reviewed-by: Gilad Broner <gbroner@codeaurora.org>
Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yaniv Gardi [Thu, 10 Mar 2016 15:37:18 +0000 (17:37 +0200)]
scsi: ufs: add device quirk delay before putting UFS rails in LPM
We put the UFS device in sleep state & UFS link in hibern8 state during
runtime suspend. After this we put all the UFS rails in low power
modes immediately but it seems some devices may still draw more than
sleep current from UFS rails (especially from VCCQ rail) at-least for
500us.
To avoid this situation, this change adds 2ms delay before putting
these UFS rails in LPM mode.
Reviewed-by: Gilad Broner <gbroner@codeaurora.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yaniv Gardi [Thu, 10 Mar 2016 15:37:17 +0000 (17:37 +0200)]
scsi: ufs: fix leakage during link off state
Currently when we try to put the link in off/disabled state during
suspend, it seems link is not being kept in low power mode.
This patch fixes the issue by putting the link in hibern8 first
(so device also puts the link in low power mode) and then stop the
host controller.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yaniv Gardi [Thu, 10 Mar 2016 15:37:16 +0000 (17:37 +0200)]
scsi: ufs: tune UniPro parameters to optimize hibern8 exit time
Optimal values of local UniPro parameters like PA_Hibern8Time &
PA_TActivate can help reduce the hibern8 exit latency. If both host and
device supports UniPro ver1.6 or later, these parameters will be
automatically tuned during link startup itself. But if either host or
device doesn't support UniPro ver 1.6 or later, we have to manually
tune them. But to keep manual tuning logic simple, we will only do
manual tuning if local unipro version doesn't support ver1.6 or later.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yaniv Gardi [Thu, 10 Mar 2016 15:37:15 +0000 (17:37 +0200)]
scsi: ufs: handle non spec compliant bkops behaviour by device
We are seeing that some devices are raising the urgent bkops exception
events even when BKOPS status doesn't indicate performace impacted or
critical. Handle these device by determining their urgent bkops status
at runtime.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yaniv Gardi [Thu, 10 Mar 2016 15:37:14 +0000 (17:37 +0200)]
scsi: ufs: add retry for query descriptors
Query commands have 100ms timeout and it may timeout if they are
issued in parallel to ongoing read/write SCSI commands, this change
adds the retry (max: 10) in case command timeouts.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yaniv Gardi [Thu, 10 Mar 2016 15:37:13 +0000 (17:37 +0200)]
scsi: ufs: add error recovery after DL NAC error
Some vendor's UFS device sends back to back NACs for the DL data frames
causing the host controller to raise the DFES error status. Sometimes
such UFS devices send back to back NAC without waiting for new
retransmitted DL frame from the host and in such cases it might be
possible the Host UniPro goes into bad state without raising the DFES
error interrupt. If this happens then all the pending commands would
timeout only after respective SW command (which is generally too
large).
This change workarounds such device behaviour like this:
- As soon as SW sees the DL NAC error, it would schedule the error
handler
- Error handler would sleep for 50ms to see if there any fatal errors
raised by UFS controller.
- If there are fatal errors then SW does normal error recovery.
- If there are no fatal errors then SW sends the NOP command to
device to check if link is alive.
- If NOP command times out, SW does normal error recovery
- If NOP command succeed, skip the error handling.
If DL NAC error is seen multiple times with some vendor's UFS devices
then enable this quirk to initiate quick error recovery and also
silence related error logs to reduce spamming of kernel logs.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yaniv Gardi [Thu, 10 Mar 2016 15:37:12 +0000 (17:37 +0200)]
scsi: ufs: make error handling bit faster
UFS driver's error handler forcefully tries to clear all the pending
requests. For each pending request in the queue, it waits 1 sec for it
to get cleared. If we have multiple requests in the queue then it's
possible that we might end up waiting for those many seconds before
resetting the host. But note that resetting host would any way clear
all the pending requests from the hardware. Hence this change skips
the forceful clear of the pending requests if we are anyway going to
reset the host (for fatal errors).
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yaniv Gardi [Thu, 10 Mar 2016 15:37:11 +0000 (17:37 +0200)]
scsi: ufs: disable vccq if it's not needed by UFS device
Some UFS devices don't require VCCQ rail for device operations hence
this change adds support to recognize such devices and remove vote for
the unused VCCQ rail.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yaniv Gardi [Thu, 10 Mar 2016 15:37:10 +0000 (17:37 +0200)]
scsi: ufs: separate device and host quirks
Currently we use the host quirks mechanism in order to
handle both device and host controller quirks.
In order to support various of UFS devices we should separate
handling the device quirks from the host controller's.
Reviewed-by: Gilad Broner <gbroner@codeaurora.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Raviv Shvili <rshvili@codeaurora.org>
Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yaniv Gardi [Thu, 10 Mar 2016 15:37:09 +0000 (17:37 +0200)]
scsi: ufs: add support to read device and string descriptors
This change adds support to read device descriptor and string descriptor
from a UFS device
Reviewed-by: Gilad Broner <gbroner@codeaurora.org>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Raviv Shvili <rshvili@codeaurora.org>
Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yaniv Gardi [Thu, 10 Mar 2016 15:37:08 +0000 (17:37 +0200)]
scsi: ufs: verify hba controller hce reg value
Sometimes due to hw issues it takes some time to the
host controller register to update. In order to verify the register
has updated, a polling is done until its value is set.
In addition the functions ufshcd_hba_stop() and
ufshcd_wait_for_register() was updated with an additional input
parameter, indicating the timeout between reads will
be done by sleeping or spinning the cpu.
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Raviv Shvili <rshvili@codeaurora.org>
Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yaniv Gardi [Thu, 10 Mar 2016 15:37:07 +0000 (17:37 +0200)]
scsi: ufs: implement scsi host timeout handler
A race condition exists between request requeueing and scsi layer
error handling:
When UFS driver queuecommand returns a busy status for a request,
it will be requeued and its tag will be freed and set to -1.
At the same time it is possible that the request will timeout and
scsi layer will start error handling for it. The scsi layer reuses
the request and its tag to send error related commands to the device,
however its tag is no longer valid.
As this request was never really sent to the device, there is no
point to start error handling with the device.
Implement the scsi error handling timeout callback and bypass SCSI
error handling for request that were not actually sent to the device.
For such requests simply reset the block layer timer. Otherwise, let
SCSI layer perform the usual error handling.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gilad Broner <gbroner@codeaurora.org>
Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yaniv Gardi [Thu, 10 Mar 2016 15:37:06 +0000 (17:37 +0200)]
scsi: ufs: avoid spurious UFS host controller interrupts
When control reaches to Linux UFS driver during UFS boot mode, UFS host
controller interrupt status/enable registers may have left over
settings.
In order to avoid any spurious interrupts due to these left overs,
it's important to clear these interrupt status/enable registers before
enabling UFS interrupt handling.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yaniv Gardi [Thu, 10 Mar 2016 15:37:05 +0000 (17:37 +0200)]
scsi: ufs-qcom: add number of lanes per direction
Different platform may have different number of lanes
for the UFS link.
Add parameter to device tree specifying how many lanes
should be configured for the UFS link.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Gilad Broner <gbroner@codeaurora.org>
Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Douglas Gilbert [Thu, 3 Mar 2016 05:31:29 +0000 (00:31 -0500)]
sg: fix dxferp in from_to case
One of the strange things that the original sg driver did was let the
user provide both a data-out buffer (it followed the sg_header+cdb)
_and_ specify a reply length greater than zero. What happened was that
the user data-out buffer was copied into some kernel buffers and then
the mid level was told a read type operation would take place with the
data from the device overwriting the same kernel buffers. The user would
then read those kernel buffers back into the user space.
From what I can tell, the above action was broken by commit
fad7f01e61bf
("sg: set dxferp to NULL for READ with the older SG interface") in 2008
and syzkaller found that out recently.
Make sure that a user space pointer is passed through when data follows
the sg_header structure and command. Fix the abnormal case when a
non-zero reply_len is also given.
Fixes:
fad7f01e61bf737fe8a3740d803f000db57ecac6
Cc: <stable@vger.kernel.org> #v2.6.28+
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Vitaly Kuznetsov [Mon, 7 Mar 2016 10:59:44 +0000 (11:59 +0100)]
scsi: storvsc: fix SRB_STATUS_ABORTED handling
Commit
3209f9d780d1 ("scsi: storvsc: Fix a bug in the handling of SRB
status flags") filtered SRB_STATUS_AUTOSENSE_VALID out effectively making
the (SRB_STATUS_ABORTED | SRB_STATUS_AUTOSENSE_VALID) case a dead code. The
logic from this branch (e.g. storvsc_device_scan() call) is still required,
fix the check.
Cc: <stable@vger.kernel.org> #v4.4+
Fixes:
3209f9d780d1 ("scsi: storvsc: Fix a bug in the handling of SRB status flags")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Maurizio Lombardi [Fri, 4 Mar 2016 09:41:49 +0000 (10:41 +0100)]
be2iscsi: set the boot_kset pointer to NULL in case of failure
In beiscsi_setup_boot_info(), the boot_kset pointer should be set to
NULL in case of failure otherwise an invalid pointer dereference may
occur later.
Cc: <stable@vger.kernel.org>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Martin K. Petersen [Sat, 5 Mar 2016 22:52:02 +0000 (17:52 -0500)]
sd: Fix discard granularity when LBPRZ=1
Commit
397737223c59 ("sd: Make discard granularity match logical block
size when LBPRZ=1") accidentally set the granularity to one byte instead
of one logical block on devices that provide deterministic zeroes after
UNMAP.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reported-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Fixes:
397737223c59e89dca7305feb6528caef8fbef84
Cc: <stable@vger.kernel.org> #v4.4+
Hannes Reinecke [Thu, 10 Mar 2016 10:25:26 +0000 (11:25 +0100)]
scsi_sysfs: Fix typo in is_bin_visible()
The test for the existence vpd_pg83 is inverted.
Fixes:
7e47976bcff ("scsi_sysfs: add 'is_bin_visible' callback")
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reported-by: Ewan Milne <emilne@redhat.com>
Reviewed-by: Laurence Oberman loberman@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Sumit Saxena [Thu, 10 Mar 2016 10:14:37 +0000 (02:14 -0800)]
megaraid_sas: Don't issue kill adapter for MFI controllers in case of PD list DCMD failure
There are few MFI adapters which do not support MR_DCMD_PD_LIST_QUERY so
if MFI adapters fail this DCMD, it should not be considered as FATAL and
driver should not issue kill adapter and set per controller's instance
variable- pd_list_not_supported so that same variable can be used inside
functions- slave_alloc and slave_configure to allow firmware scan.
Killing adapter because of DCMD failure when this DCMD is not supported
causes driver's probe getting failed. This issue got introduced by
commit
6d40afbc7d13 ("megaraid_sas: MFI IO timeout handling").
Killing adapter in case of this DCMD failure should be limited to Fusion
adapters only. Per controller's instance variable allow_fw_scan is
removed as pd_list_not_supported better reflect the purpose.
Fixes:
6d40afbc7d13359b30a5cd783e3db6ebefa5f40a
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Lars-Peter Clausen [Fri, 4 Mar 2016 10:15:07 +0000 (11:15 +0100)]
mpt3sas: Remove unnecessary synchronize_irq() before free_irq()
Calling synchronize_irq() right before free_irq() is quite useless. On
one hand the IRQ can easily fire again before free_irq() is entered, on
the other hand free_irq() itself calls synchronize_irq() internally (in
a race condition free way), before any state associated with the IRQ is
freed.
Patch was generated using the following semantic patch:
// <smpl>
@@
expression irq;
@@
-synchronize_irq(irq);
free_irq(irq, ...);
// </smpl>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Manoj N. Kumar [Fri, 4 Mar 2016 21:55:20 +0000 (15:55 -0600)]
cxlflash: Increase cmd_per_lun for better throughput
With the current value of cmd_per_lun at 16, the throughput
over a single adapter is limited to around 150kIOPS.
Increase the value of cmd_per_lun to 256 to improve
throughput. With this change a single adapter is able to
attain close to the maximum throughput (380kIOPS).
Also change the number of RRQ entries that can be queued.
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Manoj N. Kumar [Fri, 4 Mar 2016 21:55:19 +0000 (15:55 -0600)]
cxlflash: Fix to avoid unnecessary scan with internal LUNs
When switching to the internal LUN defined on the
IBM CXL flash adapter, there is an unnecessary
scan occurring on the second port. This scan leads
to the following extra lines in the log:
Dec 17 10:09:00 tul83p1 kernel: [ 3708.561134] cxlflash 0008:00:00.0: cxlflash_queuecommand: (scp=
c0000000fc1f0f00) 11/1/0/0 cdb=(
A0000000-
00000000-
10000000-
00000000)
Dec 17 10:09:00 tul83p1 kernel: [ 3708.561147] process_cmd_err: cmd failed afu_rc=32 scsi_rc=0 fc_rc=0 afu_extra=0xE, scsi_extra=0x0, fc_extra=0x0
By definition, both of the internal LUNs are on the first port/channel.
When the lun_mode is switched to internal LUN the
same value for host->max_channel is retained. This
causes an unnecessary scan over the second port/channel.
This fix alters the host->max_channel to 0 (1 port), if internal
LUNs are configured and switches it back to 1 (2 ports) while
going back to external LUNs.
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Uma Krishnan [Fri, 4 Mar 2016 21:55:18 +0000 (15:55 -0600)]
cxlflash: Reorder user context initialization
In order to support cxlflash in the PowerVM environment, underlying
hypervisor APIs have imposed a kernel API ordering change.
For the superpipe access to LUN, user applications need a context.
The cxlflash module creates this context by making a sequence of
cxl calls. In the current code, a context is initialized via
cxl_dev_context_init() followed by cxl_process_element(), a function
that obtains the process element id. Finally, cxl_start_work()
is called to attach the process element.
In the PowerVM environment, a process element id cannot be obtained
from the hypervisor until the process element is attached. The
cxlflash module is unable to create contexts without a valid
process element id.
To fix this problem, cxl_start_work() is called before obtaining
the process element id.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Matthew R. Ochs [Fri, 4 Mar 2016 21:55:17 +0000 (15:55 -0600)]
cxlflash: Simplify attach path error cleanup
The cxlflash_disk_attach() routine currently uses a cascading error
gate strategy for its error cleanup path. While this strategy is
commonly used to handle cleanup scenarios, it is too restrictive when
function callouts need to be restructured. Problems range from
inserting error path bugs in previously 'good' code to the cleanup
path imposing design changes to how the normal path is structured.
A less restrictive approach is needed to support ordering changes
that come about when operating in different environments.
To overcome this restriction, the error cleanup path is modified to
have a single entrypoint and use conditional logic to cleanup where
necessary. Entities that require multiple cleanup steps must be
carefully vetted to ensure their APIs support state. In cases where
they do not (none as of this commit) additional local variables can
be used to maintain state on their behalf.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Matthew R. Ochs [Fri, 4 Mar 2016 21:55:16 +0000 (15:55 -0600)]
cxlflash: Split out context initialization
Presently, context information structures are allocated and
initialized in the same routine, create_context(). This imposes
an ordering restriction such that all pieces of information needed
to initialize a context must be known before the context is even
allocated.
This design point is not flexible when the order of context
creation needs to be modified. Specifically, this can lead to
problems when members of the context information structure are
a part of an ordering dependency (i.e. - the 'work' structure
embedded within the context).
To remedy, the allocation is left as-is, inside of the existing
create_context() routine and the initialization is transitioned
to a new void routine, init_context(). At the same time, in
anticipation of these routines not being called in sequence, a
state boolean is added to the context information structure to
track when the context has been initilized. The context teardown
routine, destroy_context(), is modified to support being called
with a non-initialized context.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Uma Krishnan [Fri, 4 Mar 2016 21:55:15 +0000 (15:55 -0600)]
cxlflash: Unmap problem state area before detaching master context
When operating in the PowerVM environment, the cxlflash module can
receive an error from the hypervisor indicating that there are
existing mappings in the page table for the process MMIO space.
This issue exists because term_afu() currently invokes term_mc()
before stop_afu(), allowing for the master context to be detached
first and the problem state area to be unmapped second.
To resolve this issue, stop_afu() should be called before term_mc().
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Manoj N. Kumar [Fri, 4 Mar 2016 21:55:14 +0000 (15:55 -0600)]
cxlflash: Simplify PCI registration
The calls to pci_request_regions(), pci_resource_start(),
pci_set_dma_mask(), pci_set_master() and pci_save_state() are all
unnecessary for the IBM CXL flash adapter since data buffers
are not required to be mapped to the device's memory.
The use of services such as pci_set_dma_mask() are problematic on
hypervisor managed systems as the IBM CXL flash adapter is operating
under a virtual PCI Host Bridge (virtual PHB) which does not support
these services.
cxlflash 0001:00:00.0: init_pci: Failed to set PCI DMA mask rc=-5
The resolution is to simplify init_pci(), to a point where it does the
bare minimum (pci_enable_device). Similarly, remove the call the
pci_release_regions() from cxlflash_remove().
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Lars-Peter Clausen [Fri, 4 Mar 2016 10:15:06 +0000 (11:15 +0100)]
be2iscsi: Remove unnecessary synchronize_irq() before free_irq()
Calling synchronize_irq() right before free_irq() is quite useless. On one
hand the IRQ can easily fire again before free_irq() is entered, on the
other hand free_irq() itself calls synchronize_irq() internally (in a race
condition free way), before any state associated with the IRQ is freed.
Patch was generated using the following semantic patch:
// <smpl>
@@
expression irq;
@@
-synchronize_irq(irq);
free_irq(irq, ...);
// </smpl>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Reviewed-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Thu, 3 Mar 2016 06:54:12 +0000 (07:54 +0100)]
scsi_sysfs: call 'device_add' after attaching device handler
'device_add' will be evaluating the 'is_visible' callback when creating
the sysfs attributes. As by this time the device handler has not been
attached the 'access_state' attribute will never be visible.
This patch moves the code around so that the device handler is present
by the time 'is_visible' is evaluated to correctly display the
'access_state' attribute.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Bart van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Thu, 3 Mar 2016 06:54:11 +0000 (07:54 +0100)]
scsi_dh_emc: update 'access_state' field
Update the 'access_state' field of the SCSI device whenever the path
state changes.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Thu, 3 Mar 2016 06:54:10 +0000 (07:54 +0100)]
scsi_dh_rdac: update 'access_state' field
Track attached SCSI devices and update the 'access_state' whenever the
path state of the device changes.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Thu, 3 Mar 2016 06:54:09 +0000 (07:54 +0100)]
scsi_dh_alua: update 'access_state' field
Track attached SCSI devices and update the 'access_state' field whenever
an ALUA state change has been detected.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Thu, 3 Mar 2016 06:54:08 +0000 (07:54 +0100)]
scsi_dh_alua: use common definitions for ALUA state
scsi_proto.h now contains definitions for the ALUA state, so we don't
have to carry them in the device handler.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bart van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Thu, 3 Mar 2016 06:54:07 +0000 (07:54 +0100)]
scsi: Add 'access_state' and 'preferred_path' attribute
Add an 'access_state' field to struct scsi_device and display them in
sysfs as 'access_state' and 'preferred_path' attribute.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Bart van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Thu, 3 Mar 2016 06:41:24 +0000 (07:41 +0100)]
scsi_sysfs: add 'is_bin_visible' callback
Add 'is_bin_visible' callback to blank out unsupported vpd pages.
Reviewed-by: Shane Seymour <shane.seymour@hpe.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Arnd Bergmann [Wed, 2 Mar 2016 15:59:00 +0000 (16:59 +0100)]
scsi: mvumi: use __maybe_unused to hide pm functions
The mvumi scsi hides the references to its suspend/resume functions in
an #ifdef but does not hide the implementation the same way:
drivers/scsi/mvumi.c:2632:12: error: 'mvumi_suspend' defined but not used [-Werror=unused-function]
drivers/scsi/mvumi.c:2651:12: error: 'mvumi_resume' defined but not used [-Werror=unused-function]
This adds __maybe_unused annotations so the compiler knows it can
silently drop them instead of warning, while avoiding the addition of
another #ifdef.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
K. Y. Srinivasan [Sat, 27 Feb 2016 01:48:58 +0000 (17:48 -0800)]
scsi: storvsc: Fix a build issue reported by kbuild test robot
tree: https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgit.kernel.org%2fpub%2fscm%2flinux%2fkernel%2fgit%2ftorvalds%2flinux.git&data=01%7c01%7ckys%40microsoft.com%
7ce2e0622715844b79ad7108d32796ec3c%
7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=ubr4GbBaNS%2ftOz%2buJBk0CL9N0UNG9x2TidLgy6Yovg4%3d master
head:
03c21cb775a313f1ff19be59c5d02df3e3526471
commit:
dac582417bc449b1f7f572d3f1dd9d23eec15cc9 storvsc: Properly support Fibre Channel devices
date: 3 weeks ago
config: x86_64-randconfig-s3-
01281016 (attached as .config)
reproduce:
git checkout
dac582417bc449b1f7f572d3f1dd9d23eec15cc9
# save the attached .config to linux build tree
make ARCH=x86_64
All errors (new ones prefixed by >>):
drivers/built-in.o: In function `storvsc_remove':
>> storvsc_drv.c:(.text+0x213af7): undefined reference to `fc_remove_host'
drivers/built-in.o: In function `storvsc_drv_init':
>> storvsc_drv.c:(.init.text+0xcbcc): undefined reference to `fc_attach_transport'
>> storvsc_drv.c:(.init.text+0xcc06): undefined reference to `fc_release_transport'
drivers/built-in.o: In function `storvsc_drv_exit':
>> storvsc_drv.c:(.exit.text+0x123c): undefined reference to `fc_release_transport'
With this commit, the storvsc driver depends on FC atttributes. Make this
dependency explicit.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Colin Ian King [Thu, 25 Feb 2016 22:58:25 +0000 (22:58 +0000)]
snic: correctly check for array overrun on overly long version number
The snic version number is expected to be 4 decimals in the form like a
netmask string with each number stored in an element in array v.
However, there is an off-by-one check on the number of elements in v
allowing one to pass a 5 decimal version number causing v[4] to be
referenced, causing a buffer overrun. Fix the off-by-one error by
comparing to i > 3 rather than 4.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Shane Seymour <shane.seymour@hpe.com>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Amitoj Kaur Chawla [Wed, 17 Feb 2016 13:32:54 +0000 (19:02 +0530)]
qlogicpti: Return correct error code
The return value of of_ioremap on failure should be -ENODEV and not
-1.
Found using Coccinelle. A simplified version of the semantic patch
used is:
//<smpl>
@@
expression *e;
@@
e = of_ioremap(...);
if (e == NULL) {
...
return
- -1
+ -ENODEV
;
}
//</smpl>
The single call site only checks that the return value is less than 0,
hence no change is required at the call site.
Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Reviewed-by: Shane Seymour <shane.seymour@hpe.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Finn Thain [Mon, 22 Feb 2016 23:07:09 +0000 (10:07 +1100)]
ncr5380: Call scsi_eh_prep_cmnd() and scsi_eh_restore_cmnd() as and when appropriate
This bug causes the wrong command to have its sense pointer overwritten,
which sometimes leads to a NULL pointer deref. Fix this by checking which
command is being requeued before restoring the scsi_eh_save data.
It turns out that some targets will disconnect a REQUEST SENSE command.
The autosense algorithm doesn't anticipate this. Hence multiple commands
can end up undergoing autosense simultaneously, and they will all try to
use the same scsi_eh_save struct, which won't work. Defer autosense when
the scsi_eh_save storage is in use by another command.
Fixes:
f27db8eb98a1 ("ncr5380: Fix autosense bugs")
Reported-and-tested-by: Michael Schmitz <schmitzmic@gmail.com>
Cc: <stable@vger.kernel.org> # 4.5
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Finn Thain [Mon, 22 Feb 2016 23:07:08 +0000 (10:07 +1100)]
ncr5380: Fix NCR5380_select() EH checks and result handling
Add missing checks for EH abort during arbitration and selection.
Rework the handling of NCR5380_select() result to improve clarity.
Fixes:
707d62b37fbb ("ncr5380: Fix EH during arbitration and selection")
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Cc: <stable@vger.kernel.org> # 4.5
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Finn Thain [Mon, 22 Feb 2016 23:07:07 +0000 (10:07 +1100)]
ncr5380: Forget aborted commands
The list structures and related logic used in the NCR5380 driver mean that
a command cannot be queued twice (i.e. can't appear on more than one queue
and can't appear on the same queue more than once).
The abort handler must forget the command so that the mid-layer can re-use
it. E.g. the ML may send it back to the LLD via via scsi_eh_get_sense().
Fix this and also fix two error paths, so that commands get forgotten iff
completed.
Fixes:
8b00c3d5d40d ("ncr5380: Implement new eh_abort_handler")
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Cc: <stable@vger.kernel.org> # 4.5
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Finn Thain [Mon, 22 Feb 2016 23:07:06 +0000 (10:07 +1100)]
ncr5380: Dont re-enter NCR5380_select()
Calling NCR5380_select() from the abort handler causes various problems.
Firstly, it means potentially re-entering NCR5380_select(). Secondly, it
means that the lock is released, which permits the EH handlers to be
re-entered. The combination results in crashes. Don't do it.
Fixes:
8b00c3d5d40d ("ncr5380: Implement new eh_abort_handler")
Reported-and-tested-by: Michael Schmitz <schmitzmic@gmail.com>
Cc: <stable@vger.kernel.org> # 4.5
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Finn Thain [Mon, 22 Feb 2016 23:07:05 +0000 (10:07 +1100)]
ncr5380: Dont release lock for PIO transfer
The calls to NCR5380_transfer_pio() for DATA IN and DATA OUT phases will
modify cmd->SCp.this_residual, cmd->SCp.ptr and cmd->SCp.buffer. That
works as long as EH does not intervene, which became possible in
atari_NCR5380.c when I changed the locking to bring it closer to
NCR5380.c.
If error recovery aborts the command, the scsi_cmnd in question and its
buffer will be returned to the mid-layer. So the transfer has to cease,
but it can't be stopped by the initiator because the target controls the
bus phase.
The problem does not arise if the lock is not released. That was fine for
atari_scsi, because it implements DMA. For the other drivers, we have to
release the lock and re-enable interrupts for long PIO data transfers.
The solution is to split the transfer into small chunks. In between chunks
the main loop releases the lock and re-enables interrupts. Thus interrupts
can be serviced and eh_bus_reset_handler can intervene if need be.
This fixes an oops in NCR5380_transfer_pio() that can happen when the EH
abort handler is invoked during DATA IN or DATA OUT phase.
Fixes:
11d2f63b9cf5 ("ncr5380: Change instance->host_lock to hostdata->lock")
Reported-and-tested-by: Michael Schmitz <schmitzmic@gmail.com>
Cc: <stable@vger.kernel.org> # 4.5
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Finn Thain [Mon, 22 Feb 2016 23:07:04 +0000 (10:07 +1100)]
ncr5380: Correctly clear command pointers and lists after bus reset
Commands subject to exception handling are to be returned to the scsi
mid-layer. Make sure that the various command pointers and command lists
in the low-level driver are correctly cleansed of affected commands.
This fixes some bugs that I accidentally introduced in v4.5-rc1 including
the removal of INIT_LIST_HEAD for the 'autosense' and 'disconnected'
command lists, and the possible NULL pointer dereference in
NCR5380_bus_reset() that was reported by Dan Carpenter.
hostdata->sensing may also point to an affected command so this pointer
also has to be cleared. The abort handler calls complete_cmd() to take
care of this; let's have the bus reset handler do the same.
The issue queue may also contain an affected command. If so, remove it.
This also follows the abort handler logic.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes:
62717f537e1b ("ncr5380: Implement new eh_bus_reset_handler")
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Cc: <stable@vger.kernel.org> # 4.5
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Nicholas Krause [Sat, 27 Feb 2016 17:43:25 +0000 (12:43 -0500)]
be2iscsi:Add missing error check in beiscsi_eeh_resume
This adds the missing error check and path for if the call to the
function hwi_init_controller fails as this error path was clearly missed
when writing beiscsi_eeh_resume and thus we must add it now in order to
be able to handle this nonrecoverable failing function call gracefully
in beiscsi_eeh_resume.
Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
Reviewed-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Usha Ketineni [Mon, 29 Feb 2016 11:36:52 +0000 (03:36 -0800)]
fcoe: fix reset of fip selection time.
Do not reset fip selection time for every advertisement
in fcoe_ctlr_recv_adv() but set it only once for the first
validated FCF. Otherwise FCF selection won't happen when the
advertisements consistently arrive with sub FCOE_CTLR_START_DELAY
periodicity.
Tested-by: Narendra K <narendra_k@dell.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Thu, 25 Feb 2016 09:42:15 +0000 (17:42 +0800)]
hisi_sas: update driver version to 1.3
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Thu, 25 Feb 2016 09:42:14 +0000 (17:42 +0800)]
hisi_sas: add hisi_sas_slave_configure()
In high-datarate aging tests, it is found that the
SCSI framework can periodically issue lu resets as
some commands timeout.
Response TASK SET FULL and SAS_QUEUE_FULL may be
returned many times for the same command, causing the
timeouts.
The SAS_QUEUE_FULL errors come from
TRANS_TX_CREDIT_TIMEOUT_ERR, TRANS_TX_CLOSE_NORMAL_ERR,
and TRANS_TX_ERR_FRAME_TXED errors. They do not mean
that the queue is full in the host, but rather it is
equivalent to meaning the queue is full for the sdev.
To overcome this, the queue depth for the sdev is
reduced to 64 (from 256, set in sas_slave_configure()).
Normally error code SAS_QUEUE_FULL will result in the
sdev queue depth falling, but it falls too slowly during
high-datarate tests and commands timeout before it
has fallen to an adequete level from original value.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Thu, 25 Feb 2016 09:42:13 +0000 (17:42 +0800)]
hisi_sas: use slot abort in v2 hw
When TRANS_TX_ERR_FRAME_TXED error occurs in
a slot, the command should be re-attempted.
This error is equivalent to meaning that the queue
is full in the sdev (and not the host).
A superflous debug statement is also removed in the
slot complete handler.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Thu, 25 Feb 2016 09:42:12 +0000 (17:42 +0800)]
hisi_sas: use slot abort in v1 hw
When TRANS_TX_CREDIT_TIMEOUT_ERR or
TRANS_TX_CLOSE_NORMAL_ERR error occur in
a slot, the command should be re-attempted.
This error is equivalent to meaning that the queue
is full in the sdev (and not the host).
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Thu, 25 Feb 2016 09:42:11 +0000 (17:42 +0800)]
hisi_sas: add hisi_sas_slot_abort()
Add a function to abort a slot (task) in the target
device and then cleanup and complete the task.
The function is called from work queue context as
it cannot be called from the context where it is
triggered (interrupt).
Flag hisi_sas_slot.abort is added as the flag used
in the slot error handler to indicate whether the
slot needs to be aborted in the sdev prior to
cleanup and finish.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Thu, 25 Feb 2016 09:42:10 +0000 (17:42 +0800)]
hisi_sas: change tmf func complete check
In hisi_sas_exec_internal_tmf_task(), the check for
SAM_STAT_GOOD is replaced with
TMF_RESP_FUNC_COMPLETE, which is a genuine tmf
response code.
SAM_STAT_GOOD and TMF_RESP_FUNC_COMPLETE have the
same value, so this is why it worked before.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Alison Schofield [Thu, 18 Feb 2016 05:29:34 +0000 (21:29 -0800)]
gdth: replace struct timeval with ktime_get_real_seconds()
struct timeval will overflow on 32-bit systems in y2038 and is being
removed from the kernel. Replace the use of struct timeval and
do_gettimeofday() with ktime_get_real_seconds() which provides a 64-bit
seconds value and is y2038 safe.
gdth driver requires changes in two areas:
1) gdth_store_event() loads two u32 timestamp fields for ioctl GDTIOCTL_EVENT
These timestamp fields are part of struct gdth_evt_str used for passing
event data to userspace. At the first instance of an event we do
(first_stamp=last_stamp="current time"). If that same event repeats,
we do (last_stamp="current time") AND increment same_count to indicate
how many times the event has repeated since first_stamp.
This patch replaces the use of timeval and do_gettimeofday() with
ktime_get_real_seconds() cast to u32 to extend the timestamp fields
to y2106.
Beyond y2106, the userspace tools (ie. RAID controller monitors) can
work around the time rollover and this driver would still not need to
change.
Alternative: The alternative approach is to introduce a new ioctl in gdth
with the u32 time fields defined as u64. This would require userspace
changes now, but not in y2106.
2) gdth_show_info() calculates elapsed time using u32 first_stamp
It is adding events with timestamps to a seq_file. Timestamps are
calculated as the "current time" minus the first_stamp.
This patch replaces the use of timeval and do_gettimeofday() with
ktime_get_real_seconds() cast to u32 to calculate the timestamp.
This elapsed time calculation is safe even when the time wraps (beyond
y2106) due to how unsigned subtraction works. A comment has been added
to the code to indicate this safety.
Alternative: This piece itself doesn't warrant an alternative, but
if we do introduce a new structure & ioctl with u64 timestamps, this
would change accordingly.
Signed-off-by: Alison Schofield <amsfield22@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Sudip Mukherjee [Wed, 24 Feb 2016 11:21:28 +0000 (16:51 +0530)]
osd: remove deadcode
The variable is_ver1 is always true and so OSD_CAP_LEN can never be
used.
Reported by Coverity.
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Acked-by: Boaz harrosh <ooo@elecrozaur.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Sudip Mukherjee [Wed, 24 Feb 2016 10:57:11 +0000 (16:27 +0530)]
imm: check parport_claim
parport_claim() can fail and we should be checking if we were able to
claim the port.
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Charles [Mon, 22 Feb 2016 12:07:09 +0000 (20:07 +0800)]
stex: Add S3/S4 support
Add S3/S4 support, add .suspend and .resume function in pci_driver. In
.suspend handler, driver send S3/S4 signal to the device.
Signed-off-by: Charles Chiou <charles.chiou@tw.promise.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Charles [Mon, 22 Feb 2016 12:04:25 +0000 (20:04 +0800)]
stex: Add hotplug support
1. Add hotplug support. Pegasus support surprise removal. To this end, I
use return_abnormal_state function to return DID_NO_CONNECT for all
commands which sent to driver.
2. Remove stex_hba_stop in stex_remove because we cannot send command to
device after hotplug.
3. Add new device status: MU_STATE_STOP, MU_STATE_NOCONNECT,
MU_STATE_STOP. MU_STATE_STOP is currently not referenced.
MU_STATE_NOCONNECT represent that device is plugged out from the
host.
4. Use return_abnormal_function() to substitute part of code in
stex_do_reset.
Signed-off-by: Charles Chiou <charles.chiou@tw.promise.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Charles [Mon, 22 Feb 2016 12:02:02 +0000 (20:02 +0800)]
stex: Support to Pegasus series.
Pegasus is a high performace hardware RAID solution designed to unleash
the raw power of Thunderbolt technology.
1. Add code to distinct SuperTrack and Pegasus series by sub device ID.
It should support backward compatibility.
2. Change the driver version.
Signed-off-by: Charles Chiou <charles.chiou@tw.promise.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don Brace [Tue, 23 Feb 2016 21:21:44 +0000 (15:21 -0600)]
hpsa: update MAINTAINERS with new e-mail
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Reviewed-by: Justin Lindley <justin.lindley@microsemi.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don Brace [Tue, 23 Feb 2016 21:16:46 +0000 (15:16 -0600)]
hpsa: update copyright information
Reviewed-by: Justin Lindley <justin.lindley@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don Brace [Tue, 23 Feb 2016 21:16:40 +0000 (15:16 -0600)]
hpsa: remove function definition for sanitize_inquiry_string
This patch depends on patch
- commit
ac10a3e4ed64
("Export function scsi_scan.c:sanitize_inquiry_string")
Suggested-by: Hannes Reinecke <hare@suse.de>
Suggested-by: Matthew R. Ochs mrochs@linux.vnet.ibm.com
Reviewed-by: Mahesh Rajashekhara <mahesh.rajashekhara@microsemi.com>
Reviewed-by: Justin Lindley <justin.lindley@microsemi.com>
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don Brace [Tue, 23 Feb 2016 21:16:34 +0000 (15:16 -0600)]
hpsa: check for a null phys_disk pointer in ioaccel2 path
An oops can occur when submitting ioaccel2 commands when the phys_disk
pointer is NULL in hpsa_scsi_ioaccel_raid_map. Happens when there are
configuration changes during I/O operations.
If the phys_disk pointer is NULL, send the command down the RAID path.
Reviewed-by: Mahesh Rajashekhara <mahesh.rajashekhara@microsemi.com>
Reviewed-by: Justin Lindley <justin.lindley@microsemi.com>
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don Brace [Tue, 23 Feb 2016 21:16:28 +0000 (15:16 -0600)]
hpsa: correct abort tmf for hba devices
Aborts were not being sent down to HBA devices
Reviewed-by: Mahesh Rajashekhara <mahesh.rajashekhara@microsemi.com>
Reviewed-by: Justin Lindley <justin.lindley@microsemi.com>
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don Brace [Tue, 23 Feb 2016 21:16:22 +0000 (15:16 -0600)]
hpsa: correct lun data caching bitmap definition
The bitmap was changed after this definition was added to the
driver. Correcting the bitmap definition.
Reviewed-by: Mahesh Rajashekhara <mahesh.rajashekhara@microsemi.com>
Reviewed-by: Justin Lindley <justin.lindley@microsemi.com>
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don Brace [Tue, 23 Feb 2016 21:16:15 +0000 (15:16 -0600)]
hpsa: add SMR drive support
Reviewed-by: Mahesh Rajashekhara <mahesh.rajashekhara@microsemi.com>
Reviewed-by: Justin Lindley <justin.lindley@microsemi.com>
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don Brace [Tue, 23 Feb 2016 21:16:09 +0000 (15:16 -0600)]
hpsa: do not get enclosure info for external devices
Stop annoying "Error, could not get enclosure information"
messages.
Reviewed-by: Mahesh Rajashekhara <mahesh.rajashekhara@microsemi.com>
Reviewed-by: Justin Lindley <justin.lindley@microsemi.com>
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 19 Feb 2016 08:17:20 +0000 (09:17 +0100)]
scsi_dh_alua: Update version to 2.0
[mkp: Fixed merge due to patches 20-22 of series being postponed]
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 19 Feb 2016 08:17:16 +0000 (09:17 +0100)]
scsi_dh: add 'rescan' callback
If a device needs to be rescanned the device_handler might need
to be rechecked, too.
So add a 'rescan' callback to the device handler and call it
upon scsi_rescan_device(). The rescan callback will be invoked
from the Unit Attention handling of ASC/ASCQ 3F 03
(INQUIRY DATA HAS CHANGED).
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 19 Feb 2016 08:17:15 +0000 (09:17 +0100)]
scsi_dh_alua: Send TEST UNIT READY to poll for transitioning
Sending a 'REPORT TARGET PORT GROUP' command is a costly operation,
as the array has to gather information about all ports.
So instead of using RTPG to poll for a status update when a port
is in transitioning we should be sending a TEST UNIT READY, and
wait for the sense code to report success.
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 19 Feb 2016 08:17:14 +0000 (09:17 +0100)]
scsi_dh_alua: update all port states
When we read in the target port group state we should be
updating all affected port groups, otherwise we risk
running out of sync.
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 19 Feb 2016 08:17:13 +0000 (09:17 +0100)]
scsi_dh_alua: Recheck state on unit attention
When we receive a unit attention code of 'ALUA state changed'
we should recheck the state, as it might be due to an implicit
ALUA state transition. This allows us to return NEEDS_RETRY
instead of ADD_TO_MLQUEUE, allowing to terminate the retries
after a certain time.
At the same time a workqueue item might already be queued, which
should be started immediately to avoid any delays.
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 19 Feb 2016 08:17:12 +0000 (09:17 +0100)]
scsi_dh_alua: Add new blacklist flag 'BLIST_SYNC_ALUA'
Add a new blacklist flag BLIST_SYNC_ALUA to instruct the
alua device handler to use synchronous command submission
for ALUA commands.
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 19 Feb 2016 08:17:11 +0000 (09:17 +0100)]
scsi_dh_alua: Allow workqueue to run synchronously
Some arrays may only capable of handling one STPG at a time,
so this patch adds a singlethreaded workqueue for STPGs to be
submitted synchronously.
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 19 Feb 2016 08:17:10 +0000 (09:17 +0100)]
scsi_dh_alua: Use workqueue for RTPG
The current ALUA device_handler has two drawbacks:
- We're sending a 'SET TARGET PORT GROUP' command to every LUN,
disregarding the fact that several LUNs might be in a port group
and will be automatically switched whenever _any_ LUN within
that port group receives the command.
- Whenever a LUN is in 'transitioning' mode we cannot block I/O
to that LUN, instead the controller has to abort the command.
This leads to increased traffic across the wire and heavy load
on the controller during switchover.
With this patch the RTPG handling is moved to a per-portgroup
workqueue. This reduces the number of 'REPORT TARGET PORT GROUP'
and 'SET TARGET PORT GROUPS' sent to the controller as we're sending
them now per port group, and not per device as previously.
It also allows us to block I/O to any LUN / port group found to be
in 'transitioning' ALUA mode, as the workqueue item will be requeued
until the controller moves out of transitioning.
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 19 Feb 2016 08:17:09 +0000 (09:17 +0100)]
scsi_dh_alua: remove 'rel_port' from alua_dh_data structure
The 'relative port' field is not used, and might get stale when
the port group changes. So remove the field altogether.
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 19 Feb 2016 08:17:08 +0000 (09:17 +0100)]
scsi_dh_alua: move optimize_stpg evaluation
When the optimize_stpg module option is set we should just set it
once during port_group allocation. Doing so allows us to override
it later with device specific settings.
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 19 Feb 2016 08:17:07 +0000 (09:17 +0100)]
revert commit
a8e5a2d593cb ("[SCSI] scsi_dh_alua: ALUA handler attach should succeed while TPG is transitioning")
This reverts commit
a8e5a2d593cbfccf530c3382c2c328d2edaa7b66
Obsoleted by the next patch.
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 19 Feb 2016 08:17:06 +0000 (09:17 +0100)]
scsi_dh_alua: simplify alua_initialize()
Rework alua_check_vpd() to use scsi_vpd_get_tpg()
and move the port group selection into the function, too.
With that we can simplify alua_initialize() to just
call alua_check_tpgs() and alua_check_vpd();
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 19 Feb 2016 08:17:05 +0000 (09:17 +0100)]
scsi_dh_alua: use unique device id
Use scsi_vpd_lun_id() to assign a unique device identification
to the alua port group structure.
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 19 Feb 2016 08:17:04 +0000 (09:17 +0100)]
scsi_dh_alua: Use separate alua_port_group structure
The port group needs to be a separate structure as several
LUNs might belong to the same group.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 19 Feb 2016 08:17:03 +0000 (09:17 +0100)]
scsi_dh_alua: allocate RTPG buffer separately
The RTPG buffer will only evaluated within alua_rtpg(),
so we can allocate it locally there and avoid having to
put it into the global structure.
Reviewed-by: Ewan Milne <emilne@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 19 Feb 2016 08:17:02 +0000 (09:17 +0100)]
scsi_dh_alua: switch to scsi_execute_req_flags()
All commands are issued synchronously, so no need to open-code
scsi_execute_req_flags() anymore. And we can get rid of the
static sense code structure element. scsi_execute_req_flags()
will be setting REQ_QUIET and REQ_PREEMPT, but that is
perfectly fine as we're evaluating and logging any errors
ourselves and we really need to send the command even if
the device is quiesced.
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 19 Feb 2016 08:17:01 +0000 (09:17 +0100)]
scsi_dh_alua: call alua_rtpg() if stpg fails
If the call to SET TARGET PORT GROUPS fails we have no idea what
state the array is left in, so we need to issue a call to
REPORT TARGET PORT GROUPS in these cases.
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 19 Feb 2016 08:17:00 +0000 (09:17 +0100)]
scsi_dh_alua: Make stpg synchronous
The 'activate_complete' function needs to be executed after
stpg has finished, so we can as well execute stpg synchronously
and call the function directly.
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 19 Feb 2016 08:16:59 +0000 (09:16 +0100)]
scsi_dh_alua: separate out alua_stpg()
Separate out SET TARGET PORT GROUP functionality into a separate
function alua_stpg().
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>