Robert Love [Sat, 9 Oct 2010 00:12:46 +0000 (17:12 -0700)]
[SCSI] fcoe: Fix broken NPIV with correction to MAC validation
A previous patch attempted to validate the destination
MAC address of a FCoE frame by checking that MAC
address against the received port's MAC address. The
implementation seems fine on the surface, but any
VN_Ports added using the NPIV feature will have their
own MAC addresses and these MACs were not being checked,
which prevented any NPIV VN_Ports from receiving frames.
In other words, the following patch has broken NPIV.
519e5135e2537c9dbc1cbcc0891b0a936ff5dcd2
[SCSI] fcoe: adds src and dest mac address
checking for fcoe frames
Part of the offending patch is correct, but the part
that broke NPIV was attempting to satisfy FC-BB-5
section D.5, 2.1-
(discard frames that) "contain a destination MAC
address/destination N_Port_ID pair that was not
assigned by an FCF to one of the VN_Ports on the ENode"
The language does _not_ say to compare the destination
FC-MAP/destination N_Port_ID, but instead to compare
the destination MAC address/destination N_Port_ID.
>From the FC-BB-5 specification,
"A properly formed FPMA is one in which the 24 most
significant bits equal the Fabric’s FC-MAP value and
the least significant 24 bits equal the N_Port_ID
assigned to the VN_Port by the FCF."
This means that we need to compare the FC Frame's
destination FCID against the embedded FCID in the
destination MAC address. This patch checks the lower
24 bits of the destination MAC address against
destination FCID in the Fibre Channel frame.
For MAC validation the first line of defense is the
hardware MAC filtering. Each VN_Port will have a
unicast MAC addresses added to the hardware's
filtering table. The Ethernet driver should drop any
MACs not destined for a programmed MAC. This patch
adds a second line of defense that very specfically
compares an element in the FC frame against an element
in the Ethernet header, which is appropriate for the
FCoE layer.
Many alternative approaches were considered, including
a LLD callback from libfc. The second most reasonable
approach seemed to be walking the list of NPIV ports
and check each of their MAC addresses against the
destination MAC address of the received frame. The
problem with this approach was that it is likely that
performance would suffer with the more NPIV ports added
to the system since every received frame would need to
walk this list, comparing each entry's MAC.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Kiran Patil [Sat, 9 Oct 2010 00:12:41 +0000 (17:12 -0700)]
[SCSI] libfcoe: VN2VN connection setup causing stack memory corruption.
Fix: When FIP frame is received, function fcoe_ctlr_vn_recv calls function
fcoe_ctlr_vn_parse which does memset for addr (&buf.rdata) which leads to
memory corruption. Code was trying to treat "buf" as struct but it was defined
as union. Fix is to change from union to struct for "buf" in function fcoe_ctlr_vn_recv.
Technical Details: N/A
Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Acked-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Bhanu Prakash Gollapudi [Sat, 9 Oct 2010 00:12:36 +0000 (17:12 -0700)]
[SCSI] libfc: Do not let disc work cancel itself
When number of NPIV ports created are greater than the xids
allocated per pool -- for eg., creating 255 NPIV ports on a
system with nr_cpu_ids of 32, with each pool containing 128
xids -- and then generating a link event - for eg.,
shutdown/no shutdown -- on the switch port causes the hang
with the following stack trace.
Call Trace:
schedule_timeout+0x19d/0x230
wait_for_common+0xc0/0x170
__cancel_work_timer+0xcf/0x1b0
fc_disc_stop+0x16/0x30 [libfc]
fc_lport_reset_locked+0x47/0x90 [libfc]
fc_lport_enter_reset+0x67/0xe0 [libfc]
fc_lport_disc_callback+0xbc/0xe0 [libfc]
fc_disc_done+0xa8/0xf0 [libfc]
fc_disc_timeout+0x29/0x40 [libfc]
run_workqueue+0xb8/0x140
worker_thread+0x96/0x110
kthread+0x96/0xa0
child_rip+0xa/0x20
Fix is to not cancel the disc_work if discovery is already
stopped, thus allowing lport state machine to restart and try
discovery again.
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Acked-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Vasu Dev [Sat, 9 Oct 2010 00:12:31 +0000 (17:12 -0700)]
[SCSI] libfc: possible race could panic system due to NULL fsp->cmd
It is unlikely but in case if it hits then it would cause panic
due to null cmd ptr, so far only one instance seen recently with
ESX though this was introduced long ago with this commit:-
commit
c1ecb90a66c5afc7cc5c9349f9c3714eef4a5cfb
Author: Chris Leech <christopher.leech@intel.com>
Date: Thu Dec 10 09:59:26 2009 -0800
[SCSI] libfc: reduce hold time on SCSI host lock
Currently fsp->cmd is set to NULL w/o scsi_queue_lock before
dequeuing from scsi_pkt_queue and that could cause NULL
fsp->cmd in fc_fcp_cleanup_each_cmd for cmd completing
with fsp->cmd = NULL after fc_fcp_cleanup_each_cmd taken
reference. No need to set fsp->cmd to NULL as this is also
protected by fc_fcp_lock_pkt(), for above race the
fc_fcp_lock_pkt() in fc_fcp_cleanup_each_cmd() will fail
as that cmd is already done.
Mike mentioned same issue at
http://www.open-fcoe.org/pipermail/devel/2010-September/010533.html
Similarly moved sc_cmd->SCp.ptr = NULL under scsi_queue_lock so
that scsi abort error handler won't abort on completed cmds.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Vasu Dev [Sat, 9 Oct 2010 00:12:25 +0000 (17:12 -0700)]
[SCSI] fcoe: set default FIP mode as FIP_MODE_FABRIC
Since sometimes current FIP_MODE_AUTO mode falls back to non-FIP
mode while DCB link still getting ready in fabric mode with
its peer switch, it falls back after few libfc flogi retries
and that is not we want while working with FIP enabled
switches in FABRIC mode, therefore sets default as FIP_MODE_FABRIC
as discussed and agreed before in this mail thread
http://www.open-fcoe.org/pipermail/devel/2010-August/010511.html
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Vasu Dev [Sat, 9 Oct 2010 00:12:20 +0000 (17:12 -0700)]
[SCSI] libfc: adds flogi retry in case DID is zero in RJT
Sometimes switch in NPV mode rejects flogi request with DID
zero and in that case flogi is not tried again and port
remains offline, so this patch validates DID for non zero
along with only ACC response to allow flogi retry
for RJT with DID=0 also succeed FLOGI in next try.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Vasu Dev [Sat, 9 Oct 2010 00:12:15 +0000 (17:12 -0700)]
[SCSI] libfc: use DID_TRANSPORT_DISRUPTED while lport not ready
This is per Mile Christie feedback since in this case IO
could get retried for tape devices and therefore DID_REQUEUE
cannot be used, more details in this thread.
http://marc.info/?l=linux-scsi&m=
127970522630136&w=2
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Mike Christie [Sat, 9 Oct 2010 00:12:10 +0000 (17:12 -0700)]
[SCSI] libfc: fix setting of rport dev loss
There does not seem to be a reason why libfc adds a 5
second delay to the user requested value for the dev loss
tmo. There also does not seem to be a reason to allow
setting it to 0 (or really close).
This patch removes the extra 5 sec delay, and for 0 it
sets it to 1 like other fc drivers. We should actually
be able to set it to 0 since the queue_delayed_work API
will just call queue_work, but other drivers set it to 1 in
that case.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Dan Carpenter [Fri, 8 Oct 2010 07:03:07 +0000 (09:03 +0200)]
[SCSI] gdth: integer overflow in ioctl
gdth_ioctl_alloc() takes the size variable as an int.
copy_from_user() takes the size variable as an unsigned long.
gen.data_len and gen.sense_len are unsigned longs.
On x86_64 longs are 64 bit and ints are 32 bit.
We could pass in a very large number and the allocation would truncate
the size to 32 bits and allocate a small buffer. Then when we do the
copy_from_user(), it would result in a memory corruption.
CC: stable@kernel.org
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Christof Schmitt [Wed, 6 Oct 2010 11:19:44 +0000 (13:19 +0200)]
[SCSI] Fix race when removing SCSI devices
Removing SCSI devices through
echo 1 > /sys/bus/scsi/devices/ ... /delete
while the FC transport class removes the SCSI target can lead to an
oops:
Unable to handle kernel pointer dereference at virtual kernel address
00000000b6815000
Oops: 0011 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: sunrpc qeth_l3 binfmt_misc dm_multipath scsi_dh dm_mod ipv6 qeth ccwgroup [last unloaded: scsi_wait_scan]
CPU: 1 Not tainted 2.6.35.5-45.x.
20100924-s390xdefault #1
Process fc_wq_0 (pid: 861, task:
00000000b7331240, ksp:
00000000b735bac0)
Krnl PSW :
0704200180000000 00000000003ff6e4 (__scsi_remove_device+0x24/0xd0)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0 EA:3
Krnl GPRS:
0000000000000001 0000000000000000 00000000b6815000 00000000bc24a8c0
00000000003ff7c8 000000000056dbb8 0000000000000002 0000000000835d80
ffffffff00000000 0000000000001000 00000000b6815000 00000000bc24a7f0
00000000b68151a0 00000000b6815000 00000000b735bc20 00000000b735bbf8
Krnl Code:
00000000003ff6d6:
a7840001 brc 8,3ff6d8
00000000003ff6da:
a7fbffd8 aghi %r15,-40
00000000003ff6de:
e3e0f0980024 stg %r14,152(%r15)
>
00000000003ff6e4:
e31021200004 lg %r1,288(%r2)
00000000003ff6ea:
a71f0000 cghi %r1,0
00000000003ff6ee:
a7a40011 brc 10,3ff710
00000000003ff6f2:
a7390003 lghi %r3,3
00000000003ff6f6:
c0e5ffffc8b1 brasl %r14,3f8858
Call Trace:
([<
0000000000001000>] 0x1000)
[<
00000000003ff7d2>] scsi_remove_device+0x42/0x54
[<
00000000003ff8ba>] __scsi_remove_target+0xca/0xfc
[<
00000000003ff99a>] __remove_child+0x3a/0x48
[<
00000000003e3246>] device_for_each_child+0x72/0xbc
[<
00000000003ff93a>] scsi_remove_target+0x4e/0x74
[<
0000000000406586>] fc_rport_final_delete+0xb2/0x23c
[<
000000000015d080>] worker_thread+0x200/0x344
[<
000000000016330c>] kthread+0xa0/0xa8
[<
0000000000106c1a>] kernel_thread_starter+0x6/0xc
[<
0000000000106c14>] kernel_thread_starter+0x0/0xc
INFO: lockdep is turned off.
Last Breaking-Event-Address:
[<
00000000003ff7cc>] scsi_remove_device+0x3c/0x54
The function __scsi_remove_target iterates through the SCSI devices on
the host, but it drops the host_lock before calling
scsi_remove_device. When the SCSI device is deleted from another
thread, the pointer to the SCSI device in scsi_remove_device can
become invalid. Fix this by getting a reference to the SCSI device
before dropping the host_lock to keep the SCSI device alive for the
call to scsi_remove_device.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Martin K. Petersen [Fri, 8 Oct 2010 05:36:24 +0000 (01:36 -0400)]
[SCSI] sd: Export effective protection mode in sysfs
Create a sysfs entry that reports the negotiated DIX/DIF protection mode
for a SCSI disk. This depends on the protection type the disk is
formatted with as well as the protection capabilities advertised by the
controller.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Vikas Chaudhary [Thu, 7 Oct 2010 05:51:21 +0000 (22:51 -0700)]
[SCSI] qla4xxx: Update driver version to 5.02.00-k4
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Mike Christie [Thu, 7 Oct 2010 05:51:17 +0000 (22:51 -0700)]
[SCSI] qla4xxx: grab hardware_lock in eh_abort before accessing srb
grab hardware_lock in eh_abort before accessing srb to avoid
race between command completion and get refcount on srb.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Vikas Chaudhary [Thu, 7 Oct 2010 05:51:09 +0000 (22:51 -0700)]
[SCSI] qla4xxx: remove unwanted check for bad spd
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Vikas Chaudhary [Thu, 7 Oct 2010 05:50:56 +0000 (22:50 -0700)]
[SCSI] qla4xxx: update AER support for ISP82XX
* Cleanup qla4xxx_pci_mmio_enabled():
don't want to return PCI_ERS_NEED_RESET if firmware hung.
IDC will take care of it.
* Request irq after initialize_adapter() in qla82xx_error_recovery().
* Return all active commands from qla4xxx_pci_error_detected().
* Cleanup ql4_def.h
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Shyam Sundar [Thu, 7 Oct 2010 05:50:51 +0000 (22:50 -0700)]
[SCSI] qla4xxx: Clear the rom lock if the firmware died while holding it.
There is a possibility that the firmware dies while the rom
lock is held. The only way to recover from this condition is
to forcefully unlock.
Signed-off-by: Shyam Sundar <shyam.sundar@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Shyam Sundar [Thu, 7 Oct 2010 05:50:29 +0000 (22:50 -0700)]
[SCSI] qla4xxx: use CRB Register for Request Queue in-pointer
Switching from doorbell mechanism to CRB register based
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Shyam Sundar <shyam.sundar@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Karen Higgins [Thu, 7 Oct 2010 05:50:21 +0000 (22:50 -0700)]
[SCSI] qla4xxx: dump mailbox registers on System Error
Signed-off-by: Karen Higgins <karen.higgins@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Shyam Sundar [Thu, 7 Oct 2010 05:49:40 +0000 (22:49 -0700)]
[SCSI] qla4xxx: Add support for 8130/8131 AENs.
AEN 8130 Corresponds to an event representing the insertion (detection)
of a transceiver. It also reports the type of the SFP+.
AEN 8131 corresponds to the removal of a transceiver.
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Shyam Sundar <shyam.sundar@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Lalit Chandivade [Thu, 7 Oct 2010 05:49:32 +0000 (22:49 -0700)]
[SCSI] qla4xxx: Reset seconds_since_last_heartbeat correctly.
The seconds_since_last_heartbeat should be checked for consecutive
heartbeat checks. Currently it could happen that it gets set to
max (2 seconds) for non-consecutive heartbeat checks.
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Nilesh Javali [Thu, 7 Oct 2010 05:49:20 +0000 (22:49 -0700)]
[SCSI] qla4xxx: On firmware hang do not wait for the outstanding commands to complete
Signed-off-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Vikas Chaudhary [Thu, 7 Oct 2010 05:49:08 +0000 (22:49 -0700)]
[SCSI] qla4xxx: free_irqs on failed initialize_adapter
Since interrupts are registered in start_firmware(load_risc) for 82xx,
free them if init_firmware fails.
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Karen Higgins <karen.higgins@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Vikas Chaudhary [Thu, 7 Oct 2010 05:48:53 +0000 (22:48 -0700)]
[SCSI] qla4xxx: correct data type of sense_len in qla4xxx_status_cont_entry
change data type of sense_len from uint8_t to uint16_t
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Vikas Chaudhary [Thu, 7 Oct 2010 05:48:24 +0000 (22:48 -0700)]
[SCSI] qla4xxx: remove "ha->retry_reset_ha_cnt" from wait_for_hba_online
remove "ha->retry_reset_ha_cnt" from wait_for_hba_online as its
initialize to zero at driver init time so it could always return
QLA_ERROR from wait_for_hba_online() without waiting for hba to
come online.
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Vikas Chaudhary [Thu, 7 Oct 2010 05:48:07 +0000 (22:48 -0700)]
[SCSI] qla4xxx: honor return status of qla4xxx_hw_reset
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Vikas Chaudhary [Thu, 7 Oct 2010 05:47:48 +0000 (22:47 -0700)]
[SCSI] qla4xxx: Trivial cleanup
* cleanup function qla4xxx_recovery_timeout
- No need to wakeup dpc thread from function
qla4xxx_recovery_timeout() as we are not doing anything
in do_dpc() thread when wakeup from
qla4xxx_recovery_timeout()
* cleanup function qla4xxx_wait_for_hba_online
- Remove hard coded value from qla4xxx_wait_for_hba_online().
* cleanup function qla4xxx_start_firmware_from_flash
- display seconds
* cleanup function qla4_8xxx_load_risc
- Remove redundant code.
* cleanup function qla4xxx_get_firmware_status
- update debug statement
* cleanup function qla4_8xxx_try_start_fw
- update return status
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Mike Christie [Wed, 6 Oct 2010 08:10:59 +0000 (03:10 -0500)]
[SCSI] Fix regressions in scsi_internal_device_block
Deleting a SCSI device on a blocked fc_remote_port (before
fast_io_fail_tmo fires) results in a hanging thread:
STACK:
0 schedule+1108 [0x5cac48]
1 schedule_timeout+528 [0x5cb7fc]
2 wait_for_common+266 [0x5ca6be]
3 blk_execute_rq+160 [0x354054]
4 scsi_execute+324 [0x3b7ef4]
5 scsi_execute_req+162 [0x3b80ca]
6 sd_sync_cache+138 [0x3cf662]
7 sd_shutdown+138 [0x3cf91a]
8 sd_remove+112 [0x3cfe4c]
9 __device_release_driver+124 [0x3a08b8]
10 device_release_driver+60 [0x3a0a5c]
11 bus_remove_device+266 [0x39fa76]
12 device_del+340 [0x39d818]
13 __scsi_remove_device+204 [0x3bcc48]
14 scsi_remove_device+66 [0x3bcc8e]
15 sysfs_schedule_callback_work+50 [0x260d66]
16 worker_thread+622 [0x162326]
17 kthread+160 [0x1680b0]
18 kernel_thread_starter+6 [0x10aaea]
During the delete, the SCSI device is in moved to SDEV_CANCEL. When
the FC transport class later calls scsi_target_unblock, this has no
effect, since scsi_internal_device_unblock ignores SCSI devics in this
state.
It looks like all these are regressions caused by:
5c10e63c943b4c67561ddc6bf61e01d4141f881f
[SCSI] limit state transitions in scsi_internal_device_unblock
Fix by rejecting offline and cancel in the state transition.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
[jejb: Original patch by Christof Schmitt, modified by Mike Christie]
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Christof Schmitt [Tue, 5 Oct 2010 15:12:55 +0000 (17:12 +0200)]
[SCSI] zfcp: Use correct length for FCP_RSP_INFO
Use the FCP_RSP_INFO length to correctly skip the FCP_RSP_INFO field.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Christof Schmitt [Tue, 5 Oct 2010 15:12:54 +0000 (17:12 +0200)]
[SCSI] zfcp: Call get_device on port before calling put_device
zfcp_unit_release calls put_device on the port. Ensure that get_device
has been called before possibly triggering the release function
through put_device or device_unregister.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Swen Schillig [Tue, 5 Oct 2010 15:12:53 +0000 (17:12 +0200)]
[SCSI] zfcp: Fix adapter activation on link down
If an exchange config is executed while the local link is down, the
request succeeds but the returned data is incomplete. Proceeding with
the adapter activation is leading to an unpredictable behaviour (e.g.
kernel panic) caused by invalid values. In such a scenario the
recommended ERP is to retry the action and wait for a link up event.
If the issue persists the activation has to fail.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Sigend-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Linus Torvalds [Sun, 24 Oct 2010 20:41:39 +0000 (13:41 -0700)]
Merge branch 'for-next' of git://git./linux/kernel/git/jikos/trivial
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
Update broken web addresses in arch directory.
Update broken web addresses in the kernel.
Revert "drivers/usb: Remove unnecessary return's from void functions" for musb gadget
Revert "Fix typo: configuation => configuration" partially
ida: document IDA_BITMAP_LONGS calculation
ext2: fix a typo on comment in ext2/inode.c
drivers/scsi: Remove unnecessary casts of private_data
drivers/s390: Remove unnecessary casts of private_data
net/sunrpc/rpc_pipe.c: Remove unnecessary casts of private_data
drivers/infiniband: Remove unnecessary casts of private_data
drivers/gpu/drm: Remove unnecessary casts of private_data
kernel/pm_qos_params.c: Remove unnecessary casts of private_data
fs/ecryptfs: Remove unnecessary casts of private_data
fs/seq_file.c: Remove unnecessary casts of private_data
arm: uengine.c: remove C99 comments
arm: scoop.c: remove C99 comments
Fix typo configue => configure in comments
Fix typo: configuation => configuration
Fix typo interrest[ing|ed] => interest[ing|ed]
Fix various typos of valid in comments
...
Fix up trivial conflicts in:
drivers/char/ipmi/ipmi_si_intf.c
drivers/usb/gadget/rndis.c
net/irda/irnet/irnet_ppp.c
Linus Torvalds [Sun, 24 Oct 2010 20:06:57 +0000 (13:06 -0700)]
Merge branch 'devel' of git://git./linux/kernel/git/mchehab/edac
* 'devel' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/edac: (25 commits)
i7300_edac: Properly initialize per-csrow memory size
V4L/DVB: i7300_edac: better initialize page counts
MAINTAINERS: Add maintainer for i7300-edac driver
i7300-edac: CodingStyle cleanup
i7300_edac: Improve comments
i7300_edac: Cleanup: reorganize the file contents
i7300_edac: Properly detect channel on CE errors
i7300_edac: enrich FBD error info for corrected errors
i7300_edac: enrich FBD error info for fatal errors
i7300_edac: pre-allocate a buffer used to prepare err messages
i7300_edac: Fix MTR x4/x8 detection logic
i7300_edac: Make the debug messages coherent with the others
i7300_edac: Cleanup: remove get_error_info logic
i7300_edac: Add a code to cleanup error registers
i7300_edac: Add support for reporting FBD errors
i7300_edac: Properly detect the type of error correction
i7300_edac: Detect if the device is on single mode
i7300_edac: Adds detection for enhanced scrub mode on x8
i7300_edac: Clear the error bit after reading
i7300_edac: Add error detection code for global errors
...
Linus Torvalds [Sun, 24 Oct 2010 19:47:55 +0000 (12:47 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/penberg/slab-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6: (27 commits)
SLUB: Fix memory hotplug with !NUMA
slub: Move functions to reduce #ifdefs
slub: Enable sysfs support for !CONFIG_SLUB_DEBUG
SLUB: Optimize slab_free() debug check
slub: Move NUMA-related functions under CONFIG_NUMA
slub: Add lock release annotation
slub: Fix signedness warnings
slub: extract common code to remove objects from partial list without locking
SLUB: Pass active and inactive redzone flags instead of boolean to debug functions
slub: reduce differences between SMP and NUMA
Revert "Slub: UP bandaid"
percpu: clear memory allocated with the km allocator
percpu: use percpu allocator on UP too
percpu: reduce PCPU_MIN_UNIT_SIZE to 32k
vmalloc: pcpu_get/free_vm_areas() aren't needed on UP
SLUB: Fix merged slab cache names
Slub: UP bandaid
slub: fix SLUB_RESILIENCY_TEST for dynamic kmalloc caches
slub: Fix up missing kmalloc_cache -> kmem_cache_node case for memoryhotplug
slub: Add dummy functions for the !SLUB_DEBUG case
...
Linus Torvalds [Sun, 24 Oct 2010 19:47:25 +0000 (12:47 -0700)]
Merge branch 'kvm-updates/2.6.37' of git://git./virt/kvm/kvm
* 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (321 commits)
KVM: Drop CONFIG_DMAR dependency around kvm_iommu_map_pages
KVM: Fix signature of kvm_iommu_map_pages stub
KVM: MCE: Send SRAR SIGBUS directly
KVM: MCE: Add MCG_SER_P into KVM_MCE_CAP_SUPPORTED
KVM: fix typo in copyright notice
KVM: Disable interrupts around get_kernel_ns()
KVM: MMU: Avoid sign extension in mmu_alloc_direct_roots() pae root address
KVM: MMU: move access code parsing to FNAME(walk_addr) function
KVM: MMU: audit: check whether have unsync sps after root sync
KVM: MMU: audit: introduce audit_printk to cleanup audit code
KVM: MMU: audit: unregister audit tracepoints before module unloaded
KVM: MMU: audit: fix vcpu's spte walking
KVM: MMU: set access bit for direct mapping
KVM: MMU: cleanup for error mask set while walk guest page table
KVM: MMU: update 'root_hpa' out of loop in PAE shadow path
KVM: x86 emulator: Eliminate compilation warning in x86_decode_insn()
KVM: x86: Fix constant type in kvm_get_time_scale
KVM: VMX: Add AX to list of registers clobbered by guest switch
KVM guest: Move a printk that's using the clock before it's ready
KVM: x86: TSC catchup mode
...
Linus Torvalds [Sun, 24 Oct 2010 19:46:24 +0000 (12:46 -0700)]
Merge branch 'i2c-for-linus' of git://git./linux/kernel/git/jdelvare/staging
* 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
i2c-viapro: Don't log nacks
i2c/pca954x: Remove __devinit and __devexit from probe and remove functions
MAINTAINERS: Add maintainer for PCA9541 I2C bus master selector driver
i2c/mux: Driver for PCA9541 I2C Master Selector
i2c: Optimize function i2c_detect()
i2c: Discard warning message on device instantiation from user-space
i2c-amd8111: Add proper error handling
i2c: Change to new flag variable
i2c: Remove unneeded inclusions of <linux/i2c-id.h>
i2c: Let i2c_parent_is_i2c_adapter return the parent adapter
i2c: Simplify i2c_parent_is_i2c_adapter
i2c-pca-platform: Change device name of request_irq
i2c: Fix Kconfig dependencies
Linus Torvalds [Sun, 24 Oct 2010 19:44:59 +0000 (12:44 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (47 commits)
HID: fix mismerge in hid-lg
HID: hidraw: fix window in hidraw_release
HID: hid-sony: override usbhid_output_raw_report for Sixaxis
HID: add absolute axis resolution calculation
HID: force feedback support for Logitech RumblePad gamepad
HID: support STmicroelectronics and Sitronix with hid-stantuml driver
HID: magicmouse: Adjust major / minor axes to scale
HID: Fix for problems with eGalax/DWAV multi-touch-screen
HID: waltop: add support for Waltop Slim Tablet 12.1 inch
HID: add NOGET quirk for AXIS 295 Video Surveillance Joystick
HID: usbhid: remove unused hiddev_driver
HID: magicmouse: Use hid-input parsing rather than bypassing it
HID: trivial formatting fix
HID: Add support for Logitech Speed Force Wireless gaming wheel
HID: don't Send Feature Reports on Interrupt Endpoint
HID: 3m: Adjust major / minor axes to scale
HID: 3m: Correct touchscreen emulation
HID: 3m: Convert to MT slots
HID: 3m: Output proper orientation range
HID: 3m: Adjust to sequential MT HID protocol
...
Linus Torvalds [Sun, 24 Oct 2010 19:44:34 +0000 (12:44 -0700)]
Merge git://git./linux/kernel/git/herbert/crypto-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: Makefile - replace the use of <module>-objs with <module>-y
crypto: hifn_795x - use cancel_delayed_work_sync()
crypto: talitos - sparse check endian fixes
crypto: talitos - fix checkpatch warning
crypto: talitos - fix warning: 'alg' may be used uninitialized in this function
crypto: cryptd - Adding the AEAD interface type support to cryptd
crypto: n2_crypto - Niagara2 driver needs to depend upon CRYPTO_DES
crypto: Kconfig - update broken web addresses
crypto: omap-sham - Adjust DMA parameters
crypto: fips - FIPS requires algorithm self-tests
crypto: omap-aes - OMAP2/3 AES hw accelerator driver
crypto: updates to enable omap aes
padata: add missing __percpu markup in include/linux/padata.h
MAINTAINERS: Add maintainer entries for padata/pcrypt
Pekka Enberg [Sun, 24 Oct 2010 16:57:05 +0000 (19:57 +0300)]
Merge branch 'master' into for-linus
Conflicts:
include/linux/percpu.h
mm/percpu.c
Jean Delvare [Sun, 24 Oct 2010 16:16:59 +0000 (18:16 +0200)]
i2c-viapro: Don't log nacks
Transactions not acked can happen every now and then, in particular
during device detection, and various transaction types can be used for
this purpose. So stop logging this event, except when debugging is
enabled. This is what other similar drivers (e.g. i2c-i801 or
i2c-piix4) do.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Guenter Roeck [Sun, 24 Oct 2010 16:16:59 +0000 (18:16 +0200)]
i2c/pca954x: Remove __devinit and __devexit from probe and remove functions
The underlying I2C adapter may or may not be present when this driver
gets initialized, and may disappear later, so there is no safe time at
which the probe and remove functions can be discarded.
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Guenter Roeck [Sun, 24 Oct 2010 16:16:59 +0000 (18:16 +0200)]
MAINTAINERS: Add maintainer for PCA9541 I2C bus master selector driver
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Guenter Roeck [Sun, 24 Oct 2010 16:16:58 +0000 (18:16 +0200)]
i2c/mux: Driver for PCA9541 I2C Master Selector
This patch adds support for PCA9541, an I2C Bus Master Selector.
The driver is modeled as single channel I2C Multiplexer to be able to utilize
the I2C multiplexer framework.
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Reviewed-by: Tom Grennan <tom.grennan@ericsson.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Jean Delvare [Sun, 24 Oct 2010 16:16:58 +0000 (18:16 +0200)]
i2c: Optimize function i2c_detect()
Check the class flags before allocating the temporary i2c_client
structure, to avoid allocating it when we don't need it.
Also optimize the inner loop a bit.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Jean Delvare [Sun, 24 Oct 2010 16:16:58 +0000 (18:16 +0200)]
i2c: Discard warning message on device instantiation from user-space
The "new_device" sysfs interface has been there for quite some time
now, nobody complained about it so it must be good enough. Time to
remove the warning and call it stable.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Michael Lawnick <ml.lawnick@gmx.de>
Julia Lawall [Sun, 24 Oct 2010 16:16:58 +0000 (18:16 +0200)]
i2c-amd8111: Add proper error handling
The functions the functions amd_ec_wait_write and amd_ec_wait_read have an
unsigned return type, but return a negative constant to indicate an error
condition.
A sematic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@exists@
identifier f;
constant C;
@@
unsigned f(...)
{ <+...
* return -C;
...+> }
// </smpl>
Fixing amd_ec_wait_write and amd_ec_wait_read leads to the need to adjust
the return type of the functions amd_ec_write and amd_ec_read, which are
the only functions that call amd_ec_wait_write and amd_ec_wait_read.
amd_ec_write and amd_ec_read, in turn, are only called from within the
function amd8111_access, which already returns a signed typed value. Each
of the calls to amd_ec_write and amd_ec_read are updated using the
following semantic patch:
// <smpl>
@@
@@
+ status = amd_ec_write
- amd_ec_write
(...);
+ if (status) return status;
@@
@@
+ status = amd_ec_read
- amd_ec_read
(...);
+ if (status) return status;
// </smpl>
The patch also adds the declaration of the status variable.
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
matt mooney [Sun, 24 Oct 2010 16:16:58 +0000 (18:16 +0200)]
i2c: Change to new flag variable
Replace EXTRA_CFLAGS with ccflags-y.
Signed-off-by: matt mooney <mfm@muteddisk.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Jean Delvare [Sun, 24 Oct 2010 16:16:58 +0000 (18:16 +0200)]
i2c: Remove unneeded inclusions of <linux/i2c-id.h>
These drivers don't use anything which is defined in <linux/i2c-id.h>.
This header file was never meant to be included directly anyway, and
will be deleted soon.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Ben Dooks <ben-linux@fluff.org>
Acked-by: Dave Airlie <airlied@linux.ie>
Cc: Hans Verkuil <hverkuil@xs4all.nl>
Jean Delvare [Sun, 24 Oct 2010 16:16:57 +0000 (18:16 +0200)]
i2c: Let i2c_parent_is_i2c_adapter return the parent adapter
This makes the calling site's code clearer IMHO.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Michael Lawnick <ml.lawnick@gmx.de>
Jean Delvare [Sun, 24 Oct 2010 16:16:57 +0000 (18:16 +0200)]
i2c: Simplify i2c_parent_is_i2c_adapter
Only i2c devices can have their type set to i2c_adapter_type, so
testing the bus type is redundant.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Michael Lawnick <ml.lawnick@gmx.de>
Nobuhiro Iwamatsu [Sun, 24 Oct 2010 16:16:57 +0000 (18:16 +0200)]
i2c-pca-platform: Change device name of request_irq
i2c->adap.name shouldn't be used in request_irq.
Instead the driver name "i2c-pca-platform" should be used.
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: stable@kernel.org
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Jean Delvare [Sun, 24 Oct 2010 16:16:57 +0000 (18:16 +0200)]
i2c: Fix Kconfig dependencies
drivers/i2c/algos/Kconfig makes all the algorithms dependent on
!I2C_HELPER_AUTO, which triggers a Kconfig warning about broken
dependencies when some driver selects one of the algorithms. Ideally
we would make only the prompts dependent on !I2C_HELPER_AUTO, however
Kconfig doesn't currently support that. So we have to redefine the
symbols separately for the I2C_HELPER_AUTO=y case.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Michal Marek <mmarek@suse.cz>
Jan Kiszka [Mon, 18 Oct 2010 13:38:40 +0000 (15:38 +0200)]
KVM: Drop CONFIG_DMAR dependency around kvm_iommu_map_pages
We also have to call kvm_iommu_map_pages for CONFIG_AMD_IOMMU. So drop
the dependency on Intel IOMMU, kvm_iommu_map_pages will be a nop anyway
if CONFIG_IOMMU_API is not defined.
KVM-Stable-Tag.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Jan Kiszka [Thu, 14 Oct 2010 11:59:04 +0000 (13:59 +0200)]
KVM: Fix signature of kvm_iommu_map_pages stub
Breaks otherwise if CONFIG_IOMMU_API is not set.
KVM-Stable-Tag.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Huang Ying [Fri, 8 Oct 2010 08:24:15 +0000 (16:24 +0800)]
KVM: MCE: Send SRAR SIGBUS directly
Originally, SRAR SIGBUS is sent to QEMU-KVM via touching the poisoned
page. But commit
96054569190bdec375fe824e48ca1f4e3b53dd36 prevents the
signal from being sent. So now the signal is sent via
force_sig_info_fault directly.
[marcelo: use send_sig_info instead]
Reported-by: Dean Nelson <dnelson@redhat.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Huang Ying [Fri, 8 Oct 2010 08:24:14 +0000 (16:24 +0800)]
KVM: MCE: Add MCG_SER_P into KVM_MCE_CAP_SUPPORTED
Now we have MCG_SER_P (and corresponding SRAO/SRAR MCE) support in
kernel and QEMU-KVM, the MCG_SER_P should be added into
KVM_MCE_CAP_SUPPORTED to make all these code really works.
Reported-by: Dean Nelson <dnelson@redhat.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Nicolas Kaiser [Wed, 6 Oct 2010 12:23:22 +0000 (14:23 +0200)]
KVM: fix typo in copyright notice
Fix typo in copyright notice.
Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Avi Kivity [Mon, 4 Oct 2010 10:55:49 +0000 (12:55 +0200)]
KVM: Disable interrupts around get_kernel_ns()
get_kernel_ns() wants preemption disabled. It doesn't make a lot of sense
during the get/set ioctls (no way to make them non-racy) but the callee wants
it.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Sun, 3 Oct 2010 16:51:39 +0000 (18:51 +0200)]
KVM: MMU: Avoid sign extension in mmu_alloc_direct_roots() pae root address
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Tue, 28 Sep 2010 09:03:14 +0000 (17:03 +0800)]
KVM: MMU: move access code parsing to FNAME(walk_addr) function
Move access code parsing from caller site to FNAME(walk_addr) function
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Mon, 27 Sep 2010 10:09:29 +0000 (18:09 +0800)]
KVM: MMU: audit: check whether have unsync sps after root sync
After root synced, all unsync sps are synced, this patch add a check to make
sure it's no unsync sps in VCPU's page table
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Mon, 27 Sep 2010 10:07:59 +0000 (18:07 +0800)]
KVM: MMU: audit: introduce audit_printk to cleanup audit code
Introduce audit_printk, and record audit point instead audit name
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Mon, 27 Sep 2010 10:07:07 +0000 (18:07 +0800)]
KVM: MMU: audit: unregister audit tracepoints before module unloaded
fix:
Call Trace:
[<
ffffffffa01e46ba>] ? kvm_mmu_pte_write+0x229/0x911 [kvm]
[<
ffffffffa01c6ba9>] ? gfn_to_memslot+0x39/0xa0 [kvm]
[<
ffffffffa01c6c26>] ? mark_page_dirty+0x16/0x2e [kvm]
[<
ffffffffa01c6d6f>] ? kvm_write_guest_page+0x67/0x7f [kvm]
[<
ffffffff81066fbd>] ? local_clock+0x2a/0x3b
[<
ffffffffa01d52ce>] emulator_write_phys+0x46/0x54 [kvm]
......
Code: Bad RIP value.
RIP [<
ffffffffa0172056>] 0xffffffffa0172056
RSP <
ffff880134f69a70>
CR2:
ffffffffa0172056
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Mon, 27 Sep 2010 10:06:16 +0000 (18:06 +0800)]
KVM: MMU: audit: fix vcpu's spte walking
After nested nested paging, it may using long mode to shadow 32/PAE paging
guest, so this patch fix it
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Mon, 27 Sep 2010 10:05:00 +0000 (18:05 +0800)]
KVM: MMU: set access bit for direct mapping
Set access bit while setup up direct page table if it's nonpaing or npt enabled,
it's good for CPU's speculate access
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Mon, 27 Sep 2010 10:03:27 +0000 (18:03 +0800)]
KVM: MMU: cleanup for error mask set while walk guest page table
Small cleanup for set page fault error code
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Mon, 27 Sep 2010 10:02:12 +0000 (18:02 +0800)]
KVM: MMU: update 'root_hpa' out of loop in PAE shadow path
The value of 'vcpu->arch.mmu.pae_root' is not modified, so we can update
'root_hpa' out of the loop.
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Sheng Yang [Tue, 28 Sep 2010 08:33:32 +0000 (16:33 +0800)]
KVM: x86 emulator: Eliminate compilation warning in x86_decode_insn()
Eliminate:
arch/x86/kvm/emulate.c:801: warning: ‘sv’ may be used uninitialized in this
function
on gcc 4.1.2
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Jan Kiszka [Sun, 26 Sep 2010 11:00:53 +0000 (13:00 +0200)]
KVM: x86: Fix constant type in kvm_get_time_scale
Older gcc versions complain about the improper type (for x86-32), 4.5
seems to fix this silently. However, we should better use the right type
initially.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Tue, 28 Sep 2010 14:37:42 +0000 (16:37 +0200)]
KVM: VMX: Add AX to list of registers clobbered by guest switch
By chance this caused no harm so far. We overwrite AX during switch
to/from guest context, so we must declare this.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Arjan Koers [Mon, 2 Aug 2010 21:35:28 +0000 (23:35 +0200)]
KVM guest: Move a printk that's using the clock before it's ready
Fix a hang during SMP kernel boot on KVM that showed up
after commit
489fb490dbf8dab0249ad82b56688ae3842a79e8
(2.6.35) and
59aab522154a2f17b25335b63c1cf68a51fb6ae0
(2.6.34.1). The problem only occurs when
CONFIG_PRINTK_TIME is set.
KVM-Stable-Tag.
Signed-off-by: Arjan Koers <0h61vkll2ly8@xutrox.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Zachary Amsden [Sun, 19 Sep 2010 00:38:15 +0000 (14:38 -1000)]
KVM: x86: TSC catchup mode
Negate the effects of AN TYM spell while kvm thread is preempted by tracking
conversion factor to the highest TSC rate and catching the TSC up when it has
fallen behind the kernel view of time. Note that once triggered, we don't
turn off catchup mode.
A slightly more clever version of this is possible, which only does catchup
when TSC rate drops, and which specifically targets only CPUs with broken
TSC, but since these all are considered unstable_tsc(), this patch covers
all necessary cases.
Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Zachary Amsden [Sun, 19 Sep 2010 00:38:14 +0000 (14:38 -1000)]
KVM: x86: Rename timer function
This just changes some names to better reflect the usage they
will be given. Separated out to keep confusion to a minimum.
Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Zachary Amsden [Sun, 19 Sep 2010 00:38:13 +0000 (14:38 -1000)]
KVM: x86: Make math work for other scales
The math in kvm_get_time_scale relies on the fact that
NSEC_PER_SEC < 2^32. To use the same function to compute
arbitrary time scales, we must extend the first reduction
step to shrink the base rate to a 32-bit value, and
possibly reduce the scaled rate into a 32-bit as well.
Note we must take care to avoid an arithmetic overflow
when scaling up the tps32 value (this could not happen
with the fixed scaled value of NSEC_PER_SEC, but can
happen with scaled rates above 2^31.
Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Avi Kivity [Tue, 21 Sep 2010 17:59:44 +0000 (19:59 +0200)]
KVM: cpu_relax() during spin waiting for reboot
It doesn't really matter, but if we spin, we should spin in a more relaxed
manner. This way, if something goes wrong at least it won't contribute to
global warming.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Avi Kivity [Sun, 19 Sep 2010 12:34:08 +0000 (14:34 +0200)]
KVM: VMX: Respect interrupt window in big real mode
If an interrupt is pending, we need to stop emulation so we
can inject it.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Mohammed Gamal [Sun, 19 Sep 2010 12:34:07 +0000 (14:34 +0200)]
KVM: VMX: Emulated real mode interrupt injection
Replace the inject-as-software-interrupt hack we currently have with
emulated injection.
Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Mohammed Gamal [Sun, 19 Sep 2010 12:34:06 +0000 (14:34 +0200)]
KVM: Add kvm_inject_realmode_interrupt() wrapper
This adds a wrapper function kvm_inject_realmode_interrupt() around the
emulator function emulate_int_real() to allow real mode interrupt injection.
[avi: initialize operand and address sizes before emulating interrupts]
[avi: initialize rip for real mode interrupt injection]
[avi: clear interrupt pending flag after emulating interrupt injection]
Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Mohammed Gamal [Sun, 19 Sep 2010 12:34:05 +0000 (14:34 +0200)]
KVM: x86 emulator: Expose emulate_int_real()
Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Hillf Danton [Sat, 18 Sep 2010 00:41:02 +0000 (08:41 +0800)]
KVM: MMU: fix counting of rmap entries in rmap_add()
It seems that rmap entries are under counted.
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Xiao Guangrong [Mon, 20 Sep 2010 14:17:48 +0000 (22:17 +0800)]
KVM: document 'kvm.mmu_audit' parameter
Document this parameter into Documentation/kernel-parameters.txt
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Xiao Guangrong [Mon, 20 Sep 2010 14:16:45 +0000 (22:16 +0800)]
KVM: fix the description of kvm-amd.nested in documentation
The default state of 'kvm-amd.nested' is enabled now, so fix the documentation
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Gleb Natapov [Mon, 20 Sep 2010 08:15:32 +0000 (10:15 +0200)]
KVM: SVM: do not generate "external interrupt exit" if other exit is pending
Nested SVM checks for external interrupt after injecting nested exception.
In case there is external interrupt pending the code generates "external
interrupt exit" and overwrites previous exit info. If previously injected
exception already generated exit it will be lost.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Avi Kivity [Sun, 19 Sep 2010 16:44:07 +0000 (18:44 +0200)]
KVM: Convert PIC lock from raw spinlock to ordinary spinlock
The PIC code used to be called from preempt_disable() context, which
wasn't very good for PREEMPT_RT. That is no longer the case, so move
back from raw_spinlock_t to spinlock_t.
Signed-off-by: Avi Kivity <avi@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Zachary Amsden [Sun, 19 Sep 2010 00:38:12 +0000 (14:38 -1000)]
KVM: x86: Fix kvmclock bug
If preempted after kvmclock values are updated, but before hardware
virtualization is entered, the last tsc time as read by the guest is
never set. It underflows the next time kvmclock is updated if there
has not yet been a successful entry / exit into hardware virt.
Fix this by simply setting last_tsc to the newly read tsc value so
that any computed nsec advance of kvmclock is nulled.
Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Joerg Roedel [Tue, 14 Sep 2010 15:46:12 +0000 (17:46 +0200)]
KVM: MMU: Don't track nested fault info in error-code
This patch moves the detection whether a page-fault was
nested or not out of the error code and moves it into a
separate variable in the fault struct.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 22 Jul 2010 10:09:54 +0000 (13:09 +0300)]
KVM: VMX: Move fixup_rmode_irq() to avoid forward declaration
No code changes.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 20 Jul 2010 12:06:17 +0000 (15:06 +0300)]
KVM: Non-atomic interrupt injection
Change the interrupt injection code to work from preemptible, interrupts
enabled context. This works by adding a ->cancel_injection() operation
that undoes an injection in case we were not able to actually enter the guest
(this condition could never happen with atomic injection).
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 20 Jul 2010 11:43:23 +0000 (14:43 +0300)]
KVM: VMX: Parameterize vmx_complete_interrupts() for both exit and entry
Currently vmx_complete_interrupts() can decode event information from vmx
exit fields into the generic kvm event queues. Make it able to decode
the information from the entry fields as well by parametrizing it.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 22 Jul 2010 09:54:21 +0000 (12:54 +0300)]
KVM: VMX: Move real-mode interrupt injection fixup to vmx_complete_interrupts()
This allows reuse of vmx_complete_interrupts() for cancelling injections.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 20 Jul 2010 11:31:20 +0000 (14:31 +0300)]
KVM: VMX: Split up vmx_complete_interrupts()
vmx_complete_interrupts() does too much, split it up:
- vmx_vcpu_run() gets the "cache important vmcs fields" part
- a new vmx_complete_atomic_exit() gets the parts that must be done atomically
- a new vmx_recover_nmi_blocking() does what its name says
- vmx_complete_interrupts() retains the event injection recovery code
This helps in reducing the work done in atomic context.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 27 Jul 2010 09:30:24 +0000 (12:30 +0300)]
KVM: Check for pending events before attempting injection
Instead of blindly attempting to inject an event before each guest entry,
check for a possible event first in vcpu->requests. Sites that can trigger
event injection are modified to set KVM_REQ_EVENT:
- interrupt, nmi window opening
- ppr updates
- i8259 output changes
- local apic irr changes
- rflags updates
- gif flag set
- event set on exit
This improves non-injecting entry performance, and sets the stage for
non-atomic injection.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 13 Sep 2010 14:45:28 +0000 (16:45 +0200)]
KVM: MMU: Fix regression with ept memory types merged into non-ept page tables
Commit "KVM: MMU: Make tdp_enabled a mmu-context parameter" made real-mode
set ->direct_map, and changed the code that merges in the memory type depend
on direct_map instead of tdp_enabled. However, in this case what really
matters is tdp, not direct_map, since tdp changes the pte format regardless
of whether the mapping is direct or not.
As a result, real-mode shadow mappings got corrupted with ept memory types.
The result was a huge slowdown, likely due to the cache being disabled.
Change it back as the simplest fix for the regression (real fix is to move
all that to vmx code, and not use tdp_enabled as a synonym for ept).
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Sun, 12 Sep 2010 14:39:11 +0000 (16:39 +0200)]
KVM: Document that KVM_GET_SUPPORTED_CPUID may return emulated values
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 10 Sep 2010 15:31:06 +0000 (17:31 +0200)]
KVM: X86: Report SVM bit to userspace only when supported
This patch fixes a bug in KVM where it _always_ reports the
support of the SVM feature to userspace. But KVM only
supports SVM on AMD hardware and only when it is enabled in
the kernel module. This patch fixes the wrong reporting.
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 10 Sep 2010 15:31:05 +0000 (17:31 +0200)]
KVM: SVM: Report Nested Paging support to userspace
This patch implements the reporting of the nested paging
feature support to userspace.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 10 Sep 2010 15:31:04 +0000 (17:31 +0200)]
KVM: SVM: Expect two more candiates for exit_int_info
This patch adds INTR and NMI intercepts to the list of
expected intercepts with an exit_int_info set. While this
can't happen on bare metal it is architectural legal and may
happen with KVMs SVM emulation.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 10 Sep 2010 15:31:03 +0000 (17:31 +0200)]
KVM: SVM: Initialize Nested Nested MMU context on VMRUN
This patch adds code to initialize the Nested Nested Paging
MMU context when the L1 guest executes a VMRUN instruction
and has nested paging enabled in its VMCB.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 10 Sep 2010 15:31:02 +0000 (17:31 +0200)]
KVM: SVM: Implement MMU helper functions for Nested Nested Paging
This patch adds the helper functions which will be used in
the mmu context for handling nested nested page faults.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 10 Sep 2010 15:31:01 +0000 (17:31 +0200)]
KVM: MMU: Track NX state in struct kvm_mmu
With Nested Paging emulation the NX state between the two
MMU contexts may differ. To make sure that always the right
fault error code is recorded this patch moves the NX state
into struct kvm_mmu so that the code can distinguish between
L1 and L2 NX state.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 10 Sep 2010 15:31:00 +0000 (17:31 +0200)]
KVM: MMU: Allow long mode shadows for legacy page tables
Currently the KVM softmmu implementation can not shadow a 32
bit legacy or PAE page table with a long mode page table.
This is a required feature for nested paging emulation
because the nested page table must alway be in host format.
So this patch implements the missing pieces to allow long
mode page tables for page table types.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>