Jiang Liu [Tue, 15 Apr 2014 02:35:35 +0000 (10:35 +0800)]
iommu/vt-d: fix bug in matching PCI devices with DRHD/RMRR descriptors
Commit "
59ce0515cdaf iommu/vt-d: Update DRHD/RMRR/ATSR device scope
caches when PCI hotplug happens" introduces a bug, which fails to
match PCI devices with DMAR device scope entries if PCI path array
in the entry has more than one level.
For example, it fails to handle
[1D2h 0466 1] Device Scope Entry Type : 01
[1D3h 0467 1] Entry Length : 0A
[1D4h 0468 2] Reserved : 0000
[1D6h 0470 1] Enumeration ID : 00
[1D7h 0471 1] PCI Bus Number : 00
[1D8h 0472 2] PCI Path : 1C,04
[1DAh 0474 2] PCI Path : 00,02
And cause DMA failure on HP DL980 as:
DMAR:[fault reason 02] Present bit in context entry is clear
dmar: DRHD: handling fault status reg 602
dmar: DMAR:[DMA Read] Request device [02:00.2] fault addr
7f61e000
Reported-and-tested-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Tue, 15 Apr 2014 05:01:30 +0000 (22:01 -0700)]
iommu/vt-d: Fix get_domain_for_dev() handling of upstream PCIe bridges
Commit
146922ec79 ("iommu/vt-d: Make get_domain_for_dev() take struct
device") introduced new variables bridge_bus and bridge_devfn to
identify the upstream PCIe to PCI bridge responsible for the given
target device. Leaving the original bus/devfn variables to identify
the target device itself, now that it is no longer assumed to be PCI
and we can no longer trivially find that information.
However, the patch failed to correctly use the new variables in all
cases; instead using the as-yet-uninitialised 'bus' and 'devfn'
variables.
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Jiang Liu [Wed, 9 Apr 2014 02:20:39 +0000 (10:20 +0800)]
iommu/vt-d: fix memory leakage caused by commit
ea8ea46
Commit
ea8ea46 "iommu/vt-d: Clean up and fix page table clear/free
behaviour" introduces possible leakage of DMA page tables due to:
for (pte = page_address(pg); !first_pte_in_page(pte); pte++) {
if (dma_pte_present(pte) && !dma_pte_superpage(pte))
freelist = dma_pte_list_pagetables(domain, level - 1,
pte, freelist);
}
For the first pte in a page, first_pte_in_page(pte) will always be true,
thus dma_pte_list_pagetables() will never be called and leak DMA page
tables if level is bigger than 1.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Tue, 1 Apr 2014 13:58:36 +0000 (14:58 +0100)]
iommu/vt-d: Fix error handling in ANDD processing
If we failed to find an ACPI device to correspond to an ANDD record, we
would fail to increment our pointer and would just process the same record
over and over again, with predictable results.
Turn it from a while() loop into a for() loop to let the 'continue' in
the error paths work correctly.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Dan Carpenter [Fri, 28 Mar 2014 08:29:50 +0000 (11:29 +0300)]
iommu/vt-d: returning free pointer in get_domain_for_dev()
If we hit this error condition then we want to return a NULL pointer and
not a freed variable.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Fri, 28 Mar 2014 11:28:40 +0000 (11:28 +0000)]
iommu/vt-d: Only call dmar_acpi_dev_scope_init() if DRHD units present
As pointed out by Jörg and fixed in commit
11f1a7768 ("iommu/vt-d: Check
for NULL pointer in dmar_acpi_dev_scope_init(), this code path can
bizarrely get exercised even on AMD IOMMU systems with IRQ remapping
enabled.
In addition to the defensive check for NULL which Jörg added, let's also
just avoid calling the function at all if there aren't an Intel IOMMU
units in the system.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Joerg Roedel [Tue, 25 Mar 2014 19:16:40 +0000 (20:16 +0100)]
iommu/vt-d: Check for NULL pointer in dmar_acpi_dev_scope_init()
When ir_dev_scope_init() is called via a rootfs initcall it
will check for irq_remapping_enabled before it calls
(indirectly) into dmar_acpi_dev_scope_init() which uses the
dmar_tbl pointer without any checks.
The AMD IOMMU driver also sets the irq_remapping_enabled
flag which causes the dmar_acpi_dev_scope_init() function to
be called on systems with AMD IOMMU hardware too, causing a
boot-time kernel crash.
Signed-off-by: Joerg Roedel <joro@8bytes.org>
David Woodhouse [Fri, 21 Mar 2014 16:49:04 +0000 (16:49 +0000)]
iommu/vt-d: Include ACPI devices in iommu=pt
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Tue, 11 Mar 2014 03:04:11 +0000 (20:04 -0700)]
iommu/vt-d: Finally enable translation for non-PCI devices
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Tue, 11 Mar 2014 03:01:21 +0000 (20:01 -0700)]
iommu/vt-d: Remove to_pci_dev() in intel_map_page()
It might not be...
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 23:31:06 +0000 (16:31 -0700)]
iommu/vt-d: Remove pdev from intel_iommu_attach_device()
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 23:29:55 +0000 (16:29 -0700)]
iommu/vt-d: Remove pdev from iommu_no_mapping()
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 23:27:31 +0000 (16:27 -0700)]
iommu/vt-d: Make domain_add_dev_info() take struct device
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 23:19:13 +0000 (16:19 -0700)]
iommu/vt-d: Make domain_remove_one_dev_info() take struct device
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 23:14:00 +0000 (16:14 -0700)]
iommu/vt-d: Rename 'hwdev' variables to 'dev' now that that's the norm
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 23:12:32 +0000 (16:12 -0700)]
iommu/vt-d: Remove some pointless to_pci_dev() calls
Mostly made redundant by using dev_name() instead of pci_name(), and one
instance of using *dev->dma_mask instead of pdev->dma_mask.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 23:07:40 +0000 (16:07 -0700)]
iommu/vt-d: Make get_valid_domain_for_dev() take struct device
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 23:03:08 +0000 (16:03 -0700)]
iommu/vt-d: Make iommu_should_identity_map() take struct device
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 22:48:15 +0000 (15:48 -0700)]
iommu/vt-d: Handle RMRRs for non-PCI devices
Should hopefully never happen (RMRRs are an abomination) but while we're
busy eliminating all the PCI assumptions, we might as well do it.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 22:44:17 +0000 (15:44 -0700)]
iommu/vt-d: Make get_domain_for_dev() take struct device
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 22:24:46 +0000 (15:24 -0700)]
iommu/vt-d: Make domain_context_mapp{ed,ing}() take struct device
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 21:00:57 +0000 (14:00 -0700)]
iommu/vt-d: Make device_to_iommu() cope with non-PCI devices
Pass the struct device to it, and also make it return the bus/devfn to use,
since that is also stored in the DMAR table.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 21:03:28 +0000 (14:03 -0700)]
iommu/vt-d: Make identity_mapping() take struct device not struct pci_dev
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 20:55:54 +0000 (13:55 -0700)]
iommu/vt-d: Remove segment from struct device_domain_info()
It's accessible via info->iommu->segment so this is redundant.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 20:49:45 +0000 (13:49 -0700)]
iommu/vt-d: Store PCI segment number in struct intel_iommu
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 20:33:06 +0000 (13:33 -0700)]
iommu/vt-d: Remove device_to_iommu() call from domain_remove_dev_info()
This was problematic because it works by domain/bus/devfn and we want
to make device_to_iommu() use only a struct device * (for handling non-PCI
devices). Now that the iommu pointer is reliably stored in the
device_domain_info, we don't need to look it up.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 20:52:37 +0000 (13:52 -0700)]
iommu/vt-d: Simplify iommu check in domain_remove_one_dev_info()
Now we store the iommu in the device_domain_info, we don't need to do a
lookup.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 20:31:18 +0000 (13:31 -0700)]
iommu/vt-d: Always store iommu in device_domain_info
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 20:25:07 +0000 (13:25 -0700)]
iommu/vt-d: Use domain_remove_one_dev_info() in domain_add_dev_info() error path
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 20:19:22 +0000 (13:19 -0700)]
iommu/vt-d: use dmar_insert_dev_info() from dma_add_dev_info()
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 20:11:33 +0000 (13:11 -0700)]
iommu/vt-d: Stop dmar_insert_dev_info() freeing domains on losing race
By moving this into get_domain_for_dev() we can make dmar_insert_dev_info()
suitable for use with "special" domains such as the si_domain, which
currently use domain_add_dev_info().
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Sun, 9 Mar 2014 19:52:30 +0000 (12:52 -0700)]
iommu/vt-d: Pass iommu to domain_context_mapping_one() and iommu_support_dev_iotlb()
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Thu, 6 Mar 2014 17:12:03 +0000 (17:12 +0000)]
iommu/vt-d: Use struct device in device_domain_info, not struct pci_dev
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Thu, 6 Mar 2014 16:19:30 +0000 (16:19 +0000)]
iommu/vt-d: Make dmar_insert_dev_info() take struct device instead of struct pci_dev
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Thu, 6 Mar 2014 15:59:26 +0000 (15:59 +0000)]
iommu/vt-d: Make iommu_dummy() take struct device instead of struct pci_dev
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Fri, 7 Mar 2014 23:15:42 +0000 (23:15 +0000)]
iommu/vt-d: Add ACPI devices into dmaru->devices[] array
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Fri, 7 Mar 2014 15:08:36 +0000 (15:08 +0000)]
iommu/vt-d: Change scope lists to struct device, bus, devfn
It's not only for PCI devices any more, and the scope information for an
ACPI device provides the bus and devfn so that has to be stored here too.
It is the device pointer itself which needs to be protected with RCU,
so the __rcu annotation follows it into the definition of struct
dmar_dev_scope, since we're no longer just passing arrays of device
pointers around.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Fri, 7 Mar 2014 14:39:27 +0000 (14:39 +0000)]
iommu/vt-d: Allocate space for ACPI devices
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Fri, 7 Mar 2014 14:34:38 +0000 (14:34 +0000)]
iommu/vt-d: Parse ANDD records
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Fri, 7 Mar 2014 12:43:40 +0000 (12:43 +0000)]
iommu/vt-d: Add ACPI namespace device reporting structures
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Wed, 12 Mar 2014 00:10:29 +0000 (17:10 -0700)]
iommu/vt-d: Be less pessimistic about domain coherency where possible
In commit
2e12bc29 ("intel-iommu: Default to non-coherent for domains
unattached to iommus") we decided to err on the side of caution and
always assume that it's possible that a device will be attached which is
behind a non-coherent IOMMU.
In some cases, however, that just *cannot* happen. If there *are* no
IOMMUs in the system which are non-coherent, then we don't need to do
it. And flushing the dcache is a *significant* performance hit.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Wed, 19 Mar 2014 10:38:49 +0000 (10:38 +0000)]
iommu/vt-d: Honour intel_iommu=sp_off for non-VMM domains
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Wed, 5 Mar 2014 17:09:32 +0000 (17:09 +0000)]
iommu/vt-d: Clean up and fix page table clear/free behaviour
There is a race condition between the existing clear/free code and the
hardware. The IOMMU is actually permitted to cache the intermediate
levels of the page tables, and doesn't need to walk the table from the
very top of the PGD each time. So the existing back-to-back calls to
dma_pte_clear_range() and dma_pte_free_pagetable() can lead to a
use-after-free where the IOMMU reads from a freed page table.
When freeing page tables we actually need to do the IOTLB flush, with
the 'invalidation hint' bit clear to indicate that it's not just a
leaf-node flush, after unlinking each page table page from the next level
up but before actually freeing it.
So in the rewritten domain_unmap() we just return a list of pages (using
pg->freelist to make a list of them), and then the caller is expected to
do the appropriate IOTLB flush (or tear down the domain completely,
whatever), before finally calling dma_free_pagelist() to free the pages.
As an added bonus, we no longer need to flush the CPU's data cache for
pages which are about to be *removed* from the page table hierarchy anyway,
in the non-cache-coherent case. This drastically improves the performance
of large unmaps.
As a side-effect of all these changes, this also fixes the fact that
intel_iommu_unmap() was neglecting to free the page tables for the range
in question after clearing them.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse [Wed, 19 Mar 2014 16:07:49 +0000 (16:07 +0000)]
iommu/vt-d: Clean up size handling for intel_iommu_unmap()
We have this horrid API where iommu_unmap() can unmap more than it's asked
to, if the IOVA in question happens to be mapped with a large page.
Instead of propagating this nonsense to the point where we end up returning
the page order from dma_pte_clear_range(), let's just do it once and adjust
the 'size' parameter accordingly.
Augment pfn_to_dma_pte() to return the level at which the PTE was found,
which will also be useful later if we end up changing the API for
iommu_iova_to_phys() to behave the same way as is being discussed upstream.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Jiang Liu [Wed, 19 Feb 2014 06:07:37 +0000 (14:07 +0800)]
iommu/vt-d: Update IOMMU state when memory hotplug happens
If static identity domain is created, IOMMU driver needs to update
si_domain page table when memory hotplug event happens. Otherwise
PCI device DMA operations can't access the hot-added memory regions.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Jiang Liu [Wed, 19 Feb 2014 06:07:36 +0000 (14:07 +0800)]
iommu/vt-d: Unify the way to process DMAR device scope array
Now we have a PCI bus notification based mechanism to update DMAR
device scope array, we could extend the mechanism to support boot
time initialization too, which will help to unify and simplify
the implementation.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Jiang Liu [Wed, 19 Feb 2014 06:07:35 +0000 (14:07 +0800)]
iommu/vt-d: Update DRHD/RMRR/ATSR device scope caches when PCI hotplug happens
Current Intel DMAR/IOMMU driver assumes that all PCI devices associated
with DMAR/RMRR/ATSR device scope arrays are created at boot time and
won't change at runtime, so it caches pointers of associated PCI device
object. That assumption may be wrong now due to:
1) introduction of PCI host bridge hotplug
2) PCI device hotplug through sysfs interfaces.
Wang Yijing has tried to solve this issue by caching <bus, dev, func>
tupple instead of the PCI device object pointer, but that's still
unreliable because PCI bus number may change in case of hotplug.
Please refer to http://lkml.org/lkml/2013/11/5/64
Message from Yingjing's mail:
after remove and rescan a pci device
[ 611.857095] dmar: DRHD: handling fault status reg 2
[ 611.857109] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr
ffff7000
[ 611.857109] DMAR:[fault reason 02] Present bit in context entry is clear
[ 611.857524] dmar: DRHD: handling fault status reg 102
[ 611.857534] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr
ffff6000
[ 611.857534] DMAR:[fault reason 02] Present bit in context entry is clear
[ 611.857936] dmar: DRHD: handling fault status reg 202
[ 611.857947] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr
ffff5000
[ 611.857947] DMAR:[fault reason 02] Present bit in context entry is clear
[ 611.858351] dmar: DRHD: handling fault status reg 302
[ 611.858362] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr
ffff4000
[ 611.858362] DMAR:[fault reason 02] Present bit in context entry is clear
[ 611.860819] IPv6: ADDRCONF(NETDEV_UP): eth3: link is not ready
[ 611.860983] dmar: DRHD: handling fault status reg 402
[ 611.860995] dmar: INTR-REMAP: Request device [[86:00.3] fault index a4
[ 611.860995] INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear
This patch introduces a new mechanism to update the DRHD/RMRR/ATSR device scope
caches by hooking PCI bus notification.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Jiang Liu [Wed, 19 Feb 2014 06:07:34 +0000 (14:07 +0800)]
iommu/vt-d: Use RCU to protect global resources in interrupt context
Global DMA and interrupt remapping resources may be accessed in
interrupt context, so use RCU instead of rwsem to protect them
in such cases.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Jiang Liu [Wed, 19 Feb 2014 06:07:33 +0000 (14:07 +0800)]
iommu/vt-d: Introduce a rwsem to protect global data structures
Introduce a global rwsem dmar_global_lock, which will be used to
protect DMAR related global data structures from DMAR/PCI/memory
device hotplug operations in process context.
DMA and interrupt remapping related data structures are read most,
and only change when memory/PCI/DMAR hotplug event happens.
So a global rwsem solution is adopted for balance between simplicity
and performance.
For interrupt remapping driver, function intel_irq_remapping_supported(),
dmar_table_init(), intel_enable_irq_remapping(), disable_irq_remapping(),
reenable_irq_remapping() and enable_drhd_fault_handling() etc
are called during booting, suspending and resuming with interrupt
disabled, so no need to take the global lock.
For interrupt remapping entry allocation, the locking model is:
down_read(&dmar_global_lock);
/* Find corresponding iommu */
iommu = map_hpet_to_ir(id);
if (iommu)
/*
* Allocate remapping entry and mark entry busy,
* the IOMMU won't be hot-removed until the
* allocated entry has been released.
*/
index = alloc_irte(iommu, irq, 1);
up_read(&dmar_global_lock);
For DMA remmaping driver, we only uses the dmar_global_lock rwsem to
protect functions which are only called in process context. For any
function which may be called in interrupt context, we will use RCU
to protect them in following patches.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Jiang Liu [Wed, 19 Feb 2014 06:07:32 +0000 (14:07 +0800)]
iommu/vt-d: Introduce macro for_each_dev_scope() to walk device scope entries
Introduce for_each_dev_scope()/for_each_active_dev_scope() to walk
{active} device scope entries. This will help following RCU lock
related patches.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Jiang Liu [Wed, 19 Feb 2014 06:07:31 +0000 (14:07 +0800)]
iommu/vt-d: Fix error in detect ATS capability
Current Intel IOMMU driver only matches a PCIe root port with the first
DRHD unit with the samge segment number. It will report false result
if there are multiple DRHD units with the same segment number, thus fail
to detect ATS capability for some PCIe devices.
This patch refines function dmar_find_matched_atsr_unit() to search all
DRHD units with the same segment number.
An example DMAR table entries as below:
[1D0h 0464 2] Subtable Type : 0002 <Root Port ATS Capability>
[1D2h 0466 2] Length : 0028
[1D4h 0468 1] Flags : 00
[1D5h 0469 1] Reserved : 00
[1D6h 0470 2] PCI Segment Number : 0000
[1D8h 0472 1] Device Scope Entry Type : 02
[1D9h 0473 1] Entry Length : 08
[1DAh 0474 2] Reserved : 0000
[1DCh 0476 1] Enumeration ID : 00
[1DDh 0477 1] PCI Bus Number : 00
[1DEh 0478 2] PCI Path : [02, 00]
[1E0h 0480 1] Device Scope Entry Type : 02
[1E1h 0481 1] Entry Length : 08
[1E2h 0482 2] Reserved : 0000
[1E4h 0484 1] Enumeration ID : 00
[1E5h 0485 1] PCI Bus Number : 00
[1E6h 0486 2] PCI Path : [03, 00]
[1E8h 0488 1] Device Scope Entry Type : 02
[1E9h 0489 1] Entry Length : 08
[1EAh 0490 2] Reserved : 0000
[1ECh 0492 1] Enumeration ID : 00
[1EDh 0493 1] PCI Bus Number : 00
[1EEh 0494 2] PCI Path : [03, 02]
[1F0h 0496 1] Device Scope Entry Type : 02
[1F1h 0497 1] Entry Length : 08
[1F2h 0498 2] Reserved : 0000
[1F4h 0500 1] Enumeration ID : 00
[1F5h 0501 1] PCI Bus Number : 00
[1F6h 0502 2] PCI Path : [03, 03]
[1F8h 0504 2] Subtable Type : 0002 <Root Port ATS Capability>
[1FAh 0506 2] Length : 0020
[1FCh 0508 1] Flags : 00
[1FDh 0509 1] Reserved : 00
[1FEh 0510 2] PCI Segment Number : 0000
[200h 0512 1] Device Scope Entry Type : 02
[201h 0513 1] Entry Length : 08
[202h 0514 2] Reserved : 0000
[204h 0516 1] Enumeration ID : 00
[205h 0517 1] PCI Bus Number : 40
[206h 0518 2] PCI Path : [02, 00]
[208h 0520 1] Device Scope Entry Type : 02
[209h 0521 1] Entry Length : 08
[20Ah 0522 2] Reserved : 0000
[20Ch 0524 1] Enumeration ID : 00
[20Dh 0525 1] PCI Bus Number : 40
[20Eh 0526 2] PCI Path : [02, 02]
[210h 0528 1] Device Scope Entry Type : 02
[211h 0529 1] Entry Length : 08
[212h 0530 2] Reserved : 0000
[214h 0532 1] Enumeration ID : 00
[215h 0533 1] PCI Bus Number : 40
[216h 0534 2] PCI Path : [03, 00]
[218h 0536 2] Subtable Type : 0002 <Root Port ATS Capability>
[21Ah 0538 2] Length : 0020
[21Ch 0540 1] Flags : 00
[21Dh 0541 1] Reserved : 00
[21Eh 0542 2] PCI Segment Number : 0000
[220h 0544 1] Device Scope Entry Type : 02
[221h 0545 1] Entry Length : 08
[222h 0546 2] Reserved : 0000
[224h 0548 1] Enumeration ID : 00
[225h 0549 1] PCI Bus Number : 80
[226h 0550 2] PCI Path : [02, 00]
[228h 0552 1] Device Scope Entry Type : 02
[229h 0553 1] Entry Length : 08
[22Ah 0554 2] Reserved : 0000
[22Ch 0556 1] Enumeration ID : 00
[22Dh 0557 1] PCI Bus Number : 80
[22Eh 0558 2] PCI Path : [02, 02]
[230h 0560 1] Device Scope Entry Type : 02
[231h 0561 1] Entry Length : 08
[232h 0562 2] Reserved : 0000
[234h 0564 1] Enumeration ID : 00
[235h 0565 1] PCI Bus Number : 80
[236h 0566 2] PCI Path : [03, 00]
[238h 0568 2] Subtable Type : 0002 <Root Port ATS Capability>
[23Ah 0570 2] Length : 0020
[23Ch 0572 1] Flags : 00
[23Dh 0573 1] Reserved : 00
[23Eh 0574 2] PCI Segment Number : 0000
[240h 0576 1] Device Scope Entry Type : 02
[241h 0577 1] Entry Length : 08
[242h 0578 2] Reserved : 0000
[244h 0580 1] Enumeration ID : 00
[245h 0581 1] PCI Bus Number : C0
[246h 0582 2] PCI Path : [02, 00]
[248h 0584 1] Device Scope Entry Type : 02
[249h 0585 1] Entry Length : 08
[24Ah 0586 2] Reserved : 0000
[24Ch 0588 1] Enumeration ID : 00
[24Dh 0589 1] PCI Bus Number : C0
[24Eh 0590 2] PCI Path : [02, 02]
[250h 0592 1] Device Scope Entry Type : 02
[251h 0593 1] Entry Length : 08
[252h 0594 2] Reserved : 0000
[254h 0596 1] Enumeration ID : 00
[255h 0597 1] PCI Bus Number : C0
[256h 0598 2] PCI Path : [03, 00]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Jiang Liu [Wed, 19 Feb 2014 06:07:30 +0000 (14:07 +0800)]
iommu/vt-d: Check for NULL pointer when freeing IOMMU data structure
Domain id 0 will be assigned to invalid translation without allocating
domain data structure if DMAR unit supports caching mode. So in function
free_dmar_iommu(), we should check whether the domain pointer is NULL,
otherwise it will cause system crash as below:
[ 6.790519] BUG: unable to handle kernel NULL pointer dereference at
00000000000000c8
[ 6.799520] IP: [<
ffffffff810e2dc8>] __lock_acquire+0x11f8/0x1430
[ 6.806493] PGD 0
[ 6.817972] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 6.823303] Modules linked in:
[ 6.826862] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc1+ #126
[ 6.834252] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRIVTIN1.86B.0047.R00.
1402050741 02/05/2014
[ 6.845951] task:
ffff880455a80000 ti:
ffff880455a88000 task.ti:
ffff880455a88000
[ 6.854437] RIP: 0010:[<
ffffffff810e2dc8>] [<
ffffffff810e2dc8>] __lock_acquire+0x11f8/0x1430
[ 6.864154] RSP: 0000:
ffff880455a89ce0 EFLAGS:
00010046
[ 6.870179] RAX:
0000000000000046 RBX:
0000000000000002 RCX:
0000000000000000
[ 6.878249] RDX:
0000000000000000 RSI:
0000000000000000 RDI:
00000000000000c8
[ 6.886318] RBP:
ffff880455a89d40 R08:
0000000000000002 R09:
0000000000000001
[ 6.894387] R10:
0000000000000000 R11:
0000000000000001 R12:
ffff880455a80000
[ 6.902458] R13:
0000000000000000 R14:
00000000000000c8 R15:
0000000000000000
[ 6.910520] FS:
0000000000000000(0000) GS:
ffff88045b800000(0000) knlGS:
0000000000000000
[ 6.919687] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 6.926198] CR2:
00000000000000c8 CR3:
0000000001e0e000 CR4:
00000000001407f0
[ 6.934269] Stack:
[ 6.936588]
ffffffffffffff10 ffffffff810f59db 0000000000000010 0000000000000246
[ 6.945219]
ffff880455a89d10 0000000000000000 ffffffff82bcb980 0000000000000046
[ 6.953850]
0000000000000000 0000000000000000 0000000000000002 0000000000000000
[ 6.962482] Call Trace:
[ 6.965300] [<
ffffffff810f59db>] ? vprintk_emit+0x4fb/0x5a0
[ 6.971716] [<
ffffffff810e3185>] lock_acquire+0x185/0x200
[ 6.977941] [<
ffffffff821fbbee>] ? init_dmars+0x839/0xa1d
[ 6.984167] [<
ffffffff81870b06>] _raw_spin_lock_irqsave+0x56/0x90
[ 6.991158] [<
ffffffff821fbbee>] ? init_dmars+0x839/0xa1d
[ 6.997380] [<
ffffffff821fbbee>] init_dmars+0x839/0xa1d
[ 7.003410] [<
ffffffff8147d575>] ? pci_get_dev_by_id+0x75/0xd0
[ 7.010119] [<
ffffffff821fc146>] intel_iommu_init+0x2f0/0x502
[ 7.016735] [<
ffffffff821a7947>] ? iommu_setup+0x27d/0x27d
[ 7.023056] [<
ffffffff821a796f>] pci_iommu_init+0x28/0x52
[ 7.029282] [<
ffffffff81002162>] do_one_initcall+0xf2/0x220
[ 7.035702] [<
ffffffff810a4a29>] ? parse_args+0x2c9/0x450
[ 7.041919] [<
ffffffff8219d1b1>] kernel_init_freeable+0x1c9/0x25b
[ 7.048919] [<
ffffffff8219c8d2>] ? do_early_param+0x8a/0x8a
[ 7.055336] [<
ffffffff8184d3f0>] ? rest_init+0x150/0x150
[ 7.061461] [<
ffffffff8184d3fe>] kernel_init+0xe/0x100
[ 7.067393] [<
ffffffff8187b5fc>] ret_from_fork+0x7c/0xb0
[ 7.073518] [<
ffffffff8184d3f0>] ? rest_init+0x150/0x150
[ 7.079642] Code: 01 76 18 89 05 46 04 36 01 41 be 01 00 00 00 e9 2f 02 00 00 0f 1f 80 00 00 00 00 41 be 01 00 00 00 e9 1d 02 00 00 0f 1f 44 00 00 <49> 81 3e c0 31 34 82 b8 01 00 00 00 0f 44 d8 41 83 ff 01 0f 87
[ 7.104944] RIP [<
ffffffff810e2dc8>] __lock_acquire+0x11f8/0x1430
[ 7.112008] RSP <
ffff880455a89ce0>
[ 7.115988] CR2:
00000000000000c8
[ 7.119784] ---[ end trace
13d756f0f462c538 ]---
[ 7.125034] note: swapper/0[1] exited with preempt_count 1
[ 7.131285] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[ 7.131285]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Jiang Liu [Wed, 19 Feb 2014 06:07:29 +0000 (14:07 +0800)]
iommu/vt-d: Fix incorrect iommu_count for si_domain
The iommu_count field in si_domain(static identity domain) is
initialized to zero and never increases. It will underflow
when tearing down iommu unit in function free_dmar_iommu()
and leak memory. So refine code to correctly manage
si_domain->iommu_count.
Warning message caused by si_domain memory leak:
[ 14.609681] IOMMU: Setting RMRR:
[ 14.613496] Ignoring identity map for HW passthrough device 0000:00:1a.0 [0xbdcfd000 - 0xbdd1dfff]
[ 14.623809] Ignoring identity map for HW passthrough device 0000:00:1d.0 [0xbdcfd000 - 0xbdd1dfff]
[ 14.634162] IOMMU: Prepare 0-16MiB unity mapping for LPC
[ 14.640329] Ignoring identity map for HW passthrough device 0000:00:1f.0 [0x0 - 0xffffff]
[ 14.673360] IOMMU: dmar init failed
[ 14.678157] kmem_cache_destroy iommu_devinfo: Slab cache still has objects
[ 14.686076] CPU: 12 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc1-gerry+ #59
[ 14.694176] Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.99.99.x059.
091020121352 09/10/2012
[ 14.707412]
0000000000000000 ffff88042dd33db0 ffffffff8156223d ffff880c2cc37c00
[ 14.716407]
ffff88042dd33dc8 ffffffff811790b1 ffff880c2d3533b8 ffff88042dd33e00
[ 14.725468]
ffffffff81dc7a6a ffffffff81b1e8e0 ffffffff81f84058 ffffffff81d8a711
[ 14.734464] Call Trace:
[ 14.737453] [<
ffffffff8156223d>] dump_stack+0x4d/0x66
[ 14.743430] [<
ffffffff811790b1>] kmem_cache_destroy+0xf1/0x100
[ 14.750279] [<
ffffffff81dc7a6a>] intel_iommu_init+0x122/0x56a
[ 14.757035] [<
ffffffff81d8a711>] ? iommu_setup+0x27d/0x27d
[ 14.763491] [<
ffffffff81d8a739>] pci_iommu_init+0x28/0x52
[ 14.769846] [<
ffffffff81000342>] do_one_initcall+0x122/0x180
[ 14.776506] [<
ffffffff81077738>] ? parse_args+0x1e8/0x320
[ 14.782866] [<
ffffffff81d850e8>] kernel_init_freeable+0x1e1/0x26c
[ 14.789994] [<
ffffffff81d84833>] ? do_early_param+0x88/0x88
[ 14.796556] [<
ffffffff8154ffc0>] ? rest_init+0xd0/0xd0
[ 14.802626] [<
ffffffff8154ffce>] kernel_init+0xe/0x130
[ 14.808698] [<
ffffffff815756ac>] ret_from_fork+0x7c/0xb0
[ 14.814963] [<
ffffffff8154ffc0>] ? rest_init+0xd0/0xd0
[ 14.821640] kmem_cache_destroy iommu_domain: Slab cache still has objects
[ 14.829456] CPU: 12 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc1-gerry+ #59
[ 14.837562] Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.99.99.x059.
091020121352 09/10/2012
[ 14.850803]
0000000000000000 ffff88042dd33db0 ffffffff8156223d ffff88102c1ee3c0
[ 14.861222]
ffff88042dd33dc8 ffffffff811790b1 ffff880c2d3533b8 ffff88042dd33e00
[ 14.870284]
ffffffff81dc7a76 ffffffff81b1e8e0 ffffffff81f84058 ffffffff81d8a711
[ 14.879271] Call Trace:
[ 14.882227] [<
ffffffff8156223d>] dump_stack+0x4d/0x66
[ 14.888197] [<
ffffffff811790b1>] kmem_cache_destroy+0xf1/0x100
[ 14.895034] [<
ffffffff81dc7a76>] intel_iommu_init+0x12e/0x56a
[ 14.901781] [<
ffffffff81d8a711>] ? iommu_setup+0x27d/0x27d
[ 14.908238] [<
ffffffff81d8a739>] pci_iommu_init+0x28/0x52
[ 14.914594] [<
ffffffff81000342>] do_one_initcall+0x122/0x180
[ 14.921244] [<
ffffffff81077738>] ? parse_args+0x1e8/0x320
[ 14.927598] [<
ffffffff81d850e8>] kernel_init_freeable+0x1e1/0x26c
[ 14.934738] [<
ffffffff81d84833>] ? do_early_param+0x88/0x88
[ 14.941309] [<
ffffffff8154ffc0>] ? rest_init+0xd0/0xd0
[ 14.947380] [<
ffffffff8154ffce>] kernel_init+0xe/0x130
[ 14.953430] [<
ffffffff815756ac>] ret_from_fork+0x7c/0xb0
[ 14.959689] [<
ffffffff8154ffc0>] ? rest_init+0xd0/0xd0
[ 14.966299] kmem_cache_destroy iommu_iova: Slab cache still has objects
[ 14.973923] CPU: 12 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc1-gerry+ #59
[ 14.982020] Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.99.99.x059.
091020121352 09/10/2012
[ 14.995263]
0000000000000000 ffff88042dd33db0 ffffffff8156223d ffff88042cb5c980
[ 15.004265]
ffff88042dd33dc8 ffffffff811790b1 ffff880c2d3533b8 ffff88042dd33e00
[ 15.013322]
ffffffff81dc7a82 ffffffff81b1e8e0 ffffffff81f84058 ffffffff81d8a711
[ 15.022318] Call Trace:
[ 15.025238] [<
ffffffff8156223d>] dump_stack+0x4d/0x66
[ 15.031202] [<
ffffffff811790b1>] kmem_cache_destroy+0xf1/0x100
[ 15.038038] [<
ffffffff81dc7a82>] intel_iommu_init+0x13a/0x56a
[ 15.044786] [<
ffffffff81d8a711>] ? iommu_setup+0x27d/0x27d
[ 15.051242] [<
ffffffff81d8a739>] pci_iommu_init+0x28/0x52
[ 15.057601] [<
ffffffff81000342>] do_one_initcall+0x122/0x180
[ 15.064254] [<
ffffffff81077738>] ? parse_args+0x1e8/0x320
[ 15.070608] [<
ffffffff81d850e8>] kernel_init_freeable+0x1e1/0x26c
[ 15.077747] [<
ffffffff81d84833>] ? do_early_param+0x88/0x88
[ 15.084300] [<
ffffffff8154ffc0>] ? rest_init+0xd0/0xd0
[ 15.090362] [<
ffffffff8154ffce>] kernel_init+0xe/0x130
[ 15.096431] [<
ffffffff815756ac>] ret_from_fork+0x7c/0xb0
[ 15.102693] [<
ffffffff8154ffc0>] ? rest_init+0xd0/0xd0
[ 15.189273] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Jiang Liu [Wed, 19 Feb 2014 06:07:28 +0000 (14:07 +0800)]
iommu/vt-d: Reduce duplicated code to handle virtual machine domains
Reduce duplicated code to handle virtual machine domains, there's no
functionality changes. It also improves code readability.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Jiang Liu [Wed, 19 Feb 2014 06:07:27 +0000 (14:07 +0800)]
iommu/vt-d: Free resources if failed to create domain for PCIe endpoint
Enhance function get_domain_for_dev() to release allocated resources
if failed to create domain for PCIe endpoint, otherwise the allocated
resources will get lost.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Jiang Liu [Wed, 19 Feb 2014 06:07:26 +0000 (14:07 +0800)]
iommu/vt-d: Simplify function get_domain_for_dev()
Function get_domain_for_dev() is a little complex, simplify it
by factoring out dmar_search_domain_by_dev_info() and
dmar_insert_dev_info().
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Jiang Liu [Wed, 19 Feb 2014 06:07:25 +0000 (14:07 +0800)]
iommu/vt-d: Move private structures and variables into intel-iommu.c
Move private structures and variables into intel-iommu.c, which will
help to simplify locking policy for hotplug. Also delete redundant
declarations.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Jiang Liu [Wed, 19 Feb 2014 06:07:24 +0000 (14:07 +0800)]
iommu/vt-d: Factor out dmar_alloc_dev_scope() for later reuse
Factor out function dmar_alloc_dev_scope() from dmar_parse_dev_scope()
for later reuse.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Jiang Liu [Wed, 19 Feb 2014 06:07:23 +0000 (14:07 +0800)]
iommu/vt-d: Avoid caching stale domain_device_info when hot-removing PCI device
Function device_notifier() in intel-iommu.c only remove domain_device_info
data structure associated with a PCI device when handling PCI device
driver unbinding events. If a PCI device has never been bound to a PCI
device driver, there won't be BUS_NOTIFY_UNBOUND_DRIVER event when
hot-removing the PCI device. So associated domain_device_info data
structure may get lost.
On the other hand, if iommu_pass_through is enabled, function
iommu_prepare_static_indentify_mapping() will create domain_device_info
data structure for each PCIe to PCIe bridge and PCIe endpoint,
no matter whether there are drivers associated with those PCIe devices
or not. So those domain_device_info data structures will get lost when
hot-removing the assocated PCIe devices if they have never bound to
any PCI device driver.
To be even worse, it's not only an memory leak issue, but also an
caching of stale information bug because the memory are kept in
device_domain_list and domain->devices lists.
Fix the bug by trying to remove domain_device_info data structure when
handling BUS_NOTIFY_DEL_DEVICE event.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Jiang Liu [Wed, 19 Feb 2014 06:07:22 +0000 (14:07 +0800)]
iommu/vt-d: Avoid caching stale domain_device_info and fix memory leak
Function device_notifier() in intel-iommu.c fails to remove
device_domain_info data structures for PCI devices if they are
associated with si_domain because iommu_no_mapping() returns true
for those PCI devices. This will cause memory leak and caching of
stale information in domain->devices list.
So fix the issue by not calling iommu_no_mapping() and skipping check
of iommu_pass_through.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Jiang Liu [Wed, 19 Feb 2014 06:07:21 +0000 (14:07 +0800)]
iommu/vt-d: Avoid double free of g_iommus on error recovery path
Array 'g_iommus' may be freed twice on error recovery path in function
init_dmars() and free_dmar_iommu(), thus cause random system crash as
below.
[ 6.774301] IOMMU: dmar init failed
[ 6.778310] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[ 6.785615] software IO TLB [mem 0x76bcf000-0x7abcf000] (64MB) mapped at [
ffff880076bcf000-
ffff88007abcefff]
[ 6.796887] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 6.804173] Modules linked in:
[ 6.807731] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc1+ #108
[ 6.815122] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRIVTIN1.86B.0047.R00.
1402050741 02/05/2014
[ 6.836000] task:
ffff880455a80000 ti:
ffff880455a88000 task.ti:
ffff880455a88000
[ 6.844487] RIP: 0010:[<
ffffffff8143eea6>] [<
ffffffff8143eea6>] memcpy+0x6/0x110
[ 6.853039] RSP: 0000:
ffff880455a89cc8 EFLAGS:
00010293
[ 6.859064] RAX:
ffff006568636163 RBX:
ffff00656863616a RCX:
0000000000000005
[ 6.867134] RDX:
0000000000000005 RSI:
ffffffff81cdc439 RDI:
ffff006568636163
[ 6.875205] RBP:
ffff880455a89d30 R08:
000000000001bc3b R09:
0000000000000000
[ 6.883275] R10:
0000000000000000 R11:
ffffffff81cdc43e R12:
ffff880455a89da8
[ 6.891338] R13:
ffff006568636163 R14:
0000000000000005 R15:
ffffffff81cdc439
[ 6.899408] FS:
0000000000000000(0000) GS:
ffff88045b800000(0000) knlGS:
0000000000000000
[ 6.908575] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 6.915088] CR2:
ffff88047e1ff000 CR3:
0000000001e0e000 CR4:
00000000001407f0
[ 6.923160] Stack:
[ 6.925487]
ffffffff8143c904 ffff88045b407e00 ffff006568636163 ffff006568636163
[ 6.934113]
ffffffff8120a1a9 ffffffff81cdc43e 0000000000000007 0000000000000000
[ 6.942747]
ffff880455a89da8 ffff006568636163 0000000000000007 ffffffff81cdc439
[ 6.951382] Call Trace:
[ 6.954197] [<
ffffffff8143c904>] ? vsnprintf+0x124/0x6f0
[ 6.960323] [<
ffffffff8120a1a9>] ? __kmalloc_track_caller+0x169/0x360
[ 6.967716] [<
ffffffff81440e1b>] kvasprintf+0x6b/0x80
[ 6.973552] [<
ffffffff81432bf1>] kobject_set_name_vargs+0x21/0x70
[ 6.980552] [<
ffffffff8143393d>] kobject_init_and_add+0x4d/0x90
[ 6.987364] [<
ffffffff812067c9>] ? __kmalloc+0x169/0x370
[ 6.993492] [<
ffffffff8102dbbc>] ? cache_add_dev+0x17c/0x4f0
[ 7.000005] [<
ffffffff8102ddfa>] cache_add_dev+0x3ba/0x4f0
[ 7.006327] [<
ffffffff821a87ca>] ? i8237A_init_ops+0x14/0x14
[ 7.012842] [<
ffffffff821a87f8>] cache_sysfs_init+0x2e/0x61
[ 7.019260] [<
ffffffff81002162>] do_one_initcall+0xf2/0x220
[ 7.025679] [<
ffffffff810a4a29>] ? parse_args+0x2c9/0x450
[ 7.031903] [<
ffffffff8219d1b1>] kernel_init_freeable+0x1c9/0x25b
[ 7.038904] [<
ffffffff8219c8d2>] ? do_early_param+0x8a/0x8a
[ 7.045322] [<
ffffffff8184d5e0>] ? rest_init+0x150/0x150
[ 7.051447] [<
ffffffff8184d5ee>] kernel_init+0xe/0x100
[ 7.057380] [<
ffffffff8187b87c>] ret_from_fork+0x7c/0xb0
[ 7.063503] [<
ffffffff8184d5e0>] ? rest_init+0x150/0x150
[ 7.069628] Code: 89 e5 53 48 89 fb 75 16 80 7f 3c 00 75 05 e8 d2 f9 ff ff 48 8b 43 58 48 2b 43 50 88 43 4e 5b 5d c3 90 90 90 90 48 89 f8 48 89 d1 <f3> a4 c3 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 20 4c 8b 06 4c 8b
[ 7.094960] RIP [<
ffffffff8143eea6>] memcpy+0x6/0x110
[ 7.100856] RSP <
ffff880455a89cc8>
[ 7.104864] ---[ end trace
b5d3fdc6c6c28083 ]---
[ 7.110142] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 7.110142]
[ 7.120540] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Linus Torvalds [Mon, 3 Mar 2014 02:56:16 +0000 (18:56 -0800)]
Linux 3.14-rc5
Linus Torvalds [Sun, 2 Mar 2014 23:25:45 +0000 (15:25 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Not a huge amount happening, some MAINTAINERS updates, radeon, vmwgfx
and tegra fixes"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/vmwgfx: avoid null pointer dereference at failure paths
drm/vmwgfx: Make sure backing mobs are cleared when allocated. Update driver date.
drm/vmwgfx: Remove some unused surface formats
drm/radeon: enable speaker allocation setup on dce3.2
drm/radeon: change audio enable logic
drm/radeon: fix audio disable on dce6+
drm/radeon: free uvd ring on unload
drm/radeon: disable pll sharing for DP on DCE4.1
drm/radeon: fix missing bo reservation
drm/radeon: print the supported atpx function mask
MAINTAINERS: update drm git tree entry
MAINTAINERS: add entry for drm radeon driver
drm/tegra: Add guard to avoid double disable/enable of RGB outputs
gpu: host1x: do not check previously handled gathers
drm/tegra: fix typo 'CONFIG_TEGRA_DRM_FBDEV'
Linus Torvalds [Sun, 2 Mar 2014 23:15:07 +0000 (15:15 -0800)]
Merge tag 'usb-3.14-rc5' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are 2 USB patches for 3.14-rc5, one a new device id, and the
other fixes a reported problem with threaded irqs and the USB EHCI
driver"
* tag 'usb-3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: ehci: fix deadlock when threadirqs option is used
USB: ftdi_sio: add Cressi Leonardo PID
Linus Torvalds [Sun, 2 Mar 2014 23:13:41 +0000 (15:13 -0800)]
Merge tag 'driver-core-3.14-rc5' of git://git./linux/kernel/git/gregkh/driver-core
Pull sysfs fix from Greg KH:
"Here is a single sysfs fix for 3.14-rc5. It fixes a reported problem
with the namespace code in sysfs"
* tag 'driver-core-3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
sysfs: fix namespace refcnt leak
Linus Torvalds [Sun, 2 Mar 2014 23:12:54 +0000 (15:12 -0800)]
Merge tag 'staging-3.14-rc5' of git://git./linux/kernel/git/gregkh/staging
Pull staging tree fixes from Greg KH:
"Here are a few IIO fixes, and a new device id for a staging driver for
3.14-rc5. All have been in linux-next for a while, I did a final
merge to get the IIO fixes into this tree, they were incorrectly in
the char-misc tree for a few weeks, and I forgot to tell you to pull
them from there. This makes it a single pull request for you"
* tag 'staging-3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: r8188eu: Add new device ID
staging:iio:adc:MXS:LRADC: fix touchscreen statemachine
iio:gyro: bug on L3GD20H gyroscope support
iio: cm32181: Change cm32181 ambient light sensor driver
iio: cm36651: Fix read/write integration time function.
Dave Airlie [Sun, 2 Mar 2014 23:04:41 +0000 (09:04 +1000)]
Merge branch 'drm-fixes-3.14' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
more radeon fixes
* 'drm-fixes-3.14' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: enable speaker allocation setup on dce3.2
drm/radeon: change audio enable logic
drm/radeon: fix audio disable on dce6+
drm/radeon: free uvd ring on unload
drm/radeon: disable pll sharing for DP on DCE4.1
drm/radeon: fix missing bo reservation
drm/radeon: print the supported atpx function mask
Greg Kroah-Hartman [Sun, 2 Mar 2014 22:04:01 +0000 (14:04 -0800)]
Merge iio fixes into staging-linus
These I forgot about before, but need to get into 3.14-final.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus Torvalds [Sun, 2 Mar 2014 17:37:07 +0000 (11:37 -0600)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Misc fixes, most of them on the tooling side"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools: Fix strict alias issue for find_first_bit
perf tools: fix BFD detection on opensuse
perf: Fix hotplug splat
perf/x86: Fix event scheduling
perf symbols: Destroy unused symsrcs
perf annotate: Check availability of annotate when processing samples
Dave Airlie [Sun, 2 Mar 2014 10:54:31 +0000 (20:54 +1000)]
Merge tag 'vmwgfx-fixes-3.14-2014-03-02' of git://people.freedesktop.org/~thomash/linux into drm-fixes
A couple of minor fixes.
Pull request of 2014-03-02
* tag 'vmwgfx-fixes-3.14-2014-03-02' of git://people.freedesktop.org/~thomash/linux:
drm/vmwgfx: avoid null pointer dereference at failure paths
drm/vmwgfx: Make sure backing mobs are cleared when allocated. Update driver date.
drm/vmwgfx: Remove some unused surface formats
Alexey Khoroshilov [Fri, 28 Feb 2014 21:20:18 +0000 (01:20 +0400)]
drm/vmwgfx: avoid null pointer dereference at failure paths
vmw_takedown_otable_base() and vmw_mob_unbind() check for
potential vmw_fifo_reserve() failure and print error message,
but then immediately dereference NULL pointer.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Thomas Hellstrom [Fri, 28 Feb 2014 12:33:21 +0000 (13:33 +0100)]
drm/vmwgfx: Make sure backing mobs are cleared when allocated. Update driver date.
Backing mob contents is propagated to user-space, so make sure backing
mobs are cleared when allocated. This also accidently fix rendering errors
with celestia when emulating legacy mode.
Also update driver date.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Thomas Hellstrom [Fri, 28 Feb 2014 12:31:04 +0000 (13:31 +0100)]
drm/vmwgfx: Remove some unused surface formats
These formats are deprecated.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Linus Torvalds [Sun, 2 Mar 2014 04:48:14 +0000 (22:48 -0600)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin:
"The VMCOREINFO patch I'll pushing for this release to avoid having a
release with kASLR and but without that information.
I was hoping to include the FPU patches from Suresh, but ran into a
problem (see other thread); will try to make them happen next week"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, kaslr: add missed "static" declarations
x86, kaslr: export offset in VMCOREINFO ELF notes
Linus Torvalds [Sun, 2 Mar 2014 03:33:09 +0000 (21:33 -0600)]
Merge git://git./linux/kernel/git/nab/target-pending
Pull SCSI target fixes from Nicholas Bellinger:
"The bulk of the series are bugfixes for qla2xxx target NPIV support
that went in for v3.14-rc1. Also included are a few DIF related
fixes, a qla2xxx fix (Cc'ed to stable) from Greg W., and vhost/scsi
protocol version related fix from Venkatesh.
Also just a heads up that a series to address a number of issues with
iser-target active I/O reset/shutdown is still being tested, and will
be included in a separate -rc6 PULL request"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
vhost/scsi: Check LUN structure byte 0 is set to 1, per spec
qla2xxx: Fix kernel panic on selective retransmission request
Target/sbc: Don't use sg as iterator in sbc_verify_read
target: Add DIF sense codes in transport_generic_request_failure
target/sbc: Fix sbc_dif_copy_prot addr offset bug
tcm_qla2xxx: Fix NAA formatted name for NPIV WWPNs
tcm_qla2xxx: Perform configfs depend/undepend for base_tpg
tcm_qla2xxx: Add NPIV specific enable/disable attribute logic
qla2xxx: Check + fail when npiv_vports_inuse exists in shutdown
qla2xxx: Fix qlt_lport_register base_vha callback race
Linus Torvalds [Sun, 2 Mar 2014 03:30:43 +0000 (21:30 -0600)]
Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma
Pull slave-dma fixes from Vinod Koul:
"This request brings you two small fixes. First one for fixing
dereference of freed descriptor and second for fixing sdma bindings
for it to work for imx25.
I was planning to send this about 10days ago but then I had to proceed
on my paternity leave and didnt get chance to send this. Now got a
bit of time from dady duties :)"
* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
dma: sdma: Add imx25 compatible
dma: ste_dma40: don't dereference free:d descriptor
Linus Torvalds [Sun, 2 Mar 2014 03:28:38 +0000 (21:28 -0600)]
Merge tag 'pm+acpi-3.14-rc5' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
"These three commits fix a recent intel_pstate regression and two old
bugs that should be fixed in -stable too, one in the ACPI processor
driver and one in the firmare loader.
Specifics:
- One of the recent intel_pstate driver fixes introduced a rounding
error that on some systems causes the frequency to be stuck at the
lowest level forever. Fix from Dirk Brandewie.
- The firmware_class driver's PM notifier doesn't handle the
PM_RESTORE_PREPARE event during hibernation image restore and that
leads to a deadlock on umhelper_sem in __usermodehelper_disable().
Fix from Sebastian Capella.
- acpi_processor_set_throttling() abuses set_cpus_allowed_ptr() in a
nasty way which triggers the WARN_ON_ONCE() in wq_worker_waking_up()
among other things. Fix from Lan Tianyu"
* tag 'pm+acpi-3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / processor: Rework processor throttling with work_on_cpu()
PM / hibernate: Fix restore hang in freeze_processes()
intel_pstate: Change busy calculation to use fixed point math.
Ingo Molnar [Sat, 1 Mar 2014 09:13:25 +0000 (10:13 +0100)]
Merge tag 'perf-urgent-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent build fixes for certain distro environments, from Arnaldo Carvalho de Melo:
* Problem on recent gcc on x86-32 related to strict alias issue for
find_first_bit (Jiri Olsa).
* OpenSuSE: BFD detection problems related to not explicitely listing all
required libraries (Andi Kleen)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Greg Kroah-Hartman [Sat, 1 Mar 2014 01:08:03 +0000 (17:08 -0800)]
Merge tag 'fixes-for-3.14d' of git://git./linux/kernel/git/jic23/iio into staging-linus
Jonathan writes:
Fourth set of IIO fixes for the 3.14 kernel.
A single line patch fixing a regression that was introduced in 3.13 in the
reworking of the mxs touch screen and ADC drivers to be interrupt rather
than polling driven. It resulted in a stray double reporting of the release
coordinate in the touch screen driver. The bug lay in the adc side
of the driver which left the statemachine in the wrong state.
Russell King [Fri, 28 Feb 2014 22:40:53 +0000 (22:40 +0000)]
MAINTAINERS: add maintainer entry for Armada DRM driver
Add a maintainers entry for the Armada DRM driver.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 28 Feb 2014 19:53:33 +0000 (11:53 -0800)]
Merge tag 'dm-3.14-fixes-1' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
"A few dm-cache fixes, an invalid ioctl handling fix for dm multipath,
a couple immutable biovec fixups for dm mirror, and a few dm-thin
fixes.
There will likely be additional dm-thin metadata and data resize fixes
to include in 3.14-rc6 next week.
Note to stable-minded folks: Immutable biovecs were introduced in
3.14, so the related fixups for dm mirror are not needed in stable@
kernels"
* tag 'dm-3.14-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm cache: fix truncation bug when mapping I/O to >2TB fast device
dm thin: allow metadata space larger than supported to go unused
dm mpath: fix stalls when handling invalid ioctls
dm thin: fix the error path for the thin device constructor
dm raid1: fix immutable biovec related BUG when retrying read bio
dm io: fix I/O to multiple destinations
dm thin: avoid metadata commit if a pool's thin devices haven't changed
dm cache: do not add migration to completed list before unhooking bio
dm cache: move hook_info into common portion of per_bio_data structure
Linus Torvalds [Fri, 28 Feb 2014 19:50:32 +0000 (11:50 -0800)]
Merge tag 'sound-3.14-rc5' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"It's a bad habit to get a higher volume of fixes often lately, but
things happen again.
All commits found here are real bug fixes, and are mostly trivial.
Most of changes in ASoC are the fixes for enum items due to the wrong
API usages, in addition to a few DAPM mutex deadlock and other fixes.
In HD-audio, only fixups for HP laptops. Although diffstat shows
much, the changes are simple: there are just so many different device
entries there"
* tag 'sound-3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ASoC: sta32x: Fix wrong enum for limiter2 release rate
ASoC: da732x: Mark DC offset control registers volatile
ALSA: hda/realtek - Add more entry for enable HP mute led
ALSA: hda - Add a fixup for HP Folio 13 mute LED
ASoC: wm8958-dsp: Fix firmware block loading
ASoC: sta32x: Fix cache sync
ALSA: hda/realtek - Add more entry for enable HP mute led
ASoC: dapm: Add locking to snd_soc_dapm_xxxx_pin functions
Input - arizona-haptics: Fix double lock of dapm_mutex
ASoC: wm8400: Fix the wrong number of enum items
ASoC: isabelle: Fix the wrong number of items in enum ctls
ASoC: ad1980: Fix wrong number of items for capture source
ASoC: wm8994: Fix the wrong number of enum items
ASoC: wm8900: Fix the wrong number of enum items
ASoC: wm8770: Fix wrong number of enum items
ASoC: sta32x: Fix array access overflow
ASoC: dapm: Correct regulator bypass error messages
Linus Torvalds [Fri, 28 Feb 2014 19:49:09 +0000 (11:49 -0800)]
Merge tag 'edac_fixes_for_3.14' of git://git./linux/kernel/git/bp/bp
Pull EDAC fixes from Borislav Petkov:
"Two fixes below for PCI devices disappearing when a reference count
underflow happens after a couple of insmod/rmmod cycles in succession"
* tag 'edac_fixes_for_3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
i7300_edac: Fix device reference count
i7core_edac: Fix PCI device reference count
Linus Torvalds [Fri, 28 Feb 2014 19:45:03 +0000 (11:45 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"Three x86 fixes and one for ARM/ARM64.
In particular, nested virtualization on Intel is broken in 3.13 and
fixed by this pull request"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm, vmx: Really fix lazy FPU on nested guest
kvm: x86: fix emulator buffer overflow (CVE-2014-0049)
arm/arm64: KVM: detect CPU reset on CPU_PM_EXIT
KVM: MMU: drop read-only large sptes when creating lower level sptes
Linus Torvalds [Fri, 28 Feb 2014 19:43:42 +0000 (11:43 -0800)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull ARM64 fixes from Catalin Marinas:
- !CONFIG_SMP build fix
- pte bit testing macros conversion fix (int truncates top bits of
long)
- stack unwinding PC calculation fix
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Fix !CONFIG_SMP kernel build
arm64: mm: Add double logical invert to pte accessors
ARM64: unwind: Fix PC calculation
Linus Torvalds [Fri, 28 Feb 2014 19:42:33 +0000 (11:42 -0800)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
Pull powerpc fixes from Ben Herrenschmidt:
"Here are a few more powerpc fixes for 3.14.
Most of these are also CC'ed to stable and fix bugs in new
functionality introduced in the last 2 or 3 versions"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/powernv: Fix indirect XSCOM unmangling
powerpc/powernv: Fix opal_xscom_{read,write} prototype
powerpc/powernv: Refactor PHB diag-data dump
powerpc/powernv: Dump PHB diag-data immediately
powerpc: Increase stack redzone for 64-bit userspace to 512 bytes
powerpc/ftrace: bugfix for test_24bit_addr
powerpc/crashdump : Fix page frame number check in copy_oldmem_page
powerpc/le: Ensure that the 'stop-self' RTAS token is handled correctly
Catalin Marinas [Fri, 28 Feb 2014 16:12:25 +0000 (16:12 +0000)]
arm64: Fix !CONFIG_SMP kernel build
Commit
fb4a96029c8a (arm64: kernel: fix per-cpu offset restore on
resume) uses per_cpu_offset() unconditionally during CPU wakeup,
however, this is only defined for the SMP case.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Dave P Martin <Dave.Martin@arm.com>
Steve Capper [Tue, 25 Feb 2014 11:38:53 +0000 (11:38 +0000)]
arm64: mm: Add double logical invert to pte accessors
Page table entries on ARM64 are 64 bits, and some pte functions such as
pte_dirty return a bitwise-and of a flag with the pte value. If the
flag to be tested resides in the upper 32 bits of the pte, then we run
into the danger of the result being dropped if downcast.
For example:
gather_stats(page, md, pte_dirty(*pte), 1);
where pte_dirty(*pte) is downcast to an int.
This patch adds a double logical invert to all the pte_ accessors to
ensure predictable downcasting.
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Heinz Mauelshagen [Thu, 27 Feb 2014 21:46:48 +0000 (22:46 +0100)]
dm cache: fix truncation bug when mapping I/O to >2TB fast device
When remapping a block to the cache's fast device that is larger than
2TB we must not truncate the destination sector to 32bits. The 32bit
temporary result of from_cblock() was being overflowed in
remap_to_cache() due to the logical left shift.
Use an intermediate 64bit type to store the 32bit from_cblock() result
to fix the overflow.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Jiri Olsa [Wed, 26 Feb 2014 17:14:26 +0000 (18:14 +0100)]
perf tools: Fix strict alias issue for find_first_bit
When compiling perf tool code with gcc 4.4.7 I'm getting
following error:
CC util/session.o
cc1: warnings being treated as errors
util/session.c: In function ‘perf_session_deliver_event’:
tools/perf/util/include/linux/bitops.h:109: error: dereferencing pointer ‘p’ does break strict-aliasing rules
tools/perf/util/include/linux/bitops.h:101: error: dereferencing pointer ‘p’ does break strict-aliasing rules
util/session.c:697: note: initialized from here
tools/perf/util/include/linux/bitops.h:101: note: initialized from here
make[1]: *** [util/session.o] Error 1
make: *** [util/session.o] Error 2
The aliased types here are u64 and unsigned long pointers, which is safe
for the find_first_bit processing.
This error shows up for me only for gcc 4.4 on 32bit x86, even for
-Wstrict-aliasing=3, while newer gcc are quiet and scream here for
-Wstrict-aliasing={2,1}. Looks like newer gcc changed the rules for
strict alias warnings.
The gcc documentation offers workaround for valid aliasing by using
__may_alias__ attribute:
http://gcc.gnu.org/onlinedocs/gcc-4.4.0/gcc/Type-Attributes.html
Using this workaround for the find_first_bit function.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1393434867-20271-1-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Benjamin Herrenschmidt [Fri, 28 Feb 2014 05:20:38 +0000 (16:20 +1100)]
powerpc/powernv: Fix indirect XSCOM unmangling
We need to unmangle the full address, not just the register
number, and we also need to support the real indirect bit
being set for in-kernel uses.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org> [v3.13]
Benjamin Herrenschmidt [Fri, 28 Feb 2014 05:20:29 +0000 (16:20 +1100)]
powerpc/powernv: Fix opal_xscom_{read,write} prototype
The OPAL firmware functions opal_xscom_read and opal_xscom_write
take a 64-bit argument for the XSCOM (PCB) address in order to
support the indirect mode on P8.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org> [v3.13]
Gavin Shan [Tue, 25 Feb 2014 07:28:38 +0000 (15:28 +0800)]
powerpc/powernv: Refactor PHB diag-data dump
As Ben suggested, the patch prints PHB diag-data with multiple
fields in one line and omits the line if the fields of that
line are all zero.
With the patch applied, the PHB3 diag-data dump looks like:
PHB3 PHB#3 Diag-data (Version: 1)
brdgCtl:
00000002
RootSts:
0000000f 00400000 b0830008 00100147 00002000
nFir:
0000000000000000 0030006e00000000 0000000000000000
PhbSts:
0000001c00000000 0000000000000000
Lem:
0000000000100000 42498e327f502eae 0000000000000000
InAErr:
8000000000000000 8000000000000000 0402030000000000 0000000000000000
PE[ 8] A/B:
8480002b00000000 8000000000000000
[ The current diag data is so big that it overflows the printk
buffer pretty quickly in cases when we get a handful of errors
at once which can happen. --BenH
]
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Gavin Shan [Tue, 25 Feb 2014 07:28:37 +0000 (15:28 +0800)]
powerpc/powernv: Dump PHB diag-data immediately
The PHB diag-data is important to help locating the root cause for
EEH errors such as frozen PE or fenced PHB. However, the EEH core
enables IO path by clearing part of HW registers before collecting
this data causing it to be corrupted.
This patch fixes this by dumping the PHB diag-data immediately when
frozen/fenced state on PE or PHB is detected for the first time in
eeh_ops::get_state() or next_error() backend.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Paul Mackerras [Wed, 26 Feb 2014 06:07:38 +0000 (17:07 +1100)]
powerpc: Increase stack redzone for 64-bit userspace to 512 bytes
The new ELFv2 little-endian ABI increases the stack redzone -- the
area below the stack pointer that can be used for storing data --
from 288 bytes to 512 bytes. This means that we need to allow more
space on the user stack when delivering a signal to a 64-bit process.
To make the code a bit clearer, we define new USER_REDZONE_SIZE and
KERNEL_REDZONE_SIZE symbols in ptrace.h. For now, we leave the
kernel redzone size at 288 bytes, since increasing it to 512 bytes
would increase the size of interrupt stack frames correspondingly.
Gcc currently only makes use of 288 bytes of redzone even when
compiling for the new little-endian ABI, and the kernel cannot
currently be compiled with the new ABI anyway.
In the future, hopefully gcc will provide an option to control the
amount of redzone used, and then we could reduce it even more.
This also changes the code in arch_compat_alloc_user_space() to
preserve the expanded redzone. It is not clear why this function would
ever be used on a 64-bit process, though.
Signed-off-by: Paul Mackerras <paulus@samba.org>
CC: <stable@vger.kernel.org> [v3.13]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Liu Ping Fan [Wed, 26 Feb 2014 02:23:01 +0000 (10:23 +0800)]
powerpc/ftrace: bugfix for test_24bit_addr
The branch target should be the func addr, not the addr of func_descr_t.
So using ppc_function_entry() to generate the right target addr.
Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Laurent Dufour [Mon, 24 Feb 2014 16:30:55 +0000 (17:30 +0100)]
powerpc/crashdump : Fix page frame number check in copy_oldmem_page
In copy_oldmem_page, the current check using max_pfn and min_low_pfn to
decide if the page is backed or not, is not valid when the memory layout is
not continuous.
This happens when running as a QEMU/KVM guest, where RTAS is mapped higher
in the memory. In that case max_pfn points to the end of RTAS, and a hole
between the end of the kdump kernel and RTAS is not backed by PTEs. As a
consequence, the kdump kernel is crashing in copy_oldmem_page when accessing
in a direct way the pages in that hole.
This fix relies on the memblock's service memblock_is_region_memory to
check if the read page is part or not of the directly accessible memory.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Tested-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tony Breeds [Thu, 20 Feb 2014 10:13:52 +0000 (21:13 +1100)]
powerpc/le: Ensure that the 'stop-self' RTAS token is handled correctly
Currently we're storing a host endian RTAS token in
rtas_stop_self_args.token. We then pass that directly to rtas. This is
fine on big endian however on little endian the token is not what we
expect.
This will typically result in hitting:
panic("Alas, I survived.\n");
To fix this we always use the stop-self token in host order and always
convert it to be32 before passing this to rtas.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Rafael J. Wysocki [Thu, 27 Feb 2014 23:14:11 +0000 (00:14 +0100)]
Merge branches 'pm-cpufreq', 'pm-hibernate' and 'acpi-processor'
* pm-cpufreq:
intel_pstate: Change busy calculation to use fixed point math.
* pm-hibernate:
PM / hibernate: Fix restore hang in freeze_processes()
* acpi-processor:
ACPI / processor: Rework processor throttling with work_on_cpu()
Paolo Bonzini [Thu, 27 Feb 2014 21:54:11 +0000 (22:54 +0100)]
kvm, vmx: Really fix lazy FPU on nested guest
Commit
e504c9098ed6 (kvm, vmx: Fix lazy FPU on nested guest, 2013-11-13)
highlighted a real problem, but the fix was subtly wrong.
nested_read_cr0 is the CR0 as read by L2, but here we want to look at
the CR0 value reflecting L1's setup. In other words, L2 might think
that TS=0 (so nested_read_cr0 has the bit clear); but if L1 is actually
running it with TS=1, we should inject the fault into L1.
The effective value of CR0 in L2 is contained in vmcs12->guest_cr0, use
it.
Fixes:
e504c9098ed6acd9e1079c5e10e4910724ad429f
Reported-by: Kashyap Chamarty <kchamart@redhat.com>
Reported-by: Stefan Bader <stefan.bader@canonical.com>
Tested-by: Kashyap Chamarty <kchamart@redhat.com>
Tested-by: Anthoine Bourgeois <bourgeois@bertin.fr>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>