Christoph Hellwig [Thu, 22 Dec 2016 18:20:45 +0000 (19:20 +0100)]
block: add back plugging in __blkdev_direct_IO
This allows sending larger than 1 MB requests to devices that support
large I/O sizes.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Linus Torvalds [Thu, 22 Dec 2016 18:31:30 +0000 (10:31 -0800)]
Merge tag 'leds_for_4.10_email_update' of git://git./linux/kernel/git/j.anaszewski/linux-leds
Pull LED maintainer email update from Jacek Anaszewski:
"Update Jacek Anaszewski's email address"
* tag 'leds_for_4.10_email_update' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
MAINTAINERS: Update Jacek Anaszewski's email address
Linus Torvalds [Thu, 22 Dec 2016 18:23:39 +0000 (10:23 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
"Just a set of small fixes that have either been queued up after the
original pull for this merge window, or just missed the original pull
request.
- a few bcache fixes/changes from Eric and Kent
- add WRITE_SAME to the command filter whitelist frm Mauricio
- kill an unused struct member from Ritesh
- partition IO alignment fix from Stefan
- nvme sysfs printf fix from Stephen"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: check partition alignment
nvme : Use correct scnprintf in cmb show
block: allow WRITE_SAME commands with the SG_IO ioctl
block: Remove unused member (busy) from struct blk_queue_tag
bcache: partition support: add 16 minors per bcacheN device
bcache: Make gc wakeup sane, remove set_task_state()
Linus Torvalds [Thu, 22 Dec 2016 18:19:32 +0000 (10:19 -0800)]
Merge tag 'acpi-extra-4.10-rc1' of git://git./linux/kernel/git/rafael/linux-pm
Pull more ACPI updates from Rafael Wysocki:
"Here are new versions of two ACPICA changes that were deferred
previously due to a problem they had introduced, two cleanups on top
of them and the removal of a useless warning message from the ACPI
core.
Specifics:
- Move some Linux-specific functionality to upstream ACPICA and
update the in-kernel users of it accordingly (Lv Zheng)
- Drop a useless warning (triggered by the lack of an optional
object) from the ACPI namespace scanning code (Zhang Rui)"
* tag 'acpi-extra-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / osl: Remove deprecated acpi_get_table_with_size()/early_acpi_os_unmap_memory()
ACPI / osl: Remove acpi_get_table_with_size()/early_acpi_os_unmap_memory() users
ACPICA: Tables: Allow FADT to be customized with virtual address
ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
ACPI: do not warn if _BQC does not exist
Linus Torvalds [Thu, 22 Dec 2016 18:15:05 +0000 (10:15 -0800)]
Merge tag 'pm-fixes-4.10-rc1' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"They fix one bug introduced recently, a build warning and a kerneldoc
function description.
Specifics:
- Prevent the acpi-cpufreq driver from crashing on exit by fixing a
check against the __cpuhp_setup_state() return value and fix the
kerneldoc description of that function to make it clear that it may
return positive numbers on success too (Boris Ostrovsky)
- Drop an incorrect __init annotation of a function in the s3c64xx
cpufreq driver and fix a build warning generated (by older
compilers) because of it (Arnd Bergmann)"
* tag 'pm-fixes-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: s3c64xx: remove incorrect __init annotation
cpufreq: Remove CPU hotplug callbacks only if they were initialized
CPU/hotplug: Clarify description of __cpuhp_setup_state() return value
Linus Torvalds [Thu, 22 Dec 2016 18:13:04 +0000 (10:13 -0800)]
Merge tag 'mmc-v4.10-3' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- further fix thread wake-up for requests
- use a bounce buffer to fix DMA issue for SSR register read
MMC host:
- sdhci: Fix a regression for runtime PM
- sdhci-cadence: Add a proper SoC specific DT compatible"
* tag 'mmc-v4.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sd: Meet alignment requirements for raw_ssr DMA
mmc: core: Further fix thread wake-up
mmc: sdhci: Fix to handle MMC_POWER_UNDEFINED
mmc: sdhci-cadence: add Socionext UniPhier specific compatible string
Linus Torvalds [Thu, 22 Dec 2016 18:03:52 +0000 (10:03 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/linux-security
Pull SElinux fix from James Morris:
"From Paul:
'A small SELinux patch to fix some clang/llvm compiler warnings and
ensure the tools under scripts work well in the face of kernel
changes'"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
selinux: use the kernel headers when building scripts/selinux
Linus Torvalds [Thu, 22 Dec 2016 17:25:45 +0000 (09:25 -0800)]
Merge branch 'x86-cache-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 cache allocation interface from Thomas Gleixner:
"This provides support for Intel's Cache Allocation Technology, a cache
partitioning mechanism.
The interface is odd, but the hardware interface of that CAT stuff is
odd as well.
We tried hard to come up with an abstraction, but that only allows
rather simple partitioning, but no way of sharing and dealing with the
per package nature of this mechanism.
In the end we decided to expose the allocation bitmaps directly so all
combinations of the hardware can be utilized.
There are two ways of associating a cache partition:
- Task
A task can be added to a resource group. It uses the cache
partition associated to the group.
- CPU
All tasks which are not member of a resource group use the group to
which the CPU they are running on is associated with.
That allows for simple CPU based partitioning schemes.
The main expected user sare:
- Virtualization so a VM can only trash only the associated part of
the cash w/o disturbing others
- Real-Time systems to seperate RT and general workloads.
- Latency sensitive enterprise workloads
- In theory this also can be used to protect against cache side
channel attacks"
[ Intel RDT is "Resource Director Technology". The interface really is
rather odd and very specific, which delayed this pull request while I
was thinking about it. The pull request itself came in early during
the merge window, I just delayed it until things had calmed down and I
had more time.
But people tell me they'll use this, and the good news is that it is
_so_ specific that it's rather independent of anything else, and no
user is going to depend on the interface since it's pretty rare. So if
push comes to shove, we can just remove the interface and nothing will
break ]
* 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
x86/intel_rdt: Implement show_options() for resctrlfs
x86/intel_rdt: Call intel_rdt_sched_in() with preemption disabled
x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount
x86/intel_rdt: Fix setting of closid when adding CPUs to a group
x86/intel_rdt: Update percpu closid immeditately on CPUs affected by changee
x86/intel_rdt: Reset per cpu closids on unmount
x86/intel_rdt: Select KERNFS when enabling INTEL_RDT_A
x86/intel_rdt: Prevent deadlock against hotplug lock
x86/intel_rdt: Protect info directory from removal
x86/intel_rdt: Add info files to Documentation
x86/intel_rdt: Export the minimum number of set mask bits in sysfs
x86/intel_rdt: Propagate error in rdt_mount() properly
x86/intel_rdt: Add a missing #include
MAINTAINERS: Add maintainer for Intel RDT resource allocation
x86/intel_rdt: Add scheduler hook
x86/intel_rdt: Add schemata file
x86/intel_rdt: Add tasks files
x86/intel_rdt: Add cpus file
x86/intel_rdt: Add mkdir to resctrl file system
x86/intel_rdt: Add "info" files to resctrl file system
...
Jacek Anaszewski [Thu, 22 Dec 2016 17:03:25 +0000 (18:03 +0100)]
MAINTAINERS: Update Jacek Anaszewski's email address
My previous email address is no longer valid.
From now on, jacek.anaszewski@gmail.com should be used instead.
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Rafael J. Wysocki [Thu, 22 Dec 2016 13:34:55 +0000 (14:34 +0100)]
Merge branch 'pm-cpufreq'
* pm-cpufreq:
cpufreq: s3c64xx: remove incorrect __init annotation
cpufreq: Remove CPU hotplug callbacks only if they were initialized
CPU/hotplug: Clarify description of __cpuhp_setup_state() return value
Rafael J. Wysocki [Thu, 22 Dec 2016 13:34:24 +0000 (14:34 +0100)]
Merge branches 'acpica' and 'acpi-scan'
* acpica:
ACPI / osl: Remove deprecated acpi_get_table_with_size()/early_acpi_os_unmap_memory()
ACPI / osl: Remove acpi_get_table_with_size()/early_acpi_os_unmap_memory() users
ACPICA: Tables: Allow FADT to be customized with virtual address
ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
* acpi-scan:
ACPI: do not warn if _BQC does not exist
Gertjan van Wingerde [Wed, 21 Dec 2016 22:35:24 +0000 (23:35 +0100)]
CREDITS: Remove outdated address information
This address hasn't been accurate for several years now.
Simply remove it.
Signed-off-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 21 Dec 2016 18:59:34 +0000 (10:59 -0800)]
splice: reinstate SIGPIPE/EPIPE handling
Commit
8924feff66f3 ("splice: lift pipe_lock out of splice_to_pipe()")
caused a regression when there were no more readers left on a pipe that
was being spliced into: rather than the expected SIGPIPE and -EPIPE
return value, the writer would end up waiting forever for space to free
up (which obviously was not going to happen with no readers around).
Fixes:
8924feff66f3 ("splice: lift pipe_lock out of splice_to_pipe()")
Reported-and-tested-by: Andreas Schwab <schwab@linux-m68k.org>
Debugged-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org # v4.9
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 21 Dec 2016 18:47:13 +0000 (10:47 -0800)]
Merge branch 'parisc-4.10-1' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
- add Kernel address space layout randomization support
- re-enable interrupts earlier now that we have a working IRQ stack
- optimize the timer interrupt function to better cope with missed
timer irqs
- fix error return code in parisc perf code (by Dan Carpenter)
- fix PAT debug code
* 'parisc-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Optimize timer interrupt function
parisc: perf: return -EFAULT on error
parisc: Enhance CPU detection code on PAT machines
parisc: Re-enable interrupts early
parisc: Enable KASLR
Linus Torvalds [Wed, 21 Dec 2016 18:40:30 +0000 (10:40 -0800)]
Merge tag 'nfs-for-4.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull more NFS client updates from Trond Myklebust:
"Highlights include:
- further attribute cache improvements to make revalidation more fine
grained
- NFSv4 locking improvements
Bugfixes:
- nfs4_fl_prepare_ds must be careful about reporting success in files
layout
- pNFS/flexfiles: Instead of marking a device inactive, remove it
from the cache"
* tag 'nfs-for-4.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFSv4: Retry the DELEGRETURN if the embedded GETATTR is rejected with EACCES
NFS: Retry the CLOSE if the embedded GETATTR is rejected with EACCES
NFSv4: Place the GETATTR operation before the CLOSE
NFSv4: Also ask for attributes when downgrading to a READ-only state
NFS: Don't abuse NFS_INO_REVAL_FORCED in nfs_post_op_update_inode_locked()
pNFS: Return RW layouts on OPEN_DOWNGRADE
NFSv4: Add encode/decode of the layoutreturn op in OPEN_DOWNGRADE
NFS: Don't disconnect open-owner on NFS4ERR_BAD_SEQID
NFSv4: ensure __nfs4_find_lock_state returns consistent result.
NFSv4.1: nfs4_fl_prepare_ds must be careful about reporting success.
pNFS/flexfiles: delete deviceid, don't mark inactive
NFS: Clean up nfs_attribute_timeout()
NFS: Remove unused function nfs_revalidate_inode_rcu()
NFS: Fix and clean up the access cache validity checking
NFS: Only look at the change attribute cache state in nfs_weak_revalidate()
NFS: Clean up cache validity checking
NFS: Don't revalidate the file on close if we hold a delegation
NFSv4: Don't discard the attributes returned by asynchronous DELEGRETURN
NFSv4: Update the attribute cache info in update_changeattr
Linus Torvalds [Wed, 21 Dec 2016 18:16:05 +0000 (10:16 -0800)]
Merge branch 'scsi-target-for-v4.10' of git://git./linux/kernel/git/bvanassche/linux
Pull scsi target cleanups from Bart Van Assche:
"The changes here are:
- a few small bug fixes for the iSCSI and user space target drivers.
- minimize the target build time by about 30% by rearranging #include
directives
- fix the second argument passed to percpu_ida_alloc()
- reduce the number of false positive warnings reported by sparse
These patches pass Wu Fengguang's build bot tests and also the
linux-next tests"
* 'scsi-target-for-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/bvanassche/linux:
iscsi-target: Return error if unable to add network portal
target: Fix spelling mistake and unwrap multi-line text
target/iscsi: Fix double free in lio_target_tiqn_addtpg()
target/user: Fix use-after-free of tcmu_cmds if they are expired
target: Minimize #include directives
target/user: Add an #include directive
cxgbit: Add an #include directive
ibmvscsi_tgt: Add two #include directives
sbp-target: Add an #include directive
qla2xxx: Add an #include directive
configfs: Minimize #include directives
usb: gadget: Fix second argument of percpu_ida_alloc()
sbp-target: Fix second argument of percpu_ida_alloc()
target/user: Fix a data type in tcmu_queue_cmd()
target: Use NULL instead of 0 to represent a pointer
Takashi Iwai [Wed, 21 Dec 2016 10:28:28 +0000 (11:28 +0100)]
Revert "ALSA: usb-audio: Fix race at stopping the stream"
This reverts commit
16200948d8353fe29a473a394d7d26790deae0e7.
The commit was intended to cover the race condition, but it introduced
yet another regression for devices with the implicit feedback, leading
to a kernel panic due to NULL-dereference in an irq context.
As the race condition that was addressed by the commit is very rare
and the regression is much worse, let's revert the commit for rc1, and
fix the issue properly in a later patch.
Fixes:
16200948d835 ("ALSA: usb-audio: Fix race at stopping the stream")
Reported-by: Ioan-Adrian Ratiu <adi@adirat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul Moore [Wed, 21 Dec 2016 15:39:25 +0000 (10:39 -0500)]
selinux: use the kernel headers when building scripts/selinux
Commit
3322d0d64f4e ("selinux: keep SELinux in sync with new capability
definitions") added a check on the defined capabilities without
explicitly including the capability header file which caused problems
when building genheaders for users of clang/llvm. Resolve this by
using the kernel headers when building genheaders, which is arguably
the right thing to do regardless, and explicitly including the
kernel's capability.h header file in classmap.h. We also update the
mdp build, even though it wasn't causing an error we really should
be using the headers from the kernel we are building.
Reported-by: Nicolas Iooss <nicolas.iooss@m4x.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Paul Burton [Fri, 11 Nov 2016 14:22:36 +0000 (14:22 +0000)]
mmc: sd: Meet alignment requirements for raw_ssr DMA
The mmc_read_ssr() function results in DMA to the raw_ssr member of
struct mmc_card, which is not guaranteed to be cache line aligned & thus
might not meet the requirements set out in Documentation/DMA-API.txt:
Warnings: Memory coherency operates at a granularity called the cache
line width. In order for memory mapped by this API to operate
correctly, the mapped region must begin exactly on a cache line
boundary and end exactly on one (to prevent two separately mapped
regions from sharing a single cache line). Since the cache line size
may not be known at compile time, the API will not enforce this
requirement. Therefore, it is recommended that driver writers who
don't take special care to determine the cache line size at run time
only map virtual regions that begin and end on page boundaries (which
are guaranteed also to be cache line boundaries).
On some systems where DMA is non-coherent this can lead to us losing
data that shares cache lines with the raw_ssr array.
Fix this by kmalloc'ing a temporary buffer to perform DMA into. kmalloc
will ensure the buffer is suitably aligned, allowing the DMA to be
performed without any loss of data.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes:
5275a652d296 ("mmc: sd: Export SD Status via “ssr” device attribute")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Arnd Bergmann [Fri, 16 Dec 2016 09:06:15 +0000 (10:06 +0100)]
cpufreq: s3c64xx: remove incorrect __init annotation
s3c64xx_cpufreq_config_regulator is incorrectly annotated
as __init, since the caller is also not init:
WARNING: vmlinux.o(.text+0x92fe1c): Section mismatch in reference from the function s3c64xx_cpufreq_driver_init() to the function .init.text:s3c64xx_cpufreq_config_regulator()
With modern gcc versions, the function gets inline, so we don't
see the warning, this only happens with gcc-4.6 and older.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Boris Ostrovsky [Thu, 15 Dec 2016 15:00:58 +0000 (10:00 -0500)]
cpufreq: Remove CPU hotplug callbacks only if they were initialized
Since CPU hotplug callbacks are requested for CPUHP_AP_ONLINE_DYN state,
successful callback initialization will result in cpuhp_setup_state()
returning a positive value. Therefore acpi_cpufreq_online being zero
indicates that callbacks have not been installed.
This means that acpi_cpufreq_boost_exit() should only remove them if
acpi_cpufreq_online is positive. Trying to call
cpuhp_remove_state_nocalls(0) will cause a BUG().
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Boris Ostrovsky [Thu, 15 Dec 2016 15:00:57 +0000 (10:00 -0500)]
CPU/hotplug: Clarify description of __cpuhp_setup_state() return value
When ivoked with CPUHP_AP_ONLINE_DYN state __cpuhp_setup_state()
is expected to return positive value which is the hotplug state that
the routine assigns.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lv Zheng [Wed, 14 Dec 2016 07:04:46 +0000 (15:04 +0800)]
ACPI / osl: Remove deprecated acpi_get_table_with_size()/early_acpi_os_unmap_memory()
Since all users are cleaned up, remove the 2 deprecated APIs due to no
users.
As a Linux variable rather than an ACPICA variable, acpi_gbl_permanent_mmap
is renamed to acpi_permanent_mmap to have a consistent coding style across
entire Linux ACPI subsystem.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lv Zheng [Wed, 14 Dec 2016 07:04:39 +0000 (15:04 +0800)]
ACPI / osl: Remove acpi_get_table_with_size()/early_acpi_os_unmap_memory() users
This patch removes the users of the deprectated APIs:
acpi_get_table_with_size()
early_acpi_os_unmap_memory()
The following APIs should be used instead of:
acpi_get_table()
acpi_put_table()
The deprecated APIs are invented to be a replacement of acpi_get_table()
during the early stage so that the early mapped pointer will not be stored
in ACPICA core and thus the late stage acpi_get_table() won't return a
wrong pointer. The mapping size is returned just because it is required by
early_acpi_os_unmap_memory() to unmap the pointer during early stage.
But as the mapping size equals to the acpi_table_header.length
(see acpi_tb_init_table_descriptor() and acpi_tb_validate_table()), when
such a convenient result is returned, driver code will start to use it
instead of accessing acpi_table_header to obtain the length.
Thus this patch cleans up the drivers by replacing returned table size with
acpi_table_header.length, and should be a no-op.
Reported-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lv Zheng [Wed, 14 Dec 2016 07:04:33 +0000 (15:04 +0800)]
ACPICA: Tables: Allow FADT to be customized with virtual address
ACPICA commit
d98de9ca14891130efc5dcdc871b97eb27b4b0f5
FADT parsing code requires FADT to be installed as
ACPI_TABLE_ORIGIN_INTERNAL_PHYSICAL, using new
acpi_tb_get_table()/acpi_tb_put_table(), other address types can also be allowed,
thus facilitates FADT customization with virtual address. Lv Zheng.
Link: https://github.com/acpica/acpica/commit/d98de9ca
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lv Zheng [Wed, 14 Dec 2016 07:04:25 +0000 (15:04 +0800)]
ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
ACPICA commit
cac6790954d4d752a083e6122220b8a22febcd07
This patch back ports Linux acpi_get_table_with_size() and
early_acpi_os_unmap_memory() into ACPICA upstream to reduce divergences.
The 2 APIs are used by Linux as table management APIs for long time, it
contains a hidden logic that during the early stage, the mapped tables
should be unmapped before the early stage ends.
During the early stage, tables are handled by the following sequence:
acpi_get_table_with_size();
parse the table
early_acpi_os_unmap_memory();
During the late stage, tables are handled by the following sequence:
acpi_get_table();
parse the table
Linux uses acpi_gbl_permanent_mmap to distinguish the early stage and the
late stage.
The reasoning of introducing acpi_get_table_with_size() is: ACPICA will
remember the early mapped pointer in acpi_get_table() and Linux isn't able to
prevent ACPICA from using the wrong early mapped pointer during the late
stage as there is no API provided from ACPICA to be an inverse of
acpi_get_table() to forget the early mapped pointer.
But how ACPICA can work with the early/late stage requirement? Inside of
ACPICA, tables are ensured to be remained in "INSTALLED" state during the
early stage, and they are carefully not transitioned to "VALIDATED" state
until the late stage. So the same logic is in fact implemented inside of
ACPICA in a different way. The gap is only that the feature is not provided
to the OSPMs in an accessible external API style.
It then is possible to fix the gap by providing an inverse of
acpi_get_table() from ACPICA, so that the two Linux sequences can be
combined:
acpi_get_table();
parse the table
acpi_put_table();
In order to work easier with the current Linux code, acpi_get_table() and
acpi_put_table() is implemented in a usage counting based style:
1. When the usage count of the table is increased from 0 to 1, table is
mapped and .Pointer is set with the mapping address (VALIDATED);
2. When the usage count of the table is decreased from 1 to 0, .Pointer
is unset and the mapping address is unmapped (INVALIDATED).
So that we can deploy the new APIs to Linux with minimal effort by just
invoking acpi_get_table() in acpi_get_table_with_size() and invoking
acpi_put_table() in early_acpi_os_unmap_memory(). Lv Zheng.
Link: https://github.com/acpica/acpica/commit/cac67909
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Linus Torvalds [Tue, 20 Dec 2016 23:48:34 +0000 (15:48 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes and cleanups from David Miller:
1) Use rb_entry() instead of hardcoded container_of(), from Geliang
Tang.
2) Use correct memory barriers in stammac driver, from Pavel Machek.
3) Fix assoc bind address handling in SCTP, from Xin Long.
4) Make the length check for UFO handling consistent between
__ip_append_data() and ip_finish_output(), from Zheng Li.
5) HSI driver compatible strings were busted fro hix5hd2, from Dongpo
Li.
6) Handle devm_ioremap() errors properly in cavium driver, from Arvind
Yadav.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (22 commits)
RDS: use rb_entry()
net_sched: sch_netem: use rb_entry()
net_sched: sch_fq: use rb_entry()
net/mlx5: use rb_entry()
ethernet: sfc: Add Kconfig entry for vendor Solarflare
sctp: not copying duplicate addrs to the assoc's bind address list
sctp: reduce indent level in sctp_copy_local_addr_list
ARM: dts: hix5hd2: don't change the existing compatible string
net: hix5hd2_gmac: fix compatible strings name
openvswitch: Add a missing break statement.
net: netcp: ethss: fix 10gbe host port tx pri map configuration
net: netcp: ethss: fix errors in ethtool ops
fsl/fman: enable compilation on ARM64
fsl/fman:
A007273 only applies to PPC SoCs
powerpc: fsl/fman: remove fsl,fman from of_device_ids[]
fsl/fman: fix 1G support for QSGMII interfaces
dt: bindings: net: use boolean dt properties for eee broken modes
net: phy: use boolean dt properties for eee broken modes
net: phy: fix sign type error in genphy_config_eee_advert
ipv4: Should use consistent conditional judgement for ip fragment in __ip_append_data and ip_finish_output
...
Linus Torvalds [Tue, 20 Dec 2016 23:24:32 +0000 (15:24 -0800)]
Merge branch 'akpm' (patches from Andrew)
Merge final set of updates from Andrew Morton:
- a series to make IMA play better across kexec
- a handful of random fixes
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
printk: fix typo in CONSOLE_LOGLEVEL_DEFAULT help text
ratelimit: fix WARN_ON_RATELIMIT return value
kcov: make kcov work properly with KASLR enabled
arm64: setup: introduce kaslr_offset()
mm: fadvise: avoid expensive remote LRU cache draining after FADV_DONTNEED
ima: platform-independent hash value
ima: define a canonical binary_runtime_measurements list format
ima: support restoring multiple template formats
ima: store the builtin/custom template definitions in a list
ima: on soft reboot, save the measurement list
powerpc: ima: send the kexec buffer to the next kernel
ima: maintain memory size needed for serializing the measurement list
ima: permit duplicate measurement list entries
ima: on soft reboot, restore the measurement list
powerpc: ima: get the kexec buffer passed by the previous kernel
Linus Torvalds [Tue, 20 Dec 2016 23:22:01 +0000 (15:22 -0800)]
Merge branch 'mailbox-for-next' of git://git.linaro.org/landing-teams/working/fujitsu/integration
Pull mailbox updates from Jassi Brar:
- new features (poll and SRAM usage) added to the mailbox-test driver
- major update of Broadcom's PDC controller driver
- minor fix for auto-loading test and STI driver modules
* 'mailbox-for-next' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
mailbox: mailbox-test: allow reserved areas in SRAM
mailbox: mailbox-test: add support for fasync/poll
mailbox: bcm-pdc: Remove unnecessary void* casts
mailbox: bcm-pdc: Simplify interrupt handler logic
mailbox: bcm-pdc: Performance improvements
mailbox: bcm-pdc: Don't use iowrite32 to write DMA descriptors
mailbox: bcm-pdc: Convert from threaded IRQ to tasklet
mailbox: bcm-pdc: Try to improve branch prediction
mailbox: bcm-pdc: streamline rx code
mailbox: bcm-pdc: Convert from interrupts to poll for tx done
mailbox: bcm-pdc: PDC driver leaves debugfs files after removal
mailbox: bcm-pdc: Changes so mbox client can be removed / re-inserted
mailbox: bcm-pdc: Use octal permissions rather than symbolic
mailbox: sti: Fix module autoload for OF registration
mailbox: mailbox-test: Fix module autoload
Linus Torvalds [Tue, 20 Dec 2016 23:19:55 +0000 (15:19 -0800)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang.
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: mux: mlxcpld: fix i2c mux selection caching
i2c: designware: fix wrong Tx/Rx FIFO for ACPI
i2c: xgene: Fix missing code of DTB support
i2c: mux: pca954x: fix i2c mux selection caching
i2c: octeon: thunderx: Limit register access retries
Linus Torvalds [Tue, 20 Dec 2016 23:17:55 +0000 (15:17 -0800)]
Merge tag 'doc-4.10-3' of git://git.lwn.net/linux
Pull documentation fix from Jonathan Corbet:
"A single fix for the build system.
It would appear that the docutils developers, in their wisdom, broke
the API in the 0.13 release. This fix detects the breakage and allows
the docs to be built with both the old and new versions"
* tag 'doc-4.10-3' of git://git.lwn.net/linux:
docs: sphinx-extensions: make rstFlatTable work with docutils 0.13
Linus Torvalds [Tue, 20 Dec 2016 23:16:00 +0000 (15:16 -0800)]
Merge tag 'microblaze-4.10-rc1' of git://git.monstr.eu/linux-2.6-microblaze
Pull arch/microblaze updates from Michal Simek:
- wire-up new syscalls
- add new codes and fpga families
- fix a return value
* tag 'microblaze-4.10-rc1' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: Add new fpga families
microblaze: Add missing release version code v9.6 and v10
microblaze: Add missing syscalls
microblaze: Fix return value from xilinx_timer_init
Linus Torvalds [Tue, 20 Dec 2016 22:48:53 +0000 (14:48 -0800)]
Merge tag 'xtensa-
20161219' of git://github.com/jcmvbkbc/linux-xtensa
Pull Xtensa updates from Max Filippov:
- enable HAVE_DMA_CONTIGUOUS, configure shared DMA pool reservation in
kc705 DTS
- update xtensa DMA-related Documentation/features entries
- clean up arch/xtensa/kernel/setup.c: move S32C1I self-test out of it,
remove unused declarations, fix screen_info definition
* tag 'xtensa-
20161219' of git://github.com/jcmvbkbc/linux-xtensa:
xtensa: update DMA-related Documentation/features entries
xtensa: configure shared DMA pool reservation in kc705 DTS
xtensa: enable HAVE_DMA_CONTIGUOUS
xtensa: move S32C1I self-test to a separate file
xtensa: fix screen_info, clean up unused declarations in setup.c
Helge Deller [Tue, 20 Dec 2016 19:51:10 +0000 (20:51 +0100)]
parisc: Optimize timer interrupt function
Restructure the timer interrupt function to better cope with missed timer irqs.
Optimize the calculation when the next interrupt should happen and skip irqs if
they would happen too shortly after exit of the irq function.
The update_process_times() call is done anyway at every timer irq, so we can
safely drop the prof_counter and prof_multiplier variables from the per_cpu
structure.
Signed-off-by: Helge Deller <deller@gmx.de>
Geliang Tang [Tue, 20 Dec 2016 14:02:18 +0000 (22:02 +0800)]
RDS: use rb_entry()
To make the code clearer, use rb_entry() instead of container_of() to
deal with rbtree.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Tue, 20 Dec 2016 14:02:16 +0000 (22:02 +0800)]
net_sched: sch_netem: use rb_entry()
To make the code clearer, use rb_entry() instead of container_of() to
deal with rbtree.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Tue, 20 Dec 2016 14:02:15 +0000 (22:02 +0800)]
net_sched: sch_fq: use rb_entry()
To make the code clearer, use rb_entry() instead of container_of() to
deal with rbtree.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Tue, 20 Dec 2016 14:02:14 +0000 (22:02 +0800)]
net/mlx5: use rb_entry()
To make the code clearer, use rb_entry() instead of container_of() to
deal with rbtree.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tobias Klauser [Tue, 20 Dec 2016 13:38:26 +0000 (14:38 +0100)]
ethernet: sfc: Add Kconfig entry for vendor Solarflare
Since commit
5a6681e22c14 ("sfc: separate out SFC4000 ("Falcon") support into new sfc-falcon driver")
there are two drivers for Solarflare devices, but both still show up
directly beneath "Ethernet driver support" in the Kconfig. Follow the
pattern of other vendors and group them beneath an own vendor Kconfig
entry for Solarflare.
Cc: Edward Cree <ecree@solarflare.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 20 Dec 2016 19:15:45 +0000 (14:15 -0500)]
Merge branch 'sctp-fixes'
Xin Long says:
====================
sctp: fix the issue that may copy duplicate addrs into assoc's bind address list
Patch 1/2 is to fix some indent level.
Given that we have kernels out there with this issue, patch 2/2 also
fix sctp_raw_to_bind_addrs.
v1 -> v2:
Explain why we didn't filter the duplicate addresses when global
address list gets updated in patch 2/2 changelog.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 20 Dec 2016 05:49:50 +0000 (13:49 +0800)]
sctp: not copying duplicate addrs to the assoc's bind address list
sctp.local_addr_list is a global address list that is supposed to include
all the local addresses. sctp updates this list according to NETDEV_UP/
NETDEV_DOWN notifications.
However, if multiple NICs have the same address, the global list would
have duplicate addresses. Even if for one NIC, promote secondaries in
__inet_del_ifa can also lead to accumulating duplicate addresses.
When sctp binds address 'ANY' and creates a connection, it copies all
the addresses from global list into asoc's bind addr list, which makes
sctp pack the duplicate addresses into INIT/INIT_ACK packets.
This patch is to filter the duplicate addresses when copying the addrs
from global list in sctp_copy_local_addr_list and unpacking addr_param
from cookie in sctp_raw_to_bind_addrs to asoc's bind addr list.
Note that we can't filter the duplicate addrs when global address list
gets updated, As NETDEV_DOWN event may remove an addr that still exists
in another NIC.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 20 Dec 2016 05:49:49 +0000 (13:49 +0800)]
sctp: reduce indent level in sctp_copy_local_addr_list
This patch is to reduce indent level by using continue when the addr
is not allowed, and also drop end_copy by using break.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 20 Dec 2016 19:12:30 +0000 (14:12 -0500)]
Merge branch 'hix5hd2_gmac-compatible-string'
Dongpo Li says:
====================
net: hix5hd2_gmac: keep the compatible string not changed
This patch series fix the patch:
d0fb6ba75dc0 ("net: hix5hd2_gmac: add generic compatible string")
The SoC hix5hd2 compatible string has the suffix "-gmac" and
we should not change its compatible string.
So we should name all the compatible string with the suffix "-gmac".
Creating a new name suffix "-gemac" is unnecessary.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Dongpo Li [Tue, 20 Dec 2016 02:09:29 +0000 (10:09 +0800)]
ARM: dts: hix5hd2: don't change the existing compatible string
The SoC hix5hd2 compatible string has the suffix "-gmac" and
we should not change it.
We should only add the generic compatible string "hisi-gmac-v1".
Fixes:
0855950ba580 ("ARM: dts: hix5hd2: add gmac generic compatible and clock names")
Signed-off-by: Dongpo Li <lidongpo@hisilicon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dongpo Li [Tue, 20 Dec 2016 02:09:28 +0000 (10:09 +0800)]
net: hix5hd2_gmac: fix compatible strings name
The SoC hix5hd2 compatible string has the suffix "-gmac" and
we should not change its compatible string.
So we should name all the compatible string with the suffix "-gmac".
Creating a new name suffix "-gemac" is unnecessary.
We also add another SoC compatible string in dt binding documentation
and describe which generic version the SoC belongs to.
Fixes:
d0fb6ba75dc0 ("net: hix5hd2_gmac: add generic compatible string")
Signed-off-by: Dongpo Li <lidongpo@hisilicon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jarno Rajahalme [Tue, 20 Dec 2016 01:06:33 +0000 (17:06 -0800)]
openvswitch: Add a missing break statement.
Add a break statement to prevent fall-through from
OVS_KEY_ATTR_ETHERNET to OVS_KEY_ATTR_TUNNEL. Without the break
actions setting ethernet addresses fail to validate with log messages
complaining about invalid tunnel attributes.
Fixes:
0a6410fbde ("openvswitch: netlink: support L3 packets")
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WingMan Kwok [Mon, 19 Dec 2016 22:55:57 +0000 (17:55 -0500)]
net: netcp: ethss: fix 10gbe host port tx pri map configuration
This patch adds the missing 10gbe host port tx priority map
configurations.
Signed-off-by: WingMan Kwok <w-kwok2@ti.com>
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WingMan Kwok [Mon, 19 Dec 2016 22:55:56 +0000 (17:55 -0500)]
net: netcp: ethss: fix errors in ethtool ops
In ethtool ops, it needs to retrieve the corresponding
ethss module (gbe or xgbe) from the net_device structure.
Prior to this patch, the retrieving procedure only
checks for the gbe module. This patch fixes the issue
by checking the xgbe module if the net_device structure
does not correspond to the gbe module.
Signed-off-by: WingMan Kwok <w-kwok2@ti.com>
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 20 Dec 2016 18:55:35 +0000 (13:55 -0500)]
Merge branch 'fsl-fixes'
Madalin Bucur says:
====================
fsl/fman: fixes for ARM
The patch set fixes advertised speeds for QSGMII interfaces, disables
A007273 erratum workaround on non-PowerPC platforms where it does not
apply, enables compilation on ARM64 and addresses a probing issue on
non PPC platforms.
Changes from v3: removed redundant comment, added ack by Scott
Changes from v2: merged fsl/fman changes to avoid a point of failure
Changes from v1: unifying probing on all supported platforms
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Madalin Bucur [Mon, 19 Dec 2016 20:42:46 +0000 (22:42 +0200)]
fsl/fman: enable compilation on ARM64
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Madalin Bucur [Mon, 19 Dec 2016 20:42:45 +0000 (22:42 +0200)]
fsl/fman:
A007273 only applies to PPC SoCs
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Reviewed-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Madalin Bucur [Mon, 19 Dec 2016 20:42:44 +0000 (22:42 +0200)]
powerpc: fsl/fman: remove fsl,fman from of_device_ids[]
The fsl/fman drivers will use of_platform_populate() on all
supported platforms. Call of_platform_populate() to probe the
FMan sub-nodes.
Signed-off-by: Igal Liberman <igal.liberman@freescale.com>
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Acked-by: Scott Wood <oss@buserror.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Madalin Bucur [Mon, 19 Dec 2016 20:42:43 +0000 (22:42 +0200)]
fsl/fman: fix 1G support for QSGMII interfaces
QSGMII ports were not advertising 1G speed.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Reviewed-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 20 Dec 2016 18:50:51 +0000 (13:50 -0500)]
Merge branch 'phy-broken-modes'
Jerome Brunet says:
====================
phy: Fix integration of eee-broken-modes
The purpose of this series is to fix the integration of the ethernet phy
property "eee-broken-modes" [0]
The v3 of this series has been merged, missing a fix (error reported by
kbuild robot) available in the v4 [1]
More importantly, Florian opposed adding a DT property mapping a device
register this directly [2]. The concern was that the property could be
abused to implement platform configuration policy. After discussing it,
I think we agreed that such information about the HW (defect) should appear
in the platform DT. However, the preferred way is to add a boolean property
for each EEE broken mode.
[0]: http://lkml.kernel.org/r/
1480326409-25419-1-git-send-email-jbrunet@baylibre.com
[1]: http://lkml.kernel.org/r/
1480348229-25672-1-git-send-email-jbrunet@baylibre.com
[2]: http://lkml.kernel.org/r/
e14a3b0c-dc34-be14-48b3-
518a0ad0c080@gmail.com
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
jbrunet [Mon, 19 Dec 2016 15:05:38 +0000 (16:05 +0100)]
dt: bindings: net: use boolean dt properties for eee broken modes
The patches regarding eee-broken-modes was merged before all people
involved could find an agreement on the best way to move forward.
While we agreed on having a DT property to mark particular modes as broken,
the value used for eee-broken-modes mapped the phy register in very direct
way. Because of this, the concern is that it could be used to implement
configuration policies instead of describing a broken HW.
In the end, having a boolean property for each mode seems to be preferred
over one bit field value mapping the register (too) directly.
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
jbrunet [Mon, 19 Dec 2016 15:05:37 +0000 (16:05 +0100)]
net: phy: use boolean dt properties for eee broken modes
The patches regarding eee-broken-modes was merged before all people
involved could find an agreement on the best way to move forward.
While we agreed on having a DT property to mark particular modes as broken,
the value used for eee-broken-modes mapped the phy register in very direct
way. Because of this, the concern is that it could be used to implement
configuration policies instead of describing a broken HW.
In the end, having a boolean property for each mode seems to be preferred
over one bit field value mapping the register (too) directly.
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
jbrunet [Mon, 19 Dec 2016 15:05:36 +0000 (16:05 +0100)]
net: phy: fix sign type error in genphy_config_eee_advert
In genphy_config_eee_advert, the return value of phy_read_mmd_indirect is
checked to know if the register could be accessed but the result is
assigned to a 'u32'.
Changing to 'int' to correctly get errors from phy_read_mmd_indirect.
Fixes:
d853d145ea3e ("net: phy: add an option to disable EEE advertisement")
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Borislav Petkov [Tue, 20 Dec 2016 00:23:15 +0000 (16:23 -0800)]
printk: fix typo in CONSOLE_LOGLEVEL_DEFAULT help text
s/prink/printk/
Link: http://lkml.kernel.org/r/20161215170111.19075-1-bp@alien8.de
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Olof Johansson <olof@lixom.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiri Slaby [Tue, 20 Dec 2016 00:23:12 +0000 (16:23 -0800)]
ratelimit: fix WARN_ON_RATELIMIT return value
The macro is to be used similarly as WARN_ON as:
if (WARN_ON_RATELIMIT(condition, state))
do_something();
One would expect only 'condition' to affect the 'if', but
WARN_ON_RATELIMIT does internally only:
WARN_ON((condition) && __ratelimit(state))
So the 'if' is affected by the ratelimiting state too. Fix this by
returning 'condition' in any case.
Note that nobody uses WARN_ON_RATELIMIT yet, so there is nothing to
worry about. But I was about to use it and was a bit surprised.
Link: http://lkml.kernel.org/r/20161215093224.23126-1-jslaby@suse.cz
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Popov [Tue, 20 Dec 2016 00:23:09 +0000 (16:23 -0800)]
kcov: make kcov work properly with KASLR enabled
Subtract KASLR offset from the kernel addresses reported by kcov.
Tested on x86_64 and AArch64 (Hikey LeMaker).
Link: http://lkml.kernel.org/r/1481417456-28826-3-git-send-email-alex.popov@linux.com
Signed-off-by: Alexander Popov <alex.popov@linux.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Jon Masters <jcm@redhat.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Nicolai Stange <nicstange@gmail.com>
Cc: James Morse <james.morse@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Popov [Tue, 20 Dec 2016 00:23:06 +0000 (16:23 -0800)]
arm64: setup: introduce kaslr_offset()
Introduce kaslr_offset() similar to x86_64 to fix kcov.
[ Updated by Will Deacon ]
Link: http://lkml.kernel.org/r/1481417456-28826-2-git-send-email-alex.popov@linux.com
Signed-off-by: Alexander Popov <alex.popov@linux.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Jon Masters <jcm@redhat.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Nicolai Stange <nicstange@gmail.com>
Cc: James Morse <james.morse@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Tue, 20 Dec 2016 00:23:03 +0000 (16:23 -0800)]
mm: fadvise: avoid expensive remote LRU cache draining after FADV_DONTNEED
When FADV_DONTNEED cannot drop all pages in the range, it observes that
some pages might still be on per-cpu LRU caches after recent
instantiation and so initiates remote calls to all CPUs to flush their
local caches. However, in most cases, the fadvise happens from the same
context that instantiated the pages, and any pre-LRU pages in the
specified range are most likely sitting on the local CPU's LRU cache,
and so in many cases this results in unnecessary remote calls, which, in
a loaded system, can hold up the fadvise() call significantly.
[ I didn't record it in the extreme case we observed at Facebook,
unfortunately. We had a slow-to-respond system and noticed it
lru_add_drain_all() leading the profile during fadvise calls. This
patch came out of thinking about the code and how we commonly call
FADV_DONTNEED.
FWIW, I wrote a silly directory tree walker/searcher that recurses
through /usr to read and FADV_DONTNEED each file it finds. On a 2
socket 40 ht machine, over 1% is spent in lru_add_drain_all(). With
the patch, that cost is gone; the local drain cost shows at 0.09%. ]
Try to avoid the remote call by flushing the local LRU cache before even
attempting to invalidate anything. It's a cheap operation, and the
local LRU cache is the most likely to hold any pre-LRU pages in the
specified fadvise range.
Link: http://lkml.kernel.org/r/20161214210017.GA1465@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andreas Steffen [Tue, 20 Dec 2016 00:23:00 +0000 (16:23 -0800)]
ima: platform-independent hash value
For remote attestion it is important for the ima measurement values to
be platform-independent. Therefore integer fields to be hashed must be
converted to canonical format.
Link: http://lkml.kernel.org/r/1480554346-29071-11-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Andreas Steffen <andreas.steffen@strongswan.org>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mimi Zohar [Tue, 20 Dec 2016 00:22:57 +0000 (16:22 -0800)]
ima: define a canonical binary_runtime_measurements list format
The IMA binary_runtime_measurements list is currently in platform native
format.
To allow restoring a measurement list carried across kexec with a
different endianness than the targeted kernel, this patch defines
little-endian as the canonical format. For big endian systems wanting
to save/restore the measurement list from a system with a different
endianness, a new boot command line parameter named "ima_canonical_fmt"
is defined.
Considerations: use of the "ima_canonical_fmt" boot command line option
will break existing userspace applications on big endian systems
expecting the binary_runtime_measurements list to be in platform native
format.
Link: http://lkml.kernel.org/r/1480554346-29071-10-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mimi Zohar [Tue, 20 Dec 2016 00:22:54 +0000 (16:22 -0800)]
ima: support restoring multiple template formats
The configured IMA measurement list template format can be replaced at
runtime on the boot command line, including a custom template format.
This patch adds support for restoring a measuremement list containing
multiple builtin/custom template formats.
Link: http://lkml.kernel.org/r/1480554346-29071-9-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mimi Zohar [Tue, 20 Dec 2016 00:22:51 +0000 (16:22 -0800)]
ima: store the builtin/custom template definitions in a list
The builtin and single custom templates are currently stored in an
array. In preparation for being able to restore a measurement list
containing multiple builtin/custom templates, this patch stores the
builtin and custom templates as a linked list. This will permit
defining more than one custom template per boot.
Link: http://lkml.kernel.org/r/1480554346-29071-8-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mimi Zohar [Tue, 20 Dec 2016 00:22:48 +0000 (16:22 -0800)]
ima: on soft reboot, save the measurement list
The TPM PCRs are only reset on a hard reboot. In order to validate a
TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement
list of the running kernel must be saved and restored on boot.
This patch uses the kexec buffer passing mechanism to pass the
serialized IMA binary_runtime_measurements to the next kernel.
Link: http://lkml.kernel.org/r/1480554346-29071-7-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thiago Jung Bauermann [Tue, 20 Dec 2016 00:22:45 +0000 (16:22 -0800)]
powerpc: ima: send the kexec buffer to the next kernel
The IMA kexec buffer allows the currently running kernel to pass the
measurement list via a kexec segment to the kernel that will be kexec'd.
This is the architecture-specific part of setting up the IMA kexec
buffer for the next kernel. It will be used in the next patch.
Link: http://lkml.kernel.org/r/1480554346-29071-6-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mimi Zohar [Tue, 20 Dec 2016 00:22:42 +0000 (16:22 -0800)]
ima: maintain memory size needed for serializing the measurement list
In preparation for serializing the binary_runtime_measurements, this
patch maintains the amount of memory required.
Link: http://lkml.kernel.org/r/1480554346-29071-5-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mimi Zohar [Tue, 20 Dec 2016 00:22:38 +0000 (16:22 -0800)]
ima: permit duplicate measurement list entries
Measurements carried across kexec need to be added to the IMA
measurement list, but should not prevent measurements of the newly
booted kernel from being added to the measurement list. This patch adds
support for allowing duplicate measurements.
The "boot_aggregate" measurement entry is the delimiter between soft
boots.
Link: http://lkml.kernel.org/r/1480554346-29071-4-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mimi Zohar [Tue, 20 Dec 2016 00:22:35 +0000 (16:22 -0800)]
ima: on soft reboot, restore the measurement list
The TPM PCRs are only reset on a hard reboot. In order to validate a
TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement
list of the running kernel must be saved and restored on boot. This
patch restores the measurement list.
Link: http://lkml.kernel.org/r/1480554346-29071-3-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thiago Jung Bauermann [Tue, 20 Dec 2016 00:22:32 +0000 (16:22 -0800)]
powerpc: ima: get the kexec buffer passed by the previous kernel
Patch series "ima: carry the measurement list across kexec", v8.
The TPM PCRs are only reset on a hard reboot. In order to validate a
TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement
list of the running kernel must be saved and then restored on the
subsequent boot, possibly of a different architecture.
The existing securityfs binary_runtime_measurements file conveniently
provides a serialized format of the IMA measurement list. This patch
set serializes the measurement list in this format and restores it.
Up to now, the binary_runtime_measurements was defined as architecture
native format. The assumption being that userspace could and would
handle any architecture conversions. With the ability of carrying the
measurement list across kexec, possibly from one architecture to a
different one, the per boot architecture information is lost and with it
the ability of recalculating the template digest hash. To resolve this
problem, without breaking the existing ABI, this patch set introduces
the boot command line option "ima_canonical_fmt", which is arbitrarily
defined as little endian.
The need for this boot command line option will be limited to the
existing version 1 format of the binary_runtime_measurements.
Subsequent formats will be defined as canonical format (eg. TPM 2.0
support for larger digests).
A simplified method of Thiago Bauermann's "kexec buffer handover" patch
series for carrying the IMA measurement list across kexec is included in
this patch set. The simplified method requires all file measurements be
taken prior to executing the kexec load, as subsequent measurements will
not be carried across the kexec and restored.
This patch (of 10):
The IMA kexec buffer allows the currently running kernel to pass the
measurement list via a kexec segment to the kernel that will be kexec'd.
The second kernel can check whether the previous kernel sent the buffer
and retrieve it.
This is the architecture-specific part which enables IMA to receive the
measurement list passed by the previous kernel. It will be used in the
next patch.
The change in machine_kexec_64.c is to factor out the logic of removing
an FDT memory reservation so that it can be used by remove_ima_buffer.
Link: http://lkml.kernel.org/r/1480554346-29071-2-git-send-email-zohar@linux.vnet.ibm.com
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
zheng li [Mon, 12 Dec 2016 01:56:05 +0000 (09:56 +0800)]
ipv4: Should use consistent conditional judgement for ip fragment in __ip_append_data and ip_finish_output
There is an inconsistent conditional judgement in __ip_append_data and
ip_finish_output functions, the variable length in __ip_append_data just
include the length of application's payload and udp header, don't include
the length of ip header, but in ip_finish_output use
(skb->len > ip_skb_dst_mtu(skb)) as judgement, and skb->len include the
length of ip header.
That causes some particular application's udp payload whose length is
between (MTU - IP Header) and MTU were fragmented by ip_fragment even
though the rst->dev support UFO feature.
Add the length of ip header to length in __ip_append_data to keep
consistent conditional judgement as ip_finish_output for ip fragment.
Signed-off-by: Zheng Li <james.z.li@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adrian Hunter [Mon, 19 Dec 2016 13:57:34 +0000 (15:57 +0200)]
mmc: core: Further fix thread wake-up
Commit
e0097cf5f2f1 ("mmc: queue: Fix queue thread wake-up") did not go far
enough. mmc_wait_for_data_req_done() still contains some problems and can
be further simplified. First it should not touch
context_info->is_waiting_last_req because that is a wake-up control used by
the owner of the context. Secondly, it should always return when one of its
wake-up conditions is met because, again, that is contolled by the owner of
the context.
While the current block driver does not have an issue, these problems were
exposed during testing of the Software Command Queue patches.
Fixes:
e0097cf5f2f1 ("mmc: queue: Fix queue thread wake-up")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Harjani Ritesh <riteshh@codeaurora.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Adrian Hunter [Mon, 19 Dec 2016 13:33:11 +0000 (15:33 +0200)]
mmc: sdhci: Fix to handle MMC_POWER_UNDEFINED
Since commit
c2c24819b280 ("mmc: core: Don't power off the card when
starting the host"), the power state can still be MMC_POWER_UNDEFINED after
mmc_start_host() is called. That can trigger a warning in SDHCI during
runtime resume as it tries to restore the I/O state. Handle
MMC_POWER_UNDEFINED simply by not updating the I/O state in that case.
Fixes:
c2c24819b280 ("mmc: core: Don't power off the card when starting the host")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Masahiro Yamada [Wed, 14 Dec 2016 02:10:46 +0000 (11:10 +0900)]
mmc: sdhci-cadence: add Socionext UniPhier specific compatible string
Add a Socionext SoC specific compatible (suggested by Rob Herring).
No SoC specific data are associated with the compatible strings for
now, but other SoC vendors may use this IP and want to differentiate
IP variants in the future.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Trond Myklebust [Mon, 19 Dec 2016 15:23:10 +0000 (10:23 -0500)]
NFSv4: Retry the DELEGRETURN if the embedded GETATTR is rejected with EACCES
If our DELEGRETURN RPC call is rejected with an EACCES call, then we should
remove the GETATTR call from the compound RPC and retry.
This could potentially happen when there is a conflict between an
ACL denying attribute reads and our use of SP4_MACH_CRED.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Trond Myklebust [Mon, 19 Dec 2016 15:34:14 +0000 (10:34 -0500)]
NFS: Retry the CLOSE if the embedded GETATTR is rejected with EACCES
If our CLOSE RPC call is rejected with an EACCES call, then we should
remove the GETATTR call from the compound RPC and retry.
This could potentially happen when there is a conflict between an
ACL denying attribute reads and our use of SP4_MACH_CRED.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Trond Myklebust [Mon, 19 Dec 2016 17:14:44 +0000 (12:14 -0500)]
NFSv4: Place the GETATTR operation before the CLOSE
In order to benefit from the DENY share lock protection, we should
put the GETATTR operation before the CLOSE. Otherwise, we might race
with a Windows machine that thinks it is now safe to modify the file.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Trond Myklebust [Mon, 19 Dec 2016 16:36:41 +0000 (11:36 -0500)]
NFSv4: Also ask for attributes when downgrading to a READ-only state
If we're downgrading from a READ+WRITE mode to a READ-only mode, then
ask for cache consistency attributes so that we avoid the revalidation
in nfs_close_context()
Fixes:
3947b74d0f9d ("NFSv4: Don't request a GETATTR on open_downgrade.")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Trond Myklebust [Mon, 19 Dec 2016 14:47:32 +0000 (09:47 -0500)]
NFS: Don't abuse NFS_INO_REVAL_FORCED in nfs_post_op_update_inode_locked()
The NFS_INO_REVAL_FORCED flag now really only has meaning for the
case when we've just been handed a delegation for a file that was already
cached, and we're unsure about that cache.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Trond Myklebust [Mon, 21 Nov 2016 15:56:38 +0000 (10:56 -0500)]
pNFS: Return RW layouts on OPEN_DOWNGRADE
If the client holds no more writeable open state, and does not hold a
write delegation, then send a layoutreturn as part of the OPEN_DOWNGRADE.
We do this only for writes, since some layout drivers may require you to
also hold a read layout if you are doing a R/W workload.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Trond Myklebust [Sun, 20 Nov 2016 18:34:16 +0000 (13:34 -0500)]
NFSv4: Add encode/decode of the layoutreturn op in OPEN_DOWNGRADE
While we do not need to return the RW layout when downgrading from a
read/write open state to read-only, we might want to do so in order
to reduce the burden on the metadataserver so that it does not need
to check for changed data when responding to GETATTR requests.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
NeilBrown [Mon, 19 Dec 2016 00:48:23 +0000 (11:48 +1100)]
NFS: Don't disconnect open-owner on NFS4ERR_BAD_SEQID
When an NFS4ERR_BAD_SEQID is received the open-owner is removed from
the ->state_owners rbtree so that it will no longer be used.
If any stateids attached to this open-owner are still in use, and if a
request using one gets an NFS4ERR_BAD_STATEID reply, this can for bad.
The state is marked as needing recovery and the nfs4_state_manager()
is scheduled to clean up. nfs4_state_manager() finds states to be
recovered by walking the state_owners rbtree. As the open-owner is
not in the rbtree, the bad state is not found so nfs4_state_manager()
completes having done nothing. The request is then retried, with a
predicatable result (indefinite retries).
If the stateid is for a delegation, this open_owner will be used
to open files when the delegation is returned. For that to work,
a new open-owner needs to be presented to the server.
This patch changes NFS4ERR_BAD_SEQID handling to leave the open-owner
in the rbtree but updates the 'create_time' so it looks like a new
open-owner. With this the indefinite retries no longer happen.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
NeilBrown [Mon, 19 Dec 2016 00:33:13 +0000 (11:33 +1100)]
NFSv4: ensure __nfs4_find_lock_state returns consistent result.
If a file has both flock locks and OFD locks, then it is possible that
two different nfs4 lock states could apply to file accesses from a
single process.
It is not possible to know, efficiently, which one is "correct".
Presumably the state which represents a lock that covers the region
undergoing IO would be the "correct" one to use, but finding that has
a non-trivial cost and would provide miniscule value.
Currently we just return whichever is first in the list, which could
result in inconsistent behaviour if an application ever put it self in
this position. As consistent behaviour is preferable (when perfectly
correct behaviour is not available), change the search to return a
consistent result in this circumstance.
Specifically: if there is both a flock and OFD lock state, always return
the flock one.
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
NeilBrown [Mon, 19 Dec 2016 00:19:31 +0000 (11:19 +1100)]
NFSv4.1: nfs4_fl_prepare_ds must be careful about reporting success.
Various places assume that if nfs4_fl_prepare_ds() turns a non-NULL 'ds',
then ds->ds_clp will also be non-NULL.
This is not necessasrily true in the case when the process received a fatal signal
while nfs4_pnfs_ds_connect is waiting in nfs4_wait_ds_connect().
In that case ->ds_clp may not be set, and the devid may not recently have been marked
unavailable.
So add a test for ds_clp == NULL and return NULL in that case.
Fixes:
c23266d532b4 ("NFS4.1 Fix data server connection race")
Signed-off-by: NeilBrown <neilb@suse.com>
Acked-by: Olga Kornievskaia <aglo@umich.edu>
Acked-by: Adamson, Andy <William.Adamson@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Weston Andros Adamson [Wed, 14 Dec 2016 21:31:55 +0000 (16:31 -0500)]
pNFS/flexfiles: delete deviceid, don't mark inactive
Instead of marking a device inactive, remove it from the cache entirely.
Flexfiles has a way to report errors back to the server, so we don't want
to stop devices from being tried again for 120 seconds.
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Trond Myklebust [Fri, 16 Dec 2016 23:51:15 +0000 (18:51 -0500)]
NFS: Clean up nfs_attribute_timeout()
It can be made static.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Trond Myklebust [Fri, 16 Dec 2016 23:49:38 +0000 (18:49 -0500)]
NFS: Remove unused function nfs_revalidate_inode_rcu()
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Trond Myklebust [Fri, 16 Dec 2016 23:40:03 +0000 (18:40 -0500)]
NFS: Fix and clean up the access cache validity checking
The access cache needs to check whether or not the mode bits, ownership,
or ACL has changed or the cache has timed out.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Trond Myklebust [Fri, 16 Dec 2016 23:04:47 +0000 (18:04 -0500)]
NFS: Only look at the change attribute cache state in nfs_weak_revalidate()
Just like in nfs_check_verifier(), we want to use
nfs_mapping_need_revalidate_inode() to check our knowledge of the
change attribute is up to date.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Trond Myklebust [Thu, 8 Dec 2016 23:18:38 +0000 (18:18 -0500)]
NFS: Clean up cache validity checking
Consolidate the open-coded checking of NFS_I(inode)->cache_validity
into a couple of helper functions.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Trond Myklebust [Fri, 16 Dec 2016 22:39:58 +0000 (17:39 -0500)]
NFS: Don't revalidate the file on close if we hold a delegation
If we're holding a delegation, we can skip sending the close-to-open
GETATTR until we're returning that delegation.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Trond Myklebust [Sat, 17 Dec 2016 00:48:09 +0000 (19:48 -0500)]
NFSv4: Don't discard the attributes returned by asynchronous DELEGRETURN
DELEGRETURN will always carry a reference to the inode except when
the latter is being freed, so let's ensure that we always use that
inode information to ensure close-to-open cache consistency, even
when the DELEGRETURN call is asynchronous.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Trond Myklebust [Fri, 16 Dec 2016 21:55:55 +0000 (16:55 -0500)]
NFSv4: Update the attribute cache info in update_changeattr
If we successfully updated the change attribute, we should timestamp the
cache. While we do know that the other attributes are not completely up
to date, we have the NFS_INO_INVALID_ATTR flag that let us know that,
so it is valid to say that the cache has not timed out.
We can also clear NFS_INO_REVAL_PAGECACHE, since our change attribute
is now known to be valid.
Conversely, if the change attribute did not match, we should make sure to
also revalidate the access and ACL caches.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Linus Torvalds [Mon, 19 Dec 2016 16:23:53 +0000 (08:23 -0800)]
Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-fs
Pull quota, fsnotify and ext2 updates from Jan Kara:
"Changes to locking of some quota operations from dedicated quota mutex
to s_umount semaphore, a fsnotify fix and a simple ext2 fix"
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
quota: Fix bogus warning in dquot_disable()
fsnotify: Fix possible use-after-free in inode iteration on umount
ext2: reject inodes with negative size
quota: Remove dqonoff_mutex
ocfs2: Use s_umount for quota recovery protection
quota: Remove dqonoff_mutex from dquot_scan_active()
ocfs2: Protect periodic quota syncing with s_umount semaphore
quota: Use s_umount protection for quota operations
quota: Hold s_umount in exclusive mode when enabling / disabling quotas
fs: Provide function to get superblock with exclusive s_umount
Linus Torvalds [Mon, 19 Dec 2016 16:21:29 +0000 (08:21 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"Early fixes for x86.
Instead of the (botched) revert, the lockdep/might_sleep splat has a
real fix provided by Andrea"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: nVMX: Allow L1 to intercept software exceptions (#BP and #OF)
kvm: take srcu lock around kvm_steal_time_set_preempted()
kvm: fix schedule in atomic in kvm_steal_time_set_preempted()
KVM: hyperv: fix locking of struct kvm_hv fields
KVM: x86: Expose Intel AVX512IFMA/AVX512VBMI/SHA features to guest.
kvm: nVMX: Correct a VMX instruction error code for VMPTRLD
Stefan Haberland [Mon, 19 Dec 2016 16:15:50 +0000 (17:15 +0100)]
block: check partition alignment
Partitions that are not aligned to the blocksize of a device may cause
invalid I/O requests because the blocklayer cares only about alignment
within the partition when building requests on partitions.
device
|--------4096--------|--------4096--------|--------4096--------|
partition offset 512byte
|-512-|--------4096--------|--------4096--------|--------4096--------|
When reading/writing one 4k block of the partition this maps to
reading/writing with an offset of 512 byte of the device leading to
unaligned requests for the device which in turn may cause unexpected
behavior of the device driver.
For DASD devices we have to translate the block number into a cylinder,
head, record format. The unaligned requests lead to wrong calculation
and therefore to misdirected I/O. In a "good" case this leads to I/O
errors because the underlying hardware detects the wrong addressing.
In a worst case scenario this might destroy data on the device.
To prevent partitions that are not aligned to the physical blocksize
of a device check for the alignment in the blkpg_ioctl.
Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Linus Torvalds [Mon, 19 Dec 2016 16:18:58 +0000 (08:18 -0800)]
Merge branch 'dmi-for-linus' of git://git./linux/kernel/git/jdelvare/staging
Pull dmi fix from Jean Delvare.
* 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
firmware: dmi_scan: Always show system identification string
Linus Torvalds [Mon, 19 Dec 2016 16:16:26 +0000 (08:16 -0800)]
Merge tag 'mfd-for-linus-4.10' of git://git./linux/kernel/git/lee/mfd
Pull MFD updates from Lee Jones:
"New Device Support
- Add support for Ricoh RC5T619 PMIC to rn5t618
- Add support for PM8821 PMIC to qcom-pm8xxx
New Functionality:
- Add support for GPIO to lpc_ich
- Add support for GPADC to sun4i
- Add ability for rk808 to shutdown
Fix-ups:
- Simplify/strip unnecessary code; tps65218, palmas, tps65217
- Device Tree binding updates; tps65218, altera-a10sr
- Provide/export device ID info; tps65218, axp20x-i2c, hi655x-pmic,
fsl-imx25-tsadc, intel_soc_pmic_bxtwc
- Use MFD API instead of of_platform_populate(); tps65218
- Generalise name-space; pm8xxx
- Supply/edit regmap configuration; axp20x, cs47l24-tables, axp20x
- Enable compile testing; max77620, max77686, exynos-lpass,
abx500-core
- Coding style issues; wm8994-core, wm5102-tables
- Supply endian support; syscon
- Remove module support; ab3100-core, ab8500-debugfs, ab8500-gpadc,
abx500-core
Bug Fixes:
- Fix ordering issues; wm8994
- Fix dependencies (build-time/run-time); exynos_lpass, sun4i-gpadc
- Fix compiler warnings; sun4i-gpadc
- Fix leaks; mfd-core
- Fix page fault during module unload; tps65217"
* tag 'mfd-for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (49 commits)
mfd: tps65217: Support an interrupt pin as the system wakeup
mfd: tps65217: Make an interrupt handler simpler
mfd: tps65217: Update register interrupt mask bits instead of writing operation
mfd: tps65217: Specify the IRQ name
mfd: tps65217: Fix page fault on unloading modules
mfd: palmas: Remove redundant check in palmas_power_off
mfd: arizona: Disable IRQs during driver remove
mfd: pm8xxx: add support to pm8821
mfd: intel-lpss: Try to enable Memory-Write-Invalidate
mfd: rn5t618: Add Ricoh RC5T619 PMIC support
mfd: axp20x: Add address extension registers for AXP806 regmap
mfd: intel_soc_pmic_bxtwc: Fix a typo in MODULE_DEVICE_TABLE()
mfd: core: Fix device reference leak in mfd_clone_cell
mfd: bcm590xx: Simplify a test
mfd: sun4i-gpadc: Select regmap-irq
mfd: abx500-core: drop unused MODULE_ tags from non-modular code
mfd: ab8500: make sysctrl explicitly non-modular
mfd: ab8500-gpadc: Make it explicitly non-modular
mfd: ab8500-debugfs: Make it explicitly non-modular
mfd: ab8500-core: Make it explicitly non-modular
...