Mark Rutland [Wed, 6 Jan 2016 11:05:27 +0000 (11:05 +0000)]
arm64: head.S: use memset to clear BSS
Currently we use an open-coded memzero to clear the BSS. As it is a
trivial implementation, it is sub-optimal.
Our optimised memset doesn't use the stack, is position-independent, and
for the memzero case can use of DC ZVA to clear large blocks
efficiently. In __mmap_switched the MMU is on and there are no live
caller-saved registers, so we can safely call an uninstrumented memset.
This patch changes __mmap_switched to use memset when clearing the BSS.
We use the __pi_memset alias so as to avoid any instrumentation in all
kernel configurations.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Ard Biesheuvel [Wed, 23 Dec 2015 09:29:28 +0000 (10:29 +0100)]
efi: stub: define DISABLE_BRANCH_PROFILING for all architectures
This moves the DISABLE_BRANCH_PROFILING define from the x86 specific
to the general CFLAGS definition for the stub. This fixes build errors
when building for arm64 with CONFIG_PROFILE_ALL_BRANCHES_ENABLED.
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Mark Rutland [Tue, 5 Jan 2016 17:33:34 +0000 (17:33 +0000)]
arm64: entry: remove pointless SPSR mode check
In work_pending, we may skip work if the stacked SPSR value represents
anything other than an EL0 context. We then immediately invoke the
kernel_exit 0 macro as part of ret_to_user, assuming a return to EL0.
This is somewhat confusing.
We use work_pending as part of the ret_to_user/ret_fast_syscall state
machine. We only use ret_fast_syscall in the return from an SVC issued
from EL0. We use ret_to_user for return from EL0 exception handlers and
also for return from ret_from_fork in the case the task was not a kernel
thread (i.e. it is a user task).
Thus in all cases the stacked SPSR value must represent an EL0 context,
and the check is redundant. This patch removes it, along with the now
unused no_work_pending label.
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Will Deacon [Tue, 5 Jan 2016 15:36:59 +0000 (15:36 +0000)]
arm64: mm: move pgd_cache initialisation to pgtable_cache_init
Initialising the suppport for EFI runtime services requires us to
allocate a pgd off the back of an early_initcall. On systems where the
PGD_SIZE is smaller than PAGE_SIZE (e.g. 64k pages and 48-bit VA), the
pgd_cache isn't initialised at this stage, and we panic with a NULL
dereference during boot:
Unable to handle kernel NULL pointer dereference at virtual address
00000000
__create_mapping.isra.5+0x84/0x350
create_pgd_mapping+0x20/0x28
efi_create_mapping+0x5c/0x6c
arm_enable_runtime_services+0x154/0x1e4
do_one_initcall+0x8c/0x190
kernel_init_freeable+0x84/0x1ec
kernel_init+0x10/0xe0
ret_from_fork+0x10/0x50
This patch fixes the problem by initialising the pgd_cache earlier, in
the pgtable_cache_init callback, which sounds suspiciously like what it
was intended for.
Reported-by: Dennis Chen <dennis.chen@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Ard Biesheuvel [Tue, 5 Jan 2016 09:18:52 +0000 (10:18 +0100)]
arm64: module: avoid undefined shift behavior in reloc_data()
Compilers may engage the improbability drive when encountering shifts
by a distance that is a multiple of the size of the operand type. Since
the required bounds check is very simple here, we can get rid of all the
fuzzy masking, shifting and comparing, and use the documented bounds
directly.
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Ard Biesheuvel [Tue, 5 Jan 2016 09:18:51 +0000 (10:18 +0100)]
arm64: module: fix relocation of movz instruction with negative immediate
The test whether a movz instruction with a signed immediate should be
turned into a movn instruction (i.e., when the immediate is negative)
is flawed, since the value of imm is always positive. Also, the
subsequent bounds check is incorrect since the limit update never
executes, due to the fact that the imm_type comparison will always be
false for negative signed immediates.
Let's fix this by performing the sign test on sval directly, and
replacing the bounds check with a simple comparison against U16_MAX.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[will: tidied up use of sval, renamed MOVK enum value to MOVKZ]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Will Deacon [Mon, 21 Dec 2015 16:44:27 +0000 (16:44 +0000)]
arm64: traps: address fallout from printk -> pr_* conversion
Commit
ac7b406c1a9d ("arm64: Use pr_* instead of printk") was a fairly
mindless s/printk/pr_*/ change driven by a complaint from checkpatch.
As is usual with such changes, this has led to some odd behaviour on
arm64:
* syslog now picks up the "pr_emerg" line from dump_backtrace, but not
the actual trace, which leads to a bunch of "kernel:Call trace:"
lines in the log
* __{pte,pmd,pgd}_error print at KERN_CRIT, as opposed to KERN_ERR
which is used by other architectures.
This patch restores the original printk behaviour for dump_backtrace
and downgrade the pgtable error macros to KERN_ERR.
Signed-off-by: Will Deacon <will.deacon@arm.com>
AKASHI Takahiro [Tue, 15 Dec 2015 08:33:41 +0000 (17:33 +0900)]
arm64: ftrace: fix a stack tracer's output under function graph tracer
Function graph tracer modifies a return address (LR) in a stack frame
to hook a function return. This will result in many useless entries
(return_to_handler) showing up in
a) a stack tracer's output
b) perf call graph (with perf record -g)
c) dump_backtrace (at panic et al.)
For example, in case of a),
$ echo function_graph > /sys/kernel/debug/tracing/current_tracer
$ echo 1 > /proc/sys/kernel/stack_trace_enabled
$ cat /sys/kernel/debug/tracing/stack_trace
Depth Size Location (54 entries)
----- ---- --------
0) 4504 16 gic_raise_softirq+0x28/0x150
1) 4488 80 smp_cross_call+0x38/0xb8
2) 4408 48 return_to_handler+0x0/0x40
3) 4360 32 return_to_handler+0x0/0x40
...
In case of b),
$ echo function_graph > /sys/kernel/debug/tracing/current_tracer
$ perf record -e mem:XXX:x -ag -- sleep 10
$ perf report
...
| | |--0.22%-- 0x550f8
| | | 0x10888
| | | el0_svc_naked
| | | sys_openat
| | | return_to_handler
| | | return_to_handler
...
In case of c),
$ echo function_graph > /sys/kernel/debug/tracing/current_tracer
$ echo c > /proc/sysrq-trigger
...
Call trace:
[<
ffffffc00044d3ac>] sysrq_handle_crash+0x24/0x30
[<
ffffffc000092250>] return_to_handler+0x0/0x40
[<
ffffffc000092250>] return_to_handler+0x0/0x40
...
This patch replaces such entries with real addresses preserved in
current->ret_stack[] at unwind_frame(). This way, we can cover all
the cases.
Reviewed-by: Jungseok Lee <jungseoklee85@gmail.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
[will: fixed minor context changes conflicting with irq stack bits]
Signed-off-by: Will Deacon <will.deacon@arm.com>
AKASHI Takahiro [Tue, 15 Dec 2015 08:33:40 +0000 (17:33 +0900)]
arm64: pass a task parameter to unwind_frame()
Function graph tracer modifies a return address (LR) in a stack frame
to hook a function's return. This will result in many useless entries
(return_to_handler) showing up in a call stack list.
We will fix this problem in a later patch ("arm64: ftrace: fix a stack
tracer's output under function graph tracer"). But since real return
addresses are saved in ret_stack[] array in struct task_struct,
unwind functions need to be notified of, in addition to a stack pointer
address, which task is being traced in order to find out real return
addresses.
This patch extends unwind functions' interfaces by adding an extra
argument of a pointer to task_struct.
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
AKASHI Takahiro [Tue, 15 Dec 2015 08:33:39 +0000 (17:33 +0900)]
arm64: ftrace: modify a stack frame in a safe way
Function graph tracer modifies a return address (LR) in a stack frame by
calling ftrace_prepare_return() in a traced function's function prologue.
The current code does this modification before preserving an original
address at ftrace_push_return_trace() and there is always a small window
of inconsistency when an interrupt occurs.
This doesn't matter, as far as an interrupt stack is introduced, because
stack tracer won't be invoked in an interrupt context. But it would be
better to proactively minimize such a window by moving the LR modification
after ftrace_push_return_trace().
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
James Morse [Fri, 18 Dec 2015 16:01:47 +0000 (16:01 +0000)]
arm64: remove irq_count and do_softirq_own_stack()
sysrq_handle_reboot() re-enables interrupts while on the irq stack. The
irq_stack implementation wrongly assumed this would only ever happen
via the softirq path, allowing it to update irq_count late, in
do_softirq_own_stack().
This means if an irq occurs in sysrq_handle_reboot(), during
emergency_restart() the stack will be corrupted, as irq_count wasn't
updated.
Lose the optimisation, and instead of moving the adding/subtracting of
irq_count into irq_stack_entry/irq_stack_exit, remove it, and compare
sp_el0 (struct thread_info) with sp & ~(THREAD_SIZE - 1). This tells us
if we are on a task stack, if so, we can safely switch to the irq stack.
Finally, remove do_softirq_own_stack(), we don't need it anymore.
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
[will: use get_thread_info macro]
Signed-off-by: Will Deacon <will.deacon@arm.com>
David Woods [Thu, 17 Dec 2015 19:31:26 +0000 (14:31 -0500)]
arm64: hugetlb: add support for PTE contiguous bit
The arm64 MMU supports a Contiguous bit which is a hint that the TTE
is one of a set of contiguous entries which can be cached in a single
TLB entry. Supporting this bit adds new intermediate huge page sizes.
The set of huge page sizes available depends on the base page size.
Without using contiguous pages the huge page sizes are as follows.
4KB: 2MB 1GB
64KB: 512MB
With a 4KB granule, the contiguous bit groups together sets of 16 pages
and with a 64KB granule it groups sets of 32 pages. This enables two new
huge page sizes in each case, so that the full set of available sizes
is as follows.
4KB: 64KB 2MB 32MB 1GB
64KB: 2MB 512MB 16GB
If a 16KB granule is used then the contiguous bit groups 128 pages
at the PTE level and 32 pages at the PMD level.
If the base page size is set to 64KB then 2MB pages are enabled by
default. It is possible in the future to make 2MB the default huge
page size for both 4KB and 64KB granules.
Reviewed-by: Chris Metcalf <cmetcalf@ezchip.com>
Reviewed-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: David Woods <dwoods@ezchip.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Ashok Kumar [Thu, 17 Dec 2015 09:38:32 +0000 (01:38 -0800)]
arm64: Use PoU cache instr for I/D coherency
In systems with three levels of cache(PoU at L1 and PoC at L3),
PoC cache flush instructions flushes L2 and L3 caches which could affect
performance.
For cache flushes for I and D coherency, PoU should suffice.
So changing all I and D coherency related cache flushes to PoU.
Introduced a new __clean_dcache_area_pou API for dcache flush till PoU
and provided a common macro for __flush_dcache_area and
__clean_dcache_area_pou.
Also, now in __sync_icache_dcache, icache invalidation for non-aliasing
VIPT icache is done only for that particular page instead of the earlier
__flush_icache_all.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ashok Kumar <ashoks@broadcom.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Ashok Kumar [Thu, 17 Dec 2015 09:38:31 +0000 (01:38 -0800)]
arm64: Defer dcache flush in __cpu_copy_user_page
Defer dcache flushing to __sync_icache_dcache by calling
flush_dcache_page which clears PG_dcache_clean flag.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ashok Kumar <ashoks@broadcom.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
James Morse [Tue, 15 Dec 2015 11:21:25 +0000 (11:21 +0000)]
arm64: reduce stack use in irq_handler
The code for switching to irq_stack stores three pieces of information on
the stack, fp+lr, as a fake stack frame (that lets us walk back onto the
interrupted tasks stack frame), and the address of the struct pt_regs that
contains the register values from kernel entry. (which dump_backtrace()
will print in any stack trace).
To reduce this, we store fp, and the pointer to the struct pt_regs.
unwind_frame() can recognise this as the irq_stack dummy frame, (as it only
appears at the top of the irq_stack), and use the struct pt_regs values
to find the missing interrupted link-register.
Suggested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Will Deacon [Tue, 15 Dec 2015 10:59:03 +0000 (10:59 +0000)]
Merge branch 'aarch64/efi' into aarch64/for-next/core
Merge in EFI memblock changes from Ard, which form the preparatory work
for UEFI support on 32-bit ARM.
Will Deacon [Thu, 10 Dec 2015 16:05:36 +0000 (16:05 +0000)]
arm64: mm: ensure that the zero page is visible to the page table walker
In paging_init, we allocate the zero page, memset it to zero and then
point TTBR0 to it in order to avoid speculative fetches through the
identity mapping.
In order to guarantee that the freshly zeroed page is indeed visible to
the page table walker, we need to execute a dsb instruction prior to
writing the TTBR.
Cc: <stable@vger.kernel.org> # v3.14+, for older kernels need to drop the 'ishst'
Signed-off-by: Will Deacon <will.deacon@arm.com>
Will Deacon [Tue, 17 Nov 2015 14:45:47 +0000 (14:45 +0000)]
arm64: Documentation: add list of software workarounds for errata
It's not immediately obvious which hardware errata are worked around in
the Linux kernel for an arbitrary kernel tree, so add a file to keep
track of what we're working around.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Mark Rutland [Fri, 11 Dec 2015 11:04:31 +0000 (11:04 +0000)]
arm64: mm: place __cpu_setup in .text
We drop __cpu_setup in .text.init, which ends up being part of .text.
The .text.init section was a legacy section name which has been unused
elsewhere for a long time.
The ".text.init" name is misleading if read as a synonym for
".init.text". Any CPU may execute __cpu_setup before turning the MMU on,
so it should simply live in .text.
Remove the pointless section assignment. This will leave __cpu_setup in
the .text section.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Mark Brown [Thu, 10 Dec 2015 16:54:32 +0000 (16:54 +0000)]
arm64: cmpxchg: Don't incldue linux/mmdebug.h
The arm64 asm/cmpxchg.h includes linux/mmdebug.h but doesn't so far as I
can tell actually use anything from it. Removing the inclusion reduces
spurious header dependency rebuilds and also avoids issues with
recursive inclusions of headers causing build breaks due to attempts to
use things before they are defined if linux/mmdebug.h starts pulling in
more low level headers.
Such errors have happened in -next recently, for example:
In file included from include/linux/completion.h:11:0,
from include/linux/rcupdate.h:43,
from include/linux/tracepoint.h:19,
from include/linux/mmdebug.h:6,
from ./arch/arm64/include/asm/cmpxchg.h:22,
from ./arch/arm64/include/asm/atomic.h:41,
from include/linux/atomic.h:4,
from include/linux/spinlock.h:406,
from include/linux/seqlock.h:35,
from include/linux/time.h:5,
from include/uapi/linux/timex.h:56,
from include/linux/timex.h:56,
from include/linux/sched.h:19,
from arch/arm64/kernel/asm-offsets.c:21:
include/linux/wait.h: In function 'wait_on_atomic_t':
include/linux/wait.h:1218:2: error: implicit declaration of function 'atomic_read' [-Werror=implicit-function-declaration]
if (atomic_read(val) == 0)
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Mark Rutland [Wed, 9 Dec 2015 12:44:38 +0000 (12:44 +0000)]
arm64: mm: fold alternatives into .init
Currently we treat the alternatives separately from other data that's
only used during initialisation, using separate .altinstructions and
.altinstr_replacement linker sections. These are freed for general
allocation separately from .init*. This is problematic as:
* We do not remove execute permissions, as we do for .init, leaving the
memory executable.
* We pad between them, making the kernel Image bianry up to PAGE_SIZE
bytes larger than necessary.
This patch moves the two sections into the contiguous region used for
.init*. This saves some memory, ensures that we remove execute
permissions, and allows us to remove some code made redundant by this
reorganisation.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Mark Rutland [Wed, 9 Dec 2015 12:44:37 +0000 (12:44 +0000)]
arm64: Remove redundant padding from linker script
Currently we place an ALIGN_DEBUG_RO between text and data for the .text
and .init sections, and depending on configuration each of these may
result in up to SECTION_SIZE bytes worth of padding (for
DEBUG_RODATA_ALIGN).
We make no distinction between the text and data in each of these
sections at any point when creating the initial page tables in head.S.
We also make no distinction when modifying the tables; __map_memblock,
fixup_executable, mark_rodata_ro, and fixup_init only work at section
granularity. Thus this padding is unnecessary.
For the spit between init text and data we impose a minimum alignment of
16 bytes, but this is also unnecessary. The init data is output
immediately after the padding before any symbols are defined, so this is
not required to keep a symbol for linker a section array correctly
associated with the data. Any objects within the section will be given
at least their usual alignment regardless.
This patch removes the redundant padding.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Mark Rutland [Wed, 9 Dec 2015 12:44:36 +0000 (12:44 +0000)]
arm64: mm: remove pointless PAGE_MASKing
As pgd_offset{,_k} shift the input address by PGDIR_SHIFT, the sub-page
bits will always be shifted out. There is no need to apply PAGE_MASK
before this.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
James Morse [Thu, 10 Dec 2015 10:22:41 +0000 (10:22 +0000)]
arm64: don't call C code with el0's fp register
On entry from el0, we save all the registers on the kernel stack, and
restore them before returning. x29 remains unchanged when we call out
to C code, which will store x29 as the frame-pointer on the stack.
Instead, write 0 into x29 after entry from el0, to avoid any risk of
tracing into user space.
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
James Morse [Thu, 10 Dec 2015 10:22:40 +0000 (10:22 +0000)]
arm64: when walking onto the task stack, check sp & fp are in current->stack
When unwind_frame() reaches the bottom of the irq_stack, the last fp
points to the original task stack. unwind_frame() uses
IRQ_STACK_TO_TASK_STACK() to find the sp value. If either values is
wrong, we may end up walking a corrupt stack.
Check these values are sane by testing if they are both on the stack
pointed to by current->stack.
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
James Morse [Thu, 10 Dec 2015 10:22:39 +0000 (10:22 +0000)]
arm64: Add this_cpu_ptr() assembler macro for use in entry.S
irq_stack is a per_cpu variable, that needs to be access from entry.S.
Use an assembler macro instead of the unreadable details.
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Ard Biesheuvel [Mon, 30 Nov 2015 12:28:19 +0000 (13:28 +0100)]
arm64/efi: refactor EFI init and runtime code for reuse by 32-bit ARM
This refactors the EFI init and runtime code that will be shared
between arm64 and ARM so that it can be built for both archs.
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Ard Biesheuvel [Mon, 30 Nov 2015 12:28:18 +0000 (13:28 +0100)]
arm64/efi: split off EFI init and runtime code for reuse by 32-bit ARM
This splits off the early EFI init and runtime code that
- discovers the EFI params and the memory map from the FDT, and installs
the memblocks and config tables.
- prepares and installs the EFI page tables so that UEFI Runtime Services
can be invoked at the virtual address installed by the stub.
This will allow it to be reused for 32-bit ARM.
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Ard Biesheuvel [Mon, 30 Nov 2015 12:28:17 +0000 (13:28 +0100)]
arm64/efi: mark UEFI reserved regions as MEMBLOCK_NOMAP
Change the EFI memory reservation logic to use memblock_mark_nomap()
rather than memblock_reserve() to mark UEFI reserved regions as
occupied. In addition to reserving them against allocations done by
memblock, this will also prevent them from being covered by the linear
mapping.
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Ard Biesheuvel [Mon, 30 Nov 2015 12:28:16 +0000 (13:28 +0100)]
arm64: only consider memblocks with NOMAP cleared for linear mapping
Take the new memblock attribute MEMBLOCK_NOMAP into account when
deciding whether a certain region is or should be covered by the
kernel direct mapping.
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Ard Biesheuvel [Mon, 30 Nov 2015 12:28:15 +0000 (13:28 +0100)]
mm/memblock: add MEMBLOCK_NOMAP attribute to memblock memory table
This introduces the MEMBLOCK_NOMAP attribute and the required plumbing
to make it usable as an indicator that some parts of normal memory
should not be covered by the kernel direct mapping. It is up to the
arch to actually honor the attribute when laying out this mapping,
but the memblock code itself is modified to disregard these regions
for allocations and other general use.
Cc: linux-mm@kvack.org
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Will Deacon [Wed, 9 Dec 2015 13:58:42 +0000 (13:58 +0000)]
arm64: irq: fix walking from irq stack to task stack
Running with CONFIG_DEBUG_SPINLOCK=y can trigger a BUG with the new IRQ
stack code:
BUG: spinlock lockup suspected on CPU#1
This is due to the IRQ_STACK_TO_TASK_STACK macro incorrectly retrieving
the task stack pointer stashed at the top of the IRQ stack.
Sayeth James:
| Yup, this is what is happening. Its an off-by-one due to broken
| thinking about how the stack works. My broken thinking was:
|
| > top ------------
| > | dummy_lr | <- irq_stack_ptr
| > ------------
| > | x29 |
| > ------------
| > | x19 | <- irq_stack_ptr - 0x10
| > ------------
| > | xzr |
| > ------------
|
| But the stack-pointer is decreased before use. So it actually looks
| like this:
|
| > ------------
| > | | <- irq_stack_ptr
| > top ------------
| > | dummy_lr |
| > ------------
| > | x29 | <- irq_stack_ptr - 0x10
| > ------------
| > | x19 |
| > ------------
| > | xzr | <- irq_stack_ptr - 0x20
| > ------------
|
| The value being used as the original stack is x29, which in all the
| tests is sp but without the current frames data, hence there are no
| missing frames in the output.
|
| Jungseok Lee picked it up with a 32bit user space because aarch32
| can't use x29, so it remains 0 forever. The fix he posted is correct.
This patch fixes the macro and adds some of this wisdom to a comment,
so that the layout of the IRQ stack is well understood.
Cc: James Morse <james.morse@arm.com>
Reported-by: Jungseok Lee <jungseoklee85@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
James Morse [Fri, 4 Dec 2015 11:02:27 +0000 (11:02 +0000)]
arm64: Add do_softirq_own_stack() and enable irq_stacks
entry.S is modified to switch to the per_cpu irq_stack during el{0,1}_irq.
irq_count is used to detect recursive interrupts on the irq_stack, it is
updated late by do_softirq_own_stack(), when called on the irq_stack, before
__do_softirq() re-enables interrupts to process softirqs.
do_softirq_own_stack() is added by this patch, but does not yet switch
stack.
This patch adds the dummy stack frame and data needed by the previous
stack tracing patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
AKASHI Takahiro [Fri, 4 Dec 2015 11:02:26 +0000 (11:02 +0000)]
arm64: Modify stack trace and dump for use with irq_stack
This patch allows unwind_frame() to traverse from interrupt stack to task
stack correctly. It requires data from a dummy stack frame, created
during irq_stack_entry(), added by a later patch.
A similar approach is taken to modify dump_backtrace(), which expects to
find struct pt_regs underneath any call to functions marked __exception.
When on an irq_stack, the struct pt_regs is stored on the old task stack,
the location of which is stored in the dummy stack frame.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
[james.morse: merged two patches, reworked for per_cpu irq_stacks, and
no alignment guarantees, added irq_stack definitions]
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Jungseok Lee [Fri, 4 Dec 2015 11:02:25 +0000 (11:02 +0000)]
arm64: Store struct thread_info in sp_el0
There is need for figuring out how to manage struct thread_info data when
IRQ stack is introduced. struct thread_info information should be copied
to IRQ stack under the current thread_info calculation logic whenever
context switching is invoked. This is too expensive to keep supporting
the approach.
Instead, this patch pays attention to sp_el0 which is an unused scratch
register in EL1 context. sp_el0 utilization not only simplifies the
management, but also prevents text section size from being increased
largely due to static allocated IRQ stack as removing masking operation
using THREAD_SIZE in many places.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jungseok Lee <jungseoklee85@gmail.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
John Blackwood [Mon, 7 Dec 2015 11:50:34 +0000 (11:50 +0000)]
arm64: Clear out any singlestep state on a ptrace detach operation
Make sure to clear out any ptrace singlestep state when a ptrace(2)
PTRACE_DETACH call is made on arm64 systems.
Otherwise, the previously ptraced task will die off with a SIGTRAP
signal if the debugger just previously singlestepped the ptraced task.
Cc: <stable@vger.kernel.org>
Signed-off-by: John Blackwood <john.blackwood@ccur.com>
[will: added comment to justify why this is in the arch code]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Catalin Marinas [Fri, 4 Dec 2015 12:42:29 +0000 (12:42 +0000)]
arm64: Add trace_hardirqs_off annotation in ret_to_user
When a kernel is built with CONFIG_TRACE_IRQFLAGS the following warning
is produced when entering userspace for the first time:
WARNING: at /work/Linux/linux-2.6-aarch64/kernel/locking/lockdep.c:3519
Modules linked in:
CPU: 1 PID: 1 Comm: systemd Not tainted 4.4.0-rc3+ #639
Hardware name: Juno (DT)
task:
ffffffc9768a0000 ti:
ffffffc9768a8000 task.ti:
ffffffc9768a8000
PC is at check_flags.part.22+0x19c/0x1a8
LR is at check_flags.part.22+0x19c/0x1a8
pc : [<
ffffffc0000fba6c>] lr : [<
ffffffc0000fba6c>] pstate:
600001c5
sp :
ffffffc9768abe10
x29:
ffffffc9768abe10 x28:
ffffffc9768a8000
x27:
0000000000000000 x26:
0000000000000001
x25:
00000000000000a6 x24:
ffffffc00064be6c
x23:
ffffffc0009f249e x22:
ffffffc9768a0000
x21:
ffffffc97fea5480 x20:
00000000000001c0
x19:
ffffffc00169a000 x18:
0000005558cc7b58
x17:
0000007fb78e3180 x16:
0000005558d2e238
x15:
ffffffffffffffff x14:
0ffffffffffffffd
x13:
0000000000000008 x12:
0101010101010101
x11:
7f7f7f7f7f7f7f7f x10:
fefefefefefeff63
x9 :
7f7f7f7f7f7f7f7f x8 :
6e655f7371726964
x7 :
0000000000000001 x6 :
ffffffc0001079c4
x5 :
0000000000000000 x4 :
0000000000000001
x3 :
ffffffc001698438 x2 :
0000000000000000
x1 :
ffffffc9768a0000 x0 :
000000000000002e
Call trace:
[<
ffffffc0000fba6c>] check_flags.part.22+0x19c/0x1a8
[<
ffffffc0000fc440>] lock_is_held+0x80/0x98
[<
ffffffc00064bafc>] __schedule+0x404/0x730
[<
ffffffc00064be6c>] schedule+0x44/0xb8
[<
ffffffc000085bb0>] ret_to_user+0x0/0x24
possible reason: unannotated irqs-off.
irq event stamp: 502169
hardirqs last enabled at (502169): [<
ffffffc000085a98>] el0_irq_naked+0x1c/0x24
hardirqs last disabled at (502167): [<
ffffffc0000bb3bc>] __do_softirq+0x17c/0x298
softirqs last enabled at (502168): [<
ffffffc0000bb43c>] __do_softirq+0x1fc/0x298
softirqs last disabled at (502143): [<
ffffffc0000bb830>] irq_exit+0xa0/0xf0
This happens because we disable interrupts in ret_to_user before calling
schedule() in work_resched. This patch adds the necessary
trace_hardirqs_off annotation.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Li Bin [Fri, 4 Dec 2015 03:38:40 +0000 (11:38 +0800)]
arm64: ftrace: fix the comments for ftrace_modify_code
There is no need to worry about module and __init text disappearing
case, because that ftrace has a module notifier that is called when
a module is being unloaded and before the text goes away and this
code grabs the ftrace_lock mutex and removes the module functions
from the ftrace list, such that it will no longer do any
modifications to that module's text, the update to make functions
be traced or not is done under the ftrace_lock mutex as well.
And by now, __init section codes should not been modified
by ftrace, because it is black listed in recordmcount.c and
ignored by ftrace.
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Li Bin [Fri, 4 Dec 2015 03:38:39 +0000 (11:38 +0800)]
arm64: ftrace: stop using kstop_machine to enable/disable tracing
For ftrace on arm64, kstop_machine which is hugely disruptive
to a running system is not needed to convert nops to ftrace calls
or back, because that to be modified instrucions, that NOP, B or BL,
are all safe instructions which called "concurrent modification
and execution of instructions", that can be executed by one
thread of execution as they are being modified by another thread
of execution without requiring explicit synchronization.
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Will Deacon [Thu, 19 Nov 2015 17:48:31 +0000 (17:48 +0000)]
arm64: spinlock: serialise spin_unlock_wait against concurrent lockers
Boqun Feng reported a rather nasty ordering issue with spin_unlock_wait
on architectures implementing spin_lock with LL/SC sequences and acquire
semantics:
| CPU 1 CPU 2 CPU 3
| ================== ==================== ==============
| spin_unlock(&lock);
| spin_lock(&lock):
| r1 = *lock; // r1 == 0;
| o = READ_ONCE(object); // reordered here
| object = NULL;
| smp_mb();
| spin_unlock_wait(&lock);
| *lock = 1;
| smp_mb();
| o->dead = true;
| if (o) // true
| BUG_ON(o->dead); // true!!
The crux of the problem is that spin_unlock_wait(&lock) can return on
CPU 1 whilst CPU 2 is in the process of taking the lock. This can be
resolved by upgrading spin_unlock_wait to a LOCK operation, forcing it
to serialise against a concurrent locker and giving it acquire semantics
in the process (although it is not at all clear whether this is needed -
different callers seem to assume different things about the barrier
semantics and architectures are similarly disjoint in their
implementations of the macro).
This patch implements spin_unlock_wait using an LL/SC sequence with
acquire semantics on arm64. For v8.1 systems with the LSE atomics, the
exclusive writeback is omitted, since the spin_lock operation is
indivisible and no intermediate state can be observed.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Will Deacon [Mon, 23 Nov 2015 15:12:59 +0000 (15:12 +0000)]
arm64: enable HAVE_IRQ_TIME_ACCOUNTING
arm64 relies on the arm_arch_timer for sched_clock, so we can select
HAVE_IRQ_TIME_ACCOUNTING and have the core sched-clock code enable the
feature at runtime based on the rate.
Reported-by: Mario Smarduch <m.smarduch@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Yury Norov [Wed, 2 Dec 2015 14:00:10 +0000 (14:00 +0000)]
arm64: fix COMPAT_SHMLBA definition for large pages
ARM glibc uses (4 * __getpagesize()) for SHMLBA, which is correct for
4KB pages and works fine for 64KB pages, but the kernel uses a hardcoded
16KB that is too small for 64KB page based kernels. This changes the
definition to what user space sees when using 64KB pages.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Jisheng Zhang [Fri, 20 Nov 2015 09:59:10 +0000 (17:59 +0800)]
arm64: add __init/__initdata section marker to some functions/variables
These functions/variables are not needed after booting, so mark them
as __init or __initdata.
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Will Deacon [Fri, 30 Oct 2015 18:56:19 +0000 (18:56 +0000)]
arm64: pgtable: implement pte_accessible()
This patch implements the pte_accessible() macro, which can be used to
test whether or not a given pte is a candidate for allocation in the
TLB.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Mark Rutland [Mon, 23 Nov 2015 13:26:20 +0000 (13:26 +0000)]
arm64: mm: allow sections for unaligned bases
Callees of __create_mapping may decide to create section mappings if
sufficient low bits of the physical and virtual addresses they were
passed are zero. While __create_mapping rounds the virtual base address
down, it does not similarly round the physical base address down, and
hence non-zero bits in the physical address can prevent use of a section
mapping, even where a whole next-level table would be used instead.
Round down the physical base address in __create_mapping to enable all
callees to always create section mappings when such a mapping is
possible.
Cc: Laura Abbott <labbott@fedoraproject.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Mark Rutland [Mon, 23 Nov 2015 13:26:19 +0000 (13:26 +0000)]
arm64: mm: detect bad __create_mapping uses
If a caller of __create_mapping provides a PA and VA which have
different sub-page offsets, it is not clear which offset they expect to
apply to the mapping, and is indicative of a bad caller.
In some cases, the region we wish to map may validly have a sub-page
offset in the physical and virtual addresses. For example, EFI runtime
regions have 4K granularity, yet may be mapped by a 64K page kernel. So
long as the physical and virtual offsets are the same, the region will
be mapped at the expected VAs.
Disallow calls with differing sub-page offsets, and WARN when they are
encountered, so that we can detect and fix such cases.
Cc: Laura Abbott <labbott@fedoraproject.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Linus Torvalds [Mon, 30 Nov 2015 02:58:26 +0000 (18:58 -0800)]
Linux 4.4-rc3
Linus Torvalds [Mon, 30 Nov 2015 01:38:08 +0000 (17:38 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull nouveau and radeon fixes from Dave Airlie:
"Just some nouveau and radeon/amdgpu fixes.
The nouveau fixes look large as the firmware context files are
regenerated, but the actual change is quite small"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon: make some dpm errors debug only
drm/nouveau/volt/pwm/gk104: fix an off-by-one resulting in the voltage not being set
drm/nouveau/nvif: allow userspace access to its own client object
drm/nouveau/gr/gf100-: fix oops when calling zbc methods
drm/nouveau/gr/gf117-: assume no PPC if NV_PGRAPH_GPC_GPM_PD_PES_TPC_ID_MASK is zero
drm/nouveau/gr/gf117-: read NV_PGRAPH_GPC_GPM_PD_PES_TPC_ID_MASK from correct GPC
drm/nouveau/gr/gf100-: split out per-gpc address calculation macro
drm/nouveau/bios: return actual size of the buffer retrieved via _ROM
drm/nouveau/instmem: protect instobj list with a spinlock
drm/nouveau/pci: enable c800 magic for some unknown Samsung laptop
drm/nouveau/pci: enable c800 magic for Clevo P157SM
drm/radeon: make rv770_set_sw_state failures non-fatal
drm/amdgpu: move dependency handling out of atomic section v2
drm/amdgpu: optimize scheduler fence handling
drm/amdgpu: remove vm->mutex
drm/amdgpu: add mutex for ba_va->valids/invalids
drm/amdgpu: adapt vce session create interface changes
drm/amdgpu: vce use multiple cache surface starting from stoney
drm/amdgpu: reset vce trap interrupt flag
Linus Torvalds [Mon, 30 Nov 2015 01:30:41 +0000 (17:30 -0800)]
Merge tag 'rtc-4.4-2' of git://git./linux/kernel/git/abelloni/linux
Pull RTC fixes from Alexandre Belloni:
"Two fixes for the ds1307 alarm and wakeup"
* tag 'rtc-4.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
rtc: ds1307: fix alarm reading at probe time
rtc: ds1307: fix kernel splat due to wakeup irq handling
Linus Torvalds [Mon, 30 Nov 2015 01:24:35 +0000 (17:24 -0800)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull MIPS fix from Ralf Baechle:
"Just a fix for empty loops that may be removed by non-antique GCC"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Fix delay loops which may be removed by GCC.
Linus Torvalds [Mon, 30 Nov 2015 01:18:41 +0000 (17:18 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/geert/linux-m68k
Pull m68k fixes from Geert Uytterhoeven:
"Summary:
- Add missing initialization of max_pfn, which is needed to make
selftests/vm/mlock2-tests succeed,
- Wire up new mlock2 syscall"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Wire up mlock2
m68knommu: Add missing initialization of max_pfn and {min,max}_low_pfn
m68k/mm: sun3 - Add missing initialization of max_pfn and {min,max}_low_pfn
m68k/mm: m54xx - Add missing initialization of max_pfn
m68k/mm: motorola - Add missing initialization of max_pfn
Linus Torvalds [Mon, 30 Nov 2015 01:13:07 +0000 (17:13 -0800)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"Just two changes this time around:
- wire up the new mlock2 syscall added during the last merge window
- fix a build problem with certain configurations provoked by making
CONFIG_OF user selectable"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8454/1: OF implies OF_FLATTREE
ARM: wire up mlock2 syscall
Linus Torvalds [Sun, 29 Nov 2015 17:03:57 +0000 (09:03 -0800)]
Merge git://git./linux/kernel/git/nab/target-pending
Pull SCSI target fixes from Nicholas Bellinger:
- fix tcm-user backend driver expired cmd time processing (agrover)
- eliminate kref_put_spinlock_irqsave() for I/O completion (bart)
- fix iscsi login kthread failure case hung task regression (nab)
- fix COMPARE_AND_WRITE completion use-after-free race (nab)
- fix COMPARE_AND_WRITE with SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC non zero
SGL offset data corruption. (Jan + Doug)
- fix >= v4.4-rc1 regression for tcm_qla2xxx enable configfs attribute
(Himanshu + HCH)
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
target/stat: print full t10_wwn.model buffer
target: fix COMPARE_AND_WRITE non zero SGL offset data corruption
qla2xxx: Fix regression introduced by target configFS changes
kref: Remove kref_put_spinlock_irqsave()
target: Invoke release_cmd() callback without holding a spinlock
target: Fix race for SCF_COMPARE_AND_WRITE_POST checking
iscsi-target: Fix rx_login_comp hang after login failure
iscsi-target: return -ENOMEM instead of -1 in case of failed kmalloc()
target/user: Do not set unused fields in tcmu_ops
target/user: Fix time calc in expired cmd processing
Linus Torvalds [Sun, 29 Nov 2015 16:58:48 +0000 (08:58 -0800)]
Merge branch 'next' of git://git./linux/kernel/git/rzhang/linux
Pull thermal management fixes from Zhang Rui:
"Specifics:
- several fixes and cleanups on Rockchip thermal drivers.
- add the missing support of RK3368 SoCs in Rockchip driver.
- small fixes on of-thermal, power_allocator, rcar driver, IMX, and
QCOM drivers, and also compilation fixes, on thermal.h, when thermal
is not selected"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
imx: thermal: use CPU temperature grade info for thresholds
thermal: fix thermal_zone_bind_cooling_device prototype
Revert "thermal: qcom_spmi: allow compile test"
thermal: rcar_thermal: remove redundant operation
thermal: of-thermal: Reduce log level for message when can't fine thermal zone
thermal: power_allocator: Use temperature reading from tz
thermal: rockchip: Support the RK3368 SoCs in thermal driver
thermal: rockchip: consistently use int for temperatures
thermal: rockchip: Add the sort mode for adc value increment or decrement
thermal: rockchip: improve the conversion function
thermal: rockchip: trivial: fix typo in commit
thermal: rockchip: better to compatible the driver for different SoCs
dt-bindings: rockchip-thermal: Support the RK3368 SoCs compatible
David Disseldorp [Fri, 27 Nov 2015 17:37:47 +0000 (18:37 +0100)]
target/stat: print full t10_wwn.model buffer
Cut 'n paste error saw it only process sizeof(t10_wwn.vendor) characters.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Jan Engelhardt [Mon, 23 Nov 2015 16:46:32 +0000 (17:46 +0100)]
target: fix COMPARE_AND_WRITE non zero SGL offset data corruption
target_core_sbc's compare_and_write functionality suffers from taking
data at the wrong memory location when writing a CAW request to disk
when a SGL offset is non-zero.
This can happen with loopback and vhost-scsi fabric drivers when
SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC is used to map existing user-space
SGL memory into COMPARE_AND_WRITE READ/WRITE payload buffers.
Given the following sample LIO subtopology,
% targetcli ls /loopback/
o- loopback ................................. [1 Target]
o- naa.
6001405ebb8df14a ....... [naa.
60014059143ed2b3]
o- luns ................................... [2 LUNs]
o- lun0 ................ [iblock/ram0 (/dev/ram0)]
o- lun1 ................ [iblock/ram1 (/dev/ram1)]
% lsscsi -g
[3:0:1:0] disk LIO-ORG IBLOCK 4.0 /dev/sdc /dev/sg3
[3:0:1:1] disk LIO-ORG IBLOCK 4.0 /dev/sdd /dev/sg4
the following bug can be observed in Linux 4.3 and 4.4~rc1:
% perl -e 'print chr$_ for 0..255,reverse 0..255' >rand
% perl -e 'print "\0" x 512' >zero
% cat rand >/dev/sdd
% sg_compare_and_write -i rand -D zero --lba 0 /dev/sdd
% sg_compare_and_write -i zero -D rand --lba 0 /dev/sdd
Miscompare reported
% hexdump -Cn 512 /dev/sdd
00000000 0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
00000200
Rather than writing all-zeroes as instructed with the -D file, it
corrupts the data in the sector by splicing some of the original
bytes in. The page of the first entry of cmd->t_data_sg includes the
CDB, and sg->offset is set to a position past the CDB. I presume that
sg->offset is also the right choice to use for subsequent sglist
members.
Signed-off-by: Jan Engelhardt <jengelh@netitwork.de>
Tested-by: Douglas Gilbert <dgilbert@interlog.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Himanshu Madhani [Tue, 24 Nov 2015 17:20:15 +0000 (12:20 -0500)]
qla2xxx: Fix regression introduced by target configFS changes
this patch fixes following regression
# targetcli
[Errno 13] Permission denied: '/sys/kernel/config/target/qla2xxx/21:00:00:0e:1e:08:c7:20/tpgt_1/enable'
Fixes:
2eafd72939fd ("target: use per-attribute show and store methods")
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Bart Van Assche [Thu, 22 Oct 2015 23:02:14 +0000 (16:02 -0700)]
kref: Remove kref_put_spinlock_irqsave()
The last user is gone. Hence remove this function.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Bart Van Assche [Thu, 22 Oct 2015 22:57:04 +0000 (15:57 -0700)]
target: Invoke release_cmd() callback without holding a spinlock
This patch fixes the following kernel warning because it avoids that
IRQs are disabled while ft_release_cmd() is invoked (fc_seq_set_resp()
invokes spin_unlock_bh()):
WARNING: CPU: 3 PID: 117 at kernel/softirq.c:150 __local_bh_enable_ip+0xaa/0x110()
Call Trace:
[<
ffffffff814f71eb>] dump_stack+0x4f/0x7b
[<
ffffffff8105e56a>] warn_slowpath_common+0x8a/0xc0
[<
ffffffff8105e65a>] warn_slowpath_null+0x1a/0x20
[<
ffffffff81062b2a>] __local_bh_enable_ip+0xaa/0x110
[<
ffffffff814ff229>] _raw_spin_unlock_bh+0x39/0x40
[<
ffffffffa03a7f94>] fc_seq_set_resp+0xe4/0x100 [libfc]
[<
ffffffffa02e604a>] ft_free_cmd+0x4a/0x90 [tcm_fc]
[<
ffffffffa02e6972>] ft_release_cmd+0x12/0x20 [tcm_fc]
[<
ffffffffa042bd66>] target_release_cmd_kref+0x56/0x90 [target_core_mod]
[<
ffffffffa042caf0>] target_put_sess_cmd+0xc0/0x110 [target_core_mod]
[<
ffffffffa042cb81>] transport_release_cmd+0x41/0x70 [target_core_mod]
[<
ffffffffa042d975>] transport_generic_free_cmd+0x35/0x420 [target_core_mod]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joern Engel <joern@logfs.org>
Reviewed-by: Andy Grover <agrover@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Nicholas Bellinger [Fri, 6 Nov 2015 07:37:59 +0000 (23:37 -0800)]
target: Fix race for SCF_COMPARE_AND_WRITE_POST checking
This patch addresses a race + use after free where the first
stage of COMPARE_AND_WRITE in compare_and_write_callback()
is rescheduled after the backend sends the secondary WRITE,
resulting in second stage compare_and_write_post() callback
completing in target_complete_ok_work() before the first
can return.
Because current code depends on checking se_cmd->se_cmd_flags
after return from se_cmd->transport_complete_callback(),
this results in first stage having SCF_COMPARE_AND_WRITE_POST
set, which incorrectly falls through into second stage CAW
processing code, eventually triggering a NULL pointer
dereference due to use after free.
To address this bug, pass in a new *post_ret parameter into
se_cmd->transport_complete_callback(), and depend upon this
value instead of ->se_cmd_flags to determine when to return
or fall through into ->queue_status() code for CAW.
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Nicholas Bellinger [Thu, 5 Nov 2015 22:11:59 +0000 (14:11 -0800)]
iscsi-target: Fix rx_login_comp hang after login failure
This patch addresses a case where iscsi_target_do_tx_login_io()
fails sending the last login response PDU, after the RX/TX
threads have already been started.
The case centers around iscsi_target_rx_thread() not invoking
allow_signal(SIGINT) before the send_sig(SIGINT, ...) occurs
from the failure path, resulting in RX thread hanging
indefinately on iscsi_conn->rx_login_comp.
Note this bug is a regression introduced by:
commit
e54198657b65625085834847ab6271087323ffea
Author: Nicholas Bellinger <nab@linux-iscsi.org>
Date: Wed Jul 22 23:14:19 2015 -0700
iscsi-target: Fix iscsit_start_kthreads failure OOPs
To address this bug, complete ->rx_login_complete for good
measure in the failure path, and immediately return from
RX thread context if connection state did not actually reach
full feature phase (TARG_CONN_STATE_LOGGED_IN).
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Luis de Bethencourt [Mon, 19 Oct 2015 20:18:24 +0000 (21:18 +0100)]
iscsi-target: return -ENOMEM instead of -1 in case of failed kmalloc()
Smatch complains about returning hard coded error codes, silence this
warning.
drivers/target/iscsi/iscsi_target_parameters.c:211
iscsi_create_default_params() warn: returning -1 instead of -ENOMEM is sloppy
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Andy Grover [Fri, 13 Nov 2015 18:42:20 +0000 (10:42 -0800)]
target/user: Do not set unused fields in tcmu_ops
TCMU sets TRANSPORT_FLAG_PASSTHROUGH, so INQUIRY commands will not be
emulated by LIO but passed up to userspace. Therefore TCMU should not
set these, just like pscsi doesn't.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Andy Grover [Fri, 13 Nov 2015 18:42:19 +0000 (10:42 -0800)]
target/user: Fix time calc in expired cmd processing
Reversed arguments meant that we were doing nothing for cmds whose deadline
had passed.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Arnd Bergmann [Thu, 19 Nov 2015 12:20:54 +0000 (13:20 +0100)]
ARM: 8454/1: OF implies OF_FLATTREE
On the ARM architecture, individual platforms select CONFIG_USE_OF if they
need it, but all device tree code is keyed off CONFIG_OF. When building
a platform without DT support and manually enabling CONFIG_OF, we now
get a number of build errors, e.g.
arch/arm/kernel/devtree.c: In function 'setup_machine_fdt':
arch/arm/kernel/devtree.c:215:19: error: implicit declaration of function 'early_init_dt_verify' [-Werror=implicit-function-declaration]
We could now try to separate the use case of booting from DT vs. the
case of using the dynamic implementation, but that seems more complicated
than it can gain us.
This simply changes the ARM Kconfig file to always enable OF_RESERVED_MEM
and OF_EARLY_FLATTREE when CONFIG_OF is enabled. These options add a little
extra code when we just want the dynamic OF implementation, but that seems
like a rather obscure case, and this version solves all CONFIG_OF related
randconfig regressions.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes:
0166dc11be91 ("of: make CONFIG_OF user selectable")
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus Torvalds [Sat, 28 Nov 2015 21:07:41 +0000 (13:07 -0800)]
Merge tag 'pci-v4.4-fixes-1' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
"Here are a few fixes I'd like to have in v4.4: a generic one for sysfs
and three for HiSilicon and DesignWare host controllers.
Summary:
NUMA:
- Prevent out of bounds access in numa_node override (Mathias Krause)
HiSilicon host bridge driver:
- Fix deferred probing (Arnd Bergmann)
Synopsys DesignWare host bridge driver:
- Remove incorrect io_base assignment (Stanimir Varbanov)
- Move align_resource function pointer to pci_host_bridge structure
(Gabriele Paoloni)"
* tag 'pci-v4.4-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
ARM/PCI: Move align_resource function pointer to pci_host_bridge structure
PCI: hisi: Fix deferred probing
PCI: designware: Remove incorrect io_base assignment
PCI: Prevent out of bounds access in numa_node override
Linus Torvalds [Sat, 28 Nov 2015 01:22:47 +0000 (17:22 -0800)]
Merge tag 'nfs-for-4.4-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:
Stable patches:
- Fix a NFSv4 callback identifier leak that was also causing client
crashes
- Fix NFSv4 callback decoding issues when incoming requests are
truncated
- Don't declare the attribute cache valid when we call
nfs_update_inode with an empty attribute structure.
- Resend LAYOUTGET when there is a race that changes the seqid
Bugfixes:
- Fix a number of issues with the NFSv4.2 CLONE ioctl()
- Properly set NFS v4.2 NFSDBG_FACILITY
- NFSv4 referrals are broken; Cleanup FATTR4_WORD0_FS_LOCATIONS after
decoding success
- Use sliding delay when LAYOUTGET gets NFS4ERR_DELAY
- Ensure that attrcache is revalidated after a SETATTR"
* tag 'nfs-for-4.4-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
nfs4: resend LAYOUTGET when there is a race that changes the seqid
nfs: if we have no valid attrs, then don't declare the attribute cache valid
nfs: ensure that attrcache is revalidated after a SETATTR
nfs4: limit callback decoding to received bytes
nfs4: start callback_ident at idr 1
nfs: use sliding delay when LAYOUTGET gets NFS4ERR_DELAY
NFS4: Cleanup FATTR4_WORD0_FS_LOCATIONS after decoding success
NFS: Properly set NFS v4.2 NFSDBG_FACILITY
nfs: reduce the amount of ifdefs for v4.2 in nfs4file.c
nfs: use btrfs ioctl defintions for clone
nfs: allow intra-file CLONE
nfs: offer native ioctls even if CONFIG_COMPAT is set
nfs: pass on count for CLONE operations
Linus Torvalds [Fri, 27 Nov 2015 23:53:23 +0000 (15:53 -0800)]
Merge git://www.linux-watchdog.org/linux-watchdog
Pull watchdog fixes from Wim Van Sebroeck:
- a null pointer dereference fix for omap_wdt
- some clock related fixes for pnx4008
- an underflow fix in wdt_set_timeout() for w83977f_wdt
- restart fix for tegra wdt
- Kconfig change to support Freescale Layerscape platforms
- fix for stopping the mtk_wdt watchdog
* git://www.linux-watchdog.org/linux-watchdog:
watchdog: mtk_wdt: Use MODE_KEY when stopping the watchdog
watchdog: Add support for Freescale Layerscape platforms
watchdog: tegra: Stop watchdog first if restarting
watchdog: w83977f_wdt: underflow in wdt_set_timeout()
watchdog: pnx4008: make global wdt_clk static
watchdog: pnx4008: fix warnings caused by enabling unprepared clock
watchdog: omap_wdt: fix null pointer dereference
Linus Torvalds [Fri, 27 Nov 2015 23:45:45 +0000 (15:45 -0800)]
Merge branch 'for-linus-4.4' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"This has Mark Fasheh's patches to fix quota accounting during subvol
deletion, which we've been working on for a while now. The patch is
pretty small but it's a key fix.
Otherwise it's a random assortment"
* 'for-linus-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: fix balance range usage filters in 4.4-rc
btrfs: qgroup: account shared subtree during snapshot delete
Btrfs: use btrfs_get_fs_root in resolve_indirect_ref
btrfs: qgroup: fix quota disable during rescan
Btrfs: fix race between cleaner kthread and space cache writeout
Btrfs: fix scrub preventing unused block groups from being deleted
Btrfs: fix race between scrub and block group deletion
btrfs: fix rcu warning during device replace
btrfs: Continue replace when set_block_ro failed
btrfs: fix clashing number of the enhanced balance usage filter
Btrfs: fix the number of transaction units needed to remove a block group
Btrfs: use global reserve when deleting unused block group after ENOSPC
Btrfs: tests: checking for NULL instead of IS_ERR()
btrfs: fix signed overflows in btrfs_sync_file
Linus Torvalds [Fri, 27 Nov 2015 23:27:52 +0000 (15:27 -0800)]
Merge branch 'for-linus2' of git://git./linux/kernel/git/jmorris/linux-security
Pull security layer fixes from James Morris:
"A fix for SELinux policy processing (regression introduced by
commit
fa1aa143ac4a: "selinux: extended permissions for ioctls"), as
well as a fix for the user-triggerable oops in the Keys code"
* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
KEYS: Fix handling of stored error in a negatively instantiated user key
selinux: fix bug in conditional rules handling
Linus Torvalds [Fri, 27 Nov 2015 22:22:03 +0000 (14:22 -0800)]
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
"There is a small backlog of at91 patches here, the most significant is
the addition of some sama5d2 Xplained nodes that were waiting on an
MFD include file to get merged through another tree.
We normally try to sort those out before the merge window opens, but
the maintainer wasn't aware of that here and I decided to merge the
changes this time as an exception.
On OMAP a series of audio changes for dra7 missed the merge window but
turned out to be necessary to fix a boot time imprecise external abort
error and to get audio working.
The other changes are the usual simple changes, here is a list sorted
by platform:
at91:
removal of a useless defconfig option
removal of some legacy DT pieces
use of the proper watchdog compatible string
update of the MAINTAINERS entries for some Atmel drivers
drivers/scpi:
hide get_scpi_ops in module from built-in code
imx:
add missing .irq_set_type for i.MX GPC irq_chip.
fix the wrong spi-num-chipselects settings for Vybrid DSPI devices.
fix a merge error in Vybrid dts regarding to ADC device property
keystone:
fix the optional PDSP firmware loading
fix linking RAM setup for QMs
fix crash with clk_ignore_unused
mediatek:
Enable SCPSYS power domain driver by default
mvebu:
fix QNAP TS219 power-off in dts
fix legacy get_irqnr_and_base for dove and orion5x
omap:
fix l4 related boot time errors for dm81xx
use lockless cldm/pwrdm api in omap4_boot_secondary
remove t410 abort handler to avoid hiding other critical errors
mark cpuidle tracepoints as _rcuidle
fix module alias for omap-ocp2scp
pxa:
palm: Fix typos in PWM lookup table code
renesas:
missing __initconst annotation for r8a7793_boards_compat_dt
rockchip:
disable mmc-tuning on the veyron-minnie board
adding the init state for the over-temperature-protection
zx:
only build power domain code when CONFIG_PM=y"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (31 commits)
ARM: OMAP4+: SMP: use lockless clkdm/pwrdm api in omap4_boot_secondary
arm: omap2+: add missing HWMOD_NO_IDLEST in 81xx hwmod data
ARM: orion5x: Fix legacy get_irqnr_and_base
ARM: dove: Fix legacy get_irqnr_and_base
soc: Mediatek: Enable SCPSYS power domain driver by default
ARM: dts: vfxxx: Fix dspi[01] spi-num-chipselects.
ARM: dts: keystone: k2l: fix kernel crash when clk_ignore_unused is not in bootargs
soc: ti: knav_qmss_queue: Fix linking RAM setup for queue managers
soc: ti: use request_firmware_direct() as acc firmware is optional
ARM: imx: add platform irq type setting in gpc
ARM: dts: vfxxx: Fix erroneous property in esdhc0 node
ARM: shmobile: r8a7793: proper constness with __initconst
scpi: hide get_scpi_ops in module from built-in code
ARM: zx: only build power domain code when CONFIG_PM=y
ARM: pxa: palm: Fix typos in PWM lookup table code
ARM: dts: Kirkwood: Fix QNAP TS219 power-off
ARM: dts: rockchip: Add OTP gpio pinctrl to rk3288 tsadc node
ARM: dts: rockchip: temporarily remove emmc hs200 speed from rk3288 minnie
MAINTAINERS: Atmel drivers: change NAND and ISI entries
ARM: at91/dt: sama5d2 Xplained: add several devices
...
Linus Torvalds [Fri, 27 Nov 2015 21:12:42 +0000 (13:12 -0800)]
Merge tag 'pm+acpi-4.4-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull more power management and ACPI fixes from Rafael Wysocki:
"These fix one recent regression (cpufreq core), fix up two features
added recently (ACPI CPPC support, SCPI support in the arm_big_little
cpufreq driver) and fix three older bugs in the intel_pstate driver.
Specifics:
- Fix a recent regression in the cpufreq core causing it to fail to
clean up sysfs directories properly on cpufreq driver removal
(Viresh Kumar).
- Fix a build problem in the SCPI support code recently added to the
arm_big_little cpufreq driver (Punit Agrawal).
- Fix up the recently added CPPC cpufreq frontend to process the CPU
coordination information provided by the platform firmware
correctly (Ashwin Chaugule).
- Fix the intel_pstate driver to behave as intended when switched
over to the "performance" mode via sysfs if hardware-driven P-state
selection (HWP) is enabled (Alexandra Yates).
- Fix two rounding errors in the intel_pstate driver that sometimes
cause it to use lower P-states than requested (Prarit Bhargava)"
* tag 'pm+acpi-4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
intel_pstate: Fix "performance" mode behavior with HWP enabled
cpufreq: SCPI: Depend on SCPI clk driver
cpufreq: intel_pstate: Fix limits->max_perf rounding error
cpufreq: intel_pstate: Fix limits->max_policy_pct rounding error
cpufreq: Always remove sysfs cpuX/cpufreq link on ->remove_dev()
cpufreq: CPPC: Initialize and check CPUFreq CPU co-ord type correctly
Dave Airlie [Fri, 27 Nov 2015 20:50:34 +0000 (06:50 +1000)]
Merge branch 'linux-4.4' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes
Ben Skeggs wrote:
A couple of regression fixes, some more boards whitelisted for a hw bug
workaround, gr/ucode fixes for hangs a user is seeing.
The changes look larger than they actually are due to the ucode binaries
(*.fucN.h) being regenerated.
* 'linux-4.4' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
drm/nouveau/volt/pwm/gk104: fix an off-by-one resulting in the voltage not being set
drm/nouveau/nvif: allow userspace access to its own client object
drm/nouveau/gr/gf100-: fix oops when calling zbc methods
drm/nouveau/gr/gf117-: assume no PPC if NV_PGRAPH_GPC_GPM_PD_PES_TPC_ID_MASK is zero
drm/nouveau/gr/gf117-: read NV_PGRAPH_GPC_GPM_PD_PES_TPC_ID_MASK from correct GPC
drm/nouveau/gr/gf100-: split out per-gpc address calculation macro
drm/nouveau/bios: return actual size of the buffer retrieved via _ROM
drm/nouveau/instmem: protect instobj list with a spinlock
drm/nouveau/pci: enable c800 magic for some unknown Samsung laptop
drm/nouveau/pci: enable c800 magic for Clevo P157SM
Linus Torvalds [Fri, 27 Nov 2015 19:59:02 +0000 (11:59 -0800)]
Merge tag 'sound-4.4-rc3' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Here are no big surprises but just all small fixes, mostly
device-specific quirks for HD-audio and USB-audio:
- Fix for detection of FireWire DICE Loud devices
- Intel Broxton HDMI/DP PCI IDs and relevant quirks
- Noise fixes: Dell XPS13 2015 model, Dell Latitude E6440, Gigabyte
Z170X mobo
- Fix the headphone mixer assignment on HP laptops for PulseAudio
- USB-MIDI fixes for Medeli DD305 and CH345
- Apply fixup for Acer Aspire One Cloudbook 14"
* tag 'sound-4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix noise on Gigabyte Z170X mobo
ALSA: hda - Fix headphone noise after Dell XPS 13 resume back from S3
ALSA: hda - Apply HP headphone fixups more generically
ALSA: hda - Add fixup for Acer Aspire One Cloudbook 14
ALSA: hda - apply SKL display power request/release patch to BXT
ALSA: hda - add PCI IDs for Intel Broxton
ALSA: usb-audio: work around CH345 input SysEx corruption
ALSA: usb-audio: prevent CH345 multiport output SysEx corruption
ALSA: usb-audio: add packet size quirk for the Medeli DD305
ALSA: dice: fix detection of Loud devices
ALSA: hda - Fix noise on Dell Latitude E6440
Linus Torvalds [Fri, 27 Nov 2015 19:09:59 +0000 (11:09 -0800)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Build fix when !CONFIG_UID16 (the patch is touching generic files but
it only affects arm64 builds; submitted by Arnd Bergmann)
- EFI fixes to deal with early_memremap() returning NULL and correctly
mapping run-time regions
- Fix CPUID register extraction of unsigned fields (not to be
sign-extended)
- ASID allocator fix to deal with long-running tasks over multiple
generation roll-overs
- Revert support for marking page ranges as contiguous PTEs (it leads
to TLB conflicts and requires additional non-trivial kernel changes)
- Proper early_alloc() failure check
- Disable KASan for 48-bit VA and 16KB page configuration (the pgd is
larger than the KASan shadow memory)
- Update the fault_info table (original descriptions based on early
engineering spec)
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: efi: fix initcall return values
arm64: efi: deal with NULL return value of early_memremap()
arm64: debug: Treat the BRPs/WRPs as unsigned
arm64: cpufeature: Track unsigned fields
arm64: cpufeature: Add helpers for extracting unsigned values
Revert "arm64: Mark kernel page ranges contiguous"
arm64: mm: keep reserved ASIDs in sync with mm after multiple rollovers
arm64: KASAN depends on !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
arm64: efi: correctly map runtime regions
arm64: mm: fix fault_info table xFSC decoding
arm64: fix building without CONFIG_UID16
arm64: early_alloc: Fix check for allocation failure
Linus Torvalds [Fri, 27 Nov 2015 19:05:50 +0000 (11:05 -0800)]
Merge tag 'nios2-v4.4-rc3' of git://git./linux/kernel/git/lftan/nios2
Pull nios2 fix from Ley Foon Tan:
"nios2: fix cache coherency"
* tag 'nios2-v4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
nios2: fix cache coherency
Ralf Baechle [Fri, 27 Nov 2015 18:17:01 +0000 (19:17 +0100)]
MIPS: Fix delay loops which may be removed by GCC.
GCC 4.1 and newer remove empty loops. This becomes a problem when delay
loops get removed. Fixed by rewriting to user the proper Linux interface
for such delays.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: John Crispin <blogic@openwrt.org>
Linus Torvalds [Fri, 27 Nov 2015 18:08:31 +0000 (10:08 -0800)]
Merge tag 'arc-4.4-rc3-fixes' of git://git./linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
- Fix for perf callgraph unwinding causing RCU stalls
- Fix to enable Linux to run on non-default Interrupt priority 0
- Removal of pointless SYNC from __switch_to()
* tag 'arc-4.4-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: dw2 unwind: Remove falllback linear search thru FDE entries
ARC: remove SYNC from __switch_to()
ARCv2: Use the default irq priority for idle sleep
ARC: Abstract out ISA specific SLEEP args
ARC: comments update
ARC: switch to arc-linux- CROSS_COMPILE prefix across all configs
Arnd Bergmann [Fri, 27 Nov 2015 16:41:48 +0000 (17:41 +0100)]
Merge tag 'v4.4-rockchip-dts32-fixes1' of git://git./linux/kernel/git/mmind/linux-rockchip into fixes
Merge "ARM: rockchip: devicetree fixes for 4.4" from Heiko Stuebner:
Two fixes to Rockchip devicetree files, disabling the mmc-tuning
on the veyron-minnie board for now and adding the init state for
the over-temperature-protection to prevent glitches making the
system reboot sometimes.
* tag 'v4.4-rockchip-dts32-fixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
ARM: dts: rockchip: Add OTP gpio pinctrl to rk3288 tsadc node
ARM: dts: rockchip: temporarily remove emmc hs200 speed from rk3288 minnie
Arnd Bergmann [Fri, 27 Nov 2015 16:28:41 +0000 (17:28 +0100)]
Merge tag 'mvebu-fixes-4.4-1' of git://git.infradead.org/linux-mvebu into fixes
Merge "mvebu fixes for 4.4 (part 1)" from Jason Cooper:
- Fix QNAP TS219 power-off in dts
- Fix legacy get_irqnr_and_base for dove and orion5x
* tag 'mvebu-fixes-4.4-1' of git://git.infradead.org/linux-mvebu:
ARM: orion5x: Fix legacy get_irqnr_and_base
ARM: dove: Fix legacy get_irqnr_and_base
ARM: dts: Kirkwood: Fix QNAP TS219 power-off
Arnd Bergmann [Fri, 27 Nov 2015 16:28:10 +0000 (17:28 +0100)]
Merge tag 'renesas-fixes-for-v4.4' of git://git./linux/kernel/git/horms/renesas into fixes
Merge "Renesas ARM Based SoC Fixes for v4.4" from Simon Horman:
* r8a7793 SoC: Annotate r8a7793_boards_compat_dt with __initconst
Aside from being correct this builds that otherwise
fail with section mismatch errors.
* tag 'renesas-fixes-for-v4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: r8a7793: proper constness with __initconst
Rafael J. Wysocki [Fri, 27 Nov 2015 15:23:59 +0000 (16:23 +0100)]
Merge branches 'pm-cpufreq' and 'acpi-cppc'
* pm-cpufreq:
intel_pstate: Fix "performance" mode behavior with HWP enabled
cpufreq: SCPI: Depend on SCPI clk driver
cpufreq: intel_pstate: Fix limits->max_perf rounding error
cpufreq: intel_pstate: Fix limits->max_policy_pct rounding error
cpufreq: Always remove sysfs cpuX/cpufreq link on ->remove_dev()
* acpi-cppc:
cpufreq: CPPC: Initialize and check CPUFreq CPU co-ord type correctly
Linus Torvalds [Thu, 26 Nov 2015 19:42:25 +0000 (11:42 -0800)]
Merge tag 'for-linus-4.4-rc2-tag' of git://git./linux/kernel/git/xen/tip
Pull xen bug fixes from David Vrabel:
- Fix gntdev and numa balancing.
- Fix x86 boot crash due to unallocated legacy irq descs.
- Fix overflow in evtchn device when > 1024 event channels.
* tag 'for-linus-4.4-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/evtchn: dynamically grow pending event channel ring
xen/events: Always allocate legacy interrupts on PV guests
xen/gntdev: Grant maps should not be subject to NUMA balancing
Linus Torvalds [Thu, 26 Nov 2015 19:19:59 +0000 (11:19 -0800)]
Merge tag 'powerpc-4.4-3' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- tm: Block signal return from setting invalid MSR state from Michael
Neuling
- tm: Check for already reclaimed tasks from Michael Neuling
* tag 'powerpc-4.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/tm: Check for already reclaimed tasks
powerpc/tm: Block signal return setting invalid MSR state
David Vrabel [Thu, 26 Nov 2015 16:14:35 +0000 (16:14 +0000)]
xen/evtchn: dynamically grow pending event channel ring
If more than 1024 event channels are bound to a evtchn device then it
possible (even with well behaved applications) for the ring to
overflow and events to be lost (reported as an -EFBIG error).
Dynamically increase the size of the ring so there is always enough
space for all bound events. Well behaved applicables that only unmask
events after draining them from the ring can thus no longer lose
events.
However, an application could unmask an event before draining it,
allowing multiple entries per port to accumulate in the ring, and a
overflow could still occur. So the overflow detection and reporting
is retained.
The ring size is initially only 64 entries so the common use case of
an application only binding a few events will use less memory than
before. The ring size may grow to 512 KiB (enough for all 2^17
possible channels). This order 7 kmalloc() may fail due to memory
fragmentation, so we fall back to trying vmalloc().
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Ard Biesheuvel [Mon, 23 Nov 2015 07:43:24 +0000 (08:43 +0100)]
arm64: efi: fix initcall return values
Even though initcall return values are typically ignored, the
prototype is to return 0 on success or a negative errno value on
error. So fix the arm_enable_runtime_services() implementation to
return 0 on conditions that are not in fact errors, and return a
meaningful error code otherwise.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ard Biesheuvel [Mon, 23 Nov 2015 07:43:23 +0000 (08:43 +0100)]
arm64: efi: deal with NULL return value of early_memremap()
Add NULL return value checks to two invocations of early_memremap()
in the UEFI init code. For the UEFI configuration tables, we just
warn since we have a better chance of being able to report the issue
in a way that can actually be noticed by a human operator if we don't
abort right away. For the UEFI memory map, however, all we can do is
panic() since we cannot proceed without a description of memory.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Suzuki K. Poulose [Wed, 18 Nov 2015 17:08:58 +0000 (17:08 +0000)]
arm64: debug: Treat the BRPs/WRPs as unsigned
IDAA64DFR0_EL1: BRPs and WRPs are unsigned values. Use
the appropriate helpers to extract those fields.
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Suzuki K. Poulose [Wed, 18 Nov 2015 17:08:57 +0000 (17:08 +0000)]
arm64: cpufeature: Track unsigned fields
Some of the feature bits have unsigned values and need
to be treated accordingly to avoid errors. Adds the property
to the feature bits and use the appropriate field extract helpers.
Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Boris Ostrovsky [Fri, 20 Nov 2015 16:25:04 +0000 (11:25 -0500)]
xen/events: Always allocate legacy interrupts on PV guests
After commit
8c058b0b9c34 ("x86/irq: Probe for PIC presence before
allocating descs for legacy IRQs") early_irq_init() will no longer
preallocate descriptors for legacy interrupts if PIC does not
exist, which is the case for Xen PV guests.
Therefore we may need to allocate those descriptors ourselves.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Suzuki K. Poulose [Wed, 18 Nov 2015 17:08:56 +0000 (17:08 +0000)]
arm64: cpufeature: Add helpers for extracting unsigned values
The cpuid_feature_extract_field() extracts the feature value
as a signed integer. This could be problematic for features
whose values are unsigned. e.g, ID_AA64DFR0_EL1:BRPs. Add
an unsigned variant for the unsigned fields.
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Boris Ostrovsky [Tue, 10 Nov 2015 20:10:33 +0000 (15:10 -0500)]
xen/gntdev: Grant maps should not be subject to NUMA balancing
Doing so will cause the grant to be unmapped and then, during
fault handling, the fault to be mistakenly treated as NUMA hint
fault.
In addition, even if those maps could partcipate in NUMA
balancing, it wouldn't provide any benefit since we are unable
to determine physical page's node (even if/when VNUMA is
implemented).
Marking grant maps' VMAs as VM_IO will exclude them from being
part of NUMA balancing.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Simon Guinot [Thu, 26 Nov 2015 14:37:13 +0000 (15:37 +0100)]
rtc: ds1307: fix alarm reading at probe time
With the actual code, read_alarm() always returns -EINVAL when called
during the RTC device registration. This prevents from retrieving an
already configured alarm in hardware.
This patch fixes the issue by moving the HAS_ALARM bit configuration
(if supported by the hardware) above the rtc_device_register() call.
Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Catalin Marinas [Thu, 26 Nov 2015 15:42:41 +0000 (15:42 +0000)]
Revert "arm64: Mark kernel page ranges contiguous"
This reverts commit
348a65cdcbbf243073ee39d1f7d4413081ad7eab.
Incorrect page table manipulation that does not respect the ARM ARM
recommended break-before-make sequence may lead to TLB conflicts. The
contiguous PTE patch makes the system even more susceptible to such
errors by changing the mapping from a single page to a contiguous range
of pages. An additional TLB invalidation would reduce the risk window,
however, the correct fix is to switch to a temporary swapper_pg_dir.
Once the correct workaround is done, the reverted commit will be
re-applied.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Jeremy Linton <jeremy.linton@arm.com>
Will Deacon [Thu, 26 Nov 2015 13:49:39 +0000 (13:49 +0000)]
arm64: mm: keep reserved ASIDs in sync with mm after multiple rollovers
Under some unusual context-switching patterns, it is possible to end up
with multiple threads from the same mm running concurrently with
different ASIDs:
1. CPU x schedules task t with mm p containing ASID a and generation g
This task doesn't block and the CPU doesn't context switch.
So:
* per_cpu(active_asid, x) = {g,a}
* p->context.id = {g,a}
2. Some other CPU generates an ASID rollover. The global generation is
now (g + 1). CPU x is still running t, with no context switch and
so per_cpu(reserved_asid, x) = {g,a}
3. CPU y schedules task t', which shares mm p with t. The generation
mismatches, so we take the slowpath and hit the reserved ASID from
CPU x. p is then updated so that p->context.id = {g + 1,a}
4. CPU y schedules some other task u, which has an mm != p.
5. Some other CPU generates *another* CPU rollover. The global
generation is now (g + 2). CPU x is still running t, with no context
switch and so per_cpu(reserved_asid, x) = {g,a}.
6. CPU y once again schedules task t', but now *fails* to hit the
reserved ASID from CPU x because of the generation mismatch. This
results in a new ASID being allocated, despite the fact that t is
still running on CPU x with the same mm.
Consequently, TLBIs (e.g. as a result of CoW) will not be synchronised
between the two threads.
This patch fixes the problem by updating all of the matching reserved
ASIDs when we hit on the slowpath (i.e. in step 3 above). This keeps
the reserved ASIDs in-sync with the mm and avoids the problem.
Reported-by: Tony Thompson <anthony.thompson@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Andrey Ryabinin [Tue, 17 Nov 2015 15:47:08 +0000 (18:47 +0300)]
arm64: KASAN depends on !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
On KASAN + 16K_PAGES + 48BIT_VA
arch/arm64/mm/kasan_init.c: In function ‘kasan_early_init’:
include/linux/compiler.h:484:38: error: call to ‘__compiletime_assert_95’ declared with attribute error: BUILD_BUG_ON failed: !IS_ALIGNED(KASAN_SHADOW_END, PGDIR_SIZE)
_compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
Currently KASAN will not work on 16K_PAGES and 48BIT_VA, so
forbid such configuration to avoid above build failure.
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Suzuki K. Poulose <Suzuki.Poulose@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ley Foon Tan [Thu, 26 Nov 2015 14:25:58 +0000 (22:25 +0800)]
nios2: fix cache coherency
There is intermittent cache coherency issue caught in toolchian tests.
Revert to use flushd.
Signed-off-by: Ley Foon Tan <lftan@altera.com>
James Morris [Thu, 26 Nov 2015 04:04:19 +0000 (15:04 +1100)]
Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/selinux into for-linus2
Dave Airlie [Thu, 26 Nov 2015 02:42:15 +0000 (12:42 +1000)]
Merge branch 'drm-fixes-4.4' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Radeon and amdgpu fixes for 4.4:
- DPM fixes for r7xx devices
- VCE fixes for Stoney
- GPUVM fixes
- Scheduler fixes
* 'drm-fixes-4.4' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: make some dpm errors debug only
drm/radeon: make rv770_set_sw_state failures non-fatal
drm/amdgpu: move dependency handling out of atomic section v2
drm/amdgpu: optimize scheduler fence handling
drm/amdgpu: remove vm->mutex
drm/amdgpu: add mutex for ba_va->valids/invalids
drm/amdgpu: adapt vce session create interface changes
drm/amdgpu: vce use multiple cache surface starting from stoney
drm/amdgpu: reset vce trap interrupt flag
Linus Torvalds [Wed, 25 Nov 2015 23:11:08 +0000 (15:11 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
"A couple of fixes for sendfile lockups caught by Dmitry + a fix for
ancient sysvfs symlink breakage"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
vfs: Avoid softlockups with sendfile(2)
vfs: Make sendfile(2) killable even better
fix sysvfs symlinks