Michael Ellerman [Wed, 23 Aug 2017 05:37:42 +0000 (15:37 +1000)]
powerpc/configs: Drop MEMORY_HOTREMOVE from ppc64/cell
xxxx
In commit
577ec789a79e ("powerpc/cell: Drop select of MEMORY_HOTPLUG")
we removed the last traces of any dependency between
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:41 +0000 (15:37 +1000)]
powerpc/configs: Drop unnecessary CONFIG_POWERNV_OP_PANEL
In commit
43a1dd9b5fc6 ("powerpc/powernv: Add driver for operator
panel on FSP machines") we added CONFIG_POWERNV_OP_PANEL=m to the
powernv defconfig, but it's default m so that's no necessary.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:40 +0000 (15:37 +1000)]
powerpc/configs: Drop no longer needed PCI_MSI on powernv
In commit
a311e738b6d8 ("powerpc/powernv: Make PCI non-optional") we
made PCI (and therefore PCI_MSI) non-optional on powernv, so it
doesn't need to be in the defconfig anymore.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:39 +0000 (15:37 +1000)]
powerpc/configs: Drop no longer needed CONFIG_SMP for pseries/ppc64/powernv
In commit
40e275653e2c ("powerpc/powernv: Always enable SMP when
building powernv") and
270e2dc9b803 ("powerpc/pseries: Always enable
SMP when building pseries") we forced CONFIG_SMP on for some configs.
Therefore we don't need to set it in those configs anymore.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:38 +0000 (15:37 +1000)]
powerpc/configs: Drop unnecessary CONFIG_UPROBE_EVENT
In commit
6b0b7551428e ("perf/core: Rename CONFIG_[UK]PROBE_EVENT to
CONFIG_[UK]PROBE_EVENTS") it was renamed to CONFIG_UPROBE_EVENTS.
Additionally it's default y, and we have the prerequisites enabled, so
we don't need it in our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:37 +0000 (15:37 +1000)]
powerpc/configs: Drop unnecessary CONFIG_NUMA_BALANCING_DEFAULT_ENABLED
In commit
9654f95a081a ("powerpc: Enable NUMA balancing in
pseries[_le]_defconfig") we added CONFIG_NUMA_BALANCING_DEFAULT_ENABLED
to our defconfigs. But it's already enabled by default, so drop it.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:36 +0000 (15:37 +1000)]
powerpc/configs: Drop no longer needed CONFIG_DEVPTS_MULTIPLE_INSTANCES
Since commit
eedf265aa003 ("devpts: Make each mount of devpts an
independent filesystem.") we no longer need to set
CONFIG_DEVPTS_MULTIPLE_INSTANCES in our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:35 +0000 (15:37 +1000)]
powerpc/configs: Drop no longer needed CONFIG_CRYPTO_GCM
Since commit
00b9cfa3ff38 ("mac80111: Add GCMP and GCMP-256 ciphers")
we no longer need to set CONFIG_CRYPTO_GCM in our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:34 +0000 (15:37 +1000)]
powerpc/configs: Drop no longer needed CONFIG_CRYPTO_NULL in g5 / c2k
Since commit
3491244c6298 ("crypto: echainiv - Set Kconfig default to
m") we no longer need to set CONFIG_CRYPTO_NULL in our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:33 +0000 (15:37 +1000)]
powerpc/configs: Drop no longer needed CONFIG_CRYPTO_NULL
Since commit
00b9cfa3ff38 ("mac80111: Add GCMP and GCMP-256 ciphers")
we no longer need to set CONFIG_CRYPTO_NULL in our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:32 +0000 (15:37 +1000)]
powerpc/configs: Drop no longer needed CONFIG_CRYPTO_SHA256
Since commit
826775bbf38f ("crypto: drbg - Add select on sha256") we
no longer need to set CONFIG_CRYPTO_SHA256 in our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:31 +0000 (15:37 +1000)]
powerpc/configs: Drop no longer needed CONFIG_CRYPTO_ECB
Since commit
12cb3a1c4184 ("crypto: xts - Add ECB dependency") we no
longer need to set CONFIG_CRYPTO_ECB in our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:30 +0000 (15:37 +1000)]
powerpc/configs: Drop no longer needed CONFIG_CRYPTO_HMAC
Since commit
401e4238f35c ("crypto: rng - Make DRBG the default RNG")
we no longer need to set CONFIG_CRYPTO_HMAC in our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:29 +0000 (15:37 +1000)]
powerpc/configs: Drop no longer needed CONFIG_CRYPTO_DEV_VMX_ENCRYPT
Since commit
ccf5c442a1b8 ("crypto: vmx - Convert to CPU feature based
module autoloading") we no longer need to set
CONFIG_CRYPTO_DEV_VMX_ENCRYPT in our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:28 +0000 (15:37 +1000)]
powerpc/configs: Update for CONFIG_NF_CT_PROTO_(SCTP|UDPLITE)=y
In commit
a85406afeb3e ("netfilter: conntrack: built-in support for
SCTP"), NF_CT_PROTO_SCTP switched from tristate to bool and became
default y. Similarly in commit
9b91c96c5d1f ("netfilter: conntrack:
built-in support for UDPlite"), NF_CT_PROTO_UDPLITE switched from
tristate to bool and became default y.
We had a few configs which set them to =m, which is no longer valid.
We don't need to change them to =y because both symbols are default y
and are enabled automatically based on the other symbols in the
affected defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:27 +0000 (15:37 +1000)]
powerpc/configs: Update for CONFIG_FIXED_PHY being selected by CONFIG_OF_MDIO
In commit
a5e4bd991362 ("of_mdio: select fixed phy support
unconditionally"), CONFIG_OF_MDIO began selecting CONFIG_FIXED_PHY.
That means we no longer need to set it some of our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:26 +0000 (15:37 +1000)]
powerpc/configs: Update for CONFIG_DEBUG_FS being selected via CONFIG_RCU_TRACE
In commit
961518259b3b ("rcu: Enable RCU tracepoints by default to aid
in debugging"), CONFIG_RCU_TRACE was made default y (if CONFIG_TREE_RCU=y,
which it is for some of our configs).
That in turn causes CONFIG_TREE_RCU_TRACE to be enabled, which selects
CONFIG_DEBUG_FS. The end result is that CONFIG_DEBUG_FS is forced on,
meaning we don't have to enable it in some of our configs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:25 +0000 (15:37 +1000)]
powerpc/configs: Drop no longer needed CONFIG_DEVKMEM
Since commit
e334cd69fa65 ("Move CONFIG_DEVKMEM default to n") we no
longer need to set CONFIG_DEVKMEM in our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:24 +0000 (15:37 +1000)]
powerpc/configs: Drop no longer needed CONFIG_FHANDLE
Since commit
f76be61755c5 ("Make CONFIG_FHANDLE default y") we no
longer need to set CONFIG_FHANDLE in our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:23 +0000 (15:37 +1000)]
powerpc/configs: Explicitly drop CONFIG_INPUT_MOUSEDEV
In commit
73d8ef76006b ("Input: mousedev - stop offering PS/2 to userspace by
default") (Jan 2017), CONFIG_INPUT_MOUSEDEV was switched from default y to
default n, with the explanation:
Evdev interface has been available for many years and by now everyone
is switched to using it, so let's stop offering /dev/input/mouseN
and /dev/psaux by default.
We had a number of configs which had it enabled, but going by the above
explanation probably don't need it enabled anymore.
So drop the last remnants of it from our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:22 +0000 (15:37 +1000)]
powerpc/configs: Drop unneeded CONFIG_CRYPTO_ANSI_CPRNG
Since commit
401e4238f35c ("crypto: rng - Make DRBG the default RNG") we no longer need to set CONFIG_CRYPTO_ANSI_CPRNG in our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:21 +0000 (15:37 +1000)]
powerpc/configs: Update for symbol movement only
Update defconfigs for symbols that have moved around, without their
value changing.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 13:56:24 +0000 (23:56 +1000)]
powerpc/oops: Line up NIP & MSR with other rows
This is purely cosmetic, but does look nicer IMHO:
Before:
task:
c000000001453400 task.stack:
c000000001c6c000
NIP:
c000000000a0fbfc LR:
c000000000a0fbf4 CTR:
c000000000ba6220
REGS:
c0000001fffef820 TRAP: 0300 Not tainted (
4.13.0-rc6-gcc-6.3.1-00234-g423af27f7d81)
MSR:
8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR:
88088242 XER:
00000000
CFAR:
c0000000000b3488 DAR:
0000000000000000 DSISR:
42000000 SOFTE: 0
After:
task:
c000000001453400 task.stack:
c000000001c6c000
NIP:
c000000000a0fbfc LR:
c000000000a0fbf4 CTR:
c000000000ba6220
REGS:
c0000001fffef820 TRAP: 0300 Not tainted (
4.13.0-rc6-gcc-6.3.1-00234-g423af27f7d81-dirty)
MSR:
8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR:
88088242 XER:
00000000
CFAR:
c0000000000b34a4 DAR:
0000000000000000 DSISR:
42000000 SOFTE: 0
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 13:56:23 +0000 (23:56 +1000)]
powerpc/oops: Print CR/XER on same line as MSR
Somehow we missed this when the pr_cont() changes went in. Fix CR/XER
to go on the same line as MSR, as they have historically, eg:
MSR:
8000000000009032 <SF,EE,ME,IR,DR,RI> CR:
4804408a XER:
20000000
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 13:56:22 +0000 (23:56 +1000)]
powerpc/oops: Use IS_ENABLED() for oops markers
Just because it looks less gross.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 13:56:21 +0000 (23:56 +1000)]
powerpc/oops: Print the kernel's endian in the oops
Although the MSR tells you what endian you're in it's possible that
isn't the same endian the kernel was built for, and if that happens
you're usually having a very bad day. So print a marker to make
it 100% clear which endian the kernel was built for.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 13:56:20 +0000 (23:56 +1000)]
powerpc/oops: Fix the oops markers to use pr_cont()
When we oops we print a few markers for significant config options
such as PREEMPT, SMP etc. Currently these appear on separate lines
because we're not using pr_cont() properly. Fix it.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
LABBE Corentin [Wed, 16 Aug 2017 12:34:44 +0000 (14:34 +0200)]
powerpc/powernv: Fix build error in opal-imc.c when NUMA=n
When building a random powerpc kernel I hit this build error:
arch/powerpc/platforms/powernv/opal-imc.c:130:13: error : assignment
discards « const » qualifier from pointer target type
[-Werror=discarded-qualifiers]
l_cpumask = cpumask_of_node(nid);
^
This happens because when CONFIG_NUMA=n cpumask_of_node() returns a
const pointer.
This patch simply adds const to l_cpumask to fix this issue.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
[mpe: Flesh out change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Rashmica Gupta [Thu, 1 Jun 2017 05:34:40 +0000 (15:34 +1000)]
Add documentation for the powerpc memtrace debugfs files
CONFIG_PPC_MEMTRACE must be set to use this feature. This can only be
used on powernv platforms.
Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com>
[mpe: Update dates and kernel versions, mention size is in bytes]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Rashmica Gupta [Thu, 1 Jun 2017 05:34:38 +0000 (15:34 +1000)]
powerpc/powernv: Enable removal of memory for in memory tracing
The hardware trace macro feature requires access to a chunk of real
memory. This patch provides a debugfs interface to do this. By
writing an integer containing the size of memory to be unplugged into
/sys/kernel/debug/powerpc/memtrace/enable, the code will attempt to
remove that much memory from the end of each NUMA node.
This patch also adds additional debugsfs files for each node that
allows the tracer to interact with the removed memory, as well as
a trace file that allows userspace to read the generated trace.
Note that this patch does not invoke the hardware trace macro, it
only allows memory to be removed during runtime for the trace macro
to utilise.
Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com>
[mpe: Minor formatting etc fixups]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Naveen N. Rao [Wed, 14 Jun 2017 15:44:00 +0000 (21:14 +0530)]
powerpc/uprobes: Implement arch_uretprobe_is_alive()
This helper is used to detect if a uprobe'd function has returned
through a setjmp/longjmp, rather than branching to the LR that was
updated previously by us. This fixes a SIGSEGV that gets generated when
programs use setjmp/longjmp with uretprobes.
We use the arm64 model (arch/arm64/kernel/probes/uprobes.c:
arch_uretprobe_is_alive()) for detecting when stack frames have been
removed from under us.
Reference:
https://marc.info/?l=linux-kernel&m=
143748610330073
commit
7b868e4802a86 ("uprobes/x86: Reimplement arch_uretprobe_is_alive()")
commit
db087ef69a2b1 ("uprobes/x86: Make arch_uretprobe_is_alive(RP_CHECK_CALL) more
clever")
Tested with the test program from:
https://sourceware.org/git/gitweb.cgi?p=systemtap.git;a=blob;f=testsuite/systemtap.base/bz5274.c;hb=HEAD
And this script:
$ cat test.sh
#!/bin/bash
perf probe -x ./bz5274 -a bz5274_main_return=main%return
perf probe -x ./bz5274 -a bz5274_funca_return=funca%return
perf probe -x ./bz5274 -a bz5274_funcb_return=funcb%return
perf probe -x ./bz5274 -a bz5274_funcc_return=funcc%return
perf probe -x ./bz5274 -a bz5274_funcd_return=funcd%return
perf record -e 'probe_bz5274:*' -aR ./bz5274
Reported-by: Gustavo Luiz Duarte <gduarte@redhat.com>
Reported-by: zsun@redhat.com
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Naveen N. Rao [Thu, 8 Jun 2017 19:16:55 +0000 (00:46 +0530)]
powerpc/kprobes: Don't save/restore DAR/DSISR to/from pt_regs for optprobes
We don't save/restore these across a trap, or with KPROBES_ON_FTRACE.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Cédric Le Goater [Tue, 8 Aug 2017 09:02:49 +0000 (11:02 +0200)]
powerpc/xive: Fix the size of the cpumask used in xive_find_target_in_mask()
When called from xive_irq_startup(), the size of the cpumask can be
larger than nr_cpu_ids. This can result in a WARN_ON such as:
WARNING: CPU: 10 PID: 1 at ../arch/powerpc/sysdev/xive/common.c:476 xive_find_target_in_mask+0x110/0x2f0
...
NIP [
c00000000008a310] xive_find_target_in_mask+0x110/0x2f0
LR [
c00000000008a2e4] xive_find_target_in_mask+0xe4/0x2f0
Call Trace:
xive_find_target_in_mask+0x74/0x2f0 (unreliable)
xive_pick_irq_target.isra.1+0x200/0x230
xive_irq_startup+0x60/0x180
irq_startup+0x70/0xd0
__setup_irq+0x7bc/0x880
request_threaded_irq+0x14c/0x2c0
request_event_sources_irqs+0x100/0x180
__machine_initcall_pseries_init_ras_IRQ+0x104/0x134
do_one_initcall+0x68/0x1d0
kernel_init_freeable+0x290/0x374
kernel_init+0x24/0x170
ret_from_kernel_thread+0x5c/0x74
This happens because we're being called with our affinity mask set to
irq_default_affinity. That in turn was populated using
cpumask_setall(), which sets NR_CPUs worth of bits, not nr_cpu_ids
worth. Finally cpumask_weight() will return > nr_cpu_ids when passed a
mask which has > nr_cpu_ids bits set.
Fix it by limiting the value returned by cpumask_weight().
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[mpe: Add change log details on actual cause]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Fri, 11 Aug 2017 16:39:07 +0000 (02:39 +1000)]
powerpc/64: Optimise set/clear of CTRL[RUN] (runlatch)
On modern CPUs the CTRL register is read-only except bit 63 which is
the run latch control. This means it can be updated with a mtspr
rather than mfspr/mtspr.
To accomodate older CPUs (Cell at least), where there are other bits
in the register, we still do a read/modify/write on pre 2.06 CPUs.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Update change log to mention 2.06 workaround]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Fri, 11 Aug 2017 16:39:06 +0000 (02:39 +1000)]
powerpc/64s: Remove spurious IRQ reason in IRQ replay
HVI interrupts have always used 0x500, so remove the dead branch.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Fri, 11 Aug 2017 16:39:05 +0000 (02:39 +1000)]
powerpc/64: Remove redundant instruction in interrupt replay
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Fri, 11 Aug 2017 16:39:04 +0000 (02:39 +1000)]
powerpc/64s: Use the HV handler for external IRQ replay in HV mode on POWER9
POWER9 host external interrupts use the h_virt_irq_common handler, so
use that to replay them rather than using the hardware_interrupt_common
handler. Both call do_IRQ, but using the correct handler reduces
i-cache footprint.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Fri, 11 Aug 2017 16:39:03 +0000 (02:39 +1000)]
powerpc/64s: Merge HV and non-HV paths for doorbell IRQ replay
This results in smaller code, and fewer branches. This relies on the
fact that both the 0xe80 and 0xa00 handlers call the same upper level
code, namely doorbell_exception().
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Mention we rely on the implementation of the 0xe80/0xa00 handlers]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Fri, 11 Aug 2017 16:39:02 +0000 (02:39 +1000)]
powerpc/64: Cleanup __check_irq_replay()
Move the clearing of irq_happened bits into the condition where they
were found to be set. This reduces instruction count slightly, and
reduces stores into irq_happened.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Fri, 11 Aug 2017 16:39:01 +0000 (02:39 +1000)]
powerpc/64s: masked_interrupt() returns to kernel so avoid restoring r13
Places in the kernel where r13 is not the PACA pointer must have
maskable interrupts disabled, so r13 does not have to be restored when
returning from a soft-masked interrupt. We should never have
interrupts soft disabled when we're in user space.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Fri, 11 Aug 2017 16:39:00 +0000 (02:39 +1000)]
powerpc/64s: Optimise clearing of MSR_EE in masked_[H]interrupt()
MSR_EE is always enabled in SRR1 for masked interrupts, so we can use
xor to clear it.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Fri, 11 Aug 2017 16:38:59 +0000 (02:38 +1000)]
powerpc/64s: Avoid a branch in masked_[H]interrupt()
Interrupts which do not require EE to be cleared can all be tested
with a single bitwise test.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Benjamin Herrenschmidt [Mon, 24 Jul 2017 04:28:03 +0000 (14:28 +1000)]
powerpc/mm: Make switch_mm_irqs_off() out of line
It's too big to be inline, there is no reason to keep it
that way.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Rework to incorporate the comment changes via fixes branch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Benjamin Herrenschmidt [Mon, 24 Jul 2017 04:28:02 +0000 (14:28 +1000)]
powerpc/mm: Optimize detection of thread local mm's
Instead of comparing the whole CPU mask every time, let's
keep a counter of how many bits are set in the mask. Thus
testing for a local mm only requires testing if that counter
is 1 and the current CPU bit is set in the mask.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Benjamin Herrenschmidt [Mon, 24 Jul 2017 04:28:01 +0000 (14:28 +1000)]
powerpc/mm: Use mm_is_thread_local() instread of open-coding
We open-code testing for the mm being local to the current CPU
in a few places. Use our existing helper instead.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Benjamin Herrenschmidt [Mon, 24 Jul 2017 04:27:59 +0000 (14:27 +1000)]
powerpc/mm: Avoid double irq save/restore in activate_mm
It calls switch_mm() which already does the irq save/restore
these days.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Benjamin Herrenschmidt [Mon, 24 Jul 2017 04:27:58 +0000 (14:27 +1000)]
powerpc/mm: Move pgdir setting into a helper
Makes switch_mm_irqs_off() a bit more readable
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Tue, 22 Aug 2017 01:51:37 +0000 (11:51 +1000)]
powerpc/64s: Fix replay interrupt return label name
In __replay_interrupt() we take the address of a local label so we can
return to it later. However the assembler turns the local label into a
symbol with a name like ".L1^B42" - where "^B" is literally "\002".
This does not make for pleasant stack traces. Fix it by giving the
label a sensible name.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Rob Herring [Mon, 21 Aug 2017 15:16:49 +0000 (10:16 -0500)]
powerpc: pseries: remove dlpar_attach_node dependency on full path
In preparation to stop storing the full node path in full_name, remove the
dependency on full_name from dlpar_attach_node(). Callers of
dlpar_attach_node() already have the parent device_node, so just pass the
parent node into dlpar_attach_node instead of the path. This avoids doing
a lookup of the parent node by the path.
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Rob Herring [Mon, 21 Aug 2017 15:16:47 +0000 (10:16 -0500)]
powerpc: Convert to using %pOF instead of full_name
Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each node.
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Anatolij Gustschin <agust@denx.de>
Cc: Scott Wood <oss@buserror.net>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linuxppc-dev@lists.ozlabs.org
Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Tue, 22 Aug 2017 05:14:50 +0000 (15:14 +1000)]
powerpc/vio: Use device_type to detect family
Currently in the vio.c code we use a comparision against the parent
device node's full path to decide if the device is a PFO or VIO family
device.
Both the ibm,platform-facilities and vdevice nodes are defined by PAPR,
and must have a matching device_type. So instead of using the path we
can instead compare the device_type.
I've checked Qemu and kvmtool both do this correctly, and all the
PowerVM systems I have access to do also. So it seems to be safe.
This removes the dependency on full_name, which is being removed
upstream.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 12:20:10 +0000 (22:20 +1000)]
Merge branch 'fixes' into next
There's a non-trivial dependency between some commits we want to put in
next and the KVM prefetch work around that went into fixes. So merge
fixes into next.
Arvind Yadav [Mon, 19 Jun 2017 05:44:25 +0000 (11:14 +0530)]
macintosh/rack-meter: Make of_device_ids const
of_device_ids are not supposed to change at runtime. All functions
working with of_device_ids provided by <linux/of.h> work with const
of_device_ids. So mark the non-const structs as const.
File size before:
text data bss dec hex filename
407 576 0 983 3d7 drivers/macintosh/rack-meter.o
File size after constify rackmeter_match.
text data bss dec hex filename
807 176 0 983 3d7 drivers/macintosh/rack-meter.o
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Benjamin Herrenschmidt [Mon, 24 Jul 2017 04:28:00 +0000 (14:28 +1000)]
powerpc/mm: Ensure cpumask update is ordered
There is no guarantee that the various isync's involved with
the context switch will order the update of the CPU mask with
the first TLB entry for the new context being loaded by the HW.
Be safe here and add a memory barrier to order any subsequent
load/store which may bring entries into the TLB.
The corresponding barrier on the other side already exists as
pte updates use pte_xchg() which uses __cmpxchg_u64 which has
a sync after the atomic operation.
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Add comments in the code]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Aneesh Kumar K.V [Thu, 27 Jul 2017 06:24:55 +0000 (11:54 +0530)]
powerpc/mm/cxl: Add the fault handling cpu to mm cpumask
We use mm cpumask for serializing against lockless page table walk.
Anybody who is doing a lockless page table walk is expected to disable
irq and only cpus in mm cpumask is expected do the lockless walk. This
ensure that a THP split can send IPI to only cpus in the mm cpumask,
to make sure there are no parallel lockless page table walk.
Add the CAPI fault handling cpu to the mm cpumask so that we can do
the lockless page table walk while inserting hash page table entries.
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Aneesh Kumar K.V [Thu, 27 Jul 2017 06:24:54 +0000 (11:54 +0530)]
powerpc/mm: Don't send IPI to all cpus on THP updates
Now that we made sure that lockless walk of linux page table is mostly
limitted to current task(current->mm->pgdir) we can update the THP
update sequence to only send IPI to CPUs on which this task has run.
This helps in reducing the IPI overload on systems with large number
of CPUs.
WRT kvm even though kvm is walking page table with vpc->arch.pgdir,
it is done only on secondary CPUs and in that case we have primary CPU
added to task's mm cpumask. Sending an IPI to primary will force the
secondary to do a vm exit and hence this mm cpumask usage is safe
here.
WRT CAPI, we still end up walking linux page table with capi context
MM. For now the pte lookup serialization sends an IPI to all CPUs in
CPI is in use. We can further improve this by adding the CAPI
interrupt handling CPU to task mm cpumask. That will be done in a
later patch.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Thu, 17 Aug 2017 13:14:17 +0000 (23:14 +1000)]
Merge branch 'topic/ppc-kvm' into next
Bring in the commit to rename find_linux_pte_or_hugepte() which touches
arch and KVM code, and might need to be merged with the kvmppc tree to
avoid conflicts.
Aneesh Kumar K.V [Thu, 27 Jul 2017 06:24:53 +0000 (11:54 +0530)]
powerpc/mm: Rename find_linux_pte_or_hugepte()
Add newer helpers to make the function usage simpler. It is always
recommended to use find_current_mm_pte() for walking the page table.
If we cannot use find_current_mm_pte(), it should be documented why
the said usage of __find_linux_pte() is safe against a parallel THP
split.
For now we have KVM code using __find_linux_pte(). This is because kvm
code ends up calling __find_linux_pte() in real mode with MSR_EE=0 but
with PACA soft_enabled = 1. We may want to fix that later and make
sure we keep the MSR_EE and PACA soft_enabled in sync. When we do that
we can switch kvm to use find_linux_pte().
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Naveen N. Rao [Mon, 27 Mar 2017 19:37:41 +0000 (01:07 +0530)]
powerpc/bpf: Use memset32() to pre-fill traps in BPF page(s)
Use the newly introduced memset32() to pre-fill BPF page(s) with trap
instructions.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Naveen N. Rao [Mon, 27 Mar 2017 19:37:40 +0000 (01:07 +0530)]
powerpc/string: Implement optimized memset variants
Based on Matthew Wilcox's patches for other architectures.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Geoff Levand [Mon, 7 Aug 2017 20:09:20 +0000 (20:09 +0000)]
block/ps3vram: Check return of ps3vram_cache_init
Cc: Markus Elfring <elfring@users.sourceforge.net>
Cc: Jim Paris <jim@jtan.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Markus Elfring [Mon, 7 Aug 2017 20:09:20 +0000 (20:09 +0000)]
block/ps3vram: Delete an error message for a failed memory allocation in ps3vram_cache_init()
Omit an extra message for a memory allocation failure in this function.
This issue was detected by using the Coccinelle software.
Link: http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdf
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Cc: Jim Paris <jim@jtan.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Sam Bobroff [Thu, 17 Aug 2017 01:06:47 +0000 (11:06 +1000)]
selftests/powerpc: Improve tm-resched-dscr
The tm-resched-dscr self test can, in some situations, run for
several minutes before being successfully interrupted by the context
switch it needs in order to perform the test. This often seems to
occur when the test is being run in a virtual machine.
Improve the test by running it under eat_cpu() to guarantee
contention for the CPU and increase the chance of a context switch.
In practice this seems to reduce the test time, in some cases, from
more than two minutes to under a second.
Also remove the "progress dots" so that if the test does run for a
long time, it doesn't produce large amounts of unnecessary output.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Madhavan Srinivasan [Wed, 16 Aug 2017 16:21:34 +0000 (21:51 +0530)]
powerpc/perf: Fix usage of nest_imc_refc
nest_imc_refc is a reference count struct, used to track number of
active perf sessions using the nest units.
Currently the code accesses nest_imc_refc using node_id, which is
incorrect, the array is indexed by node number. Meaning in the case of
sparse node ids we index off the end of the array.
Fix it to use get_nest_pmu_ref() which uses the existing per-cpu
variable local_nest_imc_refc.
Fixes:
885dcd709ba91 ('powerpc/perf: Add nest IMC PMU support')
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
[mpe: Tweak change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Bhumika Goyal [Wed, 2 Aug 2017 18:07:38 +0000 (23:37 +0530)]
powerpc: Add const to bin_attribute structures
Declare bin_attribute structures as const as they are only passed as an
argument to the function sysfs_create_bin_file. This argument is of
type const, so declare the structure as const.
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Benjamin Herrenschmidt [Wed, 16 Aug 2017 06:01:18 +0000 (16:01 +1000)]
powerpc: Remove more redundant VSX save/tests
__giveup_vsx/save_vsx are completely equivalent to testing MSR_FP
and MSR_VEC and calling the corresponding giveup/save function so
just remove the spurious VSX cases. Also add WARN_ONs checking that
we never have VSX enabled without the two other.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Benjamin Herrenschmidt [Wed, 16 Aug 2017 06:01:17 +0000 (16:01 +1000)]
powerpc: Remove redundant clear of MSR_VSX in __giveup_vsx()
__giveup_fpu() already does it and we cannot have MSR_VSX set
without having MSR_FP also set.
This also adds a warning to check we indeed do
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Benjamin Herrenschmidt [Wed, 16 Aug 2017 06:01:16 +0000 (16:01 +1000)]
powerpc: Remove redundant FP/Altivec giveup code
__giveup_vsx() already calls those two functions.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Benjamin Herrenschmidt [Wed, 16 Aug 2017 06:01:15 +0000 (16:01 +1000)]
powerpc: Fix missing newline before {
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Dou Liyang [Wed, 26 Jul 2017 13:34:30 +0000 (21:34 +0800)]
powerpc/topology: Remove the unused parent_node() macro
Commit
a7be6e5a7f8d ("mm: drop useless local parameters of
__register_one_node()") removes the last user of parent_node().
The parent_node() macro in POWERPC platform is unnecessary.
Remove it for cleanup.
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Benjamin Herrenschmidt [Wed, 16 Aug 2017 06:01:14 +0000 (16:01 +1000)]
powerpc: Fix VSX enabling/flushing to also test MSR_FP and MSR_VEC
VSX uses a combination of the old vector registers, the old FP
registers and new "second halves" of the FP registers.
Thus when we need to see the VSX state in the thread struct
(flush_vsx_to_thread()) or when we'll use the VSX in the kernel
(enable_kernel_vsx()) we need to ensure they are all flushed into
the thread struct if either of them is individually enabled.
Unfortunately we only tested if the whole VSX was enabled, not if they
were individually enabled.
Fixes:
72cd7b44bc99 ("powerpc: Uncomment and make enable_kernel_vsx() routine available")
Cc: stable@vger.kernel.org # v4.3+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Aneesh Kumar K.V [Fri, 28 Jul 2017 05:01:27 +0000 (10:31 +0530)]
powerpc/mm/hugetlb: Allow runtime allocation of 16G.
Now that we have GIGANTIC_PAGE enabled on powerpc, use this for 16G hugepages
with hash translation mode. Depending on the total system memory we have, we may
be able to allocate 16G hugepages runtime. This also remove the hugetlb setup
difference between hash/radix translation mode.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Aneesh Kumar K.V [Fri, 28 Jul 2017 05:01:26 +0000 (10:31 +0530)]
powerpc/mm/hugetlb: Add support for reserving gigantic huge pages via kernel command line
With commit
aa888a74977a8 ("hugetlb: support larger than MAX_ORDER") we added
support for allocating gigantic hugepages via kernel command line. Switch
ppc64 arch specific code to use that.
W.r.t FSL support, we now limit our allocation range using BOOTMEM_ALLOC_ACCESSIBLE.
We use the kernel command line to do reservation of hugetlb pages on powernv
platforms. On pseries hash mmu mode the supported gigantic huge page size is
16GB and that can only be allocated with hypervisor assist. For pseries the
command line option doesn't do the allocation. Instead pseries does gigantic
hugepage allocation based on hypervisor hint that is specified via
"ibm,expected#pages" property of the memory node.
Cc: Scott Wood <oss@buserror.net>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Aneesh Kumar K.V [Fri, 28 Jul 2017 05:01:25 +0000 (10:31 +0530)]
mm/hugetlb: Allow arch to override and call the weak function
When running in guest mode ppc64 supports a different mechanism for hugetlb
allocation/reservation. The LPAR management application called HMC can
be used to reserve a set of hugepages and we pass the details of
reserved pages via device tree to the guest. (more details in
htab_dt_scan_hugepage_blocks()) . We do the memblock_reserve of the range
and later in the boot sequence, we add the reserved range to huge_boot_pages.
But to enable 16G hugetlb on baremetal config (when we are not running as guest)
we want to do memblock reservation during boot. Generic code already does this
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Christophe Leroy [Wed, 12 Jul 2017 15:03:42 +0000 (17:03 +0200)]
powerpc/hugetlb: fix page rights verification in gup_hugepte()
gup_hugepte() checks if pages are present and readable, and
when 'write' is set, also checks if the pages are writable.
Initially this was done by checking if _PAGE_PRESENT and
_PAGE_READ were set. In addition, _PAGE_WRITE was verified for write
accesses.
The problem is that we have to handle the three following cases:
1/ The target defines __PAGE_READ and __PAGE_WRITE
2/ The target defines __PAGE_RW
3/ The target defines __PAGE_RO
In case 1/, this is obvious
In case 2/, __PAGE_READ is defined as 0 and __PAGE_WRITE as __PAGE_RW
so it works as well.
But in case 3, __PAGE_RW is defined as 0, which means __PAGE_WRITE is 0
and then the test returns true (page writable) in all cases.
A first correction was attempted in commit
6b8cb66a6a7cc ("powerpc: Fix
usage of _PAGE_RO in hugepage"), but that fix is wrong:
instead of checking that the page is writable when write is requested,
it checks that the page is NOT writable when write is NOT requested.
This patch adds a new pte_read() helper to check whether a page is
readable or not. This avoids handling all possible cases in
gup_hugepte().
Then gup_hugepte() is modified to use pte_present(), pte_read()
and pte_write() instead of the raw flags.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Christophe Leroy [Wed, 2 Aug 2017 13:51:09 +0000 (15:51 +0200)]
powerpc/mm: Simplify __set_fixmap()
__set_fixmap() uses __fix_to_virt() then does the boundary checks
by it self. Instead, we can use fix_to_virt() which does the
verification at build time. For this, we need to use it inline
so that GCC can see the real value of idx at buildtime.
In the meantime, we remove the 'fixmaps' variable.
This variable is set but has never been used from the beginning
(commit
2c419bdeca1d9 ("[POWERPC] Port fixmap from x86 and use
for kmap_atomic"))
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Christophe Leroy [Wed, 2 Aug 2017 13:51:07 +0000 (15:51 +0200)]
powerpc/mm: declare some local functions static
get_pteptr() and __mapin_ram_chunk() are only used locally,
so define them static
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Christophe Leroy [Wed, 2 Aug 2017 13:51:05 +0000 (15:51 +0200)]
powerpc/mm: Implement STRICT_KERNEL_RWX on PPC32
This patch implements STRICT_KERNEL_RWX on PPC32.
As for CONFIG_DEBUG_PAGEALLOC, it deactivates BAT and LTLB mappings
in order to allow page protection setup at the level of each page.
As BAT/LTLB mappings are deactivated, there might be a performance
impact.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Christophe Leroy [Wed, 2 Aug 2017 13:51:03 +0000 (15:51 +0200)]
powerpc/mm: Fix kernel RAM protection after freeing unused memory on PPC32
As seen below, allthough the init sections have been freed, the
associated memory area is still marked as executable in the
page tables.
~ dmesg
[ 5.860093] Freeing unused kernel memory: 592K (
c0570000 -
c0604000)
~ cat /sys/kernel/debug/kernel_page_tables
---[ Start of kernel VM ]---
0xc0000000-0xc0497fff 4704K rw X present dirty accessed shared
0xc0498000-0xc056ffff 864K rw present dirty accessed shared
0xc0570000-0xc059ffff 192K rw X present dirty accessed shared
0xc05a0000-0xc7ffffff 125312K rw present dirty accessed shared
---[ vmalloc() Area ]---
This patch fixes that.
The implementation is done by reusing the change_page_attr()
function implemented for CONFIG_DEBUG_PAGEALLOC
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Christophe Leroy [Wed, 2 Aug 2017 13:51:01 +0000 (15:51 +0200)]
powerpc/mm: Ensure change_page_attr() doesn't invalidate pinned TLBs
__change_page_attr() uses flush_tlb_page().
flush_tlb_page() uses tlbie instruction, which also invalidates
pinned TLBs, which is not what we expect.
This patch modifies the implementation to use flush_tlb_kernel_range()
instead. This will make use of tlbia which will preserve pinned TLBs.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Christophe Leroy [Wed, 12 Jul 2017 10:08:57 +0000 (12:08 +0200)]
powerpc/8xx: Reduce DTLB miss handler by one insn
This reduces the DTLB miss handler hot path (user address path)
by one instruction by preserving r10.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Christophe Leroy [Wed, 12 Jul 2017 10:08:55 +0000 (12:08 +0200)]
powerpc/8xx: mark init functions with __init
setup_initial_memory_limit() is only called during init.
mmu_patch_cmp_limit() is only called from 8xx_mmu.c
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Christophe Leroy [Wed, 12 Jul 2017 10:08:53 +0000 (12:08 +0200)]
powerpc/8xx: Do not allow Pinned TLBs with STRICT_KERNEL_RWX or DEBUG_PAGEALLOC
Pinning TLBs bypasses STRICT_KERNEL_RWX or DEBUG_PAGEALLOC protections
so it should only be allowed when those are not selected
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Christophe Leroy [Wed, 12 Jul 2017 10:08:51 +0000 (12:08 +0200)]
powerpc/8xx: Make pinning of ITLBs optional
As stated in a comment in head_8xx.S, today we "Always pin the first
8 MB ITLB to prevent ITLB misses while mucking around with SRR0/SRR1
in asm".
This issue has just been cleared by the preceding patch, therefore
we can make this pinning optional (on by default) and independent
of DATA pinning.
This patch also makes pinning of IMMR independent of pinning of DATA.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Christophe Leroy [Wed, 12 Jul 2017 10:08:49 +0000 (12:08 +0200)]
powerpc/32: Avoid risk of unrecoverable TLBmiss inside entry_32.S
By default, the 8xx pins an ITLB on the first 8M of memory in order
to avoid any ITLB miss on kernel code.
However, with some debug functions like DEBUG_PAGEALLOC and
DEBUG_RODATA, pinning TLBs is contradictory.
In order to avoid any ITLB miss in a critical section without pinning
TLBs, we have to ensure that there is no page boundary crossed between
the setup of a new value in SRR0/SRR1 and the associated RFI.
The functions modifying srr0/srr1 are all located in setup_32.S.
They are spread over almost 4kbytes.
The patch forces a 12 bits (4kbytes) alignment for those
functions. This garanties that the functions remain in a
single 4k page.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Christophe Leroy [Wed, 12 Jul 2017 10:08:47 +0000 (12:08 +0200)]
powerpc/8xx: Remove macro that checks kernel address
The macro to check if an address is a kernel address or not is
not used anymore in DTLBmiss handler. It is used in ITLB miss handler
and in DTLB error handler. DTLB error handler is not a hot path, it
doesn't need such optimisation.
In order to simplify a following patch which will rework ITLB miss
handler, we remove the macros and reintroduce them inside the handler.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Christophe Leroy [Wed, 12 Jul 2017 10:08:45 +0000 (12:08 +0200)]
powerpc/8xx: Ensures RAM mapped with LTLB is seen as block mapped on 8xx.
On the 8xx, the RAM mapped with LTLBs must be seen as block mapped,
just like areas mapped with BATs on standard PPC32.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Julia Lawall [Sun, 13 Aug 2017 13:24:23 +0000 (15:24 +0200)]
powerpc/chrp: Store the intended structure
Normally the values in the resource field and the argument to ARRAY_SIZE
in the num_resources are the same. In this case, the value in the reousrce
field is the same as the one in the previous platform_device structure, and
appears to be a copy-paste error. Replace the value in the resource field
with the argument to the local call to ARRAY_SIZE.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Andreas Schwab [Mon, 14 Aug 2017 18:42:43 +0000 (20:42 +0200)]
powerpc/l2cr_6xx: Fix invalid use of register expressions
This fixes another invalid use of register expressions.
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Tue, 8 Aug 2017 07:06:32 +0000 (17:06 +1000)]
powerpc/iommu: Avoid undefined right shift in iommu_range_alloc()
In iommu_range_alloc() we generate a mask by right shifting ~0,
however if the specified alignment is 0 then we right shift by 64,
which is undefined. UBSAN tells us so:
UBSAN: Undefined behaviour in ../arch/powerpc/kernel/iommu.c:193:35
shift exponent 64 is too large for 64-bit type 'long unsigned int'
We can avoid it by instead generating the mask with:
align_mask = (1ull << align_order) - 1;
That will also generate an undefined shift if align_order is 64 or
greater, but that shouldn't be a problem for a while.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Anju T [Mon, 14 Aug 2017 11:42:23 +0000 (17:12 +0530)]
powerpc/perf/imc: Fix nest events on muti socket system
In a multi node system with discontiguous node ids, nest event values
are not showing up properly. eg. lscpu output:
NUMA node0 CPU(s): 0-15
NUMA node8 CPU(s): 16-31
Nest event values on such systems can be counted on CPUs <= 15:
$./perf stat -e 'nest_powerbus0_imc/PM_PB_CYC/' -C 0-14 -I 1000 sleep 1000
# time counts unit events
1.
000294577 30,17,24,42,880 nest_powerbus0_imc/PM_PB_CYC/
But not on CPUs >= 16:
$./perf stat -e 'nest_powerbus0_imc/PM_PB_CYC/' -C 16-28 -I 1000 sleep 1000
# time counts unit events
1.
000049902 <not supported> nest_powerbus0_imc/PM_PB_CYC/
This is because, when fetching the reference count, the node id (which
may be sparse) is used as the array index, not the node number (which
is 0 based and contiguous).
Fix it by using the node number as the array index.
$./perf stat -e 'nest_powerbus0_imc/PM_PB_CYC/' -C 16-28 -I 1000 sleep 1000
# time counts unit events
1.
000241961 26,12,35,28,704 nest_powerbus0_imc/PM_PB_CYC/
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
[mpe: Change log tweaks for clarity and brevity]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Tue, 15 Aug 2017 10:02:56 +0000 (20:02 +1000)]
powerpc/mm/nohash: Move definition of PGALLOC_GFP to fix build errors
In some obscure Book3E configs (randconfig) we can end up missing a
definition for PGALLOC_GFP in pgtable_64.c.
Fix it by moving the definition to asm/pgalloc.h.
Fixes:
de3b87611dd1 ("powerpc/mm/book(e)(3s)/64: Add page table accounting")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Naveen N. Rao [Wed, 2 Aug 2017 18:25:38 +0000 (23:55 +0530)]
powerpc/xmon: Exclude all of xmon from ftrace
Exclude core xmon files from ftrace (along with an xmon xive helper
outside of xmon/) to minimize impact of ftrace while within xmon.
Before:
/sys/kernel/debug/tracing# grep -ci xmon available_filter_functions
26
After:
/sys/kernel/debug/tracing# grep -ci xmon available_filter_functions
0
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
[mpe: Use $(subst ..) on KBUILD_CFLAGS rather than CFLAGS_REMOVE_xxx]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Breno Leitao [Wed, 2 Aug 2017 20:14:06 +0000 (17:14 -0300)]
powerpc/xmon: Disable tracing when entering xmon
If tracing is enabled and you get into xmon, the tracing buffer
continues to be updated, causing possible loss of data and unnecessary
tracing information coming from xmon functions.
This patch simple disables tracing when entering xmon, and re-enables it
if the kernel is resumed (with 'x').
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Breno Leitao [Wed, 2 Aug 2017 20:14:05 +0000 (17:14 -0300)]
powerpc/xmon: Dump ftrace buffers for the current CPU only
Current xmon 'dt' command dumps the tracing buffer for all the CPUs,
which makes it very hard to read due to the fact that most of
powerpc machines currently have many CPUs. Other than that, the CPU
lines are interleaved in the ftrace log.
This new option just dumps the ftrace buffer for the current CPU.
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Bhumika Goyal [Fri, 11 Aug 2017 17:38:45 +0000 (23:08 +0530)]
drivers/macintosh: Make wf_control_ops and wf_pid_param const
Make wf_control_ops const as they are only stored in the ops field of a
wf_control structure, which is const.
Make wf_pid_param const as they are only used during a copy operation.
Done using Coccinelle.
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Dan Carpenter [Fri, 11 Aug 2017 20:05:41 +0000 (23:05 +0300)]
powerpc/perf: Fix double unlock in imc_common_cpuhp_mem_free()
This function is not called with the nest_init_lock held, and it also
unlocks the nest_init_lock immediately below, so it's fairly clear
that this is a typo and should be locking the lock.
Fixes:
885dcd709ba9 ("powerpc/perf: Add nest IMC PMU support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Christophe Leroy [Mon, 14 Aug 2017 07:14:19 +0000 (09:14 +0200)]
powerpc/8xx: Fix two CONFIG_8xx left behind
Commit
968159c0031ac ("powerpc/8xx: Getting rid of remaining use of
CONFIG_8xx") removed all but 2 references to 8xx in Kconfigs.
This patch removes the two remaining ones.
Fixes:
968159c0031a ("powerpc/8xx: Getting rid of remaining use of CONFIG_8xx")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Tue, 8 Aug 2017 11:44:14 +0000 (21:44 +1000)]
powerpc/xive: Fix section mismatch warnings
Both xive_core_init() and xive_native_init() are called from and call
__init routines, so they should also be __init.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Tue, 8 Aug 2017 11:44:08 +0000 (21:44 +1000)]
powerpc/mm: Fix section mismatch warning in early_check_vec5()
early_check_vec5() is called from and calls __init routines, so should
also be __init.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>