Peter Zijlstra [Thu, 26 Mar 2015 16:45:37 +0000 (17:45 +0100)]
locking: Remove atomicy checks from {READ,WRITE}_ONCE
The fact that volatile allows for atomic load/stores is a special case
not a requirement for {READ,WRITE}_ONCE(). Their primary purpose is to
force the compiler to emit load/stores _once_.
Change-Id: I86e34ae44535576859d66411208e7c8a13c0ec3a
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Git-commit:
7bd3e239d6c6d1cad276e8f130b386df4234dcd7
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Trilok Soni <tsoni@codeaurora.org>
Andrew Morton [Wed, 3 Jul 2013 22:02:11 +0000 (15:02 -0700)]
UPSTREAM: include/linux/mm.h: add PAGE_ALIGNED() helper
To test whether an address is aligned to PAGE_SIZE.
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit
0fa73b86ef0797ca4fde5334117ca0b330f08030)
Bug:
36007193
Change-Id: I7e912bb0dbd8c9737fb13c5b48acb54ee39dd5fc
Al Viro [Sat, 9 May 2015 02:53:15 +0000 (22:53 -0400)]
path_openat(): fix double fput()
[ Upstream commit
f15133df088ecadd141ea1907f2c96df67c729f0 ]
path_openat() jumps to the wrong place after do_tmpfile() - it has
already done path_cleanup() (as part of path_lookupat() called by
do_tmpfile()), so doing that again can lead to double fput().
Change-Id: Ia74c130ae5e379b512532c0feebea871b5f73668
Cc: stable@vger.kernel.org # v3.11+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Al Viro [Tue, 11 Jun 2013 04:23:01 +0000 (08:23 +0400)]
allow build_open_flags() to return an error
Change-Id: I6e900fc3facf5a3febefe138fea7db493bc383d8
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 6 Jun 2013 13:12:33 +0000 (09:12 -0400)]
do_last(): fix missing checks for LAST_BIND case
/proc/self/cwd with O_CREAT should fail with EISDIR. /proc/self/exe, OTOH,
should fail with ENOTDIR when opened with O_DIRECTORY.
Change-Id: I01c85a6a3894c6854c604f192f221175edc19867
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Ethan Chen [Tue, 4 Sep 2018 02:11:39 +0000 (19:11 -0700)]
uapi: Define __BITS_PER_LONG based on compiler target
* We may compile 32-bit ARM code against these kernel headers in many
situations, so provide a compiler-defined method of obtaining the width
of long.
Change-Id: Iac5e48200d70f1258ab3caca1a8f1eb6e8f7f2d3
Eric Rannaud [Thu, 30 Oct 2014 08:51:01 +0000 (01:51 -0700)]
fs: allow open(dir, O_TMPFILE|..., 0) with mode 0
The man page for open(2) indicates that when O_CREAT is specified, the
'mode' argument applies only to future accesses to the file:
Note that this mode applies only to future accesses of the newly
created file; the open() call that creates a read-only file
may well return a read/write file descriptor.
The man page for open(2) implies that 'mode' is treated identically by
O_CREAT and O_TMPFILE.
O_TMPFILE, however, behaves differently:
int fd = open("/tmp", O_TMPFILE | O_RDWR, 0);
assert(fd == -1);
assert(errno == EACCES);
int fd = open("/tmp", O_TMPFILE | O_RDWR, 0600);
assert(fd > 0);
For O_CREAT, do_last() sets acc_mode to MAY_OPEN only:
if (*opened & FILE_CREATED) {
/* Don't check for write permission, don't truncate */
open_flag &= ~O_TRUNC;
will_truncate = false;
acc_mode = MAY_OPEN;
path_to_nameidata(path, nd);
goto finish_open_created;
}
But for O_TMPFILE, do_tmpfile() passes the full op->acc_mode to
may_open().
This patch lines up the behavior of O_TMPFILE with O_CREAT. After the
inode is created, may_open() is called with acc_mode = MAY_OPEN, in
do_tmpfile().
A different, but related glibc bug revealed the discrepancy:
https://sourceware.org/bugzilla/show_bug.cgi?id=17523
The glibc lazily loads the 'mode' argument of open() and openat() using
va_arg() only if O_CREAT is present in 'flags' (to support both the 2
argument and the 3 argument forms of open; same idea for openat()).
However, the glibc ignores the 'mode' argument if O_TMPFILE is in
'flags'.
On x86_64, for open(), it magically works anyway, as 'mode' is in
RDX when entering open(), and is still in RDX on SYSCALL, which is where
the kernel looks for the 3rd argument of a syscall.
But openat() is not quite so lucky: 'mode' is in RCX when entering the
glibc wrapper for openat(), while the kernel looks for the 4th argument
of a syscall in R10. Indeed, the syscall calling convention differs from
the regular calling convention in this respect on x86_64. So the kernel
sees mode = 0 when trying to use glibc openat() with O_TMPFILE, and
fails with EACCES.
Change-Id: If41388897576291d3cfb0515fd71efb70bf2b3df
Signed-off-by: Eric Rannaud <e@nanocritical.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andy Lutomirski [Fri, 2 Aug 2013 04:07:52 +0000 (21:07 -0700)]
fs: Fix file mode for O_TMPFILE
O_TMPFILE, like O_CREAT, should respect the requested mode and should
create regular files.
This fixes two bugs: O_TMPFILE required privilege (because the mode
ended up as 000) and it produced bogus inodes with no type.
Change-Id: I4d045c5b3a07e3d3114897c5f3d2448ab6c3a0a5
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Zheng Liu [Thu, 25 Jul 2013 00:13:19 +0000 (08:13 +0800)]
vfs: add missing check for __O_TMPFILE in fcntl_init()
As comment in include/uapi/asm-generic/fcntl.h described, when
introducing new O_* bits, we need to check its uniqueness in
fcntl_init(). But __O_TMPFILE bit is missing. So fix it.
Change-Id: I372d41d2bc15b007595eaf763e200c4c06de177f
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 19 Jul 2013 23:11:32 +0000 (03:11 +0400)]
allow O_TMPFILE to work with O_WRONLY
Change-Id: If75a4f1b8f1ba485f6073be4058b59126cef034b
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 13 Jul 2013 09:26:37 +0000 (13:26 +0400)]
Safer ABI for O_TMPFILE
[suggested by Rasmus Villemoes] make O_DIRECTORY | O_RDWR part of O_TMPFILE;
that will fail on old kernels in a lot more cases than what I came up with.
And make sure O_CREAT doesn't get there...
Change-Id: I4818563d79ca1abf9ea99f5ccea9317eb2f3b678
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 7 Jun 2013 05:20:27 +0000 (01:20 -0400)]
it's still short a few helpers, but infrastructure should be OK now...
Change-Id: I9e003fabb858fd901fd922cd891ca29966ccdf3a
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Lorenzo Colitti [Thu, 3 Nov 2016 17:23:42 +0000 (02:23 +0900)]
net: core: add UID to flows, rules, and routes
- Define a new FIB rule attributes, FRA_UID_RANGE, to describe a
range of UIDs.
- Define a RTA_UID attribute for per-UID route lookups and dumps.
- Support passing these attributes to and from userspace via
rtnetlink. The value INVALID_UID indicates no UID was
specified.
- Add a UID field to the flow structures.
[Backport of net-next
622ec2c9d52405973c9f1ca5116eb1c393adfc7d]
Bug:
16355602
Change-Id: I7e3ab388ed862c4b7e39dc8b0209d977cb1129ac
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stricted [Fri, 7 Sep 2018 09:46:28 +0000 (11:46 +0200)]
Revert "net: core: Support UID-based routing."
This reverts commit
99a6ea48b591877d1cd6a51732c40a1d5321d961.
Stricted [Fri, 7 Sep 2018 09:46:22 +0000 (11:46 +0200)]
Revert "Handle 'sk' being NULL in UID-based routing."
This reverts commit
455b09d66a9ccfc572497ae88375ae343ff9ae66.
Change-Id: I1d95864d8b48ae3ca418cfd790cfd62c07f54f66
Cong Wang [Tue, 15 Apr 2014 23:25:34 +0000 (16:25 -0700)]
ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
As suggested by Julian:
Simply, flowi4_iif must not contain 0, it does not
look logical to ignore all ip rules with specified iif.
because in fib_rule_match() we do:
if (rule->iifindex && (rule->iifindex != fl->flowi_iif))
goto out;
flowi4_iif should be LOOPBACK_IFINDEX by default.
We need to move LOOPBACK_IFINDEX to include/net/flow.h:
1) It is mostly used by flowi_iif
2) Fix the following compile error if we use it in flow.h
by the patches latter:
In file included from include/linux/netfilter.h:277:0,
from include/net/netns/netfilter.h:5,
from include/net/net_namespace.h:21,
from include/linux/netdevice.h:43,
from include/linux/icmpv6.h:12,
from include/linux/ipv6.h:61,
from include/net/ipv6.h:16,
from include/linux/sunrpc/clnt.h:27,
from include/linux/nfs_fs.h:30,
from init/do_mounts.c:32:
include/net/flow.h: In function ‘flowi4_init_output’:
include/net/flow.h:84:32: error: ‘LOOPBACK_IFINDEX’ undeclared (first use in this function)
[Backport of net-next
6a662719c9868b3d6c7d26b3a085f0cd3cc15e64]
Change-Id: Ib7a0a08d78c03800488afa1b2c170cb70e34cfd9
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Stricted [Tue, 6 Aug 2019 11:33:01 +0000 (11:33 +0000)]
defconfig: s5neolte: disable cpusets
Stricted [Mon, 29 Jul 2019 04:10:08 +0000 (04:10 +0000)]
defconfig: s5neolte: regenerate defconfig
Ruchi Kandoi [Thu, 23 Apr 2015 19:09:09 +0000 (12:09 -0700)]
nf: IDLETIMER: Adds the uid field in the msg
Message notifications contains an additional uid field. This field
represents the uid that was responsible for waking the radio. And hence
it is present only in notifications stating that the radio is now
active.
Change-Id: I18fc73eada512e370d7ab24fc9f890845037b729
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
Bug:
20264396
Danny Wood [Fri, 3 May 2019 10:07:52 +0000 (11:07 +0100)]
bcmdhd: Fix android version check
Danny Wood [Fri, 3 May 2019 08:57:38 +0000 (09:57 +0100)]
dma: dmaengine: Do not skip opening a 'busy' DMA channel as it crashes the kernel and doesn't seem to be necessary
Danny Wood [Sat, 4 May 2019 09:36:48 +0000 (10:36 +0100)]
sec_battery: update prev_safety_time to fix the non-charging issue when the device is not plugged in for 20+ hours
Danny Wood [Thu, 25 Apr 2019 10:16:33 +0000 (11:16 +0100)]
a5xelte: add initial defconfig
Danny Wood [Mon, 8 Apr 2019 08:59:40 +0000 (09:59 +0100)]
cpufreq: interactive: check speedchange_task pointer before waking it to avoid a kernel panic
Danny Wood [Sun, 7 Apr 2019 08:21:44 +0000 (09:21 +0100)]
trace: exynos-ss: fix mode permissions in calls to 'ptrace_may_access'
Rohit Gupta [Sat, 7 Mar 2015 02:46:04 +0000 (18:46 -0800)]
cpufreq: interactive: Rearm governor timer at max freq
Interactive governor doesn't rearm per-cpu timer if target_freq is
equal to policy->max. However, this does not have clear performance
benefits. Profiling doesn't show any difference in benchmarks, games
or other workloads, if timers are always rearmed.
At same time, there are a few issues caused by not rearming timer
at policy->max.
1) min_sample_time enforcement is inconsistent
For target frequency that is lower than policy->max, it will not
drop until min_sample_time has passed since last frequency evaluation
selected current frequency. However, for policy->max, it will
always drop immediately as long as CPU has been run for longer than
min_sample_time. This is because timer is not running and thus
floor_freq and floor_validate_time is not updated.
Example: assume min_sample_time is 59ms and timer_rate is 20ms.
Frequency X < Y. Let's say CPU would pick the following frequencies
before accounting for min_sample_time in each 20ms sampling window.
Y, Y, Y, Y, X, X, X, X, X
If Y is not policy->max, the final target_freq after considering
min_sample_time will be Y, Y, Y, Y, *Y, *Y, X, X, X
* marks the windows where frequency is prevented from dropping.
If Y is policy->max, the final target_freq will be
Y, Y, Y, Y, X, X, X, X, X
2) Rearm timer in IDLE_START does not work as intended
IDLE_START/END is sent in arch_cpu_idle_enter/exit(). However, next
wake up is decided in tick_nohz_idle_enter(), which traverses the
timer list before idle notification is sent out. Therefore, rearming
timer in idle notification won't take effect until CPU wakes up at
least once. In rare scenarios when a CPU goes to idle and sleeps for a
long time immediately after a heavy load stops, it may not wake up
to drop its frequency vote for a long time, defeating the purpose of
having a slack_timer.
3) Need to rearm timer for policy->max change
commit
535a553fc1c4b4c3627c73214ade6326615a7463
(cpufreq: interactive: restructure CPUFREQ_GOV_LIMITS) mentions the
problem of timer getting indefinitely pushed back due to frequency
changes in policy->min/max. However, it still cancels and rearms timer
if policy->max is increased, and same problem could still happen if
policy->max is frequently changing after the fix. The best solution is
to always rearm timer for each CPU even if it's running at
policy->max.
Rearming timers even if target_freq is policy->max solves these
problems cleanly. It also simplifies the design and code of interactive
governor.
Change-Id: I973853d2375ea6f697fa4cee04a89efe6b8bf735
Reviewed-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
Sultan Alsawaf [Sun, 3 Jun 2018 17:47:51 +0000 (10:47 -0700)]
kernel: Fix massive cpufreq stats memory leaks
Every time _cpu_up() is called for a CPU, idle_thread_get() is called which
then re-initializes a CPU's idle thread that was already previously created
and cached in a global variable in smpboot.c. idle_thread_get() calls
init_idle() which then calls __sched_fork(). __sched_fork() is where
cpufreq_task_stats_init() is, and cpufreq_task_stats_init() allocates
512 bytes of memory to a pointer in the task struct.
Since idle_thread_get() reuses a task struct instance that was already
previously created, this means that every time it calls init_idle(),
cpufreq_task_stats_init() allocates 512 bytes again and overwrites the
existing 512-byte allocation that the idle thread already had.
This causes 512 bytes to be leaked every time a CPU is onlined. This is
significant when non-boot CPUs are enabled during resume from suspend; this
means that (NR_CPUS - 1) * 512 bytes are leaked every time the device exits
suspend (this turned out to be ~500 kiB leaked in 20 minutes with the
device left on a desk with the screen off).
In order to fix this, don't initialize cpufreq stats at all for the idle
threads. The cpufreq stats interface is intended to be used for tracking
userspace tasks, so we can safely remove it from the kernel's idle threads
without killing any functionality.
Change-Id: I12fe7611fc88eb7f6c39f8f7629ad27b6ec4722c
Signed-off-by: Sultan Alsawaf <sultanxda@gmail.com>
NeilBrown [Fri, 13 Feb 2015 04:49:17 +0000 (15:49 +1100)]
sched: Prevent recursion in io_schedule()
commit
9cff8adeaa34b5d2802f03f89803da57856b3b72 upstream.
io_schedule() calls blk_flush_plug() which, depending on the
contents of current->plug, can initiate arbitrary blk-io requests.
Note that this contrasts with blk_schedule_flush_plug() which requires
all non-trivial work to be handed off to a separate thread.
This makes it possible for io_schedule() to recurse, and initiating
block requests could possibly call mempool_alloc() which, in times of
memory pressure, uses io_schedule().
Apart from any stack usage issues, io_schedule() will not behave
correctly when called recursively as delayacct_blkio_start() does
not allow for repeated calls.
So:
- use ->in_iowait to detect recursion. Set it earlier, and restore
it to the old value.
- move the call to "raw_rq" after the call to blk_flush_plug().
As this is some sort of per-cpu thing, we want some chance that
we are on the right CPU
- When io_schedule() is called recurively, use blk_schedule_flush_plug()
which cannot further recurse.
- as this makes io_schedule() a lot more complex and as io_schedule()
must match io_schedule_timeout(), but all the changes in io_schedule_timeout()
and make io_schedule a simple wrapper for that.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[ Moved the now rudimentary io_schedule() into sched.h. ]
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tony Battersby <tonyb@cybernetics.com>
Link: http://lkml.kernel.org/r/20150213162600.059fffb2@notabene.brown
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Change-Id: Ic32d78863e35db89c3e8f0b29dcf20cb563dc70f
Junjie Wu [Tue, 5 Jan 2016 19:09:41 +0000 (11:09 -0800)]
cpufreq: interactive: Use wake_up_process_no_notif to wake up tasks
Scheduler could send a notification to governor each time a task wakes
up. If governor wakes up another task as a response to such a
notification, it could result in endless recursive notifications.
Use wake_up_process_no_notif to ensure scheduler won't send another
notification for speedchange task woken up by the governor.
Change-Id: I697affcbdf79e2ad0cfe843eb880d304960682f4
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Junjie Wu [Tue, 5 Jan 2016 18:53:30 +0000 (10:53 -0800)]
sched: Provide a wake up API without sending freq notifications
Each time a task wakes up, scheduler evaluates its load and notifies
governor if the resulting frequency of destination CPU is larger than
a threshold. However, some governor wakes up a separate task that
handles frequency change, which again calls wake_up_process().
This is dangerous because if the task being woken up meets the
threshold and ends up being moved around, there is a potential for
endless recursive notifications.
Introduce a new API for waking up a task without triggering
frequency notification.
Change-Id: I24261af81b7dc410c7fb01eaa90920b8d66fbd2a
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Danny Wood [Tue, 2 Apr 2019 08:27:16 +0000 (09:27 +0100)]
arm64: rebuild configs to reflect CONFIG_UID_SYS_STATS change
Danny Wood [Sun, 31 Mar 2019 08:41:21 +0000 (09:41 +0100)]
cpufreq: interactive: use CPUFREQ_RELATION_C to choose closest frequency insted of lowest
Minsung Kim [Sat, 29 Nov 2014 12:43:53 +0000 (21:43 +0900)]
cpufreq: interactive: don't skip waking up speedchange_task if target_freq > policy->cur
When __cpufreq_driver_target() in speedchange_task failed for some reason, the
policy->cur could be lower than the target_freq. The governor misses to change
the target_freq if the target_freq is equal to the next_freq at the next sample
time.
Added a check to prevent the CPU to stay at the speed that is lower than the
target_freq for long duration.
Change-Id: Ibfdcd193b8280390b8f8374a63218aa31267f310
Signed-off-by: Minsung Kim <ms925.kim@samsung.com>
Viresh Kumar [Mon, 1 May 2017 18:32:28 +0000 (18:32 +0000)]
cpufreq: Optimize cpufreq_frequency_table_verify()
cpufreq_frequency_table_verify() is rewritten here to make it more logical
and efficient.
- merge multiple lines for variable declarations together.
- quit early if any frequency between min/max is found.
- don't call cpufreq_verify_within_limits() in case any valid freq is
found as it is of no use.
- rename the count variable as found and change its type to boolean.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Used commit: https://github.com/Noxxxious/Zero/commit/
a04a517aa911804b26f5970e49d48d35e1a62b24
mythos234 [Tue, 22 Mar 2016 18:09:47 +0000 (18:09 +0000)]
Introduce CPUFREQ_RELATION_C
It selects the frequency with the minimum euclidean distance to
target. In case of equal distance between 2 frequencies, it will
select the greater freq.
Created by Stratos Karafotis <stratosk@semaphore.gr>
Original commit: https://github.com/XileForce/Vindicator-S6/commit/
4b113d38e8f62ea5810d2b4d134cc6f7a81198dd
Viresh Kumar [Sun, 4 Mar 2018 13:25:38 +0000 (16:25 +0300)]
cpufreq: Move get_cpu_idle_time() to cpufreq.c
Governors other than ondemand and conservative can also use get_cpu_idle_time()
and they aren't required to compile cpufreq_governor.c. So, move these
independent routines to cpufreq.c instead.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
This fixes compiling errors without Ondemand
Based on https://patchwork.kernel.org/patch/
2582231/
Jason Hrycay [Fri, 22 Mar 2019 21:01:50 +0000 (22:01 +0100)]
cpufreq_stats: Fix stats leak during update policy
When the cpufreq policy is moved from one CPU to another, the percpu
stats_table is overwritten and leaked. Properly free the old stats table
and ensure its protected in the non-sysfs paths of update policy and
acct_update_power. The sysfs entries are removed in the cpufreq core
driver before migrating the policy. We introduce a new spinlock
specifically for these operations to avoid needed to convert all the
other spinlocks into the irq safe variants since acct_update_power is
typically called in ISR context.
[ported to apq8084-common by Corinna Vinschen <xda@vinschen.de>]
Change-Id: I95ff24c07834065cd0fd3c763a488a9843097a1d
Signed-off-by: Jason Hrycay <jason.hrycay@motorola.com>
Reviewed-on: https://gerrit.mot.com/921752
SLTApproved: Slta Waiver <sltawvr@motorola.com>
SME-Granted: SME Approvals Granted
Reviewed-by: Igor Kovalenko <igork@motorola.com>
Andres Oportus [Sat, 20 May 2017 00:59:42 +0000 (17:59 -0700)]
ANDROID: cpufreq_stats: Fix task time in state locking
The task->time_in_state pointer is written to at task creation
and exiting, protection is needed for concurrent reads e.g. during
sysfs accesses. Added spin lock such that the task's time_in_state
pointer used is set to either allocated memory or null.
Bug:
38463235
Test: Torture concurrent sysfs reads with short lived tasks
Signed-off-by: Andres Oportus <andresoportus@google.com>
Change-Id: Iaa6402bf50a33489506f2170e4dfabe535d79e15
Andres Oportus [Mon, 5 Jun 2017 19:46:44 +0000 (12:46 -0700)]
ANDROID: cpufreq: stats: add uid removal for uid_time_in_state
Bug:
62295304
Bug:
34133340
Test: Boot and test uid removal by writing to remove_uid_range
Signed-off-by: Andres Oportus <andresoportus@google.com>
Change-Id: Ic51fc9480716a8aad88fb55549c2b69021038a11
Conflicts:
drivers/cpufreq/cpufreq_stats.c
include/linux/cpufreq.h
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>
Jin Qian [Mon, 22 May 2017 19:08:06 +0000 (12:08 -0700)]
uid_sys_stats: defer io stats calulation for dead tasks
Store sum of dead task io stats in uid_entry and defer uid io
calulation until next uid proc stat change or dumpsys.
Bug:
37754877
Change-Id: I970f010a4c841c5ca26d0efc7e027414c3c952e0
Signed-off-by: Jin Qian <jinqian@google.com>
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>
Jin Qian [Fri, 14 Apr 2017 00:07:58 +0000 (17:07 -0700)]
uid_sys_stats: reduce update_io_stats overhead
Replaced read_lock with rcu_read_lock to reduce time that preemption
is disabled.
Added a function to update io stats for specific uid and moved
hash table lookup, user_namespace out of loops.
Bug:
37319300
Change-Id: I2b81b5cd3b6399b40d08c3c14b42cad044556970
Signed-off-by: Jin Qian <jinqian@google.com>
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>
Wei Wang [Mon, 13 Mar 2017 19:22:21 +0000 (12:22 -0700)]
uid_sys_stats: change to use rt_mutex
We see this happens multiple times in heavy workload in systrace
and AMS stuck in uid_lock.
Running process: Process 953
Running thread: android.ui
State: Uninterruptible Sleep
Start:
1,025.628 ms
Duration:
27,955.949 ms
On CPU:
Running instead: system_server
Args:
{kernel callsite when blocked:: "uid_procstat_write+0xb8/0x144"}
Changing to rt_mutex can mitigate the priority inversion
Bug:
34991231
Bug:
34193533
Test: on marlin
Change-Id: I28eb3971331cea60b1075740c792ab87d103262c
Signed-off-by: Wei Wang <wvw@google.com>
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>
Jin Qian [Thu, 2 Mar 2017 21:39:43 +0000 (13:39 -0800)]
ANDROID: uid_sys_stats: account for fsync syscalls
Change-Id: Ie888d8a0f4ec7a27dea86dc4afba8e6fd4203488
Signed-off-by: Jin Qian <jinqian@google.com>
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>
Jin Qian [Tue, 28 Feb 2017 23:09:42 +0000 (15:09 -0800)]
ANDROID: uid_sys_stats: fix negative write bytes.
A task can cancel writes made by other tasks. In rare cases,
cancelled_write_bytes is larger than write_bytes if the task
itself didn't make any write. This doesn't affect total size
but may cause confusion when looking at IO usage on individual
tasks.
Bug:
35851986
Change-Id: If6cb549aeef9e248e18d804293401bb2b91918ca
Signed-off-by: Jin Qian <jinqian@google.com>
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>
Jin Qian [Fri, 17 Feb 2017 02:07:05 +0000 (18:07 -0800)]
ANDROID: uid_sys_stats: remove unnecessary code in procstat switch
No need to aggregate the switched uid separately since
update_io_stats_locked covers all uids.
Bug:
34198239
Change-Id: Ifed347264b910de02e3f3c8dec95d1a2dbde58c0
Signed-off-by: Jin Qian <jinqian@google.com>
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>
Jin Qian [Fri, 20 Jan 2017 00:34:34 +0000 (16:34 -0800)]
ANDROID: uid_sys_stats: return full size when state is not changed.
Userspace keeps retrying when it sees nothing is written.
Bug:
34364961
Change-Id: Ie288c90c6a206fb863dcad010094fcd1373767aa
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>
Jin Qian [Wed, 18 Jan 2017 01:26:07 +0000 (17:26 -0800)]
ANDROID: uid_sys_stats: allow writing same state
Signed-off-by: Jin Qian <jinqian@google.com>
Bug:
34360629
Change-Id: Ia748351e07910b1febe54f0484ca1be58c4eb9c7
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>
Jin Qian [Wed, 11 Jan 2017 00:11:07 +0000 (16:11 -0800)]
ANDROID: uid_sys_stats: rename uid_cputime.c to uid_sys_stats.c
This module tracks cputime and io stats.
Signed-off-by: Jin Qian <jinqian@google.com>
Bug:
34198239
Change-Id: I9ee7d9e915431e0bb714b36b5a2282e1fdcc7342
Conflicts:
drivers/misc/Kconfig
drivers/misc/Makefile
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>
Jin Qian [Wed, 11 Jan 2017 00:10:35 +0000 (16:10 -0800)]
ANDROID: uid_cputime: add per-uid IO usage accounting
IO usages are accounted in foreground and background buckets.
For each uid, io usage is calculated in two steps.
delta = current total of all uid tasks - previus total
current bucket += delta
Bucket is determined by current uid stat. Userspace writes to
/proc/uid_procstat/set <uid> <stat> when uid stat is updated.
/proc/uid_io/stats shows IO usage in this format.
<uid> <foreground IO> <background IO>
Signed-off-by: Jin Qian <jinqian@google.com>
Bug:
34198239
Change-Id: I3369e59e063b1e5ee0dfe3804c711d93cb937c0c
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>
Ruchi Kandoi [Sat, 24 Oct 2015 00:49:11 +0000 (17:49 -0700)]
uid_cputime: Check for the range while removing range of UIDs.
Checking if the uid_entry->uid matches the uid intended to be removed will
prevent deleting unwanted uid_entry.
Type cast the key for the hashtable to the same size, as when they were
inserted. This will make sure that we can find the uid_entry we want.
Bug:
25195548
Change-Id: I567942123cfb20e4b61ad624da19ec4cc84642c1
Signed-off: Ruchi kandoi <kandoiruchi@google.com>
Jin Qian [Tue, 14 Jul 2015 01:16:55 +0000 (18:16 -0700)]
uid_cputime: fix cputime overflow
Converting cputime_t to usec caused overflow when the value is greater
than 1 hour. Use msec and convert to unsigned long long to support bigger
range.
Bug:
22461683
Change-Id: I853fe3e8e7dbf0d3e2cc5c6f9688a5a6e1f1fb3e
Signed-off-by: Jin Qian <jinqian@google.com>
Git-commit:
2abf710d2849737b6a435f371395481da628f746
Git-repo: https://android.googlesource.com/kernel/msm/
Signed-off-by: Nirmal Abraham <nabrah@codeaurora.org>
Ruchi Kandoi [Fri, 31 Jul 2015 17:17:54 +0000 (10:17 -0700)]
uid_cputime: Iterates over all the threads instead of processes.
Bug:
22833116
Change-Id: I775a18f61bd2f4df2bec23d01bd49421d0969f87
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
Git-commit:
35ef14095795ea331361034a1f7087bdf07f76f7
Git-repo: https://android.googlesource.com/kernel/msm/
Signed-off-by: Nirmal Abraham <nabrah@codeaurora.org>
Ruchi Kandoi [Fri, 26 Jun 2015 21:19:21 +0000 (14:19 -0700)]
uid_cputime: Avoids double accounting of process stime, utime and cpu_power in task exit.
This avoids the race where a particular process is terminating and we read the
show_uid_stats. At this time since the task_struct still exists and we will account
for the terminating process as one of the active task, where as the stats would have
been added in the task exit callback.
When the task is terminated, the cpu_power for that particular task is added to the
terminated tasks. It is possible that before the task releases all the resources, cpu
reschedules the task or a timer interrupt is fired. At this point we will try to add
the additional time to the process, which will cause the accounting to be skewed. This
avoids that race condition.
Bug:
22064385
Change-Id: Id2ae04b33fcd230eda9683a41b6019d4dd8f5d85
Signed-off-by: Jin Qian <jinqian@google.com>
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
Git-commit:
344377047daa5832ef798af697adee388e367d57
Git-repo: https://android.googlesource.com/kernel/msm/
Signed-off-by: Nirmal Abraham <nabrah@codeaurora.org>
Ruchi Kandoi [Fri, 17 Apr 2015 23:52:54 +0000 (16:52 -0700)]
uid_cputime: Extends the cputime functionality to report power per uid
/proc/uid_cputime/show_uid_stats shows a third field power for each of
the uids.It represents the power in the units (uAusec)
Bug:
21498425
Change-Id: I52fdc5e59647e9dc97561a26d56f462a2689ba9c
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
Git-commit:
b8d3311e8d41c9109f9a4ba3c4d0f7e594539c68
Git-repo: https://android.googlesource.com/kernel/msm/
Signed-off-by: Nirmal Abraham <nabrah@codeaurora.org>
Andres Oportus [Thu, 11 May 2017 15:57:39 +0000 (08:57 -0700)]
ANDROID: Fix cpufreq stats table creation
cpufreq stats does not correctly supports multiple cpus per profile.
For instance Marlin/Sailfish per cpu stats struct does not get created
for all cpus (only one per policy). This change does not provide full
support for multiple cpus per profile but allows stats creation per
cpu to allow b/
34133340 to be completed.
Bug:
38244231
Bug:
34133340
Test: Boot Sailfish
Signed-off-by: Andres Oportus <andresoportus@google.com>
Change-Id: I72ea548a199f57ed841618b08b9c41e99b493376
Conflicts:
drivers/cpufreq/cpufreq_stats.c
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>
Andres Oportus [Wed, 1 Feb 2017 21:34:54 +0000 (13:34 -0800)]
ANDROID: cpufreq_stat: add per task/uid/freq stats
Adds per process nodes in /proc/PID/time_in_state showing per
frequency times and adds a global /proc/uid_time_in_state
showing per frequency per uid times.
Bug:
34133340
Bug:
38320164
Tests: boot sailfish and reading /proc/uid_time_in_state and
/proc/$$/time_in_state
Signed-off-by: Andres Oportus <andresoportus@google.com>
Change-Id: Ideb22b608b9a5e7bd2200a3a6df0f110b635f96a
Ruchi Kandoi [Fri, 27 Jan 2017 22:23:57 +0000 (22:23 +0000)]
uid_cputime: Avoids double accounting of process stime, utime and cpu…
…_power in task exit.
This avoids the race where a particular process is terminating and we
read the show_uid_stats. At this time since the task_struct still exists
and we will account for the terminating process as one of the active
task, where as the stats would have been added in the task exit
callback.
When the task is terminated, the cpu_power for that particular task is
added to the terminated tasks. It is possible that before the task releases all
the resources, cpu reschedules the task or a timer interrupt is fired. At this
point we will try to add the additional time to the process, which will cause
the accounting to be skewed. This avoids that race condition.
Bug:
22064385
Change-Id: Id2ae04b33fcd230eda9683a41b6019d4dd8f5d85
Signed-off-by: Jin Qian <jinqian@google.com>
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
Git-commit:
344377047daa5832ef798af697adee388e367d57
Git-repo: https://android.googlesource.com/kernel/msm/
Signed-off-by: Nirmal Abraham <nabrah@codeaurora.org>
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
Change-Id: I405733725d535b0a864088516bf52fa3638ee6aa
Used commit: https://github.com/omnirom/android_kernel_lge_msm8992/commit/
e36bbc4edde59683e980c816fb4a432fbbe594bb
Ruchi Kandoi [Sat, 10 Dec 2016 19:07:22 +0000 (19:07 +0000)]
sched: cpufreq: Adds a field cpu_power in the task_struct
cpu_power has been added to keep track of amount of power each task is
consuming. cpu_power is updated whenever stime and utime are updated for
a task. power is computed by taking into account the frequency at which
the current core was running and the current for cpu actively
running at hat frequency.
Change-Id: Ic535941e7b339aab5cae9081a34049daeb44b248
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
Original commit: https://github.com/dianlujitao/CAF_kernel_msm-3.10/commit/
85a6bd2bc4c903df43186e6f41209746aa6fdf05
Ruchi Kandoi [Sat, 10 Dec 2016 18:07:17 +0000 (18:07 +0000)]
cpufreq_stats: Adds the fucntionality to load current values for each frequency for all the cores.
The current values for the cpu cores needs to be added to the device
tree for this functionaly to work. It loads the current values for each
frequecy in uA for all the cores.
Change-Id: If03311aaeb3e4c09375dd0beb9ad4fbb254b5c08
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
Original commit: https://github.com/dianlujitao/CAF_kernel_msm-3.10/commit/
8d12562a74922eac859dcec9c43d34d8fd1a9fd1
Sultanxda [Sun, 25 Jun 2017 07:39:02 +0000 (07:39 +0000)]
Fix memory leaks when updating stats table
The address for an element in the per-cpu cpufreq_stats_table variable
is overwritten in cpufreq_stats_update_policy_cpu(), but the memory
allocated at the address that gets overwritten is not freed beforehand.
Free the allocated memory beforehand to fix the memory leaks.
Signed-off-by: Sultanxda <sultanxda@gmail.com>
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>
Signed-off-by: Luca Grifo <lg@linux.com>
Signed-off-by: djb77 <dwayne.bakewell@gmail.com>
Used commit: https://github.com/Siddhant-Naik/TheFlash-Kernel-A5-A7-2017/commit/
98bd2052dfc9757832c49b7a8c68157b8baea13f
Chris Redpath [Mon, 17 Jun 2013 17:36:56 +0000 (18:36 +0100)]
cpufreq: interactive governor drops bits in time calculation
Keep time calculation in 64-bit throughout. If we have long times
between idle calculations this can result in deltas > 32 bits
which causes incorrect load percentage calculations and selecting
the wrong frequencies if we truncate here.
Change-Id: Iac1e5646d58485737538edbb9e7a6d2246b56023
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Alex Naidis <alex.naidis@linux.com>
Derek Basehore [Fri, 27 Mar 2015 20:59:30 +0000 (13:59 -0700)]
CHROMIUM: cpufreq: interactive: calculate load before freq change
The update to cpu load for cpufreq rate changes was happening after the rate
change instead of before. This switches it to update the cpu load before the
cpufreq rate change so the old frequency is used for calculating the speed
adjusted cpu load. The old frequency was used for that active time, so it should
be used for calculating the speed adjusted cpu load.
BUG=chrome-os-partner:37673
TEST=power_LoadTest on Jerry
check we spend most of our time at the lower cpu frequencies
Change-Id: I2fff785561ad8aebd354ddd305e91d96c799a617
Signed-off-by: Derek Basehore <dbasehore@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/262925
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Nirmal Abraham [Wed, 7 Oct 2015 10:53:43 +0000 (16:23 +0530)]
cpufreq: Correct the data reported in all_time_in_state
Commit
bd9474e059bbb2bb62f7e93894cfc3d3ba473fb2 (cpufreq_stats:
Adds the fucntionality to load current values for each frequency
for all the cores) introduced a change by which
'cpufreq_allstats_create' gets called at initialization (from
'cpufreq_stats_init' instead of 'cpufreq_stat_notifier_policy').
This causes 'cpufreq_allstate_create' to be called before the
freq_table is allocated from 'create_all_freq_table'. Due to
this, the data for cpu's which are online at boot are not
added to the 'all_freq_table' leading to the incorrect
reporting of data when the below sysfs command is run -
'cat sys/devices/system/cpu/cpufreq/all_time_in_state'.
Correct this behaviour by altering the cpufreq_stats init
sequence by which the memory for 'all_freq_table' is allocated
before the 'cpufreq_allstats_create' function is called.
Change-Id: I2232dacdc0deec4d1987c418e833fe28f74623fc
Signed-off-by: Nirmal Abraham <nabrah@codeaurora.org>
Junjie Wu [Wed, 25 Mar 2015 21:05:49 +0000 (14:05 -0700)]
cpufreq: interactive: Ramp up directly if cpu_load exceeds 100
When governor is using regular busy time tracking, cpu_load will
never exceed 100 because busy time will never exceed elapsed time in
any one sampling window. The only exception is when frequency is
reduced in middle of a window (e.g. due to thermal throttling). In
this case, cpu_load is likely irrelevant since current frequency
governor has been voting is already higher than what target can run
at.
However, on a heterogeneous CPU system with scheduler input enabled
to track the load of migrated tasks, cpu_load could also exceed 100
when a task migrates from more capable CPU to slower CPU. When this
happens, governor already knows the exact frequency required to handle
this load. There is no need to progressively ramp up frequency in order
to assess the load's real demand. It's not desirable to starve such a
migrating task by forcing it through ramping up process on the slower
CPU.
Direclty jump beyond hispeed_freq and ignore above_hispeed_delay if
cpu_load exceeds 100.
Change-Id: Ib87057e4f00732fad943ab595a33e3059494ef15
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Alexander Alexeev [Thu, 24 Jan 2019 19:04:15 +0000 (22:04 +0300)]
cpufreq: interactive: fix permanent locking on high CPU frequency
This removed improper Samsung's changes
Andrew Bresticker [Tue, 13 Aug 2013 20:39:48 +0000 (13:39 -0700)]
CHROMIUM: cpufreq: interactive: validate above_hispeed_delay
We rely on the frequencies in above_hispeed_delay being in ascending
order. Ensure that is the case for the values the user gives us.
BUG=chrome-os-partner:20830
TEST=Boot Pit; write invalid above_hispeed_delay value:
localhost ~ # echo "20000
1800000:20000 700000:0" > above_hispeed_delay
-bash: echo: write error: Invalid argument
Change-Id: Ifb431e58ad4b6b371152d7b09fcbefa127b0cbe2
Signed-off-by: Andrew Bresticker <abrestic@chromium.org>
Reviewed-on: https://gerrit.chromium.org/gerrit/65742
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
Junjie Wu [Tue, 5 Aug 2014 18:24:32 +0000 (11:24 -0700)]
cpufreq: interactive: BUG_ON when tunables are NULL after init
When tunables are not available for events other than
CPUFREQ_GOV_POLICY_INIT in cpufreq_governor_interactive(), trigger a
panic instead of throwing a warning.
When the original warning happens, some race condition must have
occurred, and governor will be in a bad state even if it might still
run for a while. Panic directly so that it's easier to catch the
first race event.
Change-Id: I2dc1185cabfe72a63739452731fe242924d2cf45
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Junjie Wu [Tue, 27 Oct 2015 17:30:57 +0000 (18:30 +0100)]
cpufreq: interactive: Don't set floor_validate_time during boost
Frequency selection algorithm guarantees its chosen frequency
is not lower than hispeed_freq as long as boost is enabled.
Setting floor_freq and floor_validate_time during boost could block
CPU frequency from going below hispeed_freq even after
boostpulse_duration expires, if min_sample_time is higher than
boostpulse_duration. This conflicts with the intention of commit
de091367ead15b6e95dd1d0743a18f0da5a07ee5
(cpufreq: interactive: specify duration of CPU speed boost pulse)
to allow CPU to ramp down immediately after boost expires. It also
makes boost behavior inconsistent since it depends on min_sample_time.
Avoid setting floor_freq and floor_validate_time when boost starts.
Change-Id: I12852998af46cfbfaf8661eb5e8d5301b6f631e7
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Signed-off-by: Bálint Czobor <czoborbalint@gmail.com>
Junjie Wu [Fri, 15 Aug 2014 23:20:54 +0000 (16:20 -0700)]
cpufreq: interactive: Use del_timer/add_timer_on to rearm timers
Replace mod_timer_pinned() with del_timer(), add_timer_on().
mod_timer_pinned() always adds timer onto current CPU. Interactive
governor expects each CPU's timers to be running on the same CPU.
If cpufreq_interactive_timer_resched() is called from another CPU,
the timer will be armed on the wrong CPU.
Replacing mod_timer_pinned() with del_timer() and add_timer_on()
guarantees timers are still run on the right CPU even if another
CPU reschedules the timer. This would provide more flexibility
for future changes.
Change-Id: I3a10be37632afc0ea4e0cc9c86323b9783b216b1
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Sridhar Ancha [Thu, 14 Apr 2016 15:50:43 +0000 (21:20 +0530)]
net: core: To send ARP probe when neighbor state is NUD_STALE
Featurizing to send an ARP probe to the connected client when
the neighbor state moves to NUD_STALE. This triggers the
neighbor state to move back to NUD_REACHABLE if the ARP request
is resolved and prevents a RTM_DELNEIGH from being triggered
Change-Id: I27aba004a180dfbff5b1fcee2d04047c8523fb8a
Signed-off-by: Sridhar Ancha <sancha@codeaurora.org>
Ravinder Konka [Thu, 9 Apr 2015 06:12:00 +0000 (11:42 +0530)]
net: core: Send ARP probe and trigger RTM_NEWNEIGH
Send ARP probe and generate RTM_NEWNEIGH if the neighbor
state is not NUD_REACHABLE. Also featurize changes for
sending neighbor probe.
Change-Id: I633285b8e0cbcd49291d5e52136f11e20f2388bc
Signed-off-by: Ravinder Konka <rkonka@codeaurora.org>
Erik Kline [Mon, 18 May 2015 10:44:41 +0000 (19:44 +0900)]
neigh: Better handling of transition to NUD_PROBE state
[1] When entering NUD_PROBE state via neigh_update(), perhaps received
from userspace, correctly (re)initialize the probes count to zero.
This is useful for forcing revalidation of a neighbor (for example
if the host is attempting to do DNA [IPv4 4436, IPv6 6059]).
[2] Notify listeners when a neighbor goes into NUD_PROBE state.
By sending notifications on entry to NUD_PROBE state listeners get
more timely warnings of imminent connectivity issues.
The current notifications on entry to NUD_STALE have somewhat
limited usefulness: NUD_STALE is a perfectly normal state, as is
NUD_DELAY, whereas notifications on entry to NUD_FAILURE come after
a neighbor reachability problem has been confirmed (typically after
three probes).
Change-Id: I1d01d40ef3bc4753b0eaa79da2b27235425b1934
Signed-off-by: Erik Kline <ek@google.com>
Acked-By: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Git-commit:
e4a6d6ba5a9e9e1796bbe6efe4f20ce7072df667
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Ravinder Konka [Wed, 10 Dec 2014 11:46:31 +0000 (17:16 +0530)]
net: core: To send ARP probe when neighbor state is NUD_STALE
Send an ARP probe to the connected client when the neighbor
state moves to NUD_STALE. This triggers the neighbor state
to move back to NUD_REACHABLE if the ARP request is resolved
and prevents a RTM_DELNEIGH from being triggered
Change-Id: I4d17c8f24d47931524904d0db74fa812a4f235f6
Signed-off-by: Ravinder Konka <rkonka@codeaurora.org>
Signed-off-by: Skylar Chang <chiaweic@codeaurora.org>
Martijn Coenen [Thu, 10 Aug 2017 11:56:16 +0000 (13:56 +0200)]
ANDROID: binder: don't queue async transactions to thread.
This can cause issues with processes using the poll()
interface:
1) client sends two oneway transactions
2) the second one gets queued on async_todo
(because the server didn't handle the first one
yet)
3) server returns from poll(), picks up the
first transaction and does transaction work
4) server is done with the transaction, sends
BC_FREE_BUFFER, and the second transaction gets
moved to thread->todo
5) libbinder's handlePolledCommands() only handles
the commands in the current data buffer, so
doesn't see the new transaction
6) the server continues running and issues a new
outgoing transaction. Now, it suddenly finds
the incoming oneway transaction on its thread
todo, and returns that to userspace.
7) userspace does not expect this to happen; it
may be holding a lock while making the outgoing
transaction, and if handling the incoming
trasnaction requires taking the same lock,
userspace will deadlock.
By queueing the async transaction to the proc
workqueue, we make sure it's only picked up when
a thread is ready for proc work.
Bug:
38201220
Bug:
63075553
Bug:
63079216
Change-Id: I84268cc112f735d7e3173793873dfdb4b268468b
Signed-off-by: Martijn Coenen <maco@android.com>
Martijn Coenen [Thu, 10 Aug 2017 11:50:52 +0000 (13:50 +0200)]
ANDROID: binder: don't enqueue death notifications to thread todo.
This allows userspace to request death notifications without
having to worry about getting an immediate callback on the same
thread; one scenario where this would be problematic is if the
death recipient handler grabs a lock that was already taken
earlier (eg as part of a nested transaction).
Bug:
23525545
Test: binderLibTest.DeathNotificationThread passes
Change-Id: I955e16306fe3110dacb9a391ffff1bf869249495
Signed-off-by: Martijn Coenen <maco@android.com>
Martijn Coenen [Thu, 10 Aug 2017 10:32:00 +0000 (12:32 +0200)]
ANDROID: binder: call poll_wait() unconditionally.
Because we're not guaranteed that subsequent calls
to poll() will have a poll_table_struct parameter
with _qproc set. When _qproc is not set, poll_wait()
is a noop, and we won't be woken up correctly.
Bug:
64552728
Change-Id: I5b904c9886b6b0994d1631a636f5c5e5f6327950
Test: binderLibTest stops hanging with new test
Signed-off-by: Martijn Coenen <maco@android.com>
Martijn Coenen [Thu, 27 Jul 2017 21:52:24 +0000 (23:52 +0200)]
ANDROID: binder: Don't BUG_ON(!spin_is_locked()).
Because is_spin_locked() always returns false on UP
systems.
Use assert_spin_locked() instead, and remove the
WARN_ON() instances, since those were easy to verify.
Bug:
64073116
Change-Id: I9080991c6d67e91928282a3ee64db23e50c7d66a
Signed-off-by: Martijn Coenen <maco@android.com>
Martijn Coenen [Fri, 26 May 2017 17:48:56 +0000 (10:48 -0700)]
ANDROID: binder: don't check prio permissions on restore.
Because we have disabled RT priority inheritance for
the regular binder domain, the following can happen:
1) thread A (prio 98) calls into thread B
2) because RT prio inheritance is disabled, thread B
runs at the lowest nice (prio 100) instead
3) thread B calls back into A; A will run at prio 100
for the duration of the transaction
4) When thread A is done with the call from B, we will
try to restore the prio back to 98. But, we fail
because the process doesn't hold CAP_SYS_NICE,
neither is RLIMIT_RT_PRIO set.
While the proper fix going forward will be to
correctly apply CAP_SYS_NICE or RLIMIT_RT_PRIO,
for now it seems reasonable to not check permissions
on the restore path.
Change-Id: Ibede5960c9b7bb786271c001e405de50be64d944
Signed-off-by: Martijn Coenen <maco@android.com>
Colin Cross [Tue, 20 Jun 2017 20:54:44 +0000 (13:54 -0700)]
Add BINDER_GET_NODE_DEBUG_INFO ioctl
The BINDER_GET_NODE_DEBUG_INFO ioctl will return debug info on
a node. Each successive call reusing the previous return value
will return the next node. The data will be used by
libmemunreachable to mark the pointers with kernel references
as reachable.
Bug:
28275695
Change-Id: Idbbafa648a33822dc023862cd92b51a595cf7c1c
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Martijn Coenen <maco@android.com>
Martijn Coenen [Fri, 23 Jun 2017 17:13:43 +0000 (10:13 -0700)]
ANDROID: binder: add RT inheritance flag to node.
Allows a binder node to specify whether it wants to
inherit real-time scheduling policy from a caller.
Change-Id: I375b6094bf441c19f19cba06d5a6be02cd07d714
Signed-off-by: Martijn Coenen <maco@android.com>
Martijn Coenen [Wed, 7 Jun 2017 17:02:12 +0000 (10:02 -0700)]
ANDROID: binder: improve priority inheritance.
By raising the priority of a thread selected for
a transaction *before* we wake it up.
Delay restoring the priority when doing a reply
until after we wake-up the process receiving
the reply.
Change-Id: Ic332e4e0ed7d2d3ca6ab1034da4629c9eadd3405
Signed-off-by: Martijn Coenen <maco@google.com>
Martijn Coenen [Wed, 7 Jun 2017 16:29:14 +0000 (09:29 -0700)]
ANDROID: binder: add min sched_policy to node.
This change adds flags to flat_binder_object.flags
to allow indicating a minimum scheduling policy for
the node. It also clarifies the valid value range
for the priority bits in the flags.
Internally, we use the priority map that the kernel
uses, e.g. [0..99] for real-time policies and [100..139]
for the SCHED_NORMAL/SCHED_BATCH policies.
Bug:
34461621
Bug:
37293077
Change-Id: I12438deecb53df432da18c6fc77460768ae726d2
Signed-off-by: Martijn Coenen <maco@google.com>
Martijn Coenen [Wed, 7 Jun 2017 00:04:42 +0000 (17:04 -0700)]
ANDROID: binder: add support for RT prio inheritance.
Adds support for SCHED_BATCH/SCHED_FIFO/SCHED_RR
priority inheritance.
Change-Id: I71f356e476be2933713a0ecfa2cc31aa141e2dc6
Signed-off-by: Martijn Coenen <maco@google.com>
Martijn Coenen [Tue, 6 Jun 2017 22:17:46 +0000 (15:17 -0700)]
ANDROID: binder: push new transactions to waiting threads.
Instead of pushing new transactions to the process
waitqueue, select a thread that is waiting on proc
work to handle the transaction. This will make it
easier to improve priority inheritance in future
patches, by setting the priority before we wake up
a thread.
If we can't find a waiting thread, submit the work
to the proc waitqueue instead as we did previously.
Change-Id: I23cbfcca867bed7b86007e22137d0a8fad4b4001
Signed-off-by: Martijn Coenen <maco@google.com>
Martijn Coenen [Fri, 2 Jun 2017 18:15:44 +0000 (11:15 -0700)]
ANDROID: binder: remove proc waitqueue
Removes the process waitqueue, so that threads
can only wait on the thread waitqueue. Whenever
there is process work to do, pick a thread and
wake it up.
This also fixes an issue with using epoll(),
since we no longer have to block on different
waitqueues.
Bug:
34461621
Change-Id: I2950b9de6fa078ee72d53c667a03cbaf587f0849
Signed-off-by: Martijn Coenen <maco@google.com>
Todd Kjos [Mon, 14 Nov 2016 19:37:41 +0000 (11:37 -0800)]
FROMLIST: binder: remove global binder lock
(from https://patchwork.kernel.org/patch/
9817773/)
Remove global mutex and rely on fine-grained locking
Change-Id: Idd1ae2e52d654e5dd76d443a1ff97522e687fd4c
Signed-off-by: Todd Kjos <tkjos@google.com>
Martijn Coenen [Mon, 22 May 2017 18:26:23 +0000 (11:26 -0700)]
FROMLIST: binder: fix death race conditions
(from https://patchwork.kernel.org/patch/
9817765/)
A race existed where one thread could register
a death notification for a node, while another
thread was cleaning up that node and sending
out death notifications for its references,
causing simultaneous access to ref->death
because different locks were held.
Test: boots, manual testing
Change-Id: Iff73312f34f70374f417beba4c4c82dd33cac119
Signed-off-by: Martijn Coenen <maco@google.com>
Todd Kjos [Fri, 21 Apr 2017 21:32:11 +0000 (14:32 -0700)]
FROMLIST: binder: protect against stale pointers in print_binder_transaction
(from https://patchwork.kernel.org/patch/
9817761/)
When printing transactions there were several race conditions
that could cause a stale pointer to be deferenced. Fixed by
reading the pointer once and using it if valid (which is
safe). The transaction buffer also needed protection via proc
lock, so it is only printed if we are holding the correct lock.
Bug:
36650912
Test: tested manually
Change-Id: I78240f99cc1a070d70a841c0d84d4306e2fd528d
Signed-off-by: Todd Kjos <tkjos@google.com>
Todd Kjos [Thu, 20 Oct 2016 23:43:34 +0000 (16:43 -0700)]
FROMLIST: binder: protect binder_ref with outer lock
(from https://patchwork.kernel.org/patch/
9817771/)
Use proc->outer_lock to protect the binder_ref structure.
The outer lock allows functions operating on the binder_ref
to do nested acquires of node and inner locks as necessary
to attach refs to nodes atomically.
Binder refs must never be accesssed without holding the
outer lock.
Change-Id: Icf6add0eddf70473b39239960b2d9a524775b53a
Signed-off-by: Todd Kjos <tkjos@google.com>
Todd Kjos [Fri, 26 May 2017 00:35:02 +0000 (17:35 -0700)]
FROMLIST: binder: use inner lock to protect thread accounting
(from https://patchwork.kernel.org/patch/
9817763/)
Use the inner lock to protect thread accounting fields in
proc structure: max_threads, requested_threads,
requested_threads_started and ready_threads.
Change-Id: I5a17eb68812702f803d4e2806e7887de0b3af18e
Signed-off-by: Todd Kjos <tkjos@google.com>
Martijn Coenen [Fri, 2 Jun 2017 20:36:52 +0000 (13:36 -0700)]
FROMLIST: binder: protect transaction_stack with inner lock.
(from https://patchwork.kernel.org/patch/
9817779/)
This makes future changes to priority inheritance
easier, since we want to be able to look at a thread's
transaction stack when selecting a thread to inherit
priority for.
It also allows us to take just a single lock in a
few paths, where we used to take two in succession.
Change-Id: Idb1b6e9faa5c669978b2b3011fe326be8aece586
Signed-off-by: Martijn Coenen <maco@google.com>
Todd Kjos [Thu, 25 May 2017 22:52:17 +0000 (15:52 -0700)]
FROMLIST: binder: protect proc->threads with inner_lock
(from https://patchwork.kernel.org/patch/
9817775/)
proc->threads will need to be accessed with higher
locks of other processes held so use proc->inner_lock
to protect it. proc->tmp_ref now needs to be protected
by proc->inner_lock.
Change-Id: I176cfeca16bf7c9b34b428c16405f93db81d2ff8
Signed-off-by: Todd Kjos <tkjos@google.com>
Todd Kjos [Mon, 12 Jun 2017 19:07:26 +0000 (12:07 -0700)]
FROMLIST: binder: protect proc->nodes with inner lock
(from https://patchwork.kernel.org/patch/
9817783/)
When locks for binder_ref handling are added, proc->nodes
will need to be modified while holding the outer lock
Change-Id: I17b39e981c55130c14a62fe49900eceff6e3642b
Signed-off-by: Todd Kjos <tkjos@google.com>
Todd Kjos [Thu, 8 Jun 2017 20:45:59 +0000 (13:45 -0700)]
FROMLIST: binder: add spinlock to protect binder_node
(from https://patchwork.kernel.org/patch/
9817769/)
node->node_lock is used to protect elements of node. No
need to acquire for fields that are invariant: debug_id,
ptr, cookie.
Change-Id: Ib7738e52fa7689767f17136e18cc05ff548b5717
Signed-off-by: Todd Kjos <tkjos@google.com>
Todd Kjos [Thu, 20 Oct 2016 17:33:00 +0000 (10:33 -0700)]
FROMLIST: binder: add spinlocks to protect todo lists
(from https://patchwork.kernel.org/patch/
9817769/)
The todo lists in the proc, thread, and node structures
are accessed by other procs/threads to place work
items on the queue.
The todo lists are protected by the new proc->inner_lock.
No locks should ever be nested under these locks. As the
name suggests, an outer lock will be introduced in
a later patch.
Change-Id: I7720bacf5ebae4af177e22fcab0900d54c94c11a
Signed-off-by: Todd Kjos <tkjos@google.com>
Todd Kjos [Tue, 21 Mar 2017 20:06:01 +0000 (13:06 -0700)]
FROMLIST: binder: use inner lock to sync work dq and node counts
(from https://patchwork.kernel.org/patch/
9817789/)
For correct behavior we need to hold the inner lock when
dequeuing and processing node work in binder_thread_read.
We now hold the inner lock when we enter the switch statement
and release it after processing anything that might be
affected by other threads.
We also need to hold the inner lock to protect the node
weak/strong ref tracking fields as long as node->proc
is non-NULL (if it is NULL then we are guaranteed that
we don't have any node work queued).
This means that other functions that manipulate these fields
must hold the inner lock. Refactored these functions to use
the inner lock.
Change-Id: I02c5cfdd3ab6dadea7f07f2a275faf3e27be77ad
Test: tested manually
Signed-off-by: Todd Kjos <tkjos@google.com>
Todd Kjos [Mon, 29 May 2017 23:44:24 +0000 (16:44 -0700)]
FROMLIST: binder: introduce locking helper functions
(from https://patchwork.kernel.org/patch/
9817791/)
There are 3 main spinlocks which must be acquired in this
order:
1) proc->outer_lock : protects most fields of binder_proc,
binder_thread, and binder_ref structures. binder_proc_lock()
and binder_proc_unlock() are used to acq/rel.
2) node->lock : protects most fields of binder_node.
binder_node_lock() and binder_node_unlock() are
used to acq/rel
3) proc->inner_lock : protects the thread and node lists
(proc->threads, proc->nodes) and all todo lists associated
with the binder_proc (proc->todo, thread->todo,
proc->delivered_death and node->async_todo).
binder_inner_proc_lock() and binder_inner_proc_unlock()
are used to acq/rel
Any lock under procA must never be nested under any lock at the same
level or below on procB.
Functions that require a lock held on entry indicate which lock
in the suffix of the function name:
foo_olocked() : requires node->outer_lock
foo_nlocked() : requires node->lock
foo_ilocked() : requires proc->inner_lock
foo_iolocked(): requires proc->outer_lock and proc->inner_lock
foo_nilocked(): requires node->lock and proc->inner_lock
Change-Id: Ied42674486092a0e3bdde64356e45b2494844558
Signed-off-by: Todd Kjos <tkjos@google.com>
Todd Kjos [Tue, 9 May 2017 18:08:05 +0000 (11:08 -0700)]
FROMLIST: binder: use node->tmp_refs to ensure node safety
(from https://patchwork.kernel.org/patch/
9817795/)
When obtaining a node via binder_get_node(),
binder_get_node_from_ref() or binder_new_node(),
increment node->tmp_refs to take a
temporary reference on the node to ensure the node
persists while being used. binder_put_node() must
be called to remove the temporary reference.
Change-Id: I962b39d5cd80b2d7e4786bb87236ede7914e2db7
Signed-off-by: Todd Kjos <tkjos@google.com>
Todd Kjos [Mon, 8 May 2017 16:16:27 +0000 (09:16 -0700)]
FROMLIST: binder: refactor binder ref inc/dec for thread safety
(from https://patchwork.kernel.org/patch/
9817781/)
Once locks are added, binder_ref's will only be accessed
safely with the proc lock held. Refactor the inc/dec paths
to make them atomic with the binder_get_ref* paths and
node inc/dec. For example, instead of:
ref = binder_get_ref(proc, handle, strong);
...
binder_dec_ref(ref, strong);
we now have:
ret = binder_dec_ref_for_handle(proc, handle, strong, &rdata);
Since the actual ref is no longer exposed to callers, a
new struct binder_ref_data is introduced which can be used
to return a copy of ref state.
Change-Id: I7de22107f8ebc967cee63251d584fceb4ea56250
Signed-off-by: Todd Kjos <tkjos@google.com>
Todd Kjos [Fri, 12 May 2017 21:42:55 +0000 (14:42 -0700)]
FROMLIST: binder: make sure accesses to proc/thread are safe
(from https://patchwork.kernel.org/patch/
9817787/)
binder_thread and binder_proc may be accessed by other
threads when processing transaction. Therefore they
must be prevented from being freed while a transaction
is in progress that references them.
This is done by introducing a temporary reference
counter for threads and procs that indicates that the
object is in use and must not be freed. binder_thread_dec_tmpref()
and binder_proc_dec_tmpref() are used to decrement
the temporary reference.
It is safe to free a binder_thread if there
is no reference and it has been released
(indicated by thread->is_dead).
It is safe to free a binder_proc if it has no
remaining threads and no reference.
A spinlock is added to the binder_transaction
to safely access and set references for t->from
and for debug code to safely access t->to_thread
and t->to_proc.
Change-Id: I0a00a0294c3e93aea8b3f141c6f18e77ad244078
Signed-off-by: Todd Kjos <tkjos@google.com>