git.stricted.de - GitHub/LineageOS/android_kernel_samsung

sched: Prevent recursion in io_schedule()

commit 9cff8adeaa34b5d2802f03f89803da57856b3b72 upstream.

io_schedule() calls blk_flush_plug() which, depending on the
contents of current->plug, can initiate arbitrary blk-io requests.

Note that this contrasts with blk_schedule_flush_plug() which requires
all non-trivial work to be handed off to a separate thread.

This makes it possible for io_schedule() to recurse, and initiating
block requests could possibly call mempool_alloc() which, in times of
memory pressure, uses io_schedule().

Apart from any stack usage issues, io_schedule() will not behave
correctly when called recursively as delayacct_blkio_start() does
not allow for repeated calls.

So:
- use ->in_iowait to detect recursion.  Set it earlier, and restore
   it to the old value.
- move the call to "raw_rq" after the call to blk_flush_plug().
   As this is some sort of per-cpu thing, we want some chance that
   we are on the right CPU
- When io_schedule() is called recurively, use blk_schedule_flush_plug()
   which cannot further recurse.
- as this makes io_schedule() a lot more complex and as io_schedule()
   must match io_schedule_timeout(), but all the changes in io_schedule_timeout()
   and make io_schedule a simple wrapper for that.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[ Moved the now rudimentary io_schedule() into sched.h. ]
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tony Battersby <tonyb@cybernetics.com>
Link: http://lkml.kernel.org/r/20150213162600.059fffb2@notabene.brown
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Change-Id: Ic32d78863e35db89c3e8f0b29dcf20cb563dc70f

cpufreq: interactive: Use wake_up_process_no_notif to wake up tasks

Scheduler could send a notification to governor each time a task wakes
up. If governor wakes up another task as a response to such a
notification, it could result in endless recursive notifications.

Use wake_up_process_no_notif to ensure scheduler won't send another
notification for speedchange task woken up by the governor.

Change-Id: I697affcbdf79e2ad0cfe843eb880d304960682f4
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>

sched: Provide a wake up API without sending freq notifications

Each time a task wakes up, scheduler evaluates its load and notifies
governor if the resulting frequency of destination CPU is larger than
a threshold. However, some governor wakes up a separate task that
handles frequency change, which again calls wake_up_process().

This is dangerous because if the task being woken up meets the
threshold and ends up being moved around, there is a potential for
endless recursive notifications.

Introduce a new API for waking up a task without triggering
frequency notification.

Change-Id: I24261af81b7dc410c7fb01eaa90920b8d66fbd2a
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>

arm64: rebuild configs to reflect CONFIG_UID_SYS_STATS change

cpufreq: interactive: use CPUFREQ_RELATION_C to choose closest frequency insted of lowest

cpufreq: interactive: don't skip waking up speedchange_task if target_freq > policy->cur

When __cpufreq_driver_target() in speedchange_task failed for some reason, the
policy->cur could be lower than the target_freq. The governor misses to change
the target_freq if the target_freq is equal to the next_freq at the next sample
time.

Added a check to prevent the CPU to stay at the speed that is lower than the
target_freq for long duration.

Change-Id: Ibfdcd193b8280390b8f8374a63218aa31267f310
Signed-off-by: Minsung Kim <ms925.kim@samsung.com>

cpufreq: Optimize cpufreq_frequency_table_verify()

cpufreq_frequency_table_verify() is rewritten here to make it more logical
and efficient.
- merge multiple lines for variable declarations together.
- quit early if any frequency between min/max is found.
- don't call cpufreq_verify_within_limits() in case any valid freq is
found as it is of no use.
- rename the count variable as found and change its type to boolean.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Used commit: https://github.com/Noxxxious/Zero/commit/a04a517aa911804b26f5970e49d48d35e1a62b24

Introduce CPUFREQ_RELATION_C
It selects the frequency with the minimum euclidean distance to
target. In case of equal distance between 2 frequencies, it will
select the greater freq.

Created by Stratos Karafotis <stratosk@semaphore.gr>

Original commit: https://github.com/XileForce/Vindicator-S6/commit/4b113d38e8f62ea5810d2b4d134cc6f7a81198dd

cpufreq: Move get_cpu_idle_time() to cpufreq.c

Governors other than ondemand and conservative can also use get_cpu_idle_time()
and they aren't required to compile cpufreq_governor.c. So, move these
independent routines to cpufreq.c instead.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
This fixes compiling errors without Ondemand

Based on https://patchwork.kernel.org/patch/2582231/

cpufreq_stats: Fix stats leak during update policy

When the cpufreq policy is moved from one CPU to another, the percpu
stats_table is overwritten and leaked. Properly free the old stats table
and ensure its protected in the non-sysfs paths of update policy and
acct_update_power. The sysfs entries are removed in the cpufreq core
driver before migrating the policy. We introduce a new spinlock
specifically for these operations to avoid needed to convert all the
other spinlocks into the irq safe variants since acct_update_power is
typically called in ISR context.

[ported to apq8084-common by Corinna Vinschen <xda@vinschen.de>]

Change-Id: I95ff24c07834065cd0fd3c763a488a9843097a1d
Signed-off-by: Jason Hrycay <jason.hrycay@motorola.com>
Reviewed-on: https://gerrit.mot.com/921752
SLTApproved: Slta Waiver <sltawvr@motorola.com>
SME-Granted: SME Approvals Granted
Reviewed-by: Igor Kovalenko <igork@motorola.com>

ANDROID: cpufreq_stats: Fix task time in state locking

The task->time_in_state pointer is written to at task creation
and exiting, protection is needed for concurrent reads e.g. during
sysfs accesses. Added spin lock such that the task's time_in_state
pointer used is set to either allocated memory or null.

Bug: 38463235
Test: Torture concurrent sysfs reads with short lived tasks
Signed-off-by: Andres Oportus <andresoportus@google.com>
Change-Id: Iaa6402bf50a33489506f2170e4dfabe535d79e15

ANDROID: cpufreq: stats: add uid removal for uid_time_in_state

Bug: 62295304
Bug: 34133340
Test: Boot and test uid removal by writing to remove_uid_range

Signed-off-by: Andres Oportus <andresoportus@google.com>
Change-Id: Ic51fc9480716a8aad88fb55549c2b69021038a11

Conflicts:
drivers/cpufreq/cpufreq_stats.c
include/linux/cpufreq.h
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>

uid_sys_stats: defer io stats calulation for dead tasks

Store sum of dead task io stats in uid_entry and defer uid io
calulation until next uid proc stat change or dumpsys.

Bug: 37754877
Change-Id: I970f010a4c841c5ca26d0efc7e027414c3c952e0
Signed-off-by: Jin Qian <jinqian@google.com>
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>

uid_sys_stats: reduce update_io_stats overhead

Replaced read_lock with rcu_read_lock to reduce time that preemption
is disabled.

Added a function to update io stats for specific uid and moved
hash table lookup, user_namespace out of loops.

Bug: 37319300
Change-Id: I2b81b5cd3b6399b40d08c3c14b42cad044556970
Signed-off-by: Jin Qian <jinqian@google.com>
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>

uid_sys_stats: change to use rt_mutex

We see this happens multiple times in heavy workload in systrace
and AMS stuck in uid_lock.

Running process:        Process 953
Running thread: android.ui
State:  Uninterruptible Sleep
Start:
1,025.628 ms
Duration:
27,955.949 ms
On CPU:
Running instead:        system_server
Args:
{kernel callsite when blocked:: "uid_procstat_write+0xb8/0x144"}

Changing to rt_mutex can mitigate the priority inversion

Bug: 34991231
Bug: 34193533
Test: on marlin
Change-Id: I28eb3971331cea60b1075740c792ab87d103262c
Signed-off-by: Wei Wang <wvw@google.com>
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>

ANDROID: uid_sys_stats: account for fsync syscalls

Change-Id: Ie888d8a0f4ec7a27dea86dc4afba8e6fd4203488
Signed-off-by: Jin Qian <jinqian@google.com>
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>

ANDROID: uid_sys_stats: fix negative write bytes.

A task can cancel writes made by other tasks. In rare cases,
cancelled_write_bytes is larger than write_bytes if the task
itself didn't make any write. This doesn't affect total size
but may cause confusion when looking at IO usage on individual
tasks.

Bug: 35851986
Change-Id: If6cb549aeef9e248e18d804293401bb2b91918ca
Signed-off-by: Jin Qian <jinqian@google.com>
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>

ANDROID: uid_sys_stats: remove unnecessary code in procstat switch

No need to aggregate the switched uid separately since
update_io_stats_locked covers all uids.

Bug: 34198239
Change-Id: Ifed347264b910de02e3f3c8dec95d1a2dbde58c0
Signed-off-by: Jin Qian <jinqian@google.com>
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>

ANDROID: uid_sys_stats: return full size when state is not changed.

Userspace keeps retrying when it sees nothing is written.

Bug: 34364961
Change-Id: Ie288c90c6a206fb863dcad010094fcd1373767aa
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>

ANDROID: uid_sys_stats: allow writing same state

Signed-off-by: Jin Qian <jinqian@google.com>
Bug: 34360629
Change-Id: Ia748351e07910b1febe54f0484ca1be58c4eb9c7
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>

ANDROID: uid_sys_stats: rename uid_cputime.c to uid_sys_stats.c

This module tracks cputime and io stats.

Signed-off-by: Jin Qian <jinqian@google.com>
Bug: 34198239
Change-Id: I9ee7d9e915431e0bb714b36b5a2282e1fdcc7342

Conflicts:
drivers/misc/Kconfig
drivers/misc/Makefile
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>

ANDROID: uid_cputime: add per-uid IO usage accounting

IO usages are accounted in foreground and background buckets.
For each uid, io usage is calculated in two steps.

delta = current total of all uid tasks - previus total
current bucket += delta

Bucket is determined by current uid stat. Userspace writes to
/proc/uid_procstat/set <uid> <stat> when uid stat is updated.

/proc/uid_io/stats shows IO usage in this format.
<uid> <foreground IO> <background IO>

Signed-off-by: Jin Qian <jinqian@google.com>
Bug: 34198239
Change-Id: I3369e59e063b1e5ee0dfe3804c711d93cb937c0c
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>

uid_cputime: Check for the range while removing range of UIDs.

Checking if the uid_entry->uid matches the uid intended to be removed will
prevent deleting unwanted uid_entry.
Type cast the key for the hashtable to the same size, as when they were
inserted. This will make sure that we can find the uid_entry we want.

Bug: 25195548
Change-Id: I567942123cfb20e4b61ad624da19ec4cc84642c1
Signed-off: Ruchi kandoi <kandoiruchi@google.com>

uid_cputime: fix cputime overflow

Converting cputime_t to usec caused overflow when the value is greater
than 1 hour. Use msec and convert to unsigned long long to support bigger
range.

Bug: 22461683

Change-Id: I853fe3e8e7dbf0d3e2cc5c6f9688a5a6e1f1fb3e
Signed-off-by: Jin Qian <jinqian@google.com>
Git-commit: 2abf710d2849737b6a435f371395481da628f746
Git-repo: https://android.googlesource.com/kernel/msm/
Signed-off-by: Nirmal Abraham <nabrah@codeaurora.org>

uid_cputime: Iterates over all the threads instead of processes.

Bug: 22833116
Change-Id: I775a18f61bd2f4df2bec23d01bd49421d0969f87
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
Git-commit: 35ef14095795ea331361034a1f7087bdf07f76f7
Git-repo: https://android.googlesource.com/kernel/msm/
Signed-off-by: Nirmal Abraham <nabrah@codeaurora.org>

uid_cputime: Avoids double accounting of process stime, utime and cpu_power in task exit.

This avoids the race where a particular process is terminating and we read the
show_uid_stats. At this time since the task_struct still exists and we will account
for the terminating process as one of the active task, where as the stats would have
been added in the task exit callback.

When the task is terminated, the cpu_power for that particular task is added to the
terminated tasks. It is possible that before the task releases all the resources, cpu
reschedules the task or a timer interrupt is fired. At this point we will try to add
the additional time to the process, which will cause the accounting to be skewed. This
avoids that race condition.

Bug: 22064385
Change-Id: Id2ae04b33fcd230eda9683a41b6019d4dd8f5d85
Signed-off-by: Jin Qian <jinqian@google.com>
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
Git-commit: 344377047daa5832ef798af697adee388e367d57
Git-repo: https://android.googlesource.com/kernel/msm/
Signed-off-by: Nirmal Abraham <nabrah@codeaurora.org>

uid_cputime: Extends the cputime functionality to report power per uid

/proc/uid_cputime/show_uid_stats shows a third field power for each of
the uids.It represents the power in the units (uAusec)

Bug: 21498425
Change-Id: I52fdc5e59647e9dc97561a26d56f462a2689ba9c
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
Git-commit: b8d3311e8d41c9109f9a4ba3c4d0f7e594539c68
Git-repo: https://android.googlesource.com/kernel/msm/
Signed-off-by: Nirmal Abraham <nabrah@codeaurora.org>

ANDROID: Fix cpufreq stats table creation

cpufreq stats does not correctly supports multiple cpus per profile.
For instance Marlin/Sailfish per cpu stats struct does not get created
for all cpus (only one per policy). This change does not provide full
support for multiple cpus per profile but allows stats creation per
cpu to allow b/34133340 to be completed.

Bug: 38244231
Bug: 34133340
Test: Boot Sailfish

Signed-off-by: Andres Oportus <andresoportus@google.com>
Change-Id: I72ea548a199f57ed841618b08b9c41e99b493376

Conflicts:
drivers/cpufreq/cpufreq_stats.c
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>

ANDROID: cpufreq_stat: add per task/uid/freq stats

Adds per process nodes in /proc/PID/time_in_state showing per
frequency times and adds a global /proc/uid_time_in_state
showing per frequency per uid times.

Bug: 34133340
Bug: 38320164

Tests: boot sailfish and reading /proc/uid_time_in_state and
/proc/$$/time_in_state

Signed-off-by: Andres Oportus <andresoportus@google.com>
Change-Id: Ideb22b608b9a5e7bd2200a3a6df0f110b635f96a

uid_cputime: Avoids double accounting of process stime, utime and cpu…
…_power in task exit.

This avoids the race where a particular process is terminating and we
read the show_uid_stats. At this time since the task_struct still exists
and we will account for the terminating process as one of the active
task, where as the stats would have been added in the task exit
callback.

When the task is terminated, the cpu_power for that particular task is
added to the terminated tasks. It is possible that before the task releases all
the resources, cpu reschedules the task or a timer interrupt is fired. At this
point we will try to add the additional time to the process, which will cause
the accounting to be skewed. This avoids that race condition.

Bug: 22064385
Change-Id: Id2ae04b33fcd230eda9683a41b6019d4dd8f5d85
Signed-off-by: Jin Qian <jinqian@google.com>
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
Git-commit: 344377047daa5832ef798af697adee388e367d57
Git-repo: https://android.googlesource.com/kernel/msm/
Signed-off-by: Nirmal Abraham <nabrah@codeaurora.org>
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
Change-Id: I405733725d535b0a864088516bf52fa3638ee6aa

Used commit: https://github.com/omnirom/android_kernel_lge_msm8992/commit/e36bbc4edde59683e980c816fb4a432fbbe594bb

sched: cpufreq: Adds a field cpu_power in the task_struct

cpu_power has been added to keep track of amount of power each task is
consuming. cpu_power is updated whenever stime and utime are updated for
a task. power is computed by taking into account the frequency at which
the current core was running and the current for cpu actively
running at hat frequency.

Change-Id: Ic535941e7b339aab5cae9081a34049daeb44b248
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
Original commit: https://github.com/dianlujitao/CAF_kernel_msm-3.10/commit/85a6bd2bc4c903df43186e6f41209746aa6fdf05

cpufreq_stats: Adds the fucntionality to load current values for each frequency for all the cores.

The current values for the cpu cores needs to be added to the device
tree for this functionaly to work. It loads the current values for each
frequecy in uA for all the cores.

Change-Id: If03311aaeb3e4c09375dd0beb9ad4fbb254b5c08
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
Original commit: https://github.com/dianlujitao/CAF_kernel_msm-3.10/commit/8d12562a74922eac859dcec9c43d34d8fd1a9fd1

Fix memory leaks when updating stats table

The address for an element in the per-cpu cpufreq_stats_table variable
is overwritten in cpufreq_stats_update_policy_cpu(), but the memory
allocated at the address that gets overwritten is not freed beforehand.

Free the allocated memory beforehand to fix the memory leaks.

Signed-off-by: Sultanxda <sultanxda@gmail.com>
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>
Signed-off-by: Luca Grifo <lg@linux.com>
Signed-off-by: djb77 <dwayne.bakewell@gmail.com>
Used commit: https://github.com/Siddhant-Naik/TheFlash-Kernel-A5-A7-2017/commit/98bd2052dfc9757832c49b7a8c68157b8baea13f

cpufreq: interactive governor drops bits in time calculation

Keep time calculation in 64-bit throughout. If we have long times
between idle calculations this can result in deltas > 32 bits
which causes incorrect load percentage calculations and selecting
the wrong frequencies if we truncate here.

Change-Id: Iac1e5646d58485737538edbb9e7a6d2246b56023
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Alex Naidis <alex.naidis@linux.com>

CHROMIUM: cpufreq: interactive: calculate load before freq change

The update to cpu load for cpufreq rate changes was happening after the rate
change instead of before. This switches it to update the cpu load before the
cpufreq rate change so the old frequency is used for calculating the speed
adjusted cpu load. The old frequency was used for that active time, so it should
be used for calculating the speed adjusted cpu load.

BUG=chrome-os-partner:37673
TEST=power_LoadTest on Jerry
check we spend most of our time at the lower cpu frequencies

Change-Id: I2fff785561ad8aebd354ddd305e91d96c799a617
Signed-off-by: Derek Basehore <dbasehore@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/262925
Reviewed-by: Douglas Anderson <dianders@chromium.org>

cpufreq: Correct the data reported in all_time_in_state

Commit bd9474e059bbb2bb62f7e93894cfc3d3ba473fb2 (cpufreq_stats:
Adds the fucntionality to load current values for each frequency
for all the cores) introduced a change by which
'cpufreq_allstats_create' gets called at initialization (from
'cpufreq_stats_init' instead of 'cpufreq_stat_notifier_policy').
This causes 'cpufreq_allstate_create' to be called before the
freq_table is allocated from 'create_all_freq_table'. Due to
this, the data for cpu's which are online at boot are not
added to the 'all_freq_table' leading to the incorrect
reporting of data when the below sysfs command is run -
'cat sys/devices/system/cpu/cpufreq/all_time_in_state'.

Correct this behaviour by altering the cpufreq_stats init
sequence by which the memory for 'all_freq_table' is allocated
before the 'cpufreq_allstats_create' function is called.

Change-Id: I2232dacdc0deec4d1987c418e833fe28f74623fc
Signed-off-by: Nirmal Abraham <nabrah@codeaurora.org>

cpufreq: interactive: Ramp up directly if cpu_load exceeds 100

When governor is using regular busy time tracking, cpu_load will
never exceed 100 because busy time will never exceed elapsed time in
any one sampling window. The only exception is when frequency is
reduced in middle of a window (e.g. due to thermal throttling). In
this case, cpu_load is likely irrelevant since current frequency
governor has been voting is already higher than what target can run
at.

However, on a heterogeneous CPU system with scheduler input enabled
to track the load of migrated tasks, cpu_load could also exceed 100
when a task migrates from more capable CPU to slower CPU. When this
happens, governor already knows the exact frequency required to handle
this load. There is no need to progressively ramp up frequency in order
to assess the load's real demand. It's not desirable to starve such a
migrating task by forcing it through ramping up process on the slower
CPU.

Direclty jump beyond hispeed_freq and ignore above_hispeed_delay if
cpu_load exceeds 100.

Change-Id: Ib87057e4f00732fad943ab595a33e3059494ef15
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>

cpufreq: interactive: fix permanent locking on high CPU frequency

This removed improper Samsung's changes

CHROMIUM: cpufreq: interactive: validate above_hispeed_delay

We rely on the frequencies in above_hispeed_delay being in ascending
order. Ensure that is the case for the values the user gives us.

BUG=chrome-os-partner:20830
TEST=Boot Pit; write invalid above_hispeed_delay value:
localhost ~ # echo "20000 1800000:20000 700000:0" > above_hispeed_delay
-bash: echo: write error: Invalid argument

Change-Id: Ifb431e58ad4b6b371152d7b09fcbefa127b0cbe2
Signed-off-by: Andrew Bresticker <abrestic@chromium.org>
Reviewed-on: https://gerrit.chromium.org/gerrit/65742
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>

cpufreq: interactive: BUG_ON when tunables are NULL after init

When tunables are not available for events other than
CPUFREQ_GOV_POLICY_INIT in cpufreq_governor_interactive(), trigger a
panic instead of throwing a warning.

When the original warning happens, some race condition must have
occurred, and governor will be in a bad state even if it might still
run for a while. Panic directly so that it's easier to catch the
first race event.

Change-Id: I2dc1185cabfe72a63739452731fe242924d2cf45
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>

cpufreq: interactive: Don't set floor_validate_time during boost

Frequency selection algorithm guarantees its chosen frequency
is not lower than hispeed_freq as long as boost is enabled.

Setting floor_freq and floor_validate_time during boost could block
CPU frequency from going below hispeed_freq even after
boostpulse_duration expires, if min_sample_time is higher than
boostpulse_duration. This conflicts with the intention of commit
de091367ead15b6e95dd1d0743a18f0da5a07ee5
(cpufreq: interactive: specify duration of CPU speed boost pulse)
to allow CPU to ramp down immediately after boost expires. It also
makes boost behavior inconsistent since it depends on min_sample_time.

Avoid setting floor_freq and floor_validate_time when boost starts.

Change-Id: I12852998af46cfbfaf8661eb5e8d5301b6f631e7
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Signed-off-by: Bálint Czobor <czoborbalint@gmail.com>

cpufreq: interactive: Use del_timer/add_timer_on to rearm timers

Replace mod_timer_pinned() with del_timer(), add_timer_on().
mod_timer_pinned() always adds timer onto current CPU. Interactive
governor expects each CPU's timers to be running on the same CPU.
If cpufreq_interactive_timer_resched() is called from another CPU,
the timer will be armed on the wrong CPU.

Replacing mod_timer_pinned() with del_timer() and add_timer_on()
guarantees timers are still run on the right CPU even if another
CPU reschedules the timer. This would provide more flexibility
for future changes.

Change-Id: I3a10be37632afc0ea4e0cc9c86323b9783b216b1
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>

net: core: To send ARP probe when neighbor state is NUD_STALE

Featurizing to send an ARP probe to the connected client when
the neighbor state moves to NUD_STALE. This triggers the
neighbor state to move back to NUD_REACHABLE if the ARP request
is resolved and prevents a RTM_DELNEIGH from being triggered

Change-Id: I27aba004a180dfbff5b1fcee2d04047c8523fb8a
Signed-off-by: Sridhar Ancha <sancha@codeaurora.org>

net: core: Send ARP probe and trigger RTM_NEWNEIGH

Send ARP probe and generate RTM_NEWNEIGH if the neighbor
state is not NUD_REACHABLE. Also featurize changes for
sending neighbor probe.

Change-Id: I633285b8e0cbcd49291d5e52136f11e20f2388bc
Signed-off-by: Ravinder Konka <rkonka@codeaurora.org>

neigh: Better handling of transition to NUD_PROBE state

[1] When entering NUD_PROBE state via neigh_update(), perhaps received
    from userspace, correctly (re)initialize the probes count to zero.

    This is useful for forcing revalidation of a neighbor (for example
    if the host is attempting to do DNA [IPv4 4436, IPv6 6059]).

[2] Notify listeners when a neighbor goes into NUD_PROBE state.

    By sending notifications on entry to NUD_PROBE state listeners get
    more timely warnings of imminent connectivity issues.

    The current notifications on entry to NUD_STALE have somewhat
    limited usefulness: NUD_STALE is a perfectly normal state, as is
    NUD_DELAY, whereas notifications on entry to NUD_FAILURE come after
    a neighbor reachability problem has been confirmed (typically after
    three probes).

Change-Id: I1d01d40ef3bc4753b0eaa79da2b27235425b1934
Signed-off-by: Erik Kline <ek@google.com>
Acked-By: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Git-commit: e4a6d6ba5a9e9e1796bbe6efe4f20ce7072df667
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>

net: core: To send ARP probe when neighbor state is NUD_STALE

Send an ARP probe to the connected client when the neighbor
state moves to NUD_STALE. This triggers the neighbor state
to move back to NUD_REACHABLE if the ARP request is resolved
and prevents a RTM_DELNEIGH from being triggered

Change-Id: I4d17c8f24d47931524904d0db74fa812a4f235f6
Signed-off-by: Ravinder Konka <rkonka@codeaurora.org>
Signed-off-by: Skylar Chang <chiaweic@codeaurora.org>

ANDROID: binder: don't queue async transactions to thread.

This can cause issues with processes using the poll()
interface:

1) client sends two oneway transactions
2) the second one gets queued on async_todo
   (because the server didn't handle the first one
    yet)
3) server returns from poll(), picks up the
   first transaction and does transaction work
4) server is done with the transaction, sends
   BC_FREE_BUFFER, and the second transaction gets
   moved to thread->todo
5) libbinder's handlePolledCommands() only handles
   the commands in the current data buffer, so
   doesn't see the new transaction
6) the server continues running and issues a new
   outgoing transaction. Now, it suddenly finds
   the incoming oneway transaction on its thread
   todo, and returns that to userspace.
7) userspace does not expect this to happen; it
   may be holding a lock while making the outgoing
   transaction, and if handling the incoming
   trasnaction requires taking the same lock,
   userspace will deadlock.

By queueing the async transaction to the proc
workqueue, we make sure it's only picked up when
a thread is ready for proc work.

Bug: 38201220
Bug: 63075553
Bug: 63079216

Change-Id: I84268cc112f735d7e3173793873dfdb4b268468b
Signed-off-by: Martijn Coenen <maco@android.com>

ANDROID: binder: don't enqueue death notifications to thread todo.

This allows userspace to request death notifications without
having to worry about getting an immediate callback on the same
thread; one scenario where this would be problematic is if the
death recipient handler grabs a lock that was already taken
earlier (eg as part of a nested transaction).

Bug: 23525545
Test: binderLibTest.DeathNotificationThread passes
Change-Id: I955e16306fe3110dacb9a391ffff1bf869249495
Signed-off-by: Martijn Coenen <maco@android.com>

ANDROID: binder: call poll_wait() unconditionally.

Because we're not guaranteed that subsequent calls
to poll() will have a poll_table_struct parameter
with _qproc set. When _qproc is not set, poll_wait()
is a noop, and we won't be woken up correctly.

Bug: 64552728
Change-Id: I5b904c9886b6b0994d1631a636f5c5e5f6327950
Test: binderLibTest stops hanging with new test
Signed-off-by: Martijn Coenen <maco@android.com>

ANDROID: binder: Don't BUG_ON(!spin_is_locked()).

Because is_spin_locked() always returns false on UP
systems.

Use assert_spin_locked() instead, and remove the
WARN_ON() instances, since those were easy to verify.

Bug: 64073116
Change-Id: I9080991c6d67e91928282a3ee64db23e50c7d66a
Signed-off-by: Martijn Coenen <maco@android.com>

ANDROID: binder: don't check prio permissions on restore.

Because we have disabled RT priority inheritance for
the regular binder domain, the following can happen:

1) thread A (prio 98) calls into thread B
2) because RT prio inheritance is disabled, thread B
   runs at the lowest nice (prio 100) instead
3) thread B calls back into A; A will run at prio 100
   for the duration of the transaction
4) When thread A is done with the call from B, we will
   try to restore the prio back to 98. But, we fail
   because the process doesn't hold CAP_SYS_NICE,
   neither is RLIMIT_RT_PRIO set.

While the proper fix going forward will be to
correctly apply CAP_SYS_NICE or RLIMIT_RT_PRIO,
for now it seems reasonable to not check permissions
on the restore path.

Change-Id: Ibede5960c9b7bb786271c001e405de50be64d944
Signed-off-by: Martijn Coenen <maco@android.com>

Add BINDER_GET_NODE_DEBUG_INFO ioctl

The BINDER_GET_NODE_DEBUG_INFO ioctl will return debug info on
a node. Each successive call reusing the previous return value
will return the next node. The data will be used by
libmemunreachable to mark the pointers with kernel references
as reachable.

Bug: 28275695
Change-Id: Idbbafa648a33822dc023862cd92b51a595cf7c1c
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Martijn Coenen <maco@android.com>

ANDROID: binder: add RT inheritance flag to node.

Allows a binder node to specify whether it wants to
inherit real-time scheduling policy from a caller.

Change-Id: I375b6094bf441c19f19cba06d5a6be02cd07d714
Signed-off-by: Martijn Coenen <maco@android.com>

ANDROID: binder: improve priority inheritance.

By raising the priority of a thread selected for
a transaction *before* we wake it up.

Delay restoring the priority when doing a reply
until after we wake-up the process receiving
the reply.

Change-Id: Ic332e4e0ed7d2d3ca6ab1034da4629c9eadd3405
Signed-off-by: Martijn Coenen <maco@google.com>

ANDROID: binder: add min sched_policy to node.

This change adds flags to flat_binder_object.flags
to allow indicating a minimum scheduling policy for
the node. It also clarifies the valid value range
for the priority bits in the flags.

Internally, we use the priority map that the kernel
uses, e.g. [0..99] for real-time policies and [100..139]
for the SCHED_NORMAL/SCHED_BATCH policies.

Bug: 34461621
Bug: 37293077
Change-Id: I12438deecb53df432da18c6fc77460768ae726d2
Signed-off-by: Martijn Coenen <maco@google.com>

ANDROID: binder: add support for RT prio inheritance.

Adds support for SCHED_BATCH/SCHED_FIFO/SCHED_RR
priority inheritance.

Change-Id: I71f356e476be2933713a0ecfa2cc31aa141e2dc6
Signed-off-by: Martijn Coenen <maco@google.com>

ANDROID: binder: push new transactions to waiting threads.

Instead of pushing new transactions to the process
waitqueue, select a thread that is waiting on proc
work to handle the transaction. This will make it
easier to improve priority inheritance in future
patches, by setting the priority before we wake up
a thread.

If we can't find a waiting thread, submit the work
to the proc waitqueue instead as we did previously.

Change-Id: I23cbfcca867bed7b86007e22137d0a8fad4b4001
Signed-off-by: Martijn Coenen <maco@google.com>

ANDROID: binder: remove proc waitqueue

Removes the process waitqueue, so that threads
can only wait on the thread waitqueue. Whenever
there is process work to do, pick a thread and
wake it up.

This also fixes an issue with using epoll(),
since we no longer have to block on different
waitqueues.

Bug: 34461621
Change-Id: I2950b9de6fa078ee72d53c667a03cbaf587f0849
Signed-off-by: Martijn Coenen <maco@google.com>

FROMLIST: binder: remove global binder lock

(from https://patchwork.kernel.org/patch/9817773/)

Remove global mutex and rely on fine-grained locking

Change-Id: Idd1ae2e52d654e5dd76d443a1ff97522e687fd4c
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: fix death race conditions

(from https://patchwork.kernel.org/patch/9817765/)

A race existed where one thread could register
a death notification for a node, while another
thread was cleaning up that node and sending
out death notifications for its references,
causing simultaneous access to ref->death
because different locks were held.

Test: boots, manual testing
Change-Id: Iff73312f34f70374f417beba4c4c82dd33cac119
Signed-off-by: Martijn Coenen <maco@google.com>

FROMLIST: binder: protect against stale pointers in print_binder_transaction

(from https://patchwork.kernel.org/patch/9817761/)

When printing transactions there were several race conditions
that could cause a stale pointer to be deferenced. Fixed by
reading the pointer once and using it if valid (which is
safe). The transaction buffer also needed protection via proc
lock, so it is only printed if we are holding the correct lock.

Bug: 36650912
Test: tested manually
Change-Id: I78240f99cc1a070d70a841c0d84d4306e2fd528d
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: protect binder_ref with outer lock

(from https://patchwork.kernel.org/patch/9817771/)

Use proc->outer_lock to protect the binder_ref structure.
The outer lock allows functions operating on the binder_ref
to do nested acquires of node and inner locks as necessary
to attach refs to nodes atomically.

Binder refs must never be accesssed without holding the
outer lock.

Change-Id: Icf6add0eddf70473b39239960b2d9a524775b53a
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: use inner lock to protect thread accounting

(from https://patchwork.kernel.org/patch/9817763/)

Use the inner lock to protect thread accounting fields in
proc structure: max_threads, requested_threads,
requested_threads_started and ready_threads.

Change-Id: I5a17eb68812702f803d4e2806e7887de0b3af18e
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: protect transaction_stack with inner lock.

(from https://patchwork.kernel.org/patch/9817779/)

This makes future changes to priority inheritance
easier, since we want to be able to look at a thread's
transaction stack when selecting a thread to inherit
priority for.

It also allows us to take just a single lock in a
few paths, where we used to take two in succession.

Change-Id: Idb1b6e9faa5c669978b2b3011fe326be8aece586
Signed-off-by: Martijn Coenen <maco@google.com>

FROMLIST: binder: protect proc->threads with inner_lock

(from https://patchwork.kernel.org/patch/9817775/)

proc->threads will need to be accessed with higher
locks of other processes held so use proc->inner_lock
to protect it. proc->tmp_ref now needs to be protected
by proc->inner_lock.

Change-Id: I176cfeca16bf7c9b34b428c16405f93db81d2ff8
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: protect proc->nodes with inner lock

(from https://patchwork.kernel.org/patch/9817783/)

When locks for binder_ref handling are added, proc->nodes
will need to be modified while holding the outer lock

Change-Id: I17b39e981c55130c14a62fe49900eceff6e3642b
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: add spinlock to protect binder_node

(from https://patchwork.kernel.org/patch/9817769/)

node->node_lock is used to protect elements of node. No
need to acquire for fields that are invariant: debug_id,
ptr, cookie.

Change-Id: Ib7738e52fa7689767f17136e18cc05ff548b5717
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: add spinlocks to protect todo lists

(from https://patchwork.kernel.org/patch/9817769/)

The todo lists in the proc, thread, and node structures
are accessed by other procs/threads to place work
items on the queue.

The todo lists are protected by the new proc->inner_lock.
No locks should ever be nested under these locks. As the
name suggests, an outer lock will be introduced in
a later patch.

Change-Id: I7720bacf5ebae4af177e22fcab0900d54c94c11a
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: use inner lock to sync work dq and node counts

(from https://patchwork.kernel.org/patch/9817789/)

For correct behavior we need to hold the inner lock when
dequeuing and processing node work in binder_thread_read.
We now hold the inner lock when we enter the switch statement
and release it after processing anything that might be
affected by other threads.

We also need to hold the inner lock to protect the node
weak/strong ref tracking fields as long as node->proc
is non-NULL (if it is NULL then we are guaranteed that
we don't have any node work queued).

This means that other functions that manipulate these fields
must hold the inner lock. Refactored these functions to use
the inner lock.

Change-Id: I02c5cfdd3ab6dadea7f07f2a275faf3e27be77ad
Test: tested manually
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: introduce locking helper functions

(from https://patchwork.kernel.org/patch/9817791/)

There are 3 main spinlocks which must be acquired in this
order:
1) proc->outer_lock : protects most fields of binder_proc,
binder_thread, and binder_ref structures. binder_proc_lock()
and binder_proc_unlock() are used to acq/rel.
2) node->lock : protects most fields of binder_node.
binder_node_lock() and binder_node_unlock() are
used to acq/rel
3) proc->inner_lock : protects the thread and node lists
(proc->threads, proc->nodes) and all todo lists associated
with the binder_proc (proc->todo, thread->todo,
proc->delivered_death and node->async_todo).
binder_inner_proc_lock() and binder_inner_proc_unlock()
are used to acq/rel

Any lock under procA must never be nested under any lock at the same
level or below on procB.

Functions that require a lock held on entry indicate which lock
in the suffix of the function name:

foo_olocked() : requires node->outer_lock
foo_nlocked() : requires node->lock
foo_ilocked() : requires proc->inner_lock
foo_iolocked(): requires proc->outer_lock and proc->inner_lock
foo_nilocked(): requires node->lock and proc->inner_lock

Change-Id: Ied42674486092a0e3bdde64356e45b2494844558
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: use node->tmp_refs to ensure node safety

(from https://patchwork.kernel.org/patch/9817795/)

When obtaining a node via binder_get_node(),
binder_get_node_from_ref() or binder_new_node(),
increment node->tmp_refs to take a
temporary reference on the node to ensure the node
persists while being used. binder_put_node() must
be called to remove the temporary reference.

Change-Id: I962b39d5cd80b2d7e4786bb87236ede7914e2db7
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: refactor binder ref inc/dec for thread safety

(from https://patchwork.kernel.org/patch/9817781/)

Once locks are added, binder_ref's will only be accessed
safely with the proc lock held. Refactor the inc/dec paths
to make them atomic with the binder_get_ref* paths and
node inc/dec. For example, instead of:

  ref = binder_get_ref(proc, handle, strong);
  ...
  binder_dec_ref(ref, strong);

we now have:

  ret = binder_dec_ref_for_handle(proc, handle, strong, &rdata);

Since the actual ref is no longer exposed to callers, a
new struct binder_ref_data is introduced which can be used
to return a copy of ref state.

Change-Id: I7de22107f8ebc967cee63251d584fceb4ea56250
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: make sure accesses to proc/thread are safe

(from https://patchwork.kernel.org/patch/9817787/)

binder_thread and binder_proc may be accessed by other
threads when processing transaction. Therefore they
must be prevented from being freed while a transaction
is in progress that references them.

This is done by introducing a temporary reference
counter for threads and procs that indicates that the
object is in use and must not be freed. binder_thread_dec_tmpref()
and binder_proc_dec_tmpref() are used to decrement
the temporary reference.

It is safe to free a binder_thread if there
is no reference and it has been released
(indicated by thread->is_dead).

It is safe to free a binder_proc if it has no
remaining threads and no reference.

A spinlock is added to the binder_transaction
to safely access and set references for t->from
and for debug code to safely access t->to_thread
and t->to_proc.

Change-Id: I0a00a0294c3e93aea8b3f141c6f18e77ad244078
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: make sure target_node has strong ref

(from https://patchwork.kernel.org/patch/9817787/)

When initiating a transaction, the target_node must
have a strong ref on it. Then we take a second
strong ref to make sure the node survives until the
transaction is complete.

Change-Id: If7429cb43eda520ab89d45df6c19327cee97c60c
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: guarantee txn complete / errors delivered in-order

(from https://patchwork.kernel.org/patch/9817805/)

Since errors are tracked in the return_error/return_error2
fields of the binder_thread object and BR_TRANSACTION_COMPLETEs
can be tracked either in those fields or via the thread todo
work list, it is possible for errors to be reported ahead
of the associated txn complete.

Use the thread todo work list for errors to guarantee
order. Also changed binder_send_failed_reply to pop
the transaction even if it failed to send a reply.

Bug: 37218618
Test: tested manually
Change-Id: I196cfaeed09fdcd697f8ab25eea4e04241fdb08f
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: refactor binder_pop_transaction

(from https://lkml.org/lkml/2017/6/29/754)

binder_pop_transaction needs to be split into 2 pieces to
to allow the proc lock to be held on entry to dequeue the
transaction stack, but no lock when kfree'ing the transaction.

Split into binder_pop_transaction_locked and binder_free_transaction
(the actual locks are still to be added).

Change-Id: I848ae994cc27b3cd083cff2dbd1071762784f4a3
Test: tested manually
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: use atomic for transaction_log index

(from https://patchwork.kernel.org/patch/9817807/)

The log->next index for the transaction log was
not protected when incremented. This led to a
case where log->next++ resulted in an index
larger than ARRAY_SIZE(log->entry) and eventually
a bad access to memory.

Fixed by making the log index an atomic64 and
converting to an array by using "% ARRAY_SIZE(log->entry)"

Also added "complete" field to the log entry which is
written last to tell the print code whether the
entry is complete

Bug: 62038227
Test: tested manually
Change-Id: I1bb1c1a332a6ac458a626f5bedd05022b56b91f2
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: protect against two threads freeing buffer

(from https://patchwork.kernel.org/patch/9817815/)

Adds protection against malicious user code freeing
the same buffer at the same time which could cause
a crash. Cannot happen under normal use.

Bug: 36650912
Change-Id: I43e078cbf31c0789aaff5ceaf8f1a94c75f79d45
Test: tested manually
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: remove dead code in binder_get_ref_for_node

(from https://patchwork.kernel.org/patch/9817819/)

node is always non-NULL in binder_get_ref_for_node so the
conditional and else clause are not needed

Change-Id: I23f011ba59e1869d9577e6bf28e1f1dd38f45713
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: don't modify thread->looper from other threads

(from https://patchwork.kernel.org/patch/9817799/)

The looper member of struct binder_thread is a bitmask
of control bits. All of the existing bits are modified
by the affected thread except for BINDER_LOOPER_STATE_NEED_RETURN
which can be modified in binder_deferred_flush() by
another thread.

To avoid adding a spinlock around all read-mod-writes to
modify a bit, the BINDER_LOOPER_STATE_NEED_RETURN flag
is replaced by a separate field in struct binder_thread.

Bug: 33250092 32225111
Change-Id: Ia4cefbdbd683c6cb17c323ba7d278de5f2ca0745
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: avoid race conditions when enqueuing txn

(from https://patchwork.kernel.org/patch/9817813/)

Currently, the transaction complete work item is queued
after the transaction. This means that it is possible
for the transaction to be handled and a reply to be
enqueued in the current thread before the transaction
complete is enqueued, which violates the protocol
with userspace who may not expect the transaction
complete. Fixed by always enqueing the transaction
complete first.

Also, once the transaction is enqueued, it is unsafe
to access since it might be freed. Currently,
t->flags is accessed to determine whether a sync
wake is needed. Changed to access tr->flags
instead.

Change-Id: I6c01566e167a39cf17c9027c3817618182e56975
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: refactor queue management in binder_thread_read

(from https://patchwork.kernel.org/patch/9817757/)

In binder_thread_read, the BINDER_WORK_NODE command is used
to communicate the references on the node to userspace. It
can take a couple of iterations in the loop to construct
the list of commands for user space. When locking is added,
the lock would need to be release on each iteration which
means the state could change. The work item is not dequeued
during this process which prevents a simpler queue management
that can just dequeue up front and handle the work item.

Fixed by changing the BINDER_WORK_NODE algorithm in
binder_thread_read to determine which commands to send
to userspace atomically in 1 pass so it stays consistent
with the kernel view.

The work item is now dequeued immediately since only
1 pass is needed.

Change-Id: I9b4109997b2d53ba661867b14d7336cd076be06d
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: add log information for binder transaction failures

(from https://patchwork.kernel.org/patch/9817751/)

Add additional information to determine the cause of binder
failures. Adds the following to failed transaction log and
kernel messages:
return_error : value returned for transaction
return_error_param : errno returned by binder allocator
return_error_line : line number where error detected

Also, return BR_DEAD_REPLY if an allocation error indicates
a dead proc (-ESRCH)

Bug: 36406078
Change-Id: Ifc8881fa5adfcced3f2d67f9030fbd3efa3e2cab
Test: tested manually
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: make binder_last_id an atomic

(from https://patchwork.kernel.org/patch/9817809/)

Change-Id: I12a505091d377ca9034861317b7e68c2e75f7256
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: change binder_stats to atomics

(from https://patchwork.kernel.org/patch/9817755/)

Use atomics for stats to avoid needing to lock for
increments/decrements

Bug: 33250092 32225111
Change-Id: I13e69b7f0485ccf16673e25091455781e1933a98
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: add protection for non-perf cases

(from https://patchwork.kernel.org/patch/9817749/)

Add binder_dead_nodes_lock, binder_procs_lock, and
binder_context_mgr_node_lock to protect the associated global lists

Bug: 33250092 32225111
Change-Id: I9d1158536783763e0fa93b18e19fba2c488d8cfc
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: remove binder_debug_no_lock mechanism

(from https://patchwork.kernel.org/patch/9817811/)

With the global lock, there was a mechanism to acceess
binder driver debugging information with the global
lock disabled to debug deadlocks or other issues.
This mechanism is rarely (if ever) used anymore
and wasn't needed during the development of
fine-grained locking in the binder driver.
Removing it.

Change-Id: Ie1cf45748cefa433607a97c2ef322e7906efe0f7
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: move binder_alloc to separate file

(from https://patchwork.kernel.org/patch/9817753/)

Move the binder allocator functionality to its own file

Continuation of splitting the binder allocator from the binder
driver. Split binder_alloc functions from normal binder functions.

Add kernel doc comments to functions declared extern in
binder_alloc.h

Change-Id: I8f1a967375359078b8e63c7b6b88a752c374a64a
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: separate out binder_alloc functions

(from https://patchwork.kernel.org/patch/9817753/)

Continuation of splitting the binder allocator from the binder
driver. Separate binder_alloc functions from normal binder
functions. Protect the allocator with a separate mutex.

Bug: 33250092 32225111
Change-Id: I634637415aa03c74145159d84f1749bbe785a8f3
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: remove unneeded cleanup code

(from https://patchwork.kernel.org/patch/9817817/)

The buffer's transaction has already been freed before
binder_deferred_release. No need to do it again.

Change-Id: I412709c23879c5e45a6c7304a588bba0a5223fc3
Signed-off-by: Todd Kjos <tkjos@google.com>

FROMLIST: binder: separate binder allocator structure from binder proc

(from https://patchwork.kernel.org/patch/9817745/)

The binder allocator is logically separate from the rest
of the binder drivers. Separating the data structures
to prepare for splitting into separate file with separate
locking.

Change-Id: I5ca4df567ac4b8c8d6ee2116ae67c0c1d75c9747
Signed-off-by: Todd Kjos <tkjos@google.com>

Revert "android: binder: move global binder state into context struct."

This reverts commit 9f28e23f6e38e2cf55704fbbae18f44b04aeb64e.

Revert "CHROMIUM: android: binder: Fix potential scheduling-while-atomic"

This reverts commit 0e13ca5f2efc20fb5edf1985f0fe724f037923f5.

Revert "android: binder: use copy_from_user_preempt_disabled"

This reverts commit 10cfee25e9e1509d2b34a73435356db05969256a.

Revert "android: binder: Disable preemption while holding the global binder lock."

This reverts commit 6fd130fe5ba15d200251d27aabab0bf4c94d8f13.

Revert "binder: blacklist %p kptr_restrict"

This reverts commit 1bbb3f5143779897496587ff2895f9ae98a808c1.

Revert "binder: NULL pointer reference"

This reverts commit 354ac4b46f48765cd0e88ae750e810dec33a7458.

Revert "binder: prevent kptr leak by using %pK format specifier"

This reverts commit f9ddba81a5ecaf920d61d4d333300ade97614c7d.

kernel: make READ_ONCE() valid on const arguments

[ Upstream commit dd36929720f40f17685e841ae0d4c581c165ea60 ]

The use of READ_ONCE() causes lots of warnings witht he pending paravirt
spinlock fixes, because those ends up having passing a member to a
'const' structure to READ_ONCE().

There should certainly be nothing wrong with using READ_ONCE() with a
const source, but the helper function __read_once_size() would cause
warnings because it would drop the 'const' qualifier, but also because
the destination would be marked 'const' too due to the use of 'typeof'.

Use a union of types in READ_ONCE() to avoid this issue.

Also make sure to use parenthesis around the macro arguments to avoid
possible operator precedence issues.

Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

kernel: Change ASSIGN_ONCE(val, x) to WRITE_ONCE(x, val)

[ Upstream commit 43239cbe79fc369f5d2160bd7f69e28b5c50a58c ]

Feedback has shown that WRITE_ONCE(x, val) is easier to use than
ASSIGN_ONCE(val,x).
There are no in-tree users yet, so lets change it for 3.19.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>