From ef9823939e5acd5d323ff61fbc427ef998dd203e Mon Sep 17 00:00:00 2001 From: Guillaume Morin Date: Mon, 7 Apr 2014 15:38:31 -0700 Subject: [PATCH] kernel/exit.c: call proc_exit_connector() after exit_state is set The process events connector delivers a notification when a process exits. This is really convenient for a process that spawns and wants to monitor its children through an epoll-able() interface. Unfortunately, there is a small window between when the event is delivered and the child become wait()-able. This is creates a race if the parent wants to make sure that it knows about the exit, e.g pid_t pid = fork(); if (pid > 0) { register_interest_for_pid(pid); if (waitpid(pid, NULL, WNOHANG) > 0) { /* We might have raced with exit() */ } return; } /* Child */ execve(...) register_interest_for_pid() would be telling the the connector socket reader to pay attention to events related to pid. Though this is not a bug, I think it would make the connector a bit more usable if this race was closed by simply moving the call to proc_exit_connector() from just before exit_notify() to right after. Oleg said: : Even with this patch the code above is still "racy" if the child is : multi-threaded. Plus it should obviously filter-out subthreads. And : afaics there is no way to make it reliable, even if you change the code : above so that waitpid() is called only after the last thread exits WNOHANG : still can fail. Signed-off-by: Guillaume Morin Cc: Matt Helsley Cc: Oleg Nesterov Cc: David S. Miller Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- kernel/exit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/exit.c b/kernel/exit.c index 171c9a9d7b00..decf648574f6 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -802,13 +802,13 @@ void do_exit(long code) module_put(task_thread_info(tsk)->exec_domain->module); - proc_exit_connector(tsk); /* * FIXME: do that only when needed, using sched_exit tracepoint */ flush_ptrace_hw_breakpoint(tsk); exit_notify(tsk, group_dead); + proc_exit_connector(tsk); #ifdef CONFIG_NUMA task_lock(tsk); mpol_put(tsk->mempolicy); -- 2.20.1