+++ /dev/null
-The execve system call can grant a newly-started program privileges that
-its parent did not have. The most obvious examples are setuid/setgid
-programs and file capabilities. To prevent the parent program from
-gaining these privileges as well, the kernel and user code must be
-careful to prevent the parent from doing anything that could subvert the
-child. For example:
-
- - The dynamic loader handles LD_* environment variables differently if
- a program is setuid.
-
- - chroot is disallowed to unprivileged processes, since it would allow
- /etc/passwd to be replaced from the point of view of a process that
- inherited chroot.
-
- - The exec code has special handling for ptrace.
-
-These are all ad-hoc fixes. The no_new_privs bit (since Linux 3.5) is a
-new, generic mechanism to make it safe for a process to modify its
-execution environment in a manner that persists across execve. Any task
-can set no_new_privs. Once the bit is set, it is inherited across fork,
-clone, and execve and cannot be unset. With no_new_privs set, execve
-promises not to grant the privilege to do anything that could not have
-been done without the execve call. For example, the setuid and setgid
-bits will no longer change the uid or gid; file capabilities will not
-add to the permitted set, and LSMs will not relax constraints after
-execve.
-
-To set no_new_privs, use prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0).
-
-Be careful, though: LSMs might also not tighten constraints on exec
-in no_new_privs mode. (This means that setting up a general-purpose
-service launcher to set no_new_privs before execing daemons may
-interfere with LSM-based sandboxing.)
-
-Note that no_new_privs does not prevent privilege changes that do not
-involve execve. An appropriately privileged task can still call
-setuid(2) and receive SCM_RIGHTS datagrams.
-
-There are two main use cases for no_new_privs so far:
-
- - Filters installed for the seccomp mode 2 sandbox persist across
- execve and can change the behavior of newly-executed programs.
- Unprivileged users are therefore only allowed to install such filters
- if no_new_privs is set.
-
- - By itself, no_new_privs can be used to reduce the attack surface
- available to an unprivileged user. If everything running with a
- given uid has no_new_privs set, then that uid will be unable to
- escalate its privileges by directly attacking setuid, setgid, and
- fcap-using binaries; it will need to compromise something without the
- no_new_privs bit set first.
-
-In the future, other potentially dangerous kernel features could become
-available to unprivileged tasks if no_new_privs is set. In principle,
-several options to unshare(2) and clone(2) would be safe when
-no_new_privs is set, and no_new_privs + chroot is considerable less
-dangerous than chroot by itself.
--- /dev/null
+======================
+No New Privileges Flag
+======================
+
+The execve system call can grant a newly-started program privileges that
+its parent did not have. The most obvious examples are setuid/setgid
+programs and file capabilities. To prevent the parent program from
+gaining these privileges as well, the kernel and user code must be
+careful to prevent the parent from doing anything that could subvert the
+child. For example:
+
+ - The dynamic loader handles ``LD_*`` environment variables differently if
+ a program is setuid.
+
+ - chroot is disallowed to unprivileged processes, since it would allow
+ ``/etc/passwd`` to be replaced from the point of view of a process that
+ inherited chroot.
+
+ - The exec code has special handling for ptrace.
+
+These are all ad-hoc fixes. The ``no_new_privs`` bit (since Linux 3.5) is a
+new, generic mechanism to make it safe for a process to modify its
+execution environment in a manner that persists across execve. Any task
+can set ``no_new_privs``. Once the bit is set, it is inherited across fork,
+clone, and execve and cannot be unset. With ``no_new_privs`` set, ``execve()``
+promises not to grant the privilege to do anything that could not have
+been done without the execve call. For example, the setuid and setgid
+bits will no longer change the uid or gid; file capabilities will not
+add to the permitted set, and LSMs will not relax constraints after
+execve.
+
+To set ``no_new_privs``, use::
+
+ prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
+
+Be careful, though: LSMs might also not tighten constraints on exec
+in ``no_new_privs`` mode. (This means that setting up a general-purpose
+service launcher to set ``no_new_privs`` before execing daemons may
+interfere with LSM-based sandboxing.)
+
+Note that ``no_new_privs`` does not prevent privilege changes that do not
+involve ``execve()``. An appropriately privileged task can still call
+``setuid(2)`` and receive SCM_RIGHTS datagrams.
+
+There are two main use cases for ``no_new_privs`` so far:
+
+ - Filters installed for the seccomp mode 2 sandbox persist across
+ execve and can change the behavior of newly-executed programs.
+ Unprivileged users are therefore only allowed to install such filters
+ if ``no_new_privs`` is set.
+
+ - By itself, ``no_new_privs`` can be used to reduce the attack surface
+ available to an unprivileged user. If everything running with a
+ given uid has ``no_new_privs`` set, then that uid will be unable to
+ escalate its privileges by directly attacking setuid, setgid, and
+ fcap-using binaries; it will need to compromise something without the
+ ``no_new_privs`` bit set first.
+
+In the future, other potentially dangerous kernel features could become
+available to unprivileged tasks if ``no_new_privs`` is set. In principle,
+several options to ``unshare(2)`` and ``clone(2)`` would be safe when
+``no_new_privs`` is set, and ``no_new_privs`` + ``chroot`` is considerable less
+dangerous than chroot by itself.