Eric W. Biederman [Wed, 8 Feb 2012 02:52:57 +0000 (18:52 -0800)]
userns: Convert vfs posix_acl support to use kuids and kgids
- In setxattr if we are setting a posix acl convert uids and gids from
the current user namespace into the initial user namespace, before
the xattrs are passed to the underlying filesystem.
Untranslatable uids and gids are represented as -1 which
posix_acl_from_xattr will represent as INVALID_UID or INVALID_GID.
posix_acl_valid will fail if an acl from userspace has any
INVALID_UID or INVALID_GID values. In net this guarantees that
untranslatable posix acls will not be stored by filesystems.
- In getxattr if we are reading a posix acl convert uids and gids from
the initial user namespace into the current user namespace.
Uids and gids that can not be tranlsated into the current user namespace
will be represented as -1.
- Replace e_id in struct posix_acl_entry with an anymouns union of
e_uid and e_gid. For the short term retain the e_id field
until all of the users are converted.
- Don't set struct posix_acl.e_id in the cases where the acl type
does not use e_id. Greatly reducing the use of ACL_UNDEFINED_ID.
- Rework the ordering checks in posix_acl_valid so that I use kuid_t
and kgid_t types throughout the code, and so that I don't need
arithmetic on uid and gid types.
Cc: Theodore Tso <tytso@mit.edu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Tue, 13 Mar 2012 23:02:19 +0000 (16:02 -0700)]
userns: Teach trace to use from_kuid
- When tracing capture the kuid.
- When displaying the data to user space convert the kuid into the
user namespace of the process that opened the report file.
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Wed, 8 Feb 2012 00:54:50 +0000 (16:54 -0800)]
userns: Convert bsd process accounting to use kuid and kgid where appropriate
BSD process accounting conveniently passes the file the accounting
records will be written into to do_acct_process. The file credentials
captured the user namespace of the opener of the file. Use the file
credentials to format the uid and the gid of the current process into
the user namespace of the user that started the bsd process
accounting.
Cc: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Wed, 8 Feb 2012 01:56:49 +0000 (17:56 -0800)]
userns: Convert taskstats to handle the user and pid namespaces.
- Explicitly limit exit task stat broadcast to the initial user and
pid namespaces, as it is already limited to the initial network
namespace.
- For broadcast task stats explicitly generate all of the idenitiers
in terms of the initial user namespace and the initial pid
namespace.
- For request stats report them in terms of the current user namespace
and the current pid namespace. Netlink messages are delivered
syncrhonously to the kernel allowing us to get the user namespace
and the pid namespace from the current task.
- Pass the namespaces for representing pids and uids and gids
into bacct_add_task.
Cc: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Wed, 8 Feb 2012 00:53:48 +0000 (16:53 -0800)]
userns: Convert audit to work with user namespaces enabled
- Explicitly format uids gids in audit messges in the initial user
namespace. This is safe because auditd is restrected to be in
the initial user namespace.
- Convert audit_sig_uid into a kuid_t.
- Enable building the audit code and user namespaces at the same time.
The net result is that the audit subsystem now uses kuid_t and kgid_t whenever
possible making it almost impossible to confuse a raw uid_t with a kuid_t
preventing bugs.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Tue, 11 Sep 2012 05:39:43 +0000 (22:39 -0700)]
userns: Convert the audit loginuid to be a kuid
Always store audit loginuids in type kuid_t.
Print loginuids by converting them into uids in the appropriate user
namespace, and then printing the resulting uid.
Modify audit_get_loginuid to return a kuid_t.
Modify audit_set_loginuid to take a kuid_t.
Modify /proc/<pid>/loginuid on read to convert the loginuid into the
user namespace of the opener of the file.
Modify /proc/<pid>/loginud on write to convert the loginuid
rom the user namespace of the opener of the file.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Cc: Paul Moore <paul@paul-moore.com> ?
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Tue, 11 Sep 2012 09:18:08 +0000 (02:18 -0700)]
audit: Add typespecific uid and gid comparators
The audit filter code guarantees that uid are always compared with
uids and gids are always compared with gids, as the comparason
operations are type specific. Take advantage of this proper to define
audit_uid_comparator and audit_gid_comparator which use the type safe
comparasons from uidgid.h.
Build on audit_uid_comparator and audit_gid_comparator and replace
audit_compare_id with audit_compare_uid and audit_compare_gid. This
is one of those odd cases where being type safe and duplicating code
leads to simpler shorter and more concise code.
Don't allow bitmask operations in uid and gid comparisons in
audit_data_to_entry. Bitmask operations are already denined in
audit_rule_to_entry.
Convert constants in audit_rule_to_entry and audit_data_to_entry into
kuids and kgids when appropriate.
Convert the uid and gid field in struct audit_names to be of type
kuid_t and kgid_t respectively, so that the new uid and gid comparators
can be applied in a type safe manner.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Eric W. Biederman [Tue, 11 Sep 2012 07:24:49 +0000 (00:24 -0700)]
audit: Don't pass pid or uid to audit_log_common_recv_msg
The only place we use the uid and the pid that we calculate in
audit_receive_msg is in audit_log_common_recv_msg so move the
calculation of these values into the audit_log_common_recv_msg.
Simplify the calcuation of the current pid and uid by
reading them from current instead of reading them from
NETLINK_CREDS.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Eric W. Biederman [Tue, 11 Sep 2012 07:19:06 +0000 (00:19 -0700)]
audit: Remove the unused uid parameter from audit_receive_filter
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Eric W. Biederman [Tue, 11 Sep 2012 07:12:29 +0000 (00:12 -0700)]
audit: Properly set the origin port id of audit messages.
For user generated audit messages set the portid field in the netlink
header to the netlink port where the user generated audit message came
from. Reporting the process id in a port id field was just nonsense.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Eric W. Biederman [Tue, 11 Sep 2012 06:43:14 +0000 (23:43 -0700)]
audit: Simply AUDIT_TTY_SET and AUDIT_TTY_GET
Use current instead of looking up the current up the current task by
process identifier. Netlink requests are processed in trhe context of
the sending task so this is safe.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Eric W. Biederman [Tue, 11 Sep 2012 06:31:17 +0000 (23:31 -0700)]
audit: kill audit_prepare_user_tty
Now that netlink messages are processed in the context of the sender
tty_audit_push_task can be called directly and audit_prepare_user_tty
which only added looking up the task of the tty by process id is
not needed.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Eric W. Biederman [Tue, 11 Sep 2012 06:10:16 +0000 (23:10 -0700)]
audit: Use current instead of NETLINK_CREDS() in audit_filter
Get caller process uid and gid and pid values from the current task
instead of the NETLINK_CB. This is simpler than passing NETLINK_CREDS
from from audit_receive_msg to audit_filter_user_rules and avoid the
chance of being hit by the occassional bugs in netlink uid/gid
credential passing. This is a safe changes because all netlink
requests are processed in the task of the sending process.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Eric W. Biederman [Tue, 11 Sep 2012 06:20:20 +0000 (23:20 -0700)]
audit: Limit audit requests to processes in the initial pid and user namespaces.
This allows the code to safely make the assumption that all of the
uids gids and pids that need to be send in audit messages are in the
initial namespaces.
If someone cares we may lift this restriction someday but start with
limiting access so at least the code is always correct.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Eric W. Biederman [Fri, 25 May 2012 22:37:54 +0000 (16:37 -0600)]
userns: net: Call key_alloc with GLOBAL_ROOT_UID, GLOBAL_ROOT_GID instead of 0, 0
In net/dns_resolver/dns_key.c and net/rxrpc/ar-key.c make them
work with user namespaces enabled where key_alloc takes kuids and kgids.
Pass GLOBAL_ROOT_UID and GLOBAL_ROOT_GID instead of bare 0's.
Cc: Sage Weil <sage@inktank.com>
Cc: ceph-devel@vger.kernel.org
Cc: David Howells <dhowells@redhat.com>
Cc: David Miller <davem@davemloft.net>
Cc: linux-afs@lists.infradead.org
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Wed, 8 Feb 2012 15:53:04 +0000 (07:53 -0800)]
userns: Convert security/keys to the new userns infrastructure
- Replace key_user ->user_ns equality checks with kuid_has_mapping checks.
- Use from_kuid to generate key descriptions
- Use kuid_t and kgid_t and the associated helpers instead of uid_t and gid_t
- Avoid potential problems with file descriptor passing by displaying
keys in the user namespace of the opener of key status proc files.
Cc: linux-security-module@vger.kernel.org
Cc: keyrings@linux-nfs.org
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Wed, 8 Feb 2012 00:47:26 +0000 (16:47 -0800)]
userns: Convert drm to use kuid and kgid and struct pid where appropriate
Blink Blink this had not been converted to use struct pid ages ago?
- On drm open capture the openers kuid and struct pid.
- On drm close release the kuid and struct pid
- When reporting the uid and pid convert the kuid and struct pid
into values in the appropriate namespace.
Cc: dri-devel@lists.freedesktop.org
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Wed, 8 Feb 2012 00:54:11 +0000 (16:54 -0800)]
userns: Convert ipc to use kuid and kgid where appropriate
- Store the ipc owner and creator with a kuid
- Store the ipc group and the crators group with a kgid.
- Add error handling to ipc_update_perms, allowing it to
fail if the uids and gids can not be converted to kuids
or kgids.
- Modify the proc files to display the ipc creator and
owner in the user namespace of the opener of the proc file.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Wed, 8 Feb 2012 00:48:16 +0000 (16:48 -0800)]
userns: Convert process event connector to handle kuids and kgids
- Only allow asking for events from the initial user and pid namespace,
where we generate the events in.
- Convert kuids and kgids into the initial user namespace to report
them via the process event connector.
Cc: David Miller <davem@davemloft.net>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Tue, 3 Apr 2012 21:01:31 +0000 (14:01 -0700)]
userns: Convert debugfs to use kuid/kgid where appropriate.
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Tue, 24 Apr 2012 00:06:34 +0000 (17:06 -0700)]
userns: Make credential debugging user namespace safe.
Cc: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Thu, 16 Aug 2012 00:46:22 +0000 (17:46 -0700)]
userns: Enable building of pf_key sockets when user namespace support is enabled.
Enable building of pf_key sockets and user namespace support at the
same time. This combination builds successfully so there is no reason
to forbid it.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Dan Carpenter [Thu, 16 Aug 2012 13:15:02 +0000 (16:15 +0300)]
ipv6: move dereference after check in fl_free()
There is a dereference before checking for NULL bug here. Generally
free() functions should accept NULL pointers. For example, fl_create()
can pass a NULL pointer to fl_free() on the error path.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Wed, 8 Feb 2012 00:48:55 +0000 (16:48 -0800)]
userns: Convert tun/tap to use kuid and kgid where appropriate
Cc: Maxim Krasnyansky <maxk@qualcomm.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Fri, 10 Feb 2012 22:01:03 +0000 (14:01 -0800)]
userns: Make the airo wireless driver use kuids for proc uids and gids
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: John W. Linville <linville@tuxdriver.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Fri, 3 Feb 2012 01:33:59 +0000 (17:33 -0800)]
userns: xt_owner: Add basic user namespace support.
- Only allow adding matches from the initial user namespace
- Add the appropriate conversion functions to handle matches
against sockets in other user namespaces.
Cc: Jan Engelhardt <jengelh@medozas.de>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Fri, 25 May 2012 22:26:52 +0000 (16:26 -0600)]
userns xt_recent: Specify the owner/group of ip_list_perms in the initial user namespace
xt_recent creates a bunch of proc files and initializes their uid
and gids to the values of ip_list_uid and ip_list_gid. When
initialize those proc files convert those values to kuids so they
can continue to reside on the /proc inode.
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jan Engelhardt <jengelh@medozas.de>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Fri, 25 May 2012 21:50:59 +0000 (15:50 -0600)]
userns: Convert xt_LOG to print socket kuids and kgids as uids and gids
xt_LOG always writes messages via sb_add via printk. Therefore when
xt_LOG logs the uid and gid of a socket a packet came from the
values should be converted to be in the initial user namespace.
Thus making xt_LOG as user namespace safe as possible.
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Fri, 25 May 2012 19:49:36 +0000 (13:49 -0600)]
userns: Convert cls_flow to work with user namespaces enabled
The flow classifier can use uids and gids of the sockets that
are transmitting packets and do insert those uids and gids
into the packet classification calcuation. I don't fully
understand the details but it appears that we can depend
on specific uids and gids when making traffic classification
decisions.
To work with user namespaces enabled map from kuids and kgids
into uids and gids in the initial user namespace giving raw
integer values the code can play with and depend on.
To avoid issues of userspace depending on uids and gids in
packet classifiers installed from other user namespaces
and getting confused deny all packet classifiers that
use uids or gids that are not comming from a netlink socket
in the initial user namespace.
Cc: Patrick McHardy <kaber@trash.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Changli Gao <xiaosuo@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Fri, 25 May 2012 19:42:45 +0000 (13:42 -0600)]
net sched: Pass the skb into change so it can access NETLINK_CB
cls_flow.c plays with uids and gids. Unless I misread that
code it is possible for classifiers to depend on the specific uid and
gid values. Therefore I need to know the user namespace of the
netlink socket that is installing the packet classifiers. Pass
in the rtnetlink skb so I can access the NETLINK_CB of the passed
packet. In particular I want access to sk_user_ns(NETLINK_CB(in_skb).ssk).
Pass in not the user namespace but the incomming rtnetlink skb into
the the classifier change routines as that is generally the more useful
parameter.
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Fri, 25 May 2012 16:42:54 +0000 (10:42 -0600)]
userns: nfnetlink_log: Report socket uids in the log sockets user namespace
At logging instance creation capture the peer netlink socket's user
namespace. Use the captured peer user namespace when reporting socket
uids to the peer.
The peer socket's user namespace is guaranateed to be valid until the user
closes the netlink socket. nfnetlink_log removes instances during the final
close of a socket. __build_packet_message does not get called after an
instance is destroyed. Therefore it is safe to let the peer netlink socket
take care of the user namespace reference counting for us.
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Thu, 24 May 2012 23:58:08 +0000 (17:58 -0600)]
userns: Teach inet_diag to work with user namespaces
Compute the user namespace of the socket that we are replying to
and translate the kuids of reported sockets into that user namespace.
Cc: Andrew Vagin <avagin@openvz.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Thu, 24 May 2012 23:56:43 +0000 (17:56 -0600)]
userns: Implement sk_user_ns
Add a helper sk_user_ns to make it easy to find the user namespace
of the process that opened a socket.
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Thu, 24 May 2012 23:21:27 +0000 (17:21 -0600)]
netlink: Make the sending netlink socket availabe in NETLINK_CB
The sending socket of an skb is already available by it's port id
in the NETLINK_CB. If you want to know more like to examine the
credentials on the sending socket you have to look up the sending
socket by it's port id and all of the needed functions and data
structures are static inside of af_netlink.c. So do the simple
thing and pass the sending socket to the receivers in the NETLINK_CB.
I intend to use this to get the user namespace of the sending socket
in inet_diag so that I can report uids in the context of the process
who opened the socket, the same way I report uids in the contect
of the process who opens files.
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Thu, 24 May 2012 18:55:00 +0000 (12:55 -0600)]
userns: Convert net/ax25 to use kuid_t where appropriate
Cc: Ralf Baechle <ralf@linux-mips.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Sat, 4 Aug 2012 02:11:22 +0000 (19:11 -0700)]
pidns: Export free_pid_ns
There is a least one modular user so export free_pid_ns so modules can
capture and use the pid namespace on the very rare occasion when it
makes sense.
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Eric W. Biederman [Thu, 24 May 2012 16:37:59 +0000 (10:37 -0600)]
net ip6 flowlabel: Make owner a union of struct pid * and kuid_t
Correct a long standing omission and use struct pid in the owner
field of struct ip6_flowlabel when the share type is IPV6_FL_S_PROCESS.
This guarantees we don't have issues when pid wraparound occurs.
Use a kuid_t in the owner field of struct ip6_flowlabel when the
share type is IPV6_FL_S_USER to add user namespace support.
In /proc/net/ip6_flowlabel capture the current pid namespace when
opening the file and release the pid namespace when the file is
closed ensuring we print the pid owner value that is meaning to
the reader of the file. Similarly use from_kuid_munged to print
uid values that are meaningful to the reader of the file.
This requires exporting pid_nr_ns so that ipv6 can continue to built
as a module. Yoiks what silliness
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Thu, 24 May 2012 16:34:21 +0000 (10:34 -0600)]
userns: Use kgids for sysctl_ping_group_range
- Store sysctl_ping_group_range as a paire of kgid_t values
instead of a pair of gid_t values.
- Move the kgid conversion work from ping_init_sock into ipv4_ping_group_range
- For invalid cases reset to the default disabled state.
With the kgid_t conversion made part of the original value sanitation
from userspace understand how the code will react becomes clearer
and it becomes possible to set the sysctl ping group range from
something other than the initial user namespace.
Cc: Vasiliy Kulikov <segoon@openwall.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Thu, 24 May 2012 07:10:10 +0000 (01:10 -0600)]
userns: Print out socket uids in a user namespace aware fashion.
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Sridhar Samudrala <sri@us.ibm.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Thu, 24 May 2012 00:01:20 +0000 (18:01 -0600)]
userns: Make seq_file's user namespace accessible
struct file already has a user namespace associated with it
in file->f_cred->user_ns, unfortunately because struct
seq_file has no struct file backpointer associated with
it, it is difficult to get at the user namespace in seq_file
context. Therefore add a helper function seq_user_ns to return
the associated user namespace and a user_ns field to struct
seq_file to be used in implementing seq_user_ns.
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Wed, 23 May 2012 23:33:47 +0000 (17:33 -0600)]
userns: Allow USER_NS and NET simultaneously in Kconfig
Now that the networking core is user namespace safe allow
networking and user namespaces to be built at the same time.
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Wed, 23 May 2012 23:16:53 +0000 (17:16 -0600)]
userns: Convert sock_i_uid to return a kuid_t
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Wed, 23 May 2012 23:01:57 +0000 (17:01 -0600)]
userns: Convert __dev_set_promiscuity to use kuids in audit logs
Cc: Klaus Heinrich Kiwi <klausk@br.ibm.com>
Cc: Eric Paris <eparis@redhat.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Eric W. Biederman [Wed, 23 May 2012 22:39:45 +0000 (16:39 -0600)]
userns: Convert net/core/scm.c to use kuids and kgids
With the existence of kuid_t and kgid_t we can take this further
and remove the usage of struct cred altogether, ensuring we
don't get cache line misses from reference counts. For now
however start simply and do a straight forward conversion
I can be certain is correct.
In cred_to_ucred use from_kuid_munged and from_kgid_munged
as these values are going directly to userspace and we want to use
the userspace safe values not -1 when reporting a value that does not
map. The earlier conversion that used from_kuid was buggy in that
respect. Oops.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Eric W. Biederman [Fri, 3 Aug 2012 16:38:08 +0000 (09:38 -0700)]
userns: Fix link restrictions to use uid_eq
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Eric W. Biederman [Wed, 8 Feb 2012 00:31:49 +0000 (16:31 -0800)]
userns: Allow the usernamespace support to build after the removal of usbfs
The user namespace code has an explicit "depends on USB_DEVICEFS = n"
dependency to prevent building code that is not yet user namespace safe. With
the removal of usbfs from the kernel it is now impossible to satisfy the
USB_DEFICEFS = n dependency and thus it is impossible to enable user
namespace support in 3.5-rc1. So remove the now useless depedency.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Linus Torvalds [Thu, 2 Aug 2012 23:38:10 +0000 (16:38 -0700)]
Linux 3.6-rc1
Linus Torvalds [Thu, 2 Aug 2012 18:52:39 +0000 (11:52 -0700)]
Merge branch 'for-linus-3.6' of git://dev.laptop.org/users/dilinger/linux-olpc
Pull OLPC platform updates from Andres Salomon:
"These move the OLPC Embedded Controller driver out of
arch/x86/platform and into drivers/platform/olpc.
OLPC machines are now ARM-based (which means lots of x86 and ARM
changes), but are typically pretty self-contained.. so it makes more
sense to go through a separate OLPC tree after getting the appropriate
review/ACKs."
* 'for-linus-3.6' of git://dev.laptop.org/users/dilinger/linux-olpc:
x86: OLPC: move s/r-related EC cmds to EC driver
Platform: OLPC: move global variables into priv struct
Platform: OLPC: move debugfs support from x86 EC driver
x86: OLPC: switch over to using new EC driver on x86
Platform: OLPC: add a suspended flag to the EC driver
Platform: OLPC: turn EC driver into a platform_driver
Platform: OLPC: allow EC cmd to be overridden, and create a workqueue to call it
drivers: OLPC: update various drivers to include olpc-ec.h
Platform: OLPC: add a stub to drivers/platform/ for the OLPC EC driver
Linus Torvalds [Thu, 2 Aug 2012 18:50:24 +0000 (11:50 -0700)]
Merge tag 'dt2' of git://git./linux/kernel/git/arm/arm-soc
Pull arm-soc Marvell Orion device-tree updates from Olof Johansson:
"This contains a set of device-tree conversions for Marvell Orion
platforms that were staged early but took a few tries to get the
branch into a format where it was suitable for us to pick up.
Given that most people working on these platforms are hobbyists with
limited time, we were a bit more flexible with merging it even though
it came in late."
* tag 'dt2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (21 commits)
ARM: Kirkwood: Replace mrvl with marvell
ARM: Kirkwood: Describe GoFlex Net LEDs and SATA in DT.
ARM: Kirkwood: Describe Dreamplug LEDs in DT.
ARM: Kirkwood: Describe iConnects LEDs in DT.
ARM: Kirkwood: Describe iConnects temperature sensor in DT.
ARM: Kirkwood: Describe IB62x0 LEDs in DT.
ARM: Kirkwood: Describe IB62x0 gpio-keys in DT.
ARM: Kirkwood: Describe DNS32? gpio-keys in DT.
ARM: Kirkwood: Move common portions into a kirkwood-dnskw.dtsi
ARM: Kirkwood: Replace DNS-320/DNS-325 leds with dt bindings
ARM: Kirkwood: Describe DNS325 temperature sensor in DT.
ARM: Kirkwood: Use DT to configure SATA device.
ARM: kirkwood: use devicetree for SPI on dreamplug
ARM: kirkwood: Add LS-XHL and LS-CHLv2 support
ARM: Kirkwood: Initial DTS support for Kirkwood GoFlex Net
ARM: Kirkwood: Add basic device tree support for QNAP TS219.
ATA: sata_mv: Add device tree support
ARM: Orion: DTify the watchdog timer.
ARM: Orion: Add arch support needed for I2C via DT.
ARM: kirkwood: use devicetree for orion-spi
...
Conflicts:
drivers/watchdog/orion_wdt.c
Linus Torvalds [Thu, 2 Aug 2012 18:48:54 +0000 (11:48 -0700)]
Merge tag 'pm2' of git://git./linux/kernel/git/arm/arm-soc
Pull arm-soc cpuidle enablement for OMAP from Olof Johansson:
"Coupled cpuidle was meant to merge for 3.5 through Len Brown's tree,
but didn't go in because the pull request ended up rejected. So it
just got merged, and we got this staged branch that enables the
coupled cpuidle code on OMAP.
With a stable git workflow from the other maintainer we could have
staged this earlier, but that wasn't the case so we have had to merge
it late.
The alternative is to hold it off until 3.7 but given that the code is
well-isolated to OMAP and they are eager to see it go in, I didn't
push back hard in that direction."
* tag 'pm2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: OMAP4: CPUidle: Open broadcast clock-event device.
ARM: OMAP4: CPUidle: add synchronization for coupled idle states
ARM: OMAP4: CPUidle: Use coupled cpuidle states to implement SMP cpuidle.
ARM: OMAP: timer: allow gp timer clock-event to be used on both cpus
Linus Torvalds [Thu, 2 Aug 2012 18:48:20 +0000 (11:48 -0700)]
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A few fixes for merge window fallout, and a bugfix for timer resume on
PRIMA2."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: mmp: add missing irqs.h
arm: mvebu: fix typo in .dtsi comment for Armada XP SoCs
ARM: PRIMA2: delete redundant codes to restore LATCHED when timer resumes
ARM: mxc: Include missing irqs.h header
Linus Torvalds [Thu, 2 Aug 2012 18:45:42 +0000 (11:45 -0700)]
Merge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh
Pull SuperH fixes from Paul Mundt.
* tag 'sh-for-linus' of git://github.com/pmundt/linux-sh: (24 commits)
sh: explicitly include sh_dma.h in setup-sh7722.c
sh: ecovec: care CN5 VBUS if USB host mode
sh: sh7724: fixup renesas_usbhs clock settings
sh: intc: initial irqdomain support.
sh: pfc: Fix up init ordering mess.
serial: sh-sci: fix compilation breakage, when DMA is enabled
dmaengine: shdma: restore partial transfer calculation
sh: modify the sh_dmae_slave_config for RSPI in setup-sh7757
sh: Fix up recursive fault in oops with unset TTB.
sh: pfc: Build fix for pinctrl_remove_gpio_range() changes.
sh: select the fixed regulator driver on several boards
sh: ecovec: switch MMC power control to regulators
sh: add fixed voltage regulators to se7724
sh: add fixed voltage regulators to sdk7786
sh: add fixed voltage regulators to rsk
sh: add fixed voltage regulators to migor
sh: add fixed voltage regulators to kfr2r09
sh: add fixed voltage regulators to ap325rxa
sh: add fixed voltage regulators to sh7757lcr
sh: add fixed voltage regulators to sh2007
...
Linus Torvalds [Thu, 2 Aug 2012 18:34:40 +0000 (11:34 -0700)]
Merge tag 'md-3.6' of git://neil.brown.name/md
Pull additional md update from NeilBrown:
"This contains a few patches that depend on plugging changes in the
block layer so needed to wait for those.
It also contains a Kconfig fix for the new RAID10 support in dm-raid."
* tag 'md-3.6' of git://neil.brown.name/md:
md/dm-raid: DM_RAID should select MD_RAID10
md/raid1: submit IO from originating thread instead of md thread.
raid5: raid5d handle stripe in batch way
raid5: make_request use batch stripe release
Linus Torvalds [Thu, 2 Aug 2012 17:57:31 +0000 (10:57 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client
Pull two ceph fixes from Sage Weil:
"The first patch fixes up the old crufty open intent code to use the
atomic_open stuff properly, and the second fixes a possible null deref
and memory leak with the crypto keys."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: fix crypto key null deref, memory leak
ceph: simplify+fix atomic_open
Linus Torvalds [Thu, 2 Aug 2012 17:56:34 +0000 (10:56 -0700)]
Merge tag 'ecryptfs-3.6-rc1-fixes' of git://git./linux/kernel/git/tyhicks/ecryptfs
Pull ecryptfs fixes from Tyler Hicks:
- Fixes a bug when the lower filesystem mount options include 'acl',
but the eCryptfs mount options do not
- Cleanups in the messaging code
- Better handling of empty files in the lower filesystem to improve
usability. Failed file creations are now cleaned up and empty lower
files are converted into eCryptfs during open().
- The write-through cache changes are being reverted due to bugs that
are not easy to fix. Stability outweighs the performance
enhancements here.
- Improvement to the mount code to catch unsupported ciphers specified
in the mount options
* tag 'ecryptfs-3.6-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
eCryptfs: check for eCryptfs cipher support at mount
eCryptfs: Revert to a writethrough cache model
eCryptfs: Initialize empty lower files when opening them
eCryptfs: Unlink lower inode when ecryptfs_create() fails
eCryptfs: Make all miscdev functions use daemon ptr in file private_data
eCryptfs: Remove unused messaging declarations and function
eCryptfs: Copy up POSIX ACL and read-only flags from lower mount
Linus Torvalds [Thu, 2 Aug 2012 17:54:11 +0000 (10:54 -0700)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull CIFS update from Steve French:
"Adds SMB2 rmdir/mkdir capability to the SMB2/SMB2.1 support in cifs.
I am holding up a few more days on merging the remainder of the
SMB2/SMB2.1 enablement although it is nearing review completion, in
order to address some review comments from Jeff Layton on a few of the
subsequent SMB2 patches, and also to debug an unrelated cifs problem
that Pavel discovered."
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
CIFS: Add SMB2 support for rmdir
CIFS: Move rmdir code to ops struct
CIFS: Add SMB2 support for mkdir operation
CIFS: Separate protocol specific part from mkdir
CIFS: Simplify cifs_mkdir call
Linus Torvalds [Thu, 2 Aug 2012 17:37:03 +0000 (10:37 -0700)]
mm: remove node_start_pfn checking in new WARN_ON for now
Borislav Petkov reports that the new warning added in commit
88fdf75d1bb5 ("mm: warn if pg_data_t isn't initialized with zero")
triggers for him, and it is the node_start_pfn field that has already
been initialized once.
The call trace looks like this:
x86_64_start_kernel ->
x86_64_start_reservations ->
start_kernel ->
setup_arch ->
paging_init ->
zone_sizes_init ->
free_area_init_nodes ->
free_area_init_node
and (with the warning replaced by debug output), Borislav sees
On node 0 totalpages:
4193848
DMA zone: 64 pages used for memmap
DMA zone: 6 pages reserved
DMA zone: 3890 pages, LIFO batch:0
DMA32 zone: 16320 pages used for memmap
DMA32 zone: 798464 pages, LIFO batch:31
Normal zone: 52736 pages used for memmap
Normal zone:
3322368 pages, LIFO batch:31
free_area_init_node: pgdat->node_start_pfn:
4423680 <----
On node 1 totalpages:
4194304
Normal zone: 65536 pages used for memmap
Normal zone:
4128768 pages, LIFO batch:31
free_area_init_node: pgdat->node_start_pfn:
8617984 <----
On node 2 totalpages:
4194304
Normal zone: 65536 pages used for memmap
Normal zone:
4128768 pages, LIFO batch:31
free_area_init_node: pgdat->node_start_pfn:
12812288 <----
On node 3 totalpages:
4194304
Normal zone: 65536 pages used for memmap
Normal zone:
4128768 pages, LIFO batch:31
so remove the bogus warning for now to avoid annoying people. Minchan
Kim is looking at it.
Reported-by: Borislav Petkov <bp@amd64.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Haojian Zhuang [Mon, 30 Jul 2012 14:20:34 +0000 (22:20 +0800)]
ARM: mmp: add missing irqs.h
arch/arm/mach-mmp/gplugd.c:195:13: error: ‘MMP_NR_IRQS’ undeclared here
(not in a function)
make[1]: *** [arch/arm/mach-mmp/gplugd.o] Error 1
Include <mach/irqs.h> to fix this issue.
Signed-off-by: Haojian Zhuang <haojian.zhuang@gmail.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Thomas Petazzoni [Thu, 2 Aug 2012 15:13:47 +0000 (17:13 +0200)]
arm: mvebu: fix typo in .dtsi comment for Armada XP SoCs
The comment was wrongly referring to Armada 370 while the file is
related to Armada XP.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Barry Song [Mon, 30 Jul 2012 05:29:30 +0000 (13:29 +0800)]
ARM: PRIMA2: delete redundant codes to restore LATCHED when timer resumes
The only way to write LATCHED registers to write LATCH_BIT to LATCH register,
that will latch COUNTER into LATCHED.e.g.
writel_relaxed(SIRFSOC_TIMER_LATCH_BIT, sirfsoc_timer_base +
SIRFSOC_TIMER_LATCH);
Writing values to LATCHED registers directly is useless at all.
Signed-off-by: Barry Song <Baohua.Song@csr.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Sylvain Munaut [Thu, 2 Aug 2012 16:12:59 +0000 (09:12 -0700)]
libceph: fix crypto key null deref, memory leak
Avoid crashing if the crypto key payload was NULL, as when it was not correctly
allocated and initialized. Also, avoid leaking it.
Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Sage Weil [Tue, 31 Jul 2012 18:27:36 +0000 (11:27 -0700)]
ceph: simplify+fix atomic_open
The initial ->atomic_open op was carried over from the old intent code,
which was incomplete and didn't really work. Replace it with a fresh
method. In particular:
* always attempt to do an atomic open+lookup, both for the create case
and for lookups of existing files.
* fix symlink handling by returning 1 to the VFS so that we can follow
the link to its destination. This fixes a longstanding ceph bug (#2392).
Signed-off-by: Sage Weil <sage@inktank.com>
Guennadi Liakhovetski [Wed, 1 Aug 2012 08:44:57 +0000 (10:44 +0200)]
sh: explicitly include sh_dma.h in setup-sh7722.c
setup-sh7722.c defines several objects, whose types are defined in
sh_dma.h, so, it has to be included explicitly.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Linus Torvalds [Wed, 1 Aug 2012 23:47:15 +0000 (16:47 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull MIPS updates from Ralf Baechle:
"The lion share of this pull request are fixes for clk-related breakage
caused by other changes during this merge window. For some platforms
the fix was as simple as selecting HAVE_CLK, for others like the
Loongson 2 significant restructuring was required.
The remainder are changes required to get the Lantiq code to work
again."
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Loongson 2: Sort out clock managment.
MIPS: Loongson 1: more clk support and add select HAVE_CLK
MIPS: txx9: Fix redefinition of clk_* by adding select HAVE_CLK
MIPS: BCM63xx: Fix redefinition of clk_* by adding select HAVE_CLK
MIPS: AR7: Fix redefinition of clk_* by adding select HAVE_CLK
MIPS: Lantiq: Platform specific CLK fixup
MIPS: Lantiq: Add device_tree_init function
MIPS: Lantiq: Fix interface clock and PCI control register offset
Linus Torvalds [Wed, 1 Aug 2012 23:45:02 +0000 (16:45 -0700)]
Merge branch 'for-linus-3.6-rc1' of git://git./linux/kernel/git/rw/uml
Pull UML fixes from Richard Weinberger:
"This patch set contains mostly fixes and cleanups. The UML tty driver
uses now tty_port and is no longer broken like hell :-)"
* 'for-linus-3.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: Add arch/x86/um to MAINTAINERS
um: pass siginfo to guest process
um: fix ubd_file_size for read-only files
um: pull interrupt_end() into userspace()
um: split syscall_trace(), pass pt_regs to it
um: switch UPT_SET_RETURN_VALUE and regs_return_value to pt_regs
um: set BLK_CGROUP=y in defconfig
um: remove count_lock
um: fully use tty_port
um: Remove dead code
um: remove line_ioctl()
TTY: um/line, use tty from tty_port
TTY: um/line, add tty_port
Linus Torvalds [Wed, 1 Aug 2012 23:41:07 +0000 (16:41 -0700)]
Merge branch 'dmaengine' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM DMA engine updates from Russell King:
"This looks scary at first glance, but what it is is:
- a rework of the sa11x0 DMA engine driver merged during the previous
cycle, to extract a common set of helper functions for DMA engine
implementations.
- conversion of amba-pl08x.c to use these helper functions.
- addition of OMAP DMA engine driver (using these helper functions),
and conversion of some of the OMAP DMA users to use DMA engine.
Nothing in the helper functions is ARM specific, so I hope that other
implementations can consolidate some of their code by making use of
these helpers.
This has been sitting in linux-next most of the merge cycle, and has
been tested by several OMAP folk. I've tested it on sa11x0 platforms,
and given it my best shot on my broken platforms which have the
amba-pl08x controller.
The last point is the addition to feature-removal-schedule.txt, which
will have a merge conflict. Between myself and TI, we're planning to
remove the old TI DMA implementation next year."
Fix up trivial add/add conflicts in Documentation/feature-removal-schedule.txt
and drivers/dma/{Kconfig,Makefile}
* 'dmaengine' of git://git.linaro.org/people/rmk/linux-arm: (53 commits)
ARM: 7481/1: OMAP2+: omap2plus_defconfig: enable OMAP DMA engine
ARM: 7464/1: mmc: omap_hsmmc: ensure probe returns error if DMA channel request fails
Add feature removal of old OMAP private DMA implementation
mtd: omap2: remove private DMA API implementation
mtd: omap2: add DMA engine support
spi: omap2-mcspi: remove private DMA API implementation
spi: omap2-mcspi: add DMA engine support
ARM: omap: remove mmc platform data dma_mask and initialization
mmc: omap: remove private DMA API implementation
mmc: omap: add DMA engine support
mmc: omap_hsmmc: remove private DMA API implementation
mmc: omap_hsmmc: add DMA engine support
dmaengine: omap: add support for cyclic DMA
dmaengine: omap: add support for setting fi
dmaengine: omap: add support for returning residue in tx_state method
dmaengine: add OMAP DMA engine driver
dmaengine: sa11x0-dma: add cyclic DMA support
dmaengine: sa11x0-dma: fix DMA residue support
dmaengine: PL08x: ensure all descriptors are freed when channel is released
dmaengine: PL08x: get rid of write only pool_ctr and free_txd locking
...
Linus Torvalds [Wed, 1 Aug 2012 23:35:37 +0000 (16:35 -0700)]
Merge branch 'audit' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM audit/signal updates from Russell King:
"ARM audit/signal handling updates from Al and Will. This improves on
the work Viro did last merge window, and sorts out some of the issues
found with that work."
* 'audit' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7475/1: sys_trace: allow all syscall arguments to be updated via ptrace
ARM: 7474/1: get rid of TIF_SYSCALL_RESTARTSYS
ARM: 7473/1: deal with handlerless restarts without leaving the kernel
ARM: 7472/1: pull all work_pending logics into C function
ARM: 7471/1: Revert "7442/1: Revert "remove unused restart trampoline""
ARM: 7470/1: Revert "7443/1: Revert "new way of handling ERESTART_RESTARTBLOCK""
Linus Torvalds [Wed, 1 Aug 2012 23:30:45 +0000 (16:30 -0700)]
Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM fixes from Russell King:
"This fixes various issues found during July"
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7479/1: mm: avoid NULL dereference when flushing gate_vma with VIVT caches
ARM: Fix undefined instruction exception handling
ARM: 7480/1: only call smp_send_stop() on SMP
ARM: 7478/1: errata: extend workaround for erratum #720789
ARM: 7477/1: vfp: Always save VFP state in vfp_pm_suspend on UP
ARM: 7476/1: vfp: only clear vfp state for current cpu in vfp_pm_suspend
ARM: 7468/1: ftrace: Trace function entry before updating index
ARM: 7467/1: mutex: use generic xchg-based implementation for ARMv6+
ARM: 7466/1: disable interrupt before spinning endlessly
ARM: 7465/1: Handle >4GB memory sizes in device tree and mem=size@start option
Richard Weinberger [Wed, 1 Aug 2012 23:00:47 +0000 (01:00 +0200)]
um: Add arch/x86/um to MAINTAINERS
Signed-off-by: Richard Weinberger <richard@nod.at>
Martin Pärtel [Wed, 1 Aug 2012 22:49:17 +0000 (00:49 +0200)]
um: pass siginfo to guest process
UML guest processes now get correct siginfo_t for SIGTRAP, SIGFPE,
SIGILL and SIGBUS. Specifically, si_addr and si_code are now correct
where previously they were si_addr = NULL and si_code = 128.
Signed-off-by: Martin Pärtel <martin.partel@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Martin Pärtel [Wed, 1 Aug 2012 22:44:22 +0000 (00:44 +0200)]
um: fix ubd_file_size for read-only files
Made ubd_file_size not request write access. Fixes use of read-only images.
Signed-off-by: Martin Pärtel <martin.partel@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
NeilBrown [Wed, 1 Aug 2012 22:35:43 +0000 (08:35 +1000)]
md/dm-raid: DM_RAID should select MD_RAID10
Now that DM_RAID supports raid10, it needs to select that code
to ensure it is included.
Cc: Jonathan Brassow <jbrassow@redhat.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Wed, 1 Aug 2012 22:33:20 +0000 (08:33 +1000)]
md/raid1: submit IO from originating thread instead of md thread.
queuing writes to the md thread means that all requests go through the
one processor which may not be able to keep up with very high request
rates.
So use the plugging infrastructure to submit all requests on unplug.
If a 'schedule' is needed, we fall back on the old approach of handing
the requests to the thread for it to handle.
Signed-off-by: NeilBrown <neilb@suse.de>
Shaohua Li [Wed, 1 Aug 2012 22:33:15 +0000 (08:33 +1000)]
raid5: raid5d handle stripe in batch way
Let raid5d handle stripe in batch way to reduce conf->device_lock locking.
Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Shaohua Li [Wed, 1 Aug 2012 22:33:00 +0000 (08:33 +1000)]
raid5: make_request use batch stripe release
make_request() does stripe release for every stripe and the stripe usually has
count 1, which makes previous release_stripe() optimization not work. In my
test, this release_stripe() becomes the heaviest pleace to take
conf->device_lock after previous patches applied.
Below patch makes stripe release batch. All the stripes will be released in
unplug. The STRIPE_ON_UNPLUG_LIST bit is to protect concurrent access stripe
lru.
Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Al Viro [Wed, 23 May 2012 04:25:15 +0000 (00:25 -0400)]
um: pull interrupt_end() into userspace()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
Al Viro [Wed, 23 May 2012 04:18:33 +0000 (00:18 -0400)]
um: split syscall_trace(), pass pt_regs to it
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[richard@nod.at: Fixed some minor build issues]
Signed-off-by: Richard Weinberger <richard@nod.at>
Al Viro [Wed, 23 May 2012 01:16:35 +0000 (21:16 -0400)]
um: switch UPT_SET_RETURN_VALUE and regs_return_value to pt_regs
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
Linus Torvalds [Wed, 1 Aug 2012 17:45:12 +0000 (10:45 -0700)]
Merge tag 'fbdev-updates-for-3.6' of git://github.com/schandinat/linux-2.6
Pull fbdev updates from Florian Tobias Schandinat:
- large updates for OMAP
- support for LCD3 overlay manager (omap5)
- omapdss output cleanup
- removal of passive matrix LCD support as there are no drivers for
such panels for DSS or DSS2 and nobody complained (cleanup)
- large updates for SH Mobile
- overlay support
- separating MERAM (cache) from framebuffer driver
- some updates for Exynos and da8xx-fb
- various other small patches
* tag 'fbdev-updates-for-3.6' of git://github.com/schandinat/linux-2.6: (78 commits)
da8xx-fb: fix compile issue due to missing include
fbdev: Make pixel_to_pat() failure mode more friendly
da8xx-fb: do not turn ON LCD backlight unless LCDC is enabled
fbdev: sh_mobile_lcdc: Fix vertical panning step
video: exynos mipi dsi: Fix mipi dsi regulators handling issue
video: da8xx-fb: do clock reset of revision 2 LCDC before enabling
arm: da850: configure LCDC fifo threshold
video: da8xx-fb: configure FIFO threshold to reduce underflow errors
video: da8xx-fb: fix flicker due to 1 frame delay in updated frame
video: da8xx-fb rev2: fix disabling of palette completion interrupt
da8xx-fb: add missing FB_BLANK operations
video: exynos_dp: use usleep_range instead of delay
video: exynos_dp: check the only INTERLANE_ALIGN_DONE bit during Link Training
fb: epson1355fb: Fix section mismatch
video: exynos_dp: fix wrong DPCD address during Link Training
video/smscufx: fix line counting in fb_write
aty128fb: Fix coding style issues
fbdev: sh_mobile_lcdc: Fix pan offset computation in YUV mode
fbdev: sh_mobile_lcdc: Fix overlay registers update during pan operation
fbdev: sh_mobile_lcdc: Support horizontal panning
...
Linus Torvalds [Wed, 1 Aug 2012 17:42:26 +0000 (10:42 -0700)]
Merge tag 'sound-3.6' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of small fixes that have been found recently. Most of
the commits are regression fixes in HD-audio and some other random
drivers."
* tag 'sound-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: snd-usb: fix clock source validity index
ALSA: hda - Fix mute-LED GPIO initialization for IDT codecs
ALSA: hda - Add descriptions for missing IDT 92HD83x models
ALSA: hda - Fix polarity of mute LED on HP Mini 210
ALSA: es1688 - freeup resources on init failure
ALSA: hda - Workaround for silent output on VAIO Z with ALC889
ALSA: hda - Fix WARNING from HDMI/DP parser
ALSA: hda - Detach from converter at closing in patch_hdmi.c
ALSA: hda - Fix mute-LED GPIO setup for HP Mini 210
ALSA: mpu401: Fix missing initialization of irq field
ALSA: hda - Fix invalid D3 of headphone DAC on VT202x codecs
Linus Torvalds [Wed, 1 Aug 2012 17:26:23 +0000 (10:26 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs
Pull second vfs pile from Al Viro:
"The stuff in there: fsfreeze deadlock fixes by Jan (essentially, the
deadlock reproduced by xfstests 068), symlink and hardlink restriction
patches, plus assorted cleanups and fixes.
Note that another fsfreeze deadlock (emergency thaw one) is *not*
dealt with - the series by Fernando conflicts a lot with Jan's, breaks
userland ABI (FIFREEZE semantics gets changed) and trades the deadlock
for massive vfsmount leak; this is going to be handled next cycle.
There probably will be another pull request, but that stuff won't be
in it."
Fix up trivial conflicts due to unrelated changes next to each other in
drivers/{staging/gdm72xx/usb_boot.c, usb/gadget/storage_common.c}
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (54 commits)
delousing target_core_file a bit
Documentation: Correct s_umount state for freeze_fs/unfreeze_fs
fs: Remove old freezing mechanism
ext2: Implement freezing
btrfs: Convert to new freezing mechanism
nilfs2: Convert to new freezing mechanism
ntfs: Convert to new freezing mechanism
fuse: Convert to new freezing mechanism
gfs2: Convert to new freezing mechanism
ocfs2: Convert to new freezing mechanism
xfs: Convert to new freezing code
ext4: Convert to new freezing mechanism
fs: Protect write paths by sb_start_write - sb_end_write
fs: Skip atime update on frozen filesystem
fs: Add freezing handling to mnt_want_write() / mnt_drop_write()
fs: Improve filesystem freezing handling
switch the protection of percpu_counter list to spinlock
nfsd: Push mnt_want_write() outside of i_mutex
btrfs: Push mnt_want_write() outside of i_mutex
fat: Push mnt_want_write() outside of i_mutex
...
Ralf Baechle [Wed, 1 Aug 2012 15:15:32 +0000 (17:15 +0200)]
MIPS: Loongson 2: Sort out clock managment.
For unexplainable reasons the Loongson 2 clock API was implemented in a
module so fixing this involved shifting large amounts of code around.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Linus Torvalds [Wed, 1 Aug 2012 16:06:47 +0000 (09:06 -0700)]
Merge branch 'for-3.6/drivers' of git://git.kernel.dk/linux-block
Pull block driver changes from Jens Axboe:
- Making the plugging support for drivers a bit more sane from Neil.
This supersedes the plugging change from Shaohua as well.
- The usual round of drbd updates.
- Using a tail add instead of a head add in the request completion for
ndb, making us find the most completed request more quickly.
- A few floppy changes, getting rid of a duplicated flag and also
running the floppy init async (since it takes forever in boot terms)
from Andi.
* 'for-3.6/drivers' of git://git.kernel.dk/linux-block:
floppy: remove duplicated flag FD_RAW_NEED_DISK
blk: pass from_schedule to non-request unplug functions.
block: stack unplug
blk: centralize non-request unplug handling.
md: remove plug_cnt feature of plugging.
block/nbd: micro-optimization in nbd request completion
drbd: announce FLUSH/FUA capability to upper layers
drbd: fix max_bio_size to be unsigned
drbd: flush drbd work queue before invalidate/invalidate remote
drbd: fix potential access after free
drbd: call local-io-error handler early
drbd: do not reset rs_pending_cnt too early
drbd: reset congestion information before reporting it in /proc/drbd
drbd: report congestion if we are waiting for some userland callback
drbd: differentiate between normal and forced detach
drbd: cleanup, remove two unused global flags
floppy: Run floppy initialization asynchronous
Linus Torvalds [Wed, 1 Aug 2012 16:02:41 +0000 (09:02 -0700)]
Merge branch 'for-3.6/core' of git://git.kernel.dk/linux-block
Pull core block IO bits from Jens Axboe:
"The most complicated part if this is the request allocation rework by
Tejun, which has been queued up for a long time and has been in
for-next ditto as well.
There are a few commits from yesterday and today, mostly trivial and
obvious fixes. So I'm pretty confident that it is sound. It's also
smaller than usual."
* 'for-3.6/core' of git://git.kernel.dk/linux-block:
block: remove dead func declaration
block: add partition resize function to blkpg ioctl
block: uninitialized ioc->nr_tasks triggers WARN_ON
block: do not artificially constrain max_sectors for stacking drivers
blkcg: implement per-blkg request allocation
block: prepare for multiple request_lists
block: add q->nr_rqs[] and move q->rq.elvpriv to q->nr_rqs_elvpriv
blkcg: inline bio_blkcg() and friends
block: allocate io_context upfront
block: refactor get_request[_wait]()
block: drop custom queue draining used by scsi_transport_{iscsi|fc}
mempool: add @gfp_mask to mempool_create_node()
blkcg: make root blkcg allocation use %GFP_KERNEL
blkcg: __blkg_lookup_create() doesn't need radix preload
Linus Torvalds [Wed, 1 Aug 2012 16:02:01 +0000 (09:02 -0700)]
Merge branch 'for-next' of git://neil.brown.name/md
Pull md updates from NeilBrown.
* 'for-next' of git://neil.brown.name/md:
DM RAID: Add support for MD RAID10
md/RAID1: Add missing case for attempting to repair known bad blocks.
md/raid5: For odirect-write performance, do not set STRIPE_PREREAD_ACTIVE.
md/raid1: don't abort a resync on the first badblock.
md: remove duplicated test on ->openers when calling do_md_stop()
raid5: Add R5_ReadNoMerge flag which prevent bio from merging at block layer
md/raid1: prevent merging too large request
md/raid1: read balance chooses idlest disk for SSD
md/raid1: make sequential read detection per disk based
MD RAID10: Export md_raid10_congested
MD: Move macros from raid1*.h to raid1*.c
MD RAID1: rename mirror_info structure
MD RAID10: rename mirror_info structure
MD RAID10: Fix compiler warning.
raid5: add a per-stripe lock
raid5: remove unnecessary bitmap write optimization
raid5: lockless access raid5 overrided bi_phys_segments
raid5: reduce chance release_stripe() taking device_lock
J. Bruce Fields [Wed, 1 Aug 2012 11:56:16 +0000 (07:56 -0400)]
locks: remove unused lm_release_private
In commit
3b6e2723f32d ("locks: prevent side-effects of
locks_release_private before file_lock is initialized") we removed the
last user of lm_release_private without removing the field itself.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yoichi Yuasa [Wed, 1 Aug 2012 06:42:16 +0000 (15:42 +0900)]
MIPS: Loongson 1: more clk support and add select HAVE_CLK
This fixes a redefinition of clk_*:
arch/mips/loongson1/common/clock.c:23:13: error: redefinition of 'clk_get'
include/linux/clk.h:281:27: note: previous definition of 'clk_get' was here
arch/mips/loongson1/common/clock.c:41:15: error: redefinition of 'clk_get_rate'
include/linux/clk.h:302:29: note: previous definition of 'clk_get_rate' was here
make[3]: *** [arch/mips/loongson1/common/clock.o] Error 1
Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Cc: linux-mips@linux-mips.org
Reviewed-by: John Crispin <blogic@openwrt.org>
Acked-by: Kelvin Cheung <keguang.zhang@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/4143/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Yoichi Yuasa [Wed, 1 Aug 2012 06:41:06 +0000 (15:41 +0900)]
MIPS: txx9: Fix redefinition of clk_* by adding select HAVE_CLK
arch/mips/txx9/generic/setup.c:87:13: error: redefinition of 'clk_get'
include/linux/clk.h:281:27: note: previous definition of 'clk_get' was here
arch/mips/txx9/generic/setup.c:97:5: error: redefinition of 'clk_enable'
include/linux/clk.h:295:19: note: previous definition of 'clk_enable' was here
arch/mips/txx9/generic/setup.c:103:6: error: redefinition of 'clk_disable'
include/linux/clk.h:300:20: note: previous definition of 'clk_disable' was here
arch/mips/txx9/generic/setup.c:108:15: error: redefinition of 'clk_get_rate'
include/linux/clk.h:302:29: note: previous definition of 'clk_get_rate' was here
arch/mips/txx9/generic/setup.c:114:6: error: redefinition of 'clk_put'
include/linux/clk.h:291:20: note: previous definition of 'clk_put' was here
make[3]: *** [arch/mips/txx9/generic/setup.o] Error 1
Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Cc: linux-mips@linux-mips.org
Reviewed-by: John Crispin <blogic@openwrt.org>
Patchwork: https://patchwork.linux-mips.org/patch/4142/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Yoichi Yuasa [Wed, 1 Aug 2012 06:39:52 +0000 (15:39 +0900)]
MIPS: BCM63xx: Fix redefinition of clk_* by adding select HAVE_CLK
arch/mips/bcm63xx/clk.c:249:5: error: redefinition of 'clk_enable'
include/linux/clk.h:295:19: note: previous definition of 'clk_enable' was here
arch/mips/bcm63xx/clk.c:259:6: error: redefinition of 'clk_disable'
include/linux/clk.h:300:20: note: previous definition of 'clk_disable' was here
arch/mips/bcm63xx/clk.c:268:15: error: redefinition of 'clk_get_rate'
include/linux/clk.h:302:29: note: previous definition of 'clk_get_rate' was here
arch/mips/bcm63xx/clk.c:275:13: error: redefinition of 'clk_get'
include/linux/clk.h:281:27: note: previous definition of 'clk_get' was here
arch/mips/bcm63xx/clk.c:302:6: error: redefinition of 'clk_put'
include/linux/clk.h:291:20: note: previous definition of 'clk_put' was here
make[2]: *** [arch/mips/bcm63xx/clk.o] Error 1
Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Cc: linux-mips@linux-mips.org
Reviewed-by: John Crispin <blogic@openwrt.org>
Patchwork: https://patchwork.linux-mips.org/patch/4141/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Yoichi Yuasa [Wed, 1 Aug 2012 06:38:00 +0000 (15:38 +0900)]
MIPS: AR7: Fix redefinition of clk_* by adding select HAVE_CLK
arch/mips/ar7/clock.c:420:5: error: redefinition of 'clk_enable'
include/linux/clk.h:295:19: note: previous definition of 'clk_enable' was here
arch/mips/ar7/clock.c:426:6: error: redefinition of 'clk_disable'
include/linux/clk.h:300:20: note: previous definition of 'clk_disable' was here
arch/mips/ar7/clock.c:431:15: error: redefinition of 'clk_get_rate'
include/linux/clk.h:302:29: note: previous definition of 'clk_get_rate' was here
arch/mips/ar7/clock.c:437:13: error: redefinition of 'clk_get'
include/linux/clk.h:281:27: note: previous definition of 'clk_get' was here
arch/mips/ar7/clock.c:454:6: error: redefinition of 'clk_put'
include/linux/clk.h:291:20: note: previous definition of 'clk_put' was here
make[2]: *** [arch/mips/ar7/clock.o] Error 1
Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Cc: linux-mips@linux-mips.org
Reviewed-by: John Crispin <blogic@openwrt.org>
Acked-by: Florian Fainelli <florian@openwrt.org>
Patchwork: https://patchwork.linux-mips.org/patch/4140/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
John Crispin [Sun, 22 Jul 2012 06:56:01 +0000 (08:56 +0200)]
MIPS: Lantiq: Platform specific CLK fixup
As we use CLKDEV_LOOKUP but dont have support for COMMON_CLK yet, we need to
provide our own version of of_clk_get_from_provider().
Signed-off-by: John Crispin <blogic@openwrt.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/4117/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
John Crispin [Sun, 22 Jul 2012 06:56:00 +0000 (08:56 +0200)]
MIPS: Lantiq: Add device_tree_init function
Add a lantiq specific version of device_tree_init. The generic MIPS version
was removed by.
commit
594e966bc412d64eec9282d28ce511bdd62fea39
Author: David Daney <david.daney@cavium.com>
Date: Thu Jul 5 18:12:38 2012 +0200
MIPS: Prune some target specific code out of prom.c
Signed-off-by: John Crispin <blogic@openwrt.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/4116/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
John Crispin [Sun, 22 Jul 2012 06:55:57 +0000 (08:55 +0200)]
MIPS: Lantiq: Fix interface clock and PCI control register offset
The XRX200 based SoC have a different register offset for the interface
clock and PCI control registers. This patch detects the SoC and sets the
register offset at runtime. This make PCI work on the VR9 SoC.
Signed-off-by: John Crispin <blogic@openwrt.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/4113/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Al Viro [Wed, 1 Aug 2012 09:07:04 +0000 (13:07 +0400)]
delousing target_core_file a bit
* set_fs(KERNEL_DS) + getname() is probably the weirdest implementation
of strdup() I've seen. Especially since they don't to copy it at all...
* filp_open() never returns NULL; it's ERR_PTR(-E...) on failure.
* file->f_dentry is never going to be NULL, TYVM.
* match_strdup() + snprintf() + kfree() is a bloody weird way to spell
match_strlcpy().
Pox on cargo-cult programmers...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Jonathan Brassow [Wed, 1 Aug 2012 02:44:26 +0000 (21:44 -0500)]
DM RAID: Add support for MD RAID10
Support the MD RAID10 personality through dm-raid.c
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Wed, 1 Aug 2012 10:40:02 +0000 (20:40 +1000)]
Merge commit '
c039c332f23e794deb6d6f37b9f07ff3b27fb2cf' into md
Pull in pre-requisites for adding raid10 support to dm-raid.
Yuanhan Liu [Wed, 1 Aug 2012 10:25:54 +0000 (12:25 +0200)]
block: remove dead func declaration
__generic_unplug_device() function is removed with commit
7eaceaccab5f40bbfda044629a6298616aeaed50, which forgot to
remove the declaration at meantime. Here remove it.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Vivek Goyal [Wed, 1 Aug 2012 10:24:18 +0000 (12:24 +0200)]
block: add partition resize function to blkpg ioctl
Add a new operation code (BLKPG_RESIZE_PARTITION) to the BLKPG ioctl that
allows altering the size of an existing partition, even if it is currently
in use.
This patch converts hd_struct->nr_sects into sequence counter because
One might extend a partition while IO is happening to it and update of
nr_sects can be non-atomic on 32bit machines with 64bit sector_t. This
can lead to issues like reading inconsistent size of a partition. Sequence
counter have been used so that readers don't have to take bdev mutex lock
as we call sector_in_part() very frequently.
Now all the access to hd_struct->nr_sects should happen using sequence
counter read/update helper functions part_nr_sects_read/part_nr_sects_write.
There is one exception though, set_capacity()/get_capacity(). I think
theoritically race should exist there too but this patch does not
modify set_capacity()/get_capacity() due to sheer number of call sites
and I am afraid that change might break something. I have left that as a
TODO item. We can handle it later if need be. This patch does not introduce
any new races as such w.r.t set_capacity()/get_capacity().
v2: Add CONFIG_LBDAF test to UP preempt case as suggested by Phillip.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Phillip Susi <psusi@ubuntu.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Olof Johansson [Wed, 1 Aug 2012 10:17:27 +0000 (12:17 +0200)]
block: uninitialized ioc->nr_tasks triggers WARN_ON
Hi,
I'm using the old-fashioned 'dump' backup tool, and I noticed that it spews the
below warning as of 3.5-rc1 and later (3.4 is fine):
[ 10.886893] ------------[ cut here ]------------
[ 10.886904] WARNING: at include/linux/iocontext.h:140 copy_process+0x1488/0x1560()
[ 10.886905] Hardware name: Bochs
[ 10.886906] Modules linked in:
[ 10.886908] Pid: 2430, comm: dump Not tainted 3.5.0-rc7+ #27
[ 10.886908] Call Trace:
[ 10.886911] [<
ffffffff8107ce8a>] warn_slowpath_common+0x7a/0xb0
[ 10.886912] [<
ffffffff8107ced5>] warn_slowpath_null+0x15/0x20
[ 10.886913] [<
ffffffff8107c088>] copy_process+0x1488/0x1560
[ 10.886914] [<
ffffffff8107c244>] do_fork+0xb4/0x340
[ 10.886918] [<
ffffffff8108effa>] ? recalc_sigpending+0x1a/0x50
[ 10.886919] [<
ffffffff8108f6b2>] ? __set_task_blocked+0x32/0x80
[ 10.886920] [<
ffffffff81091afa>] ? __set_current_blocked+0x3a/0x60
[ 10.886923] [<
ffffffff81051db3>] sys_clone+0x23/0x30
[ 10.886925] [<
ffffffff8179bd73>] stub_clone+0x13/0x20
[ 10.886927] [<
ffffffff8179baa2>] ? system_call_fastpath+0x16/0x1b
[ 10.886928] ---[ end trace
32a14af7ee6a590b ]---
Reproducing is easy, I can hit it on a KVM system with a very basic
config (x86_64 make defconfig + enable the drivers needed). To hit it,
just install dump (on debian/ubuntu, not sure what the package might be
called on Fedora), and:
dump -o -f /tmp/foo /
You'll see the warning in dmesg once it forks off the I/O process and
starts dumping filesystem contents.
I bisected it down to the following commit:
commit
f6e8d01bee036460e03bd4f6a79d014f98ba712e
Author: Tejun Heo <tj@kernel.org>
Date: Mon Mar 5 13:15:26 2012 -0800
block: add io_context->active_ref
Currently ioc->nr_tasks is used to decide two things - whether an ioc
is done issuing IOs and whether it's shared by multiple tasks. This
patch separate out the first into ioc->active_ref, which is acquired
and released using {get|put}_io_context_active() respectively.
This will be used to associate bio's with a given task. This patch
doesn't introduce any visible behavior change.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
It seems like the init of ioc->nr_tasks was removed in that patch,
so it starts out at 0 instead of 1.
Tejun, is the right thing here to add back the init, or should something else
be done?
The below patch removes the warning, but I haven't done any more extensive
testing on it.
Signed-off-by: Olof Johansson <olof@lixom.net>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Mike Snitzer [Wed, 1 Aug 2012 08:44:28 +0000 (10:44 +0200)]
block: do not artificially constrain max_sectors for stacking drivers
blk_set_stacking_limits is intended to allow stacking drivers to build
up the limits of the stacked device based on the underlying devices'
limits. But defaulting 'max_sectors' to BLK_DEF_MAX_SECTORS (1024)
doesn't allow the stacking driver to inherit a max_sectors larger than
1024 -- due to blk_stack_limits' use of min_not_zero.
It is now clear that this artificial limit is getting in the way so
change blk_set_stacking_limits's max_sectors to UINT_MAX (which allows
stacking drivers like dm-multipath to inherit 'max_sectors' from the
underlying paths).
Reported-by: Vijay Chauhan <vijay.chauhan@netapp.com>
Tested-by: Vijay Chauhan <vijay.chauhan@netapp.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>