Joerg Roedel [Fri, 25 Mar 2011 08:44:49 +0000 (09:44 +0100)]
KVM: X86: Implement call-back to propagate virtual_tsc_khz
This patch implements a call-back into the architecture code
to allow the propagation of changes to the virtual tsc_khz
of the vcpu.
On SVM it updates the tsc_ratio variable, on VMX it does
nothing.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 25 Mar 2011 08:44:48 +0000 (09:44 +0100)]
KVM: X86: Make tsc_delta calculation a function of guest tsc
The calculation of the tsc_delta value to ensure a
forward-going tsc for the guest is a function of the
host-tsc. This works as long as the guests tsc_khz is equal
to the hosts tsc_khz. With tsc-scaling hardware support this
is not longer true and the tsc_delta needs to be calculated
using guest_tsc values.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 25 Mar 2011 08:44:47 +0000 (09:44 +0100)]
KVM: X86: Let kvm-clock report the right tsc frequency
This patch changes the kvm_guest_time_update function to use
TSC frequency the guest actually has for updating its clock.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 25 Mar 2011 08:44:46 +0000 (09:44 +0100)]
KVM: SVM: Implement infrastructure for TSC_RATE_MSR
This patch enhances the kvm_amd module with functions to
support the TSC_RATE_MSR which can be used to set a given
tsc frequency for the guest vcpu.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 5 Apr 2011 13:25:20 +0000 (16:25 +0300)]
KVM: x86 emulator: Drop EFER.SVME requirement from VMMCALL
VMMCALL requires EFER.SVME to be enabled in the host, not in the guest, which
is what check_svme() checks.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 5 Apr 2011 13:21:58 +0000 (16:21 +0300)]
KVM: x86 emulator: Re-add VendorSpecific tag to VMMCALL insn
VMMCALL needs the VendorSpecific tag so that #UD emulation
(called if a guest running on AMD was migrated to an Intel host)
is allowed to process the instruction.
Signed-off-by: Avi Kivity <avi@redhat.com>
Bharat Bhushan [Fri, 25 Mar 2011 05:02:13 +0000 (10:32 +0530)]
KVM: PPC: Fix issue clearing exit timing counters
Following dump is observed on host when clearing the exit timing counters
[root@p1021mds kvm]# echo -n 'c' > vm1200_vcpu0_timing
INFO: task echo:1276 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
echo D
0ff5bf94 0 1276 1190 0x00000000
Call Trace:
[
c2157e40] [
c0007908] __switch_to+0x9c/0xc4
[
c2157e50] [
c040293c] schedule+0x1b4/0x3bc
[
c2157e90] [
c04032dc] __mutex_lock_slowpath+0x74/0xc0
[
c2157ec0] [
c00369e4] kvmppc_init_timing_stats+0x20/0xb8
[
c2157ed0] [
c0036b00] kvmppc_exit_timing_write+0x84/0x98
[
c2157ef0] [
c00b9f90] vfs_write+0xc0/0x16c
[
c2157f10] [
c00ba284] sys_write+0x4c/0x90
[
c2157f40] [
c000e320] ret_from_syscall+0x0/0x3c
The vcpu->mutex is used by kvm_ioctl_* (KVM_RUN etc) and same was
used when clearing the stats (in kvmppc_init_timing_stats()). What happens
is that when the guest is idle then it held the vcpu->mutx. While the
exiting timing process waits for guest to release the vcpu->mutex and
a hang state is reached.
Now using seprate lock for exit timing stats.
Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Mon, 28 Mar 2011 02:29:27 +0000 (10:29 +0800)]
KVM: MMU: remove mmu_seq verification on pte update path
The mmu_seq verification can be removed since we get the pfn in the
protection of mmu_lock.
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Mon, 28 Mar 2011 14:57:49 +0000 (16:57 +0200)]
KVM: x86 emulator: do not open code return values from the emulator
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Justin P. Mattock [Wed, 30 Mar 2011 16:54:47 +0000 (09:54 -0700)]
KVM: Remove base_addresss in kvm_pit since it is unused
The patch below removes unsigned long base_addresss; in i8254.h
since it is unused.
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:36 +0000 (12:39 +0200)]
KVM: SVM: Remove nested sel_cr0_write handling code
This patch removes all the old code which handled the nested
selective cr0 write intercepts. This code was only in place
as a work-around until the instruction emulator is capable
of doing the same. This is the case with this patch-set and
so the code can be removed.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:35 +0000 (12:39 +0200)]
KVM: SVM: Add checks for IO instructions
This patch adds code to check for IOIO intercepts on
instructions decoded by the KVM instruction emulator.
[avi: fix build error due to missing #define D2bvIP]
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:34 +0000 (12:39 +0200)]
KVM: SVM: Add intercept checks for one-byte instructions
This patch add intercept checks for emulated one-byte
instructions to the KVM instruction emulation path.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:33 +0000 (12:39 +0200)]
KVM: SVM: Add intercept checks for remaining twobyte instructions
This patch adds intercepts checks for the remaining twobyte
instructions to the KVM instruction emulator.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:32 +0000 (12:39 +0200)]
KVM: SVM: Add intercept checks for remaining group7 instructions
This patch implements the emulator intercept checks for the
RDTSCP, MONITOR, and MWAIT instructions.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:31 +0000 (12:39 +0200)]
KVM: SVM: Add intercept checks for SVM instructions
This patch adds the necessary code changes in the
instruction emulator and the extensions to svm.c to
implement intercept checks for the svm instructions.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:30 +0000 (12:39 +0200)]
KVM: SVM: Add intercept checks for descriptor table accesses
This patch add intercept checks into the KVM instruction
emulator to check for the 8 instructions that access the
descriptor table addresses.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:29 +0000 (12:39 +0200)]
KVM: SVM: Add intercept check for accessing dr registers
This patch adds the intercept checks for instruction
accessing the debug registers.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:28 +0000 (12:39 +0200)]
KVM: SVM: Add intercept check for emulated cr accesses
This patch adds all necessary intercept checks for
instructions that access the crX registers.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:27 +0000 (12:39 +0200)]
KVM: x86: Add x86 callback for intercept check
This patch adds a callback into kvm_x86_ops so that svm and
vmx code can do intercept checks on emulated instructions.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:26 +0000 (12:39 +0200)]
KVM: x86 emulator: Add flag to check for protected mode instructions
This patch adds a flag for the opcoded to tag instruction
which are only recognized in protected mode. The necessary
check is added too.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:25 +0000 (12:39 +0200)]
KVM: x86 emulator: Add check_perm callback
This patch adds a check_perm callback for each opcode into
the instruction emulator. This will be used to do all
necessary permission checks on instructions before checking
whether they are intercepted or not.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Mon, 4 Apr 2011 10:39:24 +0000 (12:39 +0200)]
KVM: x86 emulator: Don't write-back cpu-state on X86EMUL_INTERCEPTED
This patch prevents the changed CPU state to be written back
when the emulator detected that the instruction was
intercepted by the guest.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 4 Apr 2011 10:39:23 +0000 (12:39 +0200)]
KVM: x86 emulator: add SVM intercepts
Add intercept codes for instructions defined by SVM as
interceptable.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 4 Apr 2011 10:39:22 +0000 (12:39 +0200)]
KVM: x86 emulator: add framework for instruction intercepts
When running in guest mode, certain instructions can be intercepted by
hardware. This also holds for nested guests running on emulated
virtualization hardware, in particular instructions emulated by kvm
itself.
This patch adds a framework for intercepting instructions. If an
instruction is marked for interception, and if we're running in guest
mode, a callback is called to check whether an intercept is needed or
not. The callback is called at three points in time: immediately after
beginning execution, after checking privilge exceptions, and after
checking memory exception. This suits the different interception points
defined for different instructions and for the various virtualization
instruction sets.
In addition, a new X86EMUL_INTERCEPT is defined, which any callback or
memory access may define, allowing the more complicated intercepts to be
implemented in existing callbacks.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Jan 2010 16:09:23 +0000 (18:09 +0200)]
KVM: x86 emulator: implement movdqu instruction (f3 0f 6f, f3 0f 7f)
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 29 Mar 2011 09:41:27 +0000 (11:41 +0200)]
KVM: x86 emulator: SSE support
Add support for marking an instruction as SSE, switching registers used
to the SSE register file.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 29 Mar 2011 09:34:38 +0000 (11:34 +0200)]
KVM: x86 emulator: Specialize decoding for insns with 66/f2/f3 prefixes
Most SIMD instructions use the 66/f2/f3 prefixes to distinguish between
different variants of the same instruction. Usually the encoding is quite
regular, but in some cases (including non-SIMD instructions) the prefixes
generate very different instructions. Examples include XCHG/PAUSE,
MOVQ/MOVDQA/MOVDQU, and MOVBE/CRC32.
Allow the emulator to handle these special cases by splitting such opcodes
into groups, with different decode flags and execution functions for different
prefixes.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 28 Mar 2011 14:53:59 +0000 (16:53 +0200)]
KVM: x86 emulator: define callbacks for using the guest fpu within the emulator
Needed for emulating fpu instructions.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Jan 2010 14:00:35 +0000 (16:00 +0200)]
KVM: x86 emulator: do not munge rep prefix
Currently we store a rep prefix as 1 or 2 depending on whether it is a REPE or
REPNE. Since sse instructions depend on the prefix value, store it as the
original opcode to simplify things further on.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 20 Jan 2010 10:01:20 +0000 (12:01 +0200)]
KVM: 16-byte mmio support
Since sse instructions can issue 16-byte mmios, we need to support them. We
can't increase the kvm_run mmio buffer size to 16 bytes without breaking
compatibility, so instead we break the large mmios into two smaller 8-byte
ones. Since the bus is 64-bit we aren't breaking any atomicity guarantees.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 19 Jan 2010 12:20:10 +0000 (14:20 +0200)]
KVM: Split mmio completion into a function
Make room for sse mmio completions.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 19 Jan 2010 10:51:22 +0000 (12:51 +0200)]
KVM: extend in-kernel mmio to handle >8 byte transactions
Needed for coalesced mmio using sse.
Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Fri, 1 Apr 2011 14:26:29 +0000 (11:26 -0300)]
KVM: x86: better fix for race between nmi injection and enabling nmi window
Fix race between nmi injection and enabling nmi window in a simpler way.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Fri, 1 Apr 2011 14:25:03 +0000 (11:25 -0300)]
Revert "KVM: Fix race between nmi injection and enabling nmi window"
This reverts commit
f86368493ec038218e8663cc1b6e5393cd8e008a.
Simpler fix to follow.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Glauber Costa [Wed, 23 Mar 2011 16:40:42 +0000 (13:40 -0300)]
KVM: expose async pf through our standard mechanism
As Avi recently mentioned, the new standard mechanism for exposing features
is KVM_GET_SUPPORTED_CPUID, not spamming CAPs. For some reason async pf
missed that.
So expose async_pf here.
Signed-off-by: Glauber Costa <glommer@redhat.com>
CC: Gleb Natapov <gleb@redhat.com>
CC: Avi Kivity <avi@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Wed, 23 Mar 2011 13:02:47 +0000 (15:02 +0200)]
KVM: VMX: simplify NMI mask management
Use vmx_set_nmi_mask() instead of open-coding management of
the hardware bit and the software hint (nmi_known_unmasked).
There's a slight change of behaviour when running without
hardware virtual NMI support - we now clear the NMI mask if
NMI delivery faulted in that case as well. This improves
emulation accuracy.
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Thu, 24 Mar 2011 08:45:10 +0000 (09:45 +0100)]
KVM: SVM: Remove unused svm_features
We use boot_cpu_has now.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 7 Mar 2011 15:39:45 +0000 (17:39 +0200)]
KVM: VMX: Use cached VM_EXIT_INTR_INFO in handle_exception
vmx_complete_atomic_exit() cached it for us, so we can use it here.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 7 Mar 2011 15:37:37 +0000 (17:37 +0200)]
KVM: VMX: Don't VMREAD VM_EXIT_INTR_INFO unconditionally
Only read it if we're going to use it later.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 7 Mar 2011 15:24:54 +0000 (17:24 +0200)]
KVM: VMX: Refactor vmx_complete_atomic_exit()
Move the exit reason checks to the front of the function, for early
exit in the common case.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 7 Mar 2011 15:20:29 +0000 (17:20 +0200)]
KVM: VMX: Qualify check for host NMI
Check for the exit reason first; this allows us, later,
to avoid a VMREAD for VM_EXIT_INTR_INFO_FIELD.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 7 Mar 2011 14:52:07 +0000 (16:52 +0200)]
KVM: VMX: Avoid vmx_recover_nmi_blocking() when unneeded
When we haven't injected an interrupt, we don't need to recover
the nmi blocking state (since the guest can't set it by itself).
This allows us to avoid a VMREAD later on.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 7 Mar 2011 13:26:44 +0000 (15:26 +0200)]
KVM: VMX: Cache cpl
We may read the cpl quite often in the same vmexit (instruction privilege
check, memory access checks for instruction and operands), so we gain
a bit if we cache the value.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 7 Mar 2011 12:54:28 +0000 (14:54 +0200)]
KVM: VMX: Optimize vmx_get_cpl()
In long mode, vm86 mode is disallowed, so we need not check for
it. Reading rflags.vm may require a VMREAD, so it is expensive.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 7 Mar 2011 10:51:22 +0000 (12:51 +0200)]
KVM: VMX: Optimize vmx_get_rflags()
If called several times within the same exit, return cached results.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 2 Aug 2010 12:30:20 +0000 (15:30 +0300)]
KVM: Use kvm_get_rflags() and kvm_set_rflags() instead of the raw versions
Some rflags bits are owned by the host, not guest, so we need to use
kvm_get_rflags() to strip those bits away or kvm_set_rflags() to add them
back.
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Wed, 9 Mar 2011 07:41:59 +0000 (15:41 +0800)]
KVM: cleanup memslot_id function
We can get memslot id from memslot->id directly
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Linus Torvalds [Wed, 11 May 2011 00:39:01 +0000 (17:39 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits)
slcan: fix ldisc->open retval
net/usb: mark LG VL600 LTE modem ethernet interface as WWAN
xfrm: Don't allow esn with disabled anti replay detection
xfrm: Assign the inner mode output function to the dst entry
net: dev_close() should check IFF_UP
vlan: fix GVRP at dismantle time
netfilter: revert
a2361c8735e07322023aedc36e4938b35af31eb0
netfilter: IPv6: fix DSCP mangle code
netfilter: IPv6: initialize TOS field in REJECT target module
IPVS: init and cleanup restructuring
IPVS: Change of socket usage to enable name space exit.
netfilter: ebtables: only call xt_compat_add_offset once per rule
netfilter: fix ebtables compat support
netfilter: ctnetlink: fix timestamp support for new conntracks
pch_gbe: support ML7223 IOH
PCH_GbE : Fixed the issue of checksum judgment
PCH_GbE : Fixed the issue of collision detection
NET: slip, fix ldisc->open retval
be2net: Fixed bugs related to PVID.
ehea: fix wrongly reported speed and port
...
David Rientjes [Wed, 11 May 2011 00:08:54 +0000 (17:08 -0700)]
slub: Revert "[PARISC] slub: fix panic with DISCONTIGMEM"
This reverts commit
4a5fa3590f09, which did not allow SLUB to be used
on architectures that use DISCONTIGMEM without compiling NUMA support
without CONFIG_BROKEN also set.
The slub panic that it was intended to prevent is addressed by
d9b41e0b54fd ("[PARISC] set memory ranges in N_NORMAL_MEMORY when
onlined") on parisc so there is no further slub issues with such a
configuration.
The reverts allows SLUB now to be used on such architectures since
there haven't been any reports of additional errors.
Cc: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David S. Miller [Tue, 10 May 2011 22:04:35 +0000 (15:04 -0700)]
Merge branch 'pablo/nf-2.6-updates' of git://1984.lsi.us.es/net-2.6
Oliver Hartkopp [Tue, 10 May 2011 20:12:30 +0000 (13:12 -0700)]
slcan: fix ldisc->open retval
TTY layer expects 0 if the ldisc->open operation succeeded.
Reported-by: Matvejchikov Ilya <matvejchikov@gmail.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Williams [Mon, 9 May 2011 07:43:20 +0000 (07:43 +0000)]
net/usb: mark LG VL600 LTE modem ethernet interface as WWAN
Like other mobile broadband device ethernet interfaces, mark the LG
VL600 with the 'wwan' devtype so userspace knows it needs additional
configuration via the AT port before the interface can be used.
Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert [Mon, 9 May 2011 19:43:05 +0000 (19:43 +0000)]
xfrm: Don't allow esn with disabled anti replay detection
Unlike the standard case, disabled anti replay detection needs some
nontrivial extra treatment on ESN. RFC 4303 states:
Note: If a receiver chooses to not enable anti-replay for an SA, then
the receiver SHOULD NOT negotiate ESN in an SA management protocol.
Use of ESN creates a need for the receiver to manage the anti-replay
window (in order to determine the correct value for the high-order
bits of the ESN, which are employed in the ICV computation), which is
generally contrary to the notion of disabling anti-replay for an SA.
So return an error if an ESN state with disabled anti replay detection
is inserted for now and add the extra treatment later if we need it.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert [Mon, 9 May 2011 19:36:38 +0000 (19:36 +0000)]
xfrm: Assign the inner mode output function to the dst entry
As it is, we assign the outer modes output function to the dst entry
when we create the xfrm bundle. This leads to two problems on interfamily
scenarios. We might insert ipv4 packets into ip6_fragment when called
from xfrm6_output. The system crashes if we try to fragment an ipv4
packet with ip6_fragment. This issue was introduced with git commit
ad0081e4 (ipv6: Fragment locally generated tunnel-mode IPSec6 packets
as needed). The second issue is, that we might insert ipv4 packets in
netfilter6 and vice versa on interfamily scenarios.
With this patch we assign the inner mode output function to the dst entry
when we create the xfrm bundle. So xfrm4_output/xfrm6_output from the inner
mode is used and the right fragmentation and netfilter functions are called.
We switch then to outer mode with the output_finish functions.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 10 May 2011 19:26:06 +0000 (12:26 -0700)]
net: dev_close() should check IFF_UP
Commit
443457242beb (factorize sync-rcu call in
unregister_netdevice_many) mistakenly removed one test from dev_close()
Following actions trigger a BUG :
modprobe bonding
modprobe dummy
ifconfig bond0 up
ifenslave bond0 dummy0
rmmod dummy
dev_close() must not close a non IFF_UP device.
With help from Frank Blaschka and Einar EL Lueck
Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Reported-by: Einar EL Lueck <ELELUECK@de.ibm.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 10 May 2011 19:22:54 +0000 (12:22 -0700)]
vlan: fix GVRP at dismantle time
ip link add link eth2 eth2.103 type vlan id 103 gvrp on loose_binding on
ip link set eth2.103 up
rmmod tg3 # driver providing eth2
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<
ffffffffa0030c9e>] garp_request_leave+0x3e/0xc0 [garp]
PGD
11d251067 PUD
11b9e0067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/virtual/net/eth2.104/ifindex
CPU 0
Modules linked in: tg3(-) 8021q garp nfsd lockd auth_rpcgss sunrpc libphy sg [last unloaded: x_tables]
Pid: 11494, comm: rmmod Tainted: G W
2.6.39-rc6-00261-gfd71257-dirty #580 HP ProLiant BL460c G6
RIP: 0010:[<
ffffffffa0030c9e>] [<
ffffffffa0030c9e>] garp_request_leave+0x3e/0xc0 [garp]
RSP: 0018:
ffff88007a19bae8 EFLAGS:
00010286
RAX:
0000000000000000 RBX:
ffff88011b5e2000 RCX:
0000000000000002
RDX:
0000000000000000 RSI:
0000000000000175 RDI:
ffffffffa0030d5b
RBP:
ffff88007a19bb18 R08:
0000000000000001 R09:
ffff88011bd64a00
R10:
ffff88011d34ec00 R11:
0000000000000000 R12:
0000000000000002
R13:
ffff88007a19bc48 R14:
ffff88007a19bb88 R15:
0000000000000001
FS:
0000000000000000(0000) GS:
ffff88011fc00000(0063) knlGS:
00000000f77d76c0
CS: 0010 DS: 002b ES: 002b CR0:
000000008005003b
CR2:
0000000000000000 CR3:
000000011a675000 CR4:
00000000000006f0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000ffff0ff0 DR7:
0000000000000400
Process rmmod (pid: 11494, threadinfo
ffff88007a19a000, task
ffff8800798595c0)
Stack:
ffff88007a19bb36 ffff88011c84b800 ffff88011b5e2000 ffff88007a19bc48
ffff88007a19bb88 0000000000000006 ffff88007a19bb38 ffffffffa003a5f6
ffff88007a19bb38 670088007a19bba8 ffff88007a19bb58 ffffffffa00397e7
Call Trace:
[<
ffffffffa003a5f6>] vlan_gvrp_request_leave+0x46/0x50 [8021q]
[<
ffffffffa00397e7>] vlan_dev_stop+0xb7/0xc0 [8021q]
[<
ffffffff8137e427>] __dev_close_many+0x87/0xe0
[<
ffffffff8137e507>] dev_close_many+0x87/0x110
[<
ffffffff8137e630>] rollback_registered_many+0xa0/0x240
[<
ffffffff8137e7e9>] unregister_netdevice_many+0x19/0x60
[<
ffffffffa00389eb>] vlan_device_event+0x53b/0x550 [8021q]
[<
ffffffff8143f448>] ? ip6mr_device_event+0xa8/0xd0
[<
ffffffff81479d03>] notifier_call_chain+0x53/0x80
[<
ffffffff81062539>] __raw_notifier_call_chain+0x9/0x10
[<
ffffffff81062551>] raw_notifier_call_chain+0x11/0x20
[<
ffffffff8137df82>] call_netdevice_notifiers+0x32/0x60
[<
ffffffff8137e69f>] rollback_registered_many+0x10f/0x240
[<
ffffffff8137e85f>] rollback_registered+0x2f/0x40
[<
ffffffff8137e8c8>] unregister_netdevice_queue+0x58/0x90
[<
ffffffff8137e9eb>] unregister_netdev+0x1b/0x30
[<
ffffffffa005d73f>] tg3_remove_one+0x6f/0x10b [tg3]
We should call vlan_gvrp_request_leave() from unregister_vlan_dev(),
not from vlan_dev_stop(), because vlan_gvrp_uninit_applicant()
is called right after unregister_netdevice_queue(). In batch mode,
unregister_netdevice_queue() doesn’t immediately call vlan_dev_stop().
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 10 May 2011 19:00:53 +0000 (12:00 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/upstream-linus
* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: (28 commits)
MIPS: Alchemy: fix xxs1500 build error
MIPS: Invalidate old TLB mappings when updating huge page PTEs.
MIPS: Hibernation: Fixes for PAGE_SIZE >= 64kb
MIPS: JZ4740: Set one-shot feature flag for the clockevent
MIPS: JZ4740: Export symbols to the watchdog driver module
MIPS: JZ4740: Fix GCC 4.6.0 build error.
MIPS: Audit: Fix success success argument pass to audit_syscall_exit
MIPS: Fix calc_vmlinuz_load_addr build warnings.
MIPS: Alchemy: Fix GCC 4.6.0 build error.
MIPS: Document former use of timerfd(2) syscall number.
MIPS: IP27: Fix GCC 4.6.0 build error.
MIPS: IP27: Fix GCC 4.6.0 build error.
MIPS: bcm63xx: Fix header_crc comment in bcm963xx_tag.h
MIPS: Octeon: Guard the Kconfig body with CPU_CAVIUM_OCTEON
MIPS: Octeon: Cleanup Kconfig IRQ_CPU* symbols.
MIPS: Rename .data..mostly and properly handle it in linker script
MIPS: MSP: Fix build error
MIPS: MSP71xx: Fix typo in msp_per_irq_controller
MIPS: Loongson: Fix GCC 2.6.0 build error.
MIPS: Jazz: Fix GCC 4.6.0 build error
...
Linus Torvalds [Tue, 10 May 2011 18:56:35 +0000 (11:56 -0700)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: fix race condition in AIL push trigger
xfs: make AIL target updates and compares 32bit safe.
xfs: always push the AIL to the target
xfs: exit AIL push work correctly when AIL is empty
xfs: ensure reclaim cursor is reset correctly at end of AG
Manuel Lauss [Sat, 7 May 2011 11:55:19 +0000 (13:55 +0200)]
MIPS: Alchemy: fix xxs1500 build error
This fixes:
alchemy/xxs1500/init.c: In function 'prom_init':
alchemy/xxs1500/init.c:57:17: error: ignoring return value of 'kstrtoul', declared with attribute warn_unused_result
Signed-off-by: Manuel Lauss <manuel.lauss@googlemail.com>
Cc: Linux-MIPS <linux-mips@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/2340/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David Daney [Wed, 27 Apr 2011 23:39:28 +0000 (16:39 -0700)]
MIPS: Invalidate old TLB mappings when updating huge page PTEs.
Without this, stale Icache or TLB entries may be used.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
https://patchwork.linux-mips.org/patch/2318/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Wu Zhangjin [Sat, 23 Apr 2011 21:56:59 +0000 (05:56 +0800)]
MIPS: Hibernation: Fixes for PAGE_SIZE >= 64kb
PAGE_SIZE >= 64kb (1 << 16) is too big to be the immediate of the
addiu/daddiu instruction, so, use addu/daddu instruction instead.
The following compiling error is fixed:
AS arch/mips/power/hibernate.o
arch/mips/power/hibernate.S: Assembler messages:
arch/mips/power/hibernate.S:38: Error: expression out of range
make[2]: *** [arch/mips/power/hibernate.o] Error 1
make[1]: *** [arch/mips/power] Error 2
Reported-by: Roman Mamedov <rm@romanrm.ru>
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2313/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Lars-Peter Clausen [Thu, 31 Mar 2011 18:52:20 +0000 (20:52 +0200)]
MIPS: JZ4740: Set one-shot feature flag for the clockevent
The code for supporting one-shot mode for the clockevent is already there,
only the feature flag was not set. Setting the one-shot flag allows the
kernel to run in tickless mode.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2261/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Mon, 18 Apr 2011 10:19:32 +0000 (11:19 +0100)]
MIPS: JZ4740: Export symbols to the watchdog driver module
MODPOST 356 modules
ERROR: "jz4740_timer_disable_watchdog" [drivers/watchdog/jz4740_wdt.ko] undefine
d!
ERROR: "jz4740_timer_enable_watchdog" [drivers/watchdog/jz4740_wdt.ko] undefined
!
make[1]: *** [__modpost] Error 1
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Mon, 18 Apr 2011 10:16:42 +0000 (11:16 +0100)]
MIPS: JZ4740: Fix GCC 4.6.0 build error.
CC arch/mips/jz4740/dma.o
arch/mips/jz4740/dma.c: In function 'jz4740_dma_chan_irq':
arch/mips/jz4740/dma.c:245:11: error: variable 'status' set but not used [-Werro
r=unused-but-set-variable]
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Wed, 13 Apr 2011 21:51:23 +0000 (23:51 +0200)]
MIPS: Audit: Fix success success argument pass to audit_syscall_exit
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Wed, 13 Apr 2011 19:49:54 +0000 (21:49 +0200)]
MIPS: Fix calc_vmlinuz_load_addr build warnings.
HOSTCC arch/mips/boot/compressed/calc_vmlinuz_load_addr
arch/mips/boot/compressed/calc_vmlinuz_load_addr.c: In function 'main':
arch/mips/boot/compressed/calc_vmlinuz_load_addr.c:35:2: warning: format '%llx' expects type 'long long unsigned int *', but argument 3 has type 'uint64_t *'
arch/mips/boot/compressed/calc_vmlinuz_load_addr.c:54:2: warning: format '%llx' expects type 'long long unsigned int', but argument 2 has type 'uint64_t'
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Wed, 13 Apr 2011 19:15:09 +0000 (21:15 +0200)]
MIPS: Alchemy: Fix GCC 4.6.0 build error.
CC arch/mips/alchemy/devboards/db1x00/board_setup.o
arch/mips/alchemy/devboards/db1x00/board_setup.c: In function 'board_setup':
arch/mips/alchemy/devboards/db1x00/board_setup.c:130:6: error: variable 'pin_func' set but not used [-Werror=unused-but-set-variable]
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Wed, 13 Apr 2011 18:50:46 +0000 (20:50 +0200)]
MIPS: Document former use of timerfd(2) syscall number.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Mon, 11 Apr 2011 09:48:31 +0000 (11:48 +0200)]
MIPS: IP27: Fix GCC 4.6.0 build error.
CC arch/mips/sgi-ip27/ip27-hubio.o
arch/mips/sgi-ip27/ip27-hubio.c: In function 'hub_pio_map':
arch/mips/sgi-ip27/ip27-hubio.c:32:20: error: variable 'junk' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Mon, 11 Apr 2011 09:37:15 +0000 (11:37 +0200)]
MIPS: IP27: Fix GCC 4.6.0 build error.
CC arch/mips/sgi-ip27/ip27-hubio.o
arch/mips/sgi-ip27/ip27-hubio.c: In function 'hub_pio_map':
arch/mips/sgi-ip27/ip27-hubio.c:32:20: error: variable 'junk' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Jonas Gorski [Fri, 8 Apr 2011 12:32:15 +0000 (14:32 +0200)]
MIPS: bcm63xx: Fix header_crc comment in bcm963xx_tag.h
The CRC32 actually includes the tag_version.
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2275/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David Daney [Fri, 18 Feb 2011 02:23:32 +0000 (18:23 -0800)]
MIPS: Octeon: Guard the Kconfig body with CPU_CAVIUM_OCTEON
Instead of making each Octeon specific option depend on
CPU_CAVIUM_OCTEON, gate the body of the entire file with
CPU_CAVIUM_OCTEON. With this change, CAVIUM_OCTEON_SPECIFIC_OPTIONS
becomes useless, so get rid of it as well.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2091/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David Daney [Thu, 17 Feb 2011 22:04:33 +0000 (14:04 -0800)]
MIPS: Octeon: Cleanup Kconfig IRQ_CPU* symbols.
Octeon doesn't use IRQ_CPU, so don't select it.
IRQ_CPU_OCTEON is a completely unused symbol, remove it completely.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2086/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Catalin Marinas [Tue, 29 Mar 2011 10:40:06 +0000 (11:40 +0100)]
MIPS: Rename .data..mostly and properly handle it in linker script
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 14:09:25 +0000 (16:09 +0200)]
MIPS: MSP: Fix build error
Reported and original patch by Yoichi Yuasa <yuasa@linux-mips.org>.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Yoichi Yuasa [Tue, 29 Mar 2011 06:53:56 +0000 (15:53 +0900)]
MIPS: MSP71xx: Fix typo in msp_per_irq_controller
CC arch/mips/pmc-sierra/msp71xx/msp_irq_per.o
arch/mips/pmc-sierra/msp71xx/msp_irq_per.c:101:2: error: expected identifier before '.' token
make[2]: *** [arch/mips/pmc-sierra/msp71xx/msp_irq_per.o] Error 1
Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/2246/
Cc: linux-mips <linux-mips@linux-mips.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 10:32:55 +0000 (12:32 +0200)]
MIPS: Loongson: Fix GCC 2.6.0 build error.
CC arch/mips/loongson/common/env.o
arch/mips/loongson/common/env.c: In function 'prom_init_env':
arch/mips/loongson/common/env.c:50:12: error: variable 'ret' set but not used [-Werror=unused-but-set-variable]
arch/mips/loongson/common/env.c:51:12: error: variable 'ret' set but not used [-Werror=unused-but-set-variable]
arch/mips/loongson/common/env.c:52:12: error: variable 'ret' set but not used [-Werror=unused-but-set-variable]
arch/mips/loongson/common/env.c:53:12: error: variable 'ret' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 10:09:51 +0000 (12:09 +0200)]
MIPS: Jazz: Fix GCC 4.6.0 build error
CC arch/mips/jazz/jazzdma.o
arch/mips/jazz/jazzdma.c: In function 'vdma_remap':
arch/mips/jazz/jazzdma.c:214:20: error: variable 'npages' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 09:57:11 +0000 (11:57 +0200)]
MIPS: SNI: Fix GCC 4.6.0 build error
CC arch/mips/sni/time.o
arch/mips/sni/time.c: In function 'dosample':
arch/mips/sni/time.c:98:19: error: variable 'lsb' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 09:48:22 +0000 (11:48 +0200)]
MIPS: Malta: Fix GCC 4.6.0 build error
CC arch/mips/mti-malta/malta-int.o
arch/mips/mti-malta/malta-int.c: In function 'mips_pcibios_iack':
arch/mips/mti-malta/malta-int.c:59:6: error: variable 'dummy' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 09:43:19 +0000 (11:43 +0200)]
MIPS: Malta: Fix GCC 4.6.0 build error
CC arch/mips/mti-malta/malta-init.o
arch/mips/mti-malta/malta-init.c: In function 'prom_init':
arch/mips/mti-malta/malta-init.c:196:6: error: variable 'result' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 09:06:49 +0000 (11:06 +0200)]
MIPS: IP22: Fix GCC 4.6.0 build error
CC arch/mips/sgi-ip22/ip22-platform.o
arch/mips/sgi-ip22/ip22-platform.c: In function 'sgiseeq_devinit':
arch/mips/sgi-ip22/ip22-platform.c:135:15: error: variable 'tmp' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
While at it rename the variable to pbdma for readability; there is a
local variable tmp of different type being used in two nested blocks.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 09:00:44 +0000 (11:00 +0200)]
MIPS: IP22: Fix GCC 4.6.0 build error
CC arch/mips/sgi-ip22/ip22-time.o
arch/mips/sgi-ip22/ip22-time.c: In function 'dosample':
arch/mips/sgi-ip22/ip22-time.c:35:10: error: variable 'lsb' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 08:54:54 +0000 (10:54 +0200)]
MIPS: tlbex: Fix GCC 4.6.0 build error
CC arch/mips/mm/tlbex.o
arch/mips/mm/tlbex.c: In function 'build_r4000_tlb_refill_handler':
arch/mips/mm/tlbex.c:1155:22: error: variable 'vmalloc_mode' set but not used [-Werror=unused-but-set-variable]
arch/mips/mm/tlbex.c:1154:28: error: variable 'htlb_info' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 08:50:38 +0000 (10:50 +0200)]
MIPS: c-r4k: Fix GCC 4.6.0 build error
CC arch/mips/mm/c-r4k.o
arch/mips/mm/c-r4k.c: In function 'probe_scache':
arch/mips/mm/c-r4k.c:1078:6: error: variable 'tmp' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Older GCC versions didn't warn about the unused variable tmp because it was
getting initialized.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David Daney [Tue, 28 Dec 2010 21:21:37 +0000 (13:21 -0800)]
MIPS: Mask jump target in ftrace_dyn_arch_init_insns().
The current code is abusing the uasm interface by passing jump target
addresses with high bits set. Mask the addresses to avoid annoying
messages at boot time.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wu Zhangjin <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/1922/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Linus Torvalds [Tue, 10 May 2011 16:41:03 +0000 (09:41 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/ryusuke/nilfs2
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
nilfs2: fix infinite loop in nilfs_palloc_freev function
Linus Torvalds [Tue, 10 May 2011 16:39:11 +0000 (09:39 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
net/9p: Handle get_user_pages_fast return properly
Ryusuke Konishi [Tue, 10 May 2011 11:59:34 +0000 (20:59 +0900)]
nilfs2: fix infinite loop in nilfs_palloc_freev function
After having applied commit
9954e7af14868b8b ("nilfs2: add free
entries count only if clear bit operation succeeded"), a free routine
of nilfs came to fall into an infinite loop, outputting the same
message endlessly:
nilfs_palloc_freev: entry number 29497 already freed
nilfs_palloc_freev: entry number 29497 already freed
nilfs_palloc_freev: entry number 29497 already freed
nilfs_palloc_freev: entry number 29497 already freed
nilfs_palloc_freev: entry number 29497 already freed ...
That patch broke the routine so that a loop counter is never updated
in an abnormal state. This fixes the regression.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Pablo Neira Ayuso [Tue, 10 May 2011 10:13:36 +0000 (12:13 +0200)]
netfilter: revert
a2361c8735e07322023aedc36e4938b35af31eb0
This patch reverts
a2361c8735e07322023aedc36e4938b35af31eb0:
"[PATCH] netfilter: xt_conntrack: warn about use in raw table"
Florian Wesphal says:
"... when the packet was sent from the local machine the skb
already has ->nfct attached, and -m conntrack seems to do
the right thing."
Acked-by: Jan Engelhardt <jengelh@medozas.de>
Reported-by: Florian Wesphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fernando Luis Vazquez Cao [Tue, 10 May 2011 08:00:21 +0000 (10:00 +0200)]
netfilter: IPv6: fix DSCP mangle code
The mask indicates the bits one wants to zero out, so it needs to be
inverted before applying to the original TOS field.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fernando Luis Vazquez Cao [Tue, 10 May 2011 07:55:44 +0000 (09:55 +0200)]
netfilter: IPv6: initialize TOS field in REJECT target module
The IPv6 header is not zeroed out in alloc_skb so we must initialize
it properly unless we want to see IPv6 packets with random TOS fields
floating around. The current implementation resets the flow label
but this could be changed if deemed necessary.
We stumbled upon this issue when trying to apply a mangle rule to
the RST packet generated by the REJECT target module.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Hans Schillstrom [Tue, 3 May 2011 20:09:31 +0000 (22:09 +0200)]
IPVS: init and cleanup restructuring
DESCRIPTION
This patch tries to restore the initial init and cleanup
sequences that was before namspace patch.
Netns also requires action when net devices unregister
which has never been implemented. I.e this patch also
covers when a device moves into a network namespace,
and has to be released.
IMPLEMENTATION
The number of calls to register_pernet_device have been
reduced to one for the ip_vs.ko
Schedulers still have their own calls.
This patch adds a function __ip_vs_service_cleanup()
and an enable flag for the netfilter hooks.
The nf hooks will be enabled when the first service is loaded
and never disabled again, except when a namespace exit starts.
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
[horms@verge.net.au: minor edit to changelog]
Signed-off-by: Simon Horman <horms@verge.net.au>
Hans Schillstrom [Tue, 3 May 2011 20:09:30 +0000 (22:09 +0200)]
IPVS: Change of socket usage to enable name space exit.
If the sync daemons run in a name space while it crashes
or get killed, there is no way to stop them except for a reboot.
When all patches are there, ip_vs_core will handle register_pernet_(),
i.e. ip_vs_sync_init() and ip_vs_sync_cleanup() will be removed.
Kernel threads should not increment the use count of a socket.
By calling sk_change_net() after creating a socket this is avoided.
sock_release cant be used intead sk_release_kernel() should be used.
Thanks Eric W Biederman for your advices.
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
[horms@verge.net.au: minor edit to changelog]
Signed-off-by: Simon Horman <horms@verge.net.au>
Florian Westphal [Thu, 21 Apr 2011 08:58:25 +0000 (10:58 +0200)]
netfilter: ebtables: only call xt_compat_add_offset once per rule
The optimizations in commit
255d0dc34068a976
(netfilter: x_table: speedup compat operations) assume that
xt_compat_add_offset is called once per rule.
ebtables however called it for each match/target found in a rule.
The match/watcher/target parser already returns the needed delta, so it
is sufficient to move the xt_compat_add_offset call to a more reasonable
location.
While at it, also get rid of the unused COMPAT iterator macros.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Eric Dumazet [Thu, 21 Apr 2011 08:57:21 +0000 (10:57 +0200)]
netfilter: fix ebtables compat support
commit
255d0dc34068a976 (netfilter: x_table: speedup compat operations)
made ebtables not working anymore.
1) xt_compat_calc_jump() is not an exact match lookup
2) compat_table_info() has a typo in xt_compat_init_offsets() call
3) compat_do_replace() misses a xt_compat_init_offsets() call
Reported-by: dann frazier <dannf@dannf.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Pablo Neira Ayuso [Thu, 21 Apr 2011 08:55:07 +0000 (10:55 +0200)]
netfilter: ctnetlink: fix timestamp support for new conntracks
This patch fixes the missing initialization of the start time if
the timestamp support is enabled.
libnetfilter_conntrack/utils# conntrack -E &
libnetfilter_conntrack/utils# ./conntrack_create
tcp 6 109 ESTABLISHED src=1.1.1.1 dst=2.2.2.2 sport=1025 dport=21 packets=0 bytes=0 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=21 dport=1025 packets=0 bytes=0 mark=0 delta-time=
1303296401 use=2
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
M. Mohan Kumar [Fri, 15 Apr 2011 08:29:33 +0000 (13:59 +0530)]
net/9p: Handle get_user_pages_fast return properly
Use proper data type to handle get_user_pages_fast error condition. Also
do not treat EFAULT error as fatal.
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Linus Torvalds [Tue, 10 May 2011 02:33:54 +0000 (19:33 -0700)]
Linux 2.6.39-rc7