GitHub/moto-9609/android_kernel_motorola_exynos9610.git
14 years agox86, memblock: Use memblock_memory_size()/memblock_free_memory_size() to get correct...
Yinghai Lu [Wed, 25 Aug 2010 20:39:18 +0000 (13:39 -0700)]
x86, memblock: Use memblock_memory_size()/memblock_free_memory_size() to get correct dma_reserve

memblock_memory_size() will return memory size in memblock.memory.region.
memblock_free_memory_size() will return free memory size in memblock.memory.region.

So We can get exact reseved size in specified range.

Set the size right after initmem_init(), because later bootmem API will
get area above 16M. (except some fallback).

Later after we remove the bootmem, We could call that just before paging_init().

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
14 years agox86: Remove not used early_res code
Yinghai Lu [Wed, 25 Aug 2010 20:39:18 +0000 (13:39 -0700)]
x86: Remove not used early_res code

and some functions in e820.c that are not used anymore

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
14 years agox86, memblock: Replace e820_/_early string with memblock_
Yinghai Lu [Wed, 25 Aug 2010 20:39:17 +0000 (13:39 -0700)]
x86, memblock: Replace e820_/_early string with memblock_

1.include linux/memblock.h directly. so later could reduce e820.h reference.
2 this patch is done by sed scripts mainly

-v2: use MEMBLOCK_ERROR instead of -1ULL or -1UL

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
14 years agox86: Use memblock to replace early_res
Yinghai Lu [Wed, 25 Aug 2010 20:39:17 +0000 (13:39 -0700)]
x86: Use memblock to replace early_res

1. replace find_e820_area with memblock_find_in_range
2. replace reserve_early with memblock_x86_reserve_range
3. replace free_early with memblock_x86_free_range.
4. NO_BOOTMEM will switch to use memblock too.
5. use _e820, _early wrap in the patch, in following patch, will
   replace them all
6. because memblock_x86_free_range support partial free, we can remove some special care
7. Need to make sure that memblock_find_in_range() is called after memblock_x86_fill()
   so adjust some calling later in setup.c::setup_arch()
   -- corruption_check and mptable_update

-v2: Move reserve_brk() early
    Before fill_memblock_area, to avoid overlap between brk and memblock_find_in_range()
    that could happen We have more then 128 RAM entry in E820 tables, and
    memblock_x86_fill() could use memblock_find_in_range() to find a new place for
    memblock.memory.region array.
    and We don't need to use extend_brk() after fill_memblock_area()
    So move reserve_brk() early before fill_memblock_area().
-v3: Move find_smp_config early
    To make sure memblock_find_in_range not find wrong place, if BIOS doesn't put mptable
    in right place.
-v4: Treat RESERVED_KERN as RAM in memblock.memory. and they are already in
    memblock.reserved already..
    use __NOT_KEEP_MEMBLOCK to make sure memblock related code could be freed later.
-v5: Generic version __memblock_find_in_range() is going from high to low, and for 32bit
    active_region for 32bit does include high pages
    need to replace the limit with memblock.default_alloc_limit, aka get_max_mapped()
-v6: Use current_limit instead
-v7: check with MEMBLOCK_ERROR instead of -1ULL or -1L
-v8: Set memblock_can_resize early to handle EFI with more RAM entries
-v9: update after kmemleak changes in mainline

Suggested-by: David S. Miller <davem@davemloft.net>
Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
14 years agox86, memblock: Use memblock_debug to control debug message print out
Yinghai Lu [Wed, 25 Aug 2010 20:39:17 +0000 (13:39 -0700)]
x86, memblock: Use memblock_debug to control debug message print out

Also let memblock_x86_reserve_range/memblock_x86_free_range could print out name if memblock=debug is
specified

will also print ther name when reserve_memblock_area/free_memblock_area are called.

-v2: according to Ingo, put " if (memblock_debug) " in one place

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
14 years agox86, memblock: Add memblock_x86_memory_in_range()
Yinghai Lu [Wed, 25 Aug 2010 20:39:17 +0000 (13:39 -0700)]
x86, memblock: Add memblock_x86_memory_in_range()

It will return memory size in specified range according to memblock.memory.region

Try to share some code with memblock_x86_free_memory_in_range() by passing get_free to
__memblock_x86_memory_in_range().

-v2: Ben want _in_range in the name instead of size

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
14 years agox86, memblock: Add memblock_x86_free_memory_in_range()
Yinghai Lu [Wed, 25 Aug 2010 20:39:16 +0000 (13:39 -0700)]
x86, memblock: Add memblock_x86_free_memory_in_range()

It will return free memory size in specified range.

We can not use memory_size - reserved_size here, because some reserved area
may not be in the scope of memblock.memory.region.

Use memblock.memory.region subtracting memblock.reserved.region to get free range array.
then count size of all free ranges.

-v2: Ben insist on using _in_range

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
14 years agox86, memblock: Add memblock_x86_find_in_range_node()
Yinghai Lu [Wed, 25 Aug 2010 20:39:16 +0000 (13:39 -0700)]
x86, memblock: Add memblock_x86_find_in_range_node()

It can be used to find NODE_DATA for numa.

Need to make sure early_node_map[] is filled before it is called, otherwise
it will fallback to memblock_find_in_range(), with node range.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
14 years agomemblock: Add find_memory_core_early()
Yinghai Lu [Wed, 25 Aug 2010 20:39:16 +0000 (13:39 -0700)]
memblock: Add find_memory_core_early()

According to node range in early_node_map[] with __memblock_find_in_range
to find free range.

Will be used by memblock_x86_find_in_range_node()

memblock_x86_find_in_range_node will be used to find right buffer for NODE_DATA

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
14 years agox86, memblock: Add memblock_x86_register_active_regions() and memblock_x86_hole_size()
Yinghai Lu [Wed, 25 Aug 2010 20:39:16 +0000 (13:39 -0700)]
x86, memblock: Add memblock_x86_register_active_regions() and memblock_x86_hole_size()

memblock_x86_register_active_regions() will be used to fill early_node_map,
the result will be memblock.memory.region AND numa data

memblock_x86_hole_size will be used to find hole size on memblock.memory.region
with specified range.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
14 years agox86, memblock: Add get_free_all_memory_range()
Yinghai Lu [Wed, 25 Aug 2010 20:39:16 +0000 (13:39 -0700)]
x86, memblock: Add get_free_all_memory_range()

get_free_all_memory_range is for CONFIG_NO_BOOTMEM=y, and will be called by
free_all_memory_core_early().

It will use early_node_map aka active ranges subtract memblock.reserved to
get all free range, and those ranges will convert to slab pages.

-v4: increase range size

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
14 years agox86, memblock: Add memblock_x86_reserve_range/memblock_x86_free_range
Yinghai Lu [Wed, 25 Aug 2010 20:39:15 +0000 (13:39 -0700)]
x86, memblock: Add memblock_x86_reserve_range/memblock_x86_free_range

They are wrappers for core versions, which take start/end/name instead
of base/size.  This will make x86 conversion eaasier.

could add more debug print out

-v2: change get_max_mapped() to memblock.default_alloc_limit according to Michael
      Ellerman and Ben
     change to memblock_x86_reserve_range and memblock_x86_free_range according to Michael Ellerman
-v3: call check_and_double after reserve/free, so could avoid to use
      find_memblock_area. Suggested by Michael Ellerman

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
14 years agox86, memblock: Add memblock_x86_to_bootmem()
Yinghai Lu [Wed, 25 Aug 2010 20:39:15 +0000 (13:39 -0700)]
x86, memblock: Add memblock_x86_to_bootmem()

memblock_x86_to_bootmem() will reserve memblock.reserved.region in
bootmem after bootmem is set up.

We can use it to with all arches that support memblock later.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
14 years agobootmem, x86: Add weak version of reserve_bootmem_generic
Yinghai Lu [Wed, 25 Aug 2010 20:39:15 +0000 (13:39 -0700)]
bootmem, x86: Add weak version of reserve_bootmem_generic

It will be used memblock_x86_to_bootmem converting

It is an wrapper for reserve_bootmem, and x86 64bit is using special one.

Also clean up that version for x86_64. We don't need to take care of numa
path for that, bootmem can handle it how

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
14 years agox86, memblock: Add memblock_x86_find_in_range_size()
Yinghai Lu [Wed, 25 Aug 2010 20:39:15 +0000 (13:39 -0700)]
x86, memblock: Add memblock_x86_find_in_range_size()

size is returned according free range.
Will be used to find free ranges for early_memtest and memory corruption check

Do not mess it up with lib/memblock.c yet.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
14 years agomemblock: Add memblock_free/reserve_reserved_regions()
Yinghai Lu [Wed, 25 Aug 2010 20:39:14 +0000 (13:39 -0700)]
memblock: Add memblock_free/reserve_reserved_regions()

So we can avoid export memblock_reserved_init_regions()
Suggested by Ben.

-v2: use __init_memblock attribute

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
14 years agomemblock: Add memblock_find_in_range()
Yinghai Lu [Wed, 28 Jul 2010 05:38:40 +0000 (15:38 +1000)]
memblock: Add memblock_find_in_range()

This is a wrapper for memblock_find_base() using slightly different
arguments (start,end instead of start,size for example) in order to
make it easier to convert existing arch/x86 code.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Option for the architecture to put memblock into the .init section
Yinghai Lu [Wed, 28 Jul 2010 05:43:02 +0000 (15:43 +1000)]
memblock: Option for the architecture to put memblock into the .init section

Arch code can define ARCH_DISCARD_MEMBLOCK in asm/memblock.h,
which in turns causes memblock code and data to go respectively
into the .init and .initdata sections. This will be used by the
x86 architecture.

If ARCH_DISCARD_MEMBLOCK is defined, the debugfs files to inspect
the memblock arrays after boot are not created.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Protect memblock.h with CONFIG_HAVE_MEMBLOCK
Yinghai Lu [Wed, 28 Jul 2010 05:28:21 +0000 (15:28 +1000)]
memblock: Protect memblock.h with CONFIG_HAVE_MEMBLOCK

This should make it easier to catch/debug incorrect use when
the CONFIG_ option isn't set.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Make MEMBLOCK_ERROR be 0
Benjamin Herrenschmidt [Wed, 28 Jul 2010 05:25:10 +0000 (15:25 +1000)]
memblock: Make MEMBLOCK_ERROR be 0

And ensure we don't hand out 0 as a valid allocation. We put the
low limit at PAGE_SIZE arbitrarily.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Export MEMBLOCK_ERROR
Yinghai Lu [Wed, 28 Jul 2010 05:20:58 +0000 (15:20 +1000)]
memblock: Export MEMBLOCK_ERROR

will used by x86 memblock_x86_find_in_range_node and nobootmem replacement

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Improve debug output when resizing the reserve array
Yinghai Lu [Wed, 28 Jul 2010 05:13:22 +0000 (15:13 +1000)]
memblock: Improve debug output when resizing the reserve array

Print out the location info in addition to which array is being
resized. Also use memblocK_dbg() to put that under control of
the memblock_debug flag.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Expose some memblock bits for use by x86
Yinghai Lu [Wed, 28 Jul 2010 05:07:21 +0000 (15:07 +1000)]
memblock: Expose some memblock bits for use by x86

This exposes memblock_debug and associated memblock_dbg() macro,
along with memblock_can_resize so that x86 can use these when
ported to use memblock

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Add debugfs files to dump the arrays content
Benjamin Herrenschmidt [Tue, 6 Jul 2010 22:39:19 +0000 (15:39 -0700)]
memblock: Add debugfs files to dump the arrays content

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Make memblock_alloc_try_nid() fallback to MEMBLOCK_ALLOC_ANYWHERE
Benjamin Herrenschmidt [Tue, 6 Jul 2010 22:39:18 +0000 (15:39 -0700)]
memblock: Make memblock_alloc_try_nid() fallback to MEMBLOCK_ALLOC_ANYWHERE

memblock_alloc_nid() used to fallback to allocating anywhere by using
memblock_alloc() as a fallback.

However, some of my previous patches limit memblock_alloc() to the region
covered by MEMBLOCK_ALLOC_ACCESSIBLE which is not quite what we want
for memblock_alloc_try_nid().

So we fix it by explicitely using MEMBLOCK_ALLOC_ANYWHERE.

Not that so far only sparc uses memblock_alloc_nid() and it hasn't been updated
to clamp the accessible zone yet. Thus the temporary "breakage" should have
no effect.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Separate memblock_alloc_nid() and memblock_alloc_try_nid()
Benjamin Herrenschmidt [Tue, 6 Jul 2010 22:39:17 +0000 (15:39 -0700)]
memblock: Separate memblock_alloc_nid() and memblock_alloc_try_nid()

The former is now strict, it will fail if it cannot honor the allocation
within the node, while the later implements the previous semantic which
falls back to allocating anywhere.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: NUMA allocate can now use early_pfn_map
Benjamin Herrenschmidt [Tue, 6 Jul 2010 22:39:16 +0000 (15:39 -0700)]
memblock: NUMA allocate can now use early_pfn_map

We now provide a default (weak) implementation of memblock_nid_range()
which uses the early_pfn_map[] if CONFIG_ARCH_POPULATES_NODE_MAP
is set. Sparc still needs to use its own method due to the way
the pages can be scattered between nodes.

This implementation is inefficient due to our main algorithm and
callback construct wanting to work on an ascending addresses bases
while early_pfn_map[] would rather work with nid's (it's unsorted
at that stage). But it should work and we can look into improving
it subsequently, possibly using arch compile options to chose a
different algorithm alltogether.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Add "start" argument to memblock_find_base()
Benjamin Herrenschmidt [Mon, 12 Jul 2010 05:00:34 +0000 (15:00 +1000)]
memblock: Add "start" argument to memblock_find_base()

To constraint the search of a region between two boundaries,
which will be used by the new NUMA aware allocator among others.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Add arch function to control coalescing of memblock memory regions
Benjamin Herrenschmidt [Tue, 6 Jul 2010 22:39:14 +0000 (15:39 -0700)]
memblock: Add arch function to control coalescing of memblock memory regions

Some archs such as ARM want to avoid coalescing accross things such
as the lowmem/highmem boundary or similar. This provides the option
to control it via an arch callback for which a weak default is provided
which always allows coalescing.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Add array resizing support
Benjamin Herrenschmidt [Tue, 6 Jul 2010 22:39:13 +0000 (15:39 -0700)]
memblock: Add array resizing support

When one of the array gets full, we resize it. After much thinking and
a few iterations of that code, I went back to on-demand resizing using
the (new) internal memblock_find_base() function, which is pretty much what
Yinghai initially proposed, though there some differences in the details.

To work this relies on the default alloc limit being set sensibly by
the architecture.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Move functions around into a more sensible order
Benjamin Herrenschmidt [Mon, 12 Jul 2010 04:36:48 +0000 (14:36 +1000)]
memblock: Move functions around into a more sensible order

Some shuffling is needed for doing array resize so we may as well
put some sense into the ordering of the functions in the whole memblock.c
file. No code change. Added some comments.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: split memblock_find_base() out of __memblock_alloc_base()
Benjamin Herrenschmidt [Mon, 12 Jul 2010 04:24:57 +0000 (14:24 +1000)]
memblock: split memblock_find_base() out of __memblock_alloc_base()

This will be used by the array resize code and might prove useful
to some arch code as well at which point it can be made non-static.

Also add comment as to why aligning size is important

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

v2. Fix loss of size alignment
v3. Fix result code

14 years agomemblock: Move memblock_init() to the bottom of the file
Benjamin Herrenschmidt [Tue, 6 Jul 2010 22:39:10 +0000 (15:39 -0700)]
memblock: Move memblock_init() to the bottom of the file

It's a real PITA to have to search for it in the middle

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Define MEMBLOCK_ERROR internally instead of using ~(phys_addr_t)0
Benjamin Herrenschmidt [Tue, 6 Jul 2010 22:39:09 +0000 (15:39 -0700)]
memblock: Define MEMBLOCK_ERROR internally instead of using ~(phys_addr_t)0

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Make memblock_find_region() out of memblock_alloc_region()
Benjamin Herrenschmidt [Mon, 12 Jul 2010 03:28:15 +0000 (13:28 +1000)]
memblock: Make memblock_find_region() out of memblock_alloc_region()

This function will be used to locate a free area to put the new memblock
arrays when attempting to resize them. memblock_alloc_region() is gone,
the two callsites now call memblock_add_region().

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
v2. Fix membase_alloc_nid_region() conversion

14 years agomemblock: Add debug markers at the end of the array
Benjamin Herrenschmidt [Tue, 6 Jul 2010 22:39:07 +0000 (15:39 -0700)]
memblock: Add debug markers at the end of the array

Since we allocate one more than needed, why not do a bit of sanity checking
here to ensure we don't walk past the end of the array ?

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Move memblock arrays to static storage in memblock.c and make their size...
Benjamin Herrenschmidt [Tue, 6 Jul 2010 22:39:06 +0000 (15:39 -0700)]
memblock: Move memblock arrays to static storage in memblock.c and make their size a variable

This is in preparation for having resizable arrays.

Note that we still allocate one more than needed, this is unchanged from
the previous implementation.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Remove memblock_type.size and add memblock.memory_size instead
Benjamin Herrenschmidt [Wed, 28 Jul 2010 04:31:29 +0000 (14:31 +1000)]
memblock: Remove memblock_type.size and add memblock.memory_size instead

Right now, both the "memory" and "reserved" memblock_type structures have
a "size" member. It represents the calculated memory size in the former
case and is unused in the latter.

This moves it out to the main memblock structure instead

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Remove unused memblock.debug struct member
Benjamin Herrenschmidt [Tue, 6 Jul 2010 22:39:04 +0000 (15:39 -0700)]
memblock: Remove unused memblock.debug struct member

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Change u64 to phys_addr_t
Benjamin Herrenschmidt [Wed, 4 Aug 2010 03:34:42 +0000 (13:34 +1000)]
memblock: Change u64 to phys_addr_t

Let's not waste space and cycles on archs that don't support >32-bit
physical address space.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Remove rmo_size, burry it in arch/powerpc where it belongs
Benjamin Herrenschmidt [Tue, 6 Jul 2010 22:39:02 +0000 (15:39 -0700)]
memblock: Remove rmo_size, burry it in arch/powerpc where it belongs

The RMA (RMO is a misnomer) is a concept specific to ppc64 (in fact
server ppc64 though I hijack it on embedded ppc64 for similar purposes)
and represents the area of memory that can be accessed in real mode
(aka with MMU off), or on embedded, from the exception vectors (which
is bolted in the TLB) which pretty much boils down to the same thing.

We take that out of the generic MEMBLOCK data structure and move it into
arch/powerpc where it belongs, renaming it to "RMA" while at it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Introduce default allocation limit and use it to replace explicit ones
Benjamin Herrenschmidt [Tue, 6 Jul 2010 22:39:01 +0000 (15:39 -0700)]
memblock: Introduce default allocation limit and use it to replace explicit ones

This introduce memblock.current_limit which is used to limit allocations
from memblock_alloc() or memblock_alloc_base(..., MEMBLOCK_ALLOC_ACCESSIBLE).

The old MEMBLOCK_ALLOC_ANYWHERE changes value from 0 to ~(u64)0 and can still
be used with memblock_alloc_base() to allocate really anywhere.

It is -no-longer- cropped to MEMBLOCK_REAL_LIMIT which disappears.

Note to archs: I'm leaving the default limit to MEMBLOCK_ALLOC_ANYWHERE. I
strongly recommend that you ensure that you set an appropriate limit
during boot in order to guarantee that an memblock_alloc() at any time
results in something that is accessible with a simple __va().

The reason is that a subsequent patch will introduce the ability for
the array to resize itself by reallocating itself. The MEMBLOCK core will
honor the current limit when performing those allocations.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Expose MEMBLOCK_ALLOC_ANYWHERE
Benjamin Herrenschmidt [Tue, 6 Jul 2010 22:39:00 +0000 (15:39 -0700)]
memblock: Expose MEMBLOCK_ALLOC_ANYWHERE

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Factor the lowest level alloc function
Benjamin Herrenschmidt [Tue, 6 Jul 2010 22:38:59 +0000 (15:38 -0700)]
memblock: Factor the lowest level alloc function

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Remove nid_range argument, arch provides memblock_nid_range() instead
Benjamin Herrenschmidt [Tue, 6 Jul 2010 22:38:58 +0000 (15:38 -0700)]
memblock: Remove nid_range argument, arch provides memblock_nid_range() instead

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Remove memblock_find()
Benjamin Herrenschmidt [Wed, 4 Aug 2010 03:52:55 +0000 (13:52 +1000)]
memblock: Remove memblock_find()

Nobody uses it anymore. It's semantics were ... weird

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Remove obsolete accessors
Benjamin Herrenschmidt [Wed, 4 Aug 2010 03:52:25 +0000 (13:52 +1000)]
memblock: Remove obsolete accessors

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock/arm: Use new accessors
Benjamin Herrenschmidt [Thu, 5 Aug 2010 02:55:55 +0000 (12:55 +1000)]
memblock/arm: Use new accessors

CC: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock/powerpc: Use new accessors
Benjamin Herrenschmidt [Wed, 4 Aug 2010 03:43:53 +0000 (13:43 +1000)]
memblock/powerpc: Use new accessors

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock/sparc: Use new accessors
Benjamin Herrenschmidt [Wed, 4 Aug 2010 03:43:31 +0000 (13:43 +1000)]
memblock/sparc: Use new accessors

CC: David S. Miller <davem@davemloft.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock/sh: Use new accessors
Benjamin Herrenschmidt [Wed, 4 Aug 2010 04:11:04 +0000 (14:11 +1000)]
memblock/sh: Use new accessors

CC: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock/microblaze: Use new accessors
Benjamin Herrenschmidt [Wed, 4 Aug 2010 04:13:06 +0000 (14:13 +1000)]
memblock/microblaze: Use new accessors

CC: Michal Simek <monstr@monstr.eu>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Introduce for_each_memblock() and new accessors
Benjamin Herrenschmidt [Wed, 4 Aug 2010 03:40:38 +0000 (13:40 +1000)]
memblock: Introduce for_each_memblock() and new accessors

Walk memblock's using for_each_memblock() and use memblock_region_base/end_pfn() for
getting to PFNs.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock/arm: Use memblock_region_is_memory() for omap fb
Benjamin Herrenschmidt [Wed, 4 Aug 2010 04:09:23 +0000 (14:09 +1000)]
memblock/arm: Use memblock_region_is_memory() for omap fb

Instead of the deprecated memblock_find()

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock/arm: pfn_valid uses memblock_is_memory()
Benjamin Herrenschmidt [Wed, 4 Aug 2010 03:23:02 +0000 (13:23 +1000)]
memblock/arm: pfn_valid uses memblock_is_memory()

The implementation is pretty much similar. There is a -small- added
overhead by having another function call and the address shift.

If that becomes a concern, I suppose we could actually have memblock
itself expose a memblock_pfn_valid() which then ARM can use directly
with an appropriate #define...

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Implement memblock_is_memory and memblock_is_region_memory
Benjamin Herrenschmidt [Wed, 4 Aug 2010 04:38:47 +0000 (14:38 +1000)]
memblock: Implement memblock_is_memory and memblock_is_region_memory

To make it fast, we steal ARM's binary search for memblock_is_memory()
and we use that to also the replace existing implementation of
memblock_is_reserved().

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: No reason to include asm/memblock.h late
Benjamin Herrenschmidt [Tue, 6 Jul 2010 22:38:56 +0000 (15:38 -0700)]
memblock: No reason to include asm/memblock.h late

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Rename memblock_region to memblock_type and memblock_property to memblock_r...
Benjamin Herrenschmidt [Wed, 4 Aug 2010 04:06:41 +0000 (14:06 +1000)]
memblock: Rename memblock_region to memblock_type and memblock_property to memblock_region

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agomemblock: Fix memblock_is_region_reserved() to return a boolean
Benjamin Herrenschmidt [Wed, 4 Aug 2010 04:17:17 +0000 (14:17 +1000)]
memblock: Fix memblock_is_region_reserved() to return a boolean

All callers expect a boolean result which is true if the region
overlaps a reserved region. However, the implementation actually
returns -1 if there is no overlap, and a region index (0 based)
if there is.

Make it behave as callers (and common sense) expect.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw
Linus Torvalds [Tue, 3 Aug 2010 21:40:10 +0000 (14:40 -0700)]
Merge git://git./linux/kernel/git/steve/gfs2-2.6-nmw

* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw:
  GFS2: Fix recovery stuck bug (try #2)
  GFS2: Fix typo in stuffed file data copy handling
  Revert "GFS2: recovery stuck on transaction lock"
  GFS2: Make "try" lock not try quite so hard
  GFS2: remove dependency on __GFP_NOFAIL
  GFS2: Simplify gfs2_write_alloc_required
  GFS2: Wait for journal id on mount if not specified on mount command line
  GFS2: Use nobh_writepage

14 years agoMerge branch 'linux-next' of git://git.infradead.org/ubi-2.6
Linus Torvalds [Tue, 3 Aug 2010 21:37:26 +0000 (14:37 -0700)]
Merge branch 'linux-next' of git://git.infradead.org/ubi-2.6

* 'linux-next' of git://git.infradead.org/ubi-2.6:
  UBI: do not warn unnecessarily
  UBI: do not print message about corruptes PEBs if we have none of them
  UBI: improve delete-compatible volumes handling
  UBI: fix error message and compilation warnings
  UBI: generate random image_seq when formatting MTD devices
  UBI: improve ECC error message
  UBI: improve corrupted flash handling
  UBI: introduce eraseblock counter variables
  UBI: introduce a new IO return code
  UBI: simplify IO error codes

14 years agoMerge branch 'linux-next' of git://git.infradead.org/ubifs-2.6
Linus Torvalds [Tue, 3 Aug 2010 21:37:02 +0000 (14:37 -0700)]
Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6

* 'linux-next' of git://git.infradead.org/ubifs-2.6:
  UBIFS: fix a memory leak on error path.
  UBIFS: fix GC LEB recovery
  UBIFS: use ERR_CAST
  UBIFS: check return code

14 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh...
Linus Torvalds [Tue, 3 Aug 2010 21:36:16 +0000 (14:36 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ericvh/v9fs

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: (22 commits)
  9p: fix sparse warnings in new xattr code
  fs/9p: remove sparse warning in vfs_inode
  fs/9p: destroy fid on failed remove
  fs/9p: Prevent parallel rename when doing fid_lookup
  fs/9p: Add support user. xattr
  net/9p: Implement TXATTRCREATE 9p call
  net/9p: Implement attrwalk 9p call
  9p: Implement LOPEN
  fs/9p: This patch implements TLCREATE for 9p2000.L protocol.
  9p: Implement TMKDIR
  9p: Implement TMKNOD
  9p: Define and implement TSYMLINK for 9P2000.L
  9p: Define and implement TLINK for 9P2000.L
  9p: Define and implement TLINK for 9P2000.L
  9p: Implement client side of setattr for 9P2000.L protocol.
  9p: getattr client implementation for 9P2000.L protocol.
  fs/9p: Pass the correct user credentials during attach
  net/9p: Handle the server returned error properly
  9p: readdir implementation for 9p2000.L
  9p: Make use of iounit for read/write
  ...

14 years agoMerge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
Linus Torvalds [Tue, 3 Aug 2010 21:33:38 +0000 (14:33 -0700)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs

* 'for-linus' of git://oss.sgi.com/xfs/xfs: (49 commits)
  xfs simplify and speed up direct I/O completions
  xfs: move aio completion after unwritten extent conversion
  direct-io: move aio_complete into ->end_io
  xfs: fix big endian build
  xfs: clean up xfs_bmap_get_bp
  xfs: simplify xfs_truncate_file
  xfs: kill the b_strat callback in xfs_buf
  xfs: remove obsolete osyncisosync mount option
  xfs: clean up filestreams helpers
  xfs: fix gcc 4.6 set but not read and unused statement warnings
  xfs: Fix build when CONFIG_XFS_POSIX_ACL=n
  xfs: fix unsigned underflow in xfs_free_eofblocks
  xfs: use GFP_NOFS for page cache allocation
  xfs: fix memory reclaim recursion deadlock on locked inode buffer
  xfs: fix xfs_trans_add_item() lockdep warnings
  xfs: simplify and remove xfs_ireclaim
  xfs: don't block on buffer read errors
  xfs: move inode shrinker unregister even earlier
  xfs: remove a dmapi leftover
  xfs: writepage always has buffers
  ...

14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
Linus Torvalds [Tue, 3 Aug 2010 21:33:09 +0000 (14:33 -0700)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (29 commits)
  cifs: fsc should not default to "on"
  [CIFS] remove redundant path walking in dfs_do_refmount
  cifs: ignore the "mand", "nomand" and "_netdev" mount options
  cifs: map NT_STATUS_ERROR_WRITE_PROTECTED to -EROFS
  cifs: don't allow cifs_iget to match inodes of the wrong type
  [CIFS] relinquish fscache cookie before freeing CIFSTconInfo
  cifs: add separate cred_uid field to sesInfo
  fs: cifs: check kmalloc() result
  [CIFS] Missing ifdef
  [CIFS] Missing line from previous commit
  [CIFS] Fix build break when CONFIG_CIFS_FSCACHE disabled
  cifs: add mount option to enable local caching
  cifs: read pages from FS-Cache
  cifs: store pages into local cache
  cifs: FS-Cache page management
  cifs: define inode-level cache object and register them
  cifs: define superblock-level cache index objects and register them
  cifs: remove unused cifsUidInfo struct
  cifs: clean up cifs_find_smb_ses (try #2)
  cifs: match secType when searching for existing tcp session
  ...

14 years agoMerge branch 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm
Linus Torvalds [Tue, 3 Aug 2010 21:31:24 +0000 (14:31 -0700)]
Merge branch 'devel' of /home/rmk/linux-2.6-arm

* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (291 commits)
  ARM: AMBA: Add pclk support to AMBA bus infrastructure
  ARM: 6278/2: fix regression in RealView after the introduction of pclk
  ARM: 6277/1: mach-shmobile: Allow users to select HZ, default to 128
  ARM: 6276/1: mach-shmobile: remove duplicate NR_IRQS_LEGACY
  ARM: 6246/1: mmci: support larger MMCIDATALENGTH register
  ARM: 6245/1: mmci: enable hardware flow control on Ux500 variants
  ARM: 6244/1: mmci: add variant data and default MCICLOCK support
  ARM: 6243/1: mmci: pass power_mode to the translate_vdd callback
  ARM: 6274/1: add global control registers definition header file for nuc900
  mx2_camera: fix type of dma buffer virtual address pointer
  mx2_camera: Add soc_camera support for i.MX25/i.MX27
  arm/imx/gpio: add spinlock protection
  ARM: Add support for the LPC32XX arch
  ARM: LPC32XX: Arch config menu supoport and makefiles
  ARM: LPC32XX: Phytec 3250 platform support
  ARM: LPC32XX: Misc support functions
  ARM: LPC32XX: Serial support code
  ARM: LPC32XX: System suspend support
  ARM: LPC32XX: GPIO, timer, and IRQ drivers
  ARM: LPC32XX: Clock driver
  ...

14 years agoPARISC: led.c - fix potential stack overflow in led_proc_write()
Helge Deller [Mon, 2 Aug 2010 20:46:41 +0000 (22:46 +0200)]
PARISC: led.c - fix potential stack overflow in led_proc_write()

avoid potential stack overflow by correctly checking count parameter

Reported-by: Ilja <ilja@netric.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoUBIFS: fix a memory leak on error path.
Matthieu CASTET [Mon, 2 Aug 2010 09:36:06 +0000 (11:36 +0200)]
UBIFS: fix a memory leak on error path.

In 'mount_ubifs()', in case of 'ubifs_leb_unmap()' falure,
free allocated resources.

Signed-off-by: Matthieu CASTET <matthieu.castet@parrot.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
14 years ago9p: fix sparse warnings in new xattr code
Eric Van Hensbergen [Mon, 2 Aug 2010 16:36:18 +0000 (11:36 -0500)]
9p: fix sparse warnings in new xattr code

fixes:

  CHECK   fs/9p/xattr.c
fs/9p/xattr.c:73:6: warning: Using plain integer as NULL pointer
fs/9p/xattr.c:135:6: warning: Using plain integer as NULL pointer

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years agofs/9p: remove sparse warning in vfs_inode
Eric Van Hensbergen [Tue, 27 Jul 2010 19:49:43 +0000 (14:49 -0500)]
fs/9p: remove sparse warning in vfs_inode

make v9fs_dentry_from_dir_inode static

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years agofs/9p: destroy fid on failed remove
Aneesh Kumar K.V [Fri, 2 Jul 2010 06:51:20 +0000 (12:21 +0530)]
fs/9p: destroy fid on failed remove

9P spec says:
"It is correct to consider remove to be a clunk with the
side effect of removing the file if permissions allow. "

So even if remove fails we need to destroy the fid.

Without this patch an rmdir on a directory with contents leave
the new cloned directory fid fid attached to fidlist. On umount
we dump the fids on the fidlist

~# rmdir /mnt2/test4/
rmdir: failed to remove `/mnt2/test4/': Directory not empty
~# umount /mnt2/
~# dmesg
[  228.474323] Found fid 3 not clunked

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years agofs/9p: Prevent parallel rename when doing fid_lookup
Aneesh Kumar K.V [Wed, 30 Jun 2010 13:48:50 +0000 (19:18 +0530)]
fs/9p: Prevent parallel rename when doing fid_lookup

During fid lookup we need to make sure that the dentry->d_parent doesn't
change so that we can safely walk the parent dentries. To ensure that
we need to prevent cross directory rename during fid_lookup. Add a
per superblock rename_sem rw_semaphore to prevent parallel fid lookup and
rename.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years agofs/9p: Add support user. xattr
Aneesh Kumar K.V [Mon, 31 May 2010 07:52:56 +0000 (13:22 +0530)]
fs/9p: Add support user. xattr

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years agonet/9p: Implement TXATTRCREATE 9p call
Aneesh Kumar K.V [Mon, 31 May 2010 07:52:50 +0000 (13:22 +0530)]
net/9p: Implement TXATTRCREATE 9p call

TXATTRCREATE:  Prepare a fid for setting xattr value on a file system object.

 size[4] TXATTRCREATE tag[2] fid[4] name[s] attr_size[8] flags[4]
 size[4] RXATTRCREATE tag[2]

txattrcreate gets a fid pointing to xattr. This fid can later be
used to set the xattr value.

flag value is derived from set Linux setxattr. The manpage says
"The flags parameter can be used to refine the semantics of the operation.
XATTR_CREATE specifies a pure create, which fails if the named attribute
exists already. XATTR_REPLACE specifies a pure replace operation, which
fails if the named attribute does not already exist. By default (no flags),
the extended attribute will be created if need be, or will simply replace
the value if the attribute exists."

The actual setxattr operation happens when the fid is clunked. At that point
the written byte count and the attr_size specified in TXATTRCREATE should be
same otherwise an error will be returned.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years agonet/9p: Implement attrwalk 9p call
Aneesh Kumar K.V [Mon, 31 May 2010 07:52:45 +0000 (13:22 +0530)]
net/9p: Implement attrwalk 9p call

TXATTRWALK: Descend a ATTR namespace

 size[4] TXATTRWALK tag[2] fid[4] newfid[4] name[s]
 size[4] RXATTRWALK tag[2] size[8]

txattrwalk gets a fid pointing to xattr. This fid can later be
used to read the xattr value. If name is NULL the fid returned
can be used to get the list of extended attribute associated to
the file system object.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years ago9p: Implement LOPEN
M. Mohan Kumar [Tue, 22 Jun 2010 14:17:50 +0000 (19:47 +0530)]
9p: Implement LOPEN

Implement 9p2000.L version of open(LOPEN) interface in 9p client.

For LOPEN, no need to convert the flags to and from 9p mode to VFS mode.

Synopsis:

    size[4] Tlopen tag[2] fid[4] mode[4]

    size[4] Rlopen tag[2] qid[13] iounit[4]

[Fix mode bit format - jvrao@linux.vnet.ibm.com]

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbegren <ericvh@gmail.com>
14 years agofs/9p: This patch implements TLCREATE for 9p2000.L protocol.
Venkateswararao Jujjuri (JV) [Fri, 18 Jun 2010 01:27:46 +0000 (18:27 -0700)]
fs/9p: This patch implements TLCREATE for 9p2000.L protocol.

SYNOPSIS

    size[4] Tlcreate tag[2] fid[4] name[s] flags[4] mode[4] gid[4]

    size[4] Rlcreate tag[2] qid[13] iounit[4]

DESCRIPTION

The Tlreate request asks the file server to create a new regular file with the
name supplied, in the directory (dir) represented by fid.
The mode argument specifies the permissions to use. New file is created with
the uid if the fid and with supplied gid.

The flags argument represent Linux access mode flags with which the caller
is requesting to open the file with. Protocol allows all the Linux access
modes but it is upto the server to allow/disallow any of these acess modes.
If the server doesn't support any of the access mode, it is expected to
return error.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years ago9p: Implement TMKDIR
M. Mohan Kumar [Wed, 16 Jun 2010 08:57:22 +0000 (14:27 +0530)]
9p: Implement TMKDIR

Implement TMKDIR as part of 2000.L Work

Synopsis

    size[4] Tmkdir tag[2] fid[4] name[s] mode[4] gid[4]

    size[4] Rmkdir tag[2] qid[13]

Description

    mkdir asks the file server to create a directory with given name,
    mode and gid. The qid for the new directory is returned with
    the mkdir reply message.

Note: 72 is selected as the opcode for TMKDIR from the reserved list.

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years ago9p: Implement TMKNOD
M. Mohan Kumar [Wed, 16 Jun 2010 08:57:01 +0000 (14:27 +0530)]
9p: Implement TMKNOD

Synopsis

    size[4] Tmknod tag[2] fid[4] name[s] mode[4] major[4] minor[4] gid[4]

    size[4] Rmknod tag[2] qid[13]

Description

    mknod asks the file server to create a device node with given major and
    minor number, mode and gid. The qid for the new device node is returned
    with the mknod reply message.

[sripathik@in.ibm.com: Fix error handling code]

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years ago9p: Define and implement TSYMLINK for 9P2000.L
Venkateswararao Jujjuri (JV) [Wed, 9 Jun 2010 22:59:31 +0000 (15:59 -0700)]
9p: Define and implement TSYMLINK for 9P2000.L

Create a symbolic link

SYNOPSIS

size[4] Tsymlink tag[2] fid[4] name[s] symtgt[s] gid[4]

size[4] Rsymlink tag[2] qid[13]

DESCRIPTION

Create a symbolic link named 'name' pointing to 'symtgt'.
gid represents the effective group id of the caller.
The  permissions of a symbolic link are irrelevant hence it is omitted
from the protocol.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Reviewed-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years ago9p: Define and implement TLINK for 9P2000.L
Venkateswararao Jujjuri (JV) [Thu, 3 Jun 2010 22:16:59 +0000 (15:16 -0700)]
9p: Define and implement TLINK for 9P2000.L

This patch adds a helper function to get the dentry from inode and
uses it in creating a Hardlink

SYNOPSIS

size[4] Tlink tag[2] dfid[4] oldfid[4] newpath[s]

size[4] Rlink tag[2]

DESCRIPTION

Create a link 'newpath' in directory pointed by dfid linking to oldfid path.

[sripathik@in.ibm.com : p9_client_link should not free req structure
if p9_client_rpc has returned an error.]

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years ago9p: Define and implement TLINK for 9P2000.L
Eric Van Hensbergen [Mon, 2 Aug 2010 19:28:09 +0000 (14:28 -0500)]
9p: Define and implement TLINK for 9P2000.L

This patch adds a helper function to get the dentry from inode and
uses it in creating a Hardlink

SYNOPSIS

size[4] Tlink tag[2] dfid[4] oldfid[4] newpath[s]

size[4] Rlink tag[2]

DESCRIPTION

Create a link 'newpath' in directory pointed by dfid linking to oldfid path.

[sripathik@in.ibm.com : p9_client_link should not free req structure
if p9_client_rpc has returned an error.]

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years ago9p: Implement client side of setattr for 9P2000.L protocol.
Sripathi Kodi [Fri, 18 Jun 2010 06:20:10 +0000 (11:50 +0530)]
9p: Implement client side of setattr for 9P2000.L protocol.

    SYNOPSIS

      size[4] Tsetattr tag[2] attr[n]

      size[4] Rsetattr tag[2]

    DESCRIPTION

      The setattr command changes some of the file status information.
      attr resembles the iattr structure used in Linux kernel. It
      specifies which status parameter is to be changed and to what
      value. It is laid out as follows:

         valid[4]
            specifies which status information is to be changed. Possible
            values are:
            ATTR_MODE       (1 << 0)
            ATTR_UID        (1 << 1)
            ATTR_GID        (1 << 2)
            ATTR_SIZE       (1 << 3)
            ATTR_ATIME      (1 << 4)
            ATTR_MTIME      (1 << 5)
            ATTR_ATIME_SET  (1 << 7)
            ATTR_MTIME_SET  (1 << 8)

            The last two bits represent whether the time information
            is being sent by the client's user space. In the absense
            of these bits the server always uses server's time.

         mode[4]
            File permission bits

         uid[4]
            Owner id of file

         gid[4]
            Group id of the file

         size[8]
            File size

         atime_sec[8]
            Time of last file access, seconds

         atime_nsec[8]
            Time of last file access, nanoseconds

         mtime_sec[8]
            Time of last file modification, seconds

         mtime_nsec[8]
            Time of last file modification, nanoseconds

Explanation of the patches:
--------------------------

*) The kernel just copies relevent contents of iattr structure to
   p9_iattr_dotl structure and passes it down to the client. The
   only check it has is calling inode_change_ok()
*) The p9_iattr_dotl structure does not have ctime and ia_file
   parameters because I don't think these are needed in our case.
   The client user space can request updating just ctime by calling
   chown(fd, -1, -1). This is handled on server side without a need
   for putting ctime on the wire.
*) The server currently supports changing mode, time, ownership and
   size of the file.
*) 9P RFC says "Either all the changes in wstat request happen, or
   none of them does: if the request succeeds, all changes were made;
   if it fails, none were."
   I have not done anything to implement this specifically because I
   don't see a reason.

Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years ago9p: getattr client implementation for 9P2000.L protocol.
Sripathi Kodi [Mon, 12 Jul 2010 14:37:23 +0000 (20:07 +0530)]
9p: getattr client implementation for 9P2000.L protocol.

        SYNOPSIS

              size[4] Tgetattr tag[2] fid[4] request_mask[8]

              size[4] Rgetattr tag[2] lstat[n]

           DESCRIPTION

              The getattr transaction inquires about the file identified by fid.
              request_mask is a bit mask that specifies which fields of the
              stat structure is the client interested in.

              The reply will contain a machine-independent directory entry,
              laid out as follows:

                 st_result_mask[8]
                    Bit mask that indicates which fields in the stat structure
                    have been populated by the server

                 qid.type[1]
                    the type of the file (directory, etc.), represented as a bit
                    vector corresponding to the high 8 bits of the file's mode
                    word.

                 qid.vers[4]
                    version number for given path

                 qid.path[8]
                    the file server's unique identification for the file

                 st_mode[4]
                    Permission and flags

                 st_uid[4]
                    User id of owner

                 st_gid[4]
                    Group ID of owner

                 st_nlink[8]
                    Number of hard links

                 st_rdev[8]
                    Device ID (if special file)

                 st_size[8]
                    Size, in bytes

                 st_blksize[8]
                    Block size for file system IO

                 st_blocks[8]
                    Number of file system blocks allocated

                 st_atime_sec[8]
                    Time of last access, seconds

                 st_atime_nsec[8]
                    Time of last access, nanoseconds

                 st_mtime_sec[8]
                    Time of last modification, seconds

                 st_mtime_nsec[8]
                    Time of last modification, nanoseconds

                 st_ctime_sec[8]
                    Time of last status change, seconds

                 st_ctime_nsec[8]
                    Time of last status change, nanoseconds

                 st_btime_sec[8]
                    Time of creation (birth) of file, seconds

                 st_btime_nsec[8]
                    Time of creation (birth) of file, nanoseconds

                 st_gen[8]
                    Inode generation

                 st_data_version[8]
                    Data version number

              request_mask and result_mask bit masks contain the following bits
                 #define P9_STATS_MODE          0x00000001ULL
                 #define P9_STATS_NLINK         0x00000002ULL
                 #define P9_STATS_UID           0x00000004ULL
                 #define P9_STATS_GID           0x00000008ULL
                 #define P9_STATS_RDEV          0x00000010ULL
                 #define P9_STATS_ATIME         0x00000020ULL
                 #define P9_STATS_MTIME         0x00000040ULL
                 #define P9_STATS_CTIME         0x00000080ULL
                 #define P9_STATS_INO           0x00000100ULL
                 #define P9_STATS_SIZE          0x00000200ULL
                 #define P9_STATS_BLOCKS        0x00000400ULL

                 #define P9_STATS_BTIME         0x00000800ULL
                 #define P9_STATS_GEN           0x00001000ULL
                 #define P9_STATS_DATA_VERSION  0x00002000ULL

                 #define P9_STATS_BASIC         0x000007ffULL
                 #define P9_STATS_ALL           0x00003fffULL

        This patch implements the client side of getattr implementation for
        9P2000.L. It introduces a new structure p9_stat_dotl for getting
        Linux stat information along with QID. The data layout is similar to
        stat structure in Linux user space with the following major
        differences:

        inode (st_ino) is not part of data. Instead qid is.

        device (st_dev) is not part of data because this doesn't make sense
        on the client.

        All time variables are 64 bit wide on the wire. The kernel seems to use
        32 bit variables for these variables. However, some of the architectures
        have used 64 bit variables and glibc exposes 64 bit variables to user
        space on some architectures. Hence to be on the safer side we have made
        these 64 bit in the protocol. Refer to the comments in
        include/asm-generic/stat.h

        There are some additional fields: st_btime_sec, st_btime_nsec, st_gen,
        st_data_version apart from the bitmask, st_result_mask. The bit mask
        is filled by the server to indicate which stat fields have been
        populated by the server. Currently there is no clean way for the
        server to obtain these additional fields, so it sends back just the
        basic fields.

Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbegren <ericvh@gmail.com>
14 years agofs/9p: Pass the correct user credentials during attach
Aneesh Kumar K.V [Tue, 1 Jun 2010 09:26:18 +0000 (09:26 +0000)]
fs/9p: Pass the correct user credentials during attach

We need to make sure we pass the right uid value
during attach. dotl is similar to dotu in this regard.
Without this mapped security model on dotl doesn't work

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years agonet/9p: Handle the server returned error properly
Aneesh Kumar K.V [Tue, 1 Jun 2010 09:26:17 +0000 (09:26 +0000)]
net/9p: Handle the server returned error properly

We need to get the negative errno value in the kernel
even for dotl.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years ago9p: readdir implementation for 9p2000.L
Sripathi Kodi [Fri, 4 Jun 2010 13:41:26 +0000 (13:41 +0000)]
9p: readdir implementation for 9p2000.L

This patch implements the kernel part of readdir() implementation for 9p2000.L

    Change from V3: Instead of inode, server now sends qids for each dirent

    SYNOPSIS

    size[4] Treaddir tag[2] fid[4] offset[8] count[4]
    size[4] Rreaddir tag[2] count[4] data[count]

    DESCRIPTION

    The readdir request asks the server to read the directory specified by 'fid'
    at an offset specified by 'offset' and return as many dirent structures as
    possible that fit into count bytes. Each dirent structure is laid out as
    follows.

            qid.type[1]
              the type of the file (directory, etc.), represented as a bit
              vector corresponding to the high 8 bits of the file's mode
              word.

            qid.vers[4]
              version number for given path

            qid.path[8]
              the file server's unique identification for the file

            offset[8]
              offset into the next dirent.

            type[1]
              type of this directory entry.

            name[256]
              name of this directory entry.

    This patch adds v9fs_dir_readdir_dotl() as the readdir() call for 9p2000.L.
    This function sends P9_TREADDIR command to the server. In response the server
    sends a buffer filled with dirent structures. This is different from the
    existing v9fs_dir_readdir() call which receives stat structures from the server.
    This results in significant speedup of readdir() on large directories.
    For example, doing 'ls >/dev/null' on a directory with 10000 files on my
    laptop takes 1.088 seconds with the existing code, but only takes 0.339 seconds
    with the new readdir.

Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years ago9p: Make use of iounit for read/write
M. Mohan Kumar [Fri, 4 Jun 2010 11:59:07 +0000 (11:59 +0000)]
9p: Make use of iounit for read/write

Change the v9fs_file_readn function to limit the maximum transfer size
based on the iounit or msize.

Also remove the redundant check for limiting the transfer size in
v9fs_file_write. This check is done by p9_client_write.

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years ago9p: strlen() doesn't count the terminator
Dan Carpenter [Sat, 10 Jul 2010 09:51:54 +0000 (11:51 +0200)]
9p: strlen() doesn't count the terminator

This is an off by one bug because strlen() doesn't count the NULL
terminator.  We strcpy() addr into a fixed length array of size
UNIX_PATH_MAX later on.

The addr variable is the name of the device being mounted.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years agovirtio_9p.h needs <linux/types.h>
Fang Wenqi [Tue, 1 Jun 2010 02:43:06 +0000 (02:43 +0000)]
virtio_9p.h needs <linux/types.h>

Found with makes headers_check:
include/linux/virtio_9p.h:15: found __[us]{8,16,32,64} type without #include <linux/types.h>

Signed-off-by: Fang Wenqi <antonf@turbolinux.com.cn>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
14 years agoMerge branch 'v2.6.35'
Alex Elder [Mon, 2 Aug 2010 15:24:57 +0000 (10:24 -0500)]
Merge branch 'v2.6.35'

14 years agocifs: fsc should not default to "on"
Jeff Layton [Mon, 26 Jul 2010 18:25:08 +0000 (14:25 -0400)]
cifs: fsc should not default to "on"

I'm not sure why this was merged with this flag hardcoded on, but it
seems quite dangerous. Turn it off.

Also, mount.cifs hands unrecognized options off to the kernel so there
should be no need for changes there in order to support this.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years ago[CIFS] remove redundant path walking in dfs_do_refmount
Steve French [Mon, 26 Jul 2010 18:20:16 +0000 (18:20 +0000)]
[CIFS] remove redundant path walking in dfs_do_refmount

Reviewed-by: Dave Howells <dhowells@redhat.com>
Signed-off-by: Igor Mammedov <niallain@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agocifs: ignore the "mand", "nomand" and "_netdev" mount options
Jeff Layton [Mon, 26 Jul 2010 14:29:58 +0000 (10:29 -0400)]
cifs: ignore the "mand", "nomand" and "_netdev" mount options

These are all handled by the userspace mount programs, but older versions
of mount.cifs also handed them off to the kernel. Ignore them.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agocifs: map NT_STATUS_ERROR_WRITE_PROTECTED to -EROFS
Jeff Layton [Mon, 26 Jul 2010 14:29:57 +0000 (10:29 -0400)]
cifs: map NT_STATUS_ERROR_WRITE_PROTECTED to -EROFS

Seems like a more sensible mapping than -EIO.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agocifs: don't allow cifs_iget to match inodes of the wrong type
Jeff Layton [Mon, 19 Jul 2010 22:00:17 +0000 (18:00 -0400)]
cifs: don't allow cifs_iget to match inodes of the wrong type

If the type is different from what we think it should be, then don't
match the existing inode.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years ago[CIFS] relinquish fscache cookie before freeing CIFSTconInfo
Steve French [Fri, 23 Jul 2010 20:37:53 +0000 (20:37 +0000)]
[CIFS] relinquish fscache cookie before freeing CIFSTconInfo

Doh, fix a use after free bug.

Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-and-Tested-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agocifs: add separate cred_uid field to sesInfo
Jeff Layton [Mon, 19 Jul 2010 22:00:17 +0000 (18:00 -0400)]
cifs: add separate cred_uid field to sesInfo

Right now, there's no clear separation between the uid that owns the
credentials used to do the mount and the overriding owner of the files
on that mount.

Add a separate cred_uid field that is set to the real uid
of the mount user. Unlike the linux_uid, the uid= option does not
override this parameter. The parm is sent to cifs.upcall, which can then
preferentially use the creduid= parm instead of the uid= parm for
finding credentials.

This is not the only way to solve this. We could try to do all of this
in kernel instead by having a module parameter that affects what gets
passed in the uid= field of the upcall. That said, we have a lot more
flexibility to change things in userspace so I think it probably makes
sense to do it this way.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agofs: cifs: check kmalloc() result
Kulikov Vasiliy [Fri, 16 Jul 2010 16:15:25 +0000 (20:15 +0400)]
fs: cifs: check kmalloc() result

If kmalloc() fails exit with -ENOMEM.

Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years ago[CIFS] Missing ifdef
Steve French [Fri, 16 Jul 2010 04:31:02 +0000 (04:31 +0000)]
[CIFS] Missing ifdef

Signed-off-by: Steve French <sfrench@us.ibm.com>