GitHub/LineageOS/G12/android_kernel_amlogic_linux-4.9.git
11 years agoiscsi-target: Fix processing of OOO commands
Shlomo Pongratz [Sun, 5 May 2013 14:36:26 +0000 (17:36 +0300)]
iscsi-target: Fix processing of OOO commands

Fix two issues in OOO commands processing done at iscsit_attach_ooo_cmdsn.

Handle command serial numbers wrap around by using iscsi_sna_lt and not regular comparisson.

The routine iterates until it finds an entry whose serial number is greater than the serial number of
the new one, thus the new entry should be inserted before that entry and not after.

Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agoiscsi-target: Make buf param of iscsit_do_crypto_hash_buf() const void *
Geert Uytterhoeven [Fri, 3 May 2013 21:15:57 +0000 (23:15 +0200)]
iscsi-target: Make buf param of iscsit_do_crypto_hash_buf() const void *

Make the "buf" input param of iscsit_do_crypto_hash_buf() "const void *".
This allows to remove lots of casts in its callers.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agoiscsi-target: Fix NULL pointer dereference in iscsit_send_reject
Nicholas Bellinger [Fri, 3 May 2013 23:46:41 +0000 (16:46 -0700)]
iscsi-target: Fix NULL pointer dereference in iscsit_send_reject

Fix up a NULL pointer dereference regression in iscsit_send_reject()
introduced by from commit 2ec5a8c11.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget: Have dev/enable show if TCM device is configured
Andy Grover [Tue, 30 Apr 2013 18:59:15 +0000 (11:59 -0700)]
target: Have dev/enable show if TCM device is configured

User tools need to know if the device is properly configured, since if
not, some other attributes are invalid.

Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget: Use FD_MAX_SECTORS/FD_BLOCKSIZE for blockdevs using fileio
Andy Grover [Fri, 26 Apr 2013 18:09:03 +0000 (11:09 -0700)]
target: Use FD_MAX_SECTORS/FD_BLOCKSIZE for blockdevs using fileio

We can still see the error reported in

https://patchwork.kernel.org/patch/2338981/

when using fileio backed by a block device.

I'm assuming this will get us past that error (from sbc_parse_cdb),
and also assuming it's OK to have our max_sectors be larger than
the block's queue max hw sectors?

Reported-by: Eric Harney <eharney@redhat.com>
Signed-off-by: Andy Grover <agrover@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget: Remove unused struct members in se_dev_entry
Andy Grover [Thu, 25 Apr 2013 22:04:02 +0000 (15:04 -0700)]
target: Remove unused struct members in se_dev_entry

Some were incremented, but never used anywhere from what I could tell.

Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agoiser-target: Add iSCSI Extensions for RDMA (iSER) target driver
Nicholas Bellinger [Thu, 7 Mar 2013 08:56:19 +0000 (00:56 -0800)]
iser-target: Add iSCSI Extensions for RDMA (iSER) target driver

This patch adds support for iSCSI Extensions for RDMA target mode,
and includes CQ pooling per isert_device context distributed across
multiple active iser target sessions.

It also uses cmwq process context for RX / TX ib_post_cq() polling
via isert_cq_desc->cq_[rx,tx]_work invoked by isert_cq_[rx,tx]_callback()
hardIRQ context callbacks.

v5 changes:

- Use ISER_RECV_DATA_SEG_LEN instead of hardcoded value in ISER_RX_PAD_SIZE (Or)
- Fix make W=1 warnings (Or)
- Add missing depends on NET && INFINIBAND_ADDR_TRANS in Kconfig (Randy + Or)
- Make isert_device_find_by_ib_dev() return proper ERR_PTR (Wei Yongjun)
- Properly setup iscsi_np->np_sockaddr in isert_setup_np() (Shlomi + nab)
- Add special case for early ISCSI_OP_SCSI_CMD exception handling (nab)

v4 changes:
- Mark isert_cq_rx_work as static (Or)
- Drop unnecessary ib_dma_sync_single_for_cpu + ib_dma_sync_single_for_device
  calls for isert_cmd->sense_buf_dma from isert_put_response (Or)
- Use 12288 for ISER_RX_PAD_SIZE base to save extra page per
  struct iser_rx_desc (Or + nab)
- Drop now unnecessary isert_rx_desc usage, and convert RX users to
  iser_rx_desc (Or + nab)
- Move isert_[alloc,free]_rx_descriptors() ahead of
  isert_create_device_ib_res() usage (nab)
- Mark isert_cq_[rx,tx]_callback() + prototypes as static
- Fix 'warning: 'ret' may be used uninitialized' warning for
  isert_create_device_ib_res on powerpc allmodconfig (fengguang + nab)
- Fix 'warning: 'ret' may be used uninitialized' warning for
  isert_connect_request on i386 allyesconfig (fengguang + nab)
- Fix pr_debug conversion specification in isert_rx_completion()
  (fengguang + nab)
- Drop unnecessary isert_conn->conn_cm_id != NULL check in
  isert_connect_release causing the build warning:
  "variable dereferenced before check 'isert_conn->conn_cm_id'"
- Fix isert_lid + isert_np leak in isert_setup_np failure path
- Add isert_conn->conn_wait_comp_err usage in isert_free_conn()
  for isert_cq_comp_err completion path
- Add isert_conn->logout_posted bit to determine decrement of
  isert_conn->post_send_buf_count from logout response completion
- Always set ISER_CONN_DOWN from isert_disconnect_work() callback

v3 changes:

- Convert to use per isert_cq_desc->cq_[rx,tx]_work + drop tasklets (Or + nab)
- Move IB_EVENT_QP_LAST_WQE_REACHED warn into correct
  isert_qp_event_callback (Or)
- Drop unnecessary IB_ACCESS_REMOTE_* access flag usage in
  isert_create_device_ib_res (Or)
- Add common isert_init_send_wr(), and convert isert_put_* calls (Or)
- Move to verbs+core logic to single ib_isert.[c,h]  (Or + nab)
- Add kmem_cache isert_cmd_cache usage for descriptor allocation (nab)
- Move common ib_post_send() logic used by isert_put_*() to
  isert_post_response() (nab)
- Add isert_put_reject call in isert_response_queue() for posting
  ISCSI_REJECT response. (nab)
- Add ISTATE_SEND_REJECT checking in isert_do_control_comp. (nab)

v2 changes:

- Drop unused ISERT_ADDR_ROUTE_TIMEOUT define
- Add rdma_notify() call for IB_EVENT_COMM_EST in isert_qp_event_callback()
- Make isert_query_device() less verbose
- Drop unused RDMA_CM_EVENT_ADDR_ERROR and RDMA_CM_EVENT_ROUTE_ERROR
  cases from isert_cma_handler()
- Drop unused rdma/ib_fmr_pool.h include
- Update isert_conn_setup_qp() to assign cq based upon least used
- Add isert_create_device_ib_res() to setup PD, CQs and MRs for each
  underlying struct ib_device, instead of using per isert_conn resources.
- Add isert_free_device_ib_res() to release PD, CQs and MRs for each
  underlying struct ib_device.
- Add isert_device_find_by_ib_dev()
- Change isert_connect_request() to drop PD, CQs and MRs allocation,
  and use isert_device_find_by_ib_dev() instead.
- Add isert_device_try_release()
- Change isert_connect_release() to decrement cq_active_qps, and drop
  PD, CQs and MRs resource release.
- Update isert_connect_release() to call isert_device_try_release()
- Make isert_create_device_ib_res() determine device->cqs_used based
  upon num_online_cpus()
- Drop misleading isert_dump_ib_wc() usage
- Drop unused rdma/ib_fmr_pool.h include
- Use proper xfer_len for login PDUs in isert_rx_completion()
- Add isert_release_cmd() usage
- Change isert_alloc_cmd() to setup iscsi_cmd.release_cmd() pointer
- Change isert_put_cmd() to perform per iscsi_opcode specific release
  logic
- Add isert_unmap_cmd() call for ISCSI_OP_SCSI_CMD from isert_put_cmd()
- Change isert_send_completion() to call
  atomic_dec(&isert_conn->post_send_buf_count)
  based upon per iscsi_opcode logic
- Drop ISTATE_REMOVE processing from isert_immediate_queue()
- Drop ISTATE_SEND_DATAIN processing from isert_response_queue()
- Drop ISTATE_SEND_STATUS processing from isert_response_queue()
- Drop iscsit_transport->iscsit_unmap_cmd() and ->iscsit_free_cmd()
- Convert iser_cq_tx_tasklet() to use struct isert_cq_desc pooling logic
- Convert isert_cq_tx_callback() to use struct isert_cq_desc pooling
  logic
- Convert iser_cq_rx_tasklet() to use struct isert_cq_desc pooling logic
- Convert isert_cq_rx_callback() to use struct isert_cq_desc pooling
  logic
- Add explict iscsit_stop_dataout_timer() call to
  isert_do_rdma_read_comp()
- Use isert_get_dataout() for iscsit_transport->iscsit_get_dataout()
  caller
- Drop ISTATE_SEND_R2T processing from isert_immediate_queue()
- Drop unused rdma/ib_fmr_pool.h include
- Drop isert_cmd->cmd_kref in favor of se_cmd->cmd_kref usage
- Add struct isert_device in order to support multiple EQs + CQ pooling
- Add struct isert_cq_desc
- Drop tasklets and cqs from isert_conn
- Bump ISERT_MAX_CQ to 64
- Various minor checkpatch fixes

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotcm_vhost: Enable VIRTIO_SCSI_F_HOTPLUG
Asias He [Thu, 25 Apr 2013 07:35:23 +0000 (15:35 +0800)]
tcm_vhost: Enable VIRTIO_SCSI_F_HOTPLUG

Everything for hotplug is ready. Let's enable the feature bit.

Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotcm_vhost: Add ioctl to get and set events missed flag
Asias He [Thu, 25 Apr 2013 07:35:22 +0000 (15:35 +0800)]
tcm_vhost: Add ioctl to get and set events missed flag

Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotcm_vhost: Add hotplug/hotunplug support
Asias He [Thu, 25 Apr 2013 07:35:21 +0000 (15:35 +0800)]
tcm_vhost: Add hotplug/hotunplug support

In commit 365a7150094 ([SCSI] virtio-scsi: hotplug support for
virtio-scsi), hotplug support is added to virtio-scsi.

This patch adds hotplug and hotunplug support to tcm_vhost.

You can create or delete a LUN in targetcli to hotplug or hotunplug a
LUN in guest.

Signed-off-by: Asias He <asias@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotcm_vhost: Refactor the lock nesting rule
Asias He [Thu, 25 Apr 2013 07:35:20 +0000 (15:35 +0800)]
tcm_vhost: Refactor the lock nesting rule

We want to use tcm_vhost_mutex to make sure hotplug/hotunplug will not
happen when set_endpoint/clear_endpoint is in process.

Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotcm_fc: Check for aborted sequence
Mark Rustad [Mon, 22 Apr 2013 16:49:55 +0000 (09:49 -0700)]
tcm_fc: Check for aborted sequence

Add a check for an aborted sequence, which has a
NULL sequence pointer, to avoid target crashes.
The most relevant messages from the crash (entered
from video capture) include:

BUG: unable to handle kernel paging request at ffffffffffffffdf
IP: [<ffffffffa02d514c>] fc_seq_send+0x3c/0x150 [libfc]
...
Call Trace:
 [<ffffffffa0443de6>] ft_queue_data_in+0x266/0x560 [tcm_fc]

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agoiscsi-target: Add iser network portal attribute
Nicholas Bellinger [Thu, 7 Mar 2013 06:20:36 +0000 (22:20 -0800)]
iscsi-target: Add iser network portal attribute

This patch adds a new network portal attribute for iser, that lives
under existing iscsi-target configfs layout at:

   /sys/kernel/config/target/iscsi/$TARGETNAME/$TPGT/np/$PORTAL/iser

When lio_target_np_store_iser() is enabled, iscsit_tpg_add_network_portal()
will attempt to start an rdma_cma network portal for iser-target, only if
the external ib_isert module transport has been loaded.

When disabled, iscsit_tpg_del_network_portal() will cease iser login service
on the network portal, and release any external ib_isert module reference.

v4 changes:

- Add request_module for ib_isert to lio_target_np_store_iser()

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agoiscsi-target: Refactor TX queue logic + export response PDU creation
Nicholas Bellinger [Wed, 20 Mar 2013 22:29:15 +0000 (15:29 -0700)]
iscsi-target: Refactor TX queue logic + export response PDU creation

This patch refactors TX immediate + response queue handling to use
the new iscsit_transport API callers, and exports the necessary
traditional iscsi PDU response creation functions for iser-target
to utilize.

This includes:

- Add iscsit_build_datain_pdu() for DATAIN PDU init + convert
  iscsit_build_datain_pdu()
- Add iscsit_build_logout_rsp() for LOGOUT_RSP PDU init + convert
  iscsit_send_logout()
- Add iscsit_build_nopin_rsp() for NOPIN_RSP PDU init + convert
  iscsit_send_nopin()
- Add iscsit_build_rsp_pdu() for SCSI_RSP PDU init + convert
  iscsit_send_response()
- Add iscsit_build_task_mgt_rsp for TM_RSP PDU init + convert
  iscsit_send_task_mgt_rsp()
- Refactor immediate queue state switch into iscsit_immediate_queue()
- Convert handle_immediate_queue() to use iscsit_transport caller
- Refactor response queue state switch into iscsit_response_queue()
- Convert handle_response_queue to use iscsit_transport caller
- Export iscsit_logout_post_handler(), iscsit_increment_maxcmdsn()
  and iscsit_tmr_post_handler() for external transport module usage

v5 changes:

- Fix solicited NopIN handling with RDMAExtensions=No (nab)

v3 changes:
- Add iscsit_build_reject for REJECT PDU init + convert
  iscsit_send_reject()

v2 changes:

- Add iscsit_queue_rsp() for iscsit_transport->iscsit_queue_data_in()
  and iscsit_transport->iscsit_queue_status()
- Update lio_queue_data_in() to use ->iscsit_queue_data_in()
- Update lio_queue_status() to use ->iscsit_queue_status()
- Use mutex_trylock() in iscsit_increment_maxcmdsn()

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agoiscsi-target: Refactor RX PDU logic + export request PDU handling
Nicholas Bellinger [Thu, 7 Mar 2013 06:18:24 +0000 (22:18 -0800)]
iscsi-target: Refactor RX PDU logic + export request PDU handling

This patch refactors existing traditional iscsi RX side PDU handling
to use iscsit_transport, and exports the necessary logic for external
transport modules.

This includes:

- Refactor iscsit_handle_scsi_cmd() into PDU setup / processing
- Add updated iscsit_handle_scsi_cmd() for tradtional iscsi code
- Add iscsit_set_unsoliticed_dataout() wrapper
- Refactor iscsit_handle_data_out() into PDU check / processing
- Add updated iscsit_handle_data_out() for tradtional iscsi code
- Add iscsit_handle_nop_out() + iscsit_handle_task_mgt_cmd() to
  accept pre-allocated struct iscsi_cmd
- Add iscsit_build_r2ts_for_cmd() caller for iscsi_target_transport
  to handle ISTATE_SEND_R2T for TX immediate queue
- Refactor main traditional iscsi iscsi_target_rx_thread() PDU switch
  into iscsi_target_rx_opcode() using iscsit_allocate_cmd()
- Turn iscsi_target_rx_thread() process context into NOP for
  ib_isert side work-queue.

v5 changes:

- Make iscsit_handle_scsi_cmd() static (Fengguang)
- Fix iscsit_handle_scsi_cmd() exception se_cmd leak (nab)

v3 changes:
- Add extra target_put_sess_cmd call in iscsit_add_reject_from_cmd
  after completion

v2 changes:

- Disable iscsit_ack_from_expstatsn() usage for RDMAExtentions=Yes
- Disable iscsit_allocate_datain_req() usage for RDMAExtentions=Yes
- Add target_get_sess_cmd() reference counting to
  iscsit_setup_scsi_cmd()
- Add TFO->lio_check_stop_free() fabric API caller
- Add export of iscsit_stop_dataout_timer() symbol
- Add iscsit_build_r2ts_for_cmd() for iscsit_transport->iscsit_get_dataout()
- Convert existing usage of iscsit_build_r2ts_for_cmd() to
  ->iscsit_get_dataout()
- Drop RDMAExtentions=Yes specific check in iscsit_build_r2ts_for_cmd()
- Fix RDMAExtentions -> RDMAExtensions typo (andy)
- Pass correct dump_payload value into iscsit_get_immediate_data()
  for iscsit_handle_scsi_cmd()

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agoiscsi-target: Add per transport iscsi_cmd alloc/free
Nicholas Bellinger [Thu, 7 Mar 2013 06:09:17 +0000 (22:09 -0800)]
iscsi-target: Add per transport iscsi_cmd alloc/free

This patch converts struct iscsi_cmd memory allocation + free to use
->iscsit_alloc_cmd() iscsit_transport API caller, and export
iscsit_allocate_cmd() symbols

Also add iscsi_cmd->release_cmd() to be used seperately from
iscsit_transport for connection/session shutdown.

v2 changes:

- Remove unnecessary checks in iscsit_alloc_cmd (asias)
- Drop iscsit_transport->iscsit_free_cmd() usage
- Drop iscsit_transport->iscsit_unmap_cmd() usage
- Add iscsi_cmd->release_cmd()
- Convert lio_release_cmd() to use iscsi_cmd->release_cmd()

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agoiscsi-target: Add iser-target parameter keys + setup during login
Nicholas Bellinger [Thu, 7 Mar 2013 05:54:38 +0000 (21:54 -0800)]
iscsi-target: Add iser-target parameter keys + setup during login

This patch adds RDMAExtensions, InitiatorRecvDataSegmentLength and
TargetRecvDataSegmentLength parameters keys necessary for iser-target
login to occur.

This includes setting the necessary parameters during login path
code within iscsi_login_zero_tsih_s2(), and currently PAGE_SIZE
aligning the target's advertised MRDSL for immediate data and
unsolicited data-out incoming payloads.

v3 changes:
- Add iscsi_post_login_start_timers FIXME for ISER

v2 changes:

- Fix RDMAExtentions -> RDMAExtensions typo (andy)
- Drop unnecessary '== true' conditional checks for type bool

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agoiscsi-target: Initial traditional TCP conversion to iscsit_transport
Nicholas Bellinger [Thu, 7 Mar 2013 05:54:13 +0000 (21:54 -0800)]
iscsi-target: Initial traditional TCP conversion to iscsit_transport

This patch performs the initial conversion of existing traditional iscsi
to use iscsit_transport API callers.  This includes:

- iscsi-np cleanups for iscsit_transport_type
- Add iscsi-np transport calls w/ ->iscsit_setup_up() and ->iscsit_free_np()
- Convert login thread process context to use ->iscsit_accept_np() for
  connections with pre-allocated struct iscsi_conn
- Convert existing socket accept code to iscsit_accept_np()
- Convert login RX/TX callers to use ->iscsit_get_login_rx() and
  ->iscsit_put_login_tx() to exchange request/response PDUs
- Convert existing socket login RX/TX calls into iscsit_get_login_rx()
  and iscsit_put_login_tx()
- Change iscsit_close_connection() to invoke ->iscsit_free_conn() +
  iscsit_put_transport() calls.
- Add iscsit_register_transport() + iscsit_unregister_transport() calls
  to module init/exit

v4 changes:

- Add missing iscsit_put_transport() call in iscsi_target_setup_login_socket()
  failure case

v2 changes:

- Update module init/exit to use register_transport() + unregister_transport()

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agoiscsi-target: Add iscsit_transport API template
Nicholas Bellinger [Thu, 7 Mar 2013 05:53:26 +0000 (21:53 -0800)]
iscsi-target: Add iscsit_transport API template

Add basic struct iscsit_transport API template to allow iscsi-target for
running with external transport modules using existing iscsi_target_core.h
code.

For all external modules, this calls try_module_get() and module_put()
to obtain + release an external iscsit_transport module reference count.

Also include the iscsi-target symbols necessary in iscsi_transport.h to
allow external transport modules to function.

v3 changes:
- Add iscsit_build_reject export for ISTATE_SEND_REJECT usage

v2 changes:

- Drop unnecessary export of iscsit_get_transport + iscsit_put_transport (roland)
- Add ->iscsit_queue_data_in() to remove extra context switch on RDMA_WRITE
- Add ->iscsit_queue_status() to remove extra context switch on IB_SEND status
- Add ->iscsit_get_dataout() to remove extra context switch on RDMA_READ
- Drop ->iscsit_free_cmd()
- Drop ->iscsit_unmap_cmd()
- Rename iscsit_create_transport() -> iscsit_register_transport() (andy)
- Rename iscsit_destroy_transport() -> iscsit_unregister_transport() (andy)

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget: Add export of target_get_sess_cmd symbol
Nicholas Bellinger [Fri, 22 Mar 2013 05:54:28 +0000 (22:54 -0700)]
target: Add export of target_get_sess_cmd symbol

Export target_get_sess_cmd() symbol so that it can be used by
iscsi-target.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget: Change default sense key of NOT_READY
Jörn Engel [Mon, 18 Mar 2013 22:30:31 +0000 (18:30 -0400)]
target: Change default sense key of NOT_READY

As the comment sais, this allows Solaris initiators to survive
intermittent errors.  The comment from someone reading the Solaris
sources seem to imply that multipathing would be broken without this
patch as well.

Signed-off-by: Joern Engel <joern@logfs.org>
Cc: Brian Bunker <brian@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget/file: Set is_nonrot attribute
Asias He [Thu, 25 Apr 2013 02:38:11 +0000 (19:38 -0700)]
target/file: Set is_nonrot attribute

Set is_nonrot attribute according to the block queue if the backend
device is a block device.

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget: Add sbc_execute_unmap() helper
Asias He [Mon, 25 Feb 2013 06:03:46 +0000 (14:03 +0800)]
target: Add sbc_execute_unmap() helper

iblock_execute_unmap() and fd_execute_unmap share a lot of code.
Add sbc_execute_unmap() helper to remove duplicated code for
iblock_execute_unmap() and fd_execute_unmap().

Cc: Christoph Hellwig <hch@lst.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget/iblock: Add iblock_do_unmap() helper
Asias He [Mon, 25 Feb 2013 06:03:45 +0000 (14:03 +0800)]
target/iblock: Add iblock_do_unmap() helper

Add helper iblock_do_unmap() to remove duplicated code in
iblock_execute_write_same_unmap() and iblock_execute_unmap().

Cc: Christoph Hellwig <hch@lst.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget/file: Add fd_do_unmap() helper
Asias He [Mon, 25 Feb 2013 06:03:44 +0000 (14:03 +0800)]
target/file: Add fd_do_unmap() helper

Add helper fd_do_unmap() to remove duplicated code in
fd_execute_write_same_unmap() and fd_execute_unmap().

Cc: Christoph Hellwig <hch@lst.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget/file: Add UNMAP emulation support
Asias He [Mon, 25 Feb 2013 06:03:43 +0000 (14:03 +0800)]
target/file: Add UNMAP emulation support

This patch adds support for emulation of UNMAP within
fd_execute_unmap() backend code.

If the FILEIO backend is normal file, the emulation uses fallocate to
punch hole to reclaim the free space used by the file. If the FILEIO
backend is block device, the emulation uses blkdev_issue_discard().

Cc: Christoph Hellwig <hch@lst.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget/file: Add WRITE_SAME w/ UNMAP=1 emulation support
Asias He [Mon, 25 Feb 2013 06:03:42 +0000 (14:03 +0800)]
target/file: Add WRITE_SAME w/ UNMAP=1 emulation support

This patch adds support for emulation of WRITE_SAME w/ UNMAP=1 within
fd_execute_write_same_unmap() backend code.

If the FILEIO backend is normal file, the emulation uses fallocate to
punch hole to reclaim the free space used by the file. If the FILEIO
backend is block device, the emulation uses blkdev_issue_discard().

Tested with 512, 1k, 2k, and 4k block_sizes.

Changes in v2:
- Set the various dev->dev_attrib.*unmap* values (nab)

Cc: Christoph Hellwig <hch@lst.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agoqla2xxx: Remove unused function
Jörn Engel [Mon, 18 Mar 2013 22:29:53 +0000 (18:29 -0400)]
qla2xxx: Remove unused function

It was already unused when first introduced in 2d70c103.

Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotcm_fc: using kfree_rcu() to simplify the code
Wei Yongjun [Mon, 11 Mar 2013 13:48:28 +0000 (21:48 +0800)]
tcm_fc: using kfree_rcu() to simplify the code

The callback function of call_rcu() just calls a kfree(), so we
can use kfree_rcu() instead of call_rcu() + callback function.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget/iscsi: Use ISCSI_LOGIN_CURRENT/NEXT_STAGE macros
Andy Grover [Mon, 4 Mar 2013 21:52:08 +0000 (13:52 -0800)]
target/iscsi: Use ISCSI_LOGIN_CURRENT/NEXT_STAGE macros

Fix bit-clearing in login_rsp->flags for case 0.

Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget/iscsi: Remove chap_set_random()
Andy Grover [Mon, 4 Mar 2013 21:52:07 +0000 (13:52 -0800)]
target/iscsi: Remove chap_set_random()

The result from get_random_bytes should already be random, so further
manipulation and mixing should not be needed.

Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget: Fix incorrect fallthrough of ALUA Standby/Offline/Transition CDBs
Nicholas Bellinger [Wed, 10 Apr 2013 22:00:27 +0000 (15:00 -0700)]
target: Fix incorrect fallthrough of ALUA Standby/Offline/Transition CDBs

This patch fixes a bug where a handful of informational / control CDBs
that should be allowed during ALUA access state Standby/Offline/Transition
where incorrectly returning CHECK_CONDITION + ASCQ_04H_ALUA_TG_PT_*.

This includes INQUIRY + REPORT_LUNS, which would end up preventing LUN
registration when LUN scanning occured during these ALUA access states.

Cc: Hannes Reinecke <hare@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotcm_vhost: Send bad target to guest when cmd fails
Asias He [Wed, 10 Apr 2013 07:06:16 +0000 (15:06 +0800)]
tcm_vhost: Send bad target to guest when cmd fails

Send bad target to guest in case:
1) we can not allocate the cmd
2) fail to submit the cmd

Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotcm_vhost: Add vhost_scsi_send_bad_target() helper
Asias He [Wed, 10 Apr 2013 07:06:15 +0000 (15:06 +0800)]
tcm_vhost: Add vhost_scsi_send_bad_target() helper

Share the send bad target code with other use cases.

Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotcm_vhost: Fix tv_cmd leak in vhost_scsi_handle_vq
Asias He [Wed, 10 Apr 2013 07:06:14 +0000 (15:06 +0800)]
tcm_vhost: Fix tv_cmd leak in vhost_scsi_handle_vq

If we fail to submit the allocated tv_vmd to tcm_vhost_submission_work,
we will leak the tv_vmd. Free tv_vmd on fail path.

Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotcm_vhost: Remove double check of response
Asias He [Wed, 10 Apr 2013 07:06:13 +0000 (15:06 +0800)]
tcm_vhost: Remove double check of response

We did the length of response check twice.

Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotcm_vhost: Initialize vq->last_used_idx when set endpoint
Asias He [Wed, 3 Apr 2013 06:17:38 +0000 (14:17 +0800)]
tcm_vhost: Initialize vq->last_used_idx when set endpoint

This patch fixes guest hang when booting seabios and guest.

  [    0.576238] scsi0 : Virtio SCSI HBA
  [    0.616754] virtio_scsi virtio1: request:id 0 is not a head!

vq->last_used_idx is initialized only when /dev/vhost-scsi is
opened or closed.

   vhost_scsi_open -> vhost_dev_init() -> vhost_vq_reset()
   vhost_scsi_release() -> vhost_dev_cleanup -> vhost_vq_reset()

So, when guest talks to tcm_vhost after seabios does, vq->last_used_idx
still contains the old valule for seabios. This confuses guest.

Fix this by calling vhost_init_used() to init vq->last_used_idx when
we set endpoint.

Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotcm_vhost: Use vq->private_data to indicate if the endpoint is setup
Asias He [Wed, 3 Apr 2013 06:17:37 +0000 (14:17 +0800)]
tcm_vhost: Use vq->private_data to indicate if the endpoint is setup

Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
not. It is set or cleared in vhost_scsi_set_endpoint() or
vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
we check it in vhost_scsi_handle_vq(), we ignored the lock.

Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
indicate the status of the endpoint, we use per virtqueue
vq->private_data to indicate it. In this way, we can only take the
vq->mutex lock which is per queue and make the concurrent multiqueue
process having less lock contention. Further, in the read side of
vq->private_data, we can even do not take the lock if it is accessed in
the vhost worker thread, because it is protected by "vhost rcu".

(nab: Do s/VHOST_FEATURES/~VHOST_SCSI_FEATURES)

Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotcm_vhost: Use ACCESS_ONCE for vs->vs_tpg[target] access
Asias He [Tue, 2 Apr 2013 15:31:37 +0000 (23:31 +0800)]
tcm_vhost: Use ACCESS_ONCE for vs->vs_tpg[target] access

In vhost_scsi_handle_vq:

      tv_tpg = vs->vs_tpg[target];
      if (!tv_tpg) {
              ....
              return
      }

      tv_cmd = vhost_scsi_allocate_cmd(tv_tpg, &v_req,

1) vs->vs_tpg[target] might change after the NULL check and 2) the above
line might access tv_tpg from vs->vs_tpg[target]. To prevent 2), use
ACCESS_ONCE. Thanks mst for catching this up!

Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget: Fix RESERVATION_CONFLICT status regression for iscsi-target special case
Nicholas Bellinger [Fri, 29 Mar 2013 06:06:00 +0000 (23:06 -0700)]
target: Fix RESERVATION_CONFLICT status regression for iscsi-target special case

This patch fixes a regression introduced in v3.8-rc1 code where a failed
target_check_reservation() check in target_setup_cmd_from_cdb() was causing
an incorrect SAM_STAT_GOOD status to be returned during a WRITE operation
performed by an unregistered / unreserved iscsi initiator port.

This regression is only effecting iscsi-target due to a special case check
for TCM_RESERVATION_CONFLICT within iscsi_target_erl1.c:iscsit_execute_cmd(),
and was still correctly disallowing WRITE commands from backend submission
for unregistered / unreserved initiator ports, while returning the incorrect
SAM_STAT_GOOD status due to the missing SAM_STAT_RESERVATION_CONFLICT
assignment.

This regression was first introduced with:

commit de103c93aff0bed0ae984274e5dc8b95899badab
Author: Christoph Hellwig <hch@lst.de>
Date:   Tue Nov 6 12:24:09 2012 -0800

    target: pass sense_reason as a return value

Go ahead and re-add the missing SAM_STAT_RESERVATION_CONFLICT assignment
during a target_check_reservation() failure, so that iscsi-target code
sends the correct SCSI status.

All other fabrics using target_submit_cmd_*() with a RESERVATION_CONFLICT
call to transport_generic_request_failure() are not effected by this bug.

Reported-by: Jeff Leung <jleung@curriegrad2004.ca>
Cc: Christoph Hellwig <hch@lst.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotcm_vhost: Avoid VIRTIO_RING_F_EVENT_IDX feature bit
Nicholas Bellinger [Thu, 28 Mar 2013 00:23:41 +0000 (17:23 -0700)]
tcm_vhost: Avoid VIRTIO_RING_F_EVENT_IDX feature bit

This patch adds a VHOST_SCSI_FEATURES mask minus VIRTIO_RING_F_EVENT_IDX
so that vhost-scsi-pci userspace will strip this feature bit once
GET_FEATURES reports it as being unsupported on the host.

This is to avoid a bug where ->handle_kicks() are missed when EVENT_IDX
is enabled by default in userspace code.

(mst: Rename to VHOST_SCSI_FEATURES + add comment)

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Asias He <asias@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget/pscsi: Reject cross page boundary case in pscsi_map_sg
Asias He [Tue, 19 Mar 2013 04:55:16 +0000 (12:55 +0800)]
target/pscsi: Reject cross page boundary case in pscsi_map_sg

We can only have one page of data in each sg element, so we can not
cross a page boundary. Fail this case.

The 'while (len > 0 && data_len > 0) {}' loop is not necessary. The loop
can only be executed once.

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget/file: Bump FD_MAX_SECTORS to 2048 to handle 1M sized I/Os
Nicholas Bellinger [Mon, 18 Mar 2013 20:15:57 +0000 (13:15 -0700)]
target/file: Bump FD_MAX_SECTORS to 2048 to handle 1M sized I/Os

This patch bumps the default FILEIO backend FD_MAX_SECTORS value from
1024 -> 2048 in order to allow block_size=512 to handle 1M sized I/Os.

The current default rejects I/Os larger than 512K in sbc_parse_cdb():

[12015.915146] SCSI OP 2ah with too big sectors 1347 exceeds backend
hw_max_sectors: 1024
[12015.977744] SCSI OP 2ah with too big sectors 2048 exceeds backend
hw_max_sectors: 1024

This issue is present in >= v3.5 based kernels, introduced after the
removal of se_task logic.

Reported-by: Viljami Ilola <azmulx@netikka.fi>
Cc: <stable@vger.kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotcm_vhost: Flush vhost_work in vhost_scsi_flush()
Asias He [Fri, 15 Mar 2013 01:14:06 +0000 (09:14 +0800)]
tcm_vhost: Flush vhost_work in vhost_scsi_flush()

We also need to flush the vhost_works. It is the completion vhost_work
currently.

Signed-off-by: Asias He <asias@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotcm_vhost: Add missed lock in vhost_scsi_clear_endpoint()
Asias He [Fri, 15 Mar 2013 01:14:05 +0000 (09:14 +0800)]
tcm_vhost: Add missed lock in vhost_scsi_clear_endpoint()

tv_tpg->tv_tpg_vhost_count should be protected by tv_tpg->tv_tpg_mutex.

Signed-off-by: Asias He <asias@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget: fix possible memory leak in core_tpg_register()
Wei Yongjun [Fri, 15 Mar 2013 09:19:26 +0000 (17:19 +0800)]
target: fix possible memory leak in core_tpg_register()

'se_tpg->tpg_lun_list' is malloced in core_tpg_register() and should be freed
before leaving from the error handling cases, otherwise it will cause memory
leak.
'se_tpg' is malloced out of this function, and will be freed if we return error, so
remove free for 'se_tpg'.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget/iscsi: Fix mutual CHAP auth on big-endian arches
Andy Grover [Mon, 4 Mar 2013 21:52:09 +0000 (13:52 -0800)]
target/iscsi: Fix mutual CHAP auth on big-endian arches

See https://bugzilla.redhat.com/show_bug.cgi?id=916290

Used a temp var since we take its address in sg_init_one.

Signed-off-by: Andy Grover <agrover@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agotarget_core_sbc: use noop for SYNCHRONIZE_CACHE
Hannes Reinecke [Mon, 4 Mar 2013 13:08:06 +0000 (14:08 +0100)]
target_core_sbc: use noop for SYNCHRONIZE_CACHE

Windows does not expect SYNCHRONIZE_CACHE to be not supported,
and will generate a BSOD upon shutdown when using rd_mcp backend.
So better use a noop here.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
11 years agoLinux 3.9-rc3
Linus Torvalds [Sun, 17 Mar 2013 22:59:32 +0000 (15:59 -0700)]
Linux 3.9-rc3

11 years agoperf,x86: fix link failure for non-Intel configs
David Rientjes [Sun, 17 Mar 2013 22:49:10 +0000 (15:49 -0700)]
perf,x86: fix link failure for non-Intel configs

Commit 1d9d8639c063 ("perf,x86: fix kernel crash with PEBS/BTS after
suspend/resume") introduces a link failure since
perf_restore_debug_store() is only defined for CONFIG_CPU_SUP_INTEL:

arch/x86/power/built-in.o: In function `restore_processor_state':
(.text+0x45c): undefined reference to `perf_restore_debug_store'

Fix it by defining the dummy function appropriately.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoperf,x86: fix wrmsr_on_cpu() warning on suspend/resume
Linus Torvalds [Sun, 17 Mar 2013 22:44:43 +0000 (15:44 -0700)]
perf,x86: fix wrmsr_on_cpu() warning on suspend/resume

Commit 1d9d8639c063 ("perf,x86: fix kernel crash with PEBS/BTS after
suspend/resume") fixed a crash when doing PEBS performance profiling
after resuming, but in using init_debug_store_on_cpu() to restore the
DS_AREA mtrr it also resulted in a new WARN_ON() triggering.

init_debug_store_on_cpu() uses "wrmsr_on_cpu()", which in turn uses CPU
cross-calls to do the MSR update.  Which is not really valid at the
early resume stage, and the warning is quite reasonable.  Now, it all
happens to _work_, for the simple reason that smp_call_function_single()
ends up just doing the call directly on the CPU when the CPU number
matches, but we really should just do the wrmsr() directly instead.

This duplicates the wrmsr() logic, but hopefully we can just remove the
wrmsr_on_cpu() version eventually.

Reported-and-tested-by: Parag Warudkar <parag.lkml@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux...
Linus Torvalds [Sun, 17 Mar 2013 18:04:14 +0000 (11:04 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs

Pull btrfs fixes from Chris Mason:
 "Eric's rcu barrier patch fixes a long standing problem with our
  unmount code hanging on to devices in workqueue helpers.  Liu Bo
  nailed down a difficult assertion for in-memory extent mappings."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: fix warning of free_extent_map
  Btrfs: fix warning when creating snapshots
  Btrfs: return as soon as possible when edquot happens
  Btrfs: return EIO if we have extent tree corruption
  btrfs: use rcu_barrier() to wait for bdev puts at unmount
  Btrfs: remove btrfs_try_spin_lock
  Btrfs: get better concurrency for snapshot-aware defrag work

11 years agoBtrfs: fix warning of free_extent_map
Liu Bo [Fri, 15 Mar 2013 14:46:39 +0000 (08:46 -0600)]
Btrfs: fix warning of free_extent_map

Users report that an extent map's list is still linked when it's actually
going to be freed from cache.

The story is that

a) when we're going to drop an extent map and may split this large one into
smaller ems, and if this large one is flagged as EXTENT_FLAG_LOGGING which means
that it's on the list to be logged, then the smaller ems split from it will also
be flagged as EXTENT_FLAG_LOGGING, and this is _not_ expected.

b) we'll keep ems from unlinking the list and freeing when they are flagged with
EXTENT_FLAG_LOGGING, because the log code holds one reference.

The end result is the warning, but the truth is that we set the flag
EXTENT_FLAG_LOGGING only during fsync.

So clear flag EXTENT_FLAG_LOGGING for extent maps split from a large one.

Reported-by: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
Reported-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
11 years agoMerge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Linus Torvalds [Sat, 16 Mar 2013 01:06:55 +0000 (18:06 -0700)]
Merge branch 'kbuild' of git://git./linux/kernel/git/mmarek/kbuild

Pull kbuild fix from Michal Marek:
 "One fix for for make headers_install/headers_check to not require make
  3.81.  The requirement has been accidentally introduced in 3.7."

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  kbuild: fix make headers_check with make 3.80

11 years agoMerge tag 'for-3.9-rc3' of git://openrisc.net/jonas/linux
Linus Torvalds [Sat, 16 Mar 2013 01:05:37 +0000 (18:05 -0700)]
Merge tag 'for-3.9-rc3' of git://openrisc.net/jonas/linux

Pull OpenRISC bug fixes from Jonas Bonn:

 - The GPIO descriptor work has exposed how broken the non-GPIOLIB bits
   for OpenRISC were.  We now require GPIOLIB as this is the preferred
   way forward.

 - The system.h split introduced a bug in llist.h for arches using
   asm-generic/cmpxchg.h directly, which is currently only OpenRISC.
   The patch here moves two defines from asm-generic/atomic.h to
   asm-generic/cmpxchg.h to make things work as they should.

 - The VIRT_TO_BUS selector was added for OpenRISC, but OpenRISC does
   not have the virt_to_bus methods, so there's a patch to remove it
   again.

* tag 'for-3.9-rc3' of git://openrisc.net/jonas/linux:
  openrisc: remove HAVE_VIRT_TO_BUS
  asm-generic: move cmpxchg*_local defs to cmpxchg.h
  openrisc: require gpiolib

11 years agoMerge tag 'char-misc-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Sat, 16 Mar 2013 01:04:38 +0000 (18:04 -0700)]
Merge tag 'char-misc-3.9-rc2' of git://git./linux/kernel/git/gregkh/char-misc

Pull char/misc fixes from Greg Kroah-Hartman:
 "Here are some tiny fixes for the w1 drivers and the final removal
  patch for getting rid of CONFIG_EXPERIMENTAL (all users of it are now
  gone from your tree, this just drops the Kconfig item itself.)

  All have been in the linux-next tree for a while"

* tag 'char-misc-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  final removal of CONFIG_EXPERIMENTAL
  w1: fix oops when w1_search is called from netlink connector
  w1-gpio: fix unused variable warning
  w1-gpio: remove erroneous __exit and __exit_p()
  ARM: w1-gpio: fix erroneous gpio requests

11 years agoMerge tag 'sound-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Linus Torvalds [Sat, 16 Mar 2013 00:35:49 +0000 (17:35 -0700)]
Merge tag 'sound-3.9' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A collection of small fixes, as expected for the middle rc:
   - A couple of fixes for potential NULL dereferences and out-of-range
     array accesses revealed by static code parsers
   - A fix for the wrong error handling detected by trinity
   - A regression fix for missing audio on some MacBooks
   - CA0132 DSP loader fixes
   - Fix for EAPD control of IDT codecs on machines w/o speaker
   - Fix a regression in the HD-audio widget list parser code
   - Workaround for the NuForce UDH-100 USB audio"

* tag 'sound-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Fix missing EAPD/GPIO setup for Cirrus codecs
  sound: sequencer: cap array index in seq_chn_common_event()
  ALSA: hda/ca0132 - Remove extra setting of dsp_state.
  ALSA: hda/ca0132 - Check download state of DSP.
  ALSA: hda/ca0132 - Check if dspload_image succeeded.
  ALSA: hda - Disable IDT eapd_switch if there are no internal speakers
  ALSA: hda - Fix snd_hda_get_num_raw_conns() to return a correct value
  ALSA: usb-audio: add a workaround for the NuForce UDH-100
  ALSA: asihpi - fix potential NULL pointer dereference
  ALSA: seq: Fix missing error handling in snd_seq_timer_open()

11 years agoMerge branch 'fixes-for-3.9' of git://git.linaro.org/people/mszyprowski/linux-dma...
Linus Torvalds [Sat, 16 Mar 2013 00:35:03 +0000 (17:35 -0700)]
Merge branch 'fixes-for-3.9' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping

Pull DMA-mapping fix from Marek Szyprowski:
 "An important fix for all ARM architectures which use ZONE_DMA.
  Without it dma_alloc_* calls with GFP_ATOMIC flag might have allocated
  buffers outsize DMA zone."

* 'fixes-for-3.9' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  ARM: DMA-mapping: add missing GFP_DMA flag for atomic buffer allocation

11 years agoMerge tag 'mfd-fixes-3.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo...
Linus Torvalds [Sat, 16 Mar 2013 00:34:01 +0000 (17:34 -0700)]
Merge tag 'mfd-fixes-3.9-1' of git://git./linux/kernel/git/sameo/mfd-fixes

Pull MFD fixes from Samuel Ortiz:
 "This is the first batch of MFD fixes for 3.9.

  With this one we have:

   - An ab8500 build failure fix.
   - An ab8500 device tree parsing fix.
   - A fix for twl4030_madc remove routine to work properly (when
     built-in).
   - A fix for properly registering palmas interrupt handler.
   - A fix for omap-usb init routine to actually write into the
     hostconfig register.
   - A couple of warning fixes for ab8500-gpadc and tps65912"

* tag 'mfd-fixes-3.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes:
  mfd: twl4030-madc: Remove __exit_p annotation
  mfd: ab8500: Kill "reg" property from binding
  mfd: ab8500-gpadc: Complain if we fail to enable vtvout LDO
  mfd: wm831x: Don't forward declare enum wm831x_auxadc
  mfd: twl4030-audio: Fix argument type for twl4030_audio_disable_resource()
  mfd: tps65912: Declare and use tps65912_irq_exit()
  mfd: palmas: Provide irq flags through DT/platform data
  mfd: Make AB8500_CORE select POWER_SUPPLY to fix build error
  mfd: omap-usb-host: Actually update hostconfig

11 years agoMerge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck...
Linus Torvalds [Sat, 16 Mar 2013 00:33:13 +0000 (17:33 -0700)]
Merge tag 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:
 "Bug fixes for pmbus, ltc2978, and lineage-pem drivers

  Added specific maintainer for some hwmon drivers"

* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (pmbus/ltc2978) Fix temperature reporting
  hwmon: (pmbus) Fix krealloc() misuse in pmbus_add_attribute()
  hwmon: (lineage-pem) Add missing terminating entry for pem_[input|fan]_attributes
  MAINTAINERS: Add maintainer for MAX6697, INA209, and INA2XX drivers

11 years agoperf,x86: fix kernel crash with PEBS/BTS after suspend/resume
Stephane Eranian [Fri, 15 Mar 2013 13:26:07 +0000 (14:26 +0100)]
perf,x86: fix kernel crash with PEBS/BTS after suspend/resume

This patch fixes a kernel crash when using precise sampling (PEBS)
after a suspend/resume. Turns out the CPU notifier code is not invoked
on CPU0 (BP). Therefore, the DS_AREA (used by PEBS) is not restored properly
by the kernel and keeps it power-on/resume value of 0 causing any PEBS
measurement to crash when running on CPU0.

The workaround is to add a hook in the actual resume code to restore
the DS Area MSR value. It is invoked for all CPUS. So for all but CPU0,
the DS_AREA will be restored twice but this is harmless.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoALSA: hda - Fix missing EAPD/GPIO setup for Cirrus codecs
Takashi Iwai [Fri, 15 Mar 2013 13:23:32 +0000 (14:23 +0100)]
ALSA: hda - Fix missing EAPD/GPIO setup for Cirrus codecs

During the transition to the generic parser, the hook to the codec
specific automute function was forgotten.  This resulted in the silent
output on some MacBooks.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
11 years agosound: sequencer: cap array index in seq_chn_common_event()
Dan Carpenter [Fri, 15 Mar 2013 06:14:22 +0000 (09:14 +0300)]
sound: sequencer: cap array index in seq_chn_common_event()

"chn" here is a number between 0 and 255, but ->chn_info[] only has
16 elements so there is a potential write beyond the end of the
array.

If the seq_mode isn't SEQ_2 then we let the individual drivers
(either opl3.c or midi_synth.c) handle it.  Those functions all
do a bounds check on "chn" so I haven't changed anything here.
The opl3.c driver has up to 18 channels and not 16.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
11 years agomfd: twl4030-madc: Remove __exit_p annotation
Arnd Bergmann [Thu, 14 Mar 2013 21:56:38 +0000 (22:56 +0100)]
mfd: twl4030-madc: Remove __exit_p annotation

4740f73fe5 "mfd: remove use of __devexit" removed the __devexit annotation
on the twl4030_madc_remove function, but left an __exit_p() present on the
pointer to this function. Using __exit_p was as wrong with the devexit in
place as it is now, but now we get a gcc warning about an unused function.

In order for the twl4030_madc_remove to work correctly in built-in code, we
have to remove the __exit_p.

Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
11 years agoALSA: hda/ca0132 - Remove extra setting of dsp_state.
Dylan Reid [Fri, 15 Mar 2013 00:27:46 +0000 (17:27 -0700)]
ALSA: hda/ca0132 - Remove extra setting of dsp_state.

spec->dsp_state is initialized to DSP_DOWNLOAD_INIT, no need to reset
and check it in ca0132_download_dsp().

Signed-off-by: Dylan Reid <dgreid@chromium.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
11 years agoALSA: hda/ca0132 - Check download state of DSP.
Dylan Reid [Fri, 15 Mar 2013 00:27:45 +0000 (17:27 -0700)]
ALSA: hda/ca0132 - Check download state of DSP.

Instead of using the dspload_is_loaded() function, check the dsp_state
that is kept in the spec.  The dspload_is_loaded() function returns
true if the DSP transfer was never started.  This false-positive leads
to multiple second delays when ca0132_setup_efaults() times out on
each write.

Signed-off-by: Dylan Reid <dgreid@chromium.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
11 years agoALSA: hda/ca0132 - Check if dspload_image succeeded.
Dylan Reid [Fri, 15 Mar 2013 00:27:44 +0000 (17:27 -0700)]
ALSA: hda/ca0132 - Check if dspload_image succeeded.

If dspload_image() fails, it was ignored and dspload_wait_loaded() was
still called.  dsp_loaded should never be set to true in this case,
skip it.  The check in dspload_wait_loaded() return true if the DSP is
loaded or if it never started.

Signed-off-by: Dylan Reid <dgreid@chromium.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
11 years agomm/fremap.c: fix possible oops on error path
Michel Lespinasse [Thu, 14 Mar 2013 23:50:02 +0000 (16:50 -0700)]
mm/fremap.c: fix possible oops on error path

The vm_flags introduced in 6d7825b10dbe ("mm/fremap.c: fix oops on error
path") is supposed to avoid a compiler warning about unitialized
vm_flags without changing the generated code.

However I am concerned that this is going to be very brittle, and fail
with some compiler versions. The failure could be either of:

- compiler could actually load vma->vm_flags before checking for the
  !vma condition, thus reintroducing the oops

- compiler could optimize out the !vma check, since the pointer just got
  dereferenced shortly before (so the compiler knows it can't be NULL!)

I propose reversing this part of the change and initializing vm_flags to 0
just to avoid the bogus uninitialized use warning.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoMerge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck...
Linus Torvalds [Thu, 14 Mar 2013 21:53:07 +0000 (14:53 -0700)]
Merge branch 'rcu/urgent' of git://git./linux/kernel/git/paulmck/linux-rcu

Pull fix for hlist_entry_safe() regression from Paul McKenney:
 "This contains a single commit that fixes a regression in
  hlist_entry_safe().  This macro references its argument twice, which
  can cause NULL-pointer errors.  This commit applies a gcc statement
  expression, creating a temporary variable to avoid the double
  reference.  This has been posted to LKML at

    https://lkml.org/lkml/2013/3/9/75.

  Kudos to CAI Qian, whose testing uncovered this, to Eric Dumazet, who
  spotted root cause, and to Li Zefan, who tested this commit."

* 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  list: Fix double fetch of pointer in hlist_entry_safe()

11 years agolist: Fix double fetch of pointer in hlist_entry_safe()
Paul E. McKenney [Sat, 9 Mar 2013 15:38:41 +0000 (07:38 -0800)]
list: Fix double fetch of pointer in hlist_entry_safe()

The current version of hlist_entry_safe() fetches the pointer twice,
once to test for NULL and the other to compute the offset back to the
enclosing structure.  This is OK for normal lock-based use because in
that case, the pointer cannot change.  However, when the pointer is
protected by RCU (as in "rcu_dereference(p)"), then the pointer can
change at any time.  This use case can result in the following sequence
of events:

1. CPU 0 invokes hlist_entry_safe(), fetches the RCU-protected
pointer as sees that it is non-NULL.

2. CPU 1 invokes hlist_del_rcu(), deleting the entry that CPU 0
just fetched a pointer to.  Because this is the last entry
in the list, the pointer fetched by CPU 0 is now NULL.

3. CPU 0 refetches the pointer, obtains NULL, and then gets a
NULL-pointer crash.

This commit therefore applies gcc's "({ })" statement expression to
create a temporary variable so that the specified pointer is fetched
only once, avoiding the above sequence of events.  Please note that
it is the caller's responsibility to use rcu_dereference() as needed.
This allows RCU-protected uses to work correctly without imposing
any additional overhead on the non-RCU case.

Many thanks to Eric Dumazet for spotting root cause!

Reported-by: CAI Qian <caiqian@redhat.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Li Zefan <lizefan@huawei.com>
11 years agoMerge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Linus Torvalds [Thu, 14 Mar 2013 19:11:28 +0000 (12:11 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-fs

Pull ext2, ext3, reiserfs, quota fixes from Jan Kara:
 "A fix for regression in ext2, and a format string issue in ext3.  The
  rest isn't too serious."

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  ext2: Fix BUG_ON in evict() on inode deletion
  reiserfs: Use kstrdup instead of kmalloc/strcpy
  ext3: Fix format string issues
  quota: add missing use of dq_data_lock in __dquot_initialize

11 years agoBtrfs: fix warning when creating snapshots
Liu Bo [Wed, 13 Mar 2013 13:43:03 +0000 (07:43 -0600)]
Btrfs: fix warning when creating snapshots

Creating snapshot passes extent_root to commit its transaction,
but it can lead to the warning of checking root for quota in
the __btrfs_end_transaction() when someone else is committing
the current transaction.  Since we've recorded the needed root
in trans_handle, just use it to get rid of the warning.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
11 years agoBtrfs: return as soon as possible when edquot happens
Wang Shilong [Wed, 6 Mar 2013 11:51:47 +0000 (11:51 +0000)]
Btrfs: return as soon as possible when edquot happens

If one of qgroup fails to reserve firstly, we should return immediately,
it is unnecessary to continue check.

Signed-off-by: Wang Shilong <wangsl-fnst@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
11 years agoBtrfs: return EIO if we have extent tree corruption
Josef Bacik [Fri, 8 Mar 2013 20:41:02 +0000 (15:41 -0500)]
Btrfs: return EIO if we have extent tree corruption

The callers of lookup_inline_extent_info all handle getting an error back
properly, so return an error if we have corruption instead of being a jerk and
panicing.  Still WARN_ON() since this is kind of crucial and I've been seeing it
a bit too much recently for my taste, I think we're doing something wrong
somewhere.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
11 years agobtrfs: use rcu_barrier() to wait for bdev puts at unmount
Eric Sandeen [Sat, 9 Mar 2013 15:18:39 +0000 (15:18 +0000)]
btrfs: use rcu_barrier() to wait for bdev puts at unmount

Doing this would reliably fail with -EBUSY for me:

# mount /dev/sdb2 /mnt/scratch; umount /mnt/scratch; mkfs.btrfs -f /dev/sdb2
...
unable to open /dev/sdb2: Device or resource busy

because mkfs.btrfs tries to open the device O_EXCL, and somebody still has it.

Using systemtap to track bdev gets & puts shows a kworker thread doing a
blkdev put after mkfs attempts a get; this is left over from the unmount
path:

btrfs_close_devices
__btrfs_close_devices
call_rcu(&device->rcu, free_device);
free_device
INIT_WORK(&device->rcu_work, __free_device);
schedule_work(&device->rcu_work);

so unmount might complete before __free_device fires & does its blkdev_put.

Adding an rcu_barrier() to btrfs_close_devices() causes unmount to wait
until all blkdev_put()s are done, and the device is truly free once
unmount completes.

Cc: stable@vger.kernel.org
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
11 years agoBtrfs: remove btrfs_try_spin_lock
Liu Bo [Mon, 11 Mar 2013 09:37:45 +0000 (09:37 +0000)]
Btrfs: remove btrfs_try_spin_lock

Remove a useless function declaration

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
11 years agoBtrfs: get better concurrency for snapshot-aware defrag work
Liu Bo [Mon, 11 Mar 2013 09:20:58 +0000 (09:20 +0000)]
Btrfs: get better concurrency for snapshot-aware defrag work

Using spinning case instead of blocking will result in better concurrency
overall.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
11 years agohwmon: (pmbus/ltc2978) Fix temperature reporting
Guenter Roeck [Thu, 21 Feb 2013 18:27:54 +0000 (10:27 -0800)]
hwmon: (pmbus/ltc2978) Fix temperature reporting

On LTC2978, only READ_TEMPERATURE is supported. It reports
the internal junction temperature. This register is unpaged.

On LTC3880, READ_TEMPERATURE and READ_TEMPERATURE2 are supported.
READ_TEMPERATURE is paged and reports external temperatures.
READ_TEMPERATURE2 is unpaged and reports the internal junction
temperature.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: stable@vger.kernel.org # 3.2+
Acked-by: Jean Delvare <khali@linux-fr.org>
11 years agoALSA: hda - Disable IDT eapd_switch if there are no internal speakers
David Henningsson [Thu, 14 Mar 2013 14:28:29 +0000 (15:28 +0100)]
ALSA: hda - Disable IDT eapd_switch if there are no internal speakers

If there are no internal speakers, we should not turn the eapd switch
off, because it might be necessary to keep high for Headphone.

BugLink: https://bugs.launchpad.net/bugs/1155016
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
11 years agohwmon: (pmbus) Fix krealloc() misuse in pmbus_add_attribute()
David Woodhouse [Thu, 14 Mar 2013 13:30:20 +0000 (13:30 +0000)]
hwmon: (pmbus) Fix krealloc() misuse in pmbus_add_attribute()

If krealloc() returns NULL, it *doesn't* free the original. So any code
of the form 'foo = krealloc(foo, …);' is almost certainly a bug.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
11 years agohwmon: (lineage-pem) Add missing terminating entry for pem_[input|fan]_attributes
Axel Lin [Thu, 14 Mar 2013 08:27:18 +0000 (16:27 +0800)]
hwmon: (lineage-pem) Add missing terminating entry for pem_[input|fan]_attributes

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Cc: stable@vger.kernel.org
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
11 years agoARM: DMA-mapping: add missing GFP_DMA flag for atomic buffer allocation
Marek Szyprowski [Tue, 26 Feb 2013 06:46:24 +0000 (07:46 +0100)]
ARM: DMA-mapping: add missing GFP_DMA flag for atomic buffer allocation

Atomic pool should always be allocated from DMA zone if such zone is
available in the system to avoid issues caused by limited dma mask of
any of the devices used for making an atomic allocation.

Reported-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Stable <stable@vger.kernel.org> [v3.6+]
11 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm...
Linus Torvalds [Wed, 13 Mar 2013 22:47:50 +0000 (15:47 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ebiederm/user-namespace

Pull namespace bugfixes from Eric Biederman:
 "This tree includes a partial revert for "fs: Limit sys_mount to only
  request filesystem modules." When I added the new style module aliases
  to the filesystems I deleted the old ones.  A bad move.  It turns out
  that distributions like Arch linux use module aliases when
  constructing ramdisks.  Which meant ultimately that an ext3 filesystem
  mounted with ext4 would not result in the ext4 module being put into
  the ramdisk.

  The other change in this tree adds a handful of filesystem module
  alias I simply failed to add the first time.  Which inconvinienced a
  few folks using cifs.

  I don't want to inconvinience folks any longer than I have to so here
  are these trivial fixes."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  fs: Readd the fs module aliases.
  fs: Limit sys_mount to only request filesystem modules. (Part 3)

11 years agoMerge branch 'akpm' (fixes from Andrew)
Linus Torvalds [Wed, 13 Mar 2013 22:21:57 +0000 (15:21 -0700)]
Merge branch 'akpm' (fixes from Andrew)

Merge misc fixes from Andrew Morton:

 - A bunch of fixes

 - Finish off the idr API conversions before someone starts to use the
   old interfaces again.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  idr: idr_alloc() shouldn't trigger lowmem warning when preloaded
  UAPI: fix endianness conditionals in M32R's asm/stat.h
  UAPI: fix endianness conditionals in linux/raid/md_p.h
  UAPI: fix endianness conditionals in linux/acct.h
  UAPI: fix endianness conditionals in linux/aio_abi.h
  decompressors: fix typo "POWERPC"
  mm/fremap.c: fix oops on error path
  idr: deprecate idr_pre_get() and idr_get_new[_above]()
  tidspbridge: convert to idr_alloc()
  zcache: convert to idr_alloc()
  mlx4: remove leftover idr_pre_get() call
  workqueue: convert to idr_alloc()
  nfsd: convert to idr_alloc()
  nfsd: remove unused get_new_stid()
  kernel/signal.c: use __ARCH_HAS_SA_RESTORER instead of SA_RESTORER
  signal: always clear sa_restorer on execve
  mm: remove_memory(): fix end_pfn setting
  include/linux/res_counter.h needs errno.h

11 years agoidr: idr_alloc() shouldn't trigger lowmem warning when preloaded
Tejun Heo [Wed, 13 Mar 2013 21:59:49 +0000 (14:59 -0700)]
idr: idr_alloc() shouldn't trigger lowmem warning when preloaded

GFP_NOIO is often used for idr_alloc() inside preloaded section as the
allocation mask doesn't really matter.  If the idr tree needs to be
expanded, idr_alloc() first tries to allocate using the specified
allocation mask and if it fails falls back to the preloaded buffer.  This
order prevent non-preloading idr_alloc() users from taking advantage of
preloading ones by using preload buffer without filling it shifting the
burden of allocation to the preload users.

Unfortunately, this allowed/expected-to-fail kmem_cache allocation ends up
generating spurious slab lowmem warning before succeeding the request from
the preload buffer.

This patch makes idr_layer_alloc() add __GFP_NOWARN to the first
kmem_cache attempt and try kmem_cache again w/o __GFP_NOWARN after
allocation from preload_buffer fails so that lowmem warning is generated
if not suppressed by the original @gfp_mask.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: David Teigland <teigland@redhat.com>
Tested-by: David Teigland <teigland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoUAPI: fix endianness conditionals in M32R's asm/stat.h
David Howells [Wed, 13 Mar 2013 21:59:48 +0000 (14:59 -0700)]
UAPI: fix endianness conditionals in M32R's asm/stat.h

In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
compared against __BYTE_ORDER in preprocessor conditionals where these are
exposed to userspace (that is they're not inside __KERNEL__ conditionals).

However, in the main kernel the norm is to check for
"defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
this has incorrectly leaked into the userspace headers.

The definition of struct stat64 in M32R's asm/stat.h is wrong in this way.
 Note that userspace will likely interpret the field order incorrectly as
the big-endian variant on little-endian machines - depending on header
inclusion order.

[!!!] NOTE [!!!]  This patch may adversely change the userspace API.  It might
be better to fix the ordering of st_blocks and __pad4 in struct stat64.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoUAPI: fix endianness conditionals in linux/raid/md_p.h
David Howells [Wed, 13 Mar 2013 21:59:47 +0000 (14:59 -0700)]
UAPI: fix endianness conditionals in linux/raid/md_p.h

In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
compared against __BYTE_ORDER in preprocessor conditionals where these are
exposed to userspace (that is they're not inside __KERNEL__ conditionals).

However, in the main kernel the norm is to check for
"defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
this has incorrectly leaked into the userspace headers.

The definition of struct mdp_superblock_s in linux/raid/md_p.h is wrong in
this way.  Note that userspace will likely interpret the ordering of the
fields incorrectly as the big-endian variant on a little-endian machines -
depending on header inclusion order.

[!!!] NOTE [!!!]  This patch may adversely change the userspace API.  It might
be better to fix the ordering of events_hi, events_lo, cp_events_hi and
cp_events_lo in struct mdp_superblock_s / typedef mdp_super_t.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoUAPI: fix endianness conditionals in linux/acct.h
David Howells [Wed, 13 Mar 2013 21:59:46 +0000 (14:59 -0700)]
UAPI: fix endianness conditionals in linux/acct.h

In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
compared against __BYTE_ORDER in preprocessor conditionals where these are
exposed to userspace (that is they're not inside __KERNEL__ conditionals).

However, in the main kernel the norm is to check for
"defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
this has incorrectly leaked into the userspace headers.

The definition of ACCT_BYTEORDER in linux/acct.h is wrong in this way.
Note that userspace will likely interpret this incorrectly as the
big-endian variant on little-endian machines - depending on header
inclusion order.

[!!!] NOTE [!!!]  This patch may adversely change the userspace API.  It might
be better to fix the value of ACCT_BYTEORDER.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoUAPI: fix endianness conditionals in linux/aio_abi.h
David Howells [Wed, 13 Mar 2013 21:59:45 +0000 (14:59 -0700)]
UAPI: fix endianness conditionals in linux/aio_abi.h

In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
compared against __BYTE_ORDER in preprocessor conditionals where these are
exposed to userspace (that is they're not inside __KERNEL__ conditionals).

However, in the main kernel the norm is to check for
"defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
this has incorrectly leaked into the userspace headers.

The definition of PADDED() in linux/aio_abi.h is wrong in this way.  Note
that userspace will likely interpret this and thus the order of fields in
struct iocb incorrectly as the little-endian variant on big-endian
machines - depending on header inclusion order.

[!!!] NOTE [!!!]  This patch may adversely change the userspace API.  It might
be better to fix the ordering of aio_key and aio_reserved1 in struct iocb.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Benjamin LaHaise <bcrl@kvack.org>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agodecompressors: fix typo "POWERPC"
Paul Bolle [Wed, 13 Mar 2013 21:59:44 +0000 (14:59 -0700)]
decompressors: fix typo "POWERPC"

Commit 5dc49c75a26b ("decompressors: make the default XZ_DEC_* config
match the selected architecture") added

default y if POWERPC

to lib/xz/Kconfig.  But there is no Kconfig symbol POWERPC.  The most
general Kconfig symbol for the powerpc architecture is PPC.  So let's
use that.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Cc: Florian Fainelli <florian@openwrt.org>
Cc: Lasse Collin <lasse.collin@tukaani.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agomm/fremap.c: fix oops on error path
Andrew Morton [Wed, 13 Mar 2013 21:59:43 +0000 (14:59 -0700)]
mm/fremap.c: fix oops on error path

If find_vma() fails, sys_remap_file_pages() will dereference `vma', which
contains NULL.  Fix it by checking the pointer.

(We could alternatively check for err==0, but this seems more direct)

(The vm_flags change is to squish a bogus used-uninitialised warning
without adding extra code).

Reported-by: Tommi Rantala <tt.rantala@gmail.com>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoidr: deprecate idr_pre_get() and idr_get_new[_above]()
Tejun Heo [Wed, 13 Mar 2013 21:59:42 +0000 (14:59 -0700)]
idr: deprecate idr_pre_get() and idr_get_new[_above]()

Now that all in-kernel users are converted to ues the new alloc
interface, mark the old interface deprecated.  We should be able to
remove these in a few releases.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agotidspbridge: convert to idr_alloc()
Tejun Heo [Wed, 13 Mar 2013 21:59:41 +0000 (14:59 -0700)]
tidspbridge: convert to idr_alloc()

idr_get_new*() and friends are about to be deprecated.  Convert to the
new idr_alloc() interface.

There are some peculiarities and possible bugs in the converted
functions.  This patch preserves those.

* drv_insert_node_res_element() returns -ENOMEM on alloc failure,
  -EFAULT if id space is exhausted.  -EFAULT is at best misleading.

* drv_proc_insert_strm_res_element() is even weirder.  It returns
  -EFAULT if kzalloc() fails, -ENOMEM if idr preloading fails and
  -EPERM if id space is exhausted.  What's going on here?

* drv_proc_insert_strm_res_element() doesn't free *pstrm_res after
  failure.

Only compile tested.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Víctor Manuel Jáquez Leal <vjaquez@igalia.com>
Cc: Rene Sapiens <rene.sapiens@ti.com>
Cc: Armando Uribe <x0095078@ti.com>
Cc: Omar Ramirez Luna <omar.ramirez@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agozcache: convert to idr_alloc()
Tejun Heo [Wed, 13 Mar 2013 21:59:40 +0000 (14:59 -0700)]
zcache: convert to idr_alloc()

idr_get_new*() and friends are about to be deprecated.  Convert to the
new idr_alloc() interface.

Only compile tested.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agomlx4: remove leftover idr_pre_get() call
Tejun Heo [Wed, 13 Mar 2013 21:59:39 +0000 (14:59 -0700)]
mlx4: remove leftover idr_pre_get() call

Commit 6a9200603d76 ("IB/mlx4: convert to idr_alloc()") forgot to remove
idr_pre_get() call in mlx4_ib_cm_paravirt_init().  It's unnecessary and
idr_pre_get() will soon be deprecated.  Remove it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jack Morgenstein <jackm@dev.mellanox.co.il>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Roland Dreier <roland@purestorage.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoworkqueue: convert to idr_alloc()
Tejun Heo [Wed, 13 Mar 2013 21:59:38 +0000 (14:59 -0700)]
workqueue: convert to idr_alloc()

idr_get_new*() and friends are about to be deprecated.  Convert to the
new idr_alloc() interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agonfsd: convert to idr_alloc()
Tejun Heo [Wed, 13 Mar 2013 21:59:37 +0000 (14:59 -0700)]
nfsd: convert to idr_alloc()

idr_get_new*() and friends are about to be deprecated.  Convert to the
new idr_alloc() interface.

Only compile-tested.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Tested-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agonfsd: remove unused get_new_stid()
Tejun Heo [Wed, 13 Mar 2013 21:59:36 +0000 (14:59 -0700)]
nfsd: remove unused get_new_stid()

get_new_stid() is no longer used since commit 3abdb607125 ("nfsd4:
simplify idr allocation").  Remove it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agokernel/signal.c: use __ARCH_HAS_SA_RESTORER instead of SA_RESTORER
Andrew Morton [Wed, 13 Mar 2013 21:59:34 +0000 (14:59 -0700)]
kernel/signal.c: use __ARCH_HAS_SA_RESTORER instead of SA_RESTORER

__ARCH_HAS_SA_RESTORER is the preferred conditional for use in 3.9 and
later kernels, per Kees.

Cc: Emese Revfy <re.emese@gmail.com>
Cc: Emese Revfy <re.emese@gmail.com>
Cc: PaX Team <pageexec@freemail.hu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Julien Tinnes <jln@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agosignal: always clear sa_restorer on execve
Kees Cook [Wed, 13 Mar 2013 21:59:33 +0000 (14:59 -0700)]
signal: always clear sa_restorer on execve

When the new signal handlers are set up, the location of sa_restorer is
not cleared, leaking a parent process's address space location to
children.  This allows for a potential bypass of the parent's ASLR by
examining the sa_restorer value returned when calling sigaction().

Based on what should be considered "secret" about addresses, it only
matters across the exec not the fork (since the VMAs haven't changed
until the exec).  But since exec sets SIG_DFL and keeps sa_restorer,
this is where it should be fixed.

Given the few uses of sa_restorer, a "set" function was not written
since this would be the only use.  Instead, we use
__ARCH_HAS_SA_RESTORER, as already done in other places.

Example of the leak before applying this patch:

  $ cat /proc/$$/maps
  ...
  7fb9f3083000-7fb9f3238000 r-xp 00000000 fd:01 404469 .../libc-2.15.so
  ...
  $ ./leak
  ...
  7f278bc74000-7f278be29000 r-xp 00000000 fd:01 404469 .../libc-2.15.so
  ...
  1 0 (nil) 0x7fb9f30b94a0
  2 4000000 (nil) 0x7f278bcaa4a0
  3 4000000 (nil) 0x7f278bcaa4a0
  4 0 (nil) 0x7fb9f30b94a0
  ...

[akpm@linux-foundation.org: use SA_RESTORER for backportability]
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Emese Revfy <re.emese@gmail.com>
Cc: Emese Revfy <re.emese@gmail.com>
Cc: PaX Team <pageexec@freemail.hu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Julien Tinnes <jln@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>