dm thin: Use last transaction's pmd->root when commit failed
authorZhihao Cheng <chengzhihao1@huawei.com>
Thu, 8 Dec 2022 14:28:02 +0000 (22:28 +0800)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Sat, 7 Jan 2023 11:07:36 +0000 (12:07 +0100)
commit 7991dbff6849f67e823b7cc0c15e5a90b0549b9f upstream.

Recently we found a softlock up problem in dm thin pool btree lookup
code due to corrupted metadata:

 Kernel panic - not syncing: softlockup: hung tasks
 CPU: 7 PID: 2669225 Comm: kworker/u16:3
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
 Workqueue: dm-thin do_worker [dm_thin_pool]
 Call Trace:
   <IRQ>
   dump_stack+0x9c/0xd3
   panic+0x35d/0x6b9
   watchdog_timer_fn.cold+0x16/0x25
   __run_hrtimer+0xa2/0x2d0
   </IRQ>
   RIP: 0010:__relink_lru+0x102/0x220 [dm_bufio]
   __bufio_new+0x11f/0x4f0 [dm_bufio]
   new_read+0xa3/0x1e0 [dm_bufio]
   dm_bm_read_lock+0x33/0xd0 [dm_persistent_data]
   ro_step+0x63/0x100 [dm_persistent_data]
   btree_lookup_raw.constprop.0+0x44/0x220 [dm_persistent_data]
   dm_btree_lookup+0x16f/0x210 [dm_persistent_data]
   dm_thin_find_block+0x12c/0x210 [dm_thin_pool]
   __process_bio_read_only+0xc5/0x400 [dm_thin_pool]
   process_thin_deferred_bios+0x1a4/0x4a0 [dm_thin_pool]
   process_one_work+0x3c5/0x730

Following process may generate a broken btree mixed with fresh and
stale btree nodes, which could get dm thin trapped in an infinite loop
while looking up data block:
 Transaction 1: pmd->root = A, A->B->C   // One path in btree
                pmd->root = X, X->Y->Z   // Copy-up
 Transaction 2: X,Z is updated on disk, Y write failed.
                // Commit failed, dm thin becomes read-only.
                process_bio_read_only
 dm_thin_find_block
  __find_block
   dm_btree_lookup(pmd->root)
The pmd->root points to a broken btree, Y may contain stale node
pointing to any block, for example X, which gets dm thin trapped into
a dead loop while looking up Z.

Fix this by setting pmd->root in __open_metadata(), so that dm thin
will use the last transaction's pmd->root if commit failed.

Fetch a reproducer in [Link].

Linke: https://bugzilla.kernel.org/show_bug.cgi?id=216790
Cc: stable@vger.kernel.org
Fixes: 991d9fa02da0 ("dm: add thin provisioning target")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/md/dm-thin-metadata.c

index b5bf2ecfaf9132a526ed2fad020af35c8a35fd0a..0b56c8412fdf23b20aa6c4ab7b10167e1c3ccd5b 100644 (file)
@@ -661,6 +661,15 @@ static int __open_metadata(struct dm_pool_metadata *pmd)
                goto bad_cleanup_data_sm;
        }
 
+       /*
+        * For pool metadata opening process, root setting is redundant
+        * because it will be set again in __begin_transaction(). But dm
+        * pool aborting process really needs to get last transaction's
+        * root to avoid accessing broken btree.
+        */
+       pmd->root = le64_to_cpu(disk_super->data_mapping_root);
+       pmd->details_root = le64_to_cpu(disk_super->device_details_root);
+
        __setup_btree_details(pmd);
        dm_bm_unlock(sblock);