If we have a file with an implicit hole (NO_HOLES feature enabled) that
has an extent following the hole, delayed writes against regions of the
file behind the hole happened before but were not yet flushed and then
we truncate the file to a smaller size that lies inside the hole, we
end up persisting a wrong disk_i_size value for our inode that leads to
data loss after umounting and mounting again the filesystem or after
the inode is evicted and loaded again.
This happens because at inode.c:btrfs_truncate_inode_items() we end up
setting last_size to the offset of the extent that we deleted and that
followed the hole. We then pass that value to btrfs_ordered_update_i_size()
which updates the inode's disk_i_size to a value smaller then the offset
of the buffered (delayed) writes.
Example reproducer:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ xfs_io -f -c "pwrite -S 0x01 0K 32K" /mnt/foo
$ xfs_io -d -c "pwrite -S 0x02 -b 32K 64K 32K" /mnt/foo
$ xfs_io -c "truncate 60K" /mnt/foo
--> inode's disk_i_size updated to 0
$ md5sum /mnt/foo
3c5ca3c3ab42f4b04d7e7eb0b0d4d806 /mnt/foo
$ umount /dev/sdb
$ mount /dev/sdb /mnt
$ md5sum /mnt/foo
d41d8cd98f00b204e9800998ecf8427e /mnt/foo
--> Empty file, all data lost!
Cc: <stable@vger.kernel.org> # 3.14+
Fixes:
16e7549f045d ("Btrfs: incompatible format change to remove hole extents")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
if (found_type > min_type) {
del_item = 1;
} else {
- if (item_end < new_size) {
- /*
- * With NO_HOLES mode, for the following mapping
- *
- * [0-4k][hole][8k-12k]
- *
- * if truncating isize down to 6k, it ends up
- * isize being 8k.
- */
- if (btrfs_fs_incompat(root->fs_info, NO_HOLES))
- last_size = new_size;
+ if (item_end < new_size)
break;
- }
if (found_key.offset >= new_size)
del_item = 1;
else
btrfs_abort_transaction(trans, ret);
}
error:
- if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID)
+ if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID) {
+ ASSERT(last_size >= new_size);
+ if (!err && last_size > new_size)
+ last_size = new_size;
btrfs_ordered_update_i_size(inode, last_size, NULL);
+ }
btrfs_free_path(path);