Commit 81a3ca9
fs: fix lazytime expiration handling in __writeback_single_inode()
ANBZ: torvalds#752
commit 1e249cb upstream.
When lazytime is enabled and an inode is being written due to its
in-memory updated timestamps having expired, either due to a sync() or
syncfs() system call or due to dirtytime_expire_interval having elapsed,
the VFS needs to inform the filesystem so that the filesystem can copy
the inode's timestamps out to the on-disk data structures.
This is done by __writeback_single_inode() calling
mark_inode_dirty_sync(), which then calls ->dirty_inode(I_DIRTY_SYNC).
However, this occurs after __writeback_single_inode() has already
cleared the dirty flags from ->i_state. This causes two bugs:
- mark_inode_dirty_sync() redirties the inode, causing it to remain
dirty. This wastefully causes the inode to be written twice. But
more importantly, it breaks cases where sync_filesystem() is expected
to clean dirty inodes. This includes the FS_IOC_REMOVE_ENCRYPTION_KEY
ioctl (as reported at
https://lore.kernel.org/r/[email protected]), as well
as possibly filesystem freezing (freeze_super()).
- Since ->i_state doesn't contain I_DIRTY_TIME when ->dirty_inode() is
called from __writeback_single_inode() for lazytime expiration,
xfs_fs_dirty_inode() ignores the notification. (XFS only cares about
lazytime expirations, and it assumes that i_state will contain
I_DIRTY_TIME during those.) Therefore, lazy timestamps aren't
persisted by sync(), syncfs(), or dirtytime_expire_interval on XFS.
Fix this by moving the call to mark_inode_dirty_sync() to earlier in
__writeback_single_inode(), before the dirty flags are cleared from
i_state. This makes filesystems be properly notified of the timestamp
expiration, and it avoids incorrectly redirtying the inode.
This fixes xfstest generic/580 (which tests
FS_IOC_REMOVE_ENCRYPTION_KEY) when run on ext4 or f2fs with lazytime
enabled. It also fixes the new lazytime xfstest I've proposed, which
reproduces the above-mentioned XFS bug
(https://lore.kernel.org/r/[email protected]).
Alternatively, we could call ->dirty_inode(I_DIRTY_SYNC) directly. But
due to the introduction of I_SYNC_QUEUED, mark_inode_dirty_sync() is the
right thing to do because mark_inode_dirty_sync() now knows not to move
the inode to a writeback list if it is currently queued for sync.
Fixes: 0ae45f6 ("vfs: add support for a lazytime mount option")
Cc: [email protected]
Depends-on: 5afced3 ("writeback: Avoid skipping inode writeback")
Link: https://lore.kernel.org/r/[email protected]
Suggested-by: Jan Kara <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Joseph Qi <[email protected]>
Reviewed-by: Gao Xiang <[email protected]>1 parent 2030fca commit 81a3ca9
1 file changed
+13
-11
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1537 | 1537 | | |
1538 | 1538 | | |
1539 | 1539 | | |
1540 | | - | |
1541 | | - | |
1542 | | - | |
| 1540 | + | |
| 1541 | + | |
| 1542 | + | |
1543 | 1543 | | |
1544 | | - | |
1545 | | - | |
1546 | | - | |
1547 | 1544 | | |
1548 | | - | |
1549 | | - | |
| 1545 | + | |
1550 | 1546 | | |
1551 | 1547 | | |
1552 | | - | |
1553 | 1548 | | |
| 1549 | + | |
1554 | 1550 | | |
| 1551 | + | |
| 1552 | + | |
| 1553 | + | |
| 1554 | + | |
| 1555 | + | |
| 1556 | + | |
| 1557 | + | |
| 1558 | + | |
1555 | 1559 | | |
1556 | 1560 | | |
1557 | 1561 | | |
| |||
1572 | 1576 | | |
1573 | 1577 | | |
1574 | 1578 | | |
1575 | | - | |
1576 | | - | |
1577 | 1579 | | |
1578 | 1580 | | |
1579 | 1581 | | |
| |||
0 commit comments