From: Zhang Yi The data=ordered mode introduces two fundamental conflicts with the iomap buffered write path, leading to potential deadlocks. 1) Lock ordering conflict In the iomap writeback path, each folio is processed sequentially: the folio lock is acquired first, followed by starting a transaction to create block mappings. In data=ordered mode, writeback triggered by the journal commit process may attempt to acquire a folio lock that is already held by iomap. Meanwhile, iomap, under that same folio lock, may start a new transaction and wait for the currently committing transaction to finish, resulting in a deadlock. 2) Partial folio submission not supported When block size is smaller than folio size, a folio may contain both mapped and unmapped blocks. In data=ordered mode, if the journal waits for such a folio to be written back while the regular writeback process has already started committing it (with the writeback flag set), mapping the remaining unmapped blocks can deadlock. This is because the writeback flag is cleared only after the entire folio is processed and committed. To support data=ordered mode, the iomap core would need two invasive changes: - Acquire the transaction handle before locking any folio for writeback. - Support partial folio submission. Both changes are complicated and risk performance regressions. Therefore, we must avoid using data=ordered mode when converting to the iomap path. Currently, data=ordered mode is used in three scenarios: - Append write - Post-EOF partial block truncate-up followed by append write - Online defragmentation We can address the first two without data=ordered mode: - For append write: always allocate unwritten blocks (i.e. always enable dioread_nolock), preserving the behavior of current extent-type inodes. - For post-EOF truncate-up + append write: postpone updating i_disksize until after the zeroed partial block has been written back. Online defragmentation does not yet support iomap; this can be resolved separately in the future. Signed-off-by: Zhang Yi --- fs/ext4/ext4_jbd2.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h index 63d17c5201b5..26999f173870 100644 --- a/fs/ext4/ext4_jbd2.h +++ b/fs/ext4/ext4_jbd2.h @@ -383,7 +383,12 @@ static inline int ext4_should_journal_data(struct inode *inode) static inline int ext4_should_order_data(struct inode *inode) { - return ext4_inode_journal_mode(inode) & EXT4_INODE_ORDERED_DATA_MODE; + /* + * inodes using the iomap buffered I/O path do not use the + * data=ordered mode. + */ + return !ext4_inode_buffered_iomap(inode) && + (ext4_inode_journal_mode(inode) & EXT4_INODE_ORDERED_DATA_MODE); } static inline int ext4_should_writeback_data(struct inode *inode) -- 2.52.0