Sign Up
Log In
Log In
or
Sign Up
Places
All Projects
Status Monitor
Collapse sidebar
filesystems
lustre_2_15
0019-LU-0000-ldiskfs-add-suse-patches-for-SLE15...
Overview
Repositories
Revisions
Requests
Users
Attributes
Meta
File 0019-LU-0000-ldiskfs-add-suse-patches-for-SLE15-SP4.patch of Package lustre_2_15
From d3055d907ee9f4a5a1eacb50eb5188149bd46be6 Mon Sep 17 00:00:00 2001 From: Mr NeilBrown <neilb@suse.de> Date: Wed, 30 Nov 2022 16:59:52 +1100 Subject: [PATCH 19/30] LU-0000 ldiskfs: add suse patches for SLE15-SP4 This patches update from GA to latest maint. Signed-off-by: Mr NeilBrown <neilb@suse.de> Change-Id: I330c0dc0c4f5868af88b9c070f130a20a702ddb0 --- ldiskfs/kernel_patches/patches/base/ext4-delayed-iput.patch | 52 +- ldiskfs/kernel_patches/patches/patches.suse/ext4-Fix-BUG_ON-in-ext4_bread-when-write-quota-data.patch | 108 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-Fix-check-for-block-being-out-of-directory-size.patch | 35 + ldiskfs/kernel_patches/patches/patches.suse/ext4-Fix-deadlock-during-directory-rename.patch | 86 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-Fix-possible-corruption-when-moving-a-directory.patch | 59 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-Fix-reusing-stale-buffer-heads-from-last-failed.patch | 122 +++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-add-EA_INODE-checking-to-ext4_iget.patch | 179 +++++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-add-EXT4_IGET_BAD-flag-to-prevent-unexpected-ba.patch | 81 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-add-helper-to-check-quota-inums.patch | 83 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-add-inode-table-check-in-__ext4_get_inode_loc-t.patch | 89 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-add-lockdep-annotations-for-i_data_sem-for-ea_i.patch | 63 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-add-missing-validation-of-fast-commit-record-le.patch | 97 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-add-new-helper-interface-ext4_try_to_trim_range.patch | 161 ++++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-add-reserved-GDT-blocks-check.patch | 79 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-allocate-extended-attribute-value-in-vmalloc-ar.patch | 50 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-avoid-BUG_ON-when-creating-xattrs.patch | 58 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-avoid-cycles-in-directory-h-tree.patch | 86 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-avoid-deadlock-in-fs-reclaim-with-page-writebac.patch | 232 ++++++++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-avoid-trim-error-on-fs-with-small-groups.patch | 72 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-avoid-unaccounted-block-allocation-when-expandi.patch | 47 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-bail-out-of-ext4_xattr_ibody_get-fails-for-any-.patch | 36 + ldiskfs/kernel_patches/patches/patches.suse/ext4-check-if-directory-block-is-within-i_size.patch | 57 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-check-iomap-type-only-if-ext4_iomap_begin-does-.patch | 45 + ldiskfs/kernel_patches/patches/patches.suse/ext4-destroy-ext4_fc_dentry_cachep-kmemcache-on-modu.patch | 79 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-disable-fast-commit-of-encrypted-dir-operations.patch | 151 ++++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-disallow-ea_inodes-with-extended-attributes.patch | 39 + ldiskfs/kernel_patches/patches/patches.suse/ext4-don-t-allow-journal-inode-to-have-encrypt-flag.patch | 58 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-don-t-increase-iversion-counter-for-ea_inodes.patch | 49 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-don-t-set-up-encryption-key-during-jbd2-transac.patch | 158 ++++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-don-t-use-the-orphan-list-when-migrating-an-ino.patch | 88 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-ext4_read_bh_lock-should-submit-IO-if-the-buffe.patch | 81 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-f2fs-fix-readahead-of-verity-data.patch | 48 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-factor-out-ext4_fc_get_tl.patch | 148 ++++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fail-ext4_iget-if-special-inode-unallocated.patch | 76 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fast-commit-may-miss-tracking-unwritten-range-d.patch | 44 + ldiskfs/kernel_patches/patches/patches.suse/ext4-filter-out-EXT4_FC_REPLAY-from-on-disk-superblo.patch | 62 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-BUG_ON-when-directory-entry-has-invalid-rec.patch | 74 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-RENAME_WHITEOUT-handling-for-inline-directo.patch | 93 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-WARNING-in-ext4_update_inline_data.patch | 114 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-WARNING-in-mb_find_extent.patch | 135 +++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-a-possible-ABBA-deadlock-due-to-busy-PA.patch | 159 ++++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-an-use-after-free-issue-about-data-journal-.patch | 135 +++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-another-off-by-one-fsmap-error-on-1k-block-.patch | 127 +++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-bad-checksum-after-online-resize.patch | 54 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-bug_on-ext4_mb_use_inode_pa.patch | 103 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-bug_on-in-__es_tree_search-caused-by-bad-bo.patch | 102 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-bug_on-in-__es_tree_search.patch | 144 ++++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-bug_on-in-ext4_writepages.patch | 113 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-cgroup-writeback-accounting-with-fs-layer-e.patch | 71 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-corruption-when-online-resizing-a-1K-bigall.patch | 58 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-data-races-when-using-cached-status-extents.patch | 86 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-deadlock-due-to-mbcache-entry-corruption.patch | 92 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-deadlock-when-converting-an-inline-director.patch | 69 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-delayed-allocation-bug-in-ext4_clu_mapped-f.patch | 63 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-dir-corruption-when-ext4_dx_add_entry-fails.patch | 100 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-error-code-return-to-user-space-in-ext4_get.patch | 57 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-error-handling-in-ext4_fc_record_modified_i.patch | 187 ++++++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-error-handling-in-ext4_restore_inline_data.patch | 65 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-ext4_fc_stats-trace-point.patch | 136 +++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-fallocate-to-use-file_modified-to-update-pe.patch | 171 +++++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-fast-commit-may-miss-tracking-range-for-FAL.patch | 62 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-fs-corruption-when-tring-to-remove-a-non-em.patch | 163 +++++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-i_disksize-exceeding-i_size-problem-in-pari.patch | 75 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-incorrect-options-show-of-original-mount_op.patch | 109 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-incorrect-type-issue-during-replay_del_rang.patch | 42 + ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-inode-leak-in-ext4_xattr_inode_create-on-an.patch | 59 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-kernel-BUG-in-ext4_write_inline_data_end.patch | 108 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-leaking-uninitialized-memory-in-fast-commit.patch | 48 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-lockdep-warning-when-enabling-MMP.patch | 89 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-miss-release-buffer-head-in-ext4_fc_write_i.patch | 62 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-null-ptr-deref-in-__ext4_journal_ensure_cre.patch | 95 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-off-by-one-errors-in-fast-commit-block-fill.patch | 171 +++++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-overhead-calculation-to-account-for-the-res.patch | 41 + ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-possible-double-unlock-when-moving-a-direct.patch | 38 + ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-potential-memory-leak-in-ext4_fc_record_mod.patch | 50 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-potential-memory-leak-in-ext4_fc_record_reg.patch | 54 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-potential-out-of-bound-read-in-ext4_fc_repl.patch | 97 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-race-condition-between-ext4_write-and-ext4_.patch | 133 +++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-reserved-cluster-accounting-in-__es_remove_.patch | 96 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-super-block-checksum-incorrect-after-mount.patch | 78 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-symlink-file-size-not-match-to-file-content.patch | 56 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-task-hung-in-ext4_xattr_delete_inode.patch | 97 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-to-check-return-value-of-freeze_bdev-in-ext.patch | 49 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-unaligned-memory-access-in-ext4_fc_reserve_.patch | 104 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-undefined-behavior-in-bit-shift-for-ext4_ch.patch | 54 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-uninititialized-value-in-ext4_evict_inode.patch | 97 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-use-after-free-in-ext4_ext_shift_extents.patch | 106 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-use-after-free-in-ext4_orphan_cleanup.patch | 81 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-use-after-free-in-ext4_rename_dir_prepare.patch | 130 +++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-use-after-free-in-ext4_search_dir.patch | 131 +++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-use-after-free-read-in-ext4_find_extent-for.patch | 94 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-warning-in-ext4_da_release_space.patch | 108 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-warning-in-ext4_handle_inode_extension.patch | 113 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-fixup-pages-without-buffers.patch | 74 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-force-overhead-calculation-if-the-s_overhead_cl.patch | 50 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-goto-right-label-failed_mount3a.patch | 68 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-improve-error-handling-from-ext4_dirhash.patch | 163 +++++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-improve-error-recovery-code-paths-in-__ext4_rem.patch | 67 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-init-quota-for-old.inode-in-ext4_rename.patch | 83 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-initialize-err_blk-before-calling-__ext4_get_in.patch | 43 + ldiskfs/kernel_patches/patches/patches.suse/ext4-initialize-quota-before-expanding-inode-in-setp.patch | 53 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-introduce-EXT4_FC_TAG_BASE_LEN-helper.patch | 192 ++++++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-limit-length-to-bitmap_maxbytes-blocksize-in-pu.patch | 66 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-make-sure-ext4_append-always-allocates-new-bloc.patch | 63 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-make-sure-quota-gets-properly-shutdown-on-error.patch | 56 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-make-sure-to-reset-inode-lockdep-class-when-quo.patch | 55 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-make-variable-count-signed.patch | 40 + ldiskfs/kernel_patches/patches/patches.suse/ext4-mark-group-as-trimmed-only-if-it-was-fully-scan.patch | 101 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-modify-the-logic-of-ext4_mb_new_blocks_simple.patch | 79 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-move-where-set-the-MAY_INLINE_DATA-flag-is-set.patch | 61 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-only-update-i_reserved_data_blocks-on-successfu.patch | 100 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-place-buffer-head-allocation-before-handle-star.patch | 49 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-prevent-used-blocks-from-being-allocated-during.patch | 117 +++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-recover-csum-seed-of-tmp_inode-after-migrating-.patch | 77 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-refuse-to-create-ea-block-when-umounted.patch | 45 + ldiskfs/kernel_patches/patches/patches.suse/ext4-reject-the-commit-option-on-ext2-filesystems.patch | 40 + ldiskfs/kernel_patches/patches/patches.suse/ext4-remove-EA-inode-entry-from-mbcache-on-inode-evi.patch | 116 +++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-set-lockdep-subclass-for-the-ea_inode-in-ext4_x.patch | 39 + ldiskfs/kernel_patches/patches/patches.suse/ext4-silence-the-warning-when-evicting-inode-with-di.patch | 99 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-turn-quotas-off-if-mount-failed-after-enabling-.patch | 76 +++ ldiskfs/kernel_patches/patches/patches.suse/ext4-unconditionally-enable-the-i_version-counter.patch | 121 +++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-unindent-codeblock-in-ext4_xattr_block_set.patch | 123 +++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-update-s_journal_inum-if-it-changes-after-journ.patch | 53 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-update-state-fc_regions_size-after-successful-m.patch | 51 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-use-ext4_ext_remove_space-for-fast-commit-repla.patch | 66 ++ ldiskfs/kernel_patches/patches/patches.suse/ext4-use-ext4_fc_tl_mem-in-fast-commit-replay-path.patch | 143 ++++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-verify-dir-block-before-splitting-it.patch | 93 ++++ ldiskfs/kernel_patches/patches/patches.suse/ext4-zero-i_disksize-when-initializing-the-bootloade.patch | 65 ++ ldiskfs/kernel_patches/patches/patches.suse/fs-ext4-initialize-fsdata-in-pagecache_write.patch | 38 + ldiskfs/kernel_patches/patches/sles15sp4/ext4-data-in-dirent.patch | 135 ++--- ldiskfs/kernel_patches/patches/sles15sp4/ext4-pdirop.patch | 109 ++-- ldiskfs/kernel_patches/series/ldiskfs-5.14.21-sles15sp4.series | 127 +++++ 132 files changed, 11492 insertions(+), 160 deletions(-) create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-Fix-BUG_ON-in-ext4_bread-when-write-quota-data.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-Fix-check-for-block-being-out-of-directory-size.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-add-new-helper-interface-ext4_try_to_trim_range.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-add-reserved-GDT-blocks-check.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-avoid-cycles-in-directory-h-tree.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-avoid-trim-error-on-fs-with-small-groups.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-check-if-directory-block-is-within-i_size.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-destroy-ext4_fc_dentry_cachep-kmemcache-on-modu.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-don-t-use-the-orphan-list-when-migrating-an-ino.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fast-commit-may-miss-tracking-unwritten-range-d.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-filter-out-EXT4_FC_REPLAY-from-on-disk-superblo.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-a-possible-ABBA-deadlock-due-to-busy-PA.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-an-use-after-free-issue-about-data-journal-.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-bug_on-ext4_mb_use_inode_pa.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-bug_on-in-__es_tree_search.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-bug_on-in-ext4_writepages.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-error-handling-in-ext4_fc_record_modified_i.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-error-handling-in-ext4_restore_inline_data.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-ext4_fc_stats-trace-point.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-fallocate-to-use-file_modified-to-update-pe.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-fast-commit-may-miss-tracking-range-for-FAL.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-fs-corruption-when-tring-to-remove-a-non-em.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-incorrect-type-issue-during-replay_del_rang.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-null-ptr-deref-in-__ext4_journal_ensure_cre.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-overhead-calculation-to-account-for-the-res.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-race-condition-between-ext4_write-and-ext4_.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-super-block-checksum-incorrect-after-mount.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-symlink-file-size-not-match-to-file-content.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-use-after-free-in-ext4_rename_dir_prepare.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-use-after-free-in-ext4_search_dir.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-warning-in-ext4_handle_inode_extension.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-force-overhead-calculation-if-the-s_overhead_cl.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-initialize-err_blk-before-calling-__ext4_get_in.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-limit-length-to-bitmap_maxbytes-blocksize-in-pu.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-make-sure-ext4_append-always-allocates-new-bloc.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-make-sure-quota-gets-properly-shutdown-on-error.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-make-sure-to-reset-inode-lockdep-class-when-quo.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-make-variable-count-signed.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-mark-group-as-trimmed-only-if-it-was-fully-scan.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-modify-the-logic-of-ext4_mb_new_blocks_simple.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-prevent-used-blocks-from-being-allocated-during.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-recover-csum-seed-of-tmp_inode-after-migrating-.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-reject-the-commit-option-on-ext2-filesystems.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-remove-EA-inode-entry-from-mbcache-on-inode-evi.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-unindent-codeblock-in-ext4_xattr_block_set.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-use-ext4_ext_remove_space-for-fast-commit-repla.patch create mode 100644 ldiskfs/kernel_patches/patches/patches.suse/ext4-verify-dir-block-before-splitting-it.patch --- a/ldiskfs/kernel_patches/patches/base/ext4-delayed-iput.patch +++ b/ldiskfs/kernel_patches/patches/base/ext4-delayed-iput.patch @@ -1,8 +1,13 @@ -Index: b2_15_linux-4.18.0-425.3.1.el8/fs/ext4/ext4.h -=================================================================== ---- b2_15_linux-4.18.0-425.3.1.el8.orig/fs/ext4/ext4.h -+++ b2_15_linux-4.18.0-425.3.1.el8/fs/ext4/ext4.h -@@ -1534,8 +1534,11 @@ struct ext4_sb_info { +--- + fs/ext4/ext4.h | 7 +++++-- + fs/ext4/page-io.c | 2 +- + fs/ext4/super.c | 15 ++++++++------- + fs/ext4/xattr.c | 39 +++++++++++++++++++++++++++++++++++++-- + 4 files changed, 51 insertions(+), 12 deletions(-) + +--- a/fs/ext4/ext4.h ++++ b/fs/ext4/ext4.h +@@ -1628,8 +1628,11 @@ struct ext4_sb_info { struct flex_groups * __rcu *s_flex_groups; ext4_group_t s_flex_groups_allocated; @@ -16,11 +21,9 @@ Index: b2_15_linux-4.18.0-425.3.1.el8/fs /* timer for periodic error stats printing */ struct timer_list s_err_report; -Index: b2_15_linux-4.18.0-425.3.1.el8/fs/ext4/page-io.c -=================================================================== ---- b2_15_linux-4.18.0-425.3.1.el8.orig/fs/ext4/page-io.c -+++ b2_15_linux-4.18.0-425.3.1.el8/fs/ext4/page-io.c -@@ -207,7 +207,7 @@ static void ext4_add_complete_io(ext4_io +--- a/fs/ext4/page-io.c ++++ b/fs/ext4/page-io.c +@@ -230,7 +230,7 @@ static void ext4_add_complete_io(ext4_io WARN_ON(!(io_end->flag & EXT4_IO_END_UNWRITTEN)); WARN_ON(!io_end->handle && sbi->s_journal); spin_lock_irqsave(&ei->i_completed_io_lock, flags); @@ -29,23 +32,22 @@ Index: b2_15_linux-4.18.0-425.3.1.el8/fs if (list_empty(&ei->i_rsv_conversion_list)) queue_work(wq, &ei->i_rsv_conversion_work); list_add_tail(&io_end->list, &ei->i_rsv_conversion_list); -Index: b2_15_linux-4.18.0-425.3.1.el8/fs/ext4/super.c -=================================================================== ---- b2_15_linux-4.18.0-425.3.1.el8.orig/fs/ext4/super.c -+++ b2_15_linux-4.18.0-425.3.1.el8/fs/ext4/super.c -@@ -1022,9 +1022,10 @@ static void ext4_put_super(struct super_ +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -1171,10 +1171,11 @@ static void ext4_put_super(struct super_ int i, err; ext4_unregister_li_request(sb); + flush_workqueue(sbi->s_misc_wq); ext4_quota_off_umount(sb); + flush_work(&sbi->s_error_work); - destroy_workqueue(sbi->rsv_conversion_wq); + destroy_workqueue(sbi->s_misc_wq); /* * Unregister sysfs before destroying jbd2 journal. -@@ -4588,9 +4589,9 @@ no_journal: +@@ -5001,9 +5002,9 @@ no_journal: * The maximum number of concurrent works can be high and * concurrency isn't really necessary. Limit it to 1. */ @@ -58,7 +60,7 @@ Index: b2_15_linux-4.18.0-425.3.1.el8/fs printk(KERN_ERR "EXT4-fs: failed to create workqueue\n"); ret = -ENOMEM; goto failed_mount4; -@@ -4785,8 +4786,8 @@ failed_mount4a: +@@ -5236,8 +5237,8 @@ failed_mount4a: sb->s_root = NULL; failed_mount4: ext4_msg(sb, KERN_ERR, "mount failed"); @@ -69,7 +71,7 @@ Index: b2_15_linux-4.18.0-425.3.1.el8/fs failed_mount_wq: ext4_xattr_destroy_cache(sbi->s_ea_inode_cache); sbi->s_ea_inode_cache = NULL; -@@ -5285,7 +5286,7 @@ static int ext4_sync_fs(struct super_blo +@@ -5802,7 +5803,7 @@ static int ext4_sync_fs(struct super_blo return 0; trace_ext4_sync_fs(sb, wait); @@ -78,11 +80,9 @@ Index: b2_15_linux-4.18.0-425.3.1.el8/fs /* * Writeback quota in non-journalled quota case - journalled quota has * no dirty dquots -Index: b2_15_linux-4.18.0-425.3.1.el8/fs/ext4/xattr.c -=================================================================== ---- b2_15_linux-4.18.0-425.3.1.el8.orig/fs/ext4/xattr.c -+++ b2_15_linux-4.18.0-425.3.1.el8/fs/ext4/xattr.c -@@ -1579,6 +1579,36 @@ static int ext4_xattr_inode_lookup_creat +--- a/fs/ext4/xattr.c ++++ b/fs/ext4/xattr.c +@@ -1588,6 +1588,36 @@ static int ext4_xattr_inode_lookup_creat return 0; } @@ -119,7 +119,7 @@ Index: b2_15_linux-4.18.0-425.3.1.el8/fs /* * Reserve min(block_size/8, 1024) bytes for xattr entries/names if ea_inode * feature is enabled. -@@ -1596,6 +1626,7 @@ static int ext4_xattr_set_entry(struct e +@@ -1605,6 +1635,7 @@ static int ext4_xattr_set_entry(struct e int in_inode = i->in_inode; struct inode *old_ea_inode = NULL; struct inode *new_ea_inode = NULL; @@ -127,7 +127,7 @@ Index: b2_15_linux-4.18.0-425.3.1.el8/fs size_t old_size, new_size; int ret; -@@ -1672,7 +1703,11 @@ static int ext4_xattr_set_entry(struct e +@@ -1681,7 +1712,11 @@ static int ext4_xattr_set_entry(struct e * Finish that work before doing any modifications to the xattr data. */ if (!s->not_found && here->e_value_inum) { @@ -140,7 +140,7 @@ Index: b2_15_linux-4.18.0-425.3.1.el8/fs le32_to_cpu(here->e_value_inum), le32_to_cpu(here->e_hash), &old_ea_inode); -@@ -1825,7 +1860,7 @@ update_hash: +@@ -1834,7 +1869,7 @@ update_hash: ret = 0; out: --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-Fix-BUG_ON-in-ext4_bread-when-write-quota-data.patch @@ -0,0 +1,108 @@ +From 380a0091cab482489e9b19e07f2a166ad2b76d5c Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Thu, 23 Dec 2021 09:55:06 +0800 +Subject: [PATCH] ext4: Fix BUG_ON in ext4_bread when write quota data +Git-commit: 380a0091cab482489e9b19e07f2a166ad2b76d5c +Patch-mainline: v5.17-rc1 +References: bsc#1197755 + +We got issue as follows when run syzkaller: +[ 167.936972] EXT4-fs error (device loop0): __ext4_remount:6314: comm rep: Abort forced by user +[ 167.938306] EXT4-fs (loop0): Remounting filesystem read-only +[ 167.981637] Assertion failure in ext4_getblk() at fs/ext4/inode.c:847: '(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) || handle != NULL || create == 0' +[ 167.983601] ------------[ cut here ]------------ +[ 167.984245] kernel BUG at fs/ext4/inode.c:847! +[ 167.984882] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI +[ 167.985624] CPU: 7 PID: 2290 Comm: rep Tainted: G B 5.16.0-rc5-next-20211217+ #123 +[ 167.986823] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 +[ 167.988590] RIP: 0010:ext4_getblk+0x17e/0x504 +[ 167.989189] Code: c6 01 74 28 49 c7 c0 a0 a3 5c 9b b9 4f 03 00 00 48 c7 c2 80 9c 5c 9b 48 c7 c6 40 b6 5c 9b 48 c7 c7 20 a4 5c 9b e8 77 e3 fd ff <0f> 0b 8b 04 244 +[ 167.991679] RSP: 0018:ffff8881736f7398 EFLAGS: 00010282 +[ 167.992385] RAX: 0000000000000094 RBX: 1ffff1102e6dee75 RCX: 0000000000000000 +[ 167.993337] RDX: 0000000000000001 RSI: ffffffff9b6e29e0 RDI: ffffed102e6dee66 +[ 167.994292] RBP: ffff88816a076210 R08: 0000000000000094 R09: ffffed107363fa09 +[ 167.995252] R10: ffff88839b1fd047 R11: ffffed107363fa08 R12: ffff88816a0761e8 +[ 167.996205] R13: 0000000000000000 R14: 0000000000000021 R15: 0000000000000001 +[ 167.997158] FS: 00007f6a1428c740(0000) GS:ffff88839b000000(0000) knlGS:0000000000000000 +[ 167.998238] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +[ 167.999025] CR2: 00007f6a140716c8 CR3: 0000000133216000 CR4: 00000000000006e0 +[ 167.999987] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 +[ 168.000944] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 +[ 168.001899] Call Trace: +[ 168.002235] <TASK> +[ 168.007167] ext4_bread+0xd/0x53 +[ 168.007612] ext4_quota_write+0x20c/0x5c0 +[ 168.010457] write_blk+0x100/0x220 +[ 168.010944] remove_free_dqentry+0x1c6/0x440 +[ 168.011525] free_dqentry.isra.0+0x565/0x830 +[ 168.012133] remove_tree+0x318/0x6d0 +[ 168.014744] remove_tree+0x1eb/0x6d0 +[ 168.017346] remove_tree+0x1eb/0x6d0 +[ 168.019969] remove_tree+0x1eb/0x6d0 +[ 168.022128] qtree_release_dquot+0x291/0x340 +[ 168.023297] v2_release_dquot+0xce/0x120 +[ 168.023847] dquot_release+0x197/0x3e0 +[ 168.024358] ext4_release_dquot+0x22a/0x2d0 +[ 168.024932] dqput.part.0+0x1c9/0x900 +[ 168.025430] __dquot_drop+0x120/0x190 +[ 168.025942] ext4_clear_inode+0x86/0x220 +[ 168.026472] ext4_evict_inode+0x9e8/0xa22 +[ 168.028200] evict+0x29e/0x4f0 +[ 168.028625] dispose_list+0x102/0x1f0 +[ 168.029148] evict_inodes+0x2c1/0x3e0 +[ 168.030188] generic_shutdown_super+0xa4/0x3b0 +[ 168.030817] kill_block_super+0x95/0xd0 +[ 168.031360] deactivate_locked_super+0x85/0xd0 +[ 168.031977] cleanup_mnt+0x2bc/0x480 +[ 168.033062] task_work_run+0xd1/0x170 +[ 168.033565] do_exit+0xa4f/0x2b50 +[ 168.037155] do_group_exit+0xef/0x2d0 +[ 168.037666] __x64_sys_exit_group+0x3a/0x50 +[ 168.038237] do_syscall_64+0x3b/0x90 +[ 168.038751] entry_SYSCALL_64_after_hwframe+0x44/0xae + +In order to reproduce this problem, the following conditions need to be met: +1. Ext4 filesystem with no journal; +2. Filesystem image with incorrect quota data; +3. Abort filesystem forced by user; +4. umount filesystem; + +As in ext4_quota_write: +... + if (EXT4_SB(sb)->s_journal && !handle) { + ext4_msg(sb, KERN_WARNING, "Quota write (off=%llu, len=%llu)" + " cancelled because transaction is not started", + (unsigned long long)off, (unsigned long long)len); + return -EIO; + } +... +We only check handle if NULL when filesystem has journal. There is need +check handle if NULL even when filesystem has no journal. + +Signed-off-by: Ye Bin <yebin10@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20211223015506.297766-1-yebin10@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/super.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/fs/ext4/super.c b/fs/ext4/super.c +index 499d1734818d..b72f8f6084e4 100644 +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -6940,7 +6940,7 @@ static ssize_t ext4_quota_write(struct super_block *sb, int type, + struct buffer_head *bh; + handle_t *handle = journal_current_handle(); + +- if (EXT4_SB(sb)->s_journal && !handle) { ++ if (!handle) { + ext4_msg(sb, KERN_WARNING, "Quota write (off=%llu, len=%llu)" + " cancelled because transaction is not started", + (unsigned long long)off, (unsigned long long)len); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-Fix-check-for-block-being-out-of-directory-size.patch @@ -0,0 +1,35 @@ +From 9fd3f1868e9e3bb249546edba45889c18d40bd6d Mon Sep 17 00:00:00 2001 +From: Jan Kara <jack@suse.cz> +Date: Mon, 22 Aug 2022 13:20:59 +0200 +Subject: [PATCH] ext4: Fix check for block being out of directory size +References: bsc#1198577 CVE-2022-1184 +Patch-mainline: Submitted, Aug 22 2022 + +The check in __ext4_read_dirblock() for block being outside of directory +size was wrong because it compared block number against directory size +in bytes. Fix it. + +Fixes: 65f8ea4cd57d ("ext4: check if directory block is within i_size") +CVE: CVE-2022-1184 +CC: stable@vger.kernel.org +Signed-off-by: Jan Kara <jack@suse.cz> +--- + fs/ext4/namei.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c +index 3a31b662f661..bc2e0612ec32 100644 +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -126,7 +126,7 @@ static struct buffer_head *__ext4_read_dirblock(struct inode *inode, + struct ext4_dir_entry *dirent; + int is_dx_block = 0; + +- if (block >= inode->i_size) { ++ if (block >= inode->i_size >> inode->i_blkbits) { + ext4_error_inode(inode, func, line, block, + "Attempting to read directory block (%u) that is past i_size (%llu)", + block, inode->i_size); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-Fix-deadlock-during-directory-rename.patch @@ -0,0 +1,86 @@ +From 3c92792da8506a295afb6d032b4476e46f979725 Mon Sep 17 00:00:00 2001 +From: Jan Kara <jack@suse.cz> +Date: Wed, 1 Mar 2023 15:10:04 +0100 +Subject: [PATCH] ext4: Fix deadlock during directory rename +Git-commit: 3c92792da8506a295afb6d032b4476e46f979725 +Patch-mainline: v6.3-rc2 +References: bsc#1210763 + +As lockdep properly warns, we should not be locking i_rwsem while having +transactions started as the proper lock ordering used by all directory +handling operations is i_rwsem -> transaction start. Fix the lock +ordering by moving the locking of the directory earlier in +ext4_rename(). + +Reported-by: syzbot+9d16c39efb5fade84574@syzkaller.appspotmail.com +Fixes: 0813299c586b ("ext4: Fix possible corruption when moving a directory") +Link: https://syzkaller.appspot.com/bug?extid=9d16c39efb5fade84574 +Signed-off-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20230301141004.15087-1-jack@suse.cz +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/namei.c | 26 +++++++++++++++++--------- + 1 file changed, 17 insertions(+), 9 deletions(-) + +diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c +index e8f429330f3c..7cc3918e2f18 100644 +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -3813,10 +3813,20 @@ static int ext4_rename(struct user_namespace *mnt_userns, struct inode *old_dir, + return retval; + } + ++ /* ++ * We need to protect against old.inode directory getting converted ++ * from inline directory format into a normal one. ++ */ ++ if (S_ISDIR(old.inode->i_mode)) ++ inode_lock_nested(old.inode, I_MUTEX_NONDIR2); ++ + old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, + &old.inlined); +- if (IS_ERR(old.bh)) +- return PTR_ERR(old.bh); ++ if (IS_ERR(old.bh)) { ++ retval = PTR_ERR(old.bh); ++ goto unlock_moved_dir; ++ } ++ + /* + * Check for inode number is _not_ due to possible IO errors. + * We might rmdir the source, keep it as pwd of some process +@@ -3873,11 +3883,6 @@ static int ext4_rename(struct user_namespace *mnt_userns, struct inode *old_dir, + if (new.dir != old.dir && EXT4_DIR_LINK_MAX(new.dir)) + goto end_rename; + } +- /* +- * We need to protect against old.inode directory getting +- * converted from inline directory format into a normal one. +- */ +- inode_lock_nested(old.inode, I_MUTEX_NONDIR2); + retval = ext4_rename_dir_prepare(handle, &old); + if (retval) { + inode_unlock(old.inode); +@@ -4014,12 +4019,15 @@ static int ext4_rename(struct user_namespace *mnt_userns, struct inode *old_dir, + } else { + ext4_journal_stop(handle); + } +- if (old.dir_bh) +- inode_unlock(old.inode); + release_bh: + brelse(old.dir_bh); + brelse(old.bh); + brelse(new.bh); ++ ++unlock_moved_dir: ++ if (S_ISDIR(old.inode->i_mode)) ++ inode_unlock(old.inode); ++ + return retval; + } + +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-Fix-possible-corruption-when-moving-a-directory.patch @@ -0,0 +1,59 @@ +From 0813299c586b175d7edb25f56412c54b812d0379 Mon Sep 17 00:00:00 2001 +From: Jan Kara <jack@suse.cz> +Date: Thu, 26 Jan 2023 12:22:21 +0100 +Subject: [PATCH] ext4: Fix possible corruption when moving a directory +Git-commit: 0813299c586b175d7edb25f56412c54b812d0379 +Patch-mainline: v6.3-rc1 +References: bsc#1210763 + +When we are renaming a directory to a different directory, we need to +update '..' entry in the moved directory. However nothing prevents moved +directory from being modified and even converted from the inline format +to the normal format. When such race happens the rename code gets +confused and we crash. Fix the problem by locking the moved directory. + +Cc: stable@vger.kernel.org +Fixes: 32f7f22c0b52 ("ext4: let ext4_rename handle inline dir") +Signed-off-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20230126112221.11866-1-jack@suse.cz +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/namei.c | 11 ++++++++++- + 1 file changed, 10 insertions(+), 1 deletion(-) + +diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c +index dd28453d6ea3..270fbcba75b6 100644 +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -3872,9 +3872,16 @@ static int ext4_rename(struct user_namespace *mnt_userns, struct inode *old_dir, + if (new.dir != old.dir && EXT4_DIR_LINK_MAX(new.dir)) + goto end_rename; + } ++ /* ++ * We need to protect against old.inode directory getting ++ * converted from inline directory format into a normal one. ++ */ ++ inode_lock_nested(old.inode, I_MUTEX_NONDIR2); + retval = ext4_rename_dir_prepare(handle, &old); +- if (retval) ++ if (retval) { ++ inode_unlock(old.inode); + goto end_rename; ++ } + } + /* + * If we're renaming a file within an inline_data dir and adding or +@@ -4006,6 +4013,8 @@ static int ext4_rename(struct user_namespace *mnt_userns, struct inode *old_dir, + } else { + ext4_journal_stop(handle); + } ++ if (old.dir_bh) ++ inode_unlock(old.inode); + release_bh: + brelse(old.dir_bh); + brelse(old.bh); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-Fix-reusing-stale-buffer-heads-from-last-failed.patch @@ -0,0 +1,122 @@ +From 26fb5290240dc31cae99b8b4dd2af7f46dfcba6b Mon Sep 17 00:00:00 2001 +From: Zhihao Cheng <chengzhihao1@huawei.com> +Date: Wed, 15 Mar 2023 09:31:23 +0800 +Subject: [PATCH] ext4: Fix reusing stale buffer heads from last failed + mounting +Git-commit: 26fb5290240dc31cae99b8b4dd2af7f46dfcba6b +Patch-mainline: v6.5-rc1 +References: bsc#1213020 + +Following process makes ext4 load stale buffer heads from last failed +mounting in a new mounting operation: +mount_bdev + ext4_fill_super + | ext4_load_and_init_journal + | ext4_load_journal + | jbd2_journal_load + | load_superblock + | journal_get_superblock + | set_buffer_verified(bh) // buffer head is verified + | jbd2_journal_recover // failed caused by EIO + | goto failed_mount3a // skip 'sb->s_root' initialization + deactivate_locked_super + kill_block_super + generic_shutdown_super + if (sb->s_root) + // false, skip ext4_put_super->invalidate_bdev-> + // invalidate_mapping_pages->mapping_evict_folio-> + // filemap_release_folio->try_to_free_buffers, which + // cannot drop buffer head. + blkdev_put + blkdev_put_whole + if (atomic_dec_and_test(&bdev->bd_openers)) + // false, systemd-udev happens to open the device. Then + // blkdev_flush_mapping->kill_bdev->truncate_inode_pages-> + // truncate_inode_folio->truncate_cleanup_folio-> + // folio_invalidate->block_invalidate_folio-> + // filemap_release_folio->try_to_free_buffers will be skipped, + // dropping buffer head is missed again. + +Second mount: +ext4_fill_super + ext4_load_and_init_journal + ext4_load_journal + ext4_get_journal + jbd2_journal_init_inode + journal_init_common + bh = getblk_unmovable + bh = __find_get_block // Found stale bh in last failed mounting + journal->j_sb_buffer = bh + jbd2_journal_load + load_superblock + journal_get_superblock + if (buffer_verified(bh)) + // true, skip journal->j_format_version = 2, value is 0 + jbd2_journal_recover + do_one_pass + next_log_block += count_tags(journal, bh) + // According to journal_tag_bytes(), 'tag_bytes' calculating is + // affected by jbd2_has_feature_csum3(), jbd2_has_feature_csum3() + // returns false because 'j->j_format_version >= 2' is not true, + // then we get wrong next_log_block. The do_one_pass may exit + // early whenoccuring non JBD2_MAGIC_NUMBER in 'next_log_block'. + +The filesystem is corrupted here, journal is partially replayed, and +new journal sequence number actually is already used by last mounting. + +The invalidate_bdev() can drop all buffer heads even racing with bare +reading block device(eg. systemd-udev), so we can fix it by invalidating +bdev in error handling path in __ext4_fill_super(). + +Fetch a reproducer in [Link]. + +Link: https://bugzilla.kernel.org/show_bug.cgi?id=217171 +Fixes: 25ed6e8a54df ("jbd2: enable journal clients to enable v2 checksumming") +Cc: stable@vger.kernel.org # v3.5 +Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20230315013128.3911115-2-chengzhihao1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/super.c | 13 +++++++------ + 1 file changed, 7 insertions(+), 6 deletions(-) + +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -1096,6 +1096,12 @@ static void ext4_blkdev_remove(struct ex + struct block_device *bdev; + bdev = sbi->s_journal_bdev; + if (bdev) { ++ /* ++ * Invalidate the journal device's buffers. We don't want them ++ * floating about in memory - the physical journal device may ++ * hotswapped, and it breaks the `ro-after' testing code. ++ */ ++ invalidate_bdev(bdev); + ext4_blkdev_put(bdev); + sbi->s_journal_bdev = NULL; + } +@@ -1233,13 +1239,7 @@ static void ext4_put_super(struct super_ + sync_blockdev(sb->s_bdev); + invalidate_bdev(sb->s_bdev); + if (sbi->s_journal_bdev && sbi->s_journal_bdev != sb->s_bdev) { +- /* +- * Invalidate the journal device's buffers. We don't want them +- * floating about in memory - the physical journal device may +- * hotswapped, and it breaks the `ro-after' testing code. +- */ + sync_blockdev(sbi->s_journal_bdev); +- invalidate_bdev(sbi->s_journal_bdev); + ext4_blkdev_remove(sbi); + } + +@@ -5256,6 +5256,7 @@ failed_mount: + brelse(bh); + ext4_blkdev_remove(sbi); + out_fail: ++ invalidate_bdev(sb->s_bdev); + sb->s_fs_info = NULL; + kfree(sbi->s_blockgroup_lock); + out_free_base: --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-add-EA_INODE-checking-to-ext4_iget.patch @@ -0,0 +1,179 @@ +From b3e6bcb94590dea45396b9481e47b809b1be4afa Mon Sep 17 00:00:00 2001 +From: Theodore Ts'o <tytso@mit.edu> +Date: Tue, 23 May 2023 23:49:48 -0400 +Subject: [PATCH] ext4: add EA_INODE checking to ext4_iget() +Git-commit: b3e6bcb94590dea45396b9481e47b809b1be4afa +Patch-mainline: v6.4-rc5 +References: bsc#1213106 + +Add a new flag, EXT4_IGET_EA_INODE which indicates whether the inode +is expected to have the EA_INODE flag or not. If the flag is not +set/clear as expected, then fail the iget() operation and mark the +file system as corrupted. + +This commit also makes the ext4_iget() always perform the +is_bad_inode() check even when the inode is already inode cache. This +allows us to remove the is_bad_inode() check from the callers of +ext4_iget() in the ea_inode code. + +Reported-by: syzbot+cbb68193bdb95af4340a@syzkaller.appspotmail.com +Reported-by: syzbot+62120febbd1ee3c3c860@syzkaller.appspotmail.com +Reported-by: syzbot+edce54daffee36421b4c@syzkaller.appspotmail.com +Cc: stable@kernel.org +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Link: https://lore.kernel.org/r/20230524034951.779531-2-tytso@mit.edu +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/ext4.h | 3 ++- + fs/ext4/inode.c | 31 ++++++++++++++++++++++++++----- + fs/ext4/xattr.c | 36 +++++++----------------------------- + 3 files changed, 35 insertions(+), 35 deletions(-) + +--- a/fs/ext4/ext4.h ++++ b/fs/ext4/ext4.h +@@ -2976,7 +2976,8 @@ typedef enum { + EXT4_IGET_NORMAL = 0, + EXT4_IGET_SPECIAL = 0x0001, /* OK to iget a system inode */ + EXT4_IGET_HANDLE = 0x0002, /* Inode # is from a handle */ +- EXT4_IGET_BAD = 0x0004 /* Allow to iget a bad inode */ ++ EXT4_IGET_BAD = 0x0004, /* Allow to iget a bad inode */ ++ EXT4_IGET_EA_INODE = 0x0008 /* Inode should contain an EA value */ + } ext4_iget_flags; + + extern struct inode *__ext4_iget(struct super_block *sb, unsigned long ino, +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -4635,6 +4635,21 @@ static inline u64 ext4_inode_peek_iversi + return inode_peek_iversion(inode); + } + ++static const char *check_igot_inode(struct inode *inode, ext4_iget_flags flags) ++ ++{ ++ if (flags & EXT4_IGET_EA_INODE) { ++ if (!(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) ++ return "missing EA_INODE flag"; ++ } else { ++ if ((EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) ++ return "unexpected EA_INODE flag"; ++ } ++ if (is_bad_inode(inode) && !(flags & EXT4_IGET_BAD)) ++ return "unexpected bad inode w/o EXT4_IGET_BAD"; ++ return NULL; ++} ++ + struct inode *__ext4_iget(struct super_block *sb, unsigned long ino, + ext4_iget_flags flags, const char *function, + unsigned int line) +@@ -4643,6 +4658,7 @@ struct inode *__ext4_iget(struct super_b + struct ext4_inode *raw_inode; + struct ext4_inode_info *ei; + struct inode *inode; ++ const char *err_str; + journal_t *journal = EXT4_SB(sb)->s_journal; + long ret; + loff_t size; +@@ -4666,8 +4682,14 @@ struct inode *__ext4_iget(struct super_b + inode = iget_locked(sb, ino); + if (!inode) + return ERR_PTR(-ENOMEM); +- if (!(inode->i_state & I_NEW)) ++ if (!(inode->i_state & I_NEW)) { ++ if ((err_str = check_igot_inode(inode, flags)) != NULL) { ++ ext4_error_inode(inode, function, line, 0, err_str); ++ iput(inode); ++ return ERR_PTR(-EFSCORRUPTED); ++ } + return inode; ++ } + + ei = EXT4_I(inode); + iloc.bh = NULL; +@@ -4936,10 +4958,9 @@ struct inode *__ext4_iget(struct super_b + if (IS_CASEFOLDED(inode) && !ext4_has_feature_casefold(inode->i_sb)) + ext4_error_inode(inode, function, line, 0, + "casefold flag without casefold feature"); +- if (is_bad_inode(inode) && !(flags & EXT4_IGET_BAD)) { +- ext4_error_inode(inode, function, line, 0, +- "bad inode without EXT4_IGET_BAD flag"); +- ret = -EUCLEAN; ++ if ((err_str = check_igot_inode(inode, flags)) != NULL) { ++ ext4_error_inode(inode, function, line, 0, err_str); ++ ret = -EFSCORRUPTED; + goto bad_inode; + } + +--- a/fs/ext4/xattr.c ++++ b/fs/ext4/xattr.c +@@ -397,7 +397,7 @@ static int ext4_xattr_inode_iget(struct + return -EFSCORRUPTED; + } + +- inode = ext4_iget(parent->i_sb, ea_ino, EXT4_IGET_NORMAL); ++ inode = ext4_iget(parent->i_sb, ea_ino, EXT4_IGET_EA_INODE); + if (IS_ERR(inode)) { + err = PTR_ERR(inode); + ext4_error(parent->i_sb, +@@ -405,23 +405,6 @@ static int ext4_xattr_inode_iget(struct + err); + return err; + } +- +- if (is_bad_inode(inode)) { +- ext4_error(parent->i_sb, +- "error while reading EA inode %lu is_bad_inode", +- ea_ino); +- err = -EIO; +- goto error; +- } +- +- if (!(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) { +- ext4_error(parent->i_sb, +- "EA inode %lu does not have EXT4_EA_INODE_FL flag", +- ea_ino); +- err = -EINVAL; +- goto error; +- } +- + ext4_xattr_inode_set_class(inode); + + /* +@@ -442,9 +425,6 @@ static int ext4_xattr_inode_iget(struct + + *ea_inode = inode; + return 0; +-error: +- iput(inode); +- return err; + } + + /* Remove entry from mbcache when EA inode is getting evicted */ +@@ -1507,11 +1487,10 @@ ext4_xattr_inode_cache_find(struct inode + + while (ce) { + ea_inode = ext4_iget(inode->i_sb, ce->e_value, +- EXT4_IGET_NORMAL); +- if (!IS_ERR(ea_inode) && +- !is_bad_inode(ea_inode) && +- (EXT4_I(ea_inode)->i_flags & EXT4_EA_INODE_FL) && +- i_size_read(ea_inode) == value_len && ++ EXT4_IGET_EA_INODE); ++ if (IS_ERR(ea_inode)) ++ goto next_entry; ++ if (i_size_read(ea_inode) == value_len && + !ext4_xattr_inode_read(ea_inode, ea_data, value_len) && + !ext4_xattr_inode_verify_hashes(ea_inode, NULL, ea_data, + value_len) && +@@ -1521,9 +1500,8 @@ ext4_xattr_inode_cache_find(struct inode + kvfree(ea_data); + return ea_inode; + } +- +- if (!IS_ERR(ea_inode)) +- iput(ea_inode); ++ iput(ea_inode); ++ next_entry: + ce = mb_cache_entry_find_next(ea_inode_cache, ce); + } + kvfree(ea_data); --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-add-EXT4_IGET_BAD-flag-to-prevent-unexpected-ba.patch @@ -0,0 +1,81 @@ +From 63b1e9bccb71fe7d7e3ddc9877dbdc85e5d2d023 Mon Sep 17 00:00:00 2001 +From: Baokun Li <libaokun1@huawei.com> +Date: Wed, 26 Oct 2022 12:23:09 +0800 +Subject: [PATCH] ext4: add EXT4_IGET_BAD flag to prevent unexpected bad inode +Git-commit: 63b1e9bccb71fe7d7e3ddc9877dbdc85e5d2d023 +Patch-mainline: v6.2-rc1 +References: bsc#1207619 + +There are many places that will get unhappy (and crash) when ext4_iget() +returns a bad inode. However, if iget the boot loader inode, allows a bad +inode to be returned, because the inode may not be initialized. This +mechanism can be used to bypass some checks and cause panic. To solve this +problem, we add a special iget flag EXT4_IGET_BAD. Only with this flag +we'd be returning bad inode from ext4_iget(), otherwise we always return +the error code if the inode is bad inode.(suggested by Jan Kara) + +Signed-off-by: Baokun Li <libaokun1@huawei.com> +Reviewed-by: Jason Yan <yanaijie@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20221026042310.3839669-4-libaokun1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/ext4.h | 3 ++- + fs/ext4/inode.c | 8 +++++++- + fs/ext4/ioctl.c | 3 ++- + 3 files changed, 11 insertions(+), 3 deletions(-) + +diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h +index 8d5453852f98..2b574b143bde 100644 +--- a/fs/ext4/ext4.h ++++ b/fs/ext4/ext4.h +@@ -2964,7 +2964,8 @@ int do_journal_get_write_access(handle_t *handle, struct inode *inode, + typedef enum { + EXT4_IGET_NORMAL = 0, + EXT4_IGET_SPECIAL = 0x0001, /* OK to iget a system inode */ +- EXT4_IGET_HANDLE = 0x0002 /* Inode # is from a handle */ ++ EXT4_IGET_HANDLE = 0x0002, /* Inode # is from a handle */ ++ EXT4_IGET_BAD = 0x0004 /* Allow to iget a bad inode */ + } ext4_iget_flags; + + extern struct inode *__ext4_iget(struct super_block *sb, unsigned long ino, +diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c +index 6d16241666e6..a4bf643aa08b 100644 +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -5058,8 +5058,14 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino, + if (IS_CASEFOLDED(inode) && !ext4_has_feature_casefold(inode->i_sb)) + ext4_error_inode(inode, function, line, 0, + "casefold flag without casefold feature"); +- brelse(iloc.bh); ++ if (is_bad_inode(inode) && !(flags & EXT4_IGET_BAD)) { ++ ext4_error_inode(inode, function, line, 0, ++ "bad inode without EXT4_IGET_BAD flag"); ++ ret = -EUCLEAN; ++ goto bad_inode; ++ } + ++ brelse(iloc.bh); + unlock_new_inode(inode); + return inode; + +diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c +index 95dfea28bf4e..9ed7b9fe2132 100644 +--- a/fs/ext4/ioctl.c ++++ b/fs/ext4/ioctl.c +@@ -374,7 +374,8 @@ static long swap_inode_boot_loader(struct super_block *sb, + blkcnt_t blocks; + unsigned short bytes; + +- inode_bl = ext4_iget(sb, EXT4_BOOT_LOADER_INO, EXT4_IGET_SPECIAL); ++ inode_bl = ext4_iget(sb, EXT4_BOOT_LOADER_INO, ++ EXT4_IGET_SPECIAL | EXT4_IGET_BAD); + if (IS_ERR(inode_bl)) + return PTR_ERR(inode_bl); + ei_bl = EXT4_I(inode_bl); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-add-helper-to-check-quota-inums.patch @@ -0,0 +1,83 @@ +From 07342ec259df2a35d6a34aebce010567a80a0e15 Mon Sep 17 00:00:00 2001 +From: Baokun Li <libaokun1@huawei.com> +Date: Wed, 26 Oct 2022 12:23:08 +0800 +Subject: [PATCH] ext4: add helper to check quota inums +Git-commit: 07342ec259df2a35d6a34aebce010567a80a0e15 +Patch-mainline: v6.2-rc1 +References: bsc#1207618 + +Before quota is enabled, a check on the preset quota inums in +ext4_super_block is added to prevent wrong quota inodes from being loaded. +In addition, when the quota fails to be enabled, the quota type and quota +inum are printed to facilitate fault locating. + +Signed-off-by: Baokun Li <libaokun1@huawei.com> +Reviewed-by: Jason Yan <yanaijie@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20221026042310.3839669-3-libaokun1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/super.c | 28 +++++++++++++++++++++++++--- + 1 file changed, 25 insertions(+), 3 deletions(-) + +diff --git a/fs/ext4/super.c b/fs/ext4/super.c +index 0f71542cf453..6f944f5e4b3f 100644 +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -6886,6 +6886,20 @@ static int ext4_quota_on(struct super_block *sb, int type, int format_id, + return err; + } + ++static inline bool ext4_check_quota_inum(int type, unsigned long qf_inum) ++{ ++ switch (type) { ++ case USRQUOTA: ++ return qf_inum == EXT4_USR_QUOTA_INO; ++ case GRPQUOTA: ++ return qf_inum == EXT4_GRP_QUOTA_INO; ++ case PRJQUOTA: ++ return qf_inum >= EXT4_GOOD_OLD_FIRST_INO; ++ default: ++ BUG(); ++ } ++} ++ + static int ext4_quota_enable(struct super_block *sb, int type, int format_id, + unsigned int flags) + { +@@ -6902,9 +6916,16 @@ static int ext4_quota_enable(struct super_block *sb, int type, int format_id, + if (!qf_inums[type]) + return -EPERM; + ++ if (!ext4_check_quota_inum(type, qf_inums[type])) { ++ ext4_error(sb, "Bad quota inum: %lu, type: %d", ++ qf_inums[type], type); ++ return -EUCLEAN; ++ } ++ + qf_inode = ext4_iget(sb, qf_inums[type], EXT4_IGET_SPECIAL); + if (IS_ERR(qf_inode)) { +- ext4_error(sb, "Bad quota inode # %lu", qf_inums[type]); ++ ext4_error(sb, "Bad quota inode: %lu, type: %d", ++ qf_inums[type], type); + return PTR_ERR(qf_inode); + } + +@@ -6943,8 +6964,9 @@ int ext4_enable_quotas(struct super_block *sb) + if (err) { + ext4_warning(sb, + "Failed to enable quota tracking " +- "(type=%d, err=%d). Please run " +- "e2fsck to fix.", type, err); ++ "(type=%d, err=%d, ino=%lu). " ++ "Please run e2fsck to fix.", type, ++ err, qf_inums[type]); + for (type--; type >= 0; type--) { + struct inode *inode; + +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-add-inode-table-check-in-__ext4_get_inode_loc-t.patch @@ -0,0 +1,89 @@ +From eee22187b53611e173161e38f61de1c7ecbeb876 Mon Sep 17 00:00:00 2001 +From: Baokun Li <libaokun1@huawei.com> +Date: Wed, 17 Aug 2022 21:27:01 +0800 +Subject: [PATCH] ext4: add inode table check in __ext4_get_inode_loc to aovid + possible infinite loop +Git-commit: eee22187b53611e173161e38f61de1c7ecbeb876 +Patch-mainline: v6.2-rc1 +References: bsc#1207617 + +In do_writepages, if the value returned by ext4_writepages is "-ENOMEM" +and "wbc->sync_mode == WB_SYNC_ALL", retry until the condition is not met. + +In __ext4_get_inode_loc, if the bh returned by sb_getblk is NULL, +the function returns -ENOMEM. + +In __getblk_slow, if the return value of grow_buffers is less than 0, +the function returns NULL. + +When the three processes are connected in series like the following stack, +an infinite loop may occur: + +do_writepages <--- keep retrying + ext4_writepages + mpage_map_and_submit_extent + mpage_map_one_extent + ext4_map_blocks + ext4_ext_map_blocks + ext4_ext_handle_unwritten_extents + ext4_ext_convert_to_initialized + ext4_split_extent + ext4_split_extent_at + __ext4_ext_dirty + __ext4_mark_inode_dirty + ext4_reserve_inode_write + ext4_get_inode_loc + __ext4_get_inode_loc <--- return -ENOMEM + sb_getblk + __getblk_gfp + __getblk_slow <--- return NULL + grow_buffers + grow_dev_page <--- return -ENXIO + ret = (block < end_block) ? 1 : -ENXIO; + +In this issue, bg_inode_table_hi is overwritten as an incorrect value. +As a result, `block < end_block` cannot be met in grow_dev_page. +Therefore, __ext4_get_inode_loc always returns '-ENOMEM' and do_writepages +keeps retrying. As a result, the writeback process is in the D state due +to an infinite loop. + +Add a check on inode table block in the __ext4_get_inode_loc function by +referring to ext4_read_inode_bitmap to avoid this infinite loop. + +Cc: stable@kernel.org +Signed-off-by: Baokun Li <libaokun1@huawei.com> +Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> +Link: https://lore.kernel.org/r/20220817132701.3015912-3-libaokun1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inode.c | 10 +++++++++- + 1 file changed, 9 insertions(+), 1 deletion(-) + +diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c +index 121ce2fccfab..6d16241666e6 100644 +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -4479,9 +4479,17 @@ static int __ext4_get_inode_loc(struct super_block *sb, unsigned long ino, + inodes_per_block = EXT4_SB(sb)->s_inodes_per_block; + inode_offset = ((ino - 1) % + EXT4_INODES_PER_GROUP(sb)); +- block = ext4_inode_table(sb, gdp) + (inode_offset / inodes_per_block); + iloc->offset = (inode_offset % inodes_per_block) * EXT4_INODE_SIZE(sb); + ++ block = ext4_inode_table(sb, gdp); ++ if ((block <= le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block)) || ++ (block >= ext4_blocks_count(EXT4_SB(sb)->s_es))) { ++ ext4_error(sb, "Invalid inode table block %llu in " ++ "block_group %u", block, iloc->block_group); ++ return -EFSCORRUPTED; ++ } ++ block += (inode_offset / inodes_per_block); ++ + bh = sb_getblk(sb, block); + if (unlikely(!bh)) + return -ENOMEM; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-add-lockdep-annotations-for-i_data_sem-for-ea_i.patch @@ -0,0 +1,63 @@ +From aff3bea95388299eec63440389b4545c8041b357 Mon Sep 17 00:00:00 2001 +From: Theodore Ts'o <tytso@mit.edu> +Date: Tue, 23 May 2023 23:49:51 -0400 +Subject: [PATCH] ext4: add lockdep annotations for i_data_sem for ea_inode's +Git-commit: aff3bea95388299eec63440389b4545c8041b357 +Patch-mainline: v6.4-rc5 +References: bsc#1213109 + +Treat i_data_sem for ea_inodes as being in their own lockdep class to +avoid lockdep complaints about ext4_setattr's use of inode_lock() on +normal inodes potentially causing lock ordering with i_data_sem on +ea_inodes in ext4_xattr_inode_write(). However, ea_inodes will be +operated on by ext4_setattr(), so this isn't a problem. + +Cc: stable@kernel.org +Link: https://syzkaller.appspot.com/bug?extid=298c5d8fb4a128bc27b0 +Reported-by: syzbot+298c5d8fb4a128bc27b0@syzkaller.appspotmail.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Link: https://lore.kernel.org/r/20230524034951.779531-5-tytso@mit.edu +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/ext4.h | 2 ++ + fs/ext4/xattr.c | 4 ++++ + 2 files changed, 6 insertions(+) + +diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h +index 9525c52b78dc..8104a21b001a 100644 +--- a/fs/ext4/ext4.h ++++ b/fs/ext4/ext4.h +@@ -918,11 +918,13 @@ do { \ + * where the second inode has larger inode number + * than the first + * I_DATA_SEM_QUOTA - Used for quota inodes only ++ * I_DATA_SEM_EA - Used for ea_inodes only + */ + enum { + I_DATA_SEM_NORMAL = 0, + I_DATA_SEM_OTHER, + I_DATA_SEM_QUOTA, ++ I_DATA_SEM_EA + }; + + +diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c +index ff7ab63c5b4f..13d7f17a9c8c 100644 +--- a/fs/ext4/xattr.c ++++ b/fs/ext4/xattr.c +@@ -121,7 +121,11 @@ ext4_expand_inode_array(struct ext4_xattr_inode_array **ea_inode_array, + #ifdef CONFIG_LOCKDEP + void ext4_xattr_inode_set_class(struct inode *ea_inode) + { ++ struct ext4_inode_info *ei = EXT4_I(ea_inode); ++ + lockdep_set_subclass(&ea_inode->i_rwsem, 1); ++ (void) ei; /* shut up clang warning if !CONFIG_LOCKDEP */ ++ lockdep_set_subclass(&ei->i_data_sem, I_DATA_SEM_EA); + } + #endif + +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-add-missing-validation-of-fast-commit-record-le.patch @@ -0,0 +1,97 @@ +From 64b4a25c3de81a69724e888ec2db3533b43816e2 Mon Sep 17 00:00:00 2001 +From: Eric Biggers <ebiggers@google.com> +Date: Sun, 6 Nov 2022 14:48:38 -0800 +Subject: [PATCH] ext4: add missing validation of fast-commit record lengths +Git-commit: 64b4a25c3de81a69724e888ec2db3533b43816e2 +Patch-mainline: v6.2-rc1 +References: bsc#1207626 + +Validate the inode and filename lengths in fast-commit journal records +so that a malicious fast-commit journal cannot cause a crash by having +invalid values for these. Also validate EXT4_FC_TAG_DEL_RANGE. + +Fixes: aa75f4d3daae ("ext4: main fast-commit commit path") +Cc: <stable@vger.kernel.org> # v5.10+ +Signed-off-by: Eric Biggers <ebiggers@google.com> +Link: https://lore.kernel.org/r/20221106224841.279231-5-ebiggers@kernel.org +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/fast_commit.c | 38 +++++++++++++++++++------------------- + fs/ext4/fast_commit.h | 2 +- + 2 files changed, 20 insertions(+), 20 deletions(-) + +--- a/fs/ext4/fast_commit.c ++++ b/fs/ext4/fast_commit.c +@@ -1925,32 +1925,31 @@ void ext4_fc_replay_cleanup(struct super + kfree(sbi->s_fc_replay_state.fc_modified_inodes); + } + +-static inline bool ext4_fc_tag_len_isvalid(struct ext4_fc_tl *tl, +- u8 *val, u8 *end) ++static bool ext4_fc_value_len_isvalid(struct ext4_sb_info *sbi, ++ int tag, int len) + { +- if (val + tl->fc_len > end) +- return false; +- +- /* Here only check ADD_RANGE/TAIL/HEAD which will read data when do +- * journal rescan before do CRC check. Other tags length check will +- * rely on CRC check. +- */ +- switch (tl->fc_tag) { ++ switch (tag) { + case EXT4_FC_TAG_ADD_RANGE: +- return (sizeof(struct ext4_fc_add_range) == tl->fc_len); +- case EXT4_FC_TAG_TAIL: +- return (sizeof(struct ext4_fc_tail) <= tl->fc_len); +- case EXT4_FC_TAG_HEAD: +- return (sizeof(struct ext4_fc_head) == tl->fc_len); ++ return len == sizeof(struct ext4_fc_add_range); + case EXT4_FC_TAG_DEL_RANGE: ++ return len == sizeof(struct ext4_fc_del_range); ++ case EXT4_FC_TAG_CREAT: + case EXT4_FC_TAG_LINK: + case EXT4_FC_TAG_UNLINK: +- case EXT4_FC_TAG_CREAT: ++ len -= sizeof(struct ext4_fc_dentry_info); ++ return len >= 1 && len <= EXT4_NAME_LEN; + case EXT4_FC_TAG_INODE: ++ len -= sizeof(struct ext4_fc_inode); ++ return len >= EXT4_GOOD_OLD_INODE_SIZE && ++ len <= sbi->s_inode_size; + case EXT4_FC_TAG_PAD: +- default: +- return true; ++ return true; /* padding can have any length */ ++ case EXT4_FC_TAG_TAIL: ++ return len >= sizeof(struct ext4_fc_tail); ++ case EXT4_FC_TAG_HEAD: ++ return len == sizeof(struct ext4_fc_head); + } ++ return false; + } + + /* +@@ -2013,7 +2012,8 @@ static int ext4_fc_replay_scan(journal_t + cur = cur + EXT4_FC_TAG_BASE_LEN + tl.fc_len) { + ext4_fc_get_tl(&tl, cur); + val = cur + EXT4_FC_TAG_BASE_LEN; +- if (!ext4_fc_tag_len_isvalid(&tl, val, end)) { ++ if (tl.fc_len > end - val || ++ !ext4_fc_value_len_isvalid(sbi, tl.fc_tag, tl.fc_len)) { + ret = state->fc_replay_num_tags ? + JBD2_FC_REPLAY_STOP : -ECANCELED; + goto out_err; +--- a/fs/ext4/fast_commit.h ++++ b/fs/ext4/fast_commit.h +@@ -58,7 +58,7 @@ struct ext4_fc_dentry_info { + __u8 fc_dname[0]; + }; + +-/* Value structure for EXT4_FC_TAG_INODE and EXT4_FC_TAG_INODE_PARTIAL. */ ++/* Value structure for EXT4_FC_TAG_INODE. */ + struct ext4_fc_inode { + __le32 fc_ino; + __u8 fc_raw_inode[0]; --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-add-new-helper-interface-ext4_try_to_trim_range.patch @@ -0,0 +1,161 @@ +From 6920b3913235f517728bb69abe9b39047a987113 Mon Sep 17 00:00:00 2001 +From: Wang Jianchao <wangjianchao@kuaishou.com> +Date: Sat, 24 Jul 2021 15:41:21 +0800 +Subject: [PATCH] ext4: add new helper interface ext4_try_to_trim_range() +Git-commit: 6920b3913235f517728bb69abe9b39047a987113 +Patch-mainline: v5.15-rc1 +References: bsc#1202783 + +There is no functional change in this patch but just split the +codes, which serachs free block and does trim, into a new function +ext4_try_to_trim_range. This is preparing for the following async +backgroup discard. + +Reviewed-by: Andreas Dilger <adilger@dilger.ca> +Signed-off-by: Wang Jianchao <wangjianchao@kuaishou.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20210724074124.25731-3-jianchao.wan9@gmail.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/mballoc.c | 106 +++++++++++++++++++++++++++++------------------------- + 1 file changed, 58 insertions(+), 48 deletions(-) + +--- a/fs/ext4/mballoc.c ++++ b/fs/ext4/mballoc.c +@@ -6218,6 +6218,55 @@ __acquires(bitlock) + return ret; + } + ++static int ext4_try_to_trim_range(struct super_block *sb, ++ struct ext4_buddy *e4b, ext4_grpblk_t start, ++ ext4_grpblk_t max, ext4_grpblk_t minblocks) ++{ ++ ext4_grpblk_t next, count, free_count; ++ void *bitmap; ++ int ret = 0; ++ ++ bitmap = e4b->bd_bitmap; ++ start = (e4b->bd_info->bb_first_free > start) ? ++ e4b->bd_info->bb_first_free : start; ++ count = 0; ++ free_count = 0; ++ ++ while (start <= max) { ++ start = mb_find_next_zero_bit(bitmap, max + 1, start); ++ if (start > max) ++ break; ++ next = mb_find_next_bit(bitmap, max + 1, start); ++ ++ if ((next - start) >= minblocks) { ++ ret = ext4_trim_extent(sb, start, next - start, ++ e4b->bd_group, e4b); ++ if (ret && ret != -EOPNOTSUPP) ++ break; ++ ret = 0; ++ count += next - start; ++ } ++ free_count += next - start; ++ start = next + 1; ++ ++ if (fatal_signal_pending(current)) { ++ count = -ERESTARTSYS; ++ break; ++ } ++ ++ if (need_resched()) { ++ ext4_unlock_group(sb, e4b->bd_group); ++ cond_resched(); ++ ext4_lock_group(sb, e4b->bd_group); ++ } ++ ++ if ((e4b->bd_info->bb_free - free_count) < minblocks) ++ break; ++ } ++ ++ return count; ++} ++ + /** + * ext4_trim_all_free -- function to trim all free space in alloc. group + * @sb: super block for file system +@@ -6241,10 +6290,8 @@ ext4_trim_all_free(struct super_block *s + ext4_grpblk_t start, ext4_grpblk_t max, + ext4_grpblk_t minblocks) + { +- void *bitmap; +- ext4_grpblk_t next, count = 0, free_count = 0; + struct ext4_buddy e4b; +- int ret = 0; ++ int ret; + + trace_ext4_trim_all_free(sb, group, start, max); + +@@ -6254,58 +6301,21 @@ ext4_trim_all_free(struct super_block *s + ret, group); + return ret; + } +- bitmap = e4b.bd_bitmap; +- + ext4_lock_group(sb, group); +- if (EXT4_MB_GRP_WAS_TRIMMED(e4b.bd_info) && +- minblocks >= atomic_read(&EXT4_SB(sb)->s_last_trim_minblks)) +- goto out; +- +- start = (e4b.bd_info->bb_first_free > start) ? +- e4b.bd_info->bb_first_free : start; +- +- while (start <= max) { +- start = mb_find_next_zero_bit(bitmap, max + 1, start); +- if (start > max) +- break; +- next = mb_find_next_bit(bitmap, max + 1, start); +- +- if ((next - start) >= minblocks) { +- ret = ext4_trim_extent(sb, start, +- next - start, group, &e4b); +- if (ret && ret != -EOPNOTSUPP) +- break; +- ret = 0; +- count += next - start; +- } +- free_count += next - start; +- start = next + 1; +- +- if (fatal_signal_pending(current)) { +- count = -ERESTARTSYS; +- break; +- } +- +- if (need_resched()) { +- ext4_unlock_group(sb, group); +- cond_resched(); +- ext4_lock_group(sb, group); +- } +- +- if ((e4b.bd_info->bb_free - free_count) < minblocks) +- break; ++ if (!EXT4_MB_GRP_WAS_TRIMMED(e4b.bd_info) || ++ minblocks < atomic_read(&EXT4_SB(sb)->s_last_trim_minblks)) { ++ ret = ext4_try_to_trim_range(sb, &e4b, start, max, minblocks); ++ if (ret >= 0) ++ EXT4_MB_GRP_SET_TRIMMED(e4b.bd_info); ++ } else { ++ ret = 0; + } + +- if (!ret) { +- ret = count; +- EXT4_MB_GRP_SET_TRIMMED(e4b.bd_info); +- } +-out: + ext4_unlock_group(sb, group); + ext4_mb_unload_buddy(&e4b); + + ext4_debug("trimmed %d blocks in the group %d\n", +- count, group); ++ ret, group); + + return ret; + } --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-add-reserved-GDT-blocks-check.patch @@ -0,0 +1,79 @@ +From b55c3cd102a6f48b90e61c44f7f3dda8c290c694 Mon Sep 17 00:00:00 2001 +From: Zhang Yi <yi.zhang@huawei.com> +Date: Wed, 1 Jun 2022 17:27:17 +0800 +Subject: [PATCH] ext4: add reserved GDT blocks check +Git-commit: b55c3cd102a6f48b90e61c44f7f3dda8c290c694 +Patch-mainline: v5.19-rc3 +References: bsc#1202712 + +We capture a NULL pointer issue when resizing a corrupt ext4 image which +is freshly clear resize_inode feature (not run e2fsck). It could be +simply reproduced by following steps. The problem is because of the +resize_inode feature was cleared, and it will convert the filesystem to +meta_bg mode in ext4_resize_fs(), but the es->s_reserved_gdt_blocks was +not reduced to zero, so could we mistakenly call reserve_backup_gdb() +and passing an uninitialized resize_inode to it when adding new group +descriptors. + + mkfs.ext4 /dev/sda 3G + tune2fs -O ^resize_inode /dev/sda #forget to run requested e2fsck + mount /dev/sda /mnt + resize2fs /dev/sda 8G + + ======== + BUG: kernel NULL pointer dereference, address: 0000000000000028 + CPU: 19 PID: 3243 Comm: resize2fs Not tainted 5.18.0-rc7-00001-gfde086c5ebfd #748 + ... + RIP: 0010:ext4_flex_group_add+0xe08/0x2570 + ... + Call Trace: + <TASK> + ext4_resize_fs+0xbec/0x1660 + __ext4_ioctl+0x1749/0x24e0 + ext4_ioctl+0x12/0x20 + __x64_sys_ioctl+0xa6/0x110 + do_syscall_64+0x3b/0x90 + entry_SYSCALL_64_after_hwframe+0x44/0xae + RIP: 0033:0x7f2dd739617b + ======== + +The fix is simple, add a check in ext4_resize_begin() to make sure that +the es->s_reserved_gdt_blocks is zero when the resize_inode feature is +disabled. + +Cc: stable@kernel.org +Signed-off-by: Zhang Yi <yi.zhang@huawei.com> +Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220601092717.763694-1-yi.zhang@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/resize.c | 10 ++++++++++ + 1 file changed, 10 insertions(+) + +diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c +index 90a941d20dff..8b70a4701293 100644 +--- a/fs/ext4/resize.c ++++ b/fs/ext4/resize.c +@@ -53,6 +53,16 @@ int ext4_resize_begin(struct super_block *sb) + if (!capable(CAP_SYS_RESOURCE)) + return -EPERM; + ++ /* ++ * If the reserved GDT blocks is non-zero, the resize_inode feature ++ * should always be set. ++ */ ++ if (EXT4_SB(sb)->s_es->s_reserved_gdt_blocks && ++ !ext4_has_feature_resize_inode(sb)) { ++ ext4_error(sb, "resize_inode disabled but reserved GDT blocks non-zero"); ++ return -EFSCORRUPTED; ++ } ++ + /* + * If we are not using the primary superblock/GDT copy don't resize, + * because the user tools have no way of handling this. Probably a +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-allocate-extended-attribute-value-in-vmalloc-ar.patch @@ -0,0 +1,50 @@ +From cc12a6f25e07ed05d5825a1664b67a970842b2ca Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Thu, 8 Dec 2022 10:32:31 +0800 +Subject: [PATCH] ext4: allocate extended attribute value in vmalloc area +Git-commit: cc12a6f25e07ed05d5825a1664b67a970842b2ca +Patch-mainline: v6.2-rc1 +References: bsc#1207635 + +Now, extended attribute value maximum length is 64K. The memory +requested here does not need continuous physical addresses, so it is +appropriate to use kvmalloc to request memory. At the same time, it +can also cope with the situation that the extended attribute will +become longer in the future. + +Signed-off-by: Ye Bin <yebin10@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20221208023233.1231330-3-yebin@huaweicloud.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/xattr.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c +index 6bdd502527f8..b666d3bf8b38 100644 +--- a/fs/ext4/xattr.c ++++ b/fs/ext4/xattr.c +@@ -2548,7 +2548,7 @@ static int ext4_xattr_move_to_block(handle_t *handle, struct inode *inode, + + is = kzalloc(sizeof(struct ext4_xattr_ibody_find), GFP_NOFS); + bs = kzalloc(sizeof(struct ext4_xattr_block_find), GFP_NOFS); +- buffer = kmalloc(value_size, GFP_NOFS); ++ buffer = kvmalloc(value_size, GFP_NOFS); + b_entry_name = kmalloc(entry->e_name_len + 1, GFP_NOFS); + if (!is || !bs || !buffer || !b_entry_name) { + error = -ENOMEM; +@@ -2600,7 +2600,7 @@ static int ext4_xattr_move_to_block(handle_t *handle, struct inode *inode, + error = 0; + out: + kfree(b_entry_name); +- kfree(buffer); ++ kvfree(buffer); + if (is) + brelse(is->iloc.bh); + if (bs) +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-avoid-BUG_ON-when-creating-xattrs.patch @@ -0,0 +1,58 @@ +From b40ebaf63851b3a401b0dc9263843538f64f5ce6 Mon Sep 17 00:00:00 2001 +From: Jan Kara <jack@suse.cz> +Date: Mon, 21 Nov 2022 14:09:29 +0100 +Subject: [PATCH] ext4: avoid BUG_ON when creating xattrs +Git-commit: b40ebaf63851b3a401b0dc9263843538f64f5ce6 +Patch-mainline: v6.2-rc1 +References: bsc#1205496 + +Commit fb0a387dcdcd ("ext4: limit block allocations for indirect-block +files to < 2^32") added code to try to allocate xattr block with 32-bit +block number for indirect block based files on the grounds that these +files cannot use larger block numbers. It also added BUG_ON when +allocated block could not fit into 32 bits. This is however bogus +reasoning because xattr block is stored in inode->i_file_acl and +inode->i_file_acl_hi and as such even indirect block based files can +happily use full 48 bits for xattr block number. The proper handling +seems to be there basically since 64-bit block number support was added. +So remove the bogus limitation and BUG_ON. + +Cc: Eric Sandeen <sandeen@redhat.com> +Fixes: fb0a387dcdcd ("ext4: limit block allocations for indirect-block files to < 2^32") +Signed-off-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20221121130929.32031-1-jack@suse.cz +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/xattr.c | 8 -------- + 1 file changed, 8 deletions(-) + +diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c +index 718ef3987f94..4d1c701f0eec 100644 +--- a/fs/ext4/xattr.c ++++ b/fs/ext4/xattr.c +@@ -2071,19 +2071,11 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode, + + goal = ext4_group_first_block_no(sb, + EXT4_I(inode)->i_block_group); +- +- /* non-extent files can't have physical blocks past 2^32 */ +- if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) +- goal = goal & EXT4_MAX_BLOCK_FILE_PHYS; +- + block = ext4_new_meta_blocks(handle, inode, goal, 0, + NULL, &error); + if (error) + goto cleanup; + +- if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) +- BUG_ON(block > EXT4_MAX_BLOCK_FILE_PHYS); +- + ea_idebug(inode, "creating block %llu", + (unsigned long long)block); + +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-avoid-cycles-in-directory-h-tree.patch @@ -0,0 +1,86 @@ +From 3ba733f879c2a88910744647e41edeefbc0d92b2 Mon Sep 17 00:00:00 2001 +From: Jan Kara <jack@suse.cz> +Date: Wed, 18 May 2022 11:33:29 +0200 +Subject: [PATCH] ext4: avoid cycles in directory h-tree +Git-commit: 3ba733f879c2a88910744647e41edeefbc0d92b2 +Patch-mainline: v5.19-rc1 +References: bsc#1198577 CVE-2022-1184 + +A maliciously corrupted filesystem can contain cycles in the h-tree +stored inside a directory. That can easily lead to the kernel corrupting +tree nodes that were already verified under its hands while doing a node +split and consequently accessing unallocated memory. Fix the problem by +verifying traversed block numbers are unique. + +Cc: stable@vger.kernel.org +Signed-off-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220518093332.13986-2-jack@suse.cz +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/namei.c | 22 +++++++++++++++++++--- + 1 file changed, 19 insertions(+), 3 deletions(-) + +diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c +index 7286472e9558..47d0ca4c795b 100644 +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -777,12 +777,14 @@ static struct dx_frame * + dx_probe(struct ext4_filename *fname, struct inode *dir, + struct dx_hash_info *hinfo, struct dx_frame *frame_in) + { +- unsigned count, indirect; ++ unsigned count, indirect, level, i; + struct dx_entry *at, *entries, *p, *q, *m; + struct dx_root *root; + struct dx_frame *frame = frame_in; + struct dx_frame *ret_err = ERR_PTR(ERR_BAD_DX_DIR); + u32 hash; ++ ext4_lblk_t block; ++ ext4_lblk_t blocks[EXT4_HTREE_LEVEL]; + + memset(frame_in, 0, EXT4_HTREE_LEVEL * sizeof(frame_in[0])); + frame->bh = ext4_read_dirblock(dir, 0, INDEX); +@@ -854,6 +856,8 @@ dx_probe(struct ext4_filename *fname, struct inode *dir, + } + + dxtrace(printk("Look up %x", hash)); ++ level = 0; ++ blocks[0] = 0; + while (1) { + count = dx_get_count(entries); + if (!count || count > dx_get_limit(entries)) { +@@ -882,15 +886,27 @@ dx_probe(struct ext4_filename *fname, struct inode *dir, + dx_get_block(at))); + frame->entries = entries; + frame->at = at; +- if (!indirect--) ++ ++ block = dx_get_block(at); ++ for (i = 0; i <= level; i++) { ++ if (blocks[i] == block) { ++ ext4_warning_inode(dir, ++ "dx entry: tree cycle block %u points back to block %u", ++ blocks[level], block); ++ goto fail; ++ } ++ } ++ if (++level > indirect) + return frame; ++ blocks[level] = block; + frame++; +- frame->bh = ext4_read_dirblock(dir, dx_get_block(at), INDEX); ++ frame->bh = ext4_read_dirblock(dir, block, INDEX); + if (IS_ERR(frame->bh)) { + ret_err = (struct dx_frame *) frame->bh; + frame->bh = NULL; + goto fail; + } ++ + entries = ((struct dx_node *) frame->bh->b_data)->entries; + + if (dx_get_limit(entries) != dx_node_limit(dir)) { +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-avoid-deadlock-in-fs-reclaim-with-page-writebac.patch @@ -0,0 +1,232 @@ +From 00d873c17e29cc32d90ca852b82685f1673acaa5 Mon Sep 17 00:00:00 2001 +From: Jan Kara <jack@suse.cz> +Date: Thu, 4 May 2023 14:47:23 +0200 +Subject: [PATCH] ext4: avoid deadlock in fs reclaim with page writeback +Git-commit: 00d873c17e29cc32d90ca852b82685f1673acaa5 +Patch-mainline: v6.4-rc2 +References: bsc#1213016 + +Ext4 has a filesystem wide lock protecting ext4_writepages() calls to +avoid races with switching of journalled data flag or inode format. This +lock can however cause a deadlock like: + +CPU0 CPU1 + +ext4_writepages() + percpu_down_read(sbi->s_writepages_rwsem); + ext4_change_inode_journal_flag() + percpu_down_write(sbi->s_writepages_rwsem); + - blocks, all readers block from now on + ext4_do_writepages() + ext4_init_io_end() + kmem_cache_zalloc(io_end_cachep, GFP_KERNEL) + fs_reclaim frees dentry... + dentry_unlink_inode() + iput() - last ref => + iput_final() - inode dirty => + write_inode_now()... + ext4_writepages() tries to acquire sbi->s_writepages_rwsem + and blocks forever + +Make sure we cannot recurse into filesystem reclaim from writeback code +to avoid the deadlock. + +Reported-by: syzbot+6898da502aef574c5f8a@syzkaller.appspotmail.com +Link: https://lore.kernel.org/all/0000000000004c66b405fa108e27@google.com +Fixes: c8585c6fcaf2 ("ext4: fix races between changing inode journal mode and ext4_writepages") +Cc: stable@vger.kernel.org +Signed-off-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20230504124723.20205-1-jack@suse.cz +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/ext4.h | 25 +++++++++++++++++++++++++ + fs/ext4/inode.c | 18 ++++++++++-------- + fs/ext4/migrate.c | 11 ++++++----- + 3 files changed, 41 insertions(+), 13 deletions(-) + +--- a/fs/ext4/ext4.h ++++ b/fs/ext4/ext4.h +@@ -36,6 +36,7 @@ + #include <linux/falloc.h> + #include <linux/percpu-rwsem.h> + #include <linux/fiemap.h> ++#include <linux/sched/mm.h> + #ifdef __KERNEL__ + #include <linux/compat.h> + #endif +@@ -1720,6 +1721,30 @@ static inline struct ext4_inode_info *EX + return container_of(inode, struct ext4_inode_info, vfs_inode); + } + ++static inline int ext4_writepages_down_read(struct super_block *sb) ++{ ++ percpu_down_read(&EXT4_SB(sb)->s_writepages_rwsem); ++ return memalloc_nofs_save(); ++} ++ ++static inline void ext4_writepages_up_read(struct super_block *sb, int ctx) ++{ ++ memalloc_nofs_restore(ctx); ++ percpu_up_read(&EXT4_SB(sb)->s_writepages_rwsem); ++} ++ ++static inline int ext4_writepages_down_write(struct super_block *sb) ++{ ++ percpu_down_write(&EXT4_SB(sb)->s_writepages_rwsem); ++ return memalloc_nofs_save(); ++} ++ ++static inline void ext4_writepages_up_write(struct super_block *sb, int ctx) ++{ ++ memalloc_nofs_restore(ctx); ++ percpu_up_write(&EXT4_SB(sb)->s_writepages_rwsem); ++} ++ + static inline int ext4_valid_inum(struct super_block *sb, unsigned long ino) + { + return ino == EXT4_ROOT_INO || +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -2668,11 +2668,12 @@ static int ext4_writepages(struct addres + struct ext4_sb_info *sbi = EXT4_SB(mapping->host->i_sb); + struct blk_plug plug; + bool give_up_on_write = false; ++ int alloc_ctx; + + if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb)))) + return -EIO; + +- percpu_down_read(&sbi->s_writepages_rwsem); ++ alloc_ctx = ext4_writepages_down_read(inode->i_sb); + trace_ext4_writepages(inode, wbc); + + /* +@@ -2882,7 +2883,7 @@ unplug: + out_writepages: + trace_ext4_writepages_result(inode, wbc, ret, + nr_to_write - wbc->nr_to_write); +- percpu_up_read(&sbi->s_writepages_rwsem); ++ ext4_writepages_up_read(inode->i_sb, alloc_ctx); + return ret; + } + +@@ -2893,17 +2894,18 @@ static int ext4_dax_writepages(struct ad + long nr_to_write = wbc->nr_to_write; + struct inode *inode = mapping->host; + struct ext4_sb_info *sbi = EXT4_SB(mapping->host->i_sb); ++ int alloc_ctx; + + if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb)))) + return -EIO; + +- percpu_down_read(&sbi->s_writepages_rwsem); ++ alloc_ctx = ext4_writepages_down_read(inode->i_sb); + trace_ext4_writepages(inode, wbc); + + ret = dax_writeback_mapping_range(mapping, sbi->s_daxdev, wbc); + trace_ext4_writepages_result(inode, wbc, ret, + nr_to_write - wbc->nr_to_write); +- percpu_up_read(&sbi->s_writepages_rwsem); ++ ext4_writepages_up_read(inode->i_sb, alloc_ctx); + return ret; + } + +@@ -6008,7 +6010,7 @@ int ext4_change_inode_journal_flag(struc + journal_t *journal; + handle_t *handle; + int err; +- struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); ++ int alloc_ctx; + + /* + * We have to be very careful here: changing a data block's +@@ -6046,7 +6048,7 @@ int ext4_change_inode_journal_flag(struc + } + } + +- percpu_down_write(&sbi->s_writepages_rwsem); ++ alloc_ctx = ext4_writepages_down_write(inode->i_sb); + jbd2_journal_lock_updates(journal); + + /* +@@ -6063,7 +6065,7 @@ int ext4_change_inode_journal_flag(struc + err = jbd2_journal_flush(journal, 0); + if (err < 0) { + jbd2_journal_unlock_updates(journal); +- percpu_up_write(&sbi->s_writepages_rwsem); ++ ext4_writepages_up_write(inode->i_sb, alloc_ctx); + return err; + } + ext4_clear_inode_flag(inode, EXT4_INODE_JOURNAL_DATA); +@@ -6071,7 +6073,7 @@ int ext4_change_inode_journal_flag(struc + ext4_set_aops(inode); + + jbd2_journal_unlock_updates(journal); +- percpu_up_write(&sbi->s_writepages_rwsem); ++ ext4_writepages_up_write(inode->i_sb, alloc_ctx); + + if (val) + filemap_invalidate_unlock(inode->i_mapping); +--- a/fs/ext4/migrate.c ++++ b/fs/ext4/migrate.c +@@ -409,7 +409,6 @@ static int free_ext_block(handle_t *hand + + int ext4_ext_migrate(struct inode *inode) + { +- struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); + handle_t *handle; + int retval = 0, i; + __le32 *i_data; +@@ -419,6 +418,7 @@ int ext4_ext_migrate(struct inode *inode + unsigned long max_entries; + __u32 goal, tmp_csum_seed; + uid_t owner[2]; ++ int alloc_ctx; + + /* + * If the filesystem does not support extents, or the inode +@@ -435,7 +435,7 @@ int ext4_ext_migrate(struct inode *inode + */ + return retval; + +- percpu_down_write(&sbi->s_writepages_rwsem); ++ alloc_ctx = ext4_writepages_down_write(inode->i_sb); + + /* + * Worst case we can touch the allocation bitmaps and a block +@@ -587,7 +587,7 @@ out_tmp_inode: + unlock_new_inode(tmp_inode); + iput(tmp_inode); + out_unlock: +- percpu_up_write(&sbi->s_writepages_rwsem); ++ ext4_writepages_up_write(inode->i_sb, alloc_ctx); + return retval; + } + +@@ -606,6 +606,7 @@ int ext4_ind_migrate(struct inode *inode + ext4_fsblk_t blk; + handle_t *handle; + int ret, ret2 = 0; ++ int alloc_ctx; + + if (!ext4_has_feature_extents(inode->i_sb) || + (!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) +@@ -622,7 +623,7 @@ int ext4_ind_migrate(struct inode *inode + if (test_opt(inode->i_sb, DELALLOC)) + ext4_alloc_da_blocks(inode); + +- percpu_down_write(&sbi->s_writepages_rwsem); ++ alloc_ctx = ext4_writepages_down_write(inode->i_sb); + + handle = ext4_journal_start(inode, EXT4_HT_MIGRATE, 1); + if (IS_ERR(handle)) { +@@ -666,6 +667,6 @@ errout: + ext4_journal_stop(handle); + up_write(&EXT4_I(inode)->i_data_sem); + out_unlock: +- percpu_up_write(&sbi->s_writepages_rwsem); ++ ext4_writepages_up_write(inode->i_sb, alloc_ctx); + return ret; + } --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-avoid-trim-error-on-fs-with-small-groups.patch @@ -0,0 +1,72 @@ +From 173b6e383d2a204c9921ffc1eca3b87aa2106c33 Mon Sep 17 00:00:00 2001 +From: Jan Kara <jack@suse.cz> +Date: Fri, 12 Nov 2021 16:22:02 +0100 +Subject: [PATCH] ext4: avoid trim error on fs with small groups +Git-commit: 173b6e383d2a204c9921ffc1eca3b87aa2106c33 +Patch-mainline: v5.17-rc1 +References: bsc#1191271 + +A user reported FITRIM ioctl failing for him on ext4 on some devices +without apparent reason. After some debugging we've found out that +these devices (being LVM volumes) report rather large discard +granularity of 42MB and the filesystem had 1k blocksize and thus group +size of 8MB. Because ext4 FITRIM implementation puts discard +granularity into minlen, ext4_trim_fs() declared the trim request as +invalid. However just silently doing nothing seems to be a more +appropriate reaction to such combination of parameters since user did +not specify anything wrong. + +Cc: Lukas Czerner <lczerner@redhat.com> +Fixes: 5c2ed62fd447 ("ext4: Adjust minlen with discard_granularity in the FITRIM ioctl") +Signed-off-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20211112152202.26614-1-jack@suse.cz +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/ioctl.c | 2 -- + fs/ext4/mballoc.c | 8 ++++++++ + 2 files changed, 8 insertions(+), 2 deletions(-) + +diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c +index 1366afb59fba..798d9d828795 100644 +--- a/fs/ext4/ioctl.c ++++ b/fs/ext4/ioctl.c +@@ -1114,8 +1114,6 @@ static long __ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) + sizeof(range))) + return -EFAULT; + +- range.minlen = max((unsigned int)range.minlen, +- q->limits.discard_granularity); + ret = ext4_trim_fs(sb, &range); + if (ret < 0) + return ret; +diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c +index 3dd9b9e2f967..ea764137462e 100644 +--- a/fs/ext4/mballoc.c ++++ b/fs/ext4/mballoc.c +@@ -6400,6 +6400,7 @@ ext4_trim_all_free(struct super_block *sb, ext4_group_t group, + */ + int ext4_trim_fs(struct super_block *sb, struct fstrim_range *range) + { ++ struct request_queue *q = bdev_get_queue(sb->s_bdev); + struct ext4_group_info *grp; + ext4_group_t group, first_group, last_group; + ext4_grpblk_t cnt = 0, first_cluster, last_cluster; +@@ -6418,6 +6419,13 @@ int ext4_trim_fs(struct super_block *sb, struct fstrim_range *range) + start >= max_blks || + range->len < sb->s_blocksize) + return -EINVAL; ++ /* No point to try to trim less than discard granularity */ ++ if (range->minlen < q->limits.discard_granularity) { ++ minlen = EXT4_NUM_B2C(EXT4_SB(sb), ++ q->limits.discard_granularity >> sb->s_blocksize_bits); ++ if (minlen > EXT4_CLUSTERS_PER_GROUP(sb)) ++ goto out; ++ } + if (end >= max_blks) + end = max_blks - 1; + if (end <= first_data_blk) +-- +2.34.1 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-avoid-unaccounted-block-allocation-when-expandi.patch @@ -0,0 +1,47 @@ +From 8994d11395f8165b3deca1971946f549f0822630 Mon Sep 17 00:00:00 2001 +From: Jan Kara <jack@suse.cz> +Date: Wed, 7 Dec 2022 12:59:28 +0100 +Subject: [PATCH] ext4: avoid unaccounted block allocation when expanding inode +Git-commit: 8994d11395f8165b3deca1971946f549f0822630 +Patch-mainline: v6.2-rc1 +References: bsc#1207634 + +When expanding inode space in ext4_expand_extra_isize_ea() we may need +to allocate external xattr block. If quota is not initialized for the +inode, the block allocation will not be accounted into quota usage. Make +sure the quota is initialized before we try to expand inode space. + +Reported-by: Pengfei Xu <pengfei.xu@intel.com> +Link: https://lore.kernel.org/all/Y5BT+k6xWqthZc1P@xpf.sh.intel.com +Signed-off-by: Jan Kara <jack@suse.cz> +Cc: stable@kernel.org +Link: https://lore.kernel.org/r/20221207115937.26601-2-jack@suse.cz +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inode.c | 8 ++++++++ + 1 file changed, 8 insertions(+) + +diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c +index 38b7db452d37..5a92e5186313 100644 +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -5945,6 +5945,14 @@ static int __ext4_expand_extra_isize(struct inode *inode, + return 0; + } + ++ /* ++ * We may need to allocate external xattr block so we need quotas ++ * initialized. Here we can be called with various locks held so we ++ * cannot affort to initialize quotas ourselves. So just bail. ++ */ ++ if (dquot_initialize_needed(inode)) ++ return -EAGAIN; ++ + /* try to expand with EAs present */ + error = ext4_expand_extra_isize_ea(inode, new_extra_isize, + raw_inode, handle); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-bail-out-of-ext4_xattr_ibody_get-fails-for-any-.patch @@ -0,0 +1,36 @@ +From 2a534e1d0d1591e951f9ece2fb460b2ff92edabd Mon Sep 17 00:00:00 2001 +From: Theodore Ts'o <tytso@mit.edu> +Date: Fri, 12 May 2023 15:16:27 -0400 +Subject: [PATCH] ext4: bail out of ext4_xattr_ibody_get() fails for any reason +Git-commit: 2a534e1d0d1591e951f9ece2fb460b2ff92edabd +Patch-mainline: v6.4-rc2 +References: bsc#1213018 + +In ext4_update_inline_data(), if ext4_xattr_ibody_get() fails for any +reason, it's best if we just fail as opposed to stumbling on, +especially if the failure is EFSCORRUPTED. + +Cc: stable@kernel.org +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inline.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c +index f47adb284e90..5854bd5a3352 100644 +--- a/fs/ext4/inline.c ++++ b/fs/ext4/inline.c +@@ -360,7 +360,7 @@ static int ext4_update_inline_data(handle_t *handle, struct inode *inode, + + error = ext4_xattr_ibody_get(inode, i.name_index, i.name, + value, len); +- if (error == -ENODATA) ++ if (error < 0) + goto out; + + BUFFER_TRACE(is.iloc.bh, "get_write_access"); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-check-if-directory-block-is-within-i_size.patch @@ -0,0 +1,57 @@ +From 65f8ea4cd57dbd46ea13b41dc8bac03176b04233 Mon Sep 17 00:00:00 2001 +From: Lukas Czerner <lczerner@redhat.com> +Date: Mon, 4 Jul 2022 16:27:20 +0200 +Subject: [PATCH] ext4: check if directory block is within i_size +Git-commit: 65f8ea4cd57dbd46ea13b41dc8bac03176b04233 +Patch-mainline: v6.0-rc1 +References: bsc#1198577 CVE-2022-1184 + +Currently ext4 directory handling code implicitly assumes that the +directory blocks are always within the i_size. In fact ext4_append() +will attempt to allocate next directory block based solely on i_size and +the i_size is then appropriately increased after a successful +allocation. + +However, for this to work it requires i_size to be correct. If, for any +reason, the directory inode i_size is corrupted in a way that the +directory tree refers to a valid directory block past i_size, we could +end up corrupting parts of the directory tree structure by overwriting +already used directory blocks when modifying the directory. + +Fix it by catching the corruption early in __ext4_read_dirblock(). + +Addresses Red-Hat-Bugzilla: #2070205 + +Cve: CVE-2022-1184 +Signed-off-by: Lukas Czerner <lczerner@redhat.com> +Cc: stable@vger.kernel.org +Reviewed-by: Andreas Dilger <adilger@dilger.ca> +Link: https://lore.kernel.org/r/20220704142721.157985-1-lczerner@redhat.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/namei.c | 7 +++++++ + 1 file changed, 7 insertions(+) + +diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c +index 1c6725ecca1a..7fced54e2891 100644 +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -110,6 +110,13 @@ static struct buffer_head *__ext4_read_dirblock(struct inode *inode, + struct ext4_dir_entry *dirent; + int is_dx_block = 0; + ++ if (block >= inode->i_size) { ++ ext4_error_inode(inode, func, line, block, ++ "Attempting to read directory block (%u) that is past i_size (%llu)", ++ block, inode->i_size); ++ return ERR_PTR(-EFSCORRUPTED); ++ } ++ + if (ext4_simulate_fail(inode->i_sb, EXT4_SIM_DIRBLOCK_EIO)) + bh = ERR_PTR(-EIO); + else +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-check-iomap-type-only-if-ext4_iomap_begin-does-.patch @@ -0,0 +1,45 @@ +From fa83c34e3e56b3c672af38059e066242655271b1 Mon Sep 17 00:00:00 2001 +From: Baokun Li <libaokun1@huawei.com> +Date: Fri, 5 May 2023 21:24:29 +0800 +Subject: [PATCH] ext4: check iomap type only if ext4_iomap_begin() does not + fail +Git-commit: fa83c34e3e56b3c672af38059e066242655271b1 +Patch-mainline: v6.4-rc2 +References: bsc#1213103 + +When ext4_iomap_overwrite_begin() calls ext4_iomap_begin() map blocks may +fail for some reason (e.g. memory allocation failure, bare disk write), and +later because "iomap->type ! = IOMAP_MAPPED" triggers WARN_ON(). When ext4 +iomap_begin() returns an error, it is normal that the type of iomap->type +may not match the expectation. Therefore, we only determine if iomap->type +is as expected when ext4_iomap_begin() is executed successfully. + +Cc: stable@kernel.org +Reported-by: syzbot+08106c4b7d60702dbc14@syzkaller.appspotmail.com +Link: https://lore.kernel.org/all/00000000000015760b05f9b4eee9@google.com +Signed-off-by: Baokun Li <libaokun1@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20230505132429.714648-1-libaokun1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inode.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c +index 3cb774d9e3f1..ce5f21b6c2b3 100644 +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -3377,7 +3377,7 @@ static int ext4_iomap_overwrite_begin(struct inode *inode, loff_t offset, + */ + flags &= ~IOMAP_WRITE; + ret = ext4_iomap_begin(inode, offset, length, flags, iomap, srcmap); +- WARN_ON_ONCE(iomap->type != IOMAP_MAPPED); ++ WARN_ON_ONCE(!ret && iomap->type != IOMAP_MAPPED); + return ret; + } + +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-destroy-ext4_fc_dentry_cachep-kmemcache-on-modu.patch @@ -0,0 +1,79 @@ +From ab047d516dea72f011c15c04a929851e4d053109 Mon Sep 17 00:00:00 2001 +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu, 23 Dec 2021 17:44:36 +0100 +Subject: [PATCH] ext4: destroy ext4_fc_dentry_cachep kmemcache on module + removal +Git-commit: ab047d516dea72f011c15c04a929851e4d053109 +Patch-mainline: v5.17-rc1 +References: bsc#1197917 + +The kmemcache for ext4_fc_dentry_cachep remains registered after module +removal. + +Destroy ext4_fc_dentry_cachep kmemcache on module removal. + +Fixes: aa75f4d3daaeb ("ext4: main fast-commit commit path") +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Reviewed-by: Lukas Czerner <lczerner@redhat.com> +Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> +Link: https://lore.kernel.org/r/20211110134640.lyku5vklvdndw6uk@linutronix.de +Link: https://lore.kernel.org/r/YbiK3JetFFl08bd7@linutronix.de +Link: https://lore.kernel.org/r/20211223164436.2628390-1-bigeasy@linutronix.de +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/ext4.h | 1 + + fs/ext4/fast_commit.c | 5 +++++ + fs/ext4/super.c | 2 ++ + 3 files changed, 8 insertions(+) + +diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h +index 82fa51d6f145..714201fa9e6f 100644 +--- a/fs/ext4/ext4.h ++++ b/fs/ext4/ext4.h +@@ -2932,6 +2932,7 @@ bool ext4_fc_replay_check_excluded(struct super_block *sb, ext4_fsblk_t block); + void ext4_fc_replay_cleanup(struct super_block *sb); + int ext4_fc_commit(journal_t *journal, tid_t commit_tid); + int __init ext4_fc_init_dentry_cache(void); ++void ext4_fc_destroy_dentry_cache(void); + + /* mballoc.c */ + extern const struct seq_operations ext4_mb_seq_groups_ops; +diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c +index 28ddeb1d6afb..a6d647325742 100644 +--- a/fs/ext4/fast_commit.c ++++ b/fs/ext4/fast_commit.c +@@ -2153,3 +2153,8 @@ int __init ext4_fc_init_dentry_cache(void) + + return 0; + } ++ ++void ext4_fc_destroy_dentry_cache(void) ++{ ++ kmem_cache_destroy(ext4_fc_dentry_cachep); ++} +diff --git a/fs/ext4/super.c b/fs/ext4/super.c +index acdfd9c0d091..499d1734818d 100644 +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -7118,6 +7118,7 @@ static int __init ext4_init_fs(void) + out: + unregister_as_ext2(); + unregister_as_ext3(); ++ ext4_fc_destroy_dentry_cache(); + out05: + destroy_inodecache(); + out1: +@@ -7144,6 +7145,7 @@ static void __exit ext4_exit_fs(void) + unregister_as_ext2(); + unregister_as_ext3(); + unregister_filesystem(&ext4_fs_type); ++ ext4_fc_destroy_dentry_cache(); + destroy_inodecache(); + ext4_exit_mballoc(); + ext4_exit_sysfs(); +-- +2.34.1 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-disable-fast-commit-of-encrypted-dir-operations.patch @@ -0,0 +1,151 @@ +From 0fbcb5251fc81b58969b272c4fb7374a7b922e3e Mon Sep 17 00:00:00 2001 +From: Eric Biggers <ebiggers@google.com> +Date: Sun, 6 Nov 2022 14:48:35 -0800 +Subject: [PATCH] ext4: disable fast-commit of encrypted dir operations +Git-commit: 0fbcb5251fc81b58969b272c4fb7374a7b922e3e +Patch-mainline: v6.2-rc1 +References: bsc#1207623 + +fast-commit of create, link, and unlink operations in encrypted +directories is completely broken because the unencrypted filenames are +being written to the fast-commit journal instead of the encrypted +filenames. These operations can't be replayed, as encryption keys +aren't present at journal replay time. It is also an information leak. + +Until if/when we can get this working properly, make encrypted directory +operations ineligible for fast-commit. + +Note that fast-commit operations on encrypted regular files continue to +be allowed, as they seem to work. + +Fixes: aa75f4d3daae ("ext4: main fast-commit commit path") +Cc: <stable@vger.kernel.org> # v5.10+ +Signed-off-by: Eric Biggers <ebiggers@google.com> +Link: https://lore.kernel.org/r/20221106224841.279231-2-ebiggers@kernel.org +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/fast_commit.c | 41 +++++++++++++++++++++++++---------------- + fs/ext4/fast_commit.h | 1 + + include/trace/events/ext4.h | 7 +++++-- + 3 files changed, 31 insertions(+), 18 deletions(-) + +--- a/fs/ext4/fast_commit.c ++++ b/fs/ext4/fast_commit.c +@@ -432,25 +432,34 @@ static int __track_dentry_update(struct + struct __track_dentry_update_args *dentry_update = + (struct __track_dentry_update_args *)arg; + struct dentry *dentry = dentry_update->dentry; +- struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); ++ struct inode *dir = dentry->d_parent->d_inode; ++ struct super_block *sb = inode->i_sb; ++ struct ext4_sb_info *sbi = EXT4_SB(sb); + + mutex_unlock(&ei->i_fc_lock); ++ ++ if (IS_ENCRYPTED(dir)) { ++ ext4_fc_mark_ineligible(sb, EXT4_FC_REASON_ENCRYPTED_FILENAME, ++ NULL); ++ mutex_lock(&ei->i_fc_lock); ++ return -EOPNOTSUPP; ++ } ++ + node = kmem_cache_alloc(ext4_fc_dentry_cachep, GFP_NOFS); + if (!node) { +- ext4_fc_mark_ineligible(inode->i_sb, EXT4_FC_REASON_NOMEM); ++ ext4_fc_mark_ineligible(sb, EXT4_FC_REASON_NOMEM); + mutex_lock(&ei->i_fc_lock); + return -ENOMEM; + } + + node->fcd_op = dentry_update->op; +- node->fcd_parent = dentry->d_parent->d_inode->i_ino; ++ node->fcd_parent = dir->i_ino; + node->fcd_ino = inode->i_ino; + if (dentry->d_name.len > DNAME_INLINE_LEN) { + node->fcd_name.name = kmalloc(dentry->d_name.len, GFP_NOFS); + if (!node->fcd_name.name) { + kmem_cache_free(ext4_fc_dentry_cachep, node); +- ext4_fc_mark_ineligible(inode->i_sb, +- EXT4_FC_REASON_NOMEM); ++ ext4_fc_mark_ineligible(sb, EXT4_FC_REASON_NOMEM); + mutex_lock(&ei->i_fc_lock); + return -ENOMEM; + } +@@ -2205,17 +2214,17 @@ void ext4_fc_init(struct super_block *sb + journal->j_fc_cleanup_callback = ext4_fc_cleanup; + } + +-static const char *fc_ineligible_reasons[] = { +- "Extended attributes changed", +- "Cross rename", +- "Journal flag changed", +- "Insufficient memory", +- "Swap boot", +- "Resize", +- "Dir renamed", +- "Falloc range op", +- "Data journalling", +- "FC Commit Failed" ++static const char * const fc_ineligible_reasons[] = { ++ [EXT4_FC_REASON_XATTR] = "Extended attributes changed", ++ [EXT4_FC_REASON_CROSS_RENAME] = "Cross rename", ++ [EXT4_FC_REASON_JOURNAL_FLAG_CHANGE] = "Journal flag changed", ++ [EXT4_FC_REASON_NOMEM] = "Insufficient memory", ++ [EXT4_FC_REASON_SWAP_BOOT] = "Swap boot", ++ [EXT4_FC_REASON_RESIZE] = "Resize", ++ [EXT4_FC_REASON_RENAME_DIR] = "Dir renamed", ++ [EXT4_FC_REASON_FALLOC_RANGE] = "Falloc range op", ++ [EXT4_FC_REASON_INODE_JOURNAL_DATA] = "Data journalling", ++ [EXT4_FC_REASON_ENCRYPTED_FILENAME] = "Encrypted filename", + }; + + int ext4_fc_info_show(struct seq_file *seq, void *v) +--- a/fs/ext4/fast_commit.h ++++ b/fs/ext4/fast_commit.h +@@ -99,6 +99,7 @@ enum { + EXT4_FC_REASON_FALLOC_RANGE, + EXT4_FC_REASON_INODE_JOURNAL_DATA, + EXT4_FC_COMMIT_FAILED, ++ EXT4_FC_REASON_ENCRYPTED_FILENAME, + EXT4_FC_REASON_MAX + }; + +--- a/include/trace/events/ext4.h ++++ b/include/trace/events/ext4.h +@@ -104,6 +104,7 @@ TRACE_DEFINE_ENUM(EXT4_FC_REASON_RESIZE) + TRACE_DEFINE_ENUM(EXT4_FC_REASON_RENAME_DIR); + TRACE_DEFINE_ENUM(EXT4_FC_REASON_FALLOC_RANGE); + TRACE_DEFINE_ENUM(EXT4_FC_REASON_INODE_JOURNAL_DATA); ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_ENCRYPTED_FILENAME); + TRACE_DEFINE_ENUM(EXT4_FC_REASON_MAX); + + #define show_fc_reason(reason) \ +@@ -116,7 +117,8 @@ TRACE_DEFINE_ENUM(EXT4_FC_REASON_MAX); + { EXT4_FC_REASON_RESIZE, "RESIZE"}, \ + { EXT4_FC_REASON_RENAME_DIR, "RENAME_DIR"}, \ + { EXT4_FC_REASON_FALLOC_RANGE, "FALLOC_RANGE"}, \ +- { EXT4_FC_REASON_INODE_JOURNAL_DATA, "INODE_JOURNAL_DATA"}) ++ { EXT4_FC_REASON_INODE_JOURNAL_DATA, "INODE_JOURNAL_DATA"}, \ ++ { EXT4_FC_REASON_ENCRYPTED_FILENAME, "ENCRYPTED_FILENAME"}) + + TRACE_EVENT(ext4_other_inode_update_time, + TP_PROTO(struct inode *inode, ino_t orig_ino), +@@ -2764,7 +2766,7 @@ TRACE_EVENT(ext4_fc_stats, + ), + + TP_printk("dev %d,%d fc ineligible reasons:\n" +- "%s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u " ++ "%s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u" + "num_commits:%lu, ineligible: %lu, numblks: %lu", + MAJOR(__entry->dev), MINOR(__entry->dev), + FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR), +@@ -2776,6 +2778,7 @@ TRACE_EVENT(ext4_fc_stats, + FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR), + FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE), + FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA), ++ FC_REASON_NAME_STAT(EXT4_FC_REASON_ENCRYPTED_FILENAME), + __entry->fc_commits, __entry->fc_ineligible_commits, + __entry->fc_numblks) + ); --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-disallow-ea_inodes-with-extended-attributes.patch @@ -0,0 +1,39 @@ +From 2bc7e7c1a3bc9bd0cbf0f71006f6fe7ef24a00c2 Mon Sep 17 00:00:00 2001 +From: Theodore Ts'o <tytso@mit.edu> +Date: Tue, 23 May 2023 23:49:50 -0400 +Subject: [PATCH] ext4: disallow ea_inodes with extended attributes +Git-commit: 2bc7e7c1a3bc9bd0cbf0f71006f6fe7ef24a00c2 +Patch-mainline: v6.4-rc5 +References: bsc#1213108 + +An ea_inode stores the value of an extended attribute; it can not have +extended attributes itself, or this will cause recursive nightmares. +Add a check in ext4_iget() to make sure this is the case. + +Cc: stable@kernel.org +Reported-by: syzbot+e44749b6ba4d0434cd47@syzkaller.appspotmail.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Link: https://lore.kernel.org/r/20230524034951.779531-4-tytso@mit.edu +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inode.c | 3 +++ + 1 file changed, 3 insertions(+) + +diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c +index 258f3cbed347..02de439bf1f0 100644 +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -4647,6 +4647,9 @@ static const char *check_igot_inode(struct inode *inode, ext4_iget_flags flags) + if (flags & EXT4_IGET_EA_INODE) { + if (!(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) + return "missing EA_INODE flag"; ++ if (ext4_test_inode_state(inode, EXT4_STATE_XATTR) || ++ EXT4_I(inode)->i_file_acl) ++ return "ea_inode with extended attributes"; + } else { + if ((EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) + return "unexpected EA_INODE flag"; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-don-t-allow-journal-inode-to-have-encrypt-flag.patch @@ -0,0 +1,58 @@ +From 105c78e12468413e426625831faa7db4284e1fec Mon Sep 17 00:00:00 2001 +From: Eric Biggers <ebiggers@google.com> +Date: Tue, 1 Nov 2022 22:33:12 -0700 +Subject: [PATCH] ext4: don't allow journal inode to have encrypt flag +Git-commit: 105c78e12468413e426625831faa7db4284e1fec +Patch-mainline: v6.2-rc1 +References: bsc#1207621 + +Mounting a filesystem whose journal inode has the encrypt flag causes a +NULL dereference in fscrypt_limit_io_blocks() when the 'inlinecrypt' +mount option is used. + +The problem is that when jbd2_journal_init_inode() calls bmap(), it +eventually finds its way into ext4_iomap_begin(), which calls +fscrypt_limit_io_blocks(). fscrypt_limit_io_blocks() requires that if +the inode is encrypted, then its encryption key must already be set up. +That's not the case here, since the journal inode is never "opened" like +a normal file would be. Hence the crash. + +A reproducer is: + + mkfs.ext4 -F /dev/vdb + debugfs -w /dev/vdb -R "set_inode_field <8> flags 0x80808" + mount /dev/vdb /mnt -o inlinecrypt + +To fix this, make ext4 consider journal inodes with the encrypt flag to +be invalid. (Note, maybe other flags should be rejected on the journal +inode too. For now, this is just the minimal fix for the above issue.) + +I've marked this as fixing the commit that introduced the call to +fscrypt_limit_io_blocks(), since that's what made an actual crash start +being possible. But this fix could be applied to any version of ext4 +that supports the encrypt feature. + +Reported-by: syzbot+ba9dac45bc76c490b7c3@syzkaller.appspotmail.com +Fixes: 38ea50daa7a4 ("ext4: support direct I/O with fscrypt using blk-crypto") +Cc: stable@vger.kernel.org +Signed-off-by: Eric Biggers <ebiggers@google.com> +Link: https://lore.kernel.org/r/20221102053312.189962-1-ebiggers@kernel.org +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/super.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -5302,7 +5302,7 @@ static struct inode *ext4_get_journal_in + + jbd_debug(2, "Journal inode found at %p: %lld bytes\n", + journal_inode, journal_inode->i_size); +- if (!S_ISREG(journal_inode->i_mode)) { ++ if (!S_ISREG(journal_inode->i_mode) || IS_ENCRYPTED(journal_inode)) { + ext4_msg(sb, KERN_ERR, "invalid journal inode"); + iput(journal_inode); + return NULL; --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-don-t-increase-iversion-counter-for-ea_inodes.patch @@ -0,0 +1,49 @@ +From 50f094a5580e6297bf10a807d16f0ee23fa576cf Mon Sep 17 00:00:00 2001 +From: Lukas Czerner <lczerner@redhat.com> +Date: Wed, 24 Aug 2022 18:03:47 +0200 +Subject: [PATCH] ext4: don't increase iversion counter for ea_inodes +Git-commit: 50f094a5580e6297bf10a807d16f0ee23fa576cf +Patch-mainline: v6.1-rc1 +References: bsc#1207605 + +ea_inodes are using i_version for storing part of the reference count so +we really need to leave it alone. + +The problem can be reproduced by xfstest ext4/026 when iversion is +enabled. Fix it by not calling inode_inc_iversion() for EXT4_EA_INODE_FL +inodes in ext4_mark_iloc_dirty(). + +Cc: stable@kernel.org +Signed-off-by: Lukas Czerner <lczerner@redhat.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Reviewed-by: Jeff Layton <jlayton@kernel.org> +Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> +Link: https://lore.kernel.org/r/20220824160349.39664-1-lczerner@redhat.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inode.c | 7 ++++++- + 1 file changed, 6 insertions(+), 1 deletion(-) + +diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c +index 601214453c3a..2a220be34caa 100644 +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -5731,7 +5731,12 @@ int ext4_mark_iloc_dirty(handle_t *handle, + } + ext4_fc_track_inode(handle, inode); + +- if (IS_I_VERSION(inode)) ++ /* ++ * ea_inodes are using i_version for storing reference count, don't ++ * mess with it ++ */ ++ if (IS_I_VERSION(inode) && ++ !(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) + inode_inc_iversion(inode); + + /* the do_update_inode consumes one bh->b_count */ +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-don-t-set-up-encryption-key-during-jbd2-transac.patch @@ -0,0 +1,158 @@ +From 4c0d5778385cb3618ff26a561ce41de2b7d9de70 Mon Sep 17 00:00:00 2001 +From: Eric Biggers <ebiggers@google.com> +Date: Sun, 6 Nov 2022 14:48:36 -0800 +Subject: [PATCH] ext4: don't set up encryption key during jbd2 transaction +Git-commit: 4c0d5778385cb3618ff26a561ce41de2b7d9de70 +Patch-mainline: v6.2-rc1 +References: bsc#1207624 + +Commit a80f7fcf1867 ("ext4: fixup ext4_fc_track_* functions' signature") +extended the scope of the transaction in ext4_unlink() too far, making +it include the call to ext4_find_entry(). However, ext4_find_entry() +can deadlock when called from within a transaction because it may need +to set up the directory's encryption key. + +Fix this by restoring the transaction to its original scope. + +Reported-by: syzbot+1a748d0007eeac3ab079@syzkaller.appspotmail.com +Fixes: a80f7fcf1867 ("ext4: fixup ext4_fc_track_* functions' signature") +Cc: <stable@vger.kernel.org> # v5.10+ +Signed-off-by: Eric Biggers <ebiggers@google.com> +Link: https://lore.kernel.org/r/20221106224841.279231-3-ebiggers@kernel.org +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/ext4.h | 4 ++-- + fs/ext4/fast_commit.c | 2 +- + fs/ext4/namei.c | 44 ++++++++++++++++++++++++-------------------- + 3 files changed, 27 insertions(+), 23 deletions(-) + +--- a/fs/ext4/ext4.h ++++ b/fs/ext4/ext4.h +@@ -3605,8 +3605,8 @@ extern void ext4_initialize_dirent_tail( + unsigned int blocksize); + extern int ext4_handle_dirty_dirblock(handle_t *handle, struct inode *inode, + struct buffer_head *bh); +-extern int __ext4_unlink(handle_t *handle, struct inode *dir, const struct qstr *d_name, +- struct inode *inode); ++extern int __ext4_unlink(struct inode *dir, const struct qstr *d_name, ++ struct inode *inode, struct dentry *dentry); + extern int __ext4_link(struct inode *dir, struct inode *inode, + struct dentry *dentry); + +--- a/fs/ext4/fast_commit.c ++++ b/fs/ext4/fast_commit.c +@@ -1334,7 +1334,7 @@ static int ext4_fc_replay_unlink(struct + return 0; + } + +- ret = __ext4_unlink(NULL, old_parent, &entry, inode); ++ ret = __ext4_unlink(old_parent, &entry, inode, NULL); + /* -ENOENT ok coz it might not exist anymore. */ + if (ret == -ENOENT) + ret = 0; +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -3389,14 +3389,20 @@ end_rmdir: + return retval; + } + +-int __ext4_unlink(handle_t *handle, struct inode *dir, const struct qstr *d_name, +- struct inode *inode) ++int __ext4_unlink(struct inode *dir, const struct qstr *d_name, ++ struct inode *inode, ++ struct dentry *dentry /* NULL during fast_commit recovery */) + { + int retval = -ENOENT; + struct buffer_head *bh; + struct ext4_dir_entry_2 *de; ++ handle_t *handle; + int skip_remove_dentry = 0; + ++ /* ++ * Keep this outside the transaction; it may have to set up the ++ * directory's encryption key, which isn't GFP_NOFS-safe. ++ */ + bh = ext4_find_entry(dir, d_name, &de, NULL); + if (IS_ERR(bh)) + return PTR_ERR(bh); +@@ -3413,7 +3419,14 @@ int __ext4_unlink(handle_t *handle, stru + if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) + skip_remove_dentry = 1; + else +- goto out; ++ goto out_bh; ++ } ++ ++ handle = ext4_journal_start(dir, EXT4_HT_DIR, ++ EXT4_DATA_TRANS_BLOCKS(dir->i_sb)); ++ if (IS_ERR(handle)) { ++ retval = PTR_ERR(handle); ++ goto out_bh; + } + + if (IS_DIRSYNC(dir)) +@@ -3422,12 +3435,12 @@ int __ext4_unlink(handle_t *handle, stru + if (!skip_remove_dentry) { + retval = ext4_delete_entry(handle, dir, de, bh); + if (retval) +- goto out; ++ goto out_handle; + dir->i_ctime = dir->i_mtime = current_time(dir); + ext4_update_dx_flag(dir); + retval = ext4_mark_inode_dirty(handle, dir); + if (retval) +- goto out; ++ goto out_handle; + } else { + retval = 0; + } +@@ -3440,15 +3453,17 @@ int __ext4_unlink(handle_t *handle, stru + ext4_orphan_add(handle, inode); + inode->i_ctime = current_time(inode); + retval = ext4_mark_inode_dirty(handle, inode); +- +-out: ++ if (dentry && !retval) ++ ext4_fc_track_unlink(handle, dentry); ++out_handle: ++ ext4_journal_stop(handle); ++out_bh: + brelse(bh); + return retval; + } + + static int ext4_unlink(struct inode *dir, struct dentry *dentry) + { +- handle_t *handle; + int retval; + + if (unlikely(ext4_forced_shutdown(EXT4_SB(dir->i_sb)))) +@@ -3466,16 +3481,7 @@ static int ext4_unlink(struct inode *dir + if (retval) + goto out_trace; + +- handle = ext4_journal_start(dir, EXT4_HT_DIR, +- EXT4_DATA_TRANS_BLOCKS(dir->i_sb)); +- if (IS_ERR(handle)) { +- retval = PTR_ERR(handle); +- goto out_trace; +- } +- +- retval = __ext4_unlink(handle, dir, &dentry->d_name, d_inode(dentry)); +- if (!retval) +- ext4_fc_track_unlink(handle, dentry); ++ retval = __ext4_unlink(dir, &dentry->d_name, d_inode(dentry), dentry); + #ifdef CONFIG_UNICODE + /* VFS negative dentries are incompatible with Encoding and + * Case-insensitiveness. Eventually we'll want avoid +@@ -3486,8 +3492,6 @@ static int ext4_unlink(struct inode *dir + if (IS_CASEFOLDED(dir)) + d_invalidate(dentry); + #endif +- if (handle) +- ext4_journal_stop(handle); + + out_trace: + trace_ext4_unlink_exit(dentry, retval); --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-don-t-use-the-orphan-list-when-migrating-an-ino.patch @@ -0,0 +1,88 @@ +From 6eeaf88fd586f05aaf1d48cb3a139d2a5c6eb055 Mon Sep 17 00:00:00 2001 +From: Theodore Ts'o <tytso@mit.edu> +Date: Wed, 5 Jan 2022 23:59:56 -0500 +Subject: [PATCH] ext4: don't use the orphan list when migrating an inode +Git-commit: 6eeaf88fd586f05aaf1d48cb3a139d2a5c6eb055 +Patch-mainline: v5.17-rc1 +References: bsc#1197756 + +We probably want to remove the indirect block to extents migration +feature after a deprecation window, but until then, let's fix a +potential data loss problem caused by the fact that we put the +tmp_inode on the orphan list. In the unlikely case where we crash and +do a journal recovery, the data blocks belonging to the inode being +migrated are also represented in the tmp_inode on the orphan list --- +and so its data blocks will get marked unallocated, and available for +reuse. + +Instead, stop putting the tmp_inode on the oprhan list. So in the +case where we crash while migrating the inode, we'll leak an inode, +which is not a disaster. It will be easily fixed the next time we run +fsck, and it's better than potentially having blocks getting claimed +by two different files, and losing data as a result. + +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Reviewed-by: Lukas Czerner <lczerner@redhat.com> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/migrate.c | 19 ++++--------------- + 1 file changed, 4 insertions(+), 15 deletions(-) + +diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c +index 36dfc88ce05b..ff8916e1d38e 100644 +--- a/fs/ext4/migrate.c ++++ b/fs/ext4/migrate.c +@@ -437,12 +437,12 @@ int ext4_ext_migrate(struct inode *inode) + percpu_down_write(&sbi->s_writepages_rwsem); + + /* +- * Worst case we can touch the allocation bitmaps, a bgd +- * block, and a block to link in the orphan list. We do need +- * need to worry about credits for modifying the quota inode. ++ * Worst case we can touch the allocation bitmaps and a block ++ * group descriptor block. We do need need to worry about ++ * credits for modifying the quota inode. + */ + handle = ext4_journal_start(inode, EXT4_HT_MIGRATE, +- 4 + EXT4_MAXQUOTAS_TRANS_BLOCKS(inode->i_sb)); ++ 3 + EXT4_MAXQUOTAS_TRANS_BLOCKS(inode->i_sb)); + + if (IS_ERR(handle)) { + retval = PTR_ERR(handle); +@@ -463,10 +463,6 @@ int ext4_ext_migrate(struct inode *inode) + * Use the correct seed for checksum (i.e. the seed from 'inode'). This + * is so that the metadata blocks will have the correct checksum after + * the migration. +- * +- * Note however that, if a crash occurs during the migration process, +- * the recovery process is broken because the tmp_inode checksums will +- * be wrong and the orphans cleanup will fail. + */ + ei = EXT4_I(inode); + EXT4_I(tmp_inode)->i_csum_seed = ei->i_csum_seed; +@@ -478,7 +474,6 @@ int ext4_ext_migrate(struct inode *inode) + clear_nlink(tmp_inode); + + ext4_ext_tree_init(handle, tmp_inode); +- ext4_orphan_add(handle, tmp_inode); + ext4_journal_stop(handle); + + /* +@@ -503,12 +498,6 @@ int ext4_ext_migrate(struct inode *inode) + + handle = ext4_journal_start(inode, EXT4_HT_MIGRATE, 1); + if (IS_ERR(handle)) { +- /* +- * It is impossible to update on-disk structures without +- * a handle, so just rollback in-core changes and live other +- * work to orphan_list_cleanup() +- */ +- ext4_orphan_del(NULL, tmp_inode); + retval = PTR_ERR(handle); + goto out_tmp_inode; + } +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-ext4_read_bh_lock-should-submit-IO-if-the-buffe.patch @@ -0,0 +1,81 @@ +From 0b73284c564d3ae4feef4bc920292f004acf4980 Mon Sep 17 00:00:00 2001 +From: Zhang Yi <yi.zhang@huawei.com> +Date: Wed, 31 Aug 2022 15:46:29 +0800 +Subject: [PATCH] ext4: ext4_read_bh_lock() should submit IO if the buffer + isn't uptodate +Git-commit: 0b73284c564d3ae4feef4bc920292f004acf4980 +Patch-mainline: v6.1-rc1 +References: bsc#1207606 + +Recently we notice that ext4 filesystem would occasionally fail to read +metadata from disk and report error message, but the disk and block +layer looks fine. After analyse, we lockon commit 88dbcbb3a484 +("blkdev: avoid migration stalls for blkdev pages"). It provide a +migration method for the bdev, we could move page that has buffers +without extra users now, but it lock the buffers on the page, which +breaks the fragile metadata read operation on ext4 filesystem, +ext4_read_bh_lock() was copied from ll_rw_block(), it depends on the +assumption of that locked buffer means it is under IO. So it just +trylock the buffer and skip submit IO if it lock failed, after +wait_on_buffer() we conclude IO error because the buffer is not +uptodate. + +This issue could be easily reproduced by add some delay just after +buffer_migrate_lock_buffers() in __buffer_migrate_folio() and do +fsstress on ext4 filesystem. + + EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #73193: + comm fsstress: reading directory lblock 0 + EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #75334: + comm fsstress: reading directory lblock 0 + +Fix it by removing the trylock logic in ext4_read_bh_lock(), just lock +the buffer and submit IO if it's not uptodate, and also leave over +readahead helper. + +Cc: stable@kernel.org +Signed-off-by: Zhang Yi <yi.zhang@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220831074629.3755110-1-yi.zhang@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/super.c | 16 +++++----------- + 1 file changed, 5 insertions(+), 11 deletions(-) + +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -187,19 +187,12 @@ int ext4_read_bh(struct buffer_head *bh, + + int ext4_read_bh_lock(struct buffer_head *bh, int op_flags, bool wait) + { +- if (trylock_buffer(bh)) { +- if (wait) +- return ext4_read_bh(bh, op_flags, NULL); ++ lock_buffer(bh); ++ if (!wait) { + ext4_read_bh_nowait(bh, op_flags, NULL); + return 0; + } +- if (wait) { +- wait_on_buffer(bh); +- if (buffer_uptodate(bh)) +- return 0; +- return -EIO; +- } +- return 0; ++ return ext4_read_bh(bh, op_flags, NULL); + } + + /* +@@ -246,7 +239,8 @@ void ext4_sb_breadahead_unmovable(struct + struct buffer_head *bh = sb_getblk_gfp(sb, block, 0); + + if (likely(bh)) { +- ext4_read_bh_lock(bh, REQ_RAHEAD, false); ++ if (trylock_buffer(bh)) ++ ext4_read_bh_nowait(bh, REQ_RAHEAD, NULL); + brelse(bh); + } + } --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-f2fs-fix-readahead-of-verity-data.patch @@ -0,0 +1,48 @@ +From 4fa0e3ff217f775cb58d2d6d51820ec519243fb9 Mon Sep 17 00:00:00 2001 +From: "Matthew Wilcox (Oracle)" <willy@infradead.org> +Date: Wed, 12 Oct 2022 20:34:19 +0100 +Subject: [PATCH] ext4,f2fs: fix readahead of verity data +Git-commit: 4fa0e3ff217f775cb58d2d6d51820ec519243fb9 +Patch-mainline: v6.1-rc1 +References: bsc#1207648 + +The recent change of page_cache_ra_unbounded() arguments was buggy in the +two callers, causing us to readahead the wrong pages. Move the definition +of ractl down to after the index is set correctly. This affected +performance on configurations that use fs-verity. + +Link: https://lkml.kernel.org/r/20221012193419.1453558-1-willy@infradead.org +Fixes: 73bb49da50cd ("mm/readahead: make page_cache_ra_unbounded take a readahead_control") +Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> +Reported-by: Jintao Yin <nicememory@gmail.com> +Signed-off-by: Andrew Morton <akpm@linux-foundation.org> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/verity.c | 3 ++- + fs/f2fs/verity.c | 3 ++- + 2 files changed, 4 insertions(+), 2 deletions(-) + +diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c +index b051d19b5c8a..94442c690ca7 100644 +--- a/fs/ext4/verity.c ++++ b/fs/ext4/verity.c +@@ -365,13 +365,14 @@ static struct page *ext4_read_merkle_tree_page(struct inode *inode, + pgoff_t index, + unsigned long num_ra_pages) + { +- DEFINE_READAHEAD(ractl, NULL, NULL, inode->i_mapping, index); + struct page *page; + + index += ext4_verity_metadata_pos(inode) >> PAGE_SHIFT; + + page = find_get_page_flags(inode->i_mapping, index, FGP_ACCESSED); + if (!page || !PageUptodate(page)) { ++ DEFINE_READAHEAD(ractl, NULL, NULL, inode->i_mapping, index); ++ + if (page) + put_page(page); + else if (num_ra_pages > 1) +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-factor-out-ext4_fc_get_tl.patch @@ -0,0 +1,148 @@ +From dcc5827484d6e53ccda12334f8bbfafcc593ceda Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Sat, 24 Sep 2022 15:52:32 +0800 +Subject: [PATCH] ext4: factor out ext4_fc_get_tl() +Git-commit: dcc5827484d6e53ccda12334f8bbfafcc593ceda +Patch-mainline: v6.1-rc1 +References: bsc#1207615 + +Factor out ext4_fc_get_tl() to fill 'tl' with host byte order. + +Signed-off-by: Ye Bin <yebin10@huawei.com> +Link: https://lore.kernel.org/r/20220924075233.2315259-3-yebin10@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/fast_commit.c | 46 +++++++++++++++++++++++++--------------------- + 1 file changed, 25 insertions(+), 21 deletions(-) + +--- a/fs/ext4/fast_commit.c ++++ b/fs/ext4/fast_commit.c +@@ -1275,7 +1275,7 @@ struct dentry_info_args { + }; + + static inline void tl_to_darg(struct dentry_info_args *darg, +- struct ext4_fc_tl *tl, u8 *val) ++ struct ext4_fc_tl *tl, u8 *val) + { + struct ext4_fc_dentry_info fcd; + +@@ -1284,8 +1284,14 @@ static inline void tl_to_darg(struct den + darg->parent_ino = le32_to_cpu(fcd.fc_parent_ino); + darg->ino = le32_to_cpu(fcd.fc_ino); + darg->dname = val + offsetof(struct ext4_fc_dentry_info, fc_dname); +- darg->dname_len = le16_to_cpu(tl->fc_len) - +- sizeof(struct ext4_fc_dentry_info); ++ darg->dname_len = tl->fc_len - sizeof(struct ext4_fc_dentry_info); ++} ++ ++static inline void ext4_fc_get_tl(struct ext4_fc_tl *tl, u8 *val) ++{ ++ memcpy(tl, val, EXT4_FC_TAG_BASE_LEN); ++ tl->fc_len = le16_to_cpu(tl->fc_len); ++ tl->fc_tag = le16_to_cpu(tl->fc_tag); + } + + /* Unlink replay function */ +@@ -1450,7 +1456,7 @@ static int ext4_fc_replay_inode(struct s + struct ext4_inode *raw_fc_inode; + struct inode *inode = NULL; + struct ext4_iloc iloc; +- int inode_len, ino, ret, tag = le16_to_cpu(tl->fc_tag); ++ int inode_len, ino, ret, tag = tl->fc_tag; + struct ext4_extent_header *eh; + + memcpy(&fc_inode, val, sizeof(fc_inode)); +@@ -1475,7 +1481,7 @@ static int ext4_fc_replay_inode(struct s + if (ret) + goto out; + +- inode_len = le16_to_cpu(tl->fc_len) - sizeof(struct ext4_fc_inode); ++ inode_len = tl->fc_len - sizeof(struct ext4_fc_inode); + raw_inode = ext4_raw_inode(&iloc); + + memcpy(raw_inode, raw_fc_inode, offsetof(struct ext4_inode, i_block)); +@@ -1962,12 +1968,12 @@ static int ext4_fc_replay_scan(journal_t + + state->fc_replay_expected_off++; + for (cur = start; cur < end; +- cur = cur + EXT4_FC_TAG_BASE_LEN + le16_to_cpu(tl.fc_len)) { +- memcpy(&tl, cur, EXT4_FC_TAG_BASE_LEN); ++ cur = cur + EXT4_FC_TAG_BASE_LEN + tl.fc_len) { ++ ext4_fc_get_tl(&tl, cur); + val = cur + EXT4_FC_TAG_BASE_LEN; + jbd_debug(3, "Scan phase, tag:%s, blk %lld\n", +- tag2str(le16_to_cpu(tl.fc_tag)), bh->b_blocknr); +- switch (le16_to_cpu(tl.fc_tag)) { ++ tag2str(tl.fc_tag), bh->b_blocknr); ++ switch (tl.fc_tag) { + case EXT4_FC_TAG_ADD_RANGE: + memcpy(&ext, val, sizeof(ext)); + ex = (struct ext4_extent *)&ext.fc_ex; +@@ -1987,7 +1993,7 @@ static int ext4_fc_replay_scan(journal_t + case EXT4_FC_TAG_PAD: + state->fc_cur_tag++; + state->fc_crc = ext4_chksum(sbi, state->fc_crc, cur, +- EXT4_FC_TAG_BASE_LEN + le16_to_cpu(tl.fc_len)); ++ EXT4_FC_TAG_BASE_LEN + tl.fc_len); + break; + case EXT4_FC_TAG_TAIL: + state->fc_cur_tag++; +@@ -2020,7 +2026,7 @@ static int ext4_fc_replay_scan(journal_t + } + state->fc_cur_tag++; + state->fc_crc = ext4_chksum(sbi, state->fc_crc, cur, +- EXT4_FC_TAG_BASE_LEN + le16_to_cpu(tl.fc_len)); ++ EXT4_FC_TAG_BASE_LEN + tl.fc_len); + break; + default: + ret = state->fc_replay_num_tags ? +@@ -2076,8 +2082,8 @@ static int ext4_fc_replay(journal_t *jou + end = (__u8 *)bh->b_data + journal->j_blocksize - 1; + + for (cur = start; cur < end; +- cur = cur + EXT4_FC_TAG_BASE_LEN + le16_to_cpu(tl.fc_len)) { +- memcpy(&tl, cur, EXT4_FC_TAG_BASE_LEN); ++ cur = cur + EXT4_FC_TAG_BASE_LEN + tl.fc_len) { ++ ext4_fc_get_tl(&tl, cur); + val = cur + EXT4_FC_TAG_BASE_LEN; + + if (state->fc_replay_num_tags == 0) { +@@ -2085,10 +2091,9 @@ static int ext4_fc_replay(journal_t *jou + ext4_fc_set_bitmaps_and_counters(sb); + break; + } +- jbd_debug(3, "Replay phase, tag:%s\n", +- tag2str(le16_to_cpu(tl.fc_tag))); ++ jbd_debug(3, "Replay phase, tag:%s\n", tag2str(tl.fc_tag)); + state->fc_replay_num_tags--; +- switch (le16_to_cpu(tl.fc_tag)) { ++ switch (tl.fc_tag) { + case EXT4_FC_TAG_LINK: + ret = ext4_fc_replay_link(sb, &tl, val); + break; +@@ -2109,19 +2114,18 @@ static int ext4_fc_replay(journal_t *jou + break; + case EXT4_FC_TAG_PAD: + trace_ext4_fc_replay(sb, EXT4_FC_TAG_PAD, 0, +- le16_to_cpu(tl.fc_len), 0); ++ tl.fc_len, 0); + break; + case EXT4_FC_TAG_TAIL: +- trace_ext4_fc_replay(sb, EXT4_FC_TAG_TAIL, 0, +- le16_to_cpu(tl.fc_len), 0); ++ trace_ext4_fc_replay(sb, EXT4_FC_TAG_TAIL, ++ 0, tl.fc_len, 0); + memcpy(&tail, val, sizeof(tail)); + WARN_ON(le32_to_cpu(tail.fc_tid) != expected_tid); + break; + case EXT4_FC_TAG_HEAD: + break; + default: +- trace_ext4_fc_replay(sb, le16_to_cpu(tl.fc_tag), 0, +- le16_to_cpu(tl.fc_len), 0); ++ trace_ext4_fc_replay(sb, tl.fc_tag, 0, tl.fc_len, 0); + ret = -ECANCELED; + break; + } --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fail-ext4_iget-if-special-inode-unallocated.patch @@ -0,0 +1,76 @@ +From 5cd740287ae5e3f9d1c46f5bfe8778972fd6d3fe Mon Sep 17 00:00:00 2001 +From: Baokun Li <libaokun1@huawei.com> +Date: Sat, 7 Jan 2023 11:21:25 +0800 +Subject: [PATCH] ext4: fail ext4_iget if special inode unallocated +Mime-version: 1.0 +Content-type: text/plain; charset=UTF-8 +Content-transfer-encoding: 8bit +Git-commit: 5cd740287ae5e3f9d1c46f5bfe8778972fd6d3fe +Patch-mainline: v6.3-rc1 +References: bsc#1213010 + +In ext4_fill_super(), EXT4_ORPHAN_FS flag is cleared after +ext4_orphan_cleanup() is executed. Therefore, when __ext4_iget() is +called to get an inode whose i_nlink is 0 when the flag exists, no error +is returned. If the inode is a special inode, a null pointer dereference +may occur. If the value of i_nlink is 0 for any inodes (except boot loader +inodes) got by using the EXT4_IGET_SPECIAL flag, the current file system +is corrupted. Therefore, make the ext4_iget() function return an error if +it gets such an abnormal special inode. + +Link: https://bugzilla.kernel.org/show_bug.cgi?id=199179 +Link: https://bugzilla.kernel.org/show_bug.cgi?id=216541 +Link: https://bugzilla.kernel.org/show_bug.cgi?id=216539 +Reported-by: LuÃs Henriques <lhenriques@suse.de> +Suggested-by: Theodore Ts'o <tytso@mit.edu> +Signed-off-by: Baokun Li <libaokun1@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20230107032126.4165860-2-libaokun1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inode.c | 18 ++++++++---------- + 1 file changed, 8 insertions(+), 10 deletions(-) + +diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c +index 9df913bdb416..b65dadfe3b45 100644 +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -4872,13 +4872,6 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino, + goto bad_inode; + raw_inode = ext4_raw_inode(&iloc); + +- if ((ino == EXT4_ROOT_INO) && (raw_inode->i_links_count == 0)) { +- ext4_error_inode(inode, function, line, 0, +- "iget: root inode unallocated"); +- ret = -EFSCORRUPTED; +- goto bad_inode; +- } +- + if ((flags & EXT4_IGET_HANDLE) && + (raw_inode->i_links_count == 0) && (raw_inode->i_mode == 0)) { + ret = -ESTALE; +@@ -4951,11 +4944,16 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino, + * NeilBrown 1999oct15 + */ + if (inode->i_nlink == 0) { +- if ((inode->i_mode == 0 || ++ if ((inode->i_mode == 0 || flags & EXT4_IGET_SPECIAL || + !(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_ORPHAN_FS)) && + ino != EXT4_BOOT_LOADER_INO) { +- /* this inode is deleted */ +- ret = -ESTALE; ++ /* this inode is deleted or unallocated */ ++ if (flags & EXT4_IGET_SPECIAL) { ++ ext4_error_inode(inode, function, line, 0, ++ "iget: special inode unallocated"); ++ ret = -EFSCORRUPTED; ++ } else ++ ret = -ESTALE; + goto bad_inode; + } + /* The only unlinked inodes we let through here have +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fast-commit-may-miss-tracking-unwritten-range-d.patch @@ -0,0 +1,44 @@ +From 9725958bb75cdfa10f2ec11526fdb23e7485e8e4 Mon Sep 17 00:00:00 2001 +From: Xin Yin <yinxin.x@bytedance.com> +Date: Thu, 23 Dec 2021 11:23:37 +0800 +Subject: [PATCH] ext4: fast commit may miss tracking unwritten range during + ftruncate +Git-commit: 9725958bb75cdfa10f2ec11526fdb23e7485e8e4 +Patch-mainline: v5.17-rc1 +References: bsc#1202759 + +If use FALLOC_FL_KEEP_SIZE to alloc unwritten range at bottom, the +inode->i_size will not include the unwritten range. When call +ftruncate with fast commit enabled, it will miss to track the +unwritten range. + +Change to trace the full range during ftruncate. + +Signed-off-by: Xin Yin <yinxin.x@bytedance.com> +Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> +Link: https://lore.kernel.org/r/20211223032337.5198-3-yinxin.x@bytedance.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inode.c | 3 +-- + 1 file changed, 1 insertion(+), 2 deletions(-) + +diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c +index 4895909de21b..08a90e25b78b 100644 +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -5424,8 +5424,7 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry, + ext4_fc_track_range(handle, inode, + (attr->ia_size > 0 ? attr->ia_size - 1 : 0) >> + inode->i_sb->s_blocksize_bits, +- (oldsize > 0 ? oldsize - 1 : 0) >> +- inode->i_sb->s_blocksize_bits); ++ EXT_MAX_BLOCKS - 1); + else + ext4_fc_track_range( + handle, inode, +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-filter-out-EXT4_FC_REPLAY-from-on-disk-superblo.patch @@ -0,0 +1,62 @@ +From c878bea3c9d724ddfa05a813f30de3d25a0ba83f Mon Sep 17 00:00:00 2001 +From: Theodore Ts'o <tytso@mit.edu> +Date: Tue, 17 May 2022 13:27:55 -0400 +Subject: [PATCH] ext4: filter out EXT4_FC_REPLAY from on-disk superblock field + s_state +Git-commit: c878bea3c9d724ddfa05a813f30de3d25a0ba83f +Patch-mainline: v5.19-rc1 +References: bsc#1202771 + +The EXT4_FC_REPLAY bit in sbi->s_mount_state is used to indicate that +we are in the middle of replay the fast commit journal. This was +actually a mistake, since the sbi->s_mount_info is initialized from +es->s_state. Arguably s_mount_state is misleadingly named, but the +name is historical --- s_mount_state and s_state dates back to ext2. + +What should have been used is the ext4_{set,clear,test}_mount_flag() +inline functions, which sets EXT4_MF_* bits in sbi->s_mount_flags. + +The problem with using EXT4_FC_REPLAY is that a maliciously corrupted +superblock could result in EXT4_FC_REPLAY getting set in +s_mount_state. This bypasses some sanity checks, and this can trigger +a BUG() in ext4_es_cache_extent(). As a easy-to-backport-fix, filter +out the EXT4_FC_REPLAY bit for now. We should eventually transition +away from EXT4_FC_REPLAY to something like EXT4_MF_REPLAY. + +Cc: stable@kernel.org +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Link: https://lore.kernel.org/r/20220420192312.1655305-1-phind.uet@gmail.com +Link: https://lore.kernel.org/r/20220517174028.942119-1-tytso@mit.edu +Reported-by: syzbot+c7358a3cd05ee786eb31@syzkaller.appspotmail.com +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/super.c | 5 +++-- + 1 file changed, 3 insertions(+), 2 deletions(-) + +diff --git a/fs/ext4/super.c b/fs/ext4/super.c +index 7f6cd2473163..9cbb22045379 100644 +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -4770,7 +4770,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb) + sbi->s_inodes_per_block; + sbi->s_desc_per_block = blocksize / EXT4_DESC_SIZE(sb); + sbi->s_sbh = bh; +- sbi->s_mount_state = le16_to_cpu(es->s_state); ++ sbi->s_mount_state = le16_to_cpu(es->s_state) & ~EXT4_FC_REPLAY; + sbi->s_addr_per_block_bits = ilog2(EXT4_ADDR_PER_BLOCK(sb)); + sbi->s_desc_per_block_bits = ilog2(EXT4_DESC_PER_BLOCK(sb)); + +@@ -6333,7 +6333,8 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb) + if (err) + goto restore_opts; + } +- sbi->s_mount_state = le16_to_cpu(es->s_state); ++ sbi->s_mount_state = (le16_to_cpu(es->s_state) & ++ ~EXT4_FC_REPLAY); + + err = ext4_setup_super(sb, es, 0); + if (err) +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-BUG_ON-when-directory-entry-has-invalid-rec.patch @@ -0,0 +1,74 @@ +From 17a0bc9bd697f75cfdf9b378d5eb2d7409c91340 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Lu=C3=ADs=20Henriques?= <lhenriques@suse.de> +Date: Wed, 12 Oct 2022 14:13:30 +0100 +Subject: [PATCH] ext4: fix BUG_ON() when directory entry has invalid rec_len +Mime-version: 1.0 +Content-type: text/plain; charset=UTF-8 +Content-transfer-encoding: 8bit +Git-commit: 17a0bc9bd697f75cfdf9b378d5eb2d7409c91340 +Patch-mainline: v6.1-rc4 +References: bsc#1206886 + +The rec_len field in the directory entry has to be a multiple of 4. A +corrupted filesystem image can be used to hit a BUG() in +ext4_rec_len_to_disk(), called from make_indexed_dir(). + + ------------[ cut here ]------------ + kernel BUG at fs/ext4/ext4.h:2413! + ... + RIP: 0010:make_indexed_dir+0x53f/0x5f0 + ... + Call Trace: + <TASK> + ? add_dirent_to_buf+0x1b2/0x200 + ext4_add_entry+0x36e/0x480 + ext4_add_nondir+0x2b/0xc0 + ext4_create+0x163/0x200 + path_openat+0x635/0xe90 + do_filp_open+0xb4/0x160 + ? __create_object.isra.0+0x1de/0x3b0 + ? _raw_spin_unlock+0x12/0x30 + do_sys_openat2+0x91/0x150 + __x64_sys_open+0x6c/0xa0 + do_syscall_64+0x3c/0x80 + entry_SYSCALL_64_after_hwframe+0x46/0xb0 + +The fix simply adds a call to ext4_check_dir_entry() to validate the +directory entry, returning -EFSCORRUPTED if the entry is invalid. + +Cc: stable@kernel.org +Link: https://bugzilla.kernel.org/show_bug.cgi?id=216540 +Signed-off-by: LuÃs Henriques <lhenriques@suse.de> +Link: https://lore.kernel.org/r/20221012131330.32456-1-lhenriques@suse.de +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/namei.c | 10 +++++++++- + 1 file changed, 9 insertions(+), 1 deletion(-) + +diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c +index 4183a4cb4a21..be8136aafa22 100644 +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -2259,8 +2259,16 @@ static int make_indexed_dir(handle_t *handle, struct ext4_filename *fname, + memset(de, 0, len); /* wipe old data */ + de = (struct ext4_dir_entry_2 *) data2; + top = data2 + len; +- while ((char *)(de2 = ext4_next_entry(de, blocksize)) < top) ++ while ((char *)(de2 = ext4_next_entry(de, blocksize)) < top) { ++ if (ext4_check_dir_entry(dir, NULL, de, bh2, data2, len, ++ (data2 + (blocksize - csum_size) - ++ (char *) de))) { ++ brelse(bh2); ++ brelse(bh); ++ return -EFSCORRUPTED; ++ } + de = de2; ++ } + de->rec_len = ext4_rec_len_to_disk(data2 + (blocksize - csum_size) - + (char *) de, blocksize); + +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-RENAME_WHITEOUT-handling-for-inline-directo.patch @@ -0,0 +1,93 @@ +From c9f62c8b2dbf7240536c0cc9a4529397bb8bf38e Mon Sep 17 00:00:00 2001 +From: Eric Whitney <enwlinux@gmail.com> +Date: Fri, 10 Feb 2023 12:32:44 -0500 +Subject: [PATCH] ext4: fix RENAME_WHITEOUT handling for inline directories +Git-commit: c9f62c8b2dbf7240536c0cc9a4529397bb8bf38e +Patch-mainline: v6.3-rc2 +References: bsc#1210766 + +A significant number of xfstests can cause ext4 to log one or more +warning messages when they are run on a test file system where the +inline_data feature has been enabled. An example: + +"EXT4-fs warning (device vdc): ext4_dirblock_csum_set:425: inode + #16385: comm fsstress: No space for directory leaf checksum. Please +run e2fsck -D." + +The xfstests include: ext4/057, 058, and 307; generic/013, 051, 068, +070, 076, 078, 083, 232, 269, 270, 390, 461, 475, 476, 482, 579, 585, +589, 626, 631, and 650. + +In this situation, the warning message indicates a bug in the code that +performs the RENAME_WHITEOUT operation on a directory entry that has +been stored inline. It doesn't detect that the directory is stored +inline, and incorrectly attempts to compute a dirent block checksum on +the whiteout inode when creating it. This attempt fails as a result +of the integrity checking in get_dirent_tail (usually due to a failure +to match the EXT4_FT_DIR_CSUM magic cookie), and the warning message +is then emitted. + +Fix this by simply collecting the inlined data state at the time the +search for the source directory entry is performed. Existing code +handles the rest, and this is sufficient to eliminate all spurious +warning messages produced by the tests above. Go one step further +and do the same in the code that resets the source directory entry in +the event of failure. The inlined state should be present in the +"old" struct, but given the possibility of a race there's no harm +in taking a conservative approach and getting that information again +since the directory entry is being reread anyway. + +Fixes: b7ff91fd030d ("ext4: find old entry again if failed to rename whiteout") +Cc: stable@kernel.org +Signed-off-by: Eric Whitney <enwlinux@gmail.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20230210173244.679890-1-enwlinux@gmail.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/namei.c | 13 +++++++------ + 1 file changed, 7 insertions(+), 6 deletions(-) + +diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c +index 270fbcba75b6..e8f429330f3c 100644 +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -1595,11 +1595,10 @@ static struct buffer_head *__ext4_find_entry(struct inode *dir, + int has_inline_data = 1; + ret = ext4_find_inline_entry(dir, fname, res_dir, + &has_inline_data); +- if (has_inline_data) { +- if (inlined) +- *inlined = 1; ++ if (inlined) ++ *inlined = has_inline_data; ++ if (has_inline_data) + goto cleanup_and_exit; +- } + } + + if ((namelen <= 2) && (name[0] == '.') && +@@ -3646,7 +3645,8 @@ static void ext4_resetent(handle_t *handle, struct ext4_renament *ent, + * so the old->de may no longer valid and need to find it again + * before reset old inode info. + */ +- old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, NULL); ++ old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, ++ &old.inlined); + if (IS_ERR(old.bh)) + retval = PTR_ERR(old.bh); + if (!old.bh) +@@ -3813,7 +3813,8 @@ static int ext4_rename(struct user_namespace *mnt_userns, struct inode *old_dir, + return retval; + } + +- old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, NULL); ++ old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, ++ &old.inlined); + if (IS_ERR(old.bh)) + return PTR_ERR(old.bh); + /* +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-WARNING-in-ext4_update_inline_data.patch @@ -0,0 +1,114 @@ +From 2b96b4a5d9443ca4cad58b0040be455803c05a42 Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Tue, 7 Mar 2023 09:52:53 +0800 +Subject: [PATCH] ext4: fix WARNING in ext4_update_inline_data +Git-commit: 2b96b4a5d9443ca4cad58b0040be455803c05a42 +Patch-mainline: v6.3-rc2 +References: bsc#1213012 + +Syzbot found the following issue: +EXT4-fs (loop0): mounted filesystem 00000000-0000-0000-0000-000000000000 without journal. Quota mode: none. + +Fscrypt: AES-256-CTS-CBC using implementation "cts-cbc-aes-aesni" +Fscrypt: AES-256-XTS using implementation "xts-aes-aesni" +Acked-by: Jan Kara <jack@suse.cz> + +------------[ cut here ]------------ +WARNING: CPU: 0 PID: 5071 at mm/page_alloc.c:5525 __alloc_pages+0x30a/0x560 mm/page_alloc.c:5525 +Modules linked in: +CPU: 1 PID: 5071 Comm: syz-executor263 Not tainted 6.2.0-rc1-syzkaller #0 +Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 +RIP: 0010:__alloc_pages+0x30a/0x560 mm/page_alloc.c:5525 +RSP: 0018:ffffc90003c2f1c0 EFLAGS: 00010246 +RAX: ffffc90003c2f220 RBX: 0000000000000014 RCX: 0000000000000000 +RDX: 0000000000000028 RSI: 0000000000000000 RDI: ffffc90003c2f248 +RBP: ffffc90003c2f2d8 R08: dffffc0000000000 R09: ffffc90003c2f220 +R10: fffff52000785e49 R11: 1ffff92000785e44 R12: 0000000000040d40 +R13: 1ffff92000785e40 R14: dffffc0000000000 R15: 1ffff92000785e3c +FS: 0000555556c0d300(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: 00007f95d5e04138 CR3: 00000000793aa000 CR4: 00000000003506f0 +DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 +DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 +Call Trace: + <TASK> + __alloc_pages_node include/linux/gfp.h:237 [inline] + alloc_pages_node include/linux/gfp.h:260 [inline] + __kmalloc_large_node+0x95/0x1e0 mm/slab_common.c:1113 + __do_kmalloc_node mm/slab_common.c:956 [inline] + __kmalloc+0xfe/0x190 mm/slab_common.c:981 + kmalloc include/linux/slab.h:584 [inline] + kzalloc include/linux/slab.h:720 [inline] + ext4_update_inline_data+0x236/0x6b0 fs/ext4/inline.c:346 + ext4_update_inline_dir fs/ext4/inline.c:1115 [inline] + ext4_try_add_inline_entry+0x328/0x990 fs/ext4/inline.c:1307 + ext4_add_entry+0x5a4/0xeb0 fs/ext4/namei.c:2385 + ext4_add_nondir+0x96/0x260 fs/ext4/namei.c:2772 + ext4_create+0x36c/0x560 fs/ext4/namei.c:2817 + lookup_open fs/namei.c:3413 [inline] + open_last_lookups fs/namei.c:3481 [inline] + path_openat+0x12ac/0x2dd0 fs/namei.c:3711 + do_filp_open+0x264/0x4f0 fs/namei.c:3741 + do_sys_openat2+0x124/0x4e0 fs/open.c:1310 + do_sys_open fs/open.c:1326 [inline] + __do_sys_openat fs/open.c:1342 [inline] + __se_sys_openat fs/open.c:1337 [inline] + __x64_sys_openat+0x243/0x290 fs/open.c:1337 + do_syscall_x64 arch/x86/entry/common.c:50 [inline] + do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 + entry_SYSCALL_64_after_hwframe+0x63/0xcd + +Above issue happens as follows: +ext4_iget + ext4_find_inline_data_nolock ->i_inline_off=164 i_inline_size=60 +ext4_try_add_inline_entry + __ext4_mark_inode_dirty + ext4_expand_extra_isize_ea ->i_extra_isize=32 s_want_extra_isize=44 + ext4_xattr_shift_entries + ->after shift i_inline_off is incorrect, actually is change to 176 +ext4_try_add_inline_entry + ext4_update_inline_dir + get_max_inline_xattr_value_size + if (EXT4_I(inode)->i_inline_off) + entry = (struct ext4_xattr_entry *)((void *)raw_inode + + EXT4_I(inode)->i_inline_off); + free += EXT4_XATTR_SIZE(le32_to_cpu(entry->e_value_size)); + ->As entry is incorrect, then 'free' may be negative + ext4_update_inline_data + value = kzalloc(len, GFP_NOFS); + -> len is unsigned int, maybe very large, then trigger warning when + 'kzalloc()' + +To resolve the above issue we need to update 'i_inline_off' after +'ext4_xattr_shift_entries()'. We do not need to set +EXT4_STATE_MAY_INLINE_DATA flag here, since ext4_mark_inode_dirty() +already sets this flag if needed. Setting EXT4_STATE_MAY_INLINE_DATA +when it is needed may trigger a BUG_ON in ext4_writepages(). + +Reported-by: syzbot+d30838395804afc2fa6f@syzkaller.appspotmail.com +Cc: stable@kernel.org +Signed-off-by: Ye Bin <yebin10@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20230307015253.2232062-3-yebin@huaweicloud.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +--- + fs/ext4/xattr.c | 3 +++ + 1 file changed, 3 insertions(+) + +diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c +index 863c15388848..2a006e4db467 100644 +--- a/fs/ext4/xattr.c ++++ b/fs/ext4/xattr.c +@@ -2851,6 +2851,9 @@ int ext4_expand_extra_isize_ea(struct inode *inode, int new_extra_isize, + (void *)header, total_ino); + EXT4_I(inode)->i_extra_isize = new_extra_isize; + ++ if (ext4_has_inline_data(inode)) ++ error = ext4_find_inline_data_nolock(inode); ++ + cleanup: + if (error && (mnt_count != le16_to_cpu(sbi->s_es->s_mnt_count))) { + ext4_warning(inode->i_sb, "Unable to expand inode %lu. Delete some EAs or run e2fsck.", +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-WARNING-in-mb_find_extent.patch @@ -0,0 +1,135 @@ +From fa08a7b61dff8a4df11ff1e84abfc214b487caf7 Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Mon, 16 Jan 2023 10:00:15 +0800 +Subject: [PATCH] ext4: fix WARNING in mb_find_extent +Git-commit: fa08a7b61dff8a4df11ff1e84abfc214b487caf7 +Patch-mainline: v6.4-rc2 +References: bsc#1213099 + +Syzbot found the following issue: + +Ext4-fs: Warning: mounting with data=journal disables delayed allocation, dioread_nolock, O_DIRECT and fast_commit support! +EXT4-fs (loop0): orphan cleanup on readonly fs + +Acked-by: Jan Kara <jack@suse.cz> + +------------[ cut here ]------------ +WARNING: CPU: 1 PID: 5067 at fs/ext4/mballoc.c:1869 mb_find_extent+0x8a1/0xe30 +Modules linked in: +CPU: 1 PID: 5067 Comm: syz-executor307 Not tainted 6.2.0-rc1-syzkaller #0 +Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 +RIP: 0010:mb_find_extent+0x8a1/0xe30 fs/ext4/mballoc.c:1869 +RSP: 0018:ffffc90003c9e098 EFLAGS: 00010293 +RAX: ffffffff82405731 RBX: 0000000000000041 RCX: ffff8880783457c0 +RDX: 0000000000000000 RSI: 0000000000000041 RDI: 0000000000000040 +RBP: 0000000000000040 R08: ffffffff82405723 R09: ffffed10053c9402 +R10: ffffed10053c9402 R11: 1ffff110053c9401 R12: 0000000000000000 +R13: ffffc90003c9e538 R14: dffffc0000000000 R15: ffffc90003c9e2cc +FS: 0000555556665300(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: 000056312f6796f8 CR3: 0000000022437000 CR4: 00000000003506e0 +DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 +DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 +Call Trace: + <TASK> + ext4_mb_complex_scan_group+0x353/0x1100 fs/ext4/mballoc.c:2307 + ext4_mb_regular_allocator+0x1533/0x3860 fs/ext4/mballoc.c:2735 + ext4_mb_new_blocks+0xddf/0x3db0 fs/ext4/mballoc.c:5605 + ext4_ext_map_blocks+0x1868/0x6880 fs/ext4/extents.c:4286 + ext4_map_blocks+0xa49/0x1cc0 fs/ext4/inode.c:651 + ext4_getblk+0x1b9/0x770 fs/ext4/inode.c:864 + ext4_bread+0x2a/0x170 fs/ext4/inode.c:920 + ext4_quota_write+0x225/0x570 fs/ext4/super.c:7105 + write_blk fs/quota/quota_tree.c:64 [inline] + get_free_dqblk+0x34a/0x6d0 fs/quota/quota_tree.c:130 + do_insert_tree+0x26b/0x1aa0 fs/quota/quota_tree.c:340 + do_insert_tree+0x722/0x1aa0 fs/quota/quota_tree.c:375 + do_insert_tree+0x722/0x1aa0 fs/quota/quota_tree.c:375 + do_insert_tree+0x722/0x1aa0 fs/quota/quota_tree.c:375 + dq_insert_tree fs/quota/quota_tree.c:401 [inline] + qtree_write_dquot+0x3b6/0x530 fs/quota/quota_tree.c:420 + v2_write_dquot+0x11b/0x190 fs/quota/quota_v2.c:358 + dquot_acquire+0x348/0x670 fs/quota/dquot.c:444 + ext4_acquire_dquot+0x2dc/0x400 fs/ext4/super.c:6740 + dqget+0x999/0xdc0 fs/quota/dquot.c:914 + __dquot_initialize+0x3d0/0xcf0 fs/quota/dquot.c:1492 + ext4_process_orphan+0x57/0x2d0 fs/ext4/orphan.c:329 + ext4_orphan_cleanup+0xb60/0x1340 fs/ext4/orphan.c:474 + __ext4_fill_super fs/ext4/super.c:5516 [inline] + ext4_fill_super+0x81cd/0x8700 fs/ext4/super.c:5644 + get_tree_bdev+0x400/0x620 fs/super.c:1282 + vfs_get_tree+0x88/0x270 fs/super.c:1489 + do_new_mount+0x289/0xad0 fs/namespace.c:3145 + do_mount fs/namespace.c:3488 [inline] + __do_sys_mount fs/namespace.c:3697 [inline] + __se_sys_mount+0x2d3/0x3c0 fs/namespace.c:3674 + do_syscall_x64 arch/x86/entry/common.c:50 [inline] + do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 + entry_SYSCALL_64_after_hwframe+0x63/0xcd + +Add some debug information: +mb_find_extent: mb_find_extent block=41, order=0 needed=64 next=0 ex=0/41/1@3735929054 64 64 7 +block_bitmap: ff 3f 0c 00 fc 01 00 00 d2 3d 00 00 00 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff + +Acctually, blocks per group is 64, but block bitmap indicate at least has +128 blocks. Now, ext4_validate_block_bitmap() didn't check invalid block's +bitmap if set. +To resolve above issue, add check like fsck "Padding at end of block bitmap is +not set". + +Cc: stable@kernel.org +Reported-by: syzbot+68223fe9f6c95ad43bed@syzkaller.appspotmail.com +Signed-off-by: Ye Bin <yebin10@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20230116020015.1506120-1-yebin@huaweicloud.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +--- + fs/ext4/balloc.c | 25 +++++++++++++++++++++++++ + 1 file changed, 25 insertions(+) + +diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c +index 094269488183..c49e612e3975 100644 +--- a/fs/ext4/balloc.c ++++ b/fs/ext4/balloc.c +@@ -305,6 +305,22 @@ struct ext4_group_desc * ext4_get_group_desc(struct super_block *sb, + return desc; + } + ++static ext4_fsblk_t ext4_valid_block_bitmap_padding(struct super_block *sb, ++ ext4_group_t block_group, ++ struct buffer_head *bh) ++{ ++ ext4_grpblk_t next_zero_bit; ++ unsigned long bitmap_size = sb->s_blocksize * 8; ++ unsigned int offset = num_clusters_in_group(sb, block_group); ++ ++ if (bitmap_size <= offset) ++ return 0; ++ ++ next_zero_bit = ext4_find_next_zero_bit(bh->b_data, bitmap_size, offset); ++ ++ return (next_zero_bit < bitmap_size ? next_zero_bit : 0); ++} ++ + /* + * Return the block number which was discovered to be invalid, or 0 if + * the block bitmap is valid. +@@ -402,6 +418,15 @@ static int ext4_validate_block_bitmap(struct super_block *sb, + EXT4_GROUP_INFO_BBITMAP_CORRUPT); + return -EFSCORRUPTED; + } ++ blk = ext4_valid_block_bitmap_padding(sb, block_group, bh); ++ if (unlikely(blk != 0)) { ++ ext4_unlock_group(sb, block_group); ++ ext4_error(sb, "bg %u: block %llu: padding at end of block bitmap is not set", ++ block_group, blk); ++ ext4_mark_group_bitmap_corrupted(sb, block_group, ++ EXT4_GROUP_INFO_BBITMAP_CORRUPT); ++ return -EFSCORRUPTED; ++ } + set_buffer_verified(bh); + verified: + ext4_unlock_group(sb, block_group); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-a-possible-ABBA-deadlock-due-to-busy-PA.patch @@ -0,0 +1,159 @@ +From 8c80fb312d7abf8bcd66cca1d843a80318a2c522 Mon Sep 17 00:00:00 2001 +From: Chunguang Xu <brookxu@tencent.com> +Date: Tue, 23 Nov 2021 09:17:57 +0800 +Subject: [PATCH] ext4: fix a possible ABBA deadlock due to busy PA +Git-commit: 8c80fb312d7abf8bcd66cca1d843a80318a2c522 +Patch-mainline: v5.17-rc1 +References: bsc#1202762 + +We found on older kernel (3.10) that in the scenario of insufficient +disk space, system may trigger an ABBA deadlock problem, it seems that +this problem still exists in latest kernel, try to fix it here. The +main process triggered by this problem is that task A occupies the PA +and waits for the jbd2 transaction finish, the jbd2 transaction waits +for the completion of task B's IO (plug_list), but task B waits for +the release of PA by task A to finish discard, which indirectly forms +an ABBA deadlock. The related calltrace is as follows: + + Task A + vfs_write + ext4_mb_new_blocks() + ext4_mb_mark_diskspace_used() JBD2 + jbd2_journal_get_write_access() -> jbd2_journal_commit_transaction() + ->schedule() filemap_fdatawait() + | | + | Task B | + | do_unlinkat() | + | ext4_evict_inode() | + | jbd2_journal_begin_ordered_truncate() | + | filemap_fdatawrite_range() | + | ext4_mb_new_blocks() | + -ext4_mb_discard_group_preallocations() <----- + +Here, try to cancel ext4_mb_discard_group_preallocations() internal +retry due to PA busy, and do a limited number of retries inside +ext4_mb_discard_preallocations(), which can circumvent the above +problems, but also has some advantages: + +1. Since the PA is in a busy state, if other groups have free PAs, + keeping the current PA may help to reduce fragmentation. +2. Continue to traverse forward instead of waiting for the current + group PA to be released. In most scenarios, the PA discard time + can be reduced. + +However, in the case of smaller free space, if only a few groups have +space, then due to multiple traversals of the group, it may increase +CPU overhead. But in contrast, I feel that the overall benefit is +better than the cost. + +Signed-off-by: Chunguang Xu <brookxu@tencent.com> +Reported-by: kernel test robot <lkp@intel.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/1637630277-23496-1-git-send-email-brookxu.cn@gmail.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/mballoc.c | 40 ++++++++++++++++++---------------------- + 1 file changed, 18 insertions(+), 22 deletions(-) + +diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c +index 215b7068f548..3dd9b9e2f967 100644 +--- a/fs/ext4/mballoc.c ++++ b/fs/ext4/mballoc.c +@@ -4814,7 +4814,7 @@ ext4_mb_release_group_pa(struct ext4_buddy *e4b, + */ + static noinline_for_stack int + ext4_mb_discard_group_preallocations(struct super_block *sb, +- ext4_group_t group, int needed) ++ ext4_group_t group, int *busy) + { + struct ext4_group_info *grp = ext4_get_group_info(sb, group); + struct buffer_head *bitmap_bh = NULL; +@@ -4822,8 +4822,7 @@ ext4_mb_discard_group_preallocations(struct super_block *sb, + struct list_head list; + struct ext4_buddy e4b; + int err; +- int busy = 0; +- int free, free_total = 0; ++ int free = 0; + + mb_debug(sb, "discard preallocation for group %u\n", group); + if (list_empty(&grp->bb_prealloc_list)) +@@ -4846,19 +4845,14 @@ ext4_mb_discard_group_preallocations(struct super_block *sb, + goto out_dbg; + } + +- if (needed == 0) +- needed = EXT4_CLUSTERS_PER_GROUP(sb) + 1; +- + INIT_LIST_HEAD(&list); +-repeat: +- free = 0; + ext4_lock_group(sb, group); + list_for_each_entry_safe(pa, tmp, + &grp->bb_prealloc_list, pa_group_list) { + spin_lock(&pa->pa_lock); + if (atomic_read(&pa->pa_count)) { + spin_unlock(&pa->pa_lock); +- busy = 1; ++ *busy = 1; + continue; + } + if (pa->pa_deleted) { +@@ -4898,22 +4892,13 @@ ext4_mb_discard_group_preallocations(struct super_block *sb, + call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); + } + +- free_total += free; +- +- /* if we still need more blocks and some PAs were used, try again */ +- if (free_total < needed && busy) { +- ext4_unlock_group(sb, group); +- cond_resched(); +- busy = 0; +- goto repeat; +- } + ext4_unlock_group(sb, group); + ext4_mb_unload_buddy(&e4b); + put_bh(bitmap_bh); + out_dbg: + mb_debug(sb, "discarded (%d) blocks preallocated for group %u bb_free (%d)\n", +- free_total, group, grp->bb_free); +- return free_total; ++ free, group, grp->bb_free); ++ return free; + } + + /* +@@ -5455,13 +5440,24 @@ static int ext4_mb_discard_preallocations(struct super_block *sb, int needed) + { + ext4_group_t i, ngroups = ext4_get_groups_count(sb); + int ret; +- int freed = 0; ++ int freed = 0, busy = 0; ++ int retry = 0; + + trace_ext4_mb_discard_preallocations(sb, needed); ++ ++ if (needed == 0) ++ needed = EXT4_CLUSTERS_PER_GROUP(sb) + 1; ++ repeat: + for (i = 0; i < ngroups && needed > 0; i++) { +- ret = ext4_mb_discard_group_preallocations(sb, i, needed); ++ ret = ext4_mb_discard_group_preallocations(sb, i, &busy); + freed += ret; + needed -= ret; ++ cond_resched(); ++ } ++ ++ if (needed > 0 && busy && ++retry < 3) { ++ busy = 0; ++ goto repeat; + } + + return freed; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-an-use-after-free-issue-about-data-journal-.patch @@ -0,0 +1,135 @@ +From 5c48a7df91499e371ef725895b2e2d21a126e227 Mon Sep 17 00:00:00 2001 +From: Zhang Yi <yi.zhang@huawei.com> +Date: Sat, 25 Dec 2021 17:09:37 +0800 +Subject: [PATCH] ext4: fix an use-after-free issue about data=journal + writeback mode +Git-commit: 5c48a7df91499e371ef725895b2e2d21a126e227 +Patch-mainline: v5.17-rc1 +References: bsc#1195482 + +Our syzkaller report an use-after-free issue that accessing the freed +buffer_head on the writeback page in __ext4_journalled_writepage(). The +problem is that if there was a truncate racing with the data=journalled +writeback procedure, the writeback length could become zero and +bget_one() refuse to get buffer_head's refcount, then the truncate +procedure release buffer once we drop page lock, finally, the last +ext4_walk_page_buffers() trigger the use-after-free problem. + +sync truncate +ext4_sync_file() + file_write_and_wait_range() + ext4_setattr(0) + inode->i_size = 0 + ext4_writepage() + len = 0 + __ext4_journalled_writepage() + page_bufs = page_buffers(page) + ext4_walk_page_buffers(bget_one) <- does not get refcount + do_invalidatepage() + free_buffer_head() + ext4_walk_page_buffers(page_bufs) <- trigger use-after-free + +After commit bdf96838aea6 ("ext4: fix race between truncate and +__ext4_journalled_writepage()"), we have already handled the racing +case, so the bget_one() and bput_one() are not needed. So this patch +simply remove these hunk, and recheck the i_size to make it safe. + +Fixes: bdf96838aea6 ("ext4: fix race between truncate and __ext4_journalled_writepage()") +Signed-off-by: Zhang Yi <yi.zhang@huawei.com> +Cc: stable@vger.kernel.org +Link: https://lore.kernel.org/r/20211225090937.712867-1-yi.zhang@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inode.c | 37 ++++++++++--------------------------- + 1 file changed, 10 insertions(+), 27 deletions(-) + +diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c +index bca9951634d9..68070f34f0cf 100644 +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -1845,30 +1845,16 @@ int ext4_da_get_block_prep(struct inode *inode, sector_t iblock, + return 0; + } + +-static int bget_one(handle_t *handle, struct inode *inode, +- struct buffer_head *bh) +-{ +- get_bh(bh); +- return 0; +-} +- +-static int bput_one(handle_t *handle, struct inode *inode, +- struct buffer_head *bh) +-{ +- put_bh(bh); +- return 0; +-} +- + static int __ext4_journalled_writepage(struct page *page, + unsigned int len) + { + struct address_space *mapping = page->mapping; + struct inode *inode = mapping->host; +- struct buffer_head *page_bufs = NULL; + handle_t *handle = NULL; + int ret = 0, err = 0; + int inline_data = ext4_has_inline_data(inode); + struct buffer_head *inode_bh = NULL; ++ loff_t size; + + ClearPageChecked(page); + +@@ -1878,14 +1864,6 @@ static int __ext4_journalled_writepage(struct page *page, + inode_bh = ext4_journalled_write_inline_data(inode, len, page); + if (inode_bh == NULL) + goto out; +- } else { +- page_bufs = page_buffers(page); +- if (!page_bufs) { +- BUG(); +- goto out; +- } +- ext4_walk_page_buffers(handle, inode, page_bufs, 0, len, +- NULL, bget_one); + } + /* + * We need to release the page lock before we start the +@@ -1906,7 +1884,8 @@ static int __ext4_journalled_writepage(struct page *page, + + lock_page(page); + put_page(page); +- if (page->mapping != mapping) { ++ size = i_size_read(inode); ++ if (page->mapping != mapping || page_offset(page) > size) { + /* The page got truncated from under us */ + ext4_journal_stop(handle); + ret = 0; +@@ -1916,6 +1895,13 @@ static int __ext4_journalled_writepage(struct page *page, + if (inline_data) { + ret = ext4_mark_inode_dirty(handle, inode); + } else { ++ struct buffer_head *page_bufs = page_buffers(page); ++ ++ if (page->index == size >> PAGE_SHIFT) ++ len = size & ~PAGE_MASK; ++ else ++ len = PAGE_SIZE; ++ + ret = ext4_walk_page_buffers(handle, inode, page_bufs, 0, len, + NULL, do_journal_get_write_access); + +@@ -1936,9 +1922,6 @@ static int __ext4_journalled_writepage(struct page *page, + out: + unlock_page(page); + out_no_pagelock: +- if (!inline_data && page_bufs) +- ext4_walk_page_buffers(NULL, inode, page_bufs, 0, len, +- NULL, bput_one); + brelse(inode_bh); + return ret; + } +-- +2.34.1 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-another-off-by-one-fsmap-error-on-1k-block-.patch @@ -0,0 +1,127 @@ +From c993799baf9c5861f8df91beb80e1611b12efcbd Mon Sep 17 00:00:00 2001 +From: "Darrick J. Wong" <djwong@kernel.org> +Date: Thu, 16 Feb 2023 10:55:48 -0800 +Subject: [PATCH] ext4: fix another off-by-one fsmap error on 1k block + filesystems +Git-commit: c993799baf9c5861f8df91beb80e1611b12efcbd +Patch-mainline: v6.3-rc2 +References: bsc#1210767 + +Apparently syzbot figured out that issuing this FSMAP call: + +struct fsmap_head cmd = { + .fmh_count = ...; + .fmh_keys = { + { .fmr_device = /* ext4 dev */, .fmr_physical = 0, }, + { .fmr_device = /* ext4 dev */, .fmr_physical = 0, }, + }, +... +}; +ret = ioctl(fd, FS_IOC_GETFSMAP, &cmd); + +Produces this crash if the underlying filesystem is a 1k-block ext4 +Filesystem: + +kernel BUG at fs/ext4/ext4.h:3331! +invalid opcode: 0000 [#1] PREEMPT SMP +Cpu: 3 PID: 3227965 Comm: xfs_io Tainted: G W O 6.2.0-rc8-achx +Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 +Rip: 0010:ext4_mb_load_buddy_gfp+0x47c/0x570 [ext4] +Rsp: 0018:ffffc90007c03998 EFLAGS: 00010246 +Rax: ffff888004978000 RBX: ffffc90007c03a20 RCX: ffff888041618000 +Rdx: 0000000000000000 RSI: 00000000000005a4 RDI: ffffffffa0c99b11 +Rbp: ffff888012330000 R08: ffffffffa0c2b7d0 R09: 0000000000000400 +R10: ffffc90007c03950 R11: 0000000000000000 R12: 0000000000000001 +R13: 00000000ffffffff R14: 0000000000000c40 R15: ffff88802678c398 +Fs: 00007fdf2020c880(0000) GS:ffff88807e100000(0000) knlGS:0000000000000000 +Cs: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +Cr2: 00007ffd318a5fe8 CR3: 000000007f80f001 CR4: 00000000001706e0 +Call Trace: + <TASK> + ext4_mballoc_query_range+0x4b/0x210 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] + ext4_getfsmap_datadev+0x713/0x890 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] + ext4_getfsmap+0x2b7/0x330 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] + ext4_ioc_getfsmap+0x153/0x2b0 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] + __ext4_ioctl+0x2a7/0x17e0 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] + __x64_sys_ioctl+0x82/0xa0 + do_syscall_64+0x2b/0x80 + entry_SYSCALL_64_after_hwframe+0x46/0xb0 +Rip: 0033:0x7fdf20558aff +Rsp: 002b:00007ffd318a9e30 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 +Rax: ffffffffffffffda RBX: 00000000000200c0 RCX: 00007fdf20558aff +Rdx: 00007fdf1feb2010 RSI: 00000000c0c0583b RDI: 0000000000000003 +Rbp: 00005625c0634be0 R08: 00005625c0634c40 R09: 0000000000000001 +R10: 0000000000000000 R11: 0000000000000246 R12: 00007fdf1feb2010 +R13: 00005625be70d994 R14: 0000000000000800 R15: 0000000000000000 + +For GETFSMAP calls, the caller selects a physical block device by +writing its block number into fsmap_head.fmh_keys[01].fmr_device. +To query mappings for a subrange of the device, the starting byte of the +range is written to fsmap_head.fmh_keys[0].fmr_physical and the last +byte of the range goes in fsmap_head.fmh_keys[1].fmr_physical. + +IOWs, to query what mappings overlap with bytes 3-14 of /dev/sda, you'd +set the inputs as follows: + + fmh_keys[0] = { .fmr_device = major(8, 0), .fmr_physical = 3}, + fmh_keys[1] = { .fmr_device = major(8, 0), .fmr_physical = 14}, + +Which would return you whatever is mapped in the 12 bytes starting at +physical offset 3. + +The crash is due to insufficient range validation of keys[1] in +ext4_getfsmap_datadev. On 1k-block filesystems, block 0 is not part of +the filesystem, which means that s_first_data_block is nonzero. +ext4_get_group_no_and_offset subtracts this quantity from the blocknr +argument before cracking it into a group number and a block number +within a group. IOWs, block group 0 spans blocks 1-8192 (1-based) +instead of 0-8191 (0-based) like what happens with larger blocksizes. + +The net result of this encoding is that blocknr < s_first_data_block is +not a valid input to this function. The end_fsb variable is set from +the keys that are copied from userspace, which means that in the above +example, its value is zero. That leads to an underflow here: + + blocknr = blocknr - le32_to_cpu(es->s_first_data_block); + +The division then operates on -1: + + offset = do_div(blocknr, EXT4_BLOCKS_PER_GROUP(sb)) >> + EXT4_SB(sb)->s_cluster_bits; + +Leaving an impossibly large group number (2^32-1) in blocknr. +ext4_getfsmap_check_keys checked that keys[0].fmr_physical and +keys[1].fmr_physical are in increasing order, but +ext4_getfsmap_datadev adjusts keys[0].fmr_physical to be at least +s_first_data_block. This implies that we have to check it again after +the adjustment, which is the piece that I forgot. + +Reported-by: syzbot+6be2b977c89f79b6b153@syzkaller.appspotmail.com +Fixes: 4a4956249dac ("ext4: fix off-by-one fsmap error on 1k block filesystems") +Link: https://syzkaller.appspot.com/bug?id=79d5768e9bfe362911ac1a5057a36fc6b5c30002 +Cc: stable@vger.kernel.org +Signed-off-by: Darrick J. Wong <djwong@kernel.org> +Link: https://lore.kernel.org/r/Y+58NPTH7VNGgzdd@magnolia +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/fsmap.c | 2 ++ + 1 file changed, 2 insertions(+) + +diff --git a/fs/ext4/fsmap.c b/fs/ext4/fsmap.c +index 4493ef0c715e..cdf9bfe10137 100644 +--- a/fs/ext4/fsmap.c ++++ b/fs/ext4/fsmap.c +@@ -486,6 +486,8 @@ static int ext4_getfsmap_datadev(struct super_block *sb, + keys[0].fmr_physical = bofs; + if (keys[1].fmr_physical >= eofs) + keys[1].fmr_physical = eofs - 1; ++ if (keys[1].fmr_physical < keys[0].fmr_physical) ++ return 0; + start_fsb = keys[0].fmr_physical; + end_fsb = keys[1].fmr_physical; + +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-bad-checksum-after-online-resize.patch @@ -0,0 +1,54 @@ +From a408f33e895e455f16cf964cb5cd4979b658db7b Mon Sep 17 00:00:00 2001 +From: Baokun Li <libaokun1@huawei.com> +Date: Thu, 17 Nov 2022 12:03:39 +0800 +Subject: [PATCH] ext4: fix bad checksum after online resize +Git-commit: a408f33e895e455f16cf964cb5cd4979b658db7b +Patch-mainline: v6.2-rc1 +References: bsc#1210762 bsc#1208076 + +When online resizing is performed twice consecutively, the error message +"Superblock checksum does not match superblock" is displayed for the +second time. Here's the reproducer: + + mkfs.ext4 -F /dev/sdb 100M + mount /dev/sdb /tmp/test + resize2fs /dev/sdb 5G + resize2fs /dev/sdb 6G + +To solve this issue, we moved the update of the checksum after the +es->s_overhead_clusters is updated. + +Fixes: 026d0d27c488 ("ext4: reduce computation of overhead during resize") +Fixes: de394a86658f ("ext4: update s_overhead_clusters in the superblock during an on-line resize") +Signed-off-by: Baokun Li <libaokun1@huawei.com> +Reviewed-by: Darrick J. Wong <djwong@kernel.org> +Reviewed-by: Jan Kara <jack@suse.cz> +Cc: stable@kernel.org +Link: https://lore.kernel.org/r/20221117040341.1380702-2-libaokun1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/resize.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +--- a/fs/ext4/resize.c ++++ b/fs/ext4/resize.c +@@ -1445,8 +1445,6 @@ static void ext4_update_super(struct sup + * active. */ + ext4_r_blocks_count_set(es, ext4_r_blocks_count(es) + + reserved_blocks); +- ext4_superblock_csum_set(sb); +- unlock_buffer(sbi->s_sbh); + + /* Update the free space counts */ + percpu_counter_add(&sbi->s_freeclusters_counter, +@@ -1474,6 +1472,8 @@ static void ext4_update_super(struct sup + ext4_calculate_overhead(sb); + es->s_overhead_clusters = cpu_to_le32(sbi->s_overhead); + ++ ext4_superblock_csum_set(sb); ++ unlock_buffer(sbi->s_sbh); + if (test_opt(sb, DEBUG)) + printk(KERN_DEBUG "EXT4-fs: added group %u:" + "%llu blocks(%llu free %llu reserved)\n", flex_gd->count, --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-bug_on-ext4_mb_use_inode_pa.patch @@ -0,0 +1,103 @@ +From a08f789d2ab5242c07e716baf9a835725046be89 Mon Sep 17 00:00:00 2001 +From: Baokun Li <libaokun1@huawei.com> +Date: Sat, 28 May 2022 19:00:15 +0800 +Subject: [PATCH] ext4: fix bug_on ext4_mb_use_inode_pa +Git-commit: a08f789d2ab5242c07e716baf9a835725046be89 +Patch-mainline: v5.19-rc3 +References: bsc#1200810 + +Hulk Robot reported a BUG_ON: +================================================================== +kernel BUG at fs/ext4/mballoc.c:3211! +[...] +Rip: 0010:ext4_mb_mark_diskspace_used.cold+0x85/0x136f +[...] +Call Trace: + ext4_mb_new_blocks+0x9df/0x5d30 + ext4_ext_map_blocks+0x1803/0x4d80 + ext4_map_blocks+0x3a4/0x1a10 + ext4_writepages+0x126d/0x2c30 + do_writepages+0x7f/0x1b0 + __filemap_fdatawrite_range+0x285/0x3b0 + file_write_and_wait_range+0xb1/0x140 + ext4_sync_file+0x1aa/0xca0 + vfs_fsync_range+0xfb/0x260 + do_fsync+0x48/0xa0 +[...] +================================================================== + +Above issue may happen as follows: + +Acked-by: Jan Kara <jack@suse.cz> + +------------------------------------- +do_fsync + vfs_fsync_range + ext4_sync_file + file_write_and_wait_range + __filemap_fdatawrite_range + do_writepages + ext4_writepages + mpage_map_and_submit_extent + mpage_map_one_extent + ext4_map_blocks + ext4_mb_new_blocks + ext4_mb_normalize_request + >>> start + size <= ac->ac_o_ex.fe_logical + ext4_mb_regular_allocator + ext4_mb_simple_scan_group + ext4_mb_use_best_found + ext4_mb_new_preallocation + ext4_mb_new_inode_pa + ext4_mb_use_inode_pa + >>> set ac->ac_b_ex.fe_len <= 0 + ext4_mb_mark_diskspace_used + >>> BUG_ON(ac->ac_b_ex.fe_len <= 0); + +we can easily reproduce this problem with the following commands: + `fallocate -l100M disk` + `mkfs.ext4 -b 1024 -g 256 disk` + `mount disk /mnt` + `fsstress -d /mnt -l 0 -n 1000 -p 1` + +The size must be smaller than or equal to EXT4_BLOCKS_PER_GROUP. +Therefore, "start + size <= ac->ac_o_ex.fe_logical" may occur +when the size is truncated. So start should be the start position of +the group where ac_o_ex.fe_logical is located after alignment. +In addition, when the value of fe_logical or EXT4_BLOCKS_PER_GROUP +is very large, the value calculated by start_off is more accurate. + +Cc: stable@kernel.org +Fixes: cd648b8a8fd5 ("ext4: trim allocation requests to group size") +Reported-by: Hulk Robot <hulkci@huawei.com> +Signed-off-by: Baokun Li <libaokun1@huawei.com> +Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com> +Link: https://lore.kernel.org/r/20220528110017.354175-2-libaokun1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +--- + fs/ext4/mballoc.c | 9 +++++++++ + 1 file changed, 9 insertions(+) + +diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c +index 9f12f29bc346..4d3740fdff90 100644 +--- a/fs/ext4/mballoc.c ++++ b/fs/ext4/mballoc.c +@@ -4104,6 +4104,15 @@ ext4_mb_normalize_request(struct ext4_allocation_context *ac, + size = size >> bsbits; + start = start_off >> bsbits; + ++ /* ++ * For tiny groups (smaller than 8MB) the chosen allocation ++ * alignment may be larger than group size. Make sure the ++ * alignment does not move allocation to a different group which ++ * makes mballoc fail assertions later. ++ */ ++ start = max(start, rounddown(ac->ac_o_ex.fe_logical, ++ (ext4_lblk_t)EXT4_BLOCKS_PER_GROUP(ac->ac_sb))); ++ + /* don't cover already allocated blocks in selected range */ + if (ar->pleft && start <= ar->lleft) { + size -= ar->lleft + 1 - start; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-bug_on-in-__es_tree_search-caused-by-bad-bo.patch @@ -0,0 +1,102 @@ +From 991ed014de0840c5dc405b679168924afb2952ac Mon Sep 17 00:00:00 2001 +From: Baokun Li <libaokun1@huawei.com> +Date: Wed, 26 Oct 2022 12:23:10 +0800 +Subject: [PATCH] ext4: fix bug_on in __es_tree_search caused by bad boot + loader inode +Git-commit: 991ed014de0840c5dc405b679168924afb2952ac +Patch-mainline: v6.2-rc1 +References: bsc#1207620 + +We got a issue as fllows: +================================================================== + kernel BUG at fs/ext4/extents_status.c:203! + invalid opcode: 0000 [#1] PREEMPT SMP + CPU: 1 PID: 945 Comm: cat Not tainted 6.0.0-next-20221007-dirty #349 + RIP: 0010:ext4_es_end.isra.0+0x34/0x42 + RSP: 0018:ffffc9000143b768 EFLAGS: 00010203 + RAX: 0000000000000000 RBX: ffff8881769cd0b8 RCX: 0000000000000000 + RDX: 0000000000000000 RSI: ffffffff8fc27cf7 RDI: 00000000ffffffff + RBP: ffff8881769cd0bc R08: 0000000000000000 R09: ffffc9000143b5f8 + R10: 0000000000000001 R11: 0000000000000001 R12: ffff8881769cd0a0 + R13: ffff8881768e5668 R14: 00000000768e52f0 R15: 0000000000000000 + FS: 00007f359f7f05c0(0000)GS:ffff88842fd00000(0000)knlGS:0000000000000000 + CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 + CR2: 00007f359f5a2000 CR3: 000000017130c000 CR4: 00000000000006e0 + DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 + DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 + Call Trace: + <TASK> + __es_tree_search.isra.0+0x6d/0xf5 + ext4_es_cache_extent+0xfa/0x230 + ext4_cache_extents+0xd2/0x110 + ext4_find_extent+0x5d5/0x8c0 + ext4_ext_map_blocks+0x9c/0x1d30 + ext4_map_blocks+0x431/0xa50 + ext4_mpage_readpages+0x48e/0xe40 + ext4_readahead+0x47/0x50 + read_pages+0x82/0x530 + page_cache_ra_unbounded+0x199/0x2a0 + do_page_cache_ra+0x47/0x70 + page_cache_ra_order+0x242/0x400 + ondemand_readahead+0x1e8/0x4b0 + page_cache_sync_ra+0xf4/0x110 + filemap_get_pages+0x131/0xb20 + filemap_read+0xda/0x4b0 + generic_file_read_iter+0x13a/0x250 + ext4_file_read_iter+0x59/0x1d0 + vfs_read+0x28f/0x460 + ksys_read+0x73/0x160 + __x64_sys_read+0x1e/0x30 + do_syscall_64+0x35/0x80 + entry_SYSCALL_64_after_hwframe+0x63/0xcd + </TASK> +================================================================== + +In the above issue, ioctl invokes the swap_inode_boot_loader function to +swap inode<5> and inode<12>. However, inode<5> contain incorrect imode and +disordered extents, and i_nlink is set to 1. The extents check for inode in +the ext4_iget function can be bypassed bacause 5 is EXT4_BOOT_LOADER_INO. +While links_count is set to 1, the extents are not initialized in +swap_inode_boot_loader. After the ioctl command is executed successfully, +the extents are swapped to inode<12>, in this case, run the `cat` command +to view inode<12>. And Bug_ON is triggered due to the incorrect extents. + +When the boot loader inode is not initialized, its imode can be one of the +Following: +1) the imode is a bad type, which is marked as bad_inode in ext4_iget and + set to S_IFREG. +2) the imode is good type but not S_IFREG. +3) the imode is S_IFREG. + +The BUG_ON may be triggered by bypassing the check in cases 1 and 2. +Therefore, when the boot loader inode is bad_inode or its imode is not +S_IFREG, initialize the inode to avoid triggering the BUG. + +Signed-off-by: Baokun Li <libaokun1@huawei.com> +Reviewed-by: Jason Yan <yanaijie@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20221026042310.3839669-5-libaokun1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/ioctl.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c +index 9ed7b9fe2132..e5f60057db5b 100644 +--- a/fs/ext4/ioctl.c ++++ b/fs/ext4/ioctl.c +@@ -425,7 +425,7 @@ static long swap_inode_boot_loader(struct super_block *sb, + /* Protect extent tree against block allocations via delalloc */ + ext4_double_down_write_data_sem(inode, inode_bl); + +- if (inode_bl->i_nlink == 0) { ++ if (is_bad_inode(inode_bl) || !S_ISREG(inode_bl->i_mode)) { + /* this inode has never been used as a BOOT_LOADER */ + set_nlink(inode_bl, 1); + i_uid_write(inode_bl, 0); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-bug_on-in-__es_tree_search.patch @@ -0,0 +1,144 @@ +From d36f6ed761b53933b0b4126486c10d3da7751e7f Mon Sep 17 00:00:00 2001 +From: Baokun Li <libaokun1@huawei.com> +Date: Wed, 18 May 2022 20:08:16 +0800 +Subject: [PATCH] ext4: fix bug_on in __es_tree_search +Git-commit: d36f6ed761b53933b0b4126486c10d3da7751e7f +Patch-mainline: v5.19-rc1 +References: bsc#1200809 + +Hulk Robot reported a BUG_ON: +================================================================== +kernel BUG at fs/ext4/extents_status.c:199! +[...] +Rip: 0010:ext4_es_end fs/ext4/extents_status.c:199 [inline] +Rip: 0010:__es_tree_search+0x1e0/0x260 fs/ext4/extents_status.c:217 +[...] +Call Trace: + ext4_es_cache_extent+0x109/0x340 fs/ext4/extents_status.c:766 + ext4_cache_extents+0x239/0x2e0 fs/ext4/extents.c:561 + ext4_find_extent+0x6b7/0xa20 fs/ext4/extents.c:964 + ext4_ext_map_blocks+0x16b/0x4b70 fs/ext4/extents.c:4384 + ext4_map_blocks+0xe26/0x19f0 fs/ext4/inode.c:567 + ext4_getblk+0x320/0x4c0 fs/ext4/inode.c:980 + ext4_bread+0x2d/0x170 fs/ext4/inode.c:1031 + ext4_quota_read+0x248/0x320 fs/ext4/super.c:6257 + v2_read_header+0x78/0x110 fs/quota/quota_v2.c:63 + v2_check_quota_file+0x76/0x230 fs/quota/quota_v2.c:82 + vfs_load_quota_inode+0x5d1/0x1530 fs/quota/dquot.c:2368 + dquot_enable+0x28a/0x330 fs/quota/dquot.c:2490 + ext4_quota_enable fs/ext4/super.c:6137 [inline] + ext4_enable_quotas+0x5d7/0x960 fs/ext4/super.c:6163 + ext4_fill_super+0xa7c9/0xdc00 fs/ext4/super.c:4754 + mount_bdev+0x2e9/0x3b0 fs/super.c:1158 + mount_fs+0x4b/0x1e4 fs/super.c:1261 +[...] +================================================================== + +Above issue may happen as follows: + +Acked-by: Jan Kara <jack@suse.cz> + +------------------------------------- +ext4_fill_super + ext4_enable_quotas + ext4_quota_enable + ext4_iget + __ext4_iget + ext4_ext_check_inode + ext4_ext_check + __ext4_ext_check + ext4_valid_extent_entries + Check for overlapping extents does't take effect + dquot_enable + vfs_load_quota_inode + v2_check_quota_file + v2_read_header + ext4_quota_read + ext4_bread + ext4_getblk + ext4_map_blocks + ext4_ext_map_blocks + ext4_find_extent + ext4_cache_extents + ext4_es_cache_extent + ext4_es_cache_extent + __es_tree_search + ext4_es_end + BUG_ON(es->es_lblk + es->es_len < es->es_lblk) + +The error ext4 extents is as follows: +0af3 0300 0400 0000 00000000 extent_header +00000000 0100 0000 12000000 extent1 +00000000 0100 0000 18000000 extent2 +02000000 0400 0000 14000000 extent3 + +In the ext4_valid_extent_entries function, +if prev is 0, no error is returned even if lblock<=prev. +This was intended to skip the check on the first extent, but +in the error image above, prev=0+1-1=0 when checking the second extent, +so even though lblock<=prev, the function does not return an error. +As a result, bug_ON occurs in __es_tree_search and the system panics. + +To solve this problem, we only need to check that: +1. The lblock of the first extent is not less than 0. +2. The lblock of the next extent is not less than + the next block of the previous extent. +The same applies to extent_idx. + +Cc: stable@kernel.org +Fixes: 5946d089379a ("ext4: check for overlapping extents in ext4_valid_extent_entries()") +Reported-by: Hulk Robot <hulkci@huawei.com> +Signed-off-by: Baokun Li <libaokun1@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220518120816.1541863-1-libaokun1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +--- + fs/ext4/extents.c | 10 +++++----- + 1 file changed, 5 insertions(+), 5 deletions(-) + +diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c +index 474479ce76e0..c148bb97b527 100644 +--- a/fs/ext4/extents.c ++++ b/fs/ext4/extents.c +@@ -372,7 +372,7 @@ static int ext4_valid_extent_entries(struct inode *inode, + { + unsigned short entries; + ext4_lblk_t lblock = 0; +- ext4_lblk_t prev = 0; ++ ext4_lblk_t cur = 0; + + if (eh->eh_entries == 0) + return 1; +@@ -396,11 +396,11 @@ static int ext4_valid_extent_entries(struct inode *inode, + + /* Check for overlapping extents */ + lblock = le32_to_cpu(ext->ee_block); +- if ((lblock <= prev) && prev) { ++ if (lblock < cur) { + *pblk = ext4_ext_pblock(ext); + return 0; + } +- prev = lblock + ext4_ext_get_actual_len(ext) - 1; ++ cur = lblock + ext4_ext_get_actual_len(ext); + ext++; + entries--; + } +@@ -420,13 +420,13 @@ static int ext4_valid_extent_entries(struct inode *inode, + + /* Check for overlapping index extents */ + lblock = le32_to_cpu(ext_idx->ei_block); +- if ((lblock <= prev) && prev) { ++ if (lblock < cur) { + *pblk = ext4_idx_pblock(ext_idx); + return 0; + } + ext_idx++; + entries--; +- prev = lblock; ++ cur = lblock + 1; + } + } + return 1; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-bug_on-in-ext4_writepages.patch @@ -0,0 +1,113 @@ +From ef09ed5d37b84d18562b30cf7253e57062d0db05 Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Mon, 16 May 2022 20:26:34 +0800 +Subject: [PATCH] ext4: fix bug_on in ext4_writepages +Git-commit: ef09ed5d37b84d18562b30cf7253e57062d0db05 +Patch-mainline: v5.19-rc1 +References: bsc#1200872 + +we got issue as follows: +EXT4-fs error (device loop0): ext4_mb_generate_buddy:1141: group 0, block bitmap and bg descriptor inconsistent: 25 vs 31513 free cls + +Acked-by: Jan Kara <jack@suse.cz> + +------------[ cut here ]------------ +kernel BUG at fs/ext4/inode.c:2708! +invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI +CPU: 2 PID: 2147 Comm: rep Not tainted 5.18.0-rc2-next-20220413+ #155 +RIP: 0010:ext4_writepages+0x1977/0x1c10 +RSP: 0018:ffff88811d3e7880 EFLAGS: 00010246 +RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88811c098000 +RDX: 0000000000000000 RSI: ffff88811c098000 RDI: 0000000000000002 +RBP: ffff888128140f50 R08: ffffffffb1ff6387 R09: 0000000000000000 +R10: 0000000000000007 R11: ffffed10250281ea R12: 0000000000000001 +R13: 00000000000000a4 R14: ffff88811d3e7bb8 R15: ffff888128141028 +FS: 00007f443aed9740(0000) GS:ffff8883aef00000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: 0000000020007200 CR3: 000000011c2a4000 CR4: 00000000000006e0 +DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 +DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 +Call Trace: + <TASK> + do_writepages+0x130/0x3a0 + filemap_fdatawrite_wbc+0x83/0xa0 + filemap_flush+0xab/0xe0 + ext4_alloc_da_blocks+0x51/0x120 + __ext4_ioctl+0x1534/0x3210 + __x64_sys_ioctl+0x12c/0x170 + do_syscall_64+0x3b/0x90 + +It may happen as follows: +1. write inline_data inode +vfs_write + new_sync_write + ext4_file_write_iter + ext4_buffered_write_iter + generic_perform_write + ext4_da_write_begin + ext4_da_write_inline_data_begin -> If inline data size too + small will allocate block to write, then mapping will has + dirty page + ext4_da_convert_inline_data_to_extent ->clear EXT4_STATE_MAY_INLINE_DATA +2. fallocate +do_vfs_ioctl + ioctl_preallocate + vfs_fallocate + ext4_fallocate + ext4_convert_inline_data + ext4_convert_inline_data_nolock + ext4_map_blocks -> fail will goto restore data + ext4_restore_inline_data + ext4_create_inline_data + ext4_write_inline_data + ext4_set_inode_state -> set inode EXT4_STATE_MAY_INLINE_DATA +3. writepages +__ext4_ioctl + ext4_alloc_da_blocks + filemap_flush + filemap_fdatawrite_wbc + do_writepages + ext4_writepages + if (ext4_has_inline_data(inode)) + BUG_ON(ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)) + +The root cause of this issue is we destory inline data until call +ext4_writepages under delay allocation mode. But there maybe already +convert from inline to extent. To solve this issue, we call +filemap_flush first.. + +Cc: stable@kernel.org +Signed-off-by: Ye Bin <yebin10@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220516122634.1690462-1-yebin10@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +--- + fs/ext4/inline.c | 12 ++++++++++++ + 1 file changed, 12 insertions(+) + +diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c +index 13a63e3639b1..513762c087a9 100644 +--- a/fs/ext4/inline.c ++++ b/fs/ext4/inline.c +@@ -2005,6 +2005,18 @@ int ext4_convert_inline_data(struct inode *inode) + if (!ext4_has_inline_data(inode)) { + ext4_clear_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA); + return 0; ++ } else if (!ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)) { ++ /* ++ * Inode has inline data but EXT4_STATE_MAY_INLINE_DATA is ++ * cleared. This means we are in the middle of moving of ++ * inline data to delay allocated block. Just force writeout ++ * here to finish conversion. ++ */ ++ error = filemap_flush(inode->i_mapping); ++ if (error) ++ return error; ++ if (!ext4_has_inline_data(inode)) ++ return 0; + } + + needed_blocks = ext4_writepage_trans_blocks(inode); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-cgroup-writeback-accounting-with-fs-layer-e.patch @@ -0,0 +1,71 @@ +From ffec85d53d0f39ee4680a2cf0795255e000e1feb Mon Sep 17 00:00:00 2001 +From: Eric Biggers <ebiggers@google.com> +Date: Thu, 2 Feb 2023 16:55:03 -0800 +Subject: [PATCH] ext4: fix cgroup writeback accounting with fs-layer + encryption +Git-commit: ffec85d53d0f39ee4680a2cf0795255e000e1feb +Patch-mainline: v6.3-rc2 +References: bsc#1210765 + +When writing a page from an encrypted file that is using +filesystem-layer encryption (not inline encryption), ext4 encrypts the +pagecache page into a bounce page, then writes the bounce page. + +It also passes the bounce page to wbc_account_cgroup_owner(). That's +incorrect, because the bounce page is a newly allocated temporary page +that doesn't have the memory cgroup of the original pagecache page. +This makes wbc_account_cgroup_owner() not account the I/O to the owner +of the pagecache page as it should. + +Fix this by always passing the pagecache page to +wbc_account_cgroup_owner(). + +Fixes: 001e4a8775f6 ("ext4: implement cgroup writeback support") +Cc: stable@vger.kernel.org +Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org> +Signed-off-by: Eric Biggers <ebiggers@google.com> +Acked-by: Tejun Heo <tj@kernel.org> +Link: https://lore.kernel.org/r/20230203005503.141557-1-ebiggers@kernel.org +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/page-io.c | 11 ++++++----- + 1 file changed, 6 insertions(+), 5 deletions(-) + +--- a/fs/ext4/page-io.c ++++ b/fs/ext4/page-io.c +@@ -413,7 +413,8 @@ static void io_submit_init_bio(struct ex + + static void io_submit_add_bh(struct ext4_io_submit *io, + struct inode *inode, +- struct page *page, ++ struct page *pagecache_page, ++ struct page *bounce_page, + struct buffer_head *bh) + { + int ret; +@@ -427,10 +428,11 @@ submit_and_retry: + io_submit_init_bio(io, bh); + io->io_bio->bi_write_hint = inode->i_write_hint; + } +- ret = bio_add_page(io->io_bio, page, bh->b_size, bh_offset(bh)); ++ ret = bio_add_page(io->io_bio, bounce_page ?: pagecache_page, ++ bh->b_size, bh_offset(bh)); + if (ret != bh->b_size) + goto submit_and_retry; +- wbc_account_cgroup_owner(io->io_wbc, page, bh->b_size); ++ wbc_account_cgroup_owner(io->io_wbc, pagecache_page, bh->b_size); + io->io_next_block++; + } + +@@ -548,8 +550,7 @@ int ext4_bio_write_page(struct ext4_io_s + do { + if (!buffer_async_write(bh)) + continue; +- io_submit_add_bh(io, inode, +- bounce_page ? bounce_page : page, bh); ++ io_submit_add_bh(io, inode, page, bounce_page, bh); + nr_submitted++; + clear_buffer_dirty(bh); + } while ((bh = bh->b_this_page) != head); --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-corruption-when-online-resizing-a-1K-bigall.patch @@ -0,0 +1,58 @@ +From 0aeaa2559d6d53358fca3e3fce73807367adca74 Mon Sep 17 00:00:00 2001 +From: Baokun Li <libaokun1@huawei.com> +Date: Thu, 17 Nov 2022 12:03:41 +0800 +Subject: [PATCH] ext4: fix corruption when online resizing a 1K bigalloc fs +Git-commit: 0aeaa2559d6d53358fca3e3fce73807367adca74 +Patch-mainline: v6.2-rc1 +References: bsc#1206891 + +When a backup superblock is updated in update_backups(), the primary +superblock's offset in the group (that is, sbi->s_sbh->b_blocknr) is used +as the backup superblock's offset in its group. However, when the block +size is 1K and bigalloc is enabled, the two offsets are not equal. This +causes the backup group descriptors to be overwritten by the superblock +in update_backups(). Moreover, if meta_bg is enabled, the file system will +be corrupted because this feature uses backup group descriptors. + +To solve this issue, we use a more accurate ext4_group_first_block_no() as +the offset of the backup superblock in its group. + +Fixes: d77147ff443b ("ext4: add support for online resizing with bigalloc") +Signed-off-by: Baokun Li <libaokun1@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Cc: stable@kernel.org +Link: https://lore.kernel.org/r/20221117040341.1380702-4-libaokun1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/resize.c | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c +index d460440d6dee..6b91443d6bf3 100644 +--- a/fs/ext4/resize.c ++++ b/fs/ext4/resize.c +@@ -1604,8 +1604,8 @@ static int ext4_flex_group_add(struct super_block *sb, + int meta_bg = ext4_has_feature_meta_bg(sb); + sector_t old_gdb = 0; + +- update_backups(sb, sbi->s_sbh->b_blocknr, (char *)es, +- sizeof(struct ext4_super_block), 0); ++ update_backups(sb, ext4_group_first_block_no(sb, 0), ++ (char *)es, sizeof(struct ext4_super_block), 0); + for (; gdb_num <= gdb_num_end; gdb_num++) { + struct buffer_head *gdb_bh; + +@@ -1816,7 +1816,7 @@ static int ext4_group_extend_no_check(struct super_block *sb, + if (test_opt(sb, DEBUG)) + printk(KERN_DEBUG "EXT4-fs: extended group to %llu " + "blocks\n", ext4_blocks_count(es)); +- update_backups(sb, EXT4_SB(sb)->s_sbh->b_blocknr, ++ update_backups(sb, ext4_group_first_block_no(sb, 0), + (char *)es, sizeof(struct ext4_super_block), 0); + } + return err; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-data-races-when-using-cached-status-extents.patch @@ -0,0 +1,86 @@ +From 492888df0c7b42fc0843631168b0021bc4caee84 Mon Sep 17 00:00:00 2001 +From: Jan Kara <jack@suse.cz> +Date: Thu, 4 May 2023 14:55:24 +0200 +Subject: [PATCH] ext4: fix data races when using cached status extents +Git-commit: 492888df0c7b42fc0843631168b0021bc4caee84 +Patch-mainline: v6.4-rc2 +References: bsc#1213102 + +When using cached extent stored in extent status tree in tree->cache_es +another process holding ei->i_es_lock for reading can be racing with us +setting new value of tree->cache_es. If the compiler would decide to +refetch tree->cache_es at an unfortunate moment, it could result in a +bogus in_range() check. Fix the possible race by using READ_ONCE() when +using tree->cache_es only under ei->i_es_lock for reading. + +Cc: stable@kernel.org +Reported-by: syzbot+4a03518df1e31b537066@syzkaller.appspotmail.com +Link: https://lore.kernel.org/all/000000000000d3b33905fa0fd4a6@google.com +Suggested-by: Dmitry Vyukov <dvyukov@google.com> +Signed-off-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20230504125524.10802-1-jack@suse.cz +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/extents_status.c | 30 +++++++++++++----------------- + 1 file changed, 13 insertions(+), 17 deletions(-) + +diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c +index 7bc221038c6c..595abb9e7d74 100644 +--- a/fs/ext4/extents_status.c ++++ b/fs/ext4/extents_status.c +@@ -267,14 +267,12 @@ static void __es_find_extent_range(struct inode *inode, + + /* see if the extent has been cached */ + es->es_lblk = es->es_len = es->es_pblk = 0; +- if (tree->cache_es) { +- es1 = tree->cache_es; +- if (in_range(lblk, es1->es_lblk, es1->es_len)) { +- es_debug("%u cached by [%u/%u) %llu %x\n", +- lblk, es1->es_lblk, es1->es_len, +- ext4_es_pblock(es1), ext4_es_status(es1)); +- goto out; +- } ++ es1 = READ_ONCE(tree->cache_es); ++ if (es1 && in_range(lblk, es1->es_lblk, es1->es_len)) { ++ es_debug("%u cached by [%u/%u) %llu %x\n", ++ lblk, es1->es_lblk, es1->es_len, ++ ext4_es_pblock(es1), ext4_es_status(es1)); ++ goto out; + } + + es1 = __es_tree_search(&tree->root, lblk); +@@ -293,7 +291,7 @@ static void __es_find_extent_range(struct inode *inode, + } + + if (es1 && matching_fn(es1)) { +- tree->cache_es = es1; ++ WRITE_ONCE(tree->cache_es, es1); + es->es_lblk = es1->es_lblk; + es->es_len = es1->es_len; + es->es_pblk = es1->es_pblk; +@@ -931,14 +929,12 @@ int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk, + + /* find extent in cache firstly */ + es->es_lblk = es->es_len = es->es_pblk = 0; +- if (tree->cache_es) { +- es1 = tree->cache_es; +- if (in_range(lblk, es1->es_lblk, es1->es_len)) { +- es_debug("%u cached by [%u/%u)\n", +- lblk, es1->es_lblk, es1->es_len); +- found = 1; +- goto out; +- } ++ es1 = READ_ONCE(tree->cache_es); ++ if (es1 && in_range(lblk, es1->es_lblk, es1->es_len)) { ++ es_debug("%u cached by [%u/%u)\n", ++ lblk, es1->es_lblk, es1->es_len); ++ found = 1; ++ goto out; + } + + node = tree->root.rb_node; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-deadlock-due-to-mbcache-entry-corruption.patch @@ -0,0 +1,92 @@ +From a44e84a9b7764c72896f7241a0ec9ac7e7ef38dd Mon Sep 17 00:00:00 2001 +From: Jan Kara <jack@suse.cz> +Date: Wed, 23 Nov 2022 20:39:50 +0100 +Subject: [PATCH] ext4: fix deadlock due to mbcache entry corruption +Git-commit: a44e84a9b7764c72896f7241a0ec9ac7e7ef38dd +Patch-mainline: v6.2-rc1 +References: bsc#1207653 + +When manipulating xattr blocks, we can deadlock infinitely looping +inside ext4_xattr_block_set() where we constantly keep finding xattr +block for reuse in mbcache but we are unable to reuse it because its +reference count is too big. This happens because cache entry for the +xattr block is marked as reusable (e_reusable set) although its +reference count is too big. When this inconsistency happens, this +inconsistent state is kept indefinitely and so ext4_xattr_block_set() +keeps retrying indefinitely. + +The inconsistent state is caused by non-atomic update of e_reusable bit. +e_reusable is part of a bitfield and e_reusable update can race with +update of e_referenced bit in the same bitfield resulting in loss of one +of the updates. Fix the problem by using atomic bitops instead. + +This bug has been around for many years, but it became *much* easier +to hit after commit 65f8b80053a1 ("ext4: fix race when reusing xattr +blocks"). + +Cc: stable@vger.kernel.org +Fixes: 6048c64b2609 ("mbcache: add reusable flag to cache entries") +Fixes: 65f8b80053a1 ("ext4: fix race when reusing xattr blocks") +Reported-and-tested-by: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com> +Reported-by: Thilo Fromm <t-lo@linux.microsoft.com> +Link: https://lore.kernel.org/r/c77bf00f-4618-7149-56f1-b8d1664b9d07@linux.microsoft.com/ +Signed-off-by: Jan Kara <jack@suse.cz> +Reviewed-by: Andreas Dilger <adilger@dilger.ca> +Link: https://lore.kernel.org/r/20221123193950.16758-1-jack@suse.cz +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/xattr.c | 27 ++++++++++++++++++++++++--- + 1 file changed, 24 insertions(+), 3 deletions(-) + +--- a/fs/ext4/xattr.c ++++ b/fs/ext4/xattr.c +@@ -1209,6 +1209,24 @@ ext4_xattr_inode_dec_ref_all(handle_t *h + } + } + ++/* Cache entry flags */ ++enum { ++ MBE_REFERENCED_B = 0, ++ MBE_REUSABLE_B ++}; ++struct mb_cache_entry_alt { ++ /* List of entries in cache - protected by cache->c_list_lock */ ++ struct list_head e_list; ++ /* Hash table list - protected by hash chain bitlock */ ++ struct hlist_bl_node e_hash_list; ++ atomic_t e_refcnt; ++ /* Key in hash - stable during lifetime of the entry */ ++ u32 e_key; ++ unsigned long e_flags; ++ /* User provided value - stable during lifetime of the entry */ ++ u64 e_value; ++}; ++ + /* + * Release the xattr block BH: If the reference count is > 1, decrement it; + * otherwise free the block. +@@ -1264,7 +1282,8 @@ ext4_xattr_release_block(handle_t *handl + ce = mb_cache_entry_get(ea_block_cache, hash, + bh->b_blocknr); + if (ce) { +- ce->e_reusable = 1; ++ struct mb_cache_entry_alt *cea = (void*)ce; ++ set_bit(MBE_REUSABLE_B, &cea->e_flags); + mb_cache_entry_put(ea_block_cache, ce); + } + } +@@ -2019,8 +2038,10 @@ inserted: + } + ref = le32_to_cpu(BHDR(new_bh)->h_refcount) + 1; + BHDR(new_bh)->h_refcount = cpu_to_le32(ref); +- if (ref >= EXT4_XATTR_REFCOUNT_MAX) +- ce->e_reusable = 0; ++ if (ref >= EXT4_XATTR_REFCOUNT_MAX) { ++ struct mb_cache_entry_alt *cea = (void*)ce; ++ clear_bit(MBE_REUSABLE_B, &cea->e_flags); ++ } + ea_bdebug(new_bh, "reusing; refcount now=%d", + ref); + ext4_xattr_block_csum_set(inode, new_bh); --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-deadlock-when-converting-an-inline-director.patch @@ -0,0 +1,69 @@ +From f4ce24f54d9cca4f09a395f3eecce20d6bec4663 Mon Sep 17 00:00:00 2001 +From: Theodore Ts'o <tytso@mit.edu> +Date: Sat, 6 May 2023 21:04:01 -0400 +Subject: [PATCH] ext4: fix deadlock when converting an inline directory in + nojournal mode +Git-commit: f4ce24f54d9cca4f09a395f3eecce20d6bec4663 +Patch-mainline: v6.4-rc2 +References: bsc#1213105 + +In no journal mode, ext4_finish_convert_inline_dir() can self-deadlock +by calling ext4_handle_dirty_dirblock() when it already has taken the +directory lock. There is a similar self-deadlock in +ext4_incvert_inline_data_nolock() for data files which we'll fix at +the same time. + +A simple reproducer demonstrating the problem: + + mke2fs -Fq -t ext2 -O inline_data -b 4k /dev/vdc 64 + mount -t ext4 -o dirsync /dev/vdc /vdc + cd /vdc + mkdir file0 + cd file0 + touch file0 + touch file1 + attr -s BurnSpaceInEA -V abcde . + touch supercalifragilisticexpialidocious + +Cc: stable@kernel.org +Link: https://lore.kernel.org/r/20230507021608.1290720-1-tytso@mit.edu +Reported-by: syzbot+91dccab7c64e2850a4e5@syzkaller.appspotmail.com +Link: https://syzkaller.appspot.com/bug?id=ba84cc80a9491d65416bc7877e1650c87530fe8a +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inline.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c +index 859bc4e2c9b0..d3dfc51a43c5 100644 +--- a/fs/ext4/inline.c ++++ b/fs/ext4/inline.c +@@ -1175,6 +1175,7 @@ static int ext4_finish_convert_inline_dir(handle_t *handle, + ext4_initialize_dirent_tail(dir_block, + inode->i_sb->s_blocksize); + set_buffer_uptodate(dir_block); ++ unlock_buffer(dir_block); + err = ext4_handle_dirty_dirblock(handle, inode, dir_block); + if (err) + return err; +@@ -1249,6 +1250,7 @@ static int ext4_convert_inline_data_nolock(handle_t *handle, + if (!S_ISDIR(inode->i_mode)) { + memcpy(data_bh->b_data, buf, inline_size); + set_buffer_uptodate(data_bh); ++ unlock_buffer(data_bh); + error = ext4_handle_dirty_metadata(handle, + inode, data_bh); + } else { +@@ -1256,7 +1258,6 @@ static int ext4_convert_inline_data_nolock(handle_t *handle, + buf, inline_size); + } + +- unlock_buffer(data_bh); + out_restore: + if (error) + ext4_restore_inline_data(handle, inode, iloc, buf, inline_size); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-delayed-allocation-bug-in-ext4_clu_mapped-f.patch @@ -0,0 +1,63 @@ +From 131294c35ed6f777bd4e79d42af13b5c41bf2775 Mon Sep 17 00:00:00 2001 +From: Eric Whitney <enwlinux@gmail.com> +Date: Thu, 17 Nov 2022 10:22:07 -0500 +Subject: [PATCH] ext4: fix delayed allocation bug in ext4_clu_mapped for + bigalloc + inline +Git-commit: 131294c35ed6f777bd4e79d42af13b5c41bf2775 +Patch-mainline: v6.2-rc1 +References: bsc#1207631 + +When converting files with inline data to extents, delayed allocations +made on a file system created with both the bigalloc and inline options +can result in invalid extent status cache content, incorrect reserved +cluster counts, kernel memory leaks, and potential kernel panics. + +With bigalloc, the code that determines whether a block must be +delayed allocated searches the extent tree to see if that block maps +to a previously allocated cluster. If not, the block is delayed +allocated, and otherwise, it isn't. However, if the inline option is +also used, and if the file containing the block is marked as able to +store data inline, there isn't a valid extent tree associated with +the file. The current code in ext4_clu_mapped() calls +ext4_find_extent() to search the non-existent tree for a previously +allocated cluster anyway, which typically finds nothing, as desired. +However, a side effect of the search can be to cache invalid content +from the non-existent tree (garbage) in the extent status tree, +including bogus entries in the pending reservation tree. + +To fix this, avoid searching the extent tree when allocating blocks +for bigalloc + inline files that are being converted from inline to +extent mapped. + +Signed-off-by: Eric Whitney <enwlinux@gmail.com> +Link: https://lore.kernel.org/r/20221117152207.2424-1-enwlinux@gmail.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/extents.c | 8 ++++++++ + 1 file changed, 8 insertions(+) + +diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c +index ec008278d970..9de1c9d1a13d 100644 +--- a/fs/ext4/extents.c ++++ b/fs/ext4/extents.c +@@ -5797,6 +5797,14 @@ int ext4_clu_mapped(struct inode *inode, ext4_lblk_t lclu) + struct ext4_extent *extent; + ext4_lblk_t first_lblk, first_lclu, last_lclu; + ++ /* ++ * if data can be stored inline, the logical cluster isn't ++ * mapped - no physical clusters have been allocated, and the ++ * file has no extents ++ */ ++ if (ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)) ++ return 0; ++ + /* search for the extent closest to the first block in the cluster */ + path = ext4_find_extent(inode, EXT4_C2B(sbi, lclu), NULL, 0); + if (IS_ERR(path)) { +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-dir-corruption-when-ext4_dx_add_entry-fails.patch @@ -0,0 +1,100 @@ +From 7177dd009c7c04290891e9a534cd47d1b620bd04 Mon Sep 17 00:00:00 2001 +From: Zhihao Cheng <chengzhihao1@huawei.com> +Date: Sun, 11 Sep 2022 12:52:04 +0800 +Subject: [PATCH] ext4: fix dir corruption when ext4_dx_add_entry() fails +Git-commit: 7177dd009c7c04290891e9a534cd47d1b620bd04 +Patch-mainline: v6.1-rc1 +References: bsc#1207608 + +Following process may lead to fs corruption: +1. ext4_create(dir/foo) + ext4_add_nondir + ext4_add_entry + ext4_dx_add_entry + a. add_dirent_to_buf + ext4_mark_inode_dirty + ext4_handle_dirty_metadata // dir inode bh is recorded into journal + b. ext4_append // dx_get_count(entries) == dx_get_limit(entries) + ext4_bread(EXT4_GET_BLOCKS_CREATE) + ext4_getblk + ext4_map_blocks + ext4_ext_map_blocks + ext4_mb_new_blocks + dquot_alloc_block + dquot_alloc_space_nodirty + inode_add_bytes // update dir's i_blocks + ext4_ext_insert_extent + ext4_ext_dirty // record extent bh into journal + ext4_handle_dirty_metadata(bh) + // record new block into journal + inode->i_size += inode->i_sb->s_blocksize // new size(in mem) + c. ext4_handle_dirty_dx_node(bh2) + // record dir's new block(dx_node) into journal + d. ext4_handle_dirty_dx_node((frame - 1)->bh) + e. ext4_handle_dirty_dx_node(frame->bh) + f. do_split // ret err! + g. add_dirent_to_buf + ext4_mark_inode_dirty(dir) // update raw_inode on disk(skipped) +2. fsck -a /dev/sdb + drop last block(dx_node) which beyonds dir's i_size. + /dev/sdb: recovering journal + /dev/sdb contains a file system with errors, check forced. + /dev/sdb: Inode 12, end of extent exceeds allowed value + (logical block 128, physical block 3938, len 1) +3. fsck -fn /dev/sdb + dx_node->entry[i].blk > dir->i_size + Pass 2: Checking directory structure + Problem in HTREE directory inode 12 (/dir): bad block number 128. + Clear HTree index? no + Problem in HTREE directory inode 12: block #3 has invalid depth (2) + Problem in HTREE directory inode 12: block #3 has bad max hash + Problem in HTREE directory inode 12: block #3 not referenced + +Fix it by marking inode dirty directly inside ext4_append(). +Fetch a reproducer in [Link]. + +Link: https://bugzilla.kernel.org/show_bug.cgi?id=216466 +Cc: stable@vger.kernel.org +Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220911045204.516460-1-chengzhihao1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/namei.c | 15 ++++++++++----- + 1 file changed, 10 insertions(+), 5 deletions(-) + +diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c +index bc2e0612ec32..4183a4cb4a21 100644 +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -85,15 +85,20 @@ static struct buffer_head *ext4_append(handle_t *handle, + return bh; + inode->i_size += inode->i_sb->s_blocksize; + EXT4_I(inode)->i_disksize = inode->i_size; ++ err = ext4_mark_inode_dirty(handle, inode); ++ if (err) ++ goto out; + BUFFER_TRACE(bh, "get_write_access"); + err = ext4_journal_get_write_access(handle, inode->i_sb, bh, + EXT4_JTR_NONE); +- if (err) { +- brelse(bh); +- ext4_std_error(inode->i_sb, err); +- return ERR_PTR(err); +- } ++ if (err) ++ goto out; + return bh; ++ ++out: ++ brelse(bh); ++ ext4_std_error(inode->i_sb, err); ++ return ERR_PTR(err); + } + + static int ext4_dx_csum_verify(struct inode *inode, +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-error-code-return-to-user-space-in-ext4_get.patch @@ -0,0 +1,57 @@ +From 26d75a16af285a70863ba6a81f85d81e7e65da50 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Lu=C3=ADs=20Henriques?= <lhenriques@suse.de> +Date: Wed, 9 Nov 2022 18:14:45 +0000 +Subject: [PATCH] ext4: fix error code return to user-space in + ext4_get_branch() +Mime-version: 1.0 +Content-type: text/plain; charset=UTF-8 +Content-transfer-encoding: 8bit +Git-commit: 26d75a16af285a70863ba6a81f85d81e7e65da50 +Patch-mainline: v6.2-rc1 +References: bsc#1207630 + +If a block is out of range in ext4_get_branch(), -ENOMEM will be returned +to user-space. Obviously, this error code isn't really useful. This +patch fixes it by making sure the right error code (-EFSCORRUPTED) is +propagated to user-space. EUCLEAN is more informative than ENOMEM. + +Signed-off-by: LuÃs Henriques <lhenriques@suse.de> +Link: https://lore.kernel.org/r/20221109181445.17843-1-lhenriques@suse.de +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/indirect.c | 9 ++++++++- + 1 file changed, 8 insertions(+), 1 deletion(-) + +diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c +index 860fc5119009..c68bebe7ff4b 100644 +--- a/fs/ext4/indirect.c ++++ b/fs/ext4/indirect.c +@@ -148,6 +148,7 @@ static Indirect *ext4_get_branch(struct inode *inode, int depth, + struct super_block *sb = inode->i_sb; + Indirect *p = chain; + struct buffer_head *bh; ++ unsigned int key; + int ret = -EIO; + + *err = 0; +@@ -156,7 +157,13 @@ static Indirect *ext4_get_branch(struct inode *inode, int depth, + if (!p->key) + goto no_block; + while (--depth) { +- bh = sb_getblk(sb, le32_to_cpu(p->key)); ++ key = le32_to_cpu(p->key); ++ if (key > ext4_blocks_count(EXT4_SB(sb)->s_es)) { ++ /* the block was out of range */ ++ ret = -EFSCORRUPTED; ++ goto failure; ++ } ++ bh = sb_getblk(sb, key); + if (unlikely(!bh)) { + ret = -ENOMEM; + goto failure; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-error-handling-in-ext4_fc_record_modified_i.patch @@ -0,0 +1,187 @@ +From cdce59a1549190b66f8e3fe465c2b2f714b98a94 Mon Sep 17 00:00:00 2001 +From: Ritesh Harjani <riteshh@linux.ibm.com> +Date: Mon, 17 Jan 2022 17:41:49 +0530 +Subject: [PATCH] ext4: fix error handling in ext4_fc_record_modified_inode() +Git-commit: cdce59a1549190b66f8e3fe465c2b2f714b98a94 +Patch-mainline: v5.17-rc3 +References: bsc#1202767 + +Current code does not fully takes care of krealloc() error case, which +could lead to silent memory corruption or a kernel bug. This patch +fixes that. + +Also it cleans up some duplicated error handling logic from various +functions in fast_commit.c file. + +Reported-by: luo penghao <luo.penghao@zte.com.cn> +Suggested-by: Lukas Czerner <lczerner@redhat.com> +Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/62e8b6a1cce9359682051deb736a3c0953c9d1e9.1642416995.git.riteshh@linux.ibm.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/fast_commit.c | 64 ++++++++++++++++++++----------------------- + 1 file changed, 29 insertions(+), 35 deletions(-) + +diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c +index ccd2b216d6ba..5934c23e153e 100644 +--- a/fs/ext4/fast_commit.c ++++ b/fs/ext4/fast_commit.c +@@ -1410,14 +1410,15 @@ static int ext4_fc_record_modified_inode(struct super_block *sb, int ino) + if (state->fc_modified_inodes[i] == ino) + return 0; + if (state->fc_modified_inodes_used == state->fc_modified_inodes_size) { +- state->fc_modified_inodes_size += +- EXT4_FC_REPLAY_REALLOC_INCREMENT; + state->fc_modified_inodes = krealloc( +- state->fc_modified_inodes, sizeof(int) * +- state->fc_modified_inodes_size, +- GFP_KERNEL); ++ state->fc_modified_inodes, ++ sizeof(int) * (state->fc_modified_inodes_size + ++ EXT4_FC_REPLAY_REALLOC_INCREMENT), ++ GFP_KERNEL); + if (!state->fc_modified_inodes) + return -ENOMEM; ++ state->fc_modified_inodes_size += ++ EXT4_FC_REPLAY_REALLOC_INCREMENT; + } + state->fc_modified_inodes[state->fc_modified_inodes_used++] = ino; + return 0; +@@ -1449,7 +1450,9 @@ static int ext4_fc_replay_inode(struct super_block *sb, struct ext4_fc_tl *tl, + } + inode = NULL; + +- ext4_fc_record_modified_inode(sb, ino); ++ ret = ext4_fc_record_modified_inode(sb, ino); ++ if (ret) ++ goto out; + + raw_fc_inode = (struct ext4_inode *) + (val + offsetof(struct ext4_fc_inode, fc_raw_inode)); +@@ -1649,6 +1652,8 @@ static int ext4_fc_replay_add_range(struct super_block *sb, + } + + ret = ext4_fc_record_modified_inode(sb, inode->i_ino); ++ if (ret) ++ goto out; + + start = le32_to_cpu(ex->ee_block); + start_pblk = ext4_ext_pblock(ex); +@@ -1666,18 +1671,14 @@ static int ext4_fc_replay_add_range(struct super_block *sb, + map.m_pblk = 0; + ret = ext4_map_blocks(NULL, inode, &map, 0); + +- if (ret < 0) { +- iput(inode); +- return 0; +- } ++ if (ret < 0) ++ goto out; + + if (ret == 0) { + /* Range is not mapped */ + path = ext4_find_extent(inode, cur, NULL, 0); +- if (IS_ERR(path)) { +- iput(inode); +- return 0; +- } ++ if (IS_ERR(path)) ++ goto out; + memset(&newex, 0, sizeof(newex)); + newex.ee_block = cpu_to_le32(cur); + ext4_ext_store_pblock( +@@ -1691,10 +1692,8 @@ static int ext4_fc_replay_add_range(struct super_block *sb, + up_write((&EXT4_I(inode)->i_data_sem)); + ext4_ext_drop_refs(path); + kfree(path); +- if (ret) { +- iput(inode); +- return 0; +- } ++ if (ret) ++ goto out; + goto next; + } + +@@ -1707,10 +1706,8 @@ static int ext4_fc_replay_add_range(struct super_block *sb, + ret = ext4_ext_replay_update_ex(inode, cur, map.m_len, + ext4_ext_is_unwritten(ex), + start_pblk + cur - start); +- if (ret) { +- iput(inode); +- return 0; +- } ++ if (ret) ++ goto out; + /* + * Mark the old blocks as free since they aren't used + * anymore. We maintain an array of all the modified +@@ -1730,10 +1727,8 @@ static int ext4_fc_replay_add_range(struct super_block *sb, + ext4_ext_is_unwritten(ex), map.m_pblk); + ret = ext4_ext_replay_update_ex(inode, cur, map.m_len, + ext4_ext_is_unwritten(ex), map.m_pblk); +- if (ret) { +- iput(inode); +- return 0; +- } ++ if (ret) ++ goto out; + /* + * We may have split the extent tree while toggling the state. + * Try to shrink the extent tree now. +@@ -1745,6 +1740,7 @@ static int ext4_fc_replay_add_range(struct super_block *sb, + } + ext4_ext_replay_shrink_inode(inode, i_size_read(inode) >> + sb->s_blocksize_bits); ++out: + iput(inode); + return 0; + } +@@ -1774,6 +1770,8 @@ ext4_fc_replay_del_range(struct super_block *sb, struct ext4_fc_tl *tl, + } + + ret = ext4_fc_record_modified_inode(sb, inode->i_ino); ++ if (ret) ++ goto out; + + jbd_debug(1, "DEL_RANGE, inode %ld, lblk %d, len %d\n", + inode->i_ino, le32_to_cpu(lrange.fc_lblk), +@@ -1783,10 +1781,8 @@ ext4_fc_replay_del_range(struct super_block *sb, struct ext4_fc_tl *tl, + map.m_len = remaining; + + ret = ext4_map_blocks(NULL, inode, &map, 0); +- if (ret < 0) { +- iput(inode); +- return 0; +- } ++ if (ret < 0) ++ goto out; + if (ret > 0) { + remaining -= ret; + cur += ret; +@@ -1801,15 +1797,13 @@ ext4_fc_replay_del_range(struct super_block *sb, struct ext4_fc_tl *tl, + ret = ext4_ext_remove_space(inode, lrange.fc_lblk, + lrange.fc_lblk + lrange.fc_len - 1); + up_write(&EXT4_I(inode)->i_data_sem); +- if (ret) { +- iput(inode); +- return 0; +- } ++ if (ret) ++ goto out; + ext4_ext_replay_shrink_inode(inode, + i_size_read(inode) >> sb->s_blocksize_bits); + ext4_mark_inode_dirty(NULL, inode); ++out: + iput(inode); +- + return 0; + } + +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-error-handling-in-ext4_restore_inline_data.patch @@ -0,0 +1,65 @@ +From 897026aaa73eb2517dfea8d147f20ddb0b813044 Mon Sep 17 00:00:00 2001 +From: Ritesh Harjani <riteshh@linux.ibm.com> +Date: Mon, 17 Jan 2022 17:41:47 +0530 +Subject: [PATCH] ext4: fix error handling in ext4_restore_inline_data() +Git-commit: 897026aaa73eb2517dfea8d147f20ddb0b813044 +Patch-mainline: v5.17-rc3 +References: bsc#1197757 + +While running "./check -I 200 generic/475" it sometimes gives below +kernel BUG(). Ideally we should not call ext4_write_inline_data() if +ext4_create_inline_data() has failed. + +<log snip> +[73131.453234] kernel BUG at fs/ext4/inline.c:223! + +<code snip> + 212 static void ext4_write_inline_data(struct inode *inode, struct ext4_iloc *iloc, + 213 void *buffer, loff_t pos, unsigned int len) + 214 { +<...> + 223 BUG_ON(!EXT4_I(inode)->i_inline_off); + 224 BUG_ON(pos + len > EXT4_I(inode)->i_inline_size); + +This patch handles the error and prints out a emergency msg saying potential +data loss for the given inode (since we couldn't restore the original +inline_data due to some previous error). + +[ 9571.070313] EXT4-fs (dm-0): error restoring inline_data for inode -- potential data loss! (inode 1703982, error -30) + +Reported-by: Eric Whitney <enwlinux@gmail.com> +Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/9f4cd7dfd54fa58ff27270881823d94ddf78dd07.1642416995.git.riteshh@linux.ibm.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inline.c | 10 +++++++++- + 1 file changed, 9 insertions(+), 1 deletion(-) + +diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c +index 39a1ab129fdc..d091133a4b46 100644 +--- a/fs/ext4/inline.c ++++ b/fs/ext4/inline.c +@@ -1133,7 +1133,15 @@ static void ext4_restore_inline_data(handle_t *handle, struct inode *inode, + struct ext4_iloc *iloc, + void *buf, int inline_size) + { +- ext4_create_inline_data(handle, inode, inline_size); ++ int ret; ++ ++ ret = ext4_create_inline_data(handle, inode, inline_size); ++ if (ret) { ++ ext4_msg(inode->i_sb, KERN_EMERG, ++ "error restoring inline_data for inode -- potential data loss! (inode %lu, error %d)", ++ inode->i_ino, ret); ++ return; ++ } + ext4_write_inline_data(inode, iloc, buf, 0, inline_size); + ext4_set_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA); + } +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-ext4_fc_stats-trace-point.patch @@ -0,0 +1,136 @@ +From: Ritesh Harjani <riteshh@linux.ibm.com> +Date: Sat, 12 Mar 2022 11:09:47 +0530 +Subject: ext4: fix ext4_fc_stats trace point +Git-commit: 7af1974af0a9ba8a8ed2e3e947d87dd4d9a78d27 +Patch-mainline: v5.17 or v5.17-rc9 (next release) +References: git-fixes + +ftrace's __print_symbolic() requires that any enum values used in the +symbol to string translation table be wrapped in a TRACE_DEFINE_ENUM +so that the enum value can be decoded from the ftrace ring buffer by +user space tooling. + +This patch also fixes few other problems found in this trace point. +e.g. dereferencing structures in TP_printk which should not be done +at any cost. + +Also to avoid checkpatch warnings, this patch removes those +whitespaces/tab stops issues. + +Cc: stable@kernel.org +Fixes: aa75f4d3daae ("ext4: main fast-commit commit path") +Reported-by: Steven Rostedt <rostedt@goodmis.org> +Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> +Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> +Link: https://lore.kernel.org/r/b4b9691414c35c62e570b723e661c80674169f9a.1647057583.git.riteshh@linux.ibm.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Miroslav Benes <mbenes@suse.cz> +--- + include/trace/events/ext4.h | 78 ++++++++++++++++++++++++++++----------------- + 1 file changed, 49 insertions(+), 29 deletions(-) + +diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h +index 19e957b7f941..1a0b7030f72a 100644 +--- a/include/trace/events/ext4.h ++++ b/include/trace/events/ext4.h +@@ -95,6 +95,17 @@ TRACE_DEFINE_ENUM(ES_REFERENCED_B); + { FALLOC_FL_COLLAPSE_RANGE, "COLLAPSE_RANGE"}, \ + { FALLOC_FL_ZERO_RANGE, "ZERO_RANGE"}) + ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_XATTR); ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_CROSS_RENAME); ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE); ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_NOMEM); ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_SWAP_BOOT); ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_RESIZE); ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_RENAME_DIR); ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_FALLOC_RANGE); ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_INODE_JOURNAL_DATA); ++TRACE_DEFINE_ENUM(EXT4_FC_REASON_MAX); ++ + #define show_fc_reason(reason) \ + __print_symbolic(reason, \ + { EXT4_FC_REASON_XATTR, "XATTR"}, \ +@@ -2723,41 +2734,50 @@ TRACE_EVENT(ext4_fc_commit_stop, + + #define FC_REASON_NAME_STAT(reason) \ + show_fc_reason(reason), \ +- __entry->sbi->s_fc_stats.fc_ineligible_reason_count[reason] ++ __entry->fc_ineligible_rc[reason] + + TRACE_EVENT(ext4_fc_stats, +- TP_PROTO(struct super_block *sb), +- +- TP_ARGS(sb), ++ TP_PROTO(struct super_block *sb), + +- TP_STRUCT__entry( +- __field(dev_t, dev) +- __field(struct ext4_sb_info *, sbi) +- __field(int, count) +- ), ++ TP_ARGS(sb), + +- TP_fast_assign( +- __entry->dev = sb->s_dev; +- __entry->sbi = EXT4_SB(sb); +- ), ++ TP_STRUCT__entry( ++ __field(dev_t, dev) ++ __array(unsigned int, fc_ineligible_rc, EXT4_FC_REASON_MAX) ++ __field(unsigned long, fc_commits) ++ __field(unsigned long, fc_ineligible_commits) ++ __field(unsigned long, fc_numblks) ++ ), + +- TP_printk("dev %d:%d fc ineligible reasons:\n" +- "%s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d; " +- "num_commits:%ld, ineligible: %ld, numblks: %ld", +- MAJOR(__entry->dev), MINOR(__entry->dev), +- FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR), +- FC_REASON_NAME_STAT(EXT4_FC_REASON_CROSS_RENAME), +- FC_REASON_NAME_STAT(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE), +- FC_REASON_NAME_STAT(EXT4_FC_REASON_NOMEM), +- FC_REASON_NAME_STAT(EXT4_FC_REASON_SWAP_BOOT), +- FC_REASON_NAME_STAT(EXT4_FC_REASON_RESIZE), +- FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR), +- FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE), +- FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA), +- __entry->sbi->s_fc_stats.fc_num_commits, +- __entry->sbi->s_fc_stats.fc_ineligible_commits, +- __entry->sbi->s_fc_stats.fc_numblks) ++ TP_fast_assign( ++ int i; + ++ __entry->dev = sb->s_dev; ++ for (i = 0; i < EXT4_FC_REASON_MAX; i++) { ++ __entry->fc_ineligible_rc[i] = ++ EXT4_SB(sb)->s_fc_stats.fc_ineligible_reason_count[i]; ++ } ++ __entry->fc_commits = EXT4_SB(sb)->s_fc_stats.fc_num_commits; ++ __entry->fc_ineligible_commits = ++ EXT4_SB(sb)->s_fc_stats.fc_ineligible_commits; ++ __entry->fc_numblks = EXT4_SB(sb)->s_fc_stats.fc_numblks; ++ ), ++ ++ TP_printk("dev %d,%d fc ineligible reasons:\n" ++ "%s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u " ++ "num_commits:%lu, ineligible: %lu, numblks: %lu", ++ MAJOR(__entry->dev), MINOR(__entry->dev), ++ FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR), ++ FC_REASON_NAME_STAT(EXT4_FC_REASON_CROSS_RENAME), ++ FC_REASON_NAME_STAT(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE), ++ FC_REASON_NAME_STAT(EXT4_FC_REASON_NOMEM), ++ FC_REASON_NAME_STAT(EXT4_FC_REASON_SWAP_BOOT), ++ FC_REASON_NAME_STAT(EXT4_FC_REASON_RESIZE), ++ FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR), ++ FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE), ++ FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA), ++ __entry->fc_commits, __entry->fc_ineligible_commits, ++ __entry->fc_numblks) + ); + + #define DEFINE_TRACE_DENTRY_EVENT(__type) \ + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-fallocate-to-use-file_modified-to-update-pe.patch @@ -0,0 +1,171 @@ +From ad5cd4f4ee4d5fcdb1bfb7a0c073072961e70783 Mon Sep 17 00:00:00 2001 +From: "Darrick J. Wong" <djwong@kernel.org> +Date: Tue, 8 Mar 2022 10:50:43 -0800 +Subject: [PATCH] ext4: fix fallocate to use file_modified to update + permissions consistently +Git-commit: ad5cd4f4ee4d5fcdb1bfb7a0c073072961e70783 +Patch-mainline: v5.18-rc4 +References: bsc#1202769 + +Since the initial introduction of (posix) fallocate back at the turn of +the century, it has been possible to use this syscall to change the +user-visible contents of files. This can happen by extending the file +size during a preallocation, or through any of the newer modes (punch, +zero, collapse, insert range). Because the call can be used to change +file contents, we should treat it like we do any other modification to a +file -- update the mtime, and drop set[ug]id privileges/capabilities. + +The VFS function file_modified() does all this for us if pass it a +locked inode, so let's make fallocate drop permissions correctly. + +Signed-off-by: Darrick J. Wong <djwong@kernel.org> +Link: https://lore.kernel.org/r/20220308185043.GA117678@magnolia +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/ext4.h | 2 +- + fs/ext4/extents.c | 32 +++++++++++++++++++++++++------- + fs/ext4/inode.c | 7 ++++++- + 3 files changed, 32 insertions(+), 9 deletions(-) + +--- a/fs/ext4/ext4.h ++++ b/fs/ext4/ext4.h +@@ -2980,7 +2980,7 @@ extern int ext4_inode_attach_jinode(stru + extern int ext4_can_truncate(struct inode *inode); + extern int ext4_truncate(struct inode *); + extern int ext4_break_layouts(struct inode *); +-extern int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length); ++extern int ext4_punch_hole(struct file *file, loff_t offset, loff_t length); + extern void ext4_set_inode_flags(struct inode *, bool init); + extern int ext4_alloc_da_blocks(struct inode *inode); + extern void ext4_set_aops(struct inode *inode); +--- a/fs/ext4/extents.c ++++ b/fs/ext4/extents.c +@@ -4503,9 +4503,9 @@ retry: + return ret > 0 ? ret2 : ret; + } + +-static int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len); ++static int ext4_collapse_range(struct file *file, loff_t offset, loff_t len); + +-static int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len); ++static int ext4_insert_range(struct file *file, loff_t offset, loff_t len); + + static long ext4_zero_range(struct file *file, loff_t offset, + loff_t len, int mode) +@@ -4577,6 +4577,10 @@ static long ext4_zero_range(struct file + /* Wait all existing dio workers, newcomers will block on i_mutex */ + inode_dio_wait(inode); + ++ ret = file_modified(file); ++ if (ret) ++ goto out_mutex; ++ + /* Preallocate the range including the unaligned edges */ + if (partial_begin || partial_end) { + ret = ext4_alloc_file_blocks(file, +@@ -4695,7 +4699,7 @@ long ext4_fallocate(struct file *file, i + ext4_fc_start_update(inode); + + if (mode & FALLOC_FL_PUNCH_HOLE) { +- ret = ext4_punch_hole(inode, offset, len); ++ ret = ext4_punch_hole(file, offset, len); + goto exit; + } + +@@ -4704,12 +4708,12 @@ long ext4_fallocate(struct file *file, i + goto exit; + + if (mode & FALLOC_FL_COLLAPSE_RANGE) { +- ret = ext4_collapse_range(inode, offset, len); ++ ret = ext4_collapse_range(file, offset, len); + goto exit; + } + + if (mode & FALLOC_FL_INSERT_RANGE) { +- ret = ext4_insert_range(inode, offset, len); ++ ret = ext4_insert_range(file, offset, len); + goto exit; + } + +@@ -4745,6 +4749,10 @@ long ext4_fallocate(struct file *file, i + /* Wait all existing dio workers, newcomers will block on i_mutex */ + inode_dio_wait(inode); + ++ ret = file_modified(file); ++ if (ret) ++ goto out; ++ + ret = ext4_alloc_file_blocks(file, lblk, max_blocks, new_size, flags); + if (ret) + goto out; +@@ -5247,8 +5255,9 @@ out: + * This implements the fallocate's collapse range functionality for ext4 + * Returns: 0 and non-zero on error. + */ +-static int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len) ++static int ext4_collapse_range(struct file *file, loff_t offset, loff_t len) + { ++ struct inode *inode = file_inode(file); + struct super_block *sb = inode->i_sb; + struct address_space *mapping = inode->i_mapping; + ext4_lblk_t punch_start, punch_stop; +@@ -5300,6 +5309,10 @@ static int ext4_collapse_range(struct in + /* Wait for existing dio to complete */ + inode_dio_wait(inode); + ++ ret = file_modified(file); ++ if (ret) ++ goto out_mutex; ++ + /* + * Prevent page faults from reinstantiating pages we have released from + * page cache. +@@ -5394,8 +5407,9 @@ out_mutex: + * by len bytes. + * Returns 0 on success, error otherwise. + */ +-static int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len) ++static int ext4_insert_range(struct file *file, loff_t offset, loff_t len) + { ++ struct inode *inode = file_inode(file); + struct super_block *sb = inode->i_sb; + struct address_space *mapping = inode->i_mapping; + handle_t *handle; +@@ -5452,6 +5466,10 @@ static int ext4_insert_range(struct inod + /* Wait for existing dio to complete */ + inode_dio_wait(inode); + ++ ret = file_modified(file); ++ if (ret) ++ goto out_mutex; ++ + /* + * Prevent page faults from reinstantiating pages we have released from + * page cache. +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -3991,8 +3991,9 @@ int ext4_break_layouts(struct inode *ino + * Returns: 0 on success or negative on failure + */ + +-int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length) ++int ext4_punch_hole(struct file *file, loff_t offset, loff_t length) + { ++ struct inode *inode = file_inode(file); + struct super_block *sb = inode->i_sb; + ext4_lblk_t first_block, stop_block; + struct address_space *mapping = inode->i_mapping; +@@ -4054,6 +4055,10 @@ int ext4_punch_hole(struct inode *inode, + /* Wait all existing dio workers, newcomers will block on i_mutex */ + inode_dio_wait(inode); + ++ ret = file_modified(file); ++ if (ret) ++ goto out_mutex; ++ + /* + * Prevent page faults from reinstantiating pages we have released from + * page cache. --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-fast-commit-may-miss-tracking-range-for-FAL.patch @@ -0,0 +1,62 @@ +From 5e4d0eba1ccaf19f93222abdeda5a368be141785 Mon Sep 17 00:00:00 2001 +From: Xin Yin <yinxin.x@bytedance.com> +Date: Tue, 21 Dec 2021 10:28:39 +0800 +Subject: [PATCH] ext4: fix fast commit may miss tracking range for + FALLOC_FL_ZERO_RANGE +Git-commit: 5e4d0eba1ccaf19f93222abdeda5a368be141785 +Patch-mainline: v5.17-rc1 +References: bsc#1202757 + +when call falloc with FALLOC_FL_ZERO_RANGE, to set an range to unwritten, +which has been already initialized. If the range is align to blocksize, +fast commit will not track range for this change. + +Also track range for unwritten range in ext4_map_blocks(). + +Signed-off-by: Xin Yin <yinxin.x@bytedance.com> +Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> +Link: https://lore.kernel.org/r/20211221022839.374606-1-yinxin.x@bytedance.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/extents.c | 2 -- + fs/ext4/inode.c | 7 ++++--- + 2 files changed, 4 insertions(+), 5 deletions(-) + +diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c +index 38111ea18ae1..c3e76a5de661 100644 +--- a/fs/ext4/extents.c ++++ b/fs/ext4/extents.c +@@ -4647,8 +4647,6 @@ static long ext4_zero_range(struct file *file, loff_t offset, + ret = ext4_mark_inode_dirty(handle, inode); + if (unlikely(ret)) + goto out_handle; +- ext4_fc_track_range(handle, inode, offset >> inode->i_sb->s_blocksize_bits, +- (offset + len - 1) >> inode->i_sb->s_blocksize_bits); + /* Zero out partial block at the edges of the range */ + ret = ext4_zero_partial_blocks(handle, inode, offset, len); + if (ret >= 0) +diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c +index 82f555d26980..4895909de21b 100644 +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -741,10 +741,11 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode, + if (ret) + return ret; + } +- ext4_fc_track_range(handle, inode, map->m_lblk, +- map->m_lblk + map->m_len - 1); + } +- ++ if (retval > 0 && (map->m_flags & EXT4_MAP_UNWRITTEN || ++ map->m_flags & EXT4_MAP_MAPPED)) ++ ext4_fc_track_range(handle, inode, map->m_lblk, ++ map->m_lblk + map->m_len - 1); + if (retval < 0) + ext_debug(inode, "failed with err %d\n", retval); + return retval; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-fs-corruption-when-tring-to-remove-a-non-em.patch @@ -0,0 +1,163 @@ +From 7aab5c84a0f6ec2290e2ba4a6b245178b1bf949a Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Mon, 28 Feb 2022 10:48:15 +0800 +Subject: [PATCH] ext4: fix fs corruption when tring to remove a non-empty + directory with IO error +Git-commit: 7aab5c84a0f6ec2290e2ba4a6b245178b1bf949a +Patch-mainline: v5.18-rc1 +References: bsc#1202768 + +We inject IO error when rmdir non empty direcory, then got issue as follows: +Step1: mkfs.ext4 -F /dev/sda +Step2: mount /dev/sda test +Step3: cd test +Step4: mkdir -p 1/2 +Step5: rmdir 1 + [ 110.920551] ext4_empty_dir: inject fault + [ 110.921926] EXT4-fs warning (device sda): ext4_rmdir:3113: inode #12: + comm rmdir: empty directory '1' has too many links (3) +Step6: cd .. +Step7: umount test +Step8: fsck.ext4 -f /dev/sda + e2fsck 1.42.9 (28-Dec-2013) + Pass 1: Checking inodes, blocks, and sizes + Pass 2: Checking directory structure + Entry '..' in .../??? (13) has deleted/unused inode 12. Clear<y>? yes + Pass 3: Checking directory connectivity + Unconnected directory inode 13 (...) + Connect to /lost+found<y>? yes + Pass 4: Checking reference counts + Inode 13 ref count is 3, should be 2. Fix<y>? yes + Pass 5: Checking group summary information + + /dev/sda: ***** FILE SYSTEM WAS MODIFIED ***** + /dev/sda: 12/131072 files (0.0% non-contiguous), 26157/524288 blocks + +ext4_rmdir + if (!ext4_empty_dir(inode)) + goto end_rmdir; +ext4_empty_dir + bh = ext4_read_dirblock(inode, 0, DIRENT_HTREE); + if (IS_ERR(bh)) + return true; +Now if read directory block failed, 'ext4_empty_dir' will return true, assume +directory is empty. Obviously, it will lead to above issue. +To solve this issue, if read directory block failed 'ext4_empty_dir' just +return false. To avoid making things worse when file system is already +corrupted, 'ext4_empty_dir' also return false. + +Signed-off-by: Ye Bin <yebin10@huawei.com> +Cc: stable@kernel.org +Link: https://lore.kernel.org/r/20220228024815.3952506-1-yebin10@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inline.c | 9 ++++----- + fs/ext4/namei.c | 10 +++++----- + 2 files changed, 9 insertions(+), 10 deletions(-) + +diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c +index e42941803605..9c076262770d 100644 +--- a/fs/ext4/inline.c ++++ b/fs/ext4/inline.c +@@ -1783,19 +1783,20 @@ bool empty_inline_dir(struct inode *dir, int *has_inline_data) + void *inline_pos; + unsigned int offset; + struct ext4_dir_entry_2 *de; +- bool ret = true; ++ bool ret = false; + + err = ext4_get_inode_loc(dir, &iloc); + if (err) { + EXT4_ERROR_INODE_ERR(dir, -err, + "error %d getting inode %lu block", + err, dir->i_ino); +- return true; ++ return false; + } + + down_read(&EXT4_I(dir)->xattr_sem); + if (!ext4_has_inline_data(dir)) { + *has_inline_data = 0; ++ ret = true; + goto out; + } + +@@ -1804,7 +1805,6 @@ bool empty_inline_dir(struct inode *dir, int *has_inline_data) + ext4_warning(dir->i_sb, + "bad inline directory (dir #%lu) - no `..'", + dir->i_ino); +- ret = true; + goto out; + } + +@@ -1823,16 +1823,15 @@ bool empty_inline_dir(struct inode *dir, int *has_inline_data) + dir->i_ino, le32_to_cpu(de->inode), + le16_to_cpu(de->rec_len), de->name_len, + inline_size); +- ret = true; + goto out; + } + if (le32_to_cpu(de->inode)) { +- ret = false; + goto out; + } + offset += ext4_rec_len_from_disk(de->rec_len, inline_size); + } + ++ ret = true; + out: + up_read(&EXT4_I(dir)->xattr_sem); + brelse(iloc.bh); +diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c +index 8cf0a924a49b..39e223f7bf64 100644 +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -2997,14 +2997,14 @@ bool ext4_empty_dir(struct inode *inode) + if (inode->i_size < ext4_dir_rec_len(1, NULL) + + ext4_dir_rec_len(2, NULL)) { + EXT4_ERROR_INODE(inode, "invalid size"); +- return true; ++ return false; + } + /* The first directory block must not be a hole, + * so treat it as DIRENT_HTREE + */ + bh = ext4_read_dirblock(inode, 0, DIRENT_HTREE); + if (IS_ERR(bh)) +- return true; ++ return false; + + de = (struct ext4_dir_entry_2 *) bh->b_data; + if (ext4_check_dir_entry(inode, NULL, de, bh, bh->b_data, bh->b_size, +@@ -3012,7 +3012,7 @@ bool ext4_empty_dir(struct inode *inode) + le32_to_cpu(de->inode) != inode->i_ino || strcmp(".", de->name)) { + ext4_warning_inode(inode, "directory missing '.'"); + brelse(bh); +- return true; ++ return false; + } + offset = ext4_rec_len_from_disk(de->rec_len, sb->s_blocksize); + de = ext4_next_entry(de, sb->s_blocksize); +@@ -3021,7 +3021,7 @@ bool ext4_empty_dir(struct inode *inode) + le32_to_cpu(de->inode) == 0 || strcmp("..", de->name)) { + ext4_warning_inode(inode, "directory missing '..'"); + brelse(bh); +- return true; ++ return false; + } + offset += ext4_rec_len_from_disk(de->rec_len, sb->s_blocksize); + while (offset < inode->i_size) { +@@ -3035,7 +3035,7 @@ bool ext4_empty_dir(struct inode *inode) + continue; + } + if (IS_ERR(bh)) +- return true; ++ return false; + } + de = (struct ext4_dir_entry_2 *) (bh->b_data + + (offset & (sb->s_blocksize - 1))); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-i_disksize-exceeding-i_size-problem-in-pari.patch @@ -0,0 +1,75 @@ +From 1dedde690303c05ef732b7c5c8356fdf60a4ade3 Mon Sep 17 00:00:00 2001 +From: Zhihao Cheng <chengzhihao1@huawei.com> +Date: Tue, 21 Mar 2023 09:37:21 +0800 +Subject: [PATCH] ext4: fix i_disksize exceeding i_size problem in paritally + written case +Git-commit: 1dedde690303c05ef732b7c5c8356fdf60a4ade3 +Patch-mainline: v6.4-rc1 +References: bsc#1213015 + +It is possible for i_disksize can exceed i_size, triggering a warning. + +generic_perform_write + copied = iov_iter_copy_from_user_atomic(len) // copied < len + ext4_da_write_end + | ext4_update_i_disksize + | new_i_size = pos + copied; + | WRITE_ONCE(EXT4_I(inode)->i_disksize, newsize) // update i_disksize + | generic_write_end + | copied = block_write_end(copied, len) // copied = 0 + | if (unlikely(copied < len)) + | if (!PageUptodate(page)) + | copied = 0; + | if (pos + copied > inode->i_size) // return false + if (unlikely(copied == 0)) + goto again; + if (unlikely(iov_iter_fault_in_readable(i, bytes))) { + status = -EFAULT; + break; + } + +We get i_disksize greater than i_size here, which could trigger WARNING +check 'i_size_read(inode) < EXT4_I(inode)->i_disksize' while doing dio: + +ext4_dio_write_iter + iomap_dio_rw + __iomap_dio_rw // return err, length is not aligned to 512 + ext4_handle_inode_extension + WARN_ON_ONCE(i_size_read(inode) < EXT4_I(inode)->i_disksize) // Oops + + WARNING: CPU: 2 PID: 2609 at fs/ext4/file.c:319 + CPU: 2 PID: 2609 Comm: aa Not tainted 6.3.0-rc2 + RIP: 0010:ext4_file_write_iter+0xbc7 + Call Trace: + vfs_write+0x3b1 + ksys_write+0x77 + do_syscall_64+0x39 + +Fix it by updating 'copied' value before updating i_disksize just like +ext4_write_inline_data_end() does. + +A reproducer can be found in the buganizer link below. + +Link: https://bugzilla.kernel.org/show_bug.cgi?id=217209 +Fixes: 64769240bd07 ("ext4: Add delayed allocation support in data=writeback mode") +Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20230321013721.89818-1-chengzhihao1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inode.c | 2 ++ + 1 file changed, 2 insertions(+) + +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -3095,6 +3095,8 @@ static int ext4_da_write_end(struct file + len, copied, page, fsdata); + + trace_ext4_da_write_end(inode, pos, len, copied); ++ if (unlikely(copied < len) && !PageUptodate(page)) ++ copied = 0; + start = pos & (PAGE_SIZE - 1); + end = start + copied - 1; + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-incorrect-options-show-of-original-mount_op.patch @@ -0,0 +1,109 @@ +From e3645d72f8865ffe36f9dc811540d40aa3c848d3 Mon Sep 17 00:00:00 2001 +From: Zhang Yi <yi.zhang@huawei.com> +Date: Sun, 29 Jan 2023 11:49:39 +0800 +Subject: [PATCH] ext4: fix incorrect options show of original mount_opt and + extend mount_opt2 +Git-commit: e3645d72f8865ffe36f9dc811540d40aa3c848d3 +Patch-mainline: v6.3-rc1 +References: bsc#1210764 + +Current _ext4_show_options() do not distinguish MOPT_2 flag, so it mixed +extend sbi->s_mount_opt2 options with sbi->s_mount_opt, it could lead to +show incorrect options, e.g. show fc_debug_force if we mount with +errors=continue mode and miss it if we set. + + $ mkfs.ext4 /dev/pmem0 + $ mount -o errors=remount-ro /dev/pmem0 /mnt + $ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force + #empty + $ mount -o remount,errors=continue /mnt + $ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force + fc_debug_force + $ mount -o remount,errors=remount-ro,fc_debug_force /mnt + $ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force + #empty + +Fixes: 995a3ed67fc8 ("ext4: add fast_commit feature and handling for extended mount options") +Signed-off-by: Zhang Yi <yi.zhang@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20230129034939.3702550-1-yi.zhang@huaweicloud.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/ext4.h | 1 + + fs/ext4/super.c | 28 +++++++++++++++++++++------- + 2 files changed, 22 insertions(+), 7 deletions(-) + +--- a/fs/ext4/ext4.h ++++ b/fs/ext4/ext4.h +@@ -1483,6 +1483,7 @@ struct ext4_sb_info { + unsigned int s_mount_opt2; + unsigned long s_mount_flags; + unsigned int s_def_mount_opt; ++ unsigned int s_def_mount_opt2; + ext4_fsblk_t s_sb_block; + atomic64_t s_resv_clusters; + kuid_t s_resuid; +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -2538,7 +2538,7 @@ static int _ext4_show_options(struct seq + { + struct ext4_sb_info *sbi = EXT4_SB(sb); + struct ext4_super_block *es = sbi->s_es; +- int def_errors, def_mount_opt = sbi->s_def_mount_opt; ++ int def_errors; + const struct mount_opts *m; + char sep = nodefs ? '\n' : ','; + +@@ -2550,15 +2550,28 @@ static int _ext4_show_options(struct seq + + for (m = ext4_mount_opts; m->token != Opt_err; m++) { + int want_set = m->flags & MOPT_SET; ++ int opt_2 = m->flags & MOPT_2; ++ unsigned int mount_opt, def_mount_opt; ++ + if (((m->flags & (MOPT_SET|MOPT_CLEAR)) == 0) || + (m->flags & MOPT_CLEAR_ERR) || m->flags & MOPT_SKIP) + continue; +- if (!nodefs && !(m->mount_opt & (sbi->s_mount_opt ^ def_mount_opt))) +- continue; /* skip if same as the default */ ++ ++ if (opt_2) { ++ mount_opt = sbi->s_mount_opt2; ++ def_mount_opt = sbi->s_def_mount_opt2; ++ } else { ++ mount_opt = sbi->s_mount_opt; ++ def_mount_opt = sbi->s_def_mount_opt; ++ } ++ /* skip if same as the default */ ++ if (!nodefs && !(m->mount_opt & (mount_opt ^ def_mount_opt))) ++ continue; ++ /* select Opt_noFoo vs Opt_Foo */ + if ((want_set && +- (sbi->s_mount_opt & m->mount_opt) != m->mount_opt) || +- (!want_set && (sbi->s_mount_opt & m->mount_opt))) +- continue; /* select Opt_noFoo vs Opt_Foo */ ++ (mount_opt & m->mount_opt) != m->mount_opt) || ++ (!want_set && (mount_opt & m->mount_opt))) ++ continue; + SEQ_OPTS_PRINT("%s", token2str(m->token)); + } + +@@ -2586,7 +2599,7 @@ static int _ext4_show_options(struct seq + if (nodefs || sbi->s_stripe) + SEQ_OPTS_PRINT("stripe=%lu", sbi->s_stripe); + if (nodefs || EXT4_MOUNT_DATA_FLAGS & +- (sbi->s_mount_opt ^ def_mount_opt)) { ++ (sbi->s_mount_opt ^ sbi->s_def_mount_opt)) { + if (test_opt(sb, DATA_FLAGS) == EXT4_MOUNT_JOURNAL_DATA) + SEQ_OPTS_PUTS("data=journal"); + else if (test_opt(sb, DATA_FLAGS) == EXT4_MOUNT_ORDERED_DATA) +@@ -4324,6 +4337,7 @@ static int ext4_fill_super(struct super_ + kfree(s_mount_opts); + } + sbi->s_def_mount_opt = sbi->s_mount_opt; ++ sbi->s_def_mount_opt2 = sbi->s_mount_opt2; + if (!parse_options((char *) data, sb, &parsed_opts, 0)) + goto failed_mount; + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-incorrect-type-issue-during-replay_del_rang.patch @@ -0,0 +1,42 @@ +From 8fca8a2b0a822f7936130af7299d2fd7f0a66714 Mon Sep 17 00:00:00 2001 +From: Xin Yin <yinxin.x@bytedance.com> +Date: Wed, 26 Jan 2022 14:31:46 +0800 +Subject: [PATCH] ext4: fix incorrect type issue during replay_del_range +Git-commit: 8fca8a2b0a822f7936130af7299d2fd7f0a66714 +Patch-mainline: v5.17-rc3 +References: bsc#1202867 + +should not use fast commit log data directly, add le32_to_cpu(). + +Reported-by: kernel test robot <lkp@intel.com> +Fixes: 0b5b5a62b945 ("ext4: use ext4_ext_remove_space() for fast commit replay delete range") +Cc: stable@kernel.org +Signed-off-by: Xin Yin <yinxin.x@bytedance.com> +Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com> +Link: https://lore.kernel.org/r/20220126063146.2302-1-yinxin.x@bytedance.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/fast_commit.c | 5 +++-- + 1 file changed, 3 insertions(+), 2 deletions(-) + +diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c +index 5934c23e153e..7964ee34e322 100644 +--- a/fs/ext4/fast_commit.c ++++ b/fs/ext4/fast_commit.c +@@ -1794,8 +1794,9 @@ ext4_fc_replay_del_range(struct super_block *sb, struct ext4_fc_tl *tl, + } + + down_write(&EXT4_I(inode)->i_data_sem); +- ret = ext4_ext_remove_space(inode, lrange.fc_lblk, +- lrange.fc_lblk + lrange.fc_len - 1); ++ ret = ext4_ext_remove_space(inode, le32_to_cpu(lrange.fc_lblk), ++ le32_to_cpu(lrange.fc_lblk) + ++ le32_to_cpu(lrange.fc_len) - 1); + up_write(&EXT4_I(inode)->i_data_sem); + if (ret) + goto out; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-inode-leak-in-ext4_xattr_inode_create-on-an.patch @@ -0,0 +1,59 @@ +From e4db04f7d3dbbe16680e0ded27ea2a65b10f766a Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Thu, 8 Dec 2022 10:32:33 +0800 +Subject: [PATCH] ext4: fix inode leak in ext4_xattr_inode_create() on an error + path +Git-commit: e4db04f7d3dbbe16680e0ded27ea2a65b10f766a +Patch-mainline: v6.2-rc1 +References: bsc#1207636 + +There is issue as follows when do setxattr with inject fault: + +[localhost]# fsck.ext4 -fn /dev/sda +e2fsck 1.46.6-rc1 (12-Sep-2022) +Pass 1: Checking inodes, blocks, and sizes +Pass 2: Checking directory structure +Pass 3: Checking directory connectivity +Pass 4: Checking reference counts +Unattached zero-length inode 15. Clear? no + +Unattached inode 15 +Connect to /lost+found? no + +Pass 5: Checking group summary information + +/dev/sda: ********** WARNING: Filesystem still has errors ********** + +/dev/sda: 15/655360 files (0.0% non-contiguous), 66755/2621440 blocks + +This occurs in 'ext4_xattr_inode_create()'. If 'ext4_mark_inode_dirty()' +fails, dropping i_nlink of the inode is needed. Or will lead to inode leak. + +Signed-off-by: Ye Bin <yebin10@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20221208023233.1231330-5-yebin@huaweicloud.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/xattr.c | 3 +++ + 1 file changed, 3 insertions(+) + +diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c +index b666d3bf8b38..7decaaf27e82 100644 +--- a/fs/ext4/xattr.c ++++ b/fs/ext4/xattr.c +@@ -1441,6 +1441,9 @@ static struct inode *ext4_xattr_inode_create(handle_t *handle, + if (!err) + err = ext4_inode_attach_jinode(ea_inode); + if (err) { ++ if (ext4_xattr_inode_dec_ref(handle, ea_inode)) ++ ext4_warning_inode(ea_inode, ++ "cleanup dec ref error %d", err); + iput(ea_inode); + return ERR_PTR(err); + } +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-kernel-BUG-in-ext4_write_inline_data_end.patch @@ -0,0 +1,108 @@ +From 5c099c4fdc438014d5893629e70a8ba934433ee8 Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Tue, 6 Dec 2022 22:41:34 +0800 +Subject: [PATCH] ext4: fix kernel BUG in 'ext4_write_inline_data_end()' +Git-commit: 5c099c4fdc438014d5893629e70a8ba934433ee8 +Patch-mainline: v6.2-rc1 +References: bsc#1206894 + +Syzbot report follow issue: + +Acked-by: Jan Kara <jack@suse.cz> + +------------[ cut here ]------------ +kernel BUG at fs/ext4/inline.c:227! +invalid opcode: 0000 [#1] PREEMPT SMP KASAN +CPU: 1 PID: 3629 Comm: syz-executor212 Not tainted 6.1.0-rc5-syzkaller-00018-g59d0d52c30d4 #0 +Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 +RIP: 0010:ext4_write_inline_data+0x344/0x3e0 fs/ext4/inline.c:227 +RSP: 0018:ffffc90003b3f368 EFLAGS: 00010293 +RAX: 0000000000000000 RBX: ffff8880704e16c0 RCX: 0000000000000000 +RDX: ffff888021763a80 RSI: ffffffff821e31a4 RDI: 0000000000000006 +RBP: 000000000006818e R08: 0000000000000006 R09: 0000000000068199 +R10: 0000000000000079 R11: 0000000000000000 R12: 000000000000000b +R13: 0000000000068199 R14: ffffc90003b3f408 R15: ffff8880704e1c82 +FS: 000055555723e3c0(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: 00007fffe8ac9080 CR3: 0000000079f81000 CR4: 0000000000350ee0 +Call Trace: + <TASK> + ext4_write_inline_data_end+0x2a3/0x12f0 fs/ext4/inline.c:768 + ext4_write_end+0x242/0xdd0 fs/ext4/inode.c:1313 + ext4_da_write_end+0x3ed/0xa30 fs/ext4/inode.c:3063 + generic_perform_write+0x316/0x570 mm/filemap.c:3764 + ext4_buffered_write_iter+0x15b/0x460 fs/ext4/file.c:285 + ext4_file_write_iter+0x8bc/0x16e0 fs/ext4/file.c:700 + call_write_iter include/linux/fs.h:2191 [inline] + do_iter_readv_writev+0x20b/0x3b0 fs/read_write.c:735 + do_iter_write+0x182/0x700 fs/read_write.c:861 + vfs_iter_write+0x74/0xa0 fs/read_write.c:902 + iter_file_splice_write+0x745/0xc90 fs/splice.c:686 + do_splice_from fs/splice.c:764 [inline] + direct_splice_actor+0x114/0x180 fs/splice.c:931 + splice_direct_to_actor+0x335/0x8a0 fs/splice.c:886 + do_splice_direct+0x1ab/0x280 fs/splice.c:974 + do_sendfile+0xb19/0x1270 fs/read_write.c:1255 + __do_sys_sendfile64 fs/read_write.c:1323 [inline] + __se_sys_sendfile64 fs/read_write.c:1309 [inline] + __x64_sys_sendfile64+0x1d0/0x210 fs/read_write.c:1309 + do_syscall_x64 arch/x86/entry/common.c:50 [inline] + do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 + entry_SYSCALL_64_after_hwframe+0x63/0xcd +---[ end trace 0000000000000000 ]--- + +Above issue may happens as follows: +ext4_da_write_begin + ext4_da_write_inline_data_begin + ext4_da_convert_inline_data_to_extent + ext4_clear_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA); +ext4_da_write_end + +ext4_run_li_request + ext4_mb_prefetch + ext4_read_block_bitmap_nowait + ext4_validate_block_bitmap + ext4_mark_group_bitmap_corrupted(sb, block_group, EXT4_GROUP_INFO_BBITMAP_CORRUPT) + percpu_counter_sub(&sbi->s_freeclusters_counter,grp->bb_free); + -> sbi->s_freeclusters_counter become zero +ext4_da_write_begin + if (ext4_nonda_switch(inode->i_sb)) -> As freeclusters_counter is zero will return true + *fsdata = (void *)FALL_BACK_TO_NONDELALLOC; + ext4_write_begin +ext4_da_write_end + if (write_mode == FALL_BACK_TO_NONDELALLOC) + ext4_write_end + if (inline_data) + ext4_write_inline_data_end + ext4_write_inline_data + BUG_ON(pos + len > EXT4_I(inode)->i_inline_size); + -> As inode is already convert to extent, so 'pos + len' > inline_size + -> then trigger BUG. + +To solve this issue, instead of checking ext4_has_inline_data() which +is only cleared after data has been written back, check the +EXT4_STATE_MAY_INLINE_DATA flag in ext4_write_end(). + +Fixes: f19d5870cbf7 ("ext4: add normal write support for inline data") +Reported-by: syzbot+4faa160fa96bfba639f8@syzkaller.appspotmail.com +Reported-by: Jun Nie <jun.nie@linaro.org> +Signed-off-by: Ye Bin <yebin10@huawei.com> +Link: https://lore.kernel.org/r/20221206144134.1919987-1-yebin@huaweicloud.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +--- + fs/ext4/inode.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -1294,7 +1294,8 @@ static int ext4_write_end(struct file *f + loff_t old_size = inode->i_size; + int ret = 0, ret2; + int i_size_changed = 0; +- int inline_data = ext4_has_inline_data(inode); ++ int inline_data = ext4_has_inline_data(inode) && ++ ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA); + bool verity = ext4_verity_in_progress(inode); + + trace_ext4_write_end(inode, pos, len, copied); --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-leaking-uninitialized-memory-in-fast-commit.patch @@ -0,0 +1,48 @@ +From 594bc43b410316d70bb42aeff168837888d96810 Mon Sep 17 00:00:00 2001 +From: Eric Biggers <ebiggers@google.com> +Date: Sun, 6 Nov 2022 14:48:37 -0800 +Subject: [PATCH] ext4: fix leaking uninitialized memory in fast-commit journal +Git-commit: 594bc43b410316d70bb42aeff168837888d96810 +Patch-mainline: v6.2-rc1 +References: bsc#1207625 + +When space at the end of fast-commit journal blocks is unused, make sure +to zero it out so that uninitialized memory is not leaked to disk. + +Fixes: aa75f4d3daae ("ext4: main fast-commit commit path") +Cc: <stable@vger.kernel.org> # v5.10+ +Signed-off-by: Eric Biggers <ebiggers@google.com> +Link: https://lore.kernel.org/r/20221106224841.279231-4-ebiggers@kernel.org +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/fast_commit.c | 5 +++++ + 1 file changed, 5 insertions(+) + +diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c +index da0c8228cf9c..1e8be0554239 100644 +--- a/fs/ext4/fast_commit.c ++++ b/fs/ext4/fast_commit.c +@@ -737,6 +737,9 @@ static u8 *ext4_fc_reserve_space(struct super_block *sb, int len, u32 *crc) + *crc = ext4_chksum(sbi, *crc, tl, EXT4_FC_TAG_BASE_LEN); + if (pad_len > 0) + ext4_fc_memzero(sb, tl + 1, pad_len, crc); ++ /* Don't leak uninitialized memory in the unused last byte. */ ++ *((u8 *)(tl + 1) + pad_len) = 0; ++ + ext4_fc_submit_bh(sb, false); + + ret = jbd2_fc_get_buf(EXT4_SB(sb)->s_journal, &bh); +@@ -793,6 +796,8 @@ static int ext4_fc_write_tail(struct super_block *sb, u32 crc) + dst += sizeof(tail.fc_tid); + tail.fc_crc = cpu_to_le32(crc); + ext4_fc_memcpy(sb, dst, &tail.fc_crc, sizeof(tail.fc_crc), NULL); ++ dst += sizeof(tail.fc_crc); ++ memset(dst, 0, bsize - off); /* Don't leak uninitialized memory. */ + + ext4_fc_submit_bh(sb, true); + +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-lockdep-warning-when-enabling-MMP.patch @@ -0,0 +1,89 @@ +From 949f95ff39bf188e594e7ecd8e29b82eb108f5bf Mon Sep 17 00:00:00 2001 +From: Jan Kara <jack@suse.cz> +Date: Tue, 11 Apr 2023 14:10:19 +0200 +Subject: [PATCH] ext4: fix lockdep warning when enabling MMP +Git-commit: 949f95ff39bf188e594e7ecd8e29b82eb108f5bf +Patch-mainline: v6.4-rc2 +References: bsc#1213100 + +When we enable MMP in ext4_multi_mount_protect() during mount or +remount, we end up calling sb_start_write() from write_mmp_block(). This +triggers lockdep warning because freeze protection ranks above s_umount +semaphore we are holding during mount / remount. The problem is harmless +because we are guaranteed the filesystem is not frozen during mount / +remount but still let's fix the warning by not grabbing freeze +protection from ext4_multi_mount_protect(). + +Cc: stable@kernel.org +Reported-by: syzbot+6b7df7d5506b32467149@syzkaller.appspotmail.com +Link: https://syzkaller.appspot.com/bug?id=ab7e5b6f400b7778d46f01841422e5718fb81843 +Signed-off-by: Jan Kara <jack@suse.cz> +Reviewed-by: Christian Brauner <brauner@kernel.org> +Link: https://lore.kernel.org/r/20230411121019.21940-1-jack@suse.cz +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/mmp.c | 30 +++++++++++++++++++++--------- + 1 file changed, 21 insertions(+), 9 deletions(-) + +--- a/fs/ext4/mmp.c ++++ b/fs/ext4/mmp.c +@@ -39,28 +39,36 @@ static void ext4_mmp_csum_set(struct sup + * Write the MMP block using REQ_SYNC to try to get the block on-disk + * faster. + */ +-static int write_mmp_block(struct super_block *sb, struct buffer_head *bh) ++static int write_mmp_block_thawed(struct super_block *sb, ++ struct buffer_head *bh) + { + struct mmp_struct *mmp = (struct mmp_struct *)(bh->b_data); + +- /* +- * We protect against freezing so that we don't create dirty buffers +- * on frozen filesystem. +- */ +- sb_start_write(sb); + ext4_mmp_csum_set(sb, mmp); + lock_buffer(bh); + bh->b_end_io = end_buffer_write_sync; + get_bh(bh); + submit_bh(REQ_OP_WRITE, REQ_SYNC | REQ_META | REQ_PRIO, bh); + wait_on_buffer(bh); +- sb_end_write(sb); + if (unlikely(!buffer_uptodate(bh))) + return -EIO; +- + return 0; + } + ++static int write_mmp_block(struct super_block *sb, struct buffer_head *bh) ++{ ++ int err; ++ ++ /* ++ * We protect against freezing so that we don't create dirty buffers ++ * on frozen filesystem. ++ */ ++ sb_start_write(sb); ++ err = write_mmp_block_thawed(sb, bh); ++ sb_end_write(sb); ++ return err; ++} ++ + /* + * Read the MMP block. It _must_ be read from disk and hence we clear the + * uptodate flag on the buffer. +@@ -348,7 +356,11 @@ skip: + seq = mmp_new_seq(); + mmp->mmp_seq = cpu_to_le32(seq); + +- retval = write_mmp_block(sb, bh); ++ /* ++ * On mount / remount we are protected against fs freezing (by s_umount ++ * semaphore) and grabbing freeze protection upsets lockdep ++ */ ++ retval = write_mmp_block_thawed(sb, bh); + if (retval) + goto failed; + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-miss-release-buffer-head-in-ext4_fc_write_i.patch @@ -0,0 +1,62 @@ +From ccbf8eeb39f2ff00b54726a2b20b35d788c4ecb5 Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Wed, 14 Sep 2022 18:08:59 +0800 +Subject: [PATCH] ext4: fix miss release buffer head in ext4_fc_write_inode +Git-commit: ccbf8eeb39f2ff00b54726a2b20b35d788c4ecb5 +Patch-mainline: v6.1-rc1 +References: bsc#1207609 + +In 'ext4_fc_write_inode' function first call 'ext4_get_inode_loc' get 'iloc', +after use it miss release 'iloc.bh'. +So just release 'iloc.bh' before 'ext4_fc_write_inode' return. + +Cc: stable@kernel.org +Signed-off-by: Ye Bin <yebin10@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220914100859.1415196-1-yebin10@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/fast_commit.c | 15 +++++++++------ + 1 file changed, 9 insertions(+), 6 deletions(-) + +diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c +index 2af962cbb835..b7414a5812f6 100644 +--- a/fs/ext4/fast_commit.c ++++ b/fs/ext4/fast_commit.c +@@ -874,22 +874,25 @@ static int ext4_fc_write_inode(struct inode *inode, u32 *crc) + tl.fc_tag = cpu_to_le16(EXT4_FC_TAG_INODE); + tl.fc_len = cpu_to_le16(inode_len + sizeof(fc_inode.fc_ino)); + ++ ret = -ECANCELED; + dst = ext4_fc_reserve_space(inode->i_sb, + sizeof(tl) + inode_len + sizeof(fc_inode.fc_ino), crc); + if (!dst) +- return -ECANCELED; ++ goto err; + + if (!ext4_fc_memcpy(inode->i_sb, dst, &tl, sizeof(tl), crc)) +- return -ECANCELED; ++ goto err; + dst += sizeof(tl); + if (!ext4_fc_memcpy(inode->i_sb, dst, &fc_inode, sizeof(fc_inode), crc)) +- return -ECANCELED; ++ goto err; + dst += sizeof(fc_inode); + if (!ext4_fc_memcpy(inode->i_sb, dst, (u8 *)ext4_raw_inode(&iloc), + inode_len, crc)) +- return -ECANCELED; +- +- return 0; ++ goto err; ++ ret = 0; ++err: ++ brelse(iloc.bh); ++ return ret; + } + + /* +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-null-ptr-deref-in-__ext4_journal_ensure_cre.patch @@ -0,0 +1,95 @@ +From 298b5c521746d69c07beb2757292fb5ccc1b0f85 Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Fri, 24 Dec 2021 18:03:41 +0800 +Subject: [PATCH] ext4: fix null-ptr-deref in '__ext4_journal_ensure_credits' +Git-commit: 298b5c521746d69c07beb2757292fb5ccc1b0f85 +Patch-mainline: v5.17-rc1 +References: bsc#1202764 + +We got issue as follows when run syzkaller test: +[ 1901.130043] EXT4-fs error (device vda): ext4_remount:5624: comm syz-executor.5: Abort forced by user +[ 1901.130901] Aborting journal on device vda-8. +[ 1901.131437] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.16: Detected aborted journal +[ 1901.131566] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.11: Detected aborted journal +[ 1901.132586] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.18: Detected aborted journal +[ 1901.132751] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.9: Detected aborted journal +[ 1901.136149] EXT4-fs error (device vda) in ext4_reserve_inode_write:6035: Journal has aborted +[ 1901.136837] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-fuzzer: Detected aborted journal +[ 1901.136915] ================================================================== +[ 1901.138175] BUG: KASAN: null-ptr-deref in __ext4_journal_ensure_credits+0x74/0x140 [ext4] +[ 1901.138343] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.13: Detected aborted journal +[ 1901.138398] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.1: Detected aborted journal +[ 1901.138808] Read of size 8 at addr 0000000000000000 by task syz-executor.17/968 +[ 1901.138817] +[ 1901.138852] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.30: Detected aborted journal +[ 1901.144779] CPU: 1 PID: 968 Comm: syz-executor.17 Not tainted 4.19.90-vhulk2111.1.0.h893.eulerosv2r10.aarch64+ #1 +[ 1901.146479] Hardware name: linux,dummy-virt (DT) +[ 1901.147317] Call trace: +[ 1901.147552] dump_backtrace+0x0/0x2d8 +[ 1901.147898] show_stack+0x28/0x38 +[ 1901.148215] dump_stack+0xec/0x15c +[ 1901.148746] kasan_report+0x108/0x338 +[ 1901.149207] __asan_load8+0x58/0xb0 +[ 1901.149753] __ext4_journal_ensure_credits+0x74/0x140 [ext4] +[ 1901.150579] ext4_xattr_delete_inode+0xe4/0x700 [ext4] +[ 1901.151316] ext4_evict_inode+0x524/0xba8 [ext4] +[ 1901.151985] evict+0x1a4/0x378 +[ 1901.152353] iput+0x310/0x428 +[ 1901.152733] do_unlinkat+0x260/0x428 +[ 1901.153056] __arm64_sys_unlinkat+0x6c/0xc0 +[ 1901.153455] el0_svc_common+0xc8/0x320 +[ 1901.153799] el0_svc_handler+0xf8/0x160 +[ 1901.154265] el0_svc+0x10/0x218 +[ 1901.154682] ================================================================== + +This issue may happens like this: + Process1 Process2 +ext4_evict_inode + ext4_journal_start + ext4_truncate + ext4_ind_truncate + ext4_free_branches + ext4_ind_truncate_ensure_credits + ext4_journal_ensure_credits_fn + ext4_journal_restart + handle->h_transaction = NULL; + mount -o remount,abort /mnt + -> trigger JBD abort + start_this_handle -> will return failed + ext4_xattr_delete_inode + ext4_journal_ensure_credits + ext4_journal_ensure_credits_fn + __ext4_journal_ensure_credits + jbd2_handle_buffer_credits + journal = handle->h_transaction->t_journal; ->null-ptr-deref + +Now, indirect truncate process didn't handle error. To solve this issue +maybe simply add check handle is abort in '__ext4_journal_ensure_credits' +is enough, and i also think this is necessary. + +Cc: stable@kernel.org +Signed-off-by: Ye Bin <yebin10@huawei.com> +Link: https://lore.kernel.org/r/20211224100341.3299128-1-yebin10@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/ext4_jbd2.c | 2 ++ + 1 file changed, 2 insertions(+) + +diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c +index 6def7339056d..3477a16d08ae 100644 +--- a/fs/ext4/ext4_jbd2.c ++++ b/fs/ext4/ext4_jbd2.c +@@ -162,6 +162,8 @@ int __ext4_journal_ensure_credits(handle_t *handle, int check_cred, + { + if (!ext4_handle_valid(handle)) + return 0; ++ if (is_handle_aborted(handle)) ++ return -EROFS; + if (jbd2_handle_buffer_credits(handle) >= check_cred && + handle->h_revoke_credits >= revoke_cred) + return 0; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-off-by-one-errors-in-fast-commit-block-fill.patch @@ -0,0 +1,171 @@ +From 48a6a66db82b8043d298a630f22c62d43550cae5 Mon Sep 17 00:00:00 2001 +From: Eric Biggers <ebiggers@google.com> +Date: Sun, 6 Nov 2022 14:48:40 -0800 +Subject: [PATCH] ext4: fix off-by-one errors in fast-commit block filling +Git-commit: 48a6a66db82b8043d298a630f22c62d43550cae5 +Patch-mainline: v6.2-rc1 +References: bsc#1207628 + +Due to several different off-by-one errors, or perhaps due to a late +change in design that wasn't fully reflected in the code that was +actually merged, there are several very strange constraints on how +fast-commit blocks are filled with tlv entries: + +- tlvs must start at least 10 bytes before the end of the block, even + though the minimum tlv length is 8. Otherwise, the replay code will + ignore them. (BUG: ext4_fc_reserve_space() could violate this + requirement if called with a len of blocksize - 9 or blocksize - 8. + Fortunately, this doesn't seem to happen currently.) + +- tlvs must end at least 1 byte before the end of the block. Otherwise + the replay code will consider them to be invalid. This quirk + contributed to a bug (fixed by an earlier commit) where uninitialized + memory was being leaked to disk in the last byte of blocks. + +Also, strangely these constraints don't apply to the replay code in +e2fsprogs, which will accept any tlvs in the blocks (with no bounds +checks at all, but that is a separate issue...). + +Given that this all seems to be a bug, let's fix it by just filling +blocks with tlv entries in the natural way. + +Note that old kernels will be unable to replay fast-commit journals +created by kernels that have this commit. + +Fixes: aa75f4d3daae ("ext4: main fast-commit commit path") +Cc: <stable@vger.kernel.org> # v5.10+ +Signed-off-by: Eric Biggers <ebiggers@google.com> +Link: https://lore.kernel.org/r/20221106224841.279231-7-ebiggers@kernel.org +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/fast_commit.c | 66 +++++++++++++++++++++---------------------- + 1 file changed, 33 insertions(+), 33 deletions(-) + +diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c +index 892fa7c7a768..7ed71c652f67 100644 +--- a/fs/ext4/fast_commit.c ++++ b/fs/ext4/fast_commit.c +@@ -714,43 +714,43 @@ static u8 *ext4_fc_reserve_space(struct super_block *sb, int len, u32 *crc) + struct buffer_head *bh; + int bsize = sbi->s_journal->j_blocksize; + int ret, off = sbi->s_fc_bytes % bsize; +- int pad_len; ++ int remaining; + u8 *dst; + + /* +- * After allocating len, we should have space at least for a 0 byte +- * padding. ++ * If 'len' is too long to fit in any block alongside a PAD tlv, then we ++ * cannot fulfill the request. + */ +- if (len + EXT4_FC_TAG_BASE_LEN > bsize) ++ if (len > bsize - EXT4_FC_TAG_BASE_LEN) + return NULL; + +- if (bsize - off - 1 > len + EXT4_FC_TAG_BASE_LEN) { +- /* +- * Only allocate from current buffer if we have enough space for +- * this request AND we have space to add a zero byte padding. +- */ +- if (!sbi->s_fc_bh) { +- ret = jbd2_fc_get_buf(EXT4_SB(sb)->s_journal, &bh); +- if (ret) +- return NULL; +- sbi->s_fc_bh = bh; +- } +- sbi->s_fc_bytes += len; +- return sbi->s_fc_bh->b_data + off; ++ if (!sbi->s_fc_bh) { ++ ret = jbd2_fc_get_buf(EXT4_SB(sb)->s_journal, &bh); ++ if (ret) ++ return NULL; ++ sbi->s_fc_bh = bh; + } +- /* Need to add PAD tag */ + dst = sbi->s_fc_bh->b_data + off; ++ ++ /* ++ * Allocate the bytes in the current block if we can do so while still ++ * leaving enough space for a PAD tlv. ++ */ ++ remaining = bsize - EXT4_FC_TAG_BASE_LEN - off; ++ if (len <= remaining) { ++ sbi->s_fc_bytes += len; ++ return dst; ++ } ++ ++ /* ++ * Else, terminate the current block with a PAD tlv, then allocate a new ++ * block and allocate the bytes at the start of that new block. ++ */ ++ + tl.fc_tag = cpu_to_le16(EXT4_FC_TAG_PAD); +- pad_len = bsize - off - 1 - EXT4_FC_TAG_BASE_LEN; +- tl.fc_len = cpu_to_le16(pad_len); ++ tl.fc_len = cpu_to_le16(remaining); + ext4_fc_memcpy(sb, dst, &tl, EXT4_FC_TAG_BASE_LEN, crc); +- dst += EXT4_FC_TAG_BASE_LEN; +- if (pad_len > 0) { +- ext4_fc_memzero(sb, dst, pad_len, crc); +- dst += pad_len; +- } +- /* Don't leak uninitialized memory in the unused last byte. */ +- *dst = 0; ++ ext4_fc_memzero(sb, dst + EXT4_FC_TAG_BASE_LEN, remaining, crc); + + ext4_fc_submit_bh(sb, false); + +@@ -758,7 +758,7 @@ static u8 *ext4_fc_reserve_space(struct super_block *sb, int len, u32 *crc) + if (ret) + return NULL; + sbi->s_fc_bh = bh; +- sbi->s_fc_bytes = (sbi->s_fc_bytes / bsize + 1) * bsize + len; ++ sbi->s_fc_bytes += bsize - off + len; + return sbi->s_fc_bh->b_data; + } + +@@ -789,7 +789,7 @@ static int ext4_fc_write_tail(struct super_block *sb, u32 crc) + off = sbi->s_fc_bytes % bsize; + + tl.fc_tag = cpu_to_le16(EXT4_FC_TAG_TAIL); +- tl.fc_len = cpu_to_le16(bsize - off - 1 + sizeof(struct ext4_fc_tail)); ++ tl.fc_len = cpu_to_le16(bsize - off + sizeof(struct ext4_fc_tail)); + sbi->s_fc_bytes = round_up(sbi->s_fc_bytes, bsize); + + ext4_fc_memcpy(sb, dst, &tl, EXT4_FC_TAG_BASE_LEN, &crc); +@@ -2056,7 +2056,7 @@ static int ext4_fc_replay_scan(journal_t *journal, + state = &sbi->s_fc_replay_state; + + start = (u8 *)bh->b_data; +- end = (__u8 *)bh->b_data + journal->j_blocksize - 1; ++ end = start + journal->j_blocksize; + + if (state->fc_replay_expected_off == 0) { + state->fc_cur_tag = 0; +@@ -2077,7 +2077,7 @@ static int ext4_fc_replay_scan(journal_t *journal, + } + + state->fc_replay_expected_off++; +- for (cur = start; cur < end - EXT4_FC_TAG_BASE_LEN; ++ for (cur = start; cur <= end - EXT4_FC_TAG_BASE_LEN; + cur = cur + EXT4_FC_TAG_BASE_LEN + tl.fc_len) { + ext4_fc_get_tl(&tl, cur); + val = cur + EXT4_FC_TAG_BASE_LEN; +@@ -2195,9 +2195,9 @@ static int ext4_fc_replay(journal_t *journal, struct buffer_head *bh, + #endif + + start = (u8 *)bh->b_data; +- end = (__u8 *)bh->b_data + journal->j_blocksize - 1; ++ end = start + journal->j_blocksize; + +- for (cur = start; cur < end - EXT4_FC_TAG_BASE_LEN; ++ for (cur = start; cur <= end - EXT4_FC_TAG_BASE_LEN; + cur = cur + EXT4_FC_TAG_BASE_LEN + tl.fc_len) { + ext4_fc_get_tl(&tl, cur); + val = cur + EXT4_FC_TAG_BASE_LEN; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-overhead-calculation-to-account-for-the-res.patch @@ -0,0 +1,41 @@ +From 10b01ee92df52c8d7200afead4d5e5f55a5c58b1 Mon Sep 17 00:00:00 2001 +From: Theodore Ts'o <tytso@mit.edu> +Date: Thu, 14 Apr 2022 21:31:27 -0400 +Subject: [PATCH] ext4: fix overhead calculation to account for the reserved + gdt blocks +Git-commit: 10b01ee92df52c8d7200afead4d5e5f55a5c58b1 +Patch-mainline: v5.18-rc4 +References: bsc#1200869 + +The kernel calculation was underestimating the overhead by not taking +into account the reserved gdt blocks. With this change, the overhead +calculated by the kernel matches the overhead calculation in mke2fs. + +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/super.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +diff --git a/fs/ext4/super.c b/fs/ext4/super.c +index f2a5e78f93a9..23a9b2c086ed 100644 +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -4177,9 +4177,11 @@ static int count_overhead(struct super_block *sb, ext4_group_t grp, + ext4_fsblk_t first_block, last_block, b; + ext4_group_t i, ngroups = ext4_get_groups_count(sb); + int s, j, count = 0; ++ int has_super = ext4_bg_has_super(sb, grp); + + if (!ext4_has_feature_bigalloc(sb)) +- return (ext4_bg_has_super(sb, grp) + ext4_bg_num_gdb(sb, grp) + ++ return (has_super + ext4_bg_num_gdb(sb, grp) + ++ (has_super ? le16_to_cpu(sbi->s_es->s_reserved_gdt_blocks) : 0) + + sbi->s_itb_per_group + 2); + + first_block = le32_to_cpu(sbi->s_es->s_first_data_block) + +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-possible-double-unlock-when-moving-a-direct.patch @@ -0,0 +1,38 @@ +From 70e42feab2e20618ddd0cbfc4ab4b08628236ecd Mon Sep 17 00:00:00 2001 +From: Theodore Ts'o <tytso@mit.edu> +Date: Fri, 17 Mar 2023 21:53:52 -0400 +Subject: [PATCH] ext4: fix possible double unlock when moving a directory +Git-commit: 70e42feab2e20618ddd0cbfc4ab4b08628236ecd +Patch-mainline: v6.3-rc3 +References: bsc#1210763 + +Fixes: 0813299c586b ("ext4: Fix possible corruption when moving a directory") +Link: https://lore.kernel.org/r/5efbe1b9-ad8b-4a4f-b422-24824d2b775c@kili.mountain +Reported-by: Dan Carpenter <error27@gmail.com> +Reported-by: syzbot+0c73d1d8b952c5f3d714@syzkaller.appspotmail.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/namei.c | 4 +--- + 1 file changed, 1 insertion(+), 3 deletions(-) + +diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c +index 31e21de56432..a5010b5b8a8c 100644 +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -3884,10 +3884,8 @@ static int ext4_rename(struct mnt_idmap *idmap, struct inode *old_dir, + goto end_rename; + } + retval = ext4_rename_dir_prepare(handle, &old); +- if (retval) { +- inode_unlock(old.inode); ++ if (retval) + goto end_rename; +- } + } + /* + * If we're renaming a file within an inline_data dir and adding or +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-potential-memory-leak-in-ext4_fc_record_mod.patch @@ -0,0 +1,50 @@ +From 9305721a309fa1bd7c194e0d4a2335bf3b29dca4 Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Wed, 21 Sep 2022 14:40:38 +0800 +Subject: [PATCH] ext4: fix potential memory leak in + ext4_fc_record_modified_inode() +Git-commit: 9305721a309fa1bd7c194e0d4a2335bf3b29dca4 +Patch-mainline: v6.1-rc1 +References: bsc#1207611 + +As krealloc may return NULL, in this case 'state->fc_modified_inodes' +may not be freed by krealloc, but 'state->fc_modified_inodes' already +set NULL. Then will lead to 'state->fc_modified_inodes' memory leak. + +Cc: stable@kernel.org +Signed-off-by: Ye Bin <yebin10@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220921064040.3693255-2-yebin10@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/fast_commit.c | 8 +++++--- + 1 file changed, 5 insertions(+), 3 deletions(-) + +diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c +index 9217a588afd1..9555ab816d7d 100644 +--- a/fs/ext4/fast_commit.c ++++ b/fs/ext4/fast_commit.c +@@ -1486,13 +1486,15 @@ static int ext4_fc_record_modified_inode(struct super_block *sb, int ino) + if (state->fc_modified_inodes[i] == ino) + return 0; + if (state->fc_modified_inodes_used == state->fc_modified_inodes_size) { +- state->fc_modified_inodes = krealloc( +- state->fc_modified_inodes, ++ int *fc_modified_inodes; ++ ++ fc_modified_inodes = krealloc(state->fc_modified_inodes, + sizeof(int) * (state->fc_modified_inodes_size + + EXT4_FC_REPLAY_REALLOC_INCREMENT), + GFP_KERNEL); +- if (!state->fc_modified_inodes) ++ if (!fc_modified_inodes) + return -ENOMEM; ++ state->fc_modified_inodes = fc_modified_inodes; + state->fc_modified_inodes_size += + EXT4_FC_REPLAY_REALLOC_INCREMENT; + } +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-potential-memory-leak-in-ext4_fc_record_reg.patch @@ -0,0 +1,54 @@ +From 7069d105c1f15c442b68af43f7fde784f3126739 Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Wed, 21 Sep 2022 14:40:39 +0800 +Subject: [PATCH] ext4: fix potential memory leak in ext4_fc_record_regions() +Git-commit: 7069d105c1f15c442b68af43f7fde784f3126739 +Patch-mainline: v6.1-rc1 +References: bsc#1207612 + +As krealloc may return NULL, in this case 'state->fc_regions' may not be +freed by krealloc, but 'state->fc_regions' already set NULL. Then will +lead to 'state->fc_regions' memory leak. + +Cc: stable@kernel.org +Signed-off-by: Ye Bin <yebin10@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220921064040.3693255-3-yebin10@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/fast_commit.c | 14 ++++++++------ + 1 file changed, 8 insertions(+), 6 deletions(-) + +diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c +index 9555ab816d7d..5ab58cb4ce8d 100644 +--- a/fs/ext4/fast_commit.c ++++ b/fs/ext4/fast_commit.c +@@ -1679,15 +1679,17 @@ int ext4_fc_record_regions(struct super_block *sb, int ino, + if (replay && state->fc_regions_used != state->fc_regions_valid) + state->fc_regions_used = state->fc_regions_valid; + if (state->fc_regions_used == state->fc_regions_size) { ++ struct ext4_fc_alloc_region *fc_regions; ++ + state->fc_regions_size += + EXT4_FC_REPLAY_REALLOC_INCREMENT; +- state->fc_regions = krealloc( +- state->fc_regions, +- state->fc_regions_size * +- sizeof(struct ext4_fc_alloc_region), +- GFP_KERNEL); +- if (!state->fc_regions) ++ fc_regions = krealloc(state->fc_regions, ++ state->fc_regions_size * ++ sizeof(struct ext4_fc_alloc_region), ++ GFP_KERNEL); ++ if (!fc_regions) + return -ENOMEM; ++ state->fc_regions = fc_regions; + } + region = &state->fc_regions[state->fc_regions_used++]; + region->ino = ino; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-potential-out-of-bound-read-in-ext4_fc_repl.patch @@ -0,0 +1,97 @@ +From 1b45cc5c7b920fd8bf72e5a888ec7abeadf41e09 Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Sat, 24 Sep 2022 15:52:33 +0800 +Subject: [PATCH] ext4: fix potential out of bound read in + ext4_fc_replay_scan() +Git-commit: 1b45cc5c7b920fd8bf72e5a888ec7abeadf41e09 +Patch-mainline: v6.1-rc1 +References: bsc#1207616 + +For scan loop must ensure that at least EXT4_FC_TAG_BASE_LEN space. If remain +space less than EXT4_FC_TAG_BASE_LEN which will lead to out of bound read +when mounting corrupt file system image. +ADD_RANGE/HEAD/TAIL is needed to add extra check when do journal scan, as this +three tags will read data during scan, tag length couldn't less than data length +which will read. + +Cc: stable@kernel.org +Signed-off-by: Ye Bin <yebin10@huawei.com> +Link: https://lore.kernel.org/r/20220924075233.2315259-4-yebin10@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/fast_commit.c | 38 ++++++++++++++++++++++++++++++++++++-- + 1 file changed, 36 insertions(+), 2 deletions(-) + +--- a/fs/ext4/fast_commit.c ++++ b/fs/ext4/fast_commit.c +@@ -1911,6 +1911,34 @@ void ext4_fc_replay_cleanup(struct super + kfree(sbi->s_fc_replay_state.fc_modified_inodes); + } + ++static inline bool ext4_fc_tag_len_isvalid(struct ext4_fc_tl *tl, ++ u8 *val, u8 *end) ++{ ++ if (val + tl->fc_len > end) ++ return false; ++ ++ /* Here only check ADD_RANGE/TAIL/HEAD which will read data when do ++ * journal rescan before do CRC check. Other tags length check will ++ * rely on CRC check. ++ */ ++ switch (tl->fc_tag) { ++ case EXT4_FC_TAG_ADD_RANGE: ++ return (sizeof(struct ext4_fc_add_range) == tl->fc_len); ++ case EXT4_FC_TAG_TAIL: ++ return (sizeof(struct ext4_fc_tail) <= tl->fc_len); ++ case EXT4_FC_TAG_HEAD: ++ return (sizeof(struct ext4_fc_head) == tl->fc_len); ++ case EXT4_FC_TAG_DEL_RANGE: ++ case EXT4_FC_TAG_LINK: ++ case EXT4_FC_TAG_UNLINK: ++ case EXT4_FC_TAG_CREAT: ++ case EXT4_FC_TAG_INODE: ++ case EXT4_FC_TAG_PAD: ++ default: ++ return true; ++ } ++} ++ + /* + * Recovery Scan phase handler + * +@@ -1967,10 +1995,15 @@ static int ext4_fc_replay_scan(journal_t + } + + state->fc_replay_expected_off++; +- for (cur = start; cur < end; ++ for (cur = start; cur < end - EXT4_FC_TAG_BASE_LEN; + cur = cur + EXT4_FC_TAG_BASE_LEN + tl.fc_len) { + ext4_fc_get_tl(&tl, cur); + val = cur + EXT4_FC_TAG_BASE_LEN; ++ if (!ext4_fc_tag_len_isvalid(&tl, val, end)) { ++ ret = state->fc_replay_num_tags ? ++ JBD2_FC_REPLAY_STOP : -ECANCELED; ++ goto out_err; ++ } + jbd_debug(3, "Scan phase, tag:%s, blk %lld\n", + tag2str(tl.fc_tag), bh->b_blocknr); + switch (tl.fc_tag) { +@@ -2081,7 +2114,7 @@ static int ext4_fc_replay(journal_t *jou + start = (u8 *)bh->b_data; + end = (__u8 *)bh->b_data + journal->j_blocksize - 1; + +- for (cur = start; cur < end; ++ for (cur = start; cur < end - EXT4_FC_TAG_BASE_LEN; + cur = cur + EXT4_FC_TAG_BASE_LEN + tl.fc_len) { + ext4_fc_get_tl(&tl, cur); + val = cur + EXT4_FC_TAG_BASE_LEN; +@@ -2091,6 +2124,7 @@ static int ext4_fc_replay(journal_t *jou + ext4_fc_set_bitmaps_and_counters(sb); + break; + } ++ + jbd_debug(3, "Replay phase, tag:%s\n", tag2str(tl.fc_tag)); + state->fc_replay_num_tags--; + switch (tl.fc_tag) { --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-race-condition-between-ext4_write-and-ext4_.patch @@ -0,0 +1,133 @@ +From f87c7a4b084afc13190cbb263538e444cb2b392a Mon Sep 17 00:00:00 2001 +From: Baokun Li <libaokun1@huawei.com> +Date: Thu, 28 Apr 2022 21:40:31 +0800 +Subject: [PATCH] ext4: fix race condition between ext4_write and + ext4_convert_inline_data +Git-commit: f87c7a4b084afc13190cbb263538e444cb2b392a +Patch-mainline: v5.19-rc1 +References: bsc#1200807 + +Hulk Robot reported a BUG_ON: + ================================================================== + EXT4-fs error (device loop3): ext4_mb_generate_buddy:805: group 0, + block bitmap and bg descriptor inconsistent: 25 vs 31513 free clusters + kernel BUG at fs/ext4/ext4_jbd2.c:53! + invalid opcode: 0000 [#1] SMP KASAN PTI + CPU: 0 PID: 25371 Comm: syz-executor.3 Not tainted 5.10.0+ #1 + RIP: 0010:ext4_put_nojournal fs/ext4/ext4_jbd2.c:53 [inline] + RIP: 0010:__ext4_journal_stop+0x10e/0x110 fs/ext4/ext4_jbd2.c:116 + [...] + Call Trace: + ext4_write_inline_data_end+0x59a/0x730 fs/ext4/inline.c:795 + generic_perform_write+0x279/0x3c0 mm/filemap.c:3344 + ext4_buffered_write_iter+0x2e3/0x3d0 fs/ext4/file.c:270 + ext4_file_write_iter+0x30a/0x11c0 fs/ext4/file.c:520 + do_iter_readv_writev+0x339/0x3c0 fs/read_write.c:732 + do_iter_write+0x107/0x430 fs/read_write.c:861 + vfs_writev fs/read_write.c:934 [inline] + do_pwritev+0x1e5/0x380 fs/read_write.c:1031 + [...] + ================================================================== + +Above issue may happen as follows: + cpu1 cpu2 +__________________________|__________________________ +do_pwritev + vfs_writev + do_iter_write + ext4_file_write_iter + ext4_buffered_write_iter + generic_perform_write + ext4_da_write_begin + vfs_fallocate + ext4_fallocate + ext4_convert_inline_data + ext4_convert_inline_data_nolock + ext4_destroy_inline_data_nolock + clear EXT4_STATE_MAY_INLINE_DATA + ext4_map_blocks + ext4_ext_map_blocks + ext4_mb_new_blocks + ext4_mb_regular_allocator + ext4_mb_good_group_nolock + ext4_mb_init_group + ext4_mb_init_cache + ext4_mb_generate_buddy --> error + ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA) + ext4_restore_inline_data + set EXT4_STATE_MAY_INLINE_DATA + ext4_block_write_begin + ext4_da_write_end + ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA) + ext4_write_inline_data_end + handle=NULL + ext4_journal_stop(handle) + __ext4_journal_stop + ext4_put_nojournal(handle) + ref_cnt = (unsigned long)handle + BUG_ON(ref_cnt == 0) ---> BUG_ON + +The lock held by ext4_convert_inline_data is xattr_sem, but the lock +held by generic_perform_write is i_rwsem. Therefore, the two locks can +be concurrent. + +To solve above issue, we add inode_lock() for ext4_convert_inline_data(). +At the same time, move ext4_convert_inline_data() in front of +ext4_punch_hole(), remove similar handling from ext4_punch_hole(). + +Fixes: 0c8d414f163f ("ext4: let fallocate handle inline data correctly") +Cc: stable@vger.kernel.org +Reported-by: Hulk Robot <hulkci@huawei.com> +Signed-off-by: Baokun Li <libaokun1@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220428134031.4153381-1-libaokun1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/extents.c | 10 ++++++---- + fs/ext4/inode.c | 9 --------- + 2 files changed, 6 insertions(+), 13 deletions(-) + +--- a/fs/ext4/extents.c ++++ b/fs/ext4/extents.c +@@ -4698,15 +4698,17 @@ long ext4_fallocate(struct file *file, i + + ext4_fc_start_update(inode); + ++ inode_lock(inode); ++ ret = ext4_convert_inline_data(inode); ++ inode_unlock(inode); ++ if (ret) ++ goto exit; ++ + if (mode & FALLOC_FL_PUNCH_HOLE) { + ret = ext4_punch_hole(file, offset, len); + goto exit; + } + +- ret = ext4_convert_inline_data(inode); +- if (ret) +- goto exit; +- + if (mode & FALLOC_FL_COLLAPSE_RANGE) { + ret = ext4_collapse_range(file, offset, len); + goto exit; +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -4005,15 +4005,6 @@ int ext4_punch_hole(struct file *file, l + + trace_ext4_punch_hole(inode, offset, length, 0); + +- ext4_clear_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA); +- if (ext4_has_inline_data(inode)) { +- filemap_invalidate_lock(mapping); +- ret = ext4_convert_inline_data(inode); +- filemap_invalidate_unlock(mapping); +- if (ret) +- return ret; +- } +- + /* + * Write out all dirty pages to avoid race conditions + * Then release them. --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-reserved-cluster-accounting-in-__es_remove_.patch @@ -0,0 +1,96 @@ +From 1da18e38cb97e9521e93d63034521a9649524f64 Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Thu, 8 Dec 2022 11:34:24 +0800 +Subject: [PATCH] ext4: fix reserved cluster accounting in __es_remove_extent() +Git-commit: 1da18e38cb97e9521e93d63034521a9649524f64 +Patch-mainline: v6.2-rc1 +References: bsc#1207637 + +When bigalloc is enabled, reserved cluster accounting for delayed +allocation is handled in extent_status.c. With a corrupted file +system, it's possible for this accounting to be incorrect, +dsicovered by Syzbot: + +EXT4-fs error (device loop0): ext4_validate_block_bitmap:398: comm rep: + bg 0: block 5: invalid block bitmap +EXT4-fs (loop0): Delayed block allocation failed for inode 18 at logical + offset 0 with max blocks 32 with error 28 +EXT4-fs (loop0): This should not happen!! Data will be lost + +EXT4-fs (loop0): Total free blocks count 0 +EXT4-fs (loop0): Free/Dirty block details +EXT4-fs (loop0): free_blocks=0 +EXT4-fs (loop0): dirty_blocks=32 +EXT4-fs (loop0): Block reservation details +EXT4-fs (loop0): i_reserved_data_blocks=2 +EXT4-fs (loop0): Inode 18 (00000000845cd634): + i_reserved_data_blocks (1) not cleared! + +Above issue happens as follows: +Assume: +sbi->s_cluster_ratio = 16 +Step1: +Insert delay block [0, 31] -> ei->i_reserved_data_blocks=2 +Step2: +ext4_writepages + mpage_map_and_submit_extent -> return failed + mpage_release_unused_pages -> to release [0, 30] + ext4_es_remove_extent -> remove lblk=0 end=30 + __es_remove_extent -> len1=0 len2=31-30=1 + __es_remove_extent: + ... + if (len2 > 0) { + ... + if (len1 > 0) { + ... + } else { + es->es_lblk = end + 1; + es->es_len = len2; + ... + } + if (count_reserved) + count_rsvd(inode, lblk, ...); + goto out; -> will return but didn't calculate 'reserved' + ... +Step3: +ext4_destroy_inode -> trigger "i_reserved_data_blocks (1) not cleared!" + +To solve above issue if 'len2>0' call 'get_rsvd()' before goto out. + +Reported-by: syzbot+05a0f0ccab4a25626e38@syzkaller.appspotmail.com +Fixes: 8fcc3a580651 ("ext4: rework reserved cluster accounting when invalidating pages") +Signed-off-by: Ye Bin <yebin10@huawei.com> +Reviewed-by: Eric Whitney <enwlinux@gmail.com> +Link: https://lore.kernel.org/r/20221208033426.1832460-2-yebin@huaweicloud.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/extents_status.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c +index 97eccc0028a1..7bc221038c6c 100644 +--- a/fs/ext4/extents_status.c ++++ b/fs/ext4/extents_status.c +@@ -1369,7 +1369,7 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk, + if (count_reserved) + count_rsvd(inode, lblk, orig_es.es_len - len1 - len2, + &orig_es, &rc); +- goto out; ++ goto out_get_reserved; + } + + if (len1 > 0) { +@@ -1411,6 +1411,7 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk, + } + } + ++out_get_reserved: + if (count_reserved) + *reserved = get_rsvd(inode, end, es, &rc); + out: +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-super-block-checksum-incorrect-after-mount.patch @@ -0,0 +1,78 @@ +From 9b6641dd95a0c441b277dd72ba22fed8d61f76ad Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Wed, 25 May 2022 09:29:04 +0800 +Subject: [PATCH] ext4: fix super block checksum incorrect after mount +Mime-version: 1.0 +Content-type: text/plain; charset=UTF-8 +Content-transfer-encoding: 8bit +Git-commit: 9b6641dd95a0c441b277dd72ba22fed8d61f76ad +Patch-mainline: v5.19-rc3 +References: bsc#1202773 + +We got issue as follows: +[home]# mount /dev/sda test +EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended +[home]# dmesg +EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended +EXT4-fs (sda): Errors on filesystem, clearing orphan list. +EXT4-fs (sda): recovery complete +EXT4-fs (sda): mounted filesystem with ordered data mode. Quota mode: none. +[home]# debugfs /dev/sda +debugfs 1.46.5 (30-Dec-2021) +Checksum errors in superblock! Retrying... + +Reason is ext4_orphan_cleanup will reset ‘s_last_orphan’ but not update +super block checksum. + +To solve above issue, defer update super block checksum after +ext4_orphan_cleanup. + +Signed-off-by: Ye Bin <yebin10@huawei.com> +Cc: stable@kernel.org +Reviewed-by: Jan Kara <jack@suse.cz> +Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com> +Link: https://lore.kernel.org/r/20220525012904.1604737-1-yebin10@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/super.c | 16 ++++++++-------- + 1 file changed, 8 insertions(+), 8 deletions(-) + +diff --git a/fs/ext4/super.c b/fs/ext4/super.c +index b2ecae8adbfc..13d562d11235 100644 +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -5302,14 +5302,6 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb) + err = percpu_counter_init(&sbi->s_freeinodes_counter, freei, + GFP_KERNEL); + } +- /* +- * Update the checksum after updating free space/inode +- * counters. Otherwise the superblock can have an incorrect +- * checksum in the buffer cache until it is written out and +- * e2fsprogs programs trying to open a file system immediately +- * after it is mounted can fail. +- */ +- ext4_superblock_csum_set(sb); + if (!err) + err = percpu_counter_init(&sbi->s_dirs_counter, + ext4_count_dirs(sb), GFP_KERNEL); +@@ -5367,6 +5359,14 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb) + EXT4_SB(sb)->s_mount_state |= EXT4_ORPHAN_FS; + ext4_orphan_cleanup(sb, es); + EXT4_SB(sb)->s_mount_state &= ~EXT4_ORPHAN_FS; ++ /* ++ * Update the checksum after updating free space/inode counters and ++ * ext4_orphan_cleanup. Otherwise the superblock can have an incorrect ++ * checksum in the buffer cache until it is written out and ++ * e2fsprogs programs trying to open a file system immediately ++ * after it is mounted can fail. ++ */ ++ ext4_superblock_csum_set(sb); + if (needs_recovery) { + ext4_msg(sb, KERN_INFO, "recovery complete"); + err = ext4_mark_recovery_complete(sb, es); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-symlink-file-size-not-match-to-file-content.patch @@ -0,0 +1,56 @@ +From a2b0b205d125f27cddfb4f7280e39affdaf46686 Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Mon, 21 Mar 2022 22:44:38 +0800 +Subject: [PATCH] ext4: fix symlink file size not match to file content +Git-commit: a2b0b205d125f27cddfb4f7280e39affdaf46686 +Patch-mainline: v5.18-rc4 +References: bsc#1200868 + +We got issue as follows: +[home]# fsck.ext4 -fn ram0yb +e2fsck 1.45.6 (20-Mar-2020) +Pass 1: Checking inodes, blocks, and sizes +Pass 2: Checking directory structure +Symlink /p3/d14/d1a/l3d (inode #3494) is invalid. +Clear? no +Entry 'l3d' in /p3/d14/d1a (3383) has an incorrect filetype (was 7, should be 0). +Fix? no + +As the symlink file size does not match the file content. If the writeback +of the symlink data block failed, ext4_finish_bio() handles the end of IO. +However this function fails to mark the buffer with BH_write_io_error and +so when unmount does journal checkpoint it cannot detect the writeback +error and will cleanup the journal. Thus we've lost the correct data in the +journal area. To solve this issue, mark the buffer as BH_write_io_error in +ext4_finish_bio(). + +Cc: stable@kernel.org +Signed-off-by: Ye Bin <yebin10@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220321144438.201685-1-yebin10@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/page-io.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c +index 1d370364230e..40b7d8485b44 100644 +--- a/fs/ext4/page-io.c ++++ b/fs/ext4/page-io.c +@@ -134,8 +134,10 @@ static void ext4_finish_bio(struct bio *bio) + continue; + } + clear_buffer_async_write(bh); +- if (bio->bi_status) ++ if (bio->bi_status) { ++ set_buffer_write_io_error(bh); + buffer_io_error(bh); ++ } + } while ((bh = bh->b_this_page) != head); + spin_unlock_irqrestore(&head->b_uptodate_lock, flags); + if (!under_io) { +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-task-hung-in-ext4_xattr_delete_inode.patch @@ -0,0 +1,97 @@ +From 0f7bfd6f8164be32dbbdf36aa1e5d00485c53cd7 Mon Sep 17 00:00:00 2001 +From: Baokun Li <libaokun1@huawei.com> +Date: Tue, 10 Jan 2023 21:34:36 +0800 +Subject: [PATCH] ext4: fix task hung in ext4_xattr_delete_inode +Git-commit: 0f7bfd6f8164be32dbbdf36aa1e5d00485c53cd7 +Patch-mainline: v6.3-rc1 +References: bsc#1213096 + +Syzbot reported a hung task problem: +================================================================== +Info: task syz-executor232:5073 blocked for more than 143 seconds. + Not tainted 6.2.0-rc2-syzkaller-00024-g512dee0c00ad #0 +"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. +task:syz-exec232 state:D stack:21024 pid:5073 ppid:5072 flags:0x00004004 +Call Trace: + <TASK> + context_switch kernel/sched/core.c:5244 [inline] + __schedule+0x995/0xe20 kernel/sched/core.c:6555 + schedule+0xcb/0x190 kernel/sched/core.c:6631 + __wait_on_freeing_inode fs/inode.c:2196 [inline] + find_inode_fast+0x35a/0x4c0 fs/inode.c:950 + iget_locked+0xb1/0x830 fs/inode.c:1273 + __ext4_iget+0x22e/0x3ed0 fs/ext4/inode.c:4861 + ext4_xattr_inode_iget+0x68/0x4e0 fs/ext4/xattr.c:389 + ext4_xattr_inode_dec_ref_all+0x1a7/0xe50 fs/ext4/xattr.c:1148 + ext4_xattr_delete_inode+0xb04/0xcd0 fs/ext4/xattr.c:2880 + ext4_evict_inode+0xd7c/0x10b0 fs/ext4/inode.c:296 + evict+0x2a4/0x620 fs/inode.c:664 + ext4_orphan_cleanup+0xb60/0x1340 fs/ext4/orphan.c:474 + __ext4_fill_super fs/ext4/super.c:5516 [inline] + ext4_fill_super+0x81cd/0x8700 fs/ext4/super.c:5644 + get_tree_bdev+0x400/0x620 fs/super.c:1282 + vfs_get_tree+0x88/0x270 fs/super.c:1489 + do_new_mount+0x289/0xad0 fs/namespace.c:3145 + do_mount fs/namespace.c:3488 [inline] + __do_sys_mount fs/namespace.c:3697 [inline] + __se_sys_mount+0x2d3/0x3c0 fs/namespace.c:3674 + do_syscall_x64 arch/x86/entry/common.c:50 [inline] + do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 + entry_SYSCALL_64_after_hwframe+0x63/0xcd +Rip: 0033:0x7fa5406fd5ea +Rsp: 002b:00007ffc7232f968 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 +Rax: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fa5406fd5ea +Rdx: 0000000020000440 RSI: 0000000020000000 RDI: 00007ffc7232f970 +Rbp: 00007ffc7232f970 R08: 00007ffc7232f9b0 R09: 0000000000000432 +R10: 0000000000804a03 R11: 0000000000000202 R12: 0000000000000004 +R13: 0000555556a7a2c0 R14: 00007ffc7232f9b0 R15: 0000000000000000 + </TASK> +================================================================== + +The problem is that the inode contains an xattr entry with ea_inum of 15 +when cleaning up an orphan inode <15>. When evict inode <15>, the reference +counting of the corresponding EA inode is decreased. When EA inode <15> is +found by find_inode_fast() in __ext4_iget(), it is found that the EA inode +holds the I_FREEING flag and waits for the EA inode to complete deletion. +As a result, when inode <15> is being deleted, we wait for inode <15> to +complete the deletion, resulting in an infinite loop and triggering Hung +Task. To solve this problem, we only need to check whether the ino of EA +inode and parent is the same before getting EA inode. + +Link: https://syzkaller.appspot.com/bug?extid=77d6fcc37bbb92f26048 +Reported-by: syzbot+77d6fcc37bbb92f26048@syzkaller.appspotmail.com +Signed-off-by: Baokun Li <libaokun1@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20230110133436.996350-1-libaokun1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/xattr.c | 11 +++++++++++ + 1 file changed, 11 insertions(+) + +diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c +index d8fef540ca9b..863c15388848 100644 +--- a/fs/ext4/xattr.c ++++ b/fs/ext4/xattr.c +@@ -422,6 +422,17 @@ static int ext4_xattr_inode_iget(struct inode *parent, unsigned long ea_ino, + struct inode *inode; + int err; + ++ /* ++ * We have to check for this corruption early as otherwise ++ * iget_locked() could wait indefinitely for the state of our ++ * parent inode. ++ */ ++ if (parent->i_ino == ea_ino) { ++ ext4_error(parent->i_sb, ++ "Parent and EA inode have the same ino %lu", ea_ino); ++ return -EFSCORRUPTED; ++ } ++ + inode = ext4_iget(parent->i_sb, ea_ino, EXT4_IGET_NORMAL); + if (IS_ERR(inode)) { + err = PTR_ERR(inode); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-to-check-return-value-of-freeze_bdev-in-ext.patch @@ -0,0 +1,49 @@ +From c4d13222afd8a64bf11bc7ec68645496ee8b54b9 Mon Sep 17 00:00:00 2001 +From: Chao Yu <chao@kernel.org> +Date: Tue, 6 Jun 2023 15:32:03 +0800 +Subject: [PATCH] ext4: fix to check return value of freeze_bdev() in + ext4_shutdown() +Git-commit: c4d13222afd8a64bf11bc7ec68645496ee8b54b9 +Patch-mainline: v6.5-rc1 +References: bsc#1213021 + +freeze_bdev() can fail due to a lot of reasons, it needs to check its +reason before later process. + +Fixes: 783d94854499 ("ext4: add EXT4_IOC_GOINGDOWN ioctl") +Cc: stable@kernel.org +Signed-off-by: Chao Yu <chao@kernel.org> +Link: https://lore.kernel.org/r/20230606073203.1310389-1-chao@kernel.org +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/ioctl.c | 5 ++++- + 1 file changed, 4 insertions(+), 1 deletion(-) + +diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c +index f9a430152063..55be1b8a6360 100644 +--- a/fs/ext4/ioctl.c ++++ b/fs/ext4/ioctl.c +@@ -797,6 +797,7 @@ static int ext4_shutdown(struct super_block *sb, unsigned long arg) + { + struct ext4_sb_info *sbi = EXT4_SB(sb); + __u32 flags; ++ int ret; + + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; +@@ -815,7 +816,9 @@ static int ext4_shutdown(struct super_block *sb, unsigned long arg) + + switch (flags) { + case EXT4_GOING_FLAGS_DEFAULT: +- freeze_bdev(sb->s_bdev); ++ ret = freeze_bdev(sb->s_bdev); ++ if (ret) ++ return ret; + set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags); + thaw_bdev(sb->s_bdev); + break; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-unaligned-memory-access-in-ext4_fc_reserve_.patch @@ -0,0 +1,104 @@ +From 8415ce07ecf0cc25efdd5db264a7133716e503cf Mon Sep 17 00:00:00 2001 +From: Eric Biggers <ebiggers@google.com> +Date: Sun, 6 Nov 2022 14:48:39 -0800 +Subject: [PATCH] ext4: fix unaligned memory access in ext4_fc_reserve_space() +Git-commit: 8415ce07ecf0cc25efdd5db264a7133716e503cf +Patch-mainline: v6.2-rc1 +References: bsc#1207627 + +As is done elsewhere in the file, build the struct ext4_fc_tl on the +stack and memcpy() it into the buffer, rather than directly writing it +to a potentially-unaligned location in the buffer. + +Fixes: aa75f4d3daae ("ext4: main fast-commit commit path") +Cc: <stable@vger.kernel.org> # v5.10+ +Signed-off-by: Eric Biggers <ebiggers@google.com> +Link: https://lore.kernel.org/r/20221106224841.279231-6-ebiggers@kernel.org +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/fast_commit.c | 39 +++++++++++++++++++++------------------ + 1 file changed, 21 insertions(+), 18 deletions(-) + +diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c +index d5ad4b2b235d..892fa7c7a768 100644 +--- a/fs/ext4/fast_commit.c ++++ b/fs/ext4/fast_commit.c +@@ -675,6 +675,15 @@ static void ext4_fc_submit_bh(struct super_block *sb, bool is_tail) + + /* Ext4 commit path routines */ + ++/* memcpy to fc reserved space and update CRC */ ++static void *ext4_fc_memcpy(struct super_block *sb, void *dst, const void *src, ++ int len, u32 *crc) ++{ ++ if (crc) ++ *crc = ext4_chksum(EXT4_SB(sb), *crc, src, len); ++ return memcpy(dst, src, len); ++} ++ + /* memzero and update CRC */ + static void *ext4_fc_memzero(struct super_block *sb, void *dst, int len, + u32 *crc) +@@ -700,12 +709,13 @@ static void *ext4_fc_memzero(struct super_block *sb, void *dst, int len, + */ + static u8 *ext4_fc_reserve_space(struct super_block *sb, int len, u32 *crc) + { +- struct ext4_fc_tl *tl; ++ struct ext4_fc_tl tl; + struct ext4_sb_info *sbi = EXT4_SB(sb); + struct buffer_head *bh; + int bsize = sbi->s_journal->j_blocksize; + int ret, off = sbi->s_fc_bytes % bsize; + int pad_len; ++ u8 *dst; + + /* + * After allocating len, we should have space at least for a 0 byte +@@ -729,16 +739,18 @@ static u8 *ext4_fc_reserve_space(struct super_block *sb, int len, u32 *crc) + return sbi->s_fc_bh->b_data + off; + } + /* Need to add PAD tag */ +- tl = (struct ext4_fc_tl *)(sbi->s_fc_bh->b_data + off); +- tl->fc_tag = cpu_to_le16(EXT4_FC_TAG_PAD); ++ dst = sbi->s_fc_bh->b_data + off; ++ tl.fc_tag = cpu_to_le16(EXT4_FC_TAG_PAD); + pad_len = bsize - off - 1 - EXT4_FC_TAG_BASE_LEN; +- tl->fc_len = cpu_to_le16(pad_len); +- if (crc) +- *crc = ext4_chksum(sbi, *crc, tl, EXT4_FC_TAG_BASE_LEN); +- if (pad_len > 0) +- ext4_fc_memzero(sb, tl + 1, pad_len, crc); ++ tl.fc_len = cpu_to_le16(pad_len); ++ ext4_fc_memcpy(sb, dst, &tl, EXT4_FC_TAG_BASE_LEN, crc); ++ dst += EXT4_FC_TAG_BASE_LEN; ++ if (pad_len > 0) { ++ ext4_fc_memzero(sb, dst, pad_len, crc); ++ dst += pad_len; ++ } + /* Don't leak uninitialized memory in the unused last byte. */ +- *((u8 *)(tl + 1) + pad_len) = 0; ++ *dst = 0; + + ext4_fc_submit_bh(sb, false); + +@@ -750,15 +762,6 @@ static u8 *ext4_fc_reserve_space(struct super_block *sb, int len, u32 *crc) + return sbi->s_fc_bh->b_data; + } + +-/* memcpy to fc reserved space and update CRC */ +-static void *ext4_fc_memcpy(struct super_block *sb, void *dst, const void *src, +- int len, u32 *crc) +-{ +- if (crc) +- *crc = ext4_chksum(EXT4_SB(sb), *crc, src, len); +- return memcpy(dst, src, len); +-} +- + /* + * Complete a fast commit by writing tail tag. + * +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-undefined-behavior-in-bit-shift-for-ext4_ch.patch @@ -0,0 +1,54 @@ +From 3bf678a0f9c017c9ba7c581541dbc8453452a7ae Mon Sep 17 00:00:00 2001 +From: Gaosheng Cui <cuigaosheng1@huawei.com> +Date: Mon, 31 Oct 2022 13:58:33 +0800 +Subject: [PATCH] ext4: fix undefined behavior in bit shift for + ext4_check_flag_values +Git-commit: 3bf678a0f9c017c9ba7c581541dbc8453452a7ae +Patch-mainline: v6.2-rc1 +References: bsc#1206890 + +Shifting signed 32-bit value by 31 bits is undefined, so changing +significant bit to unsigned. The UBSAN warning calltrace like below: + +Ubsan: shift-out-of-bounds in fs/ext4/ext4.h:591:2 +left shift of 1 by 31 places cannot be represented in type 'int' +Call Trace: + <TASK> + dump_stack_lvl+0x7d/0xa5 + dump_stack+0x15/0x1b + ubsan_epilogue+0xe/0x4e + __ubsan_handle_shift_out_of_bounds+0x1e7/0x20c + ext4_init_fs+0x5a/0x277 + do_one_initcall+0x76/0x430 + kernel_init_freeable+0x3b3/0x422 + kernel_init+0x24/0x1e0 + ret_from_fork+0x1f/0x30 + </TASK> + +Fixes: 9a4c80194713 ("ext4: ensure Inode flags consistency are checked at build time") +Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> +Link: https://lore.kernel.org/r/20221031055833.3966222-1-cuigaosheng1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/ext4.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h +index 2b574b143bde..3afdd99bb214 100644 +--- a/fs/ext4/ext4.h ++++ b/fs/ext4/ext4.h +@@ -558,7 +558,7 @@ enum { + * + * It's not paranoia if the Murphy's Law really *is* out to get you. :-) + */ +-#define TEST_FLAG_VALUE(FLAG) (EXT4_##FLAG##_FL == (1 << EXT4_INODE_##FLAG)) ++#define TEST_FLAG_VALUE(FLAG) (EXT4_##FLAG##_FL == (1U << EXT4_INODE_##FLAG)) + #define CHECK_FLAG_VALUE(FLAG) BUILD_BUG_ON(!TEST_FLAG_VALUE(FLAG)) + + static inline void ext4_check_flag_values(void) +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-uninititialized-value-in-ext4_evict_inode.patch @@ -0,0 +1,97 @@ +From 7ea71af94eaaaf6d9aed24bc94a05b977a741cb9 Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Thu, 17 Nov 2022 15:36:03 +0800 +Subject: [PATCH] ext4: fix uninititialized value in 'ext4_evict_inode' +Git-commit: 7ea71af94eaaaf6d9aed24bc94a05b977a741cb9 +Patch-mainline: v6.2-rc1 +References: bsc#1206893 + +Syzbot found the following issue: +===================================================== +Bug: KMSAN: uninit-value in ext4_evict_inode+0xdd/0x26b0 fs/ext4/inode.c:180 + ext4_evict_inode+0xdd/0x26b0 fs/ext4/inode.c:180 + evict+0x365/0x9a0 fs/inode.c:664 + iput_final fs/inode.c:1747 [inline] + iput+0x985/0xdd0 fs/inode.c:1773 + __ext4_new_inode+0xe54/0x7ec0 fs/ext4/ialloc.c:1361 + ext4_mknod+0x376/0x840 fs/ext4/namei.c:2844 + vfs_mknod+0x79d/0x830 fs/namei.c:3914 + do_mknodat+0x47d/0xaa0 + __do_sys_mknodat fs/namei.c:3992 [inline] + __se_sys_mknodat fs/namei.c:3989 [inline] + __ia32_sys_mknodat+0xeb/0x150 fs/namei.c:3989 + do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline] + __do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178 + do_fast_syscall_32+0x33/0x70 arch/x86/entry/common.c:203 + do_SYSENTER_32+0x1b/0x20 arch/x86/entry/common.c:246 + entry_SYSENTER_compat_after_hwframe+0x70/0x82 + +Uninit was created at: + __alloc_pages+0x9f1/0xe80 mm/page_alloc.c:5578 + alloc_pages+0xaae/0xd80 mm/mempolicy.c:2285 + alloc_slab_page mm/slub.c:1794 [inline] + allocate_slab+0x1b5/0x1010 mm/slub.c:1939 + new_slab mm/slub.c:1992 [inline] + ___slab_alloc+0x10c3/0x2d60 mm/slub.c:3180 + __slab_alloc mm/slub.c:3279 [inline] + slab_alloc_node mm/slub.c:3364 [inline] + slab_alloc mm/slub.c:3406 [inline] + __kmem_cache_alloc_lru mm/slub.c:3413 [inline] + kmem_cache_alloc_lru+0x6f3/0xb30 mm/slub.c:3429 + alloc_inode_sb include/linux/fs.h:3117 [inline] + ext4_alloc_inode+0x5f/0x860 fs/ext4/super.c:1321 + alloc_inode+0x83/0x440 fs/inode.c:259 + new_inode_pseudo fs/inode.c:1018 [inline] + new_inode+0x3b/0x430 fs/inode.c:1046 + __ext4_new_inode+0x2a7/0x7ec0 fs/ext4/ialloc.c:959 + ext4_mkdir+0x4d5/0x1560 fs/ext4/namei.c:2992 + vfs_mkdir+0x62a/0x870 fs/namei.c:4035 + do_mkdirat+0x466/0x7b0 fs/namei.c:4060 + __do_sys_mkdirat fs/namei.c:4075 [inline] + __se_sys_mkdirat fs/namei.c:4073 [inline] + __ia32_sys_mkdirat+0xc4/0x120 fs/namei.c:4073 + do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline] + __do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178 + do_fast_syscall_32+0x33/0x70 arch/x86/entry/common.c:203 + do_SYSENTER_32+0x1b/0x20 arch/x86/entry/common.c:246 + entry_SYSENTER_compat_after_hwframe+0x70/0x82 + +Cpu: 1 PID: 4625 Comm: syz-executor.2 Not tainted 6.1.0-rc4-syzkaller-62821-gcb231e2f67ec #0 +Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 +===================================================== + +Now, 'ext4_alloc_inode()' didn't init 'ei->i_flags'. If new inode failed +before set 'ei->i_flags' in '__ext4_new_inode()', then do 'iput()'. As after +6bc0d63dad7f commit will access 'ei->i_flags' in 'ext4_evict_inode()' which +will lead to access uninit-value. +To solve above issue just init 'ei->i_flags' in 'ext4_alloc_inode()'. + +Reported-by: syzbot+57b25da729eb0b88177d@syzkaller.appspotmail.com +Signed-off-by: Ye Bin <yebin10@huawei.com> +Fixes: 6bc0d63dad7f ("ext4: remove EA inode entry from mbcache on inode eviction") +Reviewed-by: Jan Kara <jack@suse.cz> +Reviewed-by: Eric Biggers <ebiggers@google.com> +Link: https://lore.kernel.org/r/20221117073603.2598882-1-yebin@huaweicloud.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/super.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/fs/ext4/super.c b/fs/ext4/super.c +index 878be47faaaf..28d009151d23 100644 +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -1324,6 +1324,7 @@ static struct inode *ext4_alloc_inode(struct super_block *sb) + return NULL; + + inode_set_iversion(&ei->vfs_inode, 1); ++ ei->i_flags = 0; + spin_lock_init(&ei->i_raw_lock); + INIT_LIST_HEAD(&ei->i_prealloc_list); + atomic_set(&ei->i_prealloc_active, 0); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-use-after-free-in-ext4_ext_shift_extents.patch @@ -0,0 +1,106 @@ +From f6b1a1cf1c3ee430d3f5e47847047ce789a690aa Mon Sep 17 00:00:00 2001 +From: Baokun Li <libaokun1@huawei.com> +Date: Thu, 22 Sep 2022 20:04:34 +0800 +Subject: [PATCH] ext4: fix use-after-free in ext4_ext_shift_extents +Mime-version: 1.0 +Content-type: text/plain; charset=UTF-8 +Content-transfer-encoding: 8bit +Git-commit: f6b1a1cf1c3ee430d3f5e47847047ce789a690aa +Patch-mainline: v6.1-rc7 +References: bsc#1206888 + +If the starting position of our insert range happens to be in the hole +between the two ext4_extent_idx, because the lblk of the ext4_extent in +the previous ext4_extent_idx is always less than the start, which leads +to the "extent" variable access across the boundary, the following UAF is +Triggered: +================================================================== +Bug: KASAN: use-after-free in ext4_ext_shift_extents+0x257/0x790 +Read of size 4 at addr ffff88819807a008 by task fallocate/8010 +Cpu: 3 PID: 8010 Comm: fallocate Tainted: G E 5.10.0+ #492 +Call Trace: + dump_stack+0x7d/0xa3 + print_address_description.constprop.0+0x1e/0x220 + kasan_report.cold+0x67/0x7f + ext4_ext_shift_extents+0x257/0x790 + ext4_insert_range+0x5b6/0x700 + ext4_fallocate+0x39e/0x3d0 + vfs_fallocate+0x26f/0x470 + ksys_fallocate+0x3a/0x70 + __x64_sys_fallocate+0x4f/0x60 + do_syscall_64+0x33/0x40 + entry_SYSCALL_64_after_hwframe+0x44/0xa9 +================================================================== + +For right shifts, we can divide them into the following situations: + +1. When the first ee_block of ext4_extent_idx is greater than or equal to + start, make right shifts directly from the first ee_block. + 1) If it is greater than start, we need to continue searching in the + previous ext4_extent_idx. + 2) If it is equal to start, we can exit the loop (iterator=NULL). + +2. When the first ee_block of ext4_extent_idx is less than start, then + traverse from the last extent to find the first extent whose ee_block + is less than start. + 1) If extent is still the last extent after traversal, it means that + the last ee_block of ext4_extent_idx is less than start, that is, + start is located in the hole between idx and (idx+1), so we can + exit the loop directly (break) without right shifts. + 2) Otherwise, make right shifts at the corresponding position of the + found extent, and then exit the loop (iterator=NULL). + +Fixes: 331573febb6a ("ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate") +Cc: stable@vger.kernel.org # v4.2+ +Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> +Signed-off-by: Baokun Li <libaokun1@huawei.com> +Link: https://lore.kernel.org/r/20220922120434.1294789-1-libaokun1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/extents.c | 18 +++++++++++++----- + 1 file changed, 13 insertions(+), 5 deletions(-) + +diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c +index f1956288307f..6c399a8b22b3 100644 +--- a/fs/ext4/extents.c ++++ b/fs/ext4/extents.c +@@ -5184,6 +5184,7 @@ ext4_ext_shift_extents(struct inode *inode, handle_t *handle, + * and it is decreased till we reach start. + */ + again: ++ ret = 0; + if (SHIFT == SHIFT_LEFT) + iterator = &start; + else +@@ -5227,14 +5228,21 @@ ext4_ext_shift_extents(struct inode *inode, handle_t *handle, + ext4_ext_get_actual_len(extent); + } else { + extent = EXT_FIRST_EXTENT(path[depth].p_hdr); +- if (le32_to_cpu(extent->ee_block) > 0) ++ if (le32_to_cpu(extent->ee_block) > start) + *iterator = le32_to_cpu(extent->ee_block) - 1; +- else +- /* Beginning is reached, end of the loop */ ++ else if (le32_to_cpu(extent->ee_block) == start) + iterator = NULL; +- /* Update path extent in case we need to stop */ +- while (le32_to_cpu(extent->ee_block) < start) ++ else { ++ extent = EXT_LAST_EXTENT(path[depth].p_hdr); ++ while (le32_to_cpu(extent->ee_block) >= start) ++ extent--; ++ ++ if (extent == EXT_LAST_EXTENT(path[depth].p_hdr)) ++ break; ++ + extent++; ++ iterator = NULL; ++ } + path[depth].p_ext = extent; + } + ret = ext4_ext_shift_path_extents(path, shift, inode, +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-use-after-free-in-ext4_orphan_cleanup.patch @@ -0,0 +1,81 @@ +From a71248b1accb2b42e4980afef4fa4a27fa0e36f5 Mon Sep 17 00:00:00 2001 +From: Baokun Li <libaokun1@huawei.com> +Date: Wed, 2 Nov 2022 16:06:33 +0800 +Subject: [PATCH] ext4: fix use-after-free in ext4_orphan_cleanup +Git-commit: a71248b1accb2b42e4980afef4fa4a27fa0e36f5 +Patch-mainline: v6.2-rc1 +References: bsc#1207622 + +I caught a issue as follows: +================================================================== + BUG: KASAN: use-after-free in __list_add_valid+0x28/0x1a0 + Read of size 8 at addr ffff88814b13f378 by task mount/710 + + CPU: 1 PID: 710 Comm: mount Not tainted 6.1.0-rc3-next #370 + Call Trace: + <TASK> + dump_stack_lvl+0x73/0x9f + print_report+0x25d/0x759 + kasan_report+0xc0/0x120 + __asan_load8+0x99/0x140 + __list_add_valid+0x28/0x1a0 + ext4_orphan_cleanup+0x564/0x9d0 [ext4] + __ext4_fill_super+0x48e2/0x5300 [ext4] + ext4_fill_super+0x19f/0x3a0 [ext4] + get_tree_bdev+0x27b/0x450 + ext4_get_tree+0x19/0x30 [ext4] + vfs_get_tree+0x49/0x150 + path_mount+0xaae/0x1350 + do_mount+0xe2/0x110 + __x64_sys_mount+0xf0/0x190 + do_syscall_64+0x35/0x80 + entry_SYSCALL_64_after_hwframe+0x63/0xcd + </TASK> + [...] +================================================================== + +Above issue may happen as follows: + +Acked-by: Jan Kara <jack@suse.cz> + +------------------------------------- +ext4_fill_super + ext4_orphan_cleanup + --- loop1: assume last_orphan is 12 --- + list_add(&EXT4_I(inode)->i_orphan, &EXT4_SB(sb)->s_orphan) + ext4_truncate --> return 0 + ext4_inode_attach_jinode --> return -ENOMEM + iput(inode) --> free inode<12> + --- loop2: last_orphan is still 12 --- + list_add(&EXT4_I(inode)->i_orphan, &EXT4_SB(sb)->s_orphan); + // use inode<12> and trigger UAF + +To solve this issue, we need to propagate the return value of +ext4_inode_attach_jinode() appropriately. + +Signed-off-by: Baokun Li <libaokun1@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20221102080633.1630225-1-libaokun1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +--- + fs/ext4/inode.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c +index a4bf643aa08b..181bc161b1ac 100644 +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -4231,7 +4231,8 @@ int ext4_truncate(struct inode *inode) + + /* If we zero-out tail of the page, we have to create jinode for jbd2 */ + if (inode->i_size & (inode->i_sb->s_blocksize - 1)) { +- if (ext4_inode_attach_jinode(inode) < 0) ++ err = ext4_inode_attach_jinode(inode); ++ if (err) + goto out_trace; + } + +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-use-after-free-in-ext4_rename_dir_prepare.patch @@ -0,0 +1,130 @@ +From 0be698ecbe4471fcad80e81ec6a05001421041b3 Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Thu, 14 Apr 2022 10:52:23 +0800 +Subject: [PATCH] ext4: fix use-after-free in ext4_rename_dir_prepare +Git-commit: 0be698ecbe4471fcad80e81ec6a05001421041b3 +Patch-mainline: v5.19-rc1 +References: bsc#1200871 + +We got issue as follows: +EXT4-fs (loop0): mounted filesystem without journal. Opts: ,errors=continue +Ext4_get_first_dir_block: bh->b_data=0xffff88810bee6000 len=34478 +Ext4_get_first_dir_block: *parent_de=0xffff88810beee6ae bh->b_data=0xffff88810bee6000 +Ext4_rename_dir_prepare: [1] parent_de=0xffff88810beee6ae +================================================================== +Bug: KASAN: use-after-free in ext4_rename_dir_prepare+0x152/0x220 +Read of size 4 at addr ffff88810beee6ae by task rep/1895 + +Cpu: 13 PID: 1895 Comm: rep Not tainted 5.10.0+ #241 +Call Trace: + dump_stack+0xbe/0xf9 + print_address_description.constprop.0+0x1e/0x220 + kasan_report.cold+0x37/0x7f + ext4_rename_dir_prepare+0x152/0x220 + ext4_rename+0xf44/0x1ad0 + ext4_rename2+0x11c/0x170 + vfs_rename+0xa84/0x1440 + do_renameat2+0x683/0x8f0 + __x64_sys_renameat+0x53/0x60 + do_syscall_64+0x33/0x40 + entry_SYSCALL_64_after_hwframe+0x44/0xa9 +Rip: 0033:0x7f45a6fc41c9 +Rsp: 002b:00007ffc5a470218 EFLAGS: 00000246 ORIG_RAX: 0000000000000108 +Rax: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f45a6fc41c9 +Rdx: 0000000000000005 RSI: 0000000020000180 RDI: 0000000000000005 +Rbp: 00007ffc5a470240 R08: 00007ffc5a470160 R09: 0000000020000080 +R10: 00000000200001c0 R11: 0000000000000246 R12: 0000000000400bb0 +R13: 00007ffc5a470320 R14: 0000000000000000 R15: 0000000000000000 + +The buggy address belongs to the page: +page:00000000440015ce refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x10beee +Flags: 0x200000000000000() +Raw: 0200000000000000 ffffea00043ff4c8 ffffea0004325608 0000000000000000 +Raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 +page dumped because: kasan: bad access detected + +Memory state around the buggy address: + ffff88810beee580: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff + ffff88810beee600: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff +>ffff88810beee680: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff + ^ + ffff88810beee700: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff + ffff88810beee780: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff +================================================================== +Disabling lock debugging due to kernel taint +Ext4_rename_dir_prepare: [2] parent_de->inode=3537895424 +Ext4_rename_dir_prepare: [3] dir=0xffff888124170140 +Ext4_rename_dir_prepare: [4] ino=2 +Ext4_rename_dir_prepare: ent->dir->i_ino=2 parent=-757071872 + +Reason is first directory entry which 'rec_len' is 34478, then will get illegal +parent entry. Now, we do not check directory entry after read directory block +in 'ext4_get_first_dir_block'. +To solve this issue, check directory entry in 'ext4_get_first_dir_block'. + +[ Trigger an ext4_error() instead of just warning if the directory is + missing a '.' or '..' entry. Also make sure we return an error code + if the file system is corrupted. -TYT ] + +Signed-off-by: Ye Bin <yebin10@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220414025223.4113128-1-yebin10@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/namei.c | 30 +++++++++++++++++++++++++++--- + 1 file changed, 27 insertions(+), 3 deletions(-) + +diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c +index dfe5514035c1..b202626391ff 100644 +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -3454,6 +3454,9 @@ static struct buffer_head *ext4_get_first_dir_block(handle_t *handle, + struct buffer_head *bh; + + if (!ext4_has_inline_data(inode)) { ++ struct ext4_dir_entry_2 *de; ++ unsigned int offset; ++ + /* The first directory block must not be a hole, so + * treat it as DIRENT_HTREE + */ +@@ -3462,9 +3465,30 @@ static struct buffer_head *ext4_get_first_dir_block(handle_t *handle, + *retval = PTR_ERR(bh); + return NULL; + } +- *parent_de = ext4_next_entry( +- (struct ext4_dir_entry_2 *)bh->b_data, +- inode->i_sb->s_blocksize); ++ ++ de = (struct ext4_dir_entry_2 *) bh->b_data; ++ if (ext4_check_dir_entry(inode, NULL, de, bh, bh->b_data, ++ bh->b_size, 0) || ++ le32_to_cpu(de->inode) != inode->i_ino || ++ strcmp(".", de->name)) { ++ EXT4_ERROR_INODE(inode, "directory missing '.'"); ++ brelse(bh); ++ *retval = -EFSCORRUPTED; ++ return NULL; ++ } ++ offset = ext4_rec_len_from_disk(de->rec_len, ++ inode->i_sb->s_blocksize); ++ de = ext4_next_entry(de, inode->i_sb->s_blocksize); ++ if (ext4_check_dir_entry(inode, NULL, de, bh, bh->b_data, ++ bh->b_size, offset) || ++ le32_to_cpu(de->inode) == 0 || strcmp("..", de->name)) { ++ EXT4_ERROR_INODE(inode, "directory missing '..'"); ++ brelse(bh); ++ *retval = -EFSCORRUPTED; ++ return NULL; ++ } ++ *parent_de = de; ++ + return bh; + } + +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-use-after-free-in-ext4_search_dir.patch @@ -0,0 +1,131 @@ +From c186f0887fe7061a35cebef024550ec33ef8fbd8 Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Thu, 24 Mar 2022 14:48:16 +0800 +Subject: [PATCH] ext4: fix use-after-free in ext4_search_dir +Git-commit: c186f0887fe7061a35cebef024550ec33ef8fbd8 +Patch-mainline: v5.18-rc4 +References: bsc#1202710 + +We got issue as follows: +EXT4-fs (loop0): mounted filesystem without journal. Opts: ,errors=continue +================================================================== +Bug: KASAN: use-after-free in ext4_search_dir fs/ext4/namei.c:1394 [inline] +Bug: KASAN: use-after-free in search_dirblock fs/ext4/namei.c:1199 [inline] +Bug: KASAN: use-after-free in __ext4_find_entry+0xdca/0x1210 fs/ext4/namei.c:1553 +Read of size 1 at addr ffff8881317c3005 by task syz-executor117/2331 + +Cpu: 1 PID: 2331 Comm: syz-executor117 Not tainted 5.10.0+ #1 +Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 +Call Trace: + __dump_stack lib/dump_stack.c:83 [inline] + dump_stack+0x144/0x187 lib/dump_stack.c:124 + print_address_description+0x7d/0x630 mm/kasan/report.c:387 + __kasan_report+0x132/0x190 mm/kasan/report.c:547 + kasan_report+0x47/0x60 mm/kasan/report.c:564 + ext4_search_dir fs/ext4/namei.c:1394 [inline] + search_dirblock fs/ext4/namei.c:1199 [inline] + __ext4_find_entry+0xdca/0x1210 fs/ext4/namei.c:1553 + ext4_lookup_entry fs/ext4/namei.c:1622 [inline] + ext4_lookup+0xb8/0x3a0 fs/ext4/namei.c:1690 + __lookup_hash+0xc5/0x190 fs/namei.c:1451 + do_rmdir+0x19e/0x310 fs/namei.c:3760 + do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46 + entry_SYSCALL_64_after_hwframe+0x44/0xa9 +Rip: 0033:0x445e59 +Code: 4d c7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 1b c7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 +Rsp: 002b:00007fff2277fac8 EFLAGS: 00000246 ORIG_RAX: 0000000000000054 +Rax: ffffffffffffffda RBX: 0000000000400280 RCX: 0000000000445e59 +Rdx: 0000000000000000 RSI: 0000000000000000 RDI: 00000000200000c0 +Rbp: 0000000000000000 R08: 0000000000000000 R09: 0000000000000002 +R10: 00007fff2277f990 R11: 0000000000000246 R12: 0000000000000000 +R13: 431bde82d7b634db R14: 0000000000000000 R15: 0000000000000000 + +The buggy address belongs to the page: +page:0000000048cd3304 refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x1317c3 +Flags: 0x200000000000000() +Raw: 0200000000000000 ffffea0004526588 ffffea0004528088 0000000000000000 +Raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 +page dumped because: kasan: bad access detected + +Memory state around the buggy address: + ffff8881317c2f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 + ffff8881317c2f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 +>ffff8881317c3000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff + ^ + ffff8881317c3080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff + ffff8881317c3100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff +================================================================== + +Ext4_search_dir: ... + de = (struct ext4_dir_entry_2 *)search_buf; + dlimit = search_buf + buf_size; + while ((char *) de < dlimit) { + ... + if ((char *) de + de->name_len <= dlimit && + ext4_match(dir, fname, de)) { + ... + } + ... + de_len = ext4_rec_len_from_disk(de->rec_len, dir->i_sb->s_blocksize); + if (de_len <= 0) + return -1; + offset += de_len; + de = (struct ext4_dir_entry_2 *) ((char *) de + de_len); + } + +Assume: +de=0xffff8881317c2fff +dlimit=0x0xffff8881317c3000 + +If read 'de->name_len' which address is 0xffff8881317c3005, obviously is +out of range, then will trigger use-after-free. +To solve this issue, 'dlimit' must reserve 8 bytes, as we will read +'de->name_len' to judge if '(char *) de + de->name_len' out of range. + +Signed-off-by: Ye Bin <yebin10@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220324064816.1209985-1-yebin10@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/ext4.h | 4 ++++ + fs/ext4/namei.c | 4 ++-- + 2 files changed, 6 insertions(+), 2 deletions(-) + +diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h +index 1d79012c5a5b..48dc2c3247ad 100644 +--- a/fs/ext4/ext4.h ++++ b/fs/ext4/ext4.h +@@ -2273,6 +2273,10 @@ static inline int ext4_forced_shutdown(struct ext4_sb_info *sbi) + * Structure of a directory entry + */ + #define EXT4_NAME_LEN 255 ++/* ++ * Base length of the ext4 directory entry excluding the name length ++ */ ++#define EXT4_BASE_DIR_LEN (sizeof(struct ext4_dir_entry_2) - EXT4_NAME_LEN) + + struct ext4_dir_entry { + __le32 inode; /* Inode number */ +diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c +index e37da8d5cd0c..767b4bfe39c3 100644 +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -1466,10 +1466,10 @@ int ext4_search_dir(struct buffer_head *bh, char *search_buf, int buf_size, + + de = (struct ext4_dir_entry_2 *)search_buf; + dlimit = search_buf + buf_size; +- while ((char *) de < dlimit) { ++ while ((char *) de < dlimit - EXT4_BASE_DIR_LEN) { + /* this code is executed quadratically often */ + /* do minimal checking `by hand' */ +- if ((char *) de + de->name_len <= dlimit && ++ if (de->name + de->name_len <= dlimit && + ext4_match(dir, fname, de)) { + /* found a match - just to be sure, do + * a full check */ +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-use-after-free-read-in-ext4_find_extent-for.patch @@ -0,0 +1,94 @@ +From 835659598c67907b98cd2aa57bb951dfaf675c69 Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Thu, 6 Apr 2023 11:16:27 +0000 +Subject: [PATCH] ext4: fix use-after-free read in ext4_find_extent for + bigalloc + inline +Git-commit: 835659598c67907b98cd2aa57bb951dfaf675c69 +Patch-mainline: v6.4-rc1 +References: bsc#1213098 + +Syzbot found the following issue: +Loop0: detected capacity change from 0 to 2048 +EXT4-fs (loop0): mounted filesystem 00000000-0000-0000-0000-000000000000 without journal. Quota mode: none. +================================================================== +Bug: KASAN: use-after-free in ext4_ext_binsearch_idx fs/ext4/extents.c:768 [inline] +Bug: KASAN: use-after-free in ext4_find_extent+0x76e/0xd90 fs/ext4/extents.c:931 +Read of size 4 at addr ffff888073644750 by task syz-executor420/5067 + +Cpu: 0 PID: 5067 Comm: syz-executor420 Not tainted 6.2.0-rc1-syzkaller #0 +Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 +Call Trace: + <TASK> + __dump_stack lib/dump_stack.c:88 [inline] + dump_stack_lvl+0x1b1/0x290 lib/dump_stack.c:106 + print_address_description+0x74/0x340 mm/kasan/report.c:306 + print_report+0x107/0x1f0 mm/kasan/report.c:417 + kasan_report+0xcd/0x100 mm/kasan/report.c:517 + ext4_ext_binsearch_idx fs/ext4/extents.c:768 [inline] + ext4_find_extent+0x76e/0xd90 fs/ext4/extents.c:931 + ext4_clu_mapped+0x117/0x970 fs/ext4/extents.c:5809 + ext4_insert_delayed_block fs/ext4/inode.c:1696 [inline] + ext4_da_map_blocks fs/ext4/inode.c:1806 [inline] + ext4_da_get_block_prep+0x9e8/0x13c0 fs/ext4/inode.c:1870 + ext4_block_write_begin+0x6a8/0x2290 fs/ext4/inode.c:1098 + ext4_da_write_begin+0x539/0x760 fs/ext4/inode.c:3082 + generic_perform_write+0x2e4/0x5e0 mm/filemap.c:3772 + ext4_buffered_write_iter+0x122/0x3a0 fs/ext4/file.c:285 + ext4_file_write_iter+0x1d0/0x18f0 + call_write_iter include/linux/fs.h:2186 [inline] + new_sync_write fs/read_write.c:491 [inline] + vfs_write+0x7dc/0xc50 fs/read_write.c:584 + ksys_write+0x177/0x2a0 fs/read_write.c:637 + do_syscall_x64 arch/x86/entry/common.c:50 [inline] + do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 + entry_SYSCALL_64_after_hwframe+0x63/0xcd +Rip: 0033:0x7f4b7a9737b9 +Rsp: 002b:00007ffc5cac3668 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 +Rax: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f4b7a9737b9 +Rdx: 00000000175d9003 RSI: 0000000020000200 RDI: 0000000000000004 +Rbp: 00007f4b7a933050 R08: 0000000000000000 R09: 0000000000000000 +R10: 000000000000079f R11: 0000000000000246 R12: 00007f4b7a9330e0 +R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 + </TASK> + +Above issue is happens when enable bigalloc and inline data feature. As +commit 131294c35ed6 fixed delayed allocation bug in ext4_clu_mapped for +bigalloc + inline. But it only resolved issue when has inline data, if +inline data has been converted to extent(ext4_da_convert_inline_data_to_extent) +before writepages, there is no EXT4_STATE_MAY_INLINE_DATA flag. However +i_data is still store inline data in this scene. Then will trigger UAF +when find extent. +To resolve above issue, there is need to add judge "ext4_has_inline_data(inode)" +in ext4_clu_mapped(). + +Fixes: 131294c35ed6 ("ext4: fix delayed allocation bug in ext4_clu_mapped for bigalloc + inline") +Reported-by: syzbot+bf4bb7731ef73b83a3b4@syzkaller.appspotmail.com +Reviewed-by: Jan Kara <jack@suse.cz> +Reviewed-by: Ye Bin <yebin10@huawei.com> +Reviewed-by: Tudor Ambarus <tudor.ambarus@linaro.org> +Tested-by: Tudor Ambarus <tudor.ambarus@linaro.org> +Link: https://lore.kernel.org/r/20230406111627.1916759-1-tudor.ambarus@linaro.org +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/extents.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c +index e79c767cc5e0..35703dce23a3 100644 +--- a/fs/ext4/extents.c ++++ b/fs/ext4/extents.c +@@ -5795,7 +5795,8 @@ int ext4_clu_mapped(struct inode *inode, ext4_lblk_t lclu) + * mapped - no physical clusters have been allocated, and the + * file has no extents + */ +- if (ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)) ++ if (ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA) || ++ ext4_has_inline_data(inode)) + return 0; + + /* search for the extent closest to the first block in the cluster */ +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-warning-in-ext4_da_release_space.patch @@ -0,0 +1,108 @@ +From 1b8f787ef547230a3249bcf897221ef0cc78481b Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Tue, 18 Oct 2022 10:27:01 +0800 +Subject: [PATCH] ext4: fix warning in 'ext4_da_release_space' +Git-commit: 1b8f787ef547230a3249bcf897221ef0cc78481b +Patch-mainline: v6.1-rc4 +References: bsc#1206887 + +Syzkaller report issue as follows: +EXT4-fs (loop0): Free/Dirty block details +EXT4-fs (loop0): free_blocks=0 +EXT4-fs (loop0): dirty_blocks=0 +EXT4-fs (loop0): Block reservation details +EXT4-fs (loop0): i_reserved_data_blocks=0 +EXT4-fs warning (device loop0): ext4_da_release_space:1527: ext4_da_release_space: ino 18, to_free 1 with only 0 reserved data blocks + +Acked-by: Jan Kara <jack@suse.cz> + +------------[ cut here ]------------ +WARNING: CPU: 0 PID: 92 at fs/ext4/inode.c:1528 ext4_da_release_space+0x25e/0x370 fs/ext4/inode.c:1524 +Modules linked in: +CPU: 0 PID: 92 Comm: kworker/u4:4 Not tainted 6.0.0-syzkaller-09423-g493ffd6605b2 #0 +Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/22/2022 +Workqueue: writeback wb_workfn (flush-7:0) +RIP: 0010:ext4_da_release_space+0x25e/0x370 fs/ext4/inode.c:1528 +RSP: 0018:ffffc900015f6c90 EFLAGS: 00010296 +RAX: 42215896cd52ea00 RBX: 0000000000000000 RCX: 42215896cd52ea00 +RDX: 0000000000000000 RSI: 0000000080000001 RDI: 0000000000000000 +RBP: 1ffff1100e907d96 R08: ffffffff816aa79d R09: fffff520002bece5 +R10: fffff520002bece5 R11: 1ffff920002bece4 R12: ffff888021fd2000 +R13: ffff88807483ecb0 R14: 0000000000000001 R15: ffff88807483e740 +FS: 0000000000000000(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: 00005555569ba628 CR3: 000000000c88e000 CR4: 00000000003506f0 +DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 +DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 +Call Trace: + <TASK> + ext4_es_remove_extent+0x1ab/0x260 fs/ext4/extents_status.c:1461 + mpage_release_unused_pages+0x24d/0xef0 fs/ext4/inode.c:1589 + ext4_writepages+0x12eb/0x3be0 fs/ext4/inode.c:2852 + do_writepages+0x3c3/0x680 mm/page-writeback.c:2469 + __writeback_single_inode+0xd1/0x670 fs/fs-writeback.c:1587 + writeback_sb_inodes+0xb3b/0x18f0 fs/fs-writeback.c:1870 + wb_writeback+0x41f/0x7b0 fs/fs-writeback.c:2044 + wb_do_writeback fs/fs-writeback.c:2187 [inline] + wb_workfn+0x3cb/0xef0 fs/fs-writeback.c:2227 + process_one_work+0x877/0xdb0 kernel/workqueue.c:2289 + worker_thread+0xb14/0x1330 kernel/workqueue.c:2436 + kthread+0x266/0x300 kernel/kthread.c:376 + ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306 + </TASK> + +Above issue may happens as follows: +ext4_da_write_begin + ext4_create_inline_data + ext4_clear_inode_flag(inode, EXT4_INODE_EXTENTS); + ext4_set_inode_flag(inode, EXT4_INODE_INLINE_DATA); +__ext4_ioctl + ext4_ext_migrate -> will lead to eh->eh_entries not zero, and set extent flag +ext4_da_write_begin + ext4_da_convert_inline_data_to_extent + ext4_da_write_inline_data_begin + ext4_da_map_blocks + ext4_insert_delayed_block + if (!ext4_es_scan_clu(inode, &ext4_es_is_delonly, lblk)) + if (!ext4_es_scan_clu(inode, &ext4_es_is_mapped, lblk)) + ext4_clu_mapped(inode, EXT4_B2C(sbi, lblk)); -> will return 1 + allocated = true; + ext4_es_insert_delayed_block(inode, lblk, allocated); +ext4_writepages + mpage_map_and_submit_extent(handle, &mpd, &give_up_on_write); -> return -ENOSPC + mpage_release_unused_pages(&mpd, give_up_on_write); -> give_up_on_write == 1 + ext4_es_remove_extent + ext4_da_release_space(inode, reserved); + if (unlikely(to_free > ei->i_reserved_data_blocks)) + -> to_free == 1 but ei->i_reserved_data_blocks == 0 + -> then trigger warning as above + +To solve above issue, forbid inode do migrate which has inline data. + +Cc: stable@kernel.org +Reported-by: syzbot+c740bb18df70ad00952e@syzkaller.appspotmail.com +Signed-off-by: Ye Bin <yebin10@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20221018022701.683489-1-yebin10@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +--- + fs/ext4/migrate.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c +index 0a220ec9862d..a19a9661646e 100644 +--- a/fs/ext4/migrate.c ++++ b/fs/ext4/migrate.c +@@ -424,7 +424,8 @@ int ext4_ext_migrate(struct inode *inode) + * already is extent-based, error out. + */ + if (!ext4_has_feature_extents(inode->i_sb) || +- (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) ++ ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS) || ++ ext4_has_inline_data(inode)) + return -EINVAL; + + if (S_ISLNK(inode->i_mode) && inode->i_blocks == 0) +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fix-warning-in-ext4_handle_inode_extension.patch @@ -0,0 +1,113 @@ +From f4534c9fc94d22383f187b9409abb3f9df2e3db3 Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Sat, 26 Mar 2022 14:53:51 +0800 +Subject: [PATCH] ext4: fix warning in ext4_handle_inode_extension +Git-commit: f4534c9fc94d22383f187b9409abb3f9df2e3db3 +Patch-mainline: v5.19-rc1 +References: bsc#1202711 + +We got issue as follows: +EXT4-fs error (device loop0) in ext4_reserve_inode_write:5741: Out of memory +EXT4-fs error (device loop0): ext4_setattr:5462: inode #13: comm syz-executor.0: mark_inode_dirty error +EXT4-fs error (device loop0) in ext4_setattr:5519: Out of memory +EXT4-fs error (device loop0): ext4_ind_map_blocks:595: inode #13: comm syz-executor.0: Can't allocate blocks for non-extent mapped inodes with bigalloc + +Acked-by: Jan Kara <jack@suse.cz> + +------------[ cut here ]------------ +WARNING: CPU: 1 PID: 4361 at fs/ext4/file.c:301 ext4_file_write_iter+0x11c9/0x1220 +Modules linked in: +CPU: 1 PID: 4361 Comm: syz-executor.0 Not tainted 5.10.0+ #1 +RIP: 0010:ext4_file_write_iter+0x11c9/0x1220 +RSP: 0018:ffff924d80b27c00 EFLAGS: 00010282 +RAX: ffffffff815a3379 RBX: 0000000000000000 RCX: 000000003b000000 +RDX: ffff924d81601000 RSI: 00000000000009cc RDI: 00000000000009cd +RBP: 000000000000000d R08: ffffffffbc5a2c6b R09: 0000902e0e52a96f +R10: ffff902e2b7c1b40 R11: ffff902e2b7c1b40 R12: 000000000000000a +R13: 0000000000000001 R14: ffff902e0e52aa10 R15: ffffffffffffff8b +FS: 00007f81a7f65700(0000) GS:ffff902e3bc80000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: ffffffffff600400 CR3: 000000012db88001 CR4: 00000000003706e0 +DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 +DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 +Call Trace: + do_iter_readv_writev+0x2e5/0x360 + do_iter_write+0x112/0x4c0 + do_pwritev+0x1e5/0x390 + __x64_sys_pwritev2+0x7e/0xa0 + do_syscall_64+0x37/0x50 + entry_SYSCALL_64_after_hwframe+0x44/0xa9 + +Above issue may happen as follows: +Assume +inode.i_size=4096 +EXT4_I(inode)->i_disksize=4096 + +step 1: set inode->i_isize = 8192 +ext4_setattr + if (attr->ia_size != inode->i_size) + EXT4_I(inode)->i_disksize = attr->ia_size; + rc = ext4_mark_inode_dirty + ext4_reserve_inode_write + ext4_get_inode_loc + __ext4_get_inode_loc + sb_getblk --> return -ENOMEM + ... + if (!error) ->will not update i_size + i_size_write(inode, attr->ia_size); +Now: +inode.i_size=4096 +EXT4_I(inode)->i_disksize=8192 + +step 2: Direct write 4096 bytes +ext4_file_write_iter + ext4_dio_write_iter + iomap_dio_rw ->return error + if (extend) + ext4_handle_inode_extension + WARN_ON_ONCE(i_size_read(inode) < EXT4_I(inode)->i_disksize); +->Then trigger warning. + +To solve above issue, if mark inode dirty failed in ext4_setattr just +set 'EXT4_I(inode)->i_disksize' with old value. + +Signed-off-by: Ye Bin <yebin10@huawei.com> +Link: https://lore.kernel.org/r/20220326065351.761952-1-yebin10@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +--- + fs/ext4/inode.c | 4 ++++ + 1 file changed, 4 insertions(+) + +diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c +index 646ece9b3455..15165c87c915 100644 +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -5398,6 +5398,7 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry, + if (attr->ia_valid & ATTR_SIZE) { + handle_t *handle; + loff_t oldsize = inode->i_size; ++ loff_t old_disksize; + int shrink = (attr->ia_size < inode->i_size); + + if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) { +@@ -5469,6 +5470,7 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry, + inode->i_sb->s_blocksize_bits); + + down_write(&EXT4_I(inode)->i_data_sem); ++ old_disksize = EXT4_I(inode)->i_disksize; + EXT4_I(inode)->i_disksize = attr->ia_size; + rc = ext4_mark_inode_dirty(handle, inode); + if (!error) +@@ -5480,6 +5482,8 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry, + */ + if (!error) + i_size_write(inode, attr->ia_size); ++ else ++ EXT4_I(inode)->i_disksize = old_disksize; + up_write(&EXT4_I(inode)->i_data_sem); + ext4_journal_stop(handle); + if (error) +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-fixup-pages-without-buffers.patch @@ -0,0 +1,74 @@ +From: Jan Kara <jack@suse.cz> +Subject: ext4: Fixup pages without buffers +References: bsc#1205495 +Patch-mainline: Never, upstream has page pinning tracking infrastructure which should be used for this + +When application use RDMA or various offload engines they pin pages which they +dirty once they are done with these pages. However the page may be cleaned and +buffers reclaimed before it is dirtied which confuses ext4 writeback code. +Detect such case and restore page buffers as good as we can. + +Signed-off-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inode.c | 35 +++++++++++++++++++++++++++++++++++ + 1 file changed, 35 insertions(+) + +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -2122,6 +2122,37 @@ out_no_pagelock: + } + + /* ++ * Create buffers under a dirty page. This is a hack for the case when someone ++ * pinned the page, the page got cleaned and buffers released, and then the ++ * pinning process dirtied the page ++ */ ++static void ext4_restore_page_buffers(struct inode *inode, struct page *page) ++{ ++ int bsize = 1 << inode->i_blkbits; ++ struct buffer_head *bh, *head; ++ loff_t isize = i_size_read(inode); ++ unsigned int off, end; ++ ext4_lblk_t iblock = page->index << (PAGE_SHIFT - inode->i_blkbits); ++ ++ if (page->index == isize >> PAGE_SHIFT) ++ end = isize & (PAGE_SIZE - 1); ++ else ++ end = PAGE_SIZE; ++ ++ create_empty_buffers(page, bsize, 0); ++ bh = head = page_buffers(page); ++ off = 0; ++ do { ++ if (off > end) ++ break; ++ ext4_get_block(inode, iblock, bh, 0); ++ off += bsize; ++ iblock++; ++ bh = bh->b_this_page; ++ } while (bh != head); ++} ++ ++/* + * Note that we don't need to start a transaction unless we're journaling data + * because we should have holes filled from ext4_page_mkwrite(). We even don't + * need to file the inode to the transaction's list in ordered mode because if +@@ -2186,6 +2217,8 @@ static int ext4_writepage(struct page *p + else + len = PAGE_SIZE; + ++ if (!page_has_buffers(page)) ++ ext4_restore_page_buffers(inode, page); + page_bufs = page_buffers(page); + /* + * We cannot do block allocation or other extent handling in this +@@ -2731,6 +2764,8 @@ static int mpage_prepare_extent_to_map(s + wait_on_page_writeback(page); + BUG_ON(PageWriteback(page)); + ++ if (!page_has_buffers(page)) ++ ext4_restore_page_buffers(mpd->inode, page); + if (mpd->map.m_len == 0) + mpd->first_page = page->index; + mpd->next_page = page->index + 1; --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-force-overhead-calculation-if-the-s_overhead_cl.patch @@ -0,0 +1,50 @@ +From 85d825dbf4899a69407338bae462a59aa9a37326 Mon Sep 17 00:00:00 2001 +From: Theodore Ts'o <tytso@mit.edu> +Date: Thu, 14 Apr 2022 21:57:49 -0400 +Subject: [PATCH] ext4: force overhead calculation if the s_overhead_cluster + makes no sense +Git-commit: 85d825dbf4899a69407338bae462a59aa9a37326 +Patch-mainline: v5.18-rc4 +References: bsc#1200870 + +If the file system does not use bigalloc, calculating the overhead is +cheap, so force the recalculation of the overhead so we don't have to +trust the precalculated overhead in the superblock. + +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/super.c | 15 ++++++++++++--- + 1 file changed, 12 insertions(+), 3 deletions(-) + +diff --git a/fs/ext4/super.c b/fs/ext4/super.c +index 23a9b2c086ed..d08820fdfdee 100644 +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -5289,9 +5289,18 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb) + * Get the # of file system overhead blocks from the + * superblock if present. + */ +- if (es->s_overhead_clusters) +- sbi->s_overhead = le32_to_cpu(es->s_overhead_clusters); +- else { ++ sbi->s_overhead = le32_to_cpu(es->s_overhead_clusters); ++ /* ignore the precalculated value if it is ridiculous */ ++ if (sbi->s_overhead > ext4_blocks_count(es)) ++ sbi->s_overhead = 0; ++ /* ++ * If the bigalloc feature is not enabled recalculating the ++ * overhead doesn't take long, so we might as well just redo ++ * it to make sure we are using the correct value. ++ */ ++ if (!ext4_has_feature_bigalloc(sb)) ++ sbi->s_overhead = 0; ++ if (sbi->s_overhead == 0) { + err = ext4_calculate_overhead(sb); + if (err) + goto failed_mount_wq; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-goto-right-label-failed_mount3a.patch @@ -0,0 +1,68 @@ +From 43bd6f1b49b61f43de4d4e33661b8dbe8c911f14 Mon Sep 17 00:00:00 2001 +From: Jason Yan <yanaijie@huawei.com> +Date: Fri, 16 Sep 2022 22:15:12 +0800 +Subject: [PATCH] ext4: goto right label 'failed_mount3a' +Git-commit: 43bd6f1b49b61f43de4d4e33661b8dbe8c911f14 +Patch-mainline: v6.1-rc1 +References: bsc#1207610 + +Before these two branches neither loaded the journal nor created the +xattr cache. So the right label to goto is 'failed_mount3a'. Although +this did not cause any issues because the error handler validated if the +pointer is null. However this still made me confused when reading +the code. So it's still worth to modify to goto the right label. + +Signed-off-by: Jason Yan <yanaijie@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> +Link: https://lore.kernel.org/r/20220916141527.1012715-2-yanaijie@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/super.c | 10 +++++----- + 1 file changed, 5 insertions(+), 5 deletions(-) + +diff --git a/fs/ext4/super.c b/fs/ext4/super.c +index 4eb18c4c52d7..016b3410e915 100644 +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -5053,30 +5053,30 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb) + ext4_has_feature_journal_needs_recovery(sb)) { + ext4_msg(sb, KERN_ERR, "required journal recovery " + "suppressed and not mounted read-only"); +- goto failed_mount_wq; ++ goto failed_mount3a; + } else { + /* Nojournal mode, all journal mount options are illegal */ + if (test_opt2(sb, EXPLICIT_JOURNAL_CHECKSUM)) { + ext4_msg(sb, KERN_ERR, "can't mount with " + "journal_checksum, fs mounted w/o journal"); +- goto failed_mount_wq; ++ goto failed_mount3a; + } + if (test_opt(sb, JOURNAL_ASYNC_COMMIT)) { + ext4_msg(sb, KERN_ERR, "can't mount with " + "journal_async_commit, fs mounted w/o journal"); +- goto failed_mount_wq; ++ goto failed_mount3a; + } + if (sbi->s_commit_interval != JBD2_DEFAULT_MAX_COMMIT_AGE*HZ) { + ext4_msg(sb, KERN_ERR, "can't mount with " + "commit=%lu, fs mounted w/o journal", + sbi->s_commit_interval / HZ); +- goto failed_mount_wq; ++ goto failed_mount3a; + } + if (EXT4_MOUNT_DATA_FLAGS & + (sbi->s_mount_opt ^ sbi->s_def_mount_opt)) { + ext4_msg(sb, KERN_ERR, "can't mount with " + "data=, fs mounted w/o journal"); +- goto failed_mount_wq; ++ goto failed_mount3a; + } + sbi->s_def_mount_opt &= ~EXT4_MOUNT_JOURNAL_CHECKSUM; + clear_opt(sb, JOURNAL_CHECKSUM); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-improve-error-handling-from-ext4_dirhash.patch @@ -0,0 +1,163 @@ +From 4b3cb1d108bfc2aebb0d7c8a52261a53cf7f5786 Mon Sep 17 00:00:00 2001 +From: Theodore Ts'o <tytso@mit.edu> +Date: Sat, 6 May 2023 11:59:13 -0400 +Subject: [PATCH] ext4: improve error handling from ext4_dirhash() +Git-commit: 4b3cb1d108bfc2aebb0d7c8a52261a53cf7f5786 +Patch-mainline: v6.4-rc2 +References: bsc#1213104 + +The ext4_dirhash() will *almost* never fail, especially when the hash +tree feature was first introduced. However, with the addition of +support of encrypted, casefolded file names, that function can most +certainly fail today. + +So make sure the callers of ext4_dirhash() properly check for +failures, and reflect the errors back up to their callers. + +Cc: stable@kernel.org +Link: https://lore.kernel.org/r/20230506142419.984260-1-tytso@mit.edu +Reported-by: syzbot+394aa8a792cb99dbc837@syzkaller.appspotmail.com +Reported-by: syzbot+344aaa8697ebd232bfc8@syzkaller.appspotmail.com +Link: https://syzkaller.appspot.com/bug?id=db56459ea4ac4a676ae4b4678f633e55da005a9b +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/hash.c | 6 +++++- + fs/ext4/namei.c | 53 ++++++++++++++++++++++++++++++++++--------------- + 2 files changed, 42 insertions(+), 17 deletions(-) + +diff --git a/fs/ext4/hash.c b/fs/ext4/hash.c +index 147b5241dd94..46c3423ddfa1 100644 +--- a/fs/ext4/hash.c ++++ b/fs/ext4/hash.c +@@ -277,7 +277,11 @@ static int __ext4fs_dirhash(const struct inode *dir, const char *name, int len, + } + default: + hinfo->hash = 0; +- return -1; ++ hinfo->minor_hash = 0; ++ ext4_warning(dir->i_sb, ++ "invalid/unsupported hash tree version %u", ++ hinfo->hash_version); ++ return -EINVAL; + } + hash = hash & ~1; + if (hash == (EXT4_HTREE_EOF_32BIT << 1)) +diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c +index a5010b5b8a8c..45b579805c95 100644 +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -674,7 +674,7 @@ static struct stats dx_show_leaf(struct inode *dir, + len = de->name_len; + if (!IS_ENCRYPTED(dir)) { + /* Directory is not encrypted */ +- ext4fs_dirhash(dir, de->name, ++ (void) ext4fs_dirhash(dir, de->name, + de->name_len, &h); + printk("%*.s:(U)%x.%u ", len, + name, h.hash, +@@ -709,8 +709,9 @@ static struct stats dx_show_leaf(struct inode *dir, + if (IS_CASEFOLDED(dir)) + h.hash = EXT4_DIRENT_HASH(de); + else +- ext4fs_dirhash(dir, de->name, +- de->name_len, &h); ++ (void) ext4fs_dirhash(dir, ++ de->name, ++ de->name_len, &h); + printk("%*.s:(E)%x.%u ", len, name, + h.hash, (unsigned) ((char *) de + - base)); +@@ -720,7 +721,8 @@ static struct stats dx_show_leaf(struct inode *dir, + #else + int len = de->name_len; + char *name = de->name; +- ext4fs_dirhash(dir, de->name, de->name_len, &h); ++ (void) ext4fs_dirhash(dir, de->name, ++ de->name_len, &h); + printk("%*.s:%x.%u ", len, name, h.hash, + (unsigned) ((char *) de - base)); + #endif +@@ -849,8 +851,14 @@ dx_probe(struct ext4_filename *fname, struct inode *dir, + hinfo->seed = EXT4_SB(dir->i_sb)->s_hash_seed; + /* hash is already computed for encrypted casefolded directory */ + if (fname && fname_name(fname) && +- !(IS_ENCRYPTED(dir) && IS_CASEFOLDED(dir))) +- ext4fs_dirhash(dir, fname_name(fname), fname_len(fname), hinfo); ++ !(IS_ENCRYPTED(dir) && IS_CASEFOLDED(dir))) { ++ int ret = ext4fs_dirhash(dir, fname_name(fname), ++ fname_len(fname), hinfo); ++ if (ret < 0) { ++ ret_err = ERR_PTR(ret); ++ goto fail; ++ } ++ } + hash = hinfo->hash; + + if (root->info.unused_flags & 1) { +@@ -1111,7 +1119,12 @@ static int htree_dirblock_to_tree(struct file *dir_file, + hinfo->minor_hash = 0; + } + } else { +- ext4fs_dirhash(dir, de->name, de->name_len, hinfo); ++ err = ext4fs_dirhash(dir, de->name, ++ de->name_len, hinfo); ++ if (err < 0) { ++ count = err; ++ goto errout; ++ } + } + if ((hinfo->hash < start_hash) || + ((hinfo->hash == start_hash) && +@@ -1313,8 +1326,12 @@ static int dx_make_map(struct inode *dir, struct buffer_head *bh, + if (de->name_len && de->inode) { + if (ext4_hash_in_dirent(dir)) + h.hash = EXT4_DIRENT_HASH(de); +- else +- ext4fs_dirhash(dir, de->name, de->name_len, &h); ++ else { ++ int err = ext4fs_dirhash(dir, de->name, ++ de->name_len, &h); ++ if (err < 0) ++ return err; ++ } + map_tail--; + map_tail->hash = h.hash; + map_tail->offs = ((char *) de - base)>>2; +@@ -1452,10 +1469,9 @@ int ext4_fname_setup_ci_filename(struct inode *dir, const struct qstr *iname, + hinfo->hash_version = DX_HASH_SIPHASH; + hinfo->seed = NULL; + if (cf_name->name) +- ext4fs_dirhash(dir, cf_name->name, cf_name->len, hinfo); ++ return ext4fs_dirhash(dir, cf_name->name, cf_name->len, hinfo); + else +- ext4fs_dirhash(dir, iname->name, iname->len, hinfo); +- return 0; ++ return ext4fs_dirhash(dir, iname->name, iname->len, hinfo); + } + #endif + +@@ -2298,10 +2314,15 @@ static int make_indexed_dir(handle_t *handle, struct ext4_filename *fname, + fname->hinfo.seed = EXT4_SB(dir->i_sb)->s_hash_seed; + + /* casefolded encrypted hashes are computed on fname setup */ +- if (!ext4_hash_in_dirent(dir)) +- ext4fs_dirhash(dir, fname_name(fname), +- fname_len(fname), &fname->hinfo); +- ++ if (!ext4_hash_in_dirent(dir)) { ++ int err = ext4fs_dirhash(dir, fname_name(fname), ++ fname_len(fname), &fname->hinfo); ++ if (err < 0) { ++ brelse(bh2); ++ brelse(bh); ++ return err; ++ } ++ } + memset(frames, 0, sizeof(frames)); + frame = frames; + frame->entries = entries; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-improve-error-recovery-code-paths-in-__ext4_rem.patch @@ -0,0 +1,67 @@ +From 4c0b4818b1f636bc96359f7817a2d8bab6370162 Mon Sep 17 00:00:00 2001 +From: Theodore Ts'o <tytso@mit.edu> +Date: Fri, 5 May 2023 22:20:29 -0400 +Subject: [PATCH] ext4: improve error recovery code paths in __ext4_remount() +Git-commit: 4c0b4818b1f636bc96359f7817a2d8bab6370162 +Patch-mainline: v6.4-rc2 +References: bsc#1213017 + +If there are failures while changing the mount options in +__ext4_remount(), we need to restore the old mount options. + +This commit fixes two problem. The first is there is a chance that we +will free the old quota file names before a potential failure leading +to a use-after-free. The second problem addressed in this commit is +if there is a failed read/write to read-only transition, if the quota +has already been suspended, we need to renable quota handling. + +Cc: stable@kernel.org +Link: https://lore.kernel.org/r/20230506142419.984260-2-tytso@mit.edu +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/super.c | 13 ++++++++++--- + 1 file changed, 10 insertions(+), 3 deletions(-) + +diff --git a/fs/ext4/super.c b/fs/ext4/super.c +index c7bc4a2709cc..bc0b4a98b337 100644 +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -6617,9 +6617,6 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb) + } + + #ifdef CONFIG_QUOTA +- /* Release old quota file names */ +- for (i = 0; i < EXT4_MAXQUOTAS; i++) +- kfree(old_opts.s_qf_names[i]); + if (enable_quota) { + if (sb_any_quota_suspended(sb)) + dquot_resume(sb, -1); +@@ -6629,6 +6626,9 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb) + goto restore_opts; + } + } ++ /* Release old quota file names */ ++ for (i = 0; i < EXT4_MAXQUOTAS; i++) ++ kfree(old_opts.s_qf_names[i]); + #endif + if (!test_opt(sb, BLOCK_VALIDITY) && sbi->s_system_blks) + ext4_release_system_zone(sb); +@@ -6642,6 +6642,13 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb) + return 0; + + restore_opts: ++ /* ++ * If there was a failing r/w to ro transition, we may need to ++ * re-enable quota ++ */ ++ if ((sb->s_flags & SB_RDONLY) && !(old_sb_flags & SB_RDONLY) && ++ sb_any_quota_suspended(sb)) ++ dquot_resume(sb, -1); + sb->s_flags = old_sb_flags; + sbi->s_mount_opt = old_opts.s_mount_opt; + sbi->s_mount_opt2 = old_opts.s_mount_opt2; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-init-quota-for-old.inode-in-ext4_rename.patch @@ -0,0 +1,83 @@ +From fae381a3d79bb94aa2eb752170d47458d778b797 Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Mon, 7 Nov 2022 09:53:35 +0800 +Subject: [PATCH] ext4: init quota for 'old.inode' in 'ext4_rename' +Git-commit: fae381a3d79bb94aa2eb752170d47458d778b797 +Patch-mainline: v6.2-rc1 +References: bsc#1207629 + +Syzbot found the following issue: + +Ext4_parse_param: s_want_extra_isize=128 +Ext4_inode_info_init: s_want_extra_isize=32 +Ext4_rename: old.inode=ffff88823869a2c8 old.dir=ffff888238699828 new.inode=ffff88823869d7e8 new.dir=ffff888238699828 +__ext4_mark_inode_dirty: inode=ffff888238699828 ea_isize=32 want_ea_size=128 +__ext4_mark_inode_dirty: inode=ffff88823869a2c8 ea_isize=32 want_ea_size=128 +Ext4_xattr_block_set: inode=ffff88823869a2c8 +Acked-by: Jan Kara <jack@suse.cz> + +------------[ cut here ]------------ +WARNING: CPU: 13 PID: 2234 at fs/ext4/xattr.c:2070 ext4_xattr_block_set.cold+0x22/0x980 +Modules linked in: +RIP: 0010:ext4_xattr_block_set.cold+0x22/0x980 +RSP: 0018:ffff888227d3f3b0 EFLAGS: 00010202 +RAX: 0000000000000001 RBX: ffff88823007a000 RCX: 0000000000000000 +RDX: 0000000000000a03 RSI: 0000000000000040 RDI: ffff888230078178 +RBP: 0000000000000000 R08: 000000000000002c R09: ffffed1075c7df8e +R10: ffff8883ae3efc6b R11: ffffed1075c7df8d R12: 0000000000000000 +R13: ffff88823869a2c8 R14: ffff8881012e0460 R15: dffffc0000000000 +FS: 00007f350ac1f740(0000) GS:ffff8883ae200000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: 00007f350a6ed6a0 CR3: 0000000237456000 CR4: 00000000000006e0 +DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 +DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 +Call Trace: + <TASK> + ? ext4_xattr_set_entry+0x3b7/0x2320 + ? ext4_xattr_block_set+0x0/0x2020 + ? ext4_xattr_set_entry+0x0/0x2320 + ? ext4_xattr_check_entries+0x77/0x310 + ? ext4_xattr_ibody_set+0x23b/0x340 + ext4_xattr_move_to_block+0x594/0x720 + ext4_expand_extra_isize_ea+0x59a/0x10f0 + __ext4_expand_extra_isize+0x278/0x3f0 + __ext4_mark_inode_dirty.cold+0x347/0x410 + ext4_rename+0xed3/0x174f + vfs_rename+0x13a7/0x2510 + do_renameat2+0x55d/0x920 + __x64_sys_rename+0x7d/0xb0 + do_syscall_64+0x3b/0xa0 + entry_SYSCALL_64_after_hwframe+0x72/0xdc + +As 'ext4_rename' will modify 'old.inode' ctime and mark inode dirty, +which may trigger expand 'extra_isize' and allocate block. If inode +didn't init quota will lead to warning. To solve above issue, init +'old.inode' firstly in 'ext4_rename'. + +Reported-by: syzbot+98346927678ac3059c77@syzkaller.appspotmail.com +Signed-off-by: Ye Bin <yebin10@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20221107015335.2524319-1-yebin@huaweicloud.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +--- + fs/ext4/namei.c | 3 +++ + 1 file changed, 3 insertions(+) + +diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c +index a789ea9b61a0..1c5518a4bdf9 100644 +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -3796,6 +3796,9 @@ static int ext4_rename(struct user_namespace *mnt_userns, struct inode *old_dir, + return -EXDEV; + + retval = dquot_initialize(old.dir); ++ if (retval) ++ return retval; ++ retval = dquot_initialize(old.inode); + if (retval) + return retval; + retval = dquot_initialize(new.dir); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-initialize-err_blk-before-calling-__ext4_get_in.patch @@ -0,0 +1,43 @@ +From c27c29c6af4f3f4ce925a2111c256733c5a5b430 Mon Sep 17 00:00:00 2001 +From: Harshad Shirwadkar <harshadshirwadkar@gmail.com> +Date: Wed, 1 Dec 2021 08:34:21 -0800 +Subject: [PATCH] ext4: initialize err_blk before calling __ext4_get_inode_loc +Git-commit: c27c29c6af4f3f4ce925a2111c256733c5a5b430 +Patch-mainline: v5.17-rc1 +References: bsc#1202763 + +It is not guaranteed that __ext4_get_inode_loc will definitely set +err_blk pointer when it returns EIO. To avoid using uninitialized +variables, let's first set err_blk to 0. + +Reported-by: Dan Carpenter <dan.carpenter@oracle.com> +Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> +Link: https://lore.kernel.org/r/20211201163421.2631661-1-harshads@google.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inode.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -4451,7 +4451,7 @@ has_buffer: + static int __ext4_get_inode_loc_noinmem(struct inode *inode, + struct ext4_iloc *iloc) + { +- ext4_fsblk_t err_blk; ++ ext4_fsblk_t err_blk = 0; + int ret; + + ret = __ext4_get_inode_loc(inode->i_sb, inode->i_ino, iloc, 0, +@@ -4466,7 +4466,7 @@ static int __ext4_get_inode_loc_noinmem( + + int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc) + { +- ext4_fsblk_t err_blk; ++ ext4_fsblk_t err_blk = 0; + int ret; + + /* We have all inode data except xattrs in memory here. */ --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-initialize-quota-before-expanding-inode-in-setp.patch @@ -0,0 +1,53 @@ +From 1485f726c6dec1a1f85438f2962feaa3d585526f Mon Sep 17 00:00:00 2001 +From: Jan Kara <jack@suse.cz> +Date: Wed, 7 Dec 2022 12:59:27 +0100 +Subject: [PATCH] ext4: initialize quota before expanding inode in setproject + ioctl +Git-commit: 1485f726c6dec1a1f85438f2962feaa3d585526f +Patch-mainline: v6.2-rc1 +References: bsc#1207633 + +Make sure we initialize quotas before possibly expanding inode space +(and thus maybe needing to allocate external xattr block) in +ext4_ioctl_setproject(). This prevents not accounting the necessary +block allocation. + +Signed-off-by: Jan Kara <jack@suse.cz> +Cc: stable@kernel.org +Link: https://lore.kernel.org/r/20221207115937.26601-1-jack@suse.cz +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/ioctl.c | 8 ++++---- + 1 file changed, 4 insertions(+), 4 deletions(-) + +diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c +index 202953b5db49..8067ccda34e4 100644 +--- a/fs/ext4/ioctl.c ++++ b/fs/ext4/ioctl.c +@@ -732,6 +732,10 @@ static int ext4_ioctl_setproject(struct inode *inode, __u32 projid) + if (ext4_is_quota_file(inode)) + return err; + ++ err = dquot_initialize(inode); ++ if (err) ++ return err; ++ + err = ext4_get_inode_loc(inode, &iloc); + if (err) + return err; +@@ -747,10 +751,6 @@ static int ext4_ioctl_setproject(struct inode *inode, __u32 projid) + brelse(iloc.bh); + } + +- err = dquot_initialize(inode); +- if (err) +- return err; +- + handle = ext4_journal_start(inode, EXT4_HT_QUOTA, + EXT4_QUOTA_INIT_BLOCKS(sb) + + EXT4_QUOTA_DEL_BLOCKS(sb) + 3); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-introduce-EXT4_FC_TAG_BASE_LEN-helper.patch @@ -0,0 +1,192 @@ +From fdc2a3c75dd8345c5b48718af90bad1a7811bedb Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Sat, 24 Sep 2022 15:52:31 +0800 +Subject: [PATCH] ext4: introduce EXT4_FC_TAG_BASE_LEN helper +Git-commit: fdc2a3c75dd8345c5b48718af90bad1a7811bedb +Patch-mainline: v6.1-rc1 +References: bsc#1207614 + +Introduce EXT4_FC_TAG_BASE_LEN helper for calculate length of +struct ext4_fc_tl. + +Signed-off-by: Ye Bin <yebin10@huawei.com> +Link: https://lore.kernel.org/r/20220924075233.2315259-2-yebin10@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/fast_commit.c | 54 +++++++++++++++++++++++++------------------------- + fs/ext4/fast_commit.h | 3 ++ + 2 files changed, 31 insertions(+), 26 deletions(-) + +--- a/fs/ext4/fast_commit.c ++++ b/fs/ext4/fast_commit.c +@@ -631,10 +631,10 @@ static u8 *ext4_fc_reserve_space(struct + * After allocating len, we should have space at least for a 0 byte + * padding. + */ +- if (len + sizeof(struct ext4_fc_tl) > bsize) ++ if (len + EXT4_FC_TAG_BASE_LEN > bsize) + return NULL; + +- if (bsize - off - 1 > len + sizeof(struct ext4_fc_tl)) { ++ if (bsize - off - 1 > len + EXT4_FC_TAG_BASE_LEN) { + /* + * Only allocate from current buffer if we have enough space for + * this request AND we have space to add a zero byte padding. +@@ -651,10 +651,10 @@ static u8 *ext4_fc_reserve_space(struct + /* Need to add PAD tag */ + tl = (struct ext4_fc_tl *)(sbi->s_fc_bh->b_data + off); + tl->fc_tag = cpu_to_le16(EXT4_FC_TAG_PAD); +- pad_len = bsize - off - 1 - sizeof(struct ext4_fc_tl); ++ pad_len = bsize - off - 1 - EXT4_FC_TAG_BASE_LEN; + tl->fc_len = cpu_to_le16(pad_len); + if (crc) +- *crc = ext4_chksum(sbi, *crc, tl, sizeof(*tl)); ++ *crc = ext4_chksum(sbi, *crc, tl, EXT4_FC_TAG_BASE_LEN); + if (pad_len > 0) + ext4_fc_memzero(sb, tl + 1, pad_len, crc); + ext4_fc_submit_bh(sb, false); +@@ -696,7 +696,7 @@ static int ext4_fc_write_tail(struct sup + * ext4_fc_reserve_space takes care of allocating an extra block if + * there's no enough space on this block for accommodating this tail. + */ +- dst = ext4_fc_reserve_space(sb, sizeof(tl) + sizeof(tail), &crc); ++ dst = ext4_fc_reserve_space(sb, EXT4_FC_TAG_BASE_LEN + sizeof(tail), &crc); + if (!dst) + return -ENOSPC; + +@@ -706,8 +706,8 @@ static int ext4_fc_write_tail(struct sup + tl.fc_len = cpu_to_le16(bsize - off - 1 + sizeof(struct ext4_fc_tail)); + sbi->s_fc_bytes = round_up(sbi->s_fc_bytes, bsize); + +- ext4_fc_memcpy(sb, dst, &tl, sizeof(tl), &crc); +- dst += sizeof(tl); ++ ext4_fc_memcpy(sb, dst, &tl, EXT4_FC_TAG_BASE_LEN, &crc); ++ dst += EXT4_FC_TAG_BASE_LEN; + tail.fc_tid = cpu_to_le32(sbi->s_journal->j_running_transaction->t_tid); + ext4_fc_memcpy(sb, dst, &tail.fc_tid, sizeof(tail.fc_tid), &crc); + dst += sizeof(tail.fc_tid); +@@ -729,15 +729,15 @@ static bool ext4_fc_add_tlv(struct super + struct ext4_fc_tl tl; + u8 *dst; + +- dst = ext4_fc_reserve_space(sb, sizeof(tl) + len, crc); ++ dst = ext4_fc_reserve_space(sb, EXT4_FC_TAG_BASE_LEN + len, crc); + if (!dst) + return false; + + tl.fc_tag = cpu_to_le16(tag); + tl.fc_len = cpu_to_le16(len); + +- ext4_fc_memcpy(sb, dst, &tl, sizeof(tl), crc); +- ext4_fc_memcpy(sb, dst + sizeof(tl), val, len, crc); ++ ext4_fc_memcpy(sb, dst, &tl, EXT4_FC_TAG_BASE_LEN, crc); ++ ext4_fc_memcpy(sb, dst + EXT4_FC_TAG_BASE_LEN, val, len, crc); + + return true; + } +@@ -750,8 +750,8 @@ static bool ext4_fc_add_dentry_tlv(stru + { + struct ext4_fc_dentry_info fcd; + struct ext4_fc_tl tl; +- u8 *dst = ext4_fc_reserve_space(sb, sizeof(tl) + sizeof(fcd) + dlen, +- crc); ++ u8 *dst = ext4_fc_reserve_space(sb, ++ EXT4_FC_TAG_BASE_LEN + sizeof(fcd) + dlen, crc); + + if (!dst) + return false; +@@ -760,8 +760,8 @@ static bool ext4_fc_add_dentry_tlv(stru + fcd.fc_ino = cpu_to_le32(ino); + tl.fc_tag = cpu_to_le16(tag); + tl.fc_len = cpu_to_le16(sizeof(fcd) + dlen); +- ext4_fc_memcpy(sb, dst, &tl, sizeof(tl), crc); +- dst += sizeof(tl); ++ ext4_fc_memcpy(sb, dst, &tl, EXT4_FC_TAG_BASE_LEN, crc); ++ dst += EXT4_FC_TAG_BASE_LEN; + ext4_fc_memcpy(sb, dst, &fcd, sizeof(fcd), crc); + dst += sizeof(fcd); + ext4_fc_memcpy(sb, dst, dname, dlen, crc); +@@ -797,13 +797,13 @@ static int ext4_fc_write_inode(struct in + + ret = -ECANCELED; + dst = ext4_fc_reserve_space(inode->i_sb, +- sizeof(tl) + inode_len + sizeof(fc_inode.fc_ino), crc); ++ EXT4_FC_TAG_BASE_LEN + inode_len + sizeof(fc_inode.fc_ino), crc); + if (!dst) + goto err; + +- if (!ext4_fc_memcpy(inode->i_sb, dst, &tl, sizeof(tl), crc)) ++ if (!ext4_fc_memcpy(inode->i_sb, dst, &tl, EXT4_FC_TAG_BASE_LEN, crc)) + goto err; +- dst += sizeof(tl); ++ dst += EXT4_FC_TAG_BASE_LEN; + if (!ext4_fc_memcpy(inode->i_sb, dst, &fc_inode, sizeof(fc_inode), crc)) + goto err; + dst += sizeof(fc_inode); +@@ -1961,9 +1961,10 @@ static int ext4_fc_replay_scan(journal_t + } + + state->fc_replay_expected_off++; +- for (cur = start; cur < end; cur = cur + sizeof(tl) + le16_to_cpu(tl.fc_len)) { +- memcpy(&tl, cur, sizeof(tl)); +- val = cur + sizeof(tl); ++ for (cur = start; cur < end; ++ cur = cur + EXT4_FC_TAG_BASE_LEN + le16_to_cpu(tl.fc_len)) { ++ memcpy(&tl, cur, EXT4_FC_TAG_BASE_LEN); ++ val = cur + EXT4_FC_TAG_BASE_LEN; + jbd_debug(3, "Scan phase, tag:%s, blk %lld\n", + tag2str(le16_to_cpu(tl.fc_tag)), bh->b_blocknr); + switch (le16_to_cpu(tl.fc_tag)) { +@@ -1986,13 +1987,13 @@ static int ext4_fc_replay_scan(journal_t + case EXT4_FC_TAG_PAD: + state->fc_cur_tag++; + state->fc_crc = ext4_chksum(sbi, state->fc_crc, cur, +- sizeof(tl) + le16_to_cpu(tl.fc_len)); ++ EXT4_FC_TAG_BASE_LEN + le16_to_cpu(tl.fc_len)); + break; + case EXT4_FC_TAG_TAIL: + state->fc_cur_tag++; + memcpy(&tail, val, sizeof(tail)); + state->fc_crc = ext4_chksum(sbi, state->fc_crc, cur, +- sizeof(tl) + ++ EXT4_FC_TAG_BASE_LEN + + offsetof(struct ext4_fc_tail, + fc_crc)); + if (le32_to_cpu(tail.fc_tid) == expected_tid && +@@ -2019,7 +2020,7 @@ static int ext4_fc_replay_scan(journal_t + } + state->fc_cur_tag++; + state->fc_crc = ext4_chksum(sbi, state->fc_crc, cur, +- sizeof(tl) + le16_to_cpu(tl.fc_len)); ++ EXT4_FC_TAG_BASE_LEN + le16_to_cpu(tl.fc_len)); + break; + default: + ret = state->fc_replay_num_tags ? +@@ -2074,9 +2075,10 @@ static int ext4_fc_replay(journal_t *jou + start = (u8 *)bh->b_data; + end = (__u8 *)bh->b_data + journal->j_blocksize - 1; + +- for (cur = start; cur < end; cur = cur + sizeof(tl) + le16_to_cpu(tl.fc_len)) { +- memcpy(&tl, cur, sizeof(tl)); +- val = cur + sizeof(tl); ++ for (cur = start; cur < end; ++ cur = cur + EXT4_FC_TAG_BASE_LEN + le16_to_cpu(tl.fc_len)) { ++ memcpy(&tl, cur, EXT4_FC_TAG_BASE_LEN); ++ val = cur + EXT4_FC_TAG_BASE_LEN; + + if (state->fc_replay_num_tags == 0) { + ret = JBD2_FC_REPLAY_STOP; +--- a/fs/ext4/fast_commit.h ++++ b/fs/ext4/fast_commit.h +@@ -70,6 +70,9 @@ struct ext4_fc_tail { + __le32 fc_crc; + }; + ++/* Tag base length */ ++#define EXT4_FC_TAG_BASE_LEN (sizeof(struct ext4_fc_tl)) ++ + /* + * Fast commit status codes + */ --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-limit-length-to-bitmap_maxbytes-blocksize-in-pu.patch @@ -0,0 +1,66 @@ +From 2da376228a2427501feb9d15815a45dbdbdd753e Mon Sep 17 00:00:00 2001 +From: Tadeusz Struk <tadeusz.struk@linaro.org> +Date: Thu, 31 Mar 2022 13:05:15 -0700 +Subject: [PATCH] ext4: limit length to bitmap_maxbytes - blocksize in + punch_hole +Git-commit: 2da376228a2427501feb9d15815a45dbdbdd753e +Patch-mainline: v5.18-rc4 +References: bsc#1200806 + +Syzbot found an issue [1] in ext4_fallocate(). +The C reproducer [2] calls fallocate(), passing size 0xffeffeff000ul, +and offset 0x1000000ul, which, when added together exceed the +bitmap_maxbytes for the inode. This triggers a BUG in +ext4_ind_remove_space(). According to the comments in this function +the 'end' parameter needs to be one block after the last block to be +removed. In the case when the BUG is triggered it points to the last +block. Modify the ext4_punch_hole() function and add constraint that +caps the length to satisfy the one before laster block requirement. + +Link: [1] https://syzkaller.appspot.com/bug?id=b80bd9cf348aac724a4f4dff251800106d721331 +Link: [2] https://syzkaller.appspot.com/text?tag=ReproC&x=14ba0238700000 + +Fixes: a4bb6b64e39a ("ext4: enable "punch hole" functionality") +Reported-by: syzbot+7a806094edd5d07ba029@syzkaller.appspotmail.com +Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org> +Link: https://lore.kernel.org/r/20220331200515.153214-1-tadeusz.struk@linaro.org +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inode.c | 11 ++++++++++- + 1 file changed, 10 insertions(+), 1 deletion(-) + +diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c +index 955dd978dccf..d815502cc97c 100644 +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -3952,7 +3952,8 @@ int ext4_punch_hole(struct file *file, loff_t offset, loff_t length) + struct super_block *sb = inode->i_sb; + ext4_lblk_t first_block, stop_block; + struct address_space *mapping = inode->i_mapping; +- loff_t first_block_offset, last_block_offset; ++ loff_t first_block_offset, last_block_offset, max_length; ++ struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); + handle_t *handle; + unsigned int credits; + int ret = 0, ret2 = 0; +@@ -3995,6 +3996,14 @@ int ext4_punch_hole(struct file *file, loff_t offset, loff_t length) + offset; + } + ++ /* ++ * For punch hole the length + offset needs to be within one block ++ * before last range. Adjust the length if it goes beyond that limit. ++ */ ++ max_length = sbi->s_bitmap_maxbytes - inode->i_sb->s_blocksize; ++ if (offset + length > max_length) ++ length = max_length - offset; ++ + if (offset & (sb->s_blocksize - 1) || + (offset + length) & (sb->s_blocksize - 1)) { + /* +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-make-sure-ext4_append-always-allocates-new-bloc.patch @@ -0,0 +1,63 @@ +From b8a04fe77ef1360fbf73c80fddbdfeaa9407ed1b Mon Sep 17 00:00:00 2001 +From: Lukas Czerner <lczerner@redhat.com> +Date: Mon, 4 Jul 2022 16:27:21 +0200 +Subject: [PATCH] ext4: make sure ext4_append() always allocates new block +Git-commit: b8a04fe77ef1360fbf73c80fddbdfeaa9407ed1b +Patch-mainline: v6.0-rc1 +References: bsc#1198577 CVE-2022-1184 + +ext4_append() must always allocate a new block, otherwise we run the +risk of overwriting existing directory block corrupting the directory +tree in the process resulting in all manner of problems later on. + +Add a sanity check to see if the logical block is already allocated and +error out if it is. + +Cc: stable@kernel.org +Signed-off-by: Lukas Czerner <lczerner@redhat.com> +Reviewed-by: Andreas Dilger <adilger@dilger.ca> +Link: https://lore.kernel.org/r/20220704142721.157985-2-lczerner@redhat.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/namei.c | 16 ++++++++++++++++ + 1 file changed, 16 insertions(+) + +diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c +index 7fced54e2891..3a31b662f661 100644 +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -54,6 +54,7 @@ static struct buffer_head *ext4_append(handle_t *handle, + struct inode *inode, + ext4_lblk_t *block) + { ++ struct ext4_map_blocks map; + struct buffer_head *bh; + int err; + +@@ -63,6 +64,21 @@ static struct buffer_head *ext4_append(handle_t *handle, + return ERR_PTR(-ENOSPC); + + *block = inode->i_size >> inode->i_sb->s_blocksize_bits; ++ map.m_lblk = *block; ++ map.m_len = 1; ++ ++ /* ++ * We're appending new directory block. Make sure the block is not ++ * allocated yet, otherwise we will end up corrupting the ++ * directory. ++ */ ++ err = ext4_map_blocks(NULL, inode, &map, 0); ++ if (err < 0) ++ return ERR_PTR(err); ++ if (err) { ++ EXT4_ERROR_INODE(inode, "Logical block already allocated"); ++ return ERR_PTR(-EFSCORRUPTED); ++ } + + bh = ext4_bread(handle, inode, *block, EXT4_GET_BLOCKS_CREATE); + if (IS_ERR(bh)) +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-make-sure-quota-gets-properly-shutdown-on-error.patch @@ -0,0 +1,56 @@ +From 15fc69bbbbbc8c72e5f6cc4e1be0f51283c5448e Mon Sep 17 00:00:00 2001 +From: Jan Kara <jack@suse.cz> +Date: Thu, 7 Oct 2021 17:53:35 +0200 +Subject: [PATCH] ext4: make sure quota gets properly shutdown on error +Git-commit: 15fc69bbbbbc8c72e5f6cc4e1be0f51283c5448e +Patch-mainline: v5.17-rc1 +References: bsc#1195480 + +When we hit an error when enabling quotas and setting inode flags, we do +not properly shutdown quota subsystem despite returning error from +Q_QUOTAON quotactl. This can lead to some odd situations like kernel +using quota file while it is still writeable for userspace. Make sure we +properly cleanup the quota subsystem in case of error. + +Signed-off-by: Jan Kara <jack@suse.cz> +Cc: stable@kernel.org +Link: https://lore.kernel.org/r/20211007155336.12493-2-jack@suse.cz +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/super.c | 10 ++++++---- + 1 file changed, 6 insertions(+), 4 deletions(-) + +diff --git a/fs/ext4/super.c b/fs/ext4/super.c +index b72f8f6084e4..863a3eae505a 100644 +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -6749,10 +6749,7 @@ static int ext4_quota_on(struct super_block *sb, int type, int format_id, + + lockdep_set_quota_inode(path->dentry->d_inode, I_DATA_SEM_QUOTA); + err = dquot_quota_on(sb, type, format_id, path); +- if (err) { +- lockdep_set_quota_inode(path->dentry->d_inode, +- I_DATA_SEM_NORMAL); +- } else { ++ if (!err) { + struct inode *inode = d_inode(path->dentry); + handle_t *handle; + +@@ -6772,7 +6769,12 @@ static int ext4_quota_on(struct super_block *sb, int type, int format_id, + ext4_journal_stop(handle); + unlock_inode: + inode_unlock(inode); ++ if (err) ++ dquot_quota_off(sb, type); + } ++ if (err) ++ lockdep_set_quota_inode(path->dentry->d_inode, ++ I_DATA_SEM_NORMAL); + return err; + } + +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-make-sure-to-reset-inode-lockdep-class-when-quo.patch @@ -0,0 +1,55 @@ +From 4013d47a5307fdb5c13370b5392498b00fedd274 Mon Sep 17 00:00:00 2001 +From: Jan Kara <jack@suse.cz> +Date: Thu, 7 Oct 2021 17:53:36 +0200 +Subject: [PATCH] ext4: make sure to reset inode lockdep class when quota + enabling fails +Git-commit: 4013d47a5307fdb5c13370b5392498b00fedd274 +Patch-mainline: v5.17-rc1 +References: bsc#1202761 + +When we succeed in enabling some quota type but fail to enable another +one with quota feature, we correctly disable all enabled quota types. +However we forget to reset i_data_sem lockdep class. When the inode gets +freed and reused, it will inherit this lockdep class (i_data_sem is +initialized only when a slab is created) and thus eventually lockdep +barfs about possible deadlocks. + +Reported-and-tested-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com +Signed-off-by: Jan Kara <jack@suse.cz> +Cc: stable@kernel.org +Link: https://lore.kernel.org/r/20211007155336.12493-3-jack@suse.cz +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/super.c | 13 ++++++++++++- + 1 file changed, 12 insertions(+), 1 deletion(-) + +diff --git a/fs/ext4/super.c b/fs/ext4/super.c +index 863a3eae505a..1b55f234e006 100644 +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -6837,8 +6837,19 @@ int ext4_enable_quotas(struct super_block *sb) + "Failed to enable quota tracking " + "(type=%d, err=%d). Please run " + "e2fsck to fix.", type, err); +- for (type--; type >= 0; type--) ++ for (type--; type >= 0; type--) { ++ struct inode *inode; ++ ++ inode = sb_dqopt(sb)->files[type]; ++ if (inode) ++ inode = igrab(inode); + dquot_quota_off(sb, type); ++ if (inode) { ++ lockdep_set_quota_inode(inode, ++ I_DATA_SEM_NORMAL); ++ iput(inode); ++ } ++ } + + return err; + } +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-make-variable-count-signed.patch @@ -0,0 +1,40 @@ +From bc75a6eb856cb1507fa907bf6c1eda91b3fef52f Mon Sep 17 00:00:00 2001 +From: Ding Xiang <dingxiang@cmss.chinamobile.com> +Date: Mon, 30 May 2022 18:00:47 +0800 +Subject: [PATCH] ext4: make variable "count" signed +Git-commit: bc75a6eb856cb1507fa907bf6c1eda91b3fef52f +Patch-mainline: v5.19-rc3 +References: bsc#1200820 + +Since dx_make_map() may return -EFSCORRUPTED now, so change "count" to +be a signed integer so we can correctly check for an error code returned +by dx_make_map(). + +Fixes: 46c116b920eb ("ext4: verify dir block before splitting it") +Cc: stable@kernel.org +Signed-off-by: Ding Xiang <dingxiang@cmss.chinamobile.com> +Link: https://lore.kernel.org/r/20220530100047.537598-1-dingxiang@cmss.chinamobile.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/namei.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c +index 47d0ca4c795b..db4ba99d1ceb 100644 +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -1929,7 +1929,8 @@ static struct ext4_dir_entry_2 *do_split(handle_t *handle, struct inode *dir, + struct dx_hash_info *hinfo) + { + unsigned blocksize = dir->i_sb->s_blocksize; +- unsigned count, continued; ++ unsigned continued; ++ int count; + struct buffer_head *bh2; + ext4_lblk_t newblock; + u32 hash2; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-mark-group-as-trimmed-only-if-it-was-fully-scan.patch @@ -0,0 +1,101 @@ +From d63c00ea435a5352f486c259665a4ced60399421 Mon Sep 17 00:00:00 2001 +From: Dmitry Monakhov <dmtrmonakhov@yandex-team.ru> +Date: Sun, 17 Apr 2022 20:03:15 +0300 +Subject: [PATCH] ext4: mark group as trimmed only if it was fully scanned +Git-commit: d63c00ea435a5352f486c259665a4ced60399421 +Patch-mainline: v5.19-rc1 +References: bsc#1202770 + +Otherwise nonaligned fstrim calls will works inconveniently for iterative +scanners, for example: + +// trim [0,16MB] for group-1, but mark full group as trimmed +fstrim -o $((1024*1024*128)) -l $((1024*1024*16)) ./m +// handle [16MB,16MB] for group-1, do nothing because group already has the flag. +fstrim -o $((1024*1024*144)) -l $((1024*1024*16)) ./m + +[ Update function documentation for ext4_trim_all_free -- TYT ] + +Signed-off-by: Dmitry Monakhov <dmtrmonakhov@yandex-team.ru> +Link: https://lore.kernel.org/r/1650214995-860245-1-git-send-email-dmtrmonakhov@yandex-team.ru +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/mballoc.c | 18 ++++++++++++------ + 1 file changed, 12 insertions(+), 6 deletions(-) + +--- a/fs/ext4/mballoc.c ++++ b/fs/ext4/mballoc.c +@@ -6281,6 +6281,7 @@ static int ext4_try_to_trim_range(struct + * @start: first group block to examine + * @max: last group block to examine + * @minblocks: minimum extent block count ++ * @set_trimmed: set the trimmed flag if at least one block is trimmed + * + * ext4_trim_all_free walks through group's buddy bitmap searching for free + * extents. When the free block is found, ext4_trim_extent is called to TRIM +@@ -6295,7 +6296,7 @@ static int ext4_try_to_trim_range(struct + static ext4_grpblk_t + ext4_trim_all_free(struct super_block *sb, ext4_group_t group, + ext4_grpblk_t start, ext4_grpblk_t max, +- ext4_grpblk_t minblocks) ++ ext4_grpblk_t minblocks, bool set_trimmed) + { + struct ext4_buddy e4b; + int ret; +@@ -6312,7 +6313,7 @@ ext4_trim_all_free(struct super_block *s + if (!EXT4_MB_GRP_WAS_TRIMMED(e4b.bd_info) || + minblocks < atomic_read(&EXT4_SB(sb)->s_last_trim_minblks)) { + ret = ext4_try_to_trim_range(sb, &e4b, start, max, minblocks); +- if (ret >= 0) ++ if (ret >= 0 && set_trimmed) + EXT4_MB_GRP_SET_TRIMMED(e4b.bd_info); + } else { + ret = 0; +@@ -6349,6 +6350,7 @@ int ext4_trim_fs(struct super_block *sb, + ext4_fsblk_t first_data_blk = + le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block); + ext4_fsblk_t max_blks = ext4_blocks_count(EXT4_SB(sb)->s_es); ++ bool whole_group, eof = false; + int ret = 0; + + start = range->start >> sb->s_blocksize_bits; +@@ -6367,8 +6369,10 @@ int ext4_trim_fs(struct super_block *sb, + if (minlen > EXT4_CLUSTERS_PER_GROUP(sb)) + goto out; + } +- if (end >= max_blks) ++ if (end >= max_blks - 1) { + end = max_blks - 1; ++ eof = true; ++ } + if (end <= first_data_blk) + goto out; + if (start < first_data_blk) +@@ -6382,6 +6386,7 @@ int ext4_trim_fs(struct super_block *sb, + + /* end now represents the last cluster to discard in this group */ + end = EXT4_CLUSTERS_PER_GROUP(sb) - 1; ++ whole_group = true; + + for (group = first_group; group <= last_group; group++) { + grp = ext4_get_group_info(sb, group); +@@ -6398,12 +6403,13 @@ int ext4_trim_fs(struct super_block *sb, + * change it for the last group, note that last_cluster is + * already computed earlier by ext4_get_group_no_and_offset() + */ +- if (group == last_group) ++ if (group == last_group) { + end = last_cluster; +- ++ whole_group = eof ? true : end == EXT4_CLUSTERS_PER_GROUP(sb) - 1; ++ } + if (grp->bb_free >= minlen) { + cnt = ext4_trim_all_free(sb, group, first_cluster, +- end, minlen); ++ end, minlen, whole_group); + if (cnt < 0) { + ret = cnt; + break; --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-modify-the-logic-of-ext4_mb_new_blocks_simple.patch @@ -0,0 +1,79 @@ +From 31a074a0c62dc0d2bfb9b543142db4fe27f9e5eb Mon Sep 17 00:00:00 2001 +From: Xin Yin <yinxin.x@bytedance.com> +Date: Mon, 10 Jan 2022 11:51:41 +0800 +Subject: [PATCH] ext4: modify the logic of ext4_mb_new_blocks_simple +Git-commit: 31a074a0c62dc0d2bfb9b543142db4fe27f9e5eb +Patch-mainline: v5.17-rc3 +References: bsc#1202766 + +For now in ext4_mb_new_blocks_simple, if we found a block which +should be excluded then will switch to next group, this may +probably cause 'group' run out of range. + +Change to check next block in the same group when get a block should +be excluded. Also change the search range to EXT4_CLUSTERS_PER_GROUP +and add error checking. + +Signed-off-by: Xin Yin <yinxin.x@bytedance.com> +Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> +Link: https://lore.kernel.org/r/20220110035141.1980-3-yinxin.x@bytedance.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/mballoc.c | 26 +++++++++++++++++--------- + 1 file changed, 17 insertions(+), 9 deletions(-) + +diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c +index cf2fd9fc7d98..c781974df9d0 100644 +--- a/fs/ext4/mballoc.c ++++ b/fs/ext4/mballoc.c +@@ -5753,7 +5753,8 @@ static ext4_fsblk_t ext4_mb_new_blocks_simple(handle_t *handle, + struct super_block *sb = ar->inode->i_sb; + ext4_group_t group; + ext4_grpblk_t blkoff; +- int i = sb->s_blocksize; ++ ext4_grpblk_t max = EXT4_CLUSTERS_PER_GROUP(sb); ++ ext4_grpblk_t i = 0; + ext4_fsblk_t goal, block; + struct ext4_super_block *es = EXT4_SB(sb)->s_es; + +@@ -5775,19 +5776,26 @@ static ext4_fsblk_t ext4_mb_new_blocks_simple(handle_t *handle, + ext4_get_group_no_and_offset(sb, + max(ext4_group_first_block_no(sb, group), goal), + NULL, &blkoff); +- i = mb_find_next_zero_bit(bitmap_bh->b_data, sb->s_blocksize, ++ while (1) { ++ i = mb_find_next_zero_bit(bitmap_bh->b_data, max, + blkoff); ++ if (i >= max) ++ break; ++ if (ext4_fc_replay_check_excluded(sb, ++ ext4_group_first_block_no(sb, group) + i)) { ++ blkoff = i + 1; ++ } else ++ break; ++ } + brelse(bitmap_bh); +- if (i >= sb->s_blocksize) +- continue; +- if (ext4_fc_replay_check_excluded(sb, +- ext4_group_first_block_no(sb, group) + i)) +- continue; +- break; ++ if (i < max) ++ break; + } + +- if (group >= ext4_get_groups_count(sb) && i >= sb->s_blocksize) ++ if (group >= ext4_get_groups_count(sb) || i >= max) { ++ *errp = -ENOSPC; + return 0; ++ } + + block = ext4_group_first_block_no(sb, group) + i; + ext4_mb_mark_bb(sb, block, 1, 1); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-move-where-set-the-MAY_INLINE_DATA-flag-is-set.patch @@ -0,0 +1,61 @@ +From 1dcdce5919115a471bf4921a57f20050c545a236 Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Tue, 7 Mar 2023 09:52:52 +0800 +Subject: [PATCH] ext4: move where set the MAY_INLINE_DATA flag is set +Git-commit: 1dcdce5919115a471bf4921a57f20050c545a236 +Patch-mainline: v6.3-rc2 +References: bsc#1213011 + +The only caller of ext4_find_inline_data_nolock() that needs setting of +EXT4_STATE_MAY_INLINE_DATA flag is ext4_iget_extra_inode(). In +ext4_write_inline_data_end() we just need to update inode->i_inline_off. +Since we are going to add one more caller that does not need to set +EXT4_STATE_MAY_INLINE_DATA, just move setting of EXT4_STATE_MAY_INLINE_DATA +out to ext4_iget_extra_inode(). + +Signed-off-by: Ye Bin <yebin10@huawei.com> +Cc: stable@kernel.org +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20230307015253.2232062-2-yebin@huaweicloud.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inline.c | 1 - + fs/ext4/inode.c | 7 ++++++- + 2 files changed, 6 insertions(+), 2 deletions(-) + +diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c +index 2b42ececa46d..1602d74b5eeb 100644 +--- a/fs/ext4/inline.c ++++ b/fs/ext4/inline.c +@@ -159,7 +159,6 @@ int ext4_find_inline_data_nolock(struct inode *inode) + (void *)ext4_raw_inode(&is.iloc)); + EXT4_I(inode)->i_inline_size = EXT4_MIN_INLINE_DATA_SIZE + + le32_to_cpu(is.s.here->e_value_size); +- ext4_set_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA); + } + out: + brelse(is.iloc.bh); +diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c +index b65dadfe3b45..530e420ae0e8 100644 +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -4797,8 +4797,13 @@ static inline int ext4_iget_extra_inode(struct inode *inode, + + if (EXT4_INODE_HAS_XATTR_SPACE(inode) && + *magic == cpu_to_le32(EXT4_XATTR_MAGIC)) { ++ int err; ++ + ext4_set_inode_state(inode, EXT4_STATE_XATTR); +- return ext4_find_inline_data_nolock(inode); ++ err = ext4_find_inline_data_nolock(inode); ++ if (!err && ext4_has_inline_data(inode)) ++ ext4_set_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA); ++ return err; + } else + EXT4_I(inode)->i_inline_off = 0; + return 0; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-only-update-i_reserved_data_blocks-on-successfu.patch @@ -0,0 +1,100 @@ +From de25d6e9610a8b30cce9bbb19b50615d02ebca02 Mon Sep 17 00:00:00 2001 +From: Baokun Li <libaokun1@huawei.com> +Date: Mon, 24 Apr 2023 11:38:35 +0800 +Subject: [PATCH] ext4: only update i_reserved_data_blocks on successful block + allocation +Git-commit: de25d6e9610a8b30cce9bbb19b50615d02ebca02 +Patch-mainline: v6.5-rc1 +References: bsc#1213019 + +In our fault injection test, we create an ext4 file, migrate it to +non-extent based file, then punch a hole and finally trigger a WARN_ON +in the ext4_da_update_reserve_space(): + +EXT4-fs warning (device sda): ext4_da_update_reserve_space:369: +ino 14, used 11 with only 10 reserved data blocks + +When writing back a non-extent based file, if we enable delalloc, the +number of reserved blocks will be subtracted from the number of blocks +mapped by ext4_ind_map_blocks(), and the extent status tree will be +updated. We update the extent status tree by first removing the old +extent_status and then inserting the new extent_status. If the block range +we remove happens to be in an extent, then we need to allocate another +extent_status with ext4_es_alloc_extent(). + + use old to remove to add new + |----------|------------|------------| + old extent_status + +The problem is that the allocation of a new extent_status failed due to a +fault injection, and __es_shrink() did not get free memory, resulting in +a return of -ENOMEM. Then do_writepages() retries after receiving -ENOMEM, +we map to the same extent again, and the number of reserved blocks is again +subtracted from the number of blocks in that extent. Since the blocks in +the same extent are subtracted twice, we end up triggering WARN_ON at +ext4_da_update_reserve_space() because used > ei->i_reserved_data_blocks. + +For non-extent based file, we update the number of reserved blocks after +ext4_ind_map_blocks() is executed, which causes a problem that when we call +ext4_ind_map_blocks() to create a block, it doesn't always create a block, +but we always reduce the number of reserved blocks. So we move the logic +for updating reserved blocks to ext4_ind_map_blocks() to ensure that the +number of reserved blocks is updated only after we do succeed in allocating +some new blocks. + +Fixes: 5f634d064c70 ("ext4: Fix quota accounting error with fallocate") +Cc: stable@kernel.org +Signed-off-by: Baokun Li <libaokun1@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20230424033846.4732-2-libaokun1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/indirect.c | 8 ++++++++ + fs/ext4/inode.c | 10 ---------- + 2 files changed, 8 insertions(+), 10 deletions(-) + +diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c +index c68bebe7ff4b..a9f3716119d3 100644 +--- a/fs/ext4/indirect.c ++++ b/fs/ext4/indirect.c +@@ -651,6 +651,14 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode, + + ext4_update_inode_fsync_trans(handle, inode, 1); + count = ar.len; ++ ++ /* ++ * Update reserved blocks/metadata blocks after successful block ++ * allocation which had been deferred till now. ++ */ ++ if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE) ++ ext4_da_update_reserve_space(inode, count, 1); ++ + got_it: + map->m_flags |= EXT4_MAP_MAPPED; + map->m_pblk = le32_to_cpu(chain[depth-1].key); +diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c +index c2868282ad81..ef7ec2690b84 100644 +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -632,16 +632,6 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode, + */ + ext4_clear_inode_state(inode, EXT4_STATE_EXT_MIGRATE); + } +- +- /* +- * Update reserved blocks/metadata blocks after successful +- * block allocation which had been deferred till now. We don't +- * support fallocate for non extent files. So we can update +- * reserve space here. +- */ +- if ((retval > 0) && +- (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE)) +- ext4_da_update_reserve_space(inode, retval, 1); + } + + if (retval > 0) { +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-place-buffer-head-allocation-before-handle-star.patch @@ -0,0 +1,49 @@ +From d1052d236eddf6aa851434db1897b942e8db9921 Mon Sep 17 00:00:00 2001 +From: Jinke Han <hanjinke.666@bytedance.com> +Date: Sat, 3 Sep 2022 09:24:29 +0800 +Subject: [PATCH] ext4: place buffer head allocation before handle start +Git-commit: d1052d236eddf6aa851434db1897b942e8db9921 +Patch-mainline: v6.1-rc1 +References: bsc#1207607 + +In our product environment, we encounter some jbd hung waiting handles to +stop while several writters were doing memory reclaim for buffer head +allocation in delay alloc write path. Ext4 do buffer head allocation with +holding transaction handle which may be blocked too long if the reclaim +works not so smooth. According to our bcc trace, the reclaim time in +buffer head allocation can reach 258s and the jbd transaction commit also +take almost the same time meanwhile. Except for these extreme cases, +we often see several seconds delays for cgroup memory reclaim on our +servers. This is more likely to happen considering docker environment. + +One thing to note, the allocation of buffer heads is as often as page +allocation or more often when blocksize less than page size. Just like +page cache allocation, we should also place the buffer head allocation +before startting the handle. + +Cc: stable@kernel.org +Signed-off-by: Jinke Han <hanjinke.666@bytedance.com> +Link: https://lore.kernel.org/r/20220903012429.22555-1-hanjinke.666@bytedance.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inode.c | 7 +++++++ + 1 file changed, 7 insertions(+) + +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -1177,6 +1177,13 @@ retry_grab: + page = grab_cache_page_write_begin(mapping, index, flags); + if (!page) + return -ENOMEM; ++ /* ++ * The same as page allocation, we prealloc buffer heads before ++ * starting the handle. ++ */ ++ if (!page_has_buffers(page)) ++ create_empty_buffers(page, inode->i_sb->s_blocksize, 0); ++ + unlock_page(page); + + retry_journal: --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-prevent-used-blocks-from-being-allocated-during.patch @@ -0,0 +1,117 @@ +From 599ea31d13617c5484c40cdf50d88301dc351cfc Mon Sep 17 00:00:00 2001 +From: Xin Yin <yinxin.x@bytedance.com> +Date: Mon, 10 Jan 2022 11:51:40 +0800 +Subject: [PATCH] ext4: prevent used blocks from being allocated during fast + commit replay +Git-commit: 599ea31d13617c5484c40cdf50d88301dc351cfc +Patch-mainline: v5.17-rc3 +References: bsc#1202765 + +During fast commit replay procedure, we clear inode blocks bitmap in +ext4_ext_clear_bb(), this may cause ext4_mb_new_blocks_simple() allocate +blocks still in use. + +Make ext4_fc_record_regions() also record physical disk regions used by +inodes during replay procedure. Then ext4_mb_new_blocks_simple() can +excludes these blocks in use. + +Signed-off-by: Xin Yin <yinxin.x@bytedance.com> +Link: https://lore.kernel.org/r/20220110035141.1980-2-yinxin.x@bytedance.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/ext4.h | 3 +++ + fs/ext4/extents.c | 4 ++++ + fs/ext4/fast_commit.c | 20 +++++++++++++++----- + 3 files changed, 22 insertions(+), 5 deletions(-) + +diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h +index 715ee206dfe1..598ecf07652a 100644 +--- a/fs/ext4/ext4.h ++++ b/fs/ext4/ext4.h +@@ -2934,6 +2934,9 @@ void ext4_fc_replay_cleanup(struct super_block *sb); + int ext4_fc_commit(journal_t *journal, tid_t commit_tid); + int __init ext4_fc_init_dentry_cache(void); + void ext4_fc_destroy_dentry_cache(void); ++int ext4_fc_record_regions(struct super_block *sb, int ino, ++ ext4_lblk_t lblk, ext4_fsblk_t pblk, ++ int len, int replay); + + /* mballoc.c */ + extern const struct seq_operations ext4_mb_seq_groups_ops; +diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c +index 1077ce7e189f..5dd13108d4b7 100644 +--- a/fs/ext4/extents.c ++++ b/fs/ext4/extents.c +@@ -6093,11 +6093,15 @@ int ext4_ext_clear_bb(struct inode *inode) + + ext4_mb_mark_bb(inode->i_sb, + path[j].p_block, 1, 0); ++ ext4_fc_record_regions(inode->i_sb, inode->i_ino, ++ 0, path[j].p_block, 1, 1); + } + ext4_ext_drop_refs(path); + kfree(path); + } + ext4_mb_mark_bb(inode->i_sb, map.m_pblk, map.m_len, 0); ++ ext4_fc_record_regions(inode->i_sb, inode->i_ino, ++ map.m_lblk, map.m_pblk, map.m_len, 1); + } + cur = cur + map.m_len; + } +diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c +index 5ae8026a0c56..1abe78b8d84f 100644 +--- a/fs/ext4/fast_commit.c ++++ b/fs/ext4/fast_commit.c +@@ -1563,16 +1563,23 @@ static int ext4_fc_replay_create(struct super_block *sb, struct ext4_fc_tl *tl, + } + + /* +- * Record physical disk regions which are in use as per fast commit area. Our +- * simple replay phase allocator excludes these regions from allocation. ++ * Record physical disk regions which are in use as per fast commit area, ++ * and used by inodes during replay phase. Our simple replay phase ++ * allocator excludes these regions from allocation. + */ +-static int ext4_fc_record_regions(struct super_block *sb, int ino, +- ext4_lblk_t lblk, ext4_fsblk_t pblk, int len) ++int ext4_fc_record_regions(struct super_block *sb, int ino, ++ ext4_lblk_t lblk, ext4_fsblk_t pblk, int len, int replay) + { + struct ext4_fc_replay_state *state; + struct ext4_fc_alloc_region *region; + + state = &EXT4_SB(sb)->s_fc_replay_state; ++ /* ++ * during replay phase, the fc_regions_valid may not same as ++ * fc_regions_used, update it when do new additions. ++ */ ++ if (replay && state->fc_regions_used != state->fc_regions_valid) ++ state->fc_regions_used = state->fc_regions_valid; + if (state->fc_regions_used == state->fc_regions_size) { + state->fc_regions_size += + EXT4_FC_REPLAY_REALLOC_INCREMENT; +@@ -1590,6 +1597,9 @@ static int ext4_fc_record_regions(struct super_block *sb, int ino, + region->pblk = pblk; + region->len = len; + ++ if (replay) ++ state->fc_regions_valid++; ++ + return 0; + } + +@@ -1937,7 +1947,7 @@ static int ext4_fc_replay_scan(journal_t *journal, + ret = ext4_fc_record_regions(sb, + le32_to_cpu(ext.fc_ino), + le32_to_cpu(ex->ee_block), ext4_ext_pblock(ex), +- ext4_ext_get_actual_len(ex)); ++ ext4_ext_get_actual_len(ex), 0); + if (ret < 0) + break; + ret = JBD2_FC_REPLAY_CONTINUE; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-recover-csum-seed-of-tmp_inode-after-migrating-.patch @@ -0,0 +1,77 @@ +From 07ea7a617d6b278fb7acedb5cbe1a81ce2de7d0c Mon Sep 17 00:00:00 2001 +From: Li Lingfeng <lilingfeng3@huawei.com> +Date: Fri, 17 Jun 2022 14:25:15 +0800 +Subject: [PATCH] ext4: recover csum seed of tmp_inode after migrating to + extents +Git-commit: 07ea7a617d6b278fb7acedb5cbe1a81ce2de7d0c +Patch-mainline: v6.0-rc1 +References: bsc#1202713 + +When migrating to extents, the checksum seed of temporary inode +need to be replaced by inode's, otherwise the inode checksums +will be incorrect when swapping the inodes data. + +However, the temporary inode can not match it's checksum to +itself since it has lost it's own checksum seed. + +mkfs.ext4 -F /dev/sdc +mount /dev/sdc /mnt/sdc +xfs_io -fc "pwrite 4k 4k" -c "fsync" /mnt/sdc/testfile +chattr -e /mnt/sdc/testfile +chattr +e /mnt/sdc/testfile +umount /dev/sdc +fsck -fn /dev/sdc + +======== +... +Pass 1: Checking inodes, blocks, and sizes +Inode 13 passes checks, but checksum does not match inode. Fix? no +... +======== + +The fix is simple, save the checksum seed of temporary inode, and +recover it after migrating to extents. + +Fixes: e81c9302a6c3 ("ext4: set csum seed in tmp inode while migrating to extents") +Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220617062515.2113438-1-lilingfeng3@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/migrate.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c +index 42f590518b4c..54e7d3c95fd7 100644 +--- a/fs/ext4/migrate.c ++++ b/fs/ext4/migrate.c +@@ -417,7 +417,7 @@ int ext4_ext_migrate(struct inode *inode) + struct inode *tmp_inode = NULL; + struct migrate_struct lb; + unsigned long max_entries; +- __u32 goal; ++ __u32 goal, tmp_csum_seed; + uid_t owner[2]; + + /* +@@ -465,6 +465,7 @@ int ext4_ext_migrate(struct inode *inode) + * the migration. + */ + ei = EXT4_I(inode); ++ tmp_csum_seed = EXT4_I(tmp_inode)->i_csum_seed; + EXT4_I(tmp_inode)->i_csum_seed = ei->i_csum_seed; + i_size_write(tmp_inode, i_size_read(inode)); + /* +@@ -575,6 +576,7 @@ int ext4_ext_migrate(struct inode *inode) + * the inode is not visible to user space. + */ + tmp_inode->i_blocks = 0; ++ EXT4_I(tmp_inode)->i_csum_seed = tmp_csum_seed; + + /* Reset the extent details */ + ext4_ext_tree_init(handle, tmp_inode); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-refuse-to-create-ea-block-when-umounted.patch @@ -0,0 +1,45 @@ +From f31173c19901a96bb2ebf6bcfec8a08df7095c91 Mon Sep 17 00:00:00 2001 +From: Jun Nie <jun.nie@linaro.org> +Date: Tue, 3 Jan 2023 09:45:17 +0800 +Subject: [PATCH] ext4: refuse to create ea block when umounted +Git-commit: f31173c19901a96bb2ebf6bcfec8a08df7095c91 +Patch-mainline: v6.3-rc1 +References: bsc#1213093 + +The ea block expansion need to access s_root while it is +already set as NULL when umount is triggered. Refuse this +request to avoid panic. + +Reported-by: syzbot+2dacb8f015bf1420155f@syzkaller.appspotmail.com +Link: https://syzkaller.appspot.com/bug?id=3613786cb88c93aa1c6a279b1df6a7b201347d08 +Link: https://lore.kernel.org/r/20230103014517.495275-3-jun.nie@linaro.org +Cc: stable@kernel.org +Signed-off-by: Jun Nie <jun.nie@linaro.org> +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/xattr.c | 7 +++++++ + 1 file changed, 7 insertions(+) + +diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c +index 38e08b438ccb..d8fef540ca9b 100644 +--- a/fs/ext4/xattr.c ++++ b/fs/ext4/xattr.c +@@ -1471,6 +1471,13 @@ static struct inode *ext4_xattr_inode_create(handle_t *handle, + uid_t owner[2] = { i_uid_read(inode), i_gid_read(inode) }; + int err; + ++ if (inode->i_sb->s_root == NULL) { ++ ext4_warning(inode->i_sb, ++ "refuse to create EA inode when umounting"); ++ WARN_ON(1); ++ return ERR_PTR(-EINVAL); ++ } ++ + /* + * Let the next inode be the goal, so we try and allocate the EA inode + * in the same group, or nearby one. +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-reject-the-commit-option-on-ext2-filesystems.patch @@ -0,0 +1,40 @@ +From cb8435dc8ba33bcafa41cf2aa253794320a3b8df Mon Sep 17 00:00:00 2001 +From: Eric Biggers <ebiggers@google.com> +Date: Tue, 10 May 2022 11:32:32 -0700 +Subject: [PATCH] ext4: reject the 'commit' option on ext2 filesystems +Git-commit: cb8435dc8ba33bcafa41cf2aa253794320a3b8df +Patch-mainline: v5.19-rc1 +References: bsc#1200808 + +The 'commit' option is only applicable for ext3 and ext4 filesystems, +and has never been accepted by the ext2 filesystem driver, so the ext4 +driver shouldn't allow it on ext2 filesystems. + +This fixes a failure in xfstest ext4/053. + +Fixes: 8dc0aa8cf0f7 ("ext4: check incompatible mount options while mounting ext2/3") +Signed-off-by: Eric Biggers <ebiggers@google.com> +Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com> +Reviewed-by: Lukas Czerner <lczerner@redhat.com> +Link: https://lore.kernel.org/r/20220510183232.172615-1-ebiggers@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/super.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/fs/ext4/super.c b/fs/ext4/super.c +index 4b0ea8df1f5c..3f59efd3aa3e 100644 +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -1915,6 +1915,7 @@ static const struct mount_opts { + MOPT_EXT4_ONLY | MOPT_CLEAR}, + {Opt_warn_on_error, EXT4_MOUNT_WARN_ON_ERROR, MOPT_SET}, + {Opt_nowarn_on_error, EXT4_MOUNT_WARN_ON_ERROR, MOPT_CLEAR}, ++ {Opt_commit, 0, MOPT_NO_EXT2}, + {Opt_nojournal_checksum, EXT4_MOUNT_JOURNAL_CHECKSUM, + MOPT_EXT4_ONLY | MOPT_CLEAR}, + {Opt_journal_checksum, EXT4_MOUNT_JOURNAL_CHECKSUM, +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-remove-EA-inode-entry-from-mbcache-on-inode-evi.patch @@ -0,0 +1,116 @@ +From 6bc0d63dad7f9f54d381925ee855b402f652fa39 Mon Sep 17 00:00:00 2001 +From: Jan Kara <jack@suse.cz> +Date: Tue, 12 Jul 2022 12:54:22 +0200 +Subject: [PATCH] ext4: remove EA inode entry from mbcache on inode eviction +Git-commit: 6bc0d63dad7f9f54d381925ee855b402f652fa39 +Patch-mainline: v6.0-rc1 +References: bsc#1198971 + +Currently we remove EA inode from mbcache as soon as its xattr refcount +drops to zero. However there can be pending attempts to reuse the inode +and thus refcount handling code has to handle the situation when +refcount increases from zero anyway. So save some work and just keep EA +inode in mbcache until it is getting evicted. At that moment we are sure +following iget() of EA inode will fail anyway (or wait for eviction to +finish and load things from the disk again) and so removing mbcache +entry at that moment is fine and simplifies the code a bit. + +Cc: stable@vger.kernel.org +Fixes: 82939d7999df ("ext4: convert to mbcache2") +Signed-off-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220712105436.32204-3-jack@suse.cz +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inode.c | 2 ++ + fs/ext4/xattr.c | 24 ++++++++---------------- + fs/ext4/xattr.h | 1 + + 3 files changed, 11 insertions(+), 16 deletions(-) + +diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c +index 7db52defcb16..8204c59bdd1d 100644 +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -177,6 +177,8 @@ void ext4_evict_inode(struct inode *inode) + + trace_ext4_evict_inode(inode); + ++ if (EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL) ++ ext4_evict_ea_inode(inode); + if (inode->i_nlink) { + /* + * When journalling data dirty buffers are tracked only in the +diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c +index c42b3e0d2d94..d92d50de5a01 100644 +--- a/fs/ext4/xattr.c ++++ b/fs/ext4/xattr.c +@@ -436,6 +436,14 @@ static int ext4_xattr_inode_iget(struct inode *parent, unsigned long ea_ino, + return err; + } + ++/* Remove entry from mbcache when EA inode is getting evicted */ ++void ext4_evict_ea_inode(struct inode *inode) ++{ ++ if (EA_INODE_CACHE(inode)) ++ mb_cache_entry_delete(EA_INODE_CACHE(inode), ++ ext4_xattr_inode_get_hash(inode), inode->i_ino); ++} ++ + static int + ext4_xattr_inode_verify_hashes(struct inode *ea_inode, + struct ext4_xattr_entry *entry, void *buffer, +@@ -976,10 +984,8 @@ int __ext4_xattr_set_credits(struct super_block *sb, struct inode *inode, + static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode, + int ref_change) + { +- struct mb_cache *ea_inode_cache = EA_INODE_CACHE(ea_inode); + struct ext4_iloc iloc; + s64 ref_count; +- u32 hash; + int ret; + + inode_lock(ea_inode); +@@ -1002,14 +1008,6 @@ static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode, + + set_nlink(ea_inode, 1); + ext4_orphan_del(handle, ea_inode); +- +- if (ea_inode_cache) { +- hash = ext4_xattr_inode_get_hash(ea_inode); +- mb_cache_entry_create(ea_inode_cache, +- GFP_NOFS, hash, +- ea_inode->i_ino, +- true /* reusable */); +- } + } + } else { + WARN_ONCE(ref_count < 0, "EA inode %lu ref_count=%lld", +@@ -1022,12 +1020,6 @@ static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode, + + clear_nlink(ea_inode); + ext4_orphan_add(handle, ea_inode); +- +- if (ea_inode_cache) { +- hash = ext4_xattr_inode_get_hash(ea_inode); +- mb_cache_entry_delete(ea_inode_cache, hash, +- ea_inode->i_ino); +- } + } + } + +diff --git a/fs/ext4/xattr.h b/fs/ext4/xattr.h +index e29101168733..824faf0b15a8 100644 +--- a/fs/ext4/xattr.h ++++ b/fs/ext4/xattr.h +@@ -191,6 +191,7 @@ extern void ext4_xattr_inode_array_free(struct ext4_xattr_inode_array *array); + + extern int ext4_expand_extra_isize_ea(struct inode *inode, int new_extra_isize, + struct ext4_inode *raw_inode, handle_t *handle); ++extern void ext4_evict_ea_inode(struct inode *inode); + + extern const struct xattr_handler *ext4_xattr_handlers[]; + +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-set-lockdep-subclass-for-the-ea_inode-in-ext4_x.patch @@ -0,0 +1,39 @@ +From b928dfdcb27d8fa59917b794cfba53052a2f050f Mon Sep 17 00:00:00 2001 +From: Theodore Ts'o <tytso@mit.edu> +Date: Tue, 23 May 2023 23:49:49 -0400 +Subject: [PATCH] ext4: set lockdep subclass for the ea_inode in + ext4_xattr_inode_cache_find() +Git-commit: b928dfdcb27d8fa59917b794cfba53052a2f050f +Patch-mainline: v6.4-rc5 +References: bsc#1213107 + +If the ea_inode has been pushed out of the inode cache while there is +still a reference in the mb_cache, the lockdep subclass will not be +set on the inode, which can lead to some lockdep false positives. + +Fixes: 33d201e0277b ("ext4: fix lockdep warning about recursive inode locking") +Cc: stable@kernel.org +Reported-by: syzbot+d4b971e744b1f5439336@syzkaller.appspotmail.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Link: https://lore.kernel.org/r/20230524034951.779531-3-tytso@mit.edu +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/xattr.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c +index a27208129a80..ff7ab63c5b4f 100644 +--- a/fs/ext4/xattr.c ++++ b/fs/ext4/xattr.c +@@ -1539,6 +1539,7 @@ ext4_xattr_inode_cache_find(struct inode *inode, const void *value, + EXT4_IGET_EA_INODE); + if (IS_ERR(ea_inode)) + goto next_entry; ++ ext4_xattr_inode_set_class(ea_inode); + if (i_size_read(ea_inode) == value_len && + !ext4_xattr_inode_read(ea_inode, ea_data, value_len) && + !ext4_xattr_inode_verify_hashes(ea_inode, NULL, ea_data, +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-silence-the-warning-when-evicting-inode-with-di.patch @@ -0,0 +1,99 @@ +From bc12ac98ea2e1b70adc6478c8b473a0003b659d3 Mon Sep 17 00:00:00 2001 +From: Zhang Yi <yi.zhang@huawei.com> +Date: Wed, 29 Jun 2022 19:26:46 +0800 +Subject: [PATCH] ext4: silence the warning when evicting inode with + dioread_nolock +Git-commit: bc12ac98ea2e1b70adc6478c8b473a0003b659d3 +Patch-mainline: v6.2-rc1 +References: bsc#1206889 + +When evicting an inode with default dioread_nolock, it could be raced by +the unwritten extents converting kworker after writeback some new +allocated dirty blocks. It convert unwritten extents to written, the +extents could be merged to upper level and free extent blocks, so it +could mark the inode dirty again even this inode has been marked +I_FREEING. But the inode->i_io_list check and warning in +ext4_evict_inode() missing this corner case. Fortunately, +ext4_evict_inode() will wait all extents converting finished before this +check, so it will not lead to inode use-after-free problem, every thing +is OK besides this warning. The WARN_ON_ONCE was originally designed +for finding inode use-after-free issues in advance, but if we add +current dioread_nolock case in, it will become not quite useful, so fix +this warning by just remove this check. + + ====== + WARNING: CPU: 7 PID: 1092 at fs/ext4/inode.c:227 + ext4_evict_inode+0x875/0xc60 + ... + RIP: 0010:ext4_evict_inode+0x875/0xc60 + ... + Call Trace: + <TASK> + evict+0x11c/0x2b0 + iput+0x236/0x3a0 + do_unlinkat+0x1b4/0x490 + __x64_sys_unlinkat+0x4c/0xb0 + do_syscall_64+0x3b/0x90 + entry_SYSCALL_64_after_hwframe+0x46/0xb0 + RIP: 0033:0x7fa933c1115b + ====== + +rm kworker + ext4_end_io_end() +vfs_unlink() + ext4_unlink() + ext4_convert_unwritten_io_end_vec() + ext4_convert_unwritten_extents() + ext4_map_blocks() + ext4_ext_map_blocks() + ext4_ext_try_to_merge_up() + __mark_inode_dirty() + check !I_FREEING + locked_inode_to_wb_and_lock_list() + iput() + iput_final() + evict() + ext4_evict_inode() + truncate_inode_pages_final() //wait release io_end + inode_io_list_move_locked() + ext4_release_io_end() + trigger WARN_ON_ONCE() + +Cc: stable@kernel.org +Fixes: ceff86fddae8 ("ext4: Avoid freeing inodes on dirty list") +Signed-off-by: Zhang Yi <yi.zhang@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220629112647.4141034-1-yi.zhang@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inode.c | 10 +++++----- + 1 file changed, 5 insertions(+), 5 deletions(-) + +diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c +index 2b5ef1b64249..7c5f5dabe0fd 100644 +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -222,13 +222,13 @@ void ext4_evict_inode(struct inode *inode) + + /* + * For inodes with journalled data, transaction commit could have +- * dirtied the inode. Flush worker is ignoring it because of I_FREEING +- * flag but we still need to remove the inode from the writeback lists. ++ * dirtied the inode. And for inodes with dioread_nolock, unwritten ++ * extents converting worker could merge extents and also have dirtied ++ * the inode. Flush worker is ignoring it because of I_FREEING flag but ++ * we still need to remove the inode from the writeback lists. + */ +- if (!list_empty_careful(&inode->i_io_list)) { +- WARN_ON_ONCE(!ext4_should_journal_data(inode)); ++ if (!list_empty_careful(&inode->i_io_list)) + inode_io_list_del(inode); +- } + + /* + * Protect us against freezing - iput() caller didn't have to have any +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-turn-quotas-off-if-mount-failed-after-enabling-.patch @@ -0,0 +1,76 @@ +From d13f99632748462c32fc95d729f5e754bab06064 Mon Sep 17 00:00:00 2001 +From: Baokun Li <libaokun1@huawei.com> +Date: Mon, 27 Mar 2023 22:16:29 +0800 +Subject: [PATCH] ext4: turn quotas off if mount failed after enabling quotas +Git-commit: d13f99632748462c32fc95d729f5e754bab06064 +Patch-mainline: v6.5-rc1 +References: bsc#1213110 + +Yi found during a review of the patch "ext4: don't BUG on inconsistent +journal feature" that when ext4_mark_recovery_complete() returns an error +value, the error handling path does not turn off the enabled quotas, +which triggers the following kmemleak: + +================================================================ +unreferenced object 0xffff8cf68678e7c0 (size 64): +comm "mount", pid 746, jiffies 4294871231 (age 11.540s) +hex dump (first 32 bytes): +00 90 ef 82 f6 8c ff ff 00 00 00 00 41 01 00 00 ............A... +c7 00 00 00 bd 00 00 00 0a 00 00 00 48 00 00 00 ............H... +Backtrace: +[<00000000c561ef24>] __kmem_cache_alloc_node+0x4d4/0x880 +[<00000000d4e621d7>] kmalloc_trace+0x39/0x140 +[<00000000837eee74>] v2_read_file_info+0x18a/0x3a0 +[<0000000088f6c877>] dquot_load_quota_sb+0x2ed/0x770 +[<00000000340a4782>] dquot_load_quota_inode+0xc6/0x1c0 +[<0000000089a18bd5>] ext4_enable_quotas+0x17e/0x3a0 [ext4] +[<000000003a0268fa>] __ext4_fill_super+0x3448/0x3910 [ext4] +[<00000000b0f2a8a8>] ext4_fill_super+0x13d/0x340 [ext4] +[<000000004a9489c4>] get_tree_bdev+0x1dc/0x370 +[<000000006e723bf1>] ext4_get_tree+0x1d/0x30 [ext4] +[<00000000c7cb663d>] vfs_get_tree+0x31/0x160 +[<00000000320e1bed>] do_new_mount+0x1d5/0x480 +[<00000000c074654c>] path_mount+0x22e/0xbe0 +[<0000000003e97a8e>] do_mount+0x95/0xc0 +[<000000002f3d3736>] __x64_sys_mount+0xc4/0x160 +[<0000000027d2140c>] do_syscall_64+0x3f/0x90 +================================================================ + +To solve this problem, we add a "failed_mount10" tag, and call +ext4_quota_off_umount() in this tag to release the enabled qoutas. + +Fixes: 11215630aada ("ext4: don't BUG on inconsistent journal feature") +Cc: stable@kernel.org +Signed-off-by: Zhang Yi <yi.zhang@huawei.com> +Signed-off-by: Baokun Li <libaokun1@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20230327141630.156875-2-libaokun1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/super.c | 6 ++++-- + 1 file changed, 4 insertions(+), 2 deletions(-) + +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -5135,7 +5135,7 @@ no_journal: + ext4_msg(sb, KERN_INFO, "recovery complete"); + err = ext4_mark_recovery_complete(sb, es); + if (err) +- goto failed_mount8; ++ goto failed_mount9; + } + if (EXT4_SB(sb)->s_journal) { + if (test_opt(sb, DATA_FLAGS) == EXT4_MOUNT_JOURNAL_DATA) +@@ -5181,7 +5181,9 @@ cantfind_ext4: + ext4_msg(sb, KERN_ERR, "VFS: Can't find ext4 filesystem"); + goto failed_mount; + +-failed_mount8: ++failed_mount9: ++ ext4_quota_off_umount(sb); ++failed_mount8: __maybe_unused + ext4_unregister_sysfs(sb); + kobject_put(&sbi->s_kobj); + failed_mount7: --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-unconditionally-enable-the-i_version-counter.patch @@ -0,0 +1,121 @@ +From 1ff20307393e17dc57fde62226df625a3a3c36e9 Mon Sep 17 00:00:00 2001 +From: Jeff Layton <jlayton@kernel.org> +Date: Wed, 24 Aug 2022 18:03:49 +0200 +Subject: [PATCH] ext4: unconditionally enable the i_version counter +Git-commit: 1ff20307393e17dc57fde62226df625a3a3c36e9 +Patch-mainline: v6.1-rc1 +References: bsc#1211299 + +The original i_version implementation was pretty expensive, requiring a +log flush on every change. Because of this, it was gated behind a mount +option (implemented via the MS_I_VERSION mountoption flag). + +Commit ae5e165d855d (fs: new API for handling inode->i_version) made the +i_version flag much less expensive, so there is no longer a performance +penalty from enabling it. xfs and btrfs already enable it +unconditionally when the on-disk format can support it. + +Have ext4 ignore the SB_I_VERSION flag, and just enable it +unconditionally. While we're in here, mark the i_version mount +option Opt_removed. + +[ Removed leftover bits of i_version from ext4_apply_options() since it + now can't ever be set in ctx->mask_s_flags -- lczerner ] + +Cc: stable@kernel.org +Cc: Dave Chinner <david@fromorbit.com> +Cc: Benjamin Coddington <bcodding@redhat.com> +Cc: Christoph Hellwig <hch@infradead.org> +Cc: Darrick J. Wong <djwong@kernel.org> +Signed-off-by: Jeff Layton <jlayton@kernel.org> +Signed-off-by: Lukas Czerner <lczerner@redhat.com> +Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220824160349.39664-3-lczerner@redhat.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/inode.c | 5 ++--- + fs/ext4/super.c | 14 ++++++-------- + 2 files changed, 8 insertions(+), 11 deletions(-) + +--- a/fs/ext4/inode.c ++++ b/fs/ext4/inode.c +@@ -5421,7 +5421,7 @@ int ext4_setattr(struct user_namespace * + return -EINVAL; + } + +- if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size) ++ if (attr->ia_size != inode->i_size) + inode_inc_iversion(inode); + + if (shrink) { +@@ -5731,8 +5731,7 @@ int ext4_mark_iloc_dirty(handle_t *handl + * ea_inodes are using i_version for storing reference count, don't + * mess with it + */ +- if (IS_I_VERSION(inode) && +- !(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) ++ if (!(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) + inode_inc_iversion(inode); + + /* the do_update_inode consumes one bh->b_count */ +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -1673,7 +1673,7 @@ enum { + Opt_usrjquota, Opt_grpjquota, Opt_offusrjquota, Opt_offgrpjquota, + Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_jqfmt_vfsv1, Opt_quota, + Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err, +- Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, ++ Opt_usrquota, Opt_grpquota, Opt_prjquota, + Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never, + Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error, + Opt_nowarn_on_error, Opt_mblk_io_submit, +@@ -1743,7 +1743,7 @@ static const match_table_t tokens = { + {Opt_barrier, "barrier=%u"}, + {Opt_barrier, "barrier"}, + {Opt_nobarrier, "nobarrier"}, +- {Opt_i_version, "i_version"}, ++ {Opt_removed, "i_version"}, + {Opt_dax, "dax"}, + {Opt_dax_always, "dax=always"}, + {Opt_dax_inode, "dax=inode"}, +@@ -2130,9 +2130,6 @@ static int handle_mount_opt(struct super + case Opt_abort: + ext4_set_mount_flag(sb, EXT4_MF_FS_ABORTED); + return 1; +- case Opt_i_version: +- sb->s_flags |= SB_I_VERSION; +- return 1; + case Opt_lazytime: + sb->s_flags |= SB_LAZYTIME; + return 1; +@@ -2593,8 +2590,6 @@ static int _ext4_show_options(struct seq + SEQ_OPTS_PRINT("min_batch_time=%u", sbi->s_min_batch_time); + if (nodefs || sbi->s_max_batch_time != EXT4_DEF_MAX_BATCH_TIME) + SEQ_OPTS_PRINT("max_batch_time=%u", sbi->s_max_batch_time); +- if (sb->s_flags & SB_I_VERSION) +- SEQ_OPTS_PUTS("i_version"); + if (nodefs || sbi->s_stripe) + SEQ_OPTS_PRINT("stripe=%lu", sbi->s_stripe); + if (nodefs || EXT4_MOUNT_DATA_FLAGS & +@@ -4381,6 +4376,9 @@ static int ext4_fill_super(struct super_ + sb->s_flags = (sb->s_flags & ~SB_POSIXACL) | + (test_opt(sb, POSIX_ACL) ? SB_POSIXACL : 0); + ++ /* i_version is always enabled now */ ++ sb->s_flags |= SB_I_VERSION; ++ + if (le32_to_cpu(es->s_rev_level) == EXT4_GOOD_OLD_REV && + (ext4_has_compat_features(sb) || + ext4_has_ro_compat_features(sb) || +@@ -5931,7 +5929,7 @@ static int ext4_remount(struct super_blo + * either way we need to make sure it matches in both *flags and + * s_flags. Copy those selected flags from *flags to s_flags + */ +- vfs_flags = SB_LAZYTIME | SB_I_VERSION; ++ vfs_flags = SB_LAZYTIME; + sb->s_flags = (sb->s_flags & ~vfs_flags) | (*flags & vfs_flags); + + if (!parse_options(data, sb, &parsed_opts, 1)) { --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-unindent-codeblock-in-ext4_xattr_block_set.patch @@ -0,0 +1,123 @@ +From fd48e9acdf26d0cbd80051de07d4a735d05d29b2 Mon Sep 17 00:00:00 2001 +From: Jan Kara <jack@suse.cz> +Date: Tue, 12 Jul 2022 12:54:23 +0200 +Subject: [PATCH] ext4: unindent codeblock in ext4_xattr_block_set() +Git-commit: fd48e9acdf26d0cbd80051de07d4a735d05d29b2 +Patch-mainline: v6.0-rc1 +References: bsc#1198971 + +Remove unnecessary else (and thus indentation level) from a code block +in ext4_xattr_block_set(). It will also make following code changes +easier. No functional changes. + +Cc: stable@vger.kernel.org +Fixes: 82939d7999df ("ext4: convert to mbcache2") +Signed-off-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220712105436.32204-4-jack@suse.cz +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/xattr.c | 81 +++++++++++++++++++++++++++----------------------------- + 1 file changed, 40 insertions(+), 41 deletions(-) + +--- a/fs/ext4/xattr.c ++++ b/fs/ext4/xattr.c +@@ -1850,6 +1850,8 @@ ext4_xattr_block_set(handle_t *handle, s + #define header(x) ((struct ext4_xattr_header *)(x)) + + if (s->base) { ++ int offset = (char *)s->here - bs->bh->b_data; ++ + BUFFER_TRACE(bs->bh, "get_write_access"); + error = ext4_journal_get_write_access(handle, sb, bs->bh, + EXT4_JTR_NONE); +@@ -1882,50 +1884,47 @@ ext4_xattr_block_set(handle_t *handle, s + if (error) + goto cleanup; + goto inserted; +- } else { +- int offset = (char *)s->here - bs->bh->b_data; +- +- unlock_buffer(bs->bh); +- ea_bdebug(bs->bh, "cloning"); +- s->base = kmalloc(bs->bh->b_size, GFP_NOFS); +- error = -ENOMEM; +- if (s->base == NULL) ++ } ++ unlock_buffer(bs->bh); ++ ea_bdebug(bs->bh, "cloning"); ++ s->base = kmalloc(bs->bh->b_size, GFP_NOFS); ++ error = -ENOMEM; ++ if (s->base == NULL) ++ goto cleanup; ++ memcpy(s->base, BHDR(bs->bh), bs->bh->b_size); ++ s->first = ENTRY(header(s->base)+1); ++ header(s->base)->h_refcount = cpu_to_le32(1); ++ s->here = ENTRY(s->base + offset); ++ s->end = s->base + bs->bh->b_size; ++ ++ /* ++ * If existing entry points to an xattr inode, we need ++ * to prevent ext4_xattr_set_entry() from decrementing ++ * ref count on it because the reference belongs to the ++ * original block. In this case, make the entry look ++ * like it has an empty value. ++ */ ++ if (!s->not_found && s->here->e_value_inum) { ++ ea_ino = le32_to_cpu(s->here->e_value_inum); ++ error = ext4_xattr_inode_iget(inode, ea_ino, ++ le32_to_cpu(s->here->e_hash), ++ &tmp_inode); ++ if (error) + goto cleanup; +- memcpy(s->base, BHDR(bs->bh), bs->bh->b_size); +- s->first = ENTRY(header(s->base)+1); +- header(s->base)->h_refcount = cpu_to_le32(1); +- s->here = ENTRY(s->base + offset); +- s->end = s->base + bs->bh->b_size; + +- /* +- * If existing entry points to an xattr inode, we need +- * to prevent ext4_xattr_set_entry() from decrementing +- * ref count on it because the reference belongs to the +- * original block. In this case, make the entry look +- * like it has an empty value. +- */ +- if (!s->not_found && s->here->e_value_inum) { +- ea_ino = le32_to_cpu(s->here->e_value_inum); +- error = ext4_xattr_inode_iget(inode, ea_ino, +- le32_to_cpu(s->here->e_hash), +- &tmp_inode); +- if (error) +- goto cleanup; +- +- if (!ext4_test_inode_state(tmp_inode, +- EXT4_STATE_LUSTRE_EA_INODE)) { +- /* +- * Defer quota free call for previous +- * inode until success is guaranteed. +- */ +- old_ea_inode_quota = le32_to_cpu( +- s->here->e_value_size); +- } +- iput(tmp_inode); +- +- s->here->e_value_inum = 0; +- s->here->e_value_size = 0; ++ if (!ext4_test_inode_state(tmp_inode, ++ EXT4_STATE_LUSTRE_EA_INODE)) { ++ /* ++ * Defer quota free call for previous ++ * inode until success is guaranteed. ++ */ ++ old_ea_inode_quota = le32_to_cpu( ++ s->here->e_value_size); + } ++ iput(tmp_inode); ++ ++ s->here->e_value_inum = 0; ++ s->here->e_value_size = 0; + } + } else { + /* Allocate a buffer where we construct the new block. */ --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-update-s_journal_inum-if-it-changes-after-journ.patch @@ -0,0 +1,53 @@ +From 3039d8b8692408438a618fac2776b629852663c3 Mon Sep 17 00:00:00 2001 +From: Baokun Li <libaokun1@huawei.com> +Date: Sat, 7 Jan 2023 11:21:26 +0800 +Subject: [PATCH] ext4: update s_journal_inum if it changes after journal + replay +Mime-version: 1.0 +Content-type: text/plain; charset=UTF-8 +Content-transfer-encoding: 8bit +Git-commit: 3039d8b8692408438a618fac2776b629852663c3 +Patch-mainline: v6.3-rc1 +References: bsc#1213094 + +When mounting a crafted ext4 image, s_journal_inum may change after journal +replay, which is obviously unreasonable because we have successfully loaded +and replayed the journal through the old s_journal_inum. And the new +s_journal_inum bypasses some of the checks in ext4_get_journal(), which +may trigger a null pointer dereference problem. So if s_journal_inum +changes after the journal replay, we ignore the change, and rewrite the +current journal_inum to the superblock. + +Link: https://bugzilla.kernel.org/show_bug.cgi?id=216541 +Reported-by: LuÃs Henriques <lhenriques@suse.de> +Signed-off-by: Baokun Li <libaokun1@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20230107032126.4165860-3-libaokun1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/super.c | 7 +++++-- + 1 file changed, 5 insertions(+), 2 deletions(-) + +diff --git a/fs/ext4/super.c b/fs/ext4/super.c +index 3b9e30e1afd9..45bcfd35e559 100644 +--- a/fs/ext4/super.c ++++ b/fs/ext4/super.c +@@ -5953,8 +5953,11 @@ static int ext4_load_journal(struct super_block *sb, + if (!really_read_only && journal_devnum && + journal_devnum != le32_to_cpu(es->s_journal_dev)) { + es->s_journal_dev = cpu_to_le32(journal_devnum); +- +- /* Make sure we flush the recovery flag to disk. */ ++ ext4_commit_super(sb); ++ } ++ if (!really_read_only && journal_inum && ++ journal_inum != le32_to_cpu(es->s_journal_inum)) { ++ es->s_journal_inum = cpu_to_le32(journal_inum); + ext4_commit_super(sb); + } + +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-update-state-fc_regions_size-after-successful-m.patch @@ -0,0 +1,51 @@ +From 27cd49780381c6ccbf248798e5e8fd076200ffba Mon Sep 17 00:00:00 2001 +From: Ye Bin <yebin10@huawei.com> +Date: Wed, 21 Sep 2022 14:40:40 +0800 +Subject: [PATCH] ext4: update 'state->fc_regions_size' after successful memory + allocation +Git-commit: 27cd49780381c6ccbf248798e5e8fd076200ffba +Patch-mainline: v6.1-rc1 +References: bsc#1207613 + +To avoid to 'state->fc_regions_size' mismatch with 'state->fc_regions' +when fail to reallocate 'fc_reqions',only update 'state->fc_regions_size' +after 'state->fc_regions' is allocated successfully. + +Cc: stable@kernel.org +Signed-off-by: Ye Bin <yebin10@huawei.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220921064040.3693255-4-yebin10@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/fast_commit.c | 9 +++++---- + 1 file changed, 5 insertions(+), 4 deletions(-) + +diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c +index 5ab58cb4ce8d..9549d89b3519 100644 +--- a/fs/ext4/fast_commit.c ++++ b/fs/ext4/fast_commit.c +@@ -1681,14 +1681,15 @@ int ext4_fc_record_regions(struct super_block *sb, int ino, + if (state->fc_regions_used == state->fc_regions_size) { + struct ext4_fc_alloc_region *fc_regions; + +- state->fc_regions_size += +- EXT4_FC_REPLAY_REALLOC_INCREMENT; + fc_regions = krealloc(state->fc_regions, +- state->fc_regions_size * +- sizeof(struct ext4_fc_alloc_region), ++ sizeof(struct ext4_fc_alloc_region) * ++ (state->fc_regions_size + ++ EXT4_FC_REPLAY_REALLOC_INCREMENT), + GFP_KERNEL); + if (!fc_regions) + return -ENOMEM; ++ state->fc_regions_size += ++ EXT4_FC_REPLAY_REALLOC_INCREMENT; + state->fc_regions = fc_regions; + } + region = &state->fc_regions[state->fc_regions_used++]; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-use-ext4_ext_remove_space-for-fast-commit-repla.patch @@ -0,0 +1,66 @@ +From 0b5b5a62b945a141e64011b2f90ee7e46f14be98 Mon Sep 17 00:00:00 2001 +From: Xin Yin <yinxin.x@bytedance.com> +Date: Thu, 23 Dec 2021 11:23:36 +0800 +Subject: [PATCH] ext4: use ext4_ext_remove_space() for fast commit replay + delete range +Git-commit: 0b5b5a62b945a141e64011b2f90ee7e46f14be98 +Patch-mainline: v5.17-rc1 +References: bsc#1202758 + +For now ,we use ext4_punch_hole() during fast commit replay delete range +procedure. But it will be affected by inode->i_size, which may not +correct during fast commit replay procedure. The following test will +failed. + +-create & write foo (len 1000K) +-falloc FALLOC_FL_ZERO_RANGE foo (range 400K - 600K) +-create & fsync bar +-falloc FALLOC_FL_PUNCH_HOLE foo (range 300K-500K) +-fsync foo +-crash before a full commit + +After the fast_commit reply procedure, the range 400K-500K will not be +removed. Because in this case, when calling ext4_punch_hole() the +inode->i_size is 0, and it just retruns with doing nothing. + +Change to use ext4_ext_remove_space() instead of ext4_punch_hole() +to remove blocks of inode directly. + +Signed-off-by: Xin Yin <yinxin.x@bytedance.com> +Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> +Link: https://lore.kernel.org/r/20211223032337.5198-2-yinxin.x@bytedance.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/fast_commit.c | 13 ++++++++----- + 1 file changed, 8 insertions(+), 5 deletions(-) + +diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c +index dd002facf6c9..28ddeb1d6afb 100644 +--- a/fs/ext4/fast_commit.c ++++ b/fs/ext4/fast_commit.c +@@ -1770,11 +1770,14 @@ ext4_fc_replay_del_range(struct super_block *sb, struct ext4_fc_tl *tl, + } + } + +- ret = ext4_punch_hole(inode, +- le32_to_cpu(lrange.fc_lblk) << sb->s_blocksize_bits, +- le32_to_cpu(lrange.fc_len) << sb->s_blocksize_bits); +- if (ret) +- jbd_debug(1, "ext4_punch_hole returned %d", ret); ++ down_write(&EXT4_I(inode)->i_data_sem); ++ ret = ext4_ext_remove_space(inode, lrange.fc_lblk, ++ lrange.fc_lblk + lrange.fc_len - 1); ++ up_write(&EXT4_I(inode)->i_data_sem); ++ if (ret) { ++ iput(inode); ++ return 0; ++ } + ext4_ext_replay_shrink_inode(inode, + i_size_read(inode) >> sb->s_blocksize_bits); + ext4_mark_inode_dirty(NULL, inode); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-use-ext4_fc_tl_mem-in-fast-commit-replay-path.patch @@ -0,0 +1,143 @@ +From 11768cfd98136dd8399480c60b7a5d3d3c7b109b Mon Sep 17 00:00:00 2001 +From: Eric Biggers <ebiggers@google.com> +Date: Fri, 16 Dec 2022 21:02:12 -0800 +Subject: [PATCH] ext4: use ext4_fc_tl_mem in fast-commit replay path +Git-commit: 11768cfd98136dd8399480c60b7a5d3d3c7b109b +Patch-mainline: v6.3-rc1 +References: bsc#1213092 + +To avoid 'sparse' warnings about missing endianness conversions, don't +store native endianness values into struct ext4_fc_tl. Instead, use a +separate struct type, ext4_fc_tl_mem. + +Fixes: dcc5827484d6 ("ext4: factor out ext4_fc_get_tl()") +Cc: Ye Bin <yebin10@huawei.com> +Signed-off-by: Eric Biggers <ebiggers@google.com> +Reviewed-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20221217050212.150665-1-ebiggers@kernel.org +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/fast_commit.c | 44 +++++++++++++++++++++++++------------------ + 1 file changed, 26 insertions(+), 18 deletions(-) + +diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c +index 4594b62f147b..b06de728b3b6 100644 +--- a/fs/ext4/fast_commit.c ++++ b/fs/ext4/fast_commit.c +@@ -1332,8 +1332,14 @@ struct dentry_info_args { + char *dname; + }; + ++/* Same as struct ext4_fc_tl, but uses native endianness fields */ ++struct ext4_fc_tl_mem { ++ u16 fc_tag; ++ u16 fc_len; ++}; ++ + static inline void tl_to_darg(struct dentry_info_args *darg, +- struct ext4_fc_tl *tl, u8 *val) ++ struct ext4_fc_tl_mem *tl, u8 *val) + { + struct ext4_fc_dentry_info fcd; + +@@ -1345,16 +1351,18 @@ static inline void tl_to_darg(struct dentry_info_args *darg, + darg->dname_len = tl->fc_len - sizeof(struct ext4_fc_dentry_info); + } + +-static inline void ext4_fc_get_tl(struct ext4_fc_tl *tl, u8 *val) ++static inline void ext4_fc_get_tl(struct ext4_fc_tl_mem *tl, u8 *val) + { +- memcpy(tl, val, EXT4_FC_TAG_BASE_LEN); +- tl->fc_len = le16_to_cpu(tl->fc_len); +- tl->fc_tag = le16_to_cpu(tl->fc_tag); ++ struct ext4_fc_tl tl_disk; ++ ++ memcpy(&tl_disk, val, EXT4_FC_TAG_BASE_LEN); ++ tl->fc_len = le16_to_cpu(tl_disk.fc_len); ++ tl->fc_tag = le16_to_cpu(tl_disk.fc_tag); + } + + /* Unlink replay function */ +-static int ext4_fc_replay_unlink(struct super_block *sb, struct ext4_fc_tl *tl, +- u8 *val) ++static int ext4_fc_replay_unlink(struct super_block *sb, ++ struct ext4_fc_tl_mem *tl, u8 *val) + { + struct inode *inode, *old_parent; + struct qstr entry; +@@ -1451,8 +1459,8 @@ static int ext4_fc_replay_link_internal(struct super_block *sb, + } + + /* Link replay function */ +-static int ext4_fc_replay_link(struct super_block *sb, struct ext4_fc_tl *tl, +- u8 *val) ++static int ext4_fc_replay_link(struct super_block *sb, ++ struct ext4_fc_tl_mem *tl, u8 *val) + { + struct inode *inode; + struct dentry_info_args darg; +@@ -1506,8 +1514,8 @@ static int ext4_fc_record_modified_inode(struct super_block *sb, int ino) + /* + * Inode replay function + */ +-static int ext4_fc_replay_inode(struct super_block *sb, struct ext4_fc_tl *tl, +- u8 *val) ++static int ext4_fc_replay_inode(struct super_block *sb, ++ struct ext4_fc_tl_mem *tl, u8 *val) + { + struct ext4_fc_inode fc_inode; + struct ext4_inode *raw_inode; +@@ -1609,8 +1617,8 @@ static int ext4_fc_replay_inode(struct super_block *sb, struct ext4_fc_tl *tl, + * inode for which we are trying to create a dentry here, should already have + * been replayed before we start here. + */ +-static int ext4_fc_replay_create(struct super_block *sb, struct ext4_fc_tl *tl, +- u8 *val) ++static int ext4_fc_replay_create(struct super_block *sb, ++ struct ext4_fc_tl_mem *tl, u8 *val) + { + int ret = 0; + struct inode *inode = NULL; +@@ -1708,7 +1716,7 @@ int ext4_fc_record_regions(struct super_block *sb, int ino, + + /* Replay add range tag */ + static int ext4_fc_replay_add_range(struct super_block *sb, +- struct ext4_fc_tl *tl, u8 *val) ++ struct ext4_fc_tl_mem *tl, u8 *val) + { + struct ext4_fc_add_range fc_add_ex; + struct ext4_extent newex, *ex; +@@ -1828,8 +1836,8 @@ static int ext4_fc_replay_add_range(struct super_block *sb, + + /* Replay DEL_RANGE tag */ + static int +-ext4_fc_replay_del_range(struct super_block *sb, struct ext4_fc_tl *tl, +- u8 *val) ++ext4_fc_replay_del_range(struct super_block *sb, ++ struct ext4_fc_tl_mem *tl, u8 *val) + { + struct inode *inode; + struct ext4_fc_del_range lrange; +@@ -2025,7 +2033,7 @@ static int ext4_fc_replay_scan(journal_t *journal, + struct ext4_fc_replay_state *state; + int ret = JBD2_FC_REPLAY_CONTINUE; + struct ext4_fc_add_range ext; +- struct ext4_fc_tl tl; ++ struct ext4_fc_tl_mem tl; + struct ext4_fc_tail tail; + __u8 *start, *end, *cur, *val; + struct ext4_fc_head head; +@@ -2144,7 +2152,7 @@ static int ext4_fc_replay(journal_t *journal, struct buffer_head *bh, + { + struct super_block *sb = journal->j_private; + struct ext4_sb_info *sbi = EXT4_SB(sb); +- struct ext4_fc_tl tl; ++ struct ext4_fc_tl_mem tl; + __u8 *start, *end, *cur, *val; + int ret = JBD2_FC_REPLAY_CONTINUE; + struct ext4_fc_replay_state *state = &sbi->s_fc_replay_state; +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-verify-dir-block-before-splitting-it.patch @@ -0,0 +1,93 @@ +From 46c116b920ebec58031f0a78c5ea9599b0d2a371 Mon Sep 17 00:00:00 2001 +From: Jan Kara <jack@suse.cz> +Date: Wed, 18 May 2022 11:33:28 +0200 +Subject: [PATCH] ext4: verify dir block before splitting it +Git-commit: 46c116b920ebec58031f0a78c5ea9599b0d2a371 +Patch-mainline: v5.19-rc1 +References: bsc#1198577 CVE-2022-1184 + +Before splitting a directory block verify its directory entries are sane +so that the splitting code does not access memory it should not. + +Cc: stable@vger.kernel.org +Signed-off-by: Jan Kara <jack@suse.cz> +Link: https://lore.kernel.org/r/20220518093332.13986-1-jack@suse.cz +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/namei.c | 32 +++++++++++++++++++++----------- + 1 file changed, 21 insertions(+), 11 deletions(-) + +diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c +index 7aca8944901d..7286472e9558 100644 +--- a/fs/ext4/namei.c ++++ b/fs/ext4/namei.c +@@ -277,9 +277,9 @@ static struct dx_frame *dx_probe(struct ext4_filename *fname, + struct dx_hash_info *hinfo, + struct dx_frame *frame); + static void dx_release(struct dx_frame *frames); +-static int dx_make_map(struct inode *dir, struct ext4_dir_entry_2 *de, +- unsigned blocksize, struct dx_hash_info *hinfo, +- struct dx_map_entry map[]); ++static int dx_make_map(struct inode *dir, struct buffer_head *bh, ++ struct dx_hash_info *hinfo, ++ struct dx_map_entry *map_tail); + static void dx_sort_map(struct dx_map_entry *map, unsigned count); + static struct ext4_dir_entry_2 *dx_move_dirents(struct inode *dir, char *from, + char *to, struct dx_map_entry *offsets, +@@ -1249,15 +1249,23 @@ static inline int search_dirblock(struct buffer_head *bh, + * Create map of hash values, offsets, and sizes, stored at end of block. + * Returns number of entries mapped. + */ +-static int dx_make_map(struct inode *dir, struct ext4_dir_entry_2 *de, +- unsigned blocksize, struct dx_hash_info *hinfo, ++static int dx_make_map(struct inode *dir, struct buffer_head *bh, ++ struct dx_hash_info *hinfo, + struct dx_map_entry *map_tail) + { + int count = 0; +- char *base = (char *) de; ++ struct ext4_dir_entry_2 *de = (struct ext4_dir_entry_2 *)bh->b_data; ++ unsigned int buflen = bh->b_size; ++ char *base = bh->b_data; + struct dx_hash_info h = *hinfo; + +- while ((char *) de < base + blocksize) { ++ if (ext4_has_metadata_csum(dir->i_sb)) ++ buflen -= sizeof(struct ext4_dir_entry_tail); ++ ++ while ((char *) de < base + buflen) { ++ if (ext4_check_dir_entry(dir, NULL, de, bh, base, buflen, ++ ((char *)de) - base)) ++ return -EFSCORRUPTED; + if (de->name_len && de->inode) { + if (ext4_hash_in_dirent(dir)) + h.hash = EXT4_DIRENT_HASH(de); +@@ -1270,8 +1278,7 @@ static int dx_make_map(struct inode *dir, struct ext4_dir_entry_2 *de, + count++; + cond_resched(); + } +- /* XXX: do we need to check rec_len == 0 case? -Chris */ +- de = ext4_next_entry(de, blocksize); ++ de = ext4_next_entry(de, dir->i_sb->s_blocksize); + } + return count; + } +@@ -1943,8 +1950,11 @@ static struct ext4_dir_entry_2 *do_split(handle_t *handle, struct inode *dir, + + /* create map in the end of data2 block */ + map = (struct dx_map_entry *) (data2 + blocksize); +- count = dx_make_map(dir, (struct ext4_dir_entry_2 *) data1, +- blocksize, hinfo, map); ++ count = dx_make_map(dir, *bh, hinfo, map); ++ if (count < 0) { ++ err = count; ++ goto journal_error; ++ } + map -= count; + dx_sort_map(map, count); + /* Ensure that neither split block is over half full */ +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/ext4-zero-i_disksize-when-initializing-the-bootloade.patch @@ -0,0 +1,65 @@ +From f5361da1e60d54ec81346aee8e3d8baf1be0b762 Mon Sep 17 00:00:00 2001 +From: Zhihao Cheng <chengzhihao1@huawei.com> +Date: Wed, 8 Mar 2023 11:26:43 +0800 +Subject: [PATCH] ext4: zero i_disksize when initializing the bootloader inode +Git-commit: f5361da1e60d54ec81346aee8e3d8baf1be0b762 +Patch-mainline: v6.3-rc2 +References: bsc#1213013 + +If the boot loader inode has never been used before, the +EXT4_IOC_SWAP_BOOT inode will initialize it, including setting the +i_size to 0. However, if the "never before used" boot loader has a +non-zero i_size, then i_disksize will be non-zero, and the +inconsistency between i_size and i_disksize can trigger a kernel +Warning: + + WARNING: CPU: 0 PID: 2580 at fs/ext4/file.c:319 + CPU: 0 PID: 2580 Comm: bb Not tainted 6.3.0-rc1-00004-g703695902cfa + RIP: 0010:ext4_file_write_iter+0xbc7/0xd10 + Call Trace: + vfs_write+0x3b1/0x5c0 + ksys_write+0x77/0x160 + __x64_sys_write+0x22/0x30 + do_syscall_64+0x39/0x80 + +Reproducer: 1. create corrupted image and mount it: + mke2fs -t ext4 /tmp/foo.img 200 + debugfs -wR "sif <5> size 25700" /tmp/foo.img + mount -t ext4 /tmp/foo.img /mnt + cd /mnt + echo 123 > file + 2. Run the reproducer program: + posix_memalign(&buf, 1024, 1024) + fd = open("file", O_RDWR | O_DIRECT); + ioctl(fd, EXT4_IOC_SWAP_BOOT); + write(fd, buf, 1024); + +Fix this by setting i_disksize as well as i_size to zero when +initiaizing the boot loader inode. + +Link: https://bugzilla.kernel.org/show_bug.cgi?id=217159 +Cc: stable@kernel.org +Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> +Link: https://lore.kernel.org/r/20230308032643.641113-1-chengzhihao1@huawei.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/ioctl.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c +index 2e8c34036313..cc17205f7f49 100644 +--- a/fs/ext4/ioctl.c ++++ b/fs/ext4/ioctl.c +@@ -431,6 +431,7 @@ static long swap_inode_boot_loader(struct super_block *sb, + ei_bl->i_flags = 0; + inode_set_iversion(inode_bl, 1); + i_size_write(inode_bl, 0); ++ EXT4_I(inode_bl)->i_disksize = inode_bl->i_size; + inode_bl->i_mode = S_IFREG; + if (ext4_has_feature_extents(sb)) { + ext4_set_inode_flag(inode_bl, EXT4_INODE_EXTENTS); +-- +2.35.3 + --- /dev/null +++ b/ldiskfs/kernel_patches/patches/patches.suse/fs-ext4-initialize-fsdata-in-pagecache_write.patch @@ -0,0 +1,38 @@ +From 956510c0c7439e90b8103aaeaf4da92878c622f0 Mon Sep 17 00:00:00 2001 +From: Alexander Potapenko <glider@google.com> +Date: Mon, 21 Nov 2022 12:21:30 +0100 +Subject: [PATCH] fs: ext4: initialize fsdata in pagecache_write() +Git-commit: 956510c0c7439e90b8103aaeaf4da92878c622f0 +Patch-mainline: v6.2-rc1 +References: bsc#1207632 + +When aops->write_begin() does not initialize fsdata, KMSAN reports +an error passing the latter to aops->write_end(). + +Fix this by unconditionally initializing fsdata. + +Cc: Eric Biggers <ebiggers@kernel.org> +Fixes: c93d8f885809 ("ext4: add basic fs-verity support") +Reported-by: syzbot+9767be679ef5016b6082@syzkaller.appspotmail.com +Signed-off-by: Alexander Potapenko <glider@google.com> +Reviewed-by: Eric Biggers <ebiggers@google.com> +Link: https://lore.kernel.org/r/20221121112134.407362-1-glider@google.com +Signed-off-by: Theodore Ts'o <tytso@mit.edu> +Cc: stable@kernel.org +Acked-by: Jan Kara <jack@suse.cz> + +--- + fs/ext4/verity.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/fs/ext4/verity.c ++++ b/fs/ext4/verity.c +@@ -76,7 +76,7 @@ static int pagecache_write(struct inode + size_t n = min_t(size_t, count, + PAGE_SIZE - offset_in_page(pos)); + struct page *page; +- void *fsdata; ++ void *fsdata = NULL; + int res; + + res = pagecache_write_begin(NULL, inode->i_mapping, pos, n, 0, --- a/ldiskfs/kernel_patches/patches/sles15sp4/ext4-data-in-dirent.patch +++ b/ldiskfs/kernel_patches/patches/sles15sp4/ext4-data-in-dirent.patch @@ -8,19 +8,17 @@ data is present. This uses dentry->d_fsdata to pass fid to ext4. so no changes in ext4_add_entry() interface required. --- - fs/ext4/dir.c | 9 +- - fs/ext4/ext4.h | 106 ++++++++++++++++-- - fs/ext4/fast_commit.c | 2 +- - fs/ext4/inline.c | 8 +- - fs/ext4/namei.c | 249 ++++++++++++++++++++++++++++++++---------- - fs/ext4/super.c | 4 +- + fs/ext4/dir.c | 9 + + fs/ext4/ext4.h | 106 +++++++++++++++++++-- + fs/ext4/fast_commit.c | 2 + fs/ext4/inline.c | 8 - + fs/ext4/namei.c | 249 +++++++++++++++++++++++++++++++++++++------------- + fs/ext4/super.c | 4 6 files changed, 303 insertions(+), 75 deletions(-) -diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c -index 74b172a..c6afabc 100644 --- a/fs/ext4/dir.c +++ b/fs/ext4/dir.c -@@ -466,12 +466,17 @@ int ext4_htree_store_dirent(struct file *dir_file, __u32 hash, +@@ -466,12 +466,17 @@ int ext4_htree_store_dirent(struct file struct fname *fname, *new_fn; struct dir_private_info *info; int len; @@ -39,7 +37,7 @@ index 74b172a..c6afabc 100644 new_fn = kzalloc(len, GFP_KERNEL); if (!new_fn) return -ENOMEM; -@@ -480,7 +485,7 @@ int ext4_htree_store_dirent(struct file *dir_file, __u32 hash, +@@ -480,7 +485,7 @@ int ext4_htree_store_dirent(struct file new_fn->inode = le32_to_cpu(dirent->inode); new_fn->name_len = ent_name->len; new_fn->file_type = dirent->file_type; @@ -48,11 +46,9 @@ index 74b172a..c6afabc 100644 while (*p) { parent = *p; -diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h -index 0791a8b..f1bc21d 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h -@@ -1165,6 +1165,7 @@ struct ext4_inode_info { +@@ -1168,6 +1168,7 @@ struct ext4_inode_info { __u32 i_csum_seed; kprojid_t i_projid; @@ -60,7 +56,7 @@ index 0791a8b..f1bc21d 100644 }; /* -@@ -1186,6 +1187,7 @@ struct ext4_inode_info { +@@ -1189,6 +1190,7 @@ struct ext4_inode_info { * Mount flags set via mount options or defaults */ #define EXT4_MOUNT_NO_MBCACHE 0x00001 /* Do not use mbcache */ @@ -68,7 +64,7 @@ index 0791a8b..f1bc21d 100644 #define EXT4_MOUNT_GRPID 0x00004 /* Create files with directory's group */ #define EXT4_MOUNT_DEBUG 0x00008 /* Some debugging messages */ #define EXT4_MOUNT_ERRORS_CONT 0x00010 /* Continue on errors */ -@@ -2117,6 +2119,7 @@ EXT4_FEATURE_INCOMPAT_FUNCS(casefold, CASEFOLD) +@@ -2145,6 +2147,7 @@ EXT4_FEATURE_INCOMPAT_FUNCS(casefold, C EXT4_FEATURE_INCOMPAT_FLEX_BG| \ EXT4_FEATURE_INCOMPAT_EA_INODE| \ EXT4_FEATURE_INCOMPAT_MMP | \ @@ -76,7 +72,7 @@ index 0791a8b..f1bc21d 100644 EXT4_FEATURE_INCOMPAT_INLINE_DATA | \ EXT4_FEATURE_INCOMPAT_ENCRYPT | \ EXT4_FEATURE_INCOMPAT_CASEFOLD | \ -@@ -2326,6 +2329,42 @@ struct ext4_dir_entry_tail { +@@ -2354,6 +2357,42 @@ struct ext4_dir_entry_tail { #define EXT4_FT_SYMLINK 7 #define EXT4_FT_MAX 8 @@ -119,7 +115,7 @@ index 0791a8b..f1bc21d 100644 #define EXT4_FT_DIR_CSUM 0xDE -@@ -2337,6 +2376,17 @@ struct ext4_dir_entry_tail { +@@ -2365,6 +2404,17 @@ struct ext4_dir_entry_tail { #define EXT4_DIR_PAD 4 #define EXT4_DIR_ROUND (EXT4_DIR_PAD - 1) #define EXT4_MAX_REC_LEN ((1<<16)-1) @@ -137,7 +133,7 @@ index 0791a8b..f1bc21d 100644 /* * The rec_len is dependent on the type of directory. Directories that are -@@ -2344,10 +2394,10 @@ struct ext4_dir_entry_tail { +@@ -2372,10 +2422,10 @@ struct ext4_dir_entry_tail { * ext4_extended_dir_entry_2. For all entries related to '.' or '..' you should * pass NULL for dir, as those entries do not use the extra fields. */ @@ -150,7 +146,7 @@ index 0791a8b..f1bc21d 100644 if (dir && ext4_hash_in_dirent(dir)) rec_len += sizeof(struct ext4_dir_entry_hash); -@@ -2821,11 +2871,13 @@ extern int ext4_find_dest_de(struct inode *dir, struct inode *inode, +@@ -2849,11 +2899,13 @@ extern int ext4_find_dest_de(struct inod struct buffer_head *bh, void *buf, int buf_size, struct ext4_filename *fname, @@ -166,7 +162,7 @@ index 0791a8b..f1bc21d 100644 static inline void ext4_update_dx_flag(struct inode *inode) { if (!ext4_has_feature_dir_index(inode->i_sb) && -@@ -2841,10 +2893,17 @@ static const unsigned char ext4_filetype_table[] = { +@@ -2869,10 +2921,17 @@ static const unsigned char ext4_filetype static inline unsigned char get_dtype(struct super_block *sb, int filetype) { @@ -186,7 +182,7 @@ index 0791a8b..f1bc21d 100644 } extern int ext4_check_all_de(struct inode *dir, struct buffer_head *bh, void *buf, int buf_size); -@@ -3048,7 +3107,8 @@ extern int ext4_ind_migrate(struct inode *inode); +@@ -3078,7 +3137,8 @@ extern int ext4_ind_migrate(struct inode /* namei.c */ extern int ext4_init_new_dir(handle_t *handle, struct inode *dir, @@ -196,7 +192,7 @@ index 0791a8b..f1bc21d 100644 extern int ext4_dirblock_csum_verify(struct inode *inode, struct buffer_head *bh); extern int ext4_orphan_add(handle_t *, struct inode *); -@@ -3059,6 +3119,8 @@ extern struct inode *ext4_create_inode(handle_t *handle, +@@ -3089,6 +3149,8 @@ extern struct inode *ext4_create_inode(h extern int ext4_delete_entry(handle_t *handle, struct inode * dir, struct ext4_dir_entry_2 *de_del, struct buffer_head *bh); @@ -205,7 +201,7 @@ index 0791a8b..f1bc21d 100644 extern int ext4_htree_fill_tree(struct file *dir_file, __u32 start_hash, __u32 start_minor_hash, __u32 *next_hash); extern int ext4_search_dir(struct buffer_head *bh, -@@ -3862,6 +3924,36 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh) +@@ -3892,6 +3954,36 @@ static inline int ext4_buffer_uptodate(s return buffer_uptodate(bh); } @@ -242,11 +238,9 @@ index 0791a8b..f1bc21d 100644 #endif /* __KERNEL__ */ #define EFSBADCRC EBADMSG /* Bad CRC detected */ -diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c -index 276d9e6..3be0f08 100644 --- a/fs/ext4/fast_commit.c +++ b/fs/ext4/fast_commit.c -@@ -1596,7 +1596,7 @@ static int ext4_fc_replay_create(struct super_block *sb, struct ext4_fc_tl *tl, +@@ -1632,7 +1632,7 @@ static int ext4_fc_replay_create(struct jbd_debug(1, "Dir %d not found.", darg.ino); goto out; } @@ -255,11 +249,9 @@ index 276d9e6..3be0f08 100644 iput(dir); if (ret) { ret = 0; -diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c -index 9626c31..ed31b5c 100644 --- a/fs/ext4/inline.c +++ b/fs/ext4/inline.c -@@ -1029,7 +1029,7 @@ static int ext4_add_dirent_to_inline(handle_t *handle, +@@ -1028,7 +1028,7 @@ static int ext4_add_dirent_to_inline(han struct ext4_dir_entry_2 *de; err = ext4_find_dest_de(dir, inode, iloc->bh, inline_start, @@ -268,7 +260,7 @@ index 9626c31..ed31b5c 100644 if (err) return err; -@@ -1038,7 +1038,7 @@ static int ext4_add_dirent_to_inline(handle_t *handle, +@@ -1037,7 +1037,7 @@ static int ext4_add_dirent_to_inline(han EXT4_JTR_NONE); if (err) return err; @@ -277,7 +269,7 @@ index 9626c31..ed31b5c 100644 ext4_show_inline_dir(dir, iloc->bh, inline_start, inline_size); -@@ -1396,7 +1396,7 @@ int ext4_inlinedir_to_tree(struct file *dir_file, +@@ -1396,7 +1396,7 @@ int ext4_inlinedir_to_tree(struct file * fake.name_len = 1; strcpy(fake.name, "."); fake.rec_len = ext4_rec_len_to_disk( @@ -286,7 +278,7 @@ index 9626c31..ed31b5c 100644 inline_size); ext4_set_de_type(inode->i_sb, &fake, S_IFDIR); de = &fake; -@@ -1406,7 +1406,7 @@ int ext4_inlinedir_to_tree(struct file *dir_file, +@@ -1406,7 +1406,7 @@ int ext4_inlinedir_to_tree(struct file * fake.name_len = 2; strcpy(fake.name, ".."); fake.rec_len = ext4_rec_len_to_disk( @@ -295,11 +287,9 @@ index 9626c31..ed31b5c 100644 inline_size); ext4_set_de_type(inode->i_sb, &fake, S_IFDIR); de = &fake; -diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c -index 7f00dc3..51c950b 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c -@@ -285,13 +285,14 @@ static unsigned dx_get_count(struct dx_entry *entries); +@@ -290,13 +290,14 @@ static unsigned dx_get_count(struct dx_e static unsigned dx_get_limit(struct dx_entry *entries); static void dx_set_count(struct dx_entry *entries, unsigned value); static void dx_set_limit(struct dx_entry *entries, unsigned value); @@ -316,7 +306,7 @@ index 7f00dc3..51c950b 100644 static int dx_make_map(struct inode *dir, struct buffer_head *bh, struct dx_hash_info *hinfo, struct dx_map_entry *map_tail); -@@ -431,22 +432,23 @@ static struct dx_countlimit *get_dx_countlimit(struct inode *inode, +@@ -436,22 +437,23 @@ static struct dx_countlimit *get_dx_coun { struct ext4_dir_entry *dp; struct dx_root_info *root; @@ -348,7 +338,7 @@ index 7f00dc3..51c950b 100644 if (offset) *offset = count_offset; -@@ -549,13 +551,14 @@ ext4_next_entry(struct ext4_dir_entry_2 *p, unsigned long blocksize) +@@ -554,13 +556,14 @@ ext4_next_entry(struct ext4_dir_entry_2 * Future: use high four bits of block for coalesce-on-delete flags * Mask them off for now. */ @@ -366,7 +356,7 @@ index 7f00dc3..51c950b 100644 return (struct dx_root_info *)de; } -@@ -600,11 +603,16 @@ static inline void dx_set_limit(struct dx_entry *entries, unsigned value) +@@ -605,11 +608,16 @@ static inline void dx_set_limit(struct d ((struct dx_countlimit *) entries)->limit = cpu_to_le16(value); } @@ -387,7 +377,7 @@ index 7f00dc3..51c950b 100644 if (ext4_has_metadata_csum(dir->i_sb)) entry_space -= sizeof(struct dx_tail); -@@ -722,7 +730,7 @@ static struct stats dx_show_leaf(struct inode *dir, +@@ -729,7 +737,7 @@ static struct stats dx_show_leaf(struct (unsigned) ((char *) de - base)); #endif } @@ -396,7 +386,7 @@ index 7f00dc3..51c950b 100644 names++; } de = ext4_next_entry(de, size); -@@ -816,7 +824,7 @@ dx_probe(struct ext4_filename *fname, struct inode *dir, +@@ -823,7 +831,7 @@ dx_probe(struct ext4_filename *fname, st if (IS_ERR(frame->bh)) return (struct dx_frame *) frame->bh; @@ -405,7 +395,7 @@ index 7f00dc3..51c950b 100644 if (info->hash_version != DX_HASH_TEA && info->hash_version != DX_HASH_HALF_MD4 && info->hash_version != DX_HASH_LEGACY && -@@ -872,11 +880,14 @@ dx_probe(struct ext4_filename *fname, struct inode *dir, +@@ -885,11 +893,14 @@ dx_probe(struct ext4_filename *fname, st entries = (struct dx_entry *)(((char *)info) + info->info_length); @@ -423,7 +413,7 @@ index 7f00dc3..51c950b 100644 goto fail; } -@@ -953,7 +964,7 @@ fail: +@@ -966,7 +977,7 @@ fail: return ret_err; } @@ -432,7 +422,7 @@ index 7f00dc3..51c950b 100644 { struct dx_root_info *info; int i; -@@ -962,7 +973,7 @@ static void dx_release(struct dx_frame *frames) +@@ -975,7 +986,7 @@ static void dx_release(struct dx_frame * if (frames[0].bh == NULL) return; @@ -441,7 +431,7 @@ index 7f00dc3..51c950b 100644 /* save local copy, "info" may be freed after brelse() */ indirect_levels = info->indirect_levels; for (i = 0; i <= indirect_levels; i++) { -@@ -1263,12 +1274,12 @@ int ext4_htree_fill_tree(struct file *dir_file, __u32 start_hash, +@@ -1281,12 +1292,12 @@ int ext4_htree_fill_tree(struct file *di (count && ((hashval & 1) == 0))) break; } @@ -456,7 +446,7 @@ index 7f00dc3..51c950b 100644 return (err); } -@@ -1801,7 +1812,7 @@ static struct buffer_head * ext4_dx_find_entry(struct inode *dir, +@@ -1821,7 +1832,7 @@ static struct buffer_head * ext4_dx_find errout: dxtrace(printk(KERN_DEBUG "%s not found\n", fname->usr_fname->name)); success: @@ -465,7 +455,7 @@ index 7f00dc3..51c950b 100644 return bh; } -@@ -1925,7 +1936,7 @@ dx_move_dirents(struct inode *dir, char *from, char *to, +@@ -1945,7 +1956,7 @@ dx_move_dirents(struct inode *dir, char while (count--) { struct ext4_dir_entry_2 *de = (struct ext4_dir_entry_2 *) (from + (map->offs<<2)); @@ -474,7 +464,7 @@ index 7f00dc3..51c950b 100644 memcpy (to, de, rec_len); ((struct ext4_dir_entry_2 *) to)->rec_len = -@@ -1958,7 +1969,7 @@ static struct ext4_dir_entry_2 *dx_pack_dirents(struct inode *dir, char *base, +@@ -1978,7 +1989,7 @@ static struct ext4_dir_entry_2 *dx_pack_ while ((char*)de < base + blocksize) { next = ext4_next_entry(de, blocksize); if (de->inode && de->name_len) { @@ -483,7 +473,7 @@ index 7f00dc3..51c950b 100644 if (de > to) memmove(to, de, rec_len); to->rec_len = ext4_rec_len_to_disk(rec_len, blocksize); -@@ -2101,14 +2112,21 @@ int ext4_find_dest_de(struct inode *dir, struct inode *inode, +@@ -2121,14 +2132,21 @@ int ext4_find_dest_de(struct inode *dir, struct buffer_head *bh, void *buf, int buf_size, struct ext4_filename *fname, @@ -507,7 +497,7 @@ index 7f00dc3..51c950b 100644 de = (struct ext4_dir_entry_2 *)buf; top = buf + buf_size - reclen; while ((char *) de <= top) { -@@ -2117,10 +2135,31 @@ int ext4_find_dest_de(struct inode *dir, struct inode *inode, +@@ -2137,10 +2155,31 @@ int ext4_find_dest_de(struct inode *dir, return -EFSCORRUPTED; if (ext4_match(dir, fname, de)) return -EEXIST; @@ -540,7 +530,7 @@ index 7f00dc3..51c950b 100644 de = (struct ext4_dir_entry_2 *)((char *)de + rlen); offset += rlen; } -@@ -2135,12 +2174,13 @@ void ext4_insert_dentry(struct inode *dir, +@@ -2155,12 +2194,13 @@ void ext4_insert_dentry(struct inode *di struct inode *inode, struct ext4_dir_entry_2 *de, int buf_size, @@ -556,7 +546,7 @@ index 7f00dc3..51c950b 100644 rlen = ext4_rec_len_from_disk(de->rec_len, buf_size); if (de->inode) { struct ext4_dir_entry_2 *de1 = -@@ -2161,6 +2201,12 @@ void ext4_insert_dentry(struct inode *dir, +@@ -2181,6 +2221,12 @@ void ext4_insert_dentry(struct inode *di EXT4_DIRENT_HASHES(de)->minor_hash = cpu_to_le32(hinfo->minor_hash); } @@ -569,7 +559,7 @@ index 7f00dc3..51c950b 100644 } /* -@@ -2178,14 +2224,19 @@ static int add_dirent_to_buf(handle_t *handle, struct ext4_filename *fname, +@@ -2198,14 +2244,19 @@ static int add_dirent_to_buf(handle_t *h { unsigned int blocksize = dir->i_sb->s_blocksize; int csum_size = 0; @@ -591,7 +581,7 @@ index 7f00dc3..51c950b 100644 if (err) return err; } -@@ -2198,7 +2249,10 @@ static int add_dirent_to_buf(handle_t *handle, struct ext4_filename *fname, +@@ -2218,7 +2269,10 @@ static int add_dirent_to_buf(handle_t *h } /* By now the buffer is marked for journaling */ @@ -603,7 +593,7 @@ index 7f00dc3..51c950b 100644 /* * XXX shouldn't update any times until successful -@@ -2296,7 +2350,7 @@ static int make_indexed_dir(handle_t *handle, struct ext4_filename *fname, +@@ -2324,7 +2378,7 @@ static int make_indexed_dir(handle_t *ha blocksize); /* initialize hashing info */ @@ -612,7 +602,7 @@ index 7f00dc3..51c950b 100644 memset(dx_info, 0, sizeof(*dx_info)); dx_info->info_length = sizeof(*dx_info); if (ext4_hash_in_dirent(dir)) -@@ -2307,7 +2361,8 @@ static int make_indexed_dir(handle_t *handle, struct ext4_filename *fname, +@@ -2335,7 +2389,8 @@ static int make_indexed_dir(handle_t *ha entries = (void *)dx_info + sizeof(*dx_info); dx_set_block(entries, 1); dx_set_count(entries, 1); @@ -622,7 +612,7 @@ index 7f00dc3..51c950b 100644 /* Initialize as for dx_probe */ fname->hinfo.hash_version = dx_info->hash_version; -@@ -2348,7 +2403,7 @@ out_frames: +@@ -2381,7 +2436,7 @@ out_frames: */ if (retval) ext4_mark_inode_dirty(handle, dir); @@ -631,7 +621,7 @@ index 7f00dc3..51c950b 100644 brelse(bh2); return retval; } -@@ -2361,6 +2416,8 @@ static int ext4_update_dotdot(handle_t *handle, struct dentry *dentry, +@@ -2394,6 +2449,8 @@ static int ext4_update_dotdot(handle_t * struct buffer_head *dir_block; struct ext4_dir_entry_2 *de; int len, journal = 0, err = 0; @@ -640,7 +630,7 @@ index 7f00dc3..51c950b 100644 if (IS_ERR(handle)) return PTR_ERR(handle); -@@ -2376,21 +2433,25 @@ static int ext4_update_dotdot(handle_t *handle, struct dentry *dentry, +@@ -2409,21 +2466,25 @@ static int ext4_update_dotdot(handle_t * de = (struct ext4_dir_entry_2 *)dir_block->b_data; /* the first item must be "." */ @@ -672,7 +662,7 @@ index 7f00dc3..51c950b 100644 de = (struct ext4_dir_entry_2 *) ((char *) de + le16_to_cpu(de->rec_len)); if (!journal) { -@@ -2404,10 +2465,15 @@ static int ext4_update_dotdot(handle_t *handle, struct dentry *dentry, +@@ -2437,10 +2498,15 @@ static int ext4_update_dotdot(handle_t * if (len > 0) de->rec_len = cpu_to_le16(len); else @@ -690,7 +680,7 @@ index 7f00dc3..51c950b 100644 out_journal: if (journal) { -@@ -2445,6 +2511,7 @@ static int ext4_add_entry(handle_t *handle, struct dentry *dentry, +@@ -2478,6 +2544,7 @@ static int ext4_add_entry(handle_t *hand ext4_lblk_t block, blocks; int csum_size = 0; @@ -698,7 +688,7 @@ index 7f00dc3..51c950b 100644 if (ext4_has_metadata_csum(inode->i_sb)) csum_size = sizeof(struct ext4_dir_entry_tail); -@@ -2687,7 +2754,7 @@ again: +@@ -2720,7 +2787,7 @@ again: dx_set_count(entries, 1); dx_set_block(entries + 0, newblock); info = dx_get_dx_info((struct ext4_dir_entry_2 *) @@ -707,7 +697,7 @@ index 7f00dc3..51c950b 100644 info->indirect_levels = 1; dxtrace(printk(KERN_DEBUG "Creating %d level index...\n", -@@ -2713,7 +2780,7 @@ journal_error: +@@ -2746,7 +2813,7 @@ journal_error: ext4_std_error(dir->i_sb, err); /* this is a no-op if err == 0 */ cleanup: brelse(bh); @@ -716,7 +706,7 @@ index 7f00dc3..51c950b 100644 /* @restart is true means htree-path has been changed, we need to * repeat dx_probe() to find out valid htree-path */ -@@ -3016,38 +3083,73 @@ err_unlock_inode: +@@ -3049,38 +3116,73 @@ err_unlock_inode: return err; } @@ -798,7 +788,7 @@ index 7f00dc3..51c950b 100644 struct buffer_head *dir_block = NULL; struct ext4_dir_entry_2 *de; ext4_lblk_t block = 0; -@@ -3071,7 +3173,11 @@ int ext4_init_new_dir(handle_t *handle, struct inode *dir, +@@ -3104,7 +3206,11 @@ int ext4_init_new_dir(handle_t *handle, if (IS_ERR(dir_block)) return PTR_ERR(dir_block); de = (struct ext4_dir_entry_2 *)dir_block->b_data; @@ -811,7 +801,7 @@ index 7f00dc3..51c950b 100644 set_nlink(inode, 2); if (csum_size) ext4_initialize_dirent_tail(dir_block, blocksize); -@@ -3086,6 +3192,29 @@ out: +@@ -3119,6 +3225,29 @@ out: return err; } @@ -841,7 +831,7 @@ index 7f00dc3..51c950b 100644 static int ext4_mkdir(struct user_namespace *mnt_userns, struct inode *dir, struct dentry *dentry, umode_t mode) { -@@ -3113,7 +3242,7 @@ retry: +@@ -3146,7 +3275,7 @@ retry: inode->i_op = &ext4_dir_inode_operations; inode->i_fop = &ext4_dir_operations; @@ -850,20 +840,18 @@ index 7f00dc3..51c950b 100644 if (err) goto out_clear_inode; err = ext4_mark_inode_dirty(handle, inode); -diff --git a/fs/ext4/super.c b/fs/ext4/super.c -index 4be1994..a2fcbf8 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c -@@ -1672,7 +1672,7 @@ enum { +@@ -1667,7 +1667,7 @@ enum { Opt_inlinecrypt, Opt_usrjquota, Opt_grpjquota, Opt_offusrjquota, Opt_offgrpjquota, Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_jqfmt_vfsv1, Opt_quota, - Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err, + Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err, Opt_dirdata, - Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, + Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never, Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error, -@@ -1756,6 +1756,7 @@ static const match_table_t tokens = { +@@ -1751,6 +1751,7 @@ static const match_table_t tokens = { {Opt_nolazytime, "nolazytime"}, {Opt_debug_want_extra_isize, "debug_want_extra_isize=%u"}, {Opt_nodelalloc, "nodelalloc"}, @@ -871,7 +859,7 @@ index 4be1994..a2fcbf8 100644 {Opt_removed, "mblk_io_submit"}, {Opt_removed, "nomblk_io_submit"}, {Opt_block_validity, "block_validity"}, -@@ -2000,6 +2001,7 @@ static const struct mount_opts { +@@ -1995,6 +1996,7 @@ static const struct mount_opts { {Opt_usrjquota, 0, MOPT_Q | MOPT_STRING}, {Opt_grpjquota, 0, MOPT_Q | MOPT_STRING}, {Opt_offusrjquota, 0, MOPT_Q}, @@ -879,6 +867,3 @@ index 4be1994..a2fcbf8 100644 {Opt_offgrpjquota, 0, MOPT_Q}, {Opt_jqfmt_vfsold, QFMT_VFS_OLD, MOPT_QFMT}, {Opt_jqfmt_vfsv0, QFMT_VFS_V0, MOPT_QFMT}, --- -2.34.1 - --- a/ldiskfs/kernel_patches/patches/sles15sp4/ext4-pdirop.patch +++ b/ldiskfs/kernel_patches/patches/sles15sp4/ext4-pdirop.patch @@ -17,14 +17,12 @@ This patch contains: - integrate with osd-ldiskfs --- - fs/ext4/Makefile | 1 + - fs/ext4/ext4.h | 78 ++++++++ - fs/ext4/namei.c | 465 ++++++++++++++++++++++++++++++++++++++++++----- - fs/ext4/super.c | 1 + + fs/ext4/Makefile | 1 + fs/ext4/ext4.h | 78 +++++++++ + fs/ext4/namei.c | 465 ++++++++++++++++++++++++++++++++++++++++++++++++++----- + fs/ext4/super.c | 1 4 files changed, 504 insertions(+), 41 deletions(-) -diff --git a/fs/ext4/Makefile b/fs/ext4/Makefile -index 49e7af6..f7ced03 100644 --- a/fs/ext4/Makefile +++ b/fs/ext4/Makefile @@ -7,6 +7,7 @@ obj-$(CONFIG_EXT4_FS) += ext4.o @@ -35,8 +33,6 @@ index 49e7af6..f7ced03 100644 indirect.o inline.o inode.o ioctl.o mballoc.o migrate.o \ mmp.o move_extent.o namei.o page-io.o readpage.o resize.o \ super.o symlink.o sysfs.o xattr.o xattr_hurd.o xattr_trusted.o \ -diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h -index 54734be..fa5d5d6 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -28,6 +28,7 @@ @@ -47,7 +43,7 @@ index 54734be..fa5d5d6 100644 #include <linux/sched/signal.h> #include <linux/blockgroup_lock.h> #include <linux/percpu_counter.h> -@@ -1020,6 +1021,9 @@ struct ext4_inode_info { +@@ -1023,6 +1024,9 @@ struct ext4_inode_info { __u32 i_dtime; ext4_fsblk_t i_file_acl; @@ -57,7 +53,7 @@ index 54734be..fa5d5d6 100644 /* * i_block_group is the number of the block group which contains * this file's inode. Constant across the lifetime of the inode, -@@ -2509,6 +2513,72 @@ struct dx_hash_info +@@ -2537,6 +2541,72 @@ struct dx_hash_info */ #define HASH_NB_ALWAYS 1 @@ -130,7 +126,7 @@ index 54734be..fa5d5d6 100644 struct ext4_filename { const struct qstr *usr_fname; struct fscrypt_str disk_name; -@@ -2887,12 +2957,20 @@ void ext4_insert_dentry(struct inode *dir, struct inode *inode, +@@ -2915,12 +2985,20 @@ void ext4_insert_dentry(struct inode *di void *data); static inline void ext4_update_dx_flag(struct inode *inode) { @@ -151,11 +147,9 @@ index 54734be..fa5d5d6 100644 } static const unsigned char ext4_filetype_table[] = { DT_UNKNOWN, DT_REG, DT_DIR, DT_CHR, DT_BLK, DT_FIFO, DT_SOCK, DT_LNK -diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c -index 51c950b..1b8c80e 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c -@@ -56,6 +56,7 @@ struct buffer_head *ext4_append(handle_t *handle, +@@ -56,6 +56,7 @@ struct buffer_head *ext4_append(handle_t { struct ext4_map_blocks map; struct buffer_head *bh; @@ -163,7 +157,7 @@ index 51c950b..1b8c80e 100644 int err; if (unlikely(EXT4_SB(inode->i_sb)->s_max_dir_size_kb && -@@ -63,6 +64,10 @@ struct buffer_head *ext4_append(handle_t *handle, +@@ -63,6 +64,10 @@ struct buffer_head *ext4_append(handle_t EXT4_SB(inode->i_sb)->s_max_dir_size_kb))) return ERR_PTR(-ENOSPC); @@ -174,7 +168,7 @@ index 51c950b..1b8c80e 100644 *block = inode->i_size >> inode->i_sb->s_blocksize_bits; map.m_lblk = *block; map.m_len = 1; -@@ -73,21 +78,27 @@ struct buffer_head *ext4_append(handle_t *handle, +@@ -73,16 +78,21 @@ struct buffer_head *ext4_append(handle_t * directory. */ err = ext4_map_blocks(NULL, inode, &map, 0); @@ -197,14 +191,16 @@ index 51c950b..1b8c80e 100644 + } inode->i_size += inode->i_sb->s_blocksize; EXT4_I(inode)->i_disksize = inode->i_size; + err = ext4_mark_inode_dirty(handle, inode); +@@ -91,6 +101,7 @@ struct buffer_head *ext4_append(handle_t BUFFER_TRACE(bh, "get_write_access"); err = ext4_journal_get_write_access(handle, inode->i_sb, bh, EXT4_JTR_NONE); + up(&ei->i_append_sem); - if (err) { - brelse(bh); - ext4_std_error(inode->i_sb, err); -@@ -291,7 +302,8 @@ static unsigned dx_node_limit(struct inode *dir); + if (err) + goto out; + return bh; +@@ -296,7 +307,8 @@ static unsigned dx_node_limit(struct ino static struct dx_frame *dx_probe(struct ext4_filename *fname, struct inode *dir, struct dx_hash_info *hinfo, @@ -214,7 +210,7 @@ index 51c950b..1b8c80e 100644 static void dx_release(struct dx_frame *frames, struct inode *dir); static int dx_make_map(struct inode *dir, struct buffer_head *bh, struct dx_hash_info *hinfo, -@@ -307,12 +319,13 @@ static void dx_insert_block(struct dx_frame *frame, +@@ -312,12 +324,13 @@ static void dx_insert_block(struct dx_fr static int ext4_htree_next_block(struct inode *dir, __u32 hash, struct dx_frame *frame, struct dx_frame *frames, @@ -231,7 +227,7 @@ index 51c950b..1b8c80e 100644 /* checksumming functions */ void ext4_initialize_dirent_tail(struct buffer_head *bh, -@@ -797,6 +810,227 @@ static inline void htree_rep_invariant_check(struct dx_entry *at, +@@ -804,6 +817,227 @@ static inline void htree_rep_invariant_c } #endif /* DX_DEBUG */ @@ -459,7 +455,7 @@ index 51c950b..1b8c80e 100644 /* * Probe for a directory leaf block to search. * -@@ -808,10 +1042,11 @@ static inline void htree_rep_invariant_check(struct dx_entry *at, +@@ -815,10 +1049,11 @@ static inline void htree_rep_invariant_c */ static struct dx_frame * dx_probe(struct ext4_filename *fname, struct inode *dir, @@ -473,7 +469,7 @@ index 51c950b..1b8c80e 100644 struct dx_root_info *info; struct dx_frame *frame = frame_in; struct dx_frame *ret_err = ERR_PTR(ERR_BAD_DX_DIR); -@@ -895,8 +1130,16 @@ dx_probe(struct ext4_filename *fname, struct inode *dir, +@@ -908,8 +1143,16 @@ dx_probe(struct ext4_filename *fname, st level = 0; blocks[0] = 0; while (1) { @@ -490,7 +486,7 @@ index 51c950b..1b8c80e 100644 ext4_warning_inode(dir, "dx entry: count %u beyond limit %u", count, dx_get_limit(entries)); -@@ -923,6 +1166,74 @@ dx_probe(struct ext4_filename *fname, struct inode *dir, +@@ -936,6 +1179,74 @@ dx_probe(struct ext4_filename *fname, st frame->entries = entries; frame->at = at; @@ -565,7 +561,7 @@ index 51c950b..1b8c80e 100644 block = dx_get_block(at); for (i = 0; i <= level; i++) { if (blocks[i] == block) { -@@ -932,8 +1243,7 @@ dx_probe(struct ext4_filename *fname, struct inode *dir, +@@ -945,8 +1256,7 @@ dx_probe(struct ext4_filename *fname, st goto fail; } } @@ -575,7 +571,7 @@ index 51c950b..1b8c80e 100644 blocks[level] = block; frame++; frame->bh = ext4_read_dirblock(dir, block, INDEX); -@@ -1004,7 +1314,7 @@ static void dx_release(struct dx_frame *frames, struct inode *dir) +@@ -1017,7 +1327,7 @@ static void dx_release(struct dx_frame * static int ext4_htree_next_block(struct inode *dir, __u32 hash, struct dx_frame *frame, struct dx_frame *frames, @@ -584,7 +580,7 @@ index 51c950b..1b8c80e 100644 { struct dx_frame *p; struct buffer_head *bh; -@@ -1019,12 +1329,22 @@ static int ext4_htree_next_block(struct inode *dir, __u32 hash, +@@ -1032,12 +1342,22 @@ static int ext4_htree_next_block(struct * this loop, num_frames indicates the number of interior * nodes need to be read. */ @@ -609,7 +605,7 @@ index 51c950b..1b8c80e 100644 p--; } -@@ -1047,6 +1367,13 @@ static int ext4_htree_next_block(struct inode *dir, __u32 hash, +@@ -1060,6 +1380,13 @@ static int ext4_htree_next_block(struct * block so no check is necessary */ while (num_frames--) { @@ -623,7 +619,7 @@ index 51c950b..1b8c80e 100644 bh = ext4_read_dirblock(dir, dx_get_block(p->at), INDEX); if (IS_ERR(bh)) return PTR_ERR(bh); -@@ -1055,6 +1382,7 @@ static int ext4_htree_next_block(struct inode *dir, __u32 hash, +@@ -1068,6 +1395,7 @@ static int ext4_htree_next_block(struct p->bh = bh; p->at = p->entries = ((struct dx_node *) bh->b_data)->entries; } @@ -631,7 +627,7 @@ index 51c950b..1b8c80e 100644 return 1; } -@@ -1216,10 +1544,10 @@ int ext4_htree_fill_tree(struct file *dir_file, __u32 start_hash, +@@ -1234,10 +1562,10 @@ int ext4_htree_fill_tree(struct file *di } hinfo.hash = start_hash; hinfo.minor_hash = 0; @@ -644,7 +640,7 @@ index 51c950b..1b8c80e 100644 /* Add '.' and '..' from the htree header */ if (!start_hash && !start_minor_hash) { de = (struct ext4_dir_entry_2 *) frames[0].bh->b_data; -@@ -1259,7 +1587,7 @@ int ext4_htree_fill_tree(struct file *dir_file, __u32 start_hash, +@@ -1277,7 +1605,7 @@ int ext4_htree_fill_tree(struct file *di count += ret; hashval = ~0; ret = ext4_htree_next_block(dir, HASH_NB_ALWAYS, @@ -653,7 +649,7 @@ index 51c950b..1b8c80e 100644 *next_hash = hashval; if (ret < 0) { err = ret; -@@ -1579,7 +1907,7 @@ static int is_dx_internal_node(struct inode *dir, ext4_lblk_t block, +@@ -1600,7 +1928,7 @@ static int is_dx_internal_node(struct in static struct buffer_head *__ext4_find_entry(struct inode *dir, struct ext4_filename *fname, struct ext4_dir_entry_2 **res_dir, @@ -662,7 +658,7 @@ index 51c950b..1b8c80e 100644 { struct super_block *sb; struct buffer_head *bh_use[NAMEI_RA_SIZE]; -@@ -1621,7 +1949,7 @@ static struct buffer_head *__ext4_find_entry(struct inode *dir, +@@ -1641,7 +1969,7 @@ static struct buffer_head *__ext4_find_e goto restart; } if (is_dx(dir)) { @@ -671,7 +667,7 @@ index 51c950b..1b8c80e 100644 /* * On success, or if the error was file not found, * return. Otherwise, fall back to doing a search the -@@ -1631,6 +1959,7 @@ static struct buffer_head *__ext4_find_entry(struct inode *dir, +@@ -1651,6 +1979,7 @@ static struct buffer_head *__ext4_find_e goto cleanup_and_exit; dxtrace(printk(KERN_DEBUG "ext4_find_entry: dx failed, " "falling back\n")); @@ -679,7 +675,7 @@ index 51c950b..1b8c80e 100644 ret = NULL; } nblocks = dir->i_size >> EXT4_BLOCK_SIZE_BITS(sb); -@@ -1721,10 +2050,10 @@ cleanup_and_exit: +@@ -1741,10 +2070,10 @@ cleanup_and_exit: return ret; } @@ -692,7 +688,7 @@ index 51c950b..1b8c80e 100644 { int err; struct ext4_filename fname; -@@ -1736,12 +2065,14 @@ static struct buffer_head *ext4_find_entry(struct inode *dir, +@@ -1756,12 +2085,14 @@ static struct buffer_head *ext4_find_ent if (err) return ERR_PTR(err); @@ -708,7 +704,7 @@ index 51c950b..1b8c80e 100644 static struct buffer_head *ext4_lookup_entry(struct inode *dir, struct dentry *dentry, struct ext4_dir_entry_2 **res_dir) -@@ -1757,7 +2088,7 @@ static struct buffer_head *ext4_lookup_entry(struct inode *dir, +@@ -1777,7 +2108,7 @@ static struct buffer_head *ext4_lookup_e if (err) return ERR_PTR(err); @@ -717,7 +713,7 @@ index 51c950b..1b8c80e 100644 ext4_fname_free_filename(&fname); return bh; -@@ -1765,7 +2096,8 @@ static struct buffer_head *ext4_lookup_entry(struct inode *dir, +@@ -1785,7 +2116,8 @@ static struct buffer_head *ext4_lookup_e static struct buffer_head * ext4_dx_find_entry(struct inode *dir, struct ext4_filename *fname, @@ -727,7 +723,7 @@ index 51c950b..1b8c80e 100644 { struct super_block * sb = dir->i_sb; struct dx_frame frames[EXT4_HTREE_LEVEL], *frame; -@@ -1776,7 +2108,7 @@ static struct buffer_head * ext4_dx_find_entry(struct inode *dir, +@@ -1796,7 +2128,7 @@ static struct buffer_head * ext4_dx_find #ifdef CONFIG_FS_ENCRYPTION *res_dir = NULL; #endif @@ -736,7 +732,7 @@ index 51c950b..1b8c80e 100644 if (IS_ERR(frame)) return (struct buffer_head *) frame; do { -@@ -1798,7 +2130,7 @@ static struct buffer_head * ext4_dx_find_entry(struct inode *dir, +@@ -1818,7 +2150,7 @@ static struct buffer_head * ext4_dx_find /* Check to see if we should continue to search */ retval = ext4_htree_next_block(dir, fname->hinfo.hash, frame, @@ -745,7 +741,7 @@ index 51c950b..1b8c80e 100644 if (retval < 0) { ext4_warning_inode(dir, "error %d reading directory index block", -@@ -1987,8 +2319,9 @@ static struct ext4_dir_entry_2 *dx_pack_dirents(struct inode *dir, char *base, +@@ -2007,8 +2339,9 @@ static struct ext4_dir_entry_2 *dx_pack_ * Returns pointer to de in block into which the new entry will be inserted. */ static struct ext4_dir_entry_2 *do_split(handle_t *handle, struct inode *dir, @@ -757,7 +753,7 @@ index 51c950b..1b8c80e 100644 { unsigned blocksize = dir->i_sb->s_blocksize; unsigned continued; -@@ -2065,8 +2398,14 @@ static struct ext4_dir_entry_2 *do_split(handle_t *handle, struct inode *dir, +@@ -2085,8 +2418,14 @@ static struct ext4_dir_entry_2 *do_split hash2, split, count-split)); /* Fancy dance to stay within two buffers */ @@ -774,7 +770,7 @@ index 51c950b..1b8c80e 100644 de = dx_pack_dirents(dir, data1, blocksize); de->rec_len = ext4_rec_len_to_disk(data1 + (blocksize - csum_size) - (char *) de, -@@ -2084,12 +2423,21 @@ static struct ext4_dir_entry_2 *do_split(handle_t *handle, struct inode *dir, +@@ -2104,12 +2443,21 @@ static struct ext4_dir_entry_2 *do_split dxtrace(dx_show_leaf(dir, hinfo, (struct ext4_dir_entry_2 *) data2, blocksize, 1)); @@ -801,7 +797,7 @@ index 51c950b..1b8c80e 100644 err = ext4_handle_dirty_dirblock(handle, dir, bh2); if (err) goto journal_error; -@@ -2388,7 +2736,7 @@ static int make_indexed_dir(handle_t *handle, struct ext4_filename *fname, +@@ -2421,7 +2769,7 @@ static int make_indexed_dir(handle_t *ha if (retval) goto out_frames; @@ -810,7 +806,7 @@ index 51c950b..1b8c80e 100644 if (IS_ERR(de)) { retval = PTR_ERR(de); goto out_frames; -@@ -2497,8 +2845,8 @@ out: +@@ -2530,8 +2878,8 @@ out: * may not sleep between calling this and putting something into * the entry, as someone else might have used it while you slept. */ @@ -821,7 +817,7 @@ index 51c950b..1b8c80e 100644 { struct inode *dir = d_inode(dentry->d_parent); struct buffer_head *bh = NULL; -@@ -2547,9 +2895,10 @@ static int ext4_add_entry(handle_t *handle, struct dentry *dentry, +@@ -2580,9 +2928,10 @@ static int ext4_add_entry(handle_t *hand if (dentry->d_name.len == 2 && memcmp(dentry->d_name.name, "..", 2) == 0) return ext4_update_dotdot(handle, dentry, inode); @@ -833,7 +829,7 @@ index 51c950b..1b8c80e 100644 /* Can we just ignore htree data? */ if (ext4_has_metadata_csum(sb)) { EXT4_ERROR_INODE(dir, -@@ -2612,12 +2961,14 @@ out: +@@ -2645,12 +2994,14 @@ out: ext4_set_inode_state(inode, EXT4_STATE_NEWENTRY); return retval; } @@ -849,7 +845,7 @@ index 51c950b..1b8c80e 100644 { struct dx_frame frames[EXT4_HTREE_LEVEL], *frame; struct dx_entry *entries, *at; -@@ -2629,7 +2980,7 @@ static int ext4_dx_add_entry(handle_t *handle, struct ext4_filename *fname, +@@ -2662,7 +3013,7 @@ static int ext4_dx_add_entry(handle_t *h again: restart = 0; @@ -858,7 +854,7 @@ index 51c950b..1b8c80e 100644 if (IS_ERR(frame)) return PTR_ERR(frame); entries = frame->entries; -@@ -2664,6 +3015,12 @@ again: +@@ -2697,6 +3048,12 @@ again: struct dx_node *node2; struct buffer_head *bh2; @@ -871,7 +867,7 @@ index 51c950b..1b8c80e 100644 while (frame > frames) { if (dx_get_count((frame - 1)->entries) < dx_get_limit((frame - 1)->entries)) { -@@ -2767,8 +3124,32 @@ again: +@@ -2800,8 +3157,32 @@ again: restart = 1; goto journal_error; } @@ -905,7 +901,7 @@ index 51c950b..1b8c80e 100644 if (IS_ERR(de)) { err = PTR_ERR(de); goto cleanup; -@@ -2779,6 +3160,8 @@ again: +@@ -2812,6 +3193,8 @@ again: journal_error: ext4_std_error(dir->i_sb, err); /* this is a no-op if err == 0 */ cleanup: @@ -914,18 +910,13 @@ index 51c950b..1b8c80e 100644 brelse(bh); dx_release(frames, dir); /* @restart is true means htree-path has been changed, we need to -diff --git a/fs/ext4/super.c b/fs/ext4/super.c -index a2fcbf8..82ea5f6 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c -@@ -1291,6 +1291,7 @@ static struct inode *ext4_alloc_inode(struct super_block *sb) - +@@ -1286,6 +1286,7 @@ static struct inode *ext4_alloc_inode(st inode_set_iversion(&ei->vfs_inode, 1); + ei->i_flags = 0; spin_lock_init(&ei->i_raw_lock); + sema_init(&ei->i_append_sem, 1); INIT_LIST_HEAD(&ei->i_prealloc_list); atomic_set(&ei->i_prealloc_active, 0); spin_lock_init(&ei->i_prealloc_lock); --- -2.34.1 - --- a/ldiskfs/kernel_patches/series/ldiskfs-5.14.21-sles15sp4.series +++ b/ldiskfs/kernel_patches/series/ldiskfs-5.14.21-sles15sp4.series @@ -1,3 +1,130 @@ +patches.suse/ext4-add-new-helper-interface-ext4_try_to_trim_range.patch +patches.suse/ext4-fix-fast-commit-may-miss-tracking-range-for-FAL.patch +patches.suse/ext4-use-ext4_ext_remove_space-for-fast-commit-repla.patch +patches.suse/ext4-fast-commit-may-miss-tracking-unwritten-range-d.patch +patches.suse/ext4-destroy-ext4_fc_dentry_cachep-kmemcache-on-modu.patch +patches.suse/ext4-Fix-BUG_ON-in-ext4_bread-when-write-quota-data.patch +patches.suse/ext4-make-sure-quota-gets-properly-shutdown-on-error.patch +patches.suse/ext4-make-sure-to-reset-inode-lockdep-class-when-quo.patch +patches.suse/ext4-fix-a-possible-ABBA-deadlock-due-to-busy-PA.patch +patches.suse/ext4-initialize-err_blk-before-calling-__ext4_get_in.patch +patches.suse/ext4-fix-null-ptr-deref-in-__ext4_journal_ensure_cre.patch +patches.suse/ext4-fix-an-use-after-free-issue-about-data-journal-.patch +patches.suse/ext4-avoid-trim-error-on-fs-with-small-groups.patch +patches.suse/ext4-don-t-use-the-orphan-list-when-migrating-an-ino.patch +patches.suse/ext4-prevent-used-blocks-from-being-allocated-during.patch +patches.suse/ext4-modify-the-logic-of-ext4_mb_new_blocks_simple.patch +patches.suse/ext4-fix-error-handling-in-ext4_restore_inline_data.patch +patches.suse/ext4-fix-error-handling-in-ext4_fc_record_modified_i.patch +patches.suse/ext4-fix-incorrect-type-issue-during-replay_del_rang.patch +patches.suse/ext4-fix-fs-corruption-when-tring-to-remove-a-non-em.patch +patches.suse/ext4-fix-ext4_fc_stats-trace-point.patch +patches.suse/ext4-fix-fallocate-to-use-file_modified-to-update-pe.patch +patches.suse/ext4-fix-symlink-file-size-not-match-to-file-content.patch +patches.suse/ext4-fix-use-after-free-in-ext4_search_dir.patch +patches.suse/ext4-limit-length-to-bitmap_maxbytes-blocksize-in-pu.patch +patches.suse/ext4-fix-overhead-calculation-to-account-for-the-res.patch +patches.suse/ext4-force-overhead-calculation-if-the-s_overhead_cl.patch +patches.suse/ext4-fix-warning-in-ext4_handle_inode_extension.patch +patches.suse/ext4-fix-use-after-free-in-ext4_rename_dir_prepare.patch +patches.suse/ext4-mark-group-as-trimmed-only-if-it-was-fully-scan.patch +patches.suse/ext4-fix-race-condition-between-ext4_write-and-ext4_.patch +patches.suse/ext4-reject-the-commit-option-on-ext2-filesystems.patch +patches.suse/ext4-fix-bug_on-in-ext4_writepages.patch +patches.suse/ext4-filter-out-EXT4_FC_REPLAY-from-on-disk-superblo.patch +patches.suse/ext4-verify-dir-block-before-splitting-it.patch +patches.suse/ext4-avoid-cycles-in-directory-h-tree.patch +patches.suse/ext4-fix-bug_on-in-__es_tree_search.patch +patches.suse/ext4-fix-super-block-checksum-incorrect-after-mount.patch +patches.suse/ext4-fix-bug_on-ext4_mb_use_inode_pa.patch +patches.suse/ext4-make-variable-count-signed.patch +patches.suse/ext4-add-reserved-GDT-blocks-check.patch +patches.suse/ext4-recover-csum-seed-of-tmp_inode-after-migrating-.patch +patches.suse/ext4-check-if-directory-block-is-within-i_size.patch +patches.suse/ext4-make-sure-ext4_append-always-allocates-new-bloc.patch +patches.suse/ext4-remove-EA-inode-entry-from-mbcache-on-inode-evi.patch +patches.suse/ext4-unindent-codeblock-in-ext4_xattr_block_set.patch +patches.suse/ext4-Fix-check-for-block-being-out-of-directory-size.patch +patches.suse/ext4-don-t-increase-iversion-counter-for-ea_inodes.patch +patches.suse/ext4-unconditionally-enable-the-i_version-counter.patch +patches.suse/ext4-ext4_read_bh_lock-should-submit-IO-if-the-buffe.patch +patches.suse/ext4-place-buffer-head-allocation-before-handle-star.patch +patches.suse/ext4-fix-dir-corruption-when-ext4_dx_add_entry-fails.patch +patches.suse/ext4-fix-miss-release-buffer-head-in-ext4_fc_write_i.patch +patches.suse/ext4-goto-right-label-failed_mount3a.patch +patches.suse/ext4-fix-potential-memory-leak-in-ext4_fc_record_mod.patch +patches.suse/ext4-fix-potential-memory-leak-in-ext4_fc_record_reg.patch +patches.suse/ext4-update-state-fc_regions_size-after-successful-m.patch +patches.suse/ext4-introduce-EXT4_FC_TAG_BASE_LEN-helper.patch +patches.suse/ext4-factor-out-ext4_fc_get_tl.patch +patches.suse/ext4-fix-potential-out-of-bound-read-in-ext4_fc_repl.patch +patches.suse/ext4-f2fs-fix-readahead-of-verity-data.patch +patches.suse/ext4-fix-BUG_ON-when-directory-entry-has-invalid-rec.patch +patches.suse/ext4-fix-warning-in-ext4_da_release_space.patch +patches.suse/ext4-fix-use-after-free-in-ext4_ext_shift_extents.patch +patches.suse/ext4-silence-the-warning-when-evicting-inode-with-di.patch +patches.suse/ext4-add-inode-table-check-in-__ext4_get_inode_loc-t.patch +patches.suse/ext4-add-helper-to-check-quota-inums.patch +patches.suse/ext4-add-EXT4_IGET_BAD-flag-to-prevent-unexpected-ba.patch +patches.suse/ext4-fix-bug_on-in-__es_tree_search-caused-by-bad-bo.patch +patches.suse/ext4-fix-undefined-behavior-in-bit-shift-for-ext4_ch.patch +patches.suse/ext4-don-t-allow-journal-inode-to-have-encrypt-flag.patch +patches.suse/ext4-fix-use-after-free-in-ext4_orphan_cleanup.patch +patches.suse/ext4-don-t-set-up-encryption-key-during-jbd2-transac.patch +patches.suse/ext4-fix-leaking-uninitialized-memory-in-fast-commit.patch +patches.suse/ext4-add-missing-validation-of-fast-commit-record-le.patch +patches.suse/ext4-fix-unaligned-memory-access-in-ext4_fc_reserve_.patch +patches.suse/ext4-fix-off-by-one-errors-in-fast-commit-block-fill.patch +patches.suse/ext4-init-quota-for-old.inode-in-ext4_rename.patch +patches.suse/ext4-fix-error-code-return-to-user-space-in-ext4_get.patch +patches.suse/ext4-fix-bad-checksum-after-online-resize.patch +patches.suse/ext4-fix-corruption-when-online-resizing-a-1K-bigall.patch +patches.suse/ext4-fix-uninititialized-value-in-ext4_evict_inode.patch +patches.suse/ext4-fix-delayed-allocation-bug-in-ext4_clu_mapped-f.patch +patches.suse/fs-ext4-initialize-fsdata-in-pagecache_write.patch +patches.suse/ext4-avoid-BUG_ON-when-creating-xattrs.patch +patches.suse/ext4-fix-deadlock-due-to-mbcache-entry-corruption.patch +patches.suse/ext4-fix-kernel-BUG-in-ext4_write_inline_data_end.patch +patches.suse/ext4-initialize-quota-before-expanding-inode-in-setp.patch +patches.suse/ext4-avoid-unaccounted-block-allocation-when-expandi.patch +patches.suse/ext4-allocate-extended-attribute-value-in-vmalloc-ar.patch +patches.suse/ext4-fix-inode-leak-in-ext4_xattr_inode_create-on-an.patch +patches.suse/ext4-fix-reserved-cluster-accounting-in-__es_remove_.patch +patches.suse/ext4-use-ext4_fc_tl_mem-in-fast-commit-replay-path.patch +patches.suse/ext4-refuse-to-create-ea-block-when-umounted.patch +patches.suse/ext4-fail-ext4_iget-if-special-inode-unallocated.patch +patches.suse/ext4-update-s_journal_inum-if-it-changes-after-journ.patch +patches.suse/ext4-fix-task-hung-in-ext4_xattr_delete_inode.patch +patches.suse/ext4-Fix-possible-corruption-when-moving-a-directory.patch +patches.suse/ext4-fix-incorrect-options-show-of-original-mount_op.patch +patches.suse/ext4-fix-cgroup-writeback-accounting-with-fs-layer-e.patch +patches.suse/ext4-fix-RENAME_WHITEOUT-handling-for-inline-directo.patch +patches.suse/ext4-fix-another-off-by-one-fsmap-error-on-1k-block-.patch +patches.suse/ext4-Fix-deadlock-during-directory-rename.patch +patches.suse/ext4-move-where-set-the-MAY_INLINE_DATA-flag-is-set.patch +patches.suse/ext4-fix-WARNING-in-ext4_update_inline_data.patch +patches.suse/ext4-zero-i_disksize-when-initializing-the-bootloade.patch +patches.suse/ext4-fix-possible-double-unlock-when-moving-a-direct.patch +patches.suse/ext4-fix-i_disksize-exceeding-i_size-problem-in-pari.patch +patches.suse/ext4-fix-use-after-free-read-in-ext4_find_extent-for.patch +patches.suse/ext4-fix-WARNING-in-mb_find_extent.patch +patches.suse/ext4-fix-lockdep-warning-when-enabling-MMP.patch +patches.suse/ext4-avoid-deadlock-in-fs-reclaim-with-page-writebac.patch +patches.suse/ext4-fix-data-races-when-using-cached-status-extents.patch +patches.suse/ext4-check-iomap-type-only-if-ext4_iomap_begin-does-.patch +patches.suse/ext4-improve-error-handling-from-ext4_dirhash.patch +patches.suse/ext4-improve-error-recovery-code-paths-in-__ext4_rem.patch +patches.suse/ext4-fix-deadlock-when-converting-an-inline-director.patch +patches.suse/ext4-bail-out-of-ext4_xattr_ibody_get-fails-for-any-.patch +patches.suse/ext4-add-EA_INODE-checking-to-ext4_iget.patch +patches.suse/ext4-set-lockdep-subclass-for-the-ea_inode-in-ext4_x.patch +patches.suse/ext4-disallow-ea_inodes-with-extended-attributes.patch +patches.suse/ext4-add-lockdep-annotations-for-i_data_sem-for-ea_i.patch +patches.suse/ext4-only-update-i_reserved_data_blocks-on-successfu.patch +patches.suse/ext4-Fix-reusing-stale-buffer-heads-from-last-failed.patch +patches.suse/ext4-turn-quotas-off-if-mount-failed-after-enabling-.patch +patches.suse/ext4-fix-to-check-return-value-of-freeze_bdev-in-ext.patch +patches.suse/ext4-fixup-pages-without-buffers.patch rhel8/ext4-inode-version.patch linux-5.4/ext4-lookup-dotdot.patch linux-5.14/ext4-print-inum-in-htree-warning.patch
Locations
Projects
Search
Status Monitor
Help
OpenBuildService.org
Documentation
API Documentation
Code of Conduct
Contact
Support
@OBShq
Terms
openSUSE Build Service is sponsored by
The Open Build Service is an
openSUSE project
.
Sign Up
Log In
Places
Places
All Projects
Status Monitor