Greeting,
FYI, we noticed BUG:Bad_page_state_in_process due to commit (built with gcc-11):
commit: fc581f48adffe8e6e2f1ae7822b004b0240602b3 ("[PATCH] fscrypt: Copy the memcg information to the ciphertext page") url: https://github.com/intel-lab-lkp/linux/commits/Matthew-Wilcox-Oracle/fscrypt... base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git c96618275234ad03d44eafe9f8844305bb44fda4 patch link: https://lore.kernel.org/all/20230129121851.2248378-1-willy@infradead.org/ patch subject: [PATCH] fscrypt: Copy the memcg information to the ciphertext page
in testcase: xfstests version: xfstests-x86_64-fb6575e-1_20230123 with following parameters:
disk: 4HDD fs: f2fs test: generic-group-22
test-description: xfstests is a regression test suite for xfs and other files ystems. test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git
on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz (Skylake) with 32G memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
[ 63.714777][ T1373] run fstests generic/440 at 2023-02-03 01:47:05 [ 66.322355][ T1973] F2FS-fs (sda4): Found nat_bits in checkpoint [ 66.920249][ T1973] F2FS-fs (sda4): Mounted with checkpoint version = 194d8365 [ 66.952346][ T1983] xfs_io (pid 1983) is setting deprecated v1 encryption policy; recommend upgrading to v2. [ 68.956618][ T2010] F2FS-fs (sda4): Found nat_bits in checkpoint [ 69.578430][ T2010] F2FS-fs (sda4): Mounted with checkpoint version = 62a6fc2b [ 69.824624][ T2111] fscrypt: AES-256-CTS-CBC using implementation "cts-cbc-aes-aesni" [ 69.851641][ T1712] fscrypt: AES-256-XTS using implementation "xts-aes-aesni" [ 70.337764][ T2125] F2FS-fs (sda4): Found nat_bits in checkpoint [ 70.927766][ T2125] F2FS-fs (sda4): Mounted with checkpoint version = 62a6fc2e [ 71.349898][ T2167] BUG: Bad page state in process 440 pfn:1803ec [ 71.356070][ T2167] page:00000000a7ddf13f refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1803ec [ 71.366105][ T2167] memcg:ffff8881f31f8000 [ 71.370178][ T2167] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) [ 71.377365][ T2167] raw: 0017ffffc0000000 dead000000000100 dead000000000122 0000000000000000 [ 71.385759][ T2167] raw: 0000000000000000 0000000000000000 00000000ffffffff ffff8881f31f8000 [ 71.394149][ T2167] page dumped because: page still charged to cgroup [ 71.400554][ T2167] Modules linked in: dm_mod f2fs crc32_generic ipmi_devintf ipmi_msghandler btrfs blake2b_generic xor raid6_pq zstd_compress intel_rapl_msr libcrc32c intel_rapl_common sd_mod t10_pi x86_pkg_temp_thermal intel_powerclamp crc64_rocksoft_generic crc64_rocksoft coretemp crc64 sg kvm_intel i915 kvm irqbypass crct10dif_pclmul drm_buddy crc32_pclmul intel_gtt crc32c_intel drm_display_helper ghash_clmulni_intel sha512_ssse3 ttm ahci mei_wdt rapl drm_kms_helper libahci intel_cstate wmi_bmof intel_uncore mei_me syscopyarea sysfillrect i2c_i801 video i2c_smbus mei libata sysimgblt intel_pch_thermal wmi intel_pmc_core acpi_pad drm fuse ip_tables [ 71.457998][ T2167] CPU: 2 PID: 2167 Comm: 440 Tainted: G I 6.2.0-rc5-00206-gfc581f48adff #2 [ 71.467774][ T2167] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.1.1 10/07/2015 [ 71.475821][ T2167] Call Trace: [ 71.478959][ T2167] <TASK> [ 71.481754][ T2167] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1)) [ 71.486091][ T2167] bad_page.cold (mm/page_alloc.c:699) [ 71.490341][ T2167] free_pcppages_bulk (mm/page_alloc.c:1598) [ 71.495198][ T2167] free_unref_page (arch/x86/include/asm/paravirt.h:596 arch/x86/include/asm/qspinlock.h:57 include/linux/spinlock.h:203 include/linux/spinlock_api_smp.h:142 include/linux/spinlock.h:390 mm/page_alloc.c:3488) [ 71.499792][ T2167] __mmdrop (arch/x86/include/asm/mmu_context.h:125 (discriminator 3) kernel/fork.c:796 (discriminator 3)) [ 71.503698][ T2167] finish_task_switch+0x486/0x720 [ 71.509157][ T2167] schedule_tail (arch/x86/include/asm/current.h:41 kernel/sched/core.c:5231) [ 71.513408][ T2167] ret_from_fork (arch/x86/entry/entry_64.S:295) [ 71.517573][ T2167] </TASK> [ 71.520440][ T2167] Disabling lock debugging due to kernel taint [ 71.668880][ T2169] F2FS-fs (sda4): Found nat_bits in checkpoint [ 72.266606][ T2169] F2FS-fs (sda4): Mounted with checkpoint version = 62a6fc31 [ 74.907644][ T2213] F2FS-fs (sda4): Found nat_bits in checkpoint [ 75.506786][ T2213] F2FS-fs (sda4): Mounted with checkpoint version = 62a6fc33 [ 75.713226][ T244] generic/440 _check_dmesg: something found in dmesg (see /lkp/benchmarks/xfstests/results//generic/440.dmesg)
Please note that this issue is not 100% reproducible in our tests. We got about 50% chance to reproduce the issue in multiple rounds of tests.
If you fix the issue, kindly add following tag | Reported-by: kernel test robot yujie.liu@intel.com | Link: https://lore.kernel.org/oe-lkp/202302031333.e7a563c1-yujie.liu@intel.com
To reproduce:
git clone https://github.com/intel/lkp-tests.git cd lkp-tests sudo bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state.