This is a backport of the series that fixes the way deadline bandwidth
restoration is done which is causing noticeable delay on resume path. It also
converts the cpuset lock back into a mutex which some users on Android too.
I lack the details but AFAIU the read/write semaphore was slower on high
contention.
Compile tested against some randconfig for different archs. Only boot tested on
x86 qemu.
Based on v6.4.11
Original series:
https://lore.kernel.org/lkml/20230508075854.17215-1-juri.lelli@redhat.com/
Thanks!
--
Qais Yousef
Dietmar Eggemann (2):
sched/deadline: Create DL BW alloc, free & check overflow interface
cgroup/cpuset: Free DL BW in case can_attach() fails
Juri Lelli (4):
cgroup/cpuset: Rename functions dealing with DEADLINE accounting
sched/cpuset: Bring back cpuset_mutex
sched/cpuset: Keep track of SCHED_DEADLINE task in cpusets
cgroup/cpuset: Iterate only if DEADLINE tasks are present
include/linux/cpuset.h | 12 +-
include/linux/sched.h | 4 +-
kernel/cgroup/cgroup.c | 4 +
kernel/cgroup/cpuset.c | 244 ++++++++++++++++++++++++++--------------
kernel/sched/core.c | 41 +++----
kernel/sched/deadline.c | 67 ++++++++---
kernel/sched/sched.h | 2 +-
7 files changed, 246 insertions(+), 128 deletions(-)
--
2.34.1
From: "Paul E. McKenney" <paulmck(a)kernel.org>
[ Upstream commit 10f84c2cfb5045e37d78cb5d4c8e8321e06ae18f ]
Currently, the various torture tests sometimes react to an early-boot
bug by rebooting. This is almost always counterproductive, needlessly
consuming CPU time and bloating the console log. This commit therefore
adds the "-no-reboot" argument to qemu so that reboot requests will
cause qemu to exit.
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel(a)joelfernandes.org>
---
tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh b/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
index 6dc2b49b85ea..bdd747dc61f2 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
@@ -9,7 +9,7 @@
#
# Usage: kvm-test-1-run.sh config builddir resdir seconds qemu-args boot_args
#
-# qemu-args defaults to "-enable-kvm -nographic", along with arguments
+# qemu-args defaults to "-enable-kvm -nographic -no-reboot", along with arguments
# specifying the number of CPUs and other options
# generated from the underlying CPU architecture.
# boot_args defaults to value returned by the per_version_boot_params
@@ -132,7 +132,7 @@ then
fi
# Generate -smp qemu argument.
-qemu_args="-enable-kvm -nographic $qemu_args"
+qemu_args="-enable-kvm -nographic -no-reboot $qemu_args"
cpu_count=`configNR_CPUS.sh $resdir/ConfigFragment`
cpu_count=`configfrag_boot_cpus "$boot_args" "$config_template" "$cpu_count"`
if test "$cpu_count" -gt "$TORTURE_ALLOTED_CPUS"
--
2.42.0.rc1.204.g551eb34607-goog
I'm announcing the release of the 6.1.49 kernel.
This upgrade is only for all users of the 6.1 series that use the x86
platform OR the F2FS file system. If that's not you, feel free to
ignore this release.
The updated 6.1.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-6.1.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 -
fs/f2fs/f2fs.h | 1
fs/f2fs/file.c | 5 ++++
fs/f2fs/node.c | 14 +----------
fs/f2fs/super.c | 43 +++++++++++------------------------
include/linux/f2fs_fs.h | 1
tools/objtool/arch/x86/decode.c | 11 +++++---
tools/objtool/check.c | 22 +++++++++++++++++
tools/objtool/include/objtool/arch.h | 1
tools/objtool/include/objtool/elf.h | 1
10 files changed, 53 insertions(+), 48 deletions(-)
Greg Kroah-Hartman (4):
Revert "f2fs: don't reset unchangable mount option in f2fs_remount()"
Revert "f2fs: fix to set flush_merge opt and show noflush_merge"
Revert "f2fs: fix to do sanity check on direct node in truncate_dnode()"
Linux 6.1.49
Peter Zijlstra (1):
objtool/x86: Fix SRSO mess
Cc: stable(a)vger.kernel.org # v5.11+
Fixes: e7e0545299d8 ("x86/sgx: Initialize metadata for Enclave Page Cache (EPC) sections")
Reported-by: kernel test robot <lkp(a)intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202308221542.11UpkVfp-lkp@intel.com/
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
arch/x86/kernel/cpu/sgx/main.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 166692f2d501..388350b8f5e3 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -732,6 +732,10 @@ int arch_memory_failure(unsigned long pfn, int flags)
}
/**
+ * sgx_calc_section_metric() - Calculate an EPC section metric
+ * @low: low 32-bit word from CPUID:0x12:{2, ...}
+ * @high: high 32-bit word from CPUID:0x12:{2, ...}
+ *
* A section metric is concatenated in a way that @low bits 12-31 define the
* bits 12-31 of the metric and @high bits 0-19 define the bits 32-51 of the
* metric.
--
2.39.2
Patch series "don't use mapcount() to check large folio sharing", v2.
In madvise_cold_or_pageout_pte_range() and madvise_free_pte_range(),
folio_mapcount() is used to check whether the folio is shared. But it's
not correct as folio_mapcount() returns total mapcount of large folio.
Use folio_estimated_sharers() here as the estimated number is enough.
This patchset will fix the cases:
User space application call madvise() with MADV_FREE, MADV_COLD and
MADV_PAGEOUT for specific address range. There are THP mapped to the
range. Without the patchset, the THP is skipped. With the patch, the
THP will be split and handled accordingly.
David reported the cow self test skip some cases because of MADV_PAGEOUT
skip THP:
https://lore.kernel.org/linux-mm/9e92e42d-488f-47db-ac9d-75b24cd0d037@intel…
and I confirmed this patchset make it work again.
This patch (of 3):
Commit 07e8c82b5eff ("madvise: convert madvise_cold_or_pageout_pte_range()
to use folios") replaced the page_mapcount() with folio_mapcount() to
check whether the folio is shared by other mapping.
It's not correct for large folio. folio_mapcount() returns the total
mapcount of large folio which is not suitable to detect whether the folio
is shared.
Use folio_estimated_sharers() which returns a estimated number of shares.
That means it's not 100% correct. It should be OK for madvise case here.
User-visible effects is that the THP is skipped when user call madvise.
But the correct behavior is THP should be split and processed then.
NOTE: this change is a temporary fix to reduce the user-visible effects
before the long term fix from David is ready.
Link: https://lkml.kernel.org/r/20230808020917.2230692-1-fengwei.yin@intel.com
Link: https://lkml.kernel.org/r/20230808020917.2230692-2-fengwei.yin@intel.com
Fixes: 07e8c82b5eff ("madvise: convert madvise_cold_or_pageout_pte_range() to use folios")
Signed-off-by: Yin Fengwei <fengwei.yin(a)intel.com>
Reviewed-by: Yu Zhao <yuzhao(a)google.com>
Reviewed-by: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Vishal Moola (Oracle) <vishal.moola(a)gmail.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
(cherry picked from commit 2f406263e3e954aa24c1248edcfa9be0c1bb30fa)
---
mm/madvise.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/madvise.c b/mm/madvise.c
index b5ffbaf616f5..6adee363a9fa 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -375,7 +375,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
folio = pfn_folio(pmd_pfn(orig_pmd));
/* Do not interfere with other mappings of this folio */
- if (folio_mapcount(folio) != 1)
+ if (folio_estimated_sharers(folio) != 1)
goto huge_unlock;
if (pageout_anon_only_filter && !folio_test_anon(folio))
@@ -447,7 +447,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
* are sure it's worth. Split it if we are only owner.
*/
if (folio_test_large(folio)) {
- if (folio_mapcount(folio) != 1)
+ if (folio_estimated_sharers(folio) != 1)
break;
if (pageout_anon_only_filter && !folio_test_anon(folio))
break;
--
2.39.2
The patch below does not apply to the 6.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.4.y
git checkout FETCH_HEAD
git cherry-pick -x 0e0e9bd5f7b9d40fd03b70092367247d52da1db0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023082614-choice-mongoose-0731@gregkh' --subject-prefix 'PATCH 6.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0e0e9bd5f7b9d40fd03b70092367247d52da1db0 Mon Sep 17 00:00:00 2001
From: Yin Fengwei <fengwei.yin(a)intel.com>
Date: Tue, 8 Aug 2023 10:09:17 +0800
Subject: [PATCH] madvise:madvise_free_pte_range(): don't use mapcount()
against large folio for sharing check
Commit 98b211d6415f ("madvise: convert madvise_free_pte_range() to use a
folio") replaced the page_mapcount() with folio_mapcount() to check
whether the folio is shared by other mapping.
It's not correct for large folios. folio_mapcount() returns the total
mapcount of large folio which is not suitable to detect whether the folio
is shared.
Use folio_estimated_sharers() which returns a estimated number of shares.
That means it's not 100% correct. It should be OK for madvise case here.
User-visible effects is that the THP is skipped when user call madvise.
But the correct behavior is THP should be split and processed then.
NOTE: this change is a temporary fix to reduce the user-visible effects
before the long term fix from David is ready.
Link: https://lkml.kernel.org/r/20230808020917.2230692-4-fengwei.yin@intel.com
Fixes: 98b211d6415f ("madvise: convert madvise_free_pte_range() to use a folio")
Signed-off-by: Yin Fengwei <fengwei.yin(a)intel.com>
Reviewed-by: Yu Zhao <yuzhao(a)google.com>
Reviewed-by: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Vishal Moola (Oracle) <vishal.moola(a)gmail.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/madvise.c b/mm/madvise.c
index 46802b4cf65a..ec30f48f8f2e 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -680,7 +680,7 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr,
if (folio_test_large(folio)) {
int err;
- if (folio_mapcount(folio) != 1)
+ if (folio_estimated_sharers(folio) != 1)
break;
if (!folio_trylock(folio))
break;