From: Raghavendra K T <raghavendra.kt(a)amd.com>
[ Upstream commit 84db47ca7146d7bd00eb5cf2b93989a971c84650 ]
Since commit fc137c0ddab2 ("sched/numa: enhance vma scanning logic")
NUMA Balancing allows updating PTEs to trap NUMA hinting faults if the
task had previously accessed VMA. However unconditional scan of VMAs are
allowed during initial phase of VMA creation until process's
mm numa_scan_seq reaches 2 even though current task had not accessed VMA.
Rationale:
- Without initial scan subsequent PTE update may never happen.
- Give fair opportunity to all the VMAs to be scanned and subsequently
understand the access pattern of all the VMAs.
But it has a corner case where, if a VMA is created after some time,
process's mm numa_scan_seq could be already greater than 2.
For e.g., values of mm numa_scan_seq when VMAs are created by running
mmtest autonuma benchmark briefly looks like:
start_seq=0 : 459
start_seq=2 : 138
start_seq=3 : 144
start_seq=4 : 8
start_seq=8 : 1
start_seq=9 : 1
This results in no unconditional PTE updates for those VMAs created after
some time.
Fix:
- Note down the initial value of mm numa_scan_seq in per VMA start_seq.
- Allow unconditional scan till start_seq + 2.
Result:
SUT: AMD EPYC Milan with 2 NUMA nodes 256 cpus.
base kernel: upstream 6.6-rc6 with Mels patches [1] applied.
kernbench
========== base patched %gain
Amean elsp-128 165.09 ( 0.00%) 164.78 * 0.19%*
Duration User 41404.28 41375.08
Duration System 9862.22 9768.48
Duration Elapsed 519.87 518.72
Ops NUMA PTE updates 1041416.00 831536.00
Ops NUMA hint faults 263296.00 220966.00
Ops NUMA pages migrated 258021.00 212769.00
Ops AutoNUMA cost 1328.67 1114.69
autonumabench
NUMA01_THREADLOCAL
==================
Amean elsp-NUMA01_THREADLOCAL 81.79 (0.00%) 67.74 * 17.18%*
Duration User 54832.73 47379.67
Duration System 75.00 185.75
Duration Elapsed 576.72 476.09
Ops NUMA PTE updates 394429.00 11121044.00
Ops NUMA hint faults 1001.00 8906404.00
Ops NUMA pages migrated 288.00 2998694.00
Ops AutoNUMA cost 7.77 44666.84
Signed-off-by: Raghavendra K T <raghavendra.kt(a)amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Acked-by: Mel Gorman <mgorman(a)suse.de>
Link: https://lore.kernel.org/r/2ea7cbce80ac7c62e90cbfb9653a7972f902439f.16978166…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
include/linux/mm_types.h | 3 +++
kernel/sched/fair.c | 4 +++-
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 957ce38768b2..950df415d7de 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -600,6 +600,9 @@ struct vma_numab_state {
*/
unsigned long pids_active[2];
+ /* MM scan sequence ID when scan first started after VMA creation */
+ int start_scan_seq;
+
/*
* MM scan sequence ID when the VMA was last completely scanned.
* A VMA is not eligible for scanning if prev_scan_seq == numa_scan_seq
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d7a3c63a2171..44b5262b6657 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3164,7 +3164,7 @@ static bool vma_is_accessed(struct mm_struct *mm, struct vm_area_struct *vma)
* This is also done to avoid any side effect of task scanning
* amplifying the unfairness of disjoint set of VMAs' access.
*/
- if (READ_ONCE(current->mm->numa_scan_seq) < 2)
+ if ((READ_ONCE(current->mm->numa_scan_seq) - vma->numab_state->start_scan_seq) < 2)
return true;
pids = vma->numab_state->pids_active[0] | vma->numab_state->pids_active[1];
@@ -3307,6 +3307,8 @@ static void task_numa_work(struct callback_head *work)
if (!vma->numab_state)
continue;
+ vma->numab_state->start_scan_seq = mm->numa_scan_seq;
+
vma->numab_state->next_scan = now +
msecs_to_jiffies(sysctl_numa_balancing_scan_delay);
--
2.43.0
From: Kunwu Chan <chentao(a)kylinos.cn>
[ Upstream commit f46c8a75263f97bda13c739ba1c90aced0d3b071 ]
kasprintf() returns a pointer to dynamically allocated memory
which can be NULL upon failure. Ensure the allocation was successful
by checking the pointer validity.
Suggested-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Suggested-by: Michael Ellerman <mpe(a)ellerman.id.au>
Signed-off-by: Kunwu Chan <chentao(a)kylinos.cn>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://msgid.link/20231204023223.2447523-1-chentao@kylinos.cn
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
arch/powerpc/mm/init-common.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/mm/init-common.c b/arch/powerpc/mm/init-common.c
index 2b656e67f2ea..927703af49be 100644
--- a/arch/powerpc/mm/init-common.c
+++ b/arch/powerpc/mm/init-common.c
@@ -65,7 +65,7 @@ void pgtable_cache_add(unsigned shift, void (*ctor)(void *))
* as to leave enough 0 bits in the address to contain it. */
unsigned long minalign = max(MAX_PGTABLE_INDEX_SIZE + 1,
HUGEPD_SHIFT_MASK + 1);
- struct kmem_cache *new;
+ struct kmem_cache *new = NULL;
/* It would be nice if this was a BUILD_BUG_ON(), but at the
* moment, gcc doesn't seem to recognize is_power_of_2 as a
@@ -78,7 +78,8 @@ void pgtable_cache_add(unsigned shift, void (*ctor)(void *))
align = max_t(unsigned long, align, minalign);
name = kasprintf(GFP_KERNEL, "pgtable-2^%d", shift);
- new = kmem_cache_create(name, table_size, align, 0, ctor);
+ if (name)
+ new = kmem_cache_create(name, table_size, align, 0, ctor);
if (!new)
panic("Could not allocate pgtable cache for order %d", shift);
--
2.43.0
From: Kunwu Chan <chentao(a)kylinos.cn>
[ Upstream commit f46c8a75263f97bda13c739ba1c90aced0d3b071 ]
kasprintf() returns a pointer to dynamically allocated memory
which can be NULL upon failure. Ensure the allocation was successful
by checking the pointer validity.
Suggested-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Suggested-by: Michael Ellerman <mpe(a)ellerman.id.au>
Signed-off-by: Kunwu Chan <chentao(a)kylinos.cn>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://msgid.link/20231204023223.2447523-1-chentao@kylinos.cn
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
arch/powerpc/mm/init-common.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/mm/init-common.c b/arch/powerpc/mm/init-common.c
index a84da92920f7..e7b9cc90fd9e 100644
--- a/arch/powerpc/mm/init-common.c
+++ b/arch/powerpc/mm/init-common.c
@@ -104,7 +104,7 @@ void pgtable_cache_add(unsigned int shift)
* as to leave enough 0 bits in the address to contain it. */
unsigned long minalign = max(MAX_PGTABLE_INDEX_SIZE + 1,
HUGEPD_SHIFT_MASK + 1);
- struct kmem_cache *new;
+ struct kmem_cache *new = NULL;
/* It would be nice if this was a BUILD_BUG_ON(), but at the
* moment, gcc doesn't seem to recognize is_power_of_2 as a
@@ -117,7 +117,8 @@ void pgtable_cache_add(unsigned int shift)
align = max_t(unsigned long, align, minalign);
name = kasprintf(GFP_KERNEL, "pgtable-2^%d", shift);
- new = kmem_cache_create(name, table_size, align, 0, ctor(shift));
+ if (name)
+ new = kmem_cache_create(name, table_size, align, 0, ctor(shift));
if (!new)
panic("Could not allocate pgtable cache for order %d", shift);
--
2.43.0
Return value of 'to_amdgpu_crtc' which is container_of(...) can't be
null, so it's null check 'acrtc' is dropped.
Fixing the below:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:9302 amdgpu_dm_atomic_commit_tail() error: we previously assumed 'acrtc' could be null (see line 9299)
Add 'new_crtc_state'NULL check for function
'drm_atomic_get_new_crtc_state' that retrieves the new state for a CRTC,
while enabling writeback requests.
Cc: stable(a)vger.kernel.org
Cc: Alex Hung <alex.hung(a)amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai(a)amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira(a)amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz(a)amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam(a)amd.com>
---
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 95ff3800fc87..8eb381d5f6b8 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -9294,10 +9294,10 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state)
if (!new_con_state->writeback_job)
continue;
- new_crtc_state = NULL;
+ new_crtc_state = drm_atomic_get_new_crtc_state(state, &acrtc->base);
- if (acrtc)
- new_crtc_state = drm_atomic_get_new_crtc_state(state, &acrtc->base);
+ if (!new_crtc_state)
+ continue;
if (acrtc->wb_enabled)
continue;
--
2.34.1
This is the start of the stable review cycle for the 5.10.208 release.
There are 43 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Mon, 15 Jan 2024 09:41:55 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.208-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.10.208-rc1
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Revert "nvme: use command_id instead of req->tag in trace_nvme_complete_rq()"
Bartosz Pawlowski <bartosz.pawlowski(a)intel.com>
PCI: Disable ATS for specific Intel IPU E2000 devices
Bartosz Pawlowski <bartosz.pawlowski(a)intel.com>
PCI: Extract ATS disabling to a helper function
Phil Sutter <phil(a)nwl.cc>
netfilter: nf_tables: Reject tables of unsupported family
Wander Lairson Costa <wander(a)redhat.com>
drm/qxl: fix UAF on handle creation
Jon Maxwell <jmaxwell37(a)gmail.com>
ipv6: remove max_size check inline with ipv4
John Fastabend <john.fastabend(a)gmail.com>
net: tls, update curr on splice as well
Aditya Gupta <adityag(a)linux.ibm.com>
powerpc: update ppc_save_regs to save current r1 in pt_regs
Wenchao Chen <wenchao.chen(a)unisoc.com>
mmc: sdhci-sprd: Fix eMMC init failure after hw reset
Geert Uytterhoeven <geert+renesas(a)glider.be>
mmc: core: Cancel delayed work before releasing host
Jorge Ramirez-Ortiz <jorge(a)foundries.io>
mmc: rpmb: fixes pause retune on all RPMB partitions.
Ziyang Huang <hzyitc(a)outlook.com>
mmc: meson-mx-sdhc: Fix initialization frozen issue
Jiajun Xie <jiajun.xie.sh(a)gmail.com>
mm: fix unmap_mapping_range high bits shift bug
Benjamin Bara <benjamin.bara(a)skidata.com>
i2c: core: Fix atomic xfer check for non-preempt config
Jinghao Jia <jinghao7(a)illinois.edu>
x86/kprobes: fix incorrect return address calculation in kprobe_emulate_call_indirect
Takashi Sakamoto <o-takashi(a)sakamocchi.jp>
firewire: ohci: suppress unexpected system reboot in AMD Ryzen machines and ASM108x/VT630x PCIe cards
Matthew Wilcox (Oracle) <willy(a)infradead.org>
mm/memory-failure: check the mapcount of the precise page
Thomas Lange <thomas(a)corelatus.se>
net: Implement missing SO_TIMESTAMPING_NEW cmsg support
Michael Chan <michael.chan(a)broadcom.com>
bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters()
Chen Ni <nichen(a)iscas.ac.cn>
asix: Add check for usbnet_get_endpoints
Dinghao Liu <dinghao.liu(a)zju.edu.cn>
net/qla3xxx: fix potential memleak in ql_alloc_buffer_queues
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
net/qla3xxx: switch from 'pci_' to 'dma_' API
Andrii Staikov <andrii.staikov(a)intel.com>
i40e: Restore VF MSI-X state during PCI reset
Mark Brown <broonie(a)kernel.org>
ASoC: meson: g12a-tohdmitx: Fix event generation for S/PDIF mux
Mark Brown <broonie(a)kernel.org>
ASoC: meson: g12a-toacodec: Fix event generation
Mark Brown <broonie(a)kernel.org>
ASoC: meson: g12a-tohdmitx: Validate written enum values
Mark Brown <broonie(a)kernel.org>
ASoC: meson: g12a-toacodec: Validate written enum values
Ke Xiao <xiaoke(a)sangfor.com.cn>
i40e: fix use-after-free in i40e_aqc_add_filters()
Marc Dionne <marc.dionne(a)auristor.com>
net: Save and restore msg_namelen in sock_sendmsg
Pablo Neira Ayuso <pablo(a)netfilter.org>
netfilter: nft_immediate: drop chain reference counter on error
Pablo Neira Ayuso <pablo(a)netfilter.org>
netfilter: nftables: add loop check helper function
Adrian Cinal <adriancinal(a)gmail.com>
net: bcmgenet: Fix FCS generation for fragmented skbuffs
Zhipeng Lu <alexious(a)zju.edu.cn>
sfc: fix a double-free bug in efx_probe_filters
Stefan Wahren <wahrenst(a)gmx.net>
ARM: sun9i: smp: Fix array-index-out-of-bounds read in sunxi_mc_smp_init
Jörn-Thorben Hinz <jthinz(a)mailbox.tu-berlin.de>
net: Implement missing getsockopt(SO_TIMESTAMPING_NEW)
Hangyu Hua <hbh25y(a)gmail.com>
net: sched: em_text: fix possible memory leak in em_text_destroy()
Sudheer Mogilappagari <sudheer.mogilappagari(a)intel.com>
i40e: Fix filter input checks to prevent config with invalid values
Khaled Almahallawy <khaled.almahallawy(a)intel.com>
drm/i915/dp: Fix passing the correct DPCD_REV for drm_dp_set_phy_test_pattern
Suman Ghosh <sumang(a)marvell.com>
octeontx2-af: Fix marking couple of structure as __packed
Siddh Raman Pant <code(a)siddh.me>
nfc: llcp_core: Hold a ref to llcp_local->dev when holding a ref to llcp_local
Siddhesh Dharme <siddheshdharme18(a)gmail.com>
ALSA: hda/realtek: Fix mute and mic-mute LEDs for HP ProBook 440 G6
Sarthak Kukreti <sarthakkukreti(a)chromium.org>
block: Don't invalidate pagecache for invalid falloc modes
Edward Adam Davis <eadavis(a)qq.com>
keys, dns: Fix missing size check of V1 server-list header
-------------
Diffstat:
Makefile | 4 +-
arch/arm/mach-sunxi/mc_smp.c | 4 +-
arch/powerpc/kernel/ppc_save_regs.S | 6 +-
arch/x86/kernel/kprobes/core.c | 3 +-
drivers/firewire/ohci.c | 51 ++++++
drivers/gpu/drm/i915/display/intel_dp.c | 2 +-
drivers/gpu/drm/qxl/qxl_drv.h | 2 +-
drivers/gpu/drm/qxl/qxl_dumb.c | 5 +-
drivers/gpu/drm/qxl/qxl_gem.c | 25 ++-
drivers/gpu/drm/qxl/qxl_ioctl.c | 6 +-
drivers/i2c/i2c-core.h | 4 +-
drivers/mmc/core/block.c | 7 +-
drivers/mmc/core/host.c | 1 +
drivers/mmc/host/meson-mx-sdhc-mmc.c | 26 +--
drivers/mmc/host/sdhci-sprd.c | 10 +-
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 4 +-
drivers/net/ethernet/broadcom/genet/bcmgenet.c | 4 +-
drivers/net/ethernet/intel/i40e/i40e_main.c | 11 +-
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 34 +++-
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h | 3 +
drivers/net/ethernet/marvell/octeontx2/af/npc.h | 4 +-
drivers/net/ethernet/qlogic/qla3xxx.c | 198 +++++++++------------
drivers/net/ethernet/sfc/rx_common.c | 4 +-
drivers/net/usb/ax88172a.c | 4 +-
drivers/nvme/host/trace.h | 2 +-
drivers/pci/quirks.c | 28 ++-
fs/block_dev.c | 21 ++-
include/net/dst_ops.h | 2 +-
mm/memory-failure.c | 6 +-
mm/memory.c | 4 +-
net/core/dst.c | 8 +-
net/core/sock.c | 12 +-
net/dns_resolver/dns_key.c | 19 +-
net/ipv6/route.c | 13 +-
net/netfilter/nf_tables_api.c | 57 +++++-
net/netfilter/nft_immediate.c | 2 +-
net/nfc/llcp_core.c | 39 +++-
net/sched/em_text.c | 4 +-
net/socket.c | 2 +
net/tls/tls_sw.c | 2 +
sound/pci/hda/patch_realtek.c | 1 +
sound/soc/meson/g12a-toacodec.c | 5 +-
sound/soc/meson/g12a-tohdmitx.c | 8 +-
43 files changed, 429 insertions(+), 228 deletions(-)