This is the start of the stable review cycle for the 5.15.88 release.
There are 10 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 14 Jan 2023 13:53:18 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.88-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.88-rc1
Paolo Abeni <pabeni(a)redhat.com>
net/ulp: prevent ULP without clone op from entering the LISTEN status
Frederick Lawler <fred(a)cloudflare.com>
net: sched: disallow noqueue for qdisc classes
Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
serial: fixup backport of "serial: Deassert Transmit Enable on probe in driver-specific way"
Kyle Huey <me(a)kylehuey.com>
selftests/vm/pkeys: Add a regression test for setting PKRU through ptrace
Kyle Huey <me(a)kylehuey.com>
x86/fpu: Emulate XRSTOR's behavior if the xfeatures PKRU bit is not set
Kyle Huey <me(a)kylehuey.com>
x86/fpu: Allow PKRU to be (once again) written by ptrace.
Kyle Huey <me(a)kylehuey.com>
x86/fpu: Add a pkru argument to copy_uabi_to_xstate()
Kyle Huey <me(a)kylehuey.com>
x86/fpu: Add a pkru argument to copy_uabi_from_kernel_to_xstate().
Kyle Huey <me(a)kylehuey.com>
x86/fpu: Take task_struct* in copy_sigframe_from_user_to_xstate()
Helge Deller <deller(a)gmx.de>
parisc: Align parisc MADV_XXX constants with all other architectures
-------------
Diffstat:
Makefile | 4 +-
arch/parisc/include/uapi/asm/mman.h | 27 +++---
arch/parisc/kernel/sys_parisc.c | 27 ++++++
arch/parisc/kernel/syscalls/syscall.tbl | 2 +-
arch/x86/include/asm/fpu/xstate.h | 4 +-
arch/x86/kernel/fpu/regset.c | 2 +-
arch/x86/kernel/fpu/signal.c | 2 +-
arch/x86/kernel/fpu/xstate.c | 41 ++++++++-
drivers/tty/serial/fsl_lpuart.c | 2 +-
drivers/tty/serial/serial_core.c | 3 +-
net/ipv4/inet_connection_sock.c | 16 +++-
net/ipv4/tcp_ulp.c | 4 +
net/sched/sch_api.c | 5 +
tools/arch/parisc/include/uapi/asm/mman.h | 12 +--
tools/perf/bench/bench.h | 12 ---
tools/testing/selftests/vm/pkey-x86.h | 12 +++
tools/testing/selftests/vm/protection_keys.c | 131 ++++++++++++++++++++++++++-
17 files changed, 257 insertions(+), 49 deletions(-)
From: Sean Christopherson <seanjc(a)google.com>
Since VMX and SVM both would never update the control bits if exits
are disable after vCPUs are created, only allow setting exits
disable flag before vCPU creation.
Fixes: 4d5422cea3b6 ("KVM: X86: Provide a capability to disable MWAIT
intercepts")
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: Kechen Lu <kechenl(a)nvidia.com>
Cc: stable(a)vger.kernel.org
---
Documentation/virt/kvm/api.rst | 1 +
arch/x86/kvm/x86.c | 6 ++++++
2 files changed, 7 insertions(+)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 9807b05a1b57..fb0fcc566d5a 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7087,6 +7087,7 @@ branch to guests' 0x200 interrupt vector.
:Architectures: x86
:Parameters: args[0] defines which exits are disabled
:Returns: 0 on success, -EINVAL when args[0] contains invalid exits
+ or if any vCPU has already been created
Valid bits in args[0] are::
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index da4bbd043a7b..c8ae9c4f9f08 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6227,6 +6227,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
if (cap->args[0] & ~KVM_X86_DISABLE_VALID_EXITS)
break;
+ mutex_lock(&kvm->lock);
+ if (kvm->created_vcpus)
+ goto disable_exits_unlock;
+
if ((cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT) &&
kvm_can_mwait_in_guest())
kvm->arch.mwait_in_guest = true;
@@ -6237,6 +6241,8 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
if (cap->args[0] & KVM_X86_DISABLE_EXITS_CSTATE)
kvm->arch.cstate_in_guest = true;
r = 0;
+disable_exits_unlock:
+ mutex_unlock(&kvm->lock);
break;
case KVM_CAP_MSR_PLATFORM_INFO:
kvm->arch.guest_can_read_msr_platform_info = cap->args[0];
--
2.34.1
After the qmp phy driver was split it looks like 5.15.y stable kernels
aren't getting fixes like commit 7a7d86d14d07 ("phy: qcom-qmp-combo: fix
broken power on") which is tagged for stable 5.10. Trogdor boards use
the qmp phy on 5.15.y kernels, so I backported the fixes I could find
that looked like we may possibly trip over at some point.
USB and DP work on my Trogdor.Lazor board with this set.
Johan Hovold (4):
phy: qcom-qmp-combo: disable runtime PM on unbind
phy: qcom-qmp-combo: fix memleak on probe deferral
phy: qcom-qmp-combo: fix broken power on
phy: qcom-qmp-combo: fix runtime suspend
drivers/phy/qualcomm/phy-qcom-qmp.c | 72 ++++++++++++++---------------
1 file changed, 36 insertions(+), 36 deletions(-)
Cc: Johan Hovold <johan+linaro(a)kernel.org>
Cc: Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
Cc: Vinod Koul <vkoul(a)kernel.org>
base-commit: d57287729e229188e7d07ef0117fe927664e08cb
--
https://chromeos.dev
Hi!
> Results from Linaro’s test farm.
> Regressions on arm64 Raspberry Pi 4 Model B.
>
> Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org>
>
> While running LTP controllers cgroup_fj_stress_blkio test cases
> the Insufficient stack space to handle exception! occurred and
> followed by kernel panic on arm64 Raspberry Pi 4 Model B with
> clang-15 built kernel Image.
>
> The full boot and test log attached to this email and build and
> Kconfig links provided in the bottom of this email.
Full log is 11MB. That's rather... big for an email. Please post such
stuff as a link or at least compress them...
Best regards,
Pavel
--
DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
From: Clement Lecigne <clecigne(a)google.com>
[ Note: this is a fix that works around the bug equivalently as the
two upstream commits:
1fa4445f9adf ("ALSA: control - introduce snd_ctl_notify_one() helper")
56b88b50565c ("ALSA: pcm: Move rwsem lock inside snd_ctl_elem_read to prevent UAF")
but in a simpler way to fit with older stable trees -- tiwai ]
Add missing locking in ctl_elem_read_user/ctl_elem_write_user which can be
easily triggered and turned into an use-after-free.
Example code paths with SNDRV_CTL_IOCTL_ELEM_READ:
64-bits:
snd_ctl_ioctl
snd_ctl_elem_read_user
[takes controls_rwsem]
snd_ctl_elem_read [lock properly held, all good]
[drops controls_rwsem]
32-bits (compat):
snd_ctl_ioctl_compat
snd_ctl_elem_write_read_compat
ctl_elem_write_read
snd_ctl_elem_read [missing lock, not good]
CVE-2023-0266 was assigned for this issue.
Signed-off-by: Clement Lecigne <clecigne(a)google.com>
Cc: stable(a)kernel.org # 5.12 and older
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
---
Greg, this is a patch for the last ALSA PCM UCM fix for the older
stable trees. Please take this to 5.10.y and older stable trees.
Thanks!
sound/core/control_compat.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/sound/core/control_compat.c b/sound/core/control_compat.c
index 97467f6a32a1..980ab3580f1b 100644
--- a/sound/core/control_compat.c
+++ b/sound/core/control_compat.c
@@ -304,7 +304,9 @@ static int ctl_elem_read_user(struct snd_card *card,
err = snd_power_wait(card, SNDRV_CTL_POWER_D0);
if (err < 0)
goto error;
+ down_read(&card->controls_rwsem);
err = snd_ctl_elem_read(card, data);
+ up_read(&card->controls_rwsem);
if (err < 0)
goto error;
err = copy_ctl_value_to_user(userdata, valuep, data, type, count);
@@ -332,7 +334,9 @@ static int ctl_elem_write_user(struct snd_ctl_file *file,
err = snd_power_wait(card, SNDRV_CTL_POWER_D0);
if (err < 0)
goto error;
+ down_write(&card->controls_rwsem);
err = snd_ctl_elem_write(card, file, data);
+ up_write(&card->controls_rwsem);
if (err < 0)
goto error;
err = copy_ctl_value_to_user(userdata, valuep, data, type, count);
--
2.35.3
Eine Spende wurde an Sie getätigt, antworten Sie für weitere Einzelheiten.
Grüße
Theresia Steven
--
This email has been checked for viruses by Avast antivirus software.
www.avast.com
On Tue, Jan 03, 2023 at 11:58:48AM +0100, Ard Biesheuvel wrote:
> On Tue, 3 Jan 2023 at 03:13, Linus Torvalds
> <torvalds(a)linux-foundation.org> wrote:
> >
> > On Mon, Jan 2, 2023 at 5:45 PM Guenter Roeck <linux(a)roeck-us.net> wrote:
> > >
> > > ... and reverting commit 99cb0d917ff indeed fixes the problem.
> >
> > Hmm. My gut feel is that this just exposes some bug in binutils.
> >
> > That said, maybe that commit should not have added its own /DISCARDS/
> > thing, and instead just added that "*(.note.GNU-stack)" to the general
> > /DISCARDS/ thing that is defined by the
> >
> > #define DISCARDS ..
> >
> > a little bit later, so that we only end up with one single DISCARD
> > list. Something like this (broken patch on purpose):
> >
> > --- a/include/asm-generic/vmlinux.lds.h
> > +++ b/include/asm-generic/vmlinux.lds.h
> > @@ -897,5 +897,4 @@
> > */
> > #define NOTES \
> > - /DISCARD/ : { *(.note.GNU-stack) } \
> > .notes : AT(ADDR(.notes) - LOAD_OFFSET) { \
> > BOUNDED_SECTION_BY(.note.*, _notes) \
> > @@ -1016,4 +1015,5 @@
> > #define DISCARDS \
> > /DISCARD/ : { \
> > + *(.note.GNU-stack) \
> > EXIT_DISCARDS \
> > EXIT_CALL \
> >
> > But maybe that DISCARDS macrop ends up being used too late?
> >
>
> Masahiro's v1 did something like this, and it caused an issue on
> RISC-V, which is why we ended up with this approach instead.
>
> > It really shouldn't matter, but here we are, with a build problem with
> > some random old binutils on an odd platform..
> >
>
> AIUI, the way ld.bfd used to combine output sections may also affect
> the /DISCARD/ pseudo-section, and so introducing it much earlier
> results in these discards to be interpreted in a different order.
>
> The purpose of this change is to prevent .note.GNU-stack from deciding
> the section type of the .notes output section, and so keeping it in
> its own section should be sufficient. E.g.,
>
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -896,7 +896,7 @@
> * Otherwise, the type of .notes section would become PROGBITS
> instead of NOTES.
> */
> #define NOTES \
> - /DISCARD/ : { *(.note.GNU-stack) } \
> + .note.GNU-stack : { *(.note.GNU-stack) } \
> .notes : AT(ADDR(.notes) - LOAD_OFFSET) { \
> BOUNDED_SECTION_BY(.note.*, _notes) \
> } NOTES_HEADERS \
>
> The .note.GNU-stack has zero size, so the result should be the same.
+Greg +Nick
This also fixes Build ID on arm64 for stable 5.15, 5.10, and 5.4
which has been broken since backport of:
0d362be5b142 ("Makefile: link with -z noexecstack --no-warn-rwx-segments")
Discussed here:
https://lore.kernel.org/stable/3df32572ec7016e783d37e185f88495831671f5d.167…https://lore.kernel.org/stable/cover.1670358255.git.tom.saeger@oracle.com/
Perhaps add:
Cc: <stable(a)vger.kernel.org> # 5.15, 5.10, 5.4
for stable 5.15, 5.10, 5.4
Tested-by: Tom Saeger <tom.saeger(a)oracle.com>
From: Yunfei Wang <yf.wang(a)mediatek.com>
In __alloc_and_insert_iova_range, there is an issue that retry_pfn
overflows. The value of iovad->anchor.pfn_hi is ~0UL, then when
iovad->cached_node is iovad->anchor, curr_iova->pfn_hi + 1 will
overflow. As a result, if the retry logic is executed, low_pfn is
updated to 0, and then new_pfn < low_pfn returns false to make the
allocation successful.
This issue occurs in the following two situations:
1. The first iova size exceeds the domain size. When initializing
iova domain, iovad->cached_node is assigned as iovad->anchor. For
example, the iova domain size is 10M, start_pfn is 0x1_F000_0000,
and the iova size allocated for the first time is 11M. The
following is the log information, new->pfn_lo is smaller than
iovad->cached_node.
Example log as follows:
[ 223.798112][T1705487] sh: [name:iova&]__alloc_and_insert_iova_range
start_pfn:0x1f0000,retry_pfn:0x0,size:0xb00,limit_pfn:0x1f0a00
[ 223.799590][T1705487] sh: [name:iova&]__alloc_and_insert_iova_range
success start_pfn:0x1f0000,new->pfn_lo:0x1efe00,new->pfn_hi:0x1f08ff
2. The node with the largest iova->pfn_lo value in the iova domain
is deleted, iovad->cached_node will be updated to iovad->anchor,
and then the alloc iova size exceeds the maximum iova size that can
be allocated in the domain.
After judging that retry_pfn is less than limit_pfn, call retry_pfn+1
to fix the overflow issue.
Signed-off-by: jianjiao zeng <jianjiao.zeng(a)mediatek.com>
Signed-off-by: Yunfei Wang <yf.wang(a)mediatek.com>
Cc: <stable(a)vger.kernel.org> # 5.15.*
---
v2: Update patch
1. Cc stable(a)vger.kernel.org
This patch needs to be merged stable branch,
add stable(a)vger.kernel.org in mail list.
2. Refer robin's suggestion to update patch.
---
drivers/iommu/iova.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index a44ad92fc5eb..fe452ce46642 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -197,7 +197,7 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
curr = __get_cached_rbnode(iovad, limit_pfn);
curr_iova = to_iova(curr);
- retry_pfn = curr_iova->pfn_hi + 1;
+ retry_pfn = curr_iova->pfn_hi;
retry:
do {
@@ -211,7 +211,7 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
if (high_pfn < size || new_pfn < low_pfn) {
if (low_pfn == iovad->start_pfn && retry_pfn < limit_pfn) {
high_pfn = limit_pfn;
- low_pfn = retry_pfn;
+ low_pfn = retry_pfn + 1;
curr = iova_find_limit(iovad, limit_pfn);
curr_iova = to_iova(curr);
goto retry;
--
2.18.0
Sehr geehrter E-Mail-Begünstigter, Sie wurden für eine Spende in Höhe
von 3.500.000,00 ? ausgewählt. Wenden Sie sich an diese
E-Mail-Adresse: s.g0392440821(a)gmail.com, um weitere Informationen zum
Erhalt Ihrer Spende zu erhalten. Vielen Dank
From: ChiYuan Huang <cy_huang(a)richtek.com>
There's the altmode re-registeration issue after data role
swap (DR_SWAP).
Comparing to USBPD 2.0, in USBPD 3.0, it loose the limit that only DFP
can initiate the VDM command to get partner identity information.
For a USBPD 3.0 UFP device, it may already get the identity information
from its port partner before DR_SWAP. If DR_SWAP send or receive at the
mean time, 'send_discover' flag will be raised again. It causes discover
identify action restart while entering ready state. And after all
discover actions are done, the 'tcpm_register_altmodes' will be called.
If old altmode is not unregistered, this sysfs create fail can be found.
In 'DR_SWAP_CHANGE_DR' state case, only DFP will unregister altmodes.
For UFP, the original altmodes keep registered.
This patch fix the logic that after DR_SWAP, 'tcpm_unregister_altmodes'
must be called whatever the current data role is.
Reviewed-by: Macpaul Lin <macpaul.lin(a)mediatek.com>
Fixes: ae8a2ca8a221 ("usb: typec: Group all TCPCI/TCPM code together)
Reported-by: TommyYl Chen <tommyyl.chen(a)mediatek.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: ChiYuan Huang <cy_huang(a)richtek.com>
---
Since v2:
- Correct the mail sent from Richtek.
- Add 'Reviewed-by' tag.
Hi, Greg:
Please check this one. I have strongly requested our MIS to remove the confidential string.
ChiYuan Huang.
---
drivers/usb/typec/tcpm/tcpm.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c
index 904c7b4..59b366b 100644
--- a/drivers/usb/typec/tcpm/tcpm.c
+++ b/drivers/usb/typec/tcpm/tcpm.c
@@ -4594,14 +4594,13 @@ static void run_state_machine(struct tcpm_port *port)
tcpm_set_state(port, ready_state(port), 0);
break;
case DR_SWAP_CHANGE_DR:
- if (port->data_role == TYPEC_HOST) {
- tcpm_unregister_altmodes(port);
+ tcpm_unregister_altmodes(port);
+ if (port->data_role == TYPEC_HOST)
tcpm_set_roles(port, true, port->pwr_role,
TYPEC_DEVICE);
- } else {
+ else
tcpm_set_roles(port, true, port->pwr_role,
TYPEC_HOST);
- }
tcpm_ams_finish(port);
tcpm_set_state(port, ready_state(port), 0);
break;
--
2.7.4
Hello!
This is an experimental semi-automated report about issues detected by
Coverity from a scan of next-20230111 as part of the linux-next scan project:
https://scan.coverity.com/projects/linux-next-weekly-scan
You're getting this email because you were associated with the identified
lines of code (noted below) that were touched by commits:
Mon Jan 9 16:05:21 2023 +0100
291e9da91403 ("ALSA: usb-audio: Always initialize fixed_rate in snd_usb_find_implicit_fb_sync_format()")
Coverity reported the following:
*** CID 1530547: Null pointer dereferences (REVERSE_INULL)
sound/usb/pcm.c:166 in snd_usb_pcm_has_fixed_rate()
160 bool snd_usb_pcm_has_fixed_rate(struct snd_usb_substream *subs)
161 {
162 const struct audioformat *fp;
163 struct snd_usb_audio *chip = subs->stream->chip;
164 int rate = -1;
165
vvv CID 1530547: Null pointer dereferences (REVERSE_INULL)
vvv Null-checking "subs" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
166 if (!subs)
167 return false;
168 if (!(chip->quirk_flags & QUIRK_FLAG_FIXED_RATE))
169 return false;
170 list_for_each_entry(fp, &subs->fmt_list, list) {
171 if (fp->rates & SNDRV_PCM_RATE_CONTINUOUS)
If this is a false positive, please let us know so we can mark it as
such, or teach the Coverity rules to be smarter. If not, please make
sure fixes get into linux-next. :) For patches fixing this, please
include these lines (but double-check the "Fixes" first):
Reported-by: coverity-bot <keescook+coverity-bot(a)chromium.org>
Addresses-Coverity-ID: 1530547 ("Null pointer dereferences")
Fixes: 291e9da91403 ("ALSA: usb-audio: Always initialize fixed_rate in snd_usb_find_implicit_fb_sync_format()")
Thanks for your attention!
--
Coverity-bot
Set the framebuffer info for drivers that support VGA switcheroo. Only
affects the amdgpu and nouveau drivers, which use VGA switcheroo and
generic fbdev emulation. For other drivers, this does nothing.
This fixes a potential regression in the console code. Both, amdgpu and
nouveau, invoked vga_switcheroo_client_fb_set() from their internal fbdev
code. But the call got lost when the drivers switched to the generic
emulation.
Fixes: 087451f372bf ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.")
Fixes: 4a16dd9d18a0 ("drm/nouveau/kms: switch to drm fbdev helpers")
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Reviewed-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Reviewed-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: Ben Skeggs <bskeggs(a)redhat.com>
Cc: Karol Herbst <kherbst(a)redhat.com>
Cc: Lyude Paul <lyude(a)redhat.com>
Cc: Thomas Zimmermann <tzimmermann(a)suse.de>
Cc: Javier Martinez Canillas <javierm(a)redhat.com>
Cc: Laurent Pinchart <laurent.pinchart(a)ideasonboard.com>
Cc: Jani Nikula <jani.nikula(a)intel.com>
Cc: Dave Airlie <airlied(a)redhat.com>
Cc: Evan Quan <evan.quan(a)amd.com>
Cc: Christian König <christian.koenig(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: Hawking Zhang <Hawking.Zhang(a)amd.com>
Cc: Likun Gao <Likun.Gao(a)amd.com>
Cc: "Christian König" <christian.koenig(a)amd.com>
Cc: Stanley Yang <Stanley.Yang(a)amd.com>
Cc: "Tianci.Yin" <tianci.yin(a)amd.com>
Cc: Xiaojian Du <Xiaojian.Du(a)amd.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky(a)amd.com>
Cc: YiPeng Chai <YiPeng.Chai(a)amd.com>
Cc: Somalapuram Amaranath <Amaranath.Somalapuram(a)amd.com>
Cc: Bokun Zhang <Bokun.Zhang(a)amd.com>
Cc: Guchun Chen <guchun.chen(a)amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz(a)amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai(a)amd.com>
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Solomon Chiu <solomon.chiu(a)amd.com>
Cc: Kai-Heng Feng <kai.heng.feng(a)canonical.com>
Cc: Felix Kuehling <Felix.Kuehling(a)amd.com>
Cc: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: "Marek Olšák" <marek.olsak(a)amd.com>
Cc: Sam Ravnborg <sam(a)ravnborg.org>
Cc: Hans de Goede <hdegoede(a)redhat.com>
Cc: "Ville Syrjälä" <ville.syrjala(a)linux.intel.com>
Cc: dri-devel(a)lists.freedesktop.org
Cc: nouveau(a)lists.freedesktop.org
Cc: <stable(a)vger.kernel.org> # v5.17+
---
drivers/gpu/drm/drm_fb_helper.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 427631706128..5e445c61252d 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -30,7 +30,9 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/console.h>
+#include <linux/pci.h>
#include <linux/sysrq.h>
+#include <linux/vga_switcheroo.h>
#include <drm/drm_atomic.h>
#include <drm/drm_drv.h>
@@ -1940,6 +1942,7 @@ static int drm_fb_helper_single_fb_probe(struct drm_fb_helper *fb_helper,
int preferred_bpp)
{
struct drm_client_dev *client = &fb_helper->client;
+ struct drm_device *dev = fb_helper->dev;
struct drm_fb_helper_surface_size sizes;
int ret;
@@ -1961,6 +1964,11 @@ static int drm_fb_helper_single_fb_probe(struct drm_fb_helper *fb_helper,
return ret;
strcpy(fb_helper->fb->comm, "[fbcon]");
+
+ /* Set the fb info for vgaswitcheroo clients. Does nothing otherwise. */
+ if (dev_is_pci(dev->dev))
+ vga_switcheroo_client_fb_set(to_pci_dev(dev->dev), fb_helper->info);
+
return 0;
}
--
2.39.0
Always allow switching away via vga-switcheroo if the display is
uninitalized. Instead prevent switching to i915 if the device has
not been initialized.
This issue was introduced by commit 5df7bd130818 ("drm/i915: skip
display initialization when there is no display") protected, which
protects code paths from being executed on uninitialized devices.
In the case of vga-switcheroo, we want to allow a switch away from
i915's device. So run vga_switcheroo_process_delayed_switch() and
test in the switcheroo callbacks if the i915 device is available.
Fixes: 5df7bd130818 ("drm/i915: skip display initialization when there is no display")
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Cc: Radhakrishna Sripada <radhakrishna.sripada(a)intel.com>
Cc: Lucas De Marchi <lucas.demarchi(a)intel.com>
Cc: José Roberto de Souza <jose.souza(a)intel.com>
Cc: Jani Nikula <jani.nikula(a)intel.com>
Cc: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Cc: Jani Nikula <jani.nikula(a)linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)linux.intel.com>
Cc: "Ville Syrjälä" <ville.syrjala(a)linux.intel.com>
Cc: Manasi Navare <manasi.d.navare(a)intel.com>
Cc: Stanislav Lisovskiy <stanislav.lisovskiy(a)intel.com>
Cc: Imre Deak <imre.deak(a)intel.com>
Cc: "Jouni Högander" <jouni.hogander(a)intel.com>
Cc: Uma Shankar <uma.shankar(a)intel.com>
Cc: Ankit Nautiyal <ankit.k.nautiyal(a)intel.com>
Cc: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Cc: Matt Roper <matthew.d.roper(a)intel.com>
Cc: Ramalingam C <ramalingam.c(a)intel.com>
Cc: Thomas Zimmermann <tzimmermann(a)suse.de>
Cc: Andi Shyti <andi.shyti(a)linux.intel.com>
Cc: Andrzej Hajda <andrzej.hajda(a)intel.com>
Cc: "José Roberto de Souza" <jose.souza(a)intel.com>
Cc: Julia Lawall <Julia.Lawall(a)inria.fr>
Cc: intel-gfx(a)lists.freedesktop.org
Cc: <stable(a)vger.kernel.org> # v5.14+
---
drivers/gpu/drm/i915/i915_driver.c | 3 +--
drivers/gpu/drm/i915/i915_switcheroo.c | 6 +++++-
2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
index c1e427ba57ae..33e231b120c1 100644
--- a/drivers/gpu/drm/i915/i915_driver.c
+++ b/drivers/gpu/drm/i915/i915_driver.c
@@ -1075,8 +1075,7 @@ static void i915_driver_lastclose(struct drm_device *dev)
intel_fbdev_restore_mode(dev);
- if (HAS_DISPLAY(i915))
- vga_switcheroo_process_delayed_switch();
+ vga_switcheroo_process_delayed_switch();
}
static void i915_driver_postclose(struct drm_device *dev, struct drm_file *file)
diff --git a/drivers/gpu/drm/i915/i915_switcheroo.c b/drivers/gpu/drm/i915/i915_switcheroo.c
index 23777d500cdf..f45bd6b6cede 100644
--- a/drivers/gpu/drm/i915/i915_switcheroo.c
+++ b/drivers/gpu/drm/i915/i915_switcheroo.c
@@ -19,6 +19,10 @@ static void i915_switcheroo_set_state(struct pci_dev *pdev,
dev_err(&pdev->dev, "DRM not initialized, aborting switch.\n");
return;
}
+ if (!HAS_DISPLAY(i915)) {
+ dev_err(&pdev->dev, "Device state not initialized, aborting switch.\n");
+ return;
+ }
if (state == VGA_SWITCHEROO_ON) {
drm_info(&i915->drm, "switched on\n");
@@ -44,7 +48,7 @@ static bool i915_switcheroo_can_switch(struct pci_dev *pdev)
* locking inversion with the driver load path. And the access here is
* completely racy anyway. So don't bother with locking for now.
*/
- return i915 && atomic_read(&i915->drm.open_count) == 0;
+ return i915 && HAS_DISPLAY(i915) && atomic_read(&i915->drm.open_count) == 0;
}
static const struct vga_switcheroo_client_ops i915_switcheroo_ops = {
--
2.39.0
When wait_event_interruptible() has been interrupted by a signal the
tx.state value might not be ISOTP_IDLE. Force the state machines
into idle state to inhibit the timer handlers to continue working.
Cc: stable(a)vger.kernel.org # >= v5.15
Signed-off-by: Oliver Hartkopp <socketcan(a)hartkopp.net>
---
V2: fixed checkpatch warnings m(
net/can/isotp.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/can/isotp.c b/net/can/isotp.c
index 0476a506d4a4..fc81d77724a1 100644
--- a/net/can/isotp.c
+++ b/net/can/isotp.c
@@ -1150,10 +1150,14 @@ static int isotp_release(struct socket *sock)
net = sock_net(sk);
/* wait for complete transmission of current pdu */
wait_event_interruptible(so->wait, so->tx.state == ISOTP_IDLE);
+ /* force state machines to be idle also when a signal occurred */
+ so->tx.state = ISOTP_IDLE;
+ so->rx.state = ISOTP_IDLE;
+
spin_lock(&isotp_notifier_lock);
while (isotp_busy_notifier == so) {
spin_unlock(&isotp_notifier_lock);
schedule_timeout_uninterruptible(1);
spin_lock(&isotp_notifier_lock);
--
2.30.2
When wait_event_interruptible() has been interrupted by a signal the
tx.state value might not be ISOTP_IDLE. Force the state machines
into idle state to inhibit the timer handlers to continue working.
Fixes: 866337865f37 ("can: isotp: fix tx state handling for echo tx
processing")
Cc: stable(a)vger.kernel.org
Signed-off-by: Oliver Hartkopp <socketcan(a)hartkopp.net>
---
V2: fixed checkpatch warnings m(
V3: added 'Fixes:' tag
V4: change 'Fixes:' tag to reduce WARN(1) possibility
net/can/isotp.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/can/isotp.c b/net/can/isotp.c
index 0476a506d4a4..fc81d77724a1 100644
--- a/net/can/isotp.c
+++ b/net/can/isotp.c
@@ -1150,10 +1150,14 @@ static int isotp_release(struct socket *sock)
net = sock_net(sk);
/* wait for complete transmission of current pdu */
wait_event_interruptible(so->wait, so->tx.state == ISOTP_IDLE);
+ /* force state machines to be idle also when a signal occurred */
+ so->tx.state = ISOTP_IDLE;
+ so->rx.state = ISOTP_IDLE;
+
spin_lock(&isotp_notifier_lock);
while (isotp_busy_notifier == so) {
spin_unlock(&isotp_notifier_lock);
schedule_timeout_uninterruptible(1);
spin_lock(&isotp_notifier_lock);
--
2.30.2
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
We don't have proper sub-pixel coordinate support (some platforms
simply can't do it, for others we've not implemented it). This
can cause us to treat a < 1 pixel source width/height as zero
which is not valid for the hardware, and can also cause a div
by zero in some cases.
Refuse < 1x1 plane source size to avoid these problems.
Cc: stable(a)vger.kernel.org
Cc: Juha-Pekka Heikkila <juhapekka.heikkila(a)gmail.com>
Reported-by: Drew Davenport <ddavenport(a)chromium.org>
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
The other option would be to clamp the source size to >=1x1 pixels,
but dunno if that has any real benefits.
drivers/gpu/drm/i915/display/intel_atomic_plane.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_atomic_plane.c b/drivers/gpu/drm/i915/display/intel_atomic_plane.c
index 10e1fc9d0698..c6e43d684458 100644
--- a/drivers/gpu/drm/i915/display/intel_atomic_plane.c
+++ b/drivers/gpu/drm/i915/display/intel_atomic_plane.c
@@ -921,6 +921,21 @@ int intel_atomic_plane_check_clipping(struct intel_plane_state *plane_state,
*/
plane_state->uapi.visible = drm_rect_clip_scaled(src, dst, clip);
+ /*
+ * Avoid zero source size when we later
+ * discard the fractional coords.
+ *
+ * FIXME add proper sub-pixel coordinate handling
+ * for platforms/planes that support it.
+ */
+ if (plane_state->uapi.visible &&
+ (drm_rect_width(src) < 0x10000 || drm_rect_height(src) < 0x10000)) {
+ drm_dbg_kms(&i915->drm, "Plane source must be at least 1x1 pixels\n");
+ drm_rect_debug_print("src: ", src, true);
+ drm_rect_debug_print("dst: ", dst, false);
+ return -EINVAL;
+ }
+
drm_rect_rotate_inv(src, fb->width << 16, fb->height << 16, rotation);
if (!can_position && plane_state->uapi.visible &&
--
2.38.2
Requesting an interrupt with IRQF_ONESHOT will run the primary handler
in the hard-IRQ context even in the force-threaded mode. The
force-threaded mode is used by PREEMPT_RT in order to avoid acquiring
sleeping locks (spinlock_t) in hard-IRQ context. This combination
makes it impossible and leads to "sleeping while atomic" warnings.
Use one interrupt handler for both handlers (primary and secondary)
and drop the IRQF_ONESHOT flag which is not needed.
Fixes: e359b4411c283 ("serial: stm32: fix threaded interrupt handling")
Reviewed-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Tested-by: Valentin Caron <valentin.caron(a)foss.st.com> # V3
Signed-off-by: Marek Vasut <marex(a)denx.de>
Cc: stable(a)vger.kernel.org
---
Cc: Alexandre Torgue <alexandre.torgue(a)foss.st.com>
Cc: Erwan Le Ray <erwan.leray(a)foss.st.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Jiri Slaby <jirislaby(a)kernel.org>
Cc: Maxime Coquelin <mcoquelin.stm32(a)gmail.com>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Valentin Caron <valentin.caron(a)foss.st.com>
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: linux-stm32(a)st-md-mailman.stormreply.com
To: linux-serial(a)vger.kernel.org
---
V2: - Update patch subject, was:
serial: stm32: Move hard IRQ handling to threaded interrupt context
- Use request_irq() instead, rename the IRQ handler function
V3: - Update the commit message per suggestion from Sebastian
- Add RB from Sebastian
- Add Fixes tag
V4: - Remove uart_console() deadlock check from
stm32_usart_of_dma_rx_probe()
- Use plain spin_lock()/spin_unlock() instead of the
_irqsave/_irqrestore variants in IRQ handler
- Add TB from Valentin
V5: - Add CC stable@
- Do not move the sr variable, removes one useless hunk from the patch
---
drivers/tty/serial/stm32-usart.c | 31 ++++---------------------------
1 file changed, 4 insertions(+), 27 deletions(-)
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index a1490033aa164..1e24bee2b0ef7 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -797,23 +797,9 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
spin_unlock(&port->lock);
}
- if (stm32_usart_rx_dma_enabled(port))
- return IRQ_WAKE_THREAD;
- else
- return IRQ_HANDLED;
-}
-
-static irqreturn_t stm32_usart_threaded_interrupt(int irq, void *ptr)
-{
- struct uart_port *port = ptr;
- struct tty_port *tport = &port->state->port;
- struct stm32_port *stm32_port = to_stm32_port(port);
- unsigned int size;
- unsigned long flags;
-
/* Receiver timeout irq for DMA RX */
- if (!stm32_port->throttled) {
- spin_lock_irqsave(&port->lock, flags);
+ if (stm32_usart_rx_dma_enabled(port) && !stm32_port->throttled) {
+ spin_lock(&port->lock);
size = stm32_usart_receive_chars(port, false);
uart_unlock_and_check_sysrq_irqrestore(port, flags);
if (size)
@@ -1015,10 +1001,8 @@ static int stm32_usart_startup(struct uart_port *port)
u32 val;
int ret;
- ret = request_threaded_irq(port->irq, stm32_usart_interrupt,
- stm32_usart_threaded_interrupt,
- IRQF_ONESHOT | IRQF_NO_SUSPEND,
- name, port);
+ ret = request_irq(port->irq, stm32_usart_interrupt,
+ IRQF_NO_SUSPEND, name, port);
if (ret)
return ret;
@@ -1601,13 +1585,6 @@ static int stm32_usart_of_dma_rx_probe(struct stm32_port *stm32port,
struct dma_slave_config config;
int ret;
- /*
- * Using DMA and threaded handler for the console could lead to
- * deadlocks.
- */
- if (uart_console(port))
- return -ENODEV;
-
stm32port->rx_buf = dma_alloc_coherent(dev, RX_BUF_L,
&stm32port->rx_dma_buf,
GFP_KERNEL);
--
2.39.0
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
2c02d41d71f9 ("net/ulp: prevent ULP without clone op from entering the LISTEN status")
e276d62dcfde ("net/ulp: remove SOCK_SUPPORT_ZC from tls sockets")
e7049395b1c3 ("dccp/tcp: Remove an unused argument in inet_csk_listen_start().")
53632e111946 ("bpf: selftest: Use bpf_sk_storage in FENTRY/FEXIT/RAW_TP")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2c02d41d71f90a5168391b6a5f2954112ba2307c Mon Sep 17 00:00:00 2001
From: Paolo Abeni <pabeni(a)redhat.com>
Date: Tue, 3 Jan 2023 12:19:17 +0100
Subject: [PATCH] net/ulp: prevent ULP without clone op from entering the
LISTEN status
When an ULP-enabled socket enters the LISTEN status, the listener ULP data
pointer is copied inside the child/accepted sockets by sk_clone_lock().
The relevant ULP can take care of de-duplicating the context pointer via
the clone() operation, but only MPTCP and SMC implement such op.
Other ULPs may end-up with a double-free at socket disposal time.
We can't simply clear the ULP data at clone time, as TLS replaces the
socket ops with custom ones assuming a valid TLS ULP context is
available.
Instead completely prevent clone-less ULP sockets from entering the
LISTEN status.
Fixes: 734942cc4ea6 ("tcp: ULP infrastructure")
Reported-by: slipper <slipper.alive(a)gmail.com>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Link: https://lore.kernel.org/r/4b80c3d1dbe3d0ab072f80450c202d9bc88b4b03.16727406…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 848ffc3e0239..d1f837579398 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -1200,12 +1200,26 @@ void inet_csk_prepare_forced_close(struct sock *sk)
}
EXPORT_SYMBOL(inet_csk_prepare_forced_close);
+static int inet_ulp_can_listen(const struct sock *sk)
+{
+ const struct inet_connection_sock *icsk = inet_csk(sk);
+
+ if (icsk->icsk_ulp_ops && !icsk->icsk_ulp_ops->clone)
+ return -EINVAL;
+
+ return 0;
+}
+
int inet_csk_listen_start(struct sock *sk)
{
struct inet_connection_sock *icsk = inet_csk(sk);
struct inet_sock *inet = inet_sk(sk);
int err;
+ err = inet_ulp_can_listen(sk);
+ if (unlikely(err))
+ return err;
+
reqsk_queue_alloc(&icsk->icsk_accept_queue);
sk->sk_ack_backlog = 0;
diff --git a/net/ipv4/tcp_ulp.c b/net/ipv4/tcp_ulp.c
index 9ae50b1bd844..05b6077b9f2c 100644
--- a/net/ipv4/tcp_ulp.c
+++ b/net/ipv4/tcp_ulp.c
@@ -139,6 +139,10 @@ static int __tcp_set_ulp(struct sock *sk, const struct tcp_ulp_ops *ulp_ops)
if (sk->sk_socket)
clear_bit(SOCK_SUPPORT_ZC, &sk->sk_socket->flags);
+ err = -EINVAL;
+ if (!ulp_ops->clone && sk->sk_state == TCP_LISTEN)
+ goto out_err;
+
err = ulp_ops->init(sk);
if (err)
goto out_err;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
2c02d41d71f9 ("net/ulp: prevent ULP without clone op from entering the LISTEN status")
e276d62dcfde ("net/ulp: remove SOCK_SUPPORT_ZC from tls sockets")
e7049395b1c3 ("dccp/tcp: Remove an unused argument in inet_csk_listen_start().")
53632e111946 ("bpf: selftest: Use bpf_sk_storage in FENTRY/FEXIT/RAW_TP")
8b27dae5a2e8 ("tcp: add one skb cache for rx")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2c02d41d71f90a5168391b6a5f2954112ba2307c Mon Sep 17 00:00:00 2001
From: Paolo Abeni <pabeni(a)redhat.com>
Date: Tue, 3 Jan 2023 12:19:17 +0100
Subject: [PATCH] net/ulp: prevent ULP without clone op from entering the
LISTEN status
When an ULP-enabled socket enters the LISTEN status, the listener ULP data
pointer is copied inside the child/accepted sockets by sk_clone_lock().
The relevant ULP can take care of de-duplicating the context pointer via
the clone() operation, but only MPTCP and SMC implement such op.
Other ULPs may end-up with a double-free at socket disposal time.
We can't simply clear the ULP data at clone time, as TLS replaces the
socket ops with custom ones assuming a valid TLS ULP context is
available.
Instead completely prevent clone-less ULP sockets from entering the
LISTEN status.
Fixes: 734942cc4ea6 ("tcp: ULP infrastructure")
Reported-by: slipper <slipper.alive(a)gmail.com>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Link: https://lore.kernel.org/r/4b80c3d1dbe3d0ab072f80450c202d9bc88b4b03.16727406…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 848ffc3e0239..d1f837579398 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -1200,12 +1200,26 @@ void inet_csk_prepare_forced_close(struct sock *sk)
}
EXPORT_SYMBOL(inet_csk_prepare_forced_close);
+static int inet_ulp_can_listen(const struct sock *sk)
+{
+ const struct inet_connection_sock *icsk = inet_csk(sk);
+
+ if (icsk->icsk_ulp_ops && !icsk->icsk_ulp_ops->clone)
+ return -EINVAL;
+
+ return 0;
+}
+
int inet_csk_listen_start(struct sock *sk)
{
struct inet_connection_sock *icsk = inet_csk(sk);
struct inet_sock *inet = inet_sk(sk);
int err;
+ err = inet_ulp_can_listen(sk);
+ if (unlikely(err))
+ return err;
+
reqsk_queue_alloc(&icsk->icsk_accept_queue);
sk->sk_ack_backlog = 0;
diff --git a/net/ipv4/tcp_ulp.c b/net/ipv4/tcp_ulp.c
index 9ae50b1bd844..05b6077b9f2c 100644
--- a/net/ipv4/tcp_ulp.c
+++ b/net/ipv4/tcp_ulp.c
@@ -139,6 +139,10 @@ static int __tcp_set_ulp(struct sock *sk, const struct tcp_ulp_ops *ulp_ops)
if (sk->sk_socket)
clear_bit(SOCK_SUPPORT_ZC, &sk->sk_socket->flags);
+ err = -EINVAL;
+ if (!ulp_ops->clone && sk->sk_state == TCP_LISTEN)
+ goto out_err;
+
err = ulp_ops->init(sk);
if (err)
goto out_err;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
2c02d41d71f9 ("net/ulp: prevent ULP without clone op from entering the LISTEN status")
e276d62dcfde ("net/ulp: remove SOCK_SUPPORT_ZC from tls sockets")
e7049395b1c3 ("dccp/tcp: Remove an unused argument in inet_csk_listen_start().")
53632e111946 ("bpf: selftest: Use bpf_sk_storage in FENTRY/FEXIT/RAW_TP")
8b27dae5a2e8 ("tcp: add one skb cache for rx")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2c02d41d71f90a5168391b6a5f2954112ba2307c Mon Sep 17 00:00:00 2001
From: Paolo Abeni <pabeni(a)redhat.com>
Date: Tue, 3 Jan 2023 12:19:17 +0100
Subject: [PATCH] net/ulp: prevent ULP without clone op from entering the
LISTEN status
When an ULP-enabled socket enters the LISTEN status, the listener ULP data
pointer is copied inside the child/accepted sockets by sk_clone_lock().
The relevant ULP can take care of de-duplicating the context pointer via
the clone() operation, but only MPTCP and SMC implement such op.
Other ULPs may end-up with a double-free at socket disposal time.
We can't simply clear the ULP data at clone time, as TLS replaces the
socket ops with custom ones assuming a valid TLS ULP context is
available.
Instead completely prevent clone-less ULP sockets from entering the
LISTEN status.
Fixes: 734942cc4ea6 ("tcp: ULP infrastructure")
Reported-by: slipper <slipper.alive(a)gmail.com>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Link: https://lore.kernel.org/r/4b80c3d1dbe3d0ab072f80450c202d9bc88b4b03.16727406…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 848ffc3e0239..d1f837579398 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -1200,12 +1200,26 @@ void inet_csk_prepare_forced_close(struct sock *sk)
}
EXPORT_SYMBOL(inet_csk_prepare_forced_close);
+static int inet_ulp_can_listen(const struct sock *sk)
+{
+ const struct inet_connection_sock *icsk = inet_csk(sk);
+
+ if (icsk->icsk_ulp_ops && !icsk->icsk_ulp_ops->clone)
+ return -EINVAL;
+
+ return 0;
+}
+
int inet_csk_listen_start(struct sock *sk)
{
struct inet_connection_sock *icsk = inet_csk(sk);
struct inet_sock *inet = inet_sk(sk);
int err;
+ err = inet_ulp_can_listen(sk);
+ if (unlikely(err))
+ return err;
+
reqsk_queue_alloc(&icsk->icsk_accept_queue);
sk->sk_ack_backlog = 0;
diff --git a/net/ipv4/tcp_ulp.c b/net/ipv4/tcp_ulp.c
index 9ae50b1bd844..05b6077b9f2c 100644
--- a/net/ipv4/tcp_ulp.c
+++ b/net/ipv4/tcp_ulp.c
@@ -139,6 +139,10 @@ static int __tcp_set_ulp(struct sock *sk, const struct tcp_ulp_ops *ulp_ops)
if (sk->sk_socket)
clear_bit(SOCK_SUPPORT_ZC, &sk->sk_socket->flags);
+ err = -EINVAL;
+ if (!ulp_ops->clone && sk->sk_state == TCP_LISTEN)
+ goto out_err;
+
err = ulp_ops->init(sk);
if (err)
goto out_err;
Hi all,
Since updating to 6.0.16 the bind() system call no longer fails with
EADDRINUSE when the address is already in use.
Instead bind() returns 1 in such a case, which is not a valid return
value for this system call.
It works with the 6.0.15 kernel and earlier, 6.1.4 and 6.2-rc3 also
seem to work.
Fedora bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2159066
To reproduce you can just run `ncat -l 5000` two times, the second one
should fail. However it just uses a random port instead.
As far as I can tell this problem is caused by
https://lore.kernel.org/stable/20221228144337.512799851@linuxfoundation.org/
which did not backport commit 7a7160edf1bf properly.
The line `int ret = -EADDRINUSE, port = snum, l3mdev;` is missing in
net/ipv4/inet_connection_sock.c.
This is the working 6.1 patch:
https://lore.kernel.org/all/20221228144339.969733443@linuxfoundation.org/
Best regards,
Paul
The code to extract a peripheral's currently supported Pin Assignments
is repeated in a couple of locations. Factor it out into a separate
function.
This will also make it easier to add fixes (we only need to update 1
location instead of 2).
Fixes: c1e5c2f0cb8a ("usb: typec: altmodes/displayport: correct pin assignment for UFP receptacles")
Cc: stable(a)vger.kernel.org
Cc: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Prashant Malani <pmalani(a)chromium.org>
---
While this patch doesn't fix anything, it is required by the actual
fix (which is Patch 2/3 in this series). So, I've add the "Fixes" tag
and "Cc stable" tag to ensure that both patches are picked.
If this is the incorrect approach and there is a better way, my
apologies, and please let me know the appropriate process.
drivers/usb/typec/altmodes/displayport.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/drivers/usb/typec/altmodes/displayport.c b/drivers/usb/typec/altmodes/displayport.c
index 06fb4732f8cd..f9d4a7648bc9 100644
--- a/drivers/usb/typec/altmodes/displayport.c
+++ b/drivers/usb/typec/altmodes/displayport.c
@@ -420,6 +420,18 @@ static const char * const pin_assignments[] = {
[DP_PIN_ASSIGN_F] = "F",
};
+/*
+ * Helper function to extract a peripheral's currently supported
+ * Pin Assignments from its DisplayPort alternate mode state.
+ */
+static u8 get_current_pin_assignments(struct dp_altmode *dp)
+{
+ if (DP_CONF_CURRENTLY(dp->data.conf) == DP_CONF_DFP_D)
+ return DP_CAP_UFP_D_PIN_ASSIGN(dp->alt->vdo);
+ else
+ return DP_CAP_DFP_D_PIN_ASSIGN(dp->alt->vdo);
+}
+
static ssize_t
pin_assignment_store(struct device *dev, struct device_attribute *attr,
const char *buf, size_t size)
@@ -446,10 +458,7 @@ pin_assignment_store(struct device *dev, struct device_attribute *attr,
goto out_unlock;
}
- if (DP_CONF_CURRENTLY(dp->data.conf) == DP_CONF_DFP_D)
- assignments = DP_CAP_UFP_D_PIN_ASSIGN(dp->alt->vdo);
- else
- assignments = DP_CAP_DFP_D_PIN_ASSIGN(dp->alt->vdo);
+ assignments = get_current_pin_assignments(dp);
if (!(DP_CONF_GET_PIN_ASSIGN(conf) & assignments)) {
ret = -EINVAL;
@@ -486,10 +495,7 @@ static ssize_t pin_assignment_show(struct device *dev,
cur = get_count_order(DP_CONF_GET_PIN_ASSIGN(dp->data.conf));
- if (DP_CONF_CURRENTLY(dp->data.conf) == DP_CONF_DFP_D)
- assignments = DP_CAP_UFP_D_PIN_ASSIGN(dp->alt->vdo);
- else
- assignments = DP_CAP_DFP_D_PIN_ASSIGN(dp->alt->vdo);
+ assignments = get_current_pin_assignments(dp);
for (i = 0; assignments; assignments >>= 1, i++) {
if (assignments & 1) {
--
2.39.0.314.g84b9a713c41-goog
From: Ard Biesheuvel <ardb(a)kernel.org>
commit 196dff2712ca5a2e651977bb2fe6b05474111a83 upstream.
Instead of blindly creating the EFI random seed configuration table if
the RNG protocol is implemented and works, check whether such a EFI
configuration table was provided by an earlier boot stage and if so,
concatenate the existing and the new seeds, leaving it up to the core
code to mix it in and credit it the way it sees fit.
This can be used for, e.g., systemd-boot, to pass an additional seed to
Linux in a way that can be consumed by the kernel very early. In that
case, the following definitions should be used to pass the seed to the
EFI stub:
struct linux_efi_random_seed {
u32 size; // of the 'seed' array in bytes
u8 seed[];
};
The memory for the struct must be allocated as EFI_ACPI_RECLAIM_MEMORY
pool memory, and the address of the struct in memory should be installed
as a EFI configuration table using the following GUID:
LINUX_EFI_RANDOM_SEED_TABLE_GUID 1ce1e5bc-7ceb-42f2-81e5-8aadf180f57b
Note that doing so is safe even on kernels that were built without this
patch applied, but the seed will simply be overwritten with a seed
derived from the EFI RNG protocol, if available. The recommended seed
size is 32 bytes, and seeds larger than 512 bytes are considered
corrupted and ignored entirely.
In order to preserve forward secrecy, seeds from previous bootloaders
are memzero'd out, and in order to preserve memory, those older seeds
are also freed from memory. Freeing from memory without first memzeroing
is not safe to do, as it's possible that nothing else will ever
overwrite those pages used by EFI.
Reviewed-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
[ardb: incorporate Jason's followup changes to extend the maximum seed
size on the consumer end, memzero() it and drop a needless printk]
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
drivers/firmware/efi/efi.c | 4 +--
drivers/firmware/efi/libstub/efistub.h | 2 ++
drivers/firmware/efi/libstub/random.c | 42 ++++++++++++++++++++++----
include/linux/efi.h | 2 --
4 files changed, 40 insertions(+), 10 deletions(-)
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index a06decee51e0..a6e9968a2ddc 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -608,7 +608,7 @@ int __init efi_config_parse_tables(const efi_config_table_t *config_tables,
seed = early_memremap(efi_rng_seed, sizeof(*seed));
if (seed != NULL) {
- size = min(seed->size, EFI_RANDOM_SEED_SIZE);
+ size = min_t(u32, seed->size, SZ_1K); // sanity check
early_memunmap(seed, sizeof(*seed));
} else {
pr_err("Could not map UEFI random seed!\n");
@@ -617,8 +617,8 @@ int __init efi_config_parse_tables(const efi_config_table_t *config_tables,
seed = early_memremap(efi_rng_seed,
sizeof(*seed) + size);
if (seed != NULL) {
- pr_notice("seeding entropy pool\n");
add_bootloader_randomness(seed->bits, size);
+ memzero_explicit(seed->bits, size);
early_memunmap(seed, sizeof(*seed) + size);
} else {
pr_err("Could not map UEFI random seed!\n");
diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
index b0ae0a454404..0ce2bf4b8b58 100644
--- a/drivers/firmware/efi/libstub/efistub.h
+++ b/drivers/firmware/efi/libstub/efistub.h
@@ -873,6 +873,8 @@ efi_status_t efi_get_random_bytes(unsigned long size, u8 *out);
efi_status_t efi_random_alloc(unsigned long size, unsigned long align,
unsigned long *addr, unsigned long random_seed);
+efi_status_t efi_random_get_seed(void);
+
efi_status_t check_platform_features(void);
void *get_efi_config_table(efi_guid_t guid);
diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
index 33ab56769595..f85d2c066877 100644
--- a/drivers/firmware/efi/libstub/random.c
+++ b/drivers/firmware/efi/libstub/random.c
@@ -67,27 +67,43 @@ efi_status_t efi_random_get_seed(void)
efi_guid_t rng_proto = EFI_RNG_PROTOCOL_GUID;
efi_guid_t rng_algo_raw = EFI_RNG_ALGORITHM_RAW;
efi_guid_t rng_table_guid = LINUX_EFI_RANDOM_SEED_TABLE_GUID;
+ struct linux_efi_random_seed *prev_seed, *seed = NULL;
+ int prev_seed_size = 0, seed_size = EFI_RANDOM_SEED_SIZE;
efi_rng_protocol_t *rng = NULL;
- struct linux_efi_random_seed *seed = NULL;
efi_status_t status;
status = efi_bs_call(locate_protocol, &rng_proto, NULL, (void **)&rng);
if (status != EFI_SUCCESS)
return status;
+ /*
+ * Check whether a seed was provided by a prior boot stage. In that
+ * case, instead of overwriting it, let's create a new buffer that can
+ * hold both, and concatenate the existing and the new seeds.
+ * Note that we should read the seed size with caution, in case the
+ * table got corrupted in memory somehow.
+ */
+ prev_seed = get_efi_config_table(LINUX_EFI_RANDOM_SEED_TABLE_GUID);
+ if (prev_seed && prev_seed->size <= 512U) {
+ prev_seed_size = prev_seed->size;
+ seed_size += prev_seed_size;
+ }
+
/*
* Use EFI_ACPI_RECLAIM_MEMORY here so that it is guaranteed that the
* allocation will survive a kexec reboot (although we refresh the seed
* beforehand)
*/
status = efi_bs_call(allocate_pool, EFI_ACPI_RECLAIM_MEMORY,
- sizeof(*seed) + EFI_RANDOM_SEED_SIZE,
+ struct_size(seed, bits, seed_size),
(void **)&seed);
- if (status != EFI_SUCCESS)
- return status;
+ if (status != EFI_SUCCESS) {
+ efi_warn("Failed to allocate memory for RNG seed.\n");
+ goto err_warn;
+ }
status = efi_call_proto(rng, get_rng, &rng_algo_raw,
- EFI_RANDOM_SEED_SIZE, seed->bits);
+ EFI_RANDOM_SEED_SIZE, seed->bits);
if (status == EFI_UNSUPPORTED)
/*
@@ -100,14 +116,28 @@ efi_status_t efi_random_get_seed(void)
if (status != EFI_SUCCESS)
goto err_freepool;
- seed->size = EFI_RANDOM_SEED_SIZE;
+ seed->size = seed_size;
+ if (prev_seed_size)
+ memcpy(seed->bits + EFI_RANDOM_SEED_SIZE, prev_seed->bits,
+ prev_seed_size);
+
status = efi_bs_call(install_configuration_table, &rng_table_guid, seed);
if (status != EFI_SUCCESS)
goto err_freepool;
+ if (prev_seed_size) {
+ /* wipe and free the old seed if we managed to install the new one */
+ memzero_explicit(prev_seed->bits, prev_seed_size);
+ efi_bs_call(free_pool, prev_seed);
+ }
return EFI_SUCCESS;
err_freepool:
+ memzero_explicit(seed, struct_size(seed, bits, seed_size));
efi_bs_call(free_pool, seed);
+ efi_warn("Failed to obtain seed from EFI_RNG_PROTOCOL\n");
+err_warn:
+ if (prev_seed)
+ efi_warn("Retaining bootloader-supplied seed only");
return status;
}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index f87b2f5db9f8..4f51616f01b2 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -1139,8 +1139,6 @@ void efi_check_for_embedded_firmwares(void);
static inline void efi_check_for_embedded_firmwares(void) { }
#endif
-efi_status_t efi_random_get_seed(void);
-
#define arch_efi_call_virt(p, f, args...) ((p)->f(args))
/*
--
2.39.0
The ACPI PRM address space handler calls efi_call_virt_pointer() to
execute PRM firmware code, but doing so is only permitted when the EFI
runtime environment is available. Otherwise, such calls are guaranteed
to result in a crash, and must therefore be avoided.
Cc: <stable(a)vger.kernel.org>
Cc: "Rafael J. Wysocki" <rafael(a)kernel.org>
Cc: Len Brown <lenb(a)kernel.org>
Cc: linux-acpi(a)vger.kernel.org
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
---
drivers/acpi/prmt.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/acpi/prmt.c b/drivers/acpi/prmt.c
index 998101cf16e47145..74f924077866ae69 100644
--- a/drivers/acpi/prmt.c
+++ b/drivers/acpi/prmt.c
@@ -236,6 +236,11 @@ static acpi_status acpi_platformrt_space_handler(u32 function,
efi_status_t status;
struct prm_context_buffer context;
+ if (!efi_enabled(EFI_RUNTIME_SERVICES)) {
+ pr_err("PRM: EFI runtime services unavailable\n");
+ return AE_NOT_IMPLEMENTED;
+ }
+
/*
* The returned acpi_status will always be AE_OK. Error values will be
* saved in the first byte of the PRM message buffer to be used by ASL.
--
2.39.0
commit 27c0d217340e47ec995557f61423ef415afba987 upstream.
When a driver registers with a bus, it will attempt to match with every
device on the bus through the __driver_attach() function. Currently, if
the bus_type.match() function encounters an error that is not
-EPROBE_DEFER, __driver_attach() will return a negative error code, which
causes the driver registration logic to stop trying to match with the
remaining devices on the bus.
This behavior is not correct; a failure while matching a driver to a
device does not mean that the driver won't be able to match and bind
with other devices on the bus. Update the logic in __driver_attach()
to reflect this.
Fixes: 656b8035b0ee ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable(a)vger.kernel.org
Cc: Saravana Kannan <saravanak(a)google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres(a)google.com>
---
drivers/base/dd.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index a7bcbb99e820..0f006cad2be7 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -785,8 +785,12 @@ static int __driver_attach(struct device *dev, void *data)
*/
return 0;
} else if (ret < 0) {
- dev_dbg(dev, "Bus failed to match device: %d", ret);
- return ret;
+ dev_dbg(dev, "Bus failed to match device: %d\n", ret);
+ /*
+ * Driver could not match with device, but may match with
+ * another device on the bus.
+ */
+ return 0;
} /* ret > 0 means positive match */
if (dev->parent) /* Needed for USB */
--
2.39.0.314.g84b9a713c41-goog
Greg -
Here are backports of the MPTCP patches, and one prerequisite, that
recently failed to apply to the 5.10 stable tree. They prevent IPv6
memory leaks with MPTCP.
Thanks!
Florian Westphal (1):
mptcp: mark ops structures as ro_after_init
Matthieu Baerts (3):
mptcp: remove MPTCP 'ifdef' in TCP SYN cookies
mptcp: dedicated request sock for subflow in v6
mptcp: use proper req destructor for IPv6
include/net/mptcp.h | 12 +++++--
net/ipv4/syncookies.c | 7 ++--
net/mptcp/subflow.c | 76 +++++++++++++++++++++++++++++++++----------
3 files changed, 71 insertions(+), 24 deletions(-)
--
2.39.0
From: "Tyler Hicks" <code(a)tyhicks.com>
When attempting to build kselftests with a separate output directory, a
number of the tests fail to build.
For example,
$ rm -rf build && \
make INSTALL_HDR_PATH=build/usr headers_install > /dev/null && \
make O=build FORCE_TARGETS=1 TARGETS=breakpoints -C tools/testing/selftests > /dev/null
/usr/bin/ld: cannot open output file
build/kselftest/breakpoints/step_after_suspend_test: No such file or directory
collect2: error: ld returned 1 exit status
make[1]: *** [../lib.mk:146: build/kselftest/breakpoints/step_after_suspend_test] Error 1
make: *** [Makefile:163: all] Error 2
This has already been addressed upstream with v5.18 commit 5ad51ab618de
("selftests: set the BUILD variable to absolute path"). It does not
cleanly cherry pick to the linux-5.4.y branch without v5.7 commit
29e911ef7b70 ("selftests: Fix kselftest O=objdir build from cluttering
top level objdir"). Commit 5ad51ab618de was written in a way that
assumes that the kselftests aren't build in the top level objdir so it
makes sense to bring the pre-req commit back but it does represent a
slight change in behavior since the kselftests will now be built in a
subdir of the specified objdir (O=).
Tyler
Muhammad Usama Anjum (1):
selftests: set the BUILD variable to absolute path
Shuah Khan (1):
selftests: Fix kselftest O=objdir build from cluttering top level
objdir
tools/testing/selftests/Makefile | 28 ++++++++++++++++++----------
1 file changed, 18 insertions(+), 10 deletions(-)
--
2.34.1
From: Indan Zupancic <Indan.Zupancic(a)mep-info.com>
[ Upstream commit 401fb66a355eb0f22096cf26864324f8e63c7d78 ]
If an irq is pending when devm_request_irq() is called, the irq
handler will cause a NULL pointer access because initialisation
is not done yet.
Fixes: 9d7ee0e28da59 ("tty: serial: lpuart: avoid report NULL interrupt")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Indan Zupancic <Indan.Zupancic(a)mep-info.com>
Link: https://lore.kernel.org/r/20220505114750.45423-1-Indan.Zupancic@mep-info.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
[5.10 did not have lpuart_global_reset or anything after
uart_add_one_port(), so add the remove call in cleanup manually]
Signed-off-by: Dominique Martinet <dominique.martinet(a)atmark-techno.com>
---
This was originally intended as a prerequirement to backport the patch
submitted in [1] for 5.10, but even with that part of the patch gone it
makes sense as a fix on its own.
[1] https://lkml.kernel.org/r/20221222114414.1886632-1-linux@rasmusvillemoes.dk
drivers/tty/serial/fsl_lpuart.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
index 43aca5a2ef0f..223695947b65 100644
--- a/drivers/tty/serial/fsl_lpuart.c
+++ b/drivers/tty/serial/fsl_lpuart.c
@@ -2586,6 +2586,7 @@ static int lpuart_probe(struct platform_device *pdev)
struct device_node *np = pdev->dev.of_node;
struct lpuart_port *sport;
struct resource *res;
+ irq_handler_t handler;
int ret;
sport = devm_kzalloc(&pdev->dev, sizeof(*sport), GFP_KERNEL);
@@ -2658,17 +2659,12 @@ static int lpuart_probe(struct platform_device *pdev)
if (lpuart_is_32(sport)) {
lpuart_reg.cons = LPUART32_CONSOLE;
- ret = devm_request_irq(&pdev->dev, sport->port.irq, lpuart32_int, 0,
- DRIVER_NAME, sport);
+ handler = lpuart32_int;
} else {
lpuart_reg.cons = LPUART_CONSOLE;
- ret = devm_request_irq(&pdev->dev, sport->port.irq, lpuart_int, 0,
- DRIVER_NAME, sport);
+ handler = lpuart_int;
}
- if (ret)
- goto failed_irq_request;
-
ret = uart_get_rs485_mode(&sport->port);
if (ret)
goto failed_get_rs485;
@@ -2684,11 +2680,17 @@ static int lpuart_probe(struct platform_device *pdev)
if (ret)
goto failed_attach_port;
+ ret = devm_request_irq(&pdev->dev, sport->port.irq, handler, 0,
+ DRIVER_NAME, sport);
+ if (ret)
+ goto failed_irq_request;
+
return 0;
+failed_irq_request:
+ uart_remove_one_port(&lpuart_reg, &sport->port);
failed_get_rs485:
failed_attach_port:
-failed_irq_request:
lpuart_disable_clks(sport);
return ret;
}
--
2.35.1
When 7c7f9bc986e6 ("serial: Deassert Transmit Enable on probe in
driver-specific way") got backported to 5.15.y, there known as
b079d3775237, this hunk was accidentally left out. So if the "goto
failed_get_rs485;" is hit, the cleanup will do uart_remove_one_port()
despite uart_add_one_port() not having been called.
Add the missing hunk.
Fixes: b079d3775237 ("serial: Deassert Transmit Enable on probe in driver-specific way")
Signed-off-by: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
---
Not quite sure how to submit patches for a specific -stable series
only, or if the Fixes tag is appropriate and correct. Please let me
know if you'd have preferred anything different.
drivers/tty/serial/fsl_lpuart.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
index 595430aedc0d..fc311df9f1c9 100644
--- a/drivers/tty/serial/fsl_lpuart.c
+++ b/drivers/tty/serial/fsl_lpuart.c
@@ -2784,9 +2784,9 @@ static int lpuart_probe(struct platform_device *pdev)
return 0;
failed_irq_request:
-failed_get_rs485:
uart_remove_one_port(&lpuart_reg, &sport->port);
failed_attach_port:
+failed_get_rs485:
failed_reset:
lpuart_disable_clks(sport);
return ret;
base-commit: fd6d66840b4269da4e90e1ea807ae3197433bc66
--
2.37.2
The recent ext4 fast-commit fixes with 'Cc stable' didn't apply to 5.10
due to conflicts. Since the fast-commit support in 5.10 is rudimentary
and hard to backport fixes too, this series backports the two most
important fixes only. Please apply to 5.10-stable.
Eric Biggers (2):
ext4: disable fast-commit of encrypted dir operations
ext4: don't set up encryption key during jbd2 transaction
fs/ext4/ext4.h | 4 ++--
fs/ext4/fast_commit.c | 42 +++++++++++++++++++++--------------
fs/ext4/fast_commit.h | 1 +
fs/ext4/namei.c | 44 ++++++++++++++++++++-----------------
include/trace/events/ext4.h | 7 ++++--
5 files changed, 57 insertions(+), 41 deletions(-)
--
2.39.0
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
0b3a551fa58b ("nfsd: fix handling of cached open files in nfsd4_open codepath")
ac3a2585f018 ("nfsd: rework refcounting in filecache")
d7064eaf688c ("NFSD: Add an nfsd_file_fsync tracepoint")
821411858988 ("nfsd: reorganize filecache.c")
1f696e230ea5 ("nfsd: remove the pages_flushed statistic from filecache")
4d1ea8455716 ("NFSD: Add an NFSD_FILE_GC flag to enable nfsd_file garbage collection")
dcf3f80965ca ("NFSD: Revert "NFSD: NFSv4 CLOSE should release an nfsd_file immediately"")
c252849082ff ("NFSD: Pass the target nfsd_file to nfsd_commit()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0b3a551fa58b4da941efeb209b3770868e2eddd7 Mon Sep 17 00:00:00 2001
From: Jeff Layton <jlayton(a)kernel.org>
Date: Thu, 5 Jan 2023 14:55:56 -0500
Subject: [PATCH] nfsd: fix handling of cached open files in nfsd4_open
codepath
Commit fb70bf124b05 ("NFSD: Instantiate a struct file when creating a
regular NFSv4 file") added the ability to cache an open fd over a
compound. There are a couple of problems with the way this currently
works:
It's racy, as a newly-created nfsd_file can end up with its PENDING bit
cleared while the nf is hashed, and the nf_file pointer is still zeroed
out. Other tasks can find it in this state and they expect to see a
valid nf_file, and can oops if nf_file is NULL.
Also, there is no guarantee that we'll end up creating a new nfsd_file
if one is already in the hash. If an extant entry is in the hash with a
valid nf_file, nfs4_get_vfs_file will clobber its nf_file pointer with
the value of op_file and the old nf_file will leak.
Fix both issues by making a new nfsd_file_acquirei_opened variant that
takes an optional file pointer. If one is present when this is called,
we'll take a new reference to it instead of trying to open the file. If
the nfsd_file already has a valid nf_file, we'll just ignore the
optional file and pass the nfsd_file back as-is.
Also rework the tracepoints a bit to allow for an "opened" variant and
don't try to avoid counting acquisitions in the case where we already
have a cached open file.
Fixes: fb70bf124b05 ("NFSD: Instantiate a struct file when creating a regular NFSv4 file")
Cc: Trond Myklebust <trondmy(a)hammerspace.com>
Reported-by: Stanislav Saner <ssaner(a)redhat.com>
Reported-and-Tested-by: Ruben Vestergaard <rubenv(a)drcmr.dk>
Reported-and-Tested-by: Torkil Svensgaard <torkil(a)drcmr.dk>
Signed-off-by: Jeff Layton <jlayton(a)kernel.org>
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 45b2c9e3f636..0ef070349014 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -1071,8 +1071,8 @@ nfsd_file_is_cached(struct inode *inode)
static __be32
nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
- unsigned int may_flags, struct nfsd_file **pnf,
- bool open, bool want_gc)
+ unsigned int may_flags, struct file *file,
+ struct nfsd_file **pnf, bool want_gc)
{
struct nfsd_file_lookup_key key = {
.type = NFSD_FILE_KEY_FULL,
@@ -1147,8 +1147,7 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
status = nfserrno(nfsd_open_break_lease(file_inode(nf->nf_file), may_flags));
out:
if (status == nfs_ok) {
- if (open)
- this_cpu_inc(nfsd_file_acquisitions);
+ this_cpu_inc(nfsd_file_acquisitions);
*pnf = nf;
} else {
if (refcount_dec_and_test(&nf->nf_ref))
@@ -1158,20 +1157,23 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
out_status:
put_cred(key.cred);
- if (open)
- trace_nfsd_file_acquire(rqstp, key.inode, may_flags, nf, status);
+ trace_nfsd_file_acquire(rqstp, key.inode, may_flags, nf, status);
return status;
open_file:
trace_nfsd_file_alloc(nf);
nf->nf_mark = nfsd_file_mark_find_or_create(nf, key.inode);
if (nf->nf_mark) {
- if (open) {
+ if (file) {
+ get_file(file);
+ nf->nf_file = file;
+ status = nfs_ok;
+ trace_nfsd_file_opened(nf, status);
+ } else {
status = nfsd_open_verified(rqstp, fhp, may_flags,
&nf->nf_file);
trace_nfsd_file_open(nf, status);
- } else
- status = nfs_ok;
+ }
} else
status = nfserr_jukebox;
/*
@@ -1207,7 +1209,7 @@ __be32
nfsd_file_acquire_gc(struct svc_rqst *rqstp, struct svc_fh *fhp,
unsigned int may_flags, struct nfsd_file **pnf)
{
- return nfsd_file_do_acquire(rqstp, fhp, may_flags, pnf, true, true);
+ return nfsd_file_do_acquire(rqstp, fhp, may_flags, NULL, pnf, true);
}
/**
@@ -1228,28 +1230,30 @@ __be32
nfsd_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
unsigned int may_flags, struct nfsd_file **pnf)
{
- return nfsd_file_do_acquire(rqstp, fhp, may_flags, pnf, true, false);
+ return nfsd_file_do_acquire(rqstp, fhp, may_flags, NULL, pnf, false);
}
/**
- * nfsd_file_create - Get a struct nfsd_file, do not open
+ * nfsd_file_acquire_opened - Get a struct nfsd_file using existing open file
* @rqstp: the RPC transaction being executed
* @fhp: the NFS filehandle of the file just created
* @may_flags: NFSD_MAY_ settings for the file
+ * @file: cached, already-open file (may be NULL)
* @pnf: OUT: new or found "struct nfsd_file" object
*
- * The nfsd_file_object returned by this API is reference-counted
- * but not garbage-collected. The object is released immediately
- * one RCU grace period after the final nfsd_file_put().
+ * Acquire a nfsd_file object that is not GC'ed. If one doesn't already exist,
+ * and @file is non-NULL, use it to instantiate a new nfsd_file instead of
+ * opening a new one.
*
* Returns nfs_ok and sets @pnf on success; otherwise an nfsstat in
* network byte order is returned.
*/
__be32
-nfsd_file_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
- unsigned int may_flags, struct nfsd_file **pnf)
+nfsd_file_acquire_opened(struct svc_rqst *rqstp, struct svc_fh *fhp,
+ unsigned int may_flags, struct file *file,
+ struct nfsd_file **pnf)
{
- return nfsd_file_do_acquire(rqstp, fhp, may_flags, pnf, false, false);
+ return nfsd_file_do_acquire(rqstp, fhp, may_flags, file, pnf, false);
}
/*
diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
index b7efb2c3ddb1..41516a4263ea 100644
--- a/fs/nfsd/filecache.h
+++ b/fs/nfsd/filecache.h
@@ -60,7 +60,8 @@ __be32 nfsd_file_acquire_gc(struct svc_rqst *rqstp, struct svc_fh *fhp,
unsigned int may_flags, struct nfsd_file **nfp);
__be32 nfsd_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
unsigned int may_flags, struct nfsd_file **nfp);
-__be32 nfsd_file_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
- unsigned int may_flags, struct nfsd_file **nfp);
+__be32 nfsd_file_acquire_opened(struct svc_rqst *rqstp, struct svc_fh *fhp,
+ unsigned int may_flags, struct file *file,
+ struct nfsd_file **nfp);
int nfsd_file_cache_stats_show(struct seq_file *m, void *v);
#endif /* _FS_NFSD_FILECACHE_H */
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index e1e85c21f12b..313f666d5357 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5262,18 +5262,10 @@ static __be32 nfs4_get_vfs_file(struct svc_rqst *rqstp, struct nfs4_file *fp,
if (!fp->fi_fds[oflag]) {
spin_unlock(&fp->fi_lock);
- if (!open->op_filp) {
- status = nfsd_file_acquire(rqstp, cur_fh, access, &nf);
- if (status != nfs_ok)
- goto out_put_access;
- } else {
- status = nfsd_file_create(rqstp, cur_fh, access, &nf);
- if (status != nfs_ok)
- goto out_put_access;
- nf->nf_file = open->op_filp;
- open->op_filp = NULL;
- trace_nfsd_file_create(rqstp, access, nf);
- }
+ status = nfsd_file_acquire_opened(rqstp, cur_fh, access,
+ open->op_filp, &nf);
+ if (status != nfs_ok)
+ goto out_put_access;
spin_lock(&fp->fi_lock);
if (!fp->fi_fds[oflag]) {
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index c852ae8eaf37..8f9c82d9e075 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -981,43 +981,6 @@ TRACE_EVENT(nfsd_file_acquire,
)
);
-TRACE_EVENT(nfsd_file_create,
- TP_PROTO(
- const struct svc_rqst *rqstp,
- unsigned int may_flags,
- const struct nfsd_file *nf
- ),
-
- TP_ARGS(rqstp, may_flags, nf),
-
- TP_STRUCT__entry(
- __field(const void *, nf_inode)
- __field(const void *, nf_file)
- __field(unsigned long, may_flags)
- __field(unsigned long, nf_flags)
- __field(unsigned long, nf_may)
- __field(unsigned int, nf_ref)
- __field(u32, xid)
- ),
-
- TP_fast_assign(
- __entry->nf_inode = nf->nf_inode;
- __entry->nf_file = nf->nf_file;
- __entry->may_flags = may_flags;
- __entry->nf_flags = nf->nf_flags;
- __entry->nf_may = nf->nf_may;
- __entry->nf_ref = refcount_read(&nf->nf_ref);
- __entry->xid = be32_to_cpu(rqstp->rq_xid);
- ),
-
- TP_printk("xid=0x%x inode=%p may_flags=%s ref=%u nf_flags=%s nf_may=%s nf_file=%p",
- __entry->xid, __entry->nf_inode,
- show_nfsd_may_flags(__entry->may_flags),
- __entry->nf_ref, show_nf_flags(__entry->nf_flags),
- show_nfsd_may_flags(__entry->nf_may), __entry->nf_file
- )
-);
-
TRACE_EVENT(nfsd_file_insert_err,
TP_PROTO(
const struct svc_rqst *rqstp,
@@ -1079,8 +1042,8 @@ TRACE_EVENT(nfsd_file_cons_err,
)
);
-TRACE_EVENT(nfsd_file_open,
- TP_PROTO(struct nfsd_file *nf, __be32 status),
+DECLARE_EVENT_CLASS(nfsd_file_open_class,
+ TP_PROTO(const struct nfsd_file *nf, __be32 status),
TP_ARGS(nf, status),
TP_STRUCT__entry(
__field(void *, nf_inode) /* cannot be dereferenced */
@@ -1104,6 +1067,17 @@ TRACE_EVENT(nfsd_file_open,
__entry->nf_file)
)
+#define DEFINE_NFSD_FILE_OPEN_EVENT(name) \
+DEFINE_EVENT(nfsd_file_open_class, name, \
+ TP_PROTO( \
+ const struct nfsd_file *nf, \
+ __be32 status \
+ ), \
+ TP_ARGS(nf, status))
+
+DEFINE_NFSD_FILE_OPEN_EVENT(nfsd_file_open);
+DEFINE_NFSD_FILE_OPEN_EVENT(nfsd_file_opened);
+
TRACE_EVENT(nfsd_file_is_cached,
TP_PROTO(
const struct inode *inode,
Hi Greg,
The following commits fix a regression introduced into the kernel in
5.14 (in e84ba47e313d). Please cherry-pick them for both the 6.1.y and
the 6.0.y branches.
6a877d2450ac x86/fpu: Take task_struct* in copy_sigframe_from_user_to_xstate()
1c813ce03055 x86/fpu: Add a pkru argument to copy_uabi_from_kernel_to_xstate().
2c87767c35ee x86/fpu: Add a pkru argument to copy_uabi_to_xstate()
4a804c4f8356 x86/fpu: Allow PKRU to be (once again) written by ptrace.
d7e5aceace51 x86/fpu: Emulate XRSTOR's behavior if the xfeatures PKRU
bit is not set
6ea25770b043 selftests/vm/pkeys: Add a regression test for setting
PKRU through ptrace
5.15.y requires adjusted patches and will be sent separately.
- Kyle
Hi Greg,
can you please consider adding this patch to the 6.0-stable series
(actually kernels 5.18 up to and including 6.0)?
It's a backport of upstream commit 71bdea6f798b425bc0003780b13e3fdecb16a010
Thanks!
Helge
From: Helge Deller <deller(a)gmx.de>
Subject: [PATCH] parisc: Align parisc MADV_XXX constants with all other architectures
Adjust some MADV_XXX constants to be in sync what their values are on
all other platforms. There is currently no reason to have an own
numbering on parisc, but it requires workarounds in many userspace
sources (e.g. glibc, qemu, ...) - which are often forgotten and thus
introduce bugs and different behaviour on parisc.
A wrapper avoids an ABI breakage for existing userspace applications by
translating any old values to the new ones, so this change allows us to
move over all programs to the new ABI over time.
Signed-off-by: Helge Deller <deller(a)gmx.de>
diff --git a/arch/parisc/include/uapi/asm/mman.h b/arch/parisc/include/uapi/asm/mman.h
index a7ea3204a5fa..3fff360ae4a0 100644
--- a/arch/parisc/include/uapi/asm/mman.h
+++ b/arch/parisc/include/uapi/asm/mman.h
@@ -49,6 +49,19 @@
#define MADV_DONTFORK 10 /* don't inherit across fork */
#define MADV_DOFORK 11 /* do inherit across fork */
+#define MADV_MERGEABLE 12 /* KSM may merge identical pages */
+#define MADV_UNMERGEABLE 13 /* KSM may not merge identical pages */
+
+#define MADV_HUGEPAGE 14 /* Worth backing with hugepages */
+#define MADV_NOHUGEPAGE 15 /* Not worth backing with hugepages */
+
+#define MADV_DONTDUMP 16 /* Explicity exclude from the core dump,
+ overrides the coredump filter bits */
+#define MADV_DODUMP 17 /* Clear the MADV_NODUMP flag */
+
+#define MADV_WIPEONFORK 18 /* Zero memory on fork, child only */
+#define MADV_KEEPONFORK 19 /* Undo MADV_WIPEONFORK */
+
#define MADV_COLD 20 /* deactivate these pages */
#define MADV_PAGEOUT 21 /* reclaim these pages */
@@ -57,25 +70,11 @@
#define MADV_DONTNEED_LOCKED 24 /* like DONTNEED, but drop locked pages too */
-#define MADV_MERGEABLE 65 /* KSM may merge identical pages */
-#define MADV_UNMERGEABLE 66 /* KSM may not merge identical pages */
-
-#define MADV_HUGEPAGE 67 /* Worth backing with hugepages */
-#define MADV_NOHUGEPAGE 68 /* Not worth backing with hugepages */
-
-#define MADV_DONTDUMP 69 /* Explicity exclude from the core dump,
- overrides the coredump filter bits */
-#define MADV_DODUMP 70 /* Clear the MADV_NODUMP flag */
-
-#define MADV_WIPEONFORK 71 /* Zero memory on fork, child only */
-#define MADV_KEEPONFORK 72 /* Undo MADV_WIPEONFORK */
-
#define MADV_HWPOISON 100 /* poison a page for testing */
#define MADV_SOFT_OFFLINE 101 /* soft offline page for testing */
/* compatibility flags */
#define MAP_FILE 0
-#define MAP_VARIABLE 0
#define PKEY_DISABLE_ACCESS 0x1
#define PKEY_DISABLE_WRITE 0x2
diff --git a/arch/parisc/kernel/sys_parisc.c b/arch/parisc/kernel/sys_parisc.c
index 2b34294517a1..306dde7f0d3a 100644
--- a/arch/parisc/kernel/sys_parisc.c
+++ b/arch/parisc/kernel/sys_parisc.c
@@ -465,3 +465,30 @@ asmlinkage long parisc_inotify_init1(int flags)
flags = FIX_O_NONBLOCK(flags);
return sys_inotify_init1(flags);
}
+
+/*
+ * madvise() wrapper
+ *
+ * Up to kernel v6.1 parisc has different values than all other
+ * platforms for the MADV_xxx flags listed below.
+ * To keep binary compatibility with existing userspace programs
+ * translate the former values to the new values.
+ *
+ * XXX: Remove this wrapper in year 2025 (or later)
+ */
+
+asmlinkage notrace long parisc_madvise(unsigned long start, size_t len_in, int behavior)
+{
+ switch (behavior) {
+ case 65: behavior = MADV_MERGEABLE; break;
+ case 66: behavior = MADV_UNMERGEABLE; break;
+ case 67: behavior = MADV_HUGEPAGE; break;
+ case 68: behavior = MADV_NOHUGEPAGE; break;
+ case 69: behavior = MADV_DONTDUMP; break;
+ case 70: behavior = MADV_DODUMP; break;
+ case 71: behavior = MADV_WIPEONFORK; break;
+ case 72: behavior = MADV_KEEPONFORK; break;
+ }
+
+ return sys_madvise(start, len_in, behavior);
+}
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index 8a99c998da9b..0e42fceb2d5e 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -131,7 +131,7 @@
116 common sysinfo sys_sysinfo compat_sys_sysinfo
117 common shutdown sys_shutdown
118 common fsync sys_fsync
-119 common madvise sys_madvise
+119 common madvise parisc_madvise
120 common clone sys_clone_wrapper
121 common setdomainname sys_setdomainname
122 common sendfile sys_sendfile compat_sys_sendfile
diff --git a/tools/arch/parisc/include/uapi/asm/mman.h b/tools/arch/parisc/include/uapi/asm/mman.h
index 506c06a6536f..4cc88a642e10 100644
--- a/tools/arch/parisc/include/uapi/asm/mman.h
+++ b/tools/arch/parisc/include/uapi/asm/mman.h
@@ -1,20 +1,20 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
#ifndef TOOLS_ARCH_PARISC_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_PARISC_UAPI_ASM_MMAN_FIX_H
-#define MADV_DODUMP 70
+#define MADV_DODUMP 17
#define MADV_DOFORK 11
-#define MADV_DONTDUMP 69
+#define MADV_DONTDUMP 16
#define MADV_DONTFORK 10
#define MADV_DONTNEED 4
#define MADV_FREE 8
-#define MADV_HUGEPAGE 67
-#define MADV_MERGEABLE 65
-#define MADV_NOHUGEPAGE 68
+#define MADV_HUGEPAGE 14
+#define MADV_MERGEABLE 12
+#define MADV_NOHUGEPAGE 15
#define MADV_NORMAL 0
#define MADV_RANDOM 1
#define MADV_REMOVE 9
#define MADV_SEQUENTIAL 2
-#define MADV_UNMERGEABLE 66
+#define MADV_UNMERGEABLE 13
#define MADV_WILLNEED 3
#define MAP_ANONYMOUS 0x10
#define MAP_DENYWRITE 0x0800
diff --git a/tools/perf/bench/bench.h b/tools/perf/bench/bench.h
index 6cefb4315d75..a5d49b3b6a09 100644
--- a/tools/perf/bench/bench.h
+++ b/tools/perf/bench/bench.h
@@ -10,25 +10,13 @@ extern struct timeval bench__start, bench__end, bench__runtime;
* The madvise transparent hugepage constants were added in glibc
* 2.13. For compatibility with older versions of glibc, define these
* tokens if they are not already defined.
- *
- * PA-RISC uses different madvise values from other architectures and
- * needs to be special-cased.
*/
-#ifdef __hppa__
-# ifndef MADV_HUGEPAGE
-# define MADV_HUGEPAGE 67
-# endif
-# ifndef MADV_NOHUGEPAGE
-# define MADV_NOHUGEPAGE 68
-# endif
-#else
# ifndef MADV_HUGEPAGE
# define MADV_HUGEPAGE 14
# endif
# ifndef MADV_NOHUGEPAGE
# define MADV_NOHUGEPAGE 15
# endif
-#endif
int bench_numa(int argc, const char **argv);
int bench_sched_messaging(int argc, const char **argv);
This fixes a regression introduced by ee7a69aa38d8 ("fbdev: Disable
sysfb device registration when removing conflicting FBs"), where we
remove the sysfb when loading a driver for an unrelated pci device,
resulting in the user loosing their efifb console or similar.
Note that in practice this only is a problem with the nvidia blob,
because that's the only gpu driver people might install which does not
come with an fbdev driver of it's own. For everyone else the real gpu
driver will restor a working console.
Also note that in the referenced bug there's confusion that this same
bug also happens on amdgpu. But that was just another amdgpu specific
regression, which just happened to happen at roughly the same time and
with the same user-observable symptons. That bug is fixed now, see
https://bugzilla.kernel.org/show_bug.cgi?id=216331#c15
For the above reasons the cc: stable is just notionally, this patch
will need a backport and that's up to nvidia if they care enough.
References: https://bugzilla.kernel.org/show_bug.cgi?id=216303#c28
Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com>
Cc: Aaron Plattner <aplattner(a)nvidia.com>
Cc: Javier Martinez Canillas <javierm(a)redhat.com>
Cc: Thomas Zimmermann <tzimmermann(a)suse.de>
Cc: Helge Deller <deller(a)gmx.de>
Cc: Sam Ravnborg <sam(a)ravnborg.org>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: <stable(a)vger.kernel.org> # v5.19+ (if someone else does the backport)
---
drivers/video/aperture.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c
index ba565515480d..a1821d369bb1 100644
--- a/drivers/video/aperture.c
+++ b/drivers/video/aperture.c
@@ -321,15 +321,16 @@ int aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const char *na
primary = pdev == vga_default_device();
+ if (primary)
+ sysfb_disable();
+
for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) {
if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM))
continue;
base = pci_resource_start(pdev, bar);
size = pci_resource_len(pdev, bar);
- ret = aperture_remove_conflicting_devices(base, size, name);
- if (ret)
- return ret;
+ aperture_detach_devices(base, size);
}
if (!primary)
--
2.39.0
The quilt patch titled
Subject: nommu: fix split_vma() map_count error
has been removed from the -mm tree. Its filename was
nommu-fix-split_vma-map_count-error.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Liam Howlett <liam.howlett(a)oracle.com>
Subject: nommu: fix split_vma() map_count error
Date: Mon, 9 Jan 2023 20:58:20 +0000
During the maple tree conversion of nommu, an error in counting the VMAs
was introduced by counting the existing VMA again. The counting used to
be decremented by one and incremented by two, but now it only increments
by two. Fix the counting error by moving the increment outside the
setup_vma_to_mm() function to the callers.
Link: https://lkml.kernel.org/r/20230109205809.956325-1-Liam.Howlett@oracle.com
Fixes: 8220543df148 ("nommu: remove uses of VMA linked list")
Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/nommu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/mm/nommu.c~nommu-fix-split_vma-map_count-error
+++ a/mm/nommu.c
@@ -559,7 +559,6 @@ void vma_mas_remove(struct vm_area_struc
static void setup_vma_to_mm(struct vm_area_struct *vma, struct mm_struct *mm)
{
- mm->map_count++;
vma->vm_mm = mm;
/* add the VMA to the mapping */
@@ -587,6 +586,7 @@ static void mas_add_vma_to_mm(struct ma_
BUG_ON(!vma->vm_region);
setup_vma_to_mm(vma, mm);
+ mm->map_count++;
/* add the VMA to the tree */
vma_mas_store(vma, mas);
@@ -1347,6 +1347,7 @@ int split_vma(struct mm_struct *mm, stru
if (vma->vm_file)
return -ENOMEM;
+ mm = vma->vm_mm;
if (mm->map_count >= sysctl_max_map_count)
return -ENOMEM;
@@ -1398,6 +1399,7 @@ int split_vma(struct mm_struct *mm, stru
mas_set_range(&mas, vma->vm_start, vma->vm_end - 1);
mas_store(&mas, vma);
vma_mas_store(new, &mas);
+ mm->map_count++;
return 0;
err_mas_preallocate:
_
Patches currently in -mm which might be from liam.howlett(a)oracle.com are
maple_tree-fix-mas_empty_area_rev-lower-bound-validation.patch
maple_tree-remove-gfp_zero-from-kmem_cache_alloc-and-kmem_cache_alloc_bulk.patch
The quilt patch titled
Subject: nommu: fix do_munmap() error path
has been removed from the -mm tree. Its filename was
nommu-fix-do_munmap-error-path.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Liam Howlett <liam.howlett(a)oracle.com>
Subject: nommu: fix do_munmap() error path
Date: Mon, 9 Jan 2023 20:57:21 +0000
When removing a VMA from the tree fails due to no memory, do not free the
VMA since a reference still exists.
Link: https://lkml.kernel.org/r/20230109205708.956103-1-Liam.Howlett@oracle.com
Fixes: 8220543df148 ("nommu: remove uses of VMA linked list")
Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/nommu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/nommu.c~nommu-fix-do_munmap-error-path
+++ a/mm/nommu.c
@@ -1509,7 +1509,8 @@ int do_munmap(struct mm_struct *mm, unsi
erase_whole_vma:
if (delete_vma_from_mm(vma))
ret = -ENOMEM;
- delete_vma(mm, vma);
+ else
+ delete_vma(mm, vma);
return ret;
}
_
Patches currently in -mm which might be from liam.howlett(a)oracle.com are
maple_tree-fix-mas_empty_area_rev-lower-bound-validation.patch
maple_tree-remove-gfp_zero-from-kmem_cache_alloc-and-kmem_cache_alloc_bulk.patch
The quilt patch titled
Subject: nommu: fix memory leak in do_mmap() error path
has been removed from the -mm tree. Its filename was
nommu-fix-memory-leak-in-do_mmap-error-path.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Liam Howlett <liam.howlett(a)oracle.com>
Subject: nommu: fix memory leak in do_mmap() error path
Date: Mon, 9 Jan 2023 20:55:21 +0000
The preallocation of the maple tree nodes may leak if the error path to
"error_just_free" is taken. Fix this by moving the freeing of the maple
tree nodes to a shared location for all error paths.
Link: https://lkml.kernel.org/r/20230109205507.955577-1-Liam.Howlett@oracle.com
Fixes: 8220543df148 ("nommu: remove uses of VMA linked list")
Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/nommu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/nommu.c~nommu-fix-memory-leak-in-do_mmap-error-path
+++ a/mm/nommu.c
@@ -1240,6 +1240,7 @@ share:
error_just_free:
up_write(&nommu_region_sem);
error:
+ mas_destroy(&mas);
if (region->vm_file)
fput(region->vm_file);
kmem_cache_free(vm_region_jar, region);
@@ -1250,7 +1251,6 @@ error:
sharing_violation:
up_write(&nommu_region_sem);
- mas_destroy(&mas);
pr_warn("Attempt to share mismatched mappings\n");
ret = -EINVAL;
goto error;
_
Patches currently in -mm which might be from liam.howlett(a)oracle.com are
maple_tree-fix-mas_empty_area_rev-lower-bound-validation.patch
maple_tree-remove-gfp_zero-from-kmem_cache_alloc-and-kmem_cache_alloc_bulk.patch
The quilt patch titled
Subject: nilfs2: fix general protection fault in nilfs_btree_insert()
has been removed from the -mm tree. Its filename was
nilfs2-fix-general-protection-fault-in-nilfs_btree_insert.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Subject: nilfs2: fix general protection fault in nilfs_btree_insert()
Date: Thu, 5 Jan 2023 14:53:56 +0900
If nilfs2 reads a corrupted disk image and tries to reads a b-tree node
block by calling __nilfs_btree_get_block() against an invalid virtual
block address, it returns -ENOENT because conversion of the virtual block
address to a disk block address fails. However, this return value is the
same as the internal code that b-tree lookup routines return to indicate
that the block being searched does not exist, so functions that operate on
that b-tree may misbehave.
When nilfs_btree_insert() receives this spurious 'not found' code from
nilfs_btree_do_lookup(), it misunderstands that the 'not found' check was
successful and continues the insert operation using incomplete lookup path
data, causing the following crash:
general protection fault, probably for non-canonical address
0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f]
...
RIP: 0010:nilfs_btree_get_nonroot_node fs/nilfs2/btree.c:418 [inline]
RIP: 0010:nilfs_btree_prepare_insert fs/nilfs2/btree.c:1077 [inline]
RIP: 0010:nilfs_btree_insert+0x6d3/0x1c10 fs/nilfs2/btree.c:1238
Code: bc 24 80 00 00 00 4c 89 f8 48 c1 e8 03 42 80 3c 28 00 74 08 4c 89
ff e8 4b 02 92 fe 4d 8b 3f 49 83 c7 28 4c 89 f8 48 c1 e8 03 <42> 80 3c
28 00 74 08 4c 89 ff e8 2e 02 92 fe 4d 8b 3f 49 83 c7 02
...
Call Trace:
<TASK>
nilfs_bmap_do_insert fs/nilfs2/bmap.c:121 [inline]
nilfs_bmap_insert+0x20d/0x360 fs/nilfs2/bmap.c:147
nilfs_get_block+0x414/0x8d0 fs/nilfs2/inode.c:101
__block_write_begin_int+0x54c/0x1a80 fs/buffer.c:1991
__block_write_begin fs/buffer.c:2041 [inline]
block_write_begin+0x93/0x1e0 fs/buffer.c:2102
nilfs_write_begin+0x9c/0x110 fs/nilfs2/inode.c:261
generic_perform_write+0x2e4/0x5e0 mm/filemap.c:3772
__generic_file_write_iter+0x176/0x400 mm/filemap.c:3900
generic_file_write_iter+0xab/0x310 mm/filemap.c:3932
call_write_iter include/linux/fs.h:2186 [inline]
new_sync_write fs/read_write.c:491 [inline]
vfs_write+0x7dc/0xc50 fs/read_write.c:584
ksys_write+0x177/0x2a0 fs/read_write.c:637
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
...
</TASK>
This patch fixes the root cause of this problem by replacing the error
code that __nilfs_btree_get_block() returns on block address conversion
failure from -ENOENT to another internal code -EINVAL which means that the
b-tree metadata is corrupted.
By returning -EINVAL, it propagates without glitches, and for all relevant
b-tree operations, functions in the upper bmap layer output an error
message indicating corrupted b-tree metadata via
nilfs_bmap_convert_error(), and code -EIO will be eventually returned as
it should be.
Link: https://lkml.kernel.org/r/000000000000bd89e205f0e38355@google.com
Link: https://lkml.kernel.org/r/20230105055356.8811-1-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+ede796cecd5296353515(a)syzkaller.appspotmail.com
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/nilfs2/btree.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
--- a/fs/nilfs2/btree.c~nilfs2-fix-general-protection-fault-in-nilfs_btree_insert
+++ a/fs/nilfs2/btree.c
@@ -480,9 +480,18 @@ static int __nilfs_btree_get_block(const
ret = nilfs_btnode_submit_block(btnc, ptr, 0, REQ_OP_READ, &bh,
&submit_ptr);
if (ret) {
- if (ret != -EEXIST)
- return ret;
- goto out_check;
+ if (likely(ret == -EEXIST))
+ goto out_check;
+ if (ret == -ENOENT) {
+ /*
+ * Block address translation failed due to invalid
+ * value of 'ptr'. In this case, return internal code
+ * -EINVAL (broken bmap) to notify bmap layer of fatal
+ * metadata corruption.
+ */
+ ret = -EINVAL;
+ }
+ return ret;
}
if (ra) {
_
Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are
The quilt patch titled
Subject: mm/hugetlb: pre-allocate pgtable pages for uffd wr-protects
has been removed from the -mm tree. Its filename was
mm-hugetlb-pre-allocate-pgtable-pages-for-uffd-wr-protects.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Peter Xu <peterx(a)redhat.com>
Subject: mm/hugetlb: pre-allocate pgtable pages for uffd wr-protects
Date: Wed, 4 Jan 2023 17:52:05 -0500
Userfaultfd-wp uses pte markers to mark wr-protected pages for both shmem
and hugetlb. Shmem has pre-allocation ready for markers, but hugetlb path
was overlooked.
Doing so by calling huge_pte_alloc() if the initial pgtable walk fails to
find the huge ptep. It's possible that huge_pte_alloc() can fail with
high memory pressure, in that case stop the loop immediately and fail
silently. This is not the most ideal solution but it matches with what we
do with shmem meanwhile it avoids the splat in dmesg.
Link: https://lkml.kernel.org/r/20230104225207.1066932-2-peterx@redhat.com
Fixes: 60dfaad65aa9 ("mm/hugetlb: allow uffd wr-protect none ptes")
Signed-off-by: Peter Xu <peterx(a)redhat.com>
Reported-by: James Houghton <jthoughton(a)google.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: James Houghton <jthoughton(a)google.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: <stable(a)vger.kernel.org> [5.19+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-pre-allocate-pgtable-pages-for-uffd-wr-protects
+++ a/mm/hugetlb.c
@@ -6660,8 +6660,17 @@ unsigned long hugetlb_change_protection(
spinlock_t *ptl;
ptep = huge_pte_offset(mm, address, psize);
if (!ptep) {
- address |= last_addr_mask;
- continue;
+ if (!uffd_wp) {
+ address |= last_addr_mask;
+ continue;
+ }
+ /*
+ * Userfaultfd wr-protect requires pgtable
+ * pre-allocations to install pte markers.
+ */
+ ptep = huge_pte_alloc(mm, vma, address, psize);
+ if (!ptep)
+ break;
}
ptl = huge_pte_lock(h, mm, ptep);
if (huge_pmd_unshare(mm, vma, address, ptep)) {
_
Patches currently in -mm which might be from peterx(a)redhat.com are
mm-uffd-fix-pte-marker-when-fork-without-fork-event.patch
mm-fix-a-few-rare-cases-of-using-swapin-error-pte-marker.patch
mm-uffd-always-wr-protect-pte-in-ptepmd_mkuffd_wp.patch
mm-hugetlb-let-vma_offset_start-to-return-start.patch
mm-hugetlb-dont-wait-for-migration-entry-during-follow-page.patch
mm-hugetlb-document-huge_pte_offset-usage.patch
mm-hugetlb-move-swap-entry-handling-into-vma-lock-when-faulted.patch
mm-hugetlb-make-userfaultfd_huge_must_wait-safe-to-pmd-unshare.patch
mm-hugetlb-make-hugetlb_follow_page_mask-safe-to-pmd-unshare.patch
mm-hugetlb-make-follow_hugetlb_page-safe-to-pmd-unshare.patch
mm-hugetlb-make-walk_hugetlb_range-safe-to-pmd-unshare.patch
mm-hugetlb-introduce-hugetlb_walk.patch
mm-mprotect-use-long-for-page-accountings-and-retval.patch
mm-uffd-detect-pgtable-allocation-failures.patch
The quilt patch titled
Subject: hugetlb: unshare some PMDs when splitting VMAs
has been removed from the -mm tree. Its filename was
hugetlb-unshare-some-pmds-when-splitting-vmas.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: James Houghton <jthoughton(a)google.com>
Subject: hugetlb: unshare some PMDs when splitting VMAs
Date: Wed, 4 Jan 2023 23:19:10 +0000
PMD sharing can only be done in PUD_SIZE-aligned pieces of VMAs; however,
it is possible that HugeTLB VMAs are split without unsharing the PMDs
first.
Without this fix, it is possible to hit the uffd-wp-related WARN_ON_ONCE
in hugetlb_change_protection [1]. The key there is that
hugetlb_unshare_all_pmds will not attempt to unshare PMDs in
non-PUD_SIZE-aligned sections of the VMA.
It might seem ideal to unshare in hugetlb_vm_op_open, but we need to
unshare in both the new and old VMAs, so unsharing in hugetlb_vm_op_split
seems natural.
[1]: https://lore.kernel.org/linux-mm/CADrL8HVeOkj0QH5VZZbRzybNE8CG-tEGFshnA+bG9…
Link: https://lkml.kernel.org/r/20230104231910.1464197-1-jthoughton@google.com
Fixes: 6dfeaff93be1 ("hugetlb/userfaultfd: unshare all pmds for hugetlbfs when register wp")
Signed-off-by: James Houghton <jthoughton(a)google.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Acked-by: Peter Xu <peterx(a)redhat.com>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 44 +++++++++++++++++++++++++++++++++++---------
1 file changed, 35 insertions(+), 9 deletions(-)
--- a/mm/hugetlb.c~hugetlb-unshare-some-pmds-when-splitting-vmas
+++ a/mm/hugetlb.c
@@ -94,6 +94,8 @@ static int hugetlb_acct_memory(struct hs
static void hugetlb_vma_lock_free(struct vm_area_struct *vma);
static void hugetlb_vma_lock_alloc(struct vm_area_struct *vma);
static void __hugetlb_vma_unlock_write_free(struct vm_area_struct *vma);
+static void hugetlb_unshare_pmds(struct vm_area_struct *vma,
+ unsigned long start, unsigned long end);
static inline bool subpool_is_free(struct hugepage_subpool *spool)
{
@@ -4834,6 +4836,25 @@ static int hugetlb_vm_op_split(struct vm
{
if (addr & ~(huge_page_mask(hstate_vma(vma))))
return -EINVAL;
+
+ /*
+ * PMD sharing is only possible for PUD_SIZE-aligned address ranges
+ * in HugeTLB VMAs. If we will lose PUD_SIZE alignment due to this
+ * split, unshare PMDs in the PUD_SIZE interval surrounding addr now.
+ */
+ if (addr & ~PUD_MASK) {
+ /*
+ * hugetlb_vm_op_split is called right before we attempt to
+ * split the VMA. We will need to unshare PMDs in the old and
+ * new VMAs, so let's unshare before we split.
+ */
+ unsigned long floor = addr & PUD_MASK;
+ unsigned long ceil = floor + PUD_SIZE;
+
+ if (floor >= vma->vm_start && ceil <= vma->vm_end)
+ hugetlb_unshare_pmds(vma, floor, ceil);
+ }
+
return 0;
}
@@ -7322,26 +7343,21 @@ void move_hugetlb_state(struct folio *ol
}
}
-/*
- * This function will unconditionally remove all the shared pmd pgtable entries
- * within the specific vma for a hugetlbfs memory range.
- */
-void hugetlb_unshare_all_pmds(struct vm_area_struct *vma)
+static void hugetlb_unshare_pmds(struct vm_area_struct *vma,
+ unsigned long start,
+ unsigned long end)
{
struct hstate *h = hstate_vma(vma);
unsigned long sz = huge_page_size(h);
struct mm_struct *mm = vma->vm_mm;
struct mmu_notifier_range range;
- unsigned long address, start, end;
+ unsigned long address;
spinlock_t *ptl;
pte_t *ptep;
if (!(vma->vm_flags & VM_MAYSHARE))
return;
- start = ALIGN(vma->vm_start, PUD_SIZE);
- end = ALIGN_DOWN(vma->vm_end, PUD_SIZE);
-
if (start >= end)
return;
@@ -7373,6 +7389,16 @@ void hugetlb_unshare_all_pmds(struct vm_
mmu_notifier_invalidate_range_end(&range);
}
+/*
+ * This function will unconditionally remove all the shared pmd pgtable entries
+ * within the specific vma for a hugetlbfs memory range.
+ */
+void hugetlb_unshare_all_pmds(struct vm_area_struct *vma)
+{
+ hugetlb_unshare_pmds(vma, ALIGN(vma->vm_start, PUD_SIZE),
+ ALIGN_DOWN(vma->vm_end, PUD_SIZE));
+}
+
#ifdef CONFIG_CMA
static bool cma_reserve_called __initdata;
_
Patches currently in -mm which might be from jthoughton(a)google.com are
The quilt patch titled
Subject: mm/shmem: restore SHMEM_HUGE_DENY precedence over MADV_COLLAPSE
has been removed from the -mm tree. Its filename was
mm-shmem-restore-shmem_huge_deny-precedence-over-madv_collapse.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: "Zach O'Keefe" <zokeefe(a)google.com>
Subject: mm/shmem: restore SHMEM_HUGE_DENY precedence over MADV_COLLAPSE
Date: Sat, 24 Dec 2022 00:20:35 -0800
SHMEM_HUGE_DENY is for emergency use by the admin, to disable allocation
of shmem huge pages if, for example, a dangerous bug is found in their
usage: see "deny" in Documentation/mm/transhuge.rst. An app using
madvise(,,MADV_COLLAPSE) should not be allowed to override it: restore its
precedence over shmem_huge_force.
Restore SHMEM_HUGE_DENY precedence over MADV_COLLAPSE.
Link: https://lkml.kernel.org/r/20221224082035.3197140-2-zokeefe@google.com
Fixes: 7c6c6cc4d3a2 ("mm/shmem: add flag to enforce shmem THP in hugepage_vma_check()")
Signed-off-by: Zach O'Keefe <zokeefe(a)google.com>
Suggested-by: Hugh Dickins <hughd(a)google.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/shmem.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
--- a/mm/shmem.c~mm-shmem-restore-shmem_huge_deny-precedence-over-madv_collapse
+++ a/mm/shmem.c
@@ -478,12 +478,10 @@ bool shmem_is_huge(struct vm_area_struct
if (vma && ((vma->vm_flags & VM_NOHUGEPAGE) ||
test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags)))
return false;
- if (shmem_huge_force)
- return true;
- if (shmem_huge == SHMEM_HUGE_FORCE)
- return true;
if (shmem_huge == SHMEM_HUGE_DENY)
return false;
+ if (shmem_huge_force || shmem_huge == SHMEM_HUGE_FORCE)
+ return true;
switch (SHMEM_SB(inode->i_sb)->huge) {
case SHMEM_HUGE_ALWAYS:
_
Patches currently in -mm which might be from zokeefe(a)google.com are
The quilt patch titled
Subject: mm/MADV_COLLAPSE: don't expand collapse when vm_end is past requested end
has been removed from the -mm tree. Its filename was
mm-madv_collapse-dont-expand-collapse-when-vm_end-is-past-requested-end.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: "Zach O'Keefe" <zokeefe(a)google.com>
Subject: mm/MADV_COLLAPSE: don't expand collapse when vm_end is past requested end
Date: Sat, 24 Dec 2022 00:20:34 -0800
MADV_COLLAPSE acts on one hugepage-aligned/sized region at a time, until
it has collapsed all eligible memory contained within the bounds supplied
by the user.
At the top of each hugepage iteration we (re)lock mmap_lock and
(re)validate the VMA for eligibility and update variables that might have
changed while mmap_lock was dropped. One thing that might occur is that
the VMA could be resized, and as such, we refetch vma->vm_end to make sure
we don't collapse past the end of the VMA's new end.
However, it's possible that when refetching vma->vm_end that we expand the
region acted on by MADV_COLLAPSE if vma->vm_end is greater than size+len
supplied by the user.
The consequence here is that we may attempt to collapse more memory than
requested, possibly yielding either "too much success" or "false failure"
user-visible results. An example of the former is if we MADV_COLLAPSE the
first 4MiB of a 2TiB mmap()'d file, the incorrect refetch would cause the
operation to block for much longer than anticipated as we attempt to
collapse the entire TiB region. An example of the latter is that applying
MADV_COLLPSE to a 4MiB file mapped to the start of a 6MiB VMA will
successfully collapse the first 4MiB, then incorrectly attempt to collapse
the last hugepage-aligned/sized region -- fail (since readahead/page cache
lookup will fail) -- and report a failure to the user.
I don't believe there is a kernel stability concern here as we always
(re)validate the VMA / region accordingly. Also as Hugh mentions, the
user-visible effects are: we try to collapse more memory than requested
by the user, and/or failing an operation that should have otherwise
succeeded. An example is trying to collapse a 4MiB file contained
within a 12MiB VMA.
Don't expand the acted-on region when refetching vma->vm_end.
Link: https://lkml.kernel.org/r/20221224082035.3197140-1-zokeefe@google.com
Fixes: 4d24de9425f7 ("mm: MADV_COLLAPSE: refetch vm_end after reacquiring mmap_lock")
Signed-off-by: Zach O'Keefe <zokeefe(a)google.com>
Reported-by: Hugh Dickins <hughd(a)google.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/khugepaged.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/khugepaged.c~mm-madv_collapse-dont-expand-collapse-when-vm_end-is-past-requested-end
+++ a/mm/khugepaged.c
@@ -2647,7 +2647,7 @@ int madvise_collapse(struct vm_area_stru
goto out_nolock;
}
- hend = vma->vm_end & HPAGE_PMD_MASK;
+ hend = min(hend, vma->vm_end & HPAGE_PMD_MASK);
}
mmap_assert_locked(mm);
memset(cc->node_load, 0, sizeof(cc->node_load));
_
Patches currently in -mm which might be from zokeefe(a)google.com are
The quilt patch titled
Subject: mm/userfaultfd: enable writenotify while userfaultfd-wp is enabled for a VMA
has been removed from the -mm tree. Its filename was
mm-userfaultfd-enable-writenotify-while-userfaultfd-wp-is-enabled-for-a-vma.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: David Hildenbrand <david(a)redhat.com>
Subject: mm/userfaultfd: enable writenotify while userfaultfd-wp is enabled for a VMA
Date: Fri, 9 Dec 2022 09:09:12 +0100
Currently, we don't enable writenotify when enabling userfaultfd-wp on a
shared writable mapping (for now only shmem and hugetlb). The consequence
is that vma->vm_page_prot will still include write permissions, to be set
as default for all PTEs that get remapped (e.g., mprotect(), NUMA hinting,
page migration, ...).
So far, vma->vm_page_prot is assumed to be a safe default, meaning that we
only add permissions (e.g., mkwrite) but not remove permissions (e.g.,
wrprotect). For example, when enabling softdirty tracking, we enable
writenotify. With uffd-wp on shared mappings, that changed. More details
on vma->vm_page_prot semantics were summarized in [1].
This is problematic for uffd-wp: we'd have to manually check for a uffd-wp
PTEs/PMDs and manually write-protect PTEs/PMDs, which is error prone.
Prone to such issues is any code that uses vma->vm_page_prot to set PTE
permissions: primarily pte_modify() and mk_pte().
Instead, let's enable writenotify such that PTEs/PMDs/... will be mapped
write-protected as default and we will only allow selected PTEs that are
definitely safe to be mapped without write-protection (see
can_change_pte_writable()) to be writable. In the future, we might want
to enable write-bit recovery -- e.g., can_change_pte_writable() -- at more
locations, for example, also when removing uffd-wp protection.
This fixes two known cases:
(a) remove_migration_pte() mapping uffd-wp'ed PTEs writable, resulting
in uffd-wp not triggering on write access.
(b) do_numa_page() / do_huge_pmd_numa_page() mapping uffd-wp'ed PTEs/PMDs
writable, resulting in uffd-wp not triggering on write access.
Note that do_numa_page() / do_huge_pmd_numa_page() can be reached even
without NUMA hinting (which currently doesn't seem to be applicable to
shmem), for example, by using uffd-wp with a PROT_WRITE shmem VMA. On
such a VMA, userfaultfd-wp is currently non-functional.
Note that when enabling userfaultfd-wp, there is no need to walk page
tables to enforce the new default protection for the PTEs: we know that
they cannot be uffd-wp'ed yet, because that can only happen after enabling
uffd-wp for the VMA in general.
Also note that this makes mprotect() on ranges with uffd-wp'ed PTEs not
accidentally set the write bit -- which would result in uffd-wp not
triggering on later write access. This commit makes uffd-wp on shmem
behave just like uffd-wp on anonymous memory in that regard, even though,
mixing mprotect with uffd-wp is controversial.
[1] https://lkml.kernel.org/r/92173bad-caa3-6b43-9d1e-9a471fdbc184@redhat.com
Link: https://lkml.kernel.org/r/20221209080912.7968-1-david@redhat.com
Fixes: b1f9e876862d ("mm/uffd: enable write protection for shmem & hugetlbfs")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Ives van Hoorne <ives(a)codesandbox.io>
Debugged-by: Peter Xu <peterx(a)redhat.com>
Acked-by: Peter Xu <peterx(a)redhat.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Alistair Popple <apopple(a)nvidia.com>
Cc: Mike Rapoport <rppt(a)linux.vnet.ibm.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/userfaultfd.c | 28 ++++++++++++++++++++++------
mm/mmap.c | 4 ++++
2 files changed, 26 insertions(+), 6 deletions(-)
--- a/fs/userfaultfd.c~mm-userfaultfd-enable-writenotify-while-userfaultfd-wp-is-enabled-for-a-vma
+++ a/fs/userfaultfd.c
@@ -108,6 +108,21 @@ static bool userfaultfd_is_initialized(s
return ctx->features & UFFD_FEATURE_INITIALIZED;
}
+static void userfaultfd_set_vm_flags(struct vm_area_struct *vma,
+ vm_flags_t flags)
+{
+ const bool uffd_wp_changed = (vma->vm_flags ^ flags) & VM_UFFD_WP;
+
+ vma->vm_flags = flags;
+ /*
+ * For shared mappings, we want to enable writenotify while
+ * userfaultfd-wp is enabled (see vma_wants_writenotify()). We'll simply
+ * recalculate vma->vm_page_prot whenever userfaultfd-wp changes.
+ */
+ if ((vma->vm_flags & VM_SHARED) && uffd_wp_changed)
+ vma_set_page_prot(vma);
+}
+
static int userfaultfd_wake_function(wait_queue_entry_t *wq, unsigned mode,
int wake_flags, void *key)
{
@@ -618,7 +633,8 @@ static void userfaultfd_event_wait_compl
for_each_vma(vmi, vma) {
if (vma->vm_userfaultfd_ctx.ctx == release_new_ctx) {
vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
- vma->vm_flags &= ~__VM_UFFD_FLAGS;
+ userfaultfd_set_vm_flags(vma,
+ vma->vm_flags & ~__VM_UFFD_FLAGS);
}
}
mmap_write_unlock(mm);
@@ -652,7 +668,7 @@ int dup_userfaultfd(struct vm_area_struc
octx = vma->vm_userfaultfd_ctx.ctx;
if (!octx || !(octx->features & UFFD_FEATURE_EVENT_FORK)) {
vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
- vma->vm_flags &= ~__VM_UFFD_FLAGS;
+ userfaultfd_set_vm_flags(vma, vma->vm_flags & ~__VM_UFFD_FLAGS);
return 0;
}
@@ -733,7 +749,7 @@ void mremap_userfaultfd_prep(struct vm_a
} else {
/* Drop uffd context if remap feature not enabled */
vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
- vma->vm_flags &= ~__VM_UFFD_FLAGS;
+ userfaultfd_set_vm_flags(vma, vma->vm_flags & ~__VM_UFFD_FLAGS);
}
}
@@ -895,7 +911,7 @@ static int userfaultfd_release(struct in
prev = vma;
}
- vma->vm_flags = new_flags;
+ userfaultfd_set_vm_flags(vma, new_flags);
vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
}
mmap_write_unlock(mm);
@@ -1463,7 +1479,7 @@ static int userfaultfd_register(struct u
* the next vma was merged into the current one and
* the current one has not been updated yet.
*/
- vma->vm_flags = new_flags;
+ userfaultfd_set_vm_flags(vma, new_flags);
vma->vm_userfaultfd_ctx.ctx = ctx;
if (is_vm_hugetlb_page(vma) && uffd_disable_huge_pmd_share(vma))
@@ -1651,7 +1667,7 @@ static int userfaultfd_unregister(struct
* the next vma was merged into the current one and
* the current one has not been updated yet.
*/
- vma->vm_flags = new_flags;
+ userfaultfd_set_vm_flags(vma, new_flags);
vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
skip:
--- a/mm/mmap.c~mm-userfaultfd-enable-writenotify-while-userfaultfd-wp-is-enabled-for-a-vma
+++ a/mm/mmap.c
@@ -1524,6 +1524,10 @@ int vma_wants_writenotify(struct vm_area
if (vma_soft_dirty_enabled(vma) && !is_vm_hugetlb_page(vma))
return 1;
+ /* Do we need write faults for uffd-wp tracking? */
+ if (userfaultfd_wp(vma))
+ return 1;
+
/* Specialty mapping? */
if (vm_flags & VM_PFNMAP)
return 0;
_
Patches currently in -mm which might be from david(a)redhat.com are
mm-userfaultfd-rely-on-vma-vm_page_prot-in-uffd_wp_range.patch
mm-userfaultfd-rely-on-vma-vm_page_prot-in-uffd_wp_range-fix.patch
mm-mprotect-drop-pgprot_t-parameter-from-change_protection.patch
mm-mprotect-drop-pgprot_t-parameter-from-change_protection-fix.patch
selftests-vm-cow-add-cow-tests-for-collapsing-of-pte-mapped-anon-thp.patch
mm-nommu-factor-out-check-for-nommu-shared-mappings-into-is_nommu_shared_mapping.patch
mm-nommu-dont-use-vm_mayshare-for-map_private-mappings.patch
drivers-misc-open-dice-dont-touch-vm_mayshare.patch
selftests-mm-define-madv_pageout-to-fix-compilation-issues.patch
The quilt patch titled
Subject: mm/khugepaged: fix collapse_pte_mapped_thp() to allow anon_vma
has been removed from the -mm tree. Its filename was
mm-khugepaged-fix-collapse_pte_mapped_thp-to-allow-anon_vma.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: mm/khugepaged: fix collapse_pte_mapped_thp() to allow anon_vma
Date: Thu, 22 Dec 2022 12:41:50 -0800 (PST)
uprobe_write_opcode() uses collapse_pte_mapped_thp() to restore huge pmd,
when removing a breakpoint from hugepage text: vma->anon_vma is always set
in that case, so undo the prohibition. And MADV_COLLAPSE ought to be able
to collapse some page tables in a vma which happens to have anon_vma set
from CoWing elsewhere.
Is anon_vma lock required? Almost not: if any page other than expected
subpage of the non-anon huge page is found in the page table, collapse is
aborted without making any change. However, it is possible that an anon
page was CoWed from this extent in another mm or vma, in which case a
concurrent lookup might look here: so keep it away while clearing pmd (but
perhaps we shall go back to using pmd_lock() there in future).
Note that collapse_pte_mapped_thp() is exceptional in freeing a page table
without having cleared its ptes: I'm uneasy about that, and had thought
pte_clear()ing appropriate; but exclusive i_mmap lock does fix the
problem, and we would have to move the mmu_notification if clearing those
ptes.
What this fixes is not a dangerous instability. But I suggest Cc stable
because uprobes "healing" has regressed in that way, so this should follow
8d3c106e19e8 into those stable releases where it was backported (and may
want adjustment there - I'll supply backports as needed).
Link: https://lkml.kernel.org/r/b740c9fb-edba-92ba-59fb-7a5592e5dfc@google.com
Fixes: 8d3c106e19e8 ("mm/khugepaged: take the right locks for page table retraction")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: Zach O'Keefe <zokeefe(a)google.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: <stable(a)vger.kernel.org> [5.4+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/khugepaged.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
--- a/mm/khugepaged.c~mm-khugepaged-fix-collapse_pte_mapped_thp-to-allow-anon_vma
+++ a/mm/khugepaged.c
@@ -1460,14 +1460,6 @@ int collapse_pte_mapped_thp(struct mm_st
if (!hugepage_vma_check(vma, vma->vm_flags, false, false, false))
return SCAN_VMA_CHECK;
- /*
- * Symmetry with retract_page_tables(): Exclude MAP_PRIVATE mappings
- * that got written to. Without this, we'd have to also lock the
- * anon_vma if one exists.
- */
- if (vma->anon_vma)
- return SCAN_VMA_CHECK;
-
/* Keep pmd pgtable for uffd-wp; see comment in retract_page_tables() */
if (userfaultfd_wp(vma))
return SCAN_PTE_UFFD_WP;
@@ -1567,8 +1559,14 @@ int collapse_pte_mapped_thp(struct mm_st
}
/* step 4: remove pte entries */
+ /* we make no change to anon, but protect concurrent anon page lookup */
+ if (vma->anon_vma)
+ anon_vma_lock_write(vma->anon_vma);
+
collapse_and_free_pmd(mm, vma, haddr, pmd);
+ if (vma->anon_vma)
+ anon_vma_unlock_write(vma->anon_vma);
i_mmap_unlock_write(vma->vm_file->f_mapping);
maybe_install_pmd:
_
Patches currently in -mm which might be from hughd(a)google.com are
The quilt patch titled
Subject: mm/hugetlb: fix uffd-wp handling for migration entries in hugetlb_change_protection()
has been removed from the -mm tree. Its filename was
mm-hugetlb-fix-uffd-wp-handling-for-migration-entries-in-hugetlb_change_protection.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: David Hildenbrand <david(a)redhat.com>
Subject: mm/hugetlb: fix uffd-wp handling for migration entries in hugetlb_change_protection()
Date: Thu, 22 Dec 2022 21:55:11 +0100
We have to update the uffd-wp SWP PTE bit independent of the type of
migration entry. Currently, if we're unlucky and we want to install/clear
the uffd-wp bit just while we're migrating a read-only mapped hugetlb
page, we would miss to set/clear the uffd-wp bit.
Further, if we're processing a readable-exclusive migration entry and
neither want to set or clear the uffd-wp bit, we could currently end up
losing the uffd-wp bit. Note that the same would hold for writable
migrating entries, however, having a writable migration entry with the
uffd-wp bit set would already mean that something went wrong.
Note that the change from !is_readable_migration_entry ->
writable_migration_entry is harmless and actually cleaner, as raised by
Miaohe Lin and discussed in [1].
[1] https://lkml.kernel.org/r/90dd6a93-4500-e0de-2bf0-bf522c311b0c@huawei.com
Link: https://lkml.kernel.org/r/20221222205511.675832-3-david@redhat.com
Fixes: 60dfaad65aa9 ("mm/hugetlb: allow uffd wr-protect none ptes")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-fix-uffd-wp-handling-for-migration-entries-in-hugetlb_change_protection
+++ a/mm/hugetlb.c
@@ -6662,10 +6662,9 @@ unsigned long hugetlb_change_protection(
} else if (unlikely(is_hugetlb_entry_migration(pte))) {
swp_entry_t entry = pte_to_swp_entry(pte);
struct page *page = pfn_swap_entry_to_page(entry);
+ pte_t newpte = pte;
- if (!is_readable_migration_entry(entry)) {
- pte_t newpte;
-
+ if (is_writable_migration_entry(entry)) {
if (PageAnon(page))
entry = make_readable_exclusive_migration_entry(
swp_offset(entry));
@@ -6673,13 +6672,15 @@ unsigned long hugetlb_change_protection(
entry = make_readable_migration_entry(
swp_offset(entry));
newpte = swp_entry_to_pte(entry);
- if (uffd_wp)
- newpte = pte_swp_mkuffd_wp(newpte);
- else if (uffd_wp_resolve)
- newpte = pte_swp_clear_uffd_wp(newpte);
- set_huge_pte_at(mm, address, ptep, newpte);
pages++;
}
+
+ if (uffd_wp)
+ newpte = pte_swp_mkuffd_wp(newpte);
+ else if (uffd_wp_resolve)
+ newpte = pte_swp_clear_uffd_wp(newpte);
+ if (!pte_same(pte, newpte))
+ set_huge_pte_at(mm, address, ptep, newpte);
} else if (unlikely(is_pte_marker(pte))) {
/* No other markers apply for now. */
WARN_ON_ONCE(!pte_marker_uffd_wp(pte));
_
Patches currently in -mm which might be from david(a)redhat.com are
mm-userfaultfd-rely-on-vma-vm_page_prot-in-uffd_wp_range.patch
mm-userfaultfd-rely-on-vma-vm_page_prot-in-uffd_wp_range-fix.patch
mm-mprotect-drop-pgprot_t-parameter-from-change_protection.patch
mm-mprotect-drop-pgprot_t-parameter-from-change_protection-fix.patch
selftests-vm-cow-add-cow-tests-for-collapsing-of-pte-mapped-anon-thp.patch
mm-nommu-factor-out-check-for-nommu-shared-mappings-into-is_nommu_shared_mapping.patch
mm-nommu-dont-use-vm_mayshare-for-map_private-mappings.patch
drivers-misc-open-dice-dont-touch-vm_mayshare.patch
selftests-mm-define-madv_pageout-to-fix-compilation-issues.patch
The quilt patch titled
Subject: mm/hugetlb: fix PTE marker handling in hugetlb_change_protection()
has been removed from the -mm tree. Its filename was
mm-hugetlb-fix-pte-marker-handling-in-hugetlb_change_protection.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: David Hildenbrand <david(a)redhat.com>
Subject: mm/hugetlb: fix PTE marker handling in hugetlb_change_protection()
Date: Thu, 22 Dec 2022 21:55:10 +0100
Patch series "mm/hugetlb: uffd-wp fixes for hugetlb_change_protection()".
Playing with virtio-mem and background snapshots (using uffd-wp) on
hugetlb in QEMU, I managed to trigger a VM_BUG_ON(). Looking into the
details, hugetlb_change_protection() seems to not handle uffd-wp correctly
in all cases.
Patch #1 fixes my test case. I don't have reproducers for patch #2, as it
requires running into migration entries.
I did not yet check in detail yet if !hugetlb code requires similar care.
This patch (of 2):
There are two problematic cases when stumbling over a PTE marker in
hugetlb_change_protection():
(1) We protect an uffd-wp PTE marker a second time using uffd-wp: we will
end up in the "!huge_pte_none(pte)" case and mess up the PTE marker.
(2) We unprotect a uffd-wp PTE marker: we will similarly end up in the
"!huge_pte_none(pte)" case even though we cleared the PTE, because
the "pte" variable is stale. We'll mess up the PTE marker.
For example, if we later stumble over such a "wrongly modified" PTE marker,
we'll treat it like a present PTE that maps some garbage page.
This can, for example, be triggered by mapping a memfd backed by huge
pages, registering uffd-wp, uffd-wp'ing an unmapped page and (a)
uffd-wp'ing it a second time; or (b) uffd-unprotecting it; or (c)
unregistering uffd-wp. Then, ff we trigger fallocate(FALLOC_FL_PUNCH_HOLE)
on that file range, we will run into a VM_BUG_ON:
[ 195.039560] page:00000000ba1f2987 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x0
[ 195.039565] flags: 0x7ffffc0001000(reserved|node=0|zone=0|lastcpupid=0x1fffff)
[ 195.039568] raw: 0007ffffc0001000 ffffe742c0000008 ffffe742c0000008 0000000000000000
[ 195.039569] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
[ 195.039569] page dumped because: VM_BUG_ON_PAGE(compound && !PageHead(page))
[ 195.039573] ------------[ cut here ]------------
[ 195.039574] kernel BUG at mm/rmap.c:1346!
[ 195.039579] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 195.039581] CPU: 7 PID: 4777 Comm: qemu-system-x86 Not tainted 6.0.12-200.fc36.x86_64 #1
[ 195.039583] Hardware name: LENOVO 20WNS1F81N/20WNS1F81N, BIOS N35ET50W (1.50 ) 09/15/2022
[ 195.039584] RIP: 0010:page_remove_rmap+0x45b/0x550
[ 195.039588] Code: [...]
[ 195.039589] RSP: 0018:ffffbc03c3633ba8 EFLAGS: 00010292
[ 195.039591] RAX: 0000000000000040 RBX: ffffe742c0000000 RCX: 0000000000000000
[ 195.039592] RDX: 0000000000000002 RSI: ffffffff8e7aac1a RDI: 00000000ffffffff
[ 195.039592] RBP: 0000000000000001 R08: 0000000000000000 R09: ffffbc03c3633a08
[ 195.039593] R10: 0000000000000003 R11: ffffffff8f146328 R12: ffff9b04c42754b0
[ 195.039594] R13: ffffffff8fcc6328 R14: ffffbc03c3633c80 R15: ffff9b0484ab9100
[ 195.039595] FS: 00007fc7aaf68640(0000) GS:ffff9b0bbf7c0000(0000) knlGS:0000000000000000
[ 195.039596] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 195.039597] CR2: 000055d402c49110 CR3: 0000000159392003 CR4: 0000000000772ee0
[ 195.039598] PKRU: 55555554
[ 195.039599] Call Trace:
[ 195.039600] <TASK>
[ 195.039602] __unmap_hugepage_range+0x33b/0x7d0
[ 195.039605] unmap_hugepage_range+0x55/0x70
[ 195.039608] hugetlb_vmdelete_list+0x77/0xa0
[ 195.039611] hugetlbfs_fallocate+0x410/0x550
[ 195.039612] ? _raw_spin_unlock_irqrestore+0x23/0x40
[ 195.039616] vfs_fallocate+0x12e/0x360
[ 195.039618] __x64_sys_fallocate+0x40/0x70
[ 195.039620] do_syscall_64+0x58/0x80
[ 195.039623] ? syscall_exit_to_user_mode+0x17/0x40
[ 195.039624] ? do_syscall_64+0x67/0x80
[ 195.039626] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 195.039628] RIP: 0033:0x7fc7b590651f
[ 195.039653] Code: [...]
[ 195.039654] RSP: 002b:00007fc7aaf66e70 EFLAGS: 00000293 ORIG_RAX: 000000000000011d
[ 195.039655] RAX: ffffffffffffffda RBX: 0000558ef4b7f370 RCX: 00007fc7b590651f
[ 195.039656] RDX: 0000000018000000 RSI: 0000000000000003 RDI: 000000000000000c
[ 195.039657] RBP: 0000000008000000 R08: 0000000000000000 R09: 0000000000000073
[ 195.039658] R10: 0000000008000000 R11: 0000000000000293 R12: 0000000018000000
[ 195.039658] R13: 00007fb8bbe00000 R14: 000000000000000c R15: 0000000000001000
[ 195.039661] </TASK>
Fix it by not going into the "!huge_pte_none(pte)" case if we stumble over
an exclusive marker. spin_unlock() + continue would get the job done.
However, instead, make it clearer that there are no fall-through
statements: we process each case (hwpoison, migration, marker, !none,
none) and then unlock the page table to continue with the next PTE. Let's
avoid "continue" statements and use a single spin_unlock() at the end.
Link: https://lkml.kernel.org/r/20221222205511.675832-1-david@redhat.com
Link: https://lkml.kernel.org/r/20221222205511.675832-2-david@redhat.com
Fixes: 60dfaad65aa9 ("mm/hugetlb: allow uffd wr-protect none ptes")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Peter Xu <peterx(a)redhat.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 21 +++++++--------------
1 file changed, 7 insertions(+), 14 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-fix-pte-marker-handling-in-hugetlb_change_protection
+++ a/mm/hugetlb.c
@@ -6658,10 +6658,8 @@ unsigned long hugetlb_change_protection(
}
pte = huge_ptep_get(ptep);
if (unlikely(is_hugetlb_entry_hwpoisoned(pte))) {
- spin_unlock(ptl);
- continue;
- }
- if (unlikely(is_hugetlb_entry_migration(pte))) {
+ /* Nothing to do. */
+ } else if (unlikely(is_hugetlb_entry_migration(pte))) {
swp_entry_t entry = pte_to_swp_entry(pte);
struct page *page = pfn_swap_entry_to_page(entry);
@@ -6682,18 +6680,13 @@ unsigned long hugetlb_change_protection(
set_huge_pte_at(mm, address, ptep, newpte);
pages++;
}
- spin_unlock(ptl);
- continue;
- }
- if (unlikely(pte_marker_uffd_wp(pte))) {
- /*
- * This is changing a non-present pte into a none pte,
- * no need for huge_ptep_modify_prot_start/commit().
- */
+ } else if (unlikely(is_pte_marker(pte))) {
+ /* No other markers apply for now. */
+ WARN_ON_ONCE(!pte_marker_uffd_wp(pte));
if (uffd_wp_resolve)
+ /* Safe to modify directly (non-present->none). */
huge_pte_clear(mm, address, ptep, psize);
- }
- if (!huge_pte_none(pte)) {
+ } else if (!huge_pte_none(pte)) {
pte_t old_pte;
unsigned int shift = huge_page_shift(hstate_vma(vma));
_
Patches currently in -mm which might be from david(a)redhat.com are
mm-userfaultfd-rely-on-vma-vm_page_prot-in-uffd_wp_range.patch
mm-userfaultfd-rely-on-vma-vm_page_prot-in-uffd_wp_range-fix.patch
mm-mprotect-drop-pgprot_t-parameter-from-change_protection.patch
mm-mprotect-drop-pgprot_t-parameter-from-change_protection-fix.patch
selftests-vm-cow-add-cow-tests-for-collapsing-of-pte-mapped-anon-thp.patch
mm-nommu-factor-out-check-for-nommu-shared-mappings-into-is_nommu_shared_mapping.patch
mm-nommu-dont-use-vm_mayshare-for-map_private-mappings.patch
drivers-misc-open-dice-dont-touch-vm_mayshare.patch
selftests-mm-define-madv_pageout-to-fix-compilation-issues.patch
The patch titled
Subject: mm/khugepaged: fix ->anon_vma race
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-khugepaged-fix-anon_vma-race.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Jann Horn <jannh(a)google.com>
Subject: mm/khugepaged: fix ->anon_vma race
Date: Wed, 11 Jan 2023 14:33:51 +0100
If an ->anon_vma is attached to the VMA, collapse_and_free_pmd() requires
it to be locked. retract_page_tables() bails out if an ->anon_vma is
attached, but does this check before holding the mmap lock (as the comment
above the check explains).
If we racily merge an existing ->anon_vma (shared with a child process)
from a neighboring VMA, subsequent rmap traversals on pages belonging to
the child will be able to see the page tables that we are concurrently
removing while assuming that nothing else can access them.
Repeat the ->anon_vma check once we hold the mmap lock to ensure that
there really is no concurrent page table access.
Link: https://lkml.kernel.org/r/20230111133351.807024-1-jannh@google.com
Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Jann Horn <jannh(a)google.com>
Reported-by: Zach O'Keefe <zokeefe(a)google.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/khugepaged.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
--- a/mm/khugepaged.c~mm-khugepaged-fix-anon_vma-race
+++ a/mm/khugepaged.c
@@ -1642,7 +1642,7 @@ static int retract_page_tables(struct ad
* has higher cost too. It would also probably require locking
* the anon_vma.
*/
- if (vma->anon_vma) {
+ if (READ_ONCE(vma->anon_vma)) {
result = SCAN_PAGE_ANON;
goto next;
}
@@ -1671,6 +1671,18 @@ static int retract_page_tables(struct ad
if ((cc->is_khugepaged || is_target) &&
mmap_write_trylock(mm)) {
/*
+ * Re-check whether we have an ->anon_vma, because
+ * collapse_and_free_pmd() requires that either no
+ * ->anon_vma exists or the anon_vma is locked.
+ * We already checked ->anon_vma above, but that check
+ * is racy because ->anon_vma can be populated under the
+ * mmap lock in read mode.
+ */
+ if (vma->anon_vma) {
+ result = SCAN_PAGE_ANON;
+ goto unlock_next;
+ }
+ /*
* When a vma is registered with uffd-wp, we can't
* recycle the pmd pgtable because there can be pte
* markers installed. Skip it only, so the rest mm/vma
_
Patches currently in -mm which might be from jannh(a)google.com are
mm-khugepaged-fix-anon_vma-race.patch
The patch titled
Subject: maple_tree: fix mas_empty_area_rev() lower bound validation
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
maple_tree-fix-mas_empty_area_rev-lower-bound-validation.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Liam Howlett <liam.howlett(a)oracle.com>
Subject: maple_tree: fix mas_empty_area_rev() lower bound validation
Date: Wed, 11 Jan 2023 20:02:07 +0000
mas_empty_area_rev() was not correctly validating the start of a gap
against the lower limit. This could lead to the range starting lower than
the requested minimum.
Fix the issue by better validating a gap once one is found.
This commit also adds tests to the maple tree test suite for this issue
and tests the mas_empty_area() function for similar bound checking.
Link: https://lkml.kernel.org/r/20230111200136.1851322-1-Liam.Howlett@oracle.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216911
Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Reported-by: <amanieu(a)gmail.com>
Link: https://lore.kernel.org/linux-mm/0b9f5425-08d4-8013-aa4c-e620c3b10bb2@leemh…
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/maple_tree.c | 17 +++----
lib/test_maple_tree.c | 89 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 97 insertions(+), 9 deletions(-)
--- a/lib/maple_tree.c~maple_tree-fix-mas_empty_area_rev-lower-bound-validation
+++ a/lib/maple_tree.c
@@ -4887,7 +4887,7 @@ static bool mas_rev_awalk(struct ma_stat
unsigned long *pivots, *gaps;
void __rcu **slots;
unsigned long gap = 0;
- unsigned long max, min, index;
+ unsigned long max, min;
unsigned char offset;
if (unlikely(mas_is_err(mas)))
@@ -4909,8 +4909,7 @@ static bool mas_rev_awalk(struct ma_stat
min = mas_safe_min(mas, pivots, --offset);
max = mas_safe_pivot(mas, pivots, offset, type);
- index = mas->index;
- while (index <= max) {
+ while (mas->index <= max) {
gap = 0;
if (gaps)
gap = gaps[offset];
@@ -4941,10 +4940,8 @@ static bool mas_rev_awalk(struct ma_stat
min = mas_safe_min(mas, pivots, offset);
}
- if (unlikely(index > max)) {
- mas_set_err(mas, -EBUSY);
- return false;
- }
+ if (unlikely((mas->index > max) || (size - 1 > max - mas->index)))
+ goto no_space;
if (unlikely(ma_is_leaf(type))) {
mas->offset = offset;
@@ -4961,9 +4958,11 @@ static bool mas_rev_awalk(struct ma_stat
return false;
ascend:
- if (mte_is_root(mas->node))
- mas_set_err(mas, -EBUSY);
+ if (!mte_is_root(mas->node))
+ return false;
+no_space:
+ mas_set_err(mas, -EBUSY);
return false;
}
--- a/lib/test_maple_tree.c~maple_tree-fix-mas_empty_area_rev-lower-bound-validation
+++ a/lib/test_maple_tree.c
@@ -2517,6 +2517,91 @@ static noinline void check_bnode_min_spa
mt_set_non_kernel(0);
}
+static noinline void check_empty_area_window(struct maple_tree *mt)
+{
+ unsigned long i, nr_entries = 20;
+ MA_STATE(mas, mt, 0, 0);
+
+ for (i = 1; i <= nr_entries; i++)
+ mtree_store_range(mt, i*10, i*10 + 9,
+ xa_mk_value(i), GFP_KERNEL);
+
+ /* Create another hole besides the one at 0 */
+ mtree_store_range(mt, 160, 169, NULL, GFP_KERNEL);
+
+ /* Check lower bounds that don't fit */
+ rcu_read_lock();
+ MT_BUG_ON(mt, mas_empty_area_rev(&mas, 5, 90, 10) != -EBUSY);
+
+ mas_reset(&mas);
+ MT_BUG_ON(mt, mas_empty_area_rev(&mas, 6, 90, 5) != -EBUSY);
+
+ /* Check lower bound that does fit */
+ mas_reset(&mas);
+ MT_BUG_ON(mt, mas_empty_area_rev(&mas, 5, 90, 5) != 0);
+ MT_BUG_ON(mt, mas.index != 5);
+ MT_BUG_ON(mt, mas.last != 9);
+ rcu_read_unlock();
+
+ /* Check one gap that doesn't fit and one that does */
+ rcu_read_lock();
+ mas_reset(&mas);
+ MT_BUG_ON(mt, mas_empty_area_rev(&mas, 5, 217, 9) != 0);
+ MT_BUG_ON(mt, mas.index != 161);
+ MT_BUG_ON(mt, mas.last != 169);
+
+ /* Check one gap that does fit above the min */
+ mas_reset(&mas);
+ MT_BUG_ON(mt, mas_empty_area_rev(&mas, 100, 218, 3) != 0);
+ MT_BUG_ON(mt, mas.index != 216);
+ MT_BUG_ON(mt, mas.last != 218);
+
+ /* Check size that doesn't fit any gap */
+ mas_reset(&mas);
+ MT_BUG_ON(mt, mas_empty_area_rev(&mas, 100, 218, 16) != -EBUSY);
+
+ /*
+ * Check size that doesn't fit the lower end of the window but
+ * does fit the gap
+ */
+ mas_reset(&mas);
+ MT_BUG_ON(mt, mas_empty_area_rev(&mas, 167, 200, 4) != -EBUSY);
+
+ /*
+ * Check size that doesn't fit the upper end of the window but
+ * does fit the gap
+ */
+ mas_reset(&mas);
+ MT_BUG_ON(mt, mas_empty_area_rev(&mas, 100, 162, 4) != -EBUSY);
+
+ /* Check mas_empty_area forward */
+ mas_reset(&mas);
+ MT_BUG_ON(mt, mas_empty_area(&mas, 0, 100, 9) != 0);
+ MT_BUG_ON(mt, mas.index != 0);
+ MT_BUG_ON(mt, mas.last != 8);
+
+ mas_reset(&mas);
+ MT_BUG_ON(mt, mas_empty_area(&mas, 0, 100, 4) != 0);
+ MT_BUG_ON(mt, mas.index != 0);
+ MT_BUG_ON(mt, mas.last != 3);
+
+ mas_reset(&mas);
+ MT_BUG_ON(mt, mas_empty_area(&mas, 0, 100, 11) != -EBUSY);
+
+ mas_reset(&mas);
+ MT_BUG_ON(mt, mas_empty_area(&mas, 5, 100, 6) != -EBUSY);
+
+ mas_reset(&mas);
+ MT_BUG_ON(mt, mas_empty_area(&mas, 0, 8, 10) != -EBUSY);
+
+ mas_reset(&mas);
+ mas_empty_area(&mas, 100, 165, 3);
+
+ mas_reset(&mas);
+ MT_BUG_ON(mt, mas_empty_area(&mas, 100, 163, 6) != -EBUSY);
+ rcu_read_unlock();
+}
+
static DEFINE_MTREE(tree);
static int maple_tree_seed(void)
{
@@ -2765,6 +2850,10 @@ static int maple_tree_seed(void)
check_bnode_min_spanning(&tree);
mtree_destroy(&tree);
+ mt_init_flags(&tree, MT_FLAGS_ALLOC_RANGE);
+ check_empty_area_window(&tree);
+ mtree_destroy(&tree);
+
#if defined(BENCH)
skip:
#endif
_
Patches currently in -mm which might be from liam.howlett(a)oracle.com are
nommu-fix-memory-leak-in-do_mmap-error-path.patch
nommu-fix-do_munmap-error-path.patch
nommu-fix-split_vma-map_count-error.patch
maple_tree-fix-mas_empty_area_rev-lower-bound-validation.patch
maple_tree-remove-gfp_zero-from-kmem_cache_alloc-and-kmem_cache_alloc_bulk.patch
Happy new 2023,
I normally watch [1] for the next LTS linux-kernel which is for me an
official site and for an official announcement.
On the debian-kernel mailing list you read Linux 6.1 will be the
official one for Debian-12 aka bookworm.
I saw a phoronix article about EOL of Linux-4.9 [3] which points to [2].
[2] says:
After being prompted on the kernel mailing list, Linux stable
maintainer Greg Kroah-Hartman commented:
> I usually pick the "last kernel of the year", and based on the normal release cycle, yes, 6.1 will be that kernel.
> But I can't promise anything until it is released, for obvious reasons.
This is not a clear statement for me and was maybe at a point where
6.1 was not released.
If you published a clear statement please point me to it.
And if so, please update [1] accordingly.
( It dropped 4.9 from LTS list recently from [1] - guess Konstantin or
someone from helpdesk did - so [1] is actively maintained. )
Please, a clear statement.
Thanks.
Regards,
-Sedat-
P.S.: Just for the records: I am not subscribed to LKML or
linux-stable mailing-lists and may miss such a clear statement.
[1] https://kernel.org/category/releases.html
[2] https://www.phoronix.com/news/Linux-6.1-Likely-LTS
[3] https://www.phoronix.com/news/Linux-4.9.337-LTS-Over
[4] https://release.debian.org/ > Key release dates
The onboard_hub 'driver' consists of two drivers, a platform
driver and a USB driver. Currently when the onboard hub driver
is initialized it first registers the platform driver, then the
USB driver. This results in a race condition when the 'attach'
work is executed, which is scheduled when the platform device
is probed. The purpose of fhe 'attach' work is to bind elegible
USB hub devices to the onboard_hub USB driver. This fails if
the work runs before the USB driver has been registered.
Register the USB driver first, then the platform driver. This
increases the chances that the onboard_hub USB devices are probed
before their corresponding platform device, which the USB driver
tries to locate in _probe(). The driver already handles this
situation and defers probing if the onboard hub platform device
doesn't exist yet.
Cc: stable(a)vger.kernel.org
Fixes: 8bc063641ceb ("usb: misc: Add onboard_usb_hub driver")
Link: https://lore.kernel.org/lkml/Y6W00vQm3jfLflUJ@hovoldconsulting.com/T/#m0d64…
Reported-by: Alexander Stein <alexander.stein(a)ew.tq-group.com>
Signed-off-by: Matthias Kaehlcke <mka(a)chromium.org>
---
(no changes since v1)
drivers/usb/misc/onboard_usb_hub.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/misc/onboard_usb_hub.c b/drivers/usb/misc/onboard_usb_hub.c
index 94e7966e199d..db0844b30bbd 100644
--- a/drivers/usb/misc/onboard_usb_hub.c
+++ b/drivers/usb/misc/onboard_usb_hub.c
@@ -433,13 +433,13 @@ static int __init onboard_hub_init(void)
{
int ret;
- ret = platform_driver_register(&onboard_hub_driver);
+ ret = usb_register_device_driver(&onboard_hub_usbdev_driver, THIS_MODULE);
if (ret)
return ret;
- ret = usb_register_device_driver(&onboard_hub_usbdev_driver, THIS_MODULE);
+ ret = platform_driver_register(&onboard_hub_driver);
if (ret)
- platform_driver_unregister(&onboard_hub_driver);
+ usb_deregister_device_driver(&onboard_hub_usbdev_driver);
return ret;
}
--
2.39.0.314.g84b9a713c41-goog
The self-refresh helper framework overloads "disable" to sometimes mean
"go into self-refresh mode," and this mode activates automatically
(e.g., after some period of unchanging display output). In such cases,
the display pipe is still considered "on", and user-space is not aware
that we went into self-refresh mode. Thus, users may expect that
vblank-related features (such as DRM_IOCTL_WAIT_VBLANK) still work
properly.
However, we trigger the WARN_ONCE() here if a CRTC driver tries to leave
vblank enabled here.
Add a new exception, such that we allow CRTCs to be "disabled" (with
self-refresh active) with vblank interrupts still enabled.
Cc: <stable(a)vger.kernel.org> # dependency for subsequent patch
Signed-off-by: Brian Norris <briannorris(a)chromium.org>
---
drivers/gpu/drm/drm_atomic_helper.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index d579fd8f7cb8..7b5eddadebd5 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -1207,6 +1207,12 @@ disable_outputs(struct drm_device *dev, struct drm_atomic_state *old_state)
if (!drm_dev_has_vblank(dev))
continue;
+ /*
+ * Self-refresh is not a true "disable"; let vblank remain
+ * enabled.
+ */
+ if (new_crtc_state->self_refresh_active)
+ continue;
ret = drm_crtc_vblank_get(crtc);
WARN_ONCE(ret != -EINVAL, "driver forgot to call drm_crtc_vblank_off()\n");
--
2.39.0.314.g84b9a713c41-goog
On Wed, Jan 11, 2023, at 07:16, Naresh Kamboju wrote:
> On Tue, 10 Jan 2023 at 23:36, Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> wrote:
>>
>
> Results from Linaro’s test farm.
> Regressions on arm64 Raspberry Pi 4 Model B.
>
> Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org>
>
> While running LTP controllers cgroup_fj_stress_blkio test cases
> the Insufficient stack space to handle exception! occurred and
> followed by kernel panic on arm64 Raspberry Pi 4 Model B with
> clang-15 built kernel Image.
>
> The full boot and test log attached to this email and build and
> Kconfig links provided in the bottom of this email.
>
> I will try to reproduce this reported issue and get back to you.
I looked at the log between 6.0.18 and 6.0.19-rc1, but don't see
any arm64 or memory management patches that could result in this.
Do you know if 6.0.18 ran successfull
> [ 2893.044339] Insufficient stack space to handle exception!
> [ 2893.044351] ESR: 0x0000000096000047 -- DABT (current EL)
> [ 2893.044360] FAR: 0xffff8000128180d0
> [ 2893.044364] Task stack: [0xffff800012a18000..0xffff800012a1c000]
> [ 2893.044370] IRQ stack: [0xffff80000a798000..0xffff80000a79c000]
> [ 2893.044375] Overflow stack: [0xffff0000f77c4310..0xffff0000f77c5310]
...
> [ 2893.044413] pc : el1h_64_sync+0x0/0x68
> [ 2893.044430] lr : wp_page_copy+0xf8/0x90c
> [ 2893.044445] sp : ffff8000128180d0
...
> [ 2893.044692] el1h_64_sync+0x0/0x68
> [ 2893.044700] do_wp_page+0x4a0/0x5c8
> [ 2893.044708] handle_mm_fault+0x7fc/0x14dc
> [ 2893.044718] do_page_fault+0x29c/0x450
> [ 2893.044727] do_mem_abort+0x4c/0xf8
> [ 2893.044741] el0_da+0x48/0xa8
> [ 2893.044750] el0t_64_sync_handler+0xcc/0xf0
> [ 2893.044759] el0t_64_sync+0x18c/0x190
It claims that the stack overflow happened in do_wp_page(),
but that has a really short call chain. It would be good
to have the source line for do_wp_page+0x4a0/0x5c8 and
wp_page_copy+0xf8/0x90c to see where exactly it was.
> [ 2893.285975] WARNING: CPU: 2 PID: 315758 at kernel/sched/core.c:3119
> set_task_cpu+0x14c/0x208
....
> [ 2893.286117] CPU: 2 PID: 315758 Comm: cgroup_fj_stres Not tainted
> [ 2893.286416] arch_timer_handler_phys+0x44/0x54
> [ 2893.286427] handle_percpu_devid_irq+0x90/0x220
> [ 2893.286439] generic_handle_domain_irq+0x38/0x50
> [ 2893.286447] gic_handle_irq+0x68/0xe8
> [ 2893.286455] el1_interrupt+0x88/0xc8
> [ 2893.286464] el1h_64_irq_handler+0x18/0x24
> [ 2893.286474] el1h_64_irq+0x64/0x68
> [ 2893.286482] panic+0x2d8/0x374
This is apparently a second unrelated bug -- it still processes timer
interrupts after calling panic() and this apparently fails because
the system is already unusable.
> artifact-location:
> https://storage.tuxsuite.com/public/linaro/lkft/builds/2K9JDtix2mHMoYRjNkBe…
file not found. I tried to get the vmlinux file to look at the disassembly
but the artifacts appear to be gone already.
Arnd
While experimenting with applying noqueue to a classful queue discipline,
we discovered a NULL pointer dereference in the __dev_queue_xmit()
path that generates a kernel OOPS:
# dev=enp0s5
# tc qdisc replace dev $dev root handle 1: htb default 1
# tc class add dev $dev parent 1: classid 1:1 htb rate 10mbit
# tc qdisc add dev $dev parent 1:1 handle 10: noqueue
# ping -I $dev -w 1 -c 1 1.1.1.1
[ 2.172856] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 2.173217] #PF: supervisor instruction fetch in kernel mode
...
[ 2.178451] Call Trace:
[ 2.178577] <TASK>
[ 2.178686] htb_enqueue+0x1c8/0x370
[ 2.178880] dev_qdisc_enqueue+0x15/0x90
[ 2.179093] __dev_queue_xmit+0x798/0xd00
[ 2.179305] ? _raw_write_lock_bh+0xe/0x30
[ 2.179522] ? __local_bh_enable_ip+0x32/0x70
[ 2.179759] ? ___neigh_create+0x610/0x840
[ 2.179968] ? eth_header+0x21/0xc0
[ 2.180144] ip_finish_output2+0x15e/0x4f0
[ 2.180348] ? dst_output+0x30/0x30
[ 2.180525] ip_push_pending_frames+0x9d/0xb0
[ 2.180739] raw_sendmsg+0x601/0xcb0
[ 2.180916] ? _raw_spin_trylock+0xe/0x50
[ 2.181112] ? _raw_spin_unlock_irqrestore+0x16/0x30
[ 2.181354] ? get_page_from_freelist+0xcd6/0xdf0
[ 2.181594] ? sock_sendmsg+0x56/0x60
[ 2.181781] sock_sendmsg+0x56/0x60
[ 2.181958] __sys_sendto+0xf7/0x160
[ 2.182139] ? handle_mm_fault+0x6e/0x1d0
[ 2.182366] ? do_user_addr_fault+0x1e1/0x660
[ 2.182627] __x64_sys_sendto+0x1b/0x30
[ 2.182881] do_syscall_64+0x38/0x90
[ 2.183085] entry_SYSCALL_64_after_hwframe+0x63/0xcd
...
[ 2.187402] </TASK>
Previously in commit d66d6c3152e8 ("net: sched: register noqueue
qdisc"), NULL was set for the noqueue discipline on noqueue init
so that __dev_queue_xmit() falls through for the noqueue case. This
also sets a bypass of the enqueue NULL check in the
register_qdisc() function for the struct noqueue_disc_ops.
Classful queue disciplines make it past the NULL check in
__dev_queue_xmit() because the discipline is set to htb (in this case),
and then in the call to __dev_xmit_skb(), it calls into htb_enqueue()
which grabs a leaf node for a class and then calls qdisc_enqueue() by
passing in a queue discipline which assumes ->enqueue() is not set to NULL.
Fix this by not allowing classes to be assigned to the noqueue
discipline. Linux TC Notes states that classes cannot be set to
the noqueue discipline. [1] Let's enforce that here.
Links:
1. https://linux-tc-notes.sourceforge.net/tc/doc/sch_noqueue.txt
Fixes: d66d6c3152e8 ("net: sched: register noqueue qdisc")
Cc: stable(a)vger.kernel.org
Signed-off-by: Frederick Lawler <fred(a)cloudflare.com>
Reviewed-by: Jakub Sitnicki <jakub(a)cloudflare.com>
---
net/sched/sch_api.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 2317db02c764..72d2c204d5f3 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -1133,6 +1133,11 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent,
return -ENOENT;
}
+ if (new && new->ops == &noqueue_qdisc_ops) {
+ NL_SET_ERR_MSG(extack, "Cannot assign noqueue to a class");
+ return -EINVAL;
+ }
+
err = cops->graft(parent, cl, new, &old, extack);
if (err)
return err;
--
2.34.1
The patch titled
Subject: aio: fix mremap after fork null-deref
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
aio-fix-mremap-after-fork-null-deref.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Seth Jenkins <sethjenkins(a)google.com>
Subject: aio: fix mremap after fork null-deref
Date: Fri, 4 Nov 2022 17:25:19 -0400
Commit e4a0d3e720e7 ("aio: Make it possible to remap aio ring") introduced
a null-deref if mremap is called on an old aio mapping after fork as
mm->ioctx_table will be set to NULL.
Link: https://lkml.kernel.org/r/20221104212519.538108-1-sethjenkins@google.com
Fixes: e4a0d3e720e7 ("aio: Make it possible to remap aio ring")
Signed-off-by: Seth Jenkins <sethjenkins(a)google.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl(a)kvack.org>
Cc: Jann Horn <jannh(a)google.com>
Cc: Pavel Emelyanov <xemul(a)parallels.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/aio.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
--- a/fs/aio.c~aio-fix-mremap-after-fork-null-deref
+++ a/fs/aio.c
@@ -361,16 +361,18 @@ static int aio_ring_mremap(struct vm_are
spin_lock(&mm->ioctx_lock);
rcu_read_lock();
table = rcu_dereference(mm->ioctx_table);
- for (i = 0; i < table->nr; i++) {
- struct kioctx *ctx;
+ if (table) {
+ for (i = 0; i < table->nr; i++) {
+ struct kioctx *ctx;
- ctx = rcu_dereference(table->table[i]);
- if (ctx && ctx->aio_ring_file == file) {
- if (!atomic_read(&ctx->dead)) {
- ctx->user_id = ctx->mmap_base = vma->vm_start;
- res = 0;
+ ctx = rcu_dereference(table->table[i]);
+ if (ctx && ctx->aio_ring_file == file) {
+ if (!atomic_read(&ctx->dead)) {
+ ctx->user_id = ctx->mmap_base = vma->vm_start;
+ res = 0;
+ }
+ break;
}
- break;
}
}
_
Patches currently in -mm which might be from sethjenkins(a)google.com are
aio-fix-mremap-after-fork-null-deref.patch
When creating a new monitoring group, the RMID allocated for it may have
been used by a group which was previously removed. In this case, the
hardware counters will have non-zero values which should be deducted
from what is reported in the new group's counts.
resctrl_arch_reset_rmid() initializes the prev_msr value for counters to
0, causing the initial count to be charged to the new group. Resurrect
__rmid_read() and use it to initialize prev_msr correctly.
Unlike before, __rmid_read() checks for error bits in the MSR read so
that callers don't need to.
Fixes: 1d81d15db39c ("x86/resctrl: Move mbm_overflow_count() into resctrl_arch_rmid_read()")
Signed-off-by: Peter Newman <peternewman(a)google.com>
Reviewed-by: Reinette Chatre <reinette.chatre(a)intel.com>
Cc: stable(a)vger.kernel.org
---
v3:
- add changelog
- CC stable
v2:
- move error bit processing into __rmid_read()
v1: https://lore.kernel.org/lkml/20221207112924.3602960-1-peternewman@google.co…
v2: https://lore.kernel.org/lkml/20221214160856.2164207-1-peternewman@google.co…
---
arch/x86/kernel/cpu/resctrl/monitor.c | 49 ++++++++++++++++++---------
1 file changed, 33 insertions(+), 16 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index efe0c30d3a12..77538abeb72a 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -146,6 +146,30 @@ static inline struct rmid_entry *__rmid_entry(u32 rmid)
return entry;
}
+static int __rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val)
+{
+ u64 msr_val;
+
+ /*
+ * As per the SDM, when IA32_QM_EVTSEL.EvtID (bits 7:0) is configured
+ * with a valid event code for supported resource type and the bits
+ * IA32_QM_EVTSEL.RMID (bits 41:32) are configured with valid RMID,
+ * IA32_QM_CTR.data (bits 61:0) reports the monitored data.
+ * IA32_QM_CTR.Error (bit 63) and IA32_QM_CTR.Unavailable (bit 62)
+ * are error bits.
+ */
+ wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid);
+ rdmsrl(MSR_IA32_QM_CTR, msr_val);
+
+ if (msr_val & RMID_VAL_ERROR)
+ return -EIO;
+ if (msr_val & RMID_VAL_UNAVAIL)
+ return -EINVAL;
+
+ *val = msr_val;
+ return 0;
+}
+
static struct arch_mbm_state *get_arch_mbm_state(struct rdt_hw_domain *hw_dom,
u32 rmid,
enum resctrl_event_id eventid)
@@ -172,8 +196,12 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d,
struct arch_mbm_state *am;
am = get_arch_mbm_state(hw_dom, rmid, eventid);
- if (am)
+ if (am) {
memset(am, 0, sizeof(*am));
+
+ /* Record any initial, non-zero count value. */
+ __rmid_read(rmid, eventid, &am->prev_msr);
+ }
}
static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)
@@ -191,25 +219,14 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d,
struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
struct arch_mbm_state *am;
u64 msr_val, chunks;
+ int ret;
if (!cpumask_test_cpu(smp_processor_id(), &d->cpu_mask))
return -EINVAL;
- /*
- * As per the SDM, when IA32_QM_EVTSEL.EvtID (bits 7:0) is configured
- * with a valid event code for supported resource type and the bits
- * IA32_QM_EVTSEL.RMID (bits 41:32) are configured with valid RMID,
- * IA32_QM_CTR.data (bits 61:0) reports the monitored data.
- * IA32_QM_CTR.Error (bit 63) and IA32_QM_CTR.Unavailable (bit 62)
- * are error bits.
- */
- wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid);
- rdmsrl(MSR_IA32_QM_CTR, msr_val);
-
- if (msr_val & RMID_VAL_ERROR)
- return -EIO;
- if (msr_val & RMID_VAL_UNAVAIL)
- return -EINVAL;
+ ret = __rmid_read(rmid, eventid, &msr_val);
+ if (ret)
+ return ret;
am = get_arch_mbm_state(hw_dom, rmid, eventid);
if (am) {
base-commit: 830b3c68c1fb1e9176028d02ef86f3cf76aa2476
--
2.39.0.314.g84b9a713c41-goog
The preallocation of the maple tree nodes may leak if the error path to
"error_just_free" is taken. Fix this by moving the freeing of the maple
tree nodes to a shared location for all error paths.
Cc: stable(a)vger.kernel.org
Fixes: 8220543df148 ("nommu: remove uses of VMA linked list")
Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
---
Changes since v1:
- Added 'Cc: stable(a)vger.kernel.org' to commit message
mm/nommu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/nommu.c b/mm/nommu.c
index 214c70e1d059..c8252f01d5db 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1240,6 +1240,7 @@ unsigned long do_mmap(struct file *file,
error_just_free:
up_write(&nommu_region_sem);
error:
+ mas_destroy(&mas);
if (region->vm_file)
fput(region->vm_file);
kmem_cache_free(vm_region_jar, region);
@@ -1250,7 +1251,6 @@ unsigned long do_mmap(struct file *file,
sharing_violation:
up_write(&nommu_region_sem);
- mas_destroy(&mas);
pr_warn("Attempt to share mismatched mappings\n");
ret = -EINVAL;
goto error;
--
2.35.1
When removing a VMA from the tree fails due to no memory, do not free
the VMA since a reference still exists.
Cc: stable(a)vger.kernel.org
Fixes: 8220543df148 ("nommu: remove uses of VMA linked list")
Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
---
Changes since v1:
- Added 'Cc: stable(a)vger.kernel.org' to commit message
mm/nommu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/nommu.c b/mm/nommu.c
index c8252f01d5db..844af5be7640 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1509,7 +1509,8 @@ int do_munmap(struct mm_struct *mm, unsigned long start, size_t len, struct list
erase_whole_vma:
if (delete_vma_from_mm(vma))
ret = -ENOMEM;
- delete_vma(mm, vma);
+ else
+ delete_vma(mm, vma);
return ret;
}
--
2.35.1
During the maple tree conversion of nommu, an error in counting the VMAs
was introduced by counting the existing VMA again. The counting used to
be decremented by one and incremented by two, but now it only increments
by two. Fix the counting error by moving the increment outside the
setup_vma_to_mm() function to the callers.
Cc: stable(a)vger.kernel.org
Fixes: 8220543df148 ("nommu: remove uses of VMA linked list")
Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
---
Changes since v1:
- Added 'Cc: stable(a)vger.kernel.org' to commit message
mm/nommu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mm/nommu.c b/mm/nommu.c
index 844af5be7640..5b83938ecb67 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -559,7 +559,6 @@ void vma_mas_remove(struct vm_area_struct *vma, struct ma_state *mas)
static void setup_vma_to_mm(struct vm_area_struct *vma, struct mm_struct *mm)
{
- mm->map_count++;
vma->vm_mm = mm;
/* add the VMA to the mapping */
@@ -587,6 +586,7 @@ static void mas_add_vma_to_mm(struct ma_state *mas, struct mm_struct *mm,
BUG_ON(!vma->vm_region);
setup_vma_to_mm(vma, mm);
+ mm->map_count++;
/* add the VMA to the tree */
vma_mas_store(vma, mas);
@@ -1347,6 +1347,7 @@ int split_vma(struct mm_struct *mm, struct vm_area_struct *vma,
if (vma->vm_file)
return -ENOMEM;
+ mm = vma->vm_mm;
if (mm->map_count >= sysctl_max_map_count)
return -ENOMEM;
@@ -1398,6 +1399,7 @@ int split_vma(struct mm_struct *mm, struct vm_area_struct *vma,
mas_set_range(&mas, vma->vm_start, vma->vm_end - 1);
mas_store(&mas, vma);
vma_mas_store(new, &mas);
+ mm->map_count++;
return 0;
err_mas_preallocate:
--
2.35.1
During the maple tree conversion of nommu, an error in counting the VMAs
was introduced by counting the existing VMA again. The counting used to
be decremented by one and incremented by two, but now it only increments
by two. Fix the counting error by moving the increment outside the
setup_vma_to_mm() function to the callers.
Cc: stable(a)vger.kernel.org
Fixes: 8220543df148 ("nommu: remove uses of VMA linked list")
Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
---
Changes since v1:
- Added 'Cc: stable(a)vger.kernel.org' to commit message
mm/nommu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mm/nommu.c b/mm/nommu.c
index 844af5be7640..5b83938ecb67 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -559,7 +559,6 @@ void vma_mas_remove(struct vm_area_struct *vma, struct ma_state *mas)
static void setup_vma_to_mm(struct vm_area_struct *vma, struct mm_struct *mm)
{
- mm->map_count++;
vma->vm_mm = mm;
/* add the VMA to the mapping */
@@ -587,6 +586,7 @@ static void mas_add_vma_to_mm(struct ma_state *mas, struct mm_struct *mm,
BUG_ON(!vma->vm_region);
setup_vma_to_mm(vma, mm);
+ mm->map_count++;
/* add the VMA to the tree */
vma_mas_store(vma, mas);
@@ -1347,6 +1347,7 @@ int split_vma(struct mm_struct *mm, struct vm_area_struct *vma,
if (vma->vm_file)
return -ENOMEM;
+ mm = vma->vm_mm;
if (mm->map_count >= sysctl_max_map_count)
return -ENOMEM;
@@ -1398,6 +1399,7 @@ int split_vma(struct mm_struct *mm, struct vm_area_struct *vma,
mas_set_range(&mas, vma->vm_start, vma->vm_end - 1);
mas_store(&mas, vma);
vma_mas_store(new, &mas);
+ mm->map_count++;
return 0;
err_mas_preallocate:
--
2.35.1
The patch below does not apply to the 6.0-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
b389286d0234 ("drm/mgag200: Fix PLL setup for G200_SE_A rev >=4")
877507bb954e ("drm/mgag200: Provide per-device callbacks for PIXPLLC")
8aeeb3144fe2 ("drm/mgag200: Provide per-device callbacks for BMC synchronization")
f639f74a7895 ("drm/mgag200: Add per-device callbacks")
1baf9127c482 ("drm/mgag200: Replace simple-KMS with regular atomic helpers")
4f4dc37e374c ("drm/mgag200: Reorganize before dropping simple-KMS helpers")
ed2ef21f1089 ("drm/mgag200: Store primary plane's color format in CRTC state")
2d70b9a1482e ("drm/mgag200: Acquire I/O-register lock in atomic_commit_tail function")
1ee181fe958a ("drm/mgag200: Move DAC-register setup into model-specific code")
44373151ab42 ("drm/mgag200: Split mgag200_modeset_init()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b389286d0234e1edbaf62ed8bc0892a568c33662 Mon Sep 17 00:00:00 2001
From: Jocelyn Falempe <jfalempe(a)redhat.com>
Date: Thu, 13 Oct 2022 15:28:10 +0200
Subject: [PATCH] drm/mgag200: Fix PLL setup for G200_SE_A rev >=4
For G200_SE_A, PLL M setting is wrong, which leads to blank screen,
or "signal out of range" on VGA display.
previous code had "m |= 0x80" which was changed to
m |= ((pixpllcn & BIT(8)) >> 1);
Tested on G200_SE_A rev 42
This line of code was moved to another file with
commit 877507bb954e ("drm/mgag200: Provide per-device callbacks for
PIXPLLC") but can be easily backported before this commit.
v2: * put BIT(7) First to respect MSB-to-LSB (Thomas)
* Add a comment to explain that this bit must be set (Thomas)
Fixes: 2dd040946ecf ("drm/mgag200: Store values (not bits) in struct mgag200_pll_values")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jocelyn Falempe <jfalempe(a)redhat.com>
Reviewed-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20221013132810.521945-1-jfale…
diff --git a/drivers/gpu/drm/mgag200/mgag200_g200se.c b/drivers/gpu/drm/mgag200/mgag200_g200se.c
index be389ed91cbd..bd6e573c9a1a 100644
--- a/drivers/gpu/drm/mgag200/mgag200_g200se.c
+++ b/drivers/gpu/drm/mgag200/mgag200_g200se.c
@@ -284,7 +284,8 @@ static void mgag200_g200se_04_pixpllc_atomic_update(struct drm_crtc *crtc,
pixpllcp = pixpllc->p - 1;
pixpllcs = pixpllc->s;
- xpixpllcm = pixpllcm | ((pixpllcn & BIT(8)) >> 1);
+ // For G200SE A, BIT(7) should be set unconditionally.
+ xpixpllcm = BIT(7) | pixpllcm;
xpixpllcn = pixpllcn;
xpixpllcp = (pixpllcs << 3) | pixpllcp;
[Public]
Hi,
The following commit fixes some s2idle wakeup problems with certain firmware on AMD Rembrandt systems with Qualcomm WLAN cards.
3f9b09ccf7d5 ("wifi: ath11k: Send PME message during wakeup from D3cold")
Can you please bring this to 6.1.y?
Thanks,
KVM_SEV_SEND_UPDATE_DATA and KVM_SEV_RECEIVE_UPDATE_DATA have an integer
overflow issue. Params.guest_len and offset are both 32bite wide, with a
large params.guest_len the check to confirm a page boundary is not
crossed can falsely pass:
/* Check if we are crossing the page boundary *
offset = params.guest_uaddr & (PAGE_SIZE - 1);
if ((params.guest_len + offset > PAGE_SIZE))
Add an additional check to this conditional to confirm that
params.guest_len itself is not greater than PAGE_SIZE.
The current code is can only overflow with a params.guest_len of greater
than 0xfffff000. And the FW spec says these commands fail with lengths
greater than 16KB. So this issue should not be a security concern
Fixes: 15fb7de1a7f5 ("KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command")
Fixes: d3d1af85e2c7 ("KVM: SVM: Add KVM_SEND_UPDATE_DATA command")
Reported-by: Andy Nguyen <theflow(a)google.com>
Suggested-by: Thomas Lendacky <thomas.lendacky(a)amd.com>
Signed-off-by: Peter Gonda <pgonda(a)google.com>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Sean Christopherson <seanjc(a)google.com>
Cc: kvm(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
---
arch/x86/kvm/svm/sev.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 273cba809328..9451de72f917 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -1294,7 +1294,7 @@ static int sev_send_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
/* Check if we are crossing the page boundary */
offset = params.guest_uaddr & (PAGE_SIZE - 1);
- if ((params.guest_len + offset > PAGE_SIZE))
+ if (params.guest_len > PAGE_SIZE || (params.guest_len + offset > PAGE_SIZE))
return -EINVAL;
/* Pin guest memory */
@@ -1474,7 +1474,7 @@ static int sev_receive_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
/* Check if we are crossing the page boundary */
offset = params.guest_uaddr & (PAGE_SIZE - 1);
- if ((params.guest_len + offset > PAGE_SIZE))
+ if (params.guest_len > PAGE_SIZE || (params.guest_len + offset > PAGE_SIZE))
return -EINVAL;
hdr = psp_copy_user_blob(params.hdr_uaddr, params.hdr_len);
--
2.39.0.314.g84b9a713c41-goog
From: "Tyler Hicks" <code(a)tyhicks.com>
When attempting to build kselftests with a separate output directory, a
number of the tests fail to build.
For example,
$ rm -rf build && \
make INSTALL_HDR_PATH=build/usr headers_install > /dev/null && \
make O=build FORCE_TARGETS=1 TARGETS=breakpoints -C tools/testing/selftests > /dev/null
/usr/bin/ld: cannot open output file
build/kselftest/breakpoints/step_after_suspend_test: No such file or directory
collect2: error: ld returned 1 exit status
make[1]: *** [../lib.mk:146: build/kselftest/breakpoints/step_after_suspend_test] Error 1
make: *** [Makefile:163: all] Error 2
This has already been addressed upstream with v5.18 commit 5ad51ab618de
("selftests: set the BUILD variable to absolute path"). It is a clean
cherry pick to the linux-5.15.y and linux-5.10.y branches.
Tyler
Muhammad Usama Anjum (1):
selftests: set the BUILD variable to absolute path
tools/testing/selftests/Makefile | 26 +++++++++++++++++---------
1 file changed, 17 insertions(+), 9 deletions(-)
--
2.34.1
Please apply commit 105c78e12468 ("ext4: don't allow journal inode to have
encrypt flag") to the 5.15, 5.10, 5.4, and 4.19 LTS kernels, where it applies
cleanly.
It didn't get applied automatically because for the Fixes tag, I used a commit
in 5.18. However, that was the commit that exposed the problem, not the root
cause. IMO it makes sense to apply this to earlier kernels too, especially
because some people have backported the 5.18 commit.
- Eric
Greg -
Here are backports of two MPTCP patches that recently failed to apply to
the 5.15 stable tree. Two prerequisite patches are already queued in
5.15.87-rc1:
mptcp: mark ops structures as ro_after_init
mptcp: remove MPTCP 'ifdef' in TCP SYN cookies
These patches prevent IPv6 memory leaks with MPTCP.
Thanks!
Matthieu Baerts (2):
mptcp: dedicated request sock for subflow in v6
mptcp: use proper req destructor for IPv6
net/mptcp/subflow.c | 53 +++++++++++++++++++++++++++++++++++----------
1 file changed, 42 insertions(+), 11 deletions(-)
--
2.39.0
A number of AMD based Rembrandt laptops are not working properly in
suspend/resume. This has been root caused to be from the BIOS
implementation not populating code for the AMD GUID in uPEP, but
instead only the Microsoft one.
In later kernels this has been fixed by using the Microsoft GUID
instead.
The following series of patches has fixed it in newer kernels:
commit ed470febf837 ("ACPI: PM: s2idle: Add support for upcoming AMD uPEP
HID AMDI008")
commit 1a2dcab517cb ("ACPI: PM: s2idle: Use LPS0 idle if
ACPI_FADT_LOW_POWER_S0 is unset")
commit 100a57379380 ("ACPI: x86: s2idle: Move _HID handling for AMD
systems into structures")
commit fd894f05cf30 ("ACPI: x86: s2idle: If a new AMD _HID is missing
assume Rembrandt")
commit a0bc002393d4 ("ACPI: x86: s2idle: Add module parameter to prefer
Microsoft GUID")
commit d0f61e89f08d ("ACPI: x86: s2idle: Add a quirk for ASUS TUF Gaming
A17 FA707RE")
commit ddeea2c3cb88 ("ACPI: x86: s2idle: Add a quirk for ASUS ROG
Zephyrus G14")
commit 888ca9c7955e ("ACPI: x86: s2idle: Add a quirk for Lenovo Slim 7
Pro 14ARH7")
commit 631b54519e8e ("ACPI: x86: s2idle: Add a quirk for ASUSTeK
COMPUTER INC. ROG Flow X13")
commit 39f81776c680 ("ACPI: x86: s2idle: Fix a NULL pointer dereference")
commit 54bd1e548701 ("ACPI: x86: s2idle: Add another ID to
s2idle_dmi_table")
commit 577821f756cf ("ACPI: x86: s2idle: Force AMD GUID/_REV 2 on HP
Elitebook 865")
commit e6d180a35bc0 ("ACPI: x86: s2idle: Stop using AMD specific codepath
for Rembrandt+")
This is needlessly complex for 5.15.y though. To accomplish the same
effective result revert commit f0c6225531e4 ("ACPI: PM: Add support for
upcoming AMD uPEP HID AMDI007") instead.
Link: https://lore.kernel.org/stable/MN0PR12MB61015DB3D6EDBFD841157918E2F59@MN0PR…
Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com>
---
drivers/acpi/x86/s2idle.c | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/drivers/acpi/x86/s2idle.c b/drivers/acpi/x86/s2idle.c
index e0185e841b2a..2af1ae172102 100644
--- a/drivers/acpi/x86/s2idle.c
+++ b/drivers/acpi/x86/s2idle.c
@@ -378,16 +378,13 @@ static int lps0_device_attach(struct acpi_device *adev,
* AMDI0006:
* - should use rev_id 0x0
* - function mask = 0x3: Should use Microsoft method
- * AMDI0007:
- * - Should use rev_id 0x2
- * - Should only use AMD method
*/
const char *hid = acpi_device_hid(adev);
- rev_id = strcmp(hid, "AMDI0007") ? 0 : 2;
+ rev_id = 0;
lps0_dsm_func_mask = validate_dsm(adev->handle,
ACPI_LPS0_DSM_UUID_AMD, rev_id, &lps0_dsm_guid);
lps0_dsm_func_mask_microsoft = validate_dsm(adev->handle,
- ACPI_LPS0_DSM_UUID_MICROSOFT, 0,
+ ACPI_LPS0_DSM_UUID_MICROSOFT, rev_id,
&lps0_dsm_guid_microsoft);
if (lps0_dsm_func_mask > 0x3 && (!strcmp(hid, "AMD0004") ||
!strcmp(hid, "AMD0005") ||
@@ -395,9 +392,6 @@ static int lps0_device_attach(struct acpi_device *adev,
lps0_dsm_func_mask = (lps0_dsm_func_mask << 1) | 0x1;
acpi_handle_debug(adev->handle, "_DSM UUID %s: Adjusted function mask: 0x%x\n",
ACPI_LPS0_DSM_UUID_AMD, lps0_dsm_func_mask);
- } else if (lps0_dsm_func_mask_microsoft > 0 && !strcmp(hid, "AMDI0007")) {
- lps0_dsm_func_mask_microsoft = -EINVAL;
- acpi_handle_debug(adev->handle, "_DSM Using AMD method\n");
}
} else {
rev_id = 1;
--
2.34.1
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
cdfb2fef522d ("ksmbd: send proper error response in smb2_tree_connect()")
cb4517201b8a ("ksmbd: remove smb2_buf_length in smb2_hdr")
341b16014bf8 ("ksmdb: use cmd helper variable in smb2_get_ksmbd_tcon()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From cdfb2fef522d0c3f9cf293db51de88e9b3d46846 Mon Sep 17 00:00:00 2001
From: Marios Makassikis <mmakassikis(a)freebox.fr>
Date: Fri, 23 Dec 2022 11:59:31 +0100
Subject: [PATCH] ksmbd: send proper error response in smb2_tree_connect()
Currently, smb2_tree_connect doesn't send an error response packet on
error.
This causes libsmb2 to skip the specific error code and fail with the
following:
smb2_service failed with : Failed to parse fixed part of command
payload. Unexpected size of Error reply. Expected 9, got 8
Signed-off-by: Marios Makassikis <mmakassikis(a)freebox.fr>
Acked-by: Namjae Jeon <linkinjeon(a)kernel.org>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/ksmbd/smb2pdu.c b/fs/ksmbd/smb2pdu.c
index 14d7f3599c63..38fbda52e06f 100644
--- a/fs/ksmbd/smb2pdu.c
+++ b/fs/ksmbd/smb2pdu.c
@@ -1928,13 +1928,13 @@ int smb2_tree_connect(struct ksmbd_work *work)
if (conn->posix_ext_supported)
status.tree_conn->posix_extensions = true;
-out_err1:
rsp->StructureSize = cpu_to_le16(16);
+ inc_rfc1001_len(work->response_buf, 16);
+out_err1:
rsp->Capabilities = 0;
rsp->Reserved = 0;
/* default manual caching */
rsp->ShareFlags = SMB2_SHAREFLAG_MANUAL_CACHING;
- inc_rfc1001_len(work->response_buf, 16);
if (!IS_ERR(treename))
kfree(treename);
@@ -1967,6 +1967,9 @@ int smb2_tree_connect(struct ksmbd_work *work)
rsp->hdr.Status = STATUS_ACCESS_DENIED;
}
+ if (status.ret != KSMBD_TREE_CONN_STATUS_OK)
+ smb2_set_err_rsp(work);
+
return rc;
}
Hi, Sasha Levin
please rebase the patch queue-6.1(btrfs: fix an error handling path in btrfs_defrag_leaves)
just like queue-6.0, and then drop its 8 depency patches.
the 2 of 8 depency patches are file rename, so it will make later depency patch become
difficult?
#btrfs-move-btrfs_get_block_group-helper-out-of-disk-.patch
#btrfs-move-flush-related-definitions-to-space-info.h.patch
#btrfs-move-btrfs_print_data_csum_error-into-inode.c.patch
#btrfs-move-fs-wide-helpers-out-of-ctree.h.patch
#btrfs-move-assert-helpers-out-of-ctree.h.patch
#btrfs-move-the-printk-helpers-out-of-ctree.h.patch
#**btrfs-rename-struct-funcs.c-to-accessors.c.patch
#**btrfs-rename-tree-defrag.c-to-defrag.c.patch
and the patch(btrfs: fix an error handling path in btrfs_defrag_leaves) is small,
so a rebase will be a good choice.
Best Regards
Wang Yugui (wangyugui(a)e16-tech.com)
2023/01/10
From: Quentin Schulz <quentin.schulz(a)theobroma-systems.com>
clk_cifout is derived from clk_cifout_src through an integer divider
limited to 32. clk_cifout_src is a child of either cpll, gpll or npll
without any possibility of a divider of any sort. The default clock
parent is cpll.
Let's allow clk_cifout to ask its parent clk_cifout_src to reparent in
order to find the real closest possible rate for clk_cifout and not one
derived from cpll only.
Cc: stable(a)vger.kernel.org # 4.10+
Fixes: fd8bc829336a ("clk: rockchip: fix the rk3399 cifout clock")
Signed-off-by: Quentin Schulz <quentin.schulz(a)theobroma-systems.com>
---
clk: rockchip: rk3399: allow clk_cifout to force clk_cifout_src to reparent
This used to be correct before v4.10 but commit fd8bc829336a ("clk: rockchip:
fix the rk3399 cifout clock") incorrectly removed this ability while reworking
it.
Note: this has been tested on top of v6.0.2 only but no changes were made to
this driver since. As for older stable releases, the git context seems identical
and there does not seem to have been any logical change introduced since v4.10
so it should be pretty safe to apply.
To: Michael Turquette <mturquette(a)baylibre.com>
To: Stephen Boyd <sboyd(a)kernel.org>
To: Heiko Stuebner <heiko(a)sntech.de>
To: Xing Zheng <zhengxing(a)rock-chips.com>
Cc: linux-clk(a)vger.kernel.org
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: linux-rockchip(a)lists.infradead.org
Cc: linux-kernel(a)vger.kernel.org
---
drivers/clk/rockchip/clk-rk3399.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/clk/rockchip/clk-rk3399.c b/drivers/clk/rockchip/clk-rk3399.c
index 306910a3a0d38..9ebd6c451b3db 100644
--- a/drivers/clk/rockchip/clk-rk3399.c
+++ b/drivers/clk/rockchip/clk-rk3399.c
@@ -1263,7 +1263,7 @@ static struct rockchip_clk_branch rk3399_clk_branches[] __initdata = {
RK3399_CLKSEL_CON(56), 6, 2, MFLAGS,
RK3399_CLKGATE_CON(10), 7, GFLAGS),
- COMPOSITE_NOGATE(SCLK_CIF_OUT, "clk_cifout", mux_clk_cif_p, 0,
+ COMPOSITE_NOGATE(SCLK_CIF_OUT, "clk_cifout", mux_clk_cif_p, CLK_SET_RATE_PARENT,
RK3399_CLKSEL_CON(56), 5, 1, MFLAGS, 0, 5, DFLAGS),
/* gic */
---
base-commit: cc675d22e422442f6d230654a55a5fc5682ea018
change-id: 20221117-rk3399-cifout-set-rate-parent-1fbf0173ef2d
Best regards,
--
Quentin Schulz <quentin.schulz(a)theobroma-systems.com>
ATTN:
Directaxis Finance Loans is currently offering flexible loans at 5% interest rates on all type of loans
Kindly find attached flyer for more details on the special offer
info.directaxisloanfinance(a)aol.com or call +27635362590
Backports the following three patches to fix the issue of IMA mishandling
LSM based rule during LSM policy update, causing a file to match an
unexpected rule.
v6:
Removed the redundent i in ima_free_rule().
v5:
goes back to ima_lsm_free_rule() instead to avoid freeing
rule->fsname.
v4:
Make use of the exisiting ima_free_rule() instead of backported
ima_lsm_free_rule(). Which resolves additional memory leak issues.
v3:
Backport "LSM: switch to blocking policy update notifiers" as well, as
the prerequsite of "ima: use the lsm policy update notifier".
v2:
Re-adjust the bacported logic.
GUO Zihua (1):
ima: Handle -ESTALE returned by ima_filter_rule_match()
Janne Karhunen (2):
LSM: switch to blocking policy update notifiers
ima: use the lsm policy update notifier
drivers/infiniband/core/device.c | 4 +-
include/linux/security.h | 12 +--
security/integrity/ima/ima.h | 2 +
security/integrity/ima/ima_main.c | 8 ++
security/integrity/ima/ima_policy.c | 151 ++++++++++++++++++++++------
security/security.c | 23 +++--
security/selinux/hooks.c | 2 +-
security/selinux/selinuxfs.c | 2 +-
8 files changed, 155 insertions(+), 49 deletions(-)
--
2.17.1
Backports the following three patches to fix the issue of IMA mishandling
LSM based rule during LSM policy update, causing a file to match an
unexpected rule.
v7:
Fixed the target for free in ima_lsm_copy_rule().
v6:
Removed the redundent i in ima_free_rule().
v5:
goes back to ima_lsm_free_rule() instead to avoid freeing
rule->fsname.
v4:
Make use of the exisiting ima_free_rule() instead of backported
ima_lsm_free_rule(). Which resolves additional memory leak issues.
v3:
Backport "LSM: switch to blocking policy update notifiers" as well, as
the prerequsite of "ima: use the lsm policy update notifier".
v2:
Re-adjust the bacported logic.
GUO Zihua (1):
ima: Handle -ESTALE returned by ima_filter_rule_match()
Janne Karhunen (2):
LSM: switch to blocking policy update notifiers
ima: use the lsm policy update notifier
drivers/infiniband/core/device.c | 4 +-
include/linux/security.h | 12 +--
security/integrity/ima/ima.h | 2 +
security/integrity/ima/ima_main.c | 8 ++
security/integrity/ima/ima_policy.c | 151 ++++++++++++++++++++++------
security/security.c | 23 +++--
security/selinux/hooks.c | 2 +-
security/selinux/selinuxfs.c | 2 +-
8 files changed, 155 insertions(+), 49 deletions(-)
--
2.17.1
Since commit 07ec77a1d4e8 ("sched: Allow task CPU affinity to be
restricted on asymmetric systems"), the setting and clearing of
user_cpus_ptr are done under pi_lock for arm64 architecture. However,
dup_user_cpus_ptr() accesses user_cpus_ptr without any lock
protection. Since sched_setaffinity() can be invoked from another
process, the process being modified may be undergoing fork() at
the same time. When racing with the clearing of user_cpus_ptr in
__set_cpus_allowed_ptr_locked(), it can lead to user-after-free and
possibly double-free in arm64 kernel.
Commit 8f9ea86fdf99 ("sched: Always preserve the user requested
cpumask") fixes this problem as user_cpus_ptr, once set, will never
be cleared in a task's lifetime. However, this bug was re-introduced
in commit 851a723e45d1 ("sched: Always clear user_cpus_ptr in
do_set_cpus_allowed()") which allows the clearing of user_cpus_ptr in
do_set_cpus_allowed(). This time, it will affect all arches.
Fix this bug by always clearing the user_cpus_ptr of the newly
cloned/forked task before the copying process starts and check the
user_cpus_ptr state of the source task under pi_lock.
Note to stable, this patch won't be applicable to stable releases.
Just copy the new dup_user_cpus_ptr() function over.
Fixes: 07ec77a1d4e8 ("sched: Allow task CPU affinity to be restricted on asymmetric systems")
Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
CC: stable(a)vger.kernel.org
Reported-by: David Wang 王标 <wangbiao3(a)xiaomi.com>
Signed-off-by: Waiman Long <longman(a)redhat.com>
---
kernel/sched/core.c | 34 +++++++++++++++++++++++++++++-----
1 file changed, 29 insertions(+), 5 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 25b582b6ee5f..b93d030b9fd5 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2612,19 +2612,43 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
int node)
{
+ cpumask_t *user_mask;
unsigned long flags;
- if (!src->user_cpus_ptr)
+ /*
+ * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
+ * may differ by now due to racing.
+ */
+ dst->user_cpus_ptr = NULL;
+
+ /*
+ * This check is racy and losing the race is a valid situation.
+ * It is not worth the extra overhead of taking the pi_lock on
+ * every fork/clone.
+ */
+ if (data_race(!src->user_cpus_ptr))
return 0;
- dst->user_cpus_ptr = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
- if (!dst->user_cpus_ptr)
+ user_mask = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
+ if (!user_mask)
return -ENOMEM;
- /* Use pi_lock to protect content of user_cpus_ptr */
+ /*
+ * Use pi_lock to protect content of user_cpus_ptr
+ *
+ * Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
+ * do_set_cpus_allowed().
+ */
raw_spin_lock_irqsave(&src->pi_lock, flags);
- cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
+ if (src->user_cpus_ptr) {
+ swap(dst->user_cpus_ptr, user_mask);
+ cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
+ }
raw_spin_unlock_irqrestore(&src->pi_lock, flags);
+
+ if (unlikely(user_mask))
+ kfree(user_mask);
+
return 0;
}
--
2.31.1
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
12521a5d5cb7 ("io_uring: fix CQ waiting timeout handling")
35d90f95cfa7 ("io_uring: include task_work run after scheduling in wait for events")
3a08576b96e3 ("io_uring: remove check_cq checking from hot paths")
ed29b0b4fd83 ("io_uring: move to separate directory")
155bc9505dbd ("io_uring: return an error when cqe is dropped")
10988a0a67ba ("io_uring: use constants for cq_overflow bitfield")
3e813c902672 ("io_uring: rework io_uring_enter to simplify return value")
cef216fc32d7 ("io_uring: explicitly keep a CQE in io_kiocb")
b4f20bb4e6d5 ("io_uring: move finish_wait() outside of loop in cqring_wait()")
d487b43cd327 ("io_uring: optimise mutex locking for submit+iopoll")
773697b610bf ("io_uring: pre-calculate syscall iopolling decision")
f81440d33cc6 ("io_uring: split off IOPOLL argument verifiction")
b605a7fabb60 ("io_uring: move poll recycling later in compl flushing")
a538be5be328 ("io_uring: optimise io_free_batch_list")
c0713540f6d5 ("io_uring: fix leaks on IOPOLL and CQE_SKIP")
323b190ba2de ("io_uring: free iovec if file assignment fails")
7179c3ce3dbf ("io_uring: fix poll error reporting")
cce64ef01308 ("io_uring: fix poll file assign deadlock")
82733d168cbd ("io_uring: stop using io_wq_work as an fd placeholder")
2804ecd8d3e3 ("io_uring: move apoll->events cache")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 12521a5d5cb7ff0ad43eadfc9c135d86e1131fa8 Mon Sep 17 00:00:00 2001
From: Pavel Begunkov <asml.silence(a)gmail.com>
Date: Thu, 5 Jan 2023 10:49:15 +0000
Subject: [PATCH] io_uring: fix CQ waiting timeout handling
Jiffy to ktime CQ waiting conversion broke how we treat timeouts, in
particular we rearm it anew every time we get into
io_cqring_wait_schedule() without adjusting the timeout. Waiting for 2
CQEs and getting a task_work in the middle may double the timeout value,
or even worse in some cases task may wait indefinitely.
Cc: stable(a)vger.kernel.org
Fixes: 228339662b398 ("io_uring: don't convert to jiffies for waiting on timeouts")
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
Link: https://lore.kernel.org/r/f7bffddd71b08f28a877d44d37ac953ddb01590d.16729156…
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 472574192dd6..2ac1cd8d23ea 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2470,7 +2470,7 @@ int io_run_task_work_sig(struct io_ring_ctx *ctx)
/* when returns >0, the caller should retry */
static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
struct io_wait_queue *iowq,
- ktime_t timeout)
+ ktime_t *timeout)
{
int ret;
unsigned long check_cq;
@@ -2488,7 +2488,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT))
return -EBADR;
}
- if (!schedule_hrtimeout(&timeout, HRTIMER_MODE_ABS))
+ if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
return -ETIME;
/*
@@ -2564,7 +2564,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
}
prepare_to_wait_exclusive(&ctx->cq_wait, &iowq.wq,
TASK_INTERRUPTIBLE);
- ret = io_cqring_wait_schedule(ctx, &iowq, timeout);
+ ret = io_cqring_wait_schedule(ctx, &iowq, &timeout);
if (__io_cqring_events_user(ctx) >= min_events)
break;
cond_resched();
The patch below does not apply to the 6.0-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
f3c23bea598a ("drm/amd/display: Uninitialized variables causing 4k60 UCLK to stay at DPM1 and not DPM0")
6d4727c80947 ("drm/amd/display: Add check for DET fetch latency hiding for dcn32")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f3c23bea598ab7e8e4b8c5ca66598921310f718e Mon Sep 17 00:00:00 2001
From: Samson Tam <samson.tam(a)amd.com>
Date: Mon, 5 Dec 2022 11:08:40 -0500
Subject: [PATCH] drm/amd/display: Uninitialized variables causing 4k60 UCLK to
stay at DPM1 and not DPM0
[Why]
SwathSizePerSurfaceY[] and SwathSizePerSurfaceC[] values are uninitialized
because we are using += instead of = operator.
[How]
Assign values in loop with = operator.
Acked-by: Aurabindo Pillai <aurabindo.pillai(a)amd.com>
Signed-off-by: Samson Tam <samson.tam(a)amd.com>
Reviewed-by: Aric Cyr <aric.cyr(a)amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org # 6.0.x, 6.1.x
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c
index 5af601cff1a0..b53feeaf5cf1 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c
@@ -6257,12 +6257,12 @@ bool dml32_CalculateDETSwathFillLatencyHiding(unsigned int NumberOfActiveSurface
double SwathSizePerSurfaceC[DC__NUM_DPP__MAX];
bool NotEnoughDETSwathFillLatencyHiding = false;
- /* calculate sum of single swath size for all pipes in bytes*/
+ /* calculate sum of single swath size for all pipes in bytes */
for (k = 0; k < NumberOfActiveSurfaces; k++) {
- SwathSizePerSurfaceY[k] += SwathHeightY[k] * SwathWidthY[k] * BytePerPixelInDETY[k] * NumOfDPP[k];
+ SwathSizePerSurfaceY[k] = SwathHeightY[k] * SwathWidthY[k] * BytePerPixelInDETY[k] * NumOfDPP[k];
if (SwathHeightC[k] != 0)
- SwathSizePerSurfaceC[k] += SwathHeightC[k] * SwathWidthC[k] * BytePerPixelInDETC[k] * NumOfDPP[k];
+ SwathSizePerSurfaceC[k] = SwathHeightC[k] * SwathWidthC[k] * BytePerPixelInDETC[k] * NumOfDPP[k];
else
SwathSizePerSurfaceC[k] = 0;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
f3c23bea598a ("drm/amd/display: Uninitialized variables causing 4k60 UCLK to stay at DPM1 and not DPM0")
6d4727c80947 ("drm/amd/display: Add check for DET fetch latency hiding for dcn32")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f3c23bea598ab7e8e4b8c5ca66598921310f718e Mon Sep 17 00:00:00 2001
From: Samson Tam <samson.tam(a)amd.com>
Date: Mon, 5 Dec 2022 11:08:40 -0500
Subject: [PATCH] drm/amd/display: Uninitialized variables causing 4k60 UCLK to
stay at DPM1 and not DPM0
[Why]
SwathSizePerSurfaceY[] and SwathSizePerSurfaceC[] values are uninitialized
because we are using += instead of = operator.
[How]
Assign values in loop with = operator.
Acked-by: Aurabindo Pillai <aurabindo.pillai(a)amd.com>
Signed-off-by: Samson Tam <samson.tam(a)amd.com>
Reviewed-by: Aric Cyr <aric.cyr(a)amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org # 6.0.x, 6.1.x
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c
index 5af601cff1a0..b53feeaf5cf1 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c
@@ -6257,12 +6257,12 @@ bool dml32_CalculateDETSwathFillLatencyHiding(unsigned int NumberOfActiveSurface
double SwathSizePerSurfaceC[DC__NUM_DPP__MAX];
bool NotEnoughDETSwathFillLatencyHiding = false;
- /* calculate sum of single swath size for all pipes in bytes*/
+ /* calculate sum of single swath size for all pipes in bytes */
for (k = 0; k < NumberOfActiveSurfaces; k++) {
- SwathSizePerSurfaceY[k] += SwathHeightY[k] * SwathWidthY[k] * BytePerPixelInDETY[k] * NumOfDPP[k];
+ SwathSizePerSurfaceY[k] = SwathHeightY[k] * SwathWidthY[k] * BytePerPixelInDETY[k] * NumOfDPP[k];
if (SwathHeightC[k] != 0)
- SwathSizePerSurfaceC[k] += SwathHeightC[k] * SwathWidthC[k] * BytePerPixelInDETC[k] * NumOfDPP[k];
+ SwathSizePerSurfaceC[k] = SwathHeightC[k] * SwathWidthC[k] * BytePerPixelInDETC[k] * NumOfDPP[k];
else
SwathSizePerSurfaceC[k] = 0;
From: Takashi Iwai <tiwai(a)suse.de>
commit 8423f0b6d513b259fdab9c9bf4aaa6188d054c2d upstream.
There is a small race window at snd_pcm_oss_sync() that is called from
OSS PCM SNDCTL_DSP_SYNC ioctl; namely the function calls
snd_pcm_oss_make_ready() at first, then takes the params_lock mutex
for the rest. When the stream is set up again by another thread
between them, it leads to inconsistency, and may result in unexpected
results such as NULL dereference of OSS buffer as a fuzzer spotted
recently.
The fix is simply to cover snd_pcm_oss_make_ready() call into the same
params_lock mutex with snd_pcm_oss_make_ready_locked() variant.
Reported-and-tested-by: butt3rflyh4ck <butterflyhuangxx(a)gmail.com>
Reviewed-by: Jaroslav Kysela <perex(a)perex.cz>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/CAFcO6XN7JDM4xSXGhtusQfS2mSBcx50VJKwQpCq=WeLt57aa…
Link: https://lore.kernel.org/r/20220905060714.22549-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Signed-off-by: Zubin Mithra <zsm(a)google.com>
---
Note:
* 8423f0b6d513 is present in linux-5.15.y and linux-5.4.y; missing in
linux-5.10.y.
* Backport addresses conflict due to surrounding context.
* Tests run: build and boot.
sound/core/oss/pcm_oss.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/sound/core/oss/pcm_oss.c b/sound/core/oss/pcm_oss.c
index f88de74da1eb..de6f94bee50b 100644
--- a/sound/core/oss/pcm_oss.c
+++ b/sound/core/oss/pcm_oss.c
@@ -1662,13 +1662,14 @@ static int snd_pcm_oss_sync(struct snd_pcm_oss_file *pcm_oss_file)
runtime = substream->runtime;
if (atomic_read(&substream->mmap_count))
goto __direct;
- if ((err = snd_pcm_oss_make_ready(substream)) < 0)
- return err;
atomic_inc(&runtime->oss.rw_ref);
if (mutex_lock_interruptible(&runtime->oss.params_lock)) {
atomic_dec(&runtime->oss.rw_ref);
return -ERESTARTSYS;
}
+ err = snd_pcm_oss_make_ready_locked(substream);
+ if (err < 0)
+ goto unlock;
format = snd_pcm_oss_format_from(runtime->oss.format);
width = snd_pcm_format_physical_width(format);
if (runtime->oss.buffer_used > 0) {
--
2.38.0.rc1.362.ged0d419d3c-goog
The patch below does not apply to the 6.0-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
12521a5d5cb7 ("io_uring: fix CQ waiting timeout handling")
35d90f95cfa7 ("io_uring: include task_work run after scheduling in wait for events")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 12521a5d5cb7ff0ad43eadfc9c135d86e1131fa8 Mon Sep 17 00:00:00 2001
From: Pavel Begunkov <asml.silence(a)gmail.com>
Date: Thu, 5 Jan 2023 10:49:15 +0000
Subject: [PATCH] io_uring: fix CQ waiting timeout handling
Jiffy to ktime CQ waiting conversion broke how we treat timeouts, in
particular we rearm it anew every time we get into
io_cqring_wait_schedule() without adjusting the timeout. Waiting for 2
CQEs and getting a task_work in the middle may double the timeout value,
or even worse in some cases task may wait indefinitely.
Cc: stable(a)vger.kernel.org
Fixes: 228339662b398 ("io_uring: don't convert to jiffies for waiting on timeouts")
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
Link: https://lore.kernel.org/r/f7bffddd71b08f28a877d44d37ac953ddb01590d.16729156…
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 472574192dd6..2ac1cd8d23ea 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2470,7 +2470,7 @@ int io_run_task_work_sig(struct io_ring_ctx *ctx)
/* when returns >0, the caller should retry */
static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
struct io_wait_queue *iowq,
- ktime_t timeout)
+ ktime_t *timeout)
{
int ret;
unsigned long check_cq;
@@ -2488,7 +2488,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT))
return -EBADR;
}
- if (!schedule_hrtimeout(&timeout, HRTIMER_MODE_ABS))
+ if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
return -ETIME;
/*
@@ -2564,7 +2564,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
}
prepare_to_wait_exclusive(&ctx->cq_wait, &iowq.wq,
TASK_INTERRUPTIBLE);
- ret = io_cqring_wait_schedule(ctx, &iowq, timeout);
+ ret = io_cqring_wait_schedule(ctx, &iowq, &timeout);
if (__io_cqring_events_user(ctx) >= min_events)
break;
cond_resched();
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
12521a5d5cb7 ("io_uring: fix CQ waiting timeout handling")
35d90f95cfa7 ("io_uring: include task_work run after scheduling in wait for events")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 12521a5d5cb7ff0ad43eadfc9c135d86e1131fa8 Mon Sep 17 00:00:00 2001
From: Pavel Begunkov <asml.silence(a)gmail.com>
Date: Thu, 5 Jan 2023 10:49:15 +0000
Subject: [PATCH] io_uring: fix CQ waiting timeout handling
Jiffy to ktime CQ waiting conversion broke how we treat timeouts, in
particular we rearm it anew every time we get into
io_cqring_wait_schedule() without adjusting the timeout. Waiting for 2
CQEs and getting a task_work in the middle may double the timeout value,
or even worse in some cases task may wait indefinitely.
Cc: stable(a)vger.kernel.org
Fixes: 228339662b398 ("io_uring: don't convert to jiffies for waiting on timeouts")
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
Link: https://lore.kernel.org/r/f7bffddd71b08f28a877d44d37ac953ddb01590d.16729156…
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 472574192dd6..2ac1cd8d23ea 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2470,7 +2470,7 @@ int io_run_task_work_sig(struct io_ring_ctx *ctx)
/* when returns >0, the caller should retry */
static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
struct io_wait_queue *iowq,
- ktime_t timeout)
+ ktime_t *timeout)
{
int ret;
unsigned long check_cq;
@@ -2488,7 +2488,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT))
return -EBADR;
}
- if (!schedule_hrtimeout(&timeout, HRTIMER_MODE_ABS))
+ if (!schedule_hrtimeout(timeout, HRTIMER_MODE_ABS))
return -ETIME;
/*
@@ -2564,7 +2564,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
}
prepare_to_wait_exclusive(&ctx->cq_wait, &iowq.wq,
TASK_INTERRUPTIBLE);
- ret = io_cqring_wait_schedule(ctx, &iowq, timeout);
+ ret = io_cqring_wait_schedule(ctx, &iowq, &timeout);
if (__io_cqring_events_user(ctx) >= min_events)
break;
cond_resched();
The patch below does not apply to the 6.0-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
a26116c1e740 ("virtio_blk: Fix signedness bug in virtblk_prep_rq()")
258896fcc786 ("virtio-blk: use a helper to handle request queuing errors")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a26116c1e74028914f281851488546c91cbae57d Mon Sep 17 00:00:00 2001
From: Rafael Mendonca <rafaelmendsr(a)gmail.com>
Date: Fri, 21 Oct 2022 17:41:26 -0300
Subject: [PATCH] virtio_blk: Fix signedness bug in virtblk_prep_rq()
The virtblk_map_data() function returns negative error codes, however, the
'nents' field of vbr->sg_table is an unsigned int, which causes the error
handling not to work correctly.
Cc: stable(a)vger.kernel.org
Fixes: 0e9911fa768f ("virtio-blk: support mq_ops->queue_rqs()")
Signed-off-by: Rafael Mendonca <rafaelmendsr(a)gmail.com>
Message-Id: <20221021204126.927603-1-rafaelmendsr(a)gmail.com>
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare(a)redhat.com>
Reviewed-by: Suwan Kim <suwan.kim027(a)gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha(a)redhat.com>
Acked-by: Jason Wang <jasowang(a)redhat.com>
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index dcbf86cd2155..6a77fa917428 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -334,14 +334,16 @@ static blk_status_t virtblk_prep_rq(struct blk_mq_hw_ctx *hctx,
struct virtblk_req *vbr)
{
blk_status_t status;
+ int num;
status = virtblk_setup_cmd(vblk->vdev, req, vbr);
if (unlikely(status))
return status;
- vbr->sg_table.nents = virtblk_map_data(hctx, req, vbr);
- if (unlikely(vbr->sg_table.nents < 0))
+ num = virtblk_map_data(hctx, req, vbr);
+ if (unlikely(num < 0))
return virtblk_fail_to_queue(req, -ENOMEM);
+ vbr->sg_table.nents = num;
blk_mq_start_request(req);
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
a26116c1e740 ("virtio_blk: Fix signedness bug in virtblk_prep_rq()")
258896fcc786 ("virtio-blk: use a helper to handle request queuing errors")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a26116c1e74028914f281851488546c91cbae57d Mon Sep 17 00:00:00 2001
From: Rafael Mendonca <rafaelmendsr(a)gmail.com>
Date: Fri, 21 Oct 2022 17:41:26 -0300
Subject: [PATCH] virtio_blk: Fix signedness bug in virtblk_prep_rq()
The virtblk_map_data() function returns negative error codes, however, the
'nents' field of vbr->sg_table is an unsigned int, which causes the error
handling not to work correctly.
Cc: stable(a)vger.kernel.org
Fixes: 0e9911fa768f ("virtio-blk: support mq_ops->queue_rqs()")
Signed-off-by: Rafael Mendonca <rafaelmendsr(a)gmail.com>
Message-Id: <20221021204126.927603-1-rafaelmendsr(a)gmail.com>
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare(a)redhat.com>
Reviewed-by: Suwan Kim <suwan.kim027(a)gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha(a)redhat.com>
Acked-by: Jason Wang <jasowang(a)redhat.com>
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index dcbf86cd2155..6a77fa917428 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -334,14 +334,16 @@ static blk_status_t virtblk_prep_rq(struct blk_mq_hw_ctx *hctx,
struct virtblk_req *vbr)
{
blk_status_t status;
+ int num;
status = virtblk_setup_cmd(vblk->vdev, req, vbr);
if (unlikely(status))
return status;
- vbr->sg_table.nents = virtblk_map_data(hctx, req, vbr);
- if (unlikely(vbr->sg_table.nents < 0))
+ num = virtblk_map_data(hctx, req, vbr);
+ if (unlikely(num < 0))
return virtblk_fail_to_queue(req, -ENOMEM);
+ vbr->sg_table.nents = num;
blk_mq_start_request(req);