This is an automatic generated email to let you know that the following patch were queued:
Subject: media: v4l: Fix missing tabular column hint for Y14P format
Author: Jean-Michel Hautbois <jeanmichel.hautbois(a)yoseli.org>
Date: Sat Jun 8 18:41:27 2024 +0200
The original patch added two columns in the flat-table of Luma-Only
Image Formats, without updating hints to latex: above it. This results
in wrong column count in the output of Sphinx's latex builder.
Fix it.
Reported-by: Akira Yokosawa <akiyks(a)gmail.com>
Closes: https://lore.kernel.org/linux-media/bdbc27ba-5098-49fb-aabf-753c81361cc7@gm…
Fixes: adb1d4655e53 ("media: v4l: Add V4L2-PIX-FMT-Y14P format")
Cc: stable(a)vger.kernel.org # for v6.10
Signed-off-by: Jean-Michel Hautbois <jeanmichel.hautbois(a)yoseli.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Documentation/userspace-api/media/v4l/pixfmt-yuv-luma.rst | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
---
diff --git a/Documentation/userspace-api/media/v4l/pixfmt-yuv-luma.rst b/Documentation/userspace-api/media/v4l/pixfmt-yuv-luma.rst
index f02e6cf3516a..74df19be91f6 100644
--- a/Documentation/userspace-api/media/v4l/pixfmt-yuv-luma.rst
+++ b/Documentation/userspace-api/media/v4l/pixfmt-yuv-luma.rst
@@ -21,9 +21,9 @@ are often referred to as greyscale formats.
.. raw:: latex
- \scriptsize
+ \tiny
-.. tabularcolumns:: |p{3.6cm}|p{3.0cm}|p{1.3cm}|p{2.6cm}|p{1.3cm}|p{1.3cm}|p{1.3cm}|
+.. tabularcolumns:: |p{3.6cm}|p{2.4cm}|p{1.3cm}|p{1.3cm}|p{1.3cm}|p{1.3cm}|p{1.3cm}|p{1.3cm}|p{1.3cm}|
.. flat-table:: Luma-Only Image Formats
:header-rows: 1
From: Rameshkumar Sundaram <quic_ramess(a)quicinc.com>
[ Upstream commit 57b341e9ab13e5688491bfd54f8b5502416c8905 ]
Stations can update bandwidth/NSS change in
VHT action frame with action type Operating Mode Notification.
(IEEE Std 802.11-2020 - 9.4.1.53 Operating Mode field)
For Operating Mode Notification, an RX NSS change to a value
greater than AP's maximum NSS should not be allowed.
During fuzz testing, by forcefully sending VHT Op. mode notif.
frames from STA with random rx_nss values, it is found that AP
accepts rx_nss values greater that APs maximum NSS instead of
discarding such NSS change.
Hence allow NSS change only up to maximum NSS that is negotiated
and capped to AP's capability during association.
Signed-off-by: Rameshkumar Sundaram <quic_ramess(a)quicinc.com>
Link: https://lore.kernel.org/r/20230207114146.10567-1-quic_ramess@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
Signed-off-by: Hauke Mehrtens <hauke(a)hauke-m.de>
---
net/mac80211/vht.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)
diff --git a/net/mac80211/vht.c b/net/mac80211/vht.c
index f7526be8a1c7..b3a5c3e96a72 100644
--- a/net/mac80211/vht.c
+++ b/net/mac80211/vht.c
@@ -637,7 +637,7 @@ u32 __ieee80211_vht_handle_opmode(struct ieee80211_sub_if_data *sdata,
enum ieee80211_sta_rx_bandwidth new_bw;
struct sta_opmode_info sta_opmode = {};
u32 changed = 0;
- u8 nss;
+ u8 nss, cur_nss;
/* ignore - no support for BF yet */
if (opmode & IEEE80211_OPMODE_NOTIF_RX_NSS_TYPE_BF)
@@ -648,10 +648,25 @@ u32 __ieee80211_vht_handle_opmode(struct ieee80211_sub_if_data *sdata,
nss += 1;
if (link_sta->pub->rx_nss != nss) {
- link_sta->pub->rx_nss = nss;
- sta_opmode.rx_nss = nss;
- changed |= IEEE80211_RC_NSS_CHANGED;
- sta_opmode.changed |= STA_OPMODE_N_SS_CHANGED;
+ cur_nss = link_sta->pub->rx_nss;
+ /* Reset rx_nss and call ieee80211_sta_set_rx_nss() which
+ * will set the same to max nss value calculated based on capability.
+ */
+ link_sta->pub->rx_nss = 0;
+ ieee80211_sta_set_rx_nss(link_sta);
+ /* Do not allow an nss change to rx_nss greater than max_nss
+ * negotiated and capped to APs capability during association.
+ */
+ if (nss <= link_sta->pub->rx_nss) {
+ link_sta->pub->rx_nss = nss;
+ sta_opmode.rx_nss = nss;
+ changed |= IEEE80211_RC_NSS_CHANGED;
+ sta_opmode.changed |= STA_OPMODE_N_SS_CHANGED;
+ } else {
+ link_sta->pub->rx_nss = cur_nss;
+ pr_warn_ratelimited("Ignoring NSS change in VHT Operating Mode Notification from %pM with invalid nss %d",
+ link_sta->pub->addr, nss);
+ }
}
switch (opmode & IEEE80211_OPMODE_NOTIF_CHANWIDTH_MASK) {
--
2.45.2
This patchset serves to prevent an AB/BA deadlock:
thread 0:
* (lock A) acquire substream lock by
snd_pcm_stream_lock_irq() in
snd_pcm_status64()
* (lock B) wait for tasklet to finish by calling
tasklet_unlock_spin_wait() in
tasklet_disable_in_atomic() in
ohci_flush_iso_completions() of ohci.c
thread 1:
* (lock B) enter tasklet
* (lock A) attempt to acquire substream lock,
waiting for it to be released:
snd_pcm_stream_lock_irqsave() in
snd_pcm_period_elapsed() in
update_pcm_pointers() in
process_ctx_payloads() in
process_rx_packets() of amdtp-stream.c
? tasklet_unlock_spin_wait
</NMI>
<TASK>
ohci_flush_iso_completions firewire_ohci
amdtp_domain_stream_pcm_pointer snd_firewire_lib
snd_pcm_update_hw_ptr0 snd_pcm
snd_pcm_status64 snd_pcm
? native_queued_spin_lock_slowpath
</NMI>
<IRQ>
_raw_spin_lock_irqsave
snd_pcm_period_elapsed snd_pcm
process_rx_packets snd_firewire_lib
irq_target_callback snd_firewire_lib
handle_it_packet firewire_ohci
context_tasklet firewire_ohci
The issue has been reported as a regression of kernel 5.14:
Link: https://lore.kernel.org/regressions/kwryofzdmjvzkuw6j3clftsxmoolynljztxqwg7…
("[REGRESSION] ALSA: firewire-lib: snd_pcm_period_elapsed deadlock with
Fireface 800")
Commit 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event
in process context") removed the process context workqueue from
amdtp_domain_stream_pcm_pointer() and update_pcm_pointers() to remove
its overhead.
Commit b5b519965c4c ("ALSA: firewire-lib: obsolete workqueue for period
update") belongs to the same patch series and removed
the now-unused workqueue entirely.
Though being observed on RME Fireface 800, this issue would affect all
Firewire audio interfaces using ohci amdtp + pcm streaming.
ALSA streaming, especially under intensive CPU load will reveal this issue
the soonest due to issuing more hardIRQs, with time to occurrence ranging
from 2 secons to 30 minutes after starting playback.
to reproduce the issue:
direct ALSA playback to the device:
mpv --audio-device=alsa/sysdefault:CARD=Fireface800 Spor-Ignition.flac
Time to occurrence: 2s to 30m
Likelihood increased by:
- high CPU load
stress --cpu $(nproc)
- switching between applications via workspaces
tested with i915 in Xfce
PulsaAudio / PipeWire conceal the issue as they run PCM substream
without period wakeup mode, issuing less hardIRQs.
Cc: stable(a)vger.kernel.org
Backport note:
Also applies to and fixes on (tested):
6.10.2, 6.9.12, 6.6.43, 6.1.102, 5.15.164
Edmund Raile (3):
Revert "ALSA: firewire-lib: obsolete workqueue for period update"
Revert "ALSA: firewire-lib: operate for period elapse event in process
context"
ALSA: firewire-lib: amdtp-stream work queue inline description
sound/firewire/amdtp-stream.c | 38 ++++++++++++++++++++++-------------
sound/firewire/amdtp-stream.h | 1 +
2 files changed, 25 insertions(+), 14 deletions(-)
--
2.45.2
This patchset serves to prevent an AB/BA deadlock:
thread 0:
* (lock A) acquire substream lock by
snd_pcm_stream_lock_irq() in
snd_pcm_status64()
* (lock B) wait for tasklet to finish by calling
tasklet_unlock_spin_wait() in
tasklet_disable_in_atomic() in
ohci_flush_iso_completions() of ohci.c
thread 1:
* (lock B) enter tasklet
* (lock A) attempt to acquire substream lock,
waiting for it to be released:
snd_pcm_stream_lock_irqsave() in
snd_pcm_period_elapsed() in
update_pcm_pointers() in
process_ctx_payloads() in
process_rx_packets() of amdtp-stream.c
? tasklet_unlock_spin_wait
</NMI>
<TASK>
ohci_flush_iso_completions firewire_ohci
amdtp_domain_stream_pcm_pointer snd_firewire_lib
snd_pcm_update_hw_ptr0 snd_pcm
snd_pcm_status64 snd_pcm
? native_queued_spin_lock_slowpath
</NMI>
<IRQ>
_raw_spin_lock_irqsave
snd_pcm_period_elapsed snd_pcm
process_rx_packets snd_firewire_lib
irq_target_callback snd_firewire_lib
handle_it_packet firewire_ohci
context_tasklet firewire_ohci
The issue has been reported as a regression of kernel 5.14:
Link: https://lore.kernel.org/regressions/kwryofzdmjvzkuw6j3clftsxmoolynljztxqwg7…
("[REGRESSION] ALSA: firewire-lib: snd_pcm_period_elapsed deadlock with
Fireface 800")
Commit 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event
in process context") removed the process context workqueue from
amdtp_domain_stream_pcm_pointer() and update_pcm_pointers() to remove
its overhead.
Commit b5b519965c4c ("ALSA: firewire-lib: obsolete workqueue for period
update") belongs to the same patch series and removed
the now-unused workqueue entirely.
Though being observed on RME Fireface 800, this issue would affect all
Firewire audio interfaces using ohci amdtp + pcm streaming.
ALSA streaming, especially under intensive CPU load will reveal this issue
the soonest due to issuing more hardIRQs, with time to occurrence ranging
from 2 secons to 30 minutes after starting playback.
to reproduce the issue:
direct ALSA playback to the device:
mpv --audio-device=alsa/sysdefault:CARD=Fireface800 Spor-Ignition.flac
Time to occurrence: 2s to 30m
Likelihood increased by:
- high CPU load
stress --cpu $(nproc)
- switching between applications via workspaces
tested with i915 in Xfce
PulsaAudio / PipeWire conceal the issue as they run PCM substream
without period wakeup mode, issuing less hardIRQs.
Cc: stable(a)vger.kernel.org
Backport note:
Also applies to and fixes on (tested):
6.10.2, 6.9.12, 6.6.43, 6.1.102, 5.15.164
Edmund Raile (3):
Revert "ALSA: firewire-lib: obsolete workqueue for period update"
Revert "ALSA: firewire-lib: operate for period elapse event in process
context"
ALSA: firewire-lib: amdtp-stream work queue inline description
sound/firewire/amdtp-stream.c | 38 ++++++++++++++++++++++-------------
sound/firewire/amdtp-stream.h | 1 +
2 files changed, 25 insertions(+), 14 deletions(-)
--
2.45.2
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 525bd65aa759ec320af1dc06e114ed69733e9e23
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072914-sierra-aflutter-e231@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
525bd65aa759 ("fuse: verify {g,u}id mount options correctly")
84c215075b57 ("fuse: name fs_context consistently")
1dd539577c42 ("virtiofs: add a mount option to enable dax")
f4fd4ae354ba ("virtiofs: get rid of no_mount_options")
b330966f79fb ("fuse: reject options on reconfigure via fsconfig(2)")
e8b20a474cf2 ("fuse: ignore 'data' argument of mount(..., MS_REMOUNT)")
0189a2d367f4 ("fuse: use ->reconfigure() instead of ->remount_fs()")
7fd3abfa8dd7 ("virtiofs: do not use fuse_fill_super_common() for device installation")
c9d35ee049b4 ("Merge branch 'merge.nfs-fs_parse.1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 525bd65aa759ec320af1dc06e114ed69733e9e23 Mon Sep 17 00:00:00 2001
From: Eric Sandeen <sandeen(a)redhat.com>
Date: Tue, 2 Jul 2024 17:22:41 -0500
Subject: [PATCH] fuse: verify {g,u}id mount options correctly
As was done in
0200679fc795 ("tmpfs: verify {g,u}id mount options correctly")
we need to validate that the requested uid and/or gid is representable in
the filesystem's idmapping.
Cribbing from the above commit log,
The contract for {g,u}id mount options and {g,u}id values in general set
from userspace has always been that they are translated according to the
caller's idmapping. In so far, fuse has been doing the correct thing.
But since fuse is mountable in unprivileged contexts it is also
necessary to verify that the resulting {k,g}uid is representable in the
namespace of the superblock.
Fixes: c30da2e981a7 ("fuse: convert to use the new mount API")
Cc: stable(a)vger.kernel.org # 5.4+
Signed-off-by: Eric Sandeen <sandeen(a)redhat.com>
Link: https://lore.kernel.org/r/8f07d45d-c806-484d-a2e3-7a2199df1cd2@redhat.com
Reviewed-by: Christian Brauner <brauner(a)kernel.org>
Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 99e44ea7d875..32fe6fa72f46 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -755,6 +755,8 @@ static int fuse_parse_param(struct fs_context *fsc, struct fs_parameter *param)
struct fs_parse_result result;
struct fuse_fs_context *ctx = fsc->fs_private;
int opt;
+ kuid_t kuid;
+ kgid_t kgid;
if (fsc->purpose == FS_CONTEXT_FOR_RECONFIGURE) {
/*
@@ -799,16 +801,30 @@ static int fuse_parse_param(struct fs_context *fsc, struct fs_parameter *param)
break;
case OPT_USER_ID:
- ctx->user_id = make_kuid(fsc->user_ns, result.uint_32);
- if (!uid_valid(ctx->user_id))
+ kuid = make_kuid(fsc->user_ns, result.uint_32);
+ if (!uid_valid(kuid))
return invalfc(fsc, "Invalid user_id");
+ /*
+ * The requested uid must be representable in the
+ * filesystem's idmapping.
+ */
+ if (!kuid_has_mapping(fsc->user_ns, kuid))
+ return invalfc(fsc, "Invalid user_id");
+ ctx->user_id = kuid;
ctx->user_id_present = true;
break;
case OPT_GROUP_ID:
- ctx->group_id = make_kgid(fsc->user_ns, result.uint_32);
- if (!gid_valid(ctx->group_id))
+ kgid = make_kgid(fsc->user_ns, result.uint_32);;
+ if (!gid_valid(kgid))
return invalfc(fsc, "Invalid group_id");
+ /*
+ * The requested gid must be representable in the
+ * filesystem's idmapping.
+ */
+ if (!kgid_has_mapping(fsc->user_ns, kgid))
+ return invalfc(fsc, "Invalid group_id");
+ ctx->group_id = kgid;
ctx->group_id_present = true;
break;
All batches of the Pine64 Pinebook Pro, except the latest batch (as of 2024)
whose hardware design was revised due to the component shortage, use a 1S
lithium battery whose nominal/design capacity is 10,000 mAh, according to the
battery datasheet. [1][2] Let's correct the design full-charge value in the
Pinebook Pro board dts, to improve the accuracy of the hardware description,
and to hopefully improve the accuracy of the fuel gauge a bit on all units
that don't belong to the latest batch.
The above-mentioned latest batch uses a different 1S lithium battery with
a slightly lower capacity, more precisely 9,600 mAh. To make the fuel gauge
work reliably on the latest batch, a sample battery would need to be sent to
CellWise, to obtain its proprietary battery profile, whose data goes into
"cellwise,battery-profile" in the Pinebook Pro board dts. Without that data,
the fuel gauge reportedly works unreliably, so changing the design capacity
won't have any negative effects on the already unreliable operation of the
fuel gauge in the Pinebook Pros that belong to the latest batch.
According to the battery datasheet, its voltage can go as low as 2.75 V while
discharging, but it's better to leave the current 3.0 V value in the dts file,
because of the associated Pinebook Pro's voltage regulation issues.
[1] https://wiki.pine64.org/index.php/Pinebook_Pro#Battery
[2] https://files.pine64.org/doc/datasheet/pinebook/40110175P%203.8V%2010000mAh…
Fixes: c7c4d698cd28 ("arm64: dts: rockchip: add fuel gauge to Pinebook Pro dts")
Cc: stable(a)vger.kernel.org
Cc: Marek Kraus <gamiee(a)pine64.org>
Signed-off-by: Dragan Simic <dsimic(a)manjaro.org>
---
arch/arm64/boot/dts/rockchip/rk3399-pinebook-pro.dts | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3399-pinebook-pro.dts b/arch/arm64/boot/dts/rockchip/rk3399-pinebook-pro.dts
index 294eb2de263d..c918968714e4 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399-pinebook-pro.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3399-pinebook-pro.dts
@@ -37,7 +37,7 @@ backlight: edp-backlight {
bat: battery {
compatible = "simple-battery";
- charge-full-design-microamp-hours = <9800000>;
+ charge-full-design-microamp-hours = <10000000>;
voltage-max-design-microvolt = <4350000>;
voltage-min-design-microvolt = <3000000>;
};
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 525bd65aa759ec320af1dc06e114ed69733e9e23
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072908-everglade-starved-66da@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
525bd65aa759 ("fuse: verify {g,u}id mount options correctly")
84c215075b57 ("fuse: name fs_context consistently")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 525bd65aa759ec320af1dc06e114ed69733e9e23 Mon Sep 17 00:00:00 2001
From: Eric Sandeen <sandeen(a)redhat.com>
Date: Tue, 2 Jul 2024 17:22:41 -0500
Subject: [PATCH] fuse: verify {g,u}id mount options correctly
As was done in
0200679fc795 ("tmpfs: verify {g,u}id mount options correctly")
we need to validate that the requested uid and/or gid is representable in
the filesystem's idmapping.
Cribbing from the above commit log,
The contract for {g,u}id mount options and {g,u}id values in general set
from userspace has always been that they are translated according to the
caller's idmapping. In so far, fuse has been doing the correct thing.
But since fuse is mountable in unprivileged contexts it is also
necessary to verify that the resulting {k,g}uid is representable in the
namespace of the superblock.
Fixes: c30da2e981a7 ("fuse: convert to use the new mount API")
Cc: stable(a)vger.kernel.org # 5.4+
Signed-off-by: Eric Sandeen <sandeen(a)redhat.com>
Link: https://lore.kernel.org/r/8f07d45d-c806-484d-a2e3-7a2199df1cd2@redhat.com
Reviewed-by: Christian Brauner <brauner(a)kernel.org>
Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 99e44ea7d875..32fe6fa72f46 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -755,6 +755,8 @@ static int fuse_parse_param(struct fs_context *fsc, struct fs_parameter *param)
struct fs_parse_result result;
struct fuse_fs_context *ctx = fsc->fs_private;
int opt;
+ kuid_t kuid;
+ kgid_t kgid;
if (fsc->purpose == FS_CONTEXT_FOR_RECONFIGURE) {
/*
@@ -799,16 +801,30 @@ static int fuse_parse_param(struct fs_context *fsc, struct fs_parameter *param)
break;
case OPT_USER_ID:
- ctx->user_id = make_kuid(fsc->user_ns, result.uint_32);
- if (!uid_valid(ctx->user_id))
+ kuid = make_kuid(fsc->user_ns, result.uint_32);
+ if (!uid_valid(kuid))
return invalfc(fsc, "Invalid user_id");
+ /*
+ * The requested uid must be representable in the
+ * filesystem's idmapping.
+ */
+ if (!kuid_has_mapping(fsc->user_ns, kuid))
+ return invalfc(fsc, "Invalid user_id");
+ ctx->user_id = kuid;
ctx->user_id_present = true;
break;
case OPT_GROUP_ID:
- ctx->group_id = make_kgid(fsc->user_ns, result.uint_32);
- if (!gid_valid(ctx->group_id))
+ kgid = make_kgid(fsc->user_ns, result.uint_32);;
+ if (!gid_valid(kgid))
return invalfc(fsc, "Invalid group_id");
+ /*
+ * The requested gid must be representable in the
+ * filesystem's idmapping.
+ */
+ if (!kgid_has_mapping(fsc->user_ns, kgid))
+ return invalfc(fsc, "Invalid group_id");
+ ctx->group_id = kgid;
ctx->group_id_present = true;
break;
The ufshcd_add_delay_before_dme_cmd() always introduces a delay of
MIN_DELAY_BEFORE_DME_CMDS_US between DME commands even when it's not
required. The delay is added when the UFS host controller supplies the
quirk UFSHCD_QUIRK_DELAY_BEFORE_DME_CMDS.
Fix the logic to update hba->last_dme_cmd_tstamp to ensure subsequent
DME commands have the correct delay in the range of 0 to
MIN_DELAY_BEFORE_DME_CMDS_US.
Update the timestamp at the end of the function to ensure it captures
the latest time after any necessary delay has been applied.
Signed-off-by: Vamshi Gajjela <vamshigajjela(a)google.com>
Fixes: cad2e03d8607 ("ufs: add support to allow non standard behaviours (quirks)")
Cc: stable(a)vger.kernel.org
---
drivers/ufs/core/ufshcd.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index dc757ba47522..406bda1585f6 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -4090,11 +4090,16 @@ static inline void ufshcd_add_delay_before_dme_cmd(struct ufs_hba *hba)
min_sleep_time_us =
MIN_DELAY_BEFORE_DME_CMDS_US - delta;
else
- return; /* no more delay required */
+ min_sleep_time_us = 0; /* no more delay required */
}
- /* allow sleep for extra 50us if needed */
- usleep_range(min_sleep_time_us, min_sleep_time_us + 50);
+ if (min_sleep_time_us > 0) {
+ /* allow sleep for extra 50us if needed */
+ usleep_range(min_sleep_time_us, min_sleep_time_us + 50);
+ }
+
+ /* update the last_dme_cmd_tstamp */
+ hba->last_dme_cmd_tstamp = ktime_get();
}
/**
--
2.45.2.1089.g2a221341d9-goog
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 24dce1c538a7ceac43f2f97aae8dfd4bb93ea9b9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072931-sarcastic-coagulant-6821@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
24dce1c538a7 ("io_uring: fix lost getsockopt completions")
d10f19dff56e ("io_uring/uring_cmd: switch to always allocating async data")
a9165b83c193 ("io_uring/rw: always setup io_async_rw for read/write requests")
8e5b3b89ecaf ("io_uring: remove struct io_tw_state::locked")
92219afb980e ("io_uring: force tw ctx locking")
6e6b8c62120a ("io_uring/rw: avoid punting to io-wq directly")
e1eef2e56cb0 ("io_uring/cmd: fix tw <-> issue_flags conversion")
6edd953b6ec7 ("io_uring/cmd: kill one issue_flags to tw conversion")
da12d9ab5889 ("io_uring/cmd: move io_uring_try_cancel_uring_cmd()")
8d0c12a80cde ("io-uring: add napi busy poll support")
405b4dc14b10 ("io-uring: move io_wait_queue definition to header file")
da08d2edb020 ("io_uring: re-arrange struct io_ring_ctx to reduce padding")
42c0905f0cac ("io_uring: cleanup handle_tw_list() calling convention")
9fe3eaea4a35 ("io_uring: remove unconditional looping in local task_work handling")
670d9d3df880 ("io_uring: remove next io_kiocb fetch in task_work running")
170539bdf109 ("io_uring: handle traditional task_work in FIFO order")
592b4805432a ("io_uring: remove looping around handling traditional task_work")
95041b93e90a ("io_uring: add io_file_can_poll() helper")
521223d7c229 ("io_uring/cancel: don't default to setting req->work.cancel_seq")
4bcb982cce74 ("io_uring: expand main struct io_kiocb flags to 64-bits")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 24dce1c538a7ceac43f2f97aae8dfd4bb93ea9b9 Mon Sep 17 00:00:00 2001
From: Pavel Begunkov <asml.silence(a)gmail.com>
Date: Tue, 16 Jul 2024 19:05:46 +0100
Subject: [PATCH] io_uring: fix lost getsockopt completions
There is a report that iowq executed getsockopt never completes. The
reason being that io_uring_cmd_sock() can return a positive result, and
io_uring_cmd() propagates it back to core io_uring, instead of IOU_OK.
In case of io_wq_submit_work(), the request will be dropped without
completing it.
The offending code was introduced by a hack in
a9c3eda7eada9 ("io_uring: fix submission-failure handling for uring-cmd"),
however it was fine until getsockopt was introduced and started
returning positive results.
The right solution is to always return IOU_OK, since
e0b23d9953b0c ("io_uring: optimise ltimeout for inline execution"),
we should be able to do it without problems, however for the sake of
backporting and minimising side effects, let's keep returning negative
return codes and otherwise do IOU_OK.
Link: https://github.com/axboe/liburing/issues/1181
Cc: stable(a)vger.kernel.org
Fixes: 8e9fad0e70b7b ("io_uring: Add io_uring command support for sockets")
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
Reviewed-by: Breno Leitao <leitao(a)debian.org>
Link: https://lore.kernel.org/r/ff349cf0654018189b6077e85feed935f0f8839e.17211498…
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index 21ac5fb2d5f0..a54163a83968 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -265,7 +265,7 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags)
req_set_fail(req);
io_req_uring_cleanup(req, issue_flags);
io_req_set_res(req, ret, 0);
- return ret;
+ return ret < 0 ? ret : IOU_OK;
}
int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x f442fa6141379a20b48ae3efabee827a3d260787
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024071541-observing-landline-c9ec@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
f442fa614137 ("mm: gup: stop abusing try_grab_folio")
01d89b93e176 ("mm/gup: fix hugepd handling in hugetlb rework")
9cbe4954c6d9 ("gup: use folios for gup_devmap")
53e45c4f6d4f ("mm: convert put_devmap_managed_page_refs() to put_devmap_managed_folio_refs()")
6785c54a1b43 ("mm: remove put_devmap_managed_page()")
25176ad09ca3 ("mm/treewide: rename CONFIG_HAVE_FAST_GUP to CONFIG_HAVE_GUP_FAST")
23babe1934d7 ("mm/gup: consistently name GUP-fast functions")
a12083d721d7 ("mm/gup: handle hugepd for follow_page()")
4418c522f683 ("mm/gup: handle huge pmd for follow_pmd_mask()")
1b1676180246 ("mm/gup: handle huge pud for follow_pud_mask()")
caf8cab79857 ("mm/gup: cache *pudp in follow_pud_mask()")
878b0c451621 ("mm/gup: handle hugetlb for no_page_table()")
f3c94c625fe3 ("mm/gup: refactor record_subpages() to find 1st small page")
607c63195d63 ("mm/gup: drop gup_fast_folio_allowed() in hugepd processing")
f002882ca369 ("mm: merge folio_is_secretmem() and folio_fast_pin_allowed() into gup_fast_folio_allowed()")
1965e933ddeb ("mm/treewide: replace pXd_huge() with pXd_leaf()")
7db86dc389aa ("mm/gup: merge pXd huge mapping checks")
089f92141ed0 ("mm/gup: check p4d presence before going on")
e6fd5564c07c ("mm/gup: cache p4d in follow_p4d_mask()")
65291dcfcf89 ("mm/secretmem: fix GUP-fast succeeding on secretmem folios")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f442fa6141379a20b48ae3efabee827a3d260787 Mon Sep 17 00:00:00 2001
From: Yang Shi <yang(a)os.amperecomputing.com>
Date: Fri, 28 Jun 2024 12:14:58 -0700
Subject: [PATCH] mm: gup: stop abusing try_grab_folio
A kernel warning was reported when pinning folio in CMA memory when
launching SEV virtual machine. The splat looks like:
[ 464.325306] WARNING: CPU: 13 PID: 6734 at mm/gup.c:1313 __get_user_pages+0x423/0x520
[ 464.325464] CPU: 13 PID: 6734 Comm: qemu-kvm Kdump: loaded Not tainted 6.6.33+ #6
[ 464.325477] RIP: 0010:__get_user_pages+0x423/0x520
[ 464.325515] Call Trace:
[ 464.325520] <TASK>
[ 464.325523] ? __get_user_pages+0x423/0x520
[ 464.325528] ? __warn+0x81/0x130
[ 464.325536] ? __get_user_pages+0x423/0x520
[ 464.325541] ? report_bug+0x171/0x1a0
[ 464.325549] ? handle_bug+0x3c/0x70
[ 464.325554] ? exc_invalid_op+0x17/0x70
[ 464.325558] ? asm_exc_invalid_op+0x1a/0x20
[ 464.325567] ? __get_user_pages+0x423/0x520
[ 464.325575] __gup_longterm_locked+0x212/0x7a0
[ 464.325583] internal_get_user_pages_fast+0xfb/0x190
[ 464.325590] pin_user_pages_fast+0x47/0x60
[ 464.325598] sev_pin_memory+0xca/0x170 [kvm_amd]
[ 464.325616] sev_mem_enc_register_region+0x81/0x130 [kvm_amd]
Per the analysis done by yangge, when starting the SEV virtual machine, it
will call pin_user_pages_fast(..., FOLL_LONGTERM, ...) to pin the memory.
But the page is in CMA area, so fast GUP will fail then fallback to the
slow path due to the longterm pinnalbe check in try_grab_folio().
The slow path will try to pin the pages then migrate them out of CMA area.
But the slow path also uses try_grab_folio() to pin the page, it will
also fail due to the same check then the above warning is triggered.
In addition, the try_grab_folio() is supposed to be used in fast path and
it elevates folio refcount by using add ref unless zero. We are guaranteed
to have at least one stable reference in slow path, so the simple atomic add
could be used. The performance difference should be trivial, but the
misuse may be confusing and misleading.
Redefined try_grab_folio() to try_grab_folio_fast(), and try_grab_page()
to try_grab_folio(), and use them in the proper paths. This solves both
the abuse and the kernel warning.
The proper naming makes their usecase more clear and should prevent from
abusing in the future.
peterx said:
: The user will see the pin fails, for gpu-slow it further triggers the WARN
: right below that failure (as in the original report):
:
: folio = try_grab_folio(page, page_increm - 1,
: foll_flags);
: if (WARN_ON_ONCE(!folio)) { <------------------------ here
: /*
: * Release the 1st page ref if the
: * folio is problematic, fail hard.
: */
: gup_put_folio(page_folio(page), 1,
: foll_flags);
: ret = -EFAULT;
: goto out;
: }
[1] https://lore.kernel.org/linux-mm/1719478388-31917-1-git-send-email-yangge11…
[shy828301(a)gmail.com: fix implicit declaration of function try_grab_folio_fast]
Link: https://lkml.kernel.org/r/CAHbLzkowMSso-4Nufc9hcMehQsK9PNz3OSu-+eniU-2Mm-xj…
Link: https://lkml.kernel.org/r/20240628191458.2605553-1-yang@os.amperecomputing.…
Fixes: 57edfcfd3419 ("mm/gup: accelerate thp gup even for "pages != NULL"")
Signed-off-by: Yang Shi <yang(a)os.amperecomputing.com>
Reported-by: yangge <yangge1116(a)126.com>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: <stable(a)vger.kernel.org> [6.6+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/gup.c b/mm/gup.c
index 469799f805f1..f1d6bc06eb52 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -97,95 +97,6 @@ static inline struct folio *try_get_folio(struct page *page, int refs)
return folio;
}
-/**
- * try_grab_folio() - Attempt to get or pin a folio.
- * @page: pointer to page to be grabbed
- * @refs: the value to (effectively) add to the folio's refcount
- * @flags: gup flags: these are the FOLL_* flag values.
- *
- * "grab" names in this file mean, "look at flags to decide whether to use
- * FOLL_PIN or FOLL_GET behavior, when incrementing the folio's refcount.
- *
- * Either FOLL_PIN or FOLL_GET (or neither) must be set, but not both at the
- * same time. (That's true throughout the get_user_pages*() and
- * pin_user_pages*() APIs.) Cases:
- *
- * FOLL_GET: folio's refcount will be incremented by @refs.
- *
- * FOLL_PIN on large folios: folio's refcount will be incremented by
- * @refs, and its pincount will be incremented by @refs.
- *
- * FOLL_PIN on single-page folios: folio's refcount will be incremented by
- * @refs * GUP_PIN_COUNTING_BIAS.
- *
- * Return: The folio containing @page (with refcount appropriately
- * incremented) for success, or NULL upon failure. If neither FOLL_GET
- * nor FOLL_PIN was set, that's considered failure, and furthermore,
- * a likely bug in the caller, so a warning is also emitted.
- */
-struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags)
-{
- struct folio *folio;
-
- if (WARN_ON_ONCE((flags & (FOLL_GET | FOLL_PIN)) == 0))
- return NULL;
-
- if (unlikely(!(flags & FOLL_PCI_P2PDMA) && is_pci_p2pdma_page(page)))
- return NULL;
-
- if (flags & FOLL_GET)
- return try_get_folio(page, refs);
-
- /* FOLL_PIN is set */
-
- /*
- * Don't take a pin on the zero page - it's not going anywhere
- * and it is used in a *lot* of places.
- */
- if (is_zero_page(page))
- return page_folio(page);
-
- folio = try_get_folio(page, refs);
- if (!folio)
- return NULL;
-
- /*
- * Can't do FOLL_LONGTERM + FOLL_PIN gup fast path if not in a
- * right zone, so fail and let the caller fall back to the slow
- * path.
- */
- if (unlikely((flags & FOLL_LONGTERM) &&
- !folio_is_longterm_pinnable(folio))) {
- if (!put_devmap_managed_folio_refs(folio, refs))
- folio_put_refs(folio, refs);
- return NULL;
- }
-
- /*
- * When pinning a large folio, use an exact count to track it.
- *
- * However, be sure to *also* increment the normal folio
- * refcount field at least once, so that the folio really
- * is pinned. That's why the refcount from the earlier
- * try_get_folio() is left intact.
- */
- if (folio_test_large(folio))
- atomic_add(refs, &folio->_pincount);
- else
- folio_ref_add(folio,
- refs * (GUP_PIN_COUNTING_BIAS - 1));
- /*
- * Adjust the pincount before re-checking the PTE for changes.
- * This is essentially a smp_mb() and is paired with a memory
- * barrier in folio_try_share_anon_rmap_*().
- */
- smp_mb__after_atomic();
-
- node_stat_mod_folio(folio, NR_FOLL_PIN_ACQUIRED, refs);
-
- return folio;
-}
-
static void gup_put_folio(struct folio *folio, int refs, unsigned int flags)
{
if (flags & FOLL_PIN) {
@@ -203,58 +114,59 @@ static void gup_put_folio(struct folio *folio, int refs, unsigned int flags)
}
/**
- * try_grab_page() - elevate a page's refcount by a flag-dependent amount
- * @page: pointer to page to be grabbed
- * @flags: gup flags: these are the FOLL_* flag values.
+ * try_grab_folio() - add a folio's refcount by a flag-dependent amount
+ * @folio: pointer to folio to be grabbed
+ * @refs: the value to (effectively) add to the folio's refcount
+ * @flags: gup flags: these are the FOLL_* flag values
*
* This might not do anything at all, depending on the flags argument.
*
* "grab" names in this file mean, "look at flags to decide whether to use
- * FOLL_PIN or FOLL_GET behavior, when incrementing the page's refcount.
+ * FOLL_PIN or FOLL_GET behavior, when incrementing the folio's refcount.
*
* Either FOLL_PIN or FOLL_GET (or neither) may be set, but not both at the same
- * time. Cases: please see the try_grab_folio() documentation, with
- * "refs=1".
+ * time.
*
* Return: 0 for success, or if no action was required (if neither FOLL_PIN
* nor FOLL_GET was set, nothing is done). A negative error code for failure:
*
- * -ENOMEM FOLL_GET or FOLL_PIN was set, but the page could not
+ * -ENOMEM FOLL_GET or FOLL_PIN was set, but the folio could not
* be grabbed.
+ *
+ * It is called when we have a stable reference for the folio, typically in
+ * GUP slow path.
*/
-int __must_check try_grab_page(struct page *page, unsigned int flags)
+int __must_check try_grab_folio(struct folio *folio, int refs,
+ unsigned int flags)
{
- struct folio *folio = page_folio(page);
-
if (WARN_ON_ONCE(folio_ref_count(folio) <= 0))
return -ENOMEM;
- if (unlikely(!(flags & FOLL_PCI_P2PDMA) && is_pci_p2pdma_page(page)))
+ if (unlikely(!(flags & FOLL_PCI_P2PDMA) && is_pci_p2pdma_page(&folio->page)))
return -EREMOTEIO;
if (flags & FOLL_GET)
- folio_ref_inc(folio);
+ folio_ref_add(folio, refs);
else if (flags & FOLL_PIN) {
/*
* Don't take a pin on the zero page - it's not going anywhere
* and it is used in a *lot* of places.
*/
- if (is_zero_page(page))
+ if (is_zero_folio(folio))
return 0;
/*
- * Similar to try_grab_folio(): be sure to *also*
- * increment the normal page refcount field at least once,
+ * Increment the normal page refcount field at least once,
* so that the page really is pinned.
*/
if (folio_test_large(folio)) {
- folio_ref_add(folio, 1);
- atomic_add(1, &folio->_pincount);
+ folio_ref_add(folio, refs);
+ atomic_add(refs, &folio->_pincount);
} else {
- folio_ref_add(folio, GUP_PIN_COUNTING_BIAS);
+ folio_ref_add(folio, refs * GUP_PIN_COUNTING_BIAS);
}
- node_stat_mod_folio(folio, NR_FOLL_PIN_ACQUIRED, 1);
+ node_stat_mod_folio(folio, NR_FOLL_PIN_ACQUIRED, refs);
}
return 0;
@@ -515,6 +427,102 @@ static int record_subpages(struct page *page, unsigned long sz,
return nr;
}
+
+/**
+ * try_grab_folio_fast() - Attempt to get or pin a folio in fast path.
+ * @page: pointer to page to be grabbed
+ * @refs: the value to (effectively) add to the folio's refcount
+ * @flags: gup flags: these are the FOLL_* flag values.
+ *
+ * "grab" names in this file mean, "look at flags to decide whether to use
+ * FOLL_PIN or FOLL_GET behavior, when incrementing the folio's refcount.
+ *
+ * Either FOLL_PIN or FOLL_GET (or neither) must be set, but not both at the
+ * same time. (That's true throughout the get_user_pages*() and
+ * pin_user_pages*() APIs.) Cases:
+ *
+ * FOLL_GET: folio's refcount will be incremented by @refs.
+ *
+ * FOLL_PIN on large folios: folio's refcount will be incremented by
+ * @refs, and its pincount will be incremented by @refs.
+ *
+ * FOLL_PIN on single-page folios: folio's refcount will be incremented by
+ * @refs * GUP_PIN_COUNTING_BIAS.
+ *
+ * Return: The folio containing @page (with refcount appropriately
+ * incremented) for success, or NULL upon failure. If neither FOLL_GET
+ * nor FOLL_PIN was set, that's considered failure, and furthermore,
+ * a likely bug in the caller, so a warning is also emitted.
+ *
+ * It uses add ref unless zero to elevate the folio refcount and must be called
+ * in fast path only.
+ */
+static struct folio *try_grab_folio_fast(struct page *page, int refs,
+ unsigned int flags)
+{
+ struct folio *folio;
+
+ /* Raise warn if it is not called in fast GUP */
+ VM_WARN_ON_ONCE(!irqs_disabled());
+
+ if (WARN_ON_ONCE((flags & (FOLL_GET | FOLL_PIN)) == 0))
+ return NULL;
+
+ if (unlikely(!(flags & FOLL_PCI_P2PDMA) && is_pci_p2pdma_page(page)))
+ return NULL;
+
+ if (flags & FOLL_GET)
+ return try_get_folio(page, refs);
+
+ /* FOLL_PIN is set */
+
+ /*
+ * Don't take a pin on the zero page - it's not going anywhere
+ * and it is used in a *lot* of places.
+ */
+ if (is_zero_page(page))
+ return page_folio(page);
+
+ folio = try_get_folio(page, refs);
+ if (!folio)
+ return NULL;
+
+ /*
+ * Can't do FOLL_LONGTERM + FOLL_PIN gup fast path if not in a
+ * right zone, so fail and let the caller fall back to the slow
+ * path.
+ */
+ if (unlikely((flags & FOLL_LONGTERM) &&
+ !folio_is_longterm_pinnable(folio))) {
+ if (!put_devmap_managed_folio_refs(folio, refs))
+ folio_put_refs(folio, refs);
+ return NULL;
+ }
+
+ /*
+ * When pinning a large folio, use an exact count to track it.
+ *
+ * However, be sure to *also* increment the normal folio
+ * refcount field at least once, so that the folio really
+ * is pinned. That's why the refcount from the earlier
+ * try_get_folio() is left intact.
+ */
+ if (folio_test_large(folio))
+ atomic_add(refs, &folio->_pincount);
+ else
+ folio_ref_add(folio,
+ refs * (GUP_PIN_COUNTING_BIAS - 1));
+ /*
+ * Adjust the pincount before re-checking the PTE for changes.
+ * This is essentially a smp_mb() and is paired with a memory
+ * barrier in folio_try_share_anon_rmap_*().
+ */
+ smp_mb__after_atomic();
+
+ node_stat_mod_folio(folio, NR_FOLL_PIN_ACQUIRED, refs);
+
+ return folio;
+}
#endif /* CONFIG_ARCH_HAS_HUGEPD || CONFIG_HAVE_GUP_FAST */
#ifdef CONFIG_ARCH_HAS_HUGEPD
@@ -535,7 +543,7 @@ static unsigned long hugepte_addr_end(unsigned long addr, unsigned long end,
*/
static int gup_hugepte(struct vm_area_struct *vma, pte_t *ptep, unsigned long sz,
unsigned long addr, unsigned long end, unsigned int flags,
- struct page **pages, int *nr)
+ struct page **pages, int *nr, bool fast)
{
unsigned long pte_end;
struct page *page;
@@ -558,9 +566,15 @@ static int gup_hugepte(struct vm_area_struct *vma, pte_t *ptep, unsigned long sz
page = pte_page(pte);
refs = record_subpages(page, sz, addr, end, pages + *nr);
- folio = try_grab_folio(page, refs, flags);
- if (!folio)
- return 0;
+ if (fast) {
+ folio = try_grab_folio_fast(page, refs, flags);
+ if (!folio)
+ return 0;
+ } else {
+ folio = page_folio(page);
+ if (try_grab_folio(folio, refs, flags))
+ return 0;
+ }
if (unlikely(pte_val(pte) != pte_val(ptep_get(ptep)))) {
gup_put_folio(folio, refs, flags);
@@ -588,7 +602,7 @@ static int gup_hugepte(struct vm_area_struct *vma, pte_t *ptep, unsigned long sz
static int gup_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
unsigned long addr, unsigned int pdshift,
unsigned long end, unsigned int flags,
- struct page **pages, int *nr)
+ struct page **pages, int *nr, bool fast)
{
pte_t *ptep;
unsigned long sz = 1UL << hugepd_shift(hugepd);
@@ -598,7 +612,8 @@ static int gup_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
ptep = hugepte_offset(hugepd, addr, pdshift);
do {
next = hugepte_addr_end(addr, end, sz);
- ret = gup_hugepte(vma, ptep, sz, addr, end, flags, pages, nr);
+ ret = gup_hugepte(vma, ptep, sz, addr, end, flags, pages, nr,
+ fast);
if (ret != 1)
return ret;
} while (ptep++, addr = next, addr != end);
@@ -625,7 +640,7 @@ static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
ptep = hugepte_offset(hugepd, addr, pdshift);
ptl = huge_pte_lock(h, vma->vm_mm, ptep);
ret = gup_hugepd(vma, hugepd, addr, pdshift, addr + PAGE_SIZE,
- flags, &page, &nr);
+ flags, &page, &nr, false);
spin_unlock(ptl);
if (ret == 1) {
@@ -642,7 +657,7 @@ static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
static inline int gup_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
unsigned long addr, unsigned int pdshift,
unsigned long end, unsigned int flags,
- struct page **pages, int *nr)
+ struct page **pages, int *nr, bool fast)
{
return 0;
}
@@ -729,7 +744,7 @@ static struct page *follow_huge_pud(struct vm_area_struct *vma,
gup_must_unshare(vma, flags, page))
return ERR_PTR(-EMLINK);
- ret = try_grab_page(page, flags);
+ ret = try_grab_folio(page_folio(page), 1, flags);
if (ret)
page = ERR_PTR(ret);
else
@@ -806,7 +821,7 @@ static struct page *follow_huge_pmd(struct vm_area_struct *vma,
VM_BUG_ON_PAGE((flags & FOLL_PIN) && PageAnon(page) &&
!PageAnonExclusive(page), page);
- ret = try_grab_page(page, flags);
+ ret = try_grab_folio(page_folio(page), 1, flags);
if (ret)
return ERR_PTR(ret);
@@ -968,8 +983,8 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
VM_BUG_ON_PAGE((flags & FOLL_PIN) && PageAnon(page) &&
!PageAnonExclusive(page), page);
- /* try_grab_page() does nothing unless FOLL_GET or FOLL_PIN is set. */
- ret = try_grab_page(page, flags);
+ /* try_grab_folio() does nothing unless FOLL_GET or FOLL_PIN is set. */
+ ret = try_grab_folio(page_folio(page), 1, flags);
if (unlikely(ret)) {
page = ERR_PTR(ret);
goto out;
@@ -1233,7 +1248,7 @@ static int get_gate_page(struct mm_struct *mm, unsigned long address,
goto unmap;
*page = pte_page(entry);
}
- ret = try_grab_page(*page, gup_flags);
+ ret = try_grab_folio(page_folio(*page), 1, gup_flags);
if (unlikely(ret))
goto unmap;
out:
@@ -1636,20 +1651,19 @@ static long __get_user_pages(struct mm_struct *mm,
* pages.
*/
if (page_increm > 1) {
- struct folio *folio;
+ struct folio *folio = page_folio(page);
/*
* Since we already hold refcount on the
* large folio, this should never fail.
*/
- folio = try_grab_folio(page, page_increm - 1,
- foll_flags);
- if (WARN_ON_ONCE(!folio)) {
+ if (try_grab_folio(folio, page_increm - 1,
+ foll_flags)) {
/*
* Release the 1st page ref if the
* folio is problematic, fail hard.
*/
- gup_put_folio(page_folio(page), 1,
+ gup_put_folio(folio, 1,
foll_flags);
ret = -EFAULT;
goto out;
@@ -2797,7 +2811,6 @@ EXPORT_SYMBOL(get_user_pages_unlocked);
* This code is based heavily on the PowerPC implementation by Nick Piggin.
*/
#ifdef CONFIG_HAVE_GUP_FAST
-
/*
* Used in the GUP-fast path to determine whether GUP is permitted to work on
* a specific folio.
@@ -2962,7 +2975,7 @@ static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr,
VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
page = pte_page(pte);
- folio = try_grab_folio(page, 1, flags);
+ folio = try_grab_folio_fast(page, 1, flags);
if (!folio)
goto pte_unmap;
@@ -3049,7 +3062,7 @@ static int gup_fast_devmap_leaf(unsigned long pfn, unsigned long addr,
break;
}
- folio = try_grab_folio(page, 1, flags);
+ folio = try_grab_folio_fast(page, 1, flags);
if (!folio) {
gup_fast_undo_dev_pagemap(nr, nr_start, flags, pages);
break;
@@ -3138,7 +3151,7 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr,
page = pmd_page(orig);
refs = record_subpages(page, PMD_SIZE, addr, end, pages + *nr);
- folio = try_grab_folio(page, refs, flags);
+ folio = try_grab_folio_fast(page, refs, flags);
if (!folio)
return 0;
@@ -3182,7 +3195,7 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr,
page = pud_page(orig);
refs = record_subpages(page, PUD_SIZE, addr, end, pages + *nr);
- folio = try_grab_folio(page, refs, flags);
+ folio = try_grab_folio_fast(page, refs, flags);
if (!folio)
return 0;
@@ -3222,7 +3235,7 @@ static int gup_fast_pgd_leaf(pgd_t orig, pgd_t *pgdp, unsigned long addr,
page = pgd_page(orig);
refs = record_subpages(page, PGDIR_SIZE, addr, end, pages + *nr);
- folio = try_grab_folio(page, refs, flags);
+ folio = try_grab_folio_fast(page, refs, flags);
if (!folio)
return 0;
@@ -3276,7 +3289,8 @@ static int gup_fast_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr,
* pmd format and THP pmd format
*/
if (gup_hugepd(NULL, __hugepd(pmd_val(pmd)), addr,
- PMD_SHIFT, next, flags, pages, nr) != 1)
+ PMD_SHIFT, next, flags, pages, nr,
+ true) != 1)
return 0;
} else if (!gup_fast_pte_range(pmd, pmdp, addr, next, flags,
pages, nr))
@@ -3306,7 +3320,8 @@ static int gup_fast_pud_range(p4d_t *p4dp, p4d_t p4d, unsigned long addr,
return 0;
} else if (unlikely(is_hugepd(__hugepd(pud_val(pud))))) {
if (gup_hugepd(NULL, __hugepd(pud_val(pud)), addr,
- PUD_SHIFT, next, flags, pages, nr) != 1)
+ PUD_SHIFT, next, flags, pages, nr,
+ true) != 1)
return 0;
} else if (!gup_fast_pmd_range(pudp, pud, addr, next, flags,
pages, nr))
@@ -3333,7 +3348,8 @@ static int gup_fast_p4d_range(pgd_t *pgdp, pgd_t pgd, unsigned long addr,
BUILD_BUG_ON(p4d_leaf(p4d));
if (unlikely(is_hugepd(__hugepd(p4d_val(p4d))))) {
if (gup_hugepd(NULL, __hugepd(p4d_val(p4d)), addr,
- P4D_SHIFT, next, flags, pages, nr) != 1)
+ P4D_SHIFT, next, flags, pages, nr,
+ true) != 1)
return 0;
} else if (!gup_fast_pud_range(p4dp, p4d, addr, next, flags,
pages, nr))
@@ -3362,7 +3378,8 @@ static void gup_fast_pgd_range(unsigned long addr, unsigned long end,
return;
} else if (unlikely(is_hugepd(__hugepd(pgd_val(pgd))))) {
if (gup_hugepd(NULL, __hugepd(pgd_val(pgd)), addr,
- PGDIR_SHIFT, next, flags, pages, nr) != 1)
+ PGDIR_SHIFT, next, flags, pages, nr,
+ true) != 1)
return;
} else if (!gup_fast_p4d_range(pgdp, pgd, addr, next, flags,
pages, nr))
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index db7946a0a28c..2120f7478e55 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1331,7 +1331,7 @@ struct page *follow_devmap_pmd(struct vm_area_struct *vma, unsigned long addr,
if (!*pgmap)
return ERR_PTR(-EFAULT);
page = pfn_to_page(pfn);
- ret = try_grab_page(page, flags);
+ ret = try_grab_folio(page_folio(page), 1, flags);
if (ret)
page = ERR_PTR(ret);
diff --git a/mm/internal.h b/mm/internal.h
index 6902b7dd8509..cc2c5e07fad3 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -1182,8 +1182,8 @@ int migrate_device_coherent_page(struct page *page);
/*
* mm/gup.c
*/
-struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags);
-int __must_check try_grab_page(struct page *page, unsigned int flags);
+int __must_check try_grab_folio(struct folio *folio, int refs,
+ unsigned int flags);
/*
* mm/huge_memory.c
From: Max Kellermann <max.kellermann(a)ionos.com>
This fixes a NULL pointer dereference bug due to a data race which
looks like this:
BUG: kernel NULL pointer dereference, address: 0000000000000008
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 33 PID: 16573 Comm: kworker/u97:799 Not tainted 6.8.7-cm4all1-hp+ #43
Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 10/17/2018
Workqueue: events_unbound netfs_rreq_write_to_cache_work
RIP: 0010:cachefiles_prepare_write+0x30/0xa0
Code: 57 41 56 45 89 ce 41 55 49 89 cd 41 54 49 89 d4 55 53 48 89 fb 48 83 ec 08 48 8b 47 08 48 83 7f 10 00 48 89 34 24 48 8b 68 20 <48> 8b 45 08 4c 8b 38 74 45 49 8b 7f 50 e8 4e a9 b0 ff 48 8b 73 10
RSP: 0018:ffffb4e78113bde0 EFLAGS: 00010286
RAX: ffff976126be6d10 RBX: ffff97615cdb8438 RCX: 0000000000020000
RDX: ffff97605e6c4c68 RSI: ffff97605e6c4c60 RDI: ffff97615cdb8438
RBP: 0000000000000000 R08: 0000000000278333 R09: 0000000000000001
R10: ffff97605e6c4600 R11: 0000000000000001 R12: ffff97605e6c4c68
R13: 0000000000020000 R14: 0000000000000001 R15: ffff976064fe2c00
FS: 0000000000000000(0000) GS:ffff9776dfd40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 000000005942c002 CR4: 00000000001706f0
Call Trace:
<TASK>
? __die+0x1f/0x70
? page_fault_oops+0x15d/0x440
? search_module_extables+0xe/0x40
? fixup_exception+0x22/0x2f0
? exc_page_fault+0x5f/0x100
? asm_exc_page_fault+0x22/0x30
? cachefiles_prepare_write+0x30/0xa0
netfs_rreq_write_to_cache_work+0x135/0x2e0
process_one_work+0x137/0x2c0
worker_thread+0x2e9/0x400
? __pfx_worker_thread+0x10/0x10
kthread+0xcc/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x30/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1b/0x30
</TASK>
Modules linked in:
CR2: 0000000000000008
---[ end trace 0000000000000000 ]---
This happened because fscache_cookie_state_machine() was slow and was
still running while another process invoked fscache_unuse_cookie();
this led to a fscache_cookie_lru_do_one() call, setting the
FSCACHE_COOKIE_DO_LRU_DISCARD flag, which was picked up by
fscache_cookie_state_machine(), withdrawing the cookie via
cachefiles_withdraw_cookie(), clearing cookie->cache_priv.
At the same time, yet another process invoked
cachefiles_prepare_write(), which found a NULL pointer in this code
line:
struct cachefiles_object *object = cachefiles_cres_object(cres);
The next line crashes, obviously:
struct cachefiles_cache *cache = object->volume->cache;
During cachefiles_prepare_write(), the "n_accesses" counter is
non-zero (via fscache_begin_operation()). The cookie must not be
withdrawn until it drops to zero.
The counter is checked by fscache_cookie_state_machine() before
switching to FSCACHE_COOKIE_STATE_RELINQUISHING and
FSCACHE_COOKIE_STATE_WITHDRAWING (in "case
FSCACHE_COOKIE_STATE_FAILED"), but not for
FSCACHE_COOKIE_STATE_LRU_DISCARDING ("case
FSCACHE_COOKIE_STATE_ACTIVE").
This patch adds the missing check. With a non-zero access counter,
the function returns and the next fscache_end_cookie_access() call
will queue another fscache_cookie_state_machine() call to handle the
still-pending FSCACHE_COOKIE_DO_LRU_DISCARD.
Fixes: 12bb21a29c19 ("fscache: Implement cookie user counting and resource pinning")
Signed-off-by: Max Kellermann <max.kellermann(a)ionos.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
cc: Jeff Layton <jlayton(a)kernel.org>
cc: netfs(a)lists.linux.dev
cc: linux-fsdevel(a)vger.kernel.org
cc: stable(a)vger.kernel.org
---
fs/netfs/fscache_cookie.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/fs/netfs/fscache_cookie.c b/fs/netfs/fscache_cookie.c
index bce2492186d0..d4d4b3a8b106 100644
--- a/fs/netfs/fscache_cookie.c
+++ b/fs/netfs/fscache_cookie.c
@@ -741,6 +741,10 @@ static void fscache_cookie_state_machine(struct fscache_cookie *cookie)
spin_lock(&cookie->lock);
}
if (test_bit(FSCACHE_COOKIE_DO_LRU_DISCARD, &cookie->flags)) {
+ if (atomic_read(&cookie->n_accesses) != 0)
+ /* still being accessed: postpone it */
+ break;
+
__fscache_set_cookie_state(cookie,
FSCACHE_COOKIE_STATE_LRU_DISCARDING);
wake = true;
From: Andrey Konovalov <andreyknvl(a)gmail.com>
When collecting coverage from softirqs, KCOV uses in_serving_softirq() to
check whether the code is running in the softirq context. Unfortunately,
in_serving_softirq() is > 0 even when the code is running in the hardirq
or NMI context for hardirqs and NMIs that happened during a softirq.
As a result, if a softirq handler contains a remote coverage collection
section and a hardirq with another remote coverage collection section
happens during handling the softirq, KCOV incorrectly detects a nested
softirq coverate collection section and prints a WARNING, as reported
by syzbot.
This issue was exposed by commit a7f3813e589f ("usb: gadget: dummy_hcd:
Switch to hrtimer transfer scheduler"), which switched dummy_hcd to using
hrtimer and made the timer's callback be executed in the hardirq context.
Change the related checks in KCOV to account for this behavior of
in_serving_softirq() and make KCOV ignore remote coverage collection
sections in the hardirq and NMI contexts.
This prevents the WARNING printed by syzbot but does not fix the inability
of KCOV to collect coverage from the __usb_hcd_giveback_urb when dummy_hcd
is in use (caused by a7f3813e589f); a separate patch is required for that.
Reported-by: syzbot+2388cdaeb6b10f0c13ac(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=2388cdaeb6b10f0c13ac
Fixes: 5ff3b30ab57d ("kcov: collect coverage from interrupts")
Cc: stable(a)vger.kernel.org
Signed-off-by: Andrey Konovalov <andreyknvl(a)gmail.com>
---
kernel/kcov.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/kernel/kcov.c b/kernel/kcov.c
index f0a69d402066e..274b6b7c718de 100644
--- a/kernel/kcov.c
+++ b/kernel/kcov.c
@@ -161,6 +161,15 @@ static void kcov_remote_area_put(struct kcov_remote_area *area,
kmsan_unpoison_memory(&area->list, sizeof(area->list));
}
+/*
+ * Unlike in_serving_softirq(), this function returns false when called during
+ * a hardirq or an NMI that happened in the softirq context.
+ */
+static inline bool in_softirq_really(void)
+{
+ return in_serving_softirq() && !in_hardirq() && !in_nmi();
+}
+
static notrace bool check_kcov_mode(enum kcov_mode needed_mode, struct task_struct *t)
{
unsigned int mode;
@@ -170,7 +179,7 @@ static notrace bool check_kcov_mode(enum kcov_mode needed_mode, struct task_stru
* so we ignore code executed in interrupts, unless we are in a remote
* coverage collection section in a softirq.
*/
- if (!in_task() && !(in_serving_softirq() && t->kcov_softirq))
+ if (!in_task() && !(in_softirq_really() && t->kcov_softirq))
return false;
mode = READ_ONCE(t->kcov_mode);
/*
@@ -849,7 +858,7 @@ void kcov_remote_start(u64 handle)
if (WARN_ON(!kcov_check_handle(handle, true, true, true)))
return;
- if (!in_task() && !in_serving_softirq())
+ if (!in_task() && !in_softirq_really())
return;
local_lock_irqsave(&kcov_percpu_data.lock, flags);
@@ -991,7 +1000,7 @@ void kcov_remote_stop(void)
int sequence;
unsigned long flags;
- if (!in_task() && !in_serving_softirq())
+ if (!in_task() && !in_softirq_really())
return;
local_lock_irqsave(&kcov_percpu_data.lock, flags);
--
2.25.1
This patch series makes it possible to use Rust together with the shadow
call stack sanitizer. The first patch is intended to be backported to
ensure that people don't try to use SCS with Rust on older kernel
versions. The second patch makes it possible to use Rust with the shadow
call stack sanitizer.
The second patch in this series doesn't make sense without [1], though
it doesn't break the build if [1] is missing.
Link: https://lore.kernel.org/rust-for-linux/20240701183625.665574-12-ojeda@kerne… [1]
Signed-off-by: Alice Ryhl <aliceryhl(a)google.com>
---
Changes in v3:
- Use -Zfixed-x18.
- Add logic to reject unsupported rustc versions.
- Also include a fix to be backported.
- Link to v2: https://lore.kernel.org/rust-for-linux/20240305-shadow-call-stack-v2-1-c7b4…
Changes in v2:
- Add -Cforce-unwind-tables flag.
- Link to v1: https://lore.kernel.org/rust-for-linux/20240304-shadow-call-stack-v1-1-f055…
---
Alice Ryhl (2):
rust: SHADOW_CALL_STACK is incompatible with Rust
rust: add flags for shadow call stack sanitizer
Makefile | 1 +
arch/Kconfig | 1 +
arch/arm64/Makefile | 3 +++
3 files changed, 5 insertions(+)
---
base-commit: 83b1e6e4170cf96b2a7c49070dd43749649f454e
change-id: 20240304-shadow-call-stack-9c197a4361d9
Best regards,
--
Alice Ryhl <aliceryhl(a)google.com>
> Thank you for your sending the revised patches, it looks better than the
> previous one. However, I have an additional request.
Allright, patch v3 it is.
> [1] https://git-scm.com/docs/git-revert
Should have known git has something like that, how handy!
> $ git revert -s b5b519965c4c
Yes, 5b5 can be removed via revert, but what is the difference in
effect? Just time saving?
> $ git revert -s 7ba5ca32fe6e
This one I'd like to ask you about:
The original inline comment in amdtp-stream.c
amdtp_domain_stream_pcm_pointer()
```
// This function is called in software IRQ context of
// period_work or process context.
//
// When the software IRQ context was scheduled by software IRQ
// context of IT contexts, queued packets were already handled.
// Therefore, no need to flush the queue in buffer furthermore.
//
// When the process context reach here, some packets will be
// already queued in the buffer. These packets should be handled
// immediately to keep better granularity of PCM pointer.
//
// Later, the process context will sometimes schedules software
// IRQ context of the period_work. Then, no need to flush the
// queue by the same reason as described in the above
```
(let's call the above v1) was replaced with
```
// In software IRQ context, the call causes dead-lock to disable the tasklet
// synchronously.
```
on occasion of 7ba5ca32fe6e (let's call this v2).
I sought to replace it with
```
// use wq to prevent deadlock between process context spin_lock
// of snd_pcm_stream_lock_irq() in snd_pcm_status64() and
// softIRQ context spin_lock of snd_pcm_stream_lock_irqsave()
// in snd_pcm_period_elapsed()
```
to prevent this issue from occurring again (let's call this v3).
Should I include v1, v3 or a combination of v1 and v3 in my next patch?
> Just for safe, it is preferable to execute 'scripts/checkpatch.pl' in
> kernel tree to check the patchset generated by send-email subcommand[3].
Absolutely should have done so, sorry.
Thank you for your patience and guidance,
Edmund Raile.
Issue #501 [1] showed that the Netlink PM currently doesn't correctly
support removal and re-add of signal endpoints.
Patches 1 and 2 address the issue: the first one in the userspace path-
manager, introduced in v5.19 ; and the second one in the in-kernel path-
manager, introduced in v5.7.
Patch 3 introduces a related selftest. There is no 'Fixes' tag, because
it might be hard to backport it automatically, as missing helpers in
Bash will not be caught when compiling the kernel or the selftests.
The last two patches address two small issues in the MPTCP selftests,
one introduced in v6.6., and the other one in v5.17.
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/501 [1]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Liu Jing (1):
selftests: mptcp: always close input's FD if opened
Paolo Abeni (4):
mptcp: fix user-space PM announced address accounting
mptcp: fix NL PM announced address accounting
selftests: mptcp: add explicit test case for remove/readd
selftests: mptcp: fix error path
net/mptcp/pm_netlink.c | 27 ++++++++++++++------
tools/testing/selftests/net/mptcp/mptcp_connect.c | 8 +++---
tools/testing/selftests/net/mptcp/mptcp_join.sh | 31 ++++++++++++++++++++++-
3 files changed, 53 insertions(+), 13 deletions(-)
---
base-commit: 301927d2d2eb8e541357ba850bc7a1a74dbbd670
change-id: 20240726-upstream-net-20240726-mptcp-fix-signal-readd-f3c72bbcbceb
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 4811f7af6090e8f5a398fbdd766f903ef6c0d787
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072930-unbridle-negotiate-9977@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
4811f7af6090 ("nilfs2: handle inconsistent state in nilfs_btnode_create_block()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4811f7af6090e8f5a398fbdd766f903ef6c0d787 Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Thu, 25 Jul 2024 14:20:07 +0900
Subject: [PATCH] nilfs2: handle inconsistent state in
nilfs_btnode_create_block()
Syzbot reported that a buffer state inconsistency was detected in
nilfs_btnode_create_block(), triggering a kernel bug.
It is not appropriate to treat this inconsistency as a bug; it can occur
if the argument block address (the buffer index of the newly created
block) is a virtual block number and has been reallocated due to
corruption of the bitmap used to manage its allocation state.
So, modify nilfs_btnode_create_block() and its callers to treat it as a
possible filesystem error, rather than triggering a kernel bug.
Link: https://lkml.kernel.org/r/20240725052007.4562-1-konishi.ryusuke@gmail.com
Fixes: a60be987d45d ("nilfs2: B-tree node cache")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+89cc4f2324ed37988b60(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=89cc4f2324ed37988b60
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/btnode.c b/fs/nilfs2/btnode.c
index 0131d83b912d..c034080c334b 100644
--- a/fs/nilfs2/btnode.c
+++ b/fs/nilfs2/btnode.c
@@ -51,12 +51,21 @@ nilfs_btnode_create_block(struct address_space *btnc, __u64 blocknr)
bh = nilfs_grab_buffer(inode, btnc, blocknr, BIT(BH_NILFS_Node));
if (unlikely(!bh))
- return NULL;
+ return ERR_PTR(-ENOMEM);
if (unlikely(buffer_mapped(bh) || buffer_uptodate(bh) ||
buffer_dirty(bh))) {
- brelse(bh);
- BUG();
+ /*
+ * The block buffer at the specified new address was already
+ * in use. This can happen if it is a virtual block number
+ * and has been reallocated due to corruption of the bitmap
+ * used to manage its allocation state (if not, the buffer
+ * clearing of an abandoned b-tree node is missing somewhere).
+ */
+ nilfs_error(inode->i_sb,
+ "state inconsistency probably due to duplicate use of b-tree node block address %llu (ino=%lu)",
+ (unsigned long long)blocknr, inode->i_ino);
+ goto failed;
}
memset(bh->b_data, 0, i_blocksize(inode));
bh->b_bdev = inode->i_sb->s_bdev;
@@ -67,6 +76,12 @@ nilfs_btnode_create_block(struct address_space *btnc, __u64 blocknr)
folio_unlock(bh->b_folio);
folio_put(bh->b_folio);
return bh;
+
+failed:
+ folio_unlock(bh->b_folio);
+ folio_put(bh->b_folio);
+ brelse(bh);
+ return ERR_PTR(-EIO);
}
int nilfs_btnode_submit_block(struct address_space *btnc, __u64 blocknr,
@@ -217,8 +232,8 @@ int nilfs_btnode_prepare_change_key(struct address_space *btnc,
}
nbh = nilfs_btnode_create_block(btnc, newkey);
- if (!nbh)
- return -ENOMEM;
+ if (IS_ERR(nbh))
+ return PTR_ERR(nbh);
BUG_ON(nbh == obh);
ctxt->newbh = nbh;
diff --git a/fs/nilfs2/btree.c b/fs/nilfs2/btree.c
index a139970e4804..862bdf23120e 100644
--- a/fs/nilfs2/btree.c
+++ b/fs/nilfs2/btree.c
@@ -63,8 +63,8 @@ static int nilfs_btree_get_new_block(const struct nilfs_bmap *btree,
struct buffer_head *bh;
bh = nilfs_btnode_create_block(btnc, ptr);
- if (!bh)
- return -ENOMEM;
+ if (IS_ERR(bh))
+ return PTR_ERR(bh);
set_buffer_nilfs_volatile(bh);
*bhp = bh;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 4811f7af6090e8f5a398fbdd766f903ef6c0d787
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072929-blip-squad-13da@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
4811f7af6090 ("nilfs2: handle inconsistent state in nilfs_btnode_create_block()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4811f7af6090e8f5a398fbdd766f903ef6c0d787 Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Thu, 25 Jul 2024 14:20:07 +0900
Subject: [PATCH] nilfs2: handle inconsistent state in
nilfs_btnode_create_block()
Syzbot reported that a buffer state inconsistency was detected in
nilfs_btnode_create_block(), triggering a kernel bug.
It is not appropriate to treat this inconsistency as a bug; it can occur
if the argument block address (the buffer index of the newly created
block) is a virtual block number and has been reallocated due to
corruption of the bitmap used to manage its allocation state.
So, modify nilfs_btnode_create_block() and its callers to treat it as a
possible filesystem error, rather than triggering a kernel bug.
Link: https://lkml.kernel.org/r/20240725052007.4562-1-konishi.ryusuke@gmail.com
Fixes: a60be987d45d ("nilfs2: B-tree node cache")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+89cc4f2324ed37988b60(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=89cc4f2324ed37988b60
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/btnode.c b/fs/nilfs2/btnode.c
index 0131d83b912d..c034080c334b 100644
--- a/fs/nilfs2/btnode.c
+++ b/fs/nilfs2/btnode.c
@@ -51,12 +51,21 @@ nilfs_btnode_create_block(struct address_space *btnc, __u64 blocknr)
bh = nilfs_grab_buffer(inode, btnc, blocknr, BIT(BH_NILFS_Node));
if (unlikely(!bh))
- return NULL;
+ return ERR_PTR(-ENOMEM);
if (unlikely(buffer_mapped(bh) || buffer_uptodate(bh) ||
buffer_dirty(bh))) {
- brelse(bh);
- BUG();
+ /*
+ * The block buffer at the specified new address was already
+ * in use. This can happen if it is a virtual block number
+ * and has been reallocated due to corruption of the bitmap
+ * used to manage its allocation state (if not, the buffer
+ * clearing of an abandoned b-tree node is missing somewhere).
+ */
+ nilfs_error(inode->i_sb,
+ "state inconsistency probably due to duplicate use of b-tree node block address %llu (ino=%lu)",
+ (unsigned long long)blocknr, inode->i_ino);
+ goto failed;
}
memset(bh->b_data, 0, i_blocksize(inode));
bh->b_bdev = inode->i_sb->s_bdev;
@@ -67,6 +76,12 @@ nilfs_btnode_create_block(struct address_space *btnc, __u64 blocknr)
folio_unlock(bh->b_folio);
folio_put(bh->b_folio);
return bh;
+
+failed:
+ folio_unlock(bh->b_folio);
+ folio_put(bh->b_folio);
+ brelse(bh);
+ return ERR_PTR(-EIO);
}
int nilfs_btnode_submit_block(struct address_space *btnc, __u64 blocknr,
@@ -217,8 +232,8 @@ int nilfs_btnode_prepare_change_key(struct address_space *btnc,
}
nbh = nilfs_btnode_create_block(btnc, newkey);
- if (!nbh)
- return -ENOMEM;
+ if (IS_ERR(nbh))
+ return PTR_ERR(nbh);
BUG_ON(nbh == obh);
ctxt->newbh = nbh;
diff --git a/fs/nilfs2/btree.c b/fs/nilfs2/btree.c
index a139970e4804..862bdf23120e 100644
--- a/fs/nilfs2/btree.c
+++ b/fs/nilfs2/btree.c
@@ -63,8 +63,8 @@ static int nilfs_btree_get_new_block(const struct nilfs_bmap *btree,
struct buffer_head *bh;
bh = nilfs_btnode_create_block(btnc, ptr);
- if (!bh)
- return -ENOMEM;
+ if (IS_ERR(bh))
+ return PTR_ERR(bh);
set_buffer_nilfs_volatile(bh);
*bhp = bh;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 4811f7af6090e8f5a398fbdd766f903ef6c0d787
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072928-thermal-reactor-e976@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
4811f7af6090 ("nilfs2: handle inconsistent state in nilfs_btnode_create_block()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4811f7af6090e8f5a398fbdd766f903ef6c0d787 Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Thu, 25 Jul 2024 14:20:07 +0900
Subject: [PATCH] nilfs2: handle inconsistent state in
nilfs_btnode_create_block()
Syzbot reported that a buffer state inconsistency was detected in
nilfs_btnode_create_block(), triggering a kernel bug.
It is not appropriate to treat this inconsistency as a bug; it can occur
if the argument block address (the buffer index of the newly created
block) is a virtual block number and has been reallocated due to
corruption of the bitmap used to manage its allocation state.
So, modify nilfs_btnode_create_block() and its callers to treat it as a
possible filesystem error, rather than triggering a kernel bug.
Link: https://lkml.kernel.org/r/20240725052007.4562-1-konishi.ryusuke@gmail.com
Fixes: a60be987d45d ("nilfs2: B-tree node cache")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+89cc4f2324ed37988b60(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=89cc4f2324ed37988b60
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/btnode.c b/fs/nilfs2/btnode.c
index 0131d83b912d..c034080c334b 100644
--- a/fs/nilfs2/btnode.c
+++ b/fs/nilfs2/btnode.c
@@ -51,12 +51,21 @@ nilfs_btnode_create_block(struct address_space *btnc, __u64 blocknr)
bh = nilfs_grab_buffer(inode, btnc, blocknr, BIT(BH_NILFS_Node));
if (unlikely(!bh))
- return NULL;
+ return ERR_PTR(-ENOMEM);
if (unlikely(buffer_mapped(bh) || buffer_uptodate(bh) ||
buffer_dirty(bh))) {
- brelse(bh);
- BUG();
+ /*
+ * The block buffer at the specified new address was already
+ * in use. This can happen if it is a virtual block number
+ * and has been reallocated due to corruption of the bitmap
+ * used to manage its allocation state (if not, the buffer
+ * clearing of an abandoned b-tree node is missing somewhere).
+ */
+ nilfs_error(inode->i_sb,
+ "state inconsistency probably due to duplicate use of b-tree node block address %llu (ino=%lu)",
+ (unsigned long long)blocknr, inode->i_ino);
+ goto failed;
}
memset(bh->b_data, 0, i_blocksize(inode));
bh->b_bdev = inode->i_sb->s_bdev;
@@ -67,6 +76,12 @@ nilfs_btnode_create_block(struct address_space *btnc, __u64 blocknr)
folio_unlock(bh->b_folio);
folio_put(bh->b_folio);
return bh;
+
+failed:
+ folio_unlock(bh->b_folio);
+ folio_put(bh->b_folio);
+ brelse(bh);
+ return ERR_PTR(-EIO);
}
int nilfs_btnode_submit_block(struct address_space *btnc, __u64 blocknr,
@@ -217,8 +232,8 @@ int nilfs_btnode_prepare_change_key(struct address_space *btnc,
}
nbh = nilfs_btnode_create_block(btnc, newkey);
- if (!nbh)
- return -ENOMEM;
+ if (IS_ERR(nbh))
+ return PTR_ERR(nbh);
BUG_ON(nbh == obh);
ctxt->newbh = nbh;
diff --git a/fs/nilfs2/btree.c b/fs/nilfs2/btree.c
index a139970e4804..862bdf23120e 100644
--- a/fs/nilfs2/btree.c
+++ b/fs/nilfs2/btree.c
@@ -63,8 +63,8 @@ static int nilfs_btree_get_new_block(const struct nilfs_bmap *btree,
struct buffer_head *bh;
bh = nilfs_btnode_create_block(btnc, ptr);
- if (!bh)
- return -ENOMEM;
+ if (IS_ERR(bh))
+ return PTR_ERR(bh);
set_buffer_nilfs_volatile(bh);
*bhp = bh;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 4811f7af6090e8f5a398fbdd766f903ef6c0d787
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072927-oozy-arousal-a5a8@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
4811f7af6090 ("nilfs2: handle inconsistent state in nilfs_btnode_create_block()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4811f7af6090e8f5a398fbdd766f903ef6c0d787 Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Thu, 25 Jul 2024 14:20:07 +0900
Subject: [PATCH] nilfs2: handle inconsistent state in
nilfs_btnode_create_block()
Syzbot reported that a buffer state inconsistency was detected in
nilfs_btnode_create_block(), triggering a kernel bug.
It is not appropriate to treat this inconsistency as a bug; it can occur
if the argument block address (the buffer index of the newly created
block) is a virtual block number and has been reallocated due to
corruption of the bitmap used to manage its allocation state.
So, modify nilfs_btnode_create_block() and its callers to treat it as a
possible filesystem error, rather than triggering a kernel bug.
Link: https://lkml.kernel.org/r/20240725052007.4562-1-konishi.ryusuke@gmail.com
Fixes: a60be987d45d ("nilfs2: B-tree node cache")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+89cc4f2324ed37988b60(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=89cc4f2324ed37988b60
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/btnode.c b/fs/nilfs2/btnode.c
index 0131d83b912d..c034080c334b 100644
--- a/fs/nilfs2/btnode.c
+++ b/fs/nilfs2/btnode.c
@@ -51,12 +51,21 @@ nilfs_btnode_create_block(struct address_space *btnc, __u64 blocknr)
bh = nilfs_grab_buffer(inode, btnc, blocknr, BIT(BH_NILFS_Node));
if (unlikely(!bh))
- return NULL;
+ return ERR_PTR(-ENOMEM);
if (unlikely(buffer_mapped(bh) || buffer_uptodate(bh) ||
buffer_dirty(bh))) {
- brelse(bh);
- BUG();
+ /*
+ * The block buffer at the specified new address was already
+ * in use. This can happen if it is a virtual block number
+ * and has been reallocated due to corruption of the bitmap
+ * used to manage its allocation state (if not, the buffer
+ * clearing of an abandoned b-tree node is missing somewhere).
+ */
+ nilfs_error(inode->i_sb,
+ "state inconsistency probably due to duplicate use of b-tree node block address %llu (ino=%lu)",
+ (unsigned long long)blocknr, inode->i_ino);
+ goto failed;
}
memset(bh->b_data, 0, i_blocksize(inode));
bh->b_bdev = inode->i_sb->s_bdev;
@@ -67,6 +76,12 @@ nilfs_btnode_create_block(struct address_space *btnc, __u64 blocknr)
folio_unlock(bh->b_folio);
folio_put(bh->b_folio);
return bh;
+
+failed:
+ folio_unlock(bh->b_folio);
+ folio_put(bh->b_folio);
+ brelse(bh);
+ return ERR_PTR(-EIO);
}
int nilfs_btnode_submit_block(struct address_space *btnc, __u64 blocknr,
@@ -217,8 +232,8 @@ int nilfs_btnode_prepare_change_key(struct address_space *btnc,
}
nbh = nilfs_btnode_create_block(btnc, newkey);
- if (!nbh)
- return -ENOMEM;
+ if (IS_ERR(nbh))
+ return PTR_ERR(nbh);
BUG_ON(nbh == obh);
ctxt->newbh = nbh;
diff --git a/fs/nilfs2/btree.c b/fs/nilfs2/btree.c
index a139970e4804..862bdf23120e 100644
--- a/fs/nilfs2/btree.c
+++ b/fs/nilfs2/btree.c
@@ -63,8 +63,8 @@ static int nilfs_btree_get_new_block(const struct nilfs_bmap *btree,
struct buffer_head *bh;
bh = nilfs_btnode_create_block(btnc, ptr);
- if (!bh)
- return -ENOMEM;
+ if (IS_ERR(bh))
+ return PTR_ERR(bh);
set_buffer_nilfs_volatile(bh);
*bhp = bh;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 4811f7af6090e8f5a398fbdd766f903ef6c0d787
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072926-epiphany-basis-4f11@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
4811f7af6090 ("nilfs2: handle inconsistent state in nilfs_btnode_create_block()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4811f7af6090e8f5a398fbdd766f903ef6c0d787 Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Thu, 25 Jul 2024 14:20:07 +0900
Subject: [PATCH] nilfs2: handle inconsistent state in
nilfs_btnode_create_block()
Syzbot reported that a buffer state inconsistency was detected in
nilfs_btnode_create_block(), triggering a kernel bug.
It is not appropriate to treat this inconsistency as a bug; it can occur
if the argument block address (the buffer index of the newly created
block) is a virtual block number and has been reallocated due to
corruption of the bitmap used to manage its allocation state.
So, modify nilfs_btnode_create_block() and its callers to treat it as a
possible filesystem error, rather than triggering a kernel bug.
Link: https://lkml.kernel.org/r/20240725052007.4562-1-konishi.ryusuke@gmail.com
Fixes: a60be987d45d ("nilfs2: B-tree node cache")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+89cc4f2324ed37988b60(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=89cc4f2324ed37988b60
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/btnode.c b/fs/nilfs2/btnode.c
index 0131d83b912d..c034080c334b 100644
--- a/fs/nilfs2/btnode.c
+++ b/fs/nilfs2/btnode.c
@@ -51,12 +51,21 @@ nilfs_btnode_create_block(struct address_space *btnc, __u64 blocknr)
bh = nilfs_grab_buffer(inode, btnc, blocknr, BIT(BH_NILFS_Node));
if (unlikely(!bh))
- return NULL;
+ return ERR_PTR(-ENOMEM);
if (unlikely(buffer_mapped(bh) || buffer_uptodate(bh) ||
buffer_dirty(bh))) {
- brelse(bh);
- BUG();
+ /*
+ * The block buffer at the specified new address was already
+ * in use. This can happen if it is a virtual block number
+ * and has been reallocated due to corruption of the bitmap
+ * used to manage its allocation state (if not, the buffer
+ * clearing of an abandoned b-tree node is missing somewhere).
+ */
+ nilfs_error(inode->i_sb,
+ "state inconsistency probably due to duplicate use of b-tree node block address %llu (ino=%lu)",
+ (unsigned long long)blocknr, inode->i_ino);
+ goto failed;
}
memset(bh->b_data, 0, i_blocksize(inode));
bh->b_bdev = inode->i_sb->s_bdev;
@@ -67,6 +76,12 @@ nilfs_btnode_create_block(struct address_space *btnc, __u64 blocknr)
folio_unlock(bh->b_folio);
folio_put(bh->b_folio);
return bh;
+
+failed:
+ folio_unlock(bh->b_folio);
+ folio_put(bh->b_folio);
+ brelse(bh);
+ return ERR_PTR(-EIO);
}
int nilfs_btnode_submit_block(struct address_space *btnc, __u64 blocknr,
@@ -217,8 +232,8 @@ int nilfs_btnode_prepare_change_key(struct address_space *btnc,
}
nbh = nilfs_btnode_create_block(btnc, newkey);
- if (!nbh)
- return -ENOMEM;
+ if (IS_ERR(nbh))
+ return PTR_ERR(nbh);
BUG_ON(nbh == obh);
ctxt->newbh = nbh;
diff --git a/fs/nilfs2/btree.c b/fs/nilfs2/btree.c
index a139970e4804..862bdf23120e 100644
--- a/fs/nilfs2/btree.c
+++ b/fs/nilfs2/btree.c
@@ -63,8 +63,8 @@ static int nilfs_btree_get_new_block(const struct nilfs_bmap *btree,
struct buffer_head *bh;
bh = nilfs_btnode_create_block(btnc, ptr);
- if (!bh)
- return -ENOMEM;
+ if (IS_ERR(bh))
+ return PTR_ERR(bh);
set_buffer_nilfs_volatile(bh);
*bhp = bh;
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 8ddad558997002ce67980e30c9e8dfaa696e163b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072928-engaged-steerable-531f@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
8ddad5589970 ("dmaengine: fsl-edma: change the memory access from local into remote mode in i.MX 8QM")
77584368a0f3 ("dmaengine: fsl-edma: clean up unused "fsl,imx8qm-adma" compatible string")
d8d4355861d8 ("dmaengine: fsl-edma: add i.MX8ULP edma support")
4ee632c82d2d ("dmaengine: fsl-edma: fix DMA channel leak in eDMAv4")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8ddad558997002ce67980e30c9e8dfaa696e163b Mon Sep 17 00:00:00 2001
From: Joy Zou <joy.zou(a)nxp.com>
Date: Fri, 10 May 2024 11:09:34 +0800
Subject: [PATCH] dmaengine: fsl-edma: change the memory access from local into
remote mode in i.MX 8QM
Fix the issue where MEM_TO_MEM fail on i.MX8QM due to the requirement
that both source and destination addresses need pass through the IOMMU.
Typically, peripheral FIFO addresses bypass the IOMMU, necessitating
only one of the source or destination to go through it.
Set "is_remote" to true to ensure both source and destination
addresses pass through the IOMMU.
iMX8 Spec define "Local" and "Remote" bus as below.
Local bus: bypass IOMMU to directly access other peripheral register,
such as FIFO.
Remote bus: go through IOMMU to access system memory.
The test fail log as follow:
[ 66.268506] dmatest: dma0chan0-copy0: result #1: 'test timed out' with src_off=0x100 dst_off=0x80 len=0x3ec0 (0)
[ 66.278785] dmatest: dma0chan0-copy0: summary 1 tests, 1 failures 0.32 iops 4 KB/s (0)
Fixes: 72f5801a4e2b ("dmaengine: fsl-edma: integrate v3 support")
Signed-off-by: Joy Zou <joy.zou(a)nxp.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Frank Li <Frank.Li(a)nxp.com>
Link: https://lore.kernel.org/r/20240510030959.703663-1-joy.zou@nxp.com
Signed-off-by: Vinod Koul <vkoul(a)kernel.org>
diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-common.c
index 9d565ab85502..b7f15ab96855 100644
--- a/drivers/dma/fsl-edma-common.c
+++ b/drivers/dma/fsl-edma-common.c
@@ -755,6 +755,8 @@ struct dma_async_tx_descriptor *fsl_edma_prep_memcpy(struct dma_chan *chan,
fsl_desc->iscyclic = false;
fsl_chan->is_sw = true;
+ if (fsl_edma_drvflags(fsl_chan) & FSL_EDMA_DRV_MEM_REMOTE)
+ fsl_chan->is_remote = true;
/* To match with copy_align and max_seg_size so 1 tcd is enough */
fsl_edma_fill_tcd(fsl_chan, fsl_desc->tcd[0].vtcd, dma_src, dma_dst,
@@ -848,6 +850,7 @@ void fsl_edma_free_chan_resources(struct dma_chan *chan)
fsl_chan->tcd_pool = NULL;
fsl_chan->is_sw = false;
fsl_chan->srcid = 0;
+ fsl_chan->is_remote = false;
if (fsl_edma_drvflags(fsl_chan) & FSL_EDMA_DRV_HAS_CHCLK)
clk_disable_unprepare(fsl_chan->clk);
}
diff --git a/drivers/dma/fsl-edma-common.h b/drivers/dma/fsl-edma-common.h
index 1c90b95f4ff8..ce37e1ee9c46 100644
--- a/drivers/dma/fsl-edma-common.h
+++ b/drivers/dma/fsl-edma-common.h
@@ -194,6 +194,7 @@ struct fsl_edma_desc {
#define FSL_EDMA_DRV_HAS_PD BIT(5)
#define FSL_EDMA_DRV_HAS_CHCLK BIT(6)
#define FSL_EDMA_DRV_HAS_CHMUX BIT(7)
+#define FSL_EDMA_DRV_MEM_REMOTE BIT(8)
/* control and status register is in tcd address space, edma3 reg layout */
#define FSL_EDMA_DRV_SPLIT_REG BIT(9)
#define FSL_EDMA_DRV_BUS_8BYTE BIT(10)
diff --git a/drivers/dma/fsl-edma-main.c b/drivers/dma/fsl-edma-main.c
index ec4f5baafad5..c66185c5a199 100644
--- a/drivers/dma/fsl-edma-main.c
+++ b/drivers/dma/fsl-edma-main.c
@@ -343,7 +343,7 @@ static struct fsl_edma_drvdata imx7ulp_data = {
};
static struct fsl_edma_drvdata imx8qm_data = {
- .flags = FSL_EDMA_DRV_HAS_PD | FSL_EDMA_DRV_EDMA3,
+ .flags = FSL_EDMA_DRV_HAS_PD | FSL_EDMA_DRV_EDMA3 | FSL_EDMA_DRV_MEM_REMOTE,
.chreg_space_sz = 0x10000,
.chreg_off = 0x10000,
.setup_irq = fsl_edma3_irq_init,
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 8cb1f4080dd91c6e6b01dbea013a3f42341cb6a1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072915-exclusive-powdery-eeb5@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
8cb1f4080dd9 ("f2fs: assign CURSEG_ALL_DATA_ATGC if blkaddr is valid")
21327a042dd9 ("f2fs: fix to avoid use SSR allocate when do defragment")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8cb1f4080dd91c6e6b01dbea013a3f42341cb6a1 Mon Sep 17 00:00:00 2001
From: Jaegeuk Kim <jaegeuk(a)kernel.org>
Date: Tue, 18 Jun 2024 02:15:38 +0000
Subject: [PATCH] f2fs: assign CURSEG_ALL_DATA_ATGC if blkaddr is valid
mkdir /mnt/test/comp
f2fs_io setflags compression /mnt/test/comp
dd if=/dev/zero of=/mnt/test/comp/testfile bs=16k count=1
truncate --size 13 /mnt/test/comp/testfile
In the above scenario, we can get a BUG_ON.
kernel BUG at fs/f2fs/segment.c:3589!
Call Trace:
do_write_page+0x78/0x390 [f2fs]
f2fs_outplace_write_data+0x62/0xb0 [f2fs]
f2fs_do_write_data_page+0x275/0x740 [f2fs]
f2fs_write_single_data_page+0x1dc/0x8f0 [f2fs]
f2fs_write_multi_pages+0x1e5/0xae0 [f2fs]
f2fs_write_cache_pages+0xab1/0xc60 [f2fs]
f2fs_write_data_pages+0x2d8/0x330 [f2fs]
do_writepages+0xcf/0x270
__writeback_single_inode+0x44/0x350
writeback_sb_inodes+0x242/0x530
__writeback_inodes_wb+0x54/0xf0
wb_writeback+0x192/0x310
wb_workfn+0x30d/0x400
The reason is we gave CURSEG_ALL_DATA_ATGC to COMPR_ADDR where the
page was set the gcing flag by set_cluster_dirty().
Cc: stable(a)vger.kernel.org
Fixes: 4961acdd65c9 ("f2fs: fix to tag gcing flag on page during block migration")
Reviewed-by: Chao Yu <chao(a)kernel.org>
Tested-by: Will McVicker <willmcvicker(a)google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk(a)kernel.org>
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 362cfb550408..4db1add43e36 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -3505,6 +3505,7 @@ static int __get_segment_type_6(struct f2fs_io_info *fio)
if (fio->sbi->am.atgc_enabled &&
(fio->io_type == FS_DATA_IO) &&
(fio->sbi->gc_mode != GC_URGENT_HIGH) &&
+ __is_valid_data_blkaddr(fio->old_blkaddr) &&
!is_inode_flag_set(inode, FI_OPU_WRITE))
return CURSEG_ALL_DATA_ATGC;
else
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 8cb1f4080dd91c6e6b01dbea013a3f42341cb6a1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072914-caviar-emphasize-6bc8@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
8cb1f4080dd9 ("f2fs: assign CURSEG_ALL_DATA_ATGC if blkaddr is valid")
21327a042dd9 ("f2fs: fix to avoid use SSR allocate when do defragment")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8cb1f4080dd91c6e6b01dbea013a3f42341cb6a1 Mon Sep 17 00:00:00 2001
From: Jaegeuk Kim <jaegeuk(a)kernel.org>
Date: Tue, 18 Jun 2024 02:15:38 +0000
Subject: [PATCH] f2fs: assign CURSEG_ALL_DATA_ATGC if blkaddr is valid
mkdir /mnt/test/comp
f2fs_io setflags compression /mnt/test/comp
dd if=/dev/zero of=/mnt/test/comp/testfile bs=16k count=1
truncate --size 13 /mnt/test/comp/testfile
In the above scenario, we can get a BUG_ON.
kernel BUG at fs/f2fs/segment.c:3589!
Call Trace:
do_write_page+0x78/0x390 [f2fs]
f2fs_outplace_write_data+0x62/0xb0 [f2fs]
f2fs_do_write_data_page+0x275/0x740 [f2fs]
f2fs_write_single_data_page+0x1dc/0x8f0 [f2fs]
f2fs_write_multi_pages+0x1e5/0xae0 [f2fs]
f2fs_write_cache_pages+0xab1/0xc60 [f2fs]
f2fs_write_data_pages+0x2d8/0x330 [f2fs]
do_writepages+0xcf/0x270
__writeback_single_inode+0x44/0x350
writeback_sb_inodes+0x242/0x530
__writeback_inodes_wb+0x54/0xf0
wb_writeback+0x192/0x310
wb_workfn+0x30d/0x400
The reason is we gave CURSEG_ALL_DATA_ATGC to COMPR_ADDR where the
page was set the gcing flag by set_cluster_dirty().
Cc: stable(a)vger.kernel.org
Fixes: 4961acdd65c9 ("f2fs: fix to tag gcing flag on page during block migration")
Reviewed-by: Chao Yu <chao(a)kernel.org>
Tested-by: Will McVicker <willmcvicker(a)google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk(a)kernel.org>
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 362cfb550408..4db1add43e36 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -3505,6 +3505,7 @@ static int __get_segment_type_6(struct f2fs_io_info *fio)
if (fio->sbi->am.atgc_enabled &&
(fio->io_type == FS_DATA_IO) &&
(fio->sbi->gc_mode != GC_URGENT_HIGH) &&
+ __is_valid_data_blkaddr(fio->old_blkaddr) &&
!is_inode_flag_set(inode, FI_OPU_WRITE))
return CURSEG_ALL_DATA_ATGC;
else
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 8cb1f4080dd91c6e6b01dbea013a3f42341cb6a1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072913-acronym-duty-c413@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
8cb1f4080dd9 ("f2fs: assign CURSEG_ALL_DATA_ATGC if blkaddr is valid")
21327a042dd9 ("f2fs: fix to avoid use SSR allocate when do defragment")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8cb1f4080dd91c6e6b01dbea013a3f42341cb6a1 Mon Sep 17 00:00:00 2001
From: Jaegeuk Kim <jaegeuk(a)kernel.org>
Date: Tue, 18 Jun 2024 02:15:38 +0000
Subject: [PATCH] f2fs: assign CURSEG_ALL_DATA_ATGC if blkaddr is valid
mkdir /mnt/test/comp
f2fs_io setflags compression /mnt/test/comp
dd if=/dev/zero of=/mnt/test/comp/testfile bs=16k count=1
truncate --size 13 /mnt/test/comp/testfile
In the above scenario, we can get a BUG_ON.
kernel BUG at fs/f2fs/segment.c:3589!
Call Trace:
do_write_page+0x78/0x390 [f2fs]
f2fs_outplace_write_data+0x62/0xb0 [f2fs]
f2fs_do_write_data_page+0x275/0x740 [f2fs]
f2fs_write_single_data_page+0x1dc/0x8f0 [f2fs]
f2fs_write_multi_pages+0x1e5/0xae0 [f2fs]
f2fs_write_cache_pages+0xab1/0xc60 [f2fs]
f2fs_write_data_pages+0x2d8/0x330 [f2fs]
do_writepages+0xcf/0x270
__writeback_single_inode+0x44/0x350
writeback_sb_inodes+0x242/0x530
__writeback_inodes_wb+0x54/0xf0
wb_writeback+0x192/0x310
wb_workfn+0x30d/0x400
The reason is we gave CURSEG_ALL_DATA_ATGC to COMPR_ADDR where the
page was set the gcing flag by set_cluster_dirty().
Cc: stable(a)vger.kernel.org
Fixes: 4961acdd65c9 ("f2fs: fix to tag gcing flag on page during block migration")
Reviewed-by: Chao Yu <chao(a)kernel.org>
Tested-by: Will McVicker <willmcvicker(a)google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk(a)kernel.org>
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 362cfb550408..4db1add43e36 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -3505,6 +3505,7 @@ static int __get_segment_type_6(struct f2fs_io_info *fio)
if (fio->sbi->am.atgc_enabled &&
(fio->io_type == FS_DATA_IO) &&
(fio->sbi->gc_mode != GC_URGENT_HIGH) &&
+ __is_valid_data_blkaddr(fio->old_blkaddr) &&
!is_inode_flag_set(inode, FI_OPU_WRITE))
return CURSEG_ALL_DATA_ATGC;
else
The patch below does not apply to the 6.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y
git checkout FETCH_HEAD
git cherry-pick -x 8cb1f4080dd91c6e6b01dbea013a3f42341cb6a1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072912-upstream-tropics-2c8a@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^..
Possible dependencies:
8cb1f4080dd9 ("f2fs: assign CURSEG_ALL_DATA_ATGC if blkaddr is valid")
21327a042dd9 ("f2fs: fix to avoid use SSR allocate when do defragment")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8cb1f4080dd91c6e6b01dbea013a3f42341cb6a1 Mon Sep 17 00:00:00 2001
From: Jaegeuk Kim <jaegeuk(a)kernel.org>
Date: Tue, 18 Jun 2024 02:15:38 +0000
Subject: [PATCH] f2fs: assign CURSEG_ALL_DATA_ATGC if blkaddr is valid
mkdir /mnt/test/comp
f2fs_io setflags compression /mnt/test/comp
dd if=/dev/zero of=/mnt/test/comp/testfile bs=16k count=1
truncate --size 13 /mnt/test/comp/testfile
In the above scenario, we can get a BUG_ON.
kernel BUG at fs/f2fs/segment.c:3589!
Call Trace:
do_write_page+0x78/0x390 [f2fs]
f2fs_outplace_write_data+0x62/0xb0 [f2fs]
f2fs_do_write_data_page+0x275/0x740 [f2fs]
f2fs_write_single_data_page+0x1dc/0x8f0 [f2fs]
f2fs_write_multi_pages+0x1e5/0xae0 [f2fs]
f2fs_write_cache_pages+0xab1/0xc60 [f2fs]
f2fs_write_data_pages+0x2d8/0x330 [f2fs]
do_writepages+0xcf/0x270
__writeback_single_inode+0x44/0x350
writeback_sb_inodes+0x242/0x530
__writeback_inodes_wb+0x54/0xf0
wb_writeback+0x192/0x310
wb_workfn+0x30d/0x400
The reason is we gave CURSEG_ALL_DATA_ATGC to COMPR_ADDR where the
page was set the gcing flag by set_cluster_dirty().
Cc: stable(a)vger.kernel.org
Fixes: 4961acdd65c9 ("f2fs: fix to tag gcing flag on page during block migration")
Reviewed-by: Chao Yu <chao(a)kernel.org>
Tested-by: Will McVicker <willmcvicker(a)google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk(a)kernel.org>
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 362cfb550408..4db1add43e36 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -3505,6 +3505,7 @@ static int __get_segment_type_6(struct f2fs_io_info *fio)
if (fio->sbi->am.atgc_enabled &&
(fio->io_type == FS_DATA_IO) &&
(fio->sbi->gc_mode != GC_URGENT_HIGH) &&
+ __is_valid_data_blkaddr(fio->old_blkaddr) &&
!is_inode_flag_set(inode, FI_OPU_WRITE))
return CURSEG_ALL_DATA_ATGC;
else
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x f18d0076933689775fe7faeeb10ee93ff01be6ab
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072902-overnight-carried-32ca@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
f18d00769336 ("f2fs: use meta inode for GC of COW file")
b40a2b003709 ("f2fs: use meta inode for GC of atomic file")
9f0c4a46be1f ("f2fs: fix to truncate meta inode pages forcely")
fd244524c2cf ("f2fs: compress: fix to cover normal cluster write with cp_rwsem")
55fdc1c24a1d ("f2fs: fix to wait on block writeback for post_read case")
4e4f1eb9949b ("f2fs: introduce f2fs_invalidate_internal_cache() for cleanup")
a46bebd502fe ("f2fs: synchronize atomic write aborts")
2eae077e6e46 ("f2fs: reduce stack memory cost by using bitfield in struct f2fs_io_info")
ae267fc1cfe9 ("f2fs: fix to abort atomic write only during do_exist()")
e7547daccd6a ("f2fs: refactor extent_cache to support for read and more")
749d543c0d45 ("f2fs: remove unnecessary __init_extent_tree")
3bac20a8f011 ("f2fs: move internal functions into extent_cache.c")
12607c1ba763 ("f2fs: specify extent cache for read explicitly")
ed8ac22b6b75 ("f2fs: introduce f2fs_is_readonly() for readability")
41e8f85a75fc ("f2fs: introduce F2FS_IOC_START_ATOMIC_REPLACE")
967eaad1fed5 ("f2fs: fix to set flush_merge opt and show noflush_merge")
4d8d45df2252 ("f2fs: correct i_size change for atomic writes")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f18d0076933689775fe7faeeb10ee93ff01be6ab Mon Sep 17 00:00:00 2001
From: Sunmin Jeong <s_min.jeong(a)samsung.com>
Date: Wed, 10 Jul 2024 20:51:43 +0900
Subject: [PATCH] f2fs: use meta inode for GC of COW file
In case of the COW file, new updates and GC writes are already
separated to page caches of the atomic file and COW file. As some cases
that use the meta inode for GC, there are some race issues between a
foreground thread and GC thread.
To handle them, we need to take care when to invalidate and wait
writeback of GC pages in COW files as the case of using the meta inode.
Also, a pointer from the COW inode to the original inode is required to
check the state of original pages.
For the former, we can solve the problem by using the meta inode for GC
of COW files. Then let's get a page from the original inode in
move_data_block when GCing the COW file to avoid race condition.
Fixes: 3db1de0e582c ("f2fs: change the current atomic write way")
Cc: stable(a)vger.kernel.org #v5.19+
Reviewed-by: Sungjong Seo <sj1557.seo(a)samsung.com>
Reviewed-by: Yeongjin Gil <youngjin.gil(a)samsung.com>
Signed-off-by: Sunmin Jeong <s_min.jeong(a)samsung.com>
Reviewed-by: Chao Yu <chao(a)kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk(a)kernel.org>
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 1e4937af50f3..6457e5bca9c9 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2608,7 +2608,7 @@ bool f2fs_should_update_outplace(struct inode *inode, struct f2fs_io_info *fio)
return true;
if (IS_NOQUOTA(inode))
return true;
- if (f2fs_is_atomic_file(inode))
+ if (f2fs_used_in_atomic_write(inode))
return true;
/* rewrite low ratio compress data w/ OPU mode to avoid fragmentation */
if (f2fs_compressed_file(inode) &&
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 796ae11c0fa3..4a8621e4a33a 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -843,7 +843,11 @@ struct f2fs_inode_info {
struct task_struct *atomic_write_task; /* store atomic write task */
struct extent_tree *extent_tree[NR_EXTENT_CACHES];
/* cached extent_tree entry */
- struct inode *cow_inode; /* copy-on-write inode for atomic write */
+ union {
+ struct inode *cow_inode; /* copy-on-write inode for atomic write */
+ struct inode *atomic_inode;
+ /* point to atomic_inode, available only for cow_inode */
+ };
/* avoid racing between foreground op and gc */
struct f2fs_rwsem i_gc_rwsem[2];
@@ -4263,9 +4267,14 @@ static inline bool f2fs_post_read_required(struct inode *inode)
f2fs_compressed_file(inode);
}
+static inline bool f2fs_used_in_atomic_write(struct inode *inode)
+{
+ return f2fs_is_atomic_file(inode) || f2fs_is_cow_file(inode);
+}
+
static inline bool f2fs_meta_inode_gc_required(struct inode *inode)
{
- return f2fs_post_read_required(inode) || f2fs_is_atomic_file(inode);
+ return f2fs_post_read_required(inode) || f2fs_used_in_atomic_write(inode);
}
/*
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index e4a7cff00796..547e7ec32b1f 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -2183,6 +2183,9 @@ static int f2fs_ioc_start_atomic_write(struct file *filp, bool truncate)
set_inode_flag(fi->cow_inode, FI_COW_FILE);
clear_inode_flag(fi->cow_inode, FI_INLINE_DATA);
+
+ /* Set the COW inode's atomic_inode to the atomic inode */
+ F2FS_I(fi->cow_inode)->atomic_inode = inode;
} else {
/* Reuse the already created COW inode */
ret = f2fs_do_truncate_blocks(fi->cow_inode, 0, true);
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index cb3006551ab5..724bbcb447d3 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1171,7 +1171,8 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
static int ra_data_block(struct inode *inode, pgoff_t index)
{
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
- struct address_space *mapping = inode->i_mapping;
+ struct address_space *mapping = f2fs_is_cow_file(inode) ?
+ F2FS_I(inode)->atomic_inode->i_mapping : inode->i_mapping;
struct dnode_of_data dn;
struct page *page;
struct f2fs_io_info fio = {
@@ -1260,6 +1261,8 @@ static int ra_data_block(struct inode *inode, pgoff_t index)
static int move_data_block(struct inode *inode, block_t bidx,
int gc_type, unsigned int segno, int off)
{
+ struct address_space *mapping = f2fs_is_cow_file(inode) ?
+ F2FS_I(inode)->atomic_inode->i_mapping : inode->i_mapping;
struct f2fs_io_info fio = {
.sbi = F2FS_I_SB(inode),
.ino = inode->i_ino,
@@ -1282,7 +1285,7 @@ static int move_data_block(struct inode *inode, block_t bidx,
CURSEG_ALL_DATA_ATGC : CURSEG_COLD_DATA;
/* do not read out */
- page = f2fs_grab_cache_page(inode->i_mapping, bidx, false);
+ page = f2fs_grab_cache_page(mapping, bidx, false);
if (!page)
return -ENOMEM;
diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
index 1fba5728be70..cca7d448e55c 100644
--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -16,7 +16,7 @@
static bool support_inline_data(struct inode *inode)
{
- if (f2fs_is_atomic_file(inode))
+ if (f2fs_used_in_atomic_write(inode))
return false;
if (!S_ISREG(inode->i_mode) && !S_ISLNK(inode->i_mode))
return false;
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index 7a3e2458b2d9..18dea43e694b 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -804,8 +804,9 @@ void f2fs_evict_inode(struct inode *inode)
f2fs_abort_atomic_write(inode, true);
- if (fi->cow_inode) {
+ if (fi->cow_inode && f2fs_is_cow_file(fi->cow_inode)) {
clear_inode_flag(fi->cow_inode, FI_COW_FILE);
+ F2FS_I(fi->cow_inode)->atomic_inode = NULL;
iput(fi->cow_inode);
fi->cow_inode = NULL;
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x b40a2b00370931b0c50148681dd7364573e52e6b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072946-footpad-posh-e008@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
b40a2b003709 ("f2fs: use meta inode for GC of atomic file")
9f0c4a46be1f ("f2fs: fix to truncate meta inode pages forcely")
fd244524c2cf ("f2fs: compress: fix to cover normal cluster write with cp_rwsem")
55fdc1c24a1d ("f2fs: fix to wait on block writeback for post_read case")
4e4f1eb9949b ("f2fs: introduce f2fs_invalidate_internal_cache() for cleanup")
2eae077e6e46 ("f2fs: reduce stack memory cost by using bitfield in struct f2fs_io_info")
ed8ac22b6b75 ("f2fs: introduce f2fs_is_readonly() for readability")
967eaad1fed5 ("f2fs: fix to set flush_merge opt and show noflush_merge")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b40a2b00370931b0c50148681dd7364573e52e6b Mon Sep 17 00:00:00 2001
From: Sunmin Jeong <s_min.jeong(a)samsung.com>
Date: Wed, 10 Jul 2024 20:51:17 +0900
Subject: [PATCH] f2fs: use meta inode for GC of atomic file
The page cache of the atomic file keeps new data pages which will be
stored in the COW file. It can also keep old data pages when GCing the
atomic file. In this case, new data can be overwritten by old data if a
GC thread sets the old data page as dirty after new data page was
evicted.
Also, since all writes to the atomic file are redirected to COW inodes,
GC for the atomic file is not working well as below.
f2fs_gc(gc_type=FG_GC)
- select A as a victim segment
do_garbage_collect
- iget atomic file's inode for block B
move_data_page
f2fs_do_write_data_page
- use dn of cow inode
- set fio->old_blkaddr from cow inode
- seg_freed is 0 since block B is still valid
- goto gc_more and A is selected as victim again
To solve the problem, let's separate GC writes and updates in the atomic
file by using the meta inode for GC writes.
Fixes: 3db1de0e582c ("f2fs: change the current atomic write way")
Cc: stable(a)vger.kernel.org #v5.19+
Reviewed-by: Sungjong Seo <sj1557.seo(a)samsung.com>
Reviewed-by: Yeongjin Gil <youngjin.gil(a)samsung.com>
Signed-off-by: Sunmin Jeong <s_min.jeong(a)samsung.com>
Reviewed-by: Chao Yu <chao(a)kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk(a)kernel.org>
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 1aa7eefa659c..1e4937af50f3 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2695,7 +2695,7 @@ int f2fs_do_write_data_page(struct f2fs_io_info *fio)
}
/* wait for GCed page writeback via META_MAPPING */
- if (fio->post_read)
+ if (fio->meta_gc)
f2fs_wait_on_block_writeback(inode, fio->old_blkaddr);
/*
@@ -2790,7 +2790,7 @@ int f2fs_write_single_data_page(struct page *page, int *submitted,
.submitted = 0,
.compr_blocks = compr_blocks,
.need_lock = compr_blocks ? LOCK_DONE : LOCK_RETRY,
- .post_read = f2fs_post_read_required(inode) ? 1 : 0,
+ .meta_gc = f2fs_meta_inode_gc_required(inode) ? 1 : 0,
.io_type = io_type,
.io_wbc = wbc,
.bio = bio,
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index f7ee6c5e371e..796ae11c0fa3 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1211,7 +1211,7 @@ struct f2fs_io_info {
unsigned int in_list:1; /* indicate fio is in io_list */
unsigned int is_por:1; /* indicate IO is from recovery or not */
unsigned int encrypted:1; /* indicate file is encrypted */
- unsigned int post_read:1; /* require post read */
+ unsigned int meta_gc:1; /* require meta inode GC */
enum iostat_type io_type; /* io type */
struct writeback_control *io_wbc; /* writeback control */
struct bio **bio; /* bio for ipu */
@@ -4263,6 +4263,11 @@ static inline bool f2fs_post_read_required(struct inode *inode)
f2fs_compressed_file(inode);
}
+static inline bool f2fs_meta_inode_gc_required(struct inode *inode)
+{
+ return f2fs_post_read_required(inode) || f2fs_is_atomic_file(inode);
+}
+
/*
* compress.c
*/
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index ef667fec9a12..cb3006551ab5 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1589,7 +1589,7 @@ static int gc_data_segment(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
start_bidx = f2fs_start_bidx_of_node(nofs, inode) +
ofs_in_node;
- if (f2fs_post_read_required(inode)) {
+ if (f2fs_meta_inode_gc_required(inode)) {
int err = ra_data_block(inode, start_bidx);
f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
@@ -1640,7 +1640,7 @@ static int gc_data_segment(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
start_bidx = f2fs_start_bidx_of_node(nofs, inode)
+ ofs_in_node;
- if (f2fs_post_read_required(inode))
+ if (f2fs_meta_inode_gc_required(inode))
err = move_data_block(inode, start_bidx,
gc_type, segno, off);
else
@@ -1648,7 +1648,7 @@ static int gc_data_segment(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
segno, off);
if (!err && (gc_type == FG_GC ||
- f2fs_post_read_required(inode)))
+ f2fs_meta_inode_gc_required(inode)))
submitted++;
if (locked) {
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 64b11d88a21c..78c3198a6308 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -3859,7 +3859,7 @@ int f2fs_inplace_write_data(struct f2fs_io_info *fio)
goto drop_bio;
}
- if (fio->post_read)
+ if (fio->meta_gc)
f2fs_truncate_meta_inode_pages(sbi, fio->new_blkaddr, 1);
stat_inc_inplace_blocks(fio->sbi);
@@ -4029,7 +4029,7 @@ void f2fs_wait_on_block_writeback(struct inode *inode, block_t blkaddr)
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
struct page *cpage;
- if (!f2fs_post_read_required(inode))
+ if (!f2fs_meta_inode_gc_required(inode))
return;
if (!__is_valid_data_blkaddr(blkaddr))
@@ -4048,7 +4048,7 @@ void f2fs_wait_on_block_writeback_range(struct inode *inode, block_t blkaddr,
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
block_t i;
- if (!f2fs_post_read_required(inode))
+ if (!f2fs_meta_inode_gc_required(inode))
return;
for (i = 0; i < len; i++)
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 49ae997f8f0d5e268bbd271c5fd66166ce8287fe
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072938-tackling-banking-b3a2@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
49ae997f8f0d ("nilfs2: add missing check for inode numbers on directory entries")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 49ae997f8f0d5e268bbd271c5fd66166ce8287fe Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:34 +0900
Subject: [PATCH] nilfs2: add missing check for inode numbers on directory
entries
Syzbot reported that mounting and unmounting a specific pattern of
corrupted nilfs2 filesystem images causes a use-after-free of metadata
file inodes, which triggers a kernel bug in lru_add_fn().
As Jan Kara pointed out, this is because the link count of a metadata file
gets corrupted to 0, and nilfs_evict_inode(), which is called from iput(),
tries to delete that inode (ifile inode in this case).
The inconsistency occurs because directories containing the inode numbers
of these metadata files that should not be visible in the namespace are
read without checking.
Fix this issue by treating the inode numbers of these internal files as
errors in the sanity check helper when reading directory folios/pages.
Also thanks to Hillf Danton and Matthew Wilcox for their initial mm-layer
analysis.
Link: https://lkml.kernel.org/r/20240623051135.4180-3-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+d79afb004be235636ee8(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d79afb004be235636ee8
Reported-by: Jan Kara <jack(a)suse.cz>
Closes: https://lkml.kernel.org/r/20240617075758.wewhukbrjod5fp5o@quack3
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/dir.c b/fs/nilfs2/dir.c
index 52e50b1b7f22..dddfa604491a 100644
--- a/fs/nilfs2/dir.c
+++ b/fs/nilfs2/dir.c
@@ -135,6 +135,9 @@ static bool nilfs_check_folio(struct folio *folio, char *kaddr)
goto Enamelen;
if (((offs + rec_len - 1) ^ offs) & ~(chunk_size-1))
goto Espan;
+ if (unlikely(p->inode &&
+ NILFS_PRIVATE_INODE(le64_to_cpu(p->inode))))
+ goto Einumber;
}
if (offs != limit)
goto Eend;
@@ -160,6 +163,9 @@ static bool nilfs_check_folio(struct folio *folio, char *kaddr)
goto bad_entry;
Espan:
error = "directory entry across blocks";
+ goto bad_entry;
+Einumber:
+ error = "disallowed inode number";
bad_entry:
nilfs_error(sb,
"bad entry in directory #%lu: %s - offset=%lu, inode=%lu, rec_len=%zd, name_len=%d",
diff --git a/fs/nilfs2/nilfs.h b/fs/nilfs2/nilfs.h
index 7e39e277c77f..4017f7856440 100644
--- a/fs/nilfs2/nilfs.h
+++ b/fs/nilfs2/nilfs.h
@@ -121,6 +121,11 @@ enum {
((ino) >= NILFS_FIRST_INO(sb) || \
((ino) < NILFS_USER_INO && (NILFS_SYS_INO_BITS & BIT(ino))))
+#define NILFS_PRIVATE_INODE(ino) ({ \
+ ino_t __ino = (ino); \
+ ((__ino) < NILFS_USER_INO && (__ino) != NILFS_ROOT_INO && \
+ (__ino) != NILFS_SKETCH_INO); })
+
/**
* struct nilfs_transaction_info: context information for synchronization
* @ti_magic: Magic number
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 1ab8425091dba7ce8bf59e09e183aaca6c2a12ac
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072937-blazer-sulfur-cc24@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
1ab8425091db ("nilfs2: fix inode number range checks")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1ab8425091dba7ce8bf59e09e183aaca6c2a12ac Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:33 +0900
Subject: [PATCH] nilfs2: fix inode number range checks
Patch series "nilfs2: fix potential issues related to reserved inodes".
This series fixes one use-after-free issue reported by syzbot, caused by
nilfs2's internal inode being exposed in the namespace on a corrupted
filesystem, and a couple of flaws that cause problems if the starting
number of non-reserved inodes written in the on-disk super block is
intentionally (or corruptly) changed from its default value.
This patch (of 3):
In the current implementation of nilfs2, "nilfs->ns_first_ino", which
gives the first non-reserved inode number, is read from the superblock,
but its lower limit is not checked.
As a result, if a number that overlaps with the inode number range of
reserved inodes such as the root directory or metadata files is set in the
super block parameter, the inode number test macros (NILFS_MDT_INODE and
NILFS_VALID_INODE) will not function properly.
In addition, these test macros use left bit-shift calculations using with
the inode number as the shift count via the BIT macro, but the result of a
shift calculation that exceeds the bit width of an integer is undefined in
the C specification, so if "ns_first_ino" is set to a large value other
than the default value NILFS_USER_INO (=11), the macros may potentially
malfunction depending on the environment.
Fix these issues by checking the lower bound of "nilfs->ns_first_ino" and
by preventing bit shifts equal to or greater than the NILFS_USER_INO
constant in the inode number test macros.
Also, change the type of "ns_first_ino" from signed integer to unsigned
integer to avoid the need for type casting in comparisons such as the
lower bound check introduced this time.
Link: https://lkml.kernel.org/r/20240623051135.4180-1-konishi.ryusuke@gmail.com
Link: https://lkml.kernel.org/r/20240623051135.4180-2-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/nilfs.h b/fs/nilfs2/nilfs.h
index 728e90be3570..7e39e277c77f 100644
--- a/fs/nilfs2/nilfs.h
+++ b/fs/nilfs2/nilfs.h
@@ -116,9 +116,10 @@ enum {
#define NILFS_FIRST_INO(sb) (((struct the_nilfs *)sb->s_fs_info)->ns_first_ino)
#define NILFS_MDT_INODE(sb, ino) \
- ((ino) < NILFS_FIRST_INO(sb) && (NILFS_MDT_INO_BITS & BIT(ino)))
+ ((ino) < NILFS_USER_INO && (NILFS_MDT_INO_BITS & BIT(ino)))
#define NILFS_VALID_INODE(sb, ino) \
- ((ino) >= NILFS_FIRST_INO(sb) || (NILFS_SYS_INO_BITS & BIT(ino)))
+ ((ino) >= NILFS_FIRST_INO(sb) || \
+ ((ino) < NILFS_USER_INO && (NILFS_SYS_INO_BITS & BIT(ino))))
/**
* struct nilfs_transaction_info: context information for synchronization
diff --git a/fs/nilfs2/the_nilfs.c b/fs/nilfs2/the_nilfs.c
index f41d7b6d432c..e44dde57ab65 100644
--- a/fs/nilfs2/the_nilfs.c
+++ b/fs/nilfs2/the_nilfs.c
@@ -452,6 +452,12 @@ static int nilfs_store_disk_layout(struct the_nilfs *nilfs,
}
nilfs->ns_first_ino = le32_to_cpu(sbp->s_first_ino);
+ if (nilfs->ns_first_ino < NILFS_USER_INO) {
+ nilfs_err(nilfs->ns_sb,
+ "too small lower limit for non-reserved inode numbers: %u",
+ nilfs->ns_first_ino);
+ return -EINVAL;
+ }
nilfs->ns_blocks_per_segment = le32_to_cpu(sbp->s_blocks_per_segment);
if (nilfs->ns_blocks_per_segment < NILFS_SEG_MIN_BLOCKS) {
diff --git a/fs/nilfs2/the_nilfs.h b/fs/nilfs2/the_nilfs.h
index 85da0629415d..1e829ed7b0ef 100644
--- a/fs/nilfs2/the_nilfs.h
+++ b/fs/nilfs2/the_nilfs.h
@@ -182,7 +182,7 @@ struct the_nilfs {
unsigned long ns_nrsvsegs;
unsigned long ns_first_data_block;
int ns_inode_size;
- int ns_first_ino;
+ unsigned int ns_first_ino;
u32 ns_crc_seed;
/* /sys/fs/<nilfs>/<device> */
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 49ae997f8f0d5e268bbd271c5fd66166ce8287fe
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072937-scandal-destiny-2a6b@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
49ae997f8f0d ("nilfs2: add missing check for inode numbers on directory entries")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 49ae997f8f0d5e268bbd271c5fd66166ce8287fe Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:34 +0900
Subject: [PATCH] nilfs2: add missing check for inode numbers on directory
entries
Syzbot reported that mounting and unmounting a specific pattern of
corrupted nilfs2 filesystem images causes a use-after-free of metadata
file inodes, which triggers a kernel bug in lru_add_fn().
As Jan Kara pointed out, this is because the link count of a metadata file
gets corrupted to 0, and nilfs_evict_inode(), which is called from iput(),
tries to delete that inode (ifile inode in this case).
The inconsistency occurs because directories containing the inode numbers
of these metadata files that should not be visible in the namespace are
read without checking.
Fix this issue by treating the inode numbers of these internal files as
errors in the sanity check helper when reading directory folios/pages.
Also thanks to Hillf Danton and Matthew Wilcox for their initial mm-layer
analysis.
Link: https://lkml.kernel.org/r/20240623051135.4180-3-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+d79afb004be235636ee8(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d79afb004be235636ee8
Reported-by: Jan Kara <jack(a)suse.cz>
Closes: https://lkml.kernel.org/r/20240617075758.wewhukbrjod5fp5o@quack3
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/dir.c b/fs/nilfs2/dir.c
index 52e50b1b7f22..dddfa604491a 100644
--- a/fs/nilfs2/dir.c
+++ b/fs/nilfs2/dir.c
@@ -135,6 +135,9 @@ static bool nilfs_check_folio(struct folio *folio, char *kaddr)
goto Enamelen;
if (((offs + rec_len - 1) ^ offs) & ~(chunk_size-1))
goto Espan;
+ if (unlikely(p->inode &&
+ NILFS_PRIVATE_INODE(le64_to_cpu(p->inode))))
+ goto Einumber;
}
if (offs != limit)
goto Eend;
@@ -160,6 +163,9 @@ static bool nilfs_check_folio(struct folio *folio, char *kaddr)
goto bad_entry;
Espan:
error = "directory entry across blocks";
+ goto bad_entry;
+Einumber:
+ error = "disallowed inode number";
bad_entry:
nilfs_error(sb,
"bad entry in directory #%lu: %s - offset=%lu, inode=%lu, rec_len=%zd, name_len=%d",
diff --git a/fs/nilfs2/nilfs.h b/fs/nilfs2/nilfs.h
index 7e39e277c77f..4017f7856440 100644
--- a/fs/nilfs2/nilfs.h
+++ b/fs/nilfs2/nilfs.h
@@ -121,6 +121,11 @@ enum {
((ino) >= NILFS_FIRST_INO(sb) || \
((ino) < NILFS_USER_INO && (NILFS_SYS_INO_BITS & BIT(ino))))
+#define NILFS_PRIVATE_INODE(ino) ({ \
+ ino_t __ino = (ino); \
+ ((__ino) < NILFS_USER_INO && (__ino) != NILFS_ROOT_INO && \
+ (__ino) != NILFS_SKETCH_INO); })
+
/**
* struct nilfs_transaction_info: context information for synchronization
* @ti_magic: Magic number
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 49ae997f8f0d5e268bbd271c5fd66166ce8287fe
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072936-harvest-drown-bd07@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
49ae997f8f0d ("nilfs2: add missing check for inode numbers on directory entries")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 49ae997f8f0d5e268bbd271c5fd66166ce8287fe Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:34 +0900
Subject: [PATCH] nilfs2: add missing check for inode numbers on directory
entries
Syzbot reported that mounting and unmounting a specific pattern of
corrupted nilfs2 filesystem images causes a use-after-free of metadata
file inodes, which triggers a kernel bug in lru_add_fn().
As Jan Kara pointed out, this is because the link count of a metadata file
gets corrupted to 0, and nilfs_evict_inode(), which is called from iput(),
tries to delete that inode (ifile inode in this case).
The inconsistency occurs because directories containing the inode numbers
of these metadata files that should not be visible in the namespace are
read without checking.
Fix this issue by treating the inode numbers of these internal files as
errors in the sanity check helper when reading directory folios/pages.
Also thanks to Hillf Danton and Matthew Wilcox for their initial mm-layer
analysis.
Link: https://lkml.kernel.org/r/20240623051135.4180-3-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+d79afb004be235636ee8(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d79afb004be235636ee8
Reported-by: Jan Kara <jack(a)suse.cz>
Closes: https://lkml.kernel.org/r/20240617075758.wewhukbrjod5fp5o@quack3
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/dir.c b/fs/nilfs2/dir.c
index 52e50b1b7f22..dddfa604491a 100644
--- a/fs/nilfs2/dir.c
+++ b/fs/nilfs2/dir.c
@@ -135,6 +135,9 @@ static bool nilfs_check_folio(struct folio *folio, char *kaddr)
goto Enamelen;
if (((offs + rec_len - 1) ^ offs) & ~(chunk_size-1))
goto Espan;
+ if (unlikely(p->inode &&
+ NILFS_PRIVATE_INODE(le64_to_cpu(p->inode))))
+ goto Einumber;
}
if (offs != limit)
goto Eend;
@@ -160,6 +163,9 @@ static bool nilfs_check_folio(struct folio *folio, char *kaddr)
goto bad_entry;
Espan:
error = "directory entry across blocks";
+ goto bad_entry;
+Einumber:
+ error = "disallowed inode number";
bad_entry:
nilfs_error(sb,
"bad entry in directory #%lu: %s - offset=%lu, inode=%lu, rec_len=%zd, name_len=%d",
diff --git a/fs/nilfs2/nilfs.h b/fs/nilfs2/nilfs.h
index 7e39e277c77f..4017f7856440 100644
--- a/fs/nilfs2/nilfs.h
+++ b/fs/nilfs2/nilfs.h
@@ -121,6 +121,11 @@ enum {
((ino) >= NILFS_FIRST_INO(sb) || \
((ino) < NILFS_USER_INO && (NILFS_SYS_INO_BITS & BIT(ino))))
+#define NILFS_PRIVATE_INODE(ino) ({ \
+ ino_t __ino = (ino); \
+ ((__ino) < NILFS_USER_INO && (__ino) != NILFS_ROOT_INO && \
+ (__ino) != NILFS_SKETCH_INO); })
+
/**
* struct nilfs_transaction_info: context information for synchronization
* @ti_magic: Magic number
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 49ae997f8f0d5e268bbd271c5fd66166ce8287fe
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072935-october-footer-b2ed@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
49ae997f8f0d ("nilfs2: add missing check for inode numbers on directory entries")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 49ae997f8f0d5e268bbd271c5fd66166ce8287fe Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:34 +0900
Subject: [PATCH] nilfs2: add missing check for inode numbers on directory
entries
Syzbot reported that mounting and unmounting a specific pattern of
corrupted nilfs2 filesystem images causes a use-after-free of metadata
file inodes, which triggers a kernel bug in lru_add_fn().
As Jan Kara pointed out, this is because the link count of a metadata file
gets corrupted to 0, and nilfs_evict_inode(), which is called from iput(),
tries to delete that inode (ifile inode in this case).
The inconsistency occurs because directories containing the inode numbers
of these metadata files that should not be visible in the namespace are
read without checking.
Fix this issue by treating the inode numbers of these internal files as
errors in the sanity check helper when reading directory folios/pages.
Also thanks to Hillf Danton and Matthew Wilcox for their initial mm-layer
analysis.
Link: https://lkml.kernel.org/r/20240623051135.4180-3-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+d79afb004be235636ee8(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d79afb004be235636ee8
Reported-by: Jan Kara <jack(a)suse.cz>
Closes: https://lkml.kernel.org/r/20240617075758.wewhukbrjod5fp5o@quack3
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/dir.c b/fs/nilfs2/dir.c
index 52e50b1b7f22..dddfa604491a 100644
--- a/fs/nilfs2/dir.c
+++ b/fs/nilfs2/dir.c
@@ -135,6 +135,9 @@ static bool nilfs_check_folio(struct folio *folio, char *kaddr)
goto Enamelen;
if (((offs + rec_len - 1) ^ offs) & ~(chunk_size-1))
goto Espan;
+ if (unlikely(p->inode &&
+ NILFS_PRIVATE_INODE(le64_to_cpu(p->inode))))
+ goto Einumber;
}
if (offs != limit)
goto Eend;
@@ -160,6 +163,9 @@ static bool nilfs_check_folio(struct folio *folio, char *kaddr)
goto bad_entry;
Espan:
error = "directory entry across blocks";
+ goto bad_entry;
+Einumber:
+ error = "disallowed inode number";
bad_entry:
nilfs_error(sb,
"bad entry in directory #%lu: %s - offset=%lu, inode=%lu, rec_len=%zd, name_len=%d",
diff --git a/fs/nilfs2/nilfs.h b/fs/nilfs2/nilfs.h
index 7e39e277c77f..4017f7856440 100644
--- a/fs/nilfs2/nilfs.h
+++ b/fs/nilfs2/nilfs.h
@@ -121,6 +121,11 @@ enum {
((ino) >= NILFS_FIRST_INO(sb) || \
((ino) < NILFS_USER_INO && (NILFS_SYS_INO_BITS & BIT(ino))))
+#define NILFS_PRIVATE_INODE(ino) ({ \
+ ino_t __ino = (ino); \
+ ((__ino) < NILFS_USER_INO && (__ino) != NILFS_ROOT_INO && \
+ (__ino) != NILFS_SKETCH_INO); })
+
/**
* struct nilfs_transaction_info: context information for synchronization
* @ti_magic: Magic number
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 49ae997f8f0d5e268bbd271c5fd66166ce8287fe
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072934-durable-scuttle-9c45@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
49ae997f8f0d ("nilfs2: add missing check for inode numbers on directory entries")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 49ae997f8f0d5e268bbd271c5fd66166ce8287fe Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:34 +0900
Subject: [PATCH] nilfs2: add missing check for inode numbers on directory
entries
Syzbot reported that mounting and unmounting a specific pattern of
corrupted nilfs2 filesystem images causes a use-after-free of metadata
file inodes, which triggers a kernel bug in lru_add_fn().
As Jan Kara pointed out, this is because the link count of a metadata file
gets corrupted to 0, and nilfs_evict_inode(), which is called from iput(),
tries to delete that inode (ifile inode in this case).
The inconsistency occurs because directories containing the inode numbers
of these metadata files that should not be visible in the namespace are
read without checking.
Fix this issue by treating the inode numbers of these internal files as
errors in the sanity check helper when reading directory folios/pages.
Also thanks to Hillf Danton and Matthew Wilcox for their initial mm-layer
analysis.
Link: https://lkml.kernel.org/r/20240623051135.4180-3-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+d79afb004be235636ee8(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d79afb004be235636ee8
Reported-by: Jan Kara <jack(a)suse.cz>
Closes: https://lkml.kernel.org/r/20240617075758.wewhukbrjod5fp5o@quack3
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/dir.c b/fs/nilfs2/dir.c
index 52e50b1b7f22..dddfa604491a 100644
--- a/fs/nilfs2/dir.c
+++ b/fs/nilfs2/dir.c
@@ -135,6 +135,9 @@ static bool nilfs_check_folio(struct folio *folio, char *kaddr)
goto Enamelen;
if (((offs + rec_len - 1) ^ offs) & ~(chunk_size-1))
goto Espan;
+ if (unlikely(p->inode &&
+ NILFS_PRIVATE_INODE(le64_to_cpu(p->inode))))
+ goto Einumber;
}
if (offs != limit)
goto Eend;
@@ -160,6 +163,9 @@ static bool nilfs_check_folio(struct folio *folio, char *kaddr)
goto bad_entry;
Espan:
error = "directory entry across blocks";
+ goto bad_entry;
+Einumber:
+ error = "disallowed inode number";
bad_entry:
nilfs_error(sb,
"bad entry in directory #%lu: %s - offset=%lu, inode=%lu, rec_len=%zd, name_len=%d",
diff --git a/fs/nilfs2/nilfs.h b/fs/nilfs2/nilfs.h
index 7e39e277c77f..4017f7856440 100644
--- a/fs/nilfs2/nilfs.h
+++ b/fs/nilfs2/nilfs.h
@@ -121,6 +121,11 @@ enum {
((ino) >= NILFS_FIRST_INO(sb) || \
((ino) < NILFS_USER_INO && (NILFS_SYS_INO_BITS & BIT(ino))))
+#define NILFS_PRIVATE_INODE(ino) ({ \
+ ino_t __ino = (ino); \
+ ((__ino) < NILFS_USER_INO && (__ino) != NILFS_ROOT_INO && \
+ (__ino) != NILFS_SKETCH_INO); })
+
/**
* struct nilfs_transaction_info: context information for synchronization
* @ti_magic: Magic number
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 49ae997f8f0d5e268bbd271c5fd66166ce8287fe
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072933-unhappily-flashy-46e1@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
49ae997f8f0d ("nilfs2: add missing check for inode numbers on directory entries")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 49ae997f8f0d5e268bbd271c5fd66166ce8287fe Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:34 +0900
Subject: [PATCH] nilfs2: add missing check for inode numbers on directory
entries
Syzbot reported that mounting and unmounting a specific pattern of
corrupted nilfs2 filesystem images causes a use-after-free of metadata
file inodes, which triggers a kernel bug in lru_add_fn().
As Jan Kara pointed out, this is because the link count of a metadata file
gets corrupted to 0, and nilfs_evict_inode(), which is called from iput(),
tries to delete that inode (ifile inode in this case).
The inconsistency occurs because directories containing the inode numbers
of these metadata files that should not be visible in the namespace are
read without checking.
Fix this issue by treating the inode numbers of these internal files as
errors in the sanity check helper when reading directory folios/pages.
Also thanks to Hillf Danton and Matthew Wilcox for their initial mm-layer
analysis.
Link: https://lkml.kernel.org/r/20240623051135.4180-3-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+d79afb004be235636ee8(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d79afb004be235636ee8
Reported-by: Jan Kara <jack(a)suse.cz>
Closes: https://lkml.kernel.org/r/20240617075758.wewhukbrjod5fp5o@quack3
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/dir.c b/fs/nilfs2/dir.c
index 52e50b1b7f22..dddfa604491a 100644
--- a/fs/nilfs2/dir.c
+++ b/fs/nilfs2/dir.c
@@ -135,6 +135,9 @@ static bool nilfs_check_folio(struct folio *folio, char *kaddr)
goto Enamelen;
if (((offs + rec_len - 1) ^ offs) & ~(chunk_size-1))
goto Espan;
+ if (unlikely(p->inode &&
+ NILFS_PRIVATE_INODE(le64_to_cpu(p->inode))))
+ goto Einumber;
}
if (offs != limit)
goto Eend;
@@ -160,6 +163,9 @@ static bool nilfs_check_folio(struct folio *folio, char *kaddr)
goto bad_entry;
Espan:
error = "directory entry across blocks";
+ goto bad_entry;
+Einumber:
+ error = "disallowed inode number";
bad_entry:
nilfs_error(sb,
"bad entry in directory #%lu: %s - offset=%lu, inode=%lu, rec_len=%zd, name_len=%d",
diff --git a/fs/nilfs2/nilfs.h b/fs/nilfs2/nilfs.h
index 7e39e277c77f..4017f7856440 100644
--- a/fs/nilfs2/nilfs.h
+++ b/fs/nilfs2/nilfs.h
@@ -121,6 +121,11 @@ enum {
((ino) >= NILFS_FIRST_INO(sb) || \
((ino) < NILFS_USER_INO && (NILFS_SYS_INO_BITS & BIT(ino))))
+#define NILFS_PRIVATE_INODE(ino) ({ \
+ ino_t __ino = (ino); \
+ ((__ino) < NILFS_USER_INO && (__ino) != NILFS_ROOT_INO && \
+ (__ino) != NILFS_SKETCH_INO); })
+
/**
* struct nilfs_transaction_info: context information for synchronization
* @ti_magic: Magic number
The patch below does not apply to the 6.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y
git checkout FETCH_HEAD
git cherry-pick -x 49ae997f8f0d5e268bbd271c5fd66166ce8287fe
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072932-sage-amazingly-1d3e@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^..
Possible dependencies:
49ae997f8f0d ("nilfs2: add missing check for inode numbers on directory entries")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 49ae997f8f0d5e268bbd271c5fd66166ce8287fe Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:34 +0900
Subject: [PATCH] nilfs2: add missing check for inode numbers on directory
entries
Syzbot reported that mounting and unmounting a specific pattern of
corrupted nilfs2 filesystem images causes a use-after-free of metadata
file inodes, which triggers a kernel bug in lru_add_fn().
As Jan Kara pointed out, this is because the link count of a metadata file
gets corrupted to 0, and nilfs_evict_inode(), which is called from iput(),
tries to delete that inode (ifile inode in this case).
The inconsistency occurs because directories containing the inode numbers
of these metadata files that should not be visible in the namespace are
read without checking.
Fix this issue by treating the inode numbers of these internal files as
errors in the sanity check helper when reading directory folios/pages.
Also thanks to Hillf Danton and Matthew Wilcox for their initial mm-layer
analysis.
Link: https://lkml.kernel.org/r/20240623051135.4180-3-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+d79afb004be235636ee8(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d79afb004be235636ee8
Reported-by: Jan Kara <jack(a)suse.cz>
Closes: https://lkml.kernel.org/r/20240617075758.wewhukbrjod5fp5o@quack3
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/dir.c b/fs/nilfs2/dir.c
index 52e50b1b7f22..dddfa604491a 100644
--- a/fs/nilfs2/dir.c
+++ b/fs/nilfs2/dir.c
@@ -135,6 +135,9 @@ static bool nilfs_check_folio(struct folio *folio, char *kaddr)
goto Enamelen;
if (((offs + rec_len - 1) ^ offs) & ~(chunk_size-1))
goto Espan;
+ if (unlikely(p->inode &&
+ NILFS_PRIVATE_INODE(le64_to_cpu(p->inode))))
+ goto Einumber;
}
if (offs != limit)
goto Eend;
@@ -160,6 +163,9 @@ static bool nilfs_check_folio(struct folio *folio, char *kaddr)
goto bad_entry;
Espan:
error = "directory entry across blocks";
+ goto bad_entry;
+Einumber:
+ error = "disallowed inode number";
bad_entry:
nilfs_error(sb,
"bad entry in directory #%lu: %s - offset=%lu, inode=%lu, rec_len=%zd, name_len=%d",
diff --git a/fs/nilfs2/nilfs.h b/fs/nilfs2/nilfs.h
index 7e39e277c77f..4017f7856440 100644
--- a/fs/nilfs2/nilfs.h
+++ b/fs/nilfs2/nilfs.h
@@ -121,6 +121,11 @@ enum {
((ino) >= NILFS_FIRST_INO(sb) || \
((ino) < NILFS_USER_INO && (NILFS_SYS_INO_BITS & BIT(ino))))
+#define NILFS_PRIVATE_INODE(ino) ({ \
+ ino_t __ino = (ino); \
+ ((__ino) < NILFS_USER_INO && (__ino) != NILFS_ROOT_INO && \
+ (__ino) != NILFS_SKETCH_INO); })
+
/**
* struct nilfs_transaction_info: context information for synchronization
* @ti_magic: Magic number
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 1ab8425091dba7ce8bf59e09e183aaca6c2a12ac
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072930-clover-wool-670c@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
1ab8425091db ("nilfs2: fix inode number range checks")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1ab8425091dba7ce8bf59e09e183aaca6c2a12ac Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:33 +0900
Subject: [PATCH] nilfs2: fix inode number range checks
Patch series "nilfs2: fix potential issues related to reserved inodes".
This series fixes one use-after-free issue reported by syzbot, caused by
nilfs2's internal inode being exposed in the namespace on a corrupted
filesystem, and a couple of flaws that cause problems if the starting
number of non-reserved inodes written in the on-disk super block is
intentionally (or corruptly) changed from its default value.
This patch (of 3):
In the current implementation of nilfs2, "nilfs->ns_first_ino", which
gives the first non-reserved inode number, is read from the superblock,
but its lower limit is not checked.
As a result, if a number that overlaps with the inode number range of
reserved inodes such as the root directory or metadata files is set in the
super block parameter, the inode number test macros (NILFS_MDT_INODE and
NILFS_VALID_INODE) will not function properly.
In addition, these test macros use left bit-shift calculations using with
the inode number as the shift count via the BIT macro, but the result of a
shift calculation that exceeds the bit width of an integer is undefined in
the C specification, so if "ns_first_ino" is set to a large value other
than the default value NILFS_USER_INO (=11), the macros may potentially
malfunction depending on the environment.
Fix these issues by checking the lower bound of "nilfs->ns_first_ino" and
by preventing bit shifts equal to or greater than the NILFS_USER_INO
constant in the inode number test macros.
Also, change the type of "ns_first_ino" from signed integer to unsigned
integer to avoid the need for type casting in comparisons such as the
lower bound check introduced this time.
Link: https://lkml.kernel.org/r/20240623051135.4180-1-konishi.ryusuke@gmail.com
Link: https://lkml.kernel.org/r/20240623051135.4180-2-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/nilfs.h b/fs/nilfs2/nilfs.h
index 728e90be3570..7e39e277c77f 100644
--- a/fs/nilfs2/nilfs.h
+++ b/fs/nilfs2/nilfs.h
@@ -116,9 +116,10 @@ enum {
#define NILFS_FIRST_INO(sb) (((struct the_nilfs *)sb->s_fs_info)->ns_first_ino)
#define NILFS_MDT_INODE(sb, ino) \
- ((ino) < NILFS_FIRST_INO(sb) && (NILFS_MDT_INO_BITS & BIT(ino)))
+ ((ino) < NILFS_USER_INO && (NILFS_MDT_INO_BITS & BIT(ino)))
#define NILFS_VALID_INODE(sb, ino) \
- ((ino) >= NILFS_FIRST_INO(sb) || (NILFS_SYS_INO_BITS & BIT(ino)))
+ ((ino) >= NILFS_FIRST_INO(sb) || \
+ ((ino) < NILFS_USER_INO && (NILFS_SYS_INO_BITS & BIT(ino))))
/**
* struct nilfs_transaction_info: context information for synchronization
diff --git a/fs/nilfs2/the_nilfs.c b/fs/nilfs2/the_nilfs.c
index f41d7b6d432c..e44dde57ab65 100644
--- a/fs/nilfs2/the_nilfs.c
+++ b/fs/nilfs2/the_nilfs.c
@@ -452,6 +452,12 @@ static int nilfs_store_disk_layout(struct the_nilfs *nilfs,
}
nilfs->ns_first_ino = le32_to_cpu(sbp->s_first_ino);
+ if (nilfs->ns_first_ino < NILFS_USER_INO) {
+ nilfs_err(nilfs->ns_sb,
+ "too small lower limit for non-reserved inode numbers: %u",
+ nilfs->ns_first_ino);
+ return -EINVAL;
+ }
nilfs->ns_blocks_per_segment = le32_to_cpu(sbp->s_blocks_per_segment);
if (nilfs->ns_blocks_per_segment < NILFS_SEG_MIN_BLOCKS) {
diff --git a/fs/nilfs2/the_nilfs.h b/fs/nilfs2/the_nilfs.h
index 85da0629415d..1e829ed7b0ef 100644
--- a/fs/nilfs2/the_nilfs.h
+++ b/fs/nilfs2/the_nilfs.h
@@ -182,7 +182,7 @@ struct the_nilfs {
unsigned long ns_nrsvsegs;
unsigned long ns_first_data_block;
int ns_inode_size;
- int ns_first_ino;
+ unsigned int ns_first_ino;
u32 ns_crc_seed;
/* /sys/fs/<nilfs>/<device> */
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 1ab8425091dba7ce8bf59e09e183aaca6c2a12ac
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072929-dreamy-fable-db92@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
1ab8425091db ("nilfs2: fix inode number range checks")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1ab8425091dba7ce8bf59e09e183aaca6c2a12ac Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:33 +0900
Subject: [PATCH] nilfs2: fix inode number range checks
Patch series "nilfs2: fix potential issues related to reserved inodes".
This series fixes one use-after-free issue reported by syzbot, caused by
nilfs2's internal inode being exposed in the namespace on a corrupted
filesystem, and a couple of flaws that cause problems if the starting
number of non-reserved inodes written in the on-disk super block is
intentionally (or corruptly) changed from its default value.
This patch (of 3):
In the current implementation of nilfs2, "nilfs->ns_first_ino", which
gives the first non-reserved inode number, is read from the superblock,
but its lower limit is not checked.
As a result, if a number that overlaps with the inode number range of
reserved inodes such as the root directory or metadata files is set in the
super block parameter, the inode number test macros (NILFS_MDT_INODE and
NILFS_VALID_INODE) will not function properly.
In addition, these test macros use left bit-shift calculations using with
the inode number as the shift count via the BIT macro, but the result of a
shift calculation that exceeds the bit width of an integer is undefined in
the C specification, so if "ns_first_ino" is set to a large value other
than the default value NILFS_USER_INO (=11), the macros may potentially
malfunction depending on the environment.
Fix these issues by checking the lower bound of "nilfs->ns_first_ino" and
by preventing bit shifts equal to or greater than the NILFS_USER_INO
constant in the inode number test macros.
Also, change the type of "ns_first_ino" from signed integer to unsigned
integer to avoid the need for type casting in comparisons such as the
lower bound check introduced this time.
Link: https://lkml.kernel.org/r/20240623051135.4180-1-konishi.ryusuke@gmail.com
Link: https://lkml.kernel.org/r/20240623051135.4180-2-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/nilfs.h b/fs/nilfs2/nilfs.h
index 728e90be3570..7e39e277c77f 100644
--- a/fs/nilfs2/nilfs.h
+++ b/fs/nilfs2/nilfs.h
@@ -116,9 +116,10 @@ enum {
#define NILFS_FIRST_INO(sb) (((struct the_nilfs *)sb->s_fs_info)->ns_first_ino)
#define NILFS_MDT_INODE(sb, ino) \
- ((ino) < NILFS_FIRST_INO(sb) && (NILFS_MDT_INO_BITS & BIT(ino)))
+ ((ino) < NILFS_USER_INO && (NILFS_MDT_INO_BITS & BIT(ino)))
#define NILFS_VALID_INODE(sb, ino) \
- ((ino) >= NILFS_FIRST_INO(sb) || (NILFS_SYS_INO_BITS & BIT(ino)))
+ ((ino) >= NILFS_FIRST_INO(sb) || \
+ ((ino) < NILFS_USER_INO && (NILFS_SYS_INO_BITS & BIT(ino))))
/**
* struct nilfs_transaction_info: context information for synchronization
diff --git a/fs/nilfs2/the_nilfs.c b/fs/nilfs2/the_nilfs.c
index f41d7b6d432c..e44dde57ab65 100644
--- a/fs/nilfs2/the_nilfs.c
+++ b/fs/nilfs2/the_nilfs.c
@@ -452,6 +452,12 @@ static int nilfs_store_disk_layout(struct the_nilfs *nilfs,
}
nilfs->ns_first_ino = le32_to_cpu(sbp->s_first_ino);
+ if (nilfs->ns_first_ino < NILFS_USER_INO) {
+ nilfs_err(nilfs->ns_sb,
+ "too small lower limit for non-reserved inode numbers: %u",
+ nilfs->ns_first_ino);
+ return -EINVAL;
+ }
nilfs->ns_blocks_per_segment = le32_to_cpu(sbp->s_blocks_per_segment);
if (nilfs->ns_blocks_per_segment < NILFS_SEG_MIN_BLOCKS) {
diff --git a/fs/nilfs2/the_nilfs.h b/fs/nilfs2/the_nilfs.h
index 85da0629415d..1e829ed7b0ef 100644
--- a/fs/nilfs2/the_nilfs.h
+++ b/fs/nilfs2/the_nilfs.h
@@ -182,7 +182,7 @@ struct the_nilfs {
unsigned long ns_nrsvsegs;
unsigned long ns_first_data_block;
int ns_inode_size;
- int ns_first_ino;
+ unsigned int ns_first_ino;
u32 ns_crc_seed;
/* /sys/fs/<nilfs>/<device> */
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 1ab8425091dba7ce8bf59e09e183aaca6c2a12ac
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072928-childless-tremble-2776@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
1ab8425091db ("nilfs2: fix inode number range checks")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1ab8425091dba7ce8bf59e09e183aaca6c2a12ac Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:33 +0900
Subject: [PATCH] nilfs2: fix inode number range checks
Patch series "nilfs2: fix potential issues related to reserved inodes".
This series fixes one use-after-free issue reported by syzbot, caused by
nilfs2's internal inode being exposed in the namespace on a corrupted
filesystem, and a couple of flaws that cause problems if the starting
number of non-reserved inodes written in the on-disk super block is
intentionally (or corruptly) changed from its default value.
This patch (of 3):
In the current implementation of nilfs2, "nilfs->ns_first_ino", which
gives the first non-reserved inode number, is read from the superblock,
but its lower limit is not checked.
As a result, if a number that overlaps with the inode number range of
reserved inodes such as the root directory or metadata files is set in the
super block parameter, the inode number test macros (NILFS_MDT_INODE and
NILFS_VALID_INODE) will not function properly.
In addition, these test macros use left bit-shift calculations using with
the inode number as the shift count via the BIT macro, but the result of a
shift calculation that exceeds the bit width of an integer is undefined in
the C specification, so if "ns_first_ino" is set to a large value other
than the default value NILFS_USER_INO (=11), the macros may potentially
malfunction depending on the environment.
Fix these issues by checking the lower bound of "nilfs->ns_first_ino" and
by preventing bit shifts equal to or greater than the NILFS_USER_INO
constant in the inode number test macros.
Also, change the type of "ns_first_ino" from signed integer to unsigned
integer to avoid the need for type casting in comparisons such as the
lower bound check introduced this time.
Link: https://lkml.kernel.org/r/20240623051135.4180-1-konishi.ryusuke@gmail.com
Link: https://lkml.kernel.org/r/20240623051135.4180-2-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/nilfs.h b/fs/nilfs2/nilfs.h
index 728e90be3570..7e39e277c77f 100644
--- a/fs/nilfs2/nilfs.h
+++ b/fs/nilfs2/nilfs.h
@@ -116,9 +116,10 @@ enum {
#define NILFS_FIRST_INO(sb) (((struct the_nilfs *)sb->s_fs_info)->ns_first_ino)
#define NILFS_MDT_INODE(sb, ino) \
- ((ino) < NILFS_FIRST_INO(sb) && (NILFS_MDT_INO_BITS & BIT(ino)))
+ ((ino) < NILFS_USER_INO && (NILFS_MDT_INO_BITS & BIT(ino)))
#define NILFS_VALID_INODE(sb, ino) \
- ((ino) >= NILFS_FIRST_INO(sb) || (NILFS_SYS_INO_BITS & BIT(ino)))
+ ((ino) >= NILFS_FIRST_INO(sb) || \
+ ((ino) < NILFS_USER_INO && (NILFS_SYS_INO_BITS & BIT(ino))))
/**
* struct nilfs_transaction_info: context information for synchronization
diff --git a/fs/nilfs2/the_nilfs.c b/fs/nilfs2/the_nilfs.c
index f41d7b6d432c..e44dde57ab65 100644
--- a/fs/nilfs2/the_nilfs.c
+++ b/fs/nilfs2/the_nilfs.c
@@ -452,6 +452,12 @@ static int nilfs_store_disk_layout(struct the_nilfs *nilfs,
}
nilfs->ns_first_ino = le32_to_cpu(sbp->s_first_ino);
+ if (nilfs->ns_first_ino < NILFS_USER_INO) {
+ nilfs_err(nilfs->ns_sb,
+ "too small lower limit for non-reserved inode numbers: %u",
+ nilfs->ns_first_ino);
+ return -EINVAL;
+ }
nilfs->ns_blocks_per_segment = le32_to_cpu(sbp->s_blocks_per_segment);
if (nilfs->ns_blocks_per_segment < NILFS_SEG_MIN_BLOCKS) {
diff --git a/fs/nilfs2/the_nilfs.h b/fs/nilfs2/the_nilfs.h
index 85da0629415d..1e829ed7b0ef 100644
--- a/fs/nilfs2/the_nilfs.h
+++ b/fs/nilfs2/the_nilfs.h
@@ -182,7 +182,7 @@ struct the_nilfs {
unsigned long ns_nrsvsegs;
unsigned long ns_first_data_block;
int ns_inode_size;
- int ns_first_ino;
+ unsigned int ns_first_ino;
u32 ns_crc_seed;
/* /sys/fs/<nilfs>/<device> */
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 1ab8425091dba7ce8bf59e09e183aaca6c2a12ac
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072926-jawless-untracked-60f3@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
1ab8425091db ("nilfs2: fix inode number range checks")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1ab8425091dba7ce8bf59e09e183aaca6c2a12ac Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:33 +0900
Subject: [PATCH] nilfs2: fix inode number range checks
Patch series "nilfs2: fix potential issues related to reserved inodes".
This series fixes one use-after-free issue reported by syzbot, caused by
nilfs2's internal inode being exposed in the namespace on a corrupted
filesystem, and a couple of flaws that cause problems if the starting
number of non-reserved inodes written in the on-disk super block is
intentionally (or corruptly) changed from its default value.
This patch (of 3):
In the current implementation of nilfs2, "nilfs->ns_first_ino", which
gives the first non-reserved inode number, is read from the superblock,
but its lower limit is not checked.
As a result, if a number that overlaps with the inode number range of
reserved inodes such as the root directory or metadata files is set in the
super block parameter, the inode number test macros (NILFS_MDT_INODE and
NILFS_VALID_INODE) will not function properly.
In addition, these test macros use left bit-shift calculations using with
the inode number as the shift count via the BIT macro, but the result of a
shift calculation that exceeds the bit width of an integer is undefined in
the C specification, so if "ns_first_ino" is set to a large value other
than the default value NILFS_USER_INO (=11), the macros may potentially
malfunction depending on the environment.
Fix these issues by checking the lower bound of "nilfs->ns_first_ino" and
by preventing bit shifts equal to or greater than the NILFS_USER_INO
constant in the inode number test macros.
Also, change the type of "ns_first_ino" from signed integer to unsigned
integer to avoid the need for type casting in comparisons such as the
lower bound check introduced this time.
Link: https://lkml.kernel.org/r/20240623051135.4180-1-konishi.ryusuke@gmail.com
Link: https://lkml.kernel.org/r/20240623051135.4180-2-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/nilfs.h b/fs/nilfs2/nilfs.h
index 728e90be3570..7e39e277c77f 100644
--- a/fs/nilfs2/nilfs.h
+++ b/fs/nilfs2/nilfs.h
@@ -116,9 +116,10 @@ enum {
#define NILFS_FIRST_INO(sb) (((struct the_nilfs *)sb->s_fs_info)->ns_first_ino)
#define NILFS_MDT_INODE(sb, ino) \
- ((ino) < NILFS_FIRST_INO(sb) && (NILFS_MDT_INO_BITS & BIT(ino)))
+ ((ino) < NILFS_USER_INO && (NILFS_MDT_INO_BITS & BIT(ino)))
#define NILFS_VALID_INODE(sb, ino) \
- ((ino) >= NILFS_FIRST_INO(sb) || (NILFS_SYS_INO_BITS & BIT(ino)))
+ ((ino) >= NILFS_FIRST_INO(sb) || \
+ ((ino) < NILFS_USER_INO && (NILFS_SYS_INO_BITS & BIT(ino))))
/**
* struct nilfs_transaction_info: context information for synchronization
diff --git a/fs/nilfs2/the_nilfs.c b/fs/nilfs2/the_nilfs.c
index f41d7b6d432c..e44dde57ab65 100644
--- a/fs/nilfs2/the_nilfs.c
+++ b/fs/nilfs2/the_nilfs.c
@@ -452,6 +452,12 @@ static int nilfs_store_disk_layout(struct the_nilfs *nilfs,
}
nilfs->ns_first_ino = le32_to_cpu(sbp->s_first_ino);
+ if (nilfs->ns_first_ino < NILFS_USER_INO) {
+ nilfs_err(nilfs->ns_sb,
+ "too small lower limit for non-reserved inode numbers: %u",
+ nilfs->ns_first_ino);
+ return -EINVAL;
+ }
nilfs->ns_blocks_per_segment = le32_to_cpu(sbp->s_blocks_per_segment);
if (nilfs->ns_blocks_per_segment < NILFS_SEG_MIN_BLOCKS) {
diff --git a/fs/nilfs2/the_nilfs.h b/fs/nilfs2/the_nilfs.h
index 85da0629415d..1e829ed7b0ef 100644
--- a/fs/nilfs2/the_nilfs.h
+++ b/fs/nilfs2/the_nilfs.h
@@ -182,7 +182,7 @@ struct the_nilfs {
unsigned long ns_nrsvsegs;
unsigned long ns_first_data_block;
int ns_inode_size;
- int ns_first_ino;
+ unsigned int ns_first_ino;
u32 ns_crc_seed;
/* /sys/fs/<nilfs>/<device> */
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 1ab8425091dba7ce8bf59e09e183aaca6c2a12ac
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072925-squatter-waffle-122d@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
1ab8425091db ("nilfs2: fix inode number range checks")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1ab8425091dba7ce8bf59e09e183aaca6c2a12ac Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:33 +0900
Subject: [PATCH] nilfs2: fix inode number range checks
Patch series "nilfs2: fix potential issues related to reserved inodes".
This series fixes one use-after-free issue reported by syzbot, caused by
nilfs2's internal inode being exposed in the namespace on a corrupted
filesystem, and a couple of flaws that cause problems if the starting
number of non-reserved inodes written in the on-disk super block is
intentionally (or corruptly) changed from its default value.
This patch (of 3):
In the current implementation of nilfs2, "nilfs->ns_first_ino", which
gives the first non-reserved inode number, is read from the superblock,
but its lower limit is not checked.
As a result, if a number that overlaps with the inode number range of
reserved inodes such as the root directory or metadata files is set in the
super block parameter, the inode number test macros (NILFS_MDT_INODE and
NILFS_VALID_INODE) will not function properly.
In addition, these test macros use left bit-shift calculations using with
the inode number as the shift count via the BIT macro, but the result of a
shift calculation that exceeds the bit width of an integer is undefined in
the C specification, so if "ns_first_ino" is set to a large value other
than the default value NILFS_USER_INO (=11), the macros may potentially
malfunction depending on the environment.
Fix these issues by checking the lower bound of "nilfs->ns_first_ino" and
by preventing bit shifts equal to or greater than the NILFS_USER_INO
constant in the inode number test macros.
Also, change the type of "ns_first_ino" from signed integer to unsigned
integer to avoid the need for type casting in comparisons such as the
lower bound check introduced this time.
Link: https://lkml.kernel.org/r/20240623051135.4180-1-konishi.ryusuke@gmail.com
Link: https://lkml.kernel.org/r/20240623051135.4180-2-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/nilfs.h b/fs/nilfs2/nilfs.h
index 728e90be3570..7e39e277c77f 100644
--- a/fs/nilfs2/nilfs.h
+++ b/fs/nilfs2/nilfs.h
@@ -116,9 +116,10 @@ enum {
#define NILFS_FIRST_INO(sb) (((struct the_nilfs *)sb->s_fs_info)->ns_first_ino)
#define NILFS_MDT_INODE(sb, ino) \
- ((ino) < NILFS_FIRST_INO(sb) && (NILFS_MDT_INO_BITS & BIT(ino)))
+ ((ino) < NILFS_USER_INO && (NILFS_MDT_INO_BITS & BIT(ino)))
#define NILFS_VALID_INODE(sb, ino) \
- ((ino) >= NILFS_FIRST_INO(sb) || (NILFS_SYS_INO_BITS & BIT(ino)))
+ ((ino) >= NILFS_FIRST_INO(sb) || \
+ ((ino) < NILFS_USER_INO && (NILFS_SYS_INO_BITS & BIT(ino))))
/**
* struct nilfs_transaction_info: context information for synchronization
diff --git a/fs/nilfs2/the_nilfs.c b/fs/nilfs2/the_nilfs.c
index f41d7b6d432c..e44dde57ab65 100644
--- a/fs/nilfs2/the_nilfs.c
+++ b/fs/nilfs2/the_nilfs.c
@@ -452,6 +452,12 @@ static int nilfs_store_disk_layout(struct the_nilfs *nilfs,
}
nilfs->ns_first_ino = le32_to_cpu(sbp->s_first_ino);
+ if (nilfs->ns_first_ino < NILFS_USER_INO) {
+ nilfs_err(nilfs->ns_sb,
+ "too small lower limit for non-reserved inode numbers: %u",
+ nilfs->ns_first_ino);
+ return -EINVAL;
+ }
nilfs->ns_blocks_per_segment = le32_to_cpu(sbp->s_blocks_per_segment);
if (nilfs->ns_blocks_per_segment < NILFS_SEG_MIN_BLOCKS) {
diff --git a/fs/nilfs2/the_nilfs.h b/fs/nilfs2/the_nilfs.h
index 85da0629415d..1e829ed7b0ef 100644
--- a/fs/nilfs2/the_nilfs.h
+++ b/fs/nilfs2/the_nilfs.h
@@ -182,7 +182,7 @@ struct the_nilfs {
unsigned long ns_nrsvsegs;
unsigned long ns_first_data_block;
int ns_inode_size;
- int ns_first_ino;
+ unsigned int ns_first_ino;
u32 ns_crc_seed;
/* /sys/fs/<nilfs>/<device> */
The patch below does not apply to the 6.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y
git checkout FETCH_HEAD
git cherry-pick -x 1ab8425091dba7ce8bf59e09e183aaca6c2a12ac
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072924-daycare-driveway-4395@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^..
Possible dependencies:
1ab8425091db ("nilfs2: fix inode number range checks")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1ab8425091dba7ce8bf59e09e183aaca6c2a12ac Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:33 +0900
Subject: [PATCH] nilfs2: fix inode number range checks
Patch series "nilfs2: fix potential issues related to reserved inodes".
This series fixes one use-after-free issue reported by syzbot, caused by
nilfs2's internal inode being exposed in the namespace on a corrupted
filesystem, and a couple of flaws that cause problems if the starting
number of non-reserved inodes written in the on-disk super block is
intentionally (or corruptly) changed from its default value.
This patch (of 3):
In the current implementation of nilfs2, "nilfs->ns_first_ino", which
gives the first non-reserved inode number, is read from the superblock,
but its lower limit is not checked.
As a result, if a number that overlaps with the inode number range of
reserved inodes such as the root directory or metadata files is set in the
super block parameter, the inode number test macros (NILFS_MDT_INODE and
NILFS_VALID_INODE) will not function properly.
In addition, these test macros use left bit-shift calculations using with
the inode number as the shift count via the BIT macro, but the result of a
shift calculation that exceeds the bit width of an integer is undefined in
the C specification, so if "ns_first_ino" is set to a large value other
than the default value NILFS_USER_INO (=11), the macros may potentially
malfunction depending on the environment.
Fix these issues by checking the lower bound of "nilfs->ns_first_ino" and
by preventing bit shifts equal to or greater than the NILFS_USER_INO
constant in the inode number test macros.
Also, change the type of "ns_first_ino" from signed integer to unsigned
integer to avoid the need for type casting in comparisons such as the
lower bound check introduced this time.
Link: https://lkml.kernel.org/r/20240623051135.4180-1-konishi.ryusuke@gmail.com
Link: https://lkml.kernel.org/r/20240623051135.4180-2-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/nilfs.h b/fs/nilfs2/nilfs.h
index 728e90be3570..7e39e277c77f 100644
--- a/fs/nilfs2/nilfs.h
+++ b/fs/nilfs2/nilfs.h
@@ -116,9 +116,10 @@ enum {
#define NILFS_FIRST_INO(sb) (((struct the_nilfs *)sb->s_fs_info)->ns_first_ino)
#define NILFS_MDT_INODE(sb, ino) \
- ((ino) < NILFS_FIRST_INO(sb) && (NILFS_MDT_INO_BITS & BIT(ino)))
+ ((ino) < NILFS_USER_INO && (NILFS_MDT_INO_BITS & BIT(ino)))
#define NILFS_VALID_INODE(sb, ino) \
- ((ino) >= NILFS_FIRST_INO(sb) || (NILFS_SYS_INO_BITS & BIT(ino)))
+ ((ino) >= NILFS_FIRST_INO(sb) || \
+ ((ino) < NILFS_USER_INO && (NILFS_SYS_INO_BITS & BIT(ino))))
/**
* struct nilfs_transaction_info: context information for synchronization
diff --git a/fs/nilfs2/the_nilfs.c b/fs/nilfs2/the_nilfs.c
index f41d7b6d432c..e44dde57ab65 100644
--- a/fs/nilfs2/the_nilfs.c
+++ b/fs/nilfs2/the_nilfs.c
@@ -452,6 +452,12 @@ static int nilfs_store_disk_layout(struct the_nilfs *nilfs,
}
nilfs->ns_first_ino = le32_to_cpu(sbp->s_first_ino);
+ if (nilfs->ns_first_ino < NILFS_USER_INO) {
+ nilfs_err(nilfs->ns_sb,
+ "too small lower limit for non-reserved inode numbers: %u",
+ nilfs->ns_first_ino);
+ return -EINVAL;
+ }
nilfs->ns_blocks_per_segment = le32_to_cpu(sbp->s_blocks_per_segment);
if (nilfs->ns_blocks_per_segment < NILFS_SEG_MIN_BLOCKS) {
diff --git a/fs/nilfs2/the_nilfs.h b/fs/nilfs2/the_nilfs.h
index 85da0629415d..1e829ed7b0ef 100644
--- a/fs/nilfs2/the_nilfs.h
+++ b/fs/nilfs2/the_nilfs.h
@@ -182,7 +182,7 @@ struct the_nilfs {
unsigned long ns_nrsvsegs;
unsigned long ns_first_data_block;
int ns_inode_size;
- int ns_first_ino;
+ unsigned int ns_first_ino;
u32 ns_crc_seed;
/* /sys/fs/<nilfs>/<device> */
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x f41e355f8b48d894324a3fdc9727e08b1bce78e2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072922-polyester-flatware-fa34@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
f41e355f8b48 ("nilfs2: fix incorrect inode allocation from reserved inodes")
af6eae646851 ("nilfs2: convert persistent object allocator to use kmap_local")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f41e355f8b48d894324a3fdc9727e08b1bce78e2 Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:35 +0900
Subject: [PATCH] nilfs2: fix incorrect inode allocation from reserved inodes
If the bitmap block that manages the inode allocation status is corrupted,
nilfs_ifile_create_inode() may allocate a new inode from the reserved
inode area where it should not be allocated.
Previous fix commit d325dc6eb763 ("nilfs2: fix use-after-free bug of
struct nilfs_root"), fixed the problem that reserved inodes with inode
numbers less than NILFS_USER_INO (=11) were incorrectly reallocated due to
bitmap corruption, but since the start number of non-reserved inodes is
read from the super block and may change, in which case inode allocation
may occur from the extended reserved inode area.
If that happens, access to that inode will cause an IO error, causing the
file system to degrade to an error state.
Fix this potential issue by adding a wraparound option to the common
metadata object allocation routine and by modifying
nilfs_ifile_create_inode() to disable the option so that it only allocates
inodes with inode numbers greater than or equal to the inode number read
in "nilfs->ns_first_ino", regardless of the bitmap status of reserved
inodes.
Link: https://lkml.kernel.org/r/20240623051135.4180-4-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/alloc.c b/fs/nilfs2/alloc.c
index 89caef7513db..ba50388ee4bf 100644
--- a/fs/nilfs2/alloc.c
+++ b/fs/nilfs2/alloc.c
@@ -377,11 +377,12 @@ void *nilfs_palloc_block_get_entry(const struct inode *inode, __u64 nr,
* @target: offset number of an entry in the group (start point)
* @bsize: size in bits
* @lock: spin lock protecting @bitmap
+ * @wrap: whether to wrap around
*/
static int nilfs_palloc_find_available_slot(unsigned char *bitmap,
unsigned long target,
unsigned int bsize,
- spinlock_t *lock)
+ spinlock_t *lock, bool wrap)
{
int pos, end = bsize;
@@ -397,6 +398,8 @@ static int nilfs_palloc_find_available_slot(unsigned char *bitmap,
end = target;
}
+ if (!wrap)
+ return -ENOSPC;
/* wrap around */
for (pos = 0; pos < end; pos++) {
@@ -495,9 +498,10 @@ int nilfs_palloc_count_max_entries(struct inode *inode, u64 nused, u64 *nmaxp)
* nilfs_palloc_prepare_alloc_entry - prepare to allocate a persistent object
* @inode: inode of metadata file using this allocator
* @req: nilfs_palloc_req structure exchanged for the allocation
+ * @wrap: whether to wrap around
*/
int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
- struct nilfs_palloc_req *req)
+ struct nilfs_palloc_req *req, bool wrap)
{
struct buffer_head *desc_bh, *bitmap_bh;
struct nilfs_palloc_group_desc *desc;
@@ -516,7 +520,7 @@ int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
entries_per_group = nilfs_palloc_entries_per_group(inode);
for (i = 0; i < ngroups; i += n) {
- if (group >= ngroups) {
+ if (group >= ngroups && wrap) {
/* wrap around */
group = 0;
maxgroup = nilfs_palloc_group(inode, req->pr_entry_nr,
@@ -550,7 +554,14 @@ int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
bitmap_kaddr = kmap_local_page(bitmap_bh->b_page);
bitmap = bitmap_kaddr + bh_offset(bitmap_bh);
pos = nilfs_palloc_find_available_slot(
- bitmap, group_offset, entries_per_group, lock);
+ bitmap, group_offset, entries_per_group, lock,
+ wrap);
+ /*
+ * Since the search for a free slot in the second and
+ * subsequent bitmap blocks always starts from the
+ * beginning, the wrap flag only has an effect on the
+ * first search.
+ */
kunmap_local(bitmap_kaddr);
if (pos >= 0)
goto found;
diff --git a/fs/nilfs2/alloc.h b/fs/nilfs2/alloc.h
index b667e869ac07..d825a9faca6d 100644
--- a/fs/nilfs2/alloc.h
+++ b/fs/nilfs2/alloc.h
@@ -50,8 +50,8 @@ struct nilfs_palloc_req {
struct buffer_head *pr_entry_bh;
};
-int nilfs_palloc_prepare_alloc_entry(struct inode *,
- struct nilfs_palloc_req *);
+int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
+ struct nilfs_palloc_req *req, bool wrap);
void nilfs_palloc_commit_alloc_entry(struct inode *,
struct nilfs_palloc_req *);
void nilfs_palloc_abort_alloc_entry(struct inode *, struct nilfs_palloc_req *);
diff --git a/fs/nilfs2/dat.c b/fs/nilfs2/dat.c
index 180fc8d36213..fc1caf63a42a 100644
--- a/fs/nilfs2/dat.c
+++ b/fs/nilfs2/dat.c
@@ -75,7 +75,7 @@ int nilfs_dat_prepare_alloc(struct inode *dat, struct nilfs_palloc_req *req)
{
int ret;
- ret = nilfs_palloc_prepare_alloc_entry(dat, req);
+ ret = nilfs_palloc_prepare_alloc_entry(dat, req, true);
if (ret < 0)
return ret;
diff --git a/fs/nilfs2/ifile.c b/fs/nilfs2/ifile.c
index 612e609158b5..1e86b9303b7c 100644
--- a/fs/nilfs2/ifile.c
+++ b/fs/nilfs2/ifile.c
@@ -56,13 +56,10 @@ int nilfs_ifile_create_inode(struct inode *ifile, ino_t *out_ino,
struct nilfs_palloc_req req;
int ret;
- req.pr_entry_nr = 0; /*
- * 0 says find free inode from beginning
- * of a group. dull code!!
- */
+ req.pr_entry_nr = NILFS_FIRST_INO(ifile->i_sb);
req.pr_entry_bh = NULL;
- ret = nilfs_palloc_prepare_alloc_entry(ifile, &req);
+ ret = nilfs_palloc_prepare_alloc_entry(ifile, &req, false);
if (!ret) {
ret = nilfs_palloc_get_entry_block(ifile, req.pr_entry_nr, 1,
&req.pr_entry_bh);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x f41e355f8b48d894324a3fdc9727e08b1bce78e2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072921-detective-rink-b08a@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
f41e355f8b48 ("nilfs2: fix incorrect inode allocation from reserved inodes")
af6eae646851 ("nilfs2: convert persistent object allocator to use kmap_local")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f41e355f8b48d894324a3fdc9727e08b1bce78e2 Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:35 +0900
Subject: [PATCH] nilfs2: fix incorrect inode allocation from reserved inodes
If the bitmap block that manages the inode allocation status is corrupted,
nilfs_ifile_create_inode() may allocate a new inode from the reserved
inode area where it should not be allocated.
Previous fix commit d325dc6eb763 ("nilfs2: fix use-after-free bug of
struct nilfs_root"), fixed the problem that reserved inodes with inode
numbers less than NILFS_USER_INO (=11) were incorrectly reallocated due to
bitmap corruption, but since the start number of non-reserved inodes is
read from the super block and may change, in which case inode allocation
may occur from the extended reserved inode area.
If that happens, access to that inode will cause an IO error, causing the
file system to degrade to an error state.
Fix this potential issue by adding a wraparound option to the common
metadata object allocation routine and by modifying
nilfs_ifile_create_inode() to disable the option so that it only allocates
inodes with inode numbers greater than or equal to the inode number read
in "nilfs->ns_first_ino", regardless of the bitmap status of reserved
inodes.
Link: https://lkml.kernel.org/r/20240623051135.4180-4-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/alloc.c b/fs/nilfs2/alloc.c
index 89caef7513db..ba50388ee4bf 100644
--- a/fs/nilfs2/alloc.c
+++ b/fs/nilfs2/alloc.c
@@ -377,11 +377,12 @@ void *nilfs_palloc_block_get_entry(const struct inode *inode, __u64 nr,
* @target: offset number of an entry in the group (start point)
* @bsize: size in bits
* @lock: spin lock protecting @bitmap
+ * @wrap: whether to wrap around
*/
static int nilfs_palloc_find_available_slot(unsigned char *bitmap,
unsigned long target,
unsigned int bsize,
- spinlock_t *lock)
+ spinlock_t *lock, bool wrap)
{
int pos, end = bsize;
@@ -397,6 +398,8 @@ static int nilfs_palloc_find_available_slot(unsigned char *bitmap,
end = target;
}
+ if (!wrap)
+ return -ENOSPC;
/* wrap around */
for (pos = 0; pos < end; pos++) {
@@ -495,9 +498,10 @@ int nilfs_palloc_count_max_entries(struct inode *inode, u64 nused, u64 *nmaxp)
* nilfs_palloc_prepare_alloc_entry - prepare to allocate a persistent object
* @inode: inode of metadata file using this allocator
* @req: nilfs_palloc_req structure exchanged for the allocation
+ * @wrap: whether to wrap around
*/
int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
- struct nilfs_palloc_req *req)
+ struct nilfs_palloc_req *req, bool wrap)
{
struct buffer_head *desc_bh, *bitmap_bh;
struct nilfs_palloc_group_desc *desc;
@@ -516,7 +520,7 @@ int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
entries_per_group = nilfs_palloc_entries_per_group(inode);
for (i = 0; i < ngroups; i += n) {
- if (group >= ngroups) {
+ if (group >= ngroups && wrap) {
/* wrap around */
group = 0;
maxgroup = nilfs_palloc_group(inode, req->pr_entry_nr,
@@ -550,7 +554,14 @@ int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
bitmap_kaddr = kmap_local_page(bitmap_bh->b_page);
bitmap = bitmap_kaddr + bh_offset(bitmap_bh);
pos = nilfs_palloc_find_available_slot(
- bitmap, group_offset, entries_per_group, lock);
+ bitmap, group_offset, entries_per_group, lock,
+ wrap);
+ /*
+ * Since the search for a free slot in the second and
+ * subsequent bitmap blocks always starts from the
+ * beginning, the wrap flag only has an effect on the
+ * first search.
+ */
kunmap_local(bitmap_kaddr);
if (pos >= 0)
goto found;
diff --git a/fs/nilfs2/alloc.h b/fs/nilfs2/alloc.h
index b667e869ac07..d825a9faca6d 100644
--- a/fs/nilfs2/alloc.h
+++ b/fs/nilfs2/alloc.h
@@ -50,8 +50,8 @@ struct nilfs_palloc_req {
struct buffer_head *pr_entry_bh;
};
-int nilfs_palloc_prepare_alloc_entry(struct inode *,
- struct nilfs_palloc_req *);
+int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
+ struct nilfs_palloc_req *req, bool wrap);
void nilfs_palloc_commit_alloc_entry(struct inode *,
struct nilfs_palloc_req *);
void nilfs_palloc_abort_alloc_entry(struct inode *, struct nilfs_palloc_req *);
diff --git a/fs/nilfs2/dat.c b/fs/nilfs2/dat.c
index 180fc8d36213..fc1caf63a42a 100644
--- a/fs/nilfs2/dat.c
+++ b/fs/nilfs2/dat.c
@@ -75,7 +75,7 @@ int nilfs_dat_prepare_alloc(struct inode *dat, struct nilfs_palloc_req *req)
{
int ret;
- ret = nilfs_palloc_prepare_alloc_entry(dat, req);
+ ret = nilfs_palloc_prepare_alloc_entry(dat, req, true);
if (ret < 0)
return ret;
diff --git a/fs/nilfs2/ifile.c b/fs/nilfs2/ifile.c
index 612e609158b5..1e86b9303b7c 100644
--- a/fs/nilfs2/ifile.c
+++ b/fs/nilfs2/ifile.c
@@ -56,13 +56,10 @@ int nilfs_ifile_create_inode(struct inode *ifile, ino_t *out_ino,
struct nilfs_palloc_req req;
int ret;
- req.pr_entry_nr = 0; /*
- * 0 says find free inode from beginning
- * of a group. dull code!!
- */
+ req.pr_entry_nr = NILFS_FIRST_INO(ifile->i_sb);
req.pr_entry_bh = NULL;
- ret = nilfs_palloc_prepare_alloc_entry(ifile, &req);
+ ret = nilfs_palloc_prepare_alloc_entry(ifile, &req, false);
if (!ret) {
ret = nilfs_palloc_get_entry_block(ifile, req.pr_entry_nr, 1,
&req.pr_entry_bh);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x f41e355f8b48d894324a3fdc9727e08b1bce78e2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072920-limb-census-d305@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
f41e355f8b48 ("nilfs2: fix incorrect inode allocation from reserved inodes")
af6eae646851 ("nilfs2: convert persistent object allocator to use kmap_local")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f41e355f8b48d894324a3fdc9727e08b1bce78e2 Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:35 +0900
Subject: [PATCH] nilfs2: fix incorrect inode allocation from reserved inodes
If the bitmap block that manages the inode allocation status is corrupted,
nilfs_ifile_create_inode() may allocate a new inode from the reserved
inode area where it should not be allocated.
Previous fix commit d325dc6eb763 ("nilfs2: fix use-after-free bug of
struct nilfs_root"), fixed the problem that reserved inodes with inode
numbers less than NILFS_USER_INO (=11) were incorrectly reallocated due to
bitmap corruption, but since the start number of non-reserved inodes is
read from the super block and may change, in which case inode allocation
may occur from the extended reserved inode area.
If that happens, access to that inode will cause an IO error, causing the
file system to degrade to an error state.
Fix this potential issue by adding a wraparound option to the common
metadata object allocation routine and by modifying
nilfs_ifile_create_inode() to disable the option so that it only allocates
inodes with inode numbers greater than or equal to the inode number read
in "nilfs->ns_first_ino", regardless of the bitmap status of reserved
inodes.
Link: https://lkml.kernel.org/r/20240623051135.4180-4-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/alloc.c b/fs/nilfs2/alloc.c
index 89caef7513db..ba50388ee4bf 100644
--- a/fs/nilfs2/alloc.c
+++ b/fs/nilfs2/alloc.c
@@ -377,11 +377,12 @@ void *nilfs_palloc_block_get_entry(const struct inode *inode, __u64 nr,
* @target: offset number of an entry in the group (start point)
* @bsize: size in bits
* @lock: spin lock protecting @bitmap
+ * @wrap: whether to wrap around
*/
static int nilfs_palloc_find_available_slot(unsigned char *bitmap,
unsigned long target,
unsigned int bsize,
- spinlock_t *lock)
+ spinlock_t *lock, bool wrap)
{
int pos, end = bsize;
@@ -397,6 +398,8 @@ static int nilfs_palloc_find_available_slot(unsigned char *bitmap,
end = target;
}
+ if (!wrap)
+ return -ENOSPC;
/* wrap around */
for (pos = 0; pos < end; pos++) {
@@ -495,9 +498,10 @@ int nilfs_palloc_count_max_entries(struct inode *inode, u64 nused, u64 *nmaxp)
* nilfs_palloc_prepare_alloc_entry - prepare to allocate a persistent object
* @inode: inode of metadata file using this allocator
* @req: nilfs_palloc_req structure exchanged for the allocation
+ * @wrap: whether to wrap around
*/
int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
- struct nilfs_palloc_req *req)
+ struct nilfs_palloc_req *req, bool wrap)
{
struct buffer_head *desc_bh, *bitmap_bh;
struct nilfs_palloc_group_desc *desc;
@@ -516,7 +520,7 @@ int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
entries_per_group = nilfs_palloc_entries_per_group(inode);
for (i = 0; i < ngroups; i += n) {
- if (group >= ngroups) {
+ if (group >= ngroups && wrap) {
/* wrap around */
group = 0;
maxgroup = nilfs_palloc_group(inode, req->pr_entry_nr,
@@ -550,7 +554,14 @@ int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
bitmap_kaddr = kmap_local_page(bitmap_bh->b_page);
bitmap = bitmap_kaddr + bh_offset(bitmap_bh);
pos = nilfs_palloc_find_available_slot(
- bitmap, group_offset, entries_per_group, lock);
+ bitmap, group_offset, entries_per_group, lock,
+ wrap);
+ /*
+ * Since the search for a free slot in the second and
+ * subsequent bitmap blocks always starts from the
+ * beginning, the wrap flag only has an effect on the
+ * first search.
+ */
kunmap_local(bitmap_kaddr);
if (pos >= 0)
goto found;
diff --git a/fs/nilfs2/alloc.h b/fs/nilfs2/alloc.h
index b667e869ac07..d825a9faca6d 100644
--- a/fs/nilfs2/alloc.h
+++ b/fs/nilfs2/alloc.h
@@ -50,8 +50,8 @@ struct nilfs_palloc_req {
struct buffer_head *pr_entry_bh;
};
-int nilfs_palloc_prepare_alloc_entry(struct inode *,
- struct nilfs_palloc_req *);
+int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
+ struct nilfs_palloc_req *req, bool wrap);
void nilfs_palloc_commit_alloc_entry(struct inode *,
struct nilfs_palloc_req *);
void nilfs_palloc_abort_alloc_entry(struct inode *, struct nilfs_palloc_req *);
diff --git a/fs/nilfs2/dat.c b/fs/nilfs2/dat.c
index 180fc8d36213..fc1caf63a42a 100644
--- a/fs/nilfs2/dat.c
+++ b/fs/nilfs2/dat.c
@@ -75,7 +75,7 @@ int nilfs_dat_prepare_alloc(struct inode *dat, struct nilfs_palloc_req *req)
{
int ret;
- ret = nilfs_palloc_prepare_alloc_entry(dat, req);
+ ret = nilfs_palloc_prepare_alloc_entry(dat, req, true);
if (ret < 0)
return ret;
diff --git a/fs/nilfs2/ifile.c b/fs/nilfs2/ifile.c
index 612e609158b5..1e86b9303b7c 100644
--- a/fs/nilfs2/ifile.c
+++ b/fs/nilfs2/ifile.c
@@ -56,13 +56,10 @@ int nilfs_ifile_create_inode(struct inode *ifile, ino_t *out_ino,
struct nilfs_palloc_req req;
int ret;
- req.pr_entry_nr = 0; /*
- * 0 says find free inode from beginning
- * of a group. dull code!!
- */
+ req.pr_entry_nr = NILFS_FIRST_INO(ifile->i_sb);
req.pr_entry_bh = NULL;
- ret = nilfs_palloc_prepare_alloc_entry(ifile, &req);
+ ret = nilfs_palloc_prepare_alloc_entry(ifile, &req, false);
if (!ret) {
ret = nilfs_palloc_get_entry_block(ifile, req.pr_entry_nr, 1,
&req.pr_entry_bh);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x f41e355f8b48d894324a3fdc9727e08b1bce78e2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072919-voyage-outline-047b@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
f41e355f8b48 ("nilfs2: fix incorrect inode allocation from reserved inodes")
af6eae646851 ("nilfs2: convert persistent object allocator to use kmap_local")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f41e355f8b48d894324a3fdc9727e08b1bce78e2 Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:35 +0900
Subject: [PATCH] nilfs2: fix incorrect inode allocation from reserved inodes
If the bitmap block that manages the inode allocation status is corrupted,
nilfs_ifile_create_inode() may allocate a new inode from the reserved
inode area where it should not be allocated.
Previous fix commit d325dc6eb763 ("nilfs2: fix use-after-free bug of
struct nilfs_root"), fixed the problem that reserved inodes with inode
numbers less than NILFS_USER_INO (=11) were incorrectly reallocated due to
bitmap corruption, but since the start number of non-reserved inodes is
read from the super block and may change, in which case inode allocation
may occur from the extended reserved inode area.
If that happens, access to that inode will cause an IO error, causing the
file system to degrade to an error state.
Fix this potential issue by adding a wraparound option to the common
metadata object allocation routine and by modifying
nilfs_ifile_create_inode() to disable the option so that it only allocates
inodes with inode numbers greater than or equal to the inode number read
in "nilfs->ns_first_ino", regardless of the bitmap status of reserved
inodes.
Link: https://lkml.kernel.org/r/20240623051135.4180-4-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/alloc.c b/fs/nilfs2/alloc.c
index 89caef7513db..ba50388ee4bf 100644
--- a/fs/nilfs2/alloc.c
+++ b/fs/nilfs2/alloc.c
@@ -377,11 +377,12 @@ void *nilfs_palloc_block_get_entry(const struct inode *inode, __u64 nr,
* @target: offset number of an entry in the group (start point)
* @bsize: size in bits
* @lock: spin lock protecting @bitmap
+ * @wrap: whether to wrap around
*/
static int nilfs_palloc_find_available_slot(unsigned char *bitmap,
unsigned long target,
unsigned int bsize,
- spinlock_t *lock)
+ spinlock_t *lock, bool wrap)
{
int pos, end = bsize;
@@ -397,6 +398,8 @@ static int nilfs_palloc_find_available_slot(unsigned char *bitmap,
end = target;
}
+ if (!wrap)
+ return -ENOSPC;
/* wrap around */
for (pos = 0; pos < end; pos++) {
@@ -495,9 +498,10 @@ int nilfs_palloc_count_max_entries(struct inode *inode, u64 nused, u64 *nmaxp)
* nilfs_palloc_prepare_alloc_entry - prepare to allocate a persistent object
* @inode: inode of metadata file using this allocator
* @req: nilfs_palloc_req structure exchanged for the allocation
+ * @wrap: whether to wrap around
*/
int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
- struct nilfs_palloc_req *req)
+ struct nilfs_palloc_req *req, bool wrap)
{
struct buffer_head *desc_bh, *bitmap_bh;
struct nilfs_palloc_group_desc *desc;
@@ -516,7 +520,7 @@ int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
entries_per_group = nilfs_palloc_entries_per_group(inode);
for (i = 0; i < ngroups; i += n) {
- if (group >= ngroups) {
+ if (group >= ngroups && wrap) {
/* wrap around */
group = 0;
maxgroup = nilfs_palloc_group(inode, req->pr_entry_nr,
@@ -550,7 +554,14 @@ int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
bitmap_kaddr = kmap_local_page(bitmap_bh->b_page);
bitmap = bitmap_kaddr + bh_offset(bitmap_bh);
pos = nilfs_palloc_find_available_slot(
- bitmap, group_offset, entries_per_group, lock);
+ bitmap, group_offset, entries_per_group, lock,
+ wrap);
+ /*
+ * Since the search for a free slot in the second and
+ * subsequent bitmap blocks always starts from the
+ * beginning, the wrap flag only has an effect on the
+ * first search.
+ */
kunmap_local(bitmap_kaddr);
if (pos >= 0)
goto found;
diff --git a/fs/nilfs2/alloc.h b/fs/nilfs2/alloc.h
index b667e869ac07..d825a9faca6d 100644
--- a/fs/nilfs2/alloc.h
+++ b/fs/nilfs2/alloc.h
@@ -50,8 +50,8 @@ struct nilfs_palloc_req {
struct buffer_head *pr_entry_bh;
};
-int nilfs_palloc_prepare_alloc_entry(struct inode *,
- struct nilfs_palloc_req *);
+int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
+ struct nilfs_palloc_req *req, bool wrap);
void nilfs_palloc_commit_alloc_entry(struct inode *,
struct nilfs_palloc_req *);
void nilfs_palloc_abort_alloc_entry(struct inode *, struct nilfs_palloc_req *);
diff --git a/fs/nilfs2/dat.c b/fs/nilfs2/dat.c
index 180fc8d36213..fc1caf63a42a 100644
--- a/fs/nilfs2/dat.c
+++ b/fs/nilfs2/dat.c
@@ -75,7 +75,7 @@ int nilfs_dat_prepare_alloc(struct inode *dat, struct nilfs_palloc_req *req)
{
int ret;
- ret = nilfs_palloc_prepare_alloc_entry(dat, req);
+ ret = nilfs_palloc_prepare_alloc_entry(dat, req, true);
if (ret < 0)
return ret;
diff --git a/fs/nilfs2/ifile.c b/fs/nilfs2/ifile.c
index 612e609158b5..1e86b9303b7c 100644
--- a/fs/nilfs2/ifile.c
+++ b/fs/nilfs2/ifile.c
@@ -56,13 +56,10 @@ int nilfs_ifile_create_inode(struct inode *ifile, ino_t *out_ino,
struct nilfs_palloc_req req;
int ret;
- req.pr_entry_nr = 0; /*
- * 0 says find free inode from beginning
- * of a group. dull code!!
- */
+ req.pr_entry_nr = NILFS_FIRST_INO(ifile->i_sb);
req.pr_entry_bh = NULL;
- ret = nilfs_palloc_prepare_alloc_entry(ifile, &req);
+ ret = nilfs_palloc_prepare_alloc_entry(ifile, &req, false);
if (!ret) {
ret = nilfs_palloc_get_entry_block(ifile, req.pr_entry_nr, 1,
&req.pr_entry_bh);
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x f41e355f8b48d894324a3fdc9727e08b1bce78e2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072917-crumpet-velcro-eae5@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
f41e355f8b48 ("nilfs2: fix incorrect inode allocation from reserved inodes")
af6eae646851 ("nilfs2: convert persistent object allocator to use kmap_local")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f41e355f8b48d894324a3fdc9727e08b1bce78e2 Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:35 +0900
Subject: [PATCH] nilfs2: fix incorrect inode allocation from reserved inodes
If the bitmap block that manages the inode allocation status is corrupted,
nilfs_ifile_create_inode() may allocate a new inode from the reserved
inode area where it should not be allocated.
Previous fix commit d325dc6eb763 ("nilfs2: fix use-after-free bug of
struct nilfs_root"), fixed the problem that reserved inodes with inode
numbers less than NILFS_USER_INO (=11) were incorrectly reallocated due to
bitmap corruption, but since the start number of non-reserved inodes is
read from the super block and may change, in which case inode allocation
may occur from the extended reserved inode area.
If that happens, access to that inode will cause an IO error, causing the
file system to degrade to an error state.
Fix this potential issue by adding a wraparound option to the common
metadata object allocation routine and by modifying
nilfs_ifile_create_inode() to disable the option so that it only allocates
inodes with inode numbers greater than or equal to the inode number read
in "nilfs->ns_first_ino", regardless of the bitmap status of reserved
inodes.
Link: https://lkml.kernel.org/r/20240623051135.4180-4-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/alloc.c b/fs/nilfs2/alloc.c
index 89caef7513db..ba50388ee4bf 100644
--- a/fs/nilfs2/alloc.c
+++ b/fs/nilfs2/alloc.c
@@ -377,11 +377,12 @@ void *nilfs_palloc_block_get_entry(const struct inode *inode, __u64 nr,
* @target: offset number of an entry in the group (start point)
* @bsize: size in bits
* @lock: spin lock protecting @bitmap
+ * @wrap: whether to wrap around
*/
static int nilfs_palloc_find_available_slot(unsigned char *bitmap,
unsigned long target,
unsigned int bsize,
- spinlock_t *lock)
+ spinlock_t *lock, bool wrap)
{
int pos, end = bsize;
@@ -397,6 +398,8 @@ static int nilfs_palloc_find_available_slot(unsigned char *bitmap,
end = target;
}
+ if (!wrap)
+ return -ENOSPC;
/* wrap around */
for (pos = 0; pos < end; pos++) {
@@ -495,9 +498,10 @@ int nilfs_palloc_count_max_entries(struct inode *inode, u64 nused, u64 *nmaxp)
* nilfs_palloc_prepare_alloc_entry - prepare to allocate a persistent object
* @inode: inode of metadata file using this allocator
* @req: nilfs_palloc_req structure exchanged for the allocation
+ * @wrap: whether to wrap around
*/
int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
- struct nilfs_palloc_req *req)
+ struct nilfs_palloc_req *req, bool wrap)
{
struct buffer_head *desc_bh, *bitmap_bh;
struct nilfs_palloc_group_desc *desc;
@@ -516,7 +520,7 @@ int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
entries_per_group = nilfs_palloc_entries_per_group(inode);
for (i = 0; i < ngroups; i += n) {
- if (group >= ngroups) {
+ if (group >= ngroups && wrap) {
/* wrap around */
group = 0;
maxgroup = nilfs_palloc_group(inode, req->pr_entry_nr,
@@ -550,7 +554,14 @@ int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
bitmap_kaddr = kmap_local_page(bitmap_bh->b_page);
bitmap = bitmap_kaddr + bh_offset(bitmap_bh);
pos = nilfs_palloc_find_available_slot(
- bitmap, group_offset, entries_per_group, lock);
+ bitmap, group_offset, entries_per_group, lock,
+ wrap);
+ /*
+ * Since the search for a free slot in the second and
+ * subsequent bitmap blocks always starts from the
+ * beginning, the wrap flag only has an effect on the
+ * first search.
+ */
kunmap_local(bitmap_kaddr);
if (pos >= 0)
goto found;
diff --git a/fs/nilfs2/alloc.h b/fs/nilfs2/alloc.h
index b667e869ac07..d825a9faca6d 100644
--- a/fs/nilfs2/alloc.h
+++ b/fs/nilfs2/alloc.h
@@ -50,8 +50,8 @@ struct nilfs_palloc_req {
struct buffer_head *pr_entry_bh;
};
-int nilfs_palloc_prepare_alloc_entry(struct inode *,
- struct nilfs_palloc_req *);
+int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
+ struct nilfs_palloc_req *req, bool wrap);
void nilfs_palloc_commit_alloc_entry(struct inode *,
struct nilfs_palloc_req *);
void nilfs_palloc_abort_alloc_entry(struct inode *, struct nilfs_palloc_req *);
diff --git a/fs/nilfs2/dat.c b/fs/nilfs2/dat.c
index 180fc8d36213..fc1caf63a42a 100644
--- a/fs/nilfs2/dat.c
+++ b/fs/nilfs2/dat.c
@@ -75,7 +75,7 @@ int nilfs_dat_prepare_alloc(struct inode *dat, struct nilfs_palloc_req *req)
{
int ret;
- ret = nilfs_palloc_prepare_alloc_entry(dat, req);
+ ret = nilfs_palloc_prepare_alloc_entry(dat, req, true);
if (ret < 0)
return ret;
diff --git a/fs/nilfs2/ifile.c b/fs/nilfs2/ifile.c
index 612e609158b5..1e86b9303b7c 100644
--- a/fs/nilfs2/ifile.c
+++ b/fs/nilfs2/ifile.c
@@ -56,13 +56,10 @@ int nilfs_ifile_create_inode(struct inode *ifile, ino_t *out_ino,
struct nilfs_palloc_req req;
int ret;
- req.pr_entry_nr = 0; /*
- * 0 says find free inode from beginning
- * of a group. dull code!!
- */
+ req.pr_entry_nr = NILFS_FIRST_INO(ifile->i_sb);
req.pr_entry_bh = NULL;
- ret = nilfs_palloc_prepare_alloc_entry(ifile, &req);
+ ret = nilfs_palloc_prepare_alloc_entry(ifile, &req, false);
if (!ret) {
ret = nilfs_palloc_get_entry_block(ifile, req.pr_entry_nr, 1,
&req.pr_entry_bh);
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x f41e355f8b48d894324a3fdc9727e08b1bce78e2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072914-egotistic-crewmate-8f1b@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
f41e355f8b48 ("nilfs2: fix incorrect inode allocation from reserved inodes")
af6eae646851 ("nilfs2: convert persistent object allocator to use kmap_local")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f41e355f8b48d894324a3fdc9727e08b1bce78e2 Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:35 +0900
Subject: [PATCH] nilfs2: fix incorrect inode allocation from reserved inodes
If the bitmap block that manages the inode allocation status is corrupted,
nilfs_ifile_create_inode() may allocate a new inode from the reserved
inode area where it should not be allocated.
Previous fix commit d325dc6eb763 ("nilfs2: fix use-after-free bug of
struct nilfs_root"), fixed the problem that reserved inodes with inode
numbers less than NILFS_USER_INO (=11) were incorrectly reallocated due to
bitmap corruption, but since the start number of non-reserved inodes is
read from the super block and may change, in which case inode allocation
may occur from the extended reserved inode area.
If that happens, access to that inode will cause an IO error, causing the
file system to degrade to an error state.
Fix this potential issue by adding a wraparound option to the common
metadata object allocation routine and by modifying
nilfs_ifile_create_inode() to disable the option so that it only allocates
inodes with inode numbers greater than or equal to the inode number read
in "nilfs->ns_first_ino", regardless of the bitmap status of reserved
inodes.
Link: https://lkml.kernel.org/r/20240623051135.4180-4-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/alloc.c b/fs/nilfs2/alloc.c
index 89caef7513db..ba50388ee4bf 100644
--- a/fs/nilfs2/alloc.c
+++ b/fs/nilfs2/alloc.c
@@ -377,11 +377,12 @@ void *nilfs_palloc_block_get_entry(const struct inode *inode, __u64 nr,
* @target: offset number of an entry in the group (start point)
* @bsize: size in bits
* @lock: spin lock protecting @bitmap
+ * @wrap: whether to wrap around
*/
static int nilfs_palloc_find_available_slot(unsigned char *bitmap,
unsigned long target,
unsigned int bsize,
- spinlock_t *lock)
+ spinlock_t *lock, bool wrap)
{
int pos, end = bsize;
@@ -397,6 +398,8 @@ static int nilfs_palloc_find_available_slot(unsigned char *bitmap,
end = target;
}
+ if (!wrap)
+ return -ENOSPC;
/* wrap around */
for (pos = 0; pos < end; pos++) {
@@ -495,9 +498,10 @@ int nilfs_palloc_count_max_entries(struct inode *inode, u64 nused, u64 *nmaxp)
* nilfs_palloc_prepare_alloc_entry - prepare to allocate a persistent object
* @inode: inode of metadata file using this allocator
* @req: nilfs_palloc_req structure exchanged for the allocation
+ * @wrap: whether to wrap around
*/
int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
- struct nilfs_palloc_req *req)
+ struct nilfs_palloc_req *req, bool wrap)
{
struct buffer_head *desc_bh, *bitmap_bh;
struct nilfs_palloc_group_desc *desc;
@@ -516,7 +520,7 @@ int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
entries_per_group = nilfs_palloc_entries_per_group(inode);
for (i = 0; i < ngroups; i += n) {
- if (group >= ngroups) {
+ if (group >= ngroups && wrap) {
/* wrap around */
group = 0;
maxgroup = nilfs_palloc_group(inode, req->pr_entry_nr,
@@ -550,7 +554,14 @@ int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
bitmap_kaddr = kmap_local_page(bitmap_bh->b_page);
bitmap = bitmap_kaddr + bh_offset(bitmap_bh);
pos = nilfs_palloc_find_available_slot(
- bitmap, group_offset, entries_per_group, lock);
+ bitmap, group_offset, entries_per_group, lock,
+ wrap);
+ /*
+ * Since the search for a free slot in the second and
+ * subsequent bitmap blocks always starts from the
+ * beginning, the wrap flag only has an effect on the
+ * first search.
+ */
kunmap_local(bitmap_kaddr);
if (pos >= 0)
goto found;
diff --git a/fs/nilfs2/alloc.h b/fs/nilfs2/alloc.h
index b667e869ac07..d825a9faca6d 100644
--- a/fs/nilfs2/alloc.h
+++ b/fs/nilfs2/alloc.h
@@ -50,8 +50,8 @@ struct nilfs_palloc_req {
struct buffer_head *pr_entry_bh;
};
-int nilfs_palloc_prepare_alloc_entry(struct inode *,
- struct nilfs_palloc_req *);
+int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
+ struct nilfs_palloc_req *req, bool wrap);
void nilfs_palloc_commit_alloc_entry(struct inode *,
struct nilfs_palloc_req *);
void nilfs_palloc_abort_alloc_entry(struct inode *, struct nilfs_palloc_req *);
diff --git a/fs/nilfs2/dat.c b/fs/nilfs2/dat.c
index 180fc8d36213..fc1caf63a42a 100644
--- a/fs/nilfs2/dat.c
+++ b/fs/nilfs2/dat.c
@@ -75,7 +75,7 @@ int nilfs_dat_prepare_alloc(struct inode *dat, struct nilfs_palloc_req *req)
{
int ret;
- ret = nilfs_palloc_prepare_alloc_entry(dat, req);
+ ret = nilfs_palloc_prepare_alloc_entry(dat, req, true);
if (ret < 0)
return ret;
diff --git a/fs/nilfs2/ifile.c b/fs/nilfs2/ifile.c
index 612e609158b5..1e86b9303b7c 100644
--- a/fs/nilfs2/ifile.c
+++ b/fs/nilfs2/ifile.c
@@ -56,13 +56,10 @@ int nilfs_ifile_create_inode(struct inode *ifile, ino_t *out_ino,
struct nilfs_palloc_req req;
int ret;
- req.pr_entry_nr = 0; /*
- * 0 says find free inode from beginning
- * of a group. dull code!!
- */
+ req.pr_entry_nr = NILFS_FIRST_INO(ifile->i_sb);
req.pr_entry_bh = NULL;
- ret = nilfs_palloc_prepare_alloc_entry(ifile, &req);
+ ret = nilfs_palloc_prepare_alloc_entry(ifile, &req, false);
if (!ret) {
ret = nilfs_palloc_get_entry_block(ifile, req.pr_entry_nr, 1,
&req.pr_entry_bh);
The patch below does not apply to the 6.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y
git checkout FETCH_HEAD
git cherry-pick -x f41e355f8b48d894324a3fdc9727e08b1bce78e2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072913-overcrowd-affair-324f@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^..
Possible dependencies:
f41e355f8b48 ("nilfs2: fix incorrect inode allocation from reserved inodes")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f41e355f8b48d894324a3fdc9727e08b1bce78e2 Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Sun, 23 Jun 2024 14:11:35 +0900
Subject: [PATCH] nilfs2: fix incorrect inode allocation from reserved inodes
If the bitmap block that manages the inode allocation status is corrupted,
nilfs_ifile_create_inode() may allocate a new inode from the reserved
inode area where it should not be allocated.
Previous fix commit d325dc6eb763 ("nilfs2: fix use-after-free bug of
struct nilfs_root"), fixed the problem that reserved inodes with inode
numbers less than NILFS_USER_INO (=11) were incorrectly reallocated due to
bitmap corruption, but since the start number of non-reserved inodes is
read from the super block and may change, in which case inode allocation
may occur from the extended reserved inode area.
If that happens, access to that inode will cause an IO error, causing the
file system to degrade to an error state.
Fix this potential issue by adding a wraparound option to the common
metadata object allocation routine and by modifying
nilfs_ifile_create_inode() to disable the option so that it only allocates
inodes with inode numbers greater than or equal to the inode number read
in "nilfs->ns_first_ino", regardless of the bitmap status of reserved
inodes.
Link: https://lkml.kernel.org/r/20240623051135.4180-4-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/alloc.c b/fs/nilfs2/alloc.c
index 89caef7513db..ba50388ee4bf 100644
--- a/fs/nilfs2/alloc.c
+++ b/fs/nilfs2/alloc.c
@@ -377,11 +377,12 @@ void *nilfs_palloc_block_get_entry(const struct inode *inode, __u64 nr,
* @target: offset number of an entry in the group (start point)
* @bsize: size in bits
* @lock: spin lock protecting @bitmap
+ * @wrap: whether to wrap around
*/
static int nilfs_palloc_find_available_slot(unsigned char *bitmap,
unsigned long target,
unsigned int bsize,
- spinlock_t *lock)
+ spinlock_t *lock, bool wrap)
{
int pos, end = bsize;
@@ -397,6 +398,8 @@ static int nilfs_palloc_find_available_slot(unsigned char *bitmap,
end = target;
}
+ if (!wrap)
+ return -ENOSPC;
/* wrap around */
for (pos = 0; pos < end; pos++) {
@@ -495,9 +498,10 @@ int nilfs_palloc_count_max_entries(struct inode *inode, u64 nused, u64 *nmaxp)
* nilfs_palloc_prepare_alloc_entry - prepare to allocate a persistent object
* @inode: inode of metadata file using this allocator
* @req: nilfs_palloc_req structure exchanged for the allocation
+ * @wrap: whether to wrap around
*/
int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
- struct nilfs_palloc_req *req)
+ struct nilfs_palloc_req *req, bool wrap)
{
struct buffer_head *desc_bh, *bitmap_bh;
struct nilfs_palloc_group_desc *desc;
@@ -516,7 +520,7 @@ int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
entries_per_group = nilfs_palloc_entries_per_group(inode);
for (i = 0; i < ngroups; i += n) {
- if (group >= ngroups) {
+ if (group >= ngroups && wrap) {
/* wrap around */
group = 0;
maxgroup = nilfs_palloc_group(inode, req->pr_entry_nr,
@@ -550,7 +554,14 @@ int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
bitmap_kaddr = kmap_local_page(bitmap_bh->b_page);
bitmap = bitmap_kaddr + bh_offset(bitmap_bh);
pos = nilfs_palloc_find_available_slot(
- bitmap, group_offset, entries_per_group, lock);
+ bitmap, group_offset, entries_per_group, lock,
+ wrap);
+ /*
+ * Since the search for a free slot in the second and
+ * subsequent bitmap blocks always starts from the
+ * beginning, the wrap flag only has an effect on the
+ * first search.
+ */
kunmap_local(bitmap_kaddr);
if (pos >= 0)
goto found;
diff --git a/fs/nilfs2/alloc.h b/fs/nilfs2/alloc.h
index b667e869ac07..d825a9faca6d 100644
--- a/fs/nilfs2/alloc.h
+++ b/fs/nilfs2/alloc.h
@@ -50,8 +50,8 @@ struct nilfs_palloc_req {
struct buffer_head *pr_entry_bh;
};
-int nilfs_palloc_prepare_alloc_entry(struct inode *,
- struct nilfs_palloc_req *);
+int nilfs_palloc_prepare_alloc_entry(struct inode *inode,
+ struct nilfs_palloc_req *req, bool wrap);
void nilfs_palloc_commit_alloc_entry(struct inode *,
struct nilfs_palloc_req *);
void nilfs_palloc_abort_alloc_entry(struct inode *, struct nilfs_palloc_req *);
diff --git a/fs/nilfs2/dat.c b/fs/nilfs2/dat.c
index 180fc8d36213..fc1caf63a42a 100644
--- a/fs/nilfs2/dat.c
+++ b/fs/nilfs2/dat.c
@@ -75,7 +75,7 @@ int nilfs_dat_prepare_alloc(struct inode *dat, struct nilfs_palloc_req *req)
{
int ret;
- ret = nilfs_palloc_prepare_alloc_entry(dat, req);
+ ret = nilfs_palloc_prepare_alloc_entry(dat, req, true);
if (ret < 0)
return ret;
diff --git a/fs/nilfs2/ifile.c b/fs/nilfs2/ifile.c
index 612e609158b5..1e86b9303b7c 100644
--- a/fs/nilfs2/ifile.c
+++ b/fs/nilfs2/ifile.c
@@ -56,13 +56,10 @@ int nilfs_ifile_create_inode(struct inode *ifile, ino_t *out_ino,
struct nilfs_palloc_req req;
int ret;
- req.pr_entry_nr = 0; /*
- * 0 says find free inode from beginning
- * of a group. dull code!!
- */
+ req.pr_entry_nr = NILFS_FIRST_INO(ifile->i_sb);
req.pr_entry_bh = NULL;
- ret = nilfs_palloc_prepare_alloc_entry(ifile, &req);
+ ret = nilfs_palloc_prepare_alloc_entry(ifile, &req, false);
if (!ret) {
ret = nilfs_palloc_get_entry_block(ifile, req.pr_entry_nr, 1,
&req.pr_entry_bh);
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 66eca1021a42856d6af2a9802c99e160278aed91
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072902-yippee-superbowl-f065@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
66eca1021a42 ("mm/page_alloc: fix pcp->count race between drain_pages_zone() vs __rmqueue_pcplist()")
55f77df7d715 ("mm: page_alloc: control latency caused by zone PCP draining")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 66eca1021a42856d6af2a9802c99e160278aed91 Mon Sep 17 00:00:00 2001
From: Li Zhijian <lizhijian(a)fujitsu.com>
Date: Tue, 23 Jul 2024 14:44:28 +0800
Subject: [PATCH] mm/page_alloc: fix pcp->count race between drain_pages_zone()
vs __rmqueue_pcplist()
It's expected that no page should be left in pcp_list after calling
zone_pcp_disable() in offline_pages(). Previously, it's observed that
offline_pages() gets stuck [1] due to some pages remaining in pcp_list.
Cause:
There is a race condition between drain_pages_zone() and __rmqueue_pcplist()
involving the pcp->count variable. See below scenario:
CPU0 CPU1
---------------- ---------------
spin_lock(&pcp->lock);
__rmqueue_pcplist() {
zone_pcp_disable() {
/* list is empty */
if (list_empty(list)) {
/* add pages to pcp_list */
alloced = rmqueue_bulk()
mutex_lock(&pcp_batch_high_lock)
...
__drain_all_pages() {
drain_pages_zone() {
/* read pcp->count, it's 0 here */
count = READ_ONCE(pcp->count)
/* 0 means nothing to drain */
/* update pcp->count */
pcp->count += alloced << order;
...
...
spin_unlock(&pcp->lock);
In this case, after calling zone_pcp_disable() though, there are still some
pages in pcp_list. And these pages in pcp_list are neither movable nor
isolated, offline_pages() gets stuck as a result.
Solution:
Expand the scope of the pcp->lock to also protect pcp->count in
drain_pages_zone(), to ensure no pages are left in the pcp list after
zone_pcp_disable()
[1] https://lore.kernel.org/linux-mm/6a07125f-e720-404c-b2f9-e55f3f166e85@fujit…
Link: https://lkml.kernel.org/r/20240723064428.1179519-1-lizhijian@fujitsu.com
Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com>
Reported-by: Yao Xingtao <yaoxt.fnst(a)fujitsu.com>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7ac8d61148fe..28f80daf5c04 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2343,16 +2343,20 @@ void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp)
static void drain_pages_zone(unsigned int cpu, struct zone *zone)
{
struct per_cpu_pages *pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu);
- int count = READ_ONCE(pcp->count);
-
- while (count) {
- int to_drain = min(count, pcp->batch << CONFIG_PCP_BATCH_SCALE_MAX);
- count -= to_drain;
+ int count;
+ do {
spin_lock(&pcp->lock);
- free_pcppages_bulk(zone, to_drain, pcp, 0);
+ count = pcp->count;
+ if (count) {
+ int to_drain = min(count,
+ pcp->batch << CONFIG_PCP_BATCH_SCALE_MAX);
+
+ free_pcppages_bulk(zone, to_drain, pcp, 0);
+ count -= to_drain;
+ }
spin_unlock(&pcp->lock);
- }
+ } while (count);
}
/*
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 66eca1021a42856d6af2a9802c99e160278aed91
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072954-blaspheme-safeness-fcbc@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
66eca1021a42 ("mm/page_alloc: fix pcp->count race between drain_pages_zone() vs __rmqueue_pcplist()")
55f77df7d715 ("mm: page_alloc: control latency caused by zone PCP draining")
574907741599 ("mm/page_alloc: leave IRQs enabled for per-cpu page allocations")
c3e58a70425a ("mm/page_alloc: always remove pages from temporary list")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 66eca1021a42856d6af2a9802c99e160278aed91 Mon Sep 17 00:00:00 2001
From: Li Zhijian <lizhijian(a)fujitsu.com>
Date: Tue, 23 Jul 2024 14:44:28 +0800
Subject: [PATCH] mm/page_alloc: fix pcp->count race between drain_pages_zone()
vs __rmqueue_pcplist()
It's expected that no page should be left in pcp_list after calling
zone_pcp_disable() in offline_pages(). Previously, it's observed that
offline_pages() gets stuck [1] due to some pages remaining in pcp_list.
Cause:
There is a race condition between drain_pages_zone() and __rmqueue_pcplist()
involving the pcp->count variable. See below scenario:
CPU0 CPU1
---------------- ---------------
spin_lock(&pcp->lock);
__rmqueue_pcplist() {
zone_pcp_disable() {
/* list is empty */
if (list_empty(list)) {
/* add pages to pcp_list */
alloced = rmqueue_bulk()
mutex_lock(&pcp_batch_high_lock)
...
__drain_all_pages() {
drain_pages_zone() {
/* read pcp->count, it's 0 here */
count = READ_ONCE(pcp->count)
/* 0 means nothing to drain */
/* update pcp->count */
pcp->count += alloced << order;
...
...
spin_unlock(&pcp->lock);
In this case, after calling zone_pcp_disable() though, there are still some
pages in pcp_list. And these pages in pcp_list are neither movable nor
isolated, offline_pages() gets stuck as a result.
Solution:
Expand the scope of the pcp->lock to also protect pcp->count in
drain_pages_zone(), to ensure no pages are left in the pcp list after
zone_pcp_disable()
[1] https://lore.kernel.org/linux-mm/6a07125f-e720-404c-b2f9-e55f3f166e85@fujit…
Link: https://lkml.kernel.org/r/20240723064428.1179519-1-lizhijian@fujitsu.com
Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com>
Reported-by: Yao Xingtao <yaoxt.fnst(a)fujitsu.com>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7ac8d61148fe..28f80daf5c04 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2343,16 +2343,20 @@ void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp)
static void drain_pages_zone(unsigned int cpu, struct zone *zone)
{
struct per_cpu_pages *pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu);
- int count = READ_ONCE(pcp->count);
-
- while (count) {
- int to_drain = min(count, pcp->batch << CONFIG_PCP_BATCH_SCALE_MAX);
- count -= to_drain;
+ int count;
+ do {
spin_lock(&pcp->lock);
- free_pcppages_bulk(zone, to_drain, pcp, 0);
+ count = pcp->count;
+ if (count) {
+ int to_drain = min(count,
+ pcp->batch << CONFIG_PCP_BATCH_SCALE_MAX);
+
+ free_pcppages_bulk(zone, to_drain, pcp, 0);
+ count -= to_drain;
+ }
spin_unlock(&pcp->lock);
- }
+ } while (count);
}
/*
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 11a1f4bc47362700fcbde717292158873fb847ed
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072919-usual-unstopped-30af@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
11a1f4bc4736 ("PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 11a1f4bc47362700fcbde717292158873fb847ed Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Tue, 18 Jun 2024 12:54:55 +0200
Subject: [PATCH] PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Keith reports a use-after-free when a DPC event occurs concurrently to
hot-removal of the same portion of the hierarchy:
The dpc_handler() awaits readiness of the secondary bus below the
Downstream Port where the DPC event occurred. To do so, it polls the
config space of the first child device on the secondary bus. If that
child device is concurrently removed, accesses to its struct pci_dev
cause the kernel to oops.
That's because pci_bridge_wait_for_secondary_bus() neglects to hold a
reference on the child device. Before v6.3, the function was only
called on resume from system sleep or on runtime resume. Holding a
reference wasn't necessary back then because the pciehp IRQ thread
could never run concurrently. (On resume from system sleep, IRQs are
not enabled until after the resume_noirq phase. And runtime resume is
always awaited before a PCI device is removed.)
However starting with v6.3, pci_bridge_wait_for_secondary_bus() is also
called on a DPC event. Commit 53b54ad074de ("PCI/DPC: Await readiness
of secondary bus after reset"), which introduced that, failed to
appreciate that pci_bridge_wait_for_secondary_bus() now needs to hold a
reference on the child device because dpc_handler() and pciehp may
indeed run concurrently. The commit was backported to v5.10+ stable
kernels, so that's the oldest one affected.
Add the missing reference acquisition.
Abridged stack trace:
BUG: unable to handle page fault for address: 00000000091400c0
CPU: 15 PID: 2464 Comm: irq/53-pcie-dpc 6.9.0
RIP: pci_bus_read_config_dword+0x17/0x50
pci_dev_wait()
pci_bridge_wait_for_secondary_bus()
dpc_reset_link()
pcie_do_recovery()
dpc_handler()
Fixes: 53b54ad074de ("PCI/DPC: Await readiness of secondary bus after reset")
Closes: https://lore.kernel.org/r/20240612181625.3604512-3-kbusch@meta.com/
Link: https://lore.kernel.org/linux-pci/8e4bcd4116fd94f592f2bf2749f168099c480ddf.…
Reported-by: Keith Busch <kbusch(a)kernel.org>
Tested-by: Keith Busch <kbusch(a)kernel.org>
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Signed-off-by: Krzysztof Wilczyński <kwilczynski(a)kernel.org>
Reviewed-by: Keith Busch <kbusch(a)kernel.org>
Reviewed-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Cc: stable(a)vger.kernel.org # v5.10+
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 59e0949fb079..c40a6c11e959 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4753,7 +4753,7 @@ static int pci_bus_max_d3cold_delay(const struct pci_bus *bus)
*/
int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
{
- struct pci_dev *child;
+ struct pci_dev *child __free(pci_dev_put) = NULL;
int delay;
if (pci_dev_is_disconnected(dev))
@@ -4782,8 +4782,8 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
return 0;
}
- child = list_first_entry(&dev->subordinate->devices, struct pci_dev,
- bus_list);
+ child = pci_dev_get(list_first_entry(&dev->subordinate->devices,
+ struct pci_dev, bus_list));
up_read(&pci_bus_sem);
/*
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 11a1f4bc47362700fcbde717292158873fb847ed
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072918-headache-absently-933a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
11a1f4bc4736 ("PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 11a1f4bc47362700fcbde717292158873fb847ed Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Tue, 18 Jun 2024 12:54:55 +0200
Subject: [PATCH] PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Keith reports a use-after-free when a DPC event occurs concurrently to
hot-removal of the same portion of the hierarchy:
The dpc_handler() awaits readiness of the secondary bus below the
Downstream Port where the DPC event occurred. To do so, it polls the
config space of the first child device on the secondary bus. If that
child device is concurrently removed, accesses to its struct pci_dev
cause the kernel to oops.
That's because pci_bridge_wait_for_secondary_bus() neglects to hold a
reference on the child device. Before v6.3, the function was only
called on resume from system sleep or on runtime resume. Holding a
reference wasn't necessary back then because the pciehp IRQ thread
could never run concurrently. (On resume from system sleep, IRQs are
not enabled until after the resume_noirq phase. And runtime resume is
always awaited before a PCI device is removed.)
However starting with v6.3, pci_bridge_wait_for_secondary_bus() is also
called on a DPC event. Commit 53b54ad074de ("PCI/DPC: Await readiness
of secondary bus after reset"), which introduced that, failed to
appreciate that pci_bridge_wait_for_secondary_bus() now needs to hold a
reference on the child device because dpc_handler() and pciehp may
indeed run concurrently. The commit was backported to v5.10+ stable
kernels, so that's the oldest one affected.
Add the missing reference acquisition.
Abridged stack trace:
BUG: unable to handle page fault for address: 00000000091400c0
CPU: 15 PID: 2464 Comm: irq/53-pcie-dpc 6.9.0
RIP: pci_bus_read_config_dword+0x17/0x50
pci_dev_wait()
pci_bridge_wait_for_secondary_bus()
dpc_reset_link()
pcie_do_recovery()
dpc_handler()
Fixes: 53b54ad074de ("PCI/DPC: Await readiness of secondary bus after reset")
Closes: https://lore.kernel.org/r/20240612181625.3604512-3-kbusch@meta.com/
Link: https://lore.kernel.org/linux-pci/8e4bcd4116fd94f592f2bf2749f168099c480ddf.…
Reported-by: Keith Busch <kbusch(a)kernel.org>
Tested-by: Keith Busch <kbusch(a)kernel.org>
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Signed-off-by: Krzysztof Wilczyński <kwilczynski(a)kernel.org>
Reviewed-by: Keith Busch <kbusch(a)kernel.org>
Reviewed-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Cc: stable(a)vger.kernel.org # v5.10+
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 59e0949fb079..c40a6c11e959 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4753,7 +4753,7 @@ static int pci_bus_max_d3cold_delay(const struct pci_bus *bus)
*/
int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
{
- struct pci_dev *child;
+ struct pci_dev *child __free(pci_dev_put) = NULL;
int delay;
if (pci_dev_is_disconnected(dev))
@@ -4782,8 +4782,8 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
return 0;
}
- child = list_first_entry(&dev->subordinate->devices, struct pci_dev,
- bus_list);
+ child = pci_dev_get(list_first_entry(&dev->subordinate->devices,
+ struct pci_dev, bus_list));
up_read(&pci_bus_sem);
/*
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 11a1f4bc47362700fcbde717292158873fb847ed
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072921-crazed-babble-80d9@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
11a1f4bc4736 ("PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 11a1f4bc47362700fcbde717292158873fb847ed Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Tue, 18 Jun 2024 12:54:55 +0200
Subject: [PATCH] PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Keith reports a use-after-free when a DPC event occurs concurrently to
hot-removal of the same portion of the hierarchy:
The dpc_handler() awaits readiness of the secondary bus below the
Downstream Port where the DPC event occurred. To do so, it polls the
config space of the first child device on the secondary bus. If that
child device is concurrently removed, accesses to its struct pci_dev
cause the kernel to oops.
That's because pci_bridge_wait_for_secondary_bus() neglects to hold a
reference on the child device. Before v6.3, the function was only
called on resume from system sleep or on runtime resume. Holding a
reference wasn't necessary back then because the pciehp IRQ thread
could never run concurrently. (On resume from system sleep, IRQs are
not enabled until after the resume_noirq phase. And runtime resume is
always awaited before a PCI device is removed.)
However starting with v6.3, pci_bridge_wait_for_secondary_bus() is also
called on a DPC event. Commit 53b54ad074de ("PCI/DPC: Await readiness
of secondary bus after reset"), which introduced that, failed to
appreciate that pci_bridge_wait_for_secondary_bus() now needs to hold a
reference on the child device because dpc_handler() and pciehp may
indeed run concurrently. The commit was backported to v5.10+ stable
kernels, so that's the oldest one affected.
Add the missing reference acquisition.
Abridged stack trace:
BUG: unable to handle page fault for address: 00000000091400c0
CPU: 15 PID: 2464 Comm: irq/53-pcie-dpc 6.9.0
RIP: pci_bus_read_config_dword+0x17/0x50
pci_dev_wait()
pci_bridge_wait_for_secondary_bus()
dpc_reset_link()
pcie_do_recovery()
dpc_handler()
Fixes: 53b54ad074de ("PCI/DPC: Await readiness of secondary bus after reset")
Closes: https://lore.kernel.org/r/20240612181625.3604512-3-kbusch@meta.com/
Link: https://lore.kernel.org/linux-pci/8e4bcd4116fd94f592f2bf2749f168099c480ddf.…
Reported-by: Keith Busch <kbusch(a)kernel.org>
Tested-by: Keith Busch <kbusch(a)kernel.org>
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Signed-off-by: Krzysztof Wilczyński <kwilczynski(a)kernel.org>
Reviewed-by: Keith Busch <kbusch(a)kernel.org>
Reviewed-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Cc: stable(a)vger.kernel.org # v5.10+
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 59e0949fb079..c40a6c11e959 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4753,7 +4753,7 @@ static int pci_bus_max_d3cold_delay(const struct pci_bus *bus)
*/
int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
{
- struct pci_dev *child;
+ struct pci_dev *child __free(pci_dev_put) = NULL;
int delay;
if (pci_dev_is_disconnected(dev))
@@ -4782,8 +4782,8 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
return 0;
}
- child = list_first_entry(&dev->subordinate->devices, struct pci_dev,
- bus_list);
+ child = pci_dev_get(list_first_entry(&dev->subordinate->devices,
+ struct pci_dev, bus_list));
up_read(&pci_bus_sem);
/*
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 11a1f4bc47362700fcbde717292158873fb847ed
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072919-booted-rise-cc5d@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
11a1f4bc4736 ("PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 11a1f4bc47362700fcbde717292158873fb847ed Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Tue, 18 Jun 2024 12:54:55 +0200
Subject: [PATCH] PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Keith reports a use-after-free when a DPC event occurs concurrently to
hot-removal of the same portion of the hierarchy:
The dpc_handler() awaits readiness of the secondary bus below the
Downstream Port where the DPC event occurred. To do so, it polls the
config space of the first child device on the secondary bus. If that
child device is concurrently removed, accesses to its struct pci_dev
cause the kernel to oops.
That's because pci_bridge_wait_for_secondary_bus() neglects to hold a
reference on the child device. Before v6.3, the function was only
called on resume from system sleep or on runtime resume. Holding a
reference wasn't necessary back then because the pciehp IRQ thread
could never run concurrently. (On resume from system sleep, IRQs are
not enabled until after the resume_noirq phase. And runtime resume is
always awaited before a PCI device is removed.)
However starting with v6.3, pci_bridge_wait_for_secondary_bus() is also
called on a DPC event. Commit 53b54ad074de ("PCI/DPC: Await readiness
of secondary bus after reset"), which introduced that, failed to
appreciate that pci_bridge_wait_for_secondary_bus() now needs to hold a
reference on the child device because dpc_handler() and pciehp may
indeed run concurrently. The commit was backported to v5.10+ stable
kernels, so that's the oldest one affected.
Add the missing reference acquisition.
Abridged stack trace:
BUG: unable to handle page fault for address: 00000000091400c0
CPU: 15 PID: 2464 Comm: irq/53-pcie-dpc 6.9.0
RIP: pci_bus_read_config_dword+0x17/0x50
pci_dev_wait()
pci_bridge_wait_for_secondary_bus()
dpc_reset_link()
pcie_do_recovery()
dpc_handler()
Fixes: 53b54ad074de ("PCI/DPC: Await readiness of secondary bus after reset")
Closes: https://lore.kernel.org/r/20240612181625.3604512-3-kbusch@meta.com/
Link: https://lore.kernel.org/linux-pci/8e4bcd4116fd94f592f2bf2749f168099c480ddf.…
Reported-by: Keith Busch <kbusch(a)kernel.org>
Tested-by: Keith Busch <kbusch(a)kernel.org>
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Signed-off-by: Krzysztof Wilczyński <kwilczynski(a)kernel.org>
Reviewed-by: Keith Busch <kbusch(a)kernel.org>
Reviewed-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Cc: stable(a)vger.kernel.org # v5.10+
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 59e0949fb079..c40a6c11e959 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4753,7 +4753,7 @@ static int pci_bus_max_d3cold_delay(const struct pci_bus *bus)
*/
int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
{
- struct pci_dev *child;
+ struct pci_dev *child __free(pci_dev_put) = NULL;
int delay;
if (pci_dev_is_disconnected(dev))
@@ -4782,8 +4782,8 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
return 0;
}
- child = list_first_entry(&dev->subordinate->devices, struct pci_dev,
- bus_list);
+ child = pci_dev_get(list_first_entry(&dev->subordinate->devices,
+ struct pci_dev, bus_list));
up_read(&pci_bus_sem);
/*
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 840b7a5edf88fe678c60dee88a135647c0ea4375
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072924-yonder-factoid-6692@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
840b7a5edf88 ("PCI: rockchip: Use GPIOD_OUT_LOW flag while requesting ep_gpio")
58adbfb3ebec ("PCI: rockchip: Make 'ep-gpios' DT property optional")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 840b7a5edf88fe678c60dee88a135647c0ea4375 Mon Sep 17 00:00:00 2001
From: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
Date: Tue, 16 Apr 2024 11:12:35 +0530
Subject: [PATCH] PCI: rockchip: Use GPIOD_OUT_LOW flag while requesting
ep_gpio
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Rockchip platforms use 'GPIO_ACTIVE_HIGH' flag in the devicetree definition
for ep_gpio. This means, whatever the logical value set by the driver for
the ep_gpio, physical line will output the same logic level.
For instance,
gpiod_set_value_cansleep(rockchip->ep_gpio, 0); --> Level low
gpiod_set_value_cansleep(rockchip->ep_gpio, 1); --> Level high
But while requesting the ep_gpio, GPIOD_OUT_HIGH flag is currently used.
Now, this also causes the physical line to output 'high' creating trouble
for endpoint devices during host reboot.
When host reboot happens, the ep_gpio will initially output 'low' due to
the GPIO getting reset to its POR value. Then during host controller probe,
it will output 'high' due to GPIOD_OUT_HIGH flag. Then during
rockchip_pcie_host_init_port(), it will first output 'low' and then 'high'
indicating the completion of controller initialization.
On the endpoint side, each output 'low' of ep_gpio is accounted for PERST#
assert and 'high' for PERST# deassert. With the above mentioned flow during
host reboot, endpoint will witness below state changes for PERST#:
(1) PERST# assert - GPIO POR state
(2) PERST# deassert - GPIOD_OUT_HIGH while requesting GPIO
(3) PERST# assert - rockchip_pcie_host_init_port()
(4) PERST# deassert - rockchip_pcie_host_init_port()
Now the time interval between (2) and (3) is very short as both happen
during the driver probe(), and this results in a race in the endpoint.
Because, before completing the PERST# deassertion in (2), endpoint got
another PERST# assert in (3).
A proper way to fix this issue is to change the GPIOD_OUT_HIGH flag in (2)
to GPIOD_OUT_LOW. Because the usual convention is to request the GPIO with
a state corresponding to its 'initial/default' value and let the driver
change the state of the GPIO when required.
As per that, the ep_gpio should be requested with GPIOD_OUT_LOW as it
corresponds to the POR value of '0' (PERST# assert in the endpoint). Then
the driver can change the state of the ep_gpio later in
rockchip_pcie_host_init_port() as per the initialization sequence.
This fixes the firmware crash issue in Qcom based modems connected to
Rockpro64 based board.
Fixes: e77f847df54c ("PCI: rockchip: Add Rockchip PCIe controller support")
Closes: https://lore.kernel.org/mhi/20240402045647.GG2933@thinkpad/
Link: https://lore.kernel.org/linux-pci/20240416-pci-rockchip-perst-fix-v1-1-4800…
Reported-by: Slark Xiao <slark_xiao(a)163.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski(a)kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Reviewed-by: Niklas Cassel <cassel(a)kernel.org>
Cc: stable(a)vger.kernel.org # v4.9
diff --git a/drivers/pci/controller/pcie-rockchip.c b/drivers/pci/controller/pcie-rockchip.c
index 0ef2e622d36e..c07d7129f1c7 100644
--- a/drivers/pci/controller/pcie-rockchip.c
+++ b/drivers/pci/controller/pcie-rockchip.c
@@ -121,7 +121,7 @@ int rockchip_pcie_parse_dt(struct rockchip_pcie *rockchip)
if (rockchip->is_rc) {
rockchip->ep_gpio = devm_gpiod_get_optional(dev, "ep",
- GPIOD_OUT_HIGH);
+ GPIOD_OUT_LOW);
if (IS_ERR(rockchip->ep_gpio))
return dev_err_probe(dev, PTR_ERR(rockchip->ep_gpio),
"failed to get ep GPIO\n");
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 840b7a5edf88fe678c60dee88a135647c0ea4375
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072923-grating-bobsled-30d0@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
840b7a5edf88 ("PCI: rockchip: Use GPIOD_OUT_LOW flag while requesting ep_gpio")
58adbfb3ebec ("PCI: rockchip: Make 'ep-gpios' DT property optional")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 840b7a5edf88fe678c60dee88a135647c0ea4375 Mon Sep 17 00:00:00 2001
From: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
Date: Tue, 16 Apr 2024 11:12:35 +0530
Subject: [PATCH] PCI: rockchip: Use GPIOD_OUT_LOW flag while requesting
ep_gpio
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Rockchip platforms use 'GPIO_ACTIVE_HIGH' flag in the devicetree definition
for ep_gpio. This means, whatever the logical value set by the driver for
the ep_gpio, physical line will output the same logic level.
For instance,
gpiod_set_value_cansleep(rockchip->ep_gpio, 0); --> Level low
gpiod_set_value_cansleep(rockchip->ep_gpio, 1); --> Level high
But while requesting the ep_gpio, GPIOD_OUT_HIGH flag is currently used.
Now, this also causes the physical line to output 'high' creating trouble
for endpoint devices during host reboot.
When host reboot happens, the ep_gpio will initially output 'low' due to
the GPIO getting reset to its POR value. Then during host controller probe,
it will output 'high' due to GPIOD_OUT_HIGH flag. Then during
rockchip_pcie_host_init_port(), it will first output 'low' and then 'high'
indicating the completion of controller initialization.
On the endpoint side, each output 'low' of ep_gpio is accounted for PERST#
assert and 'high' for PERST# deassert. With the above mentioned flow during
host reboot, endpoint will witness below state changes for PERST#:
(1) PERST# assert - GPIO POR state
(2) PERST# deassert - GPIOD_OUT_HIGH while requesting GPIO
(3) PERST# assert - rockchip_pcie_host_init_port()
(4) PERST# deassert - rockchip_pcie_host_init_port()
Now the time interval between (2) and (3) is very short as both happen
during the driver probe(), and this results in a race in the endpoint.
Because, before completing the PERST# deassertion in (2), endpoint got
another PERST# assert in (3).
A proper way to fix this issue is to change the GPIOD_OUT_HIGH flag in (2)
to GPIOD_OUT_LOW. Because the usual convention is to request the GPIO with
a state corresponding to its 'initial/default' value and let the driver
change the state of the GPIO when required.
As per that, the ep_gpio should be requested with GPIOD_OUT_LOW as it
corresponds to the POR value of '0' (PERST# assert in the endpoint). Then
the driver can change the state of the ep_gpio later in
rockchip_pcie_host_init_port() as per the initialization sequence.
This fixes the firmware crash issue in Qcom based modems connected to
Rockpro64 based board.
Fixes: e77f847df54c ("PCI: rockchip: Add Rockchip PCIe controller support")
Closes: https://lore.kernel.org/mhi/20240402045647.GG2933@thinkpad/
Link: https://lore.kernel.org/linux-pci/20240416-pci-rockchip-perst-fix-v1-1-4800…
Reported-by: Slark Xiao <slark_xiao(a)163.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski(a)kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Reviewed-by: Niklas Cassel <cassel(a)kernel.org>
Cc: stable(a)vger.kernel.org # v4.9
diff --git a/drivers/pci/controller/pcie-rockchip.c b/drivers/pci/controller/pcie-rockchip.c
index 0ef2e622d36e..c07d7129f1c7 100644
--- a/drivers/pci/controller/pcie-rockchip.c
+++ b/drivers/pci/controller/pcie-rockchip.c
@@ -121,7 +121,7 @@ int rockchip_pcie_parse_dt(struct rockchip_pcie *rockchip)
if (rockchip->is_rc) {
rockchip->ep_gpio = devm_gpiod_get_optional(dev, "ep",
- GPIOD_OUT_HIGH);
+ GPIOD_OUT_LOW);
if (IS_ERR(rockchip->ep_gpio))
return dev_err_probe(dev, PTR_ERR(rockchip->ep_gpio),
"failed to get ep GPIO\n");
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x c2bc958b2b03e361f14df99983bc64a39a7323a3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072917-platter-duke-71de@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
c2bc958b2b03 ("fbdev: vesafb: Detect VGA compatibility from screen info's VESA attributes")
75fa9b7e375e ("video: Add helpers for decoding screen_info")
3218286bbb78 ("fbdev/vesafb: Replace references to global screen_info by local pointer")
7470849745e6 ("video: Move HP PARISC STI core code to shared location")
93604a5ade3a ("fbdev: Handle video= parameter in video/cmdline.c")
367221793d47 ("fbdev: Move option-string lookup into helper")
6d8ad3406a69 ("fbdev: Unexport fb_mode_option")
cedaf7cddd73 ("fbdev: Support NULL for name in option-string lookup")
73ce73c30ba9 ("fbdev: Transfer video= option strings to caller; clarify ownership")
678573b8eee2 ("fbdev/vesafb: Do not use struct fb_info.apertures")
4ef614be6557 ("fbdev/vesafb: Remove trailing whitespaces")
9a758d8756da ("drm: Move nomodeset kernel parameter to drivers/video")
7283f862bd99 ("drm: Implement DRM aperture helpers under video/")
efc8f3229f84 ("MAINTAINERS: Broaden scope of simpledrm entry")
fb84efa28a48 ("drm/aperture: Run fbdev removal before internal helpers")
2518f226c60d ("Merge tag 'drm-next-2022-05-25' of git://anongit.freedesktop.org/drm/drm")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c2bc958b2b03e361f14df99983bc64a39a7323a3 Mon Sep 17 00:00:00 2001
From: Thomas Zimmermann <tzimmermann(a)suse.de>
Date: Mon, 17 Jun 2024 13:06:27 +0200
Subject: [PATCH] fbdev: vesafb: Detect VGA compatibility from screen info's
VESA attributes
Test the vesa_attributes field in struct screen_info for compatibility
with VGA hardware. Vesafb currently tests bit 1 in screen_info's
capabilities field which indicates a 64-bit lfb address and is
unrelated to VGA compatibility.
Section 4.4 of the Vesa VBE 2.0 specifications defines that bit 5 in
the mode's attributes field signals VGA compatibility. The mode is
compatible with VGA hardware if the bit is clear. In that case, the
driver can access VGA state of the VBE's underlying hardware. The
vesafb driver uses this feature to program the color LUT in palette
modes. Without, colors might be incorrect.
The problem got introduced in commit 89ec4c238e7a ("[PATCH] vesafb: Fix
incorrect logo colors in x86_64"). It incorrectly stores the mode
attributes in the screen_info's capabilities field and updates vesafb
accordingly. Later, commit 5e8ddcbe8692 ("Video mode probing support for
the new x86 setup code") fixed the screen_info, but did not update vesafb.
Color output still tends to work, because bit 1 in capabilities is
usually 0.
Besides fixing the bug in vesafb, this commit introduces a helper that
reads the correct bit from screen_info.
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Fixes: 5e8ddcbe8692 ("Video mode probing support for the new x86 setup code")
Reviewed-by: Javier Martinez Canillas <javierm(a)redhat.com>
Cc: <stable(a)vger.kernel.org> # v2.6.23+
Signed-off-by: Helge Deller <deller(a)gmx.de>
diff --git a/drivers/video/fbdev/vesafb.c b/drivers/video/fbdev/vesafb.c
index 8ab64ae4cad3..5a161750a3ae 100644
--- a/drivers/video/fbdev/vesafb.c
+++ b/drivers/video/fbdev/vesafb.c
@@ -271,7 +271,7 @@ static int vesafb_probe(struct platform_device *dev)
if (si->orig_video_isVGA != VIDEO_TYPE_VLFB)
return -ENODEV;
- vga_compat = (si->capabilities & 2) ? 0 : 1;
+ vga_compat = !__screen_info_vbe_mode_nonvga(si);
vesafb_fix.smem_start = si->lfb_base;
vesafb_defined.bits_per_pixel = si->lfb_depth;
if (15 == vesafb_defined.bits_per_pixel)
diff --git a/include/linux/screen_info.h b/include/linux/screen_info.h
index 75303c126285..6a4a3cec4638 100644
--- a/include/linux/screen_info.h
+++ b/include/linux/screen_info.h
@@ -49,6 +49,16 @@ static inline u64 __screen_info_lfb_size(const struct screen_info *si, unsigned
return lfb_size;
}
+static inline bool __screen_info_vbe_mode_nonvga(const struct screen_info *si)
+{
+ /*
+ * VESA modes typically run on VGA hardware. Set bit 5 signals that this
+ * is not the case. Drivers can then not make use of VGA resources. See
+ * Sec 4.4 of the VBE 2.0 spec.
+ */
+ return si->vesa_attributes & BIT(5);
+}
+
static inline unsigned int __screen_info_video_type(unsigned int type)
{
switch (type) {
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x c2bc958b2b03e361f14df99983bc64a39a7323a3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072912-chafe-starship-0f16@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
c2bc958b2b03 ("fbdev: vesafb: Detect VGA compatibility from screen info's VESA attributes")
75fa9b7e375e ("video: Add helpers for decoding screen_info")
3218286bbb78 ("fbdev/vesafb: Replace references to global screen_info by local pointer")
7470849745e6 ("video: Move HP PARISC STI core code to shared location")
93604a5ade3a ("fbdev: Handle video= parameter in video/cmdline.c")
367221793d47 ("fbdev: Move option-string lookup into helper")
6d8ad3406a69 ("fbdev: Unexport fb_mode_option")
cedaf7cddd73 ("fbdev: Support NULL for name in option-string lookup")
73ce73c30ba9 ("fbdev: Transfer video= option strings to caller; clarify ownership")
678573b8eee2 ("fbdev/vesafb: Do not use struct fb_info.apertures")
4ef614be6557 ("fbdev/vesafb: Remove trailing whitespaces")
9a758d8756da ("drm: Move nomodeset kernel parameter to drivers/video")
7283f862bd99 ("drm: Implement DRM aperture helpers under video/")
efc8f3229f84 ("MAINTAINERS: Broaden scope of simpledrm entry")
fb84efa28a48 ("drm/aperture: Run fbdev removal before internal helpers")
2518f226c60d ("Merge tag 'drm-next-2022-05-25' of git://anongit.freedesktop.org/drm/drm")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c2bc958b2b03e361f14df99983bc64a39a7323a3 Mon Sep 17 00:00:00 2001
From: Thomas Zimmermann <tzimmermann(a)suse.de>
Date: Mon, 17 Jun 2024 13:06:27 +0200
Subject: [PATCH] fbdev: vesafb: Detect VGA compatibility from screen info's
VESA attributes
Test the vesa_attributes field in struct screen_info for compatibility
with VGA hardware. Vesafb currently tests bit 1 in screen_info's
capabilities field which indicates a 64-bit lfb address and is
unrelated to VGA compatibility.
Section 4.4 of the Vesa VBE 2.0 specifications defines that bit 5 in
the mode's attributes field signals VGA compatibility. The mode is
compatible with VGA hardware if the bit is clear. In that case, the
driver can access VGA state of the VBE's underlying hardware. The
vesafb driver uses this feature to program the color LUT in palette
modes. Without, colors might be incorrect.
The problem got introduced in commit 89ec4c238e7a ("[PATCH] vesafb: Fix
incorrect logo colors in x86_64"). It incorrectly stores the mode
attributes in the screen_info's capabilities field and updates vesafb
accordingly. Later, commit 5e8ddcbe8692 ("Video mode probing support for
the new x86 setup code") fixed the screen_info, but did not update vesafb.
Color output still tends to work, because bit 1 in capabilities is
usually 0.
Besides fixing the bug in vesafb, this commit introduces a helper that
reads the correct bit from screen_info.
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Fixes: 5e8ddcbe8692 ("Video mode probing support for the new x86 setup code")
Reviewed-by: Javier Martinez Canillas <javierm(a)redhat.com>
Cc: <stable(a)vger.kernel.org> # v2.6.23+
Signed-off-by: Helge Deller <deller(a)gmx.de>
diff --git a/drivers/video/fbdev/vesafb.c b/drivers/video/fbdev/vesafb.c
index 8ab64ae4cad3..5a161750a3ae 100644
--- a/drivers/video/fbdev/vesafb.c
+++ b/drivers/video/fbdev/vesafb.c
@@ -271,7 +271,7 @@ static int vesafb_probe(struct platform_device *dev)
if (si->orig_video_isVGA != VIDEO_TYPE_VLFB)
return -ENODEV;
- vga_compat = (si->capabilities & 2) ? 0 : 1;
+ vga_compat = !__screen_info_vbe_mode_nonvga(si);
vesafb_fix.smem_start = si->lfb_base;
vesafb_defined.bits_per_pixel = si->lfb_depth;
if (15 == vesafb_defined.bits_per_pixel)
diff --git a/include/linux/screen_info.h b/include/linux/screen_info.h
index 75303c126285..6a4a3cec4638 100644
--- a/include/linux/screen_info.h
+++ b/include/linux/screen_info.h
@@ -49,6 +49,16 @@ static inline u64 __screen_info_lfb_size(const struct screen_info *si, unsigned
return lfb_size;
}
+static inline bool __screen_info_vbe_mode_nonvga(const struct screen_info *si)
+{
+ /*
+ * VESA modes typically run on VGA hardware. Set bit 5 signals that this
+ * is not the case. Drivers can then not make use of VGA resources. See
+ * Sec 4.4 of the VBE 2.0 spec.
+ */
+ return si->vesa_attributes & BIT(5);
+}
+
static inline unsigned int __screen_info_video_type(unsigned int type)
{
switch (type) {
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x c2bc958b2b03e361f14df99983bc64a39a7323a3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072902-arise-online-fd6e@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
c2bc958b2b03 ("fbdev: vesafb: Detect VGA compatibility from screen info's VESA attributes")
75fa9b7e375e ("video: Add helpers for decoding screen_info")
3218286bbb78 ("fbdev/vesafb: Replace references to global screen_info by local pointer")
7470849745e6 ("video: Move HP PARISC STI core code to shared location")
93604a5ade3a ("fbdev: Handle video= parameter in video/cmdline.c")
367221793d47 ("fbdev: Move option-string lookup into helper")
6d8ad3406a69 ("fbdev: Unexport fb_mode_option")
cedaf7cddd73 ("fbdev: Support NULL for name in option-string lookup")
73ce73c30ba9 ("fbdev: Transfer video= option strings to caller; clarify ownership")
678573b8eee2 ("fbdev/vesafb: Do not use struct fb_info.apertures")
4ef614be6557 ("fbdev/vesafb: Remove trailing whitespaces")
9a758d8756da ("drm: Move nomodeset kernel parameter to drivers/video")
7283f862bd99 ("drm: Implement DRM aperture helpers under video/")
efc8f3229f84 ("MAINTAINERS: Broaden scope of simpledrm entry")
fb84efa28a48 ("drm/aperture: Run fbdev removal before internal helpers")
2518f226c60d ("Merge tag 'drm-next-2022-05-25' of git://anongit.freedesktop.org/drm/drm")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c2bc958b2b03e361f14df99983bc64a39a7323a3 Mon Sep 17 00:00:00 2001
From: Thomas Zimmermann <tzimmermann(a)suse.de>
Date: Mon, 17 Jun 2024 13:06:27 +0200
Subject: [PATCH] fbdev: vesafb: Detect VGA compatibility from screen info's
VESA attributes
Test the vesa_attributes field in struct screen_info for compatibility
with VGA hardware. Vesafb currently tests bit 1 in screen_info's
capabilities field which indicates a 64-bit lfb address and is
unrelated to VGA compatibility.
Section 4.4 of the Vesa VBE 2.0 specifications defines that bit 5 in
the mode's attributes field signals VGA compatibility. The mode is
compatible with VGA hardware if the bit is clear. In that case, the
driver can access VGA state of the VBE's underlying hardware. The
vesafb driver uses this feature to program the color LUT in palette
modes. Without, colors might be incorrect.
The problem got introduced in commit 89ec4c238e7a ("[PATCH] vesafb: Fix
incorrect logo colors in x86_64"). It incorrectly stores the mode
attributes in the screen_info's capabilities field and updates vesafb
accordingly. Later, commit 5e8ddcbe8692 ("Video mode probing support for
the new x86 setup code") fixed the screen_info, but did not update vesafb.
Color output still tends to work, because bit 1 in capabilities is
usually 0.
Besides fixing the bug in vesafb, this commit introduces a helper that
reads the correct bit from screen_info.
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Fixes: 5e8ddcbe8692 ("Video mode probing support for the new x86 setup code")
Reviewed-by: Javier Martinez Canillas <javierm(a)redhat.com>
Cc: <stable(a)vger.kernel.org> # v2.6.23+
Signed-off-by: Helge Deller <deller(a)gmx.de>
diff --git a/drivers/video/fbdev/vesafb.c b/drivers/video/fbdev/vesafb.c
index 8ab64ae4cad3..5a161750a3ae 100644
--- a/drivers/video/fbdev/vesafb.c
+++ b/drivers/video/fbdev/vesafb.c
@@ -271,7 +271,7 @@ static int vesafb_probe(struct platform_device *dev)
if (si->orig_video_isVGA != VIDEO_TYPE_VLFB)
return -ENODEV;
- vga_compat = (si->capabilities & 2) ? 0 : 1;
+ vga_compat = !__screen_info_vbe_mode_nonvga(si);
vesafb_fix.smem_start = si->lfb_base;
vesafb_defined.bits_per_pixel = si->lfb_depth;
if (15 == vesafb_defined.bits_per_pixel)
diff --git a/include/linux/screen_info.h b/include/linux/screen_info.h
index 75303c126285..6a4a3cec4638 100644
--- a/include/linux/screen_info.h
+++ b/include/linux/screen_info.h
@@ -49,6 +49,16 @@ static inline u64 __screen_info_lfb_size(const struct screen_info *si, unsigned
return lfb_size;
}
+static inline bool __screen_info_vbe_mode_nonvga(const struct screen_info *si)
+{
+ /*
+ * VESA modes typically run on VGA hardware. Set bit 5 signals that this
+ * is not the case. Drivers can then not make use of VGA resources. See
+ * Sec 4.4 of the VBE 2.0 spec.
+ */
+ return si->vesa_attributes & BIT(5);
+}
+
static inline unsigned int __screen_info_video_type(unsigned int type)
{
switch (type) {
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x c2bc958b2b03e361f14df99983bc64a39a7323a3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072956-rockfish-extradite-7f79@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
c2bc958b2b03 ("fbdev: vesafb: Detect VGA compatibility from screen info's VESA attributes")
75fa9b7e375e ("video: Add helpers for decoding screen_info")
3218286bbb78 ("fbdev/vesafb: Replace references to global screen_info by local pointer")
7470849745e6 ("video: Move HP PARISC STI core code to shared location")
93604a5ade3a ("fbdev: Handle video= parameter in video/cmdline.c")
367221793d47 ("fbdev: Move option-string lookup into helper")
6d8ad3406a69 ("fbdev: Unexport fb_mode_option")
cedaf7cddd73 ("fbdev: Support NULL for name in option-string lookup")
73ce73c30ba9 ("fbdev: Transfer video= option strings to caller; clarify ownership")
678573b8eee2 ("fbdev/vesafb: Do not use struct fb_info.apertures")
4ef614be6557 ("fbdev/vesafb: Remove trailing whitespaces")
9a758d8756da ("drm: Move nomodeset kernel parameter to drivers/video")
7283f862bd99 ("drm: Implement DRM aperture helpers under video/")
efc8f3229f84 ("MAINTAINERS: Broaden scope of simpledrm entry")
fb84efa28a48 ("drm/aperture: Run fbdev removal before internal helpers")
2518f226c60d ("Merge tag 'drm-next-2022-05-25' of git://anongit.freedesktop.org/drm/drm")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c2bc958b2b03e361f14df99983bc64a39a7323a3 Mon Sep 17 00:00:00 2001
From: Thomas Zimmermann <tzimmermann(a)suse.de>
Date: Mon, 17 Jun 2024 13:06:27 +0200
Subject: [PATCH] fbdev: vesafb: Detect VGA compatibility from screen info's
VESA attributes
Test the vesa_attributes field in struct screen_info for compatibility
with VGA hardware. Vesafb currently tests bit 1 in screen_info's
capabilities field which indicates a 64-bit lfb address and is
unrelated to VGA compatibility.
Section 4.4 of the Vesa VBE 2.0 specifications defines that bit 5 in
the mode's attributes field signals VGA compatibility. The mode is
compatible with VGA hardware if the bit is clear. In that case, the
driver can access VGA state of the VBE's underlying hardware. The
vesafb driver uses this feature to program the color LUT in palette
modes. Without, colors might be incorrect.
The problem got introduced in commit 89ec4c238e7a ("[PATCH] vesafb: Fix
incorrect logo colors in x86_64"). It incorrectly stores the mode
attributes in the screen_info's capabilities field and updates vesafb
accordingly. Later, commit 5e8ddcbe8692 ("Video mode probing support for
the new x86 setup code") fixed the screen_info, but did not update vesafb.
Color output still tends to work, because bit 1 in capabilities is
usually 0.
Besides fixing the bug in vesafb, this commit introduces a helper that
reads the correct bit from screen_info.
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Fixes: 5e8ddcbe8692 ("Video mode probing support for the new x86 setup code")
Reviewed-by: Javier Martinez Canillas <javierm(a)redhat.com>
Cc: <stable(a)vger.kernel.org> # v2.6.23+
Signed-off-by: Helge Deller <deller(a)gmx.de>
diff --git a/drivers/video/fbdev/vesafb.c b/drivers/video/fbdev/vesafb.c
index 8ab64ae4cad3..5a161750a3ae 100644
--- a/drivers/video/fbdev/vesafb.c
+++ b/drivers/video/fbdev/vesafb.c
@@ -271,7 +271,7 @@ static int vesafb_probe(struct platform_device *dev)
if (si->orig_video_isVGA != VIDEO_TYPE_VLFB)
return -ENODEV;
- vga_compat = (si->capabilities & 2) ? 0 : 1;
+ vga_compat = !__screen_info_vbe_mode_nonvga(si);
vesafb_fix.smem_start = si->lfb_base;
vesafb_defined.bits_per_pixel = si->lfb_depth;
if (15 == vesafb_defined.bits_per_pixel)
diff --git a/include/linux/screen_info.h b/include/linux/screen_info.h
index 75303c126285..6a4a3cec4638 100644
--- a/include/linux/screen_info.h
+++ b/include/linux/screen_info.h
@@ -49,6 +49,16 @@ static inline u64 __screen_info_lfb_size(const struct screen_info *si, unsigned
return lfb_size;
}
+static inline bool __screen_info_vbe_mode_nonvga(const struct screen_info *si)
+{
+ /*
+ * VESA modes typically run on VGA hardware. Set bit 5 signals that this
+ * is not the case. Drivers can then not make use of VGA resources. See
+ * Sec 4.4 of the VBE 2.0 spec.
+ */
+ return si->vesa_attributes & BIT(5);
+}
+
static inline unsigned int __screen_info_video_type(unsigned int type)
{
switch (type) {
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x c2bc958b2b03e361f14df99983bc64a39a7323a3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072951-greeter-mummy-752b@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
c2bc958b2b03 ("fbdev: vesafb: Detect VGA compatibility from screen info's VESA attributes")
75fa9b7e375e ("video: Add helpers for decoding screen_info")
3218286bbb78 ("fbdev/vesafb: Replace references to global screen_info by local pointer")
7470849745e6 ("video: Move HP PARISC STI core code to shared location")
93604a5ade3a ("fbdev: Handle video= parameter in video/cmdline.c")
367221793d47 ("fbdev: Move option-string lookup into helper")
6d8ad3406a69 ("fbdev: Unexport fb_mode_option")
cedaf7cddd73 ("fbdev: Support NULL for name in option-string lookup")
73ce73c30ba9 ("fbdev: Transfer video= option strings to caller; clarify ownership")
678573b8eee2 ("fbdev/vesafb: Do not use struct fb_info.apertures")
4ef614be6557 ("fbdev/vesafb: Remove trailing whitespaces")
9a758d8756da ("drm: Move nomodeset kernel parameter to drivers/video")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c2bc958b2b03e361f14df99983bc64a39a7323a3 Mon Sep 17 00:00:00 2001
From: Thomas Zimmermann <tzimmermann(a)suse.de>
Date: Mon, 17 Jun 2024 13:06:27 +0200
Subject: [PATCH] fbdev: vesafb: Detect VGA compatibility from screen info's
VESA attributes
Test the vesa_attributes field in struct screen_info for compatibility
with VGA hardware. Vesafb currently tests bit 1 in screen_info's
capabilities field which indicates a 64-bit lfb address and is
unrelated to VGA compatibility.
Section 4.4 of the Vesa VBE 2.0 specifications defines that bit 5 in
the mode's attributes field signals VGA compatibility. The mode is
compatible with VGA hardware if the bit is clear. In that case, the
driver can access VGA state of the VBE's underlying hardware. The
vesafb driver uses this feature to program the color LUT in palette
modes. Without, colors might be incorrect.
The problem got introduced in commit 89ec4c238e7a ("[PATCH] vesafb: Fix
incorrect logo colors in x86_64"). It incorrectly stores the mode
attributes in the screen_info's capabilities field and updates vesafb
accordingly. Later, commit 5e8ddcbe8692 ("Video mode probing support for
the new x86 setup code") fixed the screen_info, but did not update vesafb.
Color output still tends to work, because bit 1 in capabilities is
usually 0.
Besides fixing the bug in vesafb, this commit introduces a helper that
reads the correct bit from screen_info.
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Fixes: 5e8ddcbe8692 ("Video mode probing support for the new x86 setup code")
Reviewed-by: Javier Martinez Canillas <javierm(a)redhat.com>
Cc: <stable(a)vger.kernel.org> # v2.6.23+
Signed-off-by: Helge Deller <deller(a)gmx.de>
diff --git a/drivers/video/fbdev/vesafb.c b/drivers/video/fbdev/vesafb.c
index 8ab64ae4cad3..5a161750a3ae 100644
--- a/drivers/video/fbdev/vesafb.c
+++ b/drivers/video/fbdev/vesafb.c
@@ -271,7 +271,7 @@ static int vesafb_probe(struct platform_device *dev)
if (si->orig_video_isVGA != VIDEO_TYPE_VLFB)
return -ENODEV;
- vga_compat = (si->capabilities & 2) ? 0 : 1;
+ vga_compat = !__screen_info_vbe_mode_nonvga(si);
vesafb_fix.smem_start = si->lfb_base;
vesafb_defined.bits_per_pixel = si->lfb_depth;
if (15 == vesafb_defined.bits_per_pixel)
diff --git a/include/linux/screen_info.h b/include/linux/screen_info.h
index 75303c126285..6a4a3cec4638 100644
--- a/include/linux/screen_info.h
+++ b/include/linux/screen_info.h
@@ -49,6 +49,16 @@ static inline u64 __screen_info_lfb_size(const struct screen_info *si, unsigned
return lfb_size;
}
+static inline bool __screen_info_vbe_mode_nonvga(const struct screen_info *si)
+{
+ /*
+ * VESA modes typically run on VGA hardware. Set bit 5 signals that this
+ * is not the case. Drivers can then not make use of VGA resources. See
+ * Sec 4.4 of the VBE 2.0 spec.
+ */
+ return si->vesa_attributes & BIT(5);
+}
+
static inline unsigned int __screen_info_video_type(unsigned int type)
{
switch (type) {
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x c2bc958b2b03e361f14df99983bc64a39a7323a3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072945-reenter-waving-4e70@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
c2bc958b2b03 ("fbdev: vesafb: Detect VGA compatibility from screen info's VESA attributes")
75fa9b7e375e ("video: Add helpers for decoding screen_info")
3218286bbb78 ("fbdev/vesafb: Replace references to global screen_info by local pointer")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c2bc958b2b03e361f14df99983bc64a39a7323a3 Mon Sep 17 00:00:00 2001
From: Thomas Zimmermann <tzimmermann(a)suse.de>
Date: Mon, 17 Jun 2024 13:06:27 +0200
Subject: [PATCH] fbdev: vesafb: Detect VGA compatibility from screen info's
VESA attributes
Test the vesa_attributes field in struct screen_info for compatibility
with VGA hardware. Vesafb currently tests bit 1 in screen_info's
capabilities field which indicates a 64-bit lfb address and is
unrelated to VGA compatibility.
Section 4.4 of the Vesa VBE 2.0 specifications defines that bit 5 in
the mode's attributes field signals VGA compatibility. The mode is
compatible with VGA hardware if the bit is clear. In that case, the
driver can access VGA state of the VBE's underlying hardware. The
vesafb driver uses this feature to program the color LUT in palette
modes. Without, colors might be incorrect.
The problem got introduced in commit 89ec4c238e7a ("[PATCH] vesafb: Fix
incorrect logo colors in x86_64"). It incorrectly stores the mode
attributes in the screen_info's capabilities field and updates vesafb
accordingly. Later, commit 5e8ddcbe8692 ("Video mode probing support for
the new x86 setup code") fixed the screen_info, but did not update vesafb.
Color output still tends to work, because bit 1 in capabilities is
usually 0.
Besides fixing the bug in vesafb, this commit introduces a helper that
reads the correct bit from screen_info.
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Fixes: 5e8ddcbe8692 ("Video mode probing support for the new x86 setup code")
Reviewed-by: Javier Martinez Canillas <javierm(a)redhat.com>
Cc: <stable(a)vger.kernel.org> # v2.6.23+
Signed-off-by: Helge Deller <deller(a)gmx.de>
diff --git a/drivers/video/fbdev/vesafb.c b/drivers/video/fbdev/vesafb.c
index 8ab64ae4cad3..5a161750a3ae 100644
--- a/drivers/video/fbdev/vesafb.c
+++ b/drivers/video/fbdev/vesafb.c
@@ -271,7 +271,7 @@ static int vesafb_probe(struct platform_device *dev)
if (si->orig_video_isVGA != VIDEO_TYPE_VLFB)
return -ENODEV;
- vga_compat = (si->capabilities & 2) ? 0 : 1;
+ vga_compat = !__screen_info_vbe_mode_nonvga(si);
vesafb_fix.smem_start = si->lfb_base;
vesafb_defined.bits_per_pixel = si->lfb_depth;
if (15 == vesafb_defined.bits_per_pixel)
diff --git a/include/linux/screen_info.h b/include/linux/screen_info.h
index 75303c126285..6a4a3cec4638 100644
--- a/include/linux/screen_info.h
+++ b/include/linux/screen_info.h
@@ -49,6 +49,16 @@ static inline u64 __screen_info_lfb_size(const struct screen_info *si, unsigned
return lfb_size;
}
+static inline bool __screen_info_vbe_mode_nonvga(const struct screen_info *si)
+{
+ /*
+ * VESA modes typically run on VGA hardware. Set bit 5 signals that this
+ * is not the case. Drivers can then not make use of VGA resources. See
+ * Sec 4.4 of the VBE 2.0 spec.
+ */
+ return si->vesa_attributes & BIT(5);
+}
+
static inline unsigned int __screen_info_video_type(unsigned int type)
{
switch (type) {
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x e6e18021ddd0dc5af487fb86b6d7c964e062d692
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072902-kooky-velocity-3391@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
e6e18021ddd0 ("ALSA: hda/realtek: cs35l41: Fixup remaining asus strix models")
2be46155d792 ("ALSA: hda/realtek: Adjust G814JZR to use SPI init for amp")
0bfe105018bd ("ALSA: hda/realtek: cs35l41: Support ASUS ROG G634JYR")
24b6332c2d4f ("ALSA: hda: Add Lenovo Legion 7i gen7 sound quirk")
b5cb53fd3277 ("ALSA: hda/tas2781: add fixup for Lenovo 14ARB7")
ba7053b4b4a4 ("ALSA: hda: Add driver properties for cs35l41 for Lenovo Legion Slim 7 Gen 8 serie")
d110858a6925 ("ALSA: hda: cs35l41: Prevent firmware load if SPI speed too low")
ee694e7db47e ("ALSA: hda: cs35l41: Support additional Dell models without _DSD")
2b35b66d82dc ("ALSA: hda: cs35l41: Support additional ASUS Zenbook 2023 Models")
b257187bcff4 ("ALSA: hda: cs35l41: Support additional ASUS Zenbook 2022 Models")
b592ed2e1d78 ("ALSA: hda: cs35l41: Support additional ASUS ROG 2023 models")
8c4c216db8fb ("ALSA: hda: cs35l41: Add config table to support many laptops without _DSD")
f01b371b0794 ("ALSA: hda: cs35l41: Use reset label to get GPIO for HP Zbook Fury 17 G9")
447106e92a0c ("ALSA: hda: cs35l41: Support mute notifications for CS35L41 HDA")
93dc18e11b1a ("ALSA: hda/realtek: Add quirk for HP Victus 16-d1xxx to enable mute LED")
581523ee3652 ("ALSA: hda: cs35l41: Override the _DSD for HP Zbook Fury 17 G9 to correct boost type")
3babae915f4c ("ALSA: hda/tas2781: Add tas2781 HDA driver")
ef4ba63f12b0 ("ALSA: hda: cs35l41: Support systems with missing _DSD properties")
a32e0834df76 ("Merge tag 'asoc-v6.6-early' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e6e18021ddd0dc5af487fb86b6d7c964e062d692 Mon Sep 17 00:00:00 2001
From: "Luke D. Jones" <luke(a)ljones.dev>
Date: Tue, 23 Jul 2024 13:12:24 +1200
Subject: [PATCH] ALSA: hda/realtek: cs35l41: Fixup remaining asus strix models
Adjust quirks for 0x3a20, 0x3a30, 0x3a50 to match the 0x3a60. This
set has now been confirmed to work with this patch.
Signed-off-by: Luke D. Jones <luke(a)ljones.dev>
Fixes: 811dd426a9b1 ("ALSA: hda/realtek: Add quirks for Asus ROG 2024 laptops using CS35L41")
Cc: <stable(a)vger.kernel.org>
Link: https://patch.msgid.link/20240723011224.115579-1-luke@ljones.dev
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index c3a86a99f8c6..462194bf6954 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -10359,10 +10359,10 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x1043, 0x1f62, "ASUS UX7602ZM", ALC245_FIXUP_CS35L41_SPI_2),
SND_PCI_QUIRK(0x1043, 0x1f92, "ASUS ROG Flow X16", ALC289_FIXUP_ASUS_GA401),
SND_PCI_QUIRK(0x1043, 0x3030, "ASUS ZN270IE", ALC256_FIXUP_ASUS_AIO_GPIO2),
- SND_PCI_QUIRK(0x1043, 0x3a20, "ASUS G614JZR", ALC245_FIXUP_CS35L41_SPI_2),
- SND_PCI_QUIRK(0x1043, 0x3a30, "ASUS G814JVR/JIR", ALC245_FIXUP_CS35L41_SPI_2),
+ SND_PCI_QUIRK(0x1043, 0x3a20, "ASUS G614JZR", ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS),
+ SND_PCI_QUIRK(0x1043, 0x3a30, "ASUS G814JVR/JIR", ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS),
SND_PCI_QUIRK(0x1043, 0x3a40, "ASUS G814JZR", ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS),
- SND_PCI_QUIRK(0x1043, 0x3a50, "ASUS G834JYR/JZR", ALC245_FIXUP_CS35L41_SPI_2),
+ SND_PCI_QUIRK(0x1043, 0x3a50, "ASUS G834JYR/JZR", ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS),
SND_PCI_QUIRK(0x1043, 0x3a60, "ASUS G634JYR/JZR", ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS),
SND_PCI_QUIRK(0x1043, 0x831a, "ASUS P901", ALC269_FIXUP_STEREO_DMIC),
SND_PCI_QUIRK(0x1043, 0x834a, "ASUS S101", ALC269_FIXUP_STEREO_DMIC),
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x e6e18021ddd0dc5af487fb86b6d7c964e062d692
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072959-wilt-balsamic-2f7d@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
e6e18021ddd0 ("ALSA: hda/realtek: cs35l41: Fixup remaining asus strix models")
2be46155d792 ("ALSA: hda/realtek: Adjust G814JZR to use SPI init for amp")
0bfe105018bd ("ALSA: hda/realtek: cs35l41: Support ASUS ROG G634JYR")
24b6332c2d4f ("ALSA: hda: Add Lenovo Legion 7i gen7 sound quirk")
b5cb53fd3277 ("ALSA: hda/tas2781: add fixup for Lenovo 14ARB7")
ba7053b4b4a4 ("ALSA: hda: Add driver properties for cs35l41 for Lenovo Legion Slim 7 Gen 8 serie")
d110858a6925 ("ALSA: hda: cs35l41: Prevent firmware load if SPI speed too low")
ee694e7db47e ("ALSA: hda: cs35l41: Support additional Dell models without _DSD")
2b35b66d82dc ("ALSA: hda: cs35l41: Support additional ASUS Zenbook 2023 Models")
b257187bcff4 ("ALSA: hda: cs35l41: Support additional ASUS Zenbook 2022 Models")
b592ed2e1d78 ("ALSA: hda: cs35l41: Support additional ASUS ROG 2023 models")
8c4c216db8fb ("ALSA: hda: cs35l41: Add config table to support many laptops without _DSD")
f01b371b0794 ("ALSA: hda: cs35l41: Use reset label to get GPIO for HP Zbook Fury 17 G9")
447106e92a0c ("ALSA: hda: cs35l41: Support mute notifications for CS35L41 HDA")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e6e18021ddd0dc5af487fb86b6d7c964e062d692 Mon Sep 17 00:00:00 2001
From: "Luke D. Jones" <luke(a)ljones.dev>
Date: Tue, 23 Jul 2024 13:12:24 +1200
Subject: [PATCH] ALSA: hda/realtek: cs35l41: Fixup remaining asus strix models
Adjust quirks for 0x3a20, 0x3a30, 0x3a50 to match the 0x3a60. This
set has now been confirmed to work with this patch.
Signed-off-by: Luke D. Jones <luke(a)ljones.dev>
Fixes: 811dd426a9b1 ("ALSA: hda/realtek: Add quirks for Asus ROG 2024 laptops using CS35L41")
Cc: <stable(a)vger.kernel.org>
Link: https://patch.msgid.link/20240723011224.115579-1-luke@ljones.dev
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index c3a86a99f8c6..462194bf6954 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -10359,10 +10359,10 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x1043, 0x1f62, "ASUS UX7602ZM", ALC245_FIXUP_CS35L41_SPI_2),
SND_PCI_QUIRK(0x1043, 0x1f92, "ASUS ROG Flow X16", ALC289_FIXUP_ASUS_GA401),
SND_PCI_QUIRK(0x1043, 0x3030, "ASUS ZN270IE", ALC256_FIXUP_ASUS_AIO_GPIO2),
- SND_PCI_QUIRK(0x1043, 0x3a20, "ASUS G614JZR", ALC245_FIXUP_CS35L41_SPI_2),
- SND_PCI_QUIRK(0x1043, 0x3a30, "ASUS G814JVR/JIR", ALC245_FIXUP_CS35L41_SPI_2),
+ SND_PCI_QUIRK(0x1043, 0x3a20, "ASUS G614JZR", ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS),
+ SND_PCI_QUIRK(0x1043, 0x3a30, "ASUS G814JVR/JIR", ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS),
SND_PCI_QUIRK(0x1043, 0x3a40, "ASUS G814JZR", ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS),
- SND_PCI_QUIRK(0x1043, 0x3a50, "ASUS G834JYR/JZR", ALC245_FIXUP_CS35L41_SPI_2),
+ SND_PCI_QUIRK(0x1043, 0x3a50, "ASUS G834JYR/JZR", ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS),
SND_PCI_QUIRK(0x1043, 0x3a60, "ASUS G634JYR/JZR", ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS),
SND_PCI_QUIRK(0x1043, 0x831a, "ASUS P901", ALC269_FIXUP_STEREO_DMIC),
SND_PCI_QUIRK(0x1043, 0x834a, "ASUS S101", ALC269_FIXUP_STEREO_DMIC),
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 27ba5b67312a944576addc4df44ac3b709aabede
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072917-thirty-clay-7316@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
27ba5b67312a ("jbd2: avoid infinite transaction commit loop")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 27ba5b67312a944576addc4df44ac3b709aabede Mon Sep 17 00:00:00 2001
From: Jan Kara <jack(a)suse.cz>
Date: Mon, 24 Jun 2024 19:01:19 +0200
Subject: [PATCH] jbd2: avoid infinite transaction commit loop
Commit 9f356e5a4f12 ("jbd2: Account descriptor blocks into
t_outstanding_credits") started to account descriptor blocks into
transactions outstanding credits. However it didn't appropriately
decrease the maximum amount of credits available to userspace. Thus if
the filesystem requests a transaction smaller than
j_max_transaction_buffers but large enough that when descriptor blocks
are added the size exceeds j_max_transaction_buffers, we confuse
add_transaction_credits() into thinking previous handles have grown the
transaction too much and enter infinite journal commit loop in
start_this_handle() -> add_transaction_credits() trying to create
transaction with enough credits available.
Fix the problem by properly accounting for transaction space reserved
for descriptor blocks when verifying requested transaction handle size.
CC: stable(a)vger.kernel.org
Fixes: 9f356e5a4f12 ("jbd2: Account descriptor blocks into t_outstanding_credits")
Reported-by: Alexander Coffin <alex.coffin(a)maticrobots.com>
Link: https://lore.kernel.org/all/CA+hUFcuGs04JHZ_WzA1zGN57+ehL2qmHOt5a7RMpo+rv6V…
Signed-off-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Link: https://patch.msgid.link/20240624170127.3253-3-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index a095f1a3114b..66513c18ca29 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -191,6 +191,13 @@ static void sub_reserved_credits(journal_t *journal, int blocks)
wake_up(&journal->j_wait_reserved);
}
+/* Maximum number of blocks for user transaction payload */
+static int jbd2_max_user_trans_buffers(journal_t *journal)
+{
+ return journal->j_max_transaction_buffers -
+ journal->j_transaction_overhead_buffers;
+}
+
/*
* Wait until we can add credits for handle to the running transaction. Called
* with j_state_lock held for reading. Returns 0 if handle joined the running
@@ -240,12 +247,12 @@ __must_hold(&journal->j_state_lock)
* big to fit this handle? Wait until reserved credits are freed.
*/
if (atomic_read(&journal->j_reserved_credits) + total >
- journal->j_max_transaction_buffers) {
+ jbd2_max_user_trans_buffers(journal)) {
read_unlock(&journal->j_state_lock);
jbd2_might_wait_for_commit(journal);
wait_event(journal->j_wait_reserved,
atomic_read(&journal->j_reserved_credits) + total <=
- journal->j_max_transaction_buffers);
+ jbd2_max_user_trans_buffers(journal));
__acquire(&journal->j_state_lock); /* fake out sparse */
return 1;
}
@@ -285,14 +292,14 @@ __must_hold(&journal->j_state_lock)
needed = atomic_add_return(rsv_blocks, &journal->j_reserved_credits);
/* We allow at most half of a transaction to be reserved */
- if (needed > journal->j_max_transaction_buffers / 2) {
+ if (needed > jbd2_max_user_trans_buffers(journal) / 2) {
sub_reserved_credits(journal, rsv_blocks);
atomic_sub(total, &t->t_outstanding_credits);
read_unlock(&journal->j_state_lock);
jbd2_might_wait_for_commit(journal);
wait_event(journal->j_wait_reserved,
atomic_read(&journal->j_reserved_credits) + rsv_blocks
- <= journal->j_max_transaction_buffers / 2);
+ <= jbd2_max_user_trans_buffers(journal) / 2);
__acquire(&journal->j_state_lock); /* fake out sparse */
return 1;
}
@@ -322,12 +329,12 @@ static int start_this_handle(journal_t *journal, handle_t *handle,
* size and limit the number of total credits to not exceed maximum
* transaction size per operation.
*/
- if ((rsv_blocks > journal->j_max_transaction_buffers / 2) ||
- (rsv_blocks + blocks > journal->j_max_transaction_buffers)) {
+ if (rsv_blocks > jbd2_max_user_trans_buffers(journal) / 2 ||
+ rsv_blocks + blocks > jbd2_max_user_trans_buffers(journal)) {
printk(KERN_ERR "JBD2: %s wants too many credits "
"credits:%d rsv_credits:%d max:%d\n",
current->comm, blocks, rsv_blocks,
- journal->j_max_transaction_buffers);
+ jbd2_max_user_trans_buffers(journal));
WARN_ON(1);
return -ENOSPC;
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 27ba5b67312a944576addc4df44ac3b709aabede
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072949-unframed-extradite-14cf@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
27ba5b67312a ("jbd2: avoid infinite transaction commit loop")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 27ba5b67312a944576addc4df44ac3b709aabede Mon Sep 17 00:00:00 2001
From: Jan Kara <jack(a)suse.cz>
Date: Mon, 24 Jun 2024 19:01:19 +0200
Subject: [PATCH] jbd2: avoid infinite transaction commit loop
Commit 9f356e5a4f12 ("jbd2: Account descriptor blocks into
t_outstanding_credits") started to account descriptor blocks into
transactions outstanding credits. However it didn't appropriately
decrease the maximum amount of credits available to userspace. Thus if
the filesystem requests a transaction smaller than
j_max_transaction_buffers but large enough that when descriptor blocks
are added the size exceeds j_max_transaction_buffers, we confuse
add_transaction_credits() into thinking previous handles have grown the
transaction too much and enter infinite journal commit loop in
start_this_handle() -> add_transaction_credits() trying to create
transaction with enough credits available.
Fix the problem by properly accounting for transaction space reserved
for descriptor blocks when verifying requested transaction handle size.
CC: stable(a)vger.kernel.org
Fixes: 9f356e5a4f12 ("jbd2: Account descriptor blocks into t_outstanding_credits")
Reported-by: Alexander Coffin <alex.coffin(a)maticrobots.com>
Link: https://lore.kernel.org/all/CA+hUFcuGs04JHZ_WzA1zGN57+ehL2qmHOt5a7RMpo+rv6V…
Signed-off-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Link: https://patch.msgid.link/20240624170127.3253-3-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index a095f1a3114b..66513c18ca29 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -191,6 +191,13 @@ static void sub_reserved_credits(journal_t *journal, int blocks)
wake_up(&journal->j_wait_reserved);
}
+/* Maximum number of blocks for user transaction payload */
+static int jbd2_max_user_trans_buffers(journal_t *journal)
+{
+ return journal->j_max_transaction_buffers -
+ journal->j_transaction_overhead_buffers;
+}
+
/*
* Wait until we can add credits for handle to the running transaction. Called
* with j_state_lock held for reading. Returns 0 if handle joined the running
@@ -240,12 +247,12 @@ __must_hold(&journal->j_state_lock)
* big to fit this handle? Wait until reserved credits are freed.
*/
if (atomic_read(&journal->j_reserved_credits) + total >
- journal->j_max_transaction_buffers) {
+ jbd2_max_user_trans_buffers(journal)) {
read_unlock(&journal->j_state_lock);
jbd2_might_wait_for_commit(journal);
wait_event(journal->j_wait_reserved,
atomic_read(&journal->j_reserved_credits) + total <=
- journal->j_max_transaction_buffers);
+ jbd2_max_user_trans_buffers(journal));
__acquire(&journal->j_state_lock); /* fake out sparse */
return 1;
}
@@ -285,14 +292,14 @@ __must_hold(&journal->j_state_lock)
needed = atomic_add_return(rsv_blocks, &journal->j_reserved_credits);
/* We allow at most half of a transaction to be reserved */
- if (needed > journal->j_max_transaction_buffers / 2) {
+ if (needed > jbd2_max_user_trans_buffers(journal) / 2) {
sub_reserved_credits(journal, rsv_blocks);
atomic_sub(total, &t->t_outstanding_credits);
read_unlock(&journal->j_state_lock);
jbd2_might_wait_for_commit(journal);
wait_event(journal->j_wait_reserved,
atomic_read(&journal->j_reserved_credits) + rsv_blocks
- <= journal->j_max_transaction_buffers / 2);
+ <= jbd2_max_user_trans_buffers(journal) / 2);
__acquire(&journal->j_state_lock); /* fake out sparse */
return 1;
}
@@ -322,12 +329,12 @@ static int start_this_handle(journal_t *journal, handle_t *handle,
* size and limit the number of total credits to not exceed maximum
* transaction size per operation.
*/
- if ((rsv_blocks > journal->j_max_transaction_buffers / 2) ||
- (rsv_blocks + blocks > journal->j_max_transaction_buffers)) {
+ if (rsv_blocks > jbd2_max_user_trans_buffers(journal) / 2 ||
+ rsv_blocks + blocks > jbd2_max_user_trans_buffers(journal)) {
printk(KERN_ERR "JBD2: %s wants too many credits "
"credits:%d rsv_credits:%d max:%d\n",
current->comm, blocks, rsv_blocks,
- journal->j_max_transaction_buffers);
+ jbd2_max_user_trans_buffers(journal));
WARN_ON(1);
return -ENOSPC;
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 27c4fa42b11af780d49ce704f7fa67b3c2544df4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072918-undivided-veneering-9883@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
27c4fa42b11a ("KVM: nVMX: Check for pending posted interrupts when looking for nested events")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 27c4fa42b11af780d49ce704f7fa67b3c2544df4 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Fri, 7 Jun 2024 10:26:07 -0700
Subject: [PATCH] KVM: nVMX: Check for pending posted interrupts when looking
for nested events
Check for pending (and notified!) posted interrupts when checking if L2
has a pending wake event, as fully posted/notified virtual interrupt is a
valid wake event for HLT.
Note that KVM must check vmx->nested.pi_pending to avoid prematurely
waking L2, e.g. even if KVM sees a non-zero PID.PIR and PID.0N=1, the
virtual interrupt won't actually be recognized until a notification IRQ is
received by the vCPU or the vCPU does (nested) VM-Enter.
Fixes: 26844fee6ade ("KVM: x86: never write to memory from kvm_vcpu_check_block()")
Cc: stable(a)vger.kernel.org
Cc: Maxim Levitsky <mlevitsk(a)redhat.com>
Reported-by: Jim Mattson <jmattson(a)google.com>
Closes: https://lore.kernel.org/all/20231207010302.2240506-1-jmattson@google.com
Link: https://lore.kernel.org/r/20240607172609.3205077-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 411fe7aa0793..732340ff3300 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4034,8 +4034,40 @@ static bool nested_vmx_preemption_timer_pending(struct kvm_vcpu *vcpu)
static bool vmx_has_nested_events(struct kvm_vcpu *vcpu, bool for_injection)
{
- return nested_vmx_preemption_timer_pending(vcpu) ||
- to_vmx(vcpu)->nested.mtf_pending;
+ struct vcpu_vmx *vmx = to_vmx(vcpu);
+ void *vapic = vmx->nested.virtual_apic_map.hva;
+ int max_irr, vppr;
+
+ if (nested_vmx_preemption_timer_pending(vcpu) ||
+ vmx->nested.mtf_pending)
+ return true;
+
+ /*
+ * Virtual Interrupt Delivery doesn't require manual injection. Either
+ * the interrupt is already in GUEST_RVI and will be recognized by CPU
+ * at VM-Entry, or there is a KVM_REQ_EVENT pending and KVM will move
+ * the interrupt from the PIR to RVI prior to entering the guest.
+ */
+ if (for_injection)
+ return false;
+
+ if (!nested_cpu_has_vid(get_vmcs12(vcpu)) ||
+ __vmx_interrupt_blocked(vcpu))
+ return false;
+
+ if (!vapic)
+ return false;
+
+ vppr = *((u32 *)(vapic + APIC_PROCPRI));
+
+ if (vmx->nested.pi_pending && vmx->nested.pi_desc &&
+ pi_test_on(vmx->nested.pi_desc)) {
+ max_irr = pi_find_highest_vector(vmx->nested.pi_desc);
+ if (max_irr > 0 && (max_irr & 0xf0) > (vppr & 0xf0))
+ return true;
+ }
+
+ return false;
}
/*
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 27c4fa42b11af780d49ce704f7fa67b3c2544df4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072918-outsource-hardwired-03be@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
27c4fa42b11a ("KVM: nVMX: Check for pending posted interrupts when looking for nested events")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 27c4fa42b11af780d49ce704f7fa67b3c2544df4 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Fri, 7 Jun 2024 10:26:07 -0700
Subject: [PATCH] KVM: nVMX: Check for pending posted interrupts when looking
for nested events
Check for pending (and notified!) posted interrupts when checking if L2
has a pending wake event, as fully posted/notified virtual interrupt is a
valid wake event for HLT.
Note that KVM must check vmx->nested.pi_pending to avoid prematurely
waking L2, e.g. even if KVM sees a non-zero PID.PIR and PID.0N=1, the
virtual interrupt won't actually be recognized until a notification IRQ is
received by the vCPU or the vCPU does (nested) VM-Enter.
Fixes: 26844fee6ade ("KVM: x86: never write to memory from kvm_vcpu_check_block()")
Cc: stable(a)vger.kernel.org
Cc: Maxim Levitsky <mlevitsk(a)redhat.com>
Reported-by: Jim Mattson <jmattson(a)google.com>
Closes: https://lore.kernel.org/all/20231207010302.2240506-1-jmattson@google.com
Link: https://lore.kernel.org/r/20240607172609.3205077-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 411fe7aa0793..732340ff3300 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4034,8 +4034,40 @@ static bool nested_vmx_preemption_timer_pending(struct kvm_vcpu *vcpu)
static bool vmx_has_nested_events(struct kvm_vcpu *vcpu, bool for_injection)
{
- return nested_vmx_preemption_timer_pending(vcpu) ||
- to_vmx(vcpu)->nested.mtf_pending;
+ struct vcpu_vmx *vmx = to_vmx(vcpu);
+ void *vapic = vmx->nested.virtual_apic_map.hva;
+ int max_irr, vppr;
+
+ if (nested_vmx_preemption_timer_pending(vcpu) ||
+ vmx->nested.mtf_pending)
+ return true;
+
+ /*
+ * Virtual Interrupt Delivery doesn't require manual injection. Either
+ * the interrupt is already in GUEST_RVI and will be recognized by CPU
+ * at VM-Entry, or there is a KVM_REQ_EVENT pending and KVM will move
+ * the interrupt from the PIR to RVI prior to entering the guest.
+ */
+ if (for_injection)
+ return false;
+
+ if (!nested_cpu_has_vid(get_vmcs12(vcpu)) ||
+ __vmx_interrupt_blocked(vcpu))
+ return false;
+
+ if (!vapic)
+ return false;
+
+ vppr = *((u32 *)(vapic + APIC_PROCPRI));
+
+ if (vmx->nested.pi_pending && vmx->nested.pi_desc &&
+ pi_test_on(vmx->nested.pi_desc)) {
+ max_irr = pi_find_highest_vector(vmx->nested.pi_desc);
+ if (max_irr > 0 && (max_irr & 0xf0) > (vppr & 0xf0))
+ return true;
+ }
+
+ return false;
}
/*
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x d83c36d822be44db4bad0c43bea99c8908f54117
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072948-cosponsor-jacket-16bb@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
d83c36d822be ("KVM: nVMX: Add a helper to get highest pending from Posted Interrupt vector")
699f67512f04 ("KVM: VMX: Move posted interrupt descriptor out of VMX code")
50a82b0eb88c ("KVM: VMX: Split off vmx_onhyperv.{ch} from hyperv.{ch}")
662f6815786e ("KVM: VMX: Rename XSAVES control to follow KVM's preferred "ENABLE_XYZ"")
1143c0b85c07 ("KVM: VMX: Recompute "XSAVES enabled" only after CPUID update")
fbc722aac1ce ("KVM: VMX: Rename "KVM is using eVMCS" static key to match its wrapper")
19f10315fd53 ("KVM: VMX: Stub out enable_evmcs static key for CONFIG_HYPERV=n")
68ac4221497b ("KVM: nVMX: Move EVMCS1_SUPPORT_* macros to hyperv.c")
93827a0a3639 ("KVM: VMX: Fix crash due to uninitialized current_vmcs")
11633f69506d ("KVM: VMX: Always inline eVMCS read/write helpers")
d83420c2d74e ("KVM: x86: Move CPU compat checks hook to kvm_x86_ops (from kvm_x86_init_ops)")
325fc9579c2e ("KVM: SVM: Check for SVM support in CPU compatibility checks")
8504ef2139e2 ("KVM: VMX: Shuffle support checks and hardware enabling code around")
d41931324975 ("KVM: x86: Do VMX/SVM support checks directly in vendor code")
462689b37f08 ("KVM: VMX: Use current CPU's info to perform "disabled by BIOS?" checks")
8d20bd638167 ("KVM: x86: Unify pr_fmt to use module name for all KVM modules")
3045c483eeee ("KVM: x86: Do CPU compatibility checks in x86 code")
a578a0a9e352 ("KVM: Drop kvm_arch_{init,exit}() hooks")
20deee32f553 ("KVM: RISC-V: Do arch init directly in riscv_kvm_init()")
3fb8e89aa2a0 ("KVM: MIPS: Setup VZ emulation? directly from kvm_mips_init()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d83c36d822be44db4bad0c43bea99c8908f54117 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Fri, 7 Jun 2024 10:26:04 -0700
Subject: [PATCH] KVM: nVMX: Add a helper to get highest pending from Posted
Interrupt vector
Add a helper to retrieve the highest pending vector given a Posted
Interrupt descriptor. While the actual operation is straightforward, it's
surprisingly easy to mess up, e.g. if one tries to reuse lapic.c's
find_highest_vector(), which doesn't work with PID.PIR due to the APIC's
IRR and ISR component registers being physically discontiguous (they're
4-byte registers aligned at 16-byte intervals).
To make PIR handling more consistent with respect to IRR and ISR handling,
return -1 to indicate "no interrupt pending".
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20240607172609.3205077-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 643935a0f70a..8f4db6e8f57c 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -12,6 +12,7 @@
#include "mmu.h"
#include "nested.h"
#include "pmu.h"
+#include "posted_intr.h"
#include "sgx.h"
#include "trace.h"
#include "vmx.h"
@@ -3899,8 +3900,8 @@ static int vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
if (!pi_test_and_clear_on(vmx->nested.pi_desc))
return 0;
- max_irr = find_last_bit((unsigned long *)vmx->nested.pi_desc->pir, 256);
- if (max_irr != 256) {
+ max_irr = pi_find_highest_vector(vmx->nested.pi_desc);
+ if (max_irr > 0) {
vapic_page = vmx->nested.virtual_apic_map.hva;
if (!vapic_page)
goto mmio_needed;
diff --git a/arch/x86/kvm/vmx/posted_intr.h b/arch/x86/kvm/vmx/posted_intr.h
index 6b2a0226257e..1715d2ab07be 100644
--- a/arch/x86/kvm/vmx/posted_intr.h
+++ b/arch/x86/kvm/vmx/posted_intr.h
@@ -1,6 +1,8 @@
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef __KVM_X86_VMX_POSTED_INTR_H
#define __KVM_X86_VMX_POSTED_INTR_H
+
+#include <linux/find.h>
#include <asm/posted_intr.h>
void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu);
@@ -12,4 +14,12 @@ int vmx_pi_update_irte(struct kvm *kvm, unsigned int host_irq,
uint32_t guest_irq, bool set);
void vmx_pi_start_assignment(struct kvm *kvm);
+static inline int pi_find_highest_vector(struct pi_desc *pi_desc)
+{
+ int vec;
+
+ vec = find_last_bit((unsigned long *)pi_desc->pir, 256);
+ return vec < 256 ? vec : -1;
+}
+
#endif /* __KVM_X86_VMX_POSTED_INTR_H */
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x d83c36d822be44db4bad0c43bea99c8908f54117
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072947-acutely-kindness-fc65@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
d83c36d822be ("KVM: nVMX: Add a helper to get highest pending from Posted Interrupt vector")
699f67512f04 ("KVM: VMX: Move posted interrupt descriptor out of VMX code")
50a82b0eb88c ("KVM: VMX: Split off vmx_onhyperv.{ch} from hyperv.{ch}")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d83c36d822be44db4bad0c43bea99c8908f54117 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Fri, 7 Jun 2024 10:26:04 -0700
Subject: [PATCH] KVM: nVMX: Add a helper to get highest pending from Posted
Interrupt vector
Add a helper to retrieve the highest pending vector given a Posted
Interrupt descriptor. While the actual operation is straightforward, it's
surprisingly easy to mess up, e.g. if one tries to reuse lapic.c's
find_highest_vector(), which doesn't work with PID.PIR due to the APIC's
IRR and ISR component registers being physically discontiguous (they're
4-byte registers aligned at 16-byte intervals).
To make PIR handling more consistent with respect to IRR and ISR handling,
return -1 to indicate "no interrupt pending".
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20240607172609.3205077-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 643935a0f70a..8f4db6e8f57c 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -12,6 +12,7 @@
#include "mmu.h"
#include "nested.h"
#include "pmu.h"
+#include "posted_intr.h"
#include "sgx.h"
#include "trace.h"
#include "vmx.h"
@@ -3899,8 +3900,8 @@ static int vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
if (!pi_test_and_clear_on(vmx->nested.pi_desc))
return 0;
- max_irr = find_last_bit((unsigned long *)vmx->nested.pi_desc->pir, 256);
- if (max_irr != 256) {
+ max_irr = pi_find_highest_vector(vmx->nested.pi_desc);
+ if (max_irr > 0) {
vapic_page = vmx->nested.virtual_apic_map.hva;
if (!vapic_page)
goto mmio_needed;
diff --git a/arch/x86/kvm/vmx/posted_intr.h b/arch/x86/kvm/vmx/posted_intr.h
index 6b2a0226257e..1715d2ab07be 100644
--- a/arch/x86/kvm/vmx/posted_intr.h
+++ b/arch/x86/kvm/vmx/posted_intr.h
@@ -1,6 +1,8 @@
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef __KVM_X86_VMX_POSTED_INTR_H
#define __KVM_X86_VMX_POSTED_INTR_H
+
+#include <linux/find.h>
#include <asm/posted_intr.h>
void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu);
@@ -12,4 +14,12 @@ int vmx_pi_update_irte(struct kvm *kvm, unsigned int host_irq,
uint32_t guest_irq, bool set);
void vmx_pi_start_assignment(struct kvm *kvm);
+static inline int pi_find_highest_vector(struct pi_desc *pi_desc)
+{
+ int vec;
+
+ vec = find_last_bit((unsigned long *)pi_desc->pir, 256);
+ return vec < 256 ? vec : -1;
+}
+
#endif /* __KVM_X86_VMX_POSTED_INTR_H */
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 27ba5b67312a944576addc4df44ac3b709aabede
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072940-corridor-private-1f67@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
27ba5b67312a ("jbd2: avoid infinite transaction commit loop")
b33d9f5909c8 ("jbd2: add sparse annotations for add_transaction_credits()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 27ba5b67312a944576addc4df44ac3b709aabede Mon Sep 17 00:00:00 2001
From: Jan Kara <jack(a)suse.cz>
Date: Mon, 24 Jun 2024 19:01:19 +0200
Subject: [PATCH] jbd2: avoid infinite transaction commit loop
Commit 9f356e5a4f12 ("jbd2: Account descriptor blocks into
t_outstanding_credits") started to account descriptor blocks into
transactions outstanding credits. However it didn't appropriately
decrease the maximum amount of credits available to userspace. Thus if
the filesystem requests a transaction smaller than
j_max_transaction_buffers but large enough that when descriptor blocks
are added the size exceeds j_max_transaction_buffers, we confuse
add_transaction_credits() into thinking previous handles have grown the
transaction too much and enter infinite journal commit loop in
start_this_handle() -> add_transaction_credits() trying to create
transaction with enough credits available.
Fix the problem by properly accounting for transaction space reserved
for descriptor blocks when verifying requested transaction handle size.
CC: stable(a)vger.kernel.org
Fixes: 9f356e5a4f12 ("jbd2: Account descriptor blocks into t_outstanding_credits")
Reported-by: Alexander Coffin <alex.coffin(a)maticrobots.com>
Link: https://lore.kernel.org/all/CA+hUFcuGs04JHZ_WzA1zGN57+ehL2qmHOt5a7RMpo+rv6V…
Signed-off-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Link: https://patch.msgid.link/20240624170127.3253-3-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index a095f1a3114b..66513c18ca29 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -191,6 +191,13 @@ static void sub_reserved_credits(journal_t *journal, int blocks)
wake_up(&journal->j_wait_reserved);
}
+/* Maximum number of blocks for user transaction payload */
+static int jbd2_max_user_trans_buffers(journal_t *journal)
+{
+ return journal->j_max_transaction_buffers -
+ journal->j_transaction_overhead_buffers;
+}
+
/*
* Wait until we can add credits for handle to the running transaction. Called
* with j_state_lock held for reading. Returns 0 if handle joined the running
@@ -240,12 +247,12 @@ __must_hold(&journal->j_state_lock)
* big to fit this handle? Wait until reserved credits are freed.
*/
if (atomic_read(&journal->j_reserved_credits) + total >
- journal->j_max_transaction_buffers) {
+ jbd2_max_user_trans_buffers(journal)) {
read_unlock(&journal->j_state_lock);
jbd2_might_wait_for_commit(journal);
wait_event(journal->j_wait_reserved,
atomic_read(&journal->j_reserved_credits) + total <=
- journal->j_max_transaction_buffers);
+ jbd2_max_user_trans_buffers(journal));
__acquire(&journal->j_state_lock); /* fake out sparse */
return 1;
}
@@ -285,14 +292,14 @@ __must_hold(&journal->j_state_lock)
needed = atomic_add_return(rsv_blocks, &journal->j_reserved_credits);
/* We allow at most half of a transaction to be reserved */
- if (needed > journal->j_max_transaction_buffers / 2) {
+ if (needed > jbd2_max_user_trans_buffers(journal) / 2) {
sub_reserved_credits(journal, rsv_blocks);
atomic_sub(total, &t->t_outstanding_credits);
read_unlock(&journal->j_state_lock);
jbd2_might_wait_for_commit(journal);
wait_event(journal->j_wait_reserved,
atomic_read(&journal->j_reserved_credits) + rsv_blocks
- <= journal->j_max_transaction_buffers / 2);
+ <= jbd2_max_user_trans_buffers(journal) / 2);
__acquire(&journal->j_state_lock); /* fake out sparse */
return 1;
}
@@ -322,12 +329,12 @@ static int start_this_handle(journal_t *journal, handle_t *handle,
* size and limit the number of total credits to not exceed maximum
* transaction size per operation.
*/
- if ((rsv_blocks > journal->j_max_transaction_buffers / 2) ||
- (rsv_blocks + blocks > journal->j_max_transaction_buffers)) {
+ if (rsv_blocks > jbd2_max_user_trans_buffers(journal) / 2 ||
+ rsv_blocks + blocks > jbd2_max_user_trans_buffers(journal)) {
printk(KERN_ERR "JBD2: %s wants too many credits "
"credits:%d rsv_credits:%d max:%d\n",
current->comm, blocks, rsv_blocks,
- journal->j_max_transaction_buffers);
+ jbd2_max_user_trans_buffers(journal));
WARN_ON(1);
return -ENOSPC;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x ab477b766edd3bfb6321a6e3df4c790612613fae
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072917-pulsate-subduing-2b97@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
ab477b766edd ("leds: triggers: Flush pending brightness before activating trigger")
b1bbd20f35e1 ("leds: trigger: Call synchronize_rcu() before calling trig->activate()")
822c91e72eac ("leds: trigger: Store brightness set by led_trigger_event()")
c82a1662d454 ("leds: trigger: Remove unused function led_trigger_rename_static()")
2a5a8fa8b231 ("leds: trigger: use RCU to protect the led_cdevs list")
27af8e2c90fb ("leds: trigger: fix potential deadlock with libata")
93690cdf3060 ("leds: trigger: add support for LED-private device triggers")
ec28a8cfdce6 ("leds: core: Remove extern from header")
11f700022137 ("leds: remove PAGE_SIZE limit of /sys/class/leds/<led>/trigger")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ab477b766edd3bfb6321a6e3df4c790612613fae Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= <linux(a)weissschuh.net>
Date: Thu, 13 Jun 2024 17:24:51 +0200
Subject: [PATCH] leds: triggers: Flush pending brightness before activating
trigger
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The race fixed in timer_trig_activate() between a blocking
set_brightness() call and trigger->activate() can affect any trigger.
So move the call to flush_work() into led_trigger_set() where it can
avoid the race for all triggers.
Fixes: 0db37915d912 ("leds: avoid races with workqueue")
Fixes: 8c0f693c6eff ("leds: avoid flush_work in atomic context")
Cc: stable(a)vger.kernel.org
Tested-by: Dustin L. Howett <dustin(a)howett.net>
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
Link: https://lore.kernel.org/r/20240613-led-trigger-flush-v2-1-f4f970799d77@weis…
Signed-off-by: Lee Jones <lee(a)kernel.org>
diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c
index 59deadb86335..78eb20093b2c 100644
--- a/drivers/leds/led-triggers.c
+++ b/drivers/leds/led-triggers.c
@@ -201,6 +201,12 @@ int led_trigger_set(struct led_classdev *led_cdev, struct led_trigger *trig)
*/
synchronize_rcu();
+ /*
+ * If "set brightness to 0" is pending in workqueue,
+ * we don't want that to be reordered after ->activate()
+ */
+ flush_work(&led_cdev->set_brightness_work);
+
ret = 0;
if (trig->activate)
ret = trig->activate(led_cdev);
diff --git a/drivers/leds/trigger/ledtrig-timer.c b/drivers/leds/trigger/ledtrig-timer.c
index b4688d1d9d2b..1d213c999d40 100644
--- a/drivers/leds/trigger/ledtrig-timer.c
+++ b/drivers/leds/trigger/ledtrig-timer.c
@@ -110,11 +110,6 @@ static int timer_trig_activate(struct led_classdev *led_cdev)
led_cdev->flags &= ~LED_INIT_DEFAULT_TRIGGER;
}
- /*
- * If "set brightness to 0" is pending in workqueue, we don't
- * want that to be reordered after blink_set()
- */
- flush_work(&led_cdev->set_brightness_work);
led_blink_set(led_cdev, &led_cdev->blink_delay_on,
&led_cdev->blink_delay_off);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x ab477b766edd3bfb6321a6e3df4c790612613fae
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072916-emphasize-gluten-2f10@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
ab477b766edd ("leds: triggers: Flush pending brightness before activating trigger")
b1bbd20f35e1 ("leds: trigger: Call synchronize_rcu() before calling trig->activate()")
822c91e72eac ("leds: trigger: Store brightness set by led_trigger_event()")
c82a1662d454 ("leds: trigger: Remove unused function led_trigger_rename_static()")
2a5a8fa8b231 ("leds: trigger: use RCU to protect the led_cdevs list")
27af8e2c90fb ("leds: trigger: fix potential deadlock with libata")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ab477b766edd3bfb6321a6e3df4c790612613fae Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= <linux(a)weissschuh.net>
Date: Thu, 13 Jun 2024 17:24:51 +0200
Subject: [PATCH] leds: triggers: Flush pending brightness before activating
trigger
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The race fixed in timer_trig_activate() between a blocking
set_brightness() call and trigger->activate() can affect any trigger.
So move the call to flush_work() into led_trigger_set() where it can
avoid the race for all triggers.
Fixes: 0db37915d912 ("leds: avoid races with workqueue")
Fixes: 8c0f693c6eff ("leds: avoid flush_work in atomic context")
Cc: stable(a)vger.kernel.org
Tested-by: Dustin L. Howett <dustin(a)howett.net>
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
Link: https://lore.kernel.org/r/20240613-led-trigger-flush-v2-1-f4f970799d77@weis…
Signed-off-by: Lee Jones <lee(a)kernel.org>
diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c
index 59deadb86335..78eb20093b2c 100644
--- a/drivers/leds/led-triggers.c
+++ b/drivers/leds/led-triggers.c
@@ -201,6 +201,12 @@ int led_trigger_set(struct led_classdev *led_cdev, struct led_trigger *trig)
*/
synchronize_rcu();
+ /*
+ * If "set brightness to 0" is pending in workqueue,
+ * we don't want that to be reordered after ->activate()
+ */
+ flush_work(&led_cdev->set_brightness_work);
+
ret = 0;
if (trig->activate)
ret = trig->activate(led_cdev);
diff --git a/drivers/leds/trigger/ledtrig-timer.c b/drivers/leds/trigger/ledtrig-timer.c
index b4688d1d9d2b..1d213c999d40 100644
--- a/drivers/leds/trigger/ledtrig-timer.c
+++ b/drivers/leds/trigger/ledtrig-timer.c
@@ -110,11 +110,6 @@ static int timer_trig_activate(struct led_classdev *led_cdev)
led_cdev->flags &= ~LED_INIT_DEFAULT_TRIGGER;
}
- /*
- * If "set brightness to 0" is pending in workqueue, we don't
- * want that to be reordered after blink_set()
- */
- flush_work(&led_cdev->set_brightness_work);
led_blink_set(led_cdev, &led_cdev->blink_delay_on,
&led_cdev->blink_delay_off);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x ab477b766edd3bfb6321a6e3df4c790612613fae
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072915-ducktail-recapture-094b@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
ab477b766edd ("leds: triggers: Flush pending brightness before activating trigger")
b1bbd20f35e1 ("leds: trigger: Call synchronize_rcu() before calling trig->activate()")
822c91e72eac ("leds: trigger: Store brightness set by led_trigger_event()")
c82a1662d454 ("leds: trigger: Remove unused function led_trigger_rename_static()")
2a5a8fa8b231 ("leds: trigger: use RCU to protect the led_cdevs list")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ab477b766edd3bfb6321a6e3df4c790612613fae Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= <linux(a)weissschuh.net>
Date: Thu, 13 Jun 2024 17:24:51 +0200
Subject: [PATCH] leds: triggers: Flush pending brightness before activating
trigger
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The race fixed in timer_trig_activate() between a blocking
set_brightness() call and trigger->activate() can affect any trigger.
So move the call to flush_work() into led_trigger_set() where it can
avoid the race for all triggers.
Fixes: 0db37915d912 ("leds: avoid races with workqueue")
Fixes: 8c0f693c6eff ("leds: avoid flush_work in atomic context")
Cc: stable(a)vger.kernel.org
Tested-by: Dustin L. Howett <dustin(a)howett.net>
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
Link: https://lore.kernel.org/r/20240613-led-trigger-flush-v2-1-f4f970799d77@weis…
Signed-off-by: Lee Jones <lee(a)kernel.org>
diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c
index 59deadb86335..78eb20093b2c 100644
--- a/drivers/leds/led-triggers.c
+++ b/drivers/leds/led-triggers.c
@@ -201,6 +201,12 @@ int led_trigger_set(struct led_classdev *led_cdev, struct led_trigger *trig)
*/
synchronize_rcu();
+ /*
+ * If "set brightness to 0" is pending in workqueue,
+ * we don't want that to be reordered after ->activate()
+ */
+ flush_work(&led_cdev->set_brightness_work);
+
ret = 0;
if (trig->activate)
ret = trig->activate(led_cdev);
diff --git a/drivers/leds/trigger/ledtrig-timer.c b/drivers/leds/trigger/ledtrig-timer.c
index b4688d1d9d2b..1d213c999d40 100644
--- a/drivers/leds/trigger/ledtrig-timer.c
+++ b/drivers/leds/trigger/ledtrig-timer.c
@@ -110,11 +110,6 @@ static int timer_trig_activate(struct led_classdev *led_cdev)
led_cdev->flags &= ~LED_INIT_DEFAULT_TRIGGER;
}
- /*
- * If "set brightness to 0" is pending in workqueue, we don't
- * want that to be reordered after blink_set()
- */
- flush_work(&led_cdev->set_brightness_work);
led_blink_set(led_cdev, &led_cdev->blink_delay_on,
&led_cdev->blink_delay_off);
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x ab477b766edd3bfb6321a6e3df4c790612613fae
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072915-library-pronounce-3052@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
ab477b766edd ("leds: triggers: Flush pending brightness before activating trigger")
b1bbd20f35e1 ("leds: trigger: Call synchronize_rcu() before calling trig->activate()")
822c91e72eac ("leds: trigger: Store brightness set by led_trigger_event()")
c82a1662d454 ("leds: trigger: Remove unused function led_trigger_rename_static()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ab477b766edd3bfb6321a6e3df4c790612613fae Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= <linux(a)weissschuh.net>
Date: Thu, 13 Jun 2024 17:24:51 +0200
Subject: [PATCH] leds: triggers: Flush pending brightness before activating
trigger
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The race fixed in timer_trig_activate() between a blocking
set_brightness() call and trigger->activate() can affect any trigger.
So move the call to flush_work() into led_trigger_set() where it can
avoid the race for all triggers.
Fixes: 0db37915d912 ("leds: avoid races with workqueue")
Fixes: 8c0f693c6eff ("leds: avoid flush_work in atomic context")
Cc: stable(a)vger.kernel.org
Tested-by: Dustin L. Howett <dustin(a)howett.net>
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
Link: https://lore.kernel.org/r/20240613-led-trigger-flush-v2-1-f4f970799d77@weis…
Signed-off-by: Lee Jones <lee(a)kernel.org>
diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c
index 59deadb86335..78eb20093b2c 100644
--- a/drivers/leds/led-triggers.c
+++ b/drivers/leds/led-triggers.c
@@ -201,6 +201,12 @@ int led_trigger_set(struct led_classdev *led_cdev, struct led_trigger *trig)
*/
synchronize_rcu();
+ /*
+ * If "set brightness to 0" is pending in workqueue,
+ * we don't want that to be reordered after ->activate()
+ */
+ flush_work(&led_cdev->set_brightness_work);
+
ret = 0;
if (trig->activate)
ret = trig->activate(led_cdev);
diff --git a/drivers/leds/trigger/ledtrig-timer.c b/drivers/leds/trigger/ledtrig-timer.c
index b4688d1d9d2b..1d213c999d40 100644
--- a/drivers/leds/trigger/ledtrig-timer.c
+++ b/drivers/leds/trigger/ledtrig-timer.c
@@ -110,11 +110,6 @@ static int timer_trig_activate(struct led_classdev *led_cdev)
led_cdev->flags &= ~LED_INIT_DEFAULT_TRIGGER;
}
- /*
- * If "set brightness to 0" is pending in workqueue, we don't
- * want that to be reordered after blink_set()
- */
- flush_work(&led_cdev->set_brightness_work);
led_blink_set(led_cdev, &led_cdev->blink_delay_on,
&led_cdev->blink_delay_off);
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x ab477b766edd3bfb6321a6e3df4c790612613fae
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072914-fiddle-little-c002@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
ab477b766edd ("leds: triggers: Flush pending brightness before activating trigger")
b1bbd20f35e1 ("leds: trigger: Call synchronize_rcu() before calling trig->activate()")
822c91e72eac ("leds: trigger: Store brightness set by led_trigger_event()")
c82a1662d454 ("leds: trigger: Remove unused function led_trigger_rename_static()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ab477b766edd3bfb6321a6e3df4c790612613fae Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= <linux(a)weissschuh.net>
Date: Thu, 13 Jun 2024 17:24:51 +0200
Subject: [PATCH] leds: triggers: Flush pending brightness before activating
trigger
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The race fixed in timer_trig_activate() between a blocking
set_brightness() call and trigger->activate() can affect any trigger.
So move the call to flush_work() into led_trigger_set() where it can
avoid the race for all triggers.
Fixes: 0db37915d912 ("leds: avoid races with workqueue")
Fixes: 8c0f693c6eff ("leds: avoid flush_work in atomic context")
Cc: stable(a)vger.kernel.org
Tested-by: Dustin L. Howett <dustin(a)howett.net>
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
Link: https://lore.kernel.org/r/20240613-led-trigger-flush-v2-1-f4f970799d77@weis…
Signed-off-by: Lee Jones <lee(a)kernel.org>
diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c
index 59deadb86335..78eb20093b2c 100644
--- a/drivers/leds/led-triggers.c
+++ b/drivers/leds/led-triggers.c
@@ -201,6 +201,12 @@ int led_trigger_set(struct led_classdev *led_cdev, struct led_trigger *trig)
*/
synchronize_rcu();
+ /*
+ * If "set brightness to 0" is pending in workqueue,
+ * we don't want that to be reordered after ->activate()
+ */
+ flush_work(&led_cdev->set_brightness_work);
+
ret = 0;
if (trig->activate)
ret = trig->activate(led_cdev);
diff --git a/drivers/leds/trigger/ledtrig-timer.c b/drivers/leds/trigger/ledtrig-timer.c
index b4688d1d9d2b..1d213c999d40 100644
--- a/drivers/leds/trigger/ledtrig-timer.c
+++ b/drivers/leds/trigger/ledtrig-timer.c
@@ -110,11 +110,6 @@ static int timer_trig_activate(struct led_classdev *led_cdev)
led_cdev->flags &= ~LED_INIT_DEFAULT_TRIGGER;
}
- /*
- * If "set brightness to 0" is pending in workqueue, we don't
- * want that to be reordered after blink_set()
- */
- flush_work(&led_cdev->set_brightness_work);
led_blink_set(led_cdev, &led_cdev->blink_delay_on,
&led_cdev->blink_delay_off);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 9b003e14801cf85a8cebeddc87bc9fc77100fdce
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072942-curse-primate-40bf@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
9b003e14801c ("drivers: soc: xilinx: check return status of get_api_version()")
7fd890b89dea ("soc: xilinx: move PM_INIT_FINALIZE to zynqmp_pm_domains driver")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9b003e14801cf85a8cebeddc87bc9fc77100fdce Mon Sep 17 00:00:00 2001
From: Jay Buddhabhatti <jay.buddhabhatti(a)amd.com>
Date: Wed, 15 May 2024 04:23:45 -0700
Subject: [PATCH] drivers: soc: xilinx: check return status of
get_api_version()
Currently return status is not getting checked for get_api_version
and because of that for x86 arch we are getting below smatch error.
CC drivers/soc/xilinx/zynqmp_power.o
drivers/soc/xilinx/zynqmp_power.c: In function 'zynqmp_pm_probe':
drivers/soc/xilinx/zynqmp_power.c:295:12: warning: 'pm_api_version' is
used uninitialized [-Wuninitialized]
295 | if (pm_api_version < ZYNQMP_PM_VERSION)
| ^
CHECK drivers/soc/xilinx/zynqmp_power.c
drivers/soc/xilinx/zynqmp_power.c:295 zynqmp_pm_probe() error:
uninitialized symbol 'pm_api_version'.
So, check return status of pm_get_api_version and return error in case
of failure to avoid checking uninitialized pm_api_version variable.
Fixes: b9b3a8be28b3 ("firmware: xilinx: Remove eemi ops for get_api_version")
Signed-off-by: Jay Buddhabhatti <jay.buddhabhatti(a)amd.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20240515112345.24673-1-jay.buddhabhatti@amd.com
Signed-off-by: Michal Simek <michal.simek(a)amd.com>
diff --git a/drivers/soc/xilinx/zynqmp_power.c b/drivers/soc/xilinx/zynqmp_power.c
index fced6bedca43..411d33f2fb05 100644
--- a/drivers/soc/xilinx/zynqmp_power.c
+++ b/drivers/soc/xilinx/zynqmp_power.c
@@ -288,7 +288,9 @@ static int zynqmp_pm_probe(struct platform_device *pdev)
u32 pm_api_version, pm_family_code, pm_sub_family_code, node_id;
struct mbox_client *client;
- zynqmp_pm_get_api_version(&pm_api_version);
+ ret = zynqmp_pm_get_api_version(&pm_api_version);
+ if (ret)
+ return ret;
/* Check PM API version number */
if (pm_api_version < ZYNQMP_PM_VERSION)
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 9b003e14801cf85a8cebeddc87bc9fc77100fdce
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072936-ogle-royal-3339@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
9b003e14801c ("drivers: soc: xilinx: check return status of get_api_version()")
7fd890b89dea ("soc: xilinx: move PM_INIT_FINALIZE to zynqmp_pm_domains driver")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9b003e14801cf85a8cebeddc87bc9fc77100fdce Mon Sep 17 00:00:00 2001
From: Jay Buddhabhatti <jay.buddhabhatti(a)amd.com>
Date: Wed, 15 May 2024 04:23:45 -0700
Subject: [PATCH] drivers: soc: xilinx: check return status of
get_api_version()
Currently return status is not getting checked for get_api_version
and because of that for x86 arch we are getting below smatch error.
CC drivers/soc/xilinx/zynqmp_power.o
drivers/soc/xilinx/zynqmp_power.c: In function 'zynqmp_pm_probe':
drivers/soc/xilinx/zynqmp_power.c:295:12: warning: 'pm_api_version' is
used uninitialized [-Wuninitialized]
295 | if (pm_api_version < ZYNQMP_PM_VERSION)
| ^
CHECK drivers/soc/xilinx/zynqmp_power.c
drivers/soc/xilinx/zynqmp_power.c:295 zynqmp_pm_probe() error:
uninitialized symbol 'pm_api_version'.
So, check return status of pm_get_api_version and return error in case
of failure to avoid checking uninitialized pm_api_version variable.
Fixes: b9b3a8be28b3 ("firmware: xilinx: Remove eemi ops for get_api_version")
Signed-off-by: Jay Buddhabhatti <jay.buddhabhatti(a)amd.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20240515112345.24673-1-jay.buddhabhatti@amd.com
Signed-off-by: Michal Simek <michal.simek(a)amd.com>
diff --git a/drivers/soc/xilinx/zynqmp_power.c b/drivers/soc/xilinx/zynqmp_power.c
index fced6bedca43..411d33f2fb05 100644
--- a/drivers/soc/xilinx/zynqmp_power.c
+++ b/drivers/soc/xilinx/zynqmp_power.c
@@ -288,7 +288,9 @@ static int zynqmp_pm_probe(struct platform_device *pdev)
u32 pm_api_version, pm_family_code, pm_sub_family_code, node_id;
struct mbox_client *client;
- zynqmp_pm_get_api_version(&pm_api_version);
+ ret = zynqmp_pm_get_api_version(&pm_api_version);
+ if (ret)
+ return ret;
/* Check PM API version number */
if (pm_api_version < ZYNQMP_PM_VERSION)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x d01c84b97f19f1137211e90b0a910289a560019e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072918-agile-embody-42e0@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
d01c84b97f19 ("cpufreq: qcom-nvmem: fix memory leaks in probe error paths")
2a5d46c3ad6b ("cpufreq: qcom-nvmem: Simplify driver data allocation")
402732324b17 ("cpufreq: qcom-nvmem: Convert to platform remove callback returning void")
d78be404f97f ("cpufreq: qcom-nvmem: Switch to use dev_err_probe() helper")
49cd000dc51b ("cpufreq: qcom-nvmem: Migrate to dev_pm_opp_set_config()")
2ff8fe13ac6d ("cpufreq: qcom-cpufreq-nvmem: dev_pm_opp_put_*() accepts NULL argument")
a8811ec764f9 ("cpufreq: qcom: Add support for krait based socs")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d01c84b97f19f1137211e90b0a910289a560019e Mon Sep 17 00:00:00 2001
From: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Date: Thu, 23 May 2024 23:24:59 +0200
Subject: [PATCH] cpufreq: qcom-nvmem: fix memory leaks in probe error paths
The code refactoring added new error paths between the np device node
allocation and the call to of_node_put(), which leads to memory leaks if
any of those errors occur.
Add the missing of_node_put() in the error paths that require it.
Cc: stable(a)vger.kernel.org
Fixes: 57f2f8b4aa0c ("cpufreq: qcom: Refactor the driver to make it easier to extend")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
diff --git a/drivers/cpufreq/qcom-cpufreq-nvmem.c b/drivers/cpufreq/qcom-cpufreq-nvmem.c
index ea05d9d67490..5004e1dbc752 100644
--- a/drivers/cpufreq/qcom-cpufreq-nvmem.c
+++ b/drivers/cpufreq/qcom-cpufreq-nvmem.c
@@ -480,23 +480,30 @@ static int qcom_cpufreq_probe(struct platform_device *pdev)
drv = devm_kzalloc(&pdev->dev, struct_size(drv, cpus, num_possible_cpus()),
GFP_KERNEL);
- if (!drv)
+ if (!drv) {
+ of_node_put(np);
return -ENOMEM;
+ }
match = pdev->dev.platform_data;
drv->data = match->data;
- if (!drv->data)
+ if (!drv->data) {
+ of_node_put(np);
return -ENODEV;
+ }
if (drv->data->get_version) {
speedbin_nvmem = of_nvmem_cell_get(np, NULL);
- if (IS_ERR(speedbin_nvmem))
+ if (IS_ERR(speedbin_nvmem)) {
+ of_node_put(np);
return dev_err_probe(cpu_dev, PTR_ERR(speedbin_nvmem),
"Could not get nvmem cell\n");
+ }
ret = drv->data->get_version(cpu_dev,
speedbin_nvmem, &pvs_name, drv);
if (ret) {
+ of_node_put(np);
nvmem_cell_put(speedbin_nvmem);
return ret;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x d01c84b97f19f1137211e90b0a910289a560019e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072917-daily-trio-3d03@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
d01c84b97f19 ("cpufreq: qcom-nvmem: fix memory leaks in probe error paths")
2a5d46c3ad6b ("cpufreq: qcom-nvmem: Simplify driver data allocation")
402732324b17 ("cpufreq: qcom-nvmem: Convert to platform remove callback returning void")
d78be404f97f ("cpufreq: qcom-nvmem: Switch to use dev_err_probe() helper")
49cd000dc51b ("cpufreq: qcom-nvmem: Migrate to dev_pm_opp_set_config()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d01c84b97f19f1137211e90b0a910289a560019e Mon Sep 17 00:00:00 2001
From: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Date: Thu, 23 May 2024 23:24:59 +0200
Subject: [PATCH] cpufreq: qcom-nvmem: fix memory leaks in probe error paths
The code refactoring added new error paths between the np device node
allocation and the call to of_node_put(), which leads to memory leaks if
any of those errors occur.
Add the missing of_node_put() in the error paths that require it.
Cc: stable(a)vger.kernel.org
Fixes: 57f2f8b4aa0c ("cpufreq: qcom: Refactor the driver to make it easier to extend")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
diff --git a/drivers/cpufreq/qcom-cpufreq-nvmem.c b/drivers/cpufreq/qcom-cpufreq-nvmem.c
index ea05d9d67490..5004e1dbc752 100644
--- a/drivers/cpufreq/qcom-cpufreq-nvmem.c
+++ b/drivers/cpufreq/qcom-cpufreq-nvmem.c
@@ -480,23 +480,30 @@ static int qcom_cpufreq_probe(struct platform_device *pdev)
drv = devm_kzalloc(&pdev->dev, struct_size(drv, cpus, num_possible_cpus()),
GFP_KERNEL);
- if (!drv)
+ if (!drv) {
+ of_node_put(np);
return -ENOMEM;
+ }
match = pdev->dev.platform_data;
drv->data = match->data;
- if (!drv->data)
+ if (!drv->data) {
+ of_node_put(np);
return -ENODEV;
+ }
if (drv->data->get_version) {
speedbin_nvmem = of_nvmem_cell_get(np, NULL);
- if (IS_ERR(speedbin_nvmem))
+ if (IS_ERR(speedbin_nvmem)) {
+ of_node_put(np);
return dev_err_probe(cpu_dev, PTR_ERR(speedbin_nvmem),
"Could not get nvmem cell\n");
+ }
ret = drv->data->get_version(cpu_dev,
speedbin_nvmem, &pvs_name, drv);
if (ret) {
+ of_node_put(np);
nvmem_cell_put(speedbin_nvmem);
return ret;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x d01c84b97f19f1137211e90b0a910289a560019e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072917-chivalry-timid-afc3@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
d01c84b97f19 ("cpufreq: qcom-nvmem: fix memory leaks in probe error paths")
2a5d46c3ad6b ("cpufreq: qcom-nvmem: Simplify driver data allocation")
402732324b17 ("cpufreq: qcom-nvmem: Convert to platform remove callback returning void")
d78be404f97f ("cpufreq: qcom-nvmem: Switch to use dev_err_probe() helper")
49cd000dc51b ("cpufreq: qcom-nvmem: Migrate to dev_pm_opp_set_config()")
2ff8fe13ac6d ("cpufreq: qcom-cpufreq-nvmem: dev_pm_opp_put_*() accepts NULL argument")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d01c84b97f19f1137211e90b0a910289a560019e Mon Sep 17 00:00:00 2001
From: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Date: Thu, 23 May 2024 23:24:59 +0200
Subject: [PATCH] cpufreq: qcom-nvmem: fix memory leaks in probe error paths
The code refactoring added new error paths between the np device node
allocation and the call to of_node_put(), which leads to memory leaks if
any of those errors occur.
Add the missing of_node_put() in the error paths that require it.
Cc: stable(a)vger.kernel.org
Fixes: 57f2f8b4aa0c ("cpufreq: qcom: Refactor the driver to make it easier to extend")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
diff --git a/drivers/cpufreq/qcom-cpufreq-nvmem.c b/drivers/cpufreq/qcom-cpufreq-nvmem.c
index ea05d9d67490..5004e1dbc752 100644
--- a/drivers/cpufreq/qcom-cpufreq-nvmem.c
+++ b/drivers/cpufreq/qcom-cpufreq-nvmem.c
@@ -480,23 +480,30 @@ static int qcom_cpufreq_probe(struct platform_device *pdev)
drv = devm_kzalloc(&pdev->dev, struct_size(drv, cpus, num_possible_cpus()),
GFP_KERNEL);
- if (!drv)
+ if (!drv) {
+ of_node_put(np);
return -ENOMEM;
+ }
match = pdev->dev.platform_data;
drv->data = match->data;
- if (!drv->data)
+ if (!drv->data) {
+ of_node_put(np);
return -ENODEV;
+ }
if (drv->data->get_version) {
speedbin_nvmem = of_nvmem_cell_get(np, NULL);
- if (IS_ERR(speedbin_nvmem))
+ if (IS_ERR(speedbin_nvmem)) {
+ of_node_put(np);
return dev_err_probe(cpu_dev, PTR_ERR(speedbin_nvmem),
"Could not get nvmem cell\n");
+ }
ret = drv->data->get_version(cpu_dev,
speedbin_nvmem, &pvs_name, drv);
if (ret) {
+ of_node_put(np);
nvmem_cell_put(speedbin_nvmem);
return ret;
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x d01c84b97f19f1137211e90b0a910289a560019e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072916-lusty-flashy-062b@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
d01c84b97f19 ("cpufreq: qcom-nvmem: fix memory leaks in probe error paths")
2a5d46c3ad6b ("cpufreq: qcom-nvmem: Simplify driver data allocation")
402732324b17 ("cpufreq: qcom-nvmem: Convert to platform remove callback returning void")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d01c84b97f19f1137211e90b0a910289a560019e Mon Sep 17 00:00:00 2001
From: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Date: Thu, 23 May 2024 23:24:59 +0200
Subject: [PATCH] cpufreq: qcom-nvmem: fix memory leaks in probe error paths
The code refactoring added new error paths between the np device node
allocation and the call to of_node_put(), which leads to memory leaks if
any of those errors occur.
Add the missing of_node_put() in the error paths that require it.
Cc: stable(a)vger.kernel.org
Fixes: 57f2f8b4aa0c ("cpufreq: qcom: Refactor the driver to make it easier to extend")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
diff --git a/drivers/cpufreq/qcom-cpufreq-nvmem.c b/drivers/cpufreq/qcom-cpufreq-nvmem.c
index ea05d9d67490..5004e1dbc752 100644
--- a/drivers/cpufreq/qcom-cpufreq-nvmem.c
+++ b/drivers/cpufreq/qcom-cpufreq-nvmem.c
@@ -480,23 +480,30 @@ static int qcom_cpufreq_probe(struct platform_device *pdev)
drv = devm_kzalloc(&pdev->dev, struct_size(drv, cpus, num_possible_cpus()),
GFP_KERNEL);
- if (!drv)
+ if (!drv) {
+ of_node_put(np);
return -ENOMEM;
+ }
match = pdev->dev.platform_data;
drv->data = match->data;
- if (!drv->data)
+ if (!drv->data) {
+ of_node_put(np);
return -ENODEV;
+ }
if (drv->data->get_version) {
speedbin_nvmem = of_nvmem_cell_get(np, NULL);
- if (IS_ERR(speedbin_nvmem))
+ if (IS_ERR(speedbin_nvmem)) {
+ of_node_put(np);
return dev_err_probe(cpu_dev, PTR_ERR(speedbin_nvmem),
"Could not get nvmem cell\n");
+ }
ret = drv->data->get_version(cpu_dev,
speedbin_nvmem, &pvs_name, drv);
if (ret) {
+ of_node_put(np);
nvmem_cell_put(speedbin_nvmem);
return ret;
}
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x d01c84b97f19f1137211e90b0a910289a560019e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072915-unweave-tiling-74ec@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
d01c84b97f19 ("cpufreq: qcom-nvmem: fix memory leaks in probe error paths")
2a5d46c3ad6b ("cpufreq: qcom-nvmem: Simplify driver data allocation")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d01c84b97f19f1137211e90b0a910289a560019e Mon Sep 17 00:00:00 2001
From: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Date: Thu, 23 May 2024 23:24:59 +0200
Subject: [PATCH] cpufreq: qcom-nvmem: fix memory leaks in probe error paths
The code refactoring added new error paths between the np device node
allocation and the call to of_node_put(), which leads to memory leaks if
any of those errors occur.
Add the missing of_node_put() in the error paths that require it.
Cc: stable(a)vger.kernel.org
Fixes: 57f2f8b4aa0c ("cpufreq: qcom: Refactor the driver to make it easier to extend")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
diff --git a/drivers/cpufreq/qcom-cpufreq-nvmem.c b/drivers/cpufreq/qcom-cpufreq-nvmem.c
index ea05d9d67490..5004e1dbc752 100644
--- a/drivers/cpufreq/qcom-cpufreq-nvmem.c
+++ b/drivers/cpufreq/qcom-cpufreq-nvmem.c
@@ -480,23 +480,30 @@ static int qcom_cpufreq_probe(struct platform_device *pdev)
drv = devm_kzalloc(&pdev->dev, struct_size(drv, cpus, num_possible_cpus()),
GFP_KERNEL);
- if (!drv)
+ if (!drv) {
+ of_node_put(np);
return -ENOMEM;
+ }
match = pdev->dev.platform_data;
drv->data = match->data;
- if (!drv->data)
+ if (!drv->data) {
+ of_node_put(np);
return -ENODEV;
+ }
if (drv->data->get_version) {
speedbin_nvmem = of_nvmem_cell_get(np, NULL);
- if (IS_ERR(speedbin_nvmem))
+ if (IS_ERR(speedbin_nvmem)) {
+ of_node_put(np);
return dev_err_probe(cpu_dev, PTR_ERR(speedbin_nvmem),
"Could not get nvmem cell\n");
+ }
ret = drv->data->get_version(cpu_dev,
speedbin_nvmem, &pvs_name, drv);
if (ret) {
+ of_node_put(np);
nvmem_cell_put(speedbin_nvmem);
return ret;
}
From: Johannes Berg <johannes.berg(a)intel.com>
commit ce04abc3fcc62cd5640af981ebfd7c4dc3bded28 upstream.
When userspace sets basic rates, it might send us some rates
list that's empty or consists of invalid values only. We're
currently ignoring invalid values and then may end up with a
rates bitmap that's empty, which later results in a warning.
Reject the call if there were no valid rates.
[ Conflict resolution involved adjusting the patch to accommodate
changes in the function signature of ieee80211_parse_bitrates,
specifically the updated first parameter ]
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
Reported-by: syzbot+19013115c9786bfd0c4e(a)syzkaller.appspotmail.com
Tested-by: syzbot+19013115c9786bfd0c4e(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=19013115c9786bfd0c4e
Signed-off-by: Vincenzo Mezzela <vincenzo.mezzela(a)gmail.com>
---
net/mac80211/cfg.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c
index f277ce839ddb..85abd3ff07b4 100644
--- a/net/mac80211/cfg.c
+++ b/net/mac80211/cfg.c
@@ -2339,6 +2339,17 @@ static int ieee80211_change_bss(struct wiphy *wiphy,
if (!sband)
return -EINVAL;
+ if (params->basic_rates) {
+ if (!ieee80211_parse_bitrates(&sdata->vif.bss_conf.chandef,
+ wiphy->bands[sband->band],
+ params->basic_rates,
+ params->basic_rates_len,
+ &sdata->vif.bss_conf.basic_rates))
+ return -EINVAL;
+ changed |= BSS_CHANGED_BASIC_RATES;
+ ieee80211_check_rate_mask(sdata);
+ }
+
if (params->use_cts_prot >= 0) {
sdata->vif.bss_conf.use_cts_prot = params->use_cts_prot;
changed |= BSS_CHANGED_ERP_CTS_PROT;
@@ -2362,16 +2373,6 @@ static int ieee80211_change_bss(struct wiphy *wiphy,
changed |= BSS_CHANGED_ERP_SLOT;
}
- if (params->basic_rates) {
- ieee80211_parse_bitrates(&sdata->vif.bss_conf.chandef,
- wiphy->bands[sband->band],
- params->basic_rates,
- params->basic_rates_len,
- &sdata->vif.bss_conf.basic_rates);
- changed |= BSS_CHANGED_BASIC_RATES;
- ieee80211_check_rate_mask(sdata);
- }
-
if (params->ap_isolate >= 0) {
if (params->ap_isolate)
sdata->flags |= IEEE80211_SDATA_DONT_BRIDGE_PACKETS;
--
2.43.0
Hi,
Please backport the following patch back to the stable series 6.6 and 6.1:
commit a8bca3e9371dc5e276af4168be099b2a05554c2a
Author: Johannes Berg <johannes.berg(a)intel.com>
Date: Wed Feb 28 12:01:57 2024 +0100
wifi: mac80211: track capability/opmode NSS separately
There is a small conflict in the copyright year update in
net/mac80211/sta_info.h, just take the 2024 version.
On kernel 6.1 please backport this patch first to make the other one apply:
commit 57b341e9ab13e5688491bfd54f8b5502416c8905
Author: Rameshkumar Sundaram <quic_ramess(a)quicinc.com>
Date: Tue Feb 7 17:11:46 2023 +0530
wifi: mac80211: Allow NSS change only up to capability
This commit fixes a throughput regression introduced into Linux stable
with the backport of following commit:
commit dd6c064cfc3fc18d871107c6f5db8837e88572e4
Author: Johannes Berg <johannes.berg(a)intel.com>
Date: Mon Jan 29 15:53:55 2024 +0100
wifi: mac80211: set station RX-NSS on reconfig
On older kernel versions it is harder to backport the fixes, maybe
revert the offending commit in Linux stable 5.15 and older.
This regression was found by multiple users of OpenWrt in Wifi AP mode.
See the discussion in this forum thread:
https://forum.openwrt.org/t/openwrt-23-05-4-service-release/204514/111
Thank you KONG from the OpenWrt forum for finding the commit which
reduced the Wifi throughput.
Hauke
From: "Joel Fernandes (Google)" <joel(a)joelfernandes.org>
Paths using put_prev_task_balance() need to do a pick shortly
after. Make sure they also clear the ->dl_server on prev as a
part of that.
Cc: stable(a)vger.kernel.org
Fixes: 63ba8422f876 ("sched/deadline: Introduce deadline servers")
Signed-off-by: Joel Fernandes (Google) <joel(a)joelfernandes.org>
Signed-off-by: Daniel Bristot de Oliveira <bristot(a)kernel.org>
---
kernel/sched/core.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index bcf2c4cc0522..08c409457152 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6006,6 +6006,14 @@ static void put_prev_task_balance(struct rq *rq, struct task_struct *prev,
#endif
put_prev_task(rq, prev);
+
+ /*
+ * We've updated @prev and no longer need the server link, clear it.
+ * Must be done before ->pick_next_task() because that can (re)set
+ * ->dl_server.
+ */
+ if (prev->dl_server)
+ prev->dl_server = NULL;
}
/*
@@ -6049,14 +6057,6 @@ __pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
restart:
put_prev_task_balance(rq, prev, rf);
- /*
- * We've updated @prev and no longer need the server link, clear it.
- * Must be done before ->pick_next_task() because that can (re)set
- * ->dl_server.
- */
- if (prev->dl_server)
- prev->dl_server = NULL;
-
for_each_class(class) {
p = class->pick_next_task(rq);
if (p)
--
2.45.1
Add an explanation for the newly added variable.
Cc: stable(a)vger.kernel.org
Fixes: 63ba8422f876 ("sched/deadline: Introduce deadline servers")
Signed-off-by: Daniel Bristot de Oliveira <bristot(a)kernel.org>
---
include/linux/sched.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 61591ac6eab6..abce356932cc 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -637,6 +637,8 @@ struct sched_dl_entity {
*
* @dl_overrun tells if the task asked to be informed about runtime
* overruns.
+ *
+ * @dl_server tells if this is a server entity.
*/
unsigned int dl_throttled : 1;
unsigned int dl_yielded : 1;
--
2.45.1
From: Youssef Esmat <youssefesmat(a)google.com>
In case the previous pick was a DL server pick, ->dl_server might be
set. Clear it in the fast path as well.
Cc: stable(a)vger.kernel.org
Fixes: 63ba8422f876 ("sched/deadline: Introduce deadline servers")
Signed-off-by: Youssef Esmat <youssefesmat(a)google.com>
Signed-off-by: Joel Fernandes (Google) <joel(a)joelfernandes.org>
Signed-off-by: Daniel Bristot de Oliveira <bristot(a)kernel.org>
---
kernel/sched/core.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 08c409457152..6d01863f93ca 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6044,6 +6044,13 @@ __pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
p = pick_next_task_idle(rq);
}
+ /*
+ * This is a normal CFS pick, but the previous could be a DL pick.
+ * Clear it as previous is no longer picked.
+ */
+ if (prev->dl_server)
+ prev->dl_server = NULL;
+
/*
* This is the fast path; it cannot be a DL server pick;
* therefore even if @p == @prev, ->dl_server must be NULL.
--
2.45.1
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 0ea6560abb3bac1ffcfa4bf6b2c4d344fdc27b3c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072911-elsewhere-latter-afa3@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
0ea6560abb3b ("ext4: check the extent status again before inserting delalloc block")
acf795dc161f ("ext4: convert to exclusive lock while inserting delalloc extents")
3fcc2b887a1b ("ext4: refactor ext4_da_map_blocks()")
6c120399cde6 ("ext4: make ext4_es_insert_extent() return void")
2a69c450083d ("ext4: using nofail preallocation in ext4_es_insert_extent()")
bda3efaf774f ("ext4: use pre-allocated es in __es_remove_extent()")
95f0b320339a ("ext4: use pre-allocated es in __es_insert_extent()")
73a2f033656b ("ext4: factor out __es_alloc_extent() and __es_free_extent()")
9649eb18c628 ("ext4: add a new helper to check if es must be kept")
8016e29f4362 ("ext4: fast commit recovery path")
5b849b5f96b4 ("jbd2: fast commit recovery path")
aa75f4d3daae ("ext4: main fast-commit commit path")
ff780b91efe9 ("jbd2: add fast commit machinery")
6866d7b3f2bb ("ext4 / jbd2: add fast commit initialization")
995a3ed67fc8 ("ext4: add fast_commit feature and handling for extended mount options")
2d069c0889ef ("ext4: use common helpers in all places reading metadata buffers")
d9befedaafcf ("ext4: clear buffer verified flag if read meta block from disk")
15ed2851b0f4 ("ext4: remove unused argument from ext4_(inc|dec)_count")
3d392b2676bf ("ext4: add prefetch_block_bitmaps mount option")
ab74c7b23f37 ("ext4: indicate via a block bitmap read is prefetched via a tracepoint")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0ea6560abb3bac1ffcfa4bf6b2c4d344fdc27b3c Mon Sep 17 00:00:00 2001
From: Zhang Yi <yi.zhang(a)huawei.com>
Date: Fri, 17 May 2024 20:39:57 +0800
Subject: [PATCH] ext4: check the extent status again before inserting delalloc
block
ext4_da_map_blocks looks up for any extent entry in the extent status
tree (w/o i_data_sem) and then the looks up for any ondisk extent
mapping (with i_data_sem in read mode).
If it finds a hole in the extent status tree or if it couldn't find any
entry at all, it then takes the i_data_sem in write mode to add a da
entry into the extent status tree. This can actually race with page
mkwrite & fallocate path.
Note that this is ok between
1. ext4 buffered-write path v/s ext4_page_mkwrite(), because of the
folio lock
2. ext4 buffered write path v/s ext4 fallocate because of the inode
lock.
But this can race between ext4_page_mkwrite() & ext4 fallocate path
ext4_page_mkwrite() ext4_fallocate()
block_page_mkwrite()
ext4_da_map_blocks()
//find hole in extent status tree
ext4_alloc_file_blocks()
ext4_map_blocks()
//allocate block and unwritten extent
ext4_insert_delayed_block()
ext4_da_reserve_space()
//reserve one more block
ext4_es_insert_delayed_block()
//drop unwritten extent and add delayed extent by mistake
Then, the delalloc extent is wrong until writeback and the extra
reserved block can't be released any more and it triggers below warning:
EXT4-fs (pmem2): Inode 13 (00000000bbbd4d23): i_reserved_data_blocks(1) not cleared!
Fix the problem by looking up extent status tree again while the
i_data_sem is held in write mode. If it still can't find any entry, then
we insert a new da entry into the extent status tree.
Cc: stable(a)vger.kernel.org
Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-3-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 168819b4db01..4b0d64a76e88 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1737,6 +1737,7 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
if (ext4_es_is_hole(&es))
goto add_delayed;
+found:
/*
* Delayed extent could be allocated by fallocate.
* So we need to check it.
@@ -1781,6 +1782,26 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
add_delayed:
down_write(&EXT4_I(inode)->i_data_sem);
+ /*
+ * Page fault path (ext4_page_mkwrite does not take i_rwsem)
+ * and fallocate path (no folio lock) can race. Make sure we
+ * lookup the extent status tree here again while i_data_sem
+ * is held in write mode, before inserting a new da entry in
+ * the extent status tree.
+ */
+ if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
+ if (!ext4_es_is_hole(&es)) {
+ up_write(&EXT4_I(inode)->i_data_sem);
+ goto found;
+ }
+ } else if (!ext4_has_inline_data(inode)) {
+ retval = ext4_map_query_blocks(NULL, inode, map);
+ if (retval) {
+ up_write(&EXT4_I(inode)->i_data_sem);
+ return retval;
+ }
+ }
+
retval = ext4_insert_delayed_block(inode, map->m_lblk);
up_write(&EXT4_I(inode)->i_data_sem);
if (retval)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 0ea6560abb3bac1ffcfa4bf6b2c4d344fdc27b3c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072909-shamrock-frail-43e9@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
0ea6560abb3b ("ext4: check the extent status again before inserting delalloc block")
acf795dc161f ("ext4: convert to exclusive lock while inserting delalloc extents")
3fcc2b887a1b ("ext4: refactor ext4_da_map_blocks()")
6c120399cde6 ("ext4: make ext4_es_insert_extent() return void")
2a69c450083d ("ext4: using nofail preallocation in ext4_es_insert_extent()")
bda3efaf774f ("ext4: use pre-allocated es in __es_remove_extent()")
95f0b320339a ("ext4: use pre-allocated es in __es_insert_extent()")
73a2f033656b ("ext4: factor out __es_alloc_extent() and __es_free_extent()")
9649eb18c628 ("ext4: add a new helper to check if es must be kept")
8016e29f4362 ("ext4: fast commit recovery path")
5b849b5f96b4 ("jbd2: fast commit recovery path")
aa75f4d3daae ("ext4: main fast-commit commit path")
ff780b91efe9 ("jbd2: add fast commit machinery")
6866d7b3f2bb ("ext4 / jbd2: add fast commit initialization")
995a3ed67fc8 ("ext4: add fast_commit feature and handling for extended mount options")
2d069c0889ef ("ext4: use common helpers in all places reading metadata buffers")
d9befedaafcf ("ext4: clear buffer verified flag if read meta block from disk")
15ed2851b0f4 ("ext4: remove unused argument from ext4_(inc|dec)_count")
3d392b2676bf ("ext4: add prefetch_block_bitmaps mount option")
ab74c7b23f37 ("ext4: indicate via a block bitmap read is prefetched via a tracepoint")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0ea6560abb3bac1ffcfa4bf6b2c4d344fdc27b3c Mon Sep 17 00:00:00 2001
From: Zhang Yi <yi.zhang(a)huawei.com>
Date: Fri, 17 May 2024 20:39:57 +0800
Subject: [PATCH] ext4: check the extent status again before inserting delalloc
block
ext4_da_map_blocks looks up for any extent entry in the extent status
tree (w/o i_data_sem) and then the looks up for any ondisk extent
mapping (with i_data_sem in read mode).
If it finds a hole in the extent status tree or if it couldn't find any
entry at all, it then takes the i_data_sem in write mode to add a da
entry into the extent status tree. This can actually race with page
mkwrite & fallocate path.
Note that this is ok between
1. ext4 buffered-write path v/s ext4_page_mkwrite(), because of the
folio lock
2. ext4 buffered write path v/s ext4 fallocate because of the inode
lock.
But this can race between ext4_page_mkwrite() & ext4 fallocate path
ext4_page_mkwrite() ext4_fallocate()
block_page_mkwrite()
ext4_da_map_blocks()
//find hole in extent status tree
ext4_alloc_file_blocks()
ext4_map_blocks()
//allocate block and unwritten extent
ext4_insert_delayed_block()
ext4_da_reserve_space()
//reserve one more block
ext4_es_insert_delayed_block()
//drop unwritten extent and add delayed extent by mistake
Then, the delalloc extent is wrong until writeback and the extra
reserved block can't be released any more and it triggers below warning:
EXT4-fs (pmem2): Inode 13 (00000000bbbd4d23): i_reserved_data_blocks(1) not cleared!
Fix the problem by looking up extent status tree again while the
i_data_sem is held in write mode. If it still can't find any entry, then
we insert a new da entry into the extent status tree.
Cc: stable(a)vger.kernel.org
Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-3-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 168819b4db01..4b0d64a76e88 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1737,6 +1737,7 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
if (ext4_es_is_hole(&es))
goto add_delayed;
+found:
/*
* Delayed extent could be allocated by fallocate.
* So we need to check it.
@@ -1781,6 +1782,26 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
add_delayed:
down_write(&EXT4_I(inode)->i_data_sem);
+ /*
+ * Page fault path (ext4_page_mkwrite does not take i_rwsem)
+ * and fallocate path (no folio lock) can race. Make sure we
+ * lookup the extent status tree here again while i_data_sem
+ * is held in write mode, before inserting a new da entry in
+ * the extent status tree.
+ */
+ if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
+ if (!ext4_es_is_hole(&es)) {
+ up_write(&EXT4_I(inode)->i_data_sem);
+ goto found;
+ }
+ } else if (!ext4_has_inline_data(inode)) {
+ retval = ext4_map_query_blocks(NULL, inode, map);
+ if (retval) {
+ up_write(&EXT4_I(inode)->i_data_sem);
+ return retval;
+ }
+ }
+
retval = ext4_insert_delayed_block(inode, map->m_lblk);
up_write(&EXT4_I(inode)->i_data_sem);
if (retval)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 0ea6560abb3bac1ffcfa4bf6b2c4d344fdc27b3c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072908-ibuprofen-destruct-80dc@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
0ea6560abb3b ("ext4: check the extent status again before inserting delalloc block")
acf795dc161f ("ext4: convert to exclusive lock while inserting delalloc extents")
3fcc2b887a1b ("ext4: refactor ext4_da_map_blocks()")
6c120399cde6 ("ext4: make ext4_es_insert_extent() return void")
2a69c450083d ("ext4: using nofail preallocation in ext4_es_insert_extent()")
bda3efaf774f ("ext4: use pre-allocated es in __es_remove_extent()")
95f0b320339a ("ext4: use pre-allocated es in __es_insert_extent()")
73a2f033656b ("ext4: factor out __es_alloc_extent() and __es_free_extent()")
9649eb18c628 ("ext4: add a new helper to check if es must be kept")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0ea6560abb3bac1ffcfa4bf6b2c4d344fdc27b3c Mon Sep 17 00:00:00 2001
From: Zhang Yi <yi.zhang(a)huawei.com>
Date: Fri, 17 May 2024 20:39:57 +0800
Subject: [PATCH] ext4: check the extent status again before inserting delalloc
block
ext4_da_map_blocks looks up for any extent entry in the extent status
tree (w/o i_data_sem) and then the looks up for any ondisk extent
mapping (with i_data_sem in read mode).
If it finds a hole in the extent status tree or if it couldn't find any
entry at all, it then takes the i_data_sem in write mode to add a da
entry into the extent status tree. This can actually race with page
mkwrite & fallocate path.
Note that this is ok between
1. ext4 buffered-write path v/s ext4_page_mkwrite(), because of the
folio lock
2. ext4 buffered write path v/s ext4 fallocate because of the inode
lock.
But this can race between ext4_page_mkwrite() & ext4 fallocate path
ext4_page_mkwrite() ext4_fallocate()
block_page_mkwrite()
ext4_da_map_blocks()
//find hole in extent status tree
ext4_alloc_file_blocks()
ext4_map_blocks()
//allocate block and unwritten extent
ext4_insert_delayed_block()
ext4_da_reserve_space()
//reserve one more block
ext4_es_insert_delayed_block()
//drop unwritten extent and add delayed extent by mistake
Then, the delalloc extent is wrong until writeback and the extra
reserved block can't be released any more and it triggers below warning:
EXT4-fs (pmem2): Inode 13 (00000000bbbd4d23): i_reserved_data_blocks(1) not cleared!
Fix the problem by looking up extent status tree again while the
i_data_sem is held in write mode. If it still can't find any entry, then
we insert a new da entry into the extent status tree.
Cc: stable(a)vger.kernel.org
Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-3-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 168819b4db01..4b0d64a76e88 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1737,6 +1737,7 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
if (ext4_es_is_hole(&es))
goto add_delayed;
+found:
/*
* Delayed extent could be allocated by fallocate.
* So we need to check it.
@@ -1781,6 +1782,26 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
add_delayed:
down_write(&EXT4_I(inode)->i_data_sem);
+ /*
+ * Page fault path (ext4_page_mkwrite does not take i_rwsem)
+ * and fallocate path (no folio lock) can race. Make sure we
+ * lookup the extent status tree here again while i_data_sem
+ * is held in write mode, before inserting a new da entry in
+ * the extent status tree.
+ */
+ if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
+ if (!ext4_es_is_hole(&es)) {
+ up_write(&EXT4_I(inode)->i_data_sem);
+ goto found;
+ }
+ } else if (!ext4_has_inline_data(inode)) {
+ retval = ext4_map_query_blocks(NULL, inode, map);
+ if (retval) {
+ up_write(&EXT4_I(inode)->i_data_sem);
+ return retval;
+ }
+ }
+
retval = ext4_insert_delayed_block(inode, map->m_lblk);
up_write(&EXT4_I(inode)->i_data_sem);
if (retval)
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 0ea6560abb3bac1ffcfa4bf6b2c4d344fdc27b3c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072907-animosity-ocelot-8f7c@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
0ea6560abb3b ("ext4: check the extent status again before inserting delalloc block")
acf795dc161f ("ext4: convert to exclusive lock while inserting delalloc extents")
3fcc2b887a1b ("ext4: refactor ext4_da_map_blocks()")
6c120399cde6 ("ext4: make ext4_es_insert_extent() return void")
2a69c450083d ("ext4: using nofail preallocation in ext4_es_insert_extent()")
bda3efaf774f ("ext4: use pre-allocated es in __es_remove_extent()")
95f0b320339a ("ext4: use pre-allocated es in __es_insert_extent()")
73a2f033656b ("ext4: factor out __es_alloc_extent() and __es_free_extent()")
9649eb18c628 ("ext4: add a new helper to check if es must be kept")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0ea6560abb3bac1ffcfa4bf6b2c4d344fdc27b3c Mon Sep 17 00:00:00 2001
From: Zhang Yi <yi.zhang(a)huawei.com>
Date: Fri, 17 May 2024 20:39:57 +0800
Subject: [PATCH] ext4: check the extent status again before inserting delalloc
block
ext4_da_map_blocks looks up for any extent entry in the extent status
tree (w/o i_data_sem) and then the looks up for any ondisk extent
mapping (with i_data_sem in read mode).
If it finds a hole in the extent status tree or if it couldn't find any
entry at all, it then takes the i_data_sem in write mode to add a da
entry into the extent status tree. This can actually race with page
mkwrite & fallocate path.
Note that this is ok between
1. ext4 buffered-write path v/s ext4_page_mkwrite(), because of the
folio lock
2. ext4 buffered write path v/s ext4 fallocate because of the inode
lock.
But this can race between ext4_page_mkwrite() & ext4 fallocate path
ext4_page_mkwrite() ext4_fallocate()
block_page_mkwrite()
ext4_da_map_blocks()
//find hole in extent status tree
ext4_alloc_file_blocks()
ext4_map_blocks()
//allocate block and unwritten extent
ext4_insert_delayed_block()
ext4_da_reserve_space()
//reserve one more block
ext4_es_insert_delayed_block()
//drop unwritten extent and add delayed extent by mistake
Then, the delalloc extent is wrong until writeback and the extra
reserved block can't be released any more and it triggers below warning:
EXT4-fs (pmem2): Inode 13 (00000000bbbd4d23): i_reserved_data_blocks(1) not cleared!
Fix the problem by looking up extent status tree again while the
i_data_sem is held in write mode. If it still can't find any entry, then
we insert a new da entry into the extent status tree.
Cc: stable(a)vger.kernel.org
Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-3-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 168819b4db01..4b0d64a76e88 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1737,6 +1737,7 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
if (ext4_es_is_hole(&es))
goto add_delayed;
+found:
/*
* Delayed extent could be allocated by fallocate.
* So we need to check it.
@@ -1781,6 +1782,26 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
add_delayed:
down_write(&EXT4_I(inode)->i_data_sem);
+ /*
+ * Page fault path (ext4_page_mkwrite does not take i_rwsem)
+ * and fallocate path (no folio lock) can race. Make sure we
+ * lookup the extent status tree here again while i_data_sem
+ * is held in write mode, before inserting a new da entry in
+ * the extent status tree.
+ */
+ if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
+ if (!ext4_es_is_hole(&es)) {
+ up_write(&EXT4_I(inode)->i_data_sem);
+ goto found;
+ }
+ } else if (!ext4_has_inline_data(inode)) {
+ retval = ext4_map_query_blocks(NULL, inode, map);
+ if (retval) {
+ up_write(&EXT4_I(inode)->i_data_sem);
+ return retval;
+ }
+ }
+
retval = ext4_insert_delayed_block(inode, map->m_lblk);
up_write(&EXT4_I(inode)->i_data_sem);
if (retval)
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 0ea6560abb3bac1ffcfa4bf6b2c4d344fdc27b3c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072906-unshaven-whenever-406d@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
0ea6560abb3b ("ext4: check the extent status again before inserting delalloc block")
acf795dc161f ("ext4: convert to exclusive lock while inserting delalloc extents")
3fcc2b887a1b ("ext4: refactor ext4_da_map_blocks()")
6c120399cde6 ("ext4: make ext4_es_insert_extent() return void")
2a69c450083d ("ext4: using nofail preallocation in ext4_es_insert_extent()")
bda3efaf774f ("ext4: use pre-allocated es in __es_remove_extent()")
95f0b320339a ("ext4: use pre-allocated es in __es_insert_extent()")
73a2f033656b ("ext4: factor out __es_alloc_extent() and __es_free_extent()")
9649eb18c628 ("ext4: add a new helper to check if es must be kept")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0ea6560abb3bac1ffcfa4bf6b2c4d344fdc27b3c Mon Sep 17 00:00:00 2001
From: Zhang Yi <yi.zhang(a)huawei.com>
Date: Fri, 17 May 2024 20:39:57 +0800
Subject: [PATCH] ext4: check the extent status again before inserting delalloc
block
ext4_da_map_blocks looks up for any extent entry in the extent status
tree (w/o i_data_sem) and then the looks up for any ondisk extent
mapping (with i_data_sem in read mode).
If it finds a hole in the extent status tree or if it couldn't find any
entry at all, it then takes the i_data_sem in write mode to add a da
entry into the extent status tree. This can actually race with page
mkwrite & fallocate path.
Note that this is ok between
1. ext4 buffered-write path v/s ext4_page_mkwrite(), because of the
folio lock
2. ext4 buffered write path v/s ext4 fallocate because of the inode
lock.
But this can race between ext4_page_mkwrite() & ext4 fallocate path
ext4_page_mkwrite() ext4_fallocate()
block_page_mkwrite()
ext4_da_map_blocks()
//find hole in extent status tree
ext4_alloc_file_blocks()
ext4_map_blocks()
//allocate block and unwritten extent
ext4_insert_delayed_block()
ext4_da_reserve_space()
//reserve one more block
ext4_es_insert_delayed_block()
//drop unwritten extent and add delayed extent by mistake
Then, the delalloc extent is wrong until writeback and the extra
reserved block can't be released any more and it triggers below warning:
EXT4-fs (pmem2): Inode 13 (00000000bbbd4d23): i_reserved_data_blocks(1) not cleared!
Fix the problem by looking up extent status tree again while the
i_data_sem is held in write mode. If it still can't find any entry, then
we insert a new da entry into the extent status tree.
Cc: stable(a)vger.kernel.org
Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-3-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 168819b4db01..4b0d64a76e88 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1737,6 +1737,7 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
if (ext4_es_is_hole(&es))
goto add_delayed;
+found:
/*
* Delayed extent could be allocated by fallocate.
* So we need to check it.
@@ -1781,6 +1782,26 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
add_delayed:
down_write(&EXT4_I(inode)->i_data_sem);
+ /*
+ * Page fault path (ext4_page_mkwrite does not take i_rwsem)
+ * and fallocate path (no folio lock) can race. Make sure we
+ * lookup the extent status tree here again while i_data_sem
+ * is held in write mode, before inserting a new da entry in
+ * the extent status tree.
+ */
+ if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
+ if (!ext4_es_is_hole(&es)) {
+ up_write(&EXT4_I(inode)->i_data_sem);
+ goto found;
+ }
+ } else if (!ext4_has_inline_data(inode)) {
+ retval = ext4_map_query_blocks(NULL, inode, map);
+ if (retval) {
+ up_write(&EXT4_I(inode)->i_data_sem);
+ return retval;
+ }
+ }
+
retval = ext4_insert_delayed_block(inode, map->m_lblk);
up_write(&EXT4_I(inode)->i_data_sem);
if (retval)
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 0ea6560abb3bac1ffcfa4bf6b2c4d344fdc27b3c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072904-rule-emblem-471a@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
0ea6560abb3b ("ext4: check the extent status again before inserting delalloc block")
acf795dc161f ("ext4: convert to exclusive lock while inserting delalloc extents")
3fcc2b887a1b ("ext4: refactor ext4_da_map_blocks()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0ea6560abb3bac1ffcfa4bf6b2c4d344fdc27b3c Mon Sep 17 00:00:00 2001
From: Zhang Yi <yi.zhang(a)huawei.com>
Date: Fri, 17 May 2024 20:39:57 +0800
Subject: [PATCH] ext4: check the extent status again before inserting delalloc
block
ext4_da_map_blocks looks up for any extent entry in the extent status
tree (w/o i_data_sem) and then the looks up for any ondisk extent
mapping (with i_data_sem in read mode).
If it finds a hole in the extent status tree or if it couldn't find any
entry at all, it then takes the i_data_sem in write mode to add a da
entry into the extent status tree. This can actually race with page
mkwrite & fallocate path.
Note that this is ok between
1. ext4 buffered-write path v/s ext4_page_mkwrite(), because of the
folio lock
2. ext4 buffered write path v/s ext4 fallocate because of the inode
lock.
But this can race between ext4_page_mkwrite() & ext4 fallocate path
ext4_page_mkwrite() ext4_fallocate()
block_page_mkwrite()
ext4_da_map_blocks()
//find hole in extent status tree
ext4_alloc_file_blocks()
ext4_map_blocks()
//allocate block and unwritten extent
ext4_insert_delayed_block()
ext4_da_reserve_space()
//reserve one more block
ext4_es_insert_delayed_block()
//drop unwritten extent and add delayed extent by mistake
Then, the delalloc extent is wrong until writeback and the extra
reserved block can't be released any more and it triggers below warning:
EXT4-fs (pmem2): Inode 13 (00000000bbbd4d23): i_reserved_data_blocks(1) not cleared!
Fix the problem by looking up extent status tree again while the
i_data_sem is held in write mode. If it still can't find any entry, then
we insert a new da entry into the extent status tree.
Cc: stable(a)vger.kernel.org
Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-3-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 168819b4db01..4b0d64a76e88 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1737,6 +1737,7 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
if (ext4_es_is_hole(&es))
goto add_delayed;
+found:
/*
* Delayed extent could be allocated by fallocate.
* So we need to check it.
@@ -1781,6 +1782,26 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
add_delayed:
down_write(&EXT4_I(inode)->i_data_sem);
+ /*
+ * Page fault path (ext4_page_mkwrite does not take i_rwsem)
+ * and fallocate path (no folio lock) can race. Make sure we
+ * lookup the extent status tree here again while i_data_sem
+ * is held in write mode, before inserting a new da entry in
+ * the extent status tree.
+ */
+ if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
+ if (!ext4_es_is_hole(&es)) {
+ up_write(&EXT4_I(inode)->i_data_sem);
+ goto found;
+ }
+ } else if (!ext4_has_inline_data(inode)) {
+ retval = ext4_map_query_blocks(NULL, inode, map);
+ if (retval) {
+ up_write(&EXT4_I(inode)->i_data_sem);
+ return retval;
+ }
+ }
+
retval = ext4_insert_delayed_block(inode, map->m_lblk);
up_write(&EXT4_I(inode)->i_data_sem);
if (retval)
The patch below does not apply to the 6.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y
git checkout FETCH_HEAD
git cherry-pick -x 0ea6560abb3bac1ffcfa4bf6b2c4d344fdc27b3c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072903-oblong-old-80b6@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^..
Possible dependencies:
0ea6560abb3b ("ext4: check the extent status again before inserting delalloc block")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0ea6560abb3bac1ffcfa4bf6b2c4d344fdc27b3c Mon Sep 17 00:00:00 2001
From: Zhang Yi <yi.zhang(a)huawei.com>
Date: Fri, 17 May 2024 20:39:57 +0800
Subject: [PATCH] ext4: check the extent status again before inserting delalloc
block
ext4_da_map_blocks looks up for any extent entry in the extent status
tree (w/o i_data_sem) and then the looks up for any ondisk extent
mapping (with i_data_sem in read mode).
If it finds a hole in the extent status tree or if it couldn't find any
entry at all, it then takes the i_data_sem in write mode to add a da
entry into the extent status tree. This can actually race with page
mkwrite & fallocate path.
Note that this is ok between
1. ext4 buffered-write path v/s ext4_page_mkwrite(), because of the
folio lock
2. ext4 buffered write path v/s ext4 fallocate because of the inode
lock.
But this can race between ext4_page_mkwrite() & ext4 fallocate path
ext4_page_mkwrite() ext4_fallocate()
block_page_mkwrite()
ext4_da_map_blocks()
//find hole in extent status tree
ext4_alloc_file_blocks()
ext4_map_blocks()
//allocate block and unwritten extent
ext4_insert_delayed_block()
ext4_da_reserve_space()
//reserve one more block
ext4_es_insert_delayed_block()
//drop unwritten extent and add delayed extent by mistake
Then, the delalloc extent is wrong until writeback and the extra
reserved block can't be released any more and it triggers below warning:
EXT4-fs (pmem2): Inode 13 (00000000bbbd4d23): i_reserved_data_blocks(1) not cleared!
Fix the problem by looking up extent status tree again while the
i_data_sem is held in write mode. If it still can't find any entry, then
we insert a new da entry into the extent status tree.
Cc: stable(a)vger.kernel.org
Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Link: https://patch.msgid.link/20240517124005.347221-3-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 168819b4db01..4b0d64a76e88 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1737,6 +1737,7 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
if (ext4_es_is_hole(&es))
goto add_delayed;
+found:
/*
* Delayed extent could be allocated by fallocate.
* So we need to check it.
@@ -1781,6 +1782,26 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
add_delayed:
down_write(&EXT4_I(inode)->i_data_sem);
+ /*
+ * Page fault path (ext4_page_mkwrite does not take i_rwsem)
+ * and fallocate path (no folio lock) can race. Make sure we
+ * lookup the extent status tree here again while i_data_sem
+ * is held in write mode, before inserting a new da entry in
+ * the extent status tree.
+ */
+ if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
+ if (!ext4_es_is_hole(&es)) {
+ up_write(&EXT4_I(inode)->i_data_sem);
+ goto found;
+ }
+ } else if (!ext4_has_inline_data(inode)) {
+ retval = ext4_map_query_blocks(NULL, inode, map);
+ if (retval) {
+ up_write(&EXT4_I(inode)->i_data_sem);
+ return retval;
+ }
+ }
+
retval = ext4_insert_delayed_block(inode, map->m_lblk);
up_write(&EXT4_I(inode)->i_data_sem);
if (retval)
The following commit has been merged into the perf/core branch of tip:
Commit-ID: d92792a4b26e50b96ab734cbe203d8a4c932a7a9
Gitweb: https://git.kernel.org/tip/d92792a4b26e50b96ab734cbe203d8a4c932a7a9
Author: Adrian Hunter <adrian.hunter(a)intel.com>
AuthorDate: Mon, 15 Jul 2024 19:07:00 +03:00
Committer: Peter Zijlstra <peterz(a)infradead.org>
CommitterDate: Mon, 29 Jul 2024 12:16:24 +02:00
perf/x86/intel/pt: Fix sampling synchronization
pt_event_snapshot_aux() uses pt->handle_nmi to determine if tracing
needs to be stopped, however tracing can still be going because
pt->handle_nmi is set to zero before tracing is stopped in pt_event_stop,
whereas pt_event_snapshot_aux() requires that tracing must be stopped in
order to copy a sample of trace from the buffer.
Instead call pt_config_stop() always, which anyway checks config for
RTIT_CTL_TRACEEN and does nothing if it is already clear.
Note pt_event_snapshot_aux() can continue to use pt->handle_nmi to
determine if the trace needs to be restarted afterwards.
Fixes: 25e8920b301c ("perf/x86/intel/pt: Add sampling support")
Signed-off-by: Adrian Hunter <adrian.hunter(a)intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: stable(a)vger.kernel.org
Link: https://lkml.kernel.org/r/20240715160712.127117-2-adrian.hunter@intel.com
---
arch/x86/events/intel/pt.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index b4aa8da..2959970 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -1606,6 +1606,7 @@ static void pt_event_stop(struct perf_event *event, int mode)
* see comment in intel_pt_interrupt().
*/
WRITE_ONCE(pt->handle_nmi, 0);
+ barrier();
pt_config_stop(event);
@@ -1657,11 +1658,10 @@ static long pt_event_snapshot_aux(struct perf_event *event,
return 0;
/*
- * Here, handle_nmi tells us if the tracing is on
+ * There is no PT interrupt in this mode, so stop the trace and it will
+ * remain stopped while the buffer is copied.
*/
- if (READ_ONCE(pt->handle_nmi))
- pt_config_stop(event);
-
+ pt_config_stop(event);
pt_read_offset(buf);
pt_update_head(pt);
@@ -1673,11 +1673,10 @@ static long pt_event_snapshot_aux(struct perf_event *event,
ret = perf_output_copy_aux(&pt->handle, handle, from, to);
/*
- * If the tracing was on when we turned up, restart it.
- * Compiler barrier not needed as we couldn't have been
- * preempted by anything that touches pt->handle_nmi.
+ * Here, handle_nmi tells us if the tracing was on.
+ * If the tracing was on, restart it.
*/
- if (pt->handle_nmi)
+ if (READ_ONCE(pt->handle_nmi))
pt_config_start(event);
return ret;
On 7/12/24 11:47, Javier Carrasco wrote:
> On 12/07/2024 11:44, Vincenzo Mezzela wrote:
>> Device node `cpus` is allocated but never released using `of_node_put`.
>>
>> This patch introduces the __free attribute for `cpus` device_node that
>> automatically handle the cleanup of the resource by adding a call to
>> `of_node_put` at the end of the current scope. This enhancement aims to
>> mitigate memory management issues associated with forgetting to release
>> the resources.
>>
>> Signed-off-by: Vincenzo Mezzela <vincenzo.mezzela(a)gmail.com>
>> ---
>> arch/arm/kernel/devtree.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm/kernel/devtree.c b/arch/arm/kernel/devtree.c
>> index fdb74e64206a..223d66a5fff3 100644
>> --- a/arch/arm/kernel/devtree.c
>> +++ b/arch/arm/kernel/devtree.c
>> @@ -70,14 +70,14 @@ void __init arm_dt_init_cpu_maps(void)
>> * contain a list of MPIDR[23:0] values where MPIDR[31:24] must
>> * read as 0.
>> */
>> - struct device_node *cpu, *cpus;
>> int found_method = 0;
>> u32 i, j, cpuidx = 1;
>> u32 mpidr = is_smp() ? read_cpuid_mpidr() & MPIDR_HWID_BITMASK : 0;
>>
>> u32 tmp_map[NR_CPUS] = { [0 ... NR_CPUS-1] = MPIDR_INVALID };
>> bool bootcpu_valid = false;
>> - cpus = of_find_node_by_path("/cpus");
>> + struct device_node *cpu;
>> + struct device_node *cpus __free(device_node) = of_find_node_by_path("/cpus");
>>
>> if (!cpus)
>> return;
> Hello Vincenzo,
>
> If this is a fix, please provide the Fixes: tag as well as Cc: for
> stable if it applies.
>
> Best regards, Javier Carrasco
Sure, will do. :)
Best regards,
Vincenzo
Fixes: a0ae02405076a ("ARM: kernel: add device tree init map function")
Cc: stable(a)vger.kernel.org
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x a90d4471146de21745980cba51ce88e7926bcc4f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072927-stubbly-curler-09c4@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
a90d4471146d ("udf: Avoid using corrupted block bitmap buffer")
1e0d4adf17e7 ("udf: Check consistency of Space Bitmap Descriptor")
101ee137d32a ("udf: Drop VARCONV support")
a27b2923de7e ("udf: Move udf_expand_dir_adinicb() to its callsite")
57bda9fb169d ("udf: Convert udf_expand_dir_adinicb() to new directory iteration")
d16076d9b684 ("udf: New directory iteration code")
e4ae4735f7c2 ("udf: use sb_bdev_nr_blocks")
b64533344371 ("udf: Fix iocharset=utf8 mount option")
979a6e28dd96 ("udf: Get rid of 0-length arrays in struct fileIdentDesc")
fa236c2b2d44 ("udf: Fix NULL pointer dereference in udf_symlink function")
382a2287bf9c ("udf: Remove pointless union in udf_inode_info")
044e2e26f214 ("udf: Avoid accessing uninitialized data on failed inode read")
8b075e5ba459 ("udf: stop using ioctl_by_bdev")
4eb09e111218 ("fs-udf: Delete an unnecessary check before brelse()")
ab9a3a737284 ("udf: reduce leakage of blocks related to named streams")
a768a9abc625 ("udf: Explain handling of load_nls() failure")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a90d4471146de21745980cba51ce88e7926bcc4f Mon Sep 17 00:00:00 2001
From: Jan Kara <jack(a)suse.cz>
Date: Mon, 17 Jun 2024 17:41:52 +0200
Subject: [PATCH] udf: Avoid using corrupted block bitmap buffer
When the filesystem block bitmap is corrupted, we detect the corruption
while loading the bitmap and fail the allocation with error. However the
next allocation from the same bitmap will notice the bitmap buffer is
already loaded and tries to allocate from the bitmap with mixed results
(depending on the exact nature of the bitmap corruption). Fix the
problem by using BH_verified bit to indicate whether the bitmap is valid
or not.
Reported-by: syzbot+5f682cd029581f9edfd1(a)syzkaller.appspotmail.com
CC: stable(a)vger.kernel.org
Link: https://patch.msgid.link/20240617154201.29512-2-jack@suse.cz
Fixes: 1e0d4adf17e7 ("udf: Check consistency of Space Bitmap Descriptor")
Signed-off-by: Jan Kara <jack(a)suse.cz>
diff --git a/fs/udf/balloc.c b/fs/udf/balloc.c
index ab3ffc355949..558ad046972a 100644
--- a/fs/udf/balloc.c
+++ b/fs/udf/balloc.c
@@ -64,8 +64,12 @@ static int read_block_bitmap(struct super_block *sb,
}
for (i = 0; i < count; i++)
- if (udf_test_bit(i + off, bh->b_data))
+ if (udf_test_bit(i + off, bh->b_data)) {
+ bitmap->s_block_bitmap[bitmap_nr] =
+ ERR_PTR(-EFSCORRUPTED);
+ brelse(bh);
return -EFSCORRUPTED;
+ }
return 0;
}
@@ -81,8 +85,15 @@ static int __load_block_bitmap(struct super_block *sb,
block_group, nr_groups);
}
- if (bitmap->s_block_bitmap[block_group])
+ if (bitmap->s_block_bitmap[block_group]) {
+ /*
+ * The bitmap failed verification in the past. No point in
+ * trying again.
+ */
+ if (IS_ERR(bitmap->s_block_bitmap[block_group]))
+ return PTR_ERR(bitmap->s_block_bitmap[block_group]);
return block_group;
+ }
retval = read_block_bitmap(sb, bitmap, block_group, block_group);
if (retval < 0)
diff --git a/fs/udf/super.c b/fs/udf/super.c
index 9381a66c6ce5..92d477053905 100644
--- a/fs/udf/super.c
+++ b/fs/udf/super.c
@@ -336,7 +336,8 @@ static void udf_sb_free_bitmap(struct udf_bitmap *bitmap)
int nr_groups = bitmap->s_nr_groups;
for (i = 0; i < nr_groups; i++)
- brelse(bitmap->s_block_bitmap[i]);
+ if (!IS_ERR_OR_NULL(bitmap->s_block_bitmap[i]))
+ brelse(bitmap->s_block_bitmap[i]);
kvfree(bitmap);
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 2bc73505a5cd2a18a7a542022722f136c19e3b87
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072905-twirl-endless-6a91@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
2bc73505a5cd ("apparmor: use kvfree_sensitive to free data->data")
000518bc5aef ("apparmor: fix missing error check for rhashtable_insert_fast")
453431a54934 ("mm, treewide: rename kzfree() to kfree_sensitive()")
45365a06aa30 ("Merge tag 's390-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2bc73505a5cd2a18a7a542022722f136c19e3b87 Mon Sep 17 00:00:00 2001
From: Fedor Pchelkin <pchelkin(a)ispras.ru>
Date: Thu, 1 Feb 2024 17:24:48 +0300
Subject: [PATCH] apparmor: use kvfree_sensitive to free data->data
Inside unpack_profile() data->data is allocated using kvmemdup() so it
should be freed with the corresponding kvfree_sensitive().
Also add missing data->data release for rhashtable insertion failure path
in unpack_profile().
Found by Linux Verification Center (linuxtesting.org).
Fixes: e025be0f26d5 ("apparmor: support querying extended trusted helper extra data")
Cc: stable(a)vger.kernel.org
Signed-off-by: Fedor Pchelkin <pchelkin(a)ispras.ru>
Signed-off-by: John Johansen <john.johansen(a)canonical.com>
diff --git a/security/apparmor/policy.c b/security/apparmor/policy.c
index 957654d253dd..14df15e35695 100644
--- a/security/apparmor/policy.c
+++ b/security/apparmor/policy.c
@@ -225,7 +225,7 @@ static void aa_free_data(void *ptr, void *arg)
{
struct aa_data *data = ptr;
- kfree_sensitive(data->data);
+ kvfree_sensitive(data->data, data->size);
kfree_sensitive(data->key);
kfree_sensitive(data);
}
diff --git a/security/apparmor/policy_unpack.c b/security/apparmor/policy_unpack.c
index 5e578ef0ddff..75452acd0e35 100644
--- a/security/apparmor/policy_unpack.c
+++ b/security/apparmor/policy_unpack.c
@@ -1071,6 +1071,7 @@ static struct aa_profile *unpack_profile(struct aa_ext *e, char **ns_name)
if (rhashtable_insert_fast(profile->data, &data->head,
profile->data->p)) {
+ kvfree_sensitive(data->data, data->size);
kfree_sensitive(data->key);
kfree_sensitive(data);
info = "failed to insert data to table";
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 2bc73505a5cd2a18a7a542022722f136c19e3b87
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072904-landmass-fragrant-ef08@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
2bc73505a5cd ("apparmor: use kvfree_sensitive to free data->data")
000518bc5aef ("apparmor: fix missing error check for rhashtable_insert_fast")
453431a54934 ("mm, treewide: rename kzfree() to kfree_sensitive()")
45365a06aa30 ("Merge tag 's390-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2bc73505a5cd2a18a7a542022722f136c19e3b87 Mon Sep 17 00:00:00 2001
From: Fedor Pchelkin <pchelkin(a)ispras.ru>
Date: Thu, 1 Feb 2024 17:24:48 +0300
Subject: [PATCH] apparmor: use kvfree_sensitive to free data->data
Inside unpack_profile() data->data is allocated using kvmemdup() so it
should be freed with the corresponding kvfree_sensitive().
Also add missing data->data release for rhashtable insertion failure path
in unpack_profile().
Found by Linux Verification Center (linuxtesting.org).
Fixes: e025be0f26d5 ("apparmor: support querying extended trusted helper extra data")
Cc: stable(a)vger.kernel.org
Signed-off-by: Fedor Pchelkin <pchelkin(a)ispras.ru>
Signed-off-by: John Johansen <john.johansen(a)canonical.com>
diff --git a/security/apparmor/policy.c b/security/apparmor/policy.c
index 957654d253dd..14df15e35695 100644
--- a/security/apparmor/policy.c
+++ b/security/apparmor/policy.c
@@ -225,7 +225,7 @@ static void aa_free_data(void *ptr, void *arg)
{
struct aa_data *data = ptr;
- kfree_sensitive(data->data);
+ kvfree_sensitive(data->data, data->size);
kfree_sensitive(data->key);
kfree_sensitive(data);
}
diff --git a/security/apparmor/policy_unpack.c b/security/apparmor/policy_unpack.c
index 5e578ef0ddff..75452acd0e35 100644
--- a/security/apparmor/policy_unpack.c
+++ b/security/apparmor/policy_unpack.c
@@ -1071,6 +1071,7 @@ static struct aa_profile *unpack_profile(struct aa_ext *e, char **ns_name)
if (rhashtable_insert_fast(profile->data, &data->head,
profile->data->p)) {
+ kvfree_sensitive(data->data, data->size);
kfree_sensitive(data->key);
kfree_sensitive(data);
info = "failed to insert data to table";
Hello stable folk,
This patch was merged as:
commit 3af7524b1419 ("sched/fair: Use all little CPUs for CPU-bound workloads")
into 6.7, improving the following:
commit 0b0695f2b34a ("sched/fair: Rework load_balance()")
Would it be possible to port it to the 6.1 stable branch ?
The patch should apply cleanly by cherry-picking onto v6.1.94,
Regards,
Pierre
On 12/6/23 10:00, Pierre Gondois wrote:
> Running n CPU-bound tasks on an n CPUs platform:
> - with asymmetric CPU capacity
> - not being a DynamIq system (i.e. having a PKG level sched domain
> without the SD_SHARE_PKG_RESOURCES flag set)
> might result in a task placement where two tasks run on a big CPU
> and none on a little CPU. This placement could be more optimal by
> using all CPUs.
>
> Testing platform:
> Juno-r2:
> - 2 big CPUs (1-2), maximum capacity of 1024
> - 4 little CPUs (0,3-5), maximum capacity of 383
>
> Testing workload ([1]):
> Spawn 6 CPU-bound tasks. During the first 100ms (step 1), each tasks
> is affine to a CPU, except for:
> - one little CPU which is left idle.
> - one big CPU which has 2 tasks affine.
> After the 100ms (step 2), remove the cpumask affinity.
>
> Before patch:
> During step 2, the load balancer running from the idle CPU tags sched
> domains as:
> - little CPUs: 'group_has_spare'. Cf. group_has_capacity() and
> group_is_overloaded(), 3 CPU-bound tasks run on a 4 CPUs
> sched-domain, and the idle CPU provides enough spare capacity
> regarding the imbalance_pct
> - big CPUs: 'group_overloaded'. Indeed, 3 tasks run on a 2 CPUs
> sched-domain, so the following path is used:
> group_is_overloaded()
> \-if (sgs->sum_nr_running <= sgs->group_weight) return true;
>
> The following path which would change the migration type to
> 'migrate_task' is not taken:
> calculate_imbalance()
> \-if (env->idle != CPU_NOT_IDLE && env->imbalance == 0)
> as the local group has some spare capacity, so the imbalance
> is not 0.
>
> The migration type requested is 'migrate_util' and the busiest
> runqueue is the big CPU's runqueue having 2 tasks (each having a
> utilization of 512). The idle little CPU cannot pull one of these
> task as its capacity is too small for the task. The following path
> is used:
> detach_tasks()
> \-case migrate_util:
> \-if (util > env->imbalance) goto next;
>
> After patch:
> As the number of failed balancing attempts grows (with
> 'nr_balance_failed'), progressively make it easier to migrate
> a big task to the idling little CPU. A similar mechanism is
> used for the 'migrate_load' migration type.
>
> Improvement:
> Running the testing workload [1] with the step 2 representing
> a ~10s load for a big CPU:
> Before patch: ~19.3s
> After patch: ~18s (-6.7%)
>
> Similar issue reported at:
> https://lore.kernel.org/lkml/20230716014125.139577-1-qyousef@layalina.io/
>
> v1:
> https://lore.kernel.org/all/20231110125902.2152380-1-pierre.gondois@arm.com/
> v2:
> https://lore.kernel.org/all/20231124153323.3202444-1-pierre.gondois@arm.com/
>
> Suggested-by: Vincent Guittot <vincent.guittot(a)linaro.org>
> Signed-off-by: Pierre Gondois <pierre.gondois(a)arm.com>
> Reviewed-by: Vincent Guittot <vincent.guittot(a)linaro.org>
> Reviewed-by: Dietmar Eggemann <dietmar.eggemann(a)arm.com>
> ---
>
> Notes:
> v2:
> - Used Vincent's approach.
> v3:
> - Updated commit message.
> - Added Reviewed-by tags
>
> kernel/sched/fair.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index d7a3c63a2171..9481b8cff31b 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9060,7 +9060,7 @@ static int detach_tasks(struct lb_env *env)
> case migrate_util:
> util = task_util_est(p);
>
> - if (util > env->imbalance)
> + if (shr_bound(util, env->sd->nr_balance_failed) > env->imbalance)
> goto next;
>
> env->imbalance -= util;
The patch below does not apply to the 6.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y
git checkout FETCH_HEAD
git cherry-pick -x 080402007007ca1bed8bcb103625137a5c8446c6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072916-brewing-cavalier-a90a@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^..
Possible dependencies:
080402007007 ("irqchip/gic-v3: Fix 'broken_rdists' unused warning when !SMP and !ACPI")
d633da5d3ab1 ("irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs")
fa2dabe57220 ("irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 080402007007ca1bed8bcb103625137a5c8446c6 Mon Sep 17 00:00:00 2001
From: Catalin Marinas <catalin.marinas(a)arm.com>
Date: Fri, 5 Jul 2024 12:54:29 +0100
Subject: [PATCH] irqchip/gic-v3: Fix 'broken_rdists' unused warning when !SMP
and !ACPI
Compiling the GICv3 driver on arm32 with CONFIG_SMP disabled
(CONFIG_ACPI is not available) generates an unused variable warning for
'broken_rdists'. Add a __maybe_unused attribute to silence the compiler.
Fixes: d633da5d3ab1 ("irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs")
Cc: <stable(a)vger.kernel.org> # .x
Acked-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index cc81515c1413..18619d29b2ed 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -44,7 +44,7 @@
#define GIC_IRQ_TYPE_PARTITION (GIC_IRQ_TYPE_LPI + 1)
-static struct cpumask broken_rdists __read_mostly;
+static struct cpumask broken_rdists __read_mostly __maybe_unused;
struct redist_region {
void __iomem *redist_base;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 5302d1a06a2cd9855378122a07c9e0942f0f04a9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072918-bauble-crock-c0f3@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
5302d1a06a2c ("drm/amd/display: Remove ASSERT if significance is zero in math_ceil2")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5302d1a06a2cd9855378122a07c9e0942f0f04a9 Mon Sep 17 00:00:00 2001
From: Rodrigo Siqueira <rodrigo.siqueira(a)amd.com>
Date: Thu, 4 Jul 2024 11:54:34 -0600
Subject: [PATCH] drm/amd/display: Remove ASSERT if significance is zero in
math_ceil2
In the DML math_ceil2 function, there is one ASSERT if the significance
is equal to zero. However, significance might be equal to zero
sometimes, and this is not an issue for a ceil function, but the current
ASSERT will trigger warnings in those cases. This commit removes the
ASSERT if the significance is equal to zero to avoid unnecessary noise.
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Chaitanya Dhere <chaitanya.dhere(a)amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira(a)amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
(cherry picked from commit 332315885d3ccc6d8fe99700f3c2e4c24aa65ab7)
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
index defe13436a2c..e73579f1a88e 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
@@ -64,8 +64,6 @@ double math_ceil(const double arg)
double math_ceil2(const double arg, const double significance)
{
- ASSERT(significance != 0);
-
return ((int)(arg / significance + 0.99999)) * significance;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 5302d1a06a2cd9855378122a07c9e0942f0f04a9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072917-footsie-stopwatch-c5d7@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
5302d1a06a2c ("drm/amd/display: Remove ASSERT if significance is zero in math_ceil2")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5302d1a06a2cd9855378122a07c9e0942f0f04a9 Mon Sep 17 00:00:00 2001
From: Rodrigo Siqueira <rodrigo.siqueira(a)amd.com>
Date: Thu, 4 Jul 2024 11:54:34 -0600
Subject: [PATCH] drm/amd/display: Remove ASSERT if significance is zero in
math_ceil2
In the DML math_ceil2 function, there is one ASSERT if the significance
is equal to zero. However, significance might be equal to zero
sometimes, and this is not an issue for a ceil function, but the current
ASSERT will trigger warnings in those cases. This commit removes the
ASSERT if the significance is equal to zero to avoid unnecessary noise.
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Chaitanya Dhere <chaitanya.dhere(a)amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira(a)amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
(cherry picked from commit 332315885d3ccc6d8fe99700f3c2e4c24aa65ab7)
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
index defe13436a2c..e73579f1a88e 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
@@ -64,8 +64,6 @@ double math_ceil(const double arg)
double math_ceil2(const double arg, const double significance)
{
- ASSERT(significance != 0);
-
return ((int)(arg / significance + 0.99999)) * significance;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 5302d1a06a2cd9855378122a07c9e0942f0f04a9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072916-grill-deniable-ee28@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
5302d1a06a2c ("drm/amd/display: Remove ASSERT if significance is zero in math_ceil2")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5302d1a06a2cd9855378122a07c9e0942f0f04a9 Mon Sep 17 00:00:00 2001
From: Rodrigo Siqueira <rodrigo.siqueira(a)amd.com>
Date: Thu, 4 Jul 2024 11:54:34 -0600
Subject: [PATCH] drm/amd/display: Remove ASSERT if significance is zero in
math_ceil2
In the DML math_ceil2 function, there is one ASSERT if the significance
is equal to zero. However, significance might be equal to zero
sometimes, and this is not an issue for a ceil function, but the current
ASSERT will trigger warnings in those cases. This commit removes the
ASSERT if the significance is equal to zero to avoid unnecessary noise.
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Chaitanya Dhere <chaitanya.dhere(a)amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira(a)amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
(cherry picked from commit 332315885d3ccc6d8fe99700f3c2e4c24aa65ab7)
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
index defe13436a2c..e73579f1a88e 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
@@ -64,8 +64,6 @@ double math_ceil(const double arg)
double math_ceil2(const double arg, const double significance)
{
- ASSERT(significance != 0);
-
return ((int)(arg / significance + 0.99999)) * significance;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 5302d1a06a2cd9855378122a07c9e0942f0f04a9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072916-area-amusing-82db@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
5302d1a06a2c ("drm/amd/display: Remove ASSERT if significance is zero in math_ceil2")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5302d1a06a2cd9855378122a07c9e0942f0f04a9 Mon Sep 17 00:00:00 2001
From: Rodrigo Siqueira <rodrigo.siqueira(a)amd.com>
Date: Thu, 4 Jul 2024 11:54:34 -0600
Subject: [PATCH] drm/amd/display: Remove ASSERT if significance is zero in
math_ceil2
In the DML math_ceil2 function, there is one ASSERT if the significance
is equal to zero. However, significance might be equal to zero
sometimes, and this is not an issue for a ceil function, but the current
ASSERT will trigger warnings in those cases. This commit removes the
ASSERT if the significance is equal to zero to avoid unnecessary noise.
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Chaitanya Dhere <chaitanya.dhere(a)amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira(a)amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
(cherry picked from commit 332315885d3ccc6d8fe99700f3c2e4c24aa65ab7)
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
index defe13436a2c..e73579f1a88e 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
@@ -64,8 +64,6 @@ double math_ceil(const double arg)
double math_ceil2(const double arg, const double significance)
{
- ASSERT(significance != 0);
-
return ((int)(arg / significance + 0.99999)) * significance;
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 5302d1a06a2cd9855378122a07c9e0942f0f04a9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072915-childless-spherical-32e3@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
5302d1a06a2c ("drm/amd/display: Remove ASSERT if significance is zero in math_ceil2")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5302d1a06a2cd9855378122a07c9e0942f0f04a9 Mon Sep 17 00:00:00 2001
From: Rodrigo Siqueira <rodrigo.siqueira(a)amd.com>
Date: Thu, 4 Jul 2024 11:54:34 -0600
Subject: [PATCH] drm/amd/display: Remove ASSERT if significance is zero in
math_ceil2
In the DML math_ceil2 function, there is one ASSERT if the significance
is equal to zero. However, significance might be equal to zero
sometimes, and this is not an issue for a ceil function, but the current
ASSERT will trigger warnings in those cases. This commit removes the
ASSERT if the significance is equal to zero to avoid unnecessary noise.
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Chaitanya Dhere <chaitanya.dhere(a)amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira(a)amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
(cherry picked from commit 332315885d3ccc6d8fe99700f3c2e4c24aa65ab7)
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
index defe13436a2c..e73579f1a88e 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
@@ -64,8 +64,6 @@ double math_ceil(const double arg)
double math_ceil2(const double arg, const double significance)
{
- ASSERT(significance != 0);
-
return ((int)(arg / significance + 0.99999)) * significance;
}
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 5302d1a06a2cd9855378122a07c9e0942f0f04a9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072914-strewn-upheld-6833@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
5302d1a06a2c ("drm/amd/display: Remove ASSERT if significance is zero in math_ceil2")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5302d1a06a2cd9855378122a07c9e0942f0f04a9 Mon Sep 17 00:00:00 2001
From: Rodrigo Siqueira <rodrigo.siqueira(a)amd.com>
Date: Thu, 4 Jul 2024 11:54:34 -0600
Subject: [PATCH] drm/amd/display: Remove ASSERT if significance is zero in
math_ceil2
In the DML math_ceil2 function, there is one ASSERT if the significance
is equal to zero. However, significance might be equal to zero
sometimes, and this is not an issue for a ceil function, but the current
ASSERT will trigger warnings in those cases. This commit removes the
ASSERT if the significance is equal to zero to avoid unnecessary noise.
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Chaitanya Dhere <chaitanya.dhere(a)amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira(a)amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
(cherry picked from commit 332315885d3ccc6d8fe99700f3c2e4c24aa65ab7)
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
index defe13436a2c..e73579f1a88e 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
@@ -64,8 +64,6 @@ double math_ceil(const double arg)
double math_ceil2(const double arg, const double significance)
{
- ASSERT(significance != 0);
-
return ((int)(arg / significance + 0.99999)) * significance;
}
The patch below does not apply to the 6.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y
git checkout FETCH_HEAD
git cherry-pick -x 5302d1a06a2cd9855378122a07c9e0942f0f04a9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072908-unheard-ozone-f012@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^..
Possible dependencies:
5302d1a06a2c ("drm/amd/display: Remove ASSERT if significance is zero in math_ceil2")
70839da63605 ("drm/amd/display: Add new DCN401 sources")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5302d1a06a2cd9855378122a07c9e0942f0f04a9 Mon Sep 17 00:00:00 2001
From: Rodrigo Siqueira <rodrigo.siqueira(a)amd.com>
Date: Thu, 4 Jul 2024 11:54:34 -0600
Subject: [PATCH] drm/amd/display: Remove ASSERT if significance is zero in
math_ceil2
In the DML math_ceil2 function, there is one ASSERT if the significance
is equal to zero. However, significance might be equal to zero
sometimes, and this is not an issue for a ceil function, but the current
ASSERT will trigger warnings in those cases. This commit removes the
ASSERT if the significance is equal to zero to avoid unnecessary noise.
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Chaitanya Dhere <chaitanya.dhere(a)amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira(a)amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
(cherry picked from commit 332315885d3ccc6d8fe99700f3c2e4c24aa65ab7)
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
index defe13436a2c..e73579f1a88e 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_standalone_libraries/lib_float_math.c
@@ -64,8 +64,6 @@ double math_ceil(const double arg)
double math_ceil2(const double arg, const double significance)
{
- ASSERT(significance != 0);
-
return ((int)(arg / significance + 0.99999)) * significance;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x aaf9dc86bd806458f848c39057d59e5aa652a399
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072942-sedan-dingo-ba07@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
aaf9dc86bd80 ("drm/i915/display: For MTL+ platforms skip mg dp programming")
7fcf755896a3 ("drm/i915/display: use intel_encoder_is/to_* functions")
0a099232d254 ("drm/i915/snps: pass encoder to intel_snps_phy_update_psr_power_state()")
684a37a6ffa9 ("drm/i915/ddi: pass encoder to intel_wait_ddi_buf_active()")
65ea19a698f2 ("drm/i915/hdmi: convert *_port_to_ddc_pin() to *_encoder_to_ddc_pin()")
6a8c66bf0e56 ("drm/i915: Don't explode when the dig port we don't have an AUX CH")
d5c7854b50e6 ("drm/i915/xe2lpd: Move D2D enable/disable")
2e4b90fbe755 ("drm/i915: Filter out glitches on HPD lines during hotplug detection")
31a5b6ed88c7 ("drm/i915/display: Unify VSC SPD preparation")
00076671a648 ("drm/i915/display: Move colorimetry_support from intel_psr to intel_dp")
e11300a1d8e3 ("drm/i915/display: Remove intel_crtc_state->psr_vsc")
561322c3bc14 ("drm/i915/display: Skip state verification with TBT-ALT mode")
7966a93a27cf ("drm/i915: Push audio enable/disable further out")
59be90248b42 ("drm/i915/mtl: C20 state verification")
3257e55d3ea7 ("drm/i915/panelreplay: enable/disable panel replay")
b8cf5b5d266e ("drm/i915/panelreplay: Initializaton and compute config for panel replay")
dd8f2298e34b ("drm/i915/psr: Move psr specific dpcd init into own function")
36f579ffc692 ("drm/i915/dp_mst: Improve BW sharing between MST streams")
e37137380931 ("drm/i915/dp_mst: Force modeset CRTC if DSC toggling requires it")
b2608c6b3212 ("drm/i915/dp_mst: Enable MST DSC decompression for all streams")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From aaf9dc86bd806458f848c39057d59e5aa652a399 Mon Sep 17 00:00:00 2001
From: Imre Deak <imre.deak(a)intel.com>
Date: Tue, 25 Jun 2024 14:18:40 +0300
Subject: [PATCH] drm/i915/display: For MTL+ platforms skip mg dp programming
For MTL+ platforms we use PICA chips for Type-C support and
hence mg programming is not needed.
Fixes issue with drm warn of TC port not being in legacy mode.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mika Kahola <mika.kahola(a)intel.com>
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240625111840.597574-1-mika.…
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c
index bb13a3ca8c7c..6672fc162c4f 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -2096,6 +2096,9 @@ icl_program_mg_dp_mode(struct intel_digital_port *dig_port,
u32 ln0, ln1, pin_assignment;
u8 width;
+ if (DISPLAY_VER(dev_priv) >= 14)
+ return;
+
if (!intel_encoder_is_tc(&dig_port->base) ||
intel_tc_port_in_tbt_alt_mode(dig_port))
return;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x aaf9dc86bd806458f848c39057d59e5aa652a399
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072941-bogged-stingray-2127@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
aaf9dc86bd80 ("drm/i915/display: For MTL+ platforms skip mg dp programming")
7fcf755896a3 ("drm/i915/display: use intel_encoder_is/to_* functions")
0a099232d254 ("drm/i915/snps: pass encoder to intel_snps_phy_update_psr_power_state()")
684a37a6ffa9 ("drm/i915/ddi: pass encoder to intel_wait_ddi_buf_active()")
65ea19a698f2 ("drm/i915/hdmi: convert *_port_to_ddc_pin() to *_encoder_to_ddc_pin()")
6a8c66bf0e56 ("drm/i915: Don't explode when the dig port we don't have an AUX CH")
d5c7854b50e6 ("drm/i915/xe2lpd: Move D2D enable/disable")
2e4b90fbe755 ("drm/i915: Filter out glitches on HPD lines during hotplug detection")
31a5b6ed88c7 ("drm/i915/display: Unify VSC SPD preparation")
00076671a648 ("drm/i915/display: Move colorimetry_support from intel_psr to intel_dp")
e11300a1d8e3 ("drm/i915/display: Remove intel_crtc_state->psr_vsc")
561322c3bc14 ("drm/i915/display: Skip state verification with TBT-ALT mode")
7966a93a27cf ("drm/i915: Push audio enable/disable further out")
59be90248b42 ("drm/i915/mtl: C20 state verification")
3257e55d3ea7 ("drm/i915/panelreplay: enable/disable panel replay")
b8cf5b5d266e ("drm/i915/panelreplay: Initializaton and compute config for panel replay")
dd8f2298e34b ("drm/i915/psr: Move psr specific dpcd init into own function")
36f579ffc692 ("drm/i915/dp_mst: Improve BW sharing between MST streams")
e37137380931 ("drm/i915/dp_mst: Force modeset CRTC if DSC toggling requires it")
b2608c6b3212 ("drm/i915/dp_mst: Enable MST DSC decompression for all streams")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From aaf9dc86bd806458f848c39057d59e5aa652a399 Mon Sep 17 00:00:00 2001
From: Imre Deak <imre.deak(a)intel.com>
Date: Tue, 25 Jun 2024 14:18:40 +0300
Subject: [PATCH] drm/i915/display: For MTL+ platforms skip mg dp programming
For MTL+ platforms we use PICA chips for Type-C support and
hence mg programming is not needed.
Fixes issue with drm warn of TC port not being in legacy mode.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mika Kahola <mika.kahola(a)intel.com>
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240625111840.597574-1-mika.…
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c
index bb13a3ca8c7c..6672fc162c4f 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -2096,6 +2096,9 @@ icl_program_mg_dp_mode(struct intel_digital_port *dig_port,
u32 ln0, ln1, pin_assignment;
u8 width;
+ if (DISPLAY_VER(dev_priv) >= 14)
+ return;
+
if (!intel_encoder_is_tc(&dig_port->base) ||
intel_tc_port_in_tbt_alt_mode(dig_port))
return;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x aaf9dc86bd806458f848c39057d59e5aa652a399
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072941-uncross-untrained-affb@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
aaf9dc86bd80 ("drm/i915/display: For MTL+ platforms skip mg dp programming")
7fcf755896a3 ("drm/i915/display: use intel_encoder_is/to_* functions")
0a099232d254 ("drm/i915/snps: pass encoder to intel_snps_phy_update_psr_power_state()")
684a37a6ffa9 ("drm/i915/ddi: pass encoder to intel_wait_ddi_buf_active()")
65ea19a698f2 ("drm/i915/hdmi: convert *_port_to_ddc_pin() to *_encoder_to_ddc_pin()")
6a8c66bf0e56 ("drm/i915: Don't explode when the dig port we don't have an AUX CH")
d5c7854b50e6 ("drm/i915/xe2lpd: Move D2D enable/disable")
2e4b90fbe755 ("drm/i915: Filter out glitches on HPD lines during hotplug detection")
31a5b6ed88c7 ("drm/i915/display: Unify VSC SPD preparation")
00076671a648 ("drm/i915/display: Move colorimetry_support from intel_psr to intel_dp")
e11300a1d8e3 ("drm/i915/display: Remove intel_crtc_state->psr_vsc")
561322c3bc14 ("drm/i915/display: Skip state verification with TBT-ALT mode")
7966a93a27cf ("drm/i915: Push audio enable/disable further out")
59be90248b42 ("drm/i915/mtl: C20 state verification")
3257e55d3ea7 ("drm/i915/panelreplay: enable/disable panel replay")
b8cf5b5d266e ("drm/i915/panelreplay: Initializaton and compute config for panel replay")
dd8f2298e34b ("drm/i915/psr: Move psr specific dpcd init into own function")
36f579ffc692 ("drm/i915/dp_mst: Improve BW sharing between MST streams")
e37137380931 ("drm/i915/dp_mst: Force modeset CRTC if DSC toggling requires it")
b2608c6b3212 ("drm/i915/dp_mst: Enable MST DSC decompression for all streams")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From aaf9dc86bd806458f848c39057d59e5aa652a399 Mon Sep 17 00:00:00 2001
From: Imre Deak <imre.deak(a)intel.com>
Date: Tue, 25 Jun 2024 14:18:40 +0300
Subject: [PATCH] drm/i915/display: For MTL+ platforms skip mg dp programming
For MTL+ platforms we use PICA chips for Type-C support and
hence mg programming is not needed.
Fixes issue with drm warn of TC port not being in legacy mode.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mika Kahola <mika.kahola(a)intel.com>
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240625111840.597574-1-mika.…
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c
index bb13a3ca8c7c..6672fc162c4f 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -2096,6 +2096,9 @@ icl_program_mg_dp_mode(struct intel_digital_port *dig_port,
u32 ln0, ln1, pin_assignment;
u8 width;
+ if (DISPLAY_VER(dev_priv) >= 14)
+ return;
+
if (!intel_encoder_is_tc(&dig_port->base) ||
intel_tc_port_in_tbt_alt_mode(dig_port))
return;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x aaf9dc86bd806458f848c39057d59e5aa652a399
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072940-cryptic-starter-05a2@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
aaf9dc86bd80 ("drm/i915/display: For MTL+ platforms skip mg dp programming")
7fcf755896a3 ("drm/i915/display: use intel_encoder_is/to_* functions")
0a099232d254 ("drm/i915/snps: pass encoder to intel_snps_phy_update_psr_power_state()")
684a37a6ffa9 ("drm/i915/ddi: pass encoder to intel_wait_ddi_buf_active()")
65ea19a698f2 ("drm/i915/hdmi: convert *_port_to_ddc_pin() to *_encoder_to_ddc_pin()")
6a8c66bf0e56 ("drm/i915: Don't explode when the dig port we don't have an AUX CH")
d5c7854b50e6 ("drm/i915/xe2lpd: Move D2D enable/disable")
2e4b90fbe755 ("drm/i915: Filter out glitches on HPD lines during hotplug detection")
31a5b6ed88c7 ("drm/i915/display: Unify VSC SPD preparation")
00076671a648 ("drm/i915/display: Move colorimetry_support from intel_psr to intel_dp")
e11300a1d8e3 ("drm/i915/display: Remove intel_crtc_state->psr_vsc")
561322c3bc14 ("drm/i915/display: Skip state verification with TBT-ALT mode")
7966a93a27cf ("drm/i915: Push audio enable/disable further out")
59be90248b42 ("drm/i915/mtl: C20 state verification")
3257e55d3ea7 ("drm/i915/panelreplay: enable/disable panel replay")
b8cf5b5d266e ("drm/i915/panelreplay: Initializaton and compute config for panel replay")
dd8f2298e34b ("drm/i915/psr: Move psr specific dpcd init into own function")
36f579ffc692 ("drm/i915/dp_mst: Improve BW sharing between MST streams")
e37137380931 ("drm/i915/dp_mst: Force modeset CRTC if DSC toggling requires it")
b2608c6b3212 ("drm/i915/dp_mst: Enable MST DSC decompression for all streams")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From aaf9dc86bd806458f848c39057d59e5aa652a399 Mon Sep 17 00:00:00 2001
From: Imre Deak <imre.deak(a)intel.com>
Date: Tue, 25 Jun 2024 14:18:40 +0300
Subject: [PATCH] drm/i915/display: For MTL+ platforms skip mg dp programming
For MTL+ platforms we use PICA chips for Type-C support and
hence mg programming is not needed.
Fixes issue with drm warn of TC port not being in legacy mode.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mika Kahola <mika.kahola(a)intel.com>
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240625111840.597574-1-mika.…
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c
index bb13a3ca8c7c..6672fc162c4f 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -2096,6 +2096,9 @@ icl_program_mg_dp_mode(struct intel_digital_port *dig_port,
u32 ln0, ln1, pin_assignment;
u8 width;
+ if (DISPLAY_VER(dev_priv) >= 14)
+ return;
+
if (!intel_encoder_is_tc(&dig_port->base) ||
intel_tc_port_in_tbt_alt_mode(dig_port))
return;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x aaf9dc86bd806458f848c39057d59e5aa652a399
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072939-deuce-disaster-61c1@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
aaf9dc86bd80 ("drm/i915/display: For MTL+ platforms skip mg dp programming")
7fcf755896a3 ("drm/i915/display: use intel_encoder_is/to_* functions")
0a099232d254 ("drm/i915/snps: pass encoder to intel_snps_phy_update_psr_power_state()")
684a37a6ffa9 ("drm/i915/ddi: pass encoder to intel_wait_ddi_buf_active()")
65ea19a698f2 ("drm/i915/hdmi: convert *_port_to_ddc_pin() to *_encoder_to_ddc_pin()")
6a8c66bf0e56 ("drm/i915: Don't explode when the dig port we don't have an AUX CH")
d5c7854b50e6 ("drm/i915/xe2lpd: Move D2D enable/disable")
2e4b90fbe755 ("drm/i915: Filter out glitches on HPD lines during hotplug detection")
31a5b6ed88c7 ("drm/i915/display: Unify VSC SPD preparation")
00076671a648 ("drm/i915/display: Move colorimetry_support from intel_psr to intel_dp")
e11300a1d8e3 ("drm/i915/display: Remove intel_crtc_state->psr_vsc")
561322c3bc14 ("drm/i915/display: Skip state verification with TBT-ALT mode")
7966a93a27cf ("drm/i915: Push audio enable/disable further out")
59be90248b42 ("drm/i915/mtl: C20 state verification")
3257e55d3ea7 ("drm/i915/panelreplay: enable/disable panel replay")
b8cf5b5d266e ("drm/i915/panelreplay: Initializaton and compute config for panel replay")
dd8f2298e34b ("drm/i915/psr: Move psr specific dpcd init into own function")
36f579ffc692 ("drm/i915/dp_mst: Improve BW sharing between MST streams")
e37137380931 ("drm/i915/dp_mst: Force modeset CRTC if DSC toggling requires it")
b2608c6b3212 ("drm/i915/dp_mst: Enable MST DSC decompression for all streams")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From aaf9dc86bd806458f848c39057d59e5aa652a399 Mon Sep 17 00:00:00 2001
From: Imre Deak <imre.deak(a)intel.com>
Date: Tue, 25 Jun 2024 14:18:40 +0300
Subject: [PATCH] drm/i915/display: For MTL+ platforms skip mg dp programming
For MTL+ platforms we use PICA chips for Type-C support and
hence mg programming is not needed.
Fixes issue with drm warn of TC port not being in legacy mode.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mika Kahola <mika.kahola(a)intel.com>
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240625111840.597574-1-mika.…
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c
index bb13a3ca8c7c..6672fc162c4f 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -2096,6 +2096,9 @@ icl_program_mg_dp_mode(struct intel_digital_port *dig_port,
u32 ln0, ln1, pin_assignment;
u8 width;
+ if (DISPLAY_VER(dev_priv) >= 14)
+ return;
+
if (!intel_encoder_is_tc(&dig_port->base) ||
intel_tc_port_in_tbt_alt_mode(dig_port))
return;
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x aaf9dc86bd806458f848c39057d59e5aa652a399
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072939-creole-geologist-da49@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
aaf9dc86bd80 ("drm/i915/display: For MTL+ platforms skip mg dp programming")
7fcf755896a3 ("drm/i915/display: use intel_encoder_is/to_* functions")
0a099232d254 ("drm/i915/snps: pass encoder to intel_snps_phy_update_psr_power_state()")
684a37a6ffa9 ("drm/i915/ddi: pass encoder to intel_wait_ddi_buf_active()")
65ea19a698f2 ("drm/i915/hdmi: convert *_port_to_ddc_pin() to *_encoder_to_ddc_pin()")
6a8c66bf0e56 ("drm/i915: Don't explode when the dig port we don't have an AUX CH")
d5c7854b50e6 ("drm/i915/xe2lpd: Move D2D enable/disable")
2e4b90fbe755 ("drm/i915: Filter out glitches on HPD lines during hotplug detection")
31a5b6ed88c7 ("drm/i915/display: Unify VSC SPD preparation")
00076671a648 ("drm/i915/display: Move colorimetry_support from intel_psr to intel_dp")
e11300a1d8e3 ("drm/i915/display: Remove intel_crtc_state->psr_vsc")
561322c3bc14 ("drm/i915/display: Skip state verification with TBT-ALT mode")
7966a93a27cf ("drm/i915: Push audio enable/disable further out")
59be90248b42 ("drm/i915/mtl: C20 state verification")
3257e55d3ea7 ("drm/i915/panelreplay: enable/disable panel replay")
b8cf5b5d266e ("drm/i915/panelreplay: Initializaton and compute config for panel replay")
dd8f2298e34b ("drm/i915/psr: Move psr specific dpcd init into own function")
36f579ffc692 ("drm/i915/dp_mst: Improve BW sharing between MST streams")
e37137380931 ("drm/i915/dp_mst: Force modeset CRTC if DSC toggling requires it")
b2608c6b3212 ("drm/i915/dp_mst: Enable MST DSC decompression for all streams")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From aaf9dc86bd806458f848c39057d59e5aa652a399 Mon Sep 17 00:00:00 2001
From: Imre Deak <imre.deak(a)intel.com>
Date: Tue, 25 Jun 2024 14:18:40 +0300
Subject: [PATCH] drm/i915/display: For MTL+ platforms skip mg dp programming
For MTL+ platforms we use PICA chips for Type-C support and
hence mg programming is not needed.
Fixes issue with drm warn of TC port not being in legacy mode.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mika Kahola <mika.kahola(a)intel.com>
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240625111840.597574-1-mika.…
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c
index bb13a3ca8c7c..6672fc162c4f 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -2096,6 +2096,9 @@ icl_program_mg_dp_mode(struct intel_digital_port *dig_port,
u32 ln0, ln1, pin_assignment;
u8 width;
+ if (DISPLAY_VER(dev_priv) >= 14)
+ return;
+
if (!intel_encoder_is_tc(&dig_port->base) ||
intel_tc_port_in_tbt_alt_mode(dig_port))
return;
The patch below does not apply to the 6.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y
git checkout FETCH_HEAD
git cherry-pick -x aaf9dc86bd806458f848c39057d59e5aa652a399
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072938-rectangle-freckles-e9c3@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^..
Possible dependencies:
aaf9dc86bd80 ("drm/i915/display: For MTL+ platforms skip mg dp programming")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From aaf9dc86bd806458f848c39057d59e5aa652a399 Mon Sep 17 00:00:00 2001
From: Imre Deak <imre.deak(a)intel.com>
Date: Tue, 25 Jun 2024 14:18:40 +0300
Subject: [PATCH] drm/i915/display: For MTL+ platforms skip mg dp programming
For MTL+ platforms we use PICA chips for Type-C support and
hence mg programming is not needed.
Fixes issue with drm warn of TC port not being in legacy mode.
Cc: stable(a)vger.kernel.org
Signed-off-by: Mika Kahola <mika.kahola(a)intel.com>
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240625111840.597574-1-mika.…
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c
index bb13a3ca8c7c..6672fc162c4f 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -2096,6 +2096,9 @@ icl_program_mg_dp_mode(struct intel_digital_port *dig_port,
u32 ln0, ln1, pin_assignment;
u8 width;
+ if (DISPLAY_VER(dev_priv) >= 14)
+ return;
+
if (!intel_encoder_is_tc(&dig_port->base) ||
intel_tc_port_in_tbt_alt_mode(dig_port))
return;
On 16.07.24 19:33, Alex Deucher wrote:
> This reverts commit bc87d666c05a13e6d4ae1ddce41fc43d2567b9a2 and the
> register changes from commit 6d4279cb99ac4f51d10409501d29969f687ac8dc.
>
> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3478
> Cc: mikhail.v.gavrilov(a)gmail.com
> Cc: Rodrigo Siqueira <Rodrigo.Siqueira(a)amd.com>
> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
> [...]
Hi Greg, quick note, as I don't see the quoted patch in your 6.10-queue
yet: you might want to pick up e3615bd198289f ("drm/amd/display: fix
corruption with high refresh rates on DCN 3.0") [v6.11-rc1, contains
table tag] rather sooner than later for 6.10.y, as the problem
apparently hit a few people and even made the news:
https://lore.kernel.org/all/20240723042420.GA1455@sol.localdomain/https://www.reddit.com/r/archlinux/comments/1eaxjzo/linux_610_causes_screen…
Ciao, Thorsten
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 6807352353561187a718e87204458999dbcbba1b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072909-turbine-disparity-9044@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
680735235356 ("ipv4: fix source address selection with route leak")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6807352353561187a718e87204458999dbcbba1b Mon Sep 17 00:00:00 2001
From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Date: Wed, 10 Jul 2024 10:14:27 +0200
Subject: [PATCH] ipv4: fix source address selection with route leak
By default, an address assigned to the output interface is selected when
the source address is not specified. This is problematic when a route,
configured in a vrf, uses an interface from another vrf (aka route leak).
The original vrf does not own the selected source address.
Let's add a check against the output interface and call the appropriate
function to select the source address.
CC: stable(a)vger.kernel.org
Fixes: 8cbb512c923d ("net: Add source address lookup op for VRF")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Reviewed-by: David Ahern <dsahern(a)kernel.org>
Link: https://patch.msgid.link/20240710081521.3809742-2-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index f669da98d11d..8956026bc0a2 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -2270,6 +2270,15 @@ void fib_select_path(struct net *net, struct fib_result *res,
fib_select_default(fl4, res);
check_saddr:
- if (!fl4->saddr)
- fl4->saddr = fib_result_prefsrc(net, res);
+ if (!fl4->saddr) {
+ struct net_device *l3mdev;
+
+ l3mdev = dev_get_by_index_rcu(net, fl4->flowi4_l3mdev);
+
+ if (!l3mdev ||
+ l3mdev_master_dev_rcu(FIB_RES_DEV(*res)) == l3mdev)
+ fl4->saddr = fib_result_prefsrc(net, res);
+ else
+ fl4->saddr = inet_select_addr(l3mdev, 0, RT_SCOPE_LINK);
+ }
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 6807352353561187a718e87204458999dbcbba1b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072908-desolate-jeeringly-c274@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
680735235356 ("ipv4: fix source address selection with route leak")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6807352353561187a718e87204458999dbcbba1b Mon Sep 17 00:00:00 2001
From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Date: Wed, 10 Jul 2024 10:14:27 +0200
Subject: [PATCH] ipv4: fix source address selection with route leak
By default, an address assigned to the output interface is selected when
the source address is not specified. This is problematic when a route,
configured in a vrf, uses an interface from another vrf (aka route leak).
The original vrf does not own the selected source address.
Let's add a check against the output interface and call the appropriate
function to select the source address.
CC: stable(a)vger.kernel.org
Fixes: 8cbb512c923d ("net: Add source address lookup op for VRF")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Reviewed-by: David Ahern <dsahern(a)kernel.org>
Link: https://patch.msgid.link/20240710081521.3809742-2-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index f669da98d11d..8956026bc0a2 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -2270,6 +2270,15 @@ void fib_select_path(struct net *net, struct fib_result *res,
fib_select_default(fl4, res);
check_saddr:
- if (!fl4->saddr)
- fl4->saddr = fib_result_prefsrc(net, res);
+ if (!fl4->saddr) {
+ struct net_device *l3mdev;
+
+ l3mdev = dev_get_by_index_rcu(net, fl4->flowi4_l3mdev);
+
+ if (!l3mdev ||
+ l3mdev_master_dev_rcu(FIB_RES_DEV(*res)) == l3mdev)
+ fl4->saddr = fib_result_prefsrc(net, res);
+ else
+ fl4->saddr = inet_select_addr(l3mdev, 0, RT_SCOPE_LINK);
+ }
}
Since the configuration of INTx (Legacy Interrupt) is not supported, set
the .map_irq and .swizzle_irq callbacks to NULL. This fixes the error:
of_irq_parse_pci: failed with rc=-22
when the pcieport driver attempts to setup INTx for the Host Bridge via
"pci_assign_irq()". The device-tree node of the Host Bridge doesn't
contain Legacy Interrupts. As a result, Legacy Interrupts are searched for
in the MSI Interrupt Parent of the Host Bridge with the help of
"of_irq_parse_raw()". Since the MSI Interrupt Parent of the Host Bridge
uses 3 interrupt cells while Legacy Interrupts only use 1, the search
for Legacy Interrupts is terminated prematurely by the "of_irq_parse_raw()"
function with the -EINVAL error code (rc=-22).
Fixes: f3e25911a430 ("PCI: j721e: Add TI J721E PCIe driver")
Cc: stable(a)vger.kernel.org
Reported-by: Andrew Halaney <ahalaney(a)redhat.com>
Tested-by: Andrew Halaney <ahalaney(a)redhat.com>
Signed-off-by: Siddharth Vadapalli <s-vadapalli(a)ti.com>
---
Hello,
This patch is based on commit
1722389b0d86 Merge tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
of Mainline Linux.
v1:
https://lore.kernel.org/r/20240724065048.285838-1-s-vadapalli@ti.com
Changes since v1:
- Added "Cc: stable(a)vger.kernel.org" in the commit message.
- Based on Bjorn's feedback at:
https://lore.kernel.org/r/20240724162304.GA802428@bhelgaas/
the $subject and commit message have been updated. Additionally, the
comment in the driver has also been updated to specify "INTx" instead of
"Legacy Interrupts".
- Collected Tested-by tag from Andrew Halaney <ahalaney(a)redhat.com>:
https://lore.kernel.org/r/vj6vtjphpmqv6hcblaalr2m4bwjujjwvom6ca4bjdzcmgazya…
Patch has been tested on J721e-EVM and J784S4-EVM.
Regards,
Siddharth.
drivers/pci/controller/cadence/pci-j721e.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/pci/controller/cadence/pci-j721e.c b/drivers/pci/controller/cadence/pci-j721e.c
index 85718246016b..eaa6cfeb03c7 100644
--- a/drivers/pci/controller/cadence/pci-j721e.c
+++ b/drivers/pci/controller/cadence/pci-j721e.c
@@ -417,6 +417,10 @@ static int j721e_pcie_probe(struct platform_device *pdev)
if (!bridge)
return -ENOMEM;
+ /* INTx is not supported */
+ bridge->map_irq = NULL;
+ bridge->swizzle_irq = NULL;
+
if (!data->byte_access_allowed)
bridge->ops = &cdns_ti_pcie_host_ops;
rc = pci_host_bridge_priv(bridge);
--
2.40.1
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 98ca62ba9e2be5863c7d069f84f7166b45a5b2f4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072929-ocelot-buccaneer-e755@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
98ca62ba9e2b ("sysctl: always initialize i_uid/i_gid")
520713a93d55 ("sysctl: treewide: drop unused argument ctl_table_root::set_ownership(table)")
f9436a5d0497 ("sysctl: allow to change limits for posix messages queues")
50ec499b9a43 ("sysctl: allow change system v ipc sysctls inside ipc namespace")
0889f44e2810 ("ipc: Check permissions for checkpoint_restart sysctls at open time")
1f5c135ee509 ("ipc: Store ipc sysctls in the ipc namespace")
dc55e35f9e81 ("ipc: Store mqueue sysctls in the ipc namespace")
0e9beb8a96f2 ("ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL")
5563cabdde7e ("ipc: check checkpoint_restore_ns_capable() to modify C/R proc files")
32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler")
2374c09b1c8a ("sysctl: remove all extern declaration from sysctl.c")
26363af56434 ("mm: remove watermark_boost_factor_sysctl_handler")
6923aa0d8c62 ("mm/compaction: Disable compact_unevictable_allowed on RT")
964b692daf30 ("mm/compaction: really limit compact_unevictable_allowed to 0 and 1")
eaee41727e6d ("sysctl/sysrq: Remove __sysrq_enabled copy")
0a6cad5df541 ("Merge branch 'vmwgfx-coherent' of git://people.freedesktop.org/~thomash/linux into drm-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 98ca62ba9e2be5863c7d069f84f7166b45a5b2f4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= <linux(a)weissschuh.net>
Date: Tue, 2 Apr 2024 23:10:34 +0200
Subject: [PATCH] sysctl: always initialize i_uid/i_gid
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Always initialize i_uid/i_gid inside the sysfs core so set_ownership()
can safely skip setting them.
Commit 5ec27ec735ba ("fs/proc/proc_sysctl.c: fix the default values of
i_uid/i_gid on /proc/sys inodes.") added defaults for i_uid/i_gid when
set_ownership() was not implemented. It also missed adjusting
net_ctl_set_ownership() to use the same default values in case the
computation of a better value failed.
Fixes: 5ec27ec735ba ("fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys inodes.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
Signed-off-by: Joel Granados <j.granados(a)samsung.com>
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index b1c2c0b82116..dd7b462387a0 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -476,12 +476,10 @@ static struct inode *proc_sys_make_inode(struct super_block *sb,
make_empty_dir_inode(inode);
}
+ inode->i_uid = GLOBAL_ROOT_UID;
+ inode->i_gid = GLOBAL_ROOT_GID;
if (root->set_ownership)
root->set_ownership(head, &inode->i_uid, &inode->i_gid);
- else {
- inode->i_uid = GLOBAL_ROOT_UID;
- inode->i_gid = GLOBAL_ROOT_GID;
- }
return inode;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 98ca62ba9e2be5863c7d069f84f7166b45a5b2f4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072928-bunny-banana-9335@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
98ca62ba9e2b ("sysctl: always initialize i_uid/i_gid")
520713a93d55 ("sysctl: treewide: drop unused argument ctl_table_root::set_ownership(table)")
f9436a5d0497 ("sysctl: allow to change limits for posix messages queues")
50ec499b9a43 ("sysctl: allow change system v ipc sysctls inside ipc namespace")
0889f44e2810 ("ipc: Check permissions for checkpoint_restart sysctls at open time")
1f5c135ee509 ("ipc: Store ipc sysctls in the ipc namespace")
dc55e35f9e81 ("ipc: Store mqueue sysctls in the ipc namespace")
0e9beb8a96f2 ("ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL")
5563cabdde7e ("ipc: check checkpoint_restore_ns_capable() to modify C/R proc files")
32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler")
2374c09b1c8a ("sysctl: remove all extern declaration from sysctl.c")
26363af56434 ("mm: remove watermark_boost_factor_sysctl_handler")
6923aa0d8c62 ("mm/compaction: Disable compact_unevictable_allowed on RT")
964b692daf30 ("mm/compaction: really limit compact_unevictable_allowed to 0 and 1")
eaee41727e6d ("sysctl/sysrq: Remove __sysrq_enabled copy")
0a6cad5df541 ("Merge branch 'vmwgfx-coherent' of git://people.freedesktop.org/~thomash/linux into drm-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 98ca62ba9e2be5863c7d069f84f7166b45a5b2f4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= <linux(a)weissschuh.net>
Date: Tue, 2 Apr 2024 23:10:34 +0200
Subject: [PATCH] sysctl: always initialize i_uid/i_gid
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Always initialize i_uid/i_gid inside the sysfs core so set_ownership()
can safely skip setting them.
Commit 5ec27ec735ba ("fs/proc/proc_sysctl.c: fix the default values of
i_uid/i_gid on /proc/sys inodes.") added defaults for i_uid/i_gid when
set_ownership() was not implemented. It also missed adjusting
net_ctl_set_ownership() to use the same default values in case the
computation of a better value failed.
Fixes: 5ec27ec735ba ("fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys inodes.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
Signed-off-by: Joel Granados <j.granados(a)samsung.com>
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index b1c2c0b82116..dd7b462387a0 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -476,12 +476,10 @@ static struct inode *proc_sys_make_inode(struct super_block *sb,
make_empty_dir_inode(inode);
}
+ inode->i_uid = GLOBAL_ROOT_UID;
+ inode->i_gid = GLOBAL_ROOT_GID;
if (root->set_ownership)
root->set_ownership(head, &inode->i_uid, &inode->i_gid);
- else {
- inode->i_uid = GLOBAL_ROOT_UID;
- inode->i_gid = GLOBAL_ROOT_GID;
- }
return inode;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 98ca62ba9e2be5863c7d069f84f7166b45a5b2f4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072927-tricolor-backfield-1bfe@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
98ca62ba9e2b ("sysctl: always initialize i_uid/i_gid")
520713a93d55 ("sysctl: treewide: drop unused argument ctl_table_root::set_ownership(table)")
f9436a5d0497 ("sysctl: allow to change limits for posix messages queues")
50ec499b9a43 ("sysctl: allow change system v ipc sysctls inside ipc namespace")
0889f44e2810 ("ipc: Check permissions for checkpoint_restart sysctls at open time")
1f5c135ee509 ("ipc: Store ipc sysctls in the ipc namespace")
dc55e35f9e81 ("ipc: Store mqueue sysctls in the ipc namespace")
0e9beb8a96f2 ("ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL")
5563cabdde7e ("ipc: check checkpoint_restore_ns_capable() to modify C/R proc files")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 98ca62ba9e2be5863c7d069f84f7166b45a5b2f4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= <linux(a)weissschuh.net>
Date: Tue, 2 Apr 2024 23:10:34 +0200
Subject: [PATCH] sysctl: always initialize i_uid/i_gid
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Always initialize i_uid/i_gid inside the sysfs core so set_ownership()
can safely skip setting them.
Commit 5ec27ec735ba ("fs/proc/proc_sysctl.c: fix the default values of
i_uid/i_gid on /proc/sys inodes.") added defaults for i_uid/i_gid when
set_ownership() was not implemented. It also missed adjusting
net_ctl_set_ownership() to use the same default values in case the
computation of a better value failed.
Fixes: 5ec27ec735ba ("fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys inodes.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
Signed-off-by: Joel Granados <j.granados(a)samsung.com>
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index b1c2c0b82116..dd7b462387a0 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -476,12 +476,10 @@ static struct inode *proc_sys_make_inode(struct super_block *sb,
make_empty_dir_inode(inode);
}
+ inode->i_uid = GLOBAL_ROOT_UID;
+ inode->i_gid = GLOBAL_ROOT_GID;
if (root->set_ownership)
root->set_ownership(head, &inode->i_uid, &inode->i_gid);
- else {
- inode->i_uid = GLOBAL_ROOT_UID;
- inode->i_gid = GLOBAL_ROOT_GID;
- }
return inode;
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 98ca62ba9e2be5863c7d069f84f7166b45a5b2f4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072920-anytime-unfailing-5f2c@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
98ca62ba9e2b ("sysctl: always initialize i_uid/i_gid")
520713a93d55 ("sysctl: treewide: drop unused argument ctl_table_root::set_ownership(table)")
f9436a5d0497 ("sysctl: allow to change limits for posix messages queues")
50ec499b9a43 ("sysctl: allow change system v ipc sysctls inside ipc namespace")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 98ca62ba9e2be5863c7d069f84f7166b45a5b2f4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= <linux(a)weissschuh.net>
Date: Tue, 2 Apr 2024 23:10:34 +0200
Subject: [PATCH] sysctl: always initialize i_uid/i_gid
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Always initialize i_uid/i_gid inside the sysfs core so set_ownership()
can safely skip setting them.
Commit 5ec27ec735ba ("fs/proc/proc_sysctl.c: fix the default values of
i_uid/i_gid on /proc/sys inodes.") added defaults for i_uid/i_gid when
set_ownership() was not implemented. It also missed adjusting
net_ctl_set_ownership() to use the same default values in case the
computation of a better value failed.
Fixes: 5ec27ec735ba ("fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys inodes.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
Signed-off-by: Joel Granados <j.granados(a)samsung.com>
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index b1c2c0b82116..dd7b462387a0 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -476,12 +476,10 @@ static struct inode *proc_sys_make_inode(struct super_block *sb,
make_empty_dir_inode(inode);
}
+ inode->i_uid = GLOBAL_ROOT_UID;
+ inode->i_gid = GLOBAL_ROOT_GID;
if (root->set_ownership)
root->set_ownership(head, &inode->i_uid, &inode->i_gid);
- else {
- inode->i_uid = GLOBAL_ROOT_UID;
- inode->i_gid = GLOBAL_ROOT_GID;
- }
return inode;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 98ca62ba9e2be5863c7d069f84f7166b45a5b2f4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072921-steadfast-bulk-c488@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
98ca62ba9e2b ("sysctl: always initialize i_uid/i_gid")
520713a93d55 ("sysctl: treewide: drop unused argument ctl_table_root::set_ownership(table)")
f9436a5d0497 ("sysctl: allow to change limits for posix messages queues")
50ec499b9a43 ("sysctl: allow change system v ipc sysctls inside ipc namespace")
0889f44e2810 ("ipc: Check permissions for checkpoint_restart sysctls at open time")
1f5c135ee509 ("ipc: Store ipc sysctls in the ipc namespace")
dc55e35f9e81 ("ipc: Store mqueue sysctls in the ipc namespace")
0e9beb8a96f2 ("ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL")
5563cabdde7e ("ipc: check checkpoint_restore_ns_capable() to modify C/R proc files")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 98ca62ba9e2be5863c7d069f84f7166b45a5b2f4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= <linux(a)weissschuh.net>
Date: Tue, 2 Apr 2024 23:10:34 +0200
Subject: [PATCH] sysctl: always initialize i_uid/i_gid
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Always initialize i_uid/i_gid inside the sysfs core so set_ownership()
can safely skip setting them.
Commit 5ec27ec735ba ("fs/proc/proc_sysctl.c: fix the default values of
i_uid/i_gid on /proc/sys inodes.") added defaults for i_uid/i_gid when
set_ownership() was not implemented. It also missed adjusting
net_ctl_set_ownership() to use the same default values in case the
computation of a better value failed.
Fixes: 5ec27ec735ba ("fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys inodes.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
Signed-off-by: Joel Granados <j.granados(a)samsung.com>
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index b1c2c0b82116..dd7b462387a0 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -476,12 +476,10 @@ static struct inode *proc_sys_make_inode(struct super_block *sb,
make_empty_dir_inode(inode);
}
+ inode->i_uid = GLOBAL_ROOT_UID;
+ inode->i_gid = GLOBAL_ROOT_GID;
if (root->set_ownership)
root->set_ownership(head, &inode->i_uid, &inode->i_gid);
- else {
- inode->i_uid = GLOBAL_ROOT_UID;
- inode->i_gid = GLOBAL_ROOT_GID;
- }
return inode;
}
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 98ca62ba9e2be5863c7d069f84f7166b45a5b2f4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072920-unmarked-likewise-acb2@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
98ca62ba9e2b ("sysctl: always initialize i_uid/i_gid")
520713a93d55 ("sysctl: treewide: drop unused argument ctl_table_root::set_ownership(table)")
f9436a5d0497 ("sysctl: allow to change limits for posix messages queues")
50ec499b9a43 ("sysctl: allow change system v ipc sysctls inside ipc namespace")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 98ca62ba9e2be5863c7d069f84f7166b45a5b2f4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= <linux(a)weissschuh.net>
Date: Tue, 2 Apr 2024 23:10:34 +0200
Subject: [PATCH] sysctl: always initialize i_uid/i_gid
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Always initialize i_uid/i_gid inside the sysfs core so set_ownership()
can safely skip setting them.
Commit 5ec27ec735ba ("fs/proc/proc_sysctl.c: fix the default values of
i_uid/i_gid on /proc/sys inodes.") added defaults for i_uid/i_gid when
set_ownership() was not implemented. It also missed adjusting
net_ctl_set_ownership() to use the same default values in case the
computation of a better value failed.
Fixes: 5ec27ec735ba ("fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys inodes.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
Signed-off-by: Joel Granados <j.granados(a)samsung.com>
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index b1c2c0b82116..dd7b462387a0 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -476,12 +476,10 @@ static struct inode *proc_sys_make_inode(struct super_block *sb,
make_empty_dir_inode(inode);
}
+ inode->i_uid = GLOBAL_ROOT_UID;
+ inode->i_gid = GLOBAL_ROOT_GID;
if (root->set_ownership)
root->set_ownership(head, &inode->i_uid, &inode->i_gid);
- else {
- inode->i_uid = GLOBAL_ROOT_UID;
- inode->i_gid = GLOBAL_ROOT_GID;
- }
return inode;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 28ab9769117ca944cb6eb537af5599aa436287a4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072908-kissing-cornbread-2dc7@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
28ab9769117c ("ata: libata-scsi: Honor the D_SENSE bit for CK_COND=1 and no error")
f2b1e9c6f867 ("scsi: core: Introduce scsi_build_sense()")
8148dfba29e7 ("scsi: 3w-xxxx: Whitespace cleanup")
96e209be6ecb ("scsi: lpfc: Convert SCSI I/O completions to SLI-3 and SLI-4 handlers")
da255e2e7cc8 ("scsi: lpfc: Convert SCSI path to use common I/O submission path")
47ff4c510f02 ("scsi: lpfc: Enable common send_io interface for SCSI and NVMe")
307e338097dc ("scsi: lpfc: Rework remote port ref counting and node freeing")
7af29d455362 ("scsi: lpfc: Fix-up around 120 documentation issues")
372c187b8a70 ("scsi: lpfc: Add an internal trace log buffer")
317aeb83c92b ("scsi: lpfc: Add blk_io_poll support for latency improvment")
f0020e428af7 ("scsi: lpfc: Add support to display if adapter dumps are available")
86ee57a97a17 ("scsi: lpfc: Fix kdump hang on PPC")
818dbde78e0f ("Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 28ab9769117ca944cb6eb537af5599aa436287a4 Mon Sep 17 00:00:00 2001
From: Igor Pylypiv <ipylypiv(a)google.com>
Date: Tue, 2 Jul 2024 02:47:31 +0000
Subject: [PATCH] ata: libata-scsi: Honor the D_SENSE bit for CK_COND=1 and no
error
SAT-5 revision 8 specification removed the text about the ANSI INCITS
431-2007 compliance which was requiring SCSI/ATA Translation (SAT) to
return descriptor format sense data for the ATA PASS-THROUGH commands
regardless of the setting of the D_SENSE bit.
Let's honor the D_SENSE bit for ATA PASS-THROUGH commands while
generating the "ATA PASS-THROUGH INFORMATION AVAILABLE" sense data.
SAT-5 revision 7
================
12.2.2.8 Fixed format sense data
Table 212 shows the fields returned in the fixed format sense data
(see SPC-5) for ATA PASS-THROUGH commands. SATLs compliant with ANSI
INCITS 431-2007, SCSI/ATA Translation (SAT) return descriptor format
sense data for the ATA PASS-THROUGH commands regardless of the setting
of the D_SENSE bit.
SAT-5 revision 8
================
12.2.2.8 Fixed format sense data
Table 211 shows the fields returned in the fixed format sense data
(see SPC-5) for ATA PASS-THROUGH commands.
Cc: stable(a)vger.kernel.org # 4.19+
Reported-by: Niklas Cassel <cassel(a)kernel.org>
Closes: https://lore.kernel.org/linux-ide/Zn1WUhmLglM4iais@ryzen.lan
Reviewed-by: Niklas Cassel <cassel(a)kernel.org>
Signed-off-by: Igor Pylypiv <ipylypiv(a)google.com>
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Link: https://lore.kernel.org/r/20240702024735.1152293-4-ipylypiv@google.com
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index a9d2a1cf4c3d..fdc2a3b59fc1 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -941,11 +941,8 @@ static void ata_gen_passthru_sense(struct ata_queued_cmd *qc)
&sense_key, &asc, &ascq);
ata_scsi_set_sense(qc->dev, cmd, sense_key, asc, ascq);
} else {
- /*
- * ATA PASS-THROUGH INFORMATION AVAILABLE
- * Always in descriptor format sense.
- */
- scsi_build_sense(cmd, 1, RECOVERED_ERROR, 0, 0x1D);
+ /* ATA PASS-THROUGH INFORMATION AVAILABLE */
+ ata_scsi_set_sense(qc->dev, cmd, RECOVERED_ERROR, 0, 0x1D);
}
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 28ab9769117ca944cb6eb537af5599aa436287a4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072907-pronounce-drop-down-3db9@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
28ab9769117c ("ata: libata-scsi: Honor the D_SENSE bit for CK_COND=1 and no error")
f2b1e9c6f867 ("scsi: core: Introduce scsi_build_sense()")
8148dfba29e7 ("scsi: 3w-xxxx: Whitespace cleanup")
96e209be6ecb ("scsi: lpfc: Convert SCSI I/O completions to SLI-3 and SLI-4 handlers")
da255e2e7cc8 ("scsi: lpfc: Convert SCSI path to use common I/O submission path")
47ff4c510f02 ("scsi: lpfc: Enable common send_io interface for SCSI and NVMe")
307e338097dc ("scsi: lpfc: Rework remote port ref counting and node freeing")
7af29d455362 ("scsi: lpfc: Fix-up around 120 documentation issues")
372c187b8a70 ("scsi: lpfc: Add an internal trace log buffer")
317aeb83c92b ("scsi: lpfc: Add blk_io_poll support for latency improvment")
f0020e428af7 ("scsi: lpfc: Add support to display if adapter dumps are available")
86ee57a97a17 ("scsi: lpfc: Fix kdump hang on PPC")
818dbde78e0f ("Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 28ab9769117ca944cb6eb537af5599aa436287a4 Mon Sep 17 00:00:00 2001
From: Igor Pylypiv <ipylypiv(a)google.com>
Date: Tue, 2 Jul 2024 02:47:31 +0000
Subject: [PATCH] ata: libata-scsi: Honor the D_SENSE bit for CK_COND=1 and no
error
SAT-5 revision 8 specification removed the text about the ANSI INCITS
431-2007 compliance which was requiring SCSI/ATA Translation (SAT) to
return descriptor format sense data for the ATA PASS-THROUGH commands
regardless of the setting of the D_SENSE bit.
Let's honor the D_SENSE bit for ATA PASS-THROUGH commands while
generating the "ATA PASS-THROUGH INFORMATION AVAILABLE" sense data.
SAT-5 revision 7
================
12.2.2.8 Fixed format sense data
Table 212 shows the fields returned in the fixed format sense data
(see SPC-5) for ATA PASS-THROUGH commands. SATLs compliant with ANSI
INCITS 431-2007, SCSI/ATA Translation (SAT) return descriptor format
sense data for the ATA PASS-THROUGH commands regardless of the setting
of the D_SENSE bit.
SAT-5 revision 8
================
12.2.2.8 Fixed format sense data
Table 211 shows the fields returned in the fixed format sense data
(see SPC-5) for ATA PASS-THROUGH commands.
Cc: stable(a)vger.kernel.org # 4.19+
Reported-by: Niklas Cassel <cassel(a)kernel.org>
Closes: https://lore.kernel.org/linux-ide/Zn1WUhmLglM4iais@ryzen.lan
Reviewed-by: Niklas Cassel <cassel(a)kernel.org>
Signed-off-by: Igor Pylypiv <ipylypiv(a)google.com>
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Link: https://lore.kernel.org/r/20240702024735.1152293-4-ipylypiv@google.com
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index a9d2a1cf4c3d..fdc2a3b59fc1 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -941,11 +941,8 @@ static void ata_gen_passthru_sense(struct ata_queued_cmd *qc)
&sense_key, &asc, &ascq);
ata_scsi_set_sense(qc->dev, cmd, sense_key, asc, ascq);
} else {
- /*
- * ATA PASS-THROUGH INFORMATION AVAILABLE
- * Always in descriptor format sense.
- */
- scsi_build_sense(cmd, 1, RECOVERED_ERROR, 0, 0x1D);
+ /* ATA PASS-THROUGH INFORMATION AVAILABLE */
+ ata_scsi_set_sense(qc->dev, cmd, RECOVERED_ERROR, 0, 0x1D);
}
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 28ab9769117ca944cb6eb537af5599aa436287a4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072906-drapery-glimpse-fc09@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
28ab9769117c ("ata: libata-scsi: Honor the D_SENSE bit for CK_COND=1 and no error")
f2b1e9c6f867 ("scsi: core: Introduce scsi_build_sense()")
8148dfba29e7 ("scsi: 3w-xxxx: Whitespace cleanup")
96e209be6ecb ("scsi: lpfc: Convert SCSI I/O completions to SLI-3 and SLI-4 handlers")
da255e2e7cc8 ("scsi: lpfc: Convert SCSI path to use common I/O submission path")
47ff4c510f02 ("scsi: lpfc: Enable common send_io interface for SCSI and NVMe")
307e338097dc ("scsi: lpfc: Rework remote port ref counting and node freeing")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 28ab9769117ca944cb6eb537af5599aa436287a4 Mon Sep 17 00:00:00 2001
From: Igor Pylypiv <ipylypiv(a)google.com>
Date: Tue, 2 Jul 2024 02:47:31 +0000
Subject: [PATCH] ata: libata-scsi: Honor the D_SENSE bit for CK_COND=1 and no
error
SAT-5 revision 8 specification removed the text about the ANSI INCITS
431-2007 compliance which was requiring SCSI/ATA Translation (SAT) to
return descriptor format sense data for the ATA PASS-THROUGH commands
regardless of the setting of the D_SENSE bit.
Let's honor the D_SENSE bit for ATA PASS-THROUGH commands while
generating the "ATA PASS-THROUGH INFORMATION AVAILABLE" sense data.
SAT-5 revision 7
================
12.2.2.8 Fixed format sense data
Table 212 shows the fields returned in the fixed format sense data
(see SPC-5) for ATA PASS-THROUGH commands. SATLs compliant with ANSI
INCITS 431-2007, SCSI/ATA Translation (SAT) return descriptor format
sense data for the ATA PASS-THROUGH commands regardless of the setting
of the D_SENSE bit.
SAT-5 revision 8
================
12.2.2.8 Fixed format sense data
Table 211 shows the fields returned in the fixed format sense data
(see SPC-5) for ATA PASS-THROUGH commands.
Cc: stable(a)vger.kernel.org # 4.19+
Reported-by: Niklas Cassel <cassel(a)kernel.org>
Closes: https://lore.kernel.org/linux-ide/Zn1WUhmLglM4iais@ryzen.lan
Reviewed-by: Niklas Cassel <cassel(a)kernel.org>
Signed-off-by: Igor Pylypiv <ipylypiv(a)google.com>
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Link: https://lore.kernel.org/r/20240702024735.1152293-4-ipylypiv@google.com
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index a9d2a1cf4c3d..fdc2a3b59fc1 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -941,11 +941,8 @@ static void ata_gen_passthru_sense(struct ata_queued_cmd *qc)
&sense_key, &asc, &ascq);
ata_scsi_set_sense(qc->dev, cmd, sense_key, asc, ascq);
} else {
- /*
- * ATA PASS-THROUGH INFORMATION AVAILABLE
- * Always in descriptor format sense.
- */
- scsi_build_sense(cmd, 1, RECOVERED_ERROR, 0, 0x1D);
+ /* ATA PASS-THROUGH INFORMATION AVAILABLE */
+ ata_scsi_set_sense(qc->dev, cmd, RECOVERED_ERROR, 0, 0x1D);
}
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 97981926224afe17ba3e22e0c2b7dd8b516ee574
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072946-resonate-sprung-9a3c@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
97981926224a ("ata: libata-scsi: Do not overwrite valid sense data when CK_COND=1")
38dab832c3f4 ("ata: libata-scsi: Fix offsets for the fixed format sense data")
ff8072d589dc ("ata: libata: remove references to non-existing error_handler()")
3ac873c76d79 ("ata: libata-core: fix when to fetch sense data for successful commands")
18bd7718b5c4 ("scsi: ata: libata: Handle completion of CDL commands using policy 0xD")
eafe804bda7b ("scsi: ata: libata: Set read/write commands CDL index")
df60f9c64576 ("scsi: ata: libata: Add ATA feature control sub-page translation")
673b2fe6ff1d ("scsi: ata: libata-scsi: Add support for CDL pages mode sense")
62e4a60e0cdb ("scsi: ata: libata: Detect support for command duration limits")
bc9af4909406 ("ata: libata: Fix FUA handling in ata_build_rw_tf()")
4d2e4980a528 ("ata: libata: cleanup fua support detection")
7574a8377c7a ("ata: libata-scsi: do not overwrite SCSI ML and status bytes")
87aab3c4cd59 ("ata: libata: move NCQ related ATA_DFLAGs")
876293121f24 ("ata: scsi: rename flag ATA_QCFLAG_FAILED to ATA_QCFLAG_EH")
b83ad9eec316 ("ata: libata-eh: Cleanup ata_scsi_cmd_error_handler()")
3d8a3ae3d966 ("ata: libata: fix commands incorrectly not getting retried during NCQ error")
4cb7c6f1ef96 ("ata: make use of ata_port_is_frozen() helper")
cb6e73aaadff ("ata: libata-eh: Remove the unneeded result variable")
066de3b9d93b ("ata: libata-core: Simplify ata_build_rw_tf()")
e00923c59e68 ("ata: libata: Rename ATA_DFLAG_NCQ_PRIO_ENABLE")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 97981926224afe17ba3e22e0c2b7dd8b516ee574 Mon Sep 17 00:00:00 2001
From: Igor Pylypiv <ipylypiv(a)google.com>
Date: Tue, 2 Jul 2024 02:47:30 +0000
Subject: [PATCH] ata: libata-scsi: Do not overwrite valid sense data when
CK_COND=1
Current ata_gen_passthru_sense() code performs two actions:
1. Generates sense data based on the ATA 'status' and ATA 'error' fields.
2. Populates "ATA Status Return sense data descriptor" / "Fixed format
sense data" with ATA taskfile fields.
The problem is that #1 generates sense data even when a valid sense data
is already present (ATA_QCFLAG_SENSE_VALID is set). Factoring out #2 into
a separate function allows us to generate sense data only when there is
no valid sense data (ATA_QCFLAG_SENSE_VALID is not set).
As a bonus, we can now delete a FIXME comment in atapi_qc_complete()
which states that we don't want to translate taskfile registers into
sense descriptors for ATAPI.
Additionally, always set SAM_STAT_CHECK_CONDITION when CK_COND=1 because
SAT specification mandates that SATL shall return CHECK CONDITION if
the CK_COND bit is set.
The ATA PASS-THROUGH handling logic in ata_scsi_qc_complete() is hard
to read/understand. Improve the readability of the code by moving checks
into self-explanatory boolean variables.
Cc: stable(a)vger.kernel.org # 4.19+
Co-developed-by: Niklas Cassel <cassel(a)kernel.org>
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Reviewed-by: Damien Le Moal <dlemoal(a)kernel.org>
Reviewed-by: Niklas Cassel <cassel(a)kernel.org>
Signed-off-by: Igor Pylypiv <ipylypiv(a)google.com>
Link: https://lore.kernel.org/r/20240702024735.1152293-3-ipylypiv@google.com
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index a0c68b0a00c4..a9d2a1cf4c3d 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -230,6 +230,80 @@ void ata_scsi_set_sense_information(struct ata_device *dev,
SCSI_SENSE_BUFFERSIZE, information);
}
+/**
+ * ata_scsi_set_passthru_sense_fields - Set ATA fields in sense buffer
+ * @qc: ATA PASS-THROUGH command.
+ *
+ * Populates "ATA Status Return sense data descriptor" / "Fixed format
+ * sense data" with ATA taskfile fields.
+ *
+ * LOCKING:
+ * None.
+ */
+static void ata_scsi_set_passthru_sense_fields(struct ata_queued_cmd *qc)
+{
+ struct scsi_cmnd *cmd = qc->scsicmd;
+ struct ata_taskfile *tf = &qc->result_tf;
+ unsigned char *sb = cmd->sense_buffer;
+
+ if ((sb[0] & 0x7f) >= 0x72) {
+ unsigned char *desc;
+ u8 len;
+
+ /* descriptor format */
+ len = sb[7];
+ desc = (char *)scsi_sense_desc_find(sb, len + 8, 9);
+ if (!desc) {
+ if (SCSI_SENSE_BUFFERSIZE < len + 14)
+ return;
+ sb[7] = len + 14;
+ desc = sb + 8 + len;
+ }
+ desc[0] = 9;
+ desc[1] = 12;
+ /*
+ * Copy registers into sense buffer.
+ */
+ desc[2] = 0x00;
+ desc[3] = tf->error;
+ desc[5] = tf->nsect;
+ desc[7] = tf->lbal;
+ desc[9] = tf->lbam;
+ desc[11] = tf->lbah;
+ desc[12] = tf->device;
+ desc[13] = tf->status;
+
+ /*
+ * Fill in Extend bit, and the high order bytes
+ * if applicable.
+ */
+ if (tf->flags & ATA_TFLAG_LBA48) {
+ desc[2] |= 0x01;
+ desc[4] = tf->hob_nsect;
+ desc[6] = tf->hob_lbal;
+ desc[8] = tf->hob_lbam;
+ desc[10] = tf->hob_lbah;
+ }
+ } else {
+ /* Fixed sense format */
+ sb[0] |= 0x80;
+ sb[3] = tf->error;
+ sb[4] = tf->status;
+ sb[5] = tf->device;
+ sb[6] = tf->nsect;
+ if (tf->flags & ATA_TFLAG_LBA48) {
+ sb[8] |= 0x80;
+ if (tf->hob_nsect)
+ sb[8] |= 0x40;
+ if (tf->hob_lbal || tf->hob_lbam || tf->hob_lbah)
+ sb[8] |= 0x20;
+ }
+ sb[9] = tf->lbal;
+ sb[10] = tf->lbam;
+ sb[11] = tf->lbah;
+ }
+}
+
static void ata_scsi_set_invalid_field(struct ata_device *dev,
struct scsi_cmnd *cmd, u16 field, u8 bit)
{
@@ -837,10 +911,8 @@ static void ata_to_sense_error(unsigned id, u8 drv_stat, u8 drv_err, u8 *sk,
* ata_gen_passthru_sense - Generate check condition sense block.
* @qc: Command that completed.
*
- * This function is specific to the ATA descriptor format sense
- * block specified for the ATA pass through commands. Regardless
- * of whether the command errored or not, return a sense
- * block. Copy all controller registers into the sense
+ * This function is specific to the ATA pass through commands.
+ * Regardless of whether the command errored or not, return a sense
* block. If there was no error, we get the request from an ATA
* passthrough command, so we use the following sense data:
* sk = RECOVERED ERROR
@@ -875,63 +947,6 @@ static void ata_gen_passthru_sense(struct ata_queued_cmd *qc)
*/
scsi_build_sense(cmd, 1, RECOVERED_ERROR, 0, 0x1D);
}
-
- if ((sb[0] & 0x7f) >= 0x72) {
- unsigned char *desc;
- u8 len;
-
- /* descriptor format */
- len = sb[7];
- desc = (char *)scsi_sense_desc_find(sb, len + 8, 9);
- if (!desc) {
- if (SCSI_SENSE_BUFFERSIZE < len + 14)
- return;
- sb[7] = len + 14;
- desc = sb + 8 + len;
- }
- desc[0] = 9;
- desc[1] = 12;
- /*
- * Copy registers into sense buffer.
- */
- desc[2] = 0x00;
- desc[3] = tf->error;
- desc[5] = tf->nsect;
- desc[7] = tf->lbal;
- desc[9] = tf->lbam;
- desc[11] = tf->lbah;
- desc[12] = tf->device;
- desc[13] = tf->status;
-
- /*
- * Fill in Extend bit, and the high order bytes
- * if applicable.
- */
- if (tf->flags & ATA_TFLAG_LBA48) {
- desc[2] |= 0x01;
- desc[4] = tf->hob_nsect;
- desc[6] = tf->hob_lbal;
- desc[8] = tf->hob_lbam;
- desc[10] = tf->hob_lbah;
- }
- } else {
- /* Fixed sense format */
- sb[0] |= 0x80;
- sb[3] = tf->error;
- sb[4] = tf->status;
- sb[5] = tf->device;
- sb[6] = tf->nsect;
- if (tf->flags & ATA_TFLAG_LBA48) {
- sb[8] |= 0x80;
- if (tf->hob_nsect)
- sb[8] |= 0x40;
- if (tf->hob_lbal || tf->hob_lbam || tf->hob_lbah)
- sb[8] |= 0x20;
- }
- sb[9] = tf->lbal;
- sb[10] = tf->lbam;
- sb[11] = tf->lbah;
- }
}
/**
@@ -1632,26 +1647,32 @@ static void ata_scsi_qc_complete(struct ata_queued_cmd *qc)
{
struct scsi_cmnd *cmd = qc->scsicmd;
u8 *cdb = cmd->cmnd;
- int need_sense = (qc->err_mask != 0) &&
- !(qc->flags & ATA_QCFLAG_SENSE_VALID);
+ bool have_sense = qc->flags & ATA_QCFLAG_SENSE_VALID;
+ bool is_ata_passthru = cdb[0] == ATA_16 || cdb[0] == ATA_12;
+ bool is_ck_cond_request = cdb[2] & 0x20;
+ bool is_error = qc->err_mask != 0;
/* For ATA pass thru (SAT) commands, generate a sense block if
* user mandated it or if there's an error. Note that if we
- * generate because the user forced us to [CK_COND =1], a check
+ * generate because the user forced us to [CK_COND=1], a check
* condition is generated and the ATA register values are returned
* whether the command completed successfully or not. If there
- * was no error, we use the following sense data:
+ * was no error, and CK_COND=1, we use the following sense data:
* sk = RECOVERED ERROR
* asc,ascq = ATA PASS-THROUGH INFORMATION AVAILABLE
*/
- if (((cdb[0] == ATA_16) || (cdb[0] == ATA_12)) &&
- ((cdb[2] & 0x20) || need_sense))
- ata_gen_passthru_sense(qc);
- else if (need_sense)
+ if (is_ata_passthru && (is_ck_cond_request || is_error || have_sense)) {
+ if (!have_sense)
+ ata_gen_passthru_sense(qc);
+ ata_scsi_set_passthru_sense_fields(qc);
+ if (is_ck_cond_request)
+ set_status_byte(qc->scsicmd, SAM_STAT_CHECK_CONDITION);
+ } else if (is_error && !have_sense) {
ata_gen_ata_sense(qc);
- else
+ } else {
/* Keep the SCSI ML and status byte, clear host byte. */
cmd->result &= 0x0000ffff;
+ }
ata_qc_done(qc);
}
@@ -2590,14 +2611,8 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)
/* handle completion from EH */
if (unlikely(err_mask || qc->flags & ATA_QCFLAG_SENSE_VALID)) {
- if (!(qc->flags & ATA_QCFLAG_SENSE_VALID)) {
- /* FIXME: not quite right; we don't want the
- * translation of taskfile registers into a
- * sense descriptors, since that's only
- * correct for ATA, not ATAPI
- */
+ if (!(qc->flags & ATA_QCFLAG_SENSE_VALID))
ata_gen_passthru_sense(qc);
- }
/* SCSI EH automatically locks door if sdev->locked is
* set. Sometimes door lock request continues to
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 97981926224afe17ba3e22e0c2b7dd8b516ee574
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072945-gibberish-convene-2781@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
97981926224a ("ata: libata-scsi: Do not overwrite valid sense data when CK_COND=1")
38dab832c3f4 ("ata: libata-scsi: Fix offsets for the fixed format sense data")
ff8072d589dc ("ata: libata: remove references to non-existing error_handler()")
3ac873c76d79 ("ata: libata-core: fix when to fetch sense data for successful commands")
18bd7718b5c4 ("scsi: ata: libata: Handle completion of CDL commands using policy 0xD")
eafe804bda7b ("scsi: ata: libata: Set read/write commands CDL index")
df60f9c64576 ("scsi: ata: libata: Add ATA feature control sub-page translation")
673b2fe6ff1d ("scsi: ata: libata-scsi: Add support for CDL pages mode sense")
62e4a60e0cdb ("scsi: ata: libata: Detect support for command duration limits")
bc9af4909406 ("ata: libata: Fix FUA handling in ata_build_rw_tf()")
4d2e4980a528 ("ata: libata: cleanup fua support detection")
7574a8377c7a ("ata: libata-scsi: do not overwrite SCSI ML and status bytes")
87aab3c4cd59 ("ata: libata: move NCQ related ATA_DFLAGs")
876293121f24 ("ata: scsi: rename flag ATA_QCFLAG_FAILED to ATA_QCFLAG_EH")
b83ad9eec316 ("ata: libata-eh: Cleanup ata_scsi_cmd_error_handler()")
3d8a3ae3d966 ("ata: libata: fix commands incorrectly not getting retried during NCQ error")
4cb7c6f1ef96 ("ata: make use of ata_port_is_frozen() helper")
cb6e73aaadff ("ata: libata-eh: Remove the unneeded result variable")
066de3b9d93b ("ata: libata-core: Simplify ata_build_rw_tf()")
e00923c59e68 ("ata: libata: Rename ATA_DFLAG_NCQ_PRIO_ENABLE")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 97981926224afe17ba3e22e0c2b7dd8b516ee574 Mon Sep 17 00:00:00 2001
From: Igor Pylypiv <ipylypiv(a)google.com>
Date: Tue, 2 Jul 2024 02:47:30 +0000
Subject: [PATCH] ata: libata-scsi: Do not overwrite valid sense data when
CK_COND=1
Current ata_gen_passthru_sense() code performs two actions:
1. Generates sense data based on the ATA 'status' and ATA 'error' fields.
2. Populates "ATA Status Return sense data descriptor" / "Fixed format
sense data" with ATA taskfile fields.
The problem is that #1 generates sense data even when a valid sense data
is already present (ATA_QCFLAG_SENSE_VALID is set). Factoring out #2 into
a separate function allows us to generate sense data only when there is
no valid sense data (ATA_QCFLAG_SENSE_VALID is not set).
As a bonus, we can now delete a FIXME comment in atapi_qc_complete()
which states that we don't want to translate taskfile registers into
sense descriptors for ATAPI.
Additionally, always set SAM_STAT_CHECK_CONDITION when CK_COND=1 because
SAT specification mandates that SATL shall return CHECK CONDITION if
the CK_COND bit is set.
The ATA PASS-THROUGH handling logic in ata_scsi_qc_complete() is hard
to read/understand. Improve the readability of the code by moving checks
into self-explanatory boolean variables.
Cc: stable(a)vger.kernel.org # 4.19+
Co-developed-by: Niklas Cassel <cassel(a)kernel.org>
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Reviewed-by: Damien Le Moal <dlemoal(a)kernel.org>
Reviewed-by: Niklas Cassel <cassel(a)kernel.org>
Signed-off-by: Igor Pylypiv <ipylypiv(a)google.com>
Link: https://lore.kernel.org/r/20240702024735.1152293-3-ipylypiv@google.com
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index a0c68b0a00c4..a9d2a1cf4c3d 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -230,6 +230,80 @@ void ata_scsi_set_sense_information(struct ata_device *dev,
SCSI_SENSE_BUFFERSIZE, information);
}
+/**
+ * ata_scsi_set_passthru_sense_fields - Set ATA fields in sense buffer
+ * @qc: ATA PASS-THROUGH command.
+ *
+ * Populates "ATA Status Return sense data descriptor" / "Fixed format
+ * sense data" with ATA taskfile fields.
+ *
+ * LOCKING:
+ * None.
+ */
+static void ata_scsi_set_passthru_sense_fields(struct ata_queued_cmd *qc)
+{
+ struct scsi_cmnd *cmd = qc->scsicmd;
+ struct ata_taskfile *tf = &qc->result_tf;
+ unsigned char *sb = cmd->sense_buffer;
+
+ if ((sb[0] & 0x7f) >= 0x72) {
+ unsigned char *desc;
+ u8 len;
+
+ /* descriptor format */
+ len = sb[7];
+ desc = (char *)scsi_sense_desc_find(sb, len + 8, 9);
+ if (!desc) {
+ if (SCSI_SENSE_BUFFERSIZE < len + 14)
+ return;
+ sb[7] = len + 14;
+ desc = sb + 8 + len;
+ }
+ desc[0] = 9;
+ desc[1] = 12;
+ /*
+ * Copy registers into sense buffer.
+ */
+ desc[2] = 0x00;
+ desc[3] = tf->error;
+ desc[5] = tf->nsect;
+ desc[7] = tf->lbal;
+ desc[9] = tf->lbam;
+ desc[11] = tf->lbah;
+ desc[12] = tf->device;
+ desc[13] = tf->status;
+
+ /*
+ * Fill in Extend bit, and the high order bytes
+ * if applicable.
+ */
+ if (tf->flags & ATA_TFLAG_LBA48) {
+ desc[2] |= 0x01;
+ desc[4] = tf->hob_nsect;
+ desc[6] = tf->hob_lbal;
+ desc[8] = tf->hob_lbam;
+ desc[10] = tf->hob_lbah;
+ }
+ } else {
+ /* Fixed sense format */
+ sb[0] |= 0x80;
+ sb[3] = tf->error;
+ sb[4] = tf->status;
+ sb[5] = tf->device;
+ sb[6] = tf->nsect;
+ if (tf->flags & ATA_TFLAG_LBA48) {
+ sb[8] |= 0x80;
+ if (tf->hob_nsect)
+ sb[8] |= 0x40;
+ if (tf->hob_lbal || tf->hob_lbam || tf->hob_lbah)
+ sb[8] |= 0x20;
+ }
+ sb[9] = tf->lbal;
+ sb[10] = tf->lbam;
+ sb[11] = tf->lbah;
+ }
+}
+
static void ata_scsi_set_invalid_field(struct ata_device *dev,
struct scsi_cmnd *cmd, u16 field, u8 bit)
{
@@ -837,10 +911,8 @@ static void ata_to_sense_error(unsigned id, u8 drv_stat, u8 drv_err, u8 *sk,
* ata_gen_passthru_sense - Generate check condition sense block.
* @qc: Command that completed.
*
- * This function is specific to the ATA descriptor format sense
- * block specified for the ATA pass through commands. Regardless
- * of whether the command errored or not, return a sense
- * block. Copy all controller registers into the sense
+ * This function is specific to the ATA pass through commands.
+ * Regardless of whether the command errored or not, return a sense
* block. If there was no error, we get the request from an ATA
* passthrough command, so we use the following sense data:
* sk = RECOVERED ERROR
@@ -875,63 +947,6 @@ static void ata_gen_passthru_sense(struct ata_queued_cmd *qc)
*/
scsi_build_sense(cmd, 1, RECOVERED_ERROR, 0, 0x1D);
}
-
- if ((sb[0] & 0x7f) >= 0x72) {
- unsigned char *desc;
- u8 len;
-
- /* descriptor format */
- len = sb[7];
- desc = (char *)scsi_sense_desc_find(sb, len + 8, 9);
- if (!desc) {
- if (SCSI_SENSE_BUFFERSIZE < len + 14)
- return;
- sb[7] = len + 14;
- desc = sb + 8 + len;
- }
- desc[0] = 9;
- desc[1] = 12;
- /*
- * Copy registers into sense buffer.
- */
- desc[2] = 0x00;
- desc[3] = tf->error;
- desc[5] = tf->nsect;
- desc[7] = tf->lbal;
- desc[9] = tf->lbam;
- desc[11] = tf->lbah;
- desc[12] = tf->device;
- desc[13] = tf->status;
-
- /*
- * Fill in Extend bit, and the high order bytes
- * if applicable.
- */
- if (tf->flags & ATA_TFLAG_LBA48) {
- desc[2] |= 0x01;
- desc[4] = tf->hob_nsect;
- desc[6] = tf->hob_lbal;
- desc[8] = tf->hob_lbam;
- desc[10] = tf->hob_lbah;
- }
- } else {
- /* Fixed sense format */
- sb[0] |= 0x80;
- sb[3] = tf->error;
- sb[4] = tf->status;
- sb[5] = tf->device;
- sb[6] = tf->nsect;
- if (tf->flags & ATA_TFLAG_LBA48) {
- sb[8] |= 0x80;
- if (tf->hob_nsect)
- sb[8] |= 0x40;
- if (tf->hob_lbal || tf->hob_lbam || tf->hob_lbah)
- sb[8] |= 0x20;
- }
- sb[9] = tf->lbal;
- sb[10] = tf->lbam;
- sb[11] = tf->lbah;
- }
}
/**
@@ -1632,26 +1647,32 @@ static void ata_scsi_qc_complete(struct ata_queued_cmd *qc)
{
struct scsi_cmnd *cmd = qc->scsicmd;
u8 *cdb = cmd->cmnd;
- int need_sense = (qc->err_mask != 0) &&
- !(qc->flags & ATA_QCFLAG_SENSE_VALID);
+ bool have_sense = qc->flags & ATA_QCFLAG_SENSE_VALID;
+ bool is_ata_passthru = cdb[0] == ATA_16 || cdb[0] == ATA_12;
+ bool is_ck_cond_request = cdb[2] & 0x20;
+ bool is_error = qc->err_mask != 0;
/* For ATA pass thru (SAT) commands, generate a sense block if
* user mandated it or if there's an error. Note that if we
- * generate because the user forced us to [CK_COND =1], a check
+ * generate because the user forced us to [CK_COND=1], a check
* condition is generated and the ATA register values are returned
* whether the command completed successfully or not. If there
- * was no error, we use the following sense data:
+ * was no error, and CK_COND=1, we use the following sense data:
* sk = RECOVERED ERROR
* asc,ascq = ATA PASS-THROUGH INFORMATION AVAILABLE
*/
- if (((cdb[0] == ATA_16) || (cdb[0] == ATA_12)) &&
- ((cdb[2] & 0x20) || need_sense))
- ata_gen_passthru_sense(qc);
- else if (need_sense)
+ if (is_ata_passthru && (is_ck_cond_request || is_error || have_sense)) {
+ if (!have_sense)
+ ata_gen_passthru_sense(qc);
+ ata_scsi_set_passthru_sense_fields(qc);
+ if (is_ck_cond_request)
+ set_status_byte(qc->scsicmd, SAM_STAT_CHECK_CONDITION);
+ } else if (is_error && !have_sense) {
ata_gen_ata_sense(qc);
- else
+ } else {
/* Keep the SCSI ML and status byte, clear host byte. */
cmd->result &= 0x0000ffff;
+ }
ata_qc_done(qc);
}
@@ -2590,14 +2611,8 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)
/* handle completion from EH */
if (unlikely(err_mask || qc->flags & ATA_QCFLAG_SENSE_VALID)) {
- if (!(qc->flags & ATA_QCFLAG_SENSE_VALID)) {
- /* FIXME: not quite right; we don't want the
- * translation of taskfile registers into a
- * sense descriptors, since that's only
- * correct for ATA, not ATAPI
- */
+ if (!(qc->flags & ATA_QCFLAG_SENSE_VALID))
ata_gen_passthru_sense(qc);
- }
/* SCSI EH automatically locks door if sdev->locked is
* set. Sometimes door lock request continues to
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 97981926224afe17ba3e22e0c2b7dd8b516ee574
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072944-grating-modular-edd2@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
97981926224a ("ata: libata-scsi: Do not overwrite valid sense data when CK_COND=1")
38dab832c3f4 ("ata: libata-scsi: Fix offsets for the fixed format sense data")
ff8072d589dc ("ata: libata: remove references to non-existing error_handler()")
3ac873c76d79 ("ata: libata-core: fix when to fetch sense data for successful commands")
18bd7718b5c4 ("scsi: ata: libata: Handle completion of CDL commands using policy 0xD")
eafe804bda7b ("scsi: ata: libata: Set read/write commands CDL index")
df60f9c64576 ("scsi: ata: libata: Add ATA feature control sub-page translation")
673b2fe6ff1d ("scsi: ata: libata-scsi: Add support for CDL pages mode sense")
62e4a60e0cdb ("scsi: ata: libata: Detect support for command duration limits")
bc9af4909406 ("ata: libata: Fix FUA handling in ata_build_rw_tf()")
4d2e4980a528 ("ata: libata: cleanup fua support detection")
7574a8377c7a ("ata: libata-scsi: do not overwrite SCSI ML and status bytes")
87aab3c4cd59 ("ata: libata: move NCQ related ATA_DFLAGs")
876293121f24 ("ata: scsi: rename flag ATA_QCFLAG_FAILED to ATA_QCFLAG_EH")
b83ad9eec316 ("ata: libata-eh: Cleanup ata_scsi_cmd_error_handler()")
3d8a3ae3d966 ("ata: libata: fix commands incorrectly not getting retried during NCQ error")
4cb7c6f1ef96 ("ata: make use of ata_port_is_frozen() helper")
cb6e73aaadff ("ata: libata-eh: Remove the unneeded result variable")
066de3b9d93b ("ata: libata-core: Simplify ata_build_rw_tf()")
e00923c59e68 ("ata: libata: Rename ATA_DFLAG_NCQ_PRIO_ENABLE")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 97981926224afe17ba3e22e0c2b7dd8b516ee574 Mon Sep 17 00:00:00 2001
From: Igor Pylypiv <ipylypiv(a)google.com>
Date: Tue, 2 Jul 2024 02:47:30 +0000
Subject: [PATCH] ata: libata-scsi: Do not overwrite valid sense data when
CK_COND=1
Current ata_gen_passthru_sense() code performs two actions:
1. Generates sense data based on the ATA 'status' and ATA 'error' fields.
2. Populates "ATA Status Return sense data descriptor" / "Fixed format
sense data" with ATA taskfile fields.
The problem is that #1 generates sense data even when a valid sense data
is already present (ATA_QCFLAG_SENSE_VALID is set). Factoring out #2 into
a separate function allows us to generate sense data only when there is
no valid sense data (ATA_QCFLAG_SENSE_VALID is not set).
As a bonus, we can now delete a FIXME comment in atapi_qc_complete()
which states that we don't want to translate taskfile registers into
sense descriptors for ATAPI.
Additionally, always set SAM_STAT_CHECK_CONDITION when CK_COND=1 because
SAT specification mandates that SATL shall return CHECK CONDITION if
the CK_COND bit is set.
The ATA PASS-THROUGH handling logic in ata_scsi_qc_complete() is hard
to read/understand. Improve the readability of the code by moving checks
into self-explanatory boolean variables.
Cc: stable(a)vger.kernel.org # 4.19+
Co-developed-by: Niklas Cassel <cassel(a)kernel.org>
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Reviewed-by: Damien Le Moal <dlemoal(a)kernel.org>
Reviewed-by: Niklas Cassel <cassel(a)kernel.org>
Signed-off-by: Igor Pylypiv <ipylypiv(a)google.com>
Link: https://lore.kernel.org/r/20240702024735.1152293-3-ipylypiv@google.com
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index a0c68b0a00c4..a9d2a1cf4c3d 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -230,6 +230,80 @@ void ata_scsi_set_sense_information(struct ata_device *dev,
SCSI_SENSE_BUFFERSIZE, information);
}
+/**
+ * ata_scsi_set_passthru_sense_fields - Set ATA fields in sense buffer
+ * @qc: ATA PASS-THROUGH command.
+ *
+ * Populates "ATA Status Return sense data descriptor" / "Fixed format
+ * sense data" with ATA taskfile fields.
+ *
+ * LOCKING:
+ * None.
+ */
+static void ata_scsi_set_passthru_sense_fields(struct ata_queued_cmd *qc)
+{
+ struct scsi_cmnd *cmd = qc->scsicmd;
+ struct ata_taskfile *tf = &qc->result_tf;
+ unsigned char *sb = cmd->sense_buffer;
+
+ if ((sb[0] & 0x7f) >= 0x72) {
+ unsigned char *desc;
+ u8 len;
+
+ /* descriptor format */
+ len = sb[7];
+ desc = (char *)scsi_sense_desc_find(sb, len + 8, 9);
+ if (!desc) {
+ if (SCSI_SENSE_BUFFERSIZE < len + 14)
+ return;
+ sb[7] = len + 14;
+ desc = sb + 8 + len;
+ }
+ desc[0] = 9;
+ desc[1] = 12;
+ /*
+ * Copy registers into sense buffer.
+ */
+ desc[2] = 0x00;
+ desc[3] = tf->error;
+ desc[5] = tf->nsect;
+ desc[7] = tf->lbal;
+ desc[9] = tf->lbam;
+ desc[11] = tf->lbah;
+ desc[12] = tf->device;
+ desc[13] = tf->status;
+
+ /*
+ * Fill in Extend bit, and the high order bytes
+ * if applicable.
+ */
+ if (tf->flags & ATA_TFLAG_LBA48) {
+ desc[2] |= 0x01;
+ desc[4] = tf->hob_nsect;
+ desc[6] = tf->hob_lbal;
+ desc[8] = tf->hob_lbam;
+ desc[10] = tf->hob_lbah;
+ }
+ } else {
+ /* Fixed sense format */
+ sb[0] |= 0x80;
+ sb[3] = tf->error;
+ sb[4] = tf->status;
+ sb[5] = tf->device;
+ sb[6] = tf->nsect;
+ if (tf->flags & ATA_TFLAG_LBA48) {
+ sb[8] |= 0x80;
+ if (tf->hob_nsect)
+ sb[8] |= 0x40;
+ if (tf->hob_lbal || tf->hob_lbam || tf->hob_lbah)
+ sb[8] |= 0x20;
+ }
+ sb[9] = tf->lbal;
+ sb[10] = tf->lbam;
+ sb[11] = tf->lbah;
+ }
+}
+
static void ata_scsi_set_invalid_field(struct ata_device *dev,
struct scsi_cmnd *cmd, u16 field, u8 bit)
{
@@ -837,10 +911,8 @@ static void ata_to_sense_error(unsigned id, u8 drv_stat, u8 drv_err, u8 *sk,
* ata_gen_passthru_sense - Generate check condition sense block.
* @qc: Command that completed.
*
- * This function is specific to the ATA descriptor format sense
- * block specified for the ATA pass through commands. Regardless
- * of whether the command errored or not, return a sense
- * block. Copy all controller registers into the sense
+ * This function is specific to the ATA pass through commands.
+ * Regardless of whether the command errored or not, return a sense
* block. If there was no error, we get the request from an ATA
* passthrough command, so we use the following sense data:
* sk = RECOVERED ERROR
@@ -875,63 +947,6 @@ static void ata_gen_passthru_sense(struct ata_queued_cmd *qc)
*/
scsi_build_sense(cmd, 1, RECOVERED_ERROR, 0, 0x1D);
}
-
- if ((sb[0] & 0x7f) >= 0x72) {
- unsigned char *desc;
- u8 len;
-
- /* descriptor format */
- len = sb[7];
- desc = (char *)scsi_sense_desc_find(sb, len + 8, 9);
- if (!desc) {
- if (SCSI_SENSE_BUFFERSIZE < len + 14)
- return;
- sb[7] = len + 14;
- desc = sb + 8 + len;
- }
- desc[0] = 9;
- desc[1] = 12;
- /*
- * Copy registers into sense buffer.
- */
- desc[2] = 0x00;
- desc[3] = tf->error;
- desc[5] = tf->nsect;
- desc[7] = tf->lbal;
- desc[9] = tf->lbam;
- desc[11] = tf->lbah;
- desc[12] = tf->device;
- desc[13] = tf->status;
-
- /*
- * Fill in Extend bit, and the high order bytes
- * if applicable.
- */
- if (tf->flags & ATA_TFLAG_LBA48) {
- desc[2] |= 0x01;
- desc[4] = tf->hob_nsect;
- desc[6] = tf->hob_lbal;
- desc[8] = tf->hob_lbam;
- desc[10] = tf->hob_lbah;
- }
- } else {
- /* Fixed sense format */
- sb[0] |= 0x80;
- sb[3] = tf->error;
- sb[4] = tf->status;
- sb[5] = tf->device;
- sb[6] = tf->nsect;
- if (tf->flags & ATA_TFLAG_LBA48) {
- sb[8] |= 0x80;
- if (tf->hob_nsect)
- sb[8] |= 0x40;
- if (tf->hob_lbal || tf->hob_lbam || tf->hob_lbah)
- sb[8] |= 0x20;
- }
- sb[9] = tf->lbal;
- sb[10] = tf->lbam;
- sb[11] = tf->lbah;
- }
}
/**
@@ -1632,26 +1647,32 @@ static void ata_scsi_qc_complete(struct ata_queued_cmd *qc)
{
struct scsi_cmnd *cmd = qc->scsicmd;
u8 *cdb = cmd->cmnd;
- int need_sense = (qc->err_mask != 0) &&
- !(qc->flags & ATA_QCFLAG_SENSE_VALID);
+ bool have_sense = qc->flags & ATA_QCFLAG_SENSE_VALID;
+ bool is_ata_passthru = cdb[0] == ATA_16 || cdb[0] == ATA_12;
+ bool is_ck_cond_request = cdb[2] & 0x20;
+ bool is_error = qc->err_mask != 0;
/* For ATA pass thru (SAT) commands, generate a sense block if
* user mandated it or if there's an error. Note that if we
- * generate because the user forced us to [CK_COND =1], a check
+ * generate because the user forced us to [CK_COND=1], a check
* condition is generated and the ATA register values are returned
* whether the command completed successfully or not. If there
- * was no error, we use the following sense data:
+ * was no error, and CK_COND=1, we use the following sense data:
* sk = RECOVERED ERROR
* asc,ascq = ATA PASS-THROUGH INFORMATION AVAILABLE
*/
- if (((cdb[0] == ATA_16) || (cdb[0] == ATA_12)) &&
- ((cdb[2] & 0x20) || need_sense))
- ata_gen_passthru_sense(qc);
- else if (need_sense)
+ if (is_ata_passthru && (is_ck_cond_request || is_error || have_sense)) {
+ if (!have_sense)
+ ata_gen_passthru_sense(qc);
+ ata_scsi_set_passthru_sense_fields(qc);
+ if (is_ck_cond_request)
+ set_status_byte(qc->scsicmd, SAM_STAT_CHECK_CONDITION);
+ } else if (is_error && !have_sense) {
ata_gen_ata_sense(qc);
- else
+ } else {
/* Keep the SCSI ML and status byte, clear host byte. */
cmd->result &= 0x0000ffff;
+ }
ata_qc_done(qc);
}
@@ -2590,14 +2611,8 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)
/* handle completion from EH */
if (unlikely(err_mask || qc->flags & ATA_QCFLAG_SENSE_VALID)) {
- if (!(qc->flags & ATA_QCFLAG_SENSE_VALID)) {
- /* FIXME: not quite right; we don't want the
- * translation of taskfile registers into a
- * sense descriptors, since that's only
- * correct for ATA, not ATAPI
- */
+ if (!(qc->flags & ATA_QCFLAG_SENSE_VALID))
ata_gen_passthru_sense(qc);
- }
/* SCSI EH automatically locks door if sdev->locked is
* set. Sometimes door lock request continues to
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 97981926224afe17ba3e22e0c2b7dd8b516ee574
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072943-backup-snaking-6c85@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
97981926224a ("ata: libata-scsi: Do not overwrite valid sense data when CK_COND=1")
38dab832c3f4 ("ata: libata-scsi: Fix offsets for the fixed format sense data")
ff8072d589dc ("ata: libata: remove references to non-existing error_handler()")
3ac873c76d79 ("ata: libata-core: fix when to fetch sense data for successful commands")
18bd7718b5c4 ("scsi: ata: libata: Handle completion of CDL commands using policy 0xD")
eafe804bda7b ("scsi: ata: libata: Set read/write commands CDL index")
df60f9c64576 ("scsi: ata: libata: Add ATA feature control sub-page translation")
673b2fe6ff1d ("scsi: ata: libata-scsi: Add support for CDL pages mode sense")
62e4a60e0cdb ("scsi: ata: libata: Detect support for command duration limits")
bc9af4909406 ("ata: libata: Fix FUA handling in ata_build_rw_tf()")
4d2e4980a528 ("ata: libata: cleanup fua support detection")
7574a8377c7a ("ata: libata-scsi: do not overwrite SCSI ML and status bytes")
87aab3c4cd59 ("ata: libata: move NCQ related ATA_DFLAGs")
876293121f24 ("ata: scsi: rename flag ATA_QCFLAG_FAILED to ATA_QCFLAG_EH")
b83ad9eec316 ("ata: libata-eh: Cleanup ata_scsi_cmd_error_handler()")
3d8a3ae3d966 ("ata: libata: fix commands incorrectly not getting retried during NCQ error")
4cb7c6f1ef96 ("ata: make use of ata_port_is_frozen() helper")
cb6e73aaadff ("ata: libata-eh: Remove the unneeded result variable")
066de3b9d93b ("ata: libata-core: Simplify ata_build_rw_tf()")
e00923c59e68 ("ata: libata: Rename ATA_DFLAG_NCQ_PRIO_ENABLE")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 97981926224afe17ba3e22e0c2b7dd8b516ee574 Mon Sep 17 00:00:00 2001
From: Igor Pylypiv <ipylypiv(a)google.com>
Date: Tue, 2 Jul 2024 02:47:30 +0000
Subject: [PATCH] ata: libata-scsi: Do not overwrite valid sense data when
CK_COND=1
Current ata_gen_passthru_sense() code performs two actions:
1. Generates sense data based on the ATA 'status' and ATA 'error' fields.
2. Populates "ATA Status Return sense data descriptor" / "Fixed format
sense data" with ATA taskfile fields.
The problem is that #1 generates sense data even when a valid sense data
is already present (ATA_QCFLAG_SENSE_VALID is set). Factoring out #2 into
a separate function allows us to generate sense data only when there is
no valid sense data (ATA_QCFLAG_SENSE_VALID is not set).
As a bonus, we can now delete a FIXME comment in atapi_qc_complete()
which states that we don't want to translate taskfile registers into
sense descriptors for ATAPI.
Additionally, always set SAM_STAT_CHECK_CONDITION when CK_COND=1 because
SAT specification mandates that SATL shall return CHECK CONDITION if
the CK_COND bit is set.
The ATA PASS-THROUGH handling logic in ata_scsi_qc_complete() is hard
to read/understand. Improve the readability of the code by moving checks
into self-explanatory boolean variables.
Cc: stable(a)vger.kernel.org # 4.19+
Co-developed-by: Niklas Cassel <cassel(a)kernel.org>
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Reviewed-by: Damien Le Moal <dlemoal(a)kernel.org>
Reviewed-by: Niklas Cassel <cassel(a)kernel.org>
Signed-off-by: Igor Pylypiv <ipylypiv(a)google.com>
Link: https://lore.kernel.org/r/20240702024735.1152293-3-ipylypiv@google.com
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index a0c68b0a00c4..a9d2a1cf4c3d 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -230,6 +230,80 @@ void ata_scsi_set_sense_information(struct ata_device *dev,
SCSI_SENSE_BUFFERSIZE, information);
}
+/**
+ * ata_scsi_set_passthru_sense_fields - Set ATA fields in sense buffer
+ * @qc: ATA PASS-THROUGH command.
+ *
+ * Populates "ATA Status Return sense data descriptor" / "Fixed format
+ * sense data" with ATA taskfile fields.
+ *
+ * LOCKING:
+ * None.
+ */
+static void ata_scsi_set_passthru_sense_fields(struct ata_queued_cmd *qc)
+{
+ struct scsi_cmnd *cmd = qc->scsicmd;
+ struct ata_taskfile *tf = &qc->result_tf;
+ unsigned char *sb = cmd->sense_buffer;
+
+ if ((sb[0] & 0x7f) >= 0x72) {
+ unsigned char *desc;
+ u8 len;
+
+ /* descriptor format */
+ len = sb[7];
+ desc = (char *)scsi_sense_desc_find(sb, len + 8, 9);
+ if (!desc) {
+ if (SCSI_SENSE_BUFFERSIZE < len + 14)
+ return;
+ sb[7] = len + 14;
+ desc = sb + 8 + len;
+ }
+ desc[0] = 9;
+ desc[1] = 12;
+ /*
+ * Copy registers into sense buffer.
+ */
+ desc[2] = 0x00;
+ desc[3] = tf->error;
+ desc[5] = tf->nsect;
+ desc[7] = tf->lbal;
+ desc[9] = tf->lbam;
+ desc[11] = tf->lbah;
+ desc[12] = tf->device;
+ desc[13] = tf->status;
+
+ /*
+ * Fill in Extend bit, and the high order bytes
+ * if applicable.
+ */
+ if (tf->flags & ATA_TFLAG_LBA48) {
+ desc[2] |= 0x01;
+ desc[4] = tf->hob_nsect;
+ desc[6] = tf->hob_lbal;
+ desc[8] = tf->hob_lbam;
+ desc[10] = tf->hob_lbah;
+ }
+ } else {
+ /* Fixed sense format */
+ sb[0] |= 0x80;
+ sb[3] = tf->error;
+ sb[4] = tf->status;
+ sb[5] = tf->device;
+ sb[6] = tf->nsect;
+ if (tf->flags & ATA_TFLAG_LBA48) {
+ sb[8] |= 0x80;
+ if (tf->hob_nsect)
+ sb[8] |= 0x40;
+ if (tf->hob_lbal || tf->hob_lbam || tf->hob_lbah)
+ sb[8] |= 0x20;
+ }
+ sb[9] = tf->lbal;
+ sb[10] = tf->lbam;
+ sb[11] = tf->lbah;
+ }
+}
+
static void ata_scsi_set_invalid_field(struct ata_device *dev,
struct scsi_cmnd *cmd, u16 field, u8 bit)
{
@@ -837,10 +911,8 @@ static void ata_to_sense_error(unsigned id, u8 drv_stat, u8 drv_err, u8 *sk,
* ata_gen_passthru_sense - Generate check condition sense block.
* @qc: Command that completed.
*
- * This function is specific to the ATA descriptor format sense
- * block specified for the ATA pass through commands. Regardless
- * of whether the command errored or not, return a sense
- * block. Copy all controller registers into the sense
+ * This function is specific to the ATA pass through commands.
+ * Regardless of whether the command errored or not, return a sense
* block. If there was no error, we get the request from an ATA
* passthrough command, so we use the following sense data:
* sk = RECOVERED ERROR
@@ -875,63 +947,6 @@ static void ata_gen_passthru_sense(struct ata_queued_cmd *qc)
*/
scsi_build_sense(cmd, 1, RECOVERED_ERROR, 0, 0x1D);
}
-
- if ((sb[0] & 0x7f) >= 0x72) {
- unsigned char *desc;
- u8 len;
-
- /* descriptor format */
- len = sb[7];
- desc = (char *)scsi_sense_desc_find(sb, len + 8, 9);
- if (!desc) {
- if (SCSI_SENSE_BUFFERSIZE < len + 14)
- return;
- sb[7] = len + 14;
- desc = sb + 8 + len;
- }
- desc[0] = 9;
- desc[1] = 12;
- /*
- * Copy registers into sense buffer.
- */
- desc[2] = 0x00;
- desc[3] = tf->error;
- desc[5] = tf->nsect;
- desc[7] = tf->lbal;
- desc[9] = tf->lbam;
- desc[11] = tf->lbah;
- desc[12] = tf->device;
- desc[13] = tf->status;
-
- /*
- * Fill in Extend bit, and the high order bytes
- * if applicable.
- */
- if (tf->flags & ATA_TFLAG_LBA48) {
- desc[2] |= 0x01;
- desc[4] = tf->hob_nsect;
- desc[6] = tf->hob_lbal;
- desc[8] = tf->hob_lbam;
- desc[10] = tf->hob_lbah;
- }
- } else {
- /* Fixed sense format */
- sb[0] |= 0x80;
- sb[3] = tf->error;
- sb[4] = tf->status;
- sb[5] = tf->device;
- sb[6] = tf->nsect;
- if (tf->flags & ATA_TFLAG_LBA48) {
- sb[8] |= 0x80;
- if (tf->hob_nsect)
- sb[8] |= 0x40;
- if (tf->hob_lbal || tf->hob_lbam || tf->hob_lbah)
- sb[8] |= 0x20;
- }
- sb[9] = tf->lbal;
- sb[10] = tf->lbam;
- sb[11] = tf->lbah;
- }
}
/**
@@ -1632,26 +1647,32 @@ static void ata_scsi_qc_complete(struct ata_queued_cmd *qc)
{
struct scsi_cmnd *cmd = qc->scsicmd;
u8 *cdb = cmd->cmnd;
- int need_sense = (qc->err_mask != 0) &&
- !(qc->flags & ATA_QCFLAG_SENSE_VALID);
+ bool have_sense = qc->flags & ATA_QCFLAG_SENSE_VALID;
+ bool is_ata_passthru = cdb[0] == ATA_16 || cdb[0] == ATA_12;
+ bool is_ck_cond_request = cdb[2] & 0x20;
+ bool is_error = qc->err_mask != 0;
/* For ATA pass thru (SAT) commands, generate a sense block if
* user mandated it or if there's an error. Note that if we
- * generate because the user forced us to [CK_COND =1], a check
+ * generate because the user forced us to [CK_COND=1], a check
* condition is generated and the ATA register values are returned
* whether the command completed successfully or not. If there
- * was no error, we use the following sense data:
+ * was no error, and CK_COND=1, we use the following sense data:
* sk = RECOVERED ERROR
* asc,ascq = ATA PASS-THROUGH INFORMATION AVAILABLE
*/
- if (((cdb[0] == ATA_16) || (cdb[0] == ATA_12)) &&
- ((cdb[2] & 0x20) || need_sense))
- ata_gen_passthru_sense(qc);
- else if (need_sense)
+ if (is_ata_passthru && (is_ck_cond_request || is_error || have_sense)) {
+ if (!have_sense)
+ ata_gen_passthru_sense(qc);
+ ata_scsi_set_passthru_sense_fields(qc);
+ if (is_ck_cond_request)
+ set_status_byte(qc->scsicmd, SAM_STAT_CHECK_CONDITION);
+ } else if (is_error && !have_sense) {
ata_gen_ata_sense(qc);
- else
+ } else {
/* Keep the SCSI ML and status byte, clear host byte. */
cmd->result &= 0x0000ffff;
+ }
ata_qc_done(qc);
}
@@ -2590,14 +2611,8 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)
/* handle completion from EH */
if (unlikely(err_mask || qc->flags & ATA_QCFLAG_SENSE_VALID)) {
- if (!(qc->flags & ATA_QCFLAG_SENSE_VALID)) {
- /* FIXME: not quite right; we don't want the
- * translation of taskfile registers into a
- * sense descriptors, since that's only
- * correct for ATA, not ATAPI
- */
+ if (!(qc->flags & ATA_QCFLAG_SENSE_VALID))
ata_gen_passthru_sense(qc);
- }
/* SCSI EH automatically locks door if sdev->locked is
* set. Sometimes door lock request continues to
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 97981926224afe17ba3e22e0c2b7dd8b516ee574
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072942-ditzy-tropical-a9b7@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
97981926224a ("ata: libata-scsi: Do not overwrite valid sense data when CK_COND=1")
38dab832c3f4 ("ata: libata-scsi: Fix offsets for the fixed format sense data")
ff8072d589dc ("ata: libata: remove references to non-existing error_handler()")
3ac873c76d79 ("ata: libata-core: fix when to fetch sense data for successful commands")
18bd7718b5c4 ("scsi: ata: libata: Handle completion of CDL commands using policy 0xD")
eafe804bda7b ("scsi: ata: libata: Set read/write commands CDL index")
df60f9c64576 ("scsi: ata: libata: Add ATA feature control sub-page translation")
673b2fe6ff1d ("scsi: ata: libata-scsi: Add support for CDL pages mode sense")
62e4a60e0cdb ("scsi: ata: libata: Detect support for command duration limits")
bc9af4909406 ("ata: libata: Fix FUA handling in ata_build_rw_tf()")
4d2e4980a528 ("ata: libata: cleanup fua support detection")
7574a8377c7a ("ata: libata-scsi: do not overwrite SCSI ML and status bytes")
87aab3c4cd59 ("ata: libata: move NCQ related ATA_DFLAGs")
876293121f24 ("ata: scsi: rename flag ATA_QCFLAG_FAILED to ATA_QCFLAG_EH")
b83ad9eec316 ("ata: libata-eh: Cleanup ata_scsi_cmd_error_handler()")
3d8a3ae3d966 ("ata: libata: fix commands incorrectly not getting retried during NCQ error")
4cb7c6f1ef96 ("ata: make use of ata_port_is_frozen() helper")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 97981926224afe17ba3e22e0c2b7dd8b516ee574 Mon Sep 17 00:00:00 2001
From: Igor Pylypiv <ipylypiv(a)google.com>
Date: Tue, 2 Jul 2024 02:47:30 +0000
Subject: [PATCH] ata: libata-scsi: Do not overwrite valid sense data when
CK_COND=1
Current ata_gen_passthru_sense() code performs two actions:
1. Generates sense data based on the ATA 'status' and ATA 'error' fields.
2. Populates "ATA Status Return sense data descriptor" / "Fixed format
sense data" with ATA taskfile fields.
The problem is that #1 generates sense data even when a valid sense data
is already present (ATA_QCFLAG_SENSE_VALID is set). Factoring out #2 into
a separate function allows us to generate sense data only when there is
no valid sense data (ATA_QCFLAG_SENSE_VALID is not set).
As a bonus, we can now delete a FIXME comment in atapi_qc_complete()
which states that we don't want to translate taskfile registers into
sense descriptors for ATAPI.
Additionally, always set SAM_STAT_CHECK_CONDITION when CK_COND=1 because
SAT specification mandates that SATL shall return CHECK CONDITION if
the CK_COND bit is set.
The ATA PASS-THROUGH handling logic in ata_scsi_qc_complete() is hard
to read/understand. Improve the readability of the code by moving checks
into self-explanatory boolean variables.
Cc: stable(a)vger.kernel.org # 4.19+
Co-developed-by: Niklas Cassel <cassel(a)kernel.org>
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Reviewed-by: Damien Le Moal <dlemoal(a)kernel.org>
Reviewed-by: Niklas Cassel <cassel(a)kernel.org>
Signed-off-by: Igor Pylypiv <ipylypiv(a)google.com>
Link: https://lore.kernel.org/r/20240702024735.1152293-3-ipylypiv@google.com
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index a0c68b0a00c4..a9d2a1cf4c3d 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -230,6 +230,80 @@ void ata_scsi_set_sense_information(struct ata_device *dev,
SCSI_SENSE_BUFFERSIZE, information);
}
+/**
+ * ata_scsi_set_passthru_sense_fields - Set ATA fields in sense buffer
+ * @qc: ATA PASS-THROUGH command.
+ *
+ * Populates "ATA Status Return sense data descriptor" / "Fixed format
+ * sense data" with ATA taskfile fields.
+ *
+ * LOCKING:
+ * None.
+ */
+static void ata_scsi_set_passthru_sense_fields(struct ata_queued_cmd *qc)
+{
+ struct scsi_cmnd *cmd = qc->scsicmd;
+ struct ata_taskfile *tf = &qc->result_tf;
+ unsigned char *sb = cmd->sense_buffer;
+
+ if ((sb[0] & 0x7f) >= 0x72) {
+ unsigned char *desc;
+ u8 len;
+
+ /* descriptor format */
+ len = sb[7];
+ desc = (char *)scsi_sense_desc_find(sb, len + 8, 9);
+ if (!desc) {
+ if (SCSI_SENSE_BUFFERSIZE < len + 14)
+ return;
+ sb[7] = len + 14;
+ desc = sb + 8 + len;
+ }
+ desc[0] = 9;
+ desc[1] = 12;
+ /*
+ * Copy registers into sense buffer.
+ */
+ desc[2] = 0x00;
+ desc[3] = tf->error;
+ desc[5] = tf->nsect;
+ desc[7] = tf->lbal;
+ desc[9] = tf->lbam;
+ desc[11] = tf->lbah;
+ desc[12] = tf->device;
+ desc[13] = tf->status;
+
+ /*
+ * Fill in Extend bit, and the high order bytes
+ * if applicable.
+ */
+ if (tf->flags & ATA_TFLAG_LBA48) {
+ desc[2] |= 0x01;
+ desc[4] = tf->hob_nsect;
+ desc[6] = tf->hob_lbal;
+ desc[8] = tf->hob_lbam;
+ desc[10] = tf->hob_lbah;
+ }
+ } else {
+ /* Fixed sense format */
+ sb[0] |= 0x80;
+ sb[3] = tf->error;
+ sb[4] = tf->status;
+ sb[5] = tf->device;
+ sb[6] = tf->nsect;
+ if (tf->flags & ATA_TFLAG_LBA48) {
+ sb[8] |= 0x80;
+ if (tf->hob_nsect)
+ sb[8] |= 0x40;
+ if (tf->hob_lbal || tf->hob_lbam || tf->hob_lbah)
+ sb[8] |= 0x20;
+ }
+ sb[9] = tf->lbal;
+ sb[10] = tf->lbam;
+ sb[11] = tf->lbah;
+ }
+}
+
static void ata_scsi_set_invalid_field(struct ata_device *dev,
struct scsi_cmnd *cmd, u16 field, u8 bit)
{
@@ -837,10 +911,8 @@ static void ata_to_sense_error(unsigned id, u8 drv_stat, u8 drv_err, u8 *sk,
* ata_gen_passthru_sense - Generate check condition sense block.
* @qc: Command that completed.
*
- * This function is specific to the ATA descriptor format sense
- * block specified for the ATA pass through commands. Regardless
- * of whether the command errored or not, return a sense
- * block. Copy all controller registers into the sense
+ * This function is specific to the ATA pass through commands.
+ * Regardless of whether the command errored or not, return a sense
* block. If there was no error, we get the request from an ATA
* passthrough command, so we use the following sense data:
* sk = RECOVERED ERROR
@@ -875,63 +947,6 @@ static void ata_gen_passthru_sense(struct ata_queued_cmd *qc)
*/
scsi_build_sense(cmd, 1, RECOVERED_ERROR, 0, 0x1D);
}
-
- if ((sb[0] & 0x7f) >= 0x72) {
- unsigned char *desc;
- u8 len;
-
- /* descriptor format */
- len = sb[7];
- desc = (char *)scsi_sense_desc_find(sb, len + 8, 9);
- if (!desc) {
- if (SCSI_SENSE_BUFFERSIZE < len + 14)
- return;
- sb[7] = len + 14;
- desc = sb + 8 + len;
- }
- desc[0] = 9;
- desc[1] = 12;
- /*
- * Copy registers into sense buffer.
- */
- desc[2] = 0x00;
- desc[3] = tf->error;
- desc[5] = tf->nsect;
- desc[7] = tf->lbal;
- desc[9] = tf->lbam;
- desc[11] = tf->lbah;
- desc[12] = tf->device;
- desc[13] = tf->status;
-
- /*
- * Fill in Extend bit, and the high order bytes
- * if applicable.
- */
- if (tf->flags & ATA_TFLAG_LBA48) {
- desc[2] |= 0x01;
- desc[4] = tf->hob_nsect;
- desc[6] = tf->hob_lbal;
- desc[8] = tf->hob_lbam;
- desc[10] = tf->hob_lbah;
- }
- } else {
- /* Fixed sense format */
- sb[0] |= 0x80;
- sb[3] = tf->error;
- sb[4] = tf->status;
- sb[5] = tf->device;
- sb[6] = tf->nsect;
- if (tf->flags & ATA_TFLAG_LBA48) {
- sb[8] |= 0x80;
- if (tf->hob_nsect)
- sb[8] |= 0x40;
- if (tf->hob_lbal || tf->hob_lbam || tf->hob_lbah)
- sb[8] |= 0x20;
- }
- sb[9] = tf->lbal;
- sb[10] = tf->lbam;
- sb[11] = tf->lbah;
- }
}
/**
@@ -1632,26 +1647,32 @@ static void ata_scsi_qc_complete(struct ata_queued_cmd *qc)
{
struct scsi_cmnd *cmd = qc->scsicmd;
u8 *cdb = cmd->cmnd;
- int need_sense = (qc->err_mask != 0) &&
- !(qc->flags & ATA_QCFLAG_SENSE_VALID);
+ bool have_sense = qc->flags & ATA_QCFLAG_SENSE_VALID;
+ bool is_ata_passthru = cdb[0] == ATA_16 || cdb[0] == ATA_12;
+ bool is_ck_cond_request = cdb[2] & 0x20;
+ bool is_error = qc->err_mask != 0;
/* For ATA pass thru (SAT) commands, generate a sense block if
* user mandated it or if there's an error. Note that if we
- * generate because the user forced us to [CK_COND =1], a check
+ * generate because the user forced us to [CK_COND=1], a check
* condition is generated and the ATA register values are returned
* whether the command completed successfully or not. If there
- * was no error, we use the following sense data:
+ * was no error, and CK_COND=1, we use the following sense data:
* sk = RECOVERED ERROR
* asc,ascq = ATA PASS-THROUGH INFORMATION AVAILABLE
*/
- if (((cdb[0] == ATA_16) || (cdb[0] == ATA_12)) &&
- ((cdb[2] & 0x20) || need_sense))
- ata_gen_passthru_sense(qc);
- else if (need_sense)
+ if (is_ata_passthru && (is_ck_cond_request || is_error || have_sense)) {
+ if (!have_sense)
+ ata_gen_passthru_sense(qc);
+ ata_scsi_set_passthru_sense_fields(qc);
+ if (is_ck_cond_request)
+ set_status_byte(qc->scsicmd, SAM_STAT_CHECK_CONDITION);
+ } else if (is_error && !have_sense) {
ata_gen_ata_sense(qc);
- else
+ } else {
/* Keep the SCSI ML and status byte, clear host byte. */
cmd->result &= 0x0000ffff;
+ }
ata_qc_done(qc);
}
@@ -2590,14 +2611,8 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)
/* handle completion from EH */
if (unlikely(err_mask || qc->flags & ATA_QCFLAG_SENSE_VALID)) {
- if (!(qc->flags & ATA_QCFLAG_SENSE_VALID)) {
- /* FIXME: not quite right; we don't want the
- * translation of taskfile registers into a
- * sense descriptors, since that's only
- * correct for ATA, not ATAPI
- */
+ if (!(qc->flags & ATA_QCFLAG_SENSE_VALID))
ata_gen_passthru_sense(qc);
- }
/* SCSI EH automatically locks door if sdev->locked is
* set. Sometimes door lock request continues to
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 38dab832c3f4154968f95b267a3bb789e87554b0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072947-shelter-deftly-2238@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
38dab832c3f4 ("ata: libata-scsi: Fix offsets for the fixed format sense data")
ff8072d589dc ("ata: libata: remove references to non-existing error_handler()")
3ac873c76d79 ("ata: libata-core: fix when to fetch sense data for successful commands")
18bd7718b5c4 ("scsi: ata: libata: Handle completion of CDL commands using policy 0xD")
eafe804bda7b ("scsi: ata: libata: Set read/write commands CDL index")
df60f9c64576 ("scsi: ata: libata: Add ATA feature control sub-page translation")
673b2fe6ff1d ("scsi: ata: libata-scsi: Add support for CDL pages mode sense")
62e4a60e0cdb ("scsi: ata: libata: Detect support for command duration limits")
bc9af4909406 ("ata: libata: Fix FUA handling in ata_build_rw_tf()")
4d2e4980a528 ("ata: libata: cleanup fua support detection")
87aab3c4cd59 ("ata: libata: move NCQ related ATA_DFLAGs")
876293121f24 ("ata: scsi: rename flag ATA_QCFLAG_FAILED to ATA_QCFLAG_EH")
b83ad9eec316 ("ata: libata-eh: Cleanup ata_scsi_cmd_error_handler()")
3d8a3ae3d966 ("ata: libata: fix commands incorrectly not getting retried during NCQ error")
4cb7c6f1ef96 ("ata: make use of ata_port_is_frozen() helper")
cb6e73aaadff ("ata: libata-eh: Remove the unneeded result variable")
066de3b9d93b ("ata: libata-core: Simplify ata_build_rw_tf()")
e00923c59e68 ("ata: libata: Rename ATA_DFLAG_NCQ_PRIO_ENABLE")
fa82cabb8883 ("doc: admin-guide: Update libata kernel parameters")
2c33bbdac28c ("ata: libata-core: Allow forcing most horkage flags")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 38dab832c3f4154968f95b267a3bb789e87554b0 Mon Sep 17 00:00:00 2001
From: Igor Pylypiv <ipylypiv(a)google.com>
Date: Tue, 2 Jul 2024 02:47:29 +0000
Subject: [PATCH] ata: libata-scsi: Fix offsets for the fixed format sense data
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Correct the ATA PASS-THROUGH fixed format sense data offsets to conform
to SPC-6 and SAT-5 specifications. Additionally, set the VALID bit to
indicate that the INFORMATION field contains valid information.
INFORMATION
===========
SAT-5 Table 212 — "Fixed format sense data INFORMATION field for the ATA
PASS-THROUGH commands" defines the following format:
+------+------------+
| Byte | Field |
+------+------------+
| 0 | ERROR |
| 1 | STATUS |
| 2 | DEVICE |
| 3 | COUNT(7:0) |
+------+------------+
SPC-6 Table 48 - "Fixed format sense data" specifies that the INFORMATION
field starts at byte 3 in sense buffer resulting in the following offsets
for the ATA PASS-THROUGH commands:
+------------+-------------------------+
| Field | Offset in sense buffer |
+------------+-------------------------+
| ERROR | 3 |
| STATUS | 4 |
| DEVICE | 5 |
| COUNT(7:0) | 6 |
+------------+-------------------------+
COMMAND-SPECIFIC INFORMATION
============================
SAT-5 Table 213 - "Fixed format sense data COMMAND-SPECIFIC INFORMATION
field for ATA PASS-THROUGH" defines the following format:
+------+-------------------+
| Byte | Field |
+------+-------------------+
| 0 | FLAGS | LOG INDEX |
| 1 | LBA (7:0) |
| 2 | LBA (15:8) |
| 3 | LBA (23:16) |
+------+-------------------+
SPC-6 Table 48 - "Fixed format sense data" specifies that
the COMMAND-SPECIFIC-INFORMATION field starts at byte 8
in sense buffer resulting in the following offsets for
the ATA PASS-THROUGH commands:
Offsets of these fields in the fixed sense format are as follows:
+-------------------+-------------------------+
| Field | Offset in sense buffer |
+-------------------+-------------------------+
| FLAGS | LOG INDEX | 8 |
| LBA (7:0) | 9 |
| LBA (15:8) | 10 |
| LBA (23:16) | 11 |
+-------------------+-------------------------+
Reported-by: Akshat Jain <akshatzen(a)google.com>
Fixes: 11093cb1ef56 ("libata-scsi: generate correct ATA pass-through sense")
Cc: stable(a)vger.kernel.org
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Reviewed-by: Niklas Cassel <cassel(a)kernel.org>
Signed-off-by: Igor Pylypiv <ipylypiv(a)google.com>
Link: https://lore.kernel.org/r/20240702024735.1152293-2-ipylypiv@google.com
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index cdf29b178ddc..a0c68b0a00c4 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -855,7 +855,6 @@ static void ata_gen_passthru_sense(struct ata_queued_cmd *qc)
struct scsi_cmnd *cmd = qc->scsicmd;
struct ata_taskfile *tf = &qc->result_tf;
unsigned char *sb = cmd->sense_buffer;
- unsigned char *desc = sb + 8;
u8 sense_key, asc, ascq;
memset(sb, 0, SCSI_SENSE_BUFFERSIZE);
@@ -877,7 +876,8 @@ static void ata_gen_passthru_sense(struct ata_queued_cmd *qc)
scsi_build_sense(cmd, 1, RECOVERED_ERROR, 0, 0x1D);
}
- if ((cmd->sense_buffer[0] & 0x7f) >= 0x72) {
+ if ((sb[0] & 0x7f) >= 0x72) {
+ unsigned char *desc;
u8 len;
/* descriptor format */
@@ -916,21 +916,21 @@ static void ata_gen_passthru_sense(struct ata_queued_cmd *qc)
}
} else {
/* Fixed sense format */
- desc[0] = tf->error;
- desc[1] = tf->status;
- desc[2] = tf->device;
- desc[3] = tf->nsect;
- desc[7] = 0;
+ sb[0] |= 0x80;
+ sb[3] = tf->error;
+ sb[4] = tf->status;
+ sb[5] = tf->device;
+ sb[6] = tf->nsect;
if (tf->flags & ATA_TFLAG_LBA48) {
- desc[8] |= 0x80;
+ sb[8] |= 0x80;
if (tf->hob_nsect)
- desc[8] |= 0x40;
+ sb[8] |= 0x40;
if (tf->hob_lbal || tf->hob_lbam || tf->hob_lbah)
- desc[8] |= 0x20;
+ sb[8] |= 0x20;
}
- desc[9] = tf->lbal;
- desc[10] = tf->lbam;
- desc[11] = tf->lbah;
+ sb[9] = tf->lbal;
+ sb[10] = tf->lbam;
+ sb[11] = tf->lbah;
}
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 38dab832c3f4154968f95b267a3bb789e87554b0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072945-crept-keg-f9ef@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
38dab832c3f4 ("ata: libata-scsi: Fix offsets for the fixed format sense data")
ff8072d589dc ("ata: libata: remove references to non-existing error_handler()")
3ac873c76d79 ("ata: libata-core: fix when to fetch sense data for successful commands")
18bd7718b5c4 ("scsi: ata: libata: Handle completion of CDL commands using policy 0xD")
eafe804bda7b ("scsi: ata: libata: Set read/write commands CDL index")
df60f9c64576 ("scsi: ata: libata: Add ATA feature control sub-page translation")
673b2fe6ff1d ("scsi: ata: libata-scsi: Add support for CDL pages mode sense")
62e4a60e0cdb ("scsi: ata: libata: Detect support for command duration limits")
bc9af4909406 ("ata: libata: Fix FUA handling in ata_build_rw_tf()")
4d2e4980a528 ("ata: libata: cleanup fua support detection")
87aab3c4cd59 ("ata: libata: move NCQ related ATA_DFLAGs")
876293121f24 ("ata: scsi: rename flag ATA_QCFLAG_FAILED to ATA_QCFLAG_EH")
b83ad9eec316 ("ata: libata-eh: Cleanup ata_scsi_cmd_error_handler()")
3d8a3ae3d966 ("ata: libata: fix commands incorrectly not getting retried during NCQ error")
4cb7c6f1ef96 ("ata: make use of ata_port_is_frozen() helper")
cb6e73aaadff ("ata: libata-eh: Remove the unneeded result variable")
066de3b9d93b ("ata: libata-core: Simplify ata_build_rw_tf()")
e00923c59e68 ("ata: libata: Rename ATA_DFLAG_NCQ_PRIO_ENABLE")
fa82cabb8883 ("doc: admin-guide: Update libata kernel parameters")
2c33bbdac28c ("ata: libata-core: Allow forcing most horkage flags")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 38dab832c3f4154968f95b267a3bb789e87554b0 Mon Sep 17 00:00:00 2001
From: Igor Pylypiv <ipylypiv(a)google.com>
Date: Tue, 2 Jul 2024 02:47:29 +0000
Subject: [PATCH] ata: libata-scsi: Fix offsets for the fixed format sense data
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Correct the ATA PASS-THROUGH fixed format sense data offsets to conform
to SPC-6 and SAT-5 specifications. Additionally, set the VALID bit to
indicate that the INFORMATION field contains valid information.
INFORMATION
===========
SAT-5 Table 212 — "Fixed format sense data INFORMATION field for the ATA
PASS-THROUGH commands" defines the following format:
+------+------------+
| Byte | Field |
+------+------------+
| 0 | ERROR |
| 1 | STATUS |
| 2 | DEVICE |
| 3 | COUNT(7:0) |
+------+------------+
SPC-6 Table 48 - "Fixed format sense data" specifies that the INFORMATION
field starts at byte 3 in sense buffer resulting in the following offsets
for the ATA PASS-THROUGH commands:
+------------+-------------------------+
| Field | Offset in sense buffer |
+------------+-------------------------+
| ERROR | 3 |
| STATUS | 4 |
| DEVICE | 5 |
| COUNT(7:0) | 6 |
+------------+-------------------------+
COMMAND-SPECIFIC INFORMATION
============================
SAT-5 Table 213 - "Fixed format sense data COMMAND-SPECIFIC INFORMATION
field for ATA PASS-THROUGH" defines the following format:
+------+-------------------+
| Byte | Field |
+------+-------------------+
| 0 | FLAGS | LOG INDEX |
| 1 | LBA (7:0) |
| 2 | LBA (15:8) |
| 3 | LBA (23:16) |
+------+-------------------+
SPC-6 Table 48 - "Fixed format sense data" specifies that
the COMMAND-SPECIFIC-INFORMATION field starts at byte 8
in sense buffer resulting in the following offsets for
the ATA PASS-THROUGH commands:
Offsets of these fields in the fixed sense format are as follows:
+-------------------+-------------------------+
| Field | Offset in sense buffer |
+-------------------+-------------------------+
| FLAGS | LOG INDEX | 8 |
| LBA (7:0) | 9 |
| LBA (15:8) | 10 |
| LBA (23:16) | 11 |
+-------------------+-------------------------+
Reported-by: Akshat Jain <akshatzen(a)google.com>
Fixes: 11093cb1ef56 ("libata-scsi: generate correct ATA pass-through sense")
Cc: stable(a)vger.kernel.org
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Reviewed-by: Niklas Cassel <cassel(a)kernel.org>
Signed-off-by: Igor Pylypiv <ipylypiv(a)google.com>
Link: https://lore.kernel.org/r/20240702024735.1152293-2-ipylypiv@google.com
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index cdf29b178ddc..a0c68b0a00c4 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -855,7 +855,6 @@ static void ata_gen_passthru_sense(struct ata_queued_cmd *qc)
struct scsi_cmnd *cmd = qc->scsicmd;
struct ata_taskfile *tf = &qc->result_tf;
unsigned char *sb = cmd->sense_buffer;
- unsigned char *desc = sb + 8;
u8 sense_key, asc, ascq;
memset(sb, 0, SCSI_SENSE_BUFFERSIZE);
@@ -877,7 +876,8 @@ static void ata_gen_passthru_sense(struct ata_queued_cmd *qc)
scsi_build_sense(cmd, 1, RECOVERED_ERROR, 0, 0x1D);
}
- if ((cmd->sense_buffer[0] & 0x7f) >= 0x72) {
+ if ((sb[0] & 0x7f) >= 0x72) {
+ unsigned char *desc;
u8 len;
/* descriptor format */
@@ -916,21 +916,21 @@ static void ata_gen_passthru_sense(struct ata_queued_cmd *qc)
}
} else {
/* Fixed sense format */
- desc[0] = tf->error;
- desc[1] = tf->status;
- desc[2] = tf->device;
- desc[3] = tf->nsect;
- desc[7] = 0;
+ sb[0] |= 0x80;
+ sb[3] = tf->error;
+ sb[4] = tf->status;
+ sb[5] = tf->device;
+ sb[6] = tf->nsect;
if (tf->flags & ATA_TFLAG_LBA48) {
- desc[8] |= 0x80;
+ sb[8] |= 0x80;
if (tf->hob_nsect)
- desc[8] |= 0x40;
+ sb[8] |= 0x40;
if (tf->hob_lbal || tf->hob_lbam || tf->hob_lbah)
- desc[8] |= 0x20;
+ sb[8] |= 0x20;
}
- desc[9] = tf->lbal;
- desc[10] = tf->lbam;
- desc[11] = tf->lbah;
+ sb[9] = tf->lbal;
+ sb[10] = tf->lbam;
+ sb[11] = tf->lbah;
}
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 38dab832c3f4154968f95b267a3bb789e87554b0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072944-swoosh-muster-ccde@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
38dab832c3f4 ("ata: libata-scsi: Fix offsets for the fixed format sense data")
ff8072d589dc ("ata: libata: remove references to non-existing error_handler()")
3ac873c76d79 ("ata: libata-core: fix when to fetch sense data for successful commands")
18bd7718b5c4 ("scsi: ata: libata: Handle completion of CDL commands using policy 0xD")
eafe804bda7b ("scsi: ata: libata: Set read/write commands CDL index")
df60f9c64576 ("scsi: ata: libata: Add ATA feature control sub-page translation")
673b2fe6ff1d ("scsi: ata: libata-scsi: Add support for CDL pages mode sense")
62e4a60e0cdb ("scsi: ata: libata: Detect support for command duration limits")
bc9af4909406 ("ata: libata: Fix FUA handling in ata_build_rw_tf()")
4d2e4980a528 ("ata: libata: cleanup fua support detection")
87aab3c4cd59 ("ata: libata: move NCQ related ATA_DFLAGs")
876293121f24 ("ata: scsi: rename flag ATA_QCFLAG_FAILED to ATA_QCFLAG_EH")
b83ad9eec316 ("ata: libata-eh: Cleanup ata_scsi_cmd_error_handler()")
3d8a3ae3d966 ("ata: libata: fix commands incorrectly not getting retried during NCQ error")
4cb7c6f1ef96 ("ata: make use of ata_port_is_frozen() helper")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 38dab832c3f4154968f95b267a3bb789e87554b0 Mon Sep 17 00:00:00 2001
From: Igor Pylypiv <ipylypiv(a)google.com>
Date: Tue, 2 Jul 2024 02:47:29 +0000
Subject: [PATCH] ata: libata-scsi: Fix offsets for the fixed format sense data
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Correct the ATA PASS-THROUGH fixed format sense data offsets to conform
to SPC-6 and SAT-5 specifications. Additionally, set the VALID bit to
indicate that the INFORMATION field contains valid information.
INFORMATION
===========
SAT-5 Table 212 — "Fixed format sense data INFORMATION field for the ATA
PASS-THROUGH commands" defines the following format:
+------+------------+
| Byte | Field |
+------+------------+
| 0 | ERROR |
| 1 | STATUS |
| 2 | DEVICE |
| 3 | COUNT(7:0) |
+------+------------+
SPC-6 Table 48 - "Fixed format sense data" specifies that the INFORMATION
field starts at byte 3 in sense buffer resulting in the following offsets
for the ATA PASS-THROUGH commands:
+------------+-------------------------+
| Field | Offset in sense buffer |
+------------+-------------------------+
| ERROR | 3 |
| STATUS | 4 |
| DEVICE | 5 |
| COUNT(7:0) | 6 |
+------------+-------------------------+
COMMAND-SPECIFIC INFORMATION
============================
SAT-5 Table 213 - "Fixed format sense data COMMAND-SPECIFIC INFORMATION
field for ATA PASS-THROUGH" defines the following format:
+------+-------------------+
| Byte | Field |
+------+-------------------+
| 0 | FLAGS | LOG INDEX |
| 1 | LBA (7:0) |
| 2 | LBA (15:8) |
| 3 | LBA (23:16) |
+------+-------------------+
SPC-6 Table 48 - "Fixed format sense data" specifies that
the COMMAND-SPECIFIC-INFORMATION field starts at byte 8
in sense buffer resulting in the following offsets for
the ATA PASS-THROUGH commands:
Offsets of these fields in the fixed sense format are as follows:
+-------------------+-------------------------+
| Field | Offset in sense buffer |
+-------------------+-------------------------+
| FLAGS | LOG INDEX | 8 |
| LBA (7:0) | 9 |
| LBA (15:8) | 10 |
| LBA (23:16) | 11 |
+-------------------+-------------------------+
Reported-by: Akshat Jain <akshatzen(a)google.com>
Fixes: 11093cb1ef56 ("libata-scsi: generate correct ATA pass-through sense")
Cc: stable(a)vger.kernel.org
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Reviewed-by: Niklas Cassel <cassel(a)kernel.org>
Signed-off-by: Igor Pylypiv <ipylypiv(a)google.com>
Link: https://lore.kernel.org/r/20240702024735.1152293-2-ipylypiv@google.com
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index cdf29b178ddc..a0c68b0a00c4 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -855,7 +855,6 @@ static void ata_gen_passthru_sense(struct ata_queued_cmd *qc)
struct scsi_cmnd *cmd = qc->scsicmd;
struct ata_taskfile *tf = &qc->result_tf;
unsigned char *sb = cmd->sense_buffer;
- unsigned char *desc = sb + 8;
u8 sense_key, asc, ascq;
memset(sb, 0, SCSI_SENSE_BUFFERSIZE);
@@ -877,7 +876,8 @@ static void ata_gen_passthru_sense(struct ata_queued_cmd *qc)
scsi_build_sense(cmd, 1, RECOVERED_ERROR, 0, 0x1D);
}
- if ((cmd->sense_buffer[0] & 0x7f) >= 0x72) {
+ if ((sb[0] & 0x7f) >= 0x72) {
+ unsigned char *desc;
u8 len;
/* descriptor format */
@@ -916,21 +916,21 @@ static void ata_gen_passthru_sense(struct ata_queued_cmd *qc)
}
} else {
/* Fixed sense format */
- desc[0] = tf->error;
- desc[1] = tf->status;
- desc[2] = tf->device;
- desc[3] = tf->nsect;
- desc[7] = 0;
+ sb[0] |= 0x80;
+ sb[3] = tf->error;
+ sb[4] = tf->status;
+ sb[5] = tf->device;
+ sb[6] = tf->nsect;
if (tf->flags & ATA_TFLAG_LBA48) {
- desc[8] |= 0x80;
+ sb[8] |= 0x80;
if (tf->hob_nsect)
- desc[8] |= 0x40;
+ sb[8] |= 0x40;
if (tf->hob_lbal || tf->hob_lbam || tf->hob_lbah)
- desc[8] |= 0x20;
+ sb[8] |= 0x20;
}
- desc[9] = tf->lbal;
- desc[10] = tf->lbam;
- desc[11] = tf->lbah;
+ sb[9] = tf->lbal;
+ sb[10] = tf->lbam;
+ sb[11] = tf->lbah;
}
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 6807352353561187a718e87204458999dbcbba1b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072923-veteran-backless-4326@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
680735235356 ("ipv4: fix source address selection with route leak")
eba618abacad ("ipv4: Add fib_nh_common to fib_result")
0af7e7c128eb ("ipv4: Update fib_table_lookup tracepoint to take common nexthop")
b75ed8b1aa9c ("ipv4: Rename fib_nh entries")
faa041a40b9f ("ipv4: Create cleanup helper for fib_nh")
e4516ef65490 ("ipv4: Create init helper for fib_nh")
331c7a402358 ("ipv4: Move IN_DEV_IGNORE_ROUTES_WITH_LINKDOWN to helper")
8373c6c84e67 ("ipv4: Define fib_get_nhs when CONFIG_IP_ROUTE_MULTIPATH is disabled")
544fe7c2e654 ("net/mlx5e: Activate HW multipath and handle port affinity based on FIB events")
724b509ca023 ("net/mlx5: Add multipath mode")
10a193ed78ad ("net/mlx5: Expose lag operations in header file")
bb19ad0d8d49 ("net/mlx5: Use unsigned int bit instead of bool as a struct member")
97417f6182f8 ("net/mlx5e: Fix GRE key by controlling port tunnel entropy calculation")
d9ee0491c2ff ("net/mlx5e: Use dedicated uplink vport netdev representor")
025380b20dc2 ("net/mlx5e: Use single argument for the esw representor build params helper")
958246664043 ("net/mlx5: Handle LAG FW commands failure gracefully")
7c34ec19e10c ("net/mlx5: Make RoCE and SR-IOV LAG modes explicit")
292612d68c4e ("net/mlx5: Rename mlx5_lag_is_bonded() to __mlx5_lag_is_active()")
eff849b2c669 ("net/mlx5: Allow/disallow LAG according to pre-req only")
3b5ff59fd851 ("net/mlx5: Adjustments for the activate LAG logic to run under sriov")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6807352353561187a718e87204458999dbcbba1b Mon Sep 17 00:00:00 2001
From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Date: Wed, 10 Jul 2024 10:14:27 +0200
Subject: [PATCH] ipv4: fix source address selection with route leak
By default, an address assigned to the output interface is selected when
the source address is not specified. This is problematic when a route,
configured in a vrf, uses an interface from another vrf (aka route leak).
The original vrf does not own the selected source address.
Let's add a check against the output interface and call the appropriate
function to select the source address.
CC: stable(a)vger.kernel.org
Fixes: 8cbb512c923d ("net: Add source address lookup op for VRF")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Reviewed-by: David Ahern <dsahern(a)kernel.org>
Link: https://patch.msgid.link/20240710081521.3809742-2-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index f669da98d11d..8956026bc0a2 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -2270,6 +2270,15 @@ void fib_select_path(struct net *net, struct fib_result *res,
fib_select_default(fl4, res);
check_saddr:
- if (!fl4->saddr)
- fl4->saddr = fib_result_prefsrc(net, res);
+ if (!fl4->saddr) {
+ struct net_device *l3mdev;
+
+ l3mdev = dev_get_by_index_rcu(net, fl4->flowi4_l3mdev);
+
+ if (!l3mdev ||
+ l3mdev_master_dev_rcu(FIB_RES_DEV(*res)) == l3mdev)
+ fl4->saddr = fib_result_prefsrc(net, res);
+ else
+ fl4->saddr = inet_select_addr(l3mdev, 0, RT_SCOPE_LINK);
+ }
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 943ad0b62e3c21f324c4884caa6cb4a871bca05c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072941-nucleus-cannabis-e513@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
943ad0b62e3c ("kernel: rerun task_work while freezing in get_signal()")
f5d39b020809 ("freezer,sched: Rewrite core freezer logic")
9963e444f71e ("sched: Widen TAKS_state literals")
f9fc8cad9728 ("sched: Add TASK_ANY for wait_task_inactive()")
9204a97f7ae8 ("sched: Change wait_task_inactive()s match_state")
1fbcaa923ce2 ("freezer,umh: Clean up freezer/initrd interaction")
5950e5d574c6 ("freezer: Have {,un}lock_system_sleep() save/restore flags")
0b9d46fc5ef7 ("sched: Rename task_running() to task_on_cpu()")
8386c414e27c ("PM: hibernate: defer device probing when resuming from hibernation")
57b6de08b5f6 ("ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs")
7b0fe1367ef2 ("ptrace: Document that wait_task_inactive can't fail")
1930a6e739c4 ("Merge tag 'ptrace-cleanups-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 943ad0b62e3c21f324c4884caa6cb4a871bca05c Mon Sep 17 00:00:00 2001
From: Pavel Begunkov <asml.silence(a)gmail.com>
Date: Wed, 10 Jul 2024 18:58:18 +0100
Subject: [PATCH] kernel: rerun task_work while freezing in get_signal()
io_uring can asynchronously add a task_work while the task is getting
freezed. TIF_NOTIFY_SIGNAL will prevent the task from sleeping in
do_freezer_trap(), and since the get_signal()'s relock loop doesn't
retry task_work, the task will spin there not being able to sleep
until the freezing is cancelled / the task is killed / etc.
Run task_works in the freezer path. Keep the patch small and simple
so it can be easily back ported, but we might need to do some cleaning
after and look if there are other places with similar problems.
Cc: stable(a)vger.kernel.org
Link: https://github.com/systemd/systemd/issues/33626
Fixes: 12db8b690010c ("entry: Add support for TIF_NOTIFY_SIGNAL")
Reported-by: Julian Orth <ju.orth(a)gmail.com>
Acked-by: Oleg Nesterov <oleg(a)redhat.com>
Acked-by: Tejun Heo <tj(a)kernel.org>
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
Link: https://lore.kernel.org/r/89ed3a52933370deaaf61a0a620a6ac91f1e754d.17206341…
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/kernel/signal.c b/kernel/signal.c
index 1f9dd41c04be..60c737e423a1 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2600,6 +2600,14 @@ static void do_freezer_trap(void)
spin_unlock_irq(¤t->sighand->siglock);
cgroup_enter_frozen();
schedule();
+
+ /*
+ * We could've been woken by task_work, run it to clear
+ * TIF_NOTIFY_SIGNAL. The caller will retry if necessary.
+ */
+ clear_notify_signal();
+ if (unlikely(task_work_pending(current)))
+ task_work_run();
}
static int ptrace_signal(int signr, kernel_siginfo_t *info, enum pid_type type)
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x e90c369cc2ffcf7145a46448de101f715a1f5584
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072901-rare-engaging-abb3@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
e90c369cc2ff ("thermal/drivers/broadcom: Fix race between removal and clock disable")
f29ecd3748a2 ("thermal: bcm2835: Convert to platform remove callback returning void")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e90c369cc2ffcf7145a46448de101f715a1f5584 Mon Sep 17 00:00:00 2001
From: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Date: Tue, 9 Jul 2024 14:59:31 +0200
Subject: [PATCH] thermal/drivers/broadcom: Fix race between removal and clock
disable
During the probe, driver enables clocks necessary to access registers
(in get_temp()) and then registers thermal zone with managed-resources
(devm) interface. Removal of device is not done in reversed order,
because:
1. Clock will be disabled in driver remove() callback - thermal zone is
still registered and accessible to users,
2. devm interface will unregister thermal zone.
This leaves short window between (1) and (2) for accessing the
get_temp() callback with disabled clock.
Fix this by enabling clock also via devm-interface, so entire cleanup
path will be in proper, reversed order.
Fixes: 8454c8c09c77 ("thermal/drivers/bcm2835: Remove buggy call to thermal_of_zone_unregister")
Cc: stable(a)vger.kernel.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Link: https://lore.kernel.org/r/20240709-thermal-probe-v1-1-241644e2b6e0@linaro.o…
Signed-off-by: Daniel Lezcano <daniel.lezcano(a)linaro.org>
diff --git a/drivers/thermal/broadcom/bcm2835_thermal.c b/drivers/thermal/broadcom/bcm2835_thermal.c
index 5c1cebe07580..3b1030fc4fbf 100644
--- a/drivers/thermal/broadcom/bcm2835_thermal.c
+++ b/drivers/thermal/broadcom/bcm2835_thermal.c
@@ -185,7 +185,7 @@ static int bcm2835_thermal_probe(struct platform_device *pdev)
return err;
}
- data->clk = devm_clk_get(&pdev->dev, NULL);
+ data->clk = devm_clk_get_enabled(&pdev->dev, NULL);
if (IS_ERR(data->clk)) {
err = PTR_ERR(data->clk);
if (err != -EPROBE_DEFER)
@@ -193,10 +193,6 @@ static int bcm2835_thermal_probe(struct platform_device *pdev)
return err;
}
- err = clk_prepare_enable(data->clk);
- if (err)
- return err;
-
rate = clk_get_rate(data->clk);
if ((rate < 1920000) || (rate > 5000000))
dev_warn(&pdev->dev,
@@ -211,7 +207,7 @@ static int bcm2835_thermal_probe(struct platform_device *pdev)
dev_err(&pdev->dev,
"Failed to register the thermal device: %d\n",
err);
- goto err_clk;
+ return err;
}
/*
@@ -236,7 +232,7 @@ static int bcm2835_thermal_probe(struct platform_device *pdev)
dev_err(&pdev->dev,
"Not able to read trip_temp: %d\n",
err);
- goto err_tz;
+ return err;
}
/* set bandgap reference voltage and enable voltage regulator */
@@ -269,17 +265,11 @@ static int bcm2835_thermal_probe(struct platform_device *pdev)
*/
err = thermal_add_hwmon_sysfs(tz);
if (err)
- goto err_tz;
+ return err;
bcm2835_thermal_debugfs(pdev);
return 0;
-err_tz:
- devm_thermal_of_zone_unregister(&pdev->dev, tz);
-err_clk:
- clk_disable_unprepare(data->clk);
-
- return err;
}
static void bcm2835_thermal_remove(struct platform_device *pdev)
@@ -287,7 +277,6 @@ static void bcm2835_thermal_remove(struct platform_device *pdev)
struct bcm2835_thermal_data *data = platform_get_drvdata(pdev);
debugfs_remove_recursive(data->debugfsdir);
- clk_disable_unprepare(data->clk);
}
static struct platform_driver bcm2835_thermal_driver = {
Please consider the following patches (in this order) for backporting
to v6.6 and newer.
fb318ca0a522295edd6d796fb987e99ec41f0ee5
ae835a96d72cd025421910edb0e8faf706998727
The second patch addresses a regression on older Dell hardware. The
first one is a prerequisite for the second one, and a minor bugfix
itself.
Thanks,
Ard.
The patch below does not apply to the 6.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y
git checkout FETCH_HEAD
git cherry-pick -x 30d77b7eef019fa4422980806e8b7cdc8674493e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072911-marshland-grab-ced7@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^..
Possible dependencies:
30d77b7eef01 ("mm/mglru: fix ineffective protection calculation")
3f74e6bd3b84 ("mm/mglru: fix overshooting shrinker memory")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 30d77b7eef019fa4422980806e8b7cdc8674493e Mon Sep 17 00:00:00 2001
From: Yu Zhao <yuzhao(a)google.com>
Date: Fri, 12 Jul 2024 17:29:56 -0600
Subject: [PATCH] mm/mglru: fix ineffective protection calculation
mem_cgroup_calculate_protection() is not stateless and should only be used
as part of a top-down tree traversal. shrink_one() traverses the per-node
memcg LRU instead of the root_mem_cgroup tree, and therefore it should not
call mem_cgroup_calculate_protection().
The existing misuse in shrink_one() can cause ineffective protection of
sub-trees that are grandchildren of root_mem_cgroup. Fix it by reusing
lru_gen_age_node(), which already traverses the root_mem_cgroup tree, to
calculate the protection.
Previously lru_gen_age_node() opportunistically skips the first pass,
i.e., when scan_control->priority is DEF_PRIORITY. On the second pass,
lruvec_is_sizable() uses appropriate scan_control->priority, set by
set_initial_priority() from lru_gen_shrink_node(), to decide whether a
memcg is too small to reclaim from.
Now lru_gen_age_node() unconditionally traverses the root_mem_cgroup tree.
So it should call set_initial_priority() upfront, to make sure
lruvec_is_sizable() uses appropriate scan_control->priority on the first
pass. Otherwise, lruvec_is_reclaimable() can return false negatives and
result in premature OOM kills when min_ttl_ms is used.
Link: https://lkml.kernel.org/r/20240712232956.1427127-1-yuzhao@google.com
Fixes: e4dde56cd208 ("mm: multi-gen LRU: per-node lru_gen_folio lists")
Signed-off-by: Yu Zhao <yuzhao(a)google.com>
Reported-by: T.J. Mercier <tjmercier(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 6216d79edb7f..525d3ffa8451 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3915,6 +3915,32 @@ static bool try_to_inc_max_seq(struct lruvec *lruvec, unsigned long seq,
* working set protection
******************************************************************************/
+static void set_initial_priority(struct pglist_data *pgdat, struct scan_control *sc)
+{
+ int priority;
+ unsigned long reclaimable;
+
+ if (sc->priority != DEF_PRIORITY || sc->nr_to_reclaim < MIN_LRU_BATCH)
+ return;
+ /*
+ * Determine the initial priority based on
+ * (total >> priority) * reclaimed_to_scanned_ratio = nr_to_reclaim,
+ * where reclaimed_to_scanned_ratio = inactive / total.
+ */
+ reclaimable = node_page_state(pgdat, NR_INACTIVE_FILE);
+ if (can_reclaim_anon_pages(NULL, pgdat->node_id, sc))
+ reclaimable += node_page_state(pgdat, NR_INACTIVE_ANON);
+
+ /* round down reclaimable and round up sc->nr_to_reclaim */
+ priority = fls_long(reclaimable) - 1 - fls_long(sc->nr_to_reclaim - 1);
+
+ /*
+ * The estimation is based on LRU pages only, so cap it to prevent
+ * overshoots of shrinker objects by large margins.
+ */
+ sc->priority = clamp(priority, DEF_PRIORITY / 2, DEF_PRIORITY);
+}
+
static bool lruvec_is_sizable(struct lruvec *lruvec, struct scan_control *sc)
{
int gen, type, zone;
@@ -3948,19 +3974,17 @@ static bool lruvec_is_reclaimable(struct lruvec *lruvec, struct scan_control *sc
struct mem_cgroup *memcg = lruvec_memcg(lruvec);
DEFINE_MIN_SEQ(lruvec);
- /* see the comment on lru_gen_folio */
- gen = lru_gen_from_seq(min_seq[LRU_GEN_FILE]);
- birth = READ_ONCE(lruvec->lrugen.timestamps[gen]);
-
- if (time_is_after_jiffies(birth + min_ttl))
+ if (mem_cgroup_below_min(NULL, memcg))
return false;
if (!lruvec_is_sizable(lruvec, sc))
return false;
- mem_cgroup_calculate_protection(NULL, memcg);
+ /* see the comment on lru_gen_folio */
+ gen = lru_gen_from_seq(min_seq[LRU_GEN_FILE]);
+ birth = READ_ONCE(lruvec->lrugen.timestamps[gen]);
- return !mem_cgroup_below_min(NULL, memcg);
+ return time_is_before_jiffies(birth + min_ttl);
}
/* to protect the working set of the last N jiffies */
@@ -3970,23 +3994,20 @@ static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc)
{
struct mem_cgroup *memcg;
unsigned long min_ttl = READ_ONCE(lru_gen_min_ttl);
+ bool reclaimable = !min_ttl;
VM_WARN_ON_ONCE(!current_is_kswapd());
- /* check the order to exclude compaction-induced reclaim */
- if (!min_ttl || sc->order || sc->priority == DEF_PRIORITY)
- return;
+ set_initial_priority(pgdat, sc);
memcg = mem_cgroup_iter(NULL, NULL, NULL);
do {
struct lruvec *lruvec = mem_cgroup_lruvec(memcg, pgdat);
- if (lruvec_is_reclaimable(lruvec, sc, min_ttl)) {
- mem_cgroup_iter_break(NULL, memcg);
- return;
- }
+ mem_cgroup_calculate_protection(NULL, memcg);
- cond_resched();
+ if (!reclaimable)
+ reclaimable = lruvec_is_reclaimable(lruvec, sc, min_ttl);
} while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)));
/*
@@ -3994,7 +4015,7 @@ static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc)
* younger than min_ttl. However, another possibility is all memcgs are
* either too small or below min.
*/
- if (mutex_trylock(&oom_lock)) {
+ if (!reclaimable && mutex_trylock(&oom_lock)) {
struct oom_control oc = {
.gfp_mask = sc->gfp_mask,
};
@@ -4786,8 +4807,7 @@ static int shrink_one(struct lruvec *lruvec, struct scan_control *sc)
struct mem_cgroup *memcg = lruvec_memcg(lruvec);
struct pglist_data *pgdat = lruvec_pgdat(lruvec);
- mem_cgroup_calculate_protection(NULL, memcg);
-
+ /* lru_gen_age_node() called mem_cgroup_calculate_protection() */
if (mem_cgroup_below_min(NULL, memcg))
return MEMCG_LRU_YOUNG;
@@ -4911,32 +4931,6 @@ static void lru_gen_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc
blk_finish_plug(&plug);
}
-static void set_initial_priority(struct pglist_data *pgdat, struct scan_control *sc)
-{
- int priority;
- unsigned long reclaimable;
-
- if (sc->priority != DEF_PRIORITY || sc->nr_to_reclaim < MIN_LRU_BATCH)
- return;
- /*
- * Determine the initial priority based on
- * (total >> priority) * reclaimed_to_scanned_ratio = nr_to_reclaim,
- * where reclaimed_to_scanned_ratio = inactive / total.
- */
- reclaimable = node_page_state(pgdat, NR_INACTIVE_FILE);
- if (can_reclaim_anon_pages(NULL, pgdat->node_id, sc))
- reclaimable += node_page_state(pgdat, NR_INACTIVE_ANON);
-
- /* round down reclaimable and round up sc->nr_to_reclaim */
- priority = fls_long(reclaimable) - 1 - fls_long(sc->nr_to_reclaim - 1);
-
- /*
- * The estimation is based on LRU pages only, so cap it to prevent
- * overshoots of shrinker objects by large margins.
- */
- sc->priority = clamp(priority, DEF_PRIORITY / 2, DEF_PRIORITY);
-}
-
static void lru_gen_shrink_node(struct pglist_data *pgdat, struct scan_control *sc)
{
struct blk_plug plug;
The patch below does not apply to the 6.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y
git checkout FETCH_HEAD
git cherry-pick -x 6e49019db5f7a09a9c0e8ac4d108e656c3f8e583
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072937-aloe-fog-7fc4@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^..
Possible dependencies:
6e49019db5f7 ("mm/migrate: putback split folios when numa hint migration fails")
ee86814b0562 ("mm/migrate: move NUMA hinting fault folio isolation + checks under PTL")
4b88c23ab8c9 ("mm/migrate: make migrate_misplaced_folio() return 0 on success")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6e49019db5f7a09a9c0e8ac4d108e656c3f8e583 Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx(a)redhat.com>
Date: Mon, 8 Jul 2024 17:55:37 -0400
Subject: [PATCH] mm/migrate: putback split folios when numa hint migration
fails
This issue is not from any report yet, but by code observation only.
This is yet another fix besides Hugh's patch [1] but on relevant code
path, where eager split of folio can happen if the folio is already on
deferred list during a folio migration.
Here the issue is NUMA path (migrate_misplaced_folio()) may start to
encounter such folio split now even with MR_NUMA_MISPLACED hint applied.
Then when migrate_pages() didn't migrate all the folios, it's possible the
split small folios be put onto the list instead of the original folio.
Then putting back only the head page won't be enough.
Fix it by putting back all the folios on the list.
[1] https://lore.kernel.org/all/46c948b4-4dd8-6e03-4c7b-ce4e81cfa536@google.com/
[akpm(a)linux-foundation.org: remove now unused local `nr_pages']
Link: https://lkml.kernel.org/r/20240708215537.2630610-1-peterx@redhat.com
Fixes: 7262f208ca68 ("mm/migrate: split source folio if it is on deferred split list")
Signed-off-by: Peter Xu <peterx(a)redhat.com>
Reviewed-by: Zi Yan <ziy(a)nvidia.com>
Reviewed-by: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Huang Ying <ying.huang(a)intel.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/migrate.c b/mm/migrate.c
index 6eb9df239230..bdbb5bb04c91 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2621,20 +2621,13 @@ int migrate_misplaced_folio(struct folio *folio, struct vm_area_struct *vma,
int nr_remaining;
unsigned int nr_succeeded;
LIST_HEAD(migratepages);
- int nr_pages = folio_nr_pages(folio);
list_add(&folio->lru, &migratepages);
nr_remaining = migrate_pages(&migratepages, alloc_misplaced_dst_folio,
NULL, node, MIGRATE_ASYNC,
MR_NUMA_MISPLACED, &nr_succeeded);
- if (nr_remaining) {
- if (!list_empty(&migratepages)) {
- list_del(&folio->lru);
- node_stat_mod_folio(folio, NR_ISOLATED_ANON +
- folio_is_file_lru(folio), -nr_pages);
- folio_putback_lru(folio);
- }
- }
+ if (nr_remaining && !list_empty(&migratepages))
+ putback_movable_pages(&migratepages);
if (nr_succeeded) {
count_vm_numa_events(NUMA_PAGE_MIGRATE, nr_succeeded);
if (!node_is_toptier(folio_nid(folio)) && node_is_toptier(node))
The patch below does not apply to the 6.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y
git checkout FETCH_HEAD
git cherry-pick -x 68ed2a394a0190433ba982b353579075a29099bd
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072925-darling-chaplain-8e34@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^..
Possible dependencies:
68ed2a394a01 ("mm: avoid overflows in dirty throttling logic")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 68ed2a394a0190433ba982b353579075a29099bd Mon Sep 17 00:00:00 2001
From: Jan Kara <jack(a)suse.cz>
Date: Fri, 21 Jun 2024 16:42:38 +0200
Subject: [PATCH] mm: avoid overflows in dirty throttling logic
The dirty throttling logic is interspersed with assumptions that dirty
limits in PAGE_SIZE units fit into 32-bit (so that various multiplications
fit into 64-bits). If limits end up being larger, we will hit overflows,
possible divisions by 0 etc. Fix these problems by never allowing so
large dirty limits as they have dubious practical value anyway. For
dirty_bytes / dirty_background_bytes interfaces we can just refuse to set
so large limits. For dirty_ratio / dirty_background_ratio it isn't so
simple as the dirty limit is computed from the amount of available memory
which can change due to memory hotplug etc. So when converting dirty
limits from ratios to numbers of pages, we just don't allow the result to
exceed UINT_MAX.
This is root-only triggerable problem which occurs when the operator
sets dirty limits to >16 TB.
Link: https://lkml.kernel.org/r/20240621144246.11148-2-jack@suse.cz
Signed-off-by: Jan Kara <jack(a)suse.cz>
Reported-by: Zach O'Keefe <zokeefe(a)google.com>
Reviewed-By: Zach O'Keefe <zokeefe(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index c4aa6e84c20a..acff24e9fae4 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -417,13 +417,20 @@ static void domain_dirty_limits(struct dirty_throttle_control *dtc)
else
bg_thresh = (bg_ratio * available_memory) / PAGE_SIZE;
- if (bg_thresh >= thresh)
- bg_thresh = thresh / 2;
tsk = current;
if (rt_task(tsk)) {
bg_thresh += bg_thresh / 4 + global_wb_domain.dirty_limit / 32;
thresh += thresh / 4 + global_wb_domain.dirty_limit / 32;
}
+ /*
+ * Dirty throttling logic assumes the limits in page units fit into
+ * 32-bits. This gives 16TB dirty limits max which is hopefully enough.
+ */
+ if (thresh > UINT_MAX)
+ thresh = UINT_MAX;
+ /* This makes sure bg_thresh is within 32-bits as well */
+ if (bg_thresh >= thresh)
+ bg_thresh = thresh / 2;
dtc->thresh = thresh;
dtc->bg_thresh = bg_thresh;
@@ -473,7 +480,11 @@ static unsigned long node_dirty_limit(struct pglist_data *pgdat)
if (rt_task(tsk))
dirty += dirty / 4;
- return dirty;
+ /*
+ * Dirty throttling logic assumes the limits in page units fit into
+ * 32-bits. This gives 16TB dirty limits max which is hopefully enough.
+ */
+ return min_t(unsigned long, dirty, UINT_MAX);
}
/**
@@ -510,10 +521,17 @@ static int dirty_background_bytes_handler(struct ctl_table *table, int write,
void *buffer, size_t *lenp, loff_t *ppos)
{
int ret;
+ unsigned long old_bytes = dirty_background_bytes;
ret = proc_doulongvec_minmax(table, write, buffer, lenp, ppos);
- if (ret == 0 && write)
+ if (ret == 0 && write) {
+ if (DIV_ROUND_UP(dirty_background_bytes, PAGE_SIZE) >
+ UINT_MAX) {
+ dirty_background_bytes = old_bytes;
+ return -ERANGE;
+ }
dirty_background_ratio = 0;
+ }
return ret;
}
@@ -539,6 +557,10 @@ static int dirty_bytes_handler(struct ctl_table *table, int write,
ret = proc_doulongvec_minmax(table, write, buffer, lenp, ppos);
if (ret == 0 && write && vm_dirty_bytes != old_bytes) {
+ if (DIV_ROUND_UP(vm_dirty_bytes, PAGE_SIZE) > UINT_MAX) {
+ vm_dirty_bytes = old_bytes;
+ return -ERANGE;
+ }
writeback_set_ratelimit();
vm_dirty_ratio = 0;
}
The patch below does not apply to the 6.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y
git checkout FETCH_HEAD
git cherry-pick -x 00f58104202c472e487f0866fbd38832523fd4f9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072942-compare-unworried-8aec@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^..
Possible dependencies:
00f58104202c ("mm: fix khugepaged activation policy")
7f83bf14603e ("mm/huge_memory: mark racy access onhuge_anon_orders_always")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 00f58104202c472e487f0866fbd38832523fd4f9 Mon Sep 17 00:00:00 2001
From: Ryan Roberts <ryan.roberts(a)arm.com>
Date: Thu, 4 Jul 2024 10:10:50 +0100
Subject: [PATCH] mm: fix khugepaged activation policy
Since the introduction of mTHP, the docuementation has stated that
khugepaged would be enabled when any mTHP size is enabled, and disabled
when all mTHP sizes are disabled. There are 2 problems with this; 1.
this is not what was implemented by the code and 2. this is not the
desirable behavior.
Desirable behavior is for khugepaged to be enabled when any PMD-sized THP
is enabled, anon or file. (Note that file THP is still controlled by the
top-level control so we must always consider that, as well as the PMD-size
mTHP control for anon). khugepaged only supports collapsing to PMD-sized
THP so there is no value in enabling it when PMD-sized THP is disabled.
So let's change the code and documentation to reflect this policy.
Further, per-size enabled control modification events were not previously
forwarded to khugepaged to give it an opportunity to start or stop.
Consequently the following was resulting in khugepaged eroneously not
being activated:
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo always > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
[ryan.roberts(a)arm.com: v3]
Link: https://lkml.kernel.org/r/20240705102849.2479686-1-ryan.roberts@arm.com
Link: https://lkml.kernel.org/r/20240705102849.2479686-1-ryan.roberts@arm.com
Link: https://lkml.kernel.org/r/20240704091051.2411934-1-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com>
Fixes: 3485b88390b0 ("mm: thp: introduce multi-size THP sysfs interface")
Closes: https://lore.kernel.org/linux-mm/7a0bbe69-1e3d-4263-b206-da007791a5c4@redha…
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Barry Song <baohua(a)kernel.org>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: Lance Yang <ioworker0(a)gmail.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst
index a1bc9b24e29a..fe237825b95c 100644
--- a/Documentation/admin-guide/mm/transhuge.rst
+++ b/Documentation/admin-guide/mm/transhuge.rst
@@ -202,12 +202,11 @@ PMD-mappable transparent hugepage::
cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size
-khugepaged will be automatically started when one or more hugepage
-sizes are enabled (either by directly setting "always" or "madvise",
-or by setting "inherit" while the top-level enabled is set to "always"
-or "madvise"), and it'll be automatically shutdown when the last
-hugepage size is disabled (either by directly setting "never", or by
-setting "inherit" while the top-level enabled is set to "never").
+khugepaged will be automatically started when PMD-sized THP is enabled
+(either of the per-size anon control or the top-level control are set
+to "always" or "madvise"), and it'll be automatically shutdown when
+PMD-sized THP is disabled (when both the per-size anon control and the
+top-level control are "never")
Khugepaged controls
-------------------
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index cee3c5da8f0e..acb6ac24a07e 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -128,18 +128,6 @@ static inline bool hugepage_global_always(void)
(1<<TRANSPARENT_HUGEPAGE_FLAG);
}
-static inline bool hugepage_flags_enabled(void)
-{
- /*
- * We cover both the anon and the file-backed case here; we must return
- * true if globally enabled, even when all anon sizes are set to never.
- * So we don't need to look at huge_anon_orders_inherit.
- */
- return hugepage_global_enabled() ||
- READ_ONCE(huge_anon_orders_always) ||
- READ_ONCE(huge_anon_orders_madvise);
-}
-
static inline int highest_order(unsigned long orders)
{
return fls_long(orders) - 1;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 17fb072a0ca1..f8b5cbd4dd71 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -502,6 +502,13 @@ static ssize_t thpsize_enabled_store(struct kobject *kobj,
} else
ret = -EINVAL;
+ if (ret > 0) {
+ int err;
+
+ err = start_stop_khugepaged();
+ if (err)
+ ret = err;
+ }
return ret;
}
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 409f67a817f1..a5ec03ef8722 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -413,6 +413,26 @@ static inline int hpage_collapse_test_exit_or_disable(struct mm_struct *mm)
test_bit(MMF_DISABLE_THP, &mm->flags);
}
+static bool hugepage_pmd_enabled(void)
+{
+ /*
+ * We cover both the anon and the file-backed case here; file-backed
+ * hugepages, when configured in, are determined by the global control.
+ * Anon pmd-sized hugepages are determined by the pmd-size control.
+ */
+ if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
+ hugepage_global_enabled())
+ return true;
+ if (test_bit(PMD_ORDER, &huge_anon_orders_always))
+ return true;
+ if (test_bit(PMD_ORDER, &huge_anon_orders_madvise))
+ return true;
+ if (test_bit(PMD_ORDER, &huge_anon_orders_inherit) &&
+ hugepage_global_enabled())
+ return true;
+ return false;
+}
+
void __khugepaged_enter(struct mm_struct *mm)
{
struct khugepaged_mm_slot *mm_slot;
@@ -449,7 +469,7 @@ void khugepaged_enter_vma(struct vm_area_struct *vma,
unsigned long vm_flags)
{
if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
- hugepage_flags_enabled()) {
+ hugepage_pmd_enabled()) {
if (thp_vma_allowable_order(vma, vm_flags, TVA_ENFORCE_SYSFS,
PMD_ORDER))
__khugepaged_enter(vma->vm_mm);
@@ -2462,8 +2482,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result,
static int khugepaged_has_work(void)
{
- return !list_empty(&khugepaged_scan.mm_head) &&
- hugepage_flags_enabled();
+ return !list_empty(&khugepaged_scan.mm_head) && hugepage_pmd_enabled();
}
static int khugepaged_wait_event(void)
@@ -2536,7 +2555,7 @@ static void khugepaged_wait_work(void)
return;
}
- if (hugepage_flags_enabled())
+ if (hugepage_pmd_enabled())
wait_event_freezable(khugepaged_wait, khugepaged_wait_event());
}
@@ -2567,7 +2586,7 @@ static void set_recommended_min_free_kbytes(void)
int nr_zones = 0;
unsigned long recommended_min;
- if (!hugepage_flags_enabled()) {
+ if (!hugepage_pmd_enabled()) {
calculate_min_free_kbytes();
goto update_wmarks;
}
@@ -2617,7 +2636,7 @@ int start_stop_khugepaged(void)
int err = 0;
mutex_lock(&khugepaged_mutex);
- if (hugepage_flags_enabled()) {
+ if (hugepage_pmd_enabled()) {
if (!khugepaged_thread)
khugepaged_thread = kthread_run(khugepaged, NULL,
"khugepaged");
@@ -2643,7 +2662,7 @@ int start_stop_khugepaged(void)
void khugepaged_min_free_kbytes_update(void)
{
mutex_lock(&khugepaged_mutex);
- if (hugepage_flags_enabled() && khugepaged_thread)
+ if (hugepage_pmd_enabled() && khugepaged_thread)
set_recommended_min_free_kbytes();
mutex_unlock(&khugepaged_mutex);
}
I'm announcing the release of the 6.9.12 kernel.
All users of the 6.9 kernel series must upgrade.
The updated 6.9.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-6.9.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 -
arch/arm64/boot/dts/qcom/ipq6018.dtsi | 1
arch/arm64/boot/dts/qcom/ipq8074.dtsi | 2 +
arch/arm64/boot/dts/qcom/msm8996.dtsi | 1
arch/arm64/boot/dts/qcom/msm8998.dtsi | 1
arch/arm64/boot/dts/qcom/qrb2210-rb1.dts | 13 +++++++-
arch/arm64/boot/dts/qcom/qrb4210-rb2.dts | 13 +++++++-
arch/arm64/boot/dts/qcom/sc7180.dtsi | 1
arch/arm64/boot/dts/qcom/sc7280.dtsi | 1
arch/arm64/boot/dts/qcom/sdm630.dtsi | 1
arch/arm64/boot/dts/qcom/sdm845.dtsi | 2 +
arch/arm64/boot/dts/qcom/sm6115.dtsi | 1
arch/arm64/boot/dts/qcom/sm6350.dtsi | 1
arch/arm64/boot/dts/qcom/x1e80100-crd.dts | 17 ++++++++---
arch/arm64/boot/dts/qcom/x1e80100-qcp.dts | 17 ++++++++---
arch/s390/mm/fault.c | 3 +
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 2 -
drivers/net/tap.c | 5 +++
drivers/net/tun.c | 3 +
drivers/usb/gadget/function/f_midi2.c | 19 +++++++-----
fs/jfs/xattr.c | 23 ++++++++++++---
fs/locks.c | 9 ++---
fs/ntfs3/fslog.c | 44 ++++++++++++++++++++++++----
fs/ocfs2/dir.c | 46 ++++++++++++++++++------------
sound/core/pcm_dmaengine.c | 6 +++
sound/core/seq/seq_ump_client.c | 16 ++++++++++
sound/pci/hda/patch_realtek.c | 3 +
27 files changed, 198 insertions(+), 55 deletions(-)
Abel Vesa (4):
arm64: dts: qcom: x1e80100-qcp: Fix USB PHYs regulators
arm64: dts: qcom: x1e80100-crd: Fix the PHY regulator for PCIe 6a
arm64: dts: qcom: x1e80100-qcp: Fix the PHY regulator for PCIe 6a
arm64: dts: qcom: x1e80100-crd: Fix USB PHYs regulators
Dan Carpenter (1):
drm/amdgpu: Fix signedness bug in sdma_v4_0_process_trap_irq()
Dmitry Baryshkov (2):
arm64: dts: qcom: qrb2210-rb1: switch I2C2 to i2c-gpio
arm64: dts: qcom: qrb4210-rb2: switch I2C2 to i2c-gpio
Dongli Zhang (1):
tun: add missing verification for short frame
Edson Juliano Drosdeck (1):
ALSA: hda/realtek: Enable headset mic on Positivo SU C1400
Gerald Schaefer (1):
s390/mm: Fix VM_FAULT_HWPOISON handling in do_exception()
Greg Kroah-Hartman (1):
Linux 6.9.12
Jann Horn (1):
filelock: Fix fcntl/close race recovery compat path
Konstantin Komarov (1):
fs/ntfs3: Add a check for attr_names and oatbl
Krishna Kurapati (10):
arm64: dts: qcom: sc7180: Disable SuperSpeed instances in park mode
arm64: dts: qcom: sc7280: Disable SuperSpeed instances in park mode
arm64: dts: qcom: msm8996: Disable SS instance in Parkmode for USB
arm64: dts: qcom: sm6350: Disable SS instance in Parkmode for USB
arm64: dts: qcom: msm8998: Disable SS instance in Parkmode for USB
arm64: dts: qcom: ipq6018: Disable SS instance in Parkmode for USB
arm64: dts: qcom: sdm630: Disable SS instance in Parkmode for USB
arm64: dts: qcom: ipq8074: Disable SS instance in Parkmode for USB
arm64: dts: qcom: sdm845: Disable SS instance in Parkmode for USB
arm64: dts: qcom: sm6115: Disable SS instance in Parkmode for USB
Seunghun Han (1):
ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book Pro 360
Shenghao Ding (1):
ALSA: hda/tas2781: Add new quirk for Lenovo Hera2 Laptop
Shengjiu Wang (1):
ALSA: pcm_dmaengine: Don't synchronize DMA channel when DMA is paused
Si-Wei Liu (1):
tap: add missing verification for short frame
Takashi Iwai (2):
usb: gadget: midi2: Fix incorrect default MIDI2 protocol setup
ALSA: seq: ump: Skip useless ports for static blocks
lei lu (3):
ocfs2: add bounds checking to ocfs2_check_dir_entry()
jfs: don't walk off the end of ealist
fs/ntfs3: Validate ff offset
We recently made GUP's common page table walking code to also walk
hugetlb VMAs without most hugetlb special-casing, preparing for the
future of having less hugetlb-specific page table walking code in the
codebase. Turns out that we missed one page table locking detail: page
table locking for hugetlb folios that are not mapped using a single
PMD/PUD.
Assume we have hugetlb folio that spans multiple PTEs (e.g., 64 KiB
hugetlb folios on arm64 with 4 KiB base page size). GUP, as it walks the
page tables, will perform a pte_offset_map_lock() to grab the PTE table
lock.
However, hugetlb that concurrently modifies these page tables would
actually grab the mm->page_table_lock: with USE_SPLIT_PTE_PTLOCKS, the
locks would differ. Something similar can happen right now with hugetlb
folios that span multiple PMDs when USE_SPLIT_PMD_PTLOCKS.
Let's make huge_pte_lockptr() effectively uses the same PT locks as any
core-mm page table walker would.
There is one ugly case: powerpc 8xx, whereby we have an 8 MiB hugetlb
folio being mapped using two PTE page tables. While hugetlb wants to take
the PMD table lock, core-mm would grab the PTE table lock of one of both
PTE page tables. In such corner cases, we have to make sure that both
locks match, which is (fortunately!) currently guaranteed for 8xx as it
does not support SMP.
Fixes: 9cb28da54643 ("mm/gup: handle hugetlb in the generic follow_page_mask code")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: David Hildenbrand <david(a)redhat.com>
---
include/linux/hugetlb.h | 25 ++++++++++++++++++++++---
1 file changed, 22 insertions(+), 3 deletions(-)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index c9bf68c239a01..da800e56fe590 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -944,10 +944,29 @@ static inline bool htlb_allow_alloc_fallback(int reason)
static inline spinlock_t *huge_pte_lockptr(struct hstate *h,
struct mm_struct *mm, pte_t *pte)
{
- if (huge_page_size(h) == PMD_SIZE)
+ VM_WARN_ON(huge_page_size(h) == PAGE_SIZE);
+ VM_WARN_ON(huge_page_size(h) >= P4D_SIZE);
+
+ /*
+ * hugetlb must use the exact same PT locks as core-mm page table
+ * walkers would. When modifying a PTE table, hugetlb must take the
+ * PTE PT lock, when modifying a PMD table, hugetlb must take the PMD
+ * PT lock etc.
+ *
+ * The expectation is that any hugetlb folio smaller than a PMD is
+ * always mapped into a single PTE table and that any hugetlb folio
+ * smaller than a PUD (but at least as big as a PMD) is always mapped
+ * into a single PMD table.
+ *
+ * If that does not hold for an architecture, then that architecture
+ * must disable split PT locks such that all *_lockptr() functions
+ * will give us the same result: the per-MM PT lock.
+ */
+ if (huge_page_size(h) < PMD_SIZE)
+ return pte_lockptr(mm, pte);
+ else if (huge_page_size(h) < PUD_SIZE)
return pmd_lockptr(mm, (pmd_t *) pte);
- VM_BUG_ON(huge_page_size(h) == PAGE_SIZE);
- return &mm->page_table_lock;
+ return pud_lockptr(mm, (pud_t *) pte);
}
#ifndef hugepages_supported
--
2.45.2
From: Volodymyr Babchuk <Volodymyr_Babchuk(a)epam.com>
Linux does not write into cmd-db region. This region of memory is write
protected by XPU. XPU may sometime falsely detect clean cache eviction
as "write" into the write protected region leading to secure interrupt
which causes an endless loop somewhere in Trust Zone.
The only reason it is working right now is because Qualcomm Hypervisor
maps the same region as Non-Cacheable memory in Stage 2 translation
tables. The issue manifests if we want to use another hypervisor (like
Xen or KVM), which does not know anything about those specific mappings.
Changing the mapping of cmd-db memory from MEMREMAP_WB to MEMREMAP_WT/WC
removes dependency on correct mappings in Stage 2 tables. This patch
fixes the issue by updating the mapping to MEMREMAP_WC.
I tested this on SA8155P with Xen.
Fixes: 312416d9171a ("drivers: qcom: add command DB driver")
Cc: stable(a)vger.kernel.org # 5.4+
Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk(a)epam.com>
Tested-by: Nikita Travkin <nikita(a)trvn.ru> # sc7180 WoA in EL2
Signed-off-by: Maulik Shah <quic_mkshah(a)quicinc.com>
---
Changes in v2:
- Use MEMREMAP_WC instead of MEMREMAP_WT
- Update commit message from comments in v1
- Add Fixes tag and Cc to stable
- Link to v1: https://lore.kernel.org/lkml/20240327200917.2576034-1-volodymyr_babchuk@epa…
---
drivers/soc/qcom/cmd-db.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/soc/qcom/cmd-db.c b/drivers/soc/qcom/cmd-db.c
index d84572662017..ae66c2623d25 100644
--- a/drivers/soc/qcom/cmd-db.c
+++ b/drivers/soc/qcom/cmd-db.c
@@ -349,7 +349,7 @@ static int cmd_db_dev_probe(struct platform_device *pdev)
return -EINVAL;
}
- cmd_db_header = memremap(rmem->base, rmem->size, MEMREMAP_WB);
+ cmd_db_header = memremap(rmem->base, rmem->size, MEMREMAP_WC);
if (!cmd_db_header) {
ret = -ENOMEM;
cmd_db_header = NULL;
---
base-commit: 797012914d2d031430268fe512af0ccd7d8e46ef
change-id: 20240718-cmd_db_uncached-e896da5c5296
Best regards,
--
Maulik Shah <quic_mkshah(a)quicinc.com>
With PWRSTS_OFF_ON, PCIe GDSCs are turned off during gdsc_disable(). This
can happen during scenarios such as system suspend and breaks the resume
of PCIe controllers from suspend.
So use PWRSTS_RET_ON to indicate the GDSC driver to not turn off the GDSCs
during gdsc_disable() and allow the hardware to transition the GDSCs to
retention when the parent domain enters low power state during system
suspend.
Cc: stable(a)vger.kernel.org # 5.17
Fixes: db0c944ee92b ("clk: qcom: Add clock driver for SM8450")
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
---
drivers/clk/qcom/gcc-sm8450.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/clk/qcom/gcc-sm8450.c b/drivers/clk/qcom/gcc-sm8450.c
index 639a9a955914..c445c271678a 100644
--- a/drivers/clk/qcom/gcc-sm8450.c
+++ b/drivers/clk/qcom/gcc-sm8450.c
@@ -2974,7 +2974,7 @@ static struct gdsc pcie_0_gdsc = {
.pd = {
.name = "pcie_0_gdsc",
},
- .pwrsts = PWRSTS_OFF_ON,
+ .pwrsts = PWRSTS_RET_ON,
};
static struct gdsc pcie_1_gdsc = {
@@ -2982,7 +2982,7 @@ static struct gdsc pcie_1_gdsc = {
.pd = {
.name = "pcie_1_gdsc",
},
- .pwrsts = PWRSTS_OFF_ON,
+ .pwrsts = PWRSTS_RET_ON,
};
static struct gdsc ufs_phy_gdsc = {
--
2.25.1
This should fix the filesystem corruption during RAID resync.
Checking this condition in raid1_check_read_range is not ideal, but this
is only a debug patch.
Link: https://lore.kernel.org/lkml/20240724141906.10b4fc4e@peluse-desk5/T/#m671d6…
Signed-off-by: Mateusz Jończyk <mat.jonczyk(a)o2.pl>
Cc: Yu Kuai <yukuai3(a)huawei.com>
Cc: Song Liu <song(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
drivers/md/raid1-10.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c
index 2ea1710a3b70..4ab896e8cb12 100644
--- a/drivers/md/raid1-10.c
+++ b/drivers/md/raid1-10.c
@@ -252,6 +252,10 @@ static inline int raid1_check_read_range(struct md_rdev *rdev,
sector_t first_bad;
int bad_sectors;
+ if (!test_bit(In_sync, &rdev->flags) &&
+ rdev->recovery_offset < this_sector + *len)
+ return 0;
+
/* no bad block overlap */
if (!is_badblock(rdev, this_sector, *len, &first_bad, &bad_sectors))
return *len;
--
2.25.1
This patchset serves to prevent a deadlock between
process context and softIRQ context:
A. In the process context
* (lock A) Acquiring spin_lock by snd_pcm_stream_lock_irq() in
snd_pcm_status64()
* (lock B) Then attempt to enter tasklet
B. In the softIRQ context
* (lock B) Enter tasklet
* (lock A) Attempt to acquire spin_lock by snd_pcm_stream_lock_irqsave() in
snd_pcm_period_elapsed()
? tasklet_unlock_spin_wait
</NMI>
<TASK>
ohci_flush_iso_completions firewire_ohci
amdtp_domain_stream_pcm_pointer snd_firewire_lib
snd_pcm_update_hw_ptr0 snd_pcm
snd_pcm_status64 snd_pcm
? native_queued_spin_lock_slowpath
</NMI>
<IRQ>
_raw_spin_lock_irqsave
snd_pcm_period_elapsed snd_pcm
process_rx_packets snd_firewire_lib
irq_target_callback snd_firewire_lib
handle_it_packet firewire_ohci
context_tasklet firewire_ohci
The issue has been reported as a regression of kernel 5.14:
Link: https://lore.kernel.org/regressions/kwryofzdmjvzkuw6j3clftsxmoolynljztxqwg7…
("[REGRESSION] ALSA: firewire-lib: snd_pcm_period_elapsed deadlock with
Fireface 800")
Commit 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event
in process context") removed the process context workqueue from
amdtp_domain_stream_pcm_pointer() and update_pcm_pointers() to remove
its overhead.
Commit b5b519965c4c ("ALSA: firewire-lib: obsolete workqueue for period
update") belongs to the same patch series and removed
the now-unused workqueue entirely.
Though being observed on RME Fireface 800, this issue would affect all
Firewire audio interfaces using ohci amdtp + pcm streaming.
ALSA streaming, especially under intensive CPU load will reveal this issue
the soonest due to issuing more hardIRQs, with time to occurrence ranging
from 2 secons to 30 minutes after starting playback.
to reproduce the issue:
direct ALSA playback to the device:
mpv --audio-device=alsa/sysdefault:CARD=Fireface800 Spor-Ignition.flac
Time to occurrence: 2s to 30m
Likelihood increased by:
- high CPU load
stress --cpu $(nproc)
- switching between applications via workspaces
tested with i915 in Xfce
PulsaAudio / PipeWire conceal the issue as they run PCM substream
without period wakeup mode, issuing less hardIRQs.
Backport note:
Also applies to and fixes on (tested):
6.10.2, 6.9.12, 6.6.43, 6.1.102, 5.15.164
Edmund Raile (2):
ALSA: firewire-lib: restore workqueue for process context
ALSA: firewire-lib: prevent deadlock between process and softIRQ
context
sound/firewire/amdtp-stream.c | 36 ++++++++++++++++++++++-------------
sound/firewire/amdtp-stream.h | 1 +
2 files changed, 24 insertions(+), 13 deletions(-)
--
2.45.2
After a recent change in clang to stop consuming all instances of '-S'
and '-c' [1], the stack protector scripts break due to the kernel's use
of -Werror=unused-command-line-argument to catch cases where flags are
not being properly consumed by the compiler driver:
$ echo | clang -o - -x c - -S -c -Werror=unused-command-line-argument
clang: error: argument unused during compilation: '-c' [-Werror,-Wunused-command-line-argument]
This results in CONFIG_STACKPROTECTOR getting disabled because
CONFIG_CC_HAS_SANE_STACKPROTECTOR is no longer set.
'-c' and '-S' both instruct the compiler to stop at different stages of
the pipeline ('-S' after compiling, '-c' after assembling), so having
them present together in the same command makes little sense. In this
case, the test wants to stop before assembling because it is looking at
the textual assembly output of the compiler for either '%fs' or '%gs',
so remove '-c' from the list of arguments to resolve the error.
All versions of GCC continue to work after this change, along with
versions of clang that do or do not contain the change mentioned above.
Cc: stable(a)vger.kernel.org
Fixes: 4f7fd4d7a791 ("[PATCH] Add the -fstack-protector option to the CFLAGS")
Fixes: 60a5317ff0f4 ("x86: implement x86_32 stack protector")
Link: https://github.com/llvm/llvm-project/commit/6461e537815f7fa68cef06842505353… [1]
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
I think this could go via either -tip or Kbuild?
Perhaps this is an issue in the clang commit mentioned in the message
above since it deviates from GCC (Fangrui is on CC here) but I think the
combination of these options is a little dubious to begin with, hence
this change.
---
scripts/gcc-x86_32-has-stack-protector.sh | 2 +-
scripts/gcc-x86_64-has-stack-protector.sh | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/scripts/gcc-x86_32-has-stack-protector.sh b/scripts/gcc-x86_32-has-stack-protector.sh
index 825c75c5b715..9459ca4f0f11 100755
--- a/scripts/gcc-x86_32-has-stack-protector.sh
+++ b/scripts/gcc-x86_32-has-stack-protector.sh
@@ -5,4 +5,4 @@
# -mstack-protector-guard-reg, added by
# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81708
-echo "int foo(void) { char X[200]; return 3; }" | $* -S -x c -c -m32 -O0 -fstack-protector -mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard - -o - 2> /dev/null | grep -q "%fs"
+echo "int foo(void) { char X[200]; return 3; }" | $* -S -x c -m32 -O0 -fstack-protector -mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard - -o - 2> /dev/null | grep -q "%fs"
diff --git a/scripts/gcc-x86_64-has-stack-protector.sh b/scripts/gcc-x86_64-has-stack-protector.sh
index 75e4e22b986a..f680bb01aeeb 100755
--- a/scripts/gcc-x86_64-has-stack-protector.sh
+++ b/scripts/gcc-x86_64-has-stack-protector.sh
@@ -1,4 +1,4 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
-echo "int foo(void) { char X[200]; return 3; }" | $* -S -x c -c -m64 -O0 -mcmodel=kernel -fno-PIE -fstack-protector - -o - 2> /dev/null | grep -q "%gs"
+echo "int foo(void) { char X[200]; return 3; }" | $* -S -x c -m64 -O0 -mcmodel=kernel -fno-PIE -fstack-protector - -o - 2> /dev/null | grep -q "%gs"
---
base-commit: 1722389b0d863056d78287a120a1d6cadb8d4f7b
change-id: 20240726-fix-x86-stack-protector-tests-b542b1b9416b
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
From: Ajit Khaparde <ajit.khaparde(a)broadcom.com>
[ Upstream commit 524e057b2d66b61f9b63b6db30467ab7b0bb4796 ]
The Broadcom BCM5760X NIC may be a multi-function device.
While it does not advertise an ACS capability, peer-to-peer transactions
are not possible between the individual functions. So it is ok to treat
them as fully isolated.
Add an ACS quirk for this device so the functions can be in independent
IOMMU groups and attached individually to userspace applications using
VFIO.
[kwilczynski: commit log]
Link: https://lore.kernel.org/linux-pci/20240510204228.73435-1-ajit.khaparde@broa…
Signed-off-by: Ajit Khaparde <ajit.khaparde(a)broadcom.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski(a)kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Reviewed-by: Andy Gospodarek <gospo(a)broadcom.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/pci/quirks.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index bb51820890965..a3a4c4724b6c9 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4822,6 +4822,10 @@ static const struct pci_dev_acs_enabled {
{ PCI_VENDOR_ID_BROADCOM, 0x1750, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_BROADCOM, 0x1751, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_BROADCOM, 0x1752, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1760, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1761, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1762, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1763, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_BROADCOM, 0xD714, pci_quirk_brcm_acs },
{ 0 }
};
--
2.43.0
From: Ajit Khaparde <ajit.khaparde(a)broadcom.com>
[ Upstream commit 524e057b2d66b61f9b63b6db30467ab7b0bb4796 ]
The Broadcom BCM5760X NIC may be a multi-function device.
While it does not advertise an ACS capability, peer-to-peer transactions
are not possible between the individual functions. So it is ok to treat
them as fully isolated.
Add an ACS quirk for this device so the functions can be in independent
IOMMU groups and attached individually to userspace applications using
VFIO.
[kwilczynski: commit log]
Link: https://lore.kernel.org/linux-pci/20240510204228.73435-1-ajit.khaparde@broa…
Signed-off-by: Ajit Khaparde <ajit.khaparde(a)broadcom.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski(a)kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Reviewed-by: Andy Gospodarek <gospo(a)broadcom.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/pci/quirks.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 3bc7058404156..aef73bc36ee98 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4986,6 +4986,10 @@ static const struct pci_dev_acs_enabled {
{ PCI_VENDOR_ID_BROADCOM, 0x1750, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_BROADCOM, 0x1751, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_BROADCOM, 0x1752, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1760, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1761, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1762, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1763, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_BROADCOM, 0xD714, pci_quirk_brcm_acs },
/* Amazon Annapurna Labs */
{ PCI_VENDOR_ID_AMAZON_ANNAPURNA_LABS, 0x0031, pci_quirk_al_acs },
--
2.43.0
From: Ajit Khaparde <ajit.khaparde(a)broadcom.com>
[ Upstream commit 524e057b2d66b61f9b63b6db30467ab7b0bb4796 ]
The Broadcom BCM5760X NIC may be a multi-function device.
While it does not advertise an ACS capability, peer-to-peer transactions
are not possible between the individual functions. So it is ok to treat
them as fully isolated.
Add an ACS quirk for this device so the functions can be in independent
IOMMU groups and attached individually to userspace applications using
VFIO.
[kwilczynski: commit log]
Link: https://lore.kernel.org/linux-pci/20240510204228.73435-1-ajit.khaparde@broa…
Signed-off-by: Ajit Khaparde <ajit.khaparde(a)broadcom.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski(a)kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Reviewed-by: Andy Gospodarek <gospo(a)broadcom.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/pci/quirks.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 56dce858a6934..e02a1e513f989 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4997,6 +4997,10 @@ static const struct pci_dev_acs_enabled {
{ PCI_VENDOR_ID_BROADCOM, 0x1750, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_BROADCOM, 0x1751, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_BROADCOM, 0x1752, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1760, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1761, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1762, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1763, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_BROADCOM, 0xD714, pci_quirk_brcm_acs },
/* Amazon Annapurna Labs */
{ PCI_VENDOR_ID_AMAZON_ANNAPURNA_LABS, 0x0031, pci_quirk_al_acs },
--
2.43.0
From: Ajit Khaparde <ajit.khaparde(a)broadcom.com>
[ Upstream commit 524e057b2d66b61f9b63b6db30467ab7b0bb4796 ]
The Broadcom BCM5760X NIC may be a multi-function device.
While it does not advertise an ACS capability, peer-to-peer transactions
are not possible between the individual functions. So it is ok to treat
them as fully isolated.
Add an ACS quirk for this device so the functions can be in independent
IOMMU groups and attached individually to userspace applications using
VFIO.
[kwilczynski: commit log]
Link: https://lore.kernel.org/linux-pci/20240510204228.73435-1-ajit.khaparde@broa…
Signed-off-by: Ajit Khaparde <ajit.khaparde(a)broadcom.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski(a)kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Reviewed-by: Andy Gospodarek <gospo(a)broadcom.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/pci/quirks.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index ec4277d7835b2..1bf1a83dabb93 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5092,6 +5092,10 @@ static const struct pci_dev_acs_enabled {
{ PCI_VENDOR_ID_BROADCOM, 0x1750, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_BROADCOM, 0x1751, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_BROADCOM, 0x1752, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1760, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1761, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1762, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1763, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_BROADCOM, 0xD714, pci_quirk_brcm_acs },
/* Amazon Annapurna Labs */
{ PCI_VENDOR_ID_AMAZON_ANNAPURNA_LABS, 0x0031, pci_quirk_al_acs },
--
2.43.0
From: Ajit Khaparde <ajit.khaparde(a)broadcom.com>
[ Upstream commit 524e057b2d66b61f9b63b6db30467ab7b0bb4796 ]
The Broadcom BCM5760X NIC may be a multi-function device.
While it does not advertise an ACS capability, peer-to-peer transactions
are not possible between the individual functions. So it is ok to treat
them as fully isolated.
Add an ACS quirk for this device so the functions can be in independent
IOMMU groups and attached individually to userspace applications using
VFIO.
[kwilczynski: commit log]
Link: https://lore.kernel.org/linux-pci/20240510204228.73435-1-ajit.khaparde@broa…
Signed-off-by: Ajit Khaparde <ajit.khaparde(a)broadcom.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski(a)kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Reviewed-by: Andy Gospodarek <gospo(a)broadcom.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/pci/quirks.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 568410e64ce64..a2ce4e08edf5a 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5099,6 +5099,10 @@ static const struct pci_dev_acs_enabled {
{ PCI_VENDOR_ID_BROADCOM, 0x1750, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_BROADCOM, 0x1751, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_BROADCOM, 0x1752, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1760, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1761, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1762, pci_quirk_mf_endpoint_acs },
+ { PCI_VENDOR_ID_BROADCOM, 0x1763, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_BROADCOM, 0xD714, pci_quirk_brcm_acs },
/* Amazon Annapurna Labs */
{ PCI_VENDOR_ID_AMAZON_ANNAPURNA_LABS, 0x0031, pci_quirk_al_acs },
--
2.43.0
From: Ricardo Ribalda <ribalda(a)chromium.org>
[ Upstream commit 5cd7c25f6f0576073b3d03bc4cfb1e8ca63a1195 ]
Some SunplusIT cameras took a borderline interpretation of the UVC 1.5
standard, and fill the PTS and SCR fields with invalid data if the
package does not contain data.
"STC must be captured when the first video data of a video frame is put
on the USB bus."
Some SunplusIT devices send, e.g.,
buffer: 0xa7755c00 len 000012 header:0x8c stc 00000000 sof 0000 pts 00000000
buffer: 0xa7755c00 len 000012 header:0x8c stc 00000000 sof 0000 pts 00000000
buffer: 0xa7755c00 len 000668 header:0x8c stc 73779dba sof 070c pts 7376d37a
While the UVC specification meant that the first two packets shouldn't
have had the SCR bit set in the header.
This borderline/buggy interpretation has been implemented in a variety
of devices, from directly SunplusIT and from other OEMs that rebrand
SunplusIT products. So quirking based on VID:PID will be problematic.
All the affected modules have the following extension unit:
VideoControl Interface Descriptor:
guidExtensionCode {82066163-7050-ab49-b8cc-b3855e8d221d}
But the vendor plans to use that GUID in the future and fix the bug,
this means that we should use heuristic to figure out the broken
packets.
This patch takes care of this.
lsusb of one of the affected cameras:
Bus 001 Device 003: ID 1bcf:2a01 Sunplus Innovation Technology Inc.
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 2.01
bDeviceClass 239 Miscellaneous Device
bDeviceSubClass 2 ?
bDeviceProtocol 1 Interface Association
bMaxPacketSize0 64
idVendor 0x1bcf Sunplus Innovation Technology Inc.
idProduct 0x2a01
bcdDevice 0.02
iManufacturer 1 SunplusIT Inc
iProduct 2 HanChen Wise Camera
iSerial 3 01.00.00
bNumConfigurations 1
Tested-by: HungNien Chen <hn.chen(a)sunplusit.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky(a)chromium.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart(a)ideasonboard.com>
Signed-off-by: Ricardo Ribalda <ribalda(a)chromium.org>
Reviewed-by: Tomasz Figa <tfiga(a)chromium.org>
Link: https://lore.kernel.org/r/20240323-resend-hwtimestamp-v10-2-b08e590d97c7@ch…
Signed-off-by: Laurent Pinchart <laurent.pinchart(a)ideasonboard.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/media/usb/uvc/uvc_video.c | 31 ++++++++++++++++++++++++++++++-
1 file changed, 30 insertions(+), 1 deletion(-)
diff --git a/drivers/media/usb/uvc/uvc_video.c b/drivers/media/usb/uvc/uvc_video.c
index c57bc62251bb8..e2c1b98fb4a25 100644
--- a/drivers/media/usb/uvc/uvc_video.c
+++ b/drivers/media/usb/uvc/uvc_video.c
@@ -473,6 +473,7 @@ uvc_video_clock_decode(struct uvc_streaming *stream, struct uvc_buffer *buf,
ktime_t time;
u16 host_sof;
u16 dev_sof;
+ u32 dev_stc;
switch (data[1] & (UVC_STREAM_PTS | UVC_STREAM_SCR)) {
case UVC_STREAM_PTS | UVC_STREAM_SCR:
@@ -517,6 +518,34 @@ uvc_video_clock_decode(struct uvc_streaming *stream, struct uvc_buffer *buf,
if (dev_sof == stream->clock.last_sof)
return;
+ dev_stc = get_unaligned_le32(&data[header_size - 6]);
+
+ /*
+ * STC (Source Time Clock) is the clock used by the camera. The UVC 1.5
+ * standard states that it "must be captured when the first video data
+ * of a video frame is put on the USB bus". This is generally understood
+ * as requiring devices to clear the payload header's SCR bit before
+ * the first packet containing video data.
+ *
+ * Most vendors follow that interpretation, but some (namely SunplusIT
+ * on some devices) always set the `UVC_STREAM_SCR` bit, fill the SCR
+ * field with 0's,and expect that the driver only processes the SCR if
+ * there is data in the packet.
+ *
+ * Ignore all the hardware timestamp information if we haven't received
+ * any data for this frame yet, the packet contains no data, and both
+ * STC and SOF are zero. This heuristics should be safe on compliant
+ * devices. This should be safe with compliant devices, as in the very
+ * unlikely case where a UVC 1.1 device would send timing information
+ * only before the first packet containing data, and both STC and SOF
+ * happen to be zero for a particular frame, we would only miss one
+ * clock sample from many and the clock recovery algorithm wouldn't
+ * suffer from this condition.
+ */
+ if (buf && buf->bytesused == 0 && len == header_size &&
+ dev_stc == 0 && dev_sof == 0)
+ return;
+
stream->clock.last_sof = dev_sof;
host_sof = usb_get_current_frame_number(stream->dev->udev);
@@ -554,7 +583,7 @@ uvc_video_clock_decode(struct uvc_streaming *stream, struct uvc_buffer *buf,
spin_lock_irqsave(&stream->clock.lock, flags);
sample = &stream->clock.samples[stream->clock.head];
- sample->dev_stc = get_unaligned_le32(&data[header_size - 6]);
+ sample->dev_stc = dev_stc;
sample->dev_sof = dev_sof;
sample->host_sof = host_sof;
sample->host_time = time;
--
2.43.0
From: Matthew Auld <matthew.auld(a)intel.com>
[ Upstream commit 3cd1585e57908b6efcd967465ef7685f40b2a294 ]
It is really easy to introduce subtle deadlocks in
preempt_fence_work_func() since we operate on single global ordered-wq
for signalling our preempt fences behind the scenes, so even though we
signal a particular fence, everything in the callback should be in the
fence critical section, since blocking in the callback will prevent
other published fences from signalling. If we enlarge the fence critical
section to cover the entire callback, then lockdep should be able to
understand this better, and complain if we grab a sensitive lock like
vm->lock, which is also held when waiting on preempt fences.
Signed-off-by: Matthew Auld <matthew.auld(a)intel.com>
Cc: Matthew Brost <matthew.brost(a)intel.com>
Reviewed-by: Matthew Brost <matthew.brost(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240418144630.299531-2-matth…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/gpu/drm/xe/xe_preempt_fence.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_preempt_fence.c b/drivers/gpu/drm/xe/xe_preempt_fence.c
index 7d50c6e89d8e7..5b243b7feb59d 100644
--- a/drivers/gpu/drm/xe/xe_preempt_fence.c
+++ b/drivers/gpu/drm/xe/xe_preempt_fence.c
@@ -23,11 +23,19 @@ static void preempt_fence_work_func(struct work_struct *w)
q->ops->suspend_wait(q);
dma_fence_signal(&pfence->base);
- dma_fence_end_signalling(cookie);
-
+ /*
+ * Opt for keep everything in the fence critical section. This looks really strange since we
+ * have just signalled the fence, however the preempt fences are all signalled via single
+ * global ordered-wq, therefore anything that happens in this callback can easily block
+ * progress on the entire wq, which itself may prevent other published preempt fences from
+ * ever signalling. Therefore try to keep everything here in the callback in the fence
+ * critical section. For example if something below grabs a scary lock like vm->lock,
+ * lockdep should complain since we also hold that lock whilst waiting on preempt fences to
+ * complete.
+ */
xe_vm_queue_rebind_worker(q->vm);
-
xe_exec_queue_put(q);
+ dma_fence_end_signalling(cookie);
}
static const char *
--
2.43.0
c1839501fe3e ("ALSA: firewire-lib: fix wrong value as length of header for
CIP_NO_HEADER case") indeed fixes the issue.
Thank you very much!
I just hope non-CIP_NO_HEADER cases work, but then again, looking at the
original struct
```
struct {
struct fw_iso_packet params;
__be32 header[CIP_HEADER_QUADLETS];
} template = { {0}, {0} };
```
the fw_iso_packet wasn't at the end of the array, thus
```
struct fw_iso_packet {
u16 payload_length; /* Length of indirect payload */
u32 interrupt:1; /* Generate interrupt on this packet */
u32 skip:1; /* tx: Set to not send packet at all */
/* rx: Sync bit, wait for matching sy */
u32 tag:2; /* tx: Tag in packet header */
u32 sy:4; /* tx: Sy in packet header */
u32 header_length:8; /* Size of immediate header */
u32 header[]; /* tx: Top of 1394 isoch. data_block */
};
```
its header array wouldn't ever end up being initialized here with > 0 elements.
Maybe that wasn't ever required.
Tested-by: Edmund Raile <edmund.raile(a)protonmail.com>
Link: https://lore.kernel.org/r/20240725155640.128442-1-o-takashi@sakamocchi.jp
This patchset serves to prevent a deadlock between
process context and softIRQ context:
A. In the process context
* (lock A) Acquiring spin_lock by snd_pcm_stream_lock_irq() in
snd_pcm_status64()
* (lock B) Then attempt to enter tasklet
B. In the softIRQ context
* (lock B) Enter tasklet
* (lock A) Attempt to acquire spin_lock by snd_pcm_stream_lock_irqsave() in
snd_pcm_period_elapsed()
? tasklet_unlock_spin_wait
</NMI>
<TASK>
ohci_flush_iso_completions firewire_ohci
amdtp_domain_stream_pcm_pointer snd_firewire_lib
snd_pcm_update_hw_ptr0 snd_pcm
snd_pcm_status64 snd_pcm
? native_queued_spin_lock_slowpath
</NMI>
<IRQ>
_raw_spin_lock_irqsave
snd_pcm_period_elapsed snd_pcm
process_rx_packets snd_firewire_lib
irq_target_callback snd_firewire_lib
handle_it_packet firewire_ohci
context_tasklet firewire_ohci
The issue has been reported as a regression of kernel 5.14:
Link: https://lore.kernel.org/regressions/kwryofzdmjvzkuw6j3clftsxmoolynljztxqwg7…
("[REGRESSION] ALSA: firewire-lib: snd_pcm_period_elapsed deadlock with
Fireface 800")
Commit 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event
in process context") removed the process context workqueue from
amdtp_domain_stream_pcm_pointer() and update_pcm_pointers() to remove
its overhead.
Commit b5b519965c4c ("ALSA: firewire-lib: obsolete workqueue for period
update") belongs to the same patch series and removed
the now-unused workqueue entirely.
Though being observed on RME Fireface 800, this issue would affect all
Firewire audio interfaces using ohci amdtp + pcm streaming.
ALSA streaming, especially under intensive CPU load will reveal this issue
the soonest due to issuing more hardIRQs, with time to occurrence ranging
from 2 secons to 30 minutes after starting playback.
to reproduce the issue:
direct ALSA playback to the device:
mpv --audio-device=alsa/sysdefault:CARD=Fireface800 Spor-Ignition.flac
Time to occurrence: 2s to 30m
Likelihood increased by:
- high CPU load
stress --cpu $(nproc)
- switching between applications via workspaces
tested with i915 in Xfce
PulsaAudio / PipeWire conceal the issue as they run PCM substream
without period wakeup mode, issuing less hardIRQs.
Backport note:
Also applies to and fixes on (tested):
6.10.2, 6.9.12, 6.6.43, 6.1.102, 5.15.164
Edmund Raile (2):
ALSA: firewire-lib: restore workqueue for process context
ALSA: firewire-lib: prevent deadlock between process and softIRQ
context
sound/firewire/amdtp-stream.c | 36 ++++++++++++++++++++++-------------
sound/firewire/amdtp-stream.h | 1 +
2 files changed, 24 insertions(+), 13 deletions(-)
--
2.45.2
Commit 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event
in process context") removed the process context workqueue from
amdtp_domain_stream_pcm_pointer() and update_pcm_pointers() to remove
its overhead.
With RME Fireface 800, this lead to a regression since Kernels 5.14.0,
causing a deadlock with eventual system freeze under ALSA operation:
A. In the process context
* (lock A) Acquiring spin_lock by snd_pcm_stream_lock_irq() in
snd_pcm_status64()
* (lock B) Then attempt to enter tasklet
B. In the softIRQ context
* (lock B) Enter tasklet
* (lock A) Attempt to acquire spin_lock by
snd_pcm_stream_lock_irqsave() in snd_pcm_period_elapsed()
? tasklet_unlock_spin_wait
</NMI>
<TASK>
ohci_flush_iso_completions firewire_ohci
amdtp_domain_stream_pcm_pointer snd_firewire_lib
snd_pcm_update_hw_ptr0 snd_pcm
snd_pcm_status64 snd_pcm
? native_queued_spin_lock_slowpath
</NMI>
<IRQ>
_raw_spin_lock_irqsave
snd_pcm_period_elapsed snd_pcm
process_rx_packets snd_firewire_lib
irq_target_callback snd_firewire_lib
handle_it_packet firewire_ohci
context_tasklet firewire_ohci
Restore the process context work queue to prevent deadlock
between ALSA substream process context spin_lock of
snd_pcm_stream_lock_irq() in snd_pcm_status64()
and OHCI 1394 IT softIRQ context spin_lock of
snd_pcm_stream_lock_irqsave() in snd_pcm_period_elapsed().
Cc: stable(a)vger.kernel.org
Fixes: 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event in process context")
Link: https://lore.kernel.org/r/kwryofzdmjvzkuw6j3clftsxmoolynljztxqwg76hzeo4simn…
Reported-by: edmund.raile <edmund.raile(a)proton.me>
Signed-off-by: Edmund Raile <edmund.raile(a)protonmail.com>
---
sound/firewire/amdtp-stream.c | 21 ++++++++-------------
1 file changed, 8 insertions(+), 13 deletions(-)
diff --git a/sound/firewire/amdtp-stream.c b/sound/firewire/amdtp-stream.c
index 31201d506a21..c037d7de1fdc 100644
--- a/sound/firewire/amdtp-stream.c
+++ b/sound/firewire/amdtp-stream.c
@@ -615,16 +615,9 @@ static void update_pcm_pointers(struct amdtp_stream *s,
// The program in user process should periodically check the status of intermediate
// buffer associated to PCM substream to process PCM frames in the buffer, instead
// of receiving notification of period elapsed by poll wait.
- if (!pcm->runtime->no_period_wakeup) {
- if (in_softirq()) {
- // In software IRQ context for 1394 OHCI.
- snd_pcm_period_elapsed(pcm);
- } else {
- // In process context of ALSA PCM application under acquired lock of
- // PCM substream.
- snd_pcm_period_elapsed_under_stream_lock(pcm);
- }
- }
+ if (!pcm->runtime->no_period_wakeup)
+ // wq used with amdtp_domain_stream_pcm_pointer() to prevent deadlock
+ queue_work(system_highpri_wq, &s->period_work);
}
}
@@ -1866,9 +1859,11 @@ unsigned long amdtp_domain_stream_pcm_pointer(struct amdtp_domain *d,
// Process isochronous packets queued till recent isochronous cycle to handle PCM frames.
if (irq_target && amdtp_stream_running(irq_target)) {
- // In software IRQ context, the call causes dead-lock to disable the tasklet
- // synchronously.
- if (!in_softirq())
+ // use wq to prevent deadlock between process context spin_lock
+ // of snd_pcm_stream_lock_irq() in snd_pcm_status64() and
+ // softIRQ context spin_lock of snd_pcm_stream_lock_irqsave()
+ // in snd_pcm_period_elapsed()
+ if ((!in_softirq()) && (current_work() != &s->period_work))
fw_iso_context_flush_completions(irq_target->context);
}
--
2.45.2
Commit 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event
in process context") removed the process context workqueue from
amdtp_domain_stream_pcm_pointer() and update_pcm_pointers() to remove
its overhead.
With RME Fireface 800, this lead to a regression since Kernels 5.14.0,
causing a deadlock with eventual system freeze under ALSA operation:
A. In the process context
* (lock A) Acquiring spin_lock by snd_pcm_stream_lock_irq() in
snd_pcm_status64()
* (lock B) Then attempt to enter tasklet
B. In the softIRQ context
* (lock B) Enter tasklet
* (lock A) Attempt to acquire spin_lock by
snd_pcm_stream_lock_irqsave() in snd_pcm_period_elapsed()
? tasklet_unlock_spin_wait
</NMI>
<TASK>
ohci_flush_iso_completions firewire_ohci
amdtp_domain_stream_pcm_pointer snd_firewire_lib
snd_pcm_update_hw_ptr0 snd_pcm
snd_pcm_status64 snd_pcm
? native_queued_spin_lock_slowpath
</NMI>
<IRQ>
_raw_spin_lock_irqsave
snd_pcm_period_elapsed snd_pcm
process_rx_packets snd_firewire_lib
irq_target_callback snd_firewire_lib
handle_it_packet firewire_ohci
context_tasklet firewire_ohci
Restore the process context work queue to prevent deadlock
between ALSA substream process context spin_lock of
snd_pcm_stream_lock_irq() in snd_pcm_status64()
and OHCI 1394 IT softIRQ context spin_lock of
snd_pcm_stream_lock_irqsave() in snd_pcm_period_elapsed().
Cc: stable(a)vger.kernel.org
Fixes: 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event in process context")
Link: https://lore.kernel.org/r/kwryofzdmjvzkuw6j3clftsxmoolynljztxqwg76hzeo4simn…
Reported-by: edmund.raile <edmund.raile(a)proton.me>
Signed-off-by: Edmund Raile <edmund.raile(a)protonmail.com>
---
sound/firewire/amdtp-stream.c | 21 ++++++++-------------
1 file changed, 8 insertions(+), 13 deletions(-)
diff --git a/sound/firewire/amdtp-stream.c b/sound/firewire/amdtp-stream.c
index 31201d506a21..c037d7de1fdc 100644
--- a/sound/firewire/amdtp-stream.c
+++ b/sound/firewire/amdtp-stream.c
@@ -615,16 +615,9 @@ static void update_pcm_pointers(struct amdtp_stream *s,
// The program in user process should periodically check the status of intermediate
// buffer associated to PCM substream to process PCM frames in the buffer, instead
// of receiving notification of period elapsed by poll wait.
- if (!pcm->runtime->no_period_wakeup) {
- if (in_softirq()) {
- // In software IRQ context for 1394 OHCI.
- snd_pcm_period_elapsed(pcm);
- } else {
- // In process context of ALSA PCM application under acquired lock of
- // PCM substream.
- snd_pcm_period_elapsed_under_stream_lock(pcm);
- }
- }
+ if (!pcm->runtime->no_period_wakeup)
+ // wq used with amdtp_domain_stream_pcm_pointer() to prevent deadlock
+ queue_work(system_highpri_wq, &s->period_work);
}
}
@@ -1866,9 +1859,11 @@ unsigned long amdtp_domain_stream_pcm_pointer(struct amdtp_domain *d,
// Process isochronous packets queued till recent isochronous cycle to handle PCM frames.
if (irq_target && amdtp_stream_running(irq_target)) {
- // In software IRQ context, the call causes dead-lock to disable the tasklet
- // synchronously.
- if (!in_softirq())
+ // use wq to prevent deadlock between process context spin_lock
+ // of snd_pcm_stream_lock_irq() in snd_pcm_status64() and
+ // softIRQ context spin_lock of snd_pcm_stream_lock_irqsave()
+ // in snd_pcm_period_elapsed()
+ if ((!in_softirq()) && (current_work() != &s->period_work))
fw_iso_context_flush_completions(irq_target->context);
}
--
2.45.2
From: Heiner Kallweit <hkallweit1(a)gmail.com>
[ Upstream commit 982300c115d229565d7af8e8b38aa1ee7bb1f5bd ]
This early RTL8168b version was the first PCIe chip version, and it's
quite quirky. Last sign of life is from more than 15 yrs ago.
Let's remove detection of this chip version, we'll see whether anybody
complains. If not, support for this chip version can be removed a few
kernel versions later.
Signed-off-by: Heiner Kallweit <hkallweit1(a)gmail.com>
Link: https://lore.kernel.org/r/875cdcf4-843c-420a-ad5d-417447b68572@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/net/ethernet/realtek/r8169_main.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 319f8d7a502da..32ea1d902a173 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -2185,7 +2185,9 @@ static void rtl8169_get_mac_version(struct rtl8169_private *tp)
/* 8168B family. */
{ 0x7cf, 0x380, RTL_GIGA_MAC_VER_12 },
{ 0x7c8, 0x380, RTL_GIGA_MAC_VER_17 },
- { 0x7c8, 0x300, RTL_GIGA_MAC_VER_11 },
+ /* This one is very old and rare, let's see if anybody complains.
+ * { 0x7c8, 0x300, RTL_GIGA_MAC_VER_11 },
+ */
/* 8101 family. */
{ 0x7c8, 0x448, RTL_GIGA_MAC_VER_39 },
--
2.43.0
From: Heiner Kallweit <hkallweit1(a)gmail.com>
[ Upstream commit 982300c115d229565d7af8e8b38aa1ee7bb1f5bd ]
This early RTL8168b version was the first PCIe chip version, and it's
quite quirky. Last sign of life is from more than 15 yrs ago.
Let's remove detection of this chip version, we'll see whether anybody
complains. If not, support for this chip version can be removed a few
kernel versions later.
Signed-off-by: Heiner Kallweit <hkallweit1(a)gmail.com>
Link: https://lore.kernel.org/r/875cdcf4-843c-420a-ad5d-417447b68572@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/net/ethernet/realtek/r8169_main.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index d24eb5ee152a5..7d3443ad8e797 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -2018,7 +2018,9 @@ static enum mac_version rtl8169_get_mac_version(u16 xid, bool gmii)
/* 8168B family. */
{ 0x7cf, 0x380, RTL_GIGA_MAC_VER_12 },
{ 0x7c8, 0x380, RTL_GIGA_MAC_VER_17 },
- { 0x7c8, 0x300, RTL_GIGA_MAC_VER_11 },
+ /* This one is very old and rare, let's see if anybody complains.
+ * { 0x7c8, 0x300, RTL_GIGA_MAC_VER_11 },
+ */
/* 8101 family. */
{ 0x7c8, 0x448, RTL_GIGA_MAC_VER_39 },
--
2.43.0