The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 644649553508b9bacf0fc7a5bdc4f9e0165576a5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024012908-footing-happily-57fa@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
644649553508 ("clocksource: Skip watchdog check for large watchdog intervals")
c37e85c135ce ("clocksource: Loosen clocksource watchdog constraints")
fc153c1c58cb ("clocksource: Add a Kconfig option for WATCHDOG_MAX_SKEW")
c86ff8c55b8a ("clocksource: Avoid accidental unstable marking of clocksources")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 644649553508b9bacf0fc7a5bdc4f9e0165576a5 Mon Sep 17 00:00:00 2001
From: Jiri Wiesner <jwiesner(a)suse.de>
Date: Mon, 22 Jan 2024 18:23:50 +0100
Subject: [PATCH] clocksource: Skip watchdog check for large watchdog intervals
There have been reports of the watchdog marking clocksources unstable on
machines with 8 NUMA nodes:
clocksource: timekeeping watchdog on CPU373:
Marking clocksource 'tsc' as unstable because the skew is too large:
clocksource: 'hpet' wd_nsec: 14523447520
clocksource: 'tsc' cs_nsec: 14524115132
The measured clocksource skew - the absolute difference between cs_nsec
and wd_nsec - was 668 microseconds:
cs_nsec - wd_nsec = 14524115132 - 14523447520 = 667612
The kernel used 200 microseconds for the uncertainty_margin of both the
clocksource and watchdog, resulting in a threshold of 400 microseconds (the
md variable). Both the cs_nsec and the wd_nsec value indicate that the
readout interval was circa 14.5 seconds. The observed behaviour is that
watchdog checks failed for large readout intervals on 8 NUMA node
machines. This indicates that the size of the skew was directly proportinal
to the length of the readout interval on those machines. The measured
clocksource skew, 668 microseconds, was evaluated against a threshold (the
md variable) that is suited for readout intervals of roughly
WATCHDOG_INTERVAL, i.e. HZ >> 1, which is 0.5 second.
The intention of 2e27e793e280 ("clocksource: Reduce clocksource-skew
threshold") was to tighten the threshold for evaluating skew and set the
lower bound for the uncertainty_margin of clocksources to twice
WATCHDOG_MAX_SKEW. Later in c37e85c135ce ("clocksource: Loosen clocksource
watchdog constraints"), the WATCHDOG_MAX_SKEW constant was increased to
125 microseconds to fit the limit of NTP, which is able to use a
clocksource that suffers from up to 500 microseconds of skew per second.
Both the TSC and the HPET use default uncertainty_margin. When the
readout interval gets stretched the default uncertainty_margin is no
longer a suitable lower bound for evaluating skew - it imposes a limit
that is far stricter than the skew with which NTP can deal.
The root causes of the skew being directly proportinal to the length of
the readout interval are:
* the inaccuracy of the shift/mult pairs of clocksources and the watchdog
* the conversion to nanoseconds is imprecise for large readout intervals
Prevent this by skipping the current watchdog check if the readout
interval exceeds 2 * WATCHDOG_INTERVAL. Considering the maximum readout
interval of 2 * WATCHDOG_INTERVAL, the current default uncertainty margin
(of the TSC and HPET) corresponds to a limit on clocksource skew of 250
ppm (microseconds of skew per second). To keep the limit imposed by NTP
(500 microseconds of skew per second) for all possible readout intervals,
the margins would have to be scaled so that the threshold value is
proportional to the length of the actual readout interval.
As for why the readout interval may get stretched: Since the watchdog is
executed in softirq context the expiration of the watchdog timer can get
severely delayed on account of a ksoftirqd thread not getting to run in a
timely manner. Surely, a system with such belated softirq execution is not
working well and the scheduling issue should be looked into but the
clocksource watchdog should be able to deal with it accordingly.
Fixes: 2e27e793e280 ("clocksource: Reduce clocksource-skew threshold")
Suggested-by: Feng Tang <feng.tang(a)intel.com>
Signed-off-by: Jiri Wiesner <jwiesner(a)suse.de>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Paul E. McKenney <paulmck(a)kernel.org>
Reviewed-by: Feng Tang <feng.tang(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20240122172350.GA740@incl
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index c108ed8a9804..3052b1f1168e 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -99,6 +99,7 @@ static u64 suspend_start;
* Interval: 0.5sec.
*/
#define WATCHDOG_INTERVAL (HZ >> 1)
+#define WATCHDOG_INTERVAL_MAX_NS ((2 * WATCHDOG_INTERVAL) * (NSEC_PER_SEC / HZ))
/*
* Threshold: 0.0312s, when doubled: 0.0625s.
@@ -134,6 +135,7 @@ static DECLARE_WORK(watchdog_work, clocksource_watchdog_work);
static DEFINE_SPINLOCK(watchdog_lock);
static int watchdog_running;
static atomic_t watchdog_reset_pending;
+static int64_t watchdog_max_interval;
static inline void clocksource_watchdog_lock(unsigned long *flags)
{
@@ -399,8 +401,8 @@ static inline void clocksource_reset_watchdog(void)
static void clocksource_watchdog(struct timer_list *unused)
{
u64 csnow, wdnow, cslast, wdlast, delta;
+ int64_t wd_nsec, cs_nsec, interval;
int next_cpu, reset_pending;
- int64_t wd_nsec, cs_nsec;
struct clocksource *cs;
enum wd_read_status read_ret;
unsigned long extra_wait = 0;
@@ -470,6 +472,27 @@ static void clocksource_watchdog(struct timer_list *unused)
if (atomic_read(&watchdog_reset_pending))
continue;
+ /*
+ * The processing of timer softirqs can get delayed (usually
+ * on account of ksoftirqd not getting to run in a timely
+ * manner), which causes the watchdog interval to stretch.
+ * Skew detection may fail for longer watchdog intervals
+ * on account of fixed margins being used.
+ * Some clocksources, e.g. acpi_pm, cannot tolerate
+ * watchdog intervals longer than a few seconds.
+ */
+ interval = max(cs_nsec, wd_nsec);
+ if (unlikely(interval > WATCHDOG_INTERVAL_MAX_NS)) {
+ if (system_state > SYSTEM_SCHEDULING &&
+ interval > 2 * watchdog_max_interval) {
+ watchdog_max_interval = interval;
+ pr_warn("Long readout interval, skipping watchdog check: cs_nsec: %lld wd_nsec: %lld\n",
+ cs_nsec, wd_nsec);
+ }
+ watchdog_timer.expires = jiffies;
+ continue;
+ }
+
/* Check the deviation from the watchdog clocksource. */
md = cs->uncertainty_margin + watchdog->uncertainty_margin;
if (abs(cs_nsec - wd_nsec) > md) {
This patch adds support for OPP to vote for the performance state of RPMH
power domain based upon GEN speed it PCIe got enumerated.
QCOM Resource Power Manager-hardened (RPMh) is a hardware block which
maintains hardware state of a regulator by performing max aggregation of
the requests made by all of the processors.
PCIe controller can operate on different RPMh performance state of power
domain based up on the speed of the link. And this performance state varies
from target to target.
It is manadate to scale the performance state based up on the PCIe speed
link operates so that SoC can run under optimum power conditions.
Add Operating Performance Points(OPP) support to vote for RPMh state based
upon GEN speed link is operating.
Before link up PCIe driver will vote for the maximum performance state.
As now we are adding ICC BW vote in OPP, the ICC BW voting depends both
GEN speed and link width using opp-level to indicate the opp entry table
will be difficult.
In PCIe certain gen speeds like GEN1x2 & GEN2X1 or GEN3x2 & GEN4x1 use
same icc bw if we use freq in the opp table to represent the PCIe Gen
speed number of PCIe entries can reduced.
So going back to use freq in the opp table instead of level.
Signed-off-by: Krishna chaitanya chundru <quic_krichai(a)quicinc.com>
---
Changes frm v5:
- Add ICC BW voting as part of OPP, rebase the latest kernel, and only
- either OPP or ICC BW voting will supported we removed the patch to
- return eror for icc opp update patch.
- As we added the icc bw voting in opp table I am not including reviewed
- by tags given in previous patch.
- Use opp freq to find opp entries as now we need to include pcie link
- also in to considerations.
- Add CPU-PCIe BW voting which is not present till now.
- Drop PCI: qcom: Return error from 'qcom_pcie_icc_update' as either opp or icc bw
- only one executes and there is no need to fail if opp or icc update fails.
- Link for v5: https://lore.kernel.org/linux-arm-msm/20231101063323.GH2897@thinkpad/T/
Changes from v4:
- Added a separate patch for returning error from the qcom_pcie_upadate
and moved opp update logic to icc_update and used a bool variable to
update the opp.
- Addressed comments made by pavan.
changes from v3:
- Removing the opp vote on suspend when the link is not up and link is not
up and add debug prints as suggested by pavan.
- Added dev_pm_opp_find_level_floor API to find the highest opp to vote.
changes from v2:
- Instead of using the freq based opp search use level based as suggested
by Dmitry Baryshkov.
Changes from v1:
- Addressed comments from Krzysztof Kozlowski.
- Added the rpmhpd_opp_xxx phandle as suggested by pavan.
- Added dev_pm_opp_set_opp API call which was missed on previous patch.
---
Krishna chaitanya chundru (6):
dt-bindings: PCI: qcom: Add interconnects path as required property
arm64: dts: qcom: sm8450: Add interconnect path to PCIe node
PCI: qcom: Add missing icc bandwidth vote for cpu to PCIe path
dt-bindings: pci: qcom: Add opp table
arm64: dts: qcom: sm8450: Add opp table support to PCIe
PCI: qcom: Add OPP support to scale performance state of power domain
.../devicetree/bindings/pci/qcom,pcie.yaml | 6 ++
arch/arm64/boot/dts/qcom/sm8450.dtsi | 82 +++++++++++++++
drivers/pci/controller/dwc/pcie-qcom.c | 114 ++++++++++++++++++---
3 files changed, 187 insertions(+), 15 deletions(-)
---
base-commit: 70d201a40823acba23899342d62bc2644051ad2e
change-id: 20240112-opp_support-3a1839c6a171
Best regards,
--
Krishna chaitanya chundru <quic_krichai(a)quicinc.com>
From: Filipe Manana <fdmanana(a)suse.com>
Here follows the backport of some directory related fixes for the stable
5.15 tree. I tested these on top of 5.15.147.
These were recently requested at:
https://lore.kernel.org/linux-btrfs/20240124225522.GA2614102@lxhi-087/
Filipe Manana (4):
btrfs: fix infinite directory reads
btrfs: set last dir index to the current last index when opening dir
btrfs: refresh dir last index during a rewinddir(3) call
btrfs: fix race between reading a directory and adding entries to it
fs/btrfs/ctree.h | 1 +
fs/btrfs/delayed-inode.c | 5 +-
fs/btrfs/delayed-inode.h | 1 +
fs/btrfs/inode.c | 150 +++++++++++++++++++++++++--------------
4 files changed, 102 insertions(+), 55 deletions(-)
--
2.40.1
From: "Guilherme G. Piccoli" <gpiccoli(a)igalia.com>
[ Upstream commit e585a37e5061f6d5060517aed1ca4ccb2e56a34c ]
By running a Van Gogh device (Steam Deck), the following message
was noticed in the kernel log:
pci 0000:04:00.3: PCI class overridden (0x0c03fe -> 0x0c03fe) so dwc3 driver can claim this instead of xhci
Effectively this means the quirk executed but changed nothing, since the
class of this device was already the proper one (likely adjusted by newer
firmware versions).
Check and perform the override only if necessary.
Link: https://lore.kernel.org/r/20231120160531.361552-1-gpiccoli@igalia.com
Signed-off-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: Huang Rui <ray.huang(a)amd.com>
Cc: Vicki Pfau <vi(a)endrift.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/pci/quirks.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index e0081914052f..76beed053bf7 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -702,10 +702,13 @@ static void quirk_amd_dwc_class(struct pci_dev *pdev)
{
u32 class = pdev->class;
- /* Use "USB Device (not host controller)" class */
- pdev->class = PCI_CLASS_SERIAL_USB_DEVICE;
- pci_info(pdev, "PCI class overridden (%#08x -> %#08x) so dwc3 driver can claim this instead of xhci\n",
- class, pdev->class);
+ if (class != PCI_CLASS_SERIAL_USB_DEVICE) {
+ /* Use "USB Device (not host controller)" class */
+ pdev->class = PCI_CLASS_SERIAL_USB_DEVICE;
+ pci_info(pdev,
+ "PCI class overridden (%#08x -> %#08x) so dwc3 driver can claim this instead of xhci\n",
+ class, pdev->class);
+ }
}
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_NL_USB,
quirk_amd_dwc_class);
--
2.43.0
rx_data_reassembly skb is stored during NCI data exchange for processing
fragmented packets. It is dropped only when the last fragment is processed
or when an NTF packet with NCI_OP_RF_DEACTIVATE_NTF opcode is received.
However, the NCI device may be deallocated before that which leads to skb
leak.
As by design the rx_data_reassembly skb is bound to the NCI device and
nothing prevents the device to be freed before the skb is processed in
some way and cleaned, free it on the NCI device cleanup.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation")
Cc: stable(a)vger.kernel.org
Reported-by: syzbot+6b7c68d9c21e4ee4251b(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/lkml/000000000000f43987060043da7b@google.com/
Signed-off-by: Fedor Pchelkin <pchelkin(a)ispras.ru>
---
net/nfc/nci/core.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/nfc/nci/core.c b/net/nfc/nci/core.c
index 97348cedb16b..cdad47b140fa 100644
--- a/net/nfc/nci/core.c
+++ b/net/nfc/nci/core.c
@@ -1208,6 +1208,10 @@ void nci_free_device(struct nci_dev *ndev)
{
nfc_free_device(ndev->nfc_dev);
nci_hci_deallocate(ndev);
+
+ /* drop partial rx data packet if present */
+ if (ndev->rx_data_reassembly)
+ kfree_skb(ndev->rx_data_reassembly);
kfree(ndev);
}
EXPORT_SYMBOL(nci_free_device);
--
2.43.0
From: Konrad Dybcio <konrad.dybcio(a)linaro.org>
[ Upstream commit 6ab502bc1cf3147ea1d8540d04b83a7a4cb6d1f1 ]
Some devices power the DSI PHY/PLL through a power rail that we model
as a GENPD. Enable runtime PM to make it suspendable.
Signed-off-by: Konrad Dybcio <konrad.dybcio(a)linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/543352/
Link: https://lore.kernel.org/r/20230620-topic-dsiphy_rpm-v2-2-a11a751f34f0@linar…
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
Stable-dep-of: 3d07a411b4fa ("drm/msm/dsi: Use pm_runtime_resume_and_get to prevent refcnt leaks")
Signed-off-by: Amit Pundir <amit.pundir(a)linaro.org>
---
Rebased to v5.4.y: s/devm_pm_runtime_enable/pm_runtime_enable
Alternate solution maybe to revert the problematic commit 31b169a8bed7
("drm/msm/dsi: Use pm_runtime_resume_and_get to prevent refcnt leaks")
from v5.4.y which broke display on DB845c.
drivers/gpu/drm/msm/dsi/phy/dsi_phy.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c b/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c
index 1582386fe162..30f07b8a5421 100644
--- a/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c
+++ b/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c
@@ -606,6 +606,8 @@ static int dsi_phy_driver_probe(struct platform_device *pdev)
goto fail;
}
+ pm_runtime_enable(&pdev->dev);
+
/* PLL init will call into clk_register which requires
* register access, so we need to enable power and ahb clock.
*/
--
2.25.1
On 2024-01-26 at 20:15:46 +0100, Hans de Goede wrote:
> Hi,
>
> On 1/25/24 09:22, Ashok Raj wrote:
> > From: Jithu Joseph <jithu.joseph(a)intel.com>
> >
> > Missing release_firmware() due to error handling blocked any future image
> > loading.
> >
> > Fix the return code and release_fiwmare() to release the bad image.
> >
> > Fixes: 25a76dbb36dd ("platform/x86/intel/ifs: Validate image size")
> > Reported-by: Pengfei Xu <pengfei.xu(a)intel.com>
> > Signed-off-by: Jithu Joseph <jithu.joseph(a)intel.com>
> > Signed-off-by: Ashok Raj <ashok.raj(a)intel.com>
> > Tested-by: Pengfei Xu <pengfei.xu(a)intel.com>
> > Reviewed-by: Tony Luck <tony.luck(a)intel.com>
>
> Thank you for your patch/series, I've applied this patch
> (series) to my review-hans branch:
> https://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86.…
>
> Note it will show up in the pdx86 review-hans branch once I've
> pushed my local branch there, which might take a while.
>
> I will include this patch in my next fixes pull-req to Linus
> for the current kernel development cycle.
>
FYR.
[ CC stable(a)vger.kernel.org ]
Missed CC: Stable Tag.
This (follow-up) patch is now upstream into v6.7-rc1:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
Looks like linux-6.7.y needs the above fixed patch too.
Best Regards,
Thanks!
> Regards,
>
> Hans
>
>
>
>
> > ---
> > drivers/platform/x86/intel/ifs/load.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/platform/x86/intel/ifs/load.c b/drivers/platform/x86/intel/ifs/load.c
> > index a1ee1a74fc3c..2cf3b4a8813f 100644
> > --- a/drivers/platform/x86/intel/ifs/load.c
> > +++ b/drivers/platform/x86/intel/ifs/load.c
> > @@ -399,7 +399,8 @@ int ifs_load_firmware(struct device *dev)
> > if (fw->size != expected_size) {
> > dev_err(dev, "File size mismatch (expected %u, actual %zu). Corrupted IFS image.\n",
> > expected_size, fw->size);
> > - return -EINVAL;
> > + ret = -EINVAL;
> > + goto release;
> > }
> >
> > ret = image_sanity_check(dev, (struct microcode_header_intel *)fw->data);
>
The patch titled
Subject: mm: zswap: fix missing folio cleanup in writeback race path
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-zswap-fix-missing-folio-cleanup-in-writeback-race-path.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Yosry Ahmed <yosryahmed(a)google.com>
Subject: mm: zswap: fix missing folio cleanup in writeback race path
Date: Thu, 25 Jan 2024 08:51:27 +0000
In zswap_writeback_entry(), after we get a folio from
__read_swap_cache_async(), we grab the tree lock again to check that the
swap entry was not invalidated and recycled. If it was, we delete the
folio we just added to the swap cache and exit.
However, __read_swap_cache_async() returns the folio locked when it is
newly allocated, which is always true for this path, and the folio is
ref'd. Make sure to unlock and put the folio before returning.
This was discovered by code inspection, probably because this path handles
a race condition that should not happen often, and the bug would not crash
the system, it will only strand the folio indefinitely.
Link: https://lkml.kernel.org/r/20240125085127.1327013-1-yosryahmed@google.com
Fixes: 04fc7816089c ("mm: fix zswap writeback race condition")
Signed-off-by: Yosry Ahmed <yosryahmed(a)google.com>
Reviewed-by: Chengming Zhou <zhouchengming(a)bytedance.com>
Acked-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reviewed-by: Nhat Pham <nphamcs(a)gmail.com>
Cc: Domenico Cerasuolo <cerasuolodomenico(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/zswap.c | 2 ++
1 file changed, 2 insertions(+)
--- a/mm/zswap.c~mm-zswap-fix-missing-folio-cleanup-in-writeback-race-path
+++ a/mm/zswap.c
@@ -1442,6 +1442,8 @@ static int zswap_writeback_entry(struct
if (zswap_rb_search(&tree->rbroot, swp_offset(entry->swpentry)) != entry) {
spin_unlock(&tree->lock);
delete_from_swap_cache(folio);
+ folio_unlock(folio);
+ folio_put(folio);
return -ENOMEM;
}
spin_unlock(&tree->lock);
_
Patches currently in -mm which might be from yosryahmed(a)google.com are
mm-memcg-optimize-parent-iteration-in-memcg_rstat_updated.patch
mm-zswap-fix-missing-folio-cleanup-in-writeback-race-path.patch
mm-swap-enforce-updating-inuse_pages-at-the-end-of-swap_range_free.patch
mm-zswap-remove-unnecessary-trees-cleanups-in-zswap_swapoff.patch
mm-zswap-remove-unused-tree-argument-in-zswap_entry_put.patch
x86-mm-delete-unused-cpu-argument-to-leave_mm.patch
x86-mm-clarify-prev-usage-in-switch_mm_irqs_off.patch