The patch titled
Subject: mmap: fix error paths with dup_anon_vma()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mmap-fix-error-paths-with-dup_anon_vma.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Liam R. Howlett" <Liam.Howlett(a)oracle.com>
Subject: mmap: fix error paths with dup_anon_vma()
Date: Wed, 27 Sep 2023 12:07:45 -0400
When the calling function fails after the dup_anon_vma(), the duplication
of the anon_vma is not being undone. Add the necessary unlink_anon_vma()
call to the error paths that are missing them.
This issue showed up during inspection of the error path in vma_merge()
for an unrelated vma iterator issue.
Users may experience increased memory usage, which may be problematic as
the failure would likely be caused by a low memory situation.
Link: https://lkml.kernel.org/r/20230927160746.1928098-3-Liam.Howlett@oracle.com
Fixes: d4af56c5c7c6 ("mm: start tracking VMAs with maple tree")
Cc: Jann Horn <jannh(a)google.com>
Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mmap.c | 20 ++++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)
--- a/mm/mmap.c~mmap-fix-error-paths-with-dup_anon_vma
+++ a/mm/mmap.c
@@ -587,7 +587,7 @@ again:
* Returns: 0 on success.
*/
static inline int dup_anon_vma(struct vm_area_struct *dst,
- struct vm_area_struct *src)
+ struct vm_area_struct *src, struct vm_area_struct **dup)
{
/*
* Easily overlooked: when mprotect shifts the boundary, make sure the
@@ -597,6 +597,7 @@ static inline int dup_anon_vma(struct vm
if (src->anon_vma && !dst->anon_vma) {
vma_assert_write_locked(dst);
dst->anon_vma = src->anon_vma;
+ *dup = dst;
return anon_vma_clone(dst, src);
}
@@ -624,6 +625,7 @@ int vma_expand(struct vma_iterator *vmi,
unsigned long start, unsigned long end, pgoff_t pgoff,
struct vm_area_struct *next)
{
+ struct vm_area_struct *anon_dup = NULL;
bool remove_next = false;
struct vma_prepare vp;
@@ -633,7 +635,7 @@ int vma_expand(struct vma_iterator *vmi,
remove_next = true;
vma_start_write(next);
- ret = dup_anon_vma(vma, next);
+ ret = dup_anon_vma(vma, next, &anon_dup);
if (ret)
return ret;
}
@@ -661,6 +663,8 @@ int vma_expand(struct vma_iterator *vmi,
return 0;
nomem:
+ if (anon_dup)
+ unlink_anon_vmas(anon_dup);
return -ENOMEM;
}
@@ -860,6 +864,7 @@ struct vm_area_struct *vma_merge(struct
{
struct vm_area_struct *curr, *next, *res;
struct vm_area_struct *vma, *adjust, *remove, *remove2;
+ struct vm_area_struct *anon_dup = NULL;
struct vma_prepare vp;
pgoff_t vma_pgoff;
int err = 0;
@@ -927,18 +932,18 @@ struct vm_area_struct *vma_merge(struct
vma_start_write(next);
remove = next; /* case 1 */
vma_end = next->vm_end;
- err = dup_anon_vma(prev, next);
+ err = dup_anon_vma(prev, next, &anon_dup);
if (curr) { /* case 6 */
vma_start_write(curr);
remove = curr;
remove2 = next;
if (!next->anon_vma)
- err = dup_anon_vma(prev, curr);
+ err = dup_anon_vma(prev, curr, &anon_dup);
}
} else if (merge_prev) { /* case 2 */
if (curr) {
vma_start_write(curr);
- err = dup_anon_vma(prev, curr);
+ err = dup_anon_vma(prev, curr, &anon_dup);
if (end == curr->vm_end) { /* case 7 */
remove = curr;
} else { /* case 5 */
@@ -954,7 +959,7 @@ struct vm_area_struct *vma_merge(struct
vma_end = addr;
adjust = next;
adj_start = -(prev->vm_end - addr);
- err = dup_anon_vma(next, prev);
+ err = dup_anon_vma(next, prev, &anon_dup);
} else {
/*
* Note that cases 3 and 8 are the ONLY ones where prev
@@ -1018,6 +1023,9 @@ struct vm_area_struct *vma_merge(struct
return res;
prealloc_fail:
+ if (anon_dup)
+ unlink_anon_vmas(anon_dup);
+
anon_vma_fail:
if (merge_prev)
vma_next(vmi);
_
Patches currently in -mm which might be from Liam.Howlett(a)oracle.com are
maple_tree-add-mas_active-to-detect-in-tree-walks.patch
maple_tree-add-mas_underflow-and-mas_overflow-states.patch
mmap-fix-vma_iterator-in-error-path-of-vma_merge.patch
mmap-fix-error-paths-with-dup_anon_vma.patch
mmap-add-clarifying-comment-to-vma_merge-code.patch
The patch titled
Subject: mmap: fix vma_iterator in error path of vma_merge()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mmap-fix-vma_iterator-in-error-path-of-vma_merge.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Liam R. Howlett" <Liam.Howlett(a)oracle.com>
Subject: mmap: fix vma_iterator in error path of vma_merge()
Date: Wed, 27 Sep 2023 12:07:44 -0400
When merging of the previous VMA fails after the vma iterator has been
moved to the previous entry, the vma iterator must be advanced to ensure
the caller takes the correct action on the next vma iterator event. Fix
this by adding a vma_next() call to the error path.
Users may experience higher CPU usage, most likely in very low memory
situations.
Link: https://lore.kernel.org/linux-mm/CAG48ez12VN1JAOtTNMY+Y2YnsU45yL5giS-Qn=ejt…
Link: https://lkml.kernel.org/r/20230927160746.1928098-2-Liam.Howlett@oracle.com
Fixes: 18b098af2890 ("vma_merge: set vma iterator to correct position.")
Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Reported-by: Jann Horn <jannh(a)google.com>
Closes: https://lore.kernel.org/linux-mm/CAG48ez12VN1JAOtTNMY+Y2YnsU45yL5giS-Qn=ejt…
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mmap.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
--- a/mm/mmap.c~mmap-fix-vma_iterator-in-error-path-of-vma_merge
+++ a/mm/mmap.c
@@ -968,14 +968,14 @@ struct vm_area_struct *vma_merge(struct
vma_pgoff = curr->vm_pgoff;
vma_start_write(curr);
remove = curr;
- err = dup_anon_vma(next, curr);
+ err = dup_anon_vma(next, curr, &anon_dup);
}
}
}
/* Error in anon_vma clone. */
if (err)
- return NULL;
+ goto anon_vma_fail;
if (vma_start < vma->vm_start || vma_end > vma->vm_end)
vma_expanded = true;
@@ -988,7 +988,7 @@ struct vm_area_struct *vma_merge(struct
}
if (vma_iter_prealloc(vmi, vma))
- return NULL;
+ goto prealloc_fail;
init_multi_vma_prep(&vp, vma, adjust, remove, remove2);
VM_WARN_ON(vp.anon_vma && adjust && adjust->anon_vma &&
@@ -1016,6 +1016,12 @@ struct vm_area_struct *vma_merge(struct
vma_complete(&vp, vmi, mm);
khugepaged_enter_vma(res, vm_flags);
return res;
+
+prealloc_fail:
+anon_vma_fail:
+ if (merge_prev)
+ vma_next(vmi);
+ return NULL;
}
/*
_
Patches currently in -mm which might be from Liam.Howlett(a)oracle.com are
maple_tree-add-mas_active-to-detect-in-tree-walks.patch
maple_tree-add-mas_underflow-and-mas_overflow-states.patch
mmap-fix-vma_iterator-in-error-path-of-vma_merge.patch
mmap-fix-error-paths-with-dup_anon_vma.patch
mmap-add-clarifying-comment-to-vma_merge-code.patch
Upstream commit 717c6ec241b5 ("can: sja1000: Prevent overrun stalls with
a soft reset on Renesas SoCs") fixes an issue with Renesas own SJA1000
CAN controller reception: the Rx buffer is only 5 messages long, so when
the bus loaded (eg. a message every 50us), overrun may easily
happen. Upon an overrun situation, due to a possible internal crosstalk
situation, the controller enters a frozen state which only can be
unlocked with a soft reset (experimentally). The solution was to offload
a call to sja1000_start() in a threaded handler. This needs to happen in
process context as this operation requires to sleep. sja1000_start()
basically enters "reset mode", performs a proper software reset and
returns back into "normal mode".
Since this fix was introduced, we no longer observe any stalls in
reception. However it was sporadically observed that the transmit path
would now freeze. Further investigation blamed the fix mentioned above,
and especially the reset operation. Reproducing the reset in a loop
helped identifying what could possibly go wrong. The sja1000 is a single
Tx queue device, which leverages the netdev helpers to process one Tx
message at a time. The logic is: the queue is stopped, the message sent
to the transceiver, once properly transmitted the controller sets a
status bit which triggers an interrupt, in the interrupt handler the
transmission status is checked and the queue woken up. Unfortunately, if
an overrun happens, we might perform the soft reset precisely between
the transmission of the buffer to the transceiver and the advent of the
transmission status bit. We would then stop the transmission operation
without re-enabling the queue, leading to all further transmissions to
be ignored.
The reset interrupt can only happen while the device is "open", and
after a reset we anyway want to resume normal operations, no matter if a
packet to transmit got dropped in the process, so we shall wake up the
queue. Restarting the device and waking-up the queue is exactly what
sja1000_set_mode(CAN_MODE_START) does. In order to be consistent about
the queue state, we must acquire a lock both in the reset handler and in
the transmit path to ensure serialization of both operations. As the
reset handler might still be called after the transmission of a frame to
the transceiver but before it actually gets transmitted, we must ensure
we don't leak the skb, so we free it (the behavior is consistent, no
matter if there was an skb on the stack or not).
Fixes: 717c6ec241b5 ("can: sja1000: Prevent overrun stalls with a soft reset on Renesas SoCs")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
---
This patch was written and tested on a slightly older stable kernel as
this is the kernel that runs on the boards which shown the problem, but
there should be no difference with upstream kernels.
drivers/net/can/sja1000/sja1000.c | 13 ++++++++++++-
drivers/net/can/sja1000/sja1000.h | 1 +
drivers/net/can/sja1000/sja1000_platform.c | 2 ++
3 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/net/can/sja1000/sja1000.c b/drivers/net/can/sja1000/sja1000.c
index ae47fc72aa96..fe1d818f5d51 100644
--- a/drivers/net/can/sja1000/sja1000.c
+++ b/drivers/net/can/sja1000/sja1000.c
@@ -297,6 +297,8 @@ static netdev_tx_t sja1000_start_xmit(struct sk_buff *skb,
if (can_dropped_invalid_skb(dev, skb))
return NETDEV_TX_OK;
+ spin_lock(&priv->tx_lock);
+
netif_stop_queue(dev);
fi = dlc = cf->can_dlc;
@@ -335,6 +337,8 @@ static netdev_tx_t sja1000_start_xmit(struct sk_buff *skb,
sja1000_write_cmdreg(priv, cmd_reg_val);
+ spin_unlock(&priv->tx_lock);
+
return NETDEV_TX_OK;
}
@@ -394,9 +398,16 @@ static void sja1000_rx(struct net_device *dev)
static irqreturn_t sja1000_reset_interrupt(int irq, void *dev_id)
{
struct net_device *dev = (struct net_device *)dev_id;
+ struct sja1000_priv *priv = netdev_priv(dev);
netdev_dbg(dev, "performing a soft reset upon overrun\n");
- sja1000_start(dev);
+
+ spin_lock(&priv->tx_lock);
+
+ can_free_echo_skb(dev, 0);
+ sja1000_set_mode(dev, CAN_MODE_START);
+
+ spin_unlock(&priv->tx_lock);
return IRQ_HANDLED;
}
diff --git a/drivers/net/can/sja1000/sja1000.h b/drivers/net/can/sja1000/sja1000.h
index 9f041d027dcc..85def6329edb 100644
--- a/drivers/net/can/sja1000/sja1000.h
+++ b/drivers/net/can/sja1000/sja1000.h
@@ -166,6 +166,7 @@ struct sja1000_priv {
void __iomem *reg_base; /* ioremap'ed address to registers */
unsigned long irq_flags; /* for request_irq() */
spinlock_t cmdreg_lock; /* lock for concurrent cmd register writes */
+ spinlock_t tx_lock; /* lock for serializing transmissions and soft resets */
u16 flags; /* custom mode flags */
u8 ocr; /* output control register */
diff --git a/drivers/net/can/sja1000/sja1000_platform.c b/drivers/net/can/sja1000/sja1000_platform.c
index f33bad164813..ca3bcab461d8 100644
--- a/drivers/net/can/sja1000/sja1000_platform.c
+++ b/drivers/net/can/sja1000/sja1000_platform.c
@@ -305,6 +305,8 @@ static int sp_probe(struct platform_device *pdev)
priv->can.clock.freq = clk_get_rate(clk) / 2;
}
+ spin_lock_init(&priv->tx_lock);
+
platform_set_drvdata(pdev, dev);
SET_NETDEV_DEV(dev, &pdev->dev);
--
2.34.1
The patch titled
Subject: mm/page_alloc: correct start page when guard page debug is enabled
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-page_alloc-correct-start-page-when-guard-page-debug-is-enabled.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Kemeng Shi <shikemeng(a)huaweicloud.com>
Subject: mm/page_alloc: correct start page when guard page debug is enabled
Date: Wed, 27 Sep 2023 17:44:01 +0800
When guard page debug is enabled and set_page_guard returns success, we
miss to forward page to point to start of next split range and we will do
split unexpectedly in page range without target page. Move start page
update before set_page_guard to fix this.
As we split to wrong target page, then splited pages are not able to merge
back to original order when target page is put back and splited pages
except target page is not usable. To be specific:
Consider target page is the third page in buddy page with order 2.
| buddy-2 | Page | Target | Page |
After break down to target page, we will only set first page to Guard
because of bug.
| Guard | Page | Target | Page |
When we try put_page_back_buddy with target page, the buddy page of target
if neither guard nor buddy, Then it's not able to construct original page
with order 2
| Guard | Page | buddy-0 | Page |
All pages except target page is not in free list and is not usable.
Link: https://lkml.kernel.org/r/20230927094401.68205-1-shikemeng@huaweicloud.com
Fixes: 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages")
Signed-off-by: Kemeng Shi <shikemeng(a)huaweicloud.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/page_alloc.c~mm-page_alloc-correct-start-page-when-guard-page-debug-is-enabled
+++ a/mm/page_alloc.c
@@ -6475,6 +6475,7 @@ static void break_down_buddy_pages(struc
next_page = page;
current_buddy = page + size;
}
+ page = next_page;
if (set_page_guard(zone, current_buddy, high, migratetype))
continue;
@@ -6482,7 +6483,6 @@ static void break_down_buddy_pages(struc
if (current_buddy != target) {
add_to_free_list(current_buddy, zone, high, migratetype);
set_buddy_order(current_buddy, high);
- page = next_page;
}
}
}
_
Patches currently in -mm which might be from shikemeng(a)huaweicloud.com are
mm-page_alloc-correct-start-page-when-guard-page-debug-is-enabled.patch
mm-compaction-use-correct-list-in-move_freelist_head-tail.patch
mm-compaction-call-list_is_first-last-more-intuitively-in-move_freelist_head-tail.patch
mm-compaction-correctly-return-failure-with-bogus-compound_order-in-strict-mode.patch
mm-compaction-remove-repeat-compact_blockskip_flush-check-in-reset_isolation_suitable.patch
mm-compaction-improve-comment-of-is_via_compact_memory.patch
mm-compaction-factor-out-code-to-test-if-we-should-run-compaction-for-target-order.patch
The following commit has been merged into the timers/core branch of tip:
Commit-ID: 1a6a464774947920dcedcf7409be62495c7cedd0
Gitweb: https://git.kernel.org/tip/1a6a464774947920dcedcf7409be62495c7cedd0
Author: Frederic Weisbecker <frederic(a)kernel.org>
AuthorDate: Tue, 12 Sep 2023 12:44:06 +02:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Wed, 27 Sep 2023 16:54:03 +02:00
timers: Tag (hr)timer softirq as hotplug safe
Specific stress involving frequent CPU-hotplug operations, such as
running rcutorture for example, may trigger the following message:
NOHZ tick-stop error: local softirq work is pending, handler #02!!!"
This happens in the CPU-down hotplug process, after
CPUHP_AP_SMPBOOT_THREADS whose teardown callback parks ksoftirqd, and
before the target CPU shuts down through CPUHP_AP_IDLE_DEAD. In this
fragile intermediate state, softirqs waiting for threaded handling may be
forever ignored and eventually reported by the idle task as in the above
example.
However some vectors are known to be safe as long as the corresponding
subsystems have teardown callbacks handling the migration of their
events. The above error message reports pending timers softirq although
this vector can be considered as hotplug safe because the
CPUHP_TIMERS_PREPARE teardown callback performs the necessary migration
of timers after the death of the CPU. Hrtimers also have a similar
hotplug handling.
Therefore this error message, as far as (hr-)timers are concerned, can
be considered spurious and the relevant softirq vectors can be marked as
hotplug safe.
Fixes: 0345691b24c0 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle")
Signed-off-by: Frederic Weisbecker <frederic(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Joel Fernandes (Google) <joel(a)joelfernandes.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20230912104406.312185-6-frederic@kernel.org
---
include/linux/interrupt.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index a92bce4..4a1dc88 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -569,8 +569,12 @@ enum
* 2) rcu_report_dead() reports the final quiescent states.
*
* _ IRQ_POLL: irq_poll_cpu_dead() migrates the queue
+ *
+ * _ (HR)TIMER_SOFTIRQ: (hr)timers_dead_cpu() migrates the queue
*/
-#define SOFTIRQ_HOTPLUG_SAFE_MASK (BIT(RCU_SOFTIRQ) | BIT(IRQ_POLL_SOFTIRQ))
+#define SOFTIRQ_HOTPLUG_SAFE_MASK (BIT(TIMER_SOFTIRQ) | BIT(IRQ_POLL_SOFTIRQ) |\
+ BIT(HRTIMER_SOFTIRQ) | BIT(RCU_SOFTIRQ))
+
/* map softirq index to softirq name. update 'softirq_to_name' in
* kernel/softirq.c when adding a new softirq.