Fix the discrepancy between the two places we check for the CP0 counter
erratum in along with the incorrect comparison of the R4400 revision
number against 0x30 which matches none and consistently consider all
R4000 and R4400 processors affected, as documented in processor errata
publications[1][2][3], following the mapping between CP0 PRId register
values and processor models:
PRId | Processor Model
---------+--------------------
00000422 | R4000 Revision 2.2
00000430 | R4000 Revision 3.0
00000440 | R4400 Revision 1.0
00000450 | R4400 Revision 2.0
00000460 | R4400 Revision 3.0
No other revision of either processor has ever been spotted.
Contrary to what has been stated in commit ce202cbb9e0b ("[MIPS] Assume
R4000/R4400 newer than 3.0 don't have the mfc0 count bug") marking the
CP0 counter as buggy does not preclude it from being used as either a
clock event or a clock source device. It just cannot be used as both at
a time, because in that case clock event interrupts will be occasionally
lost, and the use as a clock event device takes precedence.
Compare against 0x4ff in `can_use_mips_counter' so that a single machine
instruction is produced.
References:
[1] "MIPS R4000PC/SC Errata, Processor Revision 2.2 and 3.0", MIPS
Technologies Inc., May 10, 1994, Erratum 53, p.13
[2] "MIPS R4400PC/SC Errata, Processor Revision 1.0", MIPS Technologies
Inc., February 9, 1994, Erratum 21, p.4
[3] "MIPS R4400PC/SC Errata, Processor Revision 2.0 & 3.0", MIPS
Technologies Inc., January 24, 1995, Erratum 14, p.3
Signed-off-by: Maciej W. Rozycki <macro(a)orcam.me.uk>
Fixes: ce202cbb9e0b ("[MIPS] Assume R4000/R4400 newer than 3.0 don't have the mfc0 count bug")
Cc: stable(a)vger.kernel.org # v2.6.24+
---
Thomas,
Please review the requirements for SNI platforms. In the case of an
erratic CP0 timer we give precedence to the use as a clock event rather
than clock source device; see `time_init' in arch/mips/kernel/time.c.
Therefore if SNI systems have no alternative timer interrupt source, then
the CP0 timer is supposed to still do regardless of the erratum.
Conversely a system can do without a high-precision clock source, in
which case jiffies will be used. Of course such a system will suffer if
used for precision timekeeping, but such is the price for broken hardware.
Don't SNI systems have any alternative timer available, not even the
venerable 8254?
Long-term I think this code should be factored out and rewritten so that
it lives in one place and can take advantage of compile-time constants,
which will be the case for the majority of platforms. For the time being
fix the immediate breakage however.
With the considerations above in mind, please apply.
Maciej
---
arch/mips/include/asm/timex.h | 8 ++++----
arch/mips/kernel/time.c | 11 +++--------
2 files changed, 7 insertions(+), 12 deletions(-)
linux-mips-r4k-count-erratum-prid.diff
Index: linux-macro/arch/mips/include/asm/timex.h
===================================================================
--- linux-macro.orig/arch/mips/include/asm/timex.h
+++ linux-macro/arch/mips/include/asm/timex.h
@@ -40,9 +40,9 @@
typedef unsigned int cycles_t;
/*
- * On R4000/R4400 before version 5.0 an erratum exists such that if the
- * cycle counter is read in the exact moment that it is matching the
- * compare register, no interrupt will be generated.
+ * On R4000/R4400 an erratum exists such that if the cycle counter is
+ * read in the exact moment that it is matching the compare register,
+ * no interrupt will be generated.
*
* There is a suggested workaround and also the erratum can't strike if
* the compare interrupt isn't being used as the clock source device.
@@ -63,7 +63,7 @@ static inline int can_use_mips_counter(u
if (!__builtin_constant_p(cpu_has_counter))
asm volatile("" : "=m" (cpu_data[0].options));
if (likely(cpu_has_counter &&
- prid >= (PRID_IMP_R4000 | PRID_REV_ENCODE_44(5, 0))))
+ prid > (PRID_IMP_R4000 | PRID_REV_ENCODE_44(15, 15))))
return 1;
else
return 0;
Index: linux-macro/arch/mips/kernel/time.c
===================================================================
--- linux-macro.orig/arch/mips/kernel/time.c
+++ linux-macro/arch/mips/kernel/time.c
@@ -141,15 +141,10 @@ static __init int cpu_has_mfc0_count_bug
case CPU_R4400MC:
/*
* The published errata for the R4400 up to 3.0 say the CPU
- * has the mfc0 from count bug.
- */
- if ((current_cpu_data.processor_id & 0xff) <= 0x30)
- return 1;
-
- /*
- * we assume newer revisions are ok
+ * has the mfc0 from count bug. This seems the last version
+ * produced.
*/
- return 0;
+ return 1;
}
return 0;
We always calling hold_lkb(lkb) if we increment lkb->lkb_wait_count. As
conclusion we always need to call unhold_lkb(lkb) if we decrement
lkb->lkb_wait_count. This patch will add missing unhold_lkb(lkb) if we
decrement lkb->lkb_wait_count. In case of setting lkb->lkb_wait_count to
zero we need to countdown until reaching zero and call unhold_lkb(lkb).
The waiters list unhold_lkb(lkb) can be removed because it's done for
the last lkb_wait_count decrement iteration as it's done in
_remove_from_waiters().
This issue was discovered by a dlm gfs2 test case which use excessively
dlm_unlock(LKF_CANCEL) feature. Probably the lkb->lkb_wait_count value
never reached above 1 if this feature isn't used and so it was not
discovered before.
The testcase ended in a rsb on the rsb keep data structure with a
refcount of 1 but no lkb was associated with it anymore which is itself
an invalid behaviour. As sideeffect it was observed that the dlm
lockspace was in a kind of deadlock state by sending remove messages in
a looping behaviour. With this patch I never observed a similar
behaviour again.
Cc: stable(a)vger.kernel.org
Signed-off-by: Alexander Aring <aahringo(a)redhat.com>
---
fs/dlm/lock.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
index 1899bb266e2e..d19cfa03c9bb 100644
--- a/fs/dlm/lock.c
+++ b/fs/dlm/lock.c
@@ -1578,6 +1578,7 @@ static int _remove_from_waiters(struct dlm_lkb *lkb, int mstype,
lkb->lkb_wait_type = 0;
lkb->lkb_flags &= ~DLM_IFL_OVERLAP_CANCEL;
lkb->lkb_wait_count--;
+ unhold_lkb(lkb);
goto out_del;
}
@@ -1604,6 +1605,7 @@ static int _remove_from_waiters(struct dlm_lkb *lkb, int mstype,
log_error(ls, "remwait error %x reply %d wait_type %d overlap",
lkb->lkb_id, mstype, lkb->lkb_wait_type);
lkb->lkb_wait_count--;
+ unhold_lkb(lkb);
lkb->lkb_wait_type = 0;
}
@@ -5364,11 +5366,16 @@ int dlm_recover_waiters_post(struct dlm_ls *ls)
lkb->lkb_flags &= ~DLM_IFL_OVERLAP_UNLOCK;
lkb->lkb_flags &= ~DLM_IFL_OVERLAP_CANCEL;
lkb->lkb_wait_type = 0;
- lkb->lkb_wait_count = 0;
+ /* drop all wait_count references we still
+ * hold a reference for this iteration.
+ */
+ while (lkb->lkb_wait_count) {
+ lkb->lkb_wait_count--;
+ unhold_lkb(lkb);
+ }
mutex_lock(&ls->ls_waiters_mutex);
list_del_init(&lkb->lkb_wait_reply);
mutex_unlock(&ls->ls_waiters_mutex);
- unhold_lkb(lkb); /* for waiters list */
if (oc || ou) {
/* do an unlock or cancel instead of resending */
--
2.31.1
From: Andreas Larsson <andreas(a)gaisler.com>
The previous split budget between TX and RX made it return not using
the entire budget but at the same time not having calling called
napi_complete. This sometimes led to the poll to not be called, and at
the same time having TX and RX interrupts disabled resulting in the
driver getting stuck.
Fixes: 6cec9b07fe6a ("can: grcan: Add device driver for GRCAN and GRHCAN cores")
Link: https://lore.kernel.org/all/20220429084656.29788-4-andreas@gaisler.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Andreas Larsson <andreas(a)gaisler.com>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/grcan.c | 22 +++++++---------------
1 file changed, 7 insertions(+), 15 deletions(-)
diff --git a/drivers/net/can/grcan.c b/drivers/net/can/grcan.c
index 4ca3da56d3aa..5215bd9b2c80 100644
--- a/drivers/net/can/grcan.c
+++ b/drivers/net/can/grcan.c
@@ -1125,7 +1125,7 @@ static int grcan_close(struct net_device *dev)
return 0;
}
-static int grcan_transmit_catch_up(struct net_device *dev, int budget)
+static void grcan_transmit_catch_up(struct net_device *dev)
{
struct grcan_priv *priv = netdev_priv(dev);
unsigned long flags;
@@ -1133,7 +1133,7 @@ static int grcan_transmit_catch_up(struct net_device *dev, int budget)
spin_lock_irqsave(&priv->lock, flags);
- work_done = catch_up_echo_skb(dev, budget, true);
+ work_done = catch_up_echo_skb(dev, -1, true);
if (work_done) {
if (!priv->resetting && !priv->closing &&
!(priv->can.ctrlmode & CAN_CTRLMODE_LISTENONLY))
@@ -1147,8 +1147,6 @@ static int grcan_transmit_catch_up(struct net_device *dev, int budget)
}
spin_unlock_irqrestore(&priv->lock, flags);
-
- return work_done;
}
static int grcan_receive(struct net_device *dev, int budget)
@@ -1230,19 +1228,13 @@ static int grcan_poll(struct napi_struct *napi, int budget)
struct net_device *dev = priv->dev;
struct grcan_registers __iomem *regs = priv->regs;
unsigned long flags;
- int tx_work_done, rx_work_done;
- int rx_budget = budget / 2;
- int tx_budget = budget - rx_budget;
+ int work_done;
- /* Half of the budget for receiving messages */
- rx_work_done = grcan_receive(dev, rx_budget);
+ work_done = grcan_receive(dev, budget);
- /* Half of the budget for transmitting messages as that can trigger echo
- * frames being received
- */
- tx_work_done = grcan_transmit_catch_up(dev, tx_budget);
+ grcan_transmit_catch_up(dev);
- if (rx_work_done < rx_budget && tx_work_done < tx_budget) {
+ if (work_done < budget) {
napi_complete(napi);
/* Guarantee no interference with a running reset that otherwise
@@ -1259,7 +1251,7 @@ static int grcan_poll(struct napi_struct *napi, int budget)
spin_unlock_irqrestore(&priv->lock, flags);
}
- return rx_work_done + tx_work_done;
+ return work_done;
}
/* Work tx bug by waiting while for the risky situation to clear. If that fails,
--
2.35.1
From: Duoming Zhou <duoming(a)zju.edu.cn>
There are deadlocks caused by del_timer_sync(&priv->hang_timer) and
del_timer_sync(&priv->rr_timer) in grcan_close(), one of the deadlocks
are shown below:
(Thread 1) | (Thread 2)
| grcan_reset_timer()
grcan_close() | mod_timer()
spin_lock_irqsave() //(1) | (wait a time)
... | grcan_initiate_running_reset()
del_timer_sync() | spin_lock_irqsave() //(2)
(wait timer to stop) | ...
We hold priv->lock in position (1) of thread 1 and use
del_timer_sync() to wait timer to stop, but timer handler also need
priv->lock in position (2) of thread 2. As a result, grcan_close()
will block forever.
This patch extracts del_timer_sync() from the protection of
spin_lock_irqsave(), which could let timer handler to obtain the
needed lock.
Link: https://lore.kernel.org/all/20220425042400.66517-1-duoming@zju.edu.cn
Fixes: 6cec9b07fe6a ("can: grcan: Add device driver for GRCAN and GRHCAN cores")
Cc: stable(a)vger.kernel.org
Signed-off-by: Duoming Zhou <duoming(a)zju.edu.cn>
Reviewed-by: Andreas Larsson <andreas(a)gaisler.com>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/grcan.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/can/grcan.c b/drivers/net/can/grcan.c
index d0c5a7a60daf..1189057b5d68 100644
--- a/drivers/net/can/grcan.c
+++ b/drivers/net/can/grcan.c
@@ -1102,8 +1102,10 @@ static int grcan_close(struct net_device *dev)
priv->closing = true;
if (priv->need_txbug_workaround) {
+ spin_unlock_irqrestore(&priv->lock, flags);
del_timer_sync(&priv->hang_timer);
del_timer_sync(&priv->rr_timer);
+ spin_lock_irqsave(&priv->lock, flags);
}
netif_stop_queue(dev);
grcan_stop_hardware(dev);
--
2.35.1
People are occasionally reporting a warning bfqq_request_over_limit()
triggering reporting that BFQ's idea of cgroup hierarchy (and its depth)
does not match what generic blkcg code thinks. This can actually happen
when bfqq gets moved between BFQ groups while bfqq_request_over_limit()
is running. Make sure the code is safe against BFQ queue being moved to
a different BFQ group.
Fixes: 76f1df88bbc2 ("bfq: Limit number of requests consumed by each cgroup")
CC: stable(a)vger.kernel.org
Link: https://lore.kernel.org/all/CAJCQCtTw_2C7ZSz7as5Gvq=OmnDiio=HRkQekqWpKot84s…
Reported-by: Chris Murphy <lists(a)colorremedies.com>
Reported-by: "yukuai (C)" <yukuai3(a)huawei.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
block/bfq-iosched.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index e47c75f1fa0f..272d48d8f326 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -569,7 +569,7 @@ static bool bfqq_request_over_limit(struct bfq_queue *bfqq, int limit)
struct bfq_entity *entity = &bfqq->entity;
struct bfq_entity *inline_entities[BFQ_LIMIT_INLINE_DEPTH];
struct bfq_entity **entities = inline_entities;
- int depth, level;
+ int depth, level, alloc_depth = BFQ_LIMIT_INLINE_DEPTH;
int class_idx = bfqq->ioprio_class - 1;
struct bfq_sched_data *sched_data;
unsigned long wsum;
@@ -578,15 +578,21 @@ static bool bfqq_request_over_limit(struct bfq_queue *bfqq, int limit)
if (!entity->on_st_or_in_serv)
return false;
+retry:
+ spin_lock_irq(&bfqd->lock);
/* +1 for bfqq entity, root cgroup not included */
depth = bfqg_to_blkg(bfqq_group(bfqq))->blkcg->css.cgroup->level + 1;
- if (depth > BFQ_LIMIT_INLINE_DEPTH) {
+ if (depth > alloc_depth) {
+ spin_unlock_irq(&bfqd->lock);
+ if (entities != inline_entities)
+ kfree(entities);
entities = kmalloc_array(depth, sizeof(*entities), GFP_NOIO);
if (!entities)
return false;
+ alloc_depth = depth;
+ goto retry;
}
- spin_lock_irq(&bfqd->lock);
sched_data = entity->sched_data;
/* Gather our ancestors as we need to traverse them in reverse order */
level = 0;
--
2.34.1
As discussed with Willem here:
https://lore.kernel.org/netdev/CA+FuTSdQ57O6RWj_Lenmu_Vd3NEX9xMzMYkB0C3rKMz…
the kernel silently doesn't act upon the SOF_TIMESTAMPING_OPT_ID socket
option in several cases on older kernels, yet user space has no way to
find out about this, practically resulting in broken functionality.
This patch set backports the support towards linux-4.14.y and linux-4.19.y,
which fixes the issue described above by simply making the kernel act
upon SOF_TIMESTAMPING_OPT_ID as expected.
Testing was done with the most recent (not the vintage-correct one)
kselftest script at:
tools/testing/selftests/networking/timestamping/txtimestamp.sh
with the message "OK. All tests passed".
I tried to backport the changes to linux-4.9.y as well, but something
apparently unrelated is broken there, and I didn't investigate further
so I'm not targeting that stable branch:
./txtimestamp.sh
protocol: TCP
payload: 10
server port: 9000
family: INET
test SND
USR: 1581090724 s 856088 us (seq=0, len=0)
./txtimestamp: poll
Willem de Bruijn (2):
ipv6: add missing tx timestamping on IPPROTO_RAW
net: add missing SOF_TIMESTAMPING_OPT_ID support
include/net/sock.h | 25 +++++++++++++++++++++----
net/can/raw.c | 2 +-
net/ipv4/raw.c | 2 +-
net/ipv6/raw.c | 7 +++++--
net/packet/af_packet.c | 6 +++---
5 files changed, 31 insertions(+), 11 deletions(-)
--
2.25.1
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8f0b36497303487d5a32c75789c77859cc2ee895 Mon Sep 17 00:00:00 2001
From: Muchun Song <songmuchun(a)bytedance.com>
Date: Fri, 1 Apr 2022 11:28:36 -0700
Subject: [PATCH] mm: kfence: fix objcgs vector allocation
If the kfence object is allocated to be used for objects vector, then
this slot of the pool eventually being occupied permanently since the
vector is never freed. The solutions could be (1) freeing vector when
the kfence object is freed or (2) allocating all vectors statically.
Since the memory consumption of object vectors is low, it is better to
chose (2) to fix the issue and it is also can reduce overhead of vectors
allocating in the future.
Link: https://lkml.kernel.org/r/20220328132843.16624-1-songmuchun@bytedance.com
Fixes: d3fb45f370d9 ("mm, kfence: insert KFENCE hooks for SLAB")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Reviewed-by: Marco Elver <elver(a)google.com>
Reviewed-by: Roman Gushchin <roman.gushchin(a)linux.dev>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Xiongchun Duan <duanxiongchun(a)bytedance.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 2f9fdfde1941..a203747ad2c0 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -566,6 +566,8 @@ static unsigned long kfence_init_pool(void)
* enters __slab_free() slow-path.
*/
for (i = 0; i < KFENCE_POOL_SIZE / PAGE_SIZE; i++) {
+ struct slab *slab = page_slab(&pages[i]);
+
if (!i || (i % 2))
continue;
@@ -573,7 +575,11 @@ static unsigned long kfence_init_pool(void)
if (WARN_ON(compound_head(&pages[i]) != &pages[i]))
return addr;
- __SetPageSlab(&pages[i]);
+ __folio_set_slab(slab_folio(slab));
+#ifdef CONFIG_MEMCG
+ slab->memcg_data = (unsigned long)&kfence_metadata[i / 2 - 1].objcg |
+ MEMCG_DATA_OBJCGS;
+#endif
}
/*
@@ -1033,6 +1039,9 @@ void __kfence_free(void *addr)
{
struct kfence_metadata *meta = addr_to_metadata((unsigned long)addr);
+#ifdef CONFIG_MEMCG
+ KFENCE_WARN_ON(meta->objcg);
+#endif
/*
* If the objects of the cache are SLAB_TYPESAFE_BY_RCU, defer freeing
* the object, as the object page may be recycled for other-typed
diff --git a/mm/kfence/kfence.h b/mm/kfence/kfence.h
index 2a2d5de9d379..9a6c4b1b12a8 100644
--- a/mm/kfence/kfence.h
+++ b/mm/kfence/kfence.h
@@ -89,6 +89,9 @@ struct kfence_metadata {
struct kfence_track free_track;
/* For updating alloc_covered on frees. */
u32 alloc_stack_hash;
+#ifdef CONFIG_MEMCG
+ struct obj_cgroup *objcg;
+#endif
};
extern struct kfence_metadata kfence_metadata[CONFIG_KFENCE_NUM_OBJECTS];