The patch below does not apply to the 4.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From e5d54f1935722f83df7619f3978f774c2b802cd8 Mon Sep 17 00:00:00 2001
From: Lyude Paul <lyude(a)redhat.com>
Date: Thu, 12 Jul 2018 13:02:53 -0400
Subject: [PATCH] drm/nouveau/drm/nouveau: Fix runtime PM leak in
nv50_disp_atomic_commit()
A CRTC being enabled doesn't mean it's on! It doesn't even necessarily
mean it's being used. This fixes runtime PM leaks on the P50 I've got
next to me.
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs(a)redhat.com>
diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
index 9382e99a0bc7..31b12b4f321a 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
@@ -1878,7 +1878,7 @@ nv50_disp_atomic_commit(struct drm_device *dev,
nv50_disp_atomic_commit_tail(state);
drm_for_each_crtc(crtc, dev) {
- if (crtc->state->enable) {
+ if (crtc->state->active) {
if (!drm->have_disp_power_ref) {
drm->have_disp_power_ref = true;
return 0;
Hi Greg,
Please consider this patchset, which include block/scsi multiqueue performance
enhancement and bugfix.
We've run multiple benchmark and different tests for over one week, looks
good.
These patches are also included in Oracle UEK5.
They're almost just simple cherry-pick, only 2 patches need minor adjust.
They can apply cleanly on 4.14.57.
Jens Axboe (3):
Revert "blk-mq: don't handle TAG_SHARED in restart"
blk-mq: fix issue with shared tag queue re-running
blk-mq: only run the hardware queue if IO is pending
Jianchao Wang (1):
blk-mq: put the driver tag of nxt rq before first one is requeued
Ming Lei (19):
blk-mq-sched: move actual dispatching into one helper
blk-mq: introduce .get_budget and .put_budget in blk_mq_ops
sbitmap: introduce __sbitmap_for_each_set()
blk-mq-sched: improve dispatching from sw queue
scsi: allow passing in null rq to scsi_prep_state_check()
scsi: implement .get_budget and .put_budget for blk-mq
SCSI: don't get target/host busy_count in scsi_mq_get_budget()
blk-mq: don't handle TAG_SHARED in restart
blk-mq: don't restart queue when .get_budget returns BLK_STS_RESOURCE
blk-mq: don't handle failure in .get_budget
blk-flush: don't run queue for requests bypassing flush
block: pass 'run_queue' to blk_mq_request_bypass_insert
blk-flush: use blk_mq_request_bypass_insert()
blk-mq-sched: decide how to handle flush rq via RQF_FLUSH_SEQ
blk-mq: move blk_mq_put_driver_tag*() into blk-mq.h
blk-mq: don't allocate driver tag upfront for flush rq
blk-mq: put driver tag if dispatch budget can't be got
blk-mq: quiesce queue during switching io sched and updating
nr_requests
scsi: core: run queue if SCSI device queue isn't ready and queue is
idle
block/blk-core.c | 2 +-
block/blk-flush.c | 37 +++++--
block/blk-mq-debugfs.c | 1 -
block/blk-mq-sched.c | 203 ++++++++++++++++++++++-------------
block/blk-mq.c | 278 +++++++++++++++++++++++++++---------------------
block/blk-mq.h | 58 +++++++++-
block/elevator.c | 2 +
drivers/scsi/scsi_lib.c | 53 ++++++---
include/linux/blk-mq.h | 20 +++-
include/linux/sbitmap.h | 64 ++++++++---
10 files changed, 475 insertions(+), 243 deletions(-)
--
2.7.4
Function atomic_inc_unless_negative() returns a bool to indicate
success/failure. However cxl_adapter_context_get() wrongly compares
the return value against '>=0' which will always be true. The patch
fixes this comparison to '==0' there by also fixing this compile time
warning:
drivers/misc/cxl/main.c:290 cxl_adapter_context_get()
warn: 'atomic_inc_unless_negative(&adapter->contexts_num)' is unsigned
Cc: stable(a)vger.kernel.org
Fixes: 70b565bbdb91 ("cxl: Prevent adapter reset if an active context exists")
Reported-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Signed-off-by: Vaibhav Jain <vaibhav(a)linux.ibm.com>
---
drivers/misc/cxl/main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/misc/cxl/main.c b/drivers/misc/cxl/main.c
index c1ba0d42cbc8..e0f29b8a872d 100644
--- a/drivers/misc/cxl/main.c
+++ b/drivers/misc/cxl/main.c
@@ -287,7 +287,7 @@ int cxl_adapter_context_get(struct cxl *adapter)
int rc;
rc = atomic_inc_unless_negative(&adapter->contexts_num);
- return rc >= 0 ? 0 : -EBUSY;
+ return rc ? 0 : -EBUSY;
}
void cxl_adapter_context_put(struct cxl *adapter)
--
2.17.1
Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle")
allows the writeback rate to be faster if there is no I/O request on a
bcache device. It works well if there is only one bcache device attached
to the cache set. If there are many bcache devices attached to a cache
set, it may introduce performance regression because multiple faster
writeback threads of the idle bcache devices will compete the btree level
locks with the bcache device who have I/O requests coming.
This patch fixes the above issue by only permitting fast writebac when
all bcache devices attached on the cache set are idle. And if one of the
bcache devices has new I/O request coming, minimized all writeback
throughput immediately and let PI controller __update_writeback_rate()
to decide the upcoming writeback rate for each bcache device.
Also when all bcache devices are idle, limited wrieback rate to a small
number is wast of thoughput, especially when backing devices are slower
non-rotation devices (e.g. SATA SSD). This patch sets a max writeback
rate for each backing device if the whole cache set is idle. A faster
writeback rate in idle time means new I/Os may have more available space
for dirty data, and people may observe a better write performance then.
Please note bcache may change its cache mode in run time, and this patch
still works if the cache mode is switched from writeback mode and there
is still dirty data on cache.
Fixes: Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle")
Cc: stable(a)vger.kernel.org #4.16+
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: Michael Lyle <mlyle(a)lyle.org>
---
drivers/md/bcache/bcache.h | 9 +---
drivers/md/bcache/request.c | 42 ++++++++++++++-
drivers/md/bcache/sysfs.c | 14 +++--
drivers/md/bcache/util.c | 2 +-
drivers/md/bcache/util.h | 2 +-
drivers/md/bcache/writeback.c | 98 +++++++++++++++++++++++++----------
6 files changed, 126 insertions(+), 41 deletions(-)
diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index d6bf294f3907..f7451e8be03c 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -328,13 +328,6 @@ struct cached_dev {
*/
atomic_t has_dirty;
- /*
- * Set to zero by things that touch the backing volume-- except
- * writeback. Incremented by writeback. Used to determine when to
- * accelerate idle writeback.
- */
- atomic_t backing_idle;
-
struct bch_ratelimit writeback_rate;
struct delayed_work writeback_rate_update;
@@ -514,6 +507,8 @@ struct cache_set {
struct cache_accounting accounting;
unsigned long flags;
+ atomic_t idle_counter;
+ atomic_t at_max_writeback_rate;
struct cache_sb sb;
diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index ae67f5fa8047..fe45f561a054 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -1104,6 +1104,34 @@ static void detached_dev_do_request(struct bcache_device *d, struct bio *bio)
/* Cached devices - read & write stuff */
+static void quit_max_writeback_rate(struct cache_set *c)
+{
+ int i;
+ struct bcache_device *d;
+ struct cached_dev *dc;
+
+ mutex_lock(&bch_register_lock);
+
+ for (i = 0; i < c->devices_max_used; i++) {
+ if (!c->devices[i])
+ continue;
+
+ if (UUID_FLASH_ONLY(&c->uuids[i]))
+ continue;
+
+ d = c->devices[i];
+ dc = container_of(d, struct cached_dev, disk);
+ /*
+ * set writeback rate to default minimum value,
+ * then let update_writeback_rate() to decide the
+ * upcoming rate.
+ */
+ atomic64_set(&dc->writeback_rate.rate, 1);
+ }
+
+ mutex_unlock(&bch_register_lock);
+}
+
static blk_qc_t cached_dev_make_request(struct request_queue *q,
struct bio *bio)
{
@@ -1119,7 +1147,19 @@ static blk_qc_t cached_dev_make_request(struct request_queue *q,
return BLK_QC_T_NONE;
}
- atomic_set(&dc->backing_idle, 0);
+ if (d->c) {
+ atomic_set(&d->c->idle_counter, 0);
+ /*
+ * If at_max_writeback_rate of cache set is true and new I/O
+ * comes, quit max writeback rate of all cached devices
+ * attached to this cache set, and set at_max_writeback_rate
+ * to false.
+ */
+ if (unlikely(atomic_read(&d->c->at_max_writeback_rate) == 1)) {
+ atomic_set(&d->c->at_max_writeback_rate, 0);
+ quit_max_writeback_rate(d->c);
+ }
+ }
generic_start_io_acct(q, rw, bio_sectors(bio), &d->disk->part0);
bio_set_dev(bio, dc->bdev);
diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index 225b15aa0340..d719021bff81 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -170,7 +170,8 @@ SHOW(__bch_cached_dev)
var_printf(writeback_running, "%i");
var_print(writeback_delay);
var_print(writeback_percent);
- sysfs_hprint(writeback_rate, dc->writeback_rate.rate << 9);
+ sysfs_hprint(writeback_rate,
+ atomic64_read(&dc->writeback_rate.rate) << 9);
sysfs_hprint(io_errors, atomic_read(&dc->io_errors));
sysfs_printf(io_error_limit, "%i", dc->error_limit);
sysfs_printf(io_disable, "%i", dc->io_disable);
@@ -188,7 +189,8 @@ SHOW(__bch_cached_dev)
char change[20];
s64 next_io;
- bch_hprint(rate, dc->writeback_rate.rate << 9);
+ bch_hprint(rate,
+ atomic64_read(&dc->writeback_rate.rate) << 9);
bch_hprint(dirty, bcache_dev_sectors_dirty(&dc->disk) << 9);
bch_hprint(target, dc->writeback_rate_target << 9);
bch_hprint(proportional,dc->writeback_rate_proportional << 9);
@@ -255,8 +257,12 @@ STORE(__cached_dev)
sysfs_strtoul_clamp(writeback_percent, dc->writeback_percent, 0, 40);
- sysfs_strtoul_clamp(writeback_rate,
- dc->writeback_rate.rate, 1, INT_MAX);
+ if (attr == &sysfs_writeback_rate) {
+ int v;
+
+ sysfs_strtoul_clamp(writeback_rate, v, 1, INT_MAX);
+ atomic64_set(&dc->writeback_rate.rate, v);
+ }
sysfs_strtoul_clamp(writeback_rate_update_seconds,
dc->writeback_rate_update_seconds,
diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c
index fc479b026d6d..84f90c3d996d 100644
--- a/drivers/md/bcache/util.c
+++ b/drivers/md/bcache/util.c
@@ -200,7 +200,7 @@ uint64_t bch_next_delay(struct bch_ratelimit *d, uint64_t done)
{
uint64_t now = local_clock();
- d->next += div_u64(done * NSEC_PER_SEC, d->rate);
+ d->next += div_u64(done * NSEC_PER_SEC, atomic64_read(&d->rate));
/* Bound the time. Don't let us fall further than 2 seconds behind
* (this prevents unnecessary backlog that would make it impossible
diff --git a/drivers/md/bcache/util.h b/drivers/md/bcache/util.h
index cced87f8eb27..7e17f32ab563 100644
--- a/drivers/md/bcache/util.h
+++ b/drivers/md/bcache/util.h
@@ -442,7 +442,7 @@ struct bch_ratelimit {
* Rate at which we want to do work, in units per second
* The units here correspond to the units passed to bch_next_delay()
*/
- uint32_t rate;
+ atomic64_t rate;
};
static inline void bch_ratelimit_reset(struct bch_ratelimit *d)
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index ad45ebe1a74b..72059f910230 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -49,6 +49,63 @@ static uint64_t __calc_target_rate(struct cached_dev *dc)
return (cache_dirty_target * bdev_share) >> WRITEBACK_SHARE_SHIFT;
}
+static bool set_at_max_writeback_rate(struct cache_set *c,
+ struct cached_dev *dc)
+{
+ int i, dirty_dc_nr = 0;
+ struct bcache_device *d;
+
+ mutex_lock(&bch_register_lock);
+ for (i = 0; i < c->devices_max_used; i++) {
+ if (!c->devices[i])
+ continue;
+ if (UUID_FLASH_ONLY(&c->uuids[i]))
+ continue;
+ d = c->devices[i];
+ dc = container_of(d, struct cached_dev, disk);
+ if (atomic_read(&dc->has_dirty))
+ dirty_dc_nr++;
+ }
+ mutex_unlock(&bch_register_lock);
+
+ /*
+ * Idle_counter is increased everytime when update_writeback_rate()
+ * is rescheduled in. If all backing devices attached to the same
+ * cache set has same dc->writeback_rate_update_seconds value, it
+ * is about 10 rounds of update_writeback_rate() is called on each
+ * backing device, then the code will fall through at set 1 to
+ * c->at_max_writeback_rate, and a max wrteback rate to each
+ * dc->writeback_rate.rate. This is not very accurate but works well
+ * to make sure the whole cache set has no new I/O coming before
+ * writeback rate is set to a max number.
+ */
+ if (atomic_inc_return(&c->idle_counter) < dirty_dc_nr * 10)
+ return false;
+
+ if (atomic_read(&c->at_max_writeback_rate) != 1)
+ atomic_set(&c->at_max_writeback_rate, 1);
+
+
+ atomic64_set(&dc->writeback_rate.rate, INT_MAX);
+
+ /* keep writeback_rate_target as existing value */
+ dc->writeback_rate_proportional = 0;
+ dc->writeback_rate_integral_scaled = 0;
+ dc->writeback_rate_change = 0;
+
+ /*
+ * Check c->idle_counter and c->at_max_writeback_rate agagain in case
+ * new I/O arrives during before set_at_max_writeback_rate() returns.
+ * Then the writeback rate is set to 1, and its new value should be
+ * decided via __update_writeback_rate().
+ */
+ if (atomic_read(&c->idle_counter) < dirty_dc_nr * 10 ||
+ !atomic_read(&c->at_max_writeback_rate))
+ return false;
+
+ return true;
+}
+
static void __update_writeback_rate(struct cached_dev *dc)
{
/*
@@ -104,8 +161,9 @@ static void __update_writeback_rate(struct cached_dev *dc)
dc->writeback_rate_proportional = proportional_scaled;
dc->writeback_rate_integral_scaled = integral_scaled;
- dc->writeback_rate_change = new_rate - dc->writeback_rate.rate;
- dc->writeback_rate.rate = new_rate;
+ dc->writeback_rate_change = new_rate -
+ atomic64_read(&dc->writeback_rate.rate);
+ atomic64_set(&dc->writeback_rate.rate, new_rate);
dc->writeback_rate_target = target;
}
@@ -138,9 +196,16 @@ static void update_writeback_rate(struct work_struct *work)
down_read(&dc->writeback_lock);
- if (atomic_read(&dc->has_dirty) &&
- dc->writeback_percent)
- __update_writeback_rate(dc);
+ if (atomic_read(&dc->has_dirty) && dc->writeback_percent) {
+ /*
+ * If the whole cache set is idle, set_at_max_writeback_rate()
+ * will set writeback rate to a max number. Then it is
+ * unncessary to update writeback rate for an idle cache set
+ * in maximum writeback rate number(s).
+ */
+ if (!set_at_max_writeback_rate(c, dc))
+ __update_writeback_rate(dc);
+ }
up_read(&dc->writeback_lock);
@@ -422,27 +487,6 @@ static void read_dirty(struct cached_dev *dc)
delay = writeback_delay(dc, size);
- /* If the control system would wait for at least half a
- * second, and there's been no reqs hitting the backing disk
- * for awhile: use an alternate mode where we have at most
- * one contiguous set of writebacks in flight at a time. If
- * someone wants to do IO it will be quick, as it will only
- * have to contend with one operation in flight, and we'll
- * be round-tripping data to the backing disk as quickly as
- * it can accept it.
- */
- if (delay >= HZ / 2) {
- /* 3 means at least 1.5 seconds, up to 7.5 if we
- * have slowed way down.
- */
- if (atomic_inc_return(&dc->backing_idle) >= 3) {
- /* Wait for current I/Os to finish */
- closure_sync(&cl);
- /* And immediately launch a new set. */
- delay = 0;
- }
- }
-
while (!kthread_should_stop() &&
!test_bit(CACHE_SET_IO_DISABLE, &dc->disk.c->flags) &&
delay) {
@@ -715,7 +759,7 @@ void bch_cached_dev_writeback_init(struct cached_dev *dc)
dc->writeback_running = true;
dc->writeback_percent = 10;
dc->writeback_delay = 30;
- dc->writeback_rate.rate = 1024;
+ atomic64_set(&dc->writeback_rate.rate, 1024);
dc->writeback_rate_minimum = 8;
dc->writeback_rate_update_seconds = WRITEBACK_RATE_UPDATE_SECS_DEFAULT;
--
2.17.1
From: Anssi Hannula <anssi.hannula(a)bitwise.fi>
There are several issues with the suspend/resume handling code of the
driver:
- The device is attached and detached in the runtime_suspend() and
runtime_resume() callbacks if the interface is running. However,
during xcan_chip_start() the interface is considered running,
causing the resume handler to incorrectly call netif_start_queue()
at the beginning of xcan_chip_start(), and on xcan_chip_start() error
return the suspend handler detaches the device leaving the user
unable to bring-up the device anymore.
- The device is not brought properly up on system resume. A reset is
done and the code tries to determine the bus state after that.
However, after reset the device is always in Configuration mode
(down), so the state checking code does not make sense and
communication will also not work.
- The suspend callback tries to set the device to sleep mode (low-power
mode which monitors the bus and brings the device back to normal mode
on activity), but then immediately disables the clocks (possibly
before the device reaches the sleep mode), which does not make sense
to me. If a clean shutdown is wanted before disabling clocks, we can
just bring it down completely instead of only sleep mode.
Reorganize the PM code so that only the clock logic remains in the
runtime PM callbacks and the system PM callbacks contain the device
bring-up/down logic. This makes calling the runtime PM callbacks during
e.g. xcan_chip_start() safe.
The system PM callbacks now simply call common code to start/stop the
HW if the interface was running, replacing the broken code from before.
xcan_chip_stop() is updated to use the common reset code so that it will
wait for the reset to complete. Reset also disables all interrupts so do
not do that separately.
Also, the device_may_wakeup() checks are removed as the driver does not
have wakeup support.
Tested on Zynq-7000 integrated CAN.
Signed-off-by: Anssi Hannula <anssi.hannula(a)bitwise.fi>
Cc: Michal Simek <michal.simek(a)xilinx.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/xilinx_can.c | 69 +++++++++++++++---------------------
1 file changed, 28 insertions(+), 41 deletions(-)
diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index cb80a9aa7281..5a24039733ef 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -984,13 +984,9 @@ static irqreturn_t xcan_interrupt(int irq, void *dev_id)
static void xcan_chip_stop(struct net_device *ndev)
{
struct xcan_priv *priv = netdev_priv(ndev);
- u32 ier;
/* Disable interrupts and leave the can in configuration mode */
- ier = priv->read_reg(priv, XCAN_IER_OFFSET);
- ier &= ~XCAN_INTR_ALL;
- priv->write_reg(priv, XCAN_IER_OFFSET, ier);
- priv->write_reg(priv, XCAN_SRR_OFFSET, XCAN_SRR_RESET_MASK);
+ set_reset_mode(ndev);
priv->can.state = CAN_STATE_STOPPED;
}
@@ -1123,10 +1119,15 @@ static const struct net_device_ops xcan_netdev_ops = {
*/
static int __maybe_unused xcan_suspend(struct device *dev)
{
- if (!device_may_wakeup(dev))
- return pm_runtime_force_suspend(dev);
+ struct net_device *ndev = dev_get_drvdata(dev);
- return 0;
+ if (netif_running(ndev)) {
+ netif_stop_queue(ndev);
+ netif_device_detach(ndev);
+ xcan_chip_stop(ndev);
+ }
+
+ return pm_runtime_force_suspend(dev);
}
/**
@@ -1138,11 +1139,27 @@ static int __maybe_unused xcan_suspend(struct device *dev)
*/
static int __maybe_unused xcan_resume(struct device *dev)
{
- if (!device_may_wakeup(dev))
- return pm_runtime_force_resume(dev);
+ struct net_device *ndev = dev_get_drvdata(dev);
+ int ret;
- return 0;
+ ret = pm_runtime_force_resume(dev);
+ if (ret) {
+ dev_err(dev, "pm_runtime_force_resume failed on resume\n");
+ return ret;
+ }
+
+ if (netif_running(ndev)) {
+ ret = xcan_chip_start(ndev);
+ if (ret) {
+ dev_err(dev, "xcan_chip_start failed on resume\n");
+ return ret;
+ }
+ netif_device_attach(ndev);
+ netif_start_queue(ndev);
+ }
+
+ return 0;
}
/**
@@ -1157,14 +1174,6 @@ static int __maybe_unused xcan_runtime_suspend(struct device *dev)
struct net_device *ndev = dev_get_drvdata(dev);
struct xcan_priv *priv = netdev_priv(ndev);
- if (netif_running(ndev)) {
- netif_stop_queue(ndev);
- netif_device_detach(ndev);
- }
-
- priv->write_reg(priv, XCAN_MSR_OFFSET, XCAN_MSR_SLEEP_MASK);
- priv->can.state = CAN_STATE_SLEEPING;
-
clk_disable_unprepare(priv->bus_clk);
clk_disable_unprepare(priv->can_clk);
@@ -1183,7 +1192,6 @@ static int __maybe_unused xcan_runtime_resume(struct device *dev)
struct net_device *ndev = dev_get_drvdata(dev);
struct xcan_priv *priv = netdev_priv(ndev);
int ret;
- u32 isr, status;
ret = clk_prepare_enable(priv->bus_clk);
if (ret) {
@@ -1197,27 +1205,6 @@ static int __maybe_unused xcan_runtime_resume(struct device *dev)
return ret;
}
- priv->write_reg(priv, XCAN_SRR_OFFSET, XCAN_SRR_RESET_MASK);
- isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
- status = priv->read_reg(priv, XCAN_SR_OFFSET);
-
- if (netif_running(ndev)) {
- if (isr & XCAN_IXR_BSOFF_MASK) {
- priv->can.state = CAN_STATE_BUS_OFF;
- priv->write_reg(priv, XCAN_SRR_OFFSET,
- XCAN_SRR_RESET_MASK);
- } else if ((status & XCAN_SR_ESTAT_MASK) ==
- XCAN_SR_ESTAT_MASK) {
- priv->can.state = CAN_STATE_ERROR_PASSIVE;
- } else if (status & XCAN_SR_ERRWRN_MASK) {
- priv->can.state = CAN_STATE_ERROR_WARNING;
- } else {
- priv->can.state = CAN_STATE_ERROR_ACTIVE;
- }
- netif_device_attach(ndev);
- netif_start_queue(ndev);
- }
-
return 0;
}
--
2.18.0
From: Anssi Hannula <anssi.hannula(a)bitwise.fi>
xcan_interrupt() clears ERROR|RXOFLV|BSOFF|ARBLST interrupts if any of
them is asserted. This does not take into account that some of them
could have been asserted between interrupt status read and interrupt
clear, therefore clearing them without handling them.
Fix the code to only clear those interrupts that it knows are asserted
and therefore going to be processed in xcan_err_interrupt().
Fixes: b1201e44f50b ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula(a)bitwise.fi>
Cc: Michal Simek <michal.simek(a)xilinx.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/xilinx_can.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index ea9f9d1a5ba7..cb80a9aa7281 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -938,6 +938,7 @@ static irqreturn_t xcan_interrupt(int irq, void *dev_id)
struct net_device *ndev = (struct net_device *)dev_id;
struct xcan_priv *priv = netdev_priv(ndev);
u32 isr, ier;
+ u32 isr_errors;
/* Get the interrupt status from Xilinx CAN */
isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
@@ -956,11 +957,10 @@ static irqreturn_t xcan_interrupt(int irq, void *dev_id)
xcan_tx_interrupt(ndev, isr);
/* Check for the type of error interrupt and Processing it */
- if (isr & (XCAN_IXR_ERROR_MASK | XCAN_IXR_RXOFLW_MASK |
- XCAN_IXR_BSOFF_MASK | XCAN_IXR_ARBLST_MASK)) {
- priv->write_reg(priv, XCAN_ICR_OFFSET, (XCAN_IXR_ERROR_MASK |
- XCAN_IXR_RXOFLW_MASK | XCAN_IXR_BSOFF_MASK |
- XCAN_IXR_ARBLST_MASK));
+ isr_errors = isr & (XCAN_IXR_ERROR_MASK | XCAN_IXR_RXOFLW_MASK |
+ XCAN_IXR_BSOFF_MASK | XCAN_IXR_ARBLST_MASK);
+ if (isr_errors) {
+ priv->write_reg(priv, XCAN_ICR_OFFSET, isr_errors);
xcan_err_interrupt(ndev, isr);
}
--
2.18.0
From: Anssi Hannula <anssi.hannula(a)bitwise.fi>
The xilinx_can driver assumes that the TXOK interrupt only clears after
it has been acknowledged as many times as there have been successfully
sent frames.
However, the documentation does not mention such behavior, instead
saying just that the interrupt is cleared when the clear bit is set.
Similarly, testing seems to also suggest that it is immediately cleared
regardless of the amount of frames having been sent. Performing some
heavy TX load and then going back to idle has the tx_head drifting
further away from tx_tail over time, steadily reducing the amount of
frames the driver keeps in the TX FIFO (but not to zero, as the TXOK
interrupt always frees up space for 1 frame from the driver's
perspective, so frames continue to be sent) and delaying the local echo
frames.
The TX FIFO tracking is also otherwise buggy as it does not account for
TX FIFO being cleared after software resets, causing
BUG!, TX FIFO full when queue awake!
messages to be output.
There does not seem to be any way to accurately track the state of the
TX FIFO for local echo support while using the full TX FIFO.
The Zynq version of the HW (but not the soft-AXI version) has watermark
programming support and with it an additional TX-FIFO-empty interrupt
bit.
Modify the driver to only put 1 frame into TX FIFO at a time on soft-AXI
and 2 frames at a time on Zynq. On Zynq the TXFEMP interrupt bit is used
to detect whether 1 or 2 frames have been sent at interrupt processing
time.
Tested with the integrated CAN on Zynq-7000 SoC. The 1-frame-FIFO mode
was also tested.
An alternative way to solve this would be to drop local echo support but
keep using the full TX FIFO.
v2: Add FIFO space check before TX queue wake with locking to
synchronize with queue stop. This avoids waking the queue when xmit()
had just filled it.
v3: Keep local echo support and reduce the amount of frames in FIFO
instead as suggested by Marc Kleine-Budde.
Fixes: b1201e44f50b ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula(a)bitwise.fi>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/xilinx_can.c | 139 +++++++++++++++++++++++++++++++----
1 file changed, 123 insertions(+), 16 deletions(-)
diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index 763408a3eafb..dcbdc3cd651c 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -26,8 +26,10 @@
#include <linux/module.h>
#include <linux/netdevice.h>
#include <linux/of.h>
+#include <linux/of_device.h>
#include <linux/platform_device.h>
#include <linux/skbuff.h>
+#include <linux/spinlock.h>
#include <linux/string.h>
#include <linux/types.h>
#include <linux/can/dev.h>
@@ -119,6 +121,7 @@ enum xcan_reg {
/**
* struct xcan_priv - This definition define CAN driver instance
* @can: CAN private data structure.
+ * @tx_lock: Lock for synchronizing TX interrupt handling
* @tx_head: Tx CAN packets ready to send on the queue
* @tx_tail: Tx CAN packets successfully sended on the queue
* @tx_max: Maximum number packets the driver can send
@@ -133,6 +136,7 @@ enum xcan_reg {
*/
struct xcan_priv {
struct can_priv can;
+ spinlock_t tx_lock;
unsigned int tx_head;
unsigned int tx_tail;
unsigned int tx_max;
@@ -160,6 +164,11 @@ static const struct can_bittiming_const xcan_bittiming_const = {
.brp_inc = 1,
};
+#define XCAN_CAP_WATERMARK 0x0001
+struct xcan_devtype_data {
+ unsigned int caps;
+};
+
/**
* xcan_write_reg_le - Write a value to the device register little endian
* @priv: Driver private data structure
@@ -239,6 +248,10 @@ static int set_reset_mode(struct net_device *ndev)
usleep_range(500, 10000);
}
+ /* reset clears FIFOs */
+ priv->tx_head = 0;
+ priv->tx_tail = 0;
+
return 0;
}
@@ -393,6 +406,7 @@ static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
struct net_device_stats *stats = &ndev->stats;
struct can_frame *cf = (struct can_frame *)skb->data;
u32 id, dlc, data[2] = {0, 0};
+ unsigned long flags;
if (can_dropped_invalid_skb(ndev, skb))
return NETDEV_TX_OK;
@@ -440,6 +454,9 @@ static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
data[1] = be32_to_cpup((__be32 *)(cf->data + 4));
can_put_echo_skb(skb, ndev, priv->tx_head % priv->tx_max);
+
+ spin_lock_irqsave(&priv->tx_lock, flags);
+
priv->tx_head++;
/* Write the Frame to Xilinx CAN TX FIFO */
@@ -455,10 +472,16 @@ static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
stats->tx_bytes += cf->can_dlc;
}
+ /* Clear TX-FIFO-empty interrupt for xcan_tx_interrupt() */
+ if (priv->tx_max > 1)
+ priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXFEMP_MASK);
+
/* Check if the TX buffer is full */
if ((priv->tx_head - priv->tx_tail) == priv->tx_max)
netif_stop_queue(ndev);
+ spin_unlock_irqrestore(&priv->tx_lock, flags);
+
return NETDEV_TX_OK;
}
@@ -832,19 +855,71 @@ static void xcan_tx_interrupt(struct net_device *ndev, u32 isr)
{
struct xcan_priv *priv = netdev_priv(ndev);
struct net_device_stats *stats = &ndev->stats;
+ unsigned int frames_in_fifo;
+ int frames_sent = 1; /* TXOK => at least 1 frame was sent */
+ unsigned long flags;
+ int retries = 0;
+
+ /* Synchronize with xmit as we need to know the exact number
+ * of frames in the FIFO to stay in sync due to the TXFEMP
+ * handling.
+ * This also prevents a race between netif_wake_queue() and
+ * netif_stop_queue().
+ */
+ spin_lock_irqsave(&priv->tx_lock, flags);
+
+ frames_in_fifo = priv->tx_head - priv->tx_tail;
- while ((priv->tx_head - priv->tx_tail > 0) &&
- (isr & XCAN_IXR_TXOK_MASK)) {
+ if (WARN_ON_ONCE(frames_in_fifo == 0)) {
+ /* clear TXOK anyway to avoid getting back here */
priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXOK_MASK);
+ spin_unlock_irqrestore(&priv->tx_lock, flags);
+ return;
+ }
+
+ /* Check if 2 frames were sent (TXOK only means that at least 1
+ * frame was sent).
+ */
+ if (frames_in_fifo > 1) {
+ WARN_ON(frames_in_fifo > priv->tx_max);
+
+ /* Synchronize TXOK and isr so that after the loop:
+ * (1) isr variable is up-to-date at least up to TXOK clear
+ * time. This avoids us clearing a TXOK of a second frame
+ * but not noticing that the FIFO is now empty and thus
+ * marking only a single frame as sent.
+ * (2) No TXOK is left. Having one could mean leaving a
+ * stray TXOK as we might process the associated frame
+ * via TXFEMP handling as we read TXFEMP *after* TXOK
+ * clear to satisfy (1).
+ */
+ while ((isr & XCAN_IXR_TXOK_MASK) && !WARN_ON(++retries == 100)) {
+ priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXOK_MASK);
+ isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
+ }
+
+ if (isr & XCAN_IXR_TXFEMP_MASK) {
+ /* nothing in FIFO anymore */
+ frames_sent = frames_in_fifo;
+ }
+ } else {
+ /* single frame in fifo, just clear TXOK */
+ priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_TXOK_MASK);
+ }
+
+ while (frames_sent--) {
can_get_echo_skb(ndev, priv->tx_tail %
priv->tx_max);
priv->tx_tail++;
stats->tx_packets++;
- isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
}
+
+ netif_wake_queue(ndev);
+
+ spin_unlock_irqrestore(&priv->tx_lock, flags);
+
can_led_event(ndev, CAN_LED_EVENT_TX);
xcan_update_error_state_after_rxtx(ndev);
- netif_wake_queue(ndev);
}
/**
@@ -1151,6 +1226,18 @@ static const struct dev_pm_ops xcan_dev_pm_ops = {
SET_RUNTIME_PM_OPS(xcan_runtime_suspend, xcan_runtime_resume, NULL)
};
+static const struct xcan_devtype_data xcan_zynq_data = {
+ .caps = XCAN_CAP_WATERMARK,
+};
+
+/* Match table for OF platform binding */
+static const struct of_device_id xcan_of_match[] = {
+ { .compatible = "xlnx,zynq-can-1.0", .data = &xcan_zynq_data },
+ { .compatible = "xlnx,axi-can-1.00.a", },
+ { /* end of list */ },
+};
+MODULE_DEVICE_TABLE(of, xcan_of_match);
+
/**
* xcan_probe - Platform registration call
* @pdev: Handle to the platform device structure
@@ -1165,8 +1252,10 @@ static int xcan_probe(struct platform_device *pdev)
struct resource *res; /* IO mem resources */
struct net_device *ndev;
struct xcan_priv *priv;
+ const struct of_device_id *of_id;
+ int caps = 0;
void __iomem *addr;
- int ret, rx_max, tx_max;
+ int ret, rx_max, tx_max, tx_fifo_depth;
/* Get the virtual base address for the device */
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
@@ -1176,7 +1265,8 @@ static int xcan_probe(struct platform_device *pdev)
goto err;
}
- ret = of_property_read_u32(pdev->dev.of_node, "tx-fifo-depth", &tx_max);
+ ret = of_property_read_u32(pdev->dev.of_node, "tx-fifo-depth",
+ &tx_fifo_depth);
if (ret < 0)
goto err;
@@ -1184,6 +1274,30 @@ static int xcan_probe(struct platform_device *pdev)
if (ret < 0)
goto err;
+ of_id = of_match_device(xcan_of_match, &pdev->dev);
+ if (of_id) {
+ const struct xcan_devtype_data *devtype_data = of_id->data;
+
+ if (devtype_data)
+ caps = devtype_data->caps;
+ }
+
+ /* There is no way to directly figure out how many frames have been
+ * sent when the TXOK interrupt is processed. If watermark programming
+ * is supported, we can have 2 frames in the FIFO and use TXFEMP
+ * to determine if 1 or 2 frames have been sent.
+ * Theoretically we should be able to use TXFWMEMP to determine up
+ * to 3 frames, but it seems that after putting a second frame in the
+ * FIFO, with watermark at 2 frames, it can happen that TXFWMEMP (less
+ * than 2 frames in FIFO) is set anyway with no TXOK (a frame was
+ * sent), which is not a sensible state - possibly TXFWMEMP is not
+ * completely synchronized with the rest of the bits?
+ */
+ if (caps & XCAN_CAP_WATERMARK)
+ tx_max = min(tx_fifo_depth, 2);
+ else
+ tx_max = 1;
+
/* Create a CAN device instance */
ndev = alloc_candev(sizeof(struct xcan_priv), tx_max);
if (!ndev)
@@ -1198,6 +1312,7 @@ static int xcan_probe(struct platform_device *pdev)
CAN_CTRLMODE_BERR_REPORTING;
priv->reg_base = addr;
priv->tx_max = tx_max;
+ spin_lock_init(&priv->tx_lock);
/* Get IRQ for the device */
ndev->irq = platform_get_irq(pdev, 0);
@@ -1262,9 +1377,9 @@ static int xcan_probe(struct platform_device *pdev)
pm_runtime_put(&pdev->dev);
- netdev_dbg(ndev, "reg_base=0x%p irq=%d clock=%d, tx fifo depth:%d\n",
+ netdev_dbg(ndev, "reg_base=0x%p irq=%d clock=%d, tx fifo depth: actual %d, using %d\n",
priv->reg_base, ndev->irq, priv->can.clock.freq,
- priv->tx_max);
+ tx_fifo_depth, priv->tx_max);
return 0;
@@ -1298,14 +1413,6 @@ static int xcan_remove(struct platform_device *pdev)
return 0;
}
-/* Match table for OF platform binding */
-static const struct of_device_id xcan_of_match[] = {
- { .compatible = "xlnx,zynq-can-1.0", },
- { .compatible = "xlnx,axi-can-1.00.a", },
- { /* end of list */ },
-};
-MODULE_DEVICE_TABLE(of, xcan_of_match);
-
static struct platform_driver xcan_driver = {
.probe = xcan_probe,
.remove = xcan_remove,
--
2.18.0
From: Anssi Hannula <anssi.hannula(a)bitwise.fi>
If the device gets into a state where RXNEMP (RX FIFO not empty)
interrupt is asserted without RXOK (new frame received successfully)
interrupt being asserted, xcan_rx_poll() will continue to try to clear
RXNEMP without actually reading frames from RX FIFO. If the RX FIFO is
not empty, the interrupt will not be cleared and napi_schedule() will
just be called again.
This situation can occur when:
(a) xcan_rx() returns without reading RX FIFO due to an error condition.
The code tries to clear both RXOK and RXNEMP but RXNEMP will not clear
due to a frame still being in the FIFO. The frame will never be read
from the FIFO as RXOK is no longer set.
(b) A frame is received between xcan_rx_poll() reading interrupt status
and clearing RXOK. RXOK will be cleared, but RXNEMP will again remain
set as the new message is still in the FIFO.
I'm able to trigger case (b) by flooding the bus with frames under load.
There does not seem to be any benefit in using both RXNEMP and RXOK in
the way the driver does, and the polling example in the reference manual
(UG585 v1.10 18.3.7 Read Messages from RxFIFO) also says that either
RXOK or RXNEMP can be used for detecting incoming messages.
Fix the issue and simplify the RX processing by only using RXNEMP
without RXOK.
Tested with the integrated CAN on Zynq-7000 SoC.
Fixes: b1201e44f50b ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula(a)bitwise.fi>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/xilinx_can.c | 18 +++++-------------
1 file changed, 5 insertions(+), 13 deletions(-)
diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index 389a9603db8c..1bda47aa62f5 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -101,7 +101,7 @@ enum xcan_reg {
#define XCAN_INTR_ALL (XCAN_IXR_TXOK_MASK | XCAN_IXR_BSOFF_MASK |\
XCAN_IXR_WKUP_MASK | XCAN_IXR_SLP_MASK | \
XCAN_IXR_RXNEMP_MASK | XCAN_IXR_ERROR_MASK | \
- XCAN_IXR_ARBLST_MASK | XCAN_IXR_RXOK_MASK)
+ XCAN_IXR_ARBLST_MASK)
/* CAN register bit shift - XCAN_<REG>_<BIT>_SHIFT */
#define XCAN_BTR_SJW_SHIFT 7 /* Synchronous jump width */
@@ -708,15 +708,7 @@ static int xcan_rx_poll(struct napi_struct *napi, int quota)
isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
while ((isr & XCAN_IXR_RXNEMP_MASK) && (work_done < quota)) {
- if (isr & XCAN_IXR_RXOK_MASK) {
- priv->write_reg(priv, XCAN_ICR_OFFSET,
- XCAN_IXR_RXOK_MASK);
- work_done += xcan_rx(ndev);
- } else {
- priv->write_reg(priv, XCAN_ICR_OFFSET,
- XCAN_IXR_RXNEMP_MASK);
- break;
- }
+ work_done += xcan_rx(ndev);
priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_RXNEMP_MASK);
isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
}
@@ -727,7 +719,7 @@ static int xcan_rx_poll(struct napi_struct *napi, int quota)
if (work_done < quota) {
napi_complete_done(napi, work_done);
ier = priv->read_reg(priv, XCAN_IER_OFFSET);
- ier |= (XCAN_IXR_RXOK_MASK | XCAN_IXR_RXNEMP_MASK);
+ ier |= XCAN_IXR_RXNEMP_MASK;
priv->write_reg(priv, XCAN_IER_OFFSET, ier);
}
return work_done;
@@ -799,9 +791,9 @@ static irqreturn_t xcan_interrupt(int irq, void *dev_id)
}
/* Check for the type of receive interrupt and Processing it */
- if (isr & (XCAN_IXR_RXNEMP_MASK | XCAN_IXR_RXOK_MASK)) {
+ if (isr & XCAN_IXR_RXNEMP_MASK) {
ier = priv->read_reg(priv, XCAN_IER_OFFSET);
- ier &= ~(XCAN_IXR_RXNEMP_MASK | XCAN_IXR_RXOK_MASK);
+ ier &= ~XCAN_IXR_RXNEMP_MASK;
priv->write_reg(priv, XCAN_IER_OFFSET, ier);
napi_schedule(&priv->napi);
}
--
2.18.0