From: Kieran Bingham <kieran.bingham+renesas(a)ideasonboard.com>
Commit 372b2b0399fc ("media: v4l: vsp1: Release buffers in
start_streaming error path") introduced a helper to clean up buffers on
error paths, but inadvertently changed the code such that only the
output WPF buffers were cleaned, rather than the video node being
operated on.
Since then vsp1_video_cleanup_pipeline() has grown to perform both video
node cleanup, as well as pipeline cleanup. Split the implementation into
two distinct functions that perform the required work, so that each
video node can release it's buffers correctly on streamoff. The pipe
cleanup that was performed in the vsp1_video_stop_streaming() (releasing
the pipe->dl) is moved to the function for clarity.
Fixes: 372b2b0399fc ("media: v4l: vsp1: Release buffers in start_streaming error path")
Cc: stable(a)vger.kernel.org # v4.13+
Signed-off-by: Kieran Bingham <kieran.bingham+renesas(a)ideasonboard.com>
---
drivers/media/platform/vsp1/vsp1_video.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
diff --git a/drivers/media/platform/vsp1/vsp1_video.c b/drivers/media/platform/vsp1/vsp1_video.c
index c8c12223a267..ba89dd176a13 100644
--- a/drivers/media/platform/vsp1/vsp1_video.c
+++ b/drivers/media/platform/vsp1/vsp1_video.c
@@ -842,9 +842,8 @@ static int vsp1_video_setup_pipeline(struct vsp1_pipeline *pipe)
return 0;
}
-static void vsp1_video_cleanup_pipeline(struct vsp1_pipeline *pipe)
+static void vsp1_video_release_buffers(struct vsp1_video *video)
{
- struct vsp1_video *video = pipe->output->video;
struct vsp1_vb2_buffer *buffer;
unsigned long flags;
@@ -854,12 +853,18 @@ static void vsp1_video_cleanup_pipeline(struct vsp1_pipeline *pipe)
vb2_buffer_done(&buffer->buf.vb2_buf, VB2_BUF_STATE_ERROR);
INIT_LIST_HEAD(&video->irqqueue);
spin_unlock_irqrestore(&video->irqlock, flags);
+}
+
+static void vsp1_video_cleanup_pipeline(struct vsp1_pipeline *pipe)
+{
+ lockdep_assert_held(&pipe->lock);
/* Release our partition table allocation */
- mutex_lock(&pipe->lock);
kfree(pipe->part_table);
pipe->part_table = NULL;
- mutex_unlock(&pipe->lock);
+
+ vsp1_dl_list_put(pipe->dl);
+ pipe->dl = NULL;
}
static int vsp1_video_start_streaming(struct vb2_queue *vq, unsigned int count)
@@ -874,8 +879,9 @@ static int vsp1_video_start_streaming(struct vb2_queue *vq, unsigned int count)
if (pipe->stream_count == pipe->num_inputs) {
ret = vsp1_video_setup_pipeline(pipe);
if (ret < 0) {
- mutex_unlock(&pipe->lock);
+ vsp1_video_release_buffers(video);
vsp1_video_cleanup_pipeline(pipe);
+ mutex_unlock(&pipe->lock);
return ret;
}
@@ -925,13 +931,12 @@ static void vsp1_video_stop_streaming(struct vb2_queue *vq)
if (ret == -ETIMEDOUT)
dev_err(video->vsp1->dev, "pipeline stop timeout\n");
- vsp1_dl_list_put(pipe->dl);
- pipe->dl = NULL;
+ vsp1_video_cleanup_pipeline(pipe);
}
mutex_unlock(&pipe->lock);
media_pipeline_stop(&video->video.entity);
- vsp1_video_cleanup_pipeline(pipe);
+ vsp1_video_release_buffers(video);
vsp1_video_pipeline_put(pipe);
}
--
git-series 0.9.1
This is the start of the stable review cycle for the 4.9.101 release.
There are 33 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun May 20 08:15:20 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.101-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.101-rc1
Willy Tarreau <w(a)1wt.eu>
proc: do not access cmdline nor environ from file-backed areas
Jakub Kicinski <jakub.kicinski(a)netronome.com>
nfp: TX time stamp packets before HW doorbell is rung
James Chapman <jchapman(a)katalix.com>
l2tp: revert "l2tp: fix missing print session offset info"
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Revert "ARM: dts: imx6qdl-wandboard: Fix audio channel swap"
Vasily Averin <vvs(a)virtuozzo.com>
lockd: lost rollback of set_grace_period() in lockd_down_net()
Antony Antony <antony(a)phenome.org>
xfrm: fix xfrm_do_migrate() with AEAD e.g(AES-GCM)
Jiri Slaby <jslaby(a)suse.cz>
futex: Remove duplicated code and fix undefined behaviour
Alexey Khoroshilov <khoroshilov(a)ispras.ru>
serial: sccnxp: Fix error handling in sccnxp_probe()
Xin Long <lucien.xin(a)gmail.com>
sctp: delay the authentication for the duplicated cookie-echo chunk
Xin Long <lucien.xin(a)gmail.com>
sctp: fix the issue that the cookie-ack with auth can't get processed
Yuchung Cheng <ycheng(a)google.com>
tcp: ignore Fast Open on repair mode
Debabrata Banerjee <dbanerje(a)akamai.com>
bonding: send learning packets for vlans on slave
Talat Batheesh <talatb(a)mellanox.com>
net/mlx5: Avoid cleaning flow steering table twice during error flow
Debabrata Banerjee <dbanerje(a)akamai.com>
bonding: do not allow rlb updates to invalid mac
Michael Chan <michael.chan(a)broadcom.com>
tg3: Fix vunmap() BUG_ON() triggered from tg3_free_consistent().
Neal Cardwell <ncardwell(a)google.com>
tcp_bbr: fix to zero idle_restart only upon S/ACKed data
Xin Long <lucien.xin(a)gmail.com>
sctp: use the old asoc when making the cookie-ack chunk in dupcook_d
Xin Long <lucien.xin(a)gmail.com>
sctp: remove sctp_chunk_put from fail_mark err path in sctp_ulpevent_make_rcvmsg
Xin Long <lucien.xin(a)gmail.com>
sctp: handle two v4 addrs comparison in sctp_inet6_cmp_addr
Heiner Kallweit <hkallweit1(a)gmail.com>
r8169: fix powering up RTL8168h
Bjørn Mork <bjorn(a)mork.no>
qmi_wwan: do not steal interfaces from class drivers
Stefano Brivio <sbrivio(a)redhat.com>
openvswitch: Don't swap table in nlattr_set() after OVS_ATTR_NESTED is found
Lance Richardson <lance.richardson.net(a)gmail.com>
net: support compat 64-bit time in {s,g}etsockopt
Eric Dumazet <edumazet(a)google.com>
net_sched: fq: take care of throttled flows before reuse
Adi Nissim <adin(a)mellanox.com>
net/mlx5: E-Switch, Include VF RDMA stats in vport statistics
Moshe Shemesh <moshe(a)mellanox.com>
net/mlx4_en: Verify coalescing parameters are in range
Grygorii Strashko <grygorii.strashko(a)ti.com>
net: ethernet: ti: cpsw: fix packet leaking in dual_mac mode
Rob Taglang <rob(a)taglang.io>
net: ethernet: sun: niu set correct packet size in skb
Eric Dumazet <edumazet(a)google.com>
llc: better deal with too small mtu
Andrey Ignatov <rdna(a)fb.com>
ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg
Eric Dumazet <edumazet(a)google.com>
dccp: fix tasklet usage
Hangbin Liu <liuhangbin(a)gmail.com>
bridge: check iface upper dev when setting master via ioctl
Ingo Molnar <mingo(a)elte.hu>
8139too: Use disable_irq_nosync() in rtl8139_poll_controller()
-------------
Diffstat:
Makefile | 4 +-
arch/alpha/include/asm/futex.h | 26 ++-----
arch/arc/include/asm/futex.h | 40 ++--------
arch/arm/boot/dts/imx6qdl-wandboard.dtsi | 1 -
arch/arm/include/asm/futex.h | 26 +------
arch/arm64/include/asm/futex.h | 27 +------
arch/frv/include/asm/futex.h | 3 +-
arch/frv/kernel/futex.c | 27 +------
arch/hexagon/include/asm/futex.h | 38 +--------
arch/ia64/include/asm/futex.h | 25 +-----
arch/microblaze/include/asm/futex.h | 38 +--------
arch/mips/include/asm/futex.h | 25 +-----
arch/parisc/include/asm/futex.h | 26 +------
arch/powerpc/include/asm/futex.h | 26 ++-----
arch/s390/include/asm/futex.h | 23 ++----
arch/sh/include/asm/futex.h | 26 +------
arch/sparc/include/asm/futex_64.h | 26 ++-----
arch/tile/include/asm/futex.h | 40 ++--------
arch/x86/include/asm/futex.h | 40 ++--------
arch/xtensa/include/asm/futex.h | 27 ++-----
drivers/net/bonding/bond_alb.c | 15 ++--
drivers/net/bonding/bond_main.c | 2 +
drivers/net/ethernet/broadcom/tg3.c | 9 ++-
drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 16 ++++
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 7 +-
drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 11 ++-
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 21 +++--
.../net/ethernet/netronome/nfp/nfp_net_common.c | 4 +-
drivers/net/ethernet/realtek/8139too.c | 2 +-
drivers/net/ethernet/realtek/r8169.c | 3 +
drivers/net/ethernet/sun/niu.c | 5 +-
drivers/net/ethernet/ti/cpsw.c | 2 +
drivers/net/usb/qmi_wwan.c | 12 +++
drivers/tty/serial/sccnxp.c | 13 +++-
fs/lockd/svc.c | 2 +
fs/proc/base.c | 10 +--
include/asm-generic/futex.h | 50 +++---------
include/linux/mm.h | 1 +
include/net/bonding.h | 1 +
kernel/futex.c | 39 ++++++++++
mm/gup.c | 3 +
net/bridge/br_if.c | 4 +-
net/compat.c | 6 +-
net/dccp/ccids/ccid2.c | 14 +++-
net/dccp/timer.c | 2 +-
net/ipv4/ping.c | 7 +-
net/ipv4/tcp.c | 2 +-
net/ipv4/tcp_bbr.c | 4 +-
net/ipv4/udp.c | 7 +-
net/l2tp/l2tp_netlink.c | 2 -
net/llc/af_llc.c | 3 +
net/openvswitch/flow_netlink.c | 9 +--
net/sched/sch_fq.c | 37 ++++++---
net/sctp/associola.c | 30 +++++++-
net/sctp/inqueue.c | 2 +-
net/sctp/ipv6.c | 3 +
net/sctp/sm_statefuns.c | 89 ++++++++++++----------
net/sctp/ulpevent.c | 1 -
net/xfrm/xfrm_state.c | 1 +
59 files changed, 380 insertions(+), 585 deletions(-)
This is the start of the stable review cycle for the 4.14.42 release.
There are 45 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun May 20 08:15:14 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.42-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.42-rc1
Willy Tarreau <w(a)1wt.eu>
proc: do not access cmdline nor environ from file-backed areas
James Chapman <jchapman(a)katalix.com>
l2tp: revert "l2tp: fix missing print session offset info"
Antony Antony <antony(a)phenome.org>
xfrm: fix xfrm_do_migrate() with AEAD e.g(AES-GCM)
ethanwu <ethanwu(a)synology.com>
btrfs: Take trans lock before access running trans in check_delayed_ref
Herbert Xu <herbert(a)gondor.apana.org.au>
xfrm: Use __skb_queue_tail in xfrm_trans_queue
Dave Carroll <david.carroll(a)microsemi.com>
scsi: aacraid: Correct hba_send to include iu_type
Paolo Abeni <pabeni(a)redhat.com>
udp: fix SO_BINDTODEVICE
Eric Dumazet <edumazet(a)google.com>
nsh: fix infinite loop
Jianbo Liu <jianbol(a)mellanox.com>
net/mlx5e: Allow offloading ipv4 header re-write for icmp
Eric Dumazet <edumazet(a)google.com>
ipv6: fix uninit-value in ip6_multipath_l3_keys()
Stephen Hemminger <stephen(a)networkplumber.org>
hv_netvsc: set master device
Talat Batheesh <talatb(a)mellanox.com>
net/mlx5: Avoid cleaning flow steering table twice during error flow
Tariq Toukan <tariqt(a)mellanox.com>
net/mlx5e: TX, Use correct counter in dma_map error flow
Jiri Pirko <jiri(a)mellanox.com>
net: sched: fix error path in tcf_proto_create() when modules are not configured
Debabrata Banerjee <dbanerje(a)akamai.com>
bonding: send learning packets for vlans on slave
Debabrata Banerjee <dbanerje(a)akamai.com>
bonding: do not allow rlb updates to invalid mac
Michael Chan <michael.chan(a)broadcom.com>
tg3: Fix vunmap() BUG_ON() triggered from tg3_free_consistent().
Yuchung Cheng <ycheng(a)google.com>
tcp: ignore Fast Open on repair mode
Neal Cardwell <ncardwell(a)google.com>
tcp_bbr: fix to zero idle_restart only upon S/ACKed data
Xin Long <lucien.xin(a)gmail.com>
sctp: use the old asoc when making the cookie-ack chunk in dupcook_d
Xin Long <lucien.xin(a)gmail.com>
sctp: remove sctp_chunk_put from fail_mark err path in sctp_ulpevent_make_rcvmsg
Xin Long <lucien.xin(a)gmail.com>
sctp: handle two v4 addrs comparison in sctp_inet6_cmp_addr
Xin Long <lucien.xin(a)gmail.com>
sctp: fix the issue that the cookie-ack with auth can't get processed
Xin Long <lucien.xin(a)gmail.com>
sctp: delay the authentication for the duplicated cookie-echo chunk
Eric Dumazet <edumazet(a)google.com>
rds: do not leak kernel memory to user land
Heiner Kallweit <hkallweit1(a)gmail.com>
r8169: fix powering up RTL8168h
Bjørn Mork <bjorn(a)mork.no>
qmi_wwan: do not steal interfaces from class drivers
Stefano Brivio <sbrivio(a)redhat.com>
openvswitch: Don't swap table in nlattr_set() after OVS_ATTR_NESTED is found
Andre Tomt <andre(a)tomt.net>
net/tls: Fix connection stall on partial tls record
Dave Watson <davejwatson(a)fb.com>
net/tls: Don't recursively call push_record during tls_write_space callbacks
Lance Richardson <lance.richardson.net(a)gmail.com>
net: support compat 64-bit time in {s,g}etsockopt
Eric Dumazet <edumazet(a)google.com>
net_sched: fq: take care of throttled flows before reuse
Roman Mashak <mrv(a)mojatatu.com>
net sched actions: fix refcnt leak in skbmod
Adi Nissim <adin(a)mellanox.com>
net/mlx5: E-Switch, Include VF RDMA stats in vport statistics
Roi Dayan <roid(a)mellanox.com>
net/mlx5e: Err if asked to offload TC match on frag being first
Moshe Shemesh <moshe(a)mellanox.com>
net/mlx4_en: Verify coalescing parameters are in range
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
net/mlx4_en: Fix an error handling path in 'mlx4_en_init_netdev()'
Grygorii Strashko <grygorii.strashko(a)ti.com>
net: ethernet: ti: cpsw: fix packet leaking in dual_mac mode
Rob Taglang <rob(a)taglang.io>
net: ethernet: sun: niu set correct packet size in skb
Eric Dumazet <edumazet(a)google.com>
llc: better deal with too small mtu
Andrey Ignatov <rdna(a)fb.com>
ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg
Julian Anastasov <ja(a)ssi.bg>
ipv4: fix fnhe usage by non-cached routes
Eric Dumazet <edumazet(a)google.com>
dccp: fix tasklet usage
Hangbin Liu <liuhangbin(a)gmail.com>
bridge: check iface upper dev when setting master via ioctl
Ingo Molnar <mingo(a)elte.hu>
8139too: Use disable_irq_nosync() in rtl8139_poll_controller()
-------------
Diffstat:
Makefile | 4 +-
drivers/net/bonding/bond_alb.c | 15 +--
drivers/net/bonding/bond_main.c | 2 +
drivers/net/ethernet/broadcom/tg3.c | 9 +-
drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 16 +++
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 8 +-
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 7 +-
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 7 +-
drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 20 ++--
drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 11 +-
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 23 +++--
drivers/net/ethernet/realtek/8139too.c | 2 +-
drivers/net/ethernet/realtek/r8169.c | 3 +
drivers/net/ethernet/sun/niu.c | 5 +-
drivers/net/ethernet/ti/cpsw.c | 2 +
drivers/net/hyperv/netvsc_drv.c | 3 +-
drivers/net/usb/qmi_wwan.c | 12 +++
drivers/scsi/aacraid/commsup.c | 8 +-
fs/btrfs/extent-tree.c | 7 ++
fs/proc/base.c | 8 +-
include/linux/mm.h | 1 +
include/net/bonding.h | 1 +
include/net/tls.h | 1 +
mm/gup.c | 3 +
net/bridge/br_if.c | 4 +-
net/compat.c | 6 +-
net/dccp/ccids/ccid2.c | 14 ++-
net/dccp/timer.c | 2 +-
net/ipv4/ping.c | 7 +-
net/ipv4/route.c | 118 ++++++++++------------
net/ipv4/tcp.c | 3 +-
net/ipv4/tcp_bbr.c | 4 +-
net/ipv4/udp.c | 11 +-
net/ipv6/route.c | 7 +-
net/ipv6/udp.c | 4 +-
net/l2tp/l2tp_netlink.c | 2 -
net/llc/af_llc.c | 3 +
net/nsh/nsh.c | 2 +
net/openvswitch/flow_netlink.c | 9 +-
net/rds/recv.c | 1 +
net/sched/act_skbmod.c | 5 +-
net/sched/cls_api.c | 2 +-
net/sched/sch_fq.c | 37 ++++---
net/sctp/associola.c | 30 +++++-
net/sctp/inqueue.c | 2 +-
net/sctp/ipv6.c | 3 +
net/sctp/sm_statefuns.c | 88 ++++++++--------
net/sctp/ulpevent.c | 1 -
net/tls/tls_main.c | 8 ++
net/xfrm/xfrm_input.c | 2 +-
net/xfrm/xfrm_state.c | 1 +
51 files changed, 349 insertions(+), 205 deletions(-)
Entry corresponding to 220 us setup time was missing. I am not aware of
any specific bug this fixes, but this could potentially result in enabling
PSR on a panel with a higher setup time requirement than supported by the
hardware.
I verified the value is present in eDP spec versions 1.3, 1.4 and 1.4a.
Fixes: 6608804b3d7f ("drm/dp: Add drm_dp_psr_setup_time()")
Cc: stable(a)vger.kernel.org
Cc: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Cc: Jose Roberto de Souza <jose.souza(a)intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan(a)intel.com>
---
drivers/gpu/drm/drm_dp_helper.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
index 36c7609a4bd5..a7ba602a43a8 100644
--- a/drivers/gpu/drm/drm_dp_helper.c
+++ b/drivers/gpu/drm/drm_dp_helper.c
@@ -1159,6 +1159,7 @@ int drm_dp_psr_setup_time(const u8 psr_cap[EDP_PSR_RECEIVER_CAP_SIZE])
static const u16 psr_setup_time_us[] = {
PSR_SETUP_TIME(330),
PSR_SETUP_TIME(275),
+ PSR_SETUP_TIME(220),
PSR_SETUP_TIME(165),
PSR_SETUP_TIME(110),
PSR_SETUP_TIME(55),
--
2.14.1
In 08810a4119aaebf6318f209ec5dd9828e969cba4 setting
dev->power.direct_complete was made conditional on
pm_runtime_suspended().
The justification was:
While at it, make the core check pm_runtime_suspended() when
setting power.direct_complete so that it doesn't need to be
checked by ->prepare callbacks.
However, this breaks resuming from suspend on those newer HP laptops
if the amdgpu driver is used (due to hybrid intel+radeon graphics). Given
the justification for the change, undoing it seems best as it
appears to have unintended side effects.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199693
References: https://bugs.freedesktop.org/show_bug.cgi?id=106447
Signed-off-by: Thomas Martitz <kugel(a)rockbox.org>
Cc: Pavel Machek <pavel(a)ucw.cz>
Cc: Len Brown <len.brown(a)intel.com>
Cc: <linux-pm(a)vger.kernel.org>
Cc: <stable(a)vger.kernel.org> [4.15+]
Signed-off-by: Thomas Martitz <kugel(a)rockbox.org>
---
drivers/base/power/main.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index 02a497e7c785..b2fb0974f832 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -1960,8 +1960,7 @@ static int device_prepare(struct device *dev, pm_message_t state)
*/
spin_lock_irq(&dev->power.lock);
dev->power.direct_complete = state.event == PM_EVENT_SUSPEND &&
- pm_runtime_suspended(dev) && ret > 0 &&
- !dev_pm_test_driver_flags(dev, DPM_FLAG_NEVER_SKIP);
+ ret > 0 && !dev_pm_test_driver_flags(dev, DPM_FLAG_NEVER_SKIP);
spin_unlock_irq(&dev->power.lock);
return 0;
}
--
2.17.0
Hi,
Commit 820da5357572 ("l2tp: fix missing print session offset info") has
been backported to several -stable trees (AFAICS 3.18, 4.4, 4.9, 4.14
and 4.15). This patch has been reverted upstream as the L2TP offset
option was dropped. Therefore it doesn't make sense to start exporting
this data in stable releases.
Can you guys revert the corresponding commits from your trees, or queue
up de3b58bc359a ("l2tp: revert "l2tp: fix missing print session offset info"")?
If some of you have 820da5357572 queued up for other trees, then please
drop it.
Guillaume
Since Linux v4.10 release (commit 1d9174fbc55e "PM / Runtime: Defer
resuming of the device in pm_runtime_force_resume()"),
pm_runtime_force_resume() function doesn't runtime resume device if it was
not runtime active before system suspend. Thus, driver should not do any
register access after pm_runtime_force_resume() without checking the
runtime status of the device. To fix this issue, simply move
s3c64xx_spi_hwinit() call to s3c64xx_spi_runtime_resume() to ensure that
hardware is always properly initialized. This fixes Synchronous external
abort issue on system suspend/resume cycle on newer Exynos SoCs.
Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
CC: <stable(a)vger.kernel.org> # 4.10.x: 1c75862d8e5a spi: spi-s3c64xx: Remove unused s3c64xx_spi_hwinit()
CC: <stable(a)vger.kernel.org> # 4.10.x
Reviewed-by: Krzysztof Kozlowski <krzk(a)kernel.org>
Acked-by: Andi Shyti <andi(a)etezian.org>
---
Resend reason: added cc: stable, reviewed and acked tags
---
drivers/spi/spi-s3c64xx.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/spi/spi-s3c64xx.c b/drivers/spi/spi-s3c64xx.c
index f55dc78957ad..7b7151ec14c8 100644
--- a/drivers/spi/spi-s3c64xx.c
+++ b/drivers/spi/spi-s3c64xx.c
@@ -1292,8 +1292,6 @@ static int s3c64xx_spi_resume(struct device *dev)
if (ret < 0)
return ret;
- s3c64xx_spi_hwinit(sdd);
-
return spi_master_resume(master);
}
#endif /* CONFIG_PM_SLEEP */
@@ -1331,6 +1329,8 @@ static int s3c64xx_spi_runtime_resume(struct device *dev)
if (ret != 0)
goto err_disable_src_clk;
+ s3c64xx_spi_hwinit(sdd);
+
return 0;
err_disable_src_clk:
--
2.17.0
The audit_filter_rules() function in auditsc.c used the in_[e]group_p()
functions to check GID/EGID match, but these functions use the current
task's credentials, while the comparison should use the credentials of
the task given to audit_filter_rules() as a parameter (tsk).
Note that we can use group_search(cred->group_info, ...) as a
replacement for both in_group_p and in_egroup_p as these functions only
compare the parameter to cred->fsgid/egid and then call group_search.
In fact, the usage of in_group_p was incorrect also because it compared
to cred->fsgid and not cred->gid.
GitHub issue:
https://github.com/linux-audit/audit-kernel/issues/82
Fixes: 37eebe39c973 ("audit: improve GID/EGID comparation logic")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ondrej Mosnacek <omosnace(a)redhat.com>
---
kernel/auditsc.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index cbab0da86d15..ec38e4d97c23 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -490,20 +490,20 @@ static int audit_filter_rules(struct task_struct *tsk,
result = audit_gid_comparator(cred->gid, f->op, f->gid);
if (f->op == Audit_equal) {
if (!result)
- result = in_group_p(f->gid);
+ result = groups_search(cred->group_info, f->gid);
} else if (f->op == Audit_not_equal) {
if (result)
- result = !in_group_p(f->gid);
+ result = !groups_search(cred->group_info, f->gid);
}
break;
case AUDIT_EGID:
result = audit_gid_comparator(cred->egid, f->op, f->gid);
if (f->op == Audit_equal) {
if (!result)
- result = in_egroup_p(f->gid);
+ result = groups_search(cred->group_info, f->gid);
} else if (f->op == Audit_not_equal) {
if (result)
- result = !in_egroup_p(f->gid);
+ result = !groups_search(cred->group_info, f->gid);
}
break;
case AUDIT_SGID:
--
2.17.0
The ONFI spec clearly says that FAIL bit is only valid for PROGRAM,
ERASE and READ-with-on-die-ECC operations, and should be ignored
otherwise.
It seems that checking it after sending a SET_FEATURES is a bad idea
because a previous READ, PROGRAM or ERASE op may have failed, and
depending on the implementation, the FAIL bit is not cleared until a
new READ, PROGRAM or ERASE is started.
This leads to ->set_features() returning -EIO while it actually worked,
which can sometimes stop a batch of READ/PROGRAM ops.
Note that we only fix the ->exec_op() path here, because some drivers
are abusing the NAND_STATUS_FAIL flag in their ->waitfunc()
implementation to propagate other kind of errors, like
wait-ready-timeout or controller-related errors. Let's not try to fix
those drivers since they worked fine so far.
Fixes: 8878b126df76 ("mtd: nand: add ->exec_op() implementation")
Cc: stable(a)vger.kernel.org
Signed-off-by: Boris Brezillon <boris.brezillon(a)bootlin.com>
---
This patch is fixing a problem we had with on-die ECC on Micron
NANDs [1].
On these chips, when you have an ECC failure, the FAIL bit is set and
it's not cleared until the next READ operation, which led the following
SET_FEATURES (used to re-enable on-die ECC) to fail with -EIO and
stopped the batch of page reads started by UBIFS, which in turn led to
unmountable FS.
[1]http://patchwork.ozlabs.org/patch/907874/
Changes in v2:
- Fix the subject prefix
---
drivers/mtd/nand/raw/nand_base.c | 27 +++++++++------------------
1 file changed, 9 insertions(+), 18 deletions(-)
diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index f28c3a555861..ee29f34562ab 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -2174,7 +2174,6 @@ static int nand_set_features_op(struct nand_chip *chip, u8 feature,
struct mtd_info *mtd = nand_to_mtd(chip);
const u8 *params = data;
int i, ret;
- u8 status;
if (chip->exec_op) {
const struct nand_sdr_timings *sdr =
@@ -2188,26 +2187,18 @@ static int nand_set_features_op(struct nand_chip *chip, u8 feature,
};
struct nand_operation op = NAND_OPERATION(instrs);
- ret = nand_exec_op(chip, &op);
- if (ret)
- return ret;
-
- ret = nand_status_op(chip, &status);
- if (ret)
- return ret;
- } else {
- chip->cmdfunc(mtd, NAND_CMD_SET_FEATURES, feature, -1);
- for (i = 0; i < ONFI_SUBFEATURE_PARAM_LEN; ++i)
- chip->write_byte(mtd, params[i]);
+ return nand_exec_op(chip, &op);
+ }
- ret = chip->waitfunc(mtd, chip);
- if (ret < 0)
- return ret;
+ chip->cmdfunc(mtd, NAND_CMD_SET_FEATURES, feature, -1);
+ for (i = 0; i < ONFI_SUBFEATURE_PARAM_LEN; ++i)
+ chip->write_byte(mtd, params[i]);
- status = ret;
- }
+ ret = chip->waitfunc(mtd, chip);
+ if (ret < 0)
+ return ret;
- if (status & NAND_STATUS_FAIL)
+ if (ret & NAND_STATUS_FAIL)
return -EIO;
return 0;
--
2.14.1
From: Vaibhav Jain <vaibhav(a)linux.ibm.com>
On Power-8 the AFU attr prefault_mode tried to improve storage fault
performance by prefaulting process segments. However Power-9 radix
mode doesn't have Storage-Segments and prefaulting Pages is too fine
grained.
So this patch updates prefault_mode_store() to not allow any other
value apart from CXL_PREFAULT_NONE when radix mode is enabled.
Cc: <stable(a)vger.kernel.org>
Fixes: f24be42aab37 ("cxl: Add psl9 specific code")
Signed-off-by: Vaibhav Jain <vaibhav(a)linux.ibm.com>
---
Documentation/ABI/testing/sysfs-class-cxl | 4 +++-
drivers/misc/cxl/sysfs.c | 16 ++++++++++++----
2 files changed, 15 insertions(+), 5 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-class-cxl b/Documentation/ABI/testing/sysfs-class-cxl
index 640f65e79ef1..267920a1874b 100644
--- a/Documentation/ABI/testing/sysfs-class-cxl
+++ b/Documentation/ABI/testing/sysfs-class-cxl
@@ -69,7 +69,9 @@ Date: September 2014
Contact: linuxppc-dev(a)lists.ozlabs.org
Description: read/write
Set the mode for prefaulting in segments into the segment table
- when performing the START_WORK ioctl. Possible values:
+ when performing the START_WORK ioctl. Only applicable when
+ running under hashed page table mmu.
+ Possible values:
none: No prefaulting (default)
work_element_descriptor: Treat the work element
descriptor as an effective address and
diff --git a/drivers/misc/cxl/sysfs.c b/drivers/misc/cxl/sysfs.c
index 4b5a4c5d3c01..629e2e156412 100644
--- a/drivers/misc/cxl/sysfs.c
+++ b/drivers/misc/cxl/sysfs.c
@@ -353,12 +353,20 @@ static ssize_t prefault_mode_store(struct device *device,
struct cxl_afu *afu = to_cxl_afu(device);
enum prefault_modes mode = -1;
- if (!strncmp(buf, "work_element_descriptor", 23))
- mode = CXL_PREFAULT_WED;
- if (!strncmp(buf, "all", 3))
- mode = CXL_PREFAULT_ALL;
if (!strncmp(buf, "none", 4))
mode = CXL_PREFAULT_NONE;
+ else {
+ if (!radix_enabled()) {
+
+ /* only allowed when not in radix mode */
+ if (!strncmp(buf, "work_element_descriptor", 23))
+ mode = CXL_PREFAULT_WED;
+ if (!strncmp(buf, "all", 3))
+ mode = CXL_PREFAULT_ALL;
+ } else {
+ dev_err(device, "Cannot prefault with radix enabled\n");
+ }
+ }
if (mode == -1)
return -EINVAL;
--
2.17.0
Oops sorry, I failed to write the subject. It should been something like the subject of this e-mail.
> -----Original Message-----
> From: stable-owner(a)vger.kernel.org [mailto:stable-owner@vger.kernel.org] On
> Behalf Of Daniel Sangorrin
> Sent: Friday, May 18, 2018 9:59 AM
> To: stable(a)vger.kernel.org
> Cc: mtk.manpages(a)gmail.com; viro(a)zeniv.linux.org.uk
> Subject:
>
> Hello Greg,
>
> After running LTP with Fuego on the LTS kernel 4.4.y, there were
> a few test cases failing that I thought needed some investigation.
>
> I reviewed the first one (fcntl35 and fcntl35_64) so far. According to the
> comments on LTP's fcntl35.c file (by Xiao Yang <yangx.jy(a)cn.fujitsu.com>)
> the bug tested by this test case was fixed by:
> pipe: cap initial pipe capacity according to pipe-max-size
> commit 086e774a57fba4695f14383c0818994c0b31da7c
> Author: Michael Kerrisk (man-pages) <mtk.manpages(a)gmail.com>
> Date: Tue Oct 11 13:53:43 2016 -0700
>
> I backported that patch (see next e-mail), tested again and confirmed that
> the patch fixed the bug (or at least the error message in LTP's test).
>
> Before:
> fcntl35.c:98: FAIL: an unprivileged user init the capacity of a pipe to 65536
> unexpectedly, expected 4096
> After:
> fcntl35.c:101: PASS: an unprivileged user init the capacity of a pipe to 4096
> successfully
>
> Thanks,
> Daniel Sangorrin
>
Hello Greg,
After running LTP with Fuego on the LTS kernel 4.4.y, there were
a few test cases failing that I thought needed some investigation.
I reviewed the first one (fcntl35 and fcntl35_64) so far. According to the
comments on LTP's fcntl35.c file (by Xiao Yang <yangx.jy(a)cn.fujitsu.com>)
the bug tested by this test case was fixed by:
pipe: cap initial pipe capacity according to pipe-max-size
commit 086e774a57fba4695f14383c0818994c0b31da7c
Author: Michael Kerrisk (man-pages) <mtk.manpages(a)gmail.com>
Date: Tue Oct 11 13:53:43 2016 -0700
I backported that patch (see next e-mail), tested again and confirmed that
the patch fixed the bug (or at least the error message in LTP's test).
Before:
fcntl35.c:98: FAIL: an unprivileged user init the capacity of a pipe to 65536 unexpectedly, expected 4096
After:
fcntl35.c:101: PASS: an unprivileged user init the capacity of a pipe to 4096 successfully
Thanks,
Daniel Sangorrin
When run raidconfig from Dom0 we found that the Xen DMA heap is reduced,
but Dom Heap is increased by the same size. Tracing raidconfig we found
that the related ioctl() in megaraid_sas will call dma_alloc_coherent()
to apply memory. If the memory allocated by Dom0 is not in the DMA area,
it will exchange memory with Xen to meet the requiment. Later drivers
call dma_free_coherent() to free the memory, on xen_swiotlb_free_coherent()
the check condition (dev_addr + size - 1 <= dma_mask) is always false,
it prevents calling xen_destroy_contiguous_region() to return the memory
to the Xen DMA heap.
This issue introduced by commit 6810df88dcfc2 "xen-swiotlb: When doing
coherent alloc/dealloc check before swizzling the MFNs.".
Signed-off-by: Joe Jin <joe.jin(a)oracle.com>
Tested-by: John Sobecki <john.sobecki(a)oracle.com>
Reviewed-by: Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Cc: stable(a)vger.kernel.org
---
drivers/xen/swiotlb-xen.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index e1c60899fdbc..a6f9ba85dc4b 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -351,7 +351,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
* physical address */
phys = xen_bus_to_phys(dev_addr);
- if (((dev_addr + size - 1 > dma_mask)) ||
+ if (((dev_addr + size - 1 <= dma_mask)) ||
range_straddles_page_boundary(phys, size))
xen_destroy_contiguous_region(phys, order);
--
2.14.3 (Apple Git-98)
When run raidconfig from Dom0 we found that the Xen DMA heap is reduced,
but Dom Heap is increased by the same size. Tracing raidconfig we found
that the related ioctl() in megaraid_sas will call dma_alloc_coherent()
to apply memory. If the memory allocated by Dom0 is not in the DMA area,
it will exchange memory with Xen to meet the requiment. Later drivers
call dma_free_coherent() to free the memory, on xen_swiotlb_free_coherent()
the check condition (dev_addr + size - 1 <= dma_mask) is always false,
it prevents calling xen_destroy_contiguous_region() to return the memory
to the Xen DMA heap.
This issue introduced by commit 6810df88dcfc2 "xen-swiotlb: When doing
coherent alloc/dealloc check before swizzling the MFNs.".
Signed-off-by: Joe Jin <joe.jin(a)oracle.com>
Tested-by: John Sobecki <john.sobecki(a)oracle.com>
Reviewed-by: Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Cc: stable(a)vger.kernel.org
---
drivers/xen/swiotlb-xen.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index e1c60899fdbc..a6f9ba85dc4b 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -351,7 +351,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
* physical address */
phys = xen_bus_to_phys(dev_addr);
- if (((dev_addr + size - 1 > dma_mask)) ||
+ if (((dev_addr + size - 1 <= dma_mask)) ||
range_straddles_page_boundary(phys, size))
xen_destroy_contiguous_region(phys, order);
--
2.14.3 (Apple Git-98)
v4.4.y:
drivers/net/ethernet/ti/cpsw.c: In function 'cpsw_add_dual_emac_def_ale_entries':
drivers/net/ethernet/ti/cpsw.c:1112:23: error: 'cpsw' undeclared
Guenter
On Thu, 2017-10-19 at 16:21 -0400, Josef Bacik wrote:
> + blk_mq_start_request(req);
> if (unlikely(nsock->pending && nsock->pending != req)) {
> blk_mq_requeue_request(req, true);
> ret = 0;
(replying to an e-mail from seven months ago)
Hello Josef,
Are you aware that the nbd driver is one of the very few block drivers that
calls blk_mq_requeue_request() after a request has been started? I think that
can lead to the block layer core to undesired behavior, e.g. that the timeout
handler fires concurrently with a request being reinstered. Can you or a
colleague have a look at this? I would like to add the following code to the
block layer core and I think that the nbd driver would trigger this warning:
void blk_mq_requeue_request(struct request *rq, bool kick_requeue_list)
{
+ WARN_ON_ONCE(old_state != MQ_RQ_COMPLETE);
+
__blk_mq_requeue_request(rq);
Thanks,
Bart.
For problem determination we always want to see when we were invoked
on the terminate_rport_io callback whether we perform something or not.
Temporal event sequence of interest with a long fast_io_fail_tmo of 27 sec:
loose remote port
t workqueue
[s] zfcp_q_<dev> IRQ zfcperp<dev>
=== ================== =================== ============================
0 recv RSCN
q p.test_link_work
block rport
start fast_io_fail_tmo
send ADISC ELS
4 recv ADISC fail
block zfcp_port
port forced reopen
send open port
12 recv open port fail
q p.gid_pn_work
zfcp_erp_wakeup
(zfcp_erp_wait would return)
GID_PN fail
Before this point, we got a SCSI trace with tag "sctrpi1" on fast_io_fail,
e.g. with the typical 5 sec setting.
port.status |= ERP_FAILED
If fast_io_fail_tmo triggers after this point, we missed a SCSI trace.
workqueue
fc_dl_<host>
==================
27 fc_timeout_fail_rport_io
fc_terminate_rport_io
zfcp_scsi_terminate_rport_io
zfcp_erp_port_forced_reopen
_zfcp_erp_port_forced_reopen
if (port.status & ERP_FAILED)
return;
Therefore, write a trace before above early return.
Example trace record formatted with zfcpdbf from s390-tools:
Timestamp : ...
Area : REC
Subarea : 00
Level : 1
Exception : -
CPU ID : ..
Caller : 0x...
Record ID : 1 ZFCP_DBF_REC_TRIG
Tag : sctrpi1 SCSI terminate rport I/O
LUN : 0xffffffffffffffff none (invalid)
WWPN : 0x<wwpn>
D_ID : 0x<n_port_id>
Adapter status : 0x...
Port status : 0x...
LUN status : 0x00000000 none (invalid)
Ready count : 0x...
Running count : 0x...
ERP want : 0x03 ZFCP_ERP_ACTION_REOPEN_PORT_FORCED
ERP need : 0xe0 ZFCP_ERP_ACTION_FAILED
Signed-off-by: Steffen Maier <maier(a)linux.ibm.com>
Cc: <stable(a)vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock(a)linux.ibm.com>
---
drivers/s390/scsi/zfcp_erp.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index 3489b1bc9121..5c368cdfc455 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -42,9 +42,13 @@ enum zfcp_erp_steps {
* @ZFCP_ERP_ACTION_REOPEN_PORT_FORCED: Forced port recovery.
* @ZFCP_ERP_ACTION_REOPEN_ADAPTER: Adapter recovery.
* @ZFCP_ERP_ACTION_NONE: Eyecatcher pseudo flag to bitwise or-combine with
- * either of the other enum values.
+ * either of the first four enum values.
* Used to indicate that an ERP action could not be
* set up despite a detected need for some recovery.
+ * @ZFCP_ERP_ACTION_FAILED: Eyecatcher pseudo flag to bitwise or-combine with
+ * either of the first four enum values.
+ * Used to indicate that ERP not needed because
+ * the object has ZFCP_STATUS_COMMON_ERP_FAILED.
*/
enum zfcp_erp_act_type {
ZFCP_ERP_ACTION_REOPEN_LUN = 1,
@@ -52,6 +56,7 @@ enum zfcp_erp_act_type {
ZFCP_ERP_ACTION_REOPEN_PORT_FORCED = 3,
ZFCP_ERP_ACTION_REOPEN_ADAPTER = 4,
ZFCP_ERP_ACTION_NONE = 0xc0,
+ ZFCP_ERP_ACTION_FAILED = 0xe0,
};
enum zfcp_erp_act_state {
@@ -379,8 +384,12 @@ static void _zfcp_erp_port_forced_reopen(struct zfcp_port *port, int clear,
zfcp_erp_port_block(port, clear);
zfcp_scsi_schedule_rport_block(port);
- if (atomic_read(&port->status) & ZFCP_STATUS_COMMON_ERP_FAILED)
+ if (atomic_read(&port->status) & ZFCP_STATUS_COMMON_ERP_FAILED) {
+ zfcp_dbf_rec_trig(id, port->adapter, port, NULL,
+ ZFCP_ERP_ACTION_REOPEN_PORT_FORCED,
+ ZFCP_ERP_ACTION_FAILED);
return;
+ }
zfcp_erp_action_enqueue(ZFCP_ERP_ACTION_REOPEN_PORT_FORCED,
port->adapter, port, NULL, id, 0);
--
2.16.3
If a SCSI device is deleted during scsi_eh host reset, we cannot get a
reference to the SCSI device anymore since scsi_device_get returns !=0
by design. Assuming the recovery of adapter and port(s) was successful,
zfcp_erp_strategy_followup_success() attempts to trigger a LUN reset for
the half-gone SCSI device. Unfortunately, it causes the following confusing
trace record which states that zfcp will do a LUN recovery as "ERP need" is
ZFCP_ERP_ACTION_REOPEN_LUN == 1 and equals "ERP want".
Old example trace record formatted with zfcpdbf from s390-tools:
Tag: : ersfs_3 ERP, trigger, unit reopen, port reopen succeeded
LUN : 0x<FCP_LUN>
WWPN : 0x<WWPN>
D_ID : 0x<N_Port-ID>
Adapter status : 0x5400050b
Port status : 0x54000001
LUN status : 0x40000000 ZFCP_STATUS_COMMON_RUNNING
but not ZFCP_STATUS_COMMON_UNBLOCKED as it
was closed on close part of adapter reopen
ERP want : 0x01
ERP need : 0x01 misleading
However, zfcp_erp_setup_act() returns NULL as it cannot get the reference.
Hence, zfcp_erp_action_enqueue() takes an early goto out and _NO_ recovery
actually happens.
We always do want the recovery trigger trace record even if no erp_action
could be enqueued as in this case. For other cases where we did not enqueue
an erp_action, 'need' has always been zero to indicate this. In order to
indicate above goto out, introduce an eyecatcher "flag" to mark the
"ERP need" as 'not needed' but still keep the information which erp_action
type, that zfcp_erp_required_act() had decided upon, is needed.
0xc_ is chosen to be visibly different from 0x0_ in "ERP want".
New example trace record formatted with zfcpdbf from s390-tools:
Tag: : ersfs_3 ERP, trigger, unit reopen, port reopen succeeded
LUN : 0x<FCP_LUN>
WWPN : 0x<WWPN>
D_ID : 0x<N_Port-ID>
Adapter status : 0x5400050b
Port status : 0x54000001
LUN status : 0x40000000
ERP want : 0x01
ERP need : 0xc1 would need LUN ERP, but no action set up
^
Before v2.6.38 commit ae0904f60fab ("[SCSI] zfcp: Redesign of the debug
tracing for recovery actions.") we could detect this case because the
"erp_action" field in the trace was NULL. The rework removed erp_action
as argument and field from the trace.
This patch here is for tracing. A fix to allow LUN recovery in the case at
hand is a topic for a separate patch.
See also commit fdbd1c5e27da ("[SCSI] zfcp: Allow running unit/LUN shutdown
without acquiring reference") for a similar case and background info.
Signed-off-by: Steffen Maier <maier(a)linux.ibm.com>
Fixes: ae0904f60fab ("[SCSI] zfcp: Redesign of the debug tracing for recovery actions.")
Cc: <stable(a)vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock(a)linux.ibm.com>
---
drivers/s390/scsi/zfcp_erp.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index 1d91a32db08e..d9cd25b56cfa 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -35,11 +35,23 @@ enum zfcp_erp_steps {
ZFCP_ERP_STEP_LUN_OPENING = 0x2000,
};
+/**
+ * enum zfcp_erp_act_type - Type of ERP action object.
+ * @ZFCP_ERP_ACTION_REOPEN_LUN: LUN recovery.
+ * @ZFCP_ERP_ACTION_REOPEN_PORT: Port recovery.
+ * @ZFCP_ERP_ACTION_REOPEN_PORT_FORCED: Forced port recovery.
+ * @ZFCP_ERP_ACTION_REOPEN_ADAPTER: Adapter recovery.
+ * @ZFCP_ERP_ACTION_NONE: Eyecatcher pseudo flag to bitwise or-combine with
+ * either of the other enum values.
+ * Used to indicate that an ERP action could not be
+ * set up despite a detected need for some recovery.
+ */
enum zfcp_erp_act_type {
ZFCP_ERP_ACTION_REOPEN_LUN = 1,
ZFCP_ERP_ACTION_REOPEN_PORT = 2,
ZFCP_ERP_ACTION_REOPEN_PORT_FORCED = 3,
ZFCP_ERP_ACTION_REOPEN_ADAPTER = 4,
+ ZFCP_ERP_ACTION_NONE = 0xc0,
};
enum zfcp_erp_act_state {
@@ -257,8 +269,10 @@ static int zfcp_erp_action_enqueue(int want, struct zfcp_adapter *adapter,
goto out;
act = zfcp_erp_setup_act(need, act_status, adapter, port, sdev);
- if (!act)
+ if (!act) {
+ need |= ZFCP_ERP_ACTION_NONE; /* marker for trace */
goto out;
+ }
atomic_or(ZFCP_STATUS_ADAPTER_ERP_PENDING, &adapter->status);
++adapter->erp_total_count;
list_add_tail(&act->list, &adapter->erp_ready_head);
--
2.16.3
We already have a SCSI trace for the end of abort and scsi_eh TMF. Due to
zfcp_erp_wait() and fc_block_scsi_eh() time can pass between the
start of our eh callback and an actual send/recv of an abort / TMF request.
In order to see the temporal sequence including any abort / TMF send
retries, add a trace before the above two blocking functions.
This supports problem determination with scsi_eh and parallel zfcp ERP.
No need to explicitly trace the beginning of our eh callback, since we
typically can send an abort / TMF and see its HBA response (in the worst
case, it's a pseudo response on dismiss all of adapter recovery, e.g. due
to an FSF request timeout [fsrth_1] of the abort / TMF). If we cannot send,
we now get a trace record for the first "abrt_wt" or "[lt]r_wait" which
denotes almost the beginning of the callback.
No need to explicitly trace the wakeup after the above two blocking
functions because the next retry loop causes another trace in any case
and that is sufficient.
Example trace records formatted with zfcpdbf from s390-tools:
Timestamp : ...
Area : SCSI
Subarea : 00
Level : 1
Exception : -
CPU ID : ..
Caller : 0x...
Record ID : 1
Tag : abrt_wt abort, before zfcp_erp_wait()
Request ID : 0x0000000000000000 none (invalid)
SCSI ID : 0x<scsi_id>
SCSI LUN : 0x<scsi_lun>
SCSI LUN high : 0x<scsi_lun_high>
SCSI result : 0x<scsi_result_of_cmd_to_be_aborted>
SCSI retries : 0x<retries_of_cmd_to_be_aborted>
SCSI allowed : 0x<allowed_retries_of_cmd_to_be_aborted>
SCSI scribble : 0x<req_id_of_cmd_to_be_aborted>
SCSI opcode : <CDB_of_cmd_to_be_aborted>
FCP rsp inf cod: 0x.. none (invalid)
FCP rsp IU : ... none (invalid)
Timestamp : ...
Area : SCSI
Subarea : 00
Level : 1
Exception : -
CPU ID : ..
Caller : 0x...
Record ID : 1
Tag : lr_wait LUN reset, before zfcp_erp_wait()
Request ID : 0x0000000000000000 none (invalid)
SCSI ID : 0x<scsi_id>
SCSI LUN : 0x<scsi_lun>
SCSI LUN high : 0x<scsi_lun_high>
SCSI result : 0x... unrelated
SCSI retries : 0x.. unrelated
SCSI allowed : 0x.. unrelated
SCSI scribble : 0x... unrelated
SCSI opcode : ... unrelated
FCP rsp inf cod: 0x.. none (invalid)
FCP rsp IU : ... none (invalid)
Signed-off-by: Steffen Maier <maier(a)linux.ibm.com>
Fixes: 63caf367e1c9 ("[SCSI] zfcp: Improve reliability of SCSI eh handlers in zfcp")
Fixes: af4de36d911a ("[SCSI] zfcp: Block scsi_eh thread for rport state BLOCKED")
Cc: <stable(a)vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock(a)linux.ibm.com>
---
drivers/s390/scsi/zfcp_scsi.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index a62357f5e8b4..4fdb1665b0e6 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -181,6 +181,7 @@ static int zfcp_scsi_eh_abort_handler(struct scsi_cmnd *scpnt)
if (abrt_req)
break;
+ zfcp_dbf_scsi_abort("abrt_wt", scpnt, NULL);
zfcp_erp_wait(adapter);
ret = fc_block_scsi_eh(scpnt);
if (ret) {
@@ -277,6 +278,7 @@ static int zfcp_task_mgmt_function(struct scsi_cmnd *scpnt, u8 tm_flags)
if (fsf_req)
break;
+ zfcp_dbf_scsi_devreset("wait", scpnt, tm_flags, NULL);
zfcp_erp_wait(adapter);
ret = fc_block_scsi_eh(scpnt);
if (ret) {
--
2.16.3