Hello Greg,
After running LTP with Fuego on the LTS kernel 4.4.y, there were
a few test cases failing that I thought needed some investigation.
I reviewed the first one (fcntl35 and fcntl35_64) so far. According to the
comments on LTP's fcntl35.c file (by Xiao Yang <yangx.jy(a)cn.fujitsu.com>)
the bug tested by this test case was fixed by:
pipe: cap initial pipe capacity according to pipe-max-size
commit 086e774a57fba4695f14383c0818994c0b31da7c
Author: Michael Kerrisk (man-pages) <mtk.manpages(a)gmail.com>
Date: Tue Oct 11 13:53:43 2016 -0700
I backported that patch (see next e-mail), tested again and confirmed that
the patch fixed the bug (or at least the error message in LTP's test).
Before:
fcntl35.c:98: FAIL: an unprivileged user init the capacity of a pipe to 65536 unexpectedly, expected 4096
After:
fcntl35.c:101: PASS: an unprivileged user init the capacity of a pipe to 4096 successfully
Thanks,
Daniel Sangorrin
When run raidconfig from Dom0 we found that the Xen DMA heap is reduced,
but Dom Heap is increased by the same size. Tracing raidconfig we found
that the related ioctl() in megaraid_sas will call dma_alloc_coherent()
to apply memory. If the memory allocated by Dom0 is not in the DMA area,
it will exchange memory with Xen to meet the requiment. Later drivers
call dma_free_coherent() to free the memory, on xen_swiotlb_free_coherent()
the check condition (dev_addr + size - 1 <= dma_mask) is always false,
it prevents calling xen_destroy_contiguous_region() to return the memory
to the Xen DMA heap.
This issue introduced by commit 6810df88dcfc2 "xen-swiotlb: When doing
coherent alloc/dealloc check before swizzling the MFNs.".
Signed-off-by: Joe Jin <joe.jin(a)oracle.com>
Tested-by: John Sobecki <john.sobecki(a)oracle.com>
Reviewed-by: Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Cc: stable(a)vger.kernel.org
---
drivers/xen/swiotlb-xen.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index e1c60899fdbc..a6f9ba85dc4b 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -351,7 +351,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
* physical address */
phys = xen_bus_to_phys(dev_addr);
- if (((dev_addr + size - 1 > dma_mask)) ||
+ if (((dev_addr + size - 1 <= dma_mask)) ||
range_straddles_page_boundary(phys, size))
xen_destroy_contiguous_region(phys, order);
--
2.14.3 (Apple Git-98)
When run raidconfig from Dom0 we found that the Xen DMA heap is reduced,
but Dom Heap is increased by the same size. Tracing raidconfig we found
that the related ioctl() in megaraid_sas will call dma_alloc_coherent()
to apply memory. If the memory allocated by Dom0 is not in the DMA area,
it will exchange memory with Xen to meet the requiment. Later drivers
call dma_free_coherent() to free the memory, on xen_swiotlb_free_coherent()
the check condition (dev_addr + size - 1 <= dma_mask) is always false,
it prevents calling xen_destroy_contiguous_region() to return the memory
to the Xen DMA heap.
This issue introduced by commit 6810df88dcfc2 "xen-swiotlb: When doing
coherent alloc/dealloc check before swizzling the MFNs.".
Signed-off-by: Joe Jin <joe.jin(a)oracle.com>
Tested-by: John Sobecki <john.sobecki(a)oracle.com>
Reviewed-by: Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Cc: stable(a)vger.kernel.org
---
drivers/xen/swiotlb-xen.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index e1c60899fdbc..a6f9ba85dc4b 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -351,7 +351,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
* physical address */
phys = xen_bus_to_phys(dev_addr);
- if (((dev_addr + size - 1 > dma_mask)) ||
+ if (((dev_addr + size - 1 <= dma_mask)) ||
range_straddles_page_boundary(phys, size))
xen_destroy_contiguous_region(phys, order);
--
2.14.3 (Apple Git-98)
v4.4.y:
drivers/net/ethernet/ti/cpsw.c: In function 'cpsw_add_dual_emac_def_ale_entries':
drivers/net/ethernet/ti/cpsw.c:1112:23: error: 'cpsw' undeclared
Guenter
On Thu, 2017-10-19 at 16:21 -0400, Josef Bacik wrote:
> + blk_mq_start_request(req);
> if (unlikely(nsock->pending && nsock->pending != req)) {
> blk_mq_requeue_request(req, true);
> ret = 0;
(replying to an e-mail from seven months ago)
Hello Josef,
Are you aware that the nbd driver is one of the very few block drivers that
calls blk_mq_requeue_request() after a request has been started? I think that
can lead to the block layer core to undesired behavior, e.g. that the timeout
handler fires concurrently with a request being reinstered. Can you or a
colleague have a look at this? I would like to add the following code to the
block layer core and I think that the nbd driver would trigger this warning:
void blk_mq_requeue_request(struct request *rq, bool kick_requeue_list)
{
+ WARN_ON_ONCE(old_state != MQ_RQ_COMPLETE);
+
__blk_mq_requeue_request(rq);
Thanks,
Bart.
For problem determination we always want to see when we were invoked
on the terminate_rport_io callback whether we perform something or not.
Temporal event sequence of interest with a long fast_io_fail_tmo of 27 sec:
loose remote port
t workqueue
[s] zfcp_q_<dev> IRQ zfcperp<dev>
=== ================== =================== ============================
0 recv RSCN
q p.test_link_work
block rport
start fast_io_fail_tmo
send ADISC ELS
4 recv ADISC fail
block zfcp_port
port forced reopen
send open port
12 recv open port fail
q p.gid_pn_work
zfcp_erp_wakeup
(zfcp_erp_wait would return)
GID_PN fail
Before this point, we got a SCSI trace with tag "sctrpi1" on fast_io_fail,
e.g. with the typical 5 sec setting.
port.status |= ERP_FAILED
If fast_io_fail_tmo triggers after this point, we missed a SCSI trace.
workqueue
fc_dl_<host>
==================
27 fc_timeout_fail_rport_io
fc_terminate_rport_io
zfcp_scsi_terminate_rport_io
zfcp_erp_port_forced_reopen
_zfcp_erp_port_forced_reopen
if (port.status & ERP_FAILED)
return;
Therefore, write a trace before above early return.
Example trace record formatted with zfcpdbf from s390-tools:
Timestamp : ...
Area : REC
Subarea : 00
Level : 1
Exception : -
CPU ID : ..
Caller : 0x...
Record ID : 1 ZFCP_DBF_REC_TRIG
Tag : sctrpi1 SCSI terminate rport I/O
LUN : 0xffffffffffffffff none (invalid)
WWPN : 0x<wwpn>
D_ID : 0x<n_port_id>
Adapter status : 0x...
Port status : 0x...
LUN status : 0x00000000 none (invalid)
Ready count : 0x...
Running count : 0x...
ERP want : 0x03 ZFCP_ERP_ACTION_REOPEN_PORT_FORCED
ERP need : 0xe0 ZFCP_ERP_ACTION_FAILED
Signed-off-by: Steffen Maier <maier(a)linux.ibm.com>
Cc: <stable(a)vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock(a)linux.ibm.com>
---
drivers/s390/scsi/zfcp_erp.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index 3489b1bc9121..5c368cdfc455 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -42,9 +42,13 @@ enum zfcp_erp_steps {
* @ZFCP_ERP_ACTION_REOPEN_PORT_FORCED: Forced port recovery.
* @ZFCP_ERP_ACTION_REOPEN_ADAPTER: Adapter recovery.
* @ZFCP_ERP_ACTION_NONE: Eyecatcher pseudo flag to bitwise or-combine with
- * either of the other enum values.
+ * either of the first four enum values.
* Used to indicate that an ERP action could not be
* set up despite a detected need for some recovery.
+ * @ZFCP_ERP_ACTION_FAILED: Eyecatcher pseudo flag to bitwise or-combine with
+ * either of the first four enum values.
+ * Used to indicate that ERP not needed because
+ * the object has ZFCP_STATUS_COMMON_ERP_FAILED.
*/
enum zfcp_erp_act_type {
ZFCP_ERP_ACTION_REOPEN_LUN = 1,
@@ -52,6 +56,7 @@ enum zfcp_erp_act_type {
ZFCP_ERP_ACTION_REOPEN_PORT_FORCED = 3,
ZFCP_ERP_ACTION_REOPEN_ADAPTER = 4,
ZFCP_ERP_ACTION_NONE = 0xc0,
+ ZFCP_ERP_ACTION_FAILED = 0xe0,
};
enum zfcp_erp_act_state {
@@ -379,8 +384,12 @@ static void _zfcp_erp_port_forced_reopen(struct zfcp_port *port, int clear,
zfcp_erp_port_block(port, clear);
zfcp_scsi_schedule_rport_block(port);
- if (atomic_read(&port->status) & ZFCP_STATUS_COMMON_ERP_FAILED)
+ if (atomic_read(&port->status) & ZFCP_STATUS_COMMON_ERP_FAILED) {
+ zfcp_dbf_rec_trig(id, port->adapter, port, NULL,
+ ZFCP_ERP_ACTION_REOPEN_PORT_FORCED,
+ ZFCP_ERP_ACTION_FAILED);
return;
+ }
zfcp_erp_action_enqueue(ZFCP_ERP_ACTION_REOPEN_PORT_FORCED,
port->adapter, port, NULL, id, 0);
--
2.16.3