On Thu, Dec 19, 2019 at 02:58:43PM -0800, John Hubbard wrote:
On 12/19/19 1:07 PM, Jason Gunthorpe wrote: ...
- It would be nice if I could reproduce this. I have a two-node mlx5 Infiniband
test setup, but I have done only the tiniest bit of user space IB coding, so if you have any test programs that aren't too hard to deal with that could possibly hit this, or be tweaked to hit it, I'd be grateful. Keeping in mind that I'm not an advanced IB programmer. At all. :)
Clone this:
https://github.com/linux-rdma/rdma-core.git
Install all the required deps to build it (notably cython), see the README.md
$ ./build.sh $ build/bin/run_tests.py
If you get things that far I think Leon can get a reproduction for you
Cool, it's up and running (1 failure, 3 skipped, out of 67 tests).
This is a great test suite to have running, I'll add it to my scripts. Here's the full output in case the failure or skip cases are a problem:
$ sudo ./build/bin/run_tests.py --verbose
test_create_ah (tests.test_addr.AHTest) ... ok test_create_ah_roce (tests.test_addr.AHTest) ... skipped "Can't run RoCE tests on IB link layer" test_destroy_ah (tests.test_addr.AHTest) ... ok test_create_comp_channel (tests.test_cq.CCTest) ... ok test_destroy_comp_channel (tests.test_cq.CCTest) ... ok test_create_cq_ex (tests.test_cq.CQEXTest) ... ok test_create_cq_ex_bad_flow (tests.test_cq.CQEXTest) ... ok test_destroy_cq_ex (tests.test_cq.CQEXTest) ... ok test_create_cq (tests.test_cq.CQTest) ... ok test_create_cq_bad_flow (tests.test_cq.CQTest) ... ok test_destroy_cq (tests.test_cq.CQTest) ... ok test_rc_traffic_cq_ex (tests.test_cqex.CqExTestCase) ... ok test_ud_traffic_cq_ex (tests.test_cqex.CqExTestCase) ... ok test_xrc_traffic_cq_ex (tests.test_cqex.CqExTestCase) ... ok test_create_dm (tests.test_device.DMTest) ... ok test_create_dm_bad_flow (tests.test_device.DMTest) ... ok test_destroy_dm (tests.test_device.DMTest) ... ok test_destroy_dm_bad_flow (tests.test_device.DMTest) ... ok test_dm_read (tests.test_device.DMTest) ... ok test_dm_write (tests.test_device.DMTest) ... ok test_dm_write_bad_flow (tests.test_device.DMTest) ... ok test_dev_list (tests.test_device.DeviceTest) ... ok test_open_dev (tests.test_device.DeviceTest) ... ok test_query_device (tests.test_device.DeviceTest) ... ok test_query_device_ex (tests.test_device.DeviceTest) ... ok test_query_gid (tests.test_device.DeviceTest) ... ok test_query_port (tests.test_device.DeviceTest) ... FAIL test_query_port_bad_flow (tests.test_device.DeviceTest) ... ok test_create_dm_mr (tests.test_mr.DMMRTest) ... ok test_destroy_dm_mr (tests.test_mr.DMMRTest) ... ok test_buffer (tests.test_mr.MRTest) ... ok test_dereg_mr (tests.test_mr.MRTest) ... ok test_dereg_mr_twice (tests.test_mr.MRTest) ... ok test_lkey (tests.test_mr.MRTest) ... ok test_read (tests.test_mr.MRTest) ... ok test_reg_mr (tests.test_mr.MRTest) ... ok test_reg_mr_bad_flags (tests.test_mr.MRTest) ... ok test_reg_mr_bad_flow (tests.test_mr.MRTest) ... ok test_rkey (tests.test_mr.MRTest) ... ok test_write (tests.test_mr.MRTest) ... ok test_dereg_mw_type1 (tests.test_mr.MWTest) ... ok test_dereg_mw_type2 (tests.test_mr.MWTest) ... ok test_reg_mw_type1 (tests.test_mr.MWTest) ... ok test_reg_mw_type2 (tests.test_mr.MWTest) ... ok test_reg_mw_wrong_type (tests.test_mr.MWTest) ... ok test_odp_rc_traffic (tests.test_odp.OdpTestCase) ... ok test_odp_ud_traffic (tests.test_odp.OdpTestCase) ... skipped 'ODP is not supported - ODP recv not supported' test_odp_xrc_traffic (tests.test_odp.OdpTestCase) ... ok test_default_allocators (tests.test_parent_domain.ParentDomainTestCase) ... ok test_mem_align_allocators (tests.test_parent_domain.ParentDomainTestCase) ... ok test_without_allocators (tests.test_parent_domain.ParentDomainTestCase) ... ok test_alloc_pd (tests.test_pd.PDTest) ... ok test_create_pd_none_ctx (tests.test_pd.PDTest) ... ok test_dealloc_pd (tests.test_pd.PDTest) ... ok test_destroy_pd_twice (tests.test_pd.PDTest) ... ok test_multiple_pd_creation (tests.test_pd.PDTest) ... ok test_create_qp_ex_no_attr (tests.test_qp.QPTest) ... ok test_create_qp_ex_no_attr_connected (tests.test_qp.QPTest) ... ok test_create_qp_ex_with_attr (tests.test_qp.QPTest) ... ok test_create_qp_ex_with_attr_connected (tests.test_qp.QPTest) ... ok test_create_qp_no_attr (tests.test_qp.QPTest) ... ok test_create_qp_no_attr_connected (tests.test_qp.QPTest) ... ok test_create_qp_with_attr (tests.test_qp.QPTest) ... ok test_create_qp_with_attr_connected (tests.test_qp.QPTest) ... ok test_modify_qp (tests.test_qp.QPTest) ... ok test_query_qp (tests.test_qp.QPTest) ... ok test_rdmacm_sync_traffic (tests.test_rdmacm.CMTestCase) ... skipped 'No devices with net interface'
====================================================================== FAIL: test_query_port (tests.test_device.DeviceTest)
Traceback (most recent call last): File "/kernel_work/rdma-core/tests/test_device.py", line 129, in test_query_port self.verify_port_attr(port_attr) File "/kernel_work/rdma-core/tests/test_device.py", line 113, in verify_port_attr assert 'Invalid' not in d.speed_to_str(attr.active_speed) AssertionError
I'm very curious how did you get this assert "d.speed_to_str" covers all known speeds according to the IBTA.
Thanks
Ran 67 tests in 10.058s
FAILED (failures=1, skipped=3)
thanks,
John Hubbard NVIDIA