Changes since v1* [1]:
- Rearrange setters to be next to getters (Jonathan)
- Fix endian bug in nsl_set_slot() (kbuild robot)
- Return NULL instead of !name (Jonathan)
- Use {import,export}_uuid() where UUIDs are used in external interface
structures (Andy)
- Fix uuid_to_nvdimm_class() to be static (kbuild robot)
- Fixup changelog to note uuid copying fixups (Jonathan)
- Fix the broken nlabel/nrange confusion for CXL labels (Jonathan)
- Add a dedicated nlabel validation helper
- Add nrange helpers for CXL
- Introduce __mock to fix unnecessary global symbols (kbuild robot)
- Include core.h to fix some missing prototype warnings (kbuild robot)
- Fix excessive stack usage from devm_cxl_add_decoder() (kbuild robot)
- Add spec reference for namespace label fields (Jonathan)
- Fix uninitialized variable use in cxl_nvdimm_probe() (kbuild robot)
- Move cxl region definition to its own patch for readability (Jonathan)
- Move exclusive command validation to cxl_validate_cmd_from_user() (Ben)
- Fix exclusive command locking (Ben)
- Fold in Alison's acpi_pci_find_root() fix and rebase (Alison)
- Rebase on 0day-induced fixups of the baseline
[1]: https://lore.kernel.org/r/162854806653.1980150.3354618413963083778.stgit@dw…
Note that there were some one-off direct replies marked v2, but now this
set supersedes those.
---
Changed or new(*) patches since v1 are:
[ PATCH v3 03/28] libnvdimm/labels: Introduce label setter helpers
[ PATCH v3 09/28] libnvdimm/labels: Add address-abstraction uuid definitions
[ PATCH v3 10/28] libnvdimm/labels: Add uuid helpers
[*PATCH v3 11/28] libnvdimm/label: Add a helper for nlabel validation
[*PATCH v3 12/28] libnvdimm/labels: Introduce the concept of multi-range namespace labels
[*PATCH v3 13/28] libnvdimm/label: Define CXL region labels
[ PATCH v3 14/28] libnvdimm/labels: Introduce CXL labels
[ PATCH v3 17/28] cxl/mbox: Move mailbox and other non-PCI specific infrastructure to the core
[ PATCH v3 20/28] cxl/mbox: Add exclusive kernel command support
[ PATCH v3 21/28] cxl/pmem: Translate NVDIMM label commands to CXL label commands
[ PATCH v3 22/28] cxl/pmem: Add support for multiple nvdimm-bridge objects
[*PATCH v3 23/28] cxl/acpi: Do not add DSDT disabled ACPI0016 host bridge ports
[ PATCH v3 24/28] tools/testing/cxl: Introduce a mocked-up CXL port hierarchy
[ PATCH v3 27/28] tools/testing/cxl: Introduce a mock memory device + driver
[*PATCH v3 28/28] cxl/core: Split decoder setup into alloc + add
---
As mentioned in patch 24 in this series the response of upstream QEMU
community to CXL device emulation has been underwhelming to date. Even
if that picked up it still results in a situation where new driver
features and new test capabilities for those features are split across
multiple repositories.
The "nfit_test" approach of mocking up platform resources via an
external test module continues to yield positive results catching
regressions early and often. So this attempts to repeat that success
with a "cxl_test" module to inject custom crafted topologies and command
responses into the CXL subsystem's sysfs and ioctl UAPIs.
The first target for cxl_test to verify is the integration of CXL with
LIBNVDIMM and the new support for the CXL namespace label + region-label
format. The first 14 patches introduce support for the new label format.
The next 9 patches rework the CXL PCI driver and to move more common
infrastructure into the core for the unit test environment to reuse. The
largest change here is disconnecting the mailbox command processing
infrastructure from the PCI specific transport. The unit test
environment replaces the PCI transport with a custom backend with mocked
responses to command requests.
Patch 24 introduces just enough mocked functionality for the cxl_acpi
driver to load against cxl_test resources. Patch 21 fixes the first bug
discovered by this framework, namely that HDM decoder target list maps
were not being filled out.
Finally patches 26 and 27 introduce a cxl_test representation of memory
expander devices. In this initial implementation these memory expander
targets implement just enough command support to pass the basic driver
init sequence and enable label command passthrough to LIBNVDIMM.
The topology of cxl_test includes:
- (4) platform fixed memory windows. One each of a x1-volatile,
x4-volatile, x1-persistent, and x4-persistent.
- (4) Host bridges each with (2) root ports
- (8) CXL memory expanders, one for each root port
- Each memory expander device supports the GET_SUPPORTED_LOGS, GET_LOG,
IDENTIFY, GET_LSA, and SET_LSA commands.
Going forward the expectation is that where possible new UAPI visible
subsystem functionality comes with cxl_test emulation of the same.
The build process for cxl_test is:
make M=tools/testing/cxl
make M=tools/testing/cxl modules_install
The implementation methodology of the test module is the same as
nfit_test where the bulk of the emulation comes from replacing symbols
that cxl_acpi and the cxl_core import with mocked implementation of
those symbols. See the "--wrap=" lines in tools/testing/cxl/Kbuild. Some
symbols need to be replaced, but are local to the modules like
match_add_root_ports(). In those cases the local symbol is marked __weak
(via __mock) with a strong implementation coming from
tools/testing/cxl/. The goal being to be minimally invasive to
production code paths.
---
Alison Schofield (1):
cxl/acpi: Do not add DSDT disabled ACPI0016 host bridge ports
Dan Williams (27):
libnvdimm/labels: Introduce getters for namespace label fields
libnvdimm/labels: Add isetcookie validation helper
libnvdimm/labels: Introduce label setter helpers
libnvdimm/labels: Add a checksum calculation helper
libnvdimm/labels: Add blk isetcookie set / validation helpers
libnvdimm/labels: Add blk special cases for nlabel and position helpers
libnvdimm/labels: Add type-guid helpers
libnvdimm/labels: Add claim class helpers
libnvdimm/labels: Add address-abstraction uuid definitions
libnvdimm/labels: Add uuid helpers
libnvdimm/label: Add a helper for nlabel validation
libnvdimm/labels: Introduce the concept of multi-range namespace labels
libnvdimm/label: Define CXL region labels
libnvdimm/labels: Introduce CXL labels
cxl/pci: Make 'struct cxl_mem' device type generic
cxl/mbox: Introduce the mbox_send operation
cxl/mbox: Move mailbox and other non-PCI specific infrastructure to the core
cxl/pci: Use module_pci_driver
cxl/mbox: Convert 'enabled_cmds' to DECLARE_BITMAP
cxl/mbox: Add exclusive kernel command support
cxl/pmem: Translate NVDIMM label commands to CXL label commands
cxl/pmem: Add support for multiple nvdimm-bridge objects
tools/testing/cxl: Introduce a mocked-up CXL port hierarchy
cxl/bus: Populate the target list at decoder create
cxl/mbox: Move command definitions to common location
tools/testing/cxl: Introduce a mock memory device + driver
cxl/core: Split decoder setup into alloc + add
Documentation/driver-api/cxl/memory-devices.rst | 3
drivers/cxl/acpi.c | 143 ++-
drivers/cxl/core/Makefile | 1
drivers/cxl/core/bus.c | 87 +-
drivers/cxl/core/core.h | 8
drivers/cxl/core/mbox.c | 798 +++++++++++++++++
drivers/cxl/core/memdev.c | 115 ++-
drivers/cxl/core/pmem.c | 32 +
drivers/cxl/cxl.h | 45 +
drivers/cxl/cxlmem.h | 188 ++++
drivers/cxl/pci.c | 1051 +----------------------
drivers/cxl/pmem.c | 160 +++-
drivers/nvdimm/btt.c | 11
drivers/nvdimm/btt_devs.c | 14
drivers/nvdimm/core.c | 40 -
drivers/nvdimm/label.c | 361 +++++---
drivers/nvdimm/label.h | 121 ++-
drivers/nvdimm/namespace_devs.c | 204 ++--
drivers/nvdimm/nd-core.h | 5
drivers/nvdimm/nd.h | 289 ++++++
drivers/nvdimm/pfn_devs.c | 2
include/linux/nd.h | 4
tools/testing/cxl/Kbuild | 38 +
tools/testing/cxl/config_check.c | 13
tools/testing/cxl/mock_acpi.c | 109 ++
tools/testing/cxl/mock_pmem.c | 24 +
tools/testing/cxl/test/Kbuild | 10
tools/testing/cxl/test/cxl.c | 587 +++++++++++++
tools/testing/cxl/test/mem.c | 255 ++++++
tools/testing/cxl/test/mock.c | 171 ++++
tools/testing/cxl/test/mock.h | 27 +
31 files changed, 3422 insertions(+), 1494 deletions(-)
create mode 100644 drivers/cxl/core/mbox.c
create mode 100644 tools/testing/cxl/Kbuild
create mode 100644 tools/testing/cxl/config_check.c
create mode 100644 tools/testing/cxl/mock_acpi.c
create mode 100644 tools/testing/cxl/mock_pmem.c
create mode 100644 tools/testing/cxl/test/Kbuild
create mode 100644 tools/testing/cxl/test/cxl.c
create mode 100644 tools/testing/cxl/test/mem.c
create mode 100644 tools/testing/cxl/test/mock.c
create mode 100644 tools/testing/cxl/test/mock.h
base-commit: ceeb0da0a0322bcba4c50ab3cf97fe9a7aa8a2e4
This is the start of the stable review cycle for the 4.9.282 release.
There are 16 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri, 03 Sep 2021 12:22:41 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.282-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.282-rc1
Denis Efremov <efremov(a)linux.com>
Revert "floppy: reintroduce O_NDELAY fix"
Sean Christopherson <seanjc(a)google.com>
KVM: x86/mmu: Treat NX as used (not reserved) for all !TDP shadow MMUs
George Kennedy <george.kennedy(a)oracle.com>
fbmem: add margin check to fb_check_caps()
Linus Torvalds <torvalds(a)linux-foundation.org>
vt_kdsetmode: extend console locking
Gerd Rausch <gerd.rausch(a)oracle.com>
net/rds: dma_map_sg is entitled to merge entries
Neeraj Upadhyay <neeraju(a)codeaurora.org>
vringh: Use wiov->used to check for read/write desc order
Parav Pandit <parav(a)nvidia.com>
virtio: Improve vq->broken access to avoid any compiler optimization
Maxim Kiselev <bigunclemax(a)gmail.com>
net: marvell: fix MVNETA_TX_IN_PRGRS bit number
Shreyansh Chouhan <chouhan.shreyansh630(a)gmail.com>
ip_gre: add validation for csum_start
Sasha Neftin <sasha.neftin(a)intel.com>
e1000e: Fix the max snoop/no-snoop latency for 10M
Tuo Li <islituo(a)gmail.com>
IB/hfi1: Fix possible null-pointer dereference in _extend_sdma_tx_descs()
Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
usb: dwc3: gadget: Fix dwc3_calc_trbs_left()
Zhengjun Zhang <zhangzhengjun(a)aicrobo.com>
USB: serial: option: add new VID/PID to support Fibocom FG150
Johan Hovold <johan(a)kernel.org>
Revert "USB: serial: ch341: fix character loss at high transfer rates"
Stefan Mätje <stefan.maetje(a)esd.eu>
can: usb: esd_usb2: esd_usb2_rx_event(): fix the interchange of the CAN RX and TX error counters
Guenter Roeck <linux(a)roeck-us.net>
ARC: Fix CONFIG_STACKDEPOT
-------------
Diffstat:
Makefile | 4 ++--
arch/arc/kernel/vmlinux.lds.S | 2 ++
arch/x86/kvm/mmu.c | 11 ++++++++++-
drivers/block/floppy.c | 27 +++++++++++++--------------
drivers/infiniband/hw/hfi1/sdma.c | 9 ++++-----
drivers/net/can/usb/esd_usb2.c | 4 ++--
drivers/net/ethernet/intel/e1000e/ich8lan.c | 14 +++++++++++++-
drivers/net/ethernet/intel/e1000e/ich8lan.h | 3 +++
drivers/net/ethernet/marvell/mvneta.c | 2 +-
drivers/tty/vt/vt_ioctl.c | 11 +++++++----
drivers/usb/dwc3/gadget.c | 16 ++++++++--------
drivers/usb/serial/ch341.c | 1 -
drivers/usb/serial/option.c | 2 ++
drivers/vhost/vringh.c | 2 +-
drivers/video/fbdev/core/fbmem.c | 4 ++++
drivers/virtio/virtio_ring.c | 6 ++++--
net/ipv4/ip_gre.c | 2 ++
net/rds/ib_frmr.c | 4 ++--
18 files changed, 80 insertions(+), 44 deletions(-)
After 3 days of successfully running 5.4.143 with this patch attached
and no issues, on a production workload (host + vms) of a busy
webserver and mysql database, I request queueing this for a future 5.4
stable, like the 5.10 one requested by Borislav; copying his mail text
in the hope that this is best form.
please queue for 5.4 stable
See https://bugzilla.kernel.org/show_bug.cgi?id=214159 for more info.
---
Commit 3a7956e25e1d7b3c148569e78895e1f3178122a9 upstream.
The kthread_is_per_cpu() construct relies on only being called on
PF_KTHREAD tasks (per the WARN in to_kthread). This gives rise to the
following usage pattern:
if ((p->flags & PF_KTHREAD) && kthread_is_per_cpu(p))
However, as reported by syzcaller, this is broken. The scenario is:
CPU0 CPU1 (running p)
(p->flags & PF_KTHREAD) // true
begin_new_exec()
me->flags &= ~(PF_KTHREAD|...);
kthread_is_per_cpu(p)
to_kthread(p)
WARN(!(p->flags & PF_KTHREAD) <-- *SPLAT*
Introduce __to_kthread() that omits the WARN and is sure to check both
values.
Use this to remove the problematic pattern for kthread_is_per_cpu()
and fix a number of other kthread_*() functions that have similar
issues but are currently not used in ways that would expose the
problem.
Notably kthread_func() is only ever called on 'current', while
kthread_probe_data() is only used for PF_WQ_WORKER, which implies the
task is from kthread_create*().
Fixes: ac687e6e8c26 ("kthread: Extract KTHREAD_IS_PER_CPU")
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Signed-off-by: Patrick Schaaf <bof(a)bof.de>
diff --git a/kernel/kthread.c b/kernel/kthread.c
index b2bac5d929d2..22750a8af83e 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -76,6 +76,25 @@ static inline struct kthread *to_kthread(struct
task_struct *k)
return (__force void *)k->set_child_tid;
}
+/*
+ * Variant of to_kthread() that doesn't assume @p is a kthread.
+ *
+ * Per construction; when:
+ *
+ * (p->flags & PF_KTHREAD) && p->set_child_tid
+ *
+ * the task is both a kthread and struct kthread is persistent. However
+ * PF_KTHREAD on it's own is not, kernel_thread() can exec() (See umh.c and
+ * begin_new_exec()).
+ */
+static inline struct kthread *__to_kthread(struct task_struct *p)
+{
+ void *kthread = (__force void *)p->set_child_tid;
+ if (kthread && !(p->flags & PF_KTHREAD))
+ kthread = NULL;
+ return kthread;
+}
+
void free_kthread_struct(struct task_struct *k)
{
struct kthread *kthread;
@@ -176,10 +195,11 @@ void *kthread_data(struct task_struct *task)
*/
void *kthread_probe_data(struct task_struct *task)
{
- struct kthread *kthread = to_kthread(task);
+ struct kthread *kthread = __to_kthread(task);
void *data = NULL;
- probe_kernel_read(&data, &kthread->data, sizeof(data));
+ if (kthread)
+ probe_kernel_read(&data, &kthread->data, sizeof(data));
return data;
}
@@ -490,9 +510,9 @@ void kthread_set_per_cpu(struct task_struct *k, int cpu)
set_bit(KTHREAD_IS_PER_CPU, &kthread->flags);
}
-bool kthread_is_per_cpu(struct task_struct *k)
+bool kthread_is_per_cpu(struct task_struct *p)
{
- struct kthread *kthread = to_kthread(k);
+ struct kthread *kthread = __to_kthread(p);
if (!kthread)
return false;
@@ -1272,11 +1292,9 @@ EXPORT_SYMBOL(kthread_destroy_worker);
*/
void kthread_associate_blkcg(struct cgroup_subsys_state *css)
{
- struct kthread *kthread;
+ struct kthread *kthread = __to_kthread(current);
+
- if (!(current->flags & PF_KTHREAD))
- return;
- kthread = to_kthread(current);
if (!kthread)
return;
@@ -1298,13 +1316,10 @@ EXPORT_SYMBOL(kthread_associate_blkcg);
*/
struct cgroup_subsys_state *kthread_blkcg(void)
{
- struct kthread *kthread;
+ struct kthread *kthread = __to_kthread(current);
- if (current->flags & PF_KTHREAD) {
- kthread = to_kthread(current);
- if (kthread)
- return kthread->blkcg_css;
- }
+ if (kthread)
+ return kthread->blkcg_css;
return NULL;
}
EXPORT_SYMBOL(kthread_blkcg);
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 74cb20f32f72..87d9fad9d01d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7301,7 +7301,7 @@ int can_migrate_task(struct task_struct *p,
struct lb_env *env)
return 0;
/* Disregard pcpu kthreads; they are where they need to be. */
- if ((p->flags & PF_KTHREAD) && kthread_is_per_cpu(p))
+ if (kthread_is_per_cpu(p))
return 0;
if (!cpumask_test_cpu(env->dst_cpu, p->cpus_ptr)) {