Given the fact that ARS can take 10s to 100s of seconds it is not feasible to wait for ARS completion before publishing persistent memory namespaces. Instead convert the ARS implementation to perform a short ARS for critical errors, ones that caused a previous system reset, before registering namespaces. Finally, arrange for all long ARS operations to run in the background and populate the badblock lists at run time.
While developing this rework a handful of cleanups and fixes also fell out.
---
Dan Williams (6): nfit: fix region registration vs block-data-window ranges nfit, address-range-scrub: fix scrub in-progress reporting libnvdimm: add an api to cast a 'struct nd_region' to its 'struct device' nfit, address-range-scrub: introduce nfit_spa->ars_state nfit, address-range-scrub: rework and simplify ARS state machine nfit, address-range-scrub: add module option to skip initial ars
drivers/acpi/nfit/core.c | 443 ++++++++++++++++++------------------------ drivers/acpi/nfit/nfit.h | 13 + drivers/nvdimm/nd.h | 1 drivers/nvdimm/region_devs.c | 8 + include/linux/libnvdimm.h | 1 5 files changed, 213 insertions(+), 253 deletions(-)
Commit 1cf03c00e7c1 "nfit: scrub and register regions in a workqueue" mistakenly attempts to register a region per BLK aperture. There is nothing to register for individual apertures as they belong as a set to a BLK aperture group that are registered with a corresponding DIMM-control-region. Filter them for registration to prevent some needless devm_kzalloc() allocations.
Cc: stable@vger.kernel.org Fixes: 1cf03c00e7c1 ("nfit: scrub and register regions in a workqueue") Signed-off-by: Dan Williams dan.j.williams@intel.com --- drivers/acpi/nfit/core.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index 12fb414fa678..ea9f3e727fef 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -3018,15 +3018,21 @@ static void acpi_nfit_scrub(struct work_struct *work) static int acpi_nfit_register_regions(struct acpi_nfit_desc *acpi_desc) { struct nfit_spa *nfit_spa; - int rc;
- list_for_each_entry(nfit_spa, &acpi_desc->spas, list) - if (nfit_spa_type(nfit_spa->spa) == NFIT_SPA_DCR) { - /* BLK regions don't need to wait for ars results */ - rc = acpi_nfit_register_region(acpi_desc, nfit_spa); - if (rc) - return rc; - } + list_for_each_entry(nfit_spa, &acpi_desc->spas, list) { + int rc, type = nfit_spa_type(nfit_spa->spa); + + /* PMEM and VMEM will be registered by the ARS workqueue */ + if (type == NFIT_SPA_PM || type == NFIT_SPA_VOLATILE) + continue; + /* BLK apertures belong to BLK region registration below */ + if (type == NFIT_SPA_BDW) + continue; + /* BLK regions don't need to wait for ARS results */ + rc = acpi_nfit_register_region(acpi_desc, nfit_spa); + if (rc) + return rc; + }
acpi_desc->ars_start_flags = 0; if (!acpi_desc->cancel)
On 4/2/2018 9:46 PM, Dan Williams wrote:
Commit 1cf03c00e7c1 "nfit: scrub and register regions in a workqueue" mistakenly attempts to register a region per BLK aperture. There is nothing to register for individual apertures as they belong as a set to a BLK aperture group that are registered with a corresponding DIMM-control-region. Filter them for registration to prevent some needless devm_kzalloc() allocations.
Cc: stable@vger.kernel.org Fixes: 1cf03c00e7c1 ("nfit: scrub and register regions in a workqueue") Signed-off-by: Dan Williams dan.j.williams@intel.com
Reviewed-by: Dave Jiang dave.jiang@intel.com
drivers/acpi/nfit/core.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index 12fb414fa678..ea9f3e727fef 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -3018,15 +3018,21 @@ static void acpi_nfit_scrub(struct work_struct *work) static int acpi_nfit_register_regions(struct acpi_nfit_desc *acpi_desc) { struct nfit_spa *nfit_spa;
- int rc;
- list_for_each_entry(nfit_spa, &acpi_desc->spas, list)
if (nfit_spa_type(nfit_spa->spa) == NFIT_SPA_DCR) {
/* BLK regions don't need to wait for ars results */
rc = acpi_nfit_register_region(acpi_desc, nfit_spa);
if (rc)
return rc;
}
- list_for_each_entry(nfit_spa, &acpi_desc->spas, list) {
int rc, type = nfit_spa_type(nfit_spa->spa);
/* PMEM and VMEM will be registered by the ARS workqueue */
if (type == NFIT_SPA_PM || type == NFIT_SPA_VOLATILE)
continue;
/* BLK apertures belong to BLK region registration below */
if (type == NFIT_SPA_BDW)
continue;
/* BLK regions don't need to wait for ARS results */
rc = acpi_nfit_register_region(acpi_desc, nfit_spa);
if (rc)
return rc;
- }
acpi_desc->ars_start_flags = 0; if (!acpi_desc->cancel)
Hi,
[This is an automated email]
This commit has been processed because it contains a "Fixes:" tag, fixing commit: 1cf03c00e7c1 nfit: scrub and register regions in a workqueue.
The bot has also determined it's probably a bug fixing patch. (score: 80.4079)
The bot has tested the following trees: v4.16, v4.15.15, v4.14.32, v4.9.92.
v4.16: Build OK! v4.15.15: Build OK! v4.14.32: Build OK! v4.9.92: Build OK!
-- Thanks, Sasha
There is a small window whereby ARS scan requests can schedule work that userspace will miss when polling scrub_show. Hold the init_mutex lock over calls to report the status to close this potential escape. Also, make sure that requests to cancel the ARS workqueue are treated as an idle event.
Cc: stable@vger.kernel.org Cc: Vishal Verma vishal.l.verma@intel.com Fixes: 37b137ff8c83 ("nfit, libnvdimm: allow an ARS scrub...") Signed-off-by: Dan Williams dan.j.williams@intel.com --- drivers/acpi/nfit/core.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index ea9f3e727fef..2a1fc3817a81 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -1249,8 +1249,11 @@ static ssize_t scrub_show(struct device *dev, if (nd_desc) { struct acpi_nfit_desc *acpi_desc = to_acpi_desc(nd_desc);
+ mutex_lock(&acpi_desc->init_mutex); rc = sprintf(buf, "%d%s", acpi_desc->scrub_count, - (work_busy(&acpi_desc->work)) ? "+\n" : "\n"); + work_busy(&acpi_desc->work) + && !acpi_desc->cancel ? "+\n" : "\n"); + mutex_unlock(&acpi_desc->init_mutex); } device_unlock(dev); return rc;
On 4/2/2018 9:46 PM, Dan Williams wrote:
There is a small window whereby ARS scan requests can schedule work that userspace will miss when polling scrub_show. Hold the init_mutex lock over calls to report the status to close this potential escape. Also, make sure that requests to cancel the ARS workqueue are treated as an idle event.
Cc: stable@vger.kernel.org Cc: Vishal Verma vishal.l.verma@intel.com Fixes: 37b137ff8c83 ("nfit, libnvdimm: allow an ARS scrub...") Signed-off-by: Dan Williams dan.j.williams@intel.com
Reviewed-by: Dave Jiang dave.jiang@intel.com
drivers/acpi/nfit/core.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index ea9f3e727fef..2a1fc3817a81 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -1249,8 +1249,11 @@ static ssize_t scrub_show(struct device *dev, if (nd_desc) { struct acpi_nfit_desc *acpi_desc = to_acpi_desc(nd_desc);
rc = sprintf(buf, "%d%s", acpi_desc->scrub_count,mutex_lock(&acpi_desc->init_mutex);
(work_busy(&acpi_desc->work)) ? "+\n" : "\n");
work_busy(&acpi_desc->work)
&& !acpi_desc->cancel ? "+\n" : "\n");
} device_unlock(dev); return rc;mutex_unlock(&acpi_desc->init_mutex);
Hi,
[This is an automated email]
This commit has been processed because it contains a "Fixes:" tag, fixing commit: 37b137ff8c83 nfit, libnvdimm: allow an ARS scrub to be triggered on demand.
The bot has also determined it's probably a bug fixing patch. (score: 97.4657)
The bot has tested the following trees: v4.16, v4.15.15, v4.14.32, v4.9.92.
v4.16: Build OK! v4.15.15: Build OK! v4.14.32: Build OK! v4.9.92: Build OK!
-- Thanks, Sasha
linux-stable-mirror@lists.linaro.org