Linux-stable-mirror April 2019

linux-stable-mirror@lists.linaro.org

318 participants
849 discussions

Re: [PATCH 3.16 310/366] vmxnet3: fix checks for dma mapping errors

by Ben Hutchings

On Fri, 2019-03-29 at 17:01 +0100, Thomas Weißschuh wrote: > Hi Ben, > > I forgot you to Cc: on the following mail: > > https://lore.kernel.org/stable/20190329154756.GA14540@fralrnd0033.fra.amade… > > Thomas > > On Fri, Mar 29, 2019 at 04:47:58PM +0100, Thomas Weißschuh wrote: > > > 3.16.60-rc1 review patch. If anyone has any objections, please let me know. > > > > Sorry for the late response, this just hit the kernel in Debian Jessie > > (oldstable) a few days ago. > > > > > ------------------ > > > > > > From: Alexey Khoroshilov <khoroshilov(a)ispras.ru> > > > > > > commit 5738a09d58d5ad2871f1f9a42bf6a3aa9ece5b3c upstream. > > > > > > vmxnet3_drv does not check dma_addr with dma_mapping_error() > > > after mapping dma memory. The patch adds the checks and > > > tries to handle failures. > > > > We are seeing kernel panics/machine freezes/BUGs with the new 3.16.64 from Debian. > > I bisected it with the vanilla stable kernel and it boiled down to this commit. > > VMs of multiple nodes of our vmware cluster are affected. > > The bug can be triggered in multiple ways, I have seen it when an external > > network request is served, when installing packages over the network and > > performing a git clone. [...] I missed the upstream follow-up to this, which was commit 58caf637365f "Driver: Vmxnet3: Fix regression caused by 5738a09". I'm attaching a backport of that. I don't have any VMware installation to test on; could you do that? Ben. -- Ben Hutchings It is easier to change the specification to fit the program than vice versa.

6 years, 8 months

[PATCH] mm: Fix modifying of page protection by insert_pfn_pmd()

by Aneesh Kumar K.V

With some architectures like ppc64, set_pmd_at() cannot cope with a situation where there is already some (different) valid entry present. Use pmdp_set_access_flags() instead to modify the pfn which is built to deal with modifying existing PMD entries. This is similar to commit cae85cb8add3 ("mm/memory.c: fix modifying of page protection by insert_pfn()") We also do similar update w.r.t insert_pfn_pud eventhough ppc64 don't support pud pfn entries now. CC: stable(a)vger.kernel.org Signed-off-by: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com> --- mm/huge_memory.c | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 404acdcd0455..f7dca413c4b2 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -755,6 +755,20 @@ static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, spinlock_t *ptl; ptl = pmd_lock(mm, pmd); + if (!pmd_none(*pmd)) { + if (write) { + if (pmd_pfn(*pmd) != pfn_t_to_pfn(pfn)) { + WARN_ON_ONCE(!is_huge_zero_pmd(*pmd)); + goto out_unlock; + } + entry = pmd_mkyoung(*pmd); + entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); + if (pmdp_set_access_flags(vma, addr, pmd, entry, 1)) + update_mmu_cache_pmd(vma, addr, pmd); + } + goto out_unlock; + } + entry = pmd_mkhuge(pfn_t_pmd(pfn, prot)); if (pfn_t_devmap(pfn)) entry = pmd_mkdevmap(entry); @@ -770,6 +784,7 @@ static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, set_pmd_at(mm, addr, pmd, entry); update_mmu_cache_pmd(vma, addr, pmd); +out_unlock: spin_unlock(ptl); } @@ -821,6 +836,20 @@ static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr, spinlock_t *ptl; ptl = pud_lock(mm, pud); + if (!pud_none(*pud)) { + if (write) { + if (pud_pfn(*pud) != pfn_t_to_pfn(pfn)) { + WARN_ON_ONCE(!is_huge_zero_pud(*pud)); + goto out_unlock; + } + entry = pud_mkyoung(*pud); + entry = maybe_pud_mkwrite(pud_mkdirty(entry), vma); + if (pudp_set_access_flags(vma, addr, pud, entry, 1)) + update_mmu_cache_pud(vma, addr, pud); + } + goto out_unlock; + } + entry = pud_mkhuge(pfn_t_pud(pfn, prot)); if (pfn_t_devmap(pfn)) entry = pud_mkdevmap(entry); @@ -830,6 +859,8 @@ static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr, } set_pud_at(mm, addr, pud, entry); update_mmu_cache_pud(vma, addr, pud); + +out_unlock: spin_unlock(ptl); } -- 2.20.1

6 years, 8 months

[RFC/RFT PATCH 06/18] crypto: chacha20poly1305 - set cra_name correctly

by Eric Biggers

From: Eric Biggers <ebiggers(a)google.com> If the rfc7539 template is instantiated with specific implementations, e.g. "rfc7539(chacha20-generic,poly1305-generic)" rather than "rfc7539(chacha20,poly1305)", then the implementation names end up included in the instance's cra_name. This is incorrect because it then prevents all users from allocating "rfc7539(chacha20,poly1305)", if the highest priority implementations of chacha20 and poly1305 were selected. Also, the self-tests aren't run on an instance allocated in this way. Fix it by setting the instance's cra_name from the underlying algorithms' actual cra_names, rather than from the requested names. This matches what other templates do. Fixes: 71ebc4d1b27d ("crypto: chacha20poly1305 - Add a ChaCha20-Poly1305 AEAD construction, RFC7539") Cc: <stable(a)vger.kernel.org> # v4.2+ Cc: Martin Willi <martin(a)strongswan.org> Signed-off-by: Eric Biggers <ebiggers(a)google.com> --- crypto/chacha20poly1305.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/crypto/chacha20poly1305.c b/crypto/chacha20poly1305.c index ed2e12e26dd80..279d816ab51dd 100644 --- a/crypto/chacha20poly1305.c +++ b/crypto/chacha20poly1305.c @@ -645,8 +645,8 @@ static int chachapoly_create(struct crypto_template *tmpl, struct rtattr **tb, err = -ENAMETOOLONG; if (snprintf(inst->alg.base.cra_name, CRYPTO_MAX_ALG_NAME, - "%s(%s,%s)", name, chacha_name, - poly_name) >= CRYPTO_MAX_ALG_NAME) + "%s(%s,%s)", name, chacha->base.cra_name, + poly->cra_name) >= CRYPTO_MAX_ALG_NAME) goto out_drop_chacha; if (snprintf(inst->alg.base.cra_driver_name, CRYPTO_MAX_ALG_NAME, "%s(%s,%s)", name, chacha->base.cra_driver_name, -- 2.21.0

6 years, 8 months

[RFC/RFT PATCH 01/18] crypto: x86/poly1305 - fix overflow during partial reduction

by Eric Biggers

From: Eric Biggers <ebiggers(a)google.com> The x86_64 implementation of Poly1305 produces the wrong result on some inputs because poly1305_4block_avx2() incorrectly assumes that when partially reducing the accumulator, the bits carried from limb 'd4' to limb 'h0' fit in a 32-bit integer. This is true for poly1305-generic which processes only one block at a time. However, it's not true for the AVX2 implementation, which processes 4 blocks at a time and therefore can produce intermediate limbs about 4x larger. Fix it by making the relevant calculations use 64-bit arithmetic rather than 32-bit. Note that most of the carries already used 64-bit arithmetic, but the d4 -> h0 carry was different for some reason. To be safe I also made the same change to the corresponding SSE2 code, though that only operates on 1 or 2 blocks at a time. I don't think it's really needed for poly1305_block_sse2(), but it doesn't hurt because it's already x86_64 code. It *might* be needed for poly1305_2block_sse2(), but overflows aren't easy to reproduce there. This bug was originally detected by my patches that improve testmgr to fuzz algorithms against their generic implementation. But also add a test vector which reproduces it directly (in the AVX2 case). Fixes: b1ccc8f4b631 ("crypto: poly1305 - Add a four block AVX2 variant for x86_64") Fixes: c70f4abef07a ("crypto: poly1305 - Add a SSE2 SIMD variant for x86_64") Cc: <stable(a)vger.kernel.org> # v4.3+ Cc: Martin Willi <martin(a)strongswan.org> Cc: Jason A. Donenfeld <Jason(a)zx2c4.com> Signed-off-by: Eric Biggers <ebiggers(a)google.com> --- arch/x86/crypto/poly1305-avx2-x86_64.S | 14 +++++--- arch/x86/crypto/poly1305-sse2-x86_64.S | 22 ++++++++----- crypto/testmgr.h | 44 +++++++++++++++++++++++++- 3 files changed, 67 insertions(+), 13 deletions(-) diff --git a/arch/x86/crypto/poly1305-avx2-x86_64.S b/arch/x86/crypto/poly1305-avx2-x86_64.S index 3b6e70d085da8..8457cdd47f751 100644 --- a/arch/x86/crypto/poly1305-avx2-x86_64.S +++ b/arch/x86/crypto/poly1305-avx2-x86_64.S @@ -323,6 +323,12 @@ ENTRY(poly1305_4block_avx2) vpaddq t2,t1,t1 vmovq t1x,d4 + # Now do a partial reduction mod (2^130)-5, carrying h0 -> h1 -> h2 -> + # h3 -> h4 -> h0 -> h1 to get h0,h2,h3,h4 < 2^26 and h1 < 2^26 + a small + # amount. Careful: we must not assume the carry bits 'd0 >> 26', + # 'd1 >> 26', 'd2 >> 26', 'd3 >> 26', and '(d4 >> 26) * 5' fit in 32-bit + # integers. It's true in a single-block implementation, but not here. + # d1 += d0 >> 26 mov d0,%rax shr $26,%rax @@ -361,16 +367,16 @@ ENTRY(poly1305_4block_avx2) # h0 += (d4 >> 26) * 5 mov d4,%rax shr $26,%rax - lea (%eax,%eax,4),%eax - add %eax,%ebx + lea (%rax,%rax,4),%rax + add %rax,%rbx # h4 = d4 & 0x3ffffff mov d4,%rax and $0x3ffffff,%eax mov %eax,h4 # h1 += h0 >> 26 - mov %ebx,%eax - shr $26,%eax + mov %rbx,%rax + shr $26,%rax add %eax,h1 # h0 = h0 & 0x3ffffff andl $0x3ffffff,%ebx diff --git a/arch/x86/crypto/poly1305-sse2-x86_64.S b/arch/x86/crypto/poly1305-sse2-x86_64.S index e6add74d78a59..6f0be7a869641 100644 --- a/arch/x86/crypto/poly1305-sse2-x86_64.S +++ b/arch/x86/crypto/poly1305-sse2-x86_64.S @@ -253,16 +253,16 @@ ENTRY(poly1305_block_sse2) # h0 += (d4 >> 26) * 5 mov d4,%rax shr $26,%rax - lea (%eax,%eax,4),%eax - add %eax,%ebx + lea (%rax,%rax,4),%rax + add %rax,%rbx # h4 = d4 & 0x3ffffff mov d4,%rax and $0x3ffffff,%eax mov %eax,h4 # h1 += h0 >> 26 - mov %ebx,%eax - shr $26,%eax + mov %rbx,%rax + shr $26,%rax add %eax,h1 # h0 = h0 & 0x3ffffff andl $0x3ffffff,%ebx @@ -524,6 +524,12 @@ ENTRY(poly1305_2block_sse2) paddq t2,t1 movq t1,d4 + # Now do a partial reduction mod (2^130)-5, carrying h0 -> h1 -> h2 -> + # h3 -> h4 -> h0 -> h1 to get h0,h2,h3,h4 < 2^26 and h1 < 2^26 + a small + # amount. Careful: we must not assume the carry bits 'd0 >> 26', + # 'd1 >> 26', 'd2 >> 26', 'd3 >> 26', and '(d4 >> 26) * 5' fit in 32-bit + # integers. It's true in a single-block implementation, but not here. + # d1 += d0 >> 26 mov d0,%rax shr $26,%rax @@ -562,16 +568,16 @@ ENTRY(poly1305_2block_sse2) # h0 += (d4 >> 26) * 5 mov d4,%rax shr $26,%rax - lea (%eax,%eax,4),%eax - add %eax,%ebx + lea (%rax,%rax,4),%rax + add %rax,%rbx # h4 = d4 & 0x3ffffff mov d4,%rax and $0x3ffffff,%eax mov %eax,h4 # h1 += h0 >> 26 - mov %ebx,%eax - shr $26,%eax + mov %rbx,%rax + shr $26,%rax add %eax,h1 # h0 = h0 & 0x3ffffff andl $0x3ffffff,%ebx diff --git a/crypto/testmgr.h b/crypto/testmgr.h index f267633cf13ac..d18a37629f053 100644 --- a/crypto/testmgr.h +++ b/crypto/testmgr.h @@ -5634,7 +5634,49 @@ static const struct hash_testvec poly1305_tv_template[] = { .psize = 80, .digest = "\x13\x00\x00\x00\x00\x00\x00\x00" "\x00\x00\x00\x00\x00\x00\x00\x00", - }, + }, { /* Regression test for overflow in AVX2 implementation */ + .plaintext = "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff\xff\xff\xff\xff" + "\xff\xff\xff\xff", + .psize = 300, + .digest = "\xfb\x5e\x96\xd8\x61\xd5\xc7\xc8" + "\x78\xe5\x87\xcc\x2d\x5a\x22\xe1", + } }; /* NHPoly1305 test vectors from https://github.com/google/adiantum */ -- 2.21.0

6 years, 8 months

[PATCH hulk-4.19-next 08/10] DTS:DTS2018120709658 Description:scsi: sd: Fix a race between closing an sd device and sd I/O

by chenxiang

From: c00284940 <c00284940(a)huawei.com> plinth inclusion category: bugfix bugzilla: NA DTS: DTS2018120709658 CVE: NA The scsi_end_request() function calls scsi_cmd_to_driver() indirectly and hence needs the disk->private_data pointer. Avoid that that pointer is cleared before all affected I/O requests have finished. This patch avoids that the following crash occurs: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Call trace: scsi_mq_uninit_cmd+0x1c/0x30 scsi_end_request+0x7c/0x1b8 scsi_io_completion+0x464/0x668 scsi_finish_command+0xbc/0x160 scsi_eh_flush_done_q+0x10c/0x170 sas_scsi_recover_host+0x84c/0xa98 [libsas] scsi_error_handler+0x140/0x5b0 kthread+0x100/0x12c ret_from_fork+0x10/0x18 Cc: Christoph Hellwig <hch(a)lst.de> Cc: Ming Lei <ming.lei(a)redhat.com> Cc: Hannes Reinecke <hare(a)suse.com> Cc: Johannes Thumshirn <jthumshirn(a)suse.de> Cc: Jason Yan <yanaijie(a)huawei.com> Cc: <stable(a)vger.kernel.org> Reported-by: Jason Yan <yanaijie(a)huawei.com> Signed-off-by: Bart Van Assche <bvanassche(a)acm.org> Change-Id: Ib761a1144492a507c7c3d4a09b817bb4f1285835 Signed-off-by: c00284940 <c00284940(a)huawei.com> Reviewed-on: http://10.90.31.173:8080/5601 Tested-by: public TuringEE <turingee(a)huawei.com> Reviewed-by: huangdaode 00314581 <huangdaode(a)hisilicon.com> Reviewed-by: public TuringEE <turingee(a)huawei.com> --- drivers/scsi/sd.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index a02196b..eb37f7d 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -1413,11 +1413,6 @@ static void sd_release(struct gendisk *disk, fmode_t mode) scsi_set_medium_removal(sdev, SCSI_REMOVAL_ALLOW); } - /* - * XXX and what if there are packets in flight and this close() - * XXX is followed by a "rmmod sd_mod"? - */ - scsi_disk_put(sdkp); } @@ -3514,9 +3509,21 @@ static void scsi_disk_release(struct device *dev) { struct scsi_disk *sdkp = to_scsi_disk(dev); struct gendisk *disk = sdkp->disk; - + struct request_queue *q = disk->queue; + ida_free(&sd_index_ida, sdkp->index); + /* + * Wait until all requests that are in progress have completed. + * This is necessary to avoid that e.g. scsi_end_request() crashes + * due to clearing the disk->private_data pointer. Wait from inside + * scsi_disk_release() instead of from sd_release() to avoid that + * freezing and unfreezing the request queue affects user space I/O + * in case multiple processes open a /dev/sd... node concurrently. + */ + blk_mq_freeze_queue(q); + blk_mq_unfreeze_queue(q); + disk->private_data = NULL; put_disk(disk); put_device(&sdkp->device->sdev_gendev); -- 2.8.1

6 years, 8 months

Build failure in v3.18.y.queue

by Guenter Roeck

Building arm:mxs_defconfig ... failed drivers/tty/serial/mxs-auart.c: In function 'mxs_auart_probe': drivers/tty/serial/mxs-auart.c:1065:3: error: label 'out_disable_clks' used but not defined Guenter

6 years, 8 months

[PATCH hulk-4.19-next 03/10] DTS:DTS2019030103192 Description:{fromtree} scsi:hisi_sas: fix calls to dma_set_mask_and_coherent()

by chenxiang

From: c00284940 <c00284940(a)huawei.com> plinth inclusion category: bugfix bugzilla: NA DTS: DTS2019030103192 CVE: NA The change to use dma_set_mask_and_coherent() incorrectly made a second call with the 32 bit DMA mask value when the call with the 64 bit DMA mask value succeeded. This resulted in FC connections failing due to corrupted data buffers, and various other SCSI/FCP I/O errors. Fixes: e4db40e ("scsi: hisi_sas: use dma_set_mask_and_coherent") Cc: <stable(a)vger.kernel.org> Suggested-by: Ewan D. Milne <emilne(a)redhat.com> Signed-off-by: Hannes Reinecke <hare(a)suse.com> Change-Id: I2d8b409b5b9675a8d7a17730cde342b2f7579c38 Signed-off-by: c00284940 <c00284940(a)huawei.com> Reviewed-on: http://10.90.31.173:8080/4799 Tested-by: public TuringEE <turingee(a)huawei.com> Reviewed-by: huangdaode 00314581 <huangdaode(a)hisilicon.com> Reviewed-by: public TuringEE <turingee(a)huawei.com> --- drivers/scsi/hisi_sas/hisi_sas_main.c | 8 ++++++-- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 15 +++++++-------- 2 files changed, 13 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c index ff32799..a2a4c74 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -2556,6 +2556,7 @@ static struct Scsi_Host *hisi_sas_shost_alloc(struct platform_device *pdev, struct Scsi_Host *shost; struct hisi_hba *hisi_hba; struct device *dev = &pdev->dev; + int error; shost = scsi_host_alloc(hw->sht, sizeof(*hisi_hba)); if (!shost) { @@ -2576,8 +2577,11 @@ static struct Scsi_Host *hisi_sas_shost_alloc(struct platform_device *pdev, if (hisi_sas_get_fw_info(hisi_hba) < 0) goto err_out; - if (dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64)) && - dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32))) { + error = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64)); + if (error) + error = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32)); + + if (error) { dev_err(dev, "No usable DMA addressing method\n"); goto err_out; } diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index b4bde32..24db55f 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -2903,14 +2903,13 @@ hisi_sas_v3_probe(struct pci_dev *pdev, const struct pci_device_id *id) if (rc) goto err_out_disable_device; - if ((pci_set_dma_mask(pdev, DMA_BIT_MASK(64)) != 0) || - (pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64)) != 0)) { - if ((pci_set_dma_mask(pdev, DMA_BIT_MASK(32)) != 0) || - (pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32)) != 0)) { - dev_err(dev, "No usable DMA addressing method\n"); - rc = -EIO; - goto err_out_regions; - } + rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)); + if (rc) + rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32)); + if (rc) { + dev_err(dev, "No usable DMA addressing method\n"); + rc = -ENODEV; + goto err_out_regions; } shost = hisi_sas_shost_alloc_pci(pdev); -- 2.8.1

6 years, 8 months

Re: [PATCH] staging: erofs: keep corrupted fs from crashing kernel in erofs_readdir()

by Gao Xiang

Hi, On 2019/3/30 21:45, Sasha Levin wrote: > Hi, > > [This is an automated email] > > This commit has been processed because it contains a "Fixes:" tag, > fixing commit: 3aa8ec716e52 staging: erofs: add directory operations. > > The bot has tested the following trees: v5.0.5, v4.19.32. > > v5.0.5: Build OK! > v4.19.32: Failed to apply! Possible dependencies: > 6e78901a9f23 ("staging: erofs: separate erofs_get_meta_page") > 7dd68b147d60 ("staging: erofs: use explicit unsigned int type") > 8be31270362b ("staging: erofs: introduce erofs_grab_bio") > ab47dd2b0819 ("staging: erofs: cleanup z_erofs_vle_work_{lookup, register}") > > > How should we proceed with this patch? I have made a 4.19 patch for this: https://lore.kernel.org/lkml/20190401065309.68109-2-gaoxiang25@huawei.com/ Thanks, Gao Xiang > > -- > Thanks, > Sasha >

6 years, 8 months

[PATCH V2 1/3] blk-mq: free hw queue's resource in hctx's release handler

by Ming Lei

Once blk_cleanup_queue() returns, tags shouldn't be used any more, because blk_mq_free_tag_set() may be called. Commit 45a9c9d909b2 ("blk-mq: Fix a use-after-free") fixes this issue exactly. However, that commit introduces another issue. Before 45a9c9d909b2, we are allowed to run queue during cleaning up queue if the queue's kobj refcount is held. After that commit, queue can't be run during queue cleaning up, otherwise oops can be triggered easily because some fields of hctx are freed by blk_mq_free_queue() in blk_cleanup_queue(). We have invented ways for addressing this kind of issue before, such as: 8dc765d438f1 ("SCSI: fix queue cleanup race before queue initialization is done") c2856ae2f315 ("blk-mq: quiesce queue before freeing queue") But still can't cover all cases, recently James reports another such kind of issue: https://marc.info/?l=linux-scsi&m=155389088124782&w=2 This issue can be quite hard to address by previous way, given scsi_run_queue() may run requeues for other LUNs. Fixes the above issue by freeing hctx's resources in its release handler, and this way is safe becasue tags isn't needed for freeing such hctx resource. This approach follows typical design pattern wrt. kobject's release handler. Cc: Dongli Zhang <dongli.zhang(a)oracle.com> Cc: James Smart <james.smart(a)broadcom.com> Cc: Bart Van Assche <bart.vanassche(a)wdc.com> Cc: linux-scsi(a)vger.kernel.org, Cc: Martin K . Petersen <martin.petersen(a)oracle.com>, Cc: Christoph Hellwig <hch(a)lst.de>, Cc: James E . J . Bottomley <jejb(a)linux.vnet.ibm.com>, Cc: jianchao wang <jianchao.w.wang(a)oracle.com> Reported-by: James Smart <james.smart(a)broadcom.com> Fixes: 45a9c9d909b2 ("blk-mq: Fix a use-after-free") Cc: stable(a)vger.kernel.org Signed-off-by: Ming Lei <ming.lei(a)redhat.com> --- block/blk-core.c | 2 +- block/blk-mq-sysfs.c | 6 ++++++ block/blk-mq.c | 8 ++------ block/blk-mq.h | 2 +- 4 files changed, 10 insertions(+), 8 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 4673ebe42255..b3bbf8a5110d 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -375,7 +375,7 @@ void blk_cleanup_queue(struct request_queue *q) blk_exit_queue(q); if (queue_is_mq(q)) - blk_mq_free_queue(q); + blk_mq_exit_queue(q); percpu_ref_exit(&q->q_usage_counter); diff --git a/block/blk-mq-sysfs.c b/block/blk-mq-sysfs.c index 3f9c3f4ac44c..4040e62c3737 100644 --- a/block/blk-mq-sysfs.c +++ b/block/blk-mq-sysfs.c @@ -10,6 +10,7 @@ #include <linux/smp.h> #include <linux/blk-mq.h> +#include "blk.h" #include "blk-mq.h" #include "blk-mq-tag.h" @@ -33,6 +34,11 @@ static void blk_mq_hw_sysfs_release(struct kobject *kobj) { struct blk_mq_hw_ctx *hctx = container_of(kobj, struct blk_mq_hw_ctx, kobj); + + if (hctx->flags & BLK_MQ_F_BLOCKING) + cleanup_srcu_struct(hctx->srcu); + blk_free_flush_queue(hctx->fq); + sbitmap_free(&hctx->ctx_map); free_cpumask_var(hctx->cpumask); kfree(hctx->ctxs); kfree(hctx); diff --git a/block/blk-mq.c b/block/blk-mq.c index 70b210a308c4..05d11149e75a 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2243,12 +2243,7 @@ static void blk_mq_exit_hctx(struct request_queue *q, if (set->ops->exit_hctx) set->ops->exit_hctx(hctx, hctx_idx); - if (hctx->flags & BLK_MQ_F_BLOCKING) - cleanup_srcu_struct(hctx->srcu); - blk_mq_remove_cpuhp(hctx); - blk_free_flush_queue(hctx->fq); - sbitmap_free(&hctx->ctx_map); } static void blk_mq_exit_hw_queues(struct request_queue *q, @@ -2881,7 +2876,8 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, } EXPORT_SYMBOL(blk_mq_init_allocated_queue); -void blk_mq_free_queue(struct request_queue *q) +/* tags can _not_ be used after returning from blk_mq_exit_queue */ +void blk_mq_exit_queue(struct request_queue *q) { struct blk_mq_tag_set *set = q->tag_set; diff --git a/block/blk-mq.h b/block/blk-mq.h index 0ed8e5a8729f..0a5d0893dae4 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -37,7 +37,7 @@ struct blk_mq_ctx { struct kobject kobj; } ____cacheline_aligned_in_smp; -void blk_mq_free_queue(struct request_queue *q); +void blk_mq_exit_queue(struct request_queue *q); int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr); void blk_mq_wake_waiters(struct request_queue *q); bool blk_mq_dispatch_rq_list(struct request_queue *, struct list_head *, bool); -- 2.9.5

6 years, 8 months

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror April 2019