The comment in kvm_get_shadow_phys_bits refers to MKTME, but the same is actually
true of SME and SEV. Just use CPUID[0x8000_0008].EAX[7:0] unconditionally, it is
simplest and works even if memory is not encrypted.
Cc: stable(a)vger.kernel.org
Reported-by: Tom Lendacky <thomas.lendacky(a)amd.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
---
arch/x86/kvm/mmu/mmu.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 6f92b40d798c..8b8edfbdbaef 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -538,15 +538,11 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask,
static u8 kvm_get_shadow_phys_bits(void)
{
/*
- * boot_cpu_data.x86_phys_bits is reduced when MKTME is detected
- * in CPU detection code, but MKTME treats those reduced bits as
- * 'keyID' thus they are not reserved bits. Therefore for MKTME
- * we should still return physical address bits reported by CPUID.
+ * boot_cpu_data.x86_phys_bits is reduced when MKTME or SME are detected
+ * in CPU detection code, but the processor treats those reduced bits as
+ * 'keyID' thus they are not reserved bits. Therefore KVM needs to look at
+ * the physical address bits reported by CPUID.
*/
- if (!boot_cpu_has(X86_FEATURE_TME) ||
- WARN_ON_ONCE(boot_cpu_data.extended_cpuid_level < 0x80000008))
- return boot_cpu_data.x86_phys_bits;
-
return cpuid_eax(0x80000008) & 0xff;
}
--
1.8.3.1
If max_pfn is not aligned to a section boundary, we can easily run into
BUGs. This can e.g., be triggered on x86-64 under QEMU by specifying a
memory size that is not a multiple of 128MB (e.g., 4097MB, but also
4160MB). I was told that on real HW, we can easily have this scenario
(esp., one of the main reasons sub-section hotadd of devmem was added).
The issue is, that we have a valid memmap (pfn_valid()) for the
whole section, and the whole section will be marked "online".
pfn_to_online_page() will succeed, but the memmap contains garbage.
E.g., doing a "./page-types -r -a 0x144001" when QEMU was started with
"-m 4160M" - (see tools/vm/page-types.c):
[ 200.476376] BUG: unable to handle page fault for address: fffffffffffffffe
[ 200.477500] #PF: supervisor read access in kernel mode
[ 200.478334] #PF: error_code(0x0000) - not-present page
[ 200.479076] PGD 59614067 P4D 59614067 PUD 59616067 PMD 0
[ 200.479557] Oops: 0000 [#4] SMP NOPTI
[ 200.479875] CPU: 0 PID: 603 Comm: page-types Tainted: G D W 5.5.0-rc1-next-20191209 #93
[ 200.480646] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu4
[ 200.481648] RIP: 0010:stable_page_flags+0x4d/0x410
[ 200.482061] Code: f3 ff 41 89 c0 48 b8 00 00 00 00 01 00 00 00 45 84 c0 0f 85 cd 02 00 00 48 8b 53 08 48 8b 2b 48f
[ 200.483644] RSP: 0018:ffffb139401cbe60 EFLAGS: 00010202
[ 200.484091] RAX: fffffffffffffffe RBX: fffffbeec5100040 RCX: 0000000000000000
[ 200.484697] RDX: 0000000000000001 RSI: ffffffff9535c7cd RDI: 0000000000000246
[ 200.485313] RBP: ffffffffffffffff R08: 0000000000000000 R09: 0000000000000000
[ 200.485917] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000144001
[ 200.486523] R13: 00007ffd6ba55f48 R14: 00007ffd6ba55f40 R15: ffffb139401cbf08
[ 200.487130] FS: 00007f68df717580(0000) GS:ffff9ec77fa00000(0000) knlGS:0000000000000000
[ 200.487804] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 200.488295] CR2: fffffffffffffffe CR3: 0000000135d48000 CR4: 00000000000006f0
[ 200.488897] Call Trace:
[ 200.489115] kpageflags_read+0xe9/0x140
[ 200.489447] proc_reg_read+0x3c/0x60
[ 200.489755] vfs_read+0xc2/0x170
[ 200.490037] ksys_pread64+0x65/0xa0
[ 200.490352] do_syscall_64+0x5c/0xa0
[ 200.490665] entry_SYSCALL_64_after_hwframe+0x49/0xbe
But it can be triggered much easier via "cat /proc/kpageflags > /dev/null"
after cold/hot plugging a DIMM to such a system:
[root@localhost ~]# cat /proc/kpageflags > /dev/null
[ 111.517275] BUG: unable to handle page fault for address: fffffffffffffffe
[ 111.517907] #PF: supervisor read access in kernel mode
[ 111.518333] #PF: error_code(0x0000) - not-present page
[ 111.518771] PGD a240e067 P4D a240e067 PUD a2410067 PMD 0
This patch fixes that by at least zero-ing out that memmap (so e.g.,
page_to_pfn() will not crash). Commit 907ec5fca3dc ("mm: zero remaining
unavailable struct pages") tried to fix a similar issue, but forgot to
consider this special case.
After this patch, there are still problems to solve. E.g., not all of these
pages falling into a memory hole will actually get initialized later
and set PageReserved - they are only zeroed out - but at least the
immediate crashes are gone. A follow-up patch will take care of this.
Fixes: f7f99100d8d9 ("mm: stop zeroing memory during allocation in vmemmap")
Tested-by: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Cc: <stable(a)vger.kernel.org> # v4.15+
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Steven Sistare <steven.sistare(a)oracle.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Cc: Bob Picco <bob.picco(a)oracle.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Signed-off-by: David Hildenbrand <david(a)redhat.com>
---
mm/page_alloc.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 62dcd6b76c80..1eb2ce7c79e4 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6932,7 +6932,8 @@ static u64 zero_pfn_range(unsigned long spfn, unsigned long epfn)
* This function also addresses a similar issue where struct pages are left
* uninitialized because the physical address range is not covered by
* memblock.memory or memblock.reserved. That could happen when memblock
- * layout is manually configured via memmap=.
+ * layout is manually configured via memmap=, or when the highest physical
+ * address (max_pfn) does not end on a section boundary.
*/
void __init zero_resv_unavail(void)
{
@@ -6950,7 +6951,16 @@ void __init zero_resv_unavail(void)
pgcnt += zero_pfn_range(PFN_DOWN(next), PFN_UP(start));
next = end;
}
- pgcnt += zero_pfn_range(PFN_DOWN(next), max_pfn);
+
+ /*
+ * Early sections always have a fully populated memmap for the whole
+ * section - see pfn_valid(). If the last section has holes at the
+ * end and that section is marked "online", the memmap will be
+ * considered initialized. Make sure that memmap has a well defined
+ * state.
+ */
+ pgcnt += zero_pfn_range(PFN_DOWN(next),
+ round_up(max_pfn, PAGES_PER_SECTION));
/*
* Struct pages that do not have backing memory. This could be because
--
2.23.0
From: Sreekanth Reddy <sreekanth.reddy(a)broadcom.com>
[ Upstream commit 782b281883caf70289ba6a186af29441a117d23e ]
When user issues diag register command from application with required size,
and if driver unable to allocate the memory, then it will fail the register
command. While failing the register command, driver is not currently
clearing MPT3_CMD_PENDING bit in ctl_cmds.status variable which was set
before trying to allocate the memory. As this bit is set, subsequent
register command will be failed with BUSY status even when user wants to
register the trace buffer will less memory.
Clear MPT3_CMD_PENDING bit in ctl_cmds.status before returning the diag
register command with no memory status.
Link: https://lore.kernel.org/r/1568379890-18347-4-git-send-email-sreekanth.reddy…
Signed-off-by: Sreekanth Reddy <sreekanth.reddy(a)broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/scsi/mpt3sas/mpt3sas_ctl.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/mpt3sas/mpt3sas_ctl.c b/drivers/scsi/mpt3sas/mpt3sas_ctl.c
index 26cdc127ac89c..90a87e59ff602 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_ctl.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_ctl.c
@@ -1465,7 +1465,8 @@ _ctl_diag_register_2(struct MPT3SAS_ADAPTER *ioc,
" for diag buffers, requested size(%d)\n",
ioc->name, __func__, request_data_sz);
mpt3sas_base_free_smid(ioc, smid);
- return -ENOMEM;
+ rc = -ENOMEM;
+ goto out;
}
ioc->diag_buffer[buffer_type] = request_data;
ioc->diag_buffer_sz[buffer_type] = request_data_sz;
--
2.20.1
From: Chen-Yu Tsai <wens(a)csie.org>
max_pfn, as set in arch/arm/mm/init.c:
static void __init find_limits(unsigned long *min,
unsigned long *max_low,
unsigned long *max_high)
{
*max_low = PFN_DOWN(memblock_get_current_limit());
*min = PFN_UP(memblock_start_of_DRAM());
*max_high = PFN_DOWN(memblock_end_of_DRAM());
}
with memblock_end_of_DRAM() pointing to the next byte after DRAM. As
such, max_pfn points to the PFN after the end of DRAM.
Thus when using max_pfn to check DMA masks, we should subtract one
when checking DMA ranges against it.
Commit 8bf1268f48ad ("ARM: dma-api: fix off-by-one error in
__dma_supported()") fixed the same issue, but missed this spot.
This issue was found while working on the sun4i-csi v4l2 driver on the
Allwinner R40 SoC. On Allwinner SoCs, DRAM is offset at 0x40000000,
and we are starting to use of_dma_configure() with the "dma-ranges"
property in the device tree to have the DMA API handle the offset.
In this particular instance, dma-ranges was set to the same range as
the actual available (2 GiB) DRAM. The following error appeared when
the driver attempted to allocate a buffer:
sun4i-csi 1c09000.csi: Coherent DMA mask 0x7fffffff (pfn 0x40000-0xc0000)
covers a smaller range of system memory than the DMA zone pfn 0x0-0xc0001
sun4i-csi 1c09000.csi: dma_alloc_coherent of size 307200 failed
Fixing the off-by-one error makes things work.
Fixes: 11a5aa32562e ("ARM: dma-mapping: check DMA mask against available memory")
Fixes: 9f28cde0bc64 ("ARM: another fix for the DMA mapping checks")
Fixes: ab746573c405 ("ARM: dma-mapping: allow larger DMA mask than supported")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Chen-Yu Tsai <wens(a)csie.org>
---
arch/arm/mm/dma-mapping.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index e822af0d9219..f4daafdbac56 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -227,12 +227,12 @@ static int __dma_supported(struct device *dev, u64 mask, bool warn)
* Translate the device's DMA mask to a PFN limit. This
* PFN number includes the page which we can DMA to.
*/
- if (dma_to_pfn(dev, mask) < max_dma_pfn) {
+ if (dma_to_pfn(dev, mask) < max_dma_pfn - 1) {
if (warn)
dev_warn(dev, "Coherent DMA mask %#llx (pfn %#lx-%#lx) covers a smaller range of system memory than the DMA zone pfn 0x0-%#lx\n",
mask,
dma_to_pfn(dev, 0), dma_to_pfn(dev, mask) + 1,
- max_dma_pfn + 1);
+ max_dma_pfn);
return 0;
}
--
2.24.0
From: James Smart <jsmart2021(a)gmail.com>
[ Upstream commit 3f97aed6117c7677eb16756c4ec8b86000fd5822 ]
An issue was seen discovering all SCSI Luns when a target device undergoes
link bounce.
The driver currently does not qualify the FC4 support on the target.
Therefore it will send a SCSI PRLI and an NVMe PRLI. The expectation is
that the target will reject the PRLI if it is not supported. If a PRLI
times out, the driver will retry. The driver will not proceed with the
device until both SCSI and NVMe PRLIs are resolved. In the failure case,
the device is FCP only and does not respond to the NVMe PRLI, thus
initiating the wait/retry loop in the driver. During that time, a RSCN is
received (device bounced) causing the driver to issue a GID_FT. The GID_FT
response comes back before the PRLI mess is resolved and it prematurely
cancels the PRLI retry logic and leaves the device in a STE_PRLI_ISSUE
state. Discovery with the target never completes or resets.
Fix by resetting the node state back to STE_NPR_NODE when GID_FT completes,
thereby restarting the discovery process for the node.
Link: https://lore.kernel.org/r/20190922035906.10977-10-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy(a)broadcom.com>
Signed-off-by: James Smart <jsmart2021(a)gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/scsi/lpfc/lpfc_hbadisc.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c
index b36b3da323a0a..5d657178c2b98 100644
--- a/drivers/scsi/lpfc/lpfc_hbadisc.c
+++ b/drivers/scsi/lpfc/lpfc_hbadisc.c
@@ -5231,9 +5231,14 @@ lpfc_setup_disc_node(struct lpfc_vport *vport, uint32_t did)
/* If we've already received a PLOGI from this NPort
* we don't need to try to discover it again.
*/
- if (ndlp->nlp_flag & NLP_RCV_PLOGI)
+ if (ndlp->nlp_flag & NLP_RCV_PLOGI &&
+ !(ndlp->nlp_type &
+ (NLP_FCP_TARGET | NLP_NVME_TARGET)))
return NULL;
+ ndlp->nlp_prev_state = ndlp->nlp_state;
+ lpfc_nlp_set_state(vport, ndlp, NLP_STE_NPR_NODE);
+
spin_lock_irq(shost->host_lock);
ndlp->nlp_flag |= NLP_NPR_2B_DISC;
spin_unlock_irq(shost->host_lock);
--
2.20.1
The patch below was submitted to be applied to the 5.4-stable tree.
I fail to see how this patch meets the stable kernel rules as found at
Documentation/process/stable-kernel-rules.rst.
I could be totally wrong, and if so, please respond to
<stable(a)vger.kernel.org> and let me know why this patch should be
applied. Otherwise, it is now dropped from my patch queues, never to be
seen again.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 9b8d7072d6552ee5c57e5765f211f267041f9557 Mon Sep 17 00:00:00 2001
From: "H. Nikolaus Schaller" <hns(a)goldelico.com>
Date: Thu, 7 Nov 2019 11:30:35 +0100
Subject: [PATCH] net: wireless: ti: wl1251 add device tree support
We will have the wl1251 defined as a child node of the mmc interface
and can read setup for gpios, interrupts and the ti,use-eeprom
property from there instead of pdata to be provided by pdata-quirks.
Fixes: 81eef6ca9201 ("mmc: omap_hsmmc: Use dma_request_chan() for requesting DMA channel")
Signed-off-by: H. Nikolaus Schaller <hns(a)goldelico.com>
Acked-by: Kalle Valo <kvalo(a)codeaurora.org>
Cc: <stable(a)vger.kernel.org> # v4.7+
[Ulf: Fixed up some complaints from checkpatch]
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/net/wireless/ti/wl1251/sdio.c b/drivers/net/wireless/ti/wl1251/sdio.c
index 677f1146ccf0..f1224b948f83 100644
--- a/drivers/net/wireless/ti/wl1251/sdio.c
+++ b/drivers/net/wireless/ti/wl1251/sdio.c
@@ -16,6 +16,9 @@
#include <linux/irq.h>
#include <linux/pm_runtime.h>
#include <linux/gpio.h>
+#include <linux/of.h>
+#include <linux/of_gpio.h>
+#include <linux/of_irq.h>
#include "wl1251.h"
@@ -217,6 +220,7 @@ static int wl1251_sdio_probe(struct sdio_func *func,
struct ieee80211_hw *hw;
struct wl1251_sdio *wl_sdio;
const struct wl1251_platform_data *wl1251_board_data;
+ struct device_node *np = func->dev.of_node;
hw = wl1251_alloc_hw();
if (IS_ERR(hw))
@@ -248,6 +252,17 @@ static int wl1251_sdio_probe(struct sdio_func *func,
wl->power_gpio = wl1251_board_data->power_gpio;
wl->irq = wl1251_board_data->irq;
wl->use_eeprom = wl1251_board_data->use_eeprom;
+ } else if (np) {
+ wl->use_eeprom = of_property_read_bool(np,
+ "ti,wl1251-has-eeprom");
+ wl->power_gpio = of_get_named_gpio(np, "ti,power-gpio", 0);
+ wl->irq = of_irq_get(np, 0);
+
+ if (wl->power_gpio == -EPROBE_DEFER ||
+ wl->irq == -EPROBE_DEFER) {
+ ret = -EPROBE_DEFER;
+ goto disable;
+ }
}
if (gpio_is_valid(wl->power_gpio)) {