We should compile this driver only if we enable PHY_INGENIC_USB.
Fixes: 31de313dfdcf ("PHY: Ingenic: Add USB PHY driver using generic PHY
framework.")
Signed-off-by: Qiujun Huang <hqjagain(a)gmail.com>
---
v3:
There is no need to submit this patch to -stable tree, as the driver was
not merged to 5.10.
v2:
Add a Fixes:tag and Cc linux-stable
---
drivers/phy/ingenic/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/phy/ingenic/Makefile b/drivers/phy/ingenic/Makefile
index 65d5ea00fc9d..a00306651423 100644
--- a/drivers/phy/ingenic/Makefile
+++ b/drivers/phy/ingenic/Makefile
@@ -1,2 +1,2 @@
# SPDX-License-Identifier: GPL-2.0
-obj-y += phy-ingenic-usb.o
+obj-$(PHY_INGENIC_USB) += phy-ingenic-usb.o
--
2.25.1
The most trivial example of a race condition can be demonstrated by this
sequence where mm_list contains just one entry:
CPU A CPU B
-> sgx_release()
-> sgx_mmu_notifier_release()
-> list_del_rcu()
<- list_del_rcu()
-> kref_put()
-> sgx_encl_release()
-> synchronize_srcu()
-> cleanup_srcu_struct()
A sequence similar to this has also been spotted in tests under high
stress:
[ +0.000008] WARNING: CPU: 3 PID: 7620 at kernel/rcu/srcutree.c:374 cleanup_srcu_struct+0xed/0x100
Albeit not spotted in the tests, it's also entirely possible that the
following scenario could happen:
CPU A CPU B
-> sgx_release()
-> sgx_mmu_notifier_release()
-> list_del_rcu()
-> kref_put()
-> sgx_encl_release()
-> cleanup_srcu_struct()
<- cleanup_srcu_struct()
-> synchronize_srcu()
This scenario would lead into use-after free in cleaup_srcu_struct().
Fix this by taking a reference to the enclave in
sgx_mmu_notifier_release().
Cc: stable(a)vger.kernel.org
Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
Suggested-by: Sean Christopherson <seanjc(a)google.com>
Reported-by: Haitao Huang <haitao.huang(a)linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
v5:
- To make sure that the instance does not get deleted use kref_get()
kref_put(). This also removes the need for additional
synchronize_srcu().
v4:
- Rewrite the commit message.
- Just change the call order. *_expedited() is out of scope for this
bug fix.
v3: Fine-tuned tags, and added missing change log for v2.
v2: Switch to synchronize_srcu_expedited().
arch/x86/kernel/cpu/sgx/encl.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index ee50a5010277..5ecbcf94ec2a 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -465,6 +465,7 @@ static void sgx_mmu_notifier_release(struct mmu_notifier *mn,
spin_lock(&encl_mm->encl->mm_lock);
list_for_each_entry(tmp, &encl_mm->encl->mm_list, list) {
if (tmp == encl_mm) {
+ kref_get(&encl_mm->encl->refcount);
list_del_rcu(&encl_mm->list);
break;
}
@@ -474,6 +475,7 @@ static void sgx_mmu_notifier_release(struct mmu_notifier *mn,
if (tmp == encl_mm) {
synchronize_srcu(&encl_mm->encl->srcu);
mmu_notifier_put(mn);
+ kref_put(&encl_mm->encl->refcount, sgx_encl_release);
}
}
--
2.30.0
Architectures that describe the CPU topology in devicetree and that do
not have an identity mapping between physical and logical CPU ids need
to override the default implementation of arch_match_cpu_phys_id().
Failing to do so breaks CPU devicetree-node lookups using
of_get_cpu_node() and of_cpu_device_node_get() which several drivers
rely on. It also causes the CPU struct devices exported through sysfs to
point to the wrong devicetree nodes.
On x86, CPUs are described in devicetree using their APIC ids and those
do not generally coincide with the logical ids, even if CPU0 typically
uses APIC id 0. Add the missing implementation of
arch_match_cpu_phys_id() so that CPU-node lookups work also with SMP.
Apart from fixing the broken sysfs devicetree-node links this likely do
not affect users of mainline kernels as the above mentioned drivers are
currently not used on x86 as far as I know.
Fixes: 4e07db9c8db8 ("x86/devicetree: Use CPU description from Device Tree")
Cc: stable <stable(a)vger.kernel.org> # 4.17
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
Thomas,
Hope this looks better to you.
My use case for this is still out-of-tree, but since CPU-node lookup is
generic functionality and with observable impact also for mainline users
(sysfs) I added a stable tag.
Johan
Changes in v2
- rewrite commit message
- add Fixes tag
- add stable tag for the benefit of out-of-tree users
arch/x86/kernel/apic/apic.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index b3eef1d5c903..19c0119892dd 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2311,6 +2311,11 @@ static int cpuid_to_apicid[] = {
[0 ... NR_CPUS - 1] = -1,
};
+bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
+{
+ return phys_id == cpuid_to_apicid[cpu];
+}
+
#ifdef CONFIG_SMP
/**
* apic_id_is_primary_thread - Check whether APIC ID belongs to a primary thread
--
2.26.2
On Fri, 2021-01-29 at 08:58 -0500, Mimi Zohar wrote:
> On Fri, 2021-01-29 at 01:56 +0200, jarkko(a)kernel.org wrote:
> > From: Jarkko Sakkinen <jarkko(a)kernel.org>
> >
> > When TPM 2.0 trusted keys code was moved to the trusted keys subsystem,
> > the operations were unwrapped from tpm_try_get_ops() and tpm_put_ops(),
> > which are used to take temporarily the ownership of the TPM chip. The
> > ownership is only taken inside tpm_send(), but this is not sufficient,
> > as in the key load TPM2_CC_LOAD, TPM2_CC_UNSEAL and TPM2_FLUSH_CONTEXT
> > need to be done as a one single atom.
> >
> > Take the TPM chip ownership before sending anything with
> > tpm_try_get_ops() and tpm_put_ops(), and use tpm_transmit_cmd() to send
> > TPM commands instead of tpm_send(), reverting back to the old behaviour.
> >
> > Fixes: 2e19e10131a0 ("KEYS: trusted: Move TPM2 trusted keys code")
> > Reported-by: "James E.J. Bottomley" <James.Bottomley(a)HansenPartnership.com>
> > Cc: stable(a)vger.kernel.org
> > Cc: David Howells <dhowells(a)redhat.com>
> > Cc: Mimi Zohar <zohar(a)linux.ibm.com>
> > Cc: Sumit Garg <sumit.garg(a)linaro.org>
> > Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
>
> Tested-by: Mimi Zohar <zohar(a)linux.ibm.com> (on TPM 1.2 & PTT, discrete
> TPM 2.0)
Thanks, is it OK to apply the whole series?
/Jarkko
Some Kingston A2000 NVMe SSDs sooner or later get confused and stop
working when they use the deepest APST sleep while running Linux. The
system then crashes and one has to cold boot it to get the SSD working
again.
Kingston seems to known about this since at least mid-September 2020:
https://bbs.archlinux.org/viewtopic.php?pid=1926994#p1926994
Someone working for a German company representing Kingston to the German
press confirmed to me Kingston engineering is aware of the issue and
investigating; the person stated that to their current knowledge only
the deepest APST sleep state causes trouble. Therefore, make Linux avoid
it for now by applying the NVME_QUIRK_NO_DEEPEST_PS to this SSD.
I have two such SSDs, but it seems the problem doesn't occur with them.
I hence couldn't verify if this patch really fixes the problem, but all
the data in front of me suggests it should.
This patch can easily be reverted or improved upon if a better solution
surfaces.
FWIW, there are many reports about the issue scattered around the web;
most of the users disabled APST completely to make things work, some
just made Linux avoid the deepest sleep state:
https://bugzilla.kernel.org/show_bug.cgi?id=195039#c65https://bugzilla.kernel.org/show_bug.cgi?id=195039#c73https://bugzilla.kernel.org/show_bug.cgi?id=195039#c74https://bugzilla.kernel.org/show_bug.cgi?id=195039#c78https://bugzilla.kernel.org/show_bug.cgi?id=195039#c79https://bugzilla.kernel.org/show_bug.cgi?id=195039#c80https://askubuntu.com/questions/1222049/nvmekingston-a2000-sometimes-stops-…https://community.acer.com/en/discussion/604326/m-2-nvme-ssd-aspire-517-51g…
For the record, some data from 'nvme id-ctrl /dev/nvme0'
NVME Identify Controller:
vid : 0x2646
ssvid : 0x2646
mn : KINGSTON SA2000M81000G
fr : S5Z42105
[...]
ps 0 : mp:9.00W operational enlat:0 exlat:0 rrt:0 rrl:0
rwt:0 rwl:0 idle_power:- active_power:-
ps 1 : mp:4.60W operational enlat:0 exlat:0 rrt:1 rrl:1
rwt:1 rwl:1 idle_power:- active_power:-
ps 2 : mp:3.80W operational enlat:0 exlat:0 rrt:2 rrl:2
rwt:2 rwl:2 idle_power:- active_power:-
ps 3 : mp:0.0450W non-operational enlat:2000 exlat:2000 rrt:3 rrl:3
rwt:3 rwl:3 idle_power:- active_power:-
ps 4 : mp:0.0040W non-operational enlat:15000 exlat:15000 rrt:4 rrl:4
rwt:4 rwl:4 idle_power:- active_power:-
Cc: stable(a)vger.kernel.org # 4.14+
Signed-off-by: Thorsten Leemhuis <linux(a)leemhuis.info>
---
Once this is out I will post a link to it in
https://bugzilla.kernel.org/show_bug.cgi?id=195039, maybe someone there
might be able to confirm that this fixes the issue.
---
drivers/nvme/host/pci.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 856aa31931c1..421735e16870 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3257,6 +3257,8 @@ static const struct pci_device_id nvme_id_table[] = {
.driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
{ PCI_DEVICE(0x15b7, 0x2001), /* Sandisk Skyhawk */
.driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
+ { PCI_DEVICE(0x2646, 0x2263), /* KINGSTON A2000 NVMe SSD */
+ .driver_data = NVME_QUIRK_NO_DEEPEST_PS, },
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001),
.driver_data = NVME_QUIRK_SINGLE_VECTOR },
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2003) },
--
2.29.2
From: Mike Rapoport <rppt(a)linux.ibm.com>
Hi,
Commit 73a6e474cb37 ("mm: memmap_init: iterate over
memblock regions rather that check each PFN") exposed several issues with
the memory map initialization and these patches fix those issues.
Initially there were crashes during compaction that Qian Cai reported back
in April [1]. It seemed back then that the problem was fixed, but a few
weeks ago Andrea Arcangeli hit the same bug [2] and there was an additional
discussion at [3].
I didn't appreciate variety of ways BIOSes can report memory in the first
megabyte, so v3 of this set caused boot failures on several x86 systems.
Hopefully this time I covered all the bases.
The first patch here complements commit bde9cfa3afe4 ("x86/setup: don't
remove E820_TYPE_RAM for pfn 0") for the cases when BIOS reports the first
page as absent or reserved.
The second patch is a more robust version of d3921cb8be29 ("mm: fix
initialization of struct page for holes in memory layout") that can now
handle the above cases as well.
v4:
* make sure pages in the range 0 - start_pfn_of_lowest_zone are initialized
even if an architecture hides them from the generic mm
* finally make pfn 0 on x86 to be a part of memory visible to the generic
mm as reserved memory.
v3: https://lore.kernel.org/lkml/20210111194017.22696-1-rppt@kernel.org
* use architectural zone constraints to set zone links for struct pages
corresponding to the holes
* drop implicit update of memblock.memory
* add a patch that sets pfn 0 to E820_TYPE_RAM on x86
v2: https://lore.kernel.org/lkml/20201209214304.6812-1-rppt@kernel.org/):
* added patch that adds all regions in memblock.reserved that do not
overlap with memblock.memory to memblock.memory in the beginning of
free_area_init()
[1] https://lore.kernel.org/lkml/8C537EB7-85EE-4DCF-943E-3CC0ED0DF56D@lca.pw
[2] https://lore.kernel.org/lkml/20201121194506.13464-1-aarcange@redhat.com
[3] https://lore.kernel.org/mm-commits/20201206005401.qKuAVgOXr%akpm@linux-foun…
Mike Rapoport (2):
x86/setup: always add the beginning of RAM as memblock.memory
mm: fix initialization of struct page for holes in memory layout
arch/x86/kernel/setup.c | 8 ++++
mm/page_alloc.c | 85 ++++++++++++++++++++++++-----------------
2 files changed, 59 insertions(+), 34 deletions(-)
--
2.28.0
The recent rework of probe_kernel_read() and its conversion to
get_kernel_nofault() inadvertently broke is_prefetch(). We were using
probe_kernel_read() as a sloppy "read user or kernel memory" helper, but it
doens't do that any more. The new get_kernel_nofault() reads *kernel*
memory only, which completely broke is_prefetch() for user access.
Adjust the code to the the correct accessor based on access mode. The
manual address bounds check is no longer necessary, since the accessor
helpers (get_user() / get_kernel_nofault()) do the right thing all by
themselves. As a bonus, by using the correct accessor, we don't need the
open-coded address bounds check.
While we're at it, disable the workaround on all CPUs except AMD Family
0xF. By my reading of the Revision Guide for AMD Athlon™ 64 and AMD
Opteron™ Processors, only family 0xF is affected.
Fixes: eab0c6089b68 ("maccess: unify the probe kernel arch hooks")
Cc: stable(a)vger.kernel.org
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Alexei Starovoitov <ast(a)kernel.org>
Cc: Daniel Borkmann <daniel(a)iogearbox.net>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
---
arch/x86/mm/fault.c | 31 +++++++++++++++++++++----------
1 file changed, 21 insertions(+), 10 deletions(-)
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 106b22d1d189..50dfdc71761e 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -54,7 +54,7 @@ kmmio_fault(struct pt_regs *regs, unsigned long addr)
* 32-bit mode:
*
* Sometimes AMD Athlon/Opteron CPUs report invalid exceptions on prefetch.
- * Check that here and ignore it.
+ * Check that here and ignore it. This is AMD erratum #91.
*
* 64-bit mode:
*
@@ -83,11 +83,7 @@ check_prefetch_opcode(struct pt_regs *regs, unsigned char *instr,
#ifdef CONFIG_X86_64
case 0x40:
/*
- * In AMD64 long mode 0x40..0x4F are valid REX prefixes
- * Need to figure out under what instruction mode the
- * instruction was issued. Could check the LDT for lm,
- * but for now it's good enough to assume that long
- * mode only uses well known segments or kernel.
+ * In 64-bit mode 0x40..0x4F are valid REX prefixes
*/
return (!user_mode(regs) || user_64bit_mode(regs));
#endif
@@ -124,23 +120,38 @@ is_prefetch(struct pt_regs *regs, unsigned long error_code, unsigned long addr)
if (error_code & X86_PF_INSTR)
return 0;
+ if (likely(boot_cpu_data.x86_vendor != X86_VENDOR_AMD
+ || boot_cpu_data.x86 != 0xf))
+ return 0;
+
instr = (void *)convert_ip_to_linear(current, regs);
max_instr = instr + 15;
- if (user_mode(regs) && instr >= (unsigned char *)TASK_SIZE_MAX)
- return 0;
+ /*
+ * This code has historically always bailed out if IP points to a
+ * not-present page (e.g. due to a race). No one has ever
+ * complained about this.
+ */
+ pagefault_disable();
while (instr < max_instr) {
unsigned char opcode;
- if (get_kernel_nofault(opcode, instr))
- break;
+ if (user_mode(regs)) {
+ if (get_user(opcode, instr))
+ break;
+ } else {
+ if (get_kernel_nofault(opcode, instr))
+ break;
+ }
instr++;
if (!check_prefetch_opcode(regs, instr, opcode, &prefetch))
break;
}
+
+ pagefault_enable();
return prefetch;
}
--
2.29.2
On Fri, Jan 29, 2021, Paolo Bonzini wrote:
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 76bce832cade..15733013b266 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1401,7 +1401,7 @@ static u64 kvm_get_arch_capabilities(void)
> * This lets the guest use VERW to clear CPU buffers.
This comment be updated to call out the new TSX_CTRL behavior.
/*
* On TAA affected systems:
* - nothing to do if TSX is disabled on the host.
* - we emulate TSX_CTRL if present on the host.
* This lets the guest use VERW to clear CPU buffers.
*/
> */
> if (!boot_cpu_has(X86_FEATURE_RTM))
> - data &= ~(ARCH_CAP_TAA_NO | ARCH_CAP_TSX_CTRL_MSR);
> + data &= ~ARCH_CAP_TAA_NO;
Hmm, simply clearing TSX_CTRL will only preserve the host value. Since
ARCH_CAPABILITIES is unconditionally emulated by KVM, wouldn't it make sense to
unconditionally expose TSX_CTRL as well, as opposed to exposing it only if it's
supported in the host? I.e. allow migrating a TSX-disabled guest to a host
without TSX. Or am I misunderstanding how TSX_CTRL is checked/used?
> else if (!boot_cpu_has_bug(X86_BUG_TAA))
> data |= ARCH_CAP_TAA_NO;
>
> --
> 2.26.2
>