The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b1bd5cba3306691c771d558e94baa73e8b0b96b7 Mon Sep 17 00:00:00 2001
From: Lai Jiangshan <laijs(a)linux.alibaba.com>
Date: Thu, 3 Jun 2021 13:24:55 +0800
Subject: [PATCH] KVM: X86: MMU: Use the correct inherited permissions to get
shadow page
When computing the access permissions of a shadow page, use the effective
permissions of the walk up to that point, i.e. the logic AND of its parents'
permissions. Two guest PxE entries that point at the same table gfn need to
be shadowed with different shadow pages if their parents' permissions are
different. KVM currently uses the effective permissions of the last
non-leaf entry for all non-leaf entries. Because all non-leaf SPTEs have
full ("uwx") permissions, and the effective permissions are recorded only
in role.access and merged into the leaves, this can lead to incorrect
reuse of a shadow page and eventually to a missing guest protection page
fault.
For example, here is a shared pagetable:
pgd[] pud[] pmd[] virtual address pointers
/->pmd1(u--)->pte1(uw-)->page1 <- ptr1 (u--)
/->pud1(uw-)--->pmd2(uw-)->pte2(uw-)->page2 <- ptr2 (uw-)
pgd-| (shared pmd[] as above)
\->pud2(u--)--->pmd1(u--)->pte1(uw-)->page1 <- ptr3 (u--)
\->pmd2(uw-)->pte2(uw-)->page2 <- ptr4 (u--)
pud1 and pud2 point to the same pmd table, so:
- ptr1 and ptr3 points to the same page.
- ptr2 and ptr4 points to the same page.
(pud1 and pud2 here are pud entries, while pmd1 and pmd2 here are pmd entries)
- First, the guest reads from ptr1 first and KVM prepares a shadow
page table with role.access=u--, from ptr1's pud1 and ptr1's pmd1.
"u--" comes from the effective permissions of pgd, pud1 and
pmd1, which are stored in pt->access. "u--" is used also to get
the pagetable for pud1, instead of "uw-".
- Then the guest writes to ptr2 and KVM reuses pud1 which is present.
The hypervisor set up a shadow page for ptr2 with pt->access is "uw-"
even though the pud1 pmd (because of the incorrect argument to
kvm_mmu_get_page in the previous step) has role.access="u--".
- Then the guest reads from ptr3. The hypervisor reuses pud1's
shadow pmd for pud2, because both use "u--" for their permissions.
Thus, the shadow pmd already includes entries for both pmd1 and pmd2.
- At last, the guest writes to ptr4. This causes no vmexit or pagefault,
because pud1's shadow page structures included an "uw-" page even though
its role.access was "u--".
Any kind of shared pagetable might have the similar problem when in
virtual machine without TDP enabled if the permissions are different
from different ancestors.
In order to fix the problem, we change pt->access to be an array, and
any access in it will not include permissions ANDed from child ptes.
The test code is: https://lore.kernel.org/kvm/20210603050537.19605-1-jiangshanlai@gmail.com/
Remember to test it with TDP disabled.
The problem had existed long before the commit 41074d07c78b ("KVM: MMU:
Fix inherited permissions for emulated guest pte updates"), and it
is hard to find which is the culprit. So there is no fixes tag here.
Signed-off-by: Lai Jiangshan <laijs(a)linux.alibaba.com>
Message-Id: <20210603052455.21023-1-jiangshanlai(a)gmail.com>
Cc: stable(a)vger.kernel.org
Fixes: cea0f0e7ea54 ("[PATCH] KVM: MMU: Shadow page table caching")
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/Documentation/virt/kvm/mmu.rst b/Documentation/virt/kvm/mmu.rst
index 5bfe28b0728e..20d85daed395 100644
--- a/Documentation/virt/kvm/mmu.rst
+++ b/Documentation/virt/kvm/mmu.rst
@@ -171,8 +171,8 @@ Shadow pages contain the following information:
shadow pages) so role.quadrant takes values in the range 0..3. Each
quadrant maps 1GB virtual address space.
role.access:
- Inherited guest access permissions in the form uwx. Note execute
- permission is positive, not negative.
+ Inherited guest access permissions from the parent ptes in the form uwx.
+ Note execute permission is positive, not negative.
role.invalid:
The page is invalid and should not be used. It is a root page that is
currently pinned (by a cpu hardware register pointing to it); once it is
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 70b7e44e3035..823a5919f9fa 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -90,8 +90,8 @@ struct guest_walker {
gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
bool pte_writable[PT_MAX_FULL_LEVELS];
- unsigned pt_access;
- unsigned pte_access;
+ unsigned int pt_access[PT_MAX_FULL_LEVELS];
+ unsigned int pte_access;
gfn_t gfn;
struct x86_exception fault;
};
@@ -418,13 +418,15 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
}
walker->ptes[walker->level - 1] = pte;
+
+ /* Convert to ACC_*_MASK flags for struct guest_walker. */
+ walker->pt_access[walker->level - 1] = FNAME(gpte_access)(pt_access ^ walk_nx_mask);
} while (!is_last_gpte(mmu, walker->level, pte));
pte_pkey = FNAME(gpte_pkeys)(vcpu, pte);
accessed_dirty = have_ad ? pte_access & PT_GUEST_ACCESSED_MASK : 0;
/* Convert to ACC_*_MASK flags for struct guest_walker. */
- walker->pt_access = FNAME(gpte_access)(pt_access ^ walk_nx_mask);
walker->pte_access = FNAME(gpte_access)(pte_access ^ walk_nx_mask);
errcode = permission_fault(vcpu, mmu, walker->pte_access, pte_pkey, access);
if (unlikely(errcode))
@@ -463,7 +465,8 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
}
pgprintk("%s: pte %llx pte_access %x pt_access %x\n",
- __func__, (u64)pte, walker->pte_access, walker->pt_access);
+ __func__, (u64)pte, walker->pte_access,
+ walker->pt_access[walker->level - 1]);
return 1;
error:
@@ -643,7 +646,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gpa_t addr,
bool huge_page_disallowed = exec && nx_huge_page_workaround_enabled;
struct kvm_mmu_page *sp = NULL;
struct kvm_shadow_walk_iterator it;
- unsigned direct_access, access = gw->pt_access;
+ unsigned int direct_access, access;
int top_level, level, req_level, ret;
gfn_t base_gfn = gw->gfn;
@@ -675,6 +678,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gpa_t addr,
sp = NULL;
if (!is_shadow_present_pte(*it.sptep)) {
table_gfn = gw->table_gfn[it.level - 2];
+ access = gw->pt_access[it.level - 2];
sp = kvm_mmu_get_page(vcpu, table_gfn, addr, it.level-1,
false, access);
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b1bd5cba3306691c771d558e94baa73e8b0b96b7 Mon Sep 17 00:00:00 2001
From: Lai Jiangshan <laijs(a)linux.alibaba.com>
Date: Thu, 3 Jun 2021 13:24:55 +0800
Subject: [PATCH] KVM: X86: MMU: Use the correct inherited permissions to get
shadow page
When computing the access permissions of a shadow page, use the effective
permissions of the walk up to that point, i.e. the logic AND of its parents'
permissions. Two guest PxE entries that point at the same table gfn need to
be shadowed with different shadow pages if their parents' permissions are
different. KVM currently uses the effective permissions of the last
non-leaf entry for all non-leaf entries. Because all non-leaf SPTEs have
full ("uwx") permissions, and the effective permissions are recorded only
in role.access and merged into the leaves, this can lead to incorrect
reuse of a shadow page and eventually to a missing guest protection page
fault.
For example, here is a shared pagetable:
pgd[] pud[] pmd[] virtual address pointers
/->pmd1(u--)->pte1(uw-)->page1 <- ptr1 (u--)
/->pud1(uw-)--->pmd2(uw-)->pte2(uw-)->page2 <- ptr2 (uw-)
pgd-| (shared pmd[] as above)
\->pud2(u--)--->pmd1(u--)->pte1(uw-)->page1 <- ptr3 (u--)
\->pmd2(uw-)->pte2(uw-)->page2 <- ptr4 (u--)
pud1 and pud2 point to the same pmd table, so:
- ptr1 and ptr3 points to the same page.
- ptr2 and ptr4 points to the same page.
(pud1 and pud2 here are pud entries, while pmd1 and pmd2 here are pmd entries)
- First, the guest reads from ptr1 first and KVM prepares a shadow
page table with role.access=u--, from ptr1's pud1 and ptr1's pmd1.
"u--" comes from the effective permissions of pgd, pud1 and
pmd1, which are stored in pt->access. "u--" is used also to get
the pagetable for pud1, instead of "uw-".
- Then the guest writes to ptr2 and KVM reuses pud1 which is present.
The hypervisor set up a shadow page for ptr2 with pt->access is "uw-"
even though the pud1 pmd (because of the incorrect argument to
kvm_mmu_get_page in the previous step) has role.access="u--".
- Then the guest reads from ptr3. The hypervisor reuses pud1's
shadow pmd for pud2, because both use "u--" for their permissions.
Thus, the shadow pmd already includes entries for both pmd1 and pmd2.
- At last, the guest writes to ptr4. This causes no vmexit or pagefault,
because pud1's shadow page structures included an "uw-" page even though
its role.access was "u--".
Any kind of shared pagetable might have the similar problem when in
virtual machine without TDP enabled if the permissions are different
from different ancestors.
In order to fix the problem, we change pt->access to be an array, and
any access in it will not include permissions ANDed from child ptes.
The test code is: https://lore.kernel.org/kvm/20210603050537.19605-1-jiangshanlai@gmail.com/
Remember to test it with TDP disabled.
The problem had existed long before the commit 41074d07c78b ("KVM: MMU:
Fix inherited permissions for emulated guest pte updates"), and it
is hard to find which is the culprit. So there is no fixes tag here.
Signed-off-by: Lai Jiangshan <laijs(a)linux.alibaba.com>
Message-Id: <20210603052455.21023-1-jiangshanlai(a)gmail.com>
Cc: stable(a)vger.kernel.org
Fixes: cea0f0e7ea54 ("[PATCH] KVM: MMU: Shadow page table caching")
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/Documentation/virt/kvm/mmu.rst b/Documentation/virt/kvm/mmu.rst
index 5bfe28b0728e..20d85daed395 100644
--- a/Documentation/virt/kvm/mmu.rst
+++ b/Documentation/virt/kvm/mmu.rst
@@ -171,8 +171,8 @@ Shadow pages contain the following information:
shadow pages) so role.quadrant takes values in the range 0..3. Each
quadrant maps 1GB virtual address space.
role.access:
- Inherited guest access permissions in the form uwx. Note execute
- permission is positive, not negative.
+ Inherited guest access permissions from the parent ptes in the form uwx.
+ Note execute permission is positive, not negative.
role.invalid:
The page is invalid and should not be used. It is a root page that is
currently pinned (by a cpu hardware register pointing to it); once it is
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 70b7e44e3035..823a5919f9fa 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -90,8 +90,8 @@ struct guest_walker {
gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
bool pte_writable[PT_MAX_FULL_LEVELS];
- unsigned pt_access;
- unsigned pte_access;
+ unsigned int pt_access[PT_MAX_FULL_LEVELS];
+ unsigned int pte_access;
gfn_t gfn;
struct x86_exception fault;
};
@@ -418,13 +418,15 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
}
walker->ptes[walker->level - 1] = pte;
+
+ /* Convert to ACC_*_MASK flags for struct guest_walker. */
+ walker->pt_access[walker->level - 1] = FNAME(gpte_access)(pt_access ^ walk_nx_mask);
} while (!is_last_gpte(mmu, walker->level, pte));
pte_pkey = FNAME(gpte_pkeys)(vcpu, pte);
accessed_dirty = have_ad ? pte_access & PT_GUEST_ACCESSED_MASK : 0;
/* Convert to ACC_*_MASK flags for struct guest_walker. */
- walker->pt_access = FNAME(gpte_access)(pt_access ^ walk_nx_mask);
walker->pte_access = FNAME(gpte_access)(pte_access ^ walk_nx_mask);
errcode = permission_fault(vcpu, mmu, walker->pte_access, pte_pkey, access);
if (unlikely(errcode))
@@ -463,7 +465,8 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
}
pgprintk("%s: pte %llx pte_access %x pt_access %x\n",
- __func__, (u64)pte, walker->pte_access, walker->pt_access);
+ __func__, (u64)pte, walker->pte_access,
+ walker->pt_access[walker->level - 1]);
return 1;
error:
@@ -643,7 +646,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gpa_t addr,
bool huge_page_disallowed = exec && nx_huge_page_workaround_enabled;
struct kvm_mmu_page *sp = NULL;
struct kvm_shadow_walk_iterator it;
- unsigned direct_access, access = gw->pt_access;
+ unsigned int direct_access, access;
int top_level, level, req_level, ret;
gfn_t base_gfn = gw->gfn;
@@ -675,6 +678,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gpa_t addr,
sp = NULL;
if (!is_shadow_present_pte(*it.sptep)) {
table_gfn = gw->table_gfn[it.level - 2];
+ access = gw->pt_access[it.level - 2];
sp = kvm_mmu_get_page(vcpu, table_gfn, addr, it.level-1,
false, access);
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b1bd5cba3306691c771d558e94baa73e8b0b96b7 Mon Sep 17 00:00:00 2001
From: Lai Jiangshan <laijs(a)linux.alibaba.com>
Date: Thu, 3 Jun 2021 13:24:55 +0800
Subject: [PATCH] KVM: X86: MMU: Use the correct inherited permissions to get
shadow page
When computing the access permissions of a shadow page, use the effective
permissions of the walk up to that point, i.e. the logic AND of its parents'
permissions. Two guest PxE entries that point at the same table gfn need to
be shadowed with different shadow pages if their parents' permissions are
different. KVM currently uses the effective permissions of the last
non-leaf entry for all non-leaf entries. Because all non-leaf SPTEs have
full ("uwx") permissions, and the effective permissions are recorded only
in role.access and merged into the leaves, this can lead to incorrect
reuse of a shadow page and eventually to a missing guest protection page
fault.
For example, here is a shared pagetable:
pgd[] pud[] pmd[] virtual address pointers
/->pmd1(u--)->pte1(uw-)->page1 <- ptr1 (u--)
/->pud1(uw-)--->pmd2(uw-)->pte2(uw-)->page2 <- ptr2 (uw-)
pgd-| (shared pmd[] as above)
\->pud2(u--)--->pmd1(u--)->pte1(uw-)->page1 <- ptr3 (u--)
\->pmd2(uw-)->pte2(uw-)->page2 <- ptr4 (u--)
pud1 and pud2 point to the same pmd table, so:
- ptr1 and ptr3 points to the same page.
- ptr2 and ptr4 points to the same page.
(pud1 and pud2 here are pud entries, while pmd1 and pmd2 here are pmd entries)
- First, the guest reads from ptr1 first and KVM prepares a shadow
page table with role.access=u--, from ptr1's pud1 and ptr1's pmd1.
"u--" comes from the effective permissions of pgd, pud1 and
pmd1, which are stored in pt->access. "u--" is used also to get
the pagetable for pud1, instead of "uw-".
- Then the guest writes to ptr2 and KVM reuses pud1 which is present.
The hypervisor set up a shadow page for ptr2 with pt->access is "uw-"
even though the pud1 pmd (because of the incorrect argument to
kvm_mmu_get_page in the previous step) has role.access="u--".
- Then the guest reads from ptr3. The hypervisor reuses pud1's
shadow pmd for pud2, because both use "u--" for their permissions.
Thus, the shadow pmd already includes entries for both pmd1 and pmd2.
- At last, the guest writes to ptr4. This causes no vmexit or pagefault,
because pud1's shadow page structures included an "uw-" page even though
its role.access was "u--".
Any kind of shared pagetable might have the similar problem when in
virtual machine without TDP enabled if the permissions are different
from different ancestors.
In order to fix the problem, we change pt->access to be an array, and
any access in it will not include permissions ANDed from child ptes.
The test code is: https://lore.kernel.org/kvm/20210603050537.19605-1-jiangshanlai@gmail.com/
Remember to test it with TDP disabled.
The problem had existed long before the commit 41074d07c78b ("KVM: MMU:
Fix inherited permissions for emulated guest pte updates"), and it
is hard to find which is the culprit. So there is no fixes tag here.
Signed-off-by: Lai Jiangshan <laijs(a)linux.alibaba.com>
Message-Id: <20210603052455.21023-1-jiangshanlai(a)gmail.com>
Cc: stable(a)vger.kernel.org
Fixes: cea0f0e7ea54 ("[PATCH] KVM: MMU: Shadow page table caching")
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/Documentation/virt/kvm/mmu.rst b/Documentation/virt/kvm/mmu.rst
index 5bfe28b0728e..20d85daed395 100644
--- a/Documentation/virt/kvm/mmu.rst
+++ b/Documentation/virt/kvm/mmu.rst
@@ -171,8 +171,8 @@ Shadow pages contain the following information:
shadow pages) so role.quadrant takes values in the range 0..3. Each
quadrant maps 1GB virtual address space.
role.access:
- Inherited guest access permissions in the form uwx. Note execute
- permission is positive, not negative.
+ Inherited guest access permissions from the parent ptes in the form uwx.
+ Note execute permission is positive, not negative.
role.invalid:
The page is invalid and should not be used. It is a root page that is
currently pinned (by a cpu hardware register pointing to it); once it is
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 70b7e44e3035..823a5919f9fa 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -90,8 +90,8 @@ struct guest_walker {
gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
bool pte_writable[PT_MAX_FULL_LEVELS];
- unsigned pt_access;
- unsigned pte_access;
+ unsigned int pt_access[PT_MAX_FULL_LEVELS];
+ unsigned int pte_access;
gfn_t gfn;
struct x86_exception fault;
};
@@ -418,13 +418,15 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
}
walker->ptes[walker->level - 1] = pte;
+
+ /* Convert to ACC_*_MASK flags for struct guest_walker. */
+ walker->pt_access[walker->level - 1] = FNAME(gpte_access)(pt_access ^ walk_nx_mask);
} while (!is_last_gpte(mmu, walker->level, pte));
pte_pkey = FNAME(gpte_pkeys)(vcpu, pte);
accessed_dirty = have_ad ? pte_access & PT_GUEST_ACCESSED_MASK : 0;
/* Convert to ACC_*_MASK flags for struct guest_walker. */
- walker->pt_access = FNAME(gpte_access)(pt_access ^ walk_nx_mask);
walker->pte_access = FNAME(gpte_access)(pte_access ^ walk_nx_mask);
errcode = permission_fault(vcpu, mmu, walker->pte_access, pte_pkey, access);
if (unlikely(errcode))
@@ -463,7 +465,8 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
}
pgprintk("%s: pte %llx pte_access %x pt_access %x\n",
- __func__, (u64)pte, walker->pte_access, walker->pt_access);
+ __func__, (u64)pte, walker->pte_access,
+ walker->pt_access[walker->level - 1]);
return 1;
error:
@@ -643,7 +646,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gpa_t addr,
bool huge_page_disallowed = exec && nx_huge_page_workaround_enabled;
struct kvm_mmu_page *sp = NULL;
struct kvm_shadow_walk_iterator it;
- unsigned direct_access, access = gw->pt_access;
+ unsigned int direct_access, access;
int top_level, level, req_level, ret;
gfn_t base_gfn = gw->gfn;
@@ -675,6 +678,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gpa_t addr,
sp = NULL;
if (!is_shadow_present_pte(*it.sptep)) {
table_gfn = gw->table_gfn[it.level - 2];
+ access = gw->pt_access[it.level - 2];
sp = kvm_mmu_get_page(vcpu, table_gfn, addr, it.level-1,
false, access);
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b1bd5cba3306691c771d558e94baa73e8b0b96b7 Mon Sep 17 00:00:00 2001
From: Lai Jiangshan <laijs(a)linux.alibaba.com>
Date: Thu, 3 Jun 2021 13:24:55 +0800
Subject: [PATCH] KVM: X86: MMU: Use the correct inherited permissions to get
shadow page
When computing the access permissions of a shadow page, use the effective
permissions of the walk up to that point, i.e. the logic AND of its parents'
permissions. Two guest PxE entries that point at the same table gfn need to
be shadowed with different shadow pages if their parents' permissions are
different. KVM currently uses the effective permissions of the last
non-leaf entry for all non-leaf entries. Because all non-leaf SPTEs have
full ("uwx") permissions, and the effective permissions are recorded only
in role.access and merged into the leaves, this can lead to incorrect
reuse of a shadow page and eventually to a missing guest protection page
fault.
For example, here is a shared pagetable:
pgd[] pud[] pmd[] virtual address pointers
/->pmd1(u--)->pte1(uw-)->page1 <- ptr1 (u--)
/->pud1(uw-)--->pmd2(uw-)->pte2(uw-)->page2 <- ptr2 (uw-)
pgd-| (shared pmd[] as above)
\->pud2(u--)--->pmd1(u--)->pte1(uw-)->page1 <- ptr3 (u--)
\->pmd2(uw-)->pte2(uw-)->page2 <- ptr4 (u--)
pud1 and pud2 point to the same pmd table, so:
- ptr1 and ptr3 points to the same page.
- ptr2 and ptr4 points to the same page.
(pud1 and pud2 here are pud entries, while pmd1 and pmd2 here are pmd entries)
- First, the guest reads from ptr1 first and KVM prepares a shadow
page table with role.access=u--, from ptr1's pud1 and ptr1's pmd1.
"u--" comes from the effective permissions of pgd, pud1 and
pmd1, which are stored in pt->access. "u--" is used also to get
the pagetable for pud1, instead of "uw-".
- Then the guest writes to ptr2 and KVM reuses pud1 which is present.
The hypervisor set up a shadow page for ptr2 with pt->access is "uw-"
even though the pud1 pmd (because of the incorrect argument to
kvm_mmu_get_page in the previous step) has role.access="u--".
- Then the guest reads from ptr3. The hypervisor reuses pud1's
shadow pmd for pud2, because both use "u--" for their permissions.
Thus, the shadow pmd already includes entries for both pmd1 and pmd2.
- At last, the guest writes to ptr4. This causes no vmexit or pagefault,
because pud1's shadow page structures included an "uw-" page even though
its role.access was "u--".
Any kind of shared pagetable might have the similar problem when in
virtual machine without TDP enabled if the permissions are different
from different ancestors.
In order to fix the problem, we change pt->access to be an array, and
any access in it will not include permissions ANDed from child ptes.
The test code is: https://lore.kernel.org/kvm/20210603050537.19605-1-jiangshanlai@gmail.com/
Remember to test it with TDP disabled.
The problem had existed long before the commit 41074d07c78b ("KVM: MMU:
Fix inherited permissions for emulated guest pte updates"), and it
is hard to find which is the culprit. So there is no fixes tag here.
Signed-off-by: Lai Jiangshan <laijs(a)linux.alibaba.com>
Message-Id: <20210603052455.21023-1-jiangshanlai(a)gmail.com>
Cc: stable(a)vger.kernel.org
Fixes: cea0f0e7ea54 ("[PATCH] KVM: MMU: Shadow page table caching")
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/Documentation/virt/kvm/mmu.rst b/Documentation/virt/kvm/mmu.rst
index 5bfe28b0728e..20d85daed395 100644
--- a/Documentation/virt/kvm/mmu.rst
+++ b/Documentation/virt/kvm/mmu.rst
@@ -171,8 +171,8 @@ Shadow pages contain the following information:
shadow pages) so role.quadrant takes values in the range 0..3. Each
quadrant maps 1GB virtual address space.
role.access:
- Inherited guest access permissions in the form uwx. Note execute
- permission is positive, not negative.
+ Inherited guest access permissions from the parent ptes in the form uwx.
+ Note execute permission is positive, not negative.
role.invalid:
The page is invalid and should not be used. It is a root page that is
currently pinned (by a cpu hardware register pointing to it); once it is
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 70b7e44e3035..823a5919f9fa 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -90,8 +90,8 @@ struct guest_walker {
gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
bool pte_writable[PT_MAX_FULL_LEVELS];
- unsigned pt_access;
- unsigned pte_access;
+ unsigned int pt_access[PT_MAX_FULL_LEVELS];
+ unsigned int pte_access;
gfn_t gfn;
struct x86_exception fault;
};
@@ -418,13 +418,15 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
}
walker->ptes[walker->level - 1] = pte;
+
+ /* Convert to ACC_*_MASK flags for struct guest_walker. */
+ walker->pt_access[walker->level - 1] = FNAME(gpte_access)(pt_access ^ walk_nx_mask);
} while (!is_last_gpte(mmu, walker->level, pte));
pte_pkey = FNAME(gpte_pkeys)(vcpu, pte);
accessed_dirty = have_ad ? pte_access & PT_GUEST_ACCESSED_MASK : 0;
/* Convert to ACC_*_MASK flags for struct guest_walker. */
- walker->pt_access = FNAME(gpte_access)(pt_access ^ walk_nx_mask);
walker->pte_access = FNAME(gpte_access)(pte_access ^ walk_nx_mask);
errcode = permission_fault(vcpu, mmu, walker->pte_access, pte_pkey, access);
if (unlikely(errcode))
@@ -463,7 +465,8 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
}
pgprintk("%s: pte %llx pte_access %x pt_access %x\n",
- __func__, (u64)pte, walker->pte_access, walker->pt_access);
+ __func__, (u64)pte, walker->pte_access,
+ walker->pt_access[walker->level - 1]);
return 1;
error:
@@ -643,7 +646,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gpa_t addr,
bool huge_page_disallowed = exec && nx_huge_page_workaround_enabled;
struct kvm_mmu_page *sp = NULL;
struct kvm_shadow_walk_iterator it;
- unsigned direct_access, access = gw->pt_access;
+ unsigned int direct_access, access;
int top_level, level, req_level, ret;
gfn_t base_gfn = gw->gfn;
@@ -675,6 +678,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gpa_t addr,
sp = NULL;
if (!is_shadow_present_pte(*it.sptep)) {
table_gfn = gw->table_gfn[it.level - 2];
+ access = gw->pt_access[it.level - 2];
sp = kvm_mmu_get_page(vcpu, table_gfn, addr, it.level-1,
false, access);
}
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b1bd5cba3306691c771d558e94baa73e8b0b96b7 Mon Sep 17 00:00:00 2001
From: Lai Jiangshan <laijs(a)linux.alibaba.com>
Date: Thu, 3 Jun 2021 13:24:55 +0800
Subject: [PATCH] KVM: X86: MMU: Use the correct inherited permissions to get
shadow page
When computing the access permissions of a shadow page, use the effective
permissions of the walk up to that point, i.e. the logic AND of its parents'
permissions. Two guest PxE entries that point at the same table gfn need to
be shadowed with different shadow pages if their parents' permissions are
different. KVM currently uses the effective permissions of the last
non-leaf entry for all non-leaf entries. Because all non-leaf SPTEs have
full ("uwx") permissions, and the effective permissions are recorded only
in role.access and merged into the leaves, this can lead to incorrect
reuse of a shadow page and eventually to a missing guest protection page
fault.
For example, here is a shared pagetable:
pgd[] pud[] pmd[] virtual address pointers
/->pmd1(u--)->pte1(uw-)->page1 <- ptr1 (u--)
/->pud1(uw-)--->pmd2(uw-)->pte2(uw-)->page2 <- ptr2 (uw-)
pgd-| (shared pmd[] as above)
\->pud2(u--)--->pmd1(u--)->pte1(uw-)->page1 <- ptr3 (u--)
\->pmd2(uw-)->pte2(uw-)->page2 <- ptr4 (u--)
pud1 and pud2 point to the same pmd table, so:
- ptr1 and ptr3 points to the same page.
- ptr2 and ptr4 points to the same page.
(pud1 and pud2 here are pud entries, while pmd1 and pmd2 here are pmd entries)
- First, the guest reads from ptr1 first and KVM prepares a shadow
page table with role.access=u--, from ptr1's pud1 and ptr1's pmd1.
"u--" comes from the effective permissions of pgd, pud1 and
pmd1, which are stored in pt->access. "u--" is used also to get
the pagetable for pud1, instead of "uw-".
- Then the guest writes to ptr2 and KVM reuses pud1 which is present.
The hypervisor set up a shadow page for ptr2 with pt->access is "uw-"
even though the pud1 pmd (because of the incorrect argument to
kvm_mmu_get_page in the previous step) has role.access="u--".
- Then the guest reads from ptr3. The hypervisor reuses pud1's
shadow pmd for pud2, because both use "u--" for their permissions.
Thus, the shadow pmd already includes entries for both pmd1 and pmd2.
- At last, the guest writes to ptr4. This causes no vmexit or pagefault,
because pud1's shadow page structures included an "uw-" page even though
its role.access was "u--".
Any kind of shared pagetable might have the similar problem when in
virtual machine without TDP enabled if the permissions are different
from different ancestors.
In order to fix the problem, we change pt->access to be an array, and
any access in it will not include permissions ANDed from child ptes.
The test code is: https://lore.kernel.org/kvm/20210603050537.19605-1-jiangshanlai@gmail.com/
Remember to test it with TDP disabled.
The problem had existed long before the commit 41074d07c78b ("KVM: MMU:
Fix inherited permissions for emulated guest pte updates"), and it
is hard to find which is the culprit. So there is no fixes tag here.
Signed-off-by: Lai Jiangshan <laijs(a)linux.alibaba.com>
Message-Id: <20210603052455.21023-1-jiangshanlai(a)gmail.com>
Cc: stable(a)vger.kernel.org
Fixes: cea0f0e7ea54 ("[PATCH] KVM: MMU: Shadow page table caching")
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/Documentation/virt/kvm/mmu.rst b/Documentation/virt/kvm/mmu.rst
index 5bfe28b0728e..20d85daed395 100644
--- a/Documentation/virt/kvm/mmu.rst
+++ b/Documentation/virt/kvm/mmu.rst
@@ -171,8 +171,8 @@ Shadow pages contain the following information:
shadow pages) so role.quadrant takes values in the range 0..3. Each
quadrant maps 1GB virtual address space.
role.access:
- Inherited guest access permissions in the form uwx. Note execute
- permission is positive, not negative.
+ Inherited guest access permissions from the parent ptes in the form uwx.
+ Note execute permission is positive, not negative.
role.invalid:
The page is invalid and should not be used. It is a root page that is
currently pinned (by a cpu hardware register pointing to it); once it is
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 70b7e44e3035..823a5919f9fa 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -90,8 +90,8 @@ struct guest_walker {
gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
bool pte_writable[PT_MAX_FULL_LEVELS];
- unsigned pt_access;
- unsigned pte_access;
+ unsigned int pt_access[PT_MAX_FULL_LEVELS];
+ unsigned int pte_access;
gfn_t gfn;
struct x86_exception fault;
};
@@ -418,13 +418,15 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
}
walker->ptes[walker->level - 1] = pte;
+
+ /* Convert to ACC_*_MASK flags for struct guest_walker. */
+ walker->pt_access[walker->level - 1] = FNAME(gpte_access)(pt_access ^ walk_nx_mask);
} while (!is_last_gpte(mmu, walker->level, pte));
pte_pkey = FNAME(gpte_pkeys)(vcpu, pte);
accessed_dirty = have_ad ? pte_access & PT_GUEST_ACCESSED_MASK : 0;
/* Convert to ACC_*_MASK flags for struct guest_walker. */
- walker->pt_access = FNAME(gpte_access)(pt_access ^ walk_nx_mask);
walker->pte_access = FNAME(gpte_access)(pte_access ^ walk_nx_mask);
errcode = permission_fault(vcpu, mmu, walker->pte_access, pte_pkey, access);
if (unlikely(errcode))
@@ -463,7 +465,8 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
}
pgprintk("%s: pte %llx pte_access %x pt_access %x\n",
- __func__, (u64)pte, walker->pte_access, walker->pt_access);
+ __func__, (u64)pte, walker->pte_access,
+ walker->pt_access[walker->level - 1]);
return 1;
error:
@@ -643,7 +646,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gpa_t addr,
bool huge_page_disallowed = exec && nx_huge_page_workaround_enabled;
struct kvm_mmu_page *sp = NULL;
struct kvm_shadow_walk_iterator it;
- unsigned direct_access, access = gw->pt_access;
+ unsigned int direct_access, access;
int top_level, level, req_level, ret;
gfn_t base_gfn = gw->gfn;
@@ -675,6 +678,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gpa_t addr,
sp = NULL;
if (!is_shadow_present_pte(*it.sptep)) {
table_gfn = gw->table_gfn[it.level - 2];
+ access = gw->pt_access[it.level - 2];
sp = kvm_mmu_get_page(vcpu, table_gfn, addr, it.level-1,
false, access);
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b53e84eed08b88fd3ff59e5c2a7f1a69d4004e32 Mon Sep 17 00:00:00 2001
From: Lai Jiangshan <laijs(a)linux.alibaba.com>
Date: Tue, 1 Jun 2021 01:22:56 +0800
Subject: [PATCH] KVM: x86: Unload MMU on guest TLB flush if TDP disabled to
force MMU sync
When using shadow paging, unload the guest MMU when emulating a guest TLB
flush to ensure all roots are synchronized. From the guest's perspective,
flushing the TLB ensures any and all modifications to its PTEs will be
recognized by the CPU.
Note, unloading the MMU is overkill, but is done to mirror KVM's existing
handling of INVPCID(all) and ensure the bug is squashed. Future cleanup
can be done to more precisely synchronize roots when servicing a guest
TLB flush.
If TDP is enabled, synchronizing the MMU is unnecessary even if nested
TDP is in play, as a "legacy" TLB flush from L1 does not invalidate L1's
TDP mappings. For EPT, an explicit INVEPT is required to invalidate
guest-physical mappings; for NPT, guest mappings are always tagged with
an ASID and thus can only be invalidated via the VMCB's ASID control.
This bug has existed since the introduction of KVM_VCPU_FLUSH_TLB.
It was only recently exposed after Linux guests stopped flushing the
local CPU's TLB prior to flushing remote TLBs (see commit 4ce94eabac16,
"x86/mm/tlb: Flush remote and local TLBs concurrently"), but is also
visible in Windows 10 guests.
Tested-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Fixes: f38a7b75267f ("KVM: X86: support paravirtualized help for TLB shootdowns")
Signed-off-by: Lai Jiangshan <laijs(a)linux.alibaba.com>
[sean: massaged comment and changelog]
Message-Id: <20210531172256.2908-1-jiangshanlai(a)gmail.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e2144eedaf79..9dd23bdfc6cc 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3072,6 +3072,19 @@ static void kvm_vcpu_flush_tlb_all(struct kvm_vcpu *vcpu)
static void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu)
{
++vcpu->stat.tlb_flush;
+
+ if (!tdp_enabled) {
+ /*
+ * A TLB flush on behalf of the guest is equivalent to
+ * INVPCID(all), toggling CR4.PGE, etc., which requires
+ * a forced sync of the shadow page tables. Unload the
+ * entire MMU here and the subsequent load will sync the
+ * shadow page tables, and also flush the TLB.
+ */
+ kvm_mmu_unload(vcpu);
+ return;
+ }
+
static_call(kvm_x86_tlb_flush_guest)(vcpu);
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b53e84eed08b88fd3ff59e5c2a7f1a69d4004e32 Mon Sep 17 00:00:00 2001
From: Lai Jiangshan <laijs(a)linux.alibaba.com>
Date: Tue, 1 Jun 2021 01:22:56 +0800
Subject: [PATCH] KVM: x86: Unload MMU on guest TLB flush if TDP disabled to
force MMU sync
When using shadow paging, unload the guest MMU when emulating a guest TLB
flush to ensure all roots are synchronized. From the guest's perspective,
flushing the TLB ensures any and all modifications to its PTEs will be
recognized by the CPU.
Note, unloading the MMU is overkill, but is done to mirror KVM's existing
handling of INVPCID(all) and ensure the bug is squashed. Future cleanup
can be done to more precisely synchronize roots when servicing a guest
TLB flush.
If TDP is enabled, synchronizing the MMU is unnecessary even if nested
TDP is in play, as a "legacy" TLB flush from L1 does not invalidate L1's
TDP mappings. For EPT, an explicit INVEPT is required to invalidate
guest-physical mappings; for NPT, guest mappings are always tagged with
an ASID and thus can only be invalidated via the VMCB's ASID control.
This bug has existed since the introduction of KVM_VCPU_FLUSH_TLB.
It was only recently exposed after Linux guests stopped flushing the
local CPU's TLB prior to flushing remote TLBs (see commit 4ce94eabac16,
"x86/mm/tlb: Flush remote and local TLBs concurrently"), but is also
visible in Windows 10 guests.
Tested-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Fixes: f38a7b75267f ("KVM: X86: support paravirtualized help for TLB shootdowns")
Signed-off-by: Lai Jiangshan <laijs(a)linux.alibaba.com>
[sean: massaged comment and changelog]
Message-Id: <20210531172256.2908-1-jiangshanlai(a)gmail.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e2144eedaf79..9dd23bdfc6cc 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3072,6 +3072,19 @@ static void kvm_vcpu_flush_tlb_all(struct kvm_vcpu *vcpu)
static void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu)
{
++vcpu->stat.tlb_flush;
+
+ if (!tdp_enabled) {
+ /*
+ * A TLB flush on behalf of the guest is equivalent to
+ * INVPCID(all), toggling CR4.PGE, etc., which requires
+ * a forced sync of the shadow page tables. Unload the
+ * entire MMU here and the subsequent load will sync the
+ * shadow page tables, and also flush the TLB.
+ */
+ kvm_mmu_unload(vcpu);
+ return;
+ }
+
static_call(kvm_x86_tlb_flush_guest)(vcpu);
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b53e84eed08b88fd3ff59e5c2a7f1a69d4004e32 Mon Sep 17 00:00:00 2001
From: Lai Jiangshan <laijs(a)linux.alibaba.com>
Date: Tue, 1 Jun 2021 01:22:56 +0800
Subject: [PATCH] KVM: x86: Unload MMU on guest TLB flush if TDP disabled to
force MMU sync
When using shadow paging, unload the guest MMU when emulating a guest TLB
flush to ensure all roots are synchronized. From the guest's perspective,
flushing the TLB ensures any and all modifications to its PTEs will be
recognized by the CPU.
Note, unloading the MMU is overkill, but is done to mirror KVM's existing
handling of INVPCID(all) and ensure the bug is squashed. Future cleanup
can be done to more precisely synchronize roots when servicing a guest
TLB flush.
If TDP is enabled, synchronizing the MMU is unnecessary even if nested
TDP is in play, as a "legacy" TLB flush from L1 does not invalidate L1's
TDP mappings. For EPT, an explicit INVEPT is required to invalidate
guest-physical mappings; for NPT, guest mappings are always tagged with
an ASID and thus can only be invalidated via the VMCB's ASID control.
This bug has existed since the introduction of KVM_VCPU_FLUSH_TLB.
It was only recently exposed after Linux guests stopped flushing the
local CPU's TLB prior to flushing remote TLBs (see commit 4ce94eabac16,
"x86/mm/tlb: Flush remote and local TLBs concurrently"), but is also
visible in Windows 10 guests.
Tested-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Fixes: f38a7b75267f ("KVM: X86: support paravirtualized help for TLB shootdowns")
Signed-off-by: Lai Jiangshan <laijs(a)linux.alibaba.com>
[sean: massaged comment and changelog]
Message-Id: <20210531172256.2908-1-jiangshanlai(a)gmail.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e2144eedaf79..9dd23bdfc6cc 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3072,6 +3072,19 @@ static void kvm_vcpu_flush_tlb_all(struct kvm_vcpu *vcpu)
static void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu)
{
++vcpu->stat.tlb_flush;
+
+ if (!tdp_enabled) {
+ /*
+ * A TLB flush on behalf of the guest is equivalent to
+ * INVPCID(all), toggling CR4.PGE, etc., which requires
+ * a forced sync of the shadow page tables. Unload the
+ * entire MMU here and the subsequent load will sync the
+ * shadow page tables, and also flush the TLB.
+ */
+ kvm_mmu_unload(vcpu);
+ return;
+ }
+
static_call(kvm_x86_tlb_flush_guest)(vcpu);
}
This is a note to let you know that I've just added the patch titled
nvmem: core: add a missing of_node_put
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 63879e2964bceee2aa5bbe8b99ea58bba28bb64f Mon Sep 17 00:00:00 2001
From: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
Date: Fri, 11 Jun 2021 11:23:21 +0100
Subject: nvmem: core: add a missing of_node_put
'for_each_child_of_node' performs an of_node_get on each iteration, so a
return from the middle of the loop requires an of_node_put.
Fixes: e888d445ac33 ("nvmem: resolve cells from DT at registration time")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
Link: https://lore.kernel.org/r/20210611102321.11509-1-srinivas.kandagatla@linaro…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/nvmem/core.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c
index 4d1c4f83b22f..20799e622b5b 100644
--- a/drivers/nvmem/core.c
+++ b/drivers/nvmem/core.c
@@ -690,15 +690,17 @@ static int nvmem_add_cells_from_of(struct nvmem_device *nvmem)
continue;
if (len < 2 * sizeof(u32)) {
dev_err(dev, "nvmem: invalid reg on %pOF\n", child);
+ of_node_put(child);
return -EINVAL;
}
cell = kzalloc(sizeof(*cell), GFP_KERNEL);
- if (!cell)
+ if (!cell) {
+ of_node_put(child);
return -ENOMEM;
+ }
cell->nvmem = nvmem;
- cell->np = of_node_get(child);
cell->offset = be32_to_cpup(addr++);
cell->bytes = be32_to_cpup(addr);
cell->name = kasprintf(GFP_KERNEL, "%pOFn", child);
@@ -719,11 +721,12 @@ static int nvmem_add_cells_from_of(struct nvmem_device *nvmem)
cell->name, nvmem->stride);
/* Cells already added will be freed later. */
kfree_const(cell->name);
- of_node_put(cell->np);
kfree(cell);
+ of_node_put(child);
return -EINVAL;
}
+ cell->np = of_node_get(child);
nvmem_cell_add(cell);
}
--
2.32.0
The fsl_mxc_udc driver was originally developed as a part of a DENX
project, adding Marek to CC to have them check internally what their
preferences and requirements might be.
Thanks
Guennadi
On Fri, 11 Jun 2021, Leo Li wrote:
> > -----Original Message-----
> > From: Arnd Bergmann <arnd(a)arndb.de>
> > Sent: Friday, June 11, 2021 10:43 AM
> > To: Fabio Estevam <festevam(a)gmail.com>
> > Cc: gregkh <gregkh(a)linuxfoundation.org>; Felipe Balbi <balbi(a)kernel.org>;
> > Joel Stanley <joel(a)jms.id.au>; Leo Li <leoyang.li(a)nxp.com>; kbuild test
> > robot <lkp(a)intel.com>; Peter Chen <peter.chen(a)nxp.com>; Ran Wang
> > <ran.wang_1(a)nxp.com>; Stephen Rothwell <sfr(a)canb.auug.org.au>;
> > Shawn Guo <shawnguo(a)kernel.org>; # 3.4.x <stable(a)vger.kernel.org>;
> > Guennadi Liakhovetski <g.liakhovetski(a)gmx.de>
> > Subject: Re: patch "Revert "usb: gadget: fsl: Re-enable driver for ARM SoCs""
> > added to usb-linus
> >
> > On Fri, Jun 11, 2021 at 4:51 PM Fabio Estevam <festevam(a)gmail.com> wrote:
> > > On Fri, Jun 11, 2021 at 5:30 AM Arnd Bergmann <arnd(a)arndb.de> wrote:
> > >
> > > > Adding Fabio and Guennadi to Cc.
> > > >
> > > > I can see that the missing symbols were in a driver that got removed
> > > > in commit a390bef7db1f ("usb: gadget: fsl_mxc_udc: Remove the driver").
> > > >
> > > > If CONFIG_ARCH_MXC is disabled, these are stubbed out in the header
> > file.
> > > > These were added a long time ago by Guennadi Liakhovetski
> > > > 54e4026b64a9
> > > > ("USB: gadget: Add i.MX3x support to the fsl_usb2_udc driver"). I
> > > > also see that this patch added a few #ifdef CONFIG_ARCH_MXC checks
> > > > to the driver that still remain today. This is clearly broken as it
> > > > must be possible to use the same driver module on both SOC_LS1021A
> > > > and i.MX using a runtime check.
> > > >
> > > > I also don't see any i.MX variant actually using this driver, but
> > > > instead see the dts files declaring fsl,imx27-usb devices, which
> > > > bind to the drivers/usb/chipidea/ci_hdrc_imx.c driver. Is this one
> > > > of those cases where we have two separate drivers for the same
> > > > hardware, or is this for a different device?
> > >
> > > Exactly. The USB IP on several i.MX devices comes from ChipIdea.
> > >
> > > Prior to using devicetree, we had the fsl_mxc_udc driver to handle the
> > > gadget side.
> > >
> > > Since i.MX has been converted to a DT-only platform, we no longer need
> > > fsl_mxc_udc, as drivers/usb/chipidea is used nowadays.
> >
> > Ok, good, so the simples solution I suppose is to remove the remaining bits
> > that Guennadi added when he wrote the removed driver in 54e4026b64a9,
> > and then re-apply Joel's patch.
>
> I can provide a patch for this.
>
> >
> > Alternatively, it might be possible change the chipidea driver to work on
> > ls1021a and ls1012a, assuming that they use the same hardware block as i.MX.
>
> It is also used on many legacy FSL PowerPC SoCs. I agree with the direction, but it does require some effort to make sure it works on all these legacy platforms. I think Ran Wang had tried to do that, but not completed due to bandwidth issue.
>
> >
> > Either way, it would be good to test the changes on at least one of these two
> > platforms.
> >
> > Arnd
>