Changes since v5 [1]:
* Move the percpu-ref kill function to be passed in via @pgmap (Christoph)
* Added Christoph's ack for patches 2 and 4
* Added Jérôme's Reviewed-by for patches 2-6
* Fix MEMORY_DEVICE_PRIVATE support (Jérôme)
[1]: https://lkml.org/lkml/2018/9/13/104
---
Hi Andrew,
Jérôme has reviewed the cleanups, thanks Jérôme. We still disagree on
the EXPORT_SYMBOL_GPL status of the core HMM implementation, but Logan,
Christoph and I continue to support marking all devm_memremap_pages()
derivatives EXPORT_SYMBOL_GPL.
HMM has been upstream for over a year, with no in-tree users it is clear
it was designed first and foremost for out of tree drivers. It takes
advantage of a facility Christoph and I spearheaded to support
persistent memory. It continues to see expanding use cases with no clear
end date when it will stop attracting features / revisions. It is not
suitable to export devm_memremap_pages() as a stable 3rd party driver
api.
devm_memremap_pages() is a facility that can create struct page entries
for any arbitrary range and give out-of-tree drivers the ability to
subvert core aspects of page management. It, and anything derived from
it (e.g. hmm, pcip2p, etc...), is a deep integration point into the core
kernel, and an EXPORT_SYMBOL_GPL() interface.
Commit 31c5bda3a656 "mm: fix exports that inadvertently make put_page()
EXPORT_SYMBOL_GPL" was merged ahead of this series to relieve some of
the pressure from innocent consumers of put_page(), but now we need this
series to address *producers* of device pages.
More details and justification in the changelogs. The 0day
infrastructure has reported success across 152 configs and this survives
the libnvdimm unit test suite. Aside from the controversial bits the
diffstat is compelling at:
7 files changed, 127 insertions(+), 321 deletions(-)
Note that the series has some minor collisions with Alex's recent series
to improve devm_memremap_pages() scalability [2]. So, whichever you take
first the other will need a minor rebase.
[2]: https://www.lkml.org/lkml/2018/9/11/10
---
Dan Williams (7):
mm, devm_memremap_pages: Mark devm_memremap_pages() EXPORT_SYMBOL_GPL
mm, devm_memremap_pages: Kill mapping "System RAM" support
mm, devm_memremap_pages: Fix shutdown handling
mm, devm_memremap_pages: Add MEMORY_DEVICE_PRIVATE support
mm, hmm: Use devm semantics for hmm_devmem_{add,remove}
mm, hmm: Replace hmm_devmem_pages_create() with devm_memremap_pages()
mm, hmm: Mark hmm_devmem_{add,add_resource} EXPORT_SYMBOL_GPL
drivers/dax/pmem.c | 14 --
drivers/nvdimm/pmem.c | 13 +-
include/linux/hmm.h | 4
include/linux/memremap.h | 2
kernel/memremap.c | 95 +++++++-----
mm/hmm.c | 303 +++++--------------------------------
tools/testing/nvdimm/test/iomap.c | 17 ++
7 files changed, 127 insertions(+), 321 deletions(-)
PageTransCompoundMap() returns true for hugetlbfs and THP
hugepages. This behaviour incorrectly leads to stage 2 faults for
unsupported hugepage sizes (e.g., 64K hugepage with 4K pages) to be
treated as THP faults.
Tighten the check to filter out hugetlbfs pages. This also leads to
consistently mapping all unsupported hugepage sizes as PTE level
entries at stage 2.
Signed-off-by: Punit Agrawal <punit.agrawal(a)arm.com>
Cc: Christoffer Dall <christoffer.dall(a)arm.com>
Cc: Marc Zyngier <marc.zyngier(a)arm.com>
Cc: Suzuki Poulose <suzuki.poulose(a)arm.com>
Cc: stable(a)vger.kernel.org # v4.13+
---
virt/kvm/arm/mmu.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 7e477b3cae5b..c23a1b323aad 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1231,8 +1231,14 @@ static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap)
{
kvm_pfn_t pfn = *pfnp;
gfn_t gfn = *ipap >> PAGE_SHIFT;
+ struct page *page = pfn_to_page(pfn);
- if (PageTransCompoundMap(pfn_to_page(pfn))) {
+ /*
+ * PageTransCompoungMap() returns true for THP and
+ * hugetlbfs. Make sure the adjustment is done only for THP
+ * pages.
+ */
+ if (!PageHuge(page) && PageTransCompoundMap(page)) {
unsigned long mask;
/*
* The address we faulted on is backed by a transparent huge
--
2.18.0
Hi James,
Here's a pair of fixes that need to go upstream asap, please:
(1) Revert an incorrect fix to the keyrings UAPI for a C++ reserved word
used as a struct member name. This change being reverted breaks
existing userspace code and is thus incorrect.
Further, *neither* name is consistent with the one in the keyutils
package public header.
(2) Fix the problem by using a union to make the name from keyutils
available in parallel and make the 'private' name unavailable in C++
with cpp-conditionals.
David
---
David Howells (1):
keys: Fix the use of the C++ keyword "private" in uapi/linux/keyctl.h
Lubomir Rintel (1):
Revert "uapi/linux/keyctl.h: don't use C++ reserved keyword as a struct member name"
include/uapi/linux/keyctl.h | 7 ++++++-
security/keys/dh.c | 2 +-
2 files changed, 7 insertions(+), 2 deletions(-)
From: Paul Mackerras <paulus(a)ozlabs.org>
[ Upstream commit 46dec40fb741f00f1864580130779aeeaf24fb3d ]
This fixes a bug which causes guest virtual addresses to get translated
to guest real addresses incorrectly when the guest is using the HPT MMU
and has more than 256GB of RAM, or more specifically has a HPT larger
than 2GB. This has showed up in testing as a failure of the host to
emulate doorbell instructions correctly on POWER9 for HPT guests with
more than 256GB of RAM.
The bug is that the HPTE index in kvmppc_mmu_book3s_64_hv_xlate()
is stored as an int, and in forming the HPTE address, the index gets
shifted left 4 bits as an int before being signed-extended to 64 bits.
The simple fix is to make the variable a long int, matching the
return type of kvmppc_hv_find_lock_hpte(), which is what calculates
the index.
Fixes: 697d3899dcb4 ("KVM: PPC: Implement MMIO emulation support for Book3S HV guests")
Signed-off-by: Paul Mackerras <paulus(a)ozlabs.org>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
---
arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index d40770248b6a..191cc3eea0bf 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -449,7 +449,7 @@ static int kvmppc_mmu_book3s_64_hv_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
unsigned long pp, key;
unsigned long v, gr;
__be64 *hptep;
- int index;
+ long int index;
int virtmode = vcpu->arch.shregs.msr & (data ? MSR_DR : MSR_IR);
/* Get SLB entry */
--
2.17.1
From: Paul Mackerras <paulus(a)ozlabs.org>
[ Upstream commit 46dec40fb741f00f1864580130779aeeaf24fb3d ]
This fixes a bug which causes guest virtual addresses to get translated
to guest real addresses incorrectly when the guest is using the HPT MMU
and has more than 256GB of RAM, or more specifically has a HPT larger
than 2GB. This has showed up in testing as a failure of the host to
emulate doorbell instructions correctly on POWER9 for HPT guests with
more than 256GB of RAM.
The bug is that the HPTE index in kvmppc_mmu_book3s_64_hv_xlate()
is stored as an int, and in forming the HPTE address, the index gets
shifted left 4 bits as an int before being signed-extended to 64 bits.
The simple fix is to make the variable a long int, matching the
return type of kvmppc_hv_find_lock_hpte(), which is what calculates
the index.
Fixes: 697d3899dcb4 ("KVM: PPC: Implement MMIO emulation support for Book3S HV guests")
Signed-off-by: Paul Mackerras <paulus(a)ozlabs.org>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
---
arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index fb37290a57b4..366965ae37bd 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -314,7 +314,7 @@ static int kvmppc_mmu_book3s_64_hv_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
unsigned long pp, key;
unsigned long v, gr;
__be64 *hptep;
- int index;
+ long int index;
int virtmode = vcpu->arch.shregs.msr & (data ? MSR_DR : MSR_IR);
/* Get SLB entry */
--
2.17.1
From: Toke Høiland-Jørgensen <toke(a)toke.dk>
[ Upstream commit 77cfaf52eca5cac30ed029507e0cab065f888995 ]
The TXQ teardown code can reference the vif data structures that are
stored in the netdev private memory area if there are still packets on
the queue when it is being freed. Since the TXQ teardown code is run
after the netdevs are freed, this can lead to a use-after-free. Fix this
by moving the TXQ teardown code to earlier in ieee80211_unregister_hw().
Reported-by: Ben Greear <greearb(a)candelatech.com>
Tested-by: Ben Greear <greearb(a)candelatech.com>
Signed-off-by: Toke Høiland-Jørgensen <toke(a)toke.dk>
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
---
net/mac80211/main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index 2bb6899854d4..6389ce868668 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -1164,6 +1164,7 @@ void ieee80211_unregister_hw(struct ieee80211_hw *hw)
#if IS_ENABLED(CONFIG_IPV6)
unregister_inet6addr_notifier(&local->ifa6_notifier);
#endif
+ ieee80211_txq_teardown_flows(local);
rtnl_lock();
@@ -1191,7 +1192,6 @@ void ieee80211_unregister_hw(struct ieee80211_hw *hw)
skb_queue_purge(&local->skb_queue);
skb_queue_purge(&local->skb_queue_unreliable);
skb_queue_purge(&local->skb_queue_tdls_chsw);
- ieee80211_txq_teardown_flows(local);
destroy_workqueue(local->workqueue);
wiphy_unregister(local->hw.wiphy);
--
2.17.1
From: Toke Høiland-Jørgensen <toke(a)toke.dk>
[ Upstream commit 77cfaf52eca5cac30ed029507e0cab065f888995 ]
The TXQ teardown code can reference the vif data structures that are
stored in the netdev private memory area if there are still packets on
the queue when it is being freed. Since the TXQ teardown code is run
after the netdevs are freed, this can lead to a use-after-free. Fix this
by moving the TXQ teardown code to earlier in ieee80211_unregister_hw().
Reported-by: Ben Greear <greearb(a)candelatech.com>
Tested-by: Ben Greear <greearb(a)candelatech.com>
Signed-off-by: Toke Høiland-Jørgensen <toke(a)toke.dk>
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
---
net/mac80211/main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index 8aa1f5b6a051..cb5b22b61388 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -1171,6 +1171,7 @@ void ieee80211_unregister_hw(struct ieee80211_hw *hw)
#if IS_ENABLED(CONFIG_IPV6)
unregister_inet6addr_notifier(&local->ifa6_notifier);
#endif
+ ieee80211_txq_teardown_flows(local);
rtnl_lock();
@@ -1199,7 +1200,6 @@ void ieee80211_unregister_hw(struct ieee80211_hw *hw)
skb_queue_purge(&local->skb_queue);
skb_queue_purge(&local->skb_queue_unreliable);
skb_queue_purge(&local->skb_queue_tdls_chsw);
- ieee80211_txq_teardown_flows(local);
destroy_workqueue(local->workqueue);
wiphy_unregister(local->hw.wiphy);
--
2.17.1