There is an issue with ACPI overlay table removal specifically related
to I2C multiplexers.
Consider an ACPI SSDT Overlay that defines a PCA9548 I2C mux on an
existing I2C bus. When this table is loaded we see the creation of a
device for the overall PCA9548 chip and 8 further devices - one
i2c_adapter each for the mux channels. These are all bound to their
ACPI equivalents via an eventual invocation of acpi_bind_one().
When we unload the SSDT overlay we run into the problem. The ACPI
devices are deleted as normal via acpi_device_del_work_fn() and the
acpi_device_del_list.
However, the following warning and stack trace is output as the
deletion does not go smoothly:
------------[ cut here ]------------
kernfs: can not remove 'physical_node', no directory
WARNING: CPU: 1 PID: 11 at fs/kernfs/dir.c:1674 kernfs_remove_by_name_ns+0xb9/0xc0
Modules linked in:
CPU: 1 PID: 11 Comm: kworker/u128:0 Not tainted 6.8.0-rc6+ #1
Hardware name: congatec AG conga-B7E3/conga-B7E3, BIOS 5.13 05/16/2023
Workqueue: kacpi_hotplug acpi_device_del_work_fn
RIP: 0010:kernfs_remove_by_name_ns+0xb9/0xc0
Code: e4 00 48 89 ef e8 07 71 db ff 5b b8 fe ff ff ff 5d 41 5c 41 5d e9 a7 55 e4 00 0f 0b eb a6 48 c7 c7 f0 38 0d 9d e8 97 0a d5 ff <0f> 0b eb dc 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
RSP: 0018:ffff9f864008fb28 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff8ef90a8d4940 RCX: 0000000000000000
RDX: ffff8f000e267d10 RSI: ffff8f000e25c780 RDI: ffff8f000e25c780
RBP: ffff8ef9186f9870 R08: 0000000000013ffb R09: 00000000ffffbfff
R10: 00000000ffffbfff R11: ffff8f000e0a0000 R12: ffff9f864008fb50
R13: ffff8ef90c93dd60 R14: ffff8ef9010d0958 R15: ffff8ef9186f98c8
FS: 0000000000000000(0000) GS:ffff8f000e240000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f48f5253a08 CR3: 00000003cb82e000 CR4: 00000000003506f0
Call Trace:
<TASK>
? kernfs_remove_by_name_ns+0xb9/0xc0
? __warn+0x7c/0x130
? kernfs_remove_by_name_ns+0xb9/0xc0
? report_bug+0x171/0x1a0
? handle_bug+0x3c/0x70
? exc_invalid_op+0x17/0x70
? asm_exc_invalid_op+0x1a/0x20
? kernfs_remove_by_name_ns+0xb9/0xc0
? kernfs_remove_by_name_ns+0xb9/0xc0
acpi_unbind_one+0x108/0x180
device_del+0x18b/0x490
? srso_return_thunk+0x5/0x5f
? srso_return_thunk+0x5/0x5f
device_unregister+0xd/0x30
i2c_del_adapter.part.0+0x1bf/0x250
i2c_mux_del_adapters+0xa1/0xe0
i2c_device_remove+0x1e/0x80
device_release_driver_internal+0x19a/0x200
bus_remove_device+0xbf/0x100
device_del+0x157/0x490
? __pfx_device_match_fwnode+0x10/0x10
? srso_return_thunk+0x5/0x5f
device_unregister+0xd/0x30
i2c_acpi_notify+0x10f/0x140
notifier_call_chain+0x58/0xd0
blocking_notifier_call_chain+0x3a/0x60
acpi_device_del_work_fn+0x85/0x1d0
process_one_work+0x134/0x2f0
worker_thread+0x2f0/0x410
? __pfx_worker_thread+0x10/0x10
kthread+0xe3/0x110
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2f/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1b/0x30
</TASK>
---[ end trace 0000000000000000 ]---
...
repeated 7 more times, 1 for each channel of the mux
...
The issue is that the binding of the ACPI devices to their peer I2C
adapters is not correctly cleaned up. Digging deeper into the issue we
see that the deletion order is such that the ACPI devices matching the
mux channel i2c adapters are deleted first during the SSDT overlay
removal. For each of the channels we see a call to i2c_acpi_notify()
with ACPI_RECONFIG_DEVICE_REMOVE but, because these devices are not
actually i2c_clients, nothing is done for them.
Later on, after each of the mux channels has been dealt with, we come
to delete the i2c_client representing the PCA9548 device. This is the
call stack we see above, whereby the kernel cleans up the i2c_client
including destruction of the mux and its channel adapters. At this
point we do attempt to unbind from the ACPI peers but those peers no
longer exist and so we hit the kernfs errors.
The fix is to augment i2c_acpi_notify() to handle i2c_adapters. But,
given that the life cycle of the adapters is linked to the i2c_client,
instead of deleting the i2c_adapters during the i2c_acpi_notify(), we
just trigger unbinding of the ACPI device from the adapter device, and
allow the clean up of the adapter to continue in the way it always has.
Signed-off-by: Hamish Martin <hamish.martin(a)alliedtelesis.co.nz>
Reviewed-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti(a)kernel.org>
Fixes: 525e6fabeae2 ("i2c / ACPI: add support for ACPI reconfigure notifications")
Cc: <stable(a)vger.kernel.org> # v4.8+
---
Notes:
v4:
Resolve Build failure noted by:
Linux Kernel Functional Testing <lkft(a)linaro.org>, and
kernel test robot <lkp(a)intel.com>
These failures led to revert of the v3 version of this patch that had been accepted earlier.
v3:
Add reviewed by tags (Mika Westerberg and Andi Shyti) and Fixes tag.
v2:
Moved long problem description from cover letter to commit description at Mika's suggestion
drivers/i2c/i2c-core-acpi.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/i2c/i2c-core-acpi.c b/drivers/i2c/i2c-core-acpi.c
index d6037a328669..14ae0cfc325e 100644
--- a/drivers/i2c/i2c-core-acpi.c
+++ b/drivers/i2c/i2c-core-acpi.c
@@ -445,6 +445,11 @@ static struct i2c_client *i2c_acpi_find_client_by_adev(struct acpi_device *adev)
return i2c_find_device_by_fwnode(acpi_fwnode_handle(adev));
}
+static struct i2c_adapter *i2c_acpi_find_adapter_by_adev(struct acpi_device *adev)
+{
+ return i2c_find_adapter_by_fwnode(acpi_fwnode_handle(adev));
+}
+
static int i2c_acpi_notify(struct notifier_block *nb, unsigned long value,
void *arg)
{
@@ -471,11 +476,17 @@ static int i2c_acpi_notify(struct notifier_block *nb, unsigned long value,
break;
client = i2c_acpi_find_client_by_adev(adev);
- if (!client)
- break;
+ if (client) {
+ i2c_unregister_device(client);
+ put_device(&client->dev);
+ }
+
+ adapter = i2c_acpi_find_adapter_by_adev(adev);
+ if (adapter) {
+ acpi_unbind_one(&adapter->dev);
+ put_device(&adapter->dev);
+ }
- i2c_unregister_device(client);
- put_device(&client->dev);
break;
}
--
2.43.2
Commit 5ec27ec735ba ("fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys inodes.")
added defaults for i_uid/i_gid when set_ownership() is not implemented.
It missed to also adjust net_ctl_set_ownership() to use the same default
values in case the computation of a better value fails.
Instead always initialize i_uid/i_gid inside the sysfs core so
set_ownership() can safely skip setting them.
Fixes: 5ec27ec735ba ("fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys inodes.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
Changes in v3:
- Rebase onto v6.9-rc1
- Reword commit message and mention correct fixed commit
- Link to v2: https://lore.kernel.org/r/20240322-sysctl-net-ownership-v2-1-a8b4a3306542@w…
Changes in v2:
- Move the fallback logic to the sysctl core
- Link to v1: https://lore.kernel.org/r/20240315-sysctl-net-ownership-v1-1-2b465555a292@w…
---
fs/proc/proc_sysctl.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 37cde0efee57..9e34ab9c21e4 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -479,12 +479,10 @@ static struct inode *proc_sys_make_inode(struct super_block *sb,
make_empty_dir_inode(inode);
}
+ inode->i_uid = GLOBAL_ROOT_UID;
+ inode->i_gid = GLOBAL_ROOT_GID;
if (root->set_ownership)
root->set_ownership(head, table, &inode->i_uid, &inode->i_gid);
- else {
- inode->i_uid = GLOBAL_ROOT_UID;
- inode->i_gid = GLOBAL_ROOT_GID;
- }
return inode;
}
---
base-commit: 4cece764965020c22cff7665b18a012006359095
change-id: 20240315-sysctl-net-ownership-bc4e17eaeea6
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
From: Vitor Soares <vitor.soares(a)toradex.com>
When the mcp251xfd_start_xmit() function fails, the driver stops
processing messages, and the interrupt routine does not return,
running indefinitely even after killing the running application.
Error messages:
[ 441.298819] mcp251xfd spi2.0 can0: ERROR in mcp251xfd_start_xmit: -16
[ 441.306498] mcp251xfd spi2.0 can0: Transmit Event FIFO buffer not empty. (seq=0x000017c7, tef_tail=0x000017cf, tef_head=0x000017d0, tx_head=0x000017d3).
... and repeat forever.
The issue can be triggered when multiple devices share the same
SPI interface. And there is concurrent access to the bus.
The problem occurs because tx_ring->head increments even if
mcp251xfd_start_xmit() fails. Consequently, the driver skips one
TX package while still expecting a response in
mcp251xfd_handle_tefif_one().
This patch resolves the issue by decreasing tx_ring->head if
mcp251xfd_start_xmit() fails. With the fix, if we trigger the issue and
the err = -EBUSY, the driver returns NETDEV_TX_BUSY. The network stack
retries to transmit the message.
Otherwise, it prints an error and discards the message.
Fixes: 55e5b97f003e ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN")
Cc: stable(a)vger.kernel.org
Signed-off-by: Vitor Soares <vitor.soares(a)toradex.com>
---
V2->V3:
- Add tx_dropped stats.
- netdev_sent_queue() only if can_put_echo_skb() succeed.
V1->V2:
- Return NETDEV_TX_BUSY if mcp251xfd_tx_obj_write() == -EBUSY.
- Rework the commit message to address the change above.
- Change can_put_echo_skb() to be called after mcp251xfd_tx_obj_write() succeed.
Otherwise, we get Kernel NULL pointer dereference error.
drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c | 34 ++++++++++++--------
1 file changed, 21 insertions(+), 13 deletions(-)
diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c
index 160528d3cc26..146c44e47c60 100644
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c
+++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c
@@ -166,6 +166,7 @@ netdev_tx_t mcp251xfd_start_xmit(struct sk_buff *skb,
struct net_device *ndev)
{
struct mcp251xfd_priv *priv = netdev_priv(ndev);
+ struct net_device_stats *stats = &ndev->stats;
struct mcp251xfd_tx_ring *tx_ring = priv->tx;
struct mcp251xfd_tx_obj *tx_obj;
unsigned int frame_len;
@@ -181,25 +182,32 @@ netdev_tx_t mcp251xfd_start_xmit(struct sk_buff *skb,
tx_obj = mcp251xfd_get_tx_obj_next(tx_ring);
mcp251xfd_tx_obj_from_skb(priv, tx_obj, skb, tx_ring->head);
- /* Stop queue if we occupy the complete TX FIFO */
tx_head = mcp251xfd_get_tx_head(tx_ring);
- tx_ring->head++;
- if (mcp251xfd_get_tx_free(tx_ring) == 0)
- netif_stop_queue(ndev);
-
frame_len = can_skb_get_frame_len(skb);
- err = can_put_echo_skb(skb, ndev, tx_head, frame_len);
- if (!err)
- netdev_sent_queue(priv->ndev, frame_len);
+
+ tx_ring->head++;
err = mcp251xfd_tx_obj_write(priv, tx_obj);
- if (err)
- goto out_err;
+ if (err) {
+ tx_ring->head--;
- return NETDEV_TX_OK;
+ if (err == -EBUSY)
+ return NETDEV_TX_BUSY;
- out_err:
- netdev_err(priv->ndev, "ERROR in %s: %d\n", __func__, err);
+ stats->tx_dropped++;
+
+ if (net_ratelimit())
+ netdev_err(priv->ndev,
+ "ERROR in %s: %d\n", __func__, err);
+ } else {
+ err = can_put_echo_skb(skb, ndev, tx_head, frame_len);
+ if (!err)
+ netdev_sent_queue(priv->ndev, frame_len);
+
+ /* Stop queue if we occupy the complete TX FIFO */
+ if (mcp251xfd_get_tx_free(tx_ring) == 0)
+ netif_stop_queue(ndev);
+ }
return NETDEV_TX_OK;
}
--
2.34.1
From: Bjorn Helgaas <bhelgaas(a)google.com>
Arul, Mateusz, Imcarneiro91, and Aman reported a regression caused by
07eab0901ede ("efi/x86: Remove EfiMemoryMappedIO from E820 map"). On the
Lenovo Legion 9i laptop, that commit removes the area containing ECAM from
E820, which means the early E820 validation started failing, which meant we
didn't enable ECAM in the "early MCFG" path
The lack of ECAM caused many ACPI methods to fail, resulting in the
embedded controller, PS/2, audio, trackpad, and battery devices not being
detected. The _OSC method also failed, so Linux could not take control of
the PCIe hotplug, PME, and AER features:
# pci_mmcfg_early_init()
PCI: ECAM [mem 0xc0000000-0xce0fffff] (base 0xc0000000) for domain 0000 [bus 00-e0]
PCI: not using ECAM ([mem 0xc0000000-0xce0fffff] not reserved)
ACPI Error: AE_ERROR, Returned by Handler for [PCI_Config] (20230628/evregion-300)
ACPI: Interpreter enabled
ACPI: Ignoring error and continuing table load
ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PC00.RP01._SB.PC00], AE_NOT_FOUND (20230628/dswload2-162)
ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20230628/psobject-220)
ACPI: Skipping parse of AML opcode: OpcodeName unavailable (0x0010)
ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PC00.RP01._SB.PC00], AE_NOT_FOUND (20230628/dswload2-162)
ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20230628/psobject-220)
...
ACPI Error: Aborting method \_SB.PC00._OSC due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
acpi PNP0A08:00: _OSC: platform retains control of PCIe features (AE_NOT_FOUND)
# pci_mmcfg_late_init()
PCI: ECAM [mem 0xc0000000-0xce0fffff] (base 0xc0000000) for domain 0000 [bus 00-e0]
PCI: [Firmware Info]: ECAM [mem 0xc0000000-0xce0fffff] not reserved in ACPI motherboard resources
PCI: ECAM [mem 0xc0000000-0xce0fffff] is EfiMemoryMappedIO; assuming valid
PCI: ECAM [mem 0xc0000000-0xce0fffff] reserved to work around lack of ACPI motherboard _CRS
Per PCI Firmware r3.3, sec 4.1.2, ECAM space must be reserved by a PNP0C02
resource, but it need not be mentioned in E820, so we shouldn't look at
E820 to validate the ECAM space described by MCFG.
946f2ee5c731 ("[PATCH] i386/x86-64: Check that MCFG points to an e820
reserved area") added a sanity check of E820 to work around buggy MCFG
tables, but that over-aggressive validation causes failures like this one.
Keep the E820 validation check only for older BIOSes (pre-2016) so the
buggy 2006-era machines don't break. Skip the early E820 check for 2016
and newer BIOSes.
Fixes: 07eab0901ede ("efi/x86: Remove EfiMemoryMappedIO from E820 map")
Reported-by: Mateusz Kaduk <mateusz.kaduk(a)gmail.com>
Reported-by: Arul <...>
Reported-by: Imcarneiro91 <...>
Reported-by: Aman <...>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218444
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Tested-by: Mateusz Kaduk <mateusz.kaduk(a)gmail.com>
Cc: stable(a)vger.kernel.org
---
arch/x86/pci/mmconfig-shared.c | 35 +++++++++++++++++++++++++++-------
1 file changed, 28 insertions(+), 7 deletions(-)
diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
index 0cc9520666ef..53c7afa606c3 100644
--- a/arch/x86/pci/mmconfig-shared.c
+++ b/arch/x86/pci/mmconfig-shared.c
@@ -518,7 +518,34 @@ static bool __ref pci_mmcfg_reserved(struct device *dev,
{
struct resource *conflict;
- if (!early && !acpi_disabled) {
+ if (early) {
+
+ /*
+ * Don't try to do this check unless configuration type 1
+ * is available. How about type 2?
+ */
+
+ /*
+ * 946f2ee5c731 ("Check that MCFG points to an e820
+ * reserved area") added this E820 check in 2006 to work
+ * around BIOS defects.
+ *
+ * Per PCI Firmware r3.3, sec 4.1.2, ECAM space must be
+ * reserved by a PNP0C02 resource, but it need not be
+ * mentioned in E820. Before the ACPI interpreter is
+ * available, we can't check for PNP0C02 resources, so
+ * there's no reliable way to verify the region in this
+ * early check. Keep it only for the old machines that
+ * motivated 946f2ee5c731.
+ */
+ if (dmi_get_bios_year() < 2016 && raw_pci_ops)
+ return is_mmconf_reserved(e820__mapped_all, cfg, dev,
+ "E820 entry");
+
+ return true;
+ }
+
+ if (!acpi_disabled) {
if (is_mmconf_reserved(is_acpi_reserved, cfg, dev,
"ACPI motherboard resource"))
return true;
@@ -554,12 +581,6 @@ static bool __ref pci_mmcfg_reserved(struct device *dev,
if (pci_mmcfg_running_state)
return true;
- /* Don't try to do this check unless configuration
- type 1 is available. how about type 2 ?*/
- if (raw_pci_ops)
- return is_mmconf_reserved(e820__mapped_all, cfg, dev,
- "E820 entry");
-
return false;
}
--
2.34.1
__split_huge_pmd_locked() can be called for a present THP, devmap or
(non-present) migration entry. It calls pmdp_invalidate()
unconditionally on the pmdp and only determines if it is present or not
based on the returned old pmd.
But arm64's pmd_mkinvalid(), called by pmdp_invalidate(),
unconditionally sets the PMD_PRESENT_INVALID flag, which causes future
pmd_present() calls to return true - even for a swap pmd. Therefore any
lockless pgtable walker could see the migration entry pmd in this state
and start interpretting the fields (e.g. pmd_pfn()) as if it were
present, leading to BadThings (TM). GUP-fast appears to be one such
lockless pgtable walker.
While the obvious fix is for core-mm to avoid such calls for non-present
pmds (pmdp_invalidate() will also issue TLBI which is not necessary for
this case either), all other arches that implement pmd_mkinvalid() do it
in such a way that it is robust to being called with a non-present pmd.
So it is simpler and safer to make arm64 robust too. This approach means
we can even add tests to debug_vm_pgtable.c to validate the required
behaviour.
This is a theoretical bug found during code review. I don't have any
test case to trigger it in practice.
Cc: stable(a)vger.kernel.org
Fixes: 53fa117bb33c ("arm64/mm: Enable THP migration")
Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com>
---
Hi all,
v1 of this fix [1] took the approach of fixing core-mm to never call
pmdp_invalidate() on a non-present pmd. But Zi Yan highlighted that only arm64
suffers this problem; all other arches are robust. So his suggestion was to
instead make arm64 robust in the same way and add tests to validate it. Despite
my stated reservations in the context of the v1 discussion, having thought on it
for a bit, I now agree with Zi Yan. Hence this post.
Andrew has v1 in mm-unstable at the moment, so probably the best thing to do is
remove it from there and have this go in through the arm64 tree? Assuming there
is agreement that this approach is right one.
This applies on top of v6.9-rc5. Passes all the mm selftests on arm64.
[1] https://lore.kernel.org/linux-mm/20240425170704.3379492-1-ryan.roberts@arm.…
Thanks,
Ryan
arch/arm64/include/asm/pgtable.h | 12 +++++--
mm/debug_vm_pgtable.c | 61 ++++++++++++++++++++++++++++++++
2 files changed, 71 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index afdd56d26ad7..7d580271a46d 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -511,8 +511,16 @@ static inline int pmd_trans_huge(pmd_t pmd)
static inline pmd_t pmd_mkinvalid(pmd_t pmd)
{
- pmd = set_pmd_bit(pmd, __pgprot(PMD_PRESENT_INVALID));
- pmd = clear_pmd_bit(pmd, __pgprot(PMD_SECT_VALID));
+ /*
+ * If not valid then either we are already present-invalid or we are
+ * not-present (i.e. none or swap entry). We must not convert
+ * not-present to present-invalid. Unbelievably, the core-mm may call
+ * pmd_mkinvalid() for a swap entry and all other arches can handle it.
+ */
+ if (pmd_valid(pmd)) {
+ pmd = set_pmd_bit(pmd, __pgprot(PMD_PRESENT_INVALID));
+ pmd = clear_pmd_bit(pmd, __pgprot(PMD_SECT_VALID));
+ }
return pmd;
}
diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 65c19025da3d..7e9c387d06b0 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -956,6 +956,65 @@ static void __init hugetlb_basic_tests(struct pgtable_debug_args *args) { }
#endif /* CONFIG_HUGETLB_PAGE */
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+#if !defined(__HAVE_ARCH_PMDP_INVALIDATE) && defined(CONFIG_ARCH_ENABLE_THP_MIGRATION)
+static void __init swp_pmd_mkinvalid_tests(struct pgtable_debug_args *args)
+{
+ unsigned long max_swap_offset;
+ swp_entry_t swp_set, swp_clear, swp_convert;
+ pmd_t pmd_set, pmd_clear;
+
+ /*
+ * See generic_max_swapfile_size(): probe the maximum offset, then
+ * create swap entry will all possible bits set and a swap entry will
+ * all bits clear.
+ */
+ max_swap_offset = swp_offset(pmd_to_swp_entry(swp_entry_to_pmd(swp_entry(0, ~0UL))));
+ swp_set = swp_entry((1 << MAX_SWAPFILES_SHIFT) - 1, max_swap_offset);
+ swp_clear = swp_entry(0, 0);
+
+ /* Convert to pmd. */
+ pmd_set = swp_entry_to_pmd(swp_set);
+ pmd_clear = swp_entry_to_pmd(swp_clear);
+
+ /*
+ * Sanity check that the pmds are not-present, not-huge and swap entry
+ * is recoverable without corruption.
+ */
+ WARN_ON(pmd_present(pmd_set));
+ WARN_ON(pmd_trans_huge(pmd_set));
+ swp_convert = pmd_to_swp_entry(pmd_set);
+ WARN_ON(swp_type(swp_set) != swp_type(swp_convert));
+ WARN_ON(swp_offset(swp_set) != swp_offset(swp_convert));
+ WARN_ON(pmd_present(pmd_clear));
+ WARN_ON(pmd_trans_huge(pmd_clear));
+ swp_convert = pmd_to_swp_entry(pmd_clear);
+ WARN_ON(swp_type(swp_clear) != swp_type(swp_convert));
+ WARN_ON(swp_offset(swp_clear) != swp_offset(swp_convert));
+
+ /* Now invalidate the pmd. */
+ pmd_set = pmd_mkinvalid(pmd_set);
+ pmd_clear = pmd_mkinvalid(pmd_clear);
+
+ /*
+ * Since its a swap pmd, invalidation should effectively be a noop and
+ * the checks we already did should give the same answer. Check the
+ * invalidation didn't corrupt any fields.
+ */
+ WARN_ON(pmd_present(pmd_set));
+ WARN_ON(pmd_trans_huge(pmd_set));
+ swp_convert = pmd_to_swp_entry(pmd_set);
+ WARN_ON(swp_type(swp_set) != swp_type(swp_convert));
+ WARN_ON(swp_offset(swp_set) != swp_offset(swp_convert));
+ WARN_ON(pmd_present(pmd_clear));
+ WARN_ON(pmd_trans_huge(pmd_clear));
+ swp_convert = pmd_to_swp_entry(pmd_clear);
+ WARN_ON(swp_type(swp_clear) != swp_type(swp_convert));
+ WARN_ON(swp_offset(swp_clear) != swp_offset(swp_convert));
+}
+#else
+static void __init swp_pmd_mkinvalid_tests(struct pgtable_debug_args *args) { }
+#endif /* !__HAVE_ARCH_PMDP_INVALIDATE && CONFIG_ARCH_ENABLE_THP_MIGRATION */
+
static void __init pmd_thp_tests(struct pgtable_debug_args *args)
{
pmd_t pmd;
@@ -982,6 +1041,8 @@ static void __init pmd_thp_tests(struct pgtable_debug_args *args)
WARN_ON(!pmd_trans_huge(pmd_mkinvalid(pmd_mkhuge(pmd))));
WARN_ON(!pmd_present(pmd_mkinvalid(pmd_mkhuge(pmd))));
#endif /* __HAVE_ARCH_PMDP_INVALIDATE */
+
+ swp_pmd_mkinvalid_tests(args);
}
#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
--
2.25.1