I've recently discovered that doing infinite loop of
systemctl start <ext4_on_lvm>.mount, and
systemctl stop <ext4_on_lvm>.mount
linearly increases percpu allocator memory consumption.
In several hours, it might lead to system instability by
consuming most of the memory.
Bug is not reproducible when the ext4 filesystem is on
physical partition, but it is persistent when ext4 is on
logical volume.
During debugging it was found that most of active percpu
allocations are from /system.slice/<ext4_on_lvm>.mount
memory cgroups (created by systemd for each mount). All
of these cgroups are in dying state with refcount equal
to 2. And most interesting that each mount/umount itera-
tion creates exactly one dying memory cgroup.
Tracking down the remaining refcounts showed that it was
charged from ext4_fill_super(). And the page is always
0 index in the page cache mapping.
The issue was hidden behind initial super block read using
logical blocksize from bdev and adjusting blocksize later
after reading actual block size from superblock.
If blocksizes differ, sb_set_blocksize() will kill current
buffers and page cache by using kill_bdev(). And then super
block will be reread again but using correct blocksize this
time. sb_set_blocksize() didn't fully free superblock page
and buffers as buffer pointed by bh variable remained busy.
So buffer and its page remains in the memory (leak). Super
block reread logic does not happen when ext4 filesystem is
on physical partition as blocksize is correct for initial
superblock read.
brelse(bh), where bh is a buffer head of superblock page,
must be called and bh references must be released before
kill_bdev(). kill_bdev() subfunctions (see callstack below)
won't be able to free not released buffer (even if it's
clean) and superblock page won't be freed as well.
callstack:
kill_bdev()
->truncate_inode_pages()
->truncate_inode_pages_range()
->truncate_cleanup_page()
->do_invalidatepage
->block_invalidatepage()
->try_to_release_page() == fail to release
->try_to_free_buffers() == fail to free
->drop_buffers()
->buffer_busy() == yes
Incorrect order of brelse() and kill_bdev() in ext4_fill_super()
was introduced by commit ce40733ce93d ("ext4: Check for return
value from sb_set_blocksize") 13 years ago! Thanks to memory
hungry percpu, it was easy to detect this issue now.
Fix this by moving the brelse() before sb_set_blocksize() and
add a comment about the dependency.
In addition, fix similar issue under failed_mount: label (in
the same function) about incorrect order of ext4_blkdev_remove()
vs brelse() introduced by commit ac27a0ec112a ("ext4: initial
copy of files from ext3")
Signed-off-by: Alexey Makhalov <amakhalov(a)vmware.com>
Cc: stable(a)vger.kernel.org
Fixes: ce40733ce93d ("ext4: Check for return value from sb_set_blocksize")
Fixes: ac27a0ec112a ("ext4: initial copy of files from ext3")
Cc: "Theodore Ts'o" <tytso(a)mit.edu>
Cc: Andreas Dilger <adilger.kernel(a)dilger.ca>
---
fs/ext4/super.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index b9693680463a..6c8f68309834 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4451,14 +4451,20 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
}
if (sb->s_blocksize != blocksize) {
+ /*
+ * bh must be released before kill_bdev(), otherwise
+ * it won't be freed and its page also. kill_bdev()
+ * is called by sb_set_blocksize().
+ */
+ brelse(bh);
/* Validate the filesystem blocksize */
if (!sb_set_blocksize(sb, blocksize)) {
ext4_msg(sb, KERN_ERR, "bad block size %d",
blocksize);
+ bh = NULL;
goto failed_mount;
}
- brelse(bh);
logical_sb_block = sb_block * EXT4_MIN_BLOCK_SIZE;
offset = do_div(logical_sb_block, blocksize);
bh = ext4_sb_bread_unmovable(sb, logical_sb_block);
@@ -5178,8 +5184,9 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
kfree(get_qf_name(sb, sbi, i));
#endif
fscrypt_free_dummy_policy(&sbi->s_dummy_enc_policy);
- ext4_blkdev_remove(sbi);
+ /* ext4_blkdev_remove() calls kill_bdev(), release bh before it. */
brelse(bh);
+ ext4_blkdev_remove(sbi);
out_fail:
sb->s_fs_info = NULL;
kfree(sbi->s_blockgroup_lock);
--
2.14.2
When using a 24-bit panel on a 8-bit serial bus, the pixel clock
requested by the panel has to be multiplied by 3, since the subpixels
are shifted sequentially.
The code (in ingenic_drm_encoder_atomic_check) already computed
crtc_state->adjusted_mode->crtc_clock accordingly, but clk_set_rate()
used crtc_state->adjusted_mode->clock instead.
Fixes: 28ab7d35b6e0 ("drm/ingenic: Properly compute timings when using a 3x8-bit panel")
Cc: stable(a)vger.kernel.org # v5.10
Signed-off-by: Paul Cercueil <paul(a)crapouillou.net>
---
drivers/gpu/drm/ingenic/ingenic-drm-drv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
index d60e1eefc9d1..cba68bf52ec5 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
+++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
@@ -342,7 +342,7 @@ static void ingenic_drm_crtc_atomic_flush(struct drm_crtc *crtc,
if (priv->update_clk_rate) {
mutex_lock(&priv->clk_mutex);
clk_set_rate(priv->pix_clk,
- crtc_state->adjusted_mode.clock * 1000);
+ crtc_state->adjusted_mode.crtc_clock * 1000);
priv->update_clk_rate = false;
mutex_unlock(&priv->clk_mutex);
}
--
2.30.2
From: Joerg Roedel <jroedel(a)suse.de>
Annotate the firmware files CCP might need using MODULE_FIRMWARE().
This will get them included into an initrd when CCP is also included
there. Otherwise the CCP module will not find its firmware when loaded
before the root-fs is mounted.
This can cause problems when the pre-loaded SEV firmware is too old to
support current SEV and SEV-ES virtualization features.
Fixes: e93720606efd ("crypto: ccp - Allow SEV firmware to be chosen based on Family and Model")
Cc: stable(a)vger.kernel.org # v4.20+
Acked-by: Tom Lendacky <thomas.lendacky(a)amd.com>
Signed-off-by: Joerg Roedel <jroedel(a)suse.de>
---
drivers/crypto/ccp/sev-dev.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index cb9b4c4e371e..675ff925a59d 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -42,6 +42,10 @@ static int psp_probe_timeout = 5;
module_param(psp_probe_timeout, int, 0644);
MODULE_PARM_DESC(psp_probe_timeout, " default timeout value, in seconds, during PSP device probe");
+MODULE_FIRMWARE("amd/amd_sev_fam17h_model0xh.sbin"); /* 1st gen EPYC */
+MODULE_FIRMWARE("amd/amd_sev_fam17h_model3xh.sbin"); /* 2nd gen EPYC */
+MODULE_FIRMWARE("amd/amd_sev_fam19h_model0xh.sbin"); /* 3rd gen EPYC */
+
static bool psp_dead;
static int psp_timeout;
--
2.31.1
Per schematic, both PU and SOC regulator are supplied from LTC3676 SW1
via VDDSOC_IN rail, add the PU input. Both VDD1P1, VDD2P5 are supplied
from LTC3676 SW2 via VDDHIGH_IN rail, add both inputs.
While no instability or problems are currently observed, the regulators
should be fully described in DT and that description should fully match
the hardware, else this might lead to unforseen issues later. Fix this.
Fixes: 52c7a088badd ("ARM: dts: imx6q: Add support for the DHCOM iMX6 SoM and PDK2")
Reviewed-by: Fabio Estevam <festevam(a)gmail.com>
Signed-off-by: Marek Vasut <marex(a)denx.de>
Cc: Christoph Niedermaier <cniedermaier(a)dh-electronics.com>
Cc: Fabio Estevam <festevam(a)gmail.com>
Cc: Ludwig Zenz <lzenz(a)dh-electronics.com>
Cc: NXP Linux Team <linux-imx(a)nxp.com>
Cc: Shawn Guo <shawnguo(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
V2: Amend commit message
V3: Reinstate the missing SoB line, add RB
---
arch/arm/boot/dts/imx6q-dhcom-som.dtsi | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/arch/arm/boot/dts/imx6q-dhcom-som.dtsi b/arch/arm/boot/dts/imx6q-dhcom-som.dtsi
index 236fc205c389..d0768ae429fa 100644
--- a/arch/arm/boot/dts/imx6q-dhcom-som.dtsi
+++ b/arch/arm/boot/dts/imx6q-dhcom-som.dtsi
@@ -406,6 +406,18 @@ ®_soc {
vin-supply = <&sw1_reg>;
};
+®_pu {
+ vin-supply = <&sw1_reg>;
+};
+
+®_vdd1p1 {
+ vin-supply = <&sw2_reg>;
+};
+
+®_vdd2p5 {
+ vin-supply = <&sw2_reg>;
+};
+
&uart1 {
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_uart1>;
--
2.30.2
The FEC does not have a PHY so it should not have a phy-handle. It is
connected to the switch at RGMII level so we need a fixed-link sub-node
on both ends.
This was not a problem until the qca8k.c driver was converted to PHYLINK
by commit b3591c2a3661 ("net: dsa: qca8k: Switch to PHYLINK instead of
PHYLIB"). That commit revealed the FEC configuration was not correct.
Fixes: 87489ec3a77f ("ARM: dts: imx: Add Y Soft IOTA Draco, Hydra and Ursa boards")
Cc: stable(a)vger.kernel.org
Signed-off-by: Michal Vokáč <michal.vokac(a)ysoft.com>
---
arch/arm/boot/dts/imx6dl-yapp4-common.dtsi | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/imx6dl-yapp4-common.dtsi b/arch/arm/boot/dts/imx6dl-yapp4-common.dtsi
index 111d4d331f98..686dab57a1e4 100644
--- a/arch/arm/boot/dts/imx6dl-yapp4-common.dtsi
+++ b/arch/arm/boot/dts/imx6dl-yapp4-common.dtsi
@@ -121,9 +121,13 @@
phy-reset-gpios = <&gpio1 25 GPIO_ACTIVE_LOW>;
phy-reset-duration = <20>;
phy-supply = <&sw2_reg>;
- phy-handle = <ðphy0>;
status = "okay";
+ fixed-link {
+ speed = <1000>;
+ full-duplex;
+ };
+
mdio {
#address-cells = <1>;
#size-cells = <0>;
--
2.1.4
After commit 81ad4276b505 ("Thermal: Ignore invalid trip points") all
user_space governor notifications via RW trip point is broken in intel
thermal drivers. This commits marks trip_points with value of 0 during
call to thermal_zone_device_register() as invalid. RW trip points can be
0 as user space will set the correct trip temperature later.
During driver init, x86_package_temp and all int340x drivers sets RW trip
temperature as 0. This results in all these trips marked as invalid by
the thermal core.
To fix this initialize RW trips to THERMAL_TEMP_INVALID instead of 0.
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada(a)linux.intel.com>
---
drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c | 4 ++++
drivers/thermal/intel/x86_pkg_temp_thermal.c | 2 +-
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
index d1248ba943a4..62c0aa5d0783 100644
--- a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
+++ b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
@@ -237,6 +237,8 @@ struct int34x_thermal_zone *int340x_thermal_zone_add(struct acpi_device *adev,
if (ACPI_FAILURE(status))
trip_cnt = 0;
else {
+ int i;
+
int34x_thermal_zone->aux_trips =
kcalloc(trip_cnt,
sizeof(*int34x_thermal_zone->aux_trips),
@@ -247,6 +249,8 @@ struct int34x_thermal_zone *int340x_thermal_zone_add(struct acpi_device *adev,
}
trip_mask = BIT(trip_cnt) - 1;
int34x_thermal_zone->aux_trip_nr = trip_cnt;
+ for (i = 0; i < trip_cnt; ++i)
+ int34x_thermal_zone->aux_trips[i] = THERMAL_TEMP_INVALID;
}
trip_cnt = int340x_thermal_read_trips(int34x_thermal_zone);
diff --git a/drivers/thermal/intel/x86_pkg_temp_thermal.c b/drivers/thermal/intel/x86_pkg_temp_thermal.c
index 295742e83960..4d8edc61a78b 100644
--- a/drivers/thermal/intel/x86_pkg_temp_thermal.c
+++ b/drivers/thermal/intel/x86_pkg_temp_thermal.c
@@ -166,7 +166,7 @@ static int sys_get_trip_temp(struct thermal_zone_device *tzd,
if (thres_reg_value)
*temp = zonedev->tj_max - thres_reg_value * 1000;
else
- *temp = 0;
+ *temp = THERMAL_TEMP_INVALID;
pr_debug("sys_get_trip_temp %d\n", *temp);
return 0;
--
2.25.1
Hi,
Since v5.4.102 I observe a regression on stm32mp1 platform: "no-map"
reserved-memory regions are no more "reserved" and make part of the
kernel System RAM. This causes allocation failure for devices which try
to take a reserved-memory region.
It has been introduced by the following path:
"fdt: Properly handle "no-map" field in the memory region
[ Upstream commit 86588296acbfb1591e92ba60221e95677ecadb43 ]"
which replace memblock_remove by memblock_mark_nomap in no-map case.
Reverting this patch it's fine.
I add part of my DT (something is maybe wrong inside):
memory@c0000000 {
reg = <0xc0000000 0x20000000>;
};
reserved-memory {
#address-cells = <1>;
#size-cells = <1>;
ranges;
gpu_reserved: gpu@d4000000 {
reg = <0xd4000000 0x4000000>;
no-map;
};
};
Sorry if this issue has already been raised and discussed.
Thanks
alex