October 2024 - Linux-stable-mirror

[PATCH 1/2] mm/damon/core: handle zero {aggregation,ops_update} intervals

by SeongJae Park

DAMON's logics to determine if this is the time to do aggregation and ops update assumes next_{aggregation,ops_update}_sis are always set larger than current passed_sample_intervals. And therefore it further assumes continuously incrementing passed_sample_intervals every sampling interval will make it reaches to the next_{aggregation,ops_update}_sis in future. The logic therefore make the action and update next_{aggregation,ops_updaste}_sis only if passed_sample_intervals is same to the counts, respectively. If Aggregation interval or Ops update interval are zero, however, next_aggregation_sis or next_ops_update_sis are set same to current passed_sample_intervals, respectively. And passed_sample_intervals is incremented before doing the next_{aggregation,ops_update}_sis check. Hence, passed_sample_intervals becomes larger than next_{aggregation,ops_update}_sis, and the logic says it is not the time to do the action and update next_{aggregation,ops_update}_sis forever, until an overflow happens. In other words, DAMON stops doing aggregations or ops updates effectively forever, and users cannot get monitoring results. Based on the documents and the common sense, a reasonable behavior for such inputs is doing an aggregation and an ops update for every sampling interval. Handle the case by removing the assumption. Note that this could incur particular real issue for DAMON sysfs interface users, in case of zero Aggregation interval. When user starts DAMON with zero Aggregation interval and asks online DAMON parameter tuning via DAMON sysfs interface, the request is handled by the aggregation callback. Until the callback finishes the work, the user who requested the online tuning just waits. Hence, the user will be stuck until the passed_sample_intervals overflows. Fixes: 4472edf63d66 ("mm/damon/core: use number of passed access sampling as a timer") Cc: <stable(a)vger.kernel.org> # 6.7.x Signed-off-by: SeongJae Park <sj(a)kernel.org> --- mm/damon/core.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/damon/core.c b/mm/damon/core.c index 27745dcf855f..931526fb2d2e 100644 --- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -2014,7 +2014,7 @@ static int kdamond_fn(void *data) if (ctx->ops.check_accesses) max_nr_accesses = ctx->ops.check_accesses(ctx); - if (ctx->passed_sample_intervals == next_aggregation_sis) { + if (ctx->passed_sample_intervals >= next_aggregation_sis) { kdamond_merge_regions(ctx, max_nr_accesses / 10, sz_limit); @@ -2032,7 +2032,7 @@ static int kdamond_fn(void *data) sample_interval = ctx->attrs.sample_interval ? ctx->attrs.sample_interval : 1; - if (ctx->passed_sample_intervals == next_aggregation_sis) { + if (ctx->passed_sample_intervals >= next_aggregation_sis) { ctx->next_aggregation_sis = next_aggregation_sis + ctx->attrs.aggr_interval / sample_interval; @@ -2042,7 +2042,7 @@ static int kdamond_fn(void *data) ctx->ops.reset_aggregated(ctx); } - if (ctx->passed_sample_intervals == next_ops_update_sis) { + if (ctx->passed_sample_intervals >= next_ops_update_sis) { ctx->next_ops_update_sis = next_ops_update_sis + ctx->attrs.ops_update_interval / sample_interval; -- 2.39.5

8 months, 2 weeks

1
0
0 0

[PATCH 2/3] gpiolib: fix debugfs dangling chip separator

by Johan Hovold

Add the missing newline after entries for recently removed gpio chips so that the chip sections are separated by a newline as intended. Fixes: e348544f7994 ("gpio: protect the list of GPIO devices with SRCU") Cc: stable(a)vger.kernel.org # 6.9 Cc: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org> Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org> --- drivers/gpio/gpiolib.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index e27488a90bc9..2b02655abb56 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -4971,7 +4971,7 @@ static int gpiolib_seq_show(struct seq_file *s, void *v) gc = srcu_dereference(gdev->chip, &gdev->srcu); if (!gc) { - seq_printf(s, "%s%s: (dangling chip)", + seq_printf(s, "%s%s: (dangling chip)\n", priv->newline ? "\n" : "", dev_name(&gdev->dev)); return 0; -- 2.45.2

8 months, 2 weeks

3
3
0 0

[PATCH] net: ethernet: broadcom: Fix uninitialized lockal variable

by George Rurikov

I can't find any reason why it won't happen. In SERDES_TG3_SGMII_MODE, when current_link_up == true and current_duplex == DUPLEX_FULL, program execution will be transferred using the goto fiber_setup_done, where the uninitialized remote_adv variable is passed as the second parameter to the tg3_setup_flow_control function. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 85730a631f0c ("tg3: Add SGMII phy support for 5719/5718 serdes") Cc: stable(a)vger.kernel.org Signed-off-by: George Rurikov <grurikov(a)gmail.com> --- drivers/net/ethernet/broadcom/tg3.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c index 378815917741..b1c60851c841 100644 --- a/drivers/net/ethernet/broadcom/tg3.c +++ b/drivers/net/ethernet/broadcom/tg3.c @@ -5802,7 +5802,8 @@ static int tg3_setup_fiber_mii_phy(struct tg3 *tp, bool force_reset) u32 current_speed = SPEED_UNKNOWN; u8 current_duplex = DUPLEX_UNKNOWN; bool current_link_up = false; - u32 local_adv, remote_adv, sgsr; + u32 local_adv, sgsr; + u32 remote_adv = 0; if ((tg3_asic_rev(tp) == ASIC_REV_5719 || tg3_asic_rev(tp) == ASIC_REV_5720) && -- 2.34.1

8 months, 2 weeks

2
1
0 0

[PATCH] ucounts: fix counter leak in inc_rlimit_get_ucounts()

by Andrei Vagin

The inc_rlimit_get_ucounts() increments the specified rlimit counter and then checks its limit. If the value exceeds the limit, the function returns an error without decrementing the counter. Fixes: 15bc01effefe ("ucounts: Fix signal ucount refcounting") Tested-by: Roman Gushchin <roman.gushchin(a)linux.dev> Co-debugged-by: Roman Gushchin <roman.gushchin(a)linux.dev> Cc: Kees Cook <kees(a)kernel.org> Cc: Andrei Vagin <avagin(a)google.com> Cc: "Eric W. Biederman" <ebiederm(a)xmission.com> Cc: Alexey Gladkov <legion(a)kernel.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrei Vagin <avagin(a)google.com> --- kernel/ucount.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/kernel/ucount.c b/kernel/ucount.c index 8c07714ff27d..16c0ea1cb432 100644 --- a/kernel/ucount.c +++ b/kernel/ucount.c @@ -328,13 +328,12 @@ long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum rlimit_type type) if (new != 1) continue; if (!get_ucounts(iter)) - goto dec_unwind; + goto unwind; } return ret; -dec_unwind: +unwind: dec = atomic_long_sub_return(1, &iter->rlimit[type]); WARN_ON_ONCE(dec < 0); -unwind: do_dec_rlimit_put_ucounts(ucounts, iter, type); return 0; } -- 2.47.0.163.g1226f6d8fa-goog

8 months, 2 weeks

3
4
0 0

[PATCH v2] serial: 8250: omap: Move pm_runtime_get_sync

by Judith Mendez

From: Bin Liu <b-liu(a)ti.com> Currently in omap_8250_shutdown, the dma->rx_running flag is set to zero in omap_8250_rx_dma_flush. Next pm_runtime_get_sync is called, which is a runtime resume call stack which can re-set the flag. When the call omap_8250_shutdown returns, the flag is expected to be UN-SET, but this is not the case. This is causing issues the next time UART is re-opened and omap_8250_rx_dma is called. Fix by moving pm_runtime_get_sync before the omap_8250_rx_dma_flush. cc: stable(a)vger.kernel.org Fixes: 0e31c8d173ab ("tty: serial: 8250_omap: add custom DMA-RX callback") Signed-off-by: Bin Liu <b-liu(a)ti.com> [Judith: Add commit message] Signed-off-by: Judith Mendez <jm(a)ti.com> Reviewed-by: Kevin Hilman <khilman(a)baylibre.com> Tested-by: Kevin Hilman <khilman(a)baylibre.com> --- Issue seen on am335x devices so far [0]. The patch has been tested with sanity boot test on am335x EVM, am335x-boneblack and am57xx-beagle-x15. Changes since v1 RESEND: - Fix email header and commit description length - Add fixes tag, add link [0], cc stable, add kevin's reviewed-by/tested-by's - Separate patch from patch series [1] [0] https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1… Link to v1 RESEND: [1] https://lore.kernel.org/linux-omap/20241011173356.870883-1-jm@ti.com/ --- drivers/tty/serial/8250/8250_omap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/tty/serial/8250/8250_omap.c b/drivers/tty/serial/8250/8250_omap.c index 88b58f44e4e97..0dd68bdbfbcf7 100644 --- a/drivers/tty/serial/8250/8250_omap.c +++ b/drivers/tty/serial/8250/8250_omap.c @@ -776,12 +776,12 @@ static void omap_8250_shutdown(struct uart_port *port) struct uart_8250_port *up = up_to_u8250p(port); struct omap8250_priv *priv = port->private_data; + pm_runtime_get_sync(port->dev); + flush_work(&priv->qos_work); if (up->dma) omap_8250_rx_dma_flush(up); - pm_runtime_get_sync(port->dev); - serial_out(up, UART_OMAP_WER, 0); if (priv->habit & UART_HAS_EFR2) serial_out(up, UART_OMAP_EFR2, 0x0); -- 2.47.0

8 months, 2 weeks

1
0
0 0

[PATCH 0/2] leds: max5970: fix unreleased fwnode_handle in probe function

by Javier Carrasco

This series fixes the wrong management of the 'led_node' fwnode_handle, which is not released after it is no longer required. This affects both the normal path of execution and the existing error paths (currently two) in max5970_led_probe(). First, the missing callst to fwnode_handle_put() in the different code paths are added, to make the patch available for stable kernels. Then, the code gets updated to a more robust approach by means of the __free() macro to automatically release the node when it goes out of scope, removing the need for explicit calls to fwnode_handle_put(). Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com> --- Javier Carrasco (2): leds: max5970: fix unreleased fwnode_handle in probe function leds: max5970: use cleanup facility for fwnode_handle led_node drivers/leds/leds-max5970.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) --- base-commit: f2493655d2d3d5c6958ed996b043c821c23ae8d3 change-id: 20241019-max5970-of_node_put-939b004f57d2 Best regards, -- Javier Carrasco <javier.carrasco.cruz(a)gmail.com>

8 months, 2 weeks

2
3
0 0

[PATCH] mm/damon/core: avoid overflow in damon_feed_loop_next_input()

by SeongJae Park

damon_feed_loop_next_input() is inefficient and fragile to overflows. Specifically, 'score_goal_diff_bp' calculation can overflow when 'score' is high. The calculation is actually unnecessary at all because 'goal' is a constant of value 10,000. Calculation of 'compensation' is again fragile to overflow. Final calculation of return value for under-achiving case is again fragile to overflow when the current score is under-achieving the target. Add two corner cases handling at the beginning of the function to make the body easier to read, and rewrite the body of the function to avoid overflows and the unnecessary bp value calcuation. Reported-by: Guenter Roeck <linux(a)roeck-us.net> Closes: https://lore.kernel.org/944f3d5b-9177-48e7-8ec9-7f1331a3fea3@roeck-us.net Fixes: 9294a037c015 ("mm/damon/core: implement goal-oriented feedback-driven quota auto-tuning") Cc: <stable(a)vger.kernel.org> # 6.8.x Signed-off-by: SeongJae Park <sj(a)kernel.org> Tested-by: Guenter Roeck <linux(a)roeck-us.net> --- Changes from RFC (https://lore.kernel.org/20240905172405.46995-1-sj@kernel.org) - Rebase on latest mm-unstable and cleanup code mm/damon/core.c | 28 +++++++++++++++++++++------- 1 file changed, 21 insertions(+), 7 deletions(-) diff --git a/mm/damon/core.c b/mm/damon/core.c index a83f3b736d51..27745dcf855f 100644 --- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -1456,17 +1456,31 @@ static unsigned long damon_feed_loop_next_input(unsigned long last_input, unsigned long score) { const unsigned long goal = 10000; - unsigned long score_goal_diff = max(goal, score) - min(goal, score); - unsigned long score_goal_diff_bp = score_goal_diff * 10000 / goal; - unsigned long compensation = last_input * score_goal_diff_bp / 10000; /* Set minimum input as 10000 to avoid compensation be zero */ const unsigned long min_input = 10000; + unsigned long score_goal_diff, compensation; + bool over_achieving = score > goal; - if (goal > score) + if (score == goal) + return last_input; + if (score >= goal * 2) + return min_input; + + if (over_achieving) + score_goal_diff = score - goal; + else + score_goal_diff = goal - score; + + if (last_input < ULONG_MAX / score_goal_diff) + compensation = last_input * score_goal_diff / goal; + else + compensation = last_input / goal * score_goal_diff; + + if (over_achieving) + return max(last_input - compensation, min_input); + if (last_input < ULONG_MAX - compensation) return last_input + compensation; - if (last_input > compensation + min_input) - return last_input - compensation; - return min_input; + return ULONG_MAX; } #ifdef CONFIG_PSI -- 2.39.5

8 months, 2 weeks

1
0
0 0

[PATCH] tpm: set TPM_CHIP_FLAG_SUSPENDED early

by Jarkko Sakkinen

Setting TPM_CHIP_FLAG_SUSPENDED in the end of tpm_pm_suspend() can be racy according to the bug report, as this leaves window for tpm_hwrng_read() to be called while the operation is in progress. Move setting of the flag into the beginning. Cc: stable(a)vger.kernel.org # v6.4+ Fixes: 99d464506255 ("tpm: Prevent hwrng from activating during resume") Reported-by: Mike Seo <mikeseohyungjin(a)gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219383 Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org> --- drivers/char/tpm/tpm-interface.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c index 8134f002b121..3f96bc8b95df 100644 --- a/drivers/char/tpm/tpm-interface.c +++ b/drivers/char/tpm/tpm-interface.c @@ -370,6 +370,8 @@ int tpm_pm_suspend(struct device *dev) if (!chip) return -ENODEV; + chip->flags |= TPM_CHIP_FLAG_SUSPENDED; + if (chip->flags & TPM_CHIP_FLAG_ALWAYS_POWERED) goto suspended; @@ -390,8 +392,6 @@ int tpm_pm_suspend(struct device *dev) } suspended: - chip->flags |= TPM_CHIP_FLAG_SUSPENDED; - if (rc) dev_err(dev, "Ignoring error %d while suspending\n", rc); return 0; -- 2.47.0

8 months, 2 weeks

2
6
0 0

[PATCH V2] wifi: rtlwifi: Drastically reduce the attempts to read efuse in case of failures

by Guilherme G. Piccoli

Syzkaller reported a hung task with uevent_show() on stack trace. That specific issue was addressed by another commit [0], but even with that fix applied (for example, running v6.12-rc5) we face another type of hung task that comes from the same reproducer [1]. By investigating that, we could narrow it to the following path: (a) Syzkaller emulates a Realtek USB WiFi adapter using raw-gadget and dummy_hcd infrastructure. (b) During the probe of rtl8192cu, the driver ends-up performing an efuse read procedure (which is related to EEPROM load IIUC), and here lies the issue: the function read_efuse() calls read_efuse_byte() many times, as loop iterations depending on the efuse size (in our example, 512 in total). This procedure for reading efuse bytes relies in a loop that performs an I/O read up to *10k* times in case of failures. We measured the time of the loop inside read_efuse_byte() alone, and in this reproducer (which involves the dummy_hcd emulation layer), it takes 15 seconds each. As a consequence, we have the driver stuck in its probe routine for big time, exposing a stack trace like below if we attempt to reboot the system, for example: task:kworker/0:3 state:D stack:0 pid:662 tgid:662 ppid:2 flags:0x00004000 Workqueue: usb_hub_wq hub_event Call Trace: __schedule+0xe22/0xeb6 schedule_timeout+0xe7/0x132 __wait_for_common+0xb5/0x12e usb_start_wait_urb+0xc5/0x1ef ? usb_alloc_urb+0x95/0xa4 usb_control_msg+0xff/0x184 _usbctrl_vendorreq_sync+0xa0/0x161 _usb_read_sync+0xb3/0xc5 read_efuse_byte+0x13c/0x146 read_efuse+0x351/0x5f0 efuse_read_all_map+0x42/0x52 rtl_efuse_shadow_map_update+0x60/0xef rtl_get_hwinfo+0x5d/0x1c2 rtl92cu_read_eeprom_info+0x10a/0x8d5 ? rtl92c_read_chip_version+0x14f/0x17e rtl_usb_probe+0x323/0x851 usb_probe_interface+0x278/0x34b really_probe+0x202/0x4a4 __driver_probe_device+0x166/0x1b2 driver_probe_device+0x2f/0xd8 [...] We propose hereby to drastically reduce the attempts of doing the I/O reads in case of failures, restricted to USB devices (given that they're inherently slower than PCIe ones). By retrying up to 10 times (instead of 10000), we got responsiveness in the reproducer, while seems reasonable to believe that there's no sane USB device implementation in the field requiring this amount of retries (at every I/O read) in order to properly work. Based on that assumption, it'd be good to have it backported to stable but maybe not since driver implementation (the 10k number comes from day 0), perhaps up to 6.x series makes sense. [0] Commit 15fffc6a5624 ("driver core: Fix uevent_show() vs driver detach race"). [1] A note about that: this syzkaller report presents multiple reproducers that differs by the type of emulated USB device. For this specific case, check the entry from 2024/08/08 06:23 in the list of crashes; the C repro is available at https://syzkaller.appspot.com/text?tag=ReproC&x=1521fc83980000. Cc: stable(a)vger.kernel.org # v6.1+ Reported-by: syzbot+edd9fe0d3a65b14588d5(a)syzkaller.appspotmail.com Tested-by: Bitterblue Smith <rtl8821cerfe2(a)gmail.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com> --- V2: - Restrict the change to USB device only (thanks Ping-Ke Shih). - Tested in 2 USB devices by Bitterblue Smith - thanks a lot! V1: https://lore.kernel.org/lkml/20241025150226.896613-1-gpiccoli@igalia.com/ drivers/net/wireless/realtek/rtlwifi/efuse.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/efuse.c b/drivers/net/wireless/realtek/rtlwifi/efuse.c index 82cf5fb5175f..f741066c06de 100644 --- a/drivers/net/wireless/realtek/rtlwifi/efuse.c +++ b/drivers/net/wireless/realtek/rtlwifi/efuse.c @@ -164,7 +164,17 @@ void read_efuse_byte(struct ieee80211_hw *hw, u16 _offset, u8 *pbuf) struct rtl_priv *rtlpriv = rtl_priv(hw); u32 value32; u8 readbyte; - u16 retry; + u16 retry, max_attempts; + + /* + * In case of USB devices, transfer speeds are limited, hence + * efuse I/O reads could be (way) slower. So, decrease (a lot) + * the read attempts in case of failures. + */ + if (rtlpriv->rtlhal.interface == INTF_PCI) + max_attempts = 10000; + else + max_attempts = 10; rtl_write_byte(rtlpriv, rtlpriv->cfg->maps[EFUSE_CTRL] + 1, (_offset & 0xff)); @@ -178,7 +188,7 @@ void read_efuse_byte(struct ieee80211_hw *hw, u16 _offset, u8 *pbuf) retry = 0; value32 = rtl_read_dword(rtlpriv, rtlpriv->cfg->maps[EFUSE_CTRL]); - while (!(((value32 >> 24) & 0xff) & 0x80) && (retry < 10000)) { + while (!(((value32 >> 24) & 0xff) & 0x80) && (retry < max_attempts)) { value32 = rtl_read_dword(rtlpriv, rtlpriv->cfg->maps[EFUSE_CTRL]); retry++; -- 2.46.2

8 months, 2 weeks

2
2
0 0

[PATCH v2] leds: lp55xx: Remove redundant test for invalid channel number

by Michal Vokáč

Since commit 92a81562e695 ("leds: lp55xx: Add multicolor framework support to lp55xx") there are two subsequent tests if the chan_nr (reg property) is in valid range. One in the lp55xx_init_led() function and one in the lp55xx_parse_common_child() function that was added with the mentioned commit. There are two issues with that. First is in the lp55xx_parse_common_child() function where the reg property is tested right after it is read from the device tree. Test for the upper range is not correct though. Valid reg values are 0 to (max_channel - 1) so it should be >=. Second issue is that in case the parsed value is out of the range the probe just fails and no error message is shown as the code never reaches the second test that prints and error message. Remove the test form lp55xx_parse_common_child() function completely and keep the one in lp55xx_init_led() function to deal with it. Fixes: 92a81562e695 ("leds: lp55xx: Add multicolor framework support to lp55xx") Cc: <stable(a)vger.kernel.org> Signed-off-by: Michal Vokáč <michal.vokac(a)ysoft.com> --- v2: - Complete change of the approach to the problem. In v1 I removed the test from lp55xx_init_led() but I failed to test that solution properly. It could not work. In v2 I removed the test for chan_nr being out of range from the lp55xx_parse_common_child() function. - Re-worded the subject and commit message to fit the changes. It was: "leds: lp55xx: Fix check for invalid channel number" drivers/leds/leds-lp55xx-common.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/leds/leds-lp55xx-common.c b/drivers/leds/leds-lp55xx-common.c index 5a2e259679cf..e71456a56ab8 100644 --- a/drivers/leds/leds-lp55xx-common.c +++ b/drivers/leds/leds-lp55xx-common.c @@ -1132,9 +1132,6 @@ static int lp55xx_parse_common_child(struct device_node *np, if (ret) return ret; - if (*chan_nr < 0 || *chan_nr > cfg->max_channel) - return -EINVAL; - return 0; } -- 2.1.4

8 months, 2 weeks

2
1
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror October 2024