December 2023 - Linux-stable-mirror

FAILED: patch "[PATCH] spi: atmel: Fix clock issue when using devices with different" failed to apply to 4.19-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.19-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y git checkout FETCH_HEAD git cherry-pick -x fc70d643a2f6678cbe0f5c86433c1aeb4d613fcc # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023123041-atlas-poet-e2f6@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^.. Possible dependencies: fc70d643a2f6 ("spi: atmel: Fix clock issue when using devices with different polarities") 69e1818ad27b ("spi: atmel: Fix CS and initialization bug") 5fa5e6dec762 ("spi: atmel: Switch to transfer_one transfer method") ca4196aa1008 ("Merge branch 'spi-5.5' into spi-next") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From fc70d643a2f6678cbe0f5c86433c1aeb4d613fcc Mon Sep 17 00:00:00 2001 From: Louis Chauvet <louis.chauvet(a)bootlin.com> Date: Mon, 4 Dec 2023 16:49:03 +0100 Subject: [PATCH] spi: atmel: Fix clock issue when using devices with different polarities The current Atmel SPI controller driver (v2) behaves incorrectly when using two SPI devices with different clock polarities and GPIO CS. When switching from one device to another, the controller driver first enables the CS and then applies whatever configuration suits the targeted device (typically, the polarities). The side effect of such order is the apparition of a spurious clock edge after enabling the CS when the clock polarity needs to be inverted wrt. the previous configuration of the controller. This parasitic clock edge is problematic when the SPI device uses that edge for internal processing, which is perfectly legitimate given that its CS was asserted. Indeed, devices such as HVS8080 driven by driver gpio-sr in the kernel are shift registers and will process this first clock edge to perform a first register shift. In this case, the first bit gets lost and the whole data block that will later be read by the kernel is all shifted by one. Current behavior: The actual switching of the clock polarity only occurs after the CS when the controller sends the first message: CLK ------------\ /-\ /-\ | | | | | . . . \---/ \-/ \ CS -----\ | \------------------ ^ ^ ^ | | | | | Actual clock of the message sent | | | Change of clock polarity, which occurs with the first | write to the bus. This edge occurs when the CS is | already asserted, and can be interpreted as | the first clock edge by the receiver. | GPIO CS toggle This issue is specific to this controller because while the SPI core performs the operations in the right order, the controller however does not. In practice, the controller only applies the clock configuration right before the first transmission. So this is not a problem when using the controller's dedicated CS, as the controller does things correctly, but it becomes a problem when you need to change the clock polarity and use an external GPIO for the CS. One possible approach to solve this problem is to send a dummy message before actually activating the CS, so that the controller applies the clock polarity beforehand. New behavior: CLK ------\ /-\ /-\ /-\ /-\ | | | ... | | | | ... | | \------/ \- -/ \------/ \- -/ \------ CS -\/-----------------------\ || | \/ \--------------------- ^ ^ ^ ^ ^ | | | | | | | | | Expected clock cycles when | | | | sending the message | | | | | | | Actual GPIO CS activation, occurs inside | | | the driver | | | | | Dummy message, to trigger clock polarity | | reconfiguration. This message is not received and | | processed by the device because CS is low. | | | Change of clock polarity, forced by the dummy message. This | time, the edge is not detected by the receiver. | This small spike in CS activation is due to the fact that the spi-core activates the CS gpio before calling the driver's set_cs callback, which deactivates this gpio again until the clock polarity is correct. To avoid having to systematically send a dummy packet, the driver keeps track of the clock's current polarity. In this way, it only sends the dummy packet when necessary, ensuring that the clock will have the correct polarity when the CS is toggled. There could be two hardware problems with this patch: 1- Maybe the small CS activation peak can confuse SPI devices 2- If on a design, a single wire is used to select two devices depending on its state, the dummy message may disturb them. Fixes: 5ee36c989831 ("spi: atmel_spi update chipselect handling") Cc: <stable(a)vger.kernel.org> Signed-off-by: Louis Chauvet <louis.chauvet(a)bootlin.com> Link: https://msgid.link/r/20231204154903.11607-1-louis.chauvet@bootlin.com Signed-off-by: Mark Brown <broonie(a)kernel.org> diff --git a/drivers/spi/spi-atmel.c b/drivers/spi/spi-atmel.c index 54277de30161..bad34998454a 100644 --- a/drivers/spi/spi-atmel.c +++ b/drivers/spi/spi-atmel.c @@ -22,6 +22,7 @@ #include <linux/gpio/consumer.h> #include <linux/pinctrl/consumer.h> #include <linux/pm_runtime.h> +#include <linux/iopoll.h> #include <trace/events/spi.h> /* SPI register offsets */ @@ -276,6 +277,7 @@ struct atmel_spi { bool keep_cs; u32 fifo_size; + bool last_polarity; u8 native_cs_free; u8 native_cs_for_gpio; }; @@ -288,6 +290,22 @@ struct atmel_spi_device { #define SPI_MAX_DMA_XFER 65535 /* true for both PDC and DMA */ #define INVALID_DMA_ADDRESS 0xffffffff +/* + * This frequency can be anything supported by the controller, but to avoid + * unnecessary delay, the highest possible frequency is chosen. + * + * This frequency is the highest possible which is not interfering with other + * chip select registers (see Note for Serial Clock Bit Rate configuration in + * Atmel-11121F-ATARM-SAMA5D3-Series-Datasheet_02-Feb-16, page 1283) + */ +#define DUMMY_MSG_FREQUENCY 0x02 +/* + * 8 bits is the minimum data the controller is capable of sending. + * + * This message can be anything as it should not be treated by any SPI device. + */ +#define DUMMY_MSG 0xAA + /* * Version 2 of the SPI controller has * - CR.LASTXFER @@ -301,6 +319,43 @@ static bool atmel_spi_is_v2(struct atmel_spi *as) return as->caps.is_spi2; } +/* + * Send a dummy message. + * + * This is sometimes needed when using a CS GPIO to force clock transition when + * switching between devices with different polarities. + */ +static void atmel_spi_send_dummy(struct atmel_spi *as, struct spi_device *spi, int chip_select) +{ + u32 status; + u32 csr; + + /* + * Set a clock frequency to allow sending message on SPI bus. + * The frequency here can be anything, but is needed for + * the controller to send the data. + */ + csr = spi_readl(as, CSR0 + 4 * chip_select); + csr = SPI_BFINS(SCBR, DUMMY_MSG_FREQUENCY, csr); + spi_writel(as, CSR0 + 4 * chip_select, csr); + + /* + * Read all data coming from SPI bus, needed to be able to send + * the message. + */ + spi_readl(as, RDR); + while (spi_readl(as, SR) & SPI_BIT(RDRF)) { + spi_readl(as, RDR); + cpu_relax(); + } + + spi_writel(as, TDR, DUMMY_MSG); + + readl_poll_timeout_atomic(as->regs + SPI_SR, status, + (status & SPI_BIT(TXEMPTY)), 1, 1000); +} + + /* * Earlier SPI controllers (e.g. on at91rm9200) have a design bug whereby * they assume that spi slave device state will not change on deselect, so @@ -317,11 +372,17 @@ static bool atmel_spi_is_v2(struct atmel_spi *as) * Master on Chip Select 0.") No workaround exists for that ... so for * nCS0 on that chip, we (a) don't use the GPIO, (b) can't support CS_HIGH, * and (c) will trigger that first erratum in some cases. + * + * When changing the clock polarity, the SPI controller waits for the next + * transmission to enforce the default clock state. This may be an issue when + * using a GPIO as Chip Select: the clock level is applied only when the first + * packet is sent, once the CS has already been asserted. The workaround is to + * avoid this by sending a first (dummy) message before toggling the CS state. */ - static void cs_activate(struct atmel_spi *as, struct spi_device *spi) { struct atmel_spi_device *asd = spi->controller_state; + bool new_polarity; int chip_select; u32 mr; @@ -350,6 +411,25 @@ static void cs_activate(struct atmel_spi *as, struct spi_device *spi) } mr = spi_readl(as, MR); + + /* + * Ensures the clock polarity is valid before we actually + * assert the CS to avoid spurious clock edges to be + * processed by the spi devices. + */ + if (spi_get_csgpiod(spi, 0)) { + new_polarity = (asd->csr & SPI_BIT(CPOL)) != 0; + if (new_polarity != as->last_polarity) { + /* + * Need to disable the GPIO before sending the dummy + * message because it is already set by the spi core. + */ + gpiod_set_value_cansleep(spi_get_csgpiod(spi, 0), 0); + atmel_spi_send_dummy(as, spi, chip_select); + as->last_polarity = new_polarity; + gpiod_set_value_cansleep(spi_get_csgpiod(spi, 0), 1); + } + } } else { u32 cpol = (spi->mode & SPI_CPOL) ? SPI_BIT(CPOL) : 0; int i;

1 year, 6 months

1
0
0 0

FAILED: patch "[PATCH] spi: atmel: Fix clock issue when using devices with different" failed to apply to 5.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y git checkout FETCH_HEAD git cherry-pick -x fc70d643a2f6678cbe0f5c86433c1aeb4d613fcc # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023123037-kudos-frayed-b7a9@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^.. Possible dependencies: fc70d643a2f6 ("spi: atmel: Fix clock issue when using devices with different polarities") 69e1818ad27b ("spi: atmel: Fix CS and initialization bug") 5fa5e6dec762 ("spi: atmel: Switch to transfer_one transfer method") ca4196aa1008 ("Merge branch 'spi-5.5' into spi-next") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From fc70d643a2f6678cbe0f5c86433c1aeb4d613fcc Mon Sep 17 00:00:00 2001 From: Louis Chauvet <louis.chauvet(a)bootlin.com> Date: Mon, 4 Dec 2023 16:49:03 +0100 Subject: [PATCH] spi: atmel: Fix clock issue when using devices with different polarities The current Atmel SPI controller driver (v2) behaves incorrectly when using two SPI devices with different clock polarities and GPIO CS. When switching from one device to another, the controller driver first enables the CS and then applies whatever configuration suits the targeted device (typically, the polarities). The side effect of such order is the apparition of a spurious clock edge after enabling the CS when the clock polarity needs to be inverted wrt. the previous configuration of the controller. This parasitic clock edge is problematic when the SPI device uses that edge for internal processing, which is perfectly legitimate given that its CS was asserted. Indeed, devices such as HVS8080 driven by driver gpio-sr in the kernel are shift registers and will process this first clock edge to perform a first register shift. In this case, the first bit gets lost and the whole data block that will later be read by the kernel is all shifted by one. Current behavior: The actual switching of the clock polarity only occurs after the CS when the controller sends the first message: CLK ------------\ /-\ /-\ | | | | | . . . \---/ \-/ \ CS -----\ | \------------------ ^ ^ ^ | | | | | Actual clock of the message sent | | | Change of clock polarity, which occurs with the first | write to the bus. This edge occurs when the CS is | already asserted, and can be interpreted as | the first clock edge by the receiver. | GPIO CS toggle This issue is specific to this controller because while the SPI core performs the operations in the right order, the controller however does not. In practice, the controller only applies the clock configuration right before the first transmission. So this is not a problem when using the controller's dedicated CS, as the controller does things correctly, but it becomes a problem when you need to change the clock polarity and use an external GPIO for the CS. One possible approach to solve this problem is to send a dummy message before actually activating the CS, so that the controller applies the clock polarity beforehand. New behavior: CLK ------\ /-\ /-\ /-\ /-\ | | | ... | | | | ... | | \------/ \- -/ \------/ \- -/ \------ CS -\/-----------------------\ || | \/ \--------------------- ^ ^ ^ ^ ^ | | | | | | | | | Expected clock cycles when | | | | sending the message | | | | | | | Actual GPIO CS activation, occurs inside | | | the driver | | | | | Dummy message, to trigger clock polarity | | reconfiguration. This message is not received and | | processed by the device because CS is low. | | | Change of clock polarity, forced by the dummy message. This | time, the edge is not detected by the receiver. | This small spike in CS activation is due to the fact that the spi-core activates the CS gpio before calling the driver's set_cs callback, which deactivates this gpio again until the clock polarity is correct. To avoid having to systematically send a dummy packet, the driver keeps track of the clock's current polarity. In this way, it only sends the dummy packet when necessary, ensuring that the clock will have the correct polarity when the CS is toggled. There could be two hardware problems with this patch: 1- Maybe the small CS activation peak can confuse SPI devices 2- If on a design, a single wire is used to select two devices depending on its state, the dummy message may disturb them. Fixes: 5ee36c989831 ("spi: atmel_spi update chipselect handling") Cc: <stable(a)vger.kernel.org> Signed-off-by: Louis Chauvet <louis.chauvet(a)bootlin.com> Link: https://msgid.link/r/20231204154903.11607-1-louis.chauvet@bootlin.com Signed-off-by: Mark Brown <broonie(a)kernel.org> diff --git a/drivers/spi/spi-atmel.c b/drivers/spi/spi-atmel.c index 54277de30161..bad34998454a 100644 --- a/drivers/spi/spi-atmel.c +++ b/drivers/spi/spi-atmel.c @@ -22,6 +22,7 @@ #include <linux/gpio/consumer.h> #include <linux/pinctrl/consumer.h> #include <linux/pm_runtime.h> +#include <linux/iopoll.h> #include <trace/events/spi.h> /* SPI register offsets */ @@ -276,6 +277,7 @@ struct atmel_spi { bool keep_cs; u32 fifo_size; + bool last_polarity; u8 native_cs_free; u8 native_cs_for_gpio; }; @@ -288,6 +290,22 @@ struct atmel_spi_device { #define SPI_MAX_DMA_XFER 65535 /* true for both PDC and DMA */ #define INVALID_DMA_ADDRESS 0xffffffff +/* + * This frequency can be anything supported by the controller, but to avoid + * unnecessary delay, the highest possible frequency is chosen. + * + * This frequency is the highest possible which is not interfering with other + * chip select registers (see Note for Serial Clock Bit Rate configuration in + * Atmel-11121F-ATARM-SAMA5D3-Series-Datasheet_02-Feb-16, page 1283) + */ +#define DUMMY_MSG_FREQUENCY 0x02 +/* + * 8 bits is the minimum data the controller is capable of sending. + * + * This message can be anything as it should not be treated by any SPI device. + */ +#define DUMMY_MSG 0xAA + /* * Version 2 of the SPI controller has * - CR.LASTXFER @@ -301,6 +319,43 @@ static bool atmel_spi_is_v2(struct atmel_spi *as) return as->caps.is_spi2; } +/* + * Send a dummy message. + * + * This is sometimes needed when using a CS GPIO to force clock transition when + * switching between devices with different polarities. + */ +static void atmel_spi_send_dummy(struct atmel_spi *as, struct spi_device *spi, int chip_select) +{ + u32 status; + u32 csr; + + /* + * Set a clock frequency to allow sending message on SPI bus. + * The frequency here can be anything, but is needed for + * the controller to send the data. + */ + csr = spi_readl(as, CSR0 + 4 * chip_select); + csr = SPI_BFINS(SCBR, DUMMY_MSG_FREQUENCY, csr); + spi_writel(as, CSR0 + 4 * chip_select, csr); + + /* + * Read all data coming from SPI bus, needed to be able to send + * the message. + */ + spi_readl(as, RDR); + while (spi_readl(as, SR) & SPI_BIT(RDRF)) { + spi_readl(as, RDR); + cpu_relax(); + } + + spi_writel(as, TDR, DUMMY_MSG); + + readl_poll_timeout_atomic(as->regs + SPI_SR, status, + (status & SPI_BIT(TXEMPTY)), 1, 1000); +} + + /* * Earlier SPI controllers (e.g. on at91rm9200) have a design bug whereby * they assume that spi slave device state will not change on deselect, so @@ -317,11 +372,17 @@ static bool atmel_spi_is_v2(struct atmel_spi *as) * Master on Chip Select 0.") No workaround exists for that ... so for * nCS0 on that chip, we (a) don't use the GPIO, (b) can't support CS_HIGH, * and (c) will trigger that first erratum in some cases. + * + * When changing the clock polarity, the SPI controller waits for the next + * transmission to enforce the default clock state. This may be an issue when + * using a GPIO as Chip Select: the clock level is applied only when the first + * packet is sent, once the CS has already been asserted. The workaround is to + * avoid this by sending a first (dummy) message before toggling the CS state. */ - static void cs_activate(struct atmel_spi *as, struct spi_device *spi) { struct atmel_spi_device *asd = spi->controller_state; + bool new_polarity; int chip_select; u32 mr; @@ -350,6 +411,25 @@ static void cs_activate(struct atmel_spi *as, struct spi_device *spi) } mr = spi_readl(as, MR); + + /* + * Ensures the clock polarity is valid before we actually + * assert the CS to avoid spurious clock edges to be + * processed by the spi devices. + */ + if (spi_get_csgpiod(spi, 0)) { + new_polarity = (asd->csr & SPI_BIT(CPOL)) != 0; + if (new_polarity != as->last_polarity) { + /* + * Need to disable the GPIO before sending the dummy + * message because it is already set by the spi core. + */ + gpiod_set_value_cansleep(spi_get_csgpiod(spi, 0), 0); + atmel_spi_send_dummy(as, spi, chip_select); + as->last_polarity = new_polarity; + gpiod_set_value_cansleep(spi_get_csgpiod(spi, 0), 1); + } + } } else { u32 cpol = (spi->mode & SPI_CPOL) ? SPI_BIT(CPOL) : 0; int i;

1 year, 6 months

1
0
0 0

FAILED: patch "[PATCH] spi: atmel: Fix clock issue when using devices with different" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y git checkout FETCH_HEAD git cherry-pick -x fc70d643a2f6678cbe0f5c86433c1aeb4d613fcc # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023123034-sauciness-sarcasm-914b@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^.. Possible dependencies: fc70d643a2f6 ("spi: atmel: Fix clock issue when using devices with different polarities") 69e1818ad27b ("spi: atmel: Fix CS and initialization bug") 5fa5e6dec762 ("spi: atmel: Switch to transfer_one transfer method") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From fc70d643a2f6678cbe0f5c86433c1aeb4d613fcc Mon Sep 17 00:00:00 2001 From: Louis Chauvet <louis.chauvet(a)bootlin.com> Date: Mon, 4 Dec 2023 16:49:03 +0100 Subject: [PATCH] spi: atmel: Fix clock issue when using devices with different polarities The current Atmel SPI controller driver (v2) behaves incorrectly when using two SPI devices with different clock polarities and GPIO CS. When switching from one device to another, the controller driver first enables the CS and then applies whatever configuration suits the targeted device (typically, the polarities). The side effect of such order is the apparition of a spurious clock edge after enabling the CS when the clock polarity needs to be inverted wrt. the previous configuration of the controller. This parasitic clock edge is problematic when the SPI device uses that edge for internal processing, which is perfectly legitimate given that its CS was asserted. Indeed, devices such as HVS8080 driven by driver gpio-sr in the kernel are shift registers and will process this first clock edge to perform a first register shift. In this case, the first bit gets lost and the whole data block that will later be read by the kernel is all shifted by one. Current behavior: The actual switching of the clock polarity only occurs after the CS when the controller sends the first message: CLK ------------\ /-\ /-\ | | | | | . . . \---/ \-/ \ CS -----\ | \------------------ ^ ^ ^ | | | | | Actual clock of the message sent | | | Change of clock polarity, which occurs with the first | write to the bus. This edge occurs when the CS is | already asserted, and can be interpreted as | the first clock edge by the receiver. | GPIO CS toggle This issue is specific to this controller because while the SPI core performs the operations in the right order, the controller however does not. In practice, the controller only applies the clock configuration right before the first transmission. So this is not a problem when using the controller's dedicated CS, as the controller does things correctly, but it becomes a problem when you need to change the clock polarity and use an external GPIO for the CS. One possible approach to solve this problem is to send a dummy message before actually activating the CS, so that the controller applies the clock polarity beforehand. New behavior: CLK ------\ /-\ /-\ /-\ /-\ | | | ... | | | | ... | | \------/ \- -/ \------/ \- -/ \------ CS -\/-----------------------\ || | \/ \--------------------- ^ ^ ^ ^ ^ | | | | | | | | | Expected clock cycles when | | | | sending the message | | | | | | | Actual GPIO CS activation, occurs inside | | | the driver | | | | | Dummy message, to trigger clock polarity | | reconfiguration. This message is not received and | | processed by the device because CS is low. | | | Change of clock polarity, forced by the dummy message. This | time, the edge is not detected by the receiver. | This small spike in CS activation is due to the fact that the spi-core activates the CS gpio before calling the driver's set_cs callback, which deactivates this gpio again until the clock polarity is correct. To avoid having to systematically send a dummy packet, the driver keeps track of the clock's current polarity. In this way, it only sends the dummy packet when necessary, ensuring that the clock will have the correct polarity when the CS is toggled. There could be two hardware problems with this patch: 1- Maybe the small CS activation peak can confuse SPI devices 2- If on a design, a single wire is used to select two devices depending on its state, the dummy message may disturb them. Fixes: 5ee36c989831 ("spi: atmel_spi update chipselect handling") Cc: <stable(a)vger.kernel.org> Signed-off-by: Louis Chauvet <louis.chauvet(a)bootlin.com> Link: https://msgid.link/r/20231204154903.11607-1-louis.chauvet@bootlin.com Signed-off-by: Mark Brown <broonie(a)kernel.org> diff --git a/drivers/spi/spi-atmel.c b/drivers/spi/spi-atmel.c index 54277de30161..bad34998454a 100644 --- a/drivers/spi/spi-atmel.c +++ b/drivers/spi/spi-atmel.c @@ -22,6 +22,7 @@ #include <linux/gpio/consumer.h> #include <linux/pinctrl/consumer.h> #include <linux/pm_runtime.h> +#include <linux/iopoll.h> #include <trace/events/spi.h> /* SPI register offsets */ @@ -276,6 +277,7 @@ struct atmel_spi { bool keep_cs; u32 fifo_size; + bool last_polarity; u8 native_cs_free; u8 native_cs_for_gpio; }; @@ -288,6 +290,22 @@ struct atmel_spi_device { #define SPI_MAX_DMA_XFER 65535 /* true for both PDC and DMA */ #define INVALID_DMA_ADDRESS 0xffffffff +/* + * This frequency can be anything supported by the controller, but to avoid + * unnecessary delay, the highest possible frequency is chosen. + * + * This frequency is the highest possible which is not interfering with other + * chip select registers (see Note for Serial Clock Bit Rate configuration in + * Atmel-11121F-ATARM-SAMA5D3-Series-Datasheet_02-Feb-16, page 1283) + */ +#define DUMMY_MSG_FREQUENCY 0x02 +/* + * 8 bits is the minimum data the controller is capable of sending. + * + * This message can be anything as it should not be treated by any SPI device. + */ +#define DUMMY_MSG 0xAA + /* * Version 2 of the SPI controller has * - CR.LASTXFER @@ -301,6 +319,43 @@ static bool atmel_spi_is_v2(struct atmel_spi *as) return as->caps.is_spi2; } +/* + * Send a dummy message. + * + * This is sometimes needed when using a CS GPIO to force clock transition when + * switching between devices with different polarities. + */ +static void atmel_spi_send_dummy(struct atmel_spi *as, struct spi_device *spi, int chip_select) +{ + u32 status; + u32 csr; + + /* + * Set a clock frequency to allow sending message on SPI bus. + * The frequency here can be anything, but is needed for + * the controller to send the data. + */ + csr = spi_readl(as, CSR0 + 4 * chip_select); + csr = SPI_BFINS(SCBR, DUMMY_MSG_FREQUENCY, csr); + spi_writel(as, CSR0 + 4 * chip_select, csr); + + /* + * Read all data coming from SPI bus, needed to be able to send + * the message. + */ + spi_readl(as, RDR); + while (spi_readl(as, SR) & SPI_BIT(RDRF)) { + spi_readl(as, RDR); + cpu_relax(); + } + + spi_writel(as, TDR, DUMMY_MSG); + + readl_poll_timeout_atomic(as->regs + SPI_SR, status, + (status & SPI_BIT(TXEMPTY)), 1, 1000); +} + + /* * Earlier SPI controllers (e.g. on at91rm9200) have a design bug whereby * they assume that spi slave device state will not change on deselect, so @@ -317,11 +372,17 @@ static bool atmel_spi_is_v2(struct atmel_spi *as) * Master on Chip Select 0.") No workaround exists for that ... so for * nCS0 on that chip, we (a) don't use the GPIO, (b) can't support CS_HIGH, * and (c) will trigger that first erratum in some cases. + * + * When changing the clock polarity, the SPI controller waits for the next + * transmission to enforce the default clock state. This may be an issue when + * using a GPIO as Chip Select: the clock level is applied only when the first + * packet is sent, once the CS has already been asserted. The workaround is to + * avoid this by sending a first (dummy) message before toggling the CS state. */ - static void cs_activate(struct atmel_spi *as, struct spi_device *spi) { struct atmel_spi_device *asd = spi->controller_state; + bool new_polarity; int chip_select; u32 mr; @@ -350,6 +411,25 @@ static void cs_activate(struct atmel_spi *as, struct spi_device *spi) } mr = spi_readl(as, MR); + + /* + * Ensures the clock polarity is valid before we actually + * assert the CS to avoid spurious clock edges to be + * processed by the spi devices. + */ + if (spi_get_csgpiod(spi, 0)) { + new_polarity = (asd->csr & SPI_BIT(CPOL)) != 0; + if (new_polarity != as->last_polarity) { + /* + * Need to disable the GPIO before sending the dummy + * message because it is already set by the spi core. + */ + gpiod_set_value_cansleep(spi_get_csgpiod(spi, 0), 0); + atmel_spi_send_dummy(as, spi, chip_select); + as->last_polarity = new_polarity; + gpiod_set_value_cansleep(spi_get_csgpiod(spi, 0), 1); + } + } } else { u32 cpol = (spi->mode & SPI_CPOL) ? SPI_BIT(CPOL) : 0; int i;

1 year, 6 months

1
0
0 0

[for-linus][PATCH 3/3] ftrace: Fix modification of direct_function hash while in use

by Steven Rostedt

From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org> Masami Hiramatsu reported a memory leak in register_ftrace_direct() where if the number of new entries are added is large enough to cause two allocations in the loop: for (i = 0; i < size; i++) { hlist_for_each_entry(entry, &hash->buckets[i], hlist) { new = ftrace_add_rec_direct(entry->ip, addr, &free_hash); if (!new) goto out_remove; entry->direct = addr; } } Where ftrace_add_rec_direct() has: if (ftrace_hash_empty(direct_functions) || direct_functions->count > 2 * (1 << direct_functions->size_bits)) { struct ftrace_hash *new_hash; int size = ftrace_hash_empty(direct_functions) ? 0 : direct_functions->count + 1; if (size < 32) size = 32; new_hash = dup_hash(direct_functions, size); if (!new_hash) return NULL; *free_hash = direct_functions; direct_functions = new_hash; } The "*free_hash = direct_functions;" can happen twice, losing the previous allocation of direct_functions. But this also exposed a more serious bug. The modification of direct_functions above is not safe. As direct_functions can be referenced at any time to find what direct caller it should call, the time between: new_hash = dup_hash(direct_functions, size); and direct_functions = new_hash; can have a race with another CPU (or even this one if it gets interrupted), and the entries being moved to the new hash are not referenced. That's because the "dup_hash()" is really misnamed and is really a "move_hash()". It moves the entries from the old hash to the new one. Now even if that was changed, this code is not proper as direct_functions should not be updated until the end. That is the best way to handle function reference changes, and is the way other parts of ftrace handles this. The following is done: 1. Change add_hash_entry() to return the entry it created and inserted into the hash, and not just return success or not. 2. Replace ftrace_add_rec_direct() with add_hash_entry(), and remove the former. 3. Allocate a "new_hash" at the start that is made for holding both the new hash entries as well as the existing entries in direct_functions. 4. Copy (not move) the direct_function entries over to the new_hash. 5. Copy the entries of the added hash to the new_hash. 6. If everything succeeds, then use rcu_pointer_assign() to update the direct_functions with the new_hash. This simplifies the code and fixes both the memory leak as well as the race condition mentioned above. Link: https://lore.kernel.org/all/170368070504.42064.8960569647118388081.stgit@de… Link: https://lore.kernel.org/linux-trace-kernel/20231229115134.08dd5174@gandalf.… Cc: stable(a)vger.kernel.org Cc: Masami Hiramatsu <mhiramat(a)kernel.org> Cc: Mark Rutland <mark.rutland(a)arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com> Cc: Jiri Olsa <jolsa(a)kernel.org> Cc: Alexei Starovoitov <ast(a)kernel.org> Cc: Daniel Borkmann <daniel(a)iogearbox.net> Fixes: 763e34e74bb7d ("ftrace: Add register_ftrace_direct()") Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org> --- kernel/trace/ftrace.c | 100 ++++++++++++++++++++---------------------- 1 file changed, 47 insertions(+), 53 deletions(-) diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 8de8bec5f366..b01ae7d36021 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -1183,18 +1183,19 @@ static void __add_hash_entry(struct ftrace_hash *hash, hash->count++; } -static int add_hash_entry(struct ftrace_hash *hash, unsigned long ip) +static struct ftrace_func_entry * +add_hash_entry(struct ftrace_hash *hash, unsigned long ip) { struct ftrace_func_entry *entry; entry = kmalloc(sizeof(*entry), GFP_KERNEL); if (!entry) - return -ENOMEM; + return NULL; entry->ip = ip; __add_hash_entry(hash, entry); - return 0; + return entry; } static void @@ -1349,7 +1350,6 @@ alloc_and_copy_ftrace_hash(int size_bits, struct ftrace_hash *hash) struct ftrace_func_entry *entry; struct ftrace_hash *new_hash; int size; - int ret; int i; new_hash = alloc_ftrace_hash(size_bits); @@ -1366,8 +1366,7 @@ alloc_and_copy_ftrace_hash(int size_bits, struct ftrace_hash *hash) size = 1 << hash->size_bits; for (i = 0; i < size; i++) { hlist_for_each_entry(entry, &hash->buckets[i], hlist) { - ret = add_hash_entry(new_hash, entry->ip); - if (ret < 0) + if (add_hash_entry(new_hash, entry->ip) == NULL) goto free_hash; } } @@ -2536,7 +2535,7 @@ ftrace_find_unique_ops(struct dyn_ftrace *rec) #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS /* Protected by rcu_tasks for reading, and direct_mutex for writing */ -static struct ftrace_hash *direct_functions = EMPTY_HASH; +static struct ftrace_hash __rcu *direct_functions = EMPTY_HASH; static DEFINE_MUTEX(direct_mutex); int ftrace_direct_func_count; @@ -2555,39 +2554,6 @@ unsigned long ftrace_find_rec_direct(unsigned long ip) return entry->direct; } -static struct ftrace_func_entry* -ftrace_add_rec_direct(unsigned long ip, unsigned long addr, - struct ftrace_hash **free_hash) -{ - struct ftrace_func_entry *entry; - - if (ftrace_hash_empty(direct_functions) || - direct_functions->count > 2 * (1 << direct_functions->size_bits)) { - struct ftrace_hash *new_hash; - int size = ftrace_hash_empty(direct_functions) ? 0 : - direct_functions->count + 1; - - if (size < 32) - size = 32; - - new_hash = dup_hash(direct_functions, size); - if (!new_hash) - return NULL; - - *free_hash = direct_functions; - direct_functions = new_hash; - } - - entry = kmalloc(sizeof(*entry), GFP_KERNEL); - if (!entry) - return NULL; - - entry->ip = ip; - entry->direct = addr; - __add_hash_entry(direct_functions, entry); - return entry; -} - static void call_direct_funcs(unsigned long ip, unsigned long pip, struct ftrace_ops *ops, struct ftrace_regs *fregs) { @@ -4223,8 +4189,8 @@ enter_record(struct ftrace_hash *hash, struct dyn_ftrace *rec, int clear_filter) /* Do nothing if it exists */ if (entry) return 0; - - ret = add_hash_entry(hash, rec->ip); + if (add_hash_entry(hash, rec->ip) == NULL) + ret = -ENOMEM; } return ret; } @@ -5266,7 +5232,8 @@ __ftrace_match_addr(struct ftrace_hash *hash, unsigned long ip, int remove) return 0; } - return add_hash_entry(hash, ip); + entry = add_hash_entry(hash, ip); + return entry ? 0 : -ENOMEM; } static int @@ -5410,7 +5377,7 @@ static void remove_direct_functions_hash(struct ftrace_hash *hash, unsigned long */ int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr) { - struct ftrace_hash *hash, *free_hash = NULL; + struct ftrace_hash *hash, *new_hash = NULL, *free_hash = NULL; struct ftrace_func_entry *entry, *new; int err = -EBUSY, size, i; @@ -5436,17 +5403,44 @@ int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr) } } - /* ... and insert them to direct_functions hash. */ err = -ENOMEM; + + /* Make a copy hash to place the new and the old entries in */ + size = hash->count + direct_functions->count; + if (size > 32) + size = 32; + new_hash = alloc_ftrace_hash(fls(size)); + if (!new_hash) + goto out_unlock; + + /* Now copy over the existing direct entries */ + size = 1 << direct_functions->size_bits; + for (i = 0; i < size; i++) { + hlist_for_each_entry(entry, &direct_functions->buckets[i], hlist) { + new = add_hash_entry(new_hash, entry->ip); + if (!new) + goto out_unlock; + new->direct = entry->direct; + } + } + + /* ... and add the new entries */ + size = 1 << hash->size_bits; for (i = 0; i < size; i++) { hlist_for_each_entry(entry, &hash->buckets[i], hlist) { - new = ftrace_add_rec_direct(entry->ip, addr, &free_hash); + new = add_hash_entry(new_hash, entry->ip); if (!new) - goto out_remove; + goto out_unlock; + /* Update both the copy and the hash entry */ + new->direct = addr; entry->direct = addr; } } + free_hash = direct_functions; + rcu_assign_pointer(direct_functions, new_hash); + new_hash = NULL; + ops->func = call_direct_funcs; ops->flags = MULTI_FLAGS; ops->trampoline = FTRACE_REGS_ADDR; @@ -5454,17 +5448,17 @@ int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr) err = register_ftrace_function_nolock(ops); - out_remove: - if (err) - remove_direct_functions_hash(hash, addr); - out_unlock: mutex_unlock(&direct_mutex); - if (free_hash) { + if (free_hash && free_hash != EMPTY_HASH) { synchronize_rcu_tasks(); free_ftrace_hash(free_hash); } + + if (new_hash) + free_ftrace_hash(new_hash); + return err; } EXPORT_SYMBOL_GPL(register_ftrace_direct); @@ -6309,7 +6303,7 @@ ftrace_graph_set_hash(struct ftrace_hash *hash, char *buffer) if (entry) continue; - if (add_hash_entry(hash, rec->ip) < 0) + if (add_hash_entry(hash, rec->ip) == NULL) goto out; } else { if (entry) { -- 2.42.0

1 year, 6 months

1
0
0 0

[for-linus][PATCH 2/3] tracing: Fix blocked reader of snapshot buffer

by Steven Rostedt

From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org> If an application blocks on the snapshot or snapshot_raw files, expecting to be woken up when a snapshot occurs, it will not happen. Or it may happen with an unexpected result. That result is that the application will be reading the main buffer instead of the snapshot buffer. That is because when the snapshot occurs, the main and snapshot buffers are swapped. But the reader has a descriptor still pointing to the buffer that it originally connected to. This is fine for the main buffer readers, as they may be blocked waiting for a watermark to be hit, and when a snapshot occurs, the data that the main readers want is now on the snapshot buffer. But for waiters of the snapshot buffer, they are waiting for an event to occur that will trigger the snapshot and they can then consume it quickly to save the snapshot before the next snapshot occurs. But to do this, they need to read the new snapshot buffer, not the old one that is now receiving new data. Also, it does not make sense to have a watermark "buffer_percent" on the snapshot buffer, as the snapshot buffer is static and does not receive new data except all at once. Link: https://lore.kernel.org/linux-trace-kernel/20231228095149.77f5b45d@gandalf.… Cc: stable(a)vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com> Cc: Mark Rutland <mark.rutland(a)arm.com> Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org> Fixes: debdd57f5145f ("tracing: Make a snapshot feature available from userspace") Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org> --- kernel/trace/ring_buffer.c | 3 ++- kernel/trace/trace.c | 20 +++++++++++++++++--- 2 files changed, 19 insertions(+), 4 deletions(-) diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 32c0dd2fd1c3..9286f88fcd32 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -949,7 +949,8 @@ void ring_buffer_wake_waiters(struct trace_buffer *buffer, int cpu) /* make sure the waiters see the new index */ smp_wmb(); - rb_wake_up_waiters(&rbwork->work); + /* This can be called in any context */ + irq_work_queue(&rbwork->work); } /** diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 199df497db07..a0defe156b57 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -1894,6 +1894,9 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu, __update_max_tr(tr, tsk, cpu); arch_spin_unlock(&tr->max_lock); + + /* Any waiters on the old snapshot buffer need to wake up */ + ring_buffer_wake_waiters(tr->array_buffer.buffer, RING_BUFFER_ALL_CPUS); } /** @@ -1945,12 +1948,23 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu) static int wait_on_pipe(struct trace_iterator *iter, int full) { + int ret; + /* Iterators are static, they should be filled or empty */ if (trace_buffer_iter(iter, iter->cpu_file)) return 0; - return ring_buffer_wait(iter->array_buffer->buffer, iter->cpu_file, - full); + ret = ring_buffer_wait(iter->array_buffer->buffer, iter->cpu_file, full); + +#ifdef CONFIG_TRACER_MAX_TRACE + /* + * Make sure this is still the snapshot buffer, as if a snapshot were + * to happen, this would now be the main buffer. + */ + if (iter->snapshot) + iter->array_buffer = &iter->tr->max_buffer; +#endif + return ret; } #ifdef CONFIG_FTRACE_STARTUP_TEST @@ -8517,7 +8531,7 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos, wait_index = READ_ONCE(iter->wait_index); - ret = wait_on_pipe(iter, iter->tr->buffer_percent); + ret = wait_on_pipe(iter, iter->snapshot ? 0 : iter->tr->buffer_percent); if (ret) goto out; -- 2.42.0

1 year, 6 months

1
0
0 0

[for-linus][PATCH 1/3] ring-buffer: Fix wake ups when buffer_percent is set to 100

by Steven Rostedt

From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org> The tracefs file "buffer_percent" is to allow user space to set a water-mark on how much of the tracing ring buffer needs to be filled in order to wake up a blocked reader. 0 - is to wait until any data is in the buffer 1 - is to wait for 1% of the sub buffers to be filled 50 - would be half of the sub buffers are filled with data 100 - is not to wake the waiter until the ring buffer is completely full Unfortunately the test for being full was: dirty = ring_buffer_nr_dirty_pages(buffer, cpu); return (dirty * 100) > (full * nr_pages); Where "full" is the value for "buffer_percent". There is two issues with the above when full == 100. 1. dirty * 100 > 100 * nr_pages will never be true That is, the above is basically saying that if the user sets buffer_percent to 100, more pages need to be dirty than exist in the ring buffer! 2. The page that the writer is on is never considered dirty, as dirty pages are only those that are full. When the writer goes to a new sub-buffer, it clears the contents of that sub-buffer. That is, even if the check was ">=" it would still not be equal as the most pages that can be considered "dirty" is nr_pages - 1. To fix this, add one to dirty and use ">=" in the compare. Link: https://lore.kernel.org/linux-trace-kernel/20231226125902.4a057f1d@gandalf.… Cc: stable(a)vger.kernel.org Cc: Mark Rutland <mark.rutland(a)arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com> Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org> Fixes: 03329f9939781 ("tracing: Add tracefs file buffer_percentage") Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org> --- kernel/trace/ring_buffer.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 83eab547f1d1..32c0dd2fd1c3 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -881,9 +881,14 @@ static __always_inline bool full_hit(struct trace_buffer *buffer, int cpu, int f if (!nr_pages || !full) return true; - dirty = ring_buffer_nr_dirty_pages(buffer, cpu); + /* + * Add one as dirty will never equal nr_pages, as the sub-buffer + * that the writer is on is not counted as dirty. + * This is needed if "buffer_percent" is set to 100. + */ + dirty = ring_buffer_nr_dirty_pages(buffer, cpu) + 1; - return (dirty * 100) > (full * nr_pages); + return (dirty * 100) >= (full * nr_pages); } /* -- 2.42.0

1 year, 6 months

1
0
0 0

[merged mm-stable] mm-sparsemem-fix-race-in-accessing-memory_section-usage.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm/sparsemem: fix race in accessing memory_section->usage has been removed from the -mm tree. Its filename was mm-sparsemem-fix-race-in-accessing-memory_section-usage.patch This patch was dropped because it was merged into the mm-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Charan Teja Kalla <quic_charante(a)quicinc.com> Subject: mm/sparsemem: fix race in accessing memory_section->usage Date: Fri, 13 Oct 2023 18:34:27 +0530 The below race is observed on a PFN which falls into the device memory region with the system memory configuration where PFN's are such that [ZONE_NORMAL ZONE_DEVICE ZONE_NORMAL]. Since normal zone start and end pfn contains the device memory PFN's as well, the compaction triggered will try on the device memory PFN's too though they end up in NOP(because pfn_to_online_page() returns NULL for ZONE_DEVICE memory sections). When from other core, the section mappings are being removed for the ZONE_DEVICE region, that the PFN in question belongs to, on which compaction is currently being operated is resulting into the kernel crash with CONFIG_SPASEMEM_VMEMAP enabled. The crash logs can be seen at [1]. compact_zone() memunmap_pages ------------- --------------- __pageblock_pfn_to_page ...... (a)pfn_valid(): valid_section()//return true (b)__remove_pages()-> sparse_remove_section()-> section_deactivate(): [Free the array ms->usage and set ms->usage = NULL] pfn_section_valid() [Access ms->usage which is NULL] NOTE: From the above it can be said that the race is reduced to between the pfn_valid()/pfn_section_valid() and the section deactivate with SPASEMEM_VMEMAP enabled. The commit b943f045a9af("mm/sparse: fix kernel crash with pfn_section_valid check") tried to address the same problem by clearing the SECTION_HAS_MEM_MAP with the expectation of valid_section() returns false thus ms->usage is not accessed. Fix this issue by the below steps: a) Clear SECTION_HAS_MEM_MAP before freeing the ->usage. b) RCU protected read side critical section will either return NULL when SECTION_HAS_MEM_MAP is cleared or can successfully access ->usage. c) Free the ->usage with kfree_rcu() and set ms->usage = NULL. No attempt will be made to access ->usage after this as the SECTION_HAS_MEM_MAP is cleared thus valid_section() return false. Thanks to David/Pavan for their inputs on this patch. [1] https://lore.kernel.org/linux-mm/994410bb-89aa-d987-1f50-f514903c55aa@quici… On Snapdragon SoC, with the mentioned memory configuration of PFN's as [ZONE_NORMAL ZONE_DEVICE ZONE_NORMAL], we are able to see bunch of issues daily while testing on a device farm. For this particular issue below is the log. Though the below log is not directly pointing to the pfn_section_valid(){ ms->usage;}, when we loaded this dump on T32 lauterbach tool, it is pointing. [ 540.578056] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 540.578068] Mem abort info: [ 540.578070] ESR = 0x0000000096000005 [ 540.578073] EC = 0x25: DABT (current EL), IL = 32 bits [ 540.578077] SET = 0, FnV = 0 [ 540.578080] EA = 0, S1PTW = 0 [ 540.578082] FSC = 0x05: level 1 translation fault [ 540.578085] Data abort info: [ 540.578086] ISV = 0, ISS = 0x00000005 [ 540.578088] CM = 0, WnR = 0 [ 540.579431] pstate: 82400005 (Nzcv daif +PAN -UAO +TCO -DIT -SSBSBTYPE=--) [ 540.579436] pc : __pageblock_pfn_to_page+0x6c/0x14c [ 540.579454] lr : compact_zone+0x994/0x1058 [ 540.579460] sp : ffffffc03579b510 [ 540.579463] x29: ffffffc03579b510 x28: 0000000000235800 x27:000000000000000c [ 540.579470] x26: 0000000000235c00 x25: 0000000000000068 x24:ffffffc03579b640 [ 540.579477] x23: 0000000000000001 x22: ffffffc03579b660 x21:0000000000000000 [ 540.579483] x20: 0000000000235bff x19: ffffffdebf7e3940 x18:ffffffdebf66d140 [ 540.579489] x17: 00000000739ba063 x16: 00000000739ba063 x15:00000000009f4bff [ 540.579495] x14: 0000008000000000 x13: 0000000000000000 x12:0000000000000001 [ 540.579501] x11: 0000000000000000 x10: 0000000000000000 x9 :ffffff897d2cd440 [ 540.579507] x8 : 0000000000000000 x7 : 0000000000000000 x6 :ffffffc03579b5b4 [ 540.579512] x5 : 0000000000027f25 x4 : ffffffc03579b5b8 x3 :0000000000000001 [ 540.579518] x2 : ffffffdebf7e3940 x1 : 0000000000235c00 x0 :0000000000235800 [ 540.579524] Call trace: [ 540.579527] __pageblock_pfn_to_page+0x6c/0x14c [ 540.579533] compact_zone+0x994/0x1058 [ 540.579536] try_to_compact_pages+0x128/0x378 [ 540.579540] __alloc_pages_direct_compact+0x80/0x2b0 [ 540.579544] __alloc_pages_slowpath+0x5c0/0xe10 [ 540.579547] __alloc_pages+0x250/0x2d0 [ 540.579550] __iommu_dma_alloc_noncontiguous+0x13c/0x3fc [ 540.579561] iommu_dma_alloc+0xa0/0x320 [ 540.579565] dma_alloc_attrs+0xd4/0x108 [quic_charante(a)quicinc.com: use kfree_rcu() in place of synchronize_rcu(), per David] Link: https://lkml.kernel.org/r/1698403778-20938-1-git-send-email-quic_charante@q… Link: https://lkml.kernel.org/r/1697202267-23600-1-git-send-email-quic_charante@q… Fixes: f46edbd1b151 ("mm/sparsemem: add helpers track active portions of a section at boot") Signed-off-by: Charan Teja Kalla <quic_charante(a)quicinc.com> Cc: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com> Cc: Dan Williams <dan.j.williams(a)intel.com> Cc: David Hildenbrand <david(a)redhat.com> Cc: Mel Gorman <mgorman(a)techsingularity.net> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: Vlastimil Babka <vbabka(a)suse.cz> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- include/linux/mmzone.h | 14 +++++++++++--- mm/sparse.c | 17 +++++++++-------- 2 files changed, 20 insertions(+), 11 deletions(-) --- a/include/linux/mmzone.h~mm-sparsemem-fix-race-in-accessing-memory_section-usage +++ a/include/linux/mmzone.h @@ -1799,6 +1799,7 @@ static inline unsigned long section_nr_t #define SUBSECTION_ALIGN_DOWN(pfn) ((pfn) & PAGE_SUBSECTION_MASK) struct mem_section_usage { + struct rcu_head rcu; #ifdef CONFIG_SPARSEMEM_VMEMMAP DECLARE_BITMAP(subsection_map, SUBSECTIONS_PER_SECTION); #endif @@ -1992,7 +1993,7 @@ static inline int pfn_section_valid(stru { int idx = subsection_map_index(pfn); - return test_bit(idx, ms->usage->subsection_map); + return test_bit(idx, READ_ONCE(ms->usage)->subsection_map); } #else static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) @@ -2016,6 +2017,7 @@ static inline int pfn_section_valid(stru static inline int pfn_valid(unsigned long pfn) { struct mem_section *ms; + int ret; /* * Ensure the upper PAGE_SHIFT bits are clear in the @@ -2029,13 +2031,19 @@ static inline int pfn_valid(unsigned lon if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) return 0; ms = __pfn_to_section(pfn); - if (!valid_section(ms)) + rcu_read_lock(); + if (!valid_section(ms)) { + rcu_read_unlock(); return 0; + } /* * Traditionally early sections always returned pfn_valid() for * the entire section-sized span. */ - return early_section(ms) || pfn_section_valid(ms, pfn); + ret = early_section(ms) || pfn_section_valid(ms, pfn); + rcu_read_unlock(); + + return ret; } #endif --- a/mm/sparse.c~mm-sparsemem-fix-race-in-accessing-memory_section-usage +++ a/mm/sparse.c @@ -792,6 +792,13 @@ static void section_deactivate(unsigned unsigned long section_nr = pfn_to_section_nr(pfn); /* + * Mark the section invalid so that valid_section() + * return false. This prevents code from dereferencing + * ms->usage array. + */ + ms->section_mem_map &= ~SECTION_HAS_MEM_MAP; + + /* * When removing an early section, the usage map is kept (as the * usage maps of other sections fall into the same page). It * will be re-used when re-adding the section - which is then no @@ -799,16 +806,10 @@ static void section_deactivate(unsigned * was allocated during boot. */ if (!PageReserved(virt_to_page(ms->usage))) { - kfree(ms->usage); - ms->usage = NULL; + kfree_rcu(ms->usage, rcu); + WRITE_ONCE(ms->usage, NULL); } memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr); - /* - * Mark the section invalid so that valid_section() - * return false. This prevents code from dereferencing - * ms->usage array. - */ - ms->section_mem_map &= ~SECTION_HAS_MEM_MAP; } /* _ Patches currently in -mm which might be from quic_charante(a)quicinc.com are

1 year, 6 months

1
0
0 0

[merged mm-stable] mm-migrate-fix-getting-incorrect-page-mapping-during-page-migration.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm: migrate: fix getting incorrect page mapping during page migration has been removed from the -mm tree. Its filename was mm-migrate-fix-getting-incorrect-page-mapping-during-page-migration.patch This patch was dropped because it was merged into the mm-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Baolin Wang <baolin.wang(a)linux.alibaba.com> Subject: mm: migrate: fix getting incorrect page mapping during page migration Date: Fri, 15 Dec 2023 20:07:52 +0800 When running stress-ng testing, we found below kernel crash after a few hours: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 pc : dentry_name+0xd8/0x224 lr : pointer+0x22c/0x370 sp : ffff800025f134c0 ...... Call trace: dentry_name+0xd8/0x224 pointer+0x22c/0x370 vsnprintf+0x1ec/0x730 vscnprintf+0x2c/0x60 vprintk_store+0x70/0x234 vprintk_emit+0xe0/0x24c vprintk_default+0x3c/0x44 vprintk_func+0x84/0x2d0 printk+0x64/0x88 __dump_page+0x52c/0x530 dump_page+0x14/0x20 set_migratetype_isolate+0x110/0x224 start_isolate_page_range+0xc4/0x20c offline_pages+0x124/0x474 memory_block_offline+0x44/0xf4 memory_subsys_offline+0x3c/0x70 device_offline+0xf0/0x120 ...... After analyzing the vmcore, I found this issue is caused by page migration. The scenario is that, one thread is doing page migration, and we will use the target page's ->mapping field to save 'anon_vma' pointer between page unmap and page move, and now the target page is locked and refcount is 1. Currently, there is another stress-ng thread performing memory hotplug, attempting to offline the target page that is being migrated. It discovers that the refcount of this target page is 1, preventing the offline operation, thus proceeding to dump the page. However, page_mapping() of the target page may return an incorrect file mapping to crash the system in dump_mapping(), since the target page->mapping only saves 'anon_vma' pointer without setting PAGE_MAPPING_ANON flag. There are seveval ways to fix this issue: (1) Setting the PAGE_MAPPING_ANON flag for target page's ->mapping when saving 'anon_vma', but this can confuse PageAnon() for PFN walkers, since the target page has not built mappings yet. (2) Getting the page lock to call page_mapping() in __dump_page() to avoid crashing the system, however, there are still some PFN walkers that call page_mapping() without holding the page lock, such as compaction. (3) Using target page->private field to save the 'anon_vma' pointer and 2 bits page state, just as page->mapping records an anonymous page, which can remove the page_mapping() impact for PFN walkers and also seems a simple way. So I choose option 3 to fix this issue, and this can also fix other potential issues for PFN walkers, such as compaction. Link: https://lkml.kernel.org/r/e60b17a88afc38cb32f84c3e30837ec70b343d2b.17026417… Fixes: 64c8902ed441 ("migrate_pages: split unmap_and_move() to _unmap() and _move()") Signed-off-by: Baolin Wang <baolin.wang(a)linux.alibaba.com> Reviewed-by: "Huang, Ying" <ying.huang(a)intel.com> Cc: Matthew Wilcox <willy(a)infradead.org> Cc: David Hildenbrand <david(a)redhat.com> Cc: Xu Yu <xuyu(a)linux.alibaba.com> Cc: Zi Yan <ziy(a)nvidia.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/migrate.c | 27 ++++++++++----------------- 1 file changed, 10 insertions(+), 17 deletions(-) --- a/mm/migrate.c~mm-migrate-fix-getting-incorrect-page-mapping-during-page-migration +++ a/mm/migrate.c @@ -1025,38 +1025,31 @@ out: } /* - * To record some information during migration, we use some unused - * fields (mapping and private) of struct folio of the newly allocated - * destination folio. This is safe because nobody is using them - * except us. + * To record some information during migration, we use unused private + * field of struct folio of the newly allocated destination folio. + * This is safe because nobody is using it except us. */ -union migration_ptr { - struct anon_vma *anon_vma; - struct address_space *mapping; -}; - enum { PAGE_WAS_MAPPED = BIT(0), PAGE_WAS_MLOCKED = BIT(1), + PAGE_OLD_STATES = PAGE_WAS_MAPPED | PAGE_WAS_MLOCKED, }; static void __migrate_folio_record(struct folio *dst, - unsigned long old_page_state, + int old_page_state, struct anon_vma *anon_vma) { - union migration_ptr ptr = { .anon_vma = anon_vma }; - dst->mapping = ptr.mapping; - dst->private = (void *)old_page_state; + dst->private = (void *)anon_vma + old_page_state; } static void __migrate_folio_extract(struct folio *dst, int *old_page_state, struct anon_vma **anon_vmap) { - union migration_ptr ptr = { .mapping = dst->mapping }; - *anon_vmap = ptr.anon_vma; - *old_page_state = (unsigned long)dst->private; - dst->mapping = NULL; + unsigned long private = (unsigned long)dst->private; + + *anon_vmap = (struct anon_vma *)(private & ~PAGE_OLD_STATES); + *old_page_state = private & PAGE_OLD_STATES; dst->private = NULL; } _ Patches currently in -mm which might be from baolin.wang(a)linux.alibaba.com are

1 year, 6 months

1
0
0 0

[folded-merged] mm-sparsemem-fix-race-in-accessing-memory_section-usage-v2.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm/sparsemem: fix race in accessing memory_section->usage has been removed from the -mm tree. Its filename was mm-sparsemem-fix-race-in-accessing-memory_section-usage-v2.patch This patch was dropped because it was folded into mm-sparsemem-fix-race-in-accessing-memory_section-usage.patch ------------------------------------------------------ From: Charan Teja Kalla <quic_charante(a)quicinc.com> Subject: mm/sparsemem: fix race in accessing memory_section->usage Date: Fri, 27 Oct 2023 16:19:38 +0530 use kfree_rcu() in place of synchronize_rcu(), per David Link: https://lkml.kernel.org/r/1698403778-20938-1-git-send-email-quic_charante@q… Fixes: f46edbd1b151 ("mm/sparsemem: add helpers track active portions of a section at boot") Signed-off-by: Charan Teja Kalla <quic_charante(a)quicinc.com> Cc: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com> Cc: Dan Williams <dan.j.williams(a)intel.com> Cc: David Hildenbrand <david(a)redhat.com> Cc: Mel Gorman <mgorman(a)techsingularity.net> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: Vlastimil Babka <vbabka(a)suse.cz> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- include/linux/mmzone.h | 3 ++- mm/sparse.c | 5 ++--- 2 files changed, 4 insertions(+), 4 deletions(-) --- a/include/linux/mmzone.h~mm-sparsemem-fix-race-in-accessing-memory_section-usage-v2 +++ a/include/linux/mmzone.h @@ -1799,6 +1799,7 @@ static inline unsigned long section_nr_t #define SUBSECTION_ALIGN_DOWN(pfn) ((pfn) & PAGE_SUBSECTION_MASK) struct mem_section_usage { + struct rcu_head rcu; #ifdef CONFIG_SPARSEMEM_VMEMMAP DECLARE_BITMAP(subsection_map, SUBSECTIONS_PER_SECTION); #endif @@ -1992,7 +1993,7 @@ static inline int pfn_section_valid(stru { int idx = subsection_map_index(pfn); - return test_bit(idx, ms->usage->subsection_map); + return test_bit(idx, READ_ONCE(ms->usage)->subsection_map); } #else static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) --- a/mm/sparse.c~mm-sparsemem-fix-race-in-accessing-memory_section-usage-v2 +++ a/mm/sparse.c @@ -806,9 +806,8 @@ static void section_deactivate(unsigned * was allocated during boot. */ if (!PageReserved(virt_to_page(ms->usage))) { - synchronize_rcu(); - kfree(ms->usage); - ms->usage = NULL; + kfree_rcu(ms->usage, rcu); + WRITE_ONCE(ms->usage, NULL); } memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr); } _ Patches currently in -mm which might be from quic_charante(a)quicinc.com are mm-sparsemem-fix-race-in-accessing-memory_section-usage.patch

1 year, 6 months

1
0
0 0

[merged mm-hotfixes-stable] mm-mglru-skip-special-vmas-in-lru_gen_look_around.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm/mglru: skip special VMAs in lru_gen_look_around() has been removed from the -mm tree. Its filename was mm-mglru-skip-special-vmas-in-lru_gen_look_around.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Yu Zhao <yuzhao(a)google.com> Subject: mm/mglru: skip special VMAs in lru_gen_look_around() Date: Fri, 22 Dec 2023 21:56:47 -0700 Special VMAs like VM_PFNMAP can contain anon pages from COW. There isn't much profit in doing lookaround on them. Besides, they can trigger the pte_special() warning in get_pte_pfn(). Skip them in lru_gen_look_around(). Link: https://lkml.kernel.org/r/20231223045647.1566043-1-yuzhao@google.com Fixes: 018ee47f1489 ("mm: multi-gen LRU: exploit locality in rmap") Signed-off-by: Yu Zhao <yuzhao(a)google.com> Reported-by: syzbot+03fd9b3f71641f0ebf2d(a)syzkaller.appspotmail.com Closes: https://lore.kernel.org/000000000000f9ff00060d14c256@google.com/ Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/vmscan.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) --- a/mm/vmscan.c~mm-mglru-skip-special-vmas-in-lru_gen_look_around +++ a/mm/vmscan.c @@ -3955,6 +3955,7 @@ void lru_gen_look_around(struct page_vma int young = 0; pte_t *pte = pvmw->pte; unsigned long addr = pvmw->address; + struct vm_area_struct *vma = pvmw->vma; struct folio *folio = pfn_folio(pvmw->pfn); bool can_swap = !folio_is_file_lru(folio); struct mem_cgroup *memcg = folio_memcg(folio); @@ -3969,11 +3970,15 @@ void lru_gen_look_around(struct page_vma if (spin_is_contended(pvmw->ptl)) return; + /* exclude special VMAs containing anon pages from COW */ + if (vma->vm_flags & VM_SPECIAL) + return; + /* avoid taking the LRU lock under the PTL when possible */ walk = current->reclaim_state ? current->reclaim_state->mm_walk : NULL; - start = max(addr & PMD_MASK, pvmw->vma->vm_start); - end = min(addr | ~PMD_MASK, pvmw->vma->vm_end - 1) + 1; + start = max(addr & PMD_MASK, vma->vm_start); + end = min(addr | ~PMD_MASK, vma->vm_end - 1) + 1; if (end - start > MIN_LRU_BATCH * PAGE_SIZE) { if (addr - start < MIN_LRU_BATCH * PAGE_SIZE / 2) @@ -3998,7 +4003,7 @@ void lru_gen_look_around(struct page_vma unsigned long pfn; pte_t ptent = ptep_get(pte + i); - pfn = get_pte_pfn(ptent, pvmw->vma, addr); + pfn = get_pte_pfn(ptent, vma, addr); if (pfn == -1) continue; @@ -4009,7 +4014,7 @@ void lru_gen_look_around(struct page_vma if (!folio) continue; - if (!ptep_test_and_clear_young(pvmw->vma, addr, pte + i)) + if (!ptep_test_and_clear_young(vma, addr, pte + i)) VM_WARN_ON_ONCE(true); young++; _ Patches currently in -mm which might be from yuzhao(a)google.com are

1 year, 6 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror December 2023