For some 32-bit architectures calculation of DTO and STO timeout can be incorrect due to multiply overflow. Lets prevent this by casting multiply and result to u64.
Suggested by Jisheng Zhang. Switch DIV_ROUND_UP macro to DIV_ROUND_UP_ULL is not reasonable because overflow happens on multiply and DIV_ROUND_UP_ULL helps with sum overflow.
--- Changes since v2: -add fix for cto_ms
Evgeniy Didin (2): mmc: dw_mmc: Fix the DTO timeout overflow calculation for 32-bit systems mmc: dw_mmc: Fix the CTO overflow calculation for 32-bit systems
drivers/mmc/host/dw_mmc.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
In commit 9d9491a7da2a ("mmc: dw_mmc: Fix the DTO timeout calculation") have been made changes which cause multiply overflow for 32-bit systems. The broken timeout calculations caused false interrupt latency warnings and stacktrace splat (such as below) when accessing the SD Card.
| Running : 4M-check-reassembly-tcp-cmykw2-rotatew2.out -v0 -w1 | - Info: Finished target initialization. | mmcblk0: error -110 transferring data, sector 320544, nr 2048, cmd response | 0x900, card status 0x0 | mmc_host mmc0: Bus speed (slot 0) = 50000000Hz (slot req 400000Hz, actual | 396825HZ div = 63) | mmc_host mmc0: Bus speed (slot 0) = 50000000Hz (slot req 25000000Hz, actual | 25000000HZ div = 1) | ------------[ cut here ]------------ | softirq: huh, entered softirq 6 TASKLET 6f6a9412 with preempt_count 00000101, | exited with 00000100? | WARNING: CPU: 2 PID: 0 at ../lib/scatterlist.c:652 sg_miter_next+0x28/0x20c | Modules linked in: | CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.15.0 #57 | | Stack Trace: | arc_unwind_core.constprop.1+0xd0/0xf4 | dump_stack+0x68/0x80 | warn_slowpath_null+0x4e/0xec | sg_miter_next+0x28/0x20c | dw_mci_read_data_pio+0x44/0x190 | dw_mmc f000a000.mmc: Unexpected interrupt latency | dw_mci_interrupt+0x3ee/0x530 | __handle_irq_event_percpu+0x56/0x150 | handle_irq_event+0x34/0x78 | handle_level_irq+0x8e/0x120 | generic_handle_irq+0x1c/0x2c | idu_cascade_isr+0x30/0x6c | __handle_domain_irq+0x72/0xc8 | ret_from_exception+0x0/0x8 |---[ end trace 2a58c9af6c25fe51 ]---
Lets cast this multiply to u64 type which prevents overflow.
Tested-by: Vineet Gupta Vineet.Gupta1@synopsys.com Fixes: ARC STAR 9001306872 HSDK, sdio: board crashes when copying big files
Signed-off-by: Evgeniy Didin Evgeniy.Didin@synopsys.com
CC: Alexey Brodkin abrodkin@synopsys.com CC: Eugeniy Paltsev paltsev@synopsys.com CC: Douglas Anderson dianders@chromium.org CC: Ulf Hansson ulf.hansson@linaro.org CC: linux-kernel@vger.kernel.org CC: linux-snps-arc@lists.infradead.org Cc: stable@vger.kernel.org # 9d9491a7da2a mmc: dw_mmc: Fix the DTO timeout calculation --- Nothing changed since v2. drivers/mmc/host/dw_mmc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c index 0aa39975f33b..194159219b32 100644 --- a/drivers/mmc/host/dw_mmc.c +++ b/drivers/mmc/host/dw_mmc.c @@ -1944,7 +1944,8 @@ static void dw_mci_set_drto(struct dw_mci *host) drto_div = (mci_readl(host, CLKDIV) & 0xff) * 2; if (drto_div == 0) drto_div = 1; - drto_ms = DIV_ROUND_UP(MSEC_PER_SEC * drto_clks * drto_div, + + drto_ms = DIV_ROUND_UP((u64)MSEC_PER_SEC * drto_clks * drto_div, host->bus_hz);
/* add a bit spare time */
On Mon, Feb 26, 2018 at 4:34 PM, Evgeniy Didin Evgeniy.Didin@synopsys.com wrote:
In commit 9d9491a7da2a ("mmc: dw_mmc: Fix the DTO timeout calculation") have been made changes which cause multiply overflow for 32-bit systems. The broken timeout calculations caused false interrupt latency warnings and stacktrace splat (such as below) when accessing the SD Card.
drto_ms = DIV_ROUND_UP((u64)MSEC_PER_SEC * drto_clks * drto_div, host->bus_hz);
Same as per patch 2.
Tested-by: Vineet Gupta Vineet.Gupta1@synopsys.com Fixes: ARC STAR 9001306872 HSDK, sdio: board crashes when copying big files
Signed-off-by: Evgeniy Didin Evgeniy.Didin@synopsys.com
CC: Alexey Brodkin abrodkin@synopsys.com CC: Eugeniy Paltsev paltsev@synopsys.com CC: Douglas Anderson dianders@chromium.org CC: Ulf Hansson ulf.hansson@linaro.org CC: linux-kernel@vger.kernel.org CC: linux-snps-arc@lists.infradead.org Cc: stable@vger.kernel.org # 9d9491a7da2a mmc: dw_mmc: Fix the DTO timeout calculation
As I said, the correct tag may be:
Reported-by: Vineet Gupta Vineet.Gupta1@synopsys.com # ARC STAR 9001306872 HSDK, sdio: board crashes when copying big files Tested-by: Vineet Gupta Vineet.Gupta1@synopsys.com Fixes: 9d9491a7da2a ("mmc: dw_mmc: Fix the DTO timeout calculation") Cc: stable@vger.kernel.org Signed-off-by: Evgeniy Didin Evgeniy.Didin@synopsys.com ... ...
Nothing changed since v2. drivers/mmc/host/dw_mmc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
- drto_ms = DIV_ROUND_UP(MSEC_PER_SEC * drto_clks * drto_div,
- drto_ms = DIV_ROUND_UP((u64)MSEC_PER_SEC * drto_clks * drto_div, host->bus_hz);
Hmm?
#define DIV_ROUND_DOWN_ULL(ll, d) \ ({ unsigned long long _tmp = (ll); do_div(_tmp, d); _tmp; })
#define DIV_ROUND_UP_ULL(ll, d) DIV_ROUND_DOWN_ULL((ll) + (d) - 1, (d))
It uses intermediate unsigned long long _tmp for your "multiply", namely MSEC_PER_SEC * drto_clks * drto_div, which could solves the problem. So I don't see why DIV_ROUND_UP_ULL can't work for you?
/* add a bit spare time */
Hi Evgeniy,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on linus/master] [also build test ERROR on v4.16-rc3 next-20180228] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Evgeniy-Didin/mmc-dw_mmc-Fix-the-DT... config: arm-exynos_defconfig (attached as .config) compiler: arm-linux-gnueabi-gcc (Debian 7.2.0-11) 7.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=arm
All errors (new ones prefixed by >>):
drivers/mmc/host/dw_mmc.o: In function `dw_mci_set_drto':
dw_mmc.c:(.text+0x1644): undefined reference to `__aeabi_uldivmod'
--- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation
In commit 4c2357f57dd5 ("mmc: dw_mmc: Fix the CTO timeout calculation") have been made changes which can cause multiply overflow for 32-bit systems. The value of cto_ms is lower the drto_ms, but nevertheless overflow can occur. Lets cast this multiply to u64 type which prevents overflow.
Signed-off-by: Evgeniy Didin Evgeniy.Didin@synopsys.com
CC: Alexey Brodkin abrodkin@synopsys.com CC: Eugeniy Paltsev paltsev@synopsys.com CC: Douglas Anderson dianders@chromium.org CC: Ulf Hansson ulf.hansson@linaro.org CC: linux-kernel@vger.kernel.org CC: linux-snps-arc@lists.infradead.org Cc: stable@vger.kernel.org # 4c2357f57dd5 mmc: dw_mmc: Fix the CTO timeout calculation --- drivers/mmc/host/dw_mmc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c index 194159219b32..775fb3ae1443 100644 --- a/drivers/mmc/host/dw_mmc.c +++ b/drivers/mmc/host/dw_mmc.c @@ -409,7 +409,8 @@ static inline void dw_mci_set_cto(struct dw_mci *host) cto_div = (mci_readl(host, CLKDIV) & 0xff) * 2; if (cto_div == 0) cto_div = 1; - cto_ms = DIV_ROUND_UP(MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz); + + cto_ms = DIV_ROUND_UP((u64)MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz);
/* add a bit spare time */ cto_ms += 10;
On Mon, Feb 26, 2018 at 4:34 PM, Evgeniy Didin Evgeniy.Didin@synopsys.com wrote:
In commit 4c2357f57dd5 ("mmc: dw_mmc: Fix the CTO timeout calculation") have been made changes which can cause multiply overflow for 32-bit systems. The value of cto_ms is lower the drto_ms, but nevertheless overflow can occur. Lets cast this multiply to u64 type which prevents overflow.
cto_ms = DIV_ROUND_UP(MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz);
cto_ms = DIV_ROUND_UP((u64)MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz);
IIRC, someone commented on this or similar, i.e.
DIV_ROUND_UP_ULL() ?
On Mon, 2018-02-26 at 16:39 +0200, Andy Shevchenko wrote:
On Mon, Feb 26, 2018 at 4:34 PM, Evgeniy Didin Evgeniy.Didin@synopsys.com wrote:
In commit 4c2357f57dd5 ("mmc: dw_mmc: Fix the CTO timeout calculation") have been made changes which can cause multiply overflow for 32-bit systems. The value of cto_ms is lower the drto_ms, but nevertheless overflow can occur. Lets cast this multiply to u64 type which prevents overflow. - cto_ms = DIV_ROUND_UP(MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz);
+ cto_ms = DIV_ROUND_UP((u64)MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz);
IIRC, someone commented on this or similar, i.e.
DIV_ROUND_UP_ULL() ?
Switch DIV_ROUND_UP macro to DIV_ROUND_UP_ULL is not reasonable because overflow happens on multiply and DIV_ROUND_UP_ULL helps with sum overflow.
-- Best regards, Evgeniy Didin
On Mon, Feb 26, 2018 at 5:14 PM, Evgeniy Didin Evgeniy.Didin@synopsys.com wrote:
On Mon, 2018-02-26 at 16:39 +0200, Andy Shevchenko wrote:
On Mon, Feb 26, 2018 at 4:34 PM, Evgeniy Didin Evgeniy.Didin@synopsys.com wrote:
In commit 4c2357f57dd5 ("mmc: dw_mmc: Fix the CTO timeout calculation") have been made changes which can cause multiply overflow for 32-bit systems. The value of cto_ms is lower the drto_ms, but nevertheless overflow can occur. Lets cast this multiply to u64 type which prevents overflow.
cto_ms = DIV_ROUND_UP(MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz);
cto_ms = DIV_ROUND_UP((u64)MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz);
IIRC, someone commented on this or similar, i.e.
DIV_ROUND_UP_ULL() ?
Switch DIV_ROUND_UP macro to DIV_ROUND_UP_ULL is not reasonable because overflow happens on multiply and DIV_ROUND_UP_ULL helps with sum overflow.
Did you try to compile your code for 32-bit target?
On Mon, 2018-02-26 at 18:53 +0200, Andy Shevchenko wrote:
On Mon, Feb 26, 2018 at 5:14 PM, Evgeniy Didin Evgeniy.Didin@synopsys.com wrote:
On Mon, 2018-02-26 at 16:39 +0200, Andy Shevchenko wrote:
On Mon, Feb 26, 2018 at 4:34 PM, Evgeniy Didin Evgeniy.Didin@synopsys.com wrote:
In commit 4c2357f57dd5 ("mmc: dw_mmc: Fix the CTO timeout calculation") have been made changes which can cause multiply overflow for 32-bit systems. The value of cto_ms is lower the drto_ms, but nevertheless overflow can occur. Lets cast this multiply to u64 type which prevents overflow. - cto_ms = DIV_ROUND_UP(MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz);
+ cto_ms = DIV_ROUND_UP((u64)MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz);
IIRC, someone commented on this or similar, i.e.
DIV_ROUND_UP_ULL() ?
Switch DIV_ROUND_UP macro to DIV_ROUND_UP_ULL is not reasonable because overflow happens on multiply and DIV_ROUND_UP_ULL helps with sum overflow.
Did you try to compile your code for 32-bit target?
Yes, we have compiled code for 32-bit system. I am wondering why are you asking that?
-- Best regards, Evgeniy Didin
On Mon, Feb 26, 2018 at 7:14 PM, Evgeniy Didin Evgeniy.Didin@synopsys.com wrote:
On Mon, 2018-02-26 at 18:53 +0200, Andy Shevchenko wrote:
On Mon, Feb 26, 2018 at 5:14 PM, Evgeniy Didin Evgeniy.Didin@synopsys.com wrote:
On Mon, 2018-02-26 at 16:39 +0200, Andy Shevchenko wrote:
On Mon, Feb 26, 2018 at 4:34 PM, Evgeniy Didin Evgeniy.Didin@synopsys.com wrote:
In commit 4c2357f57dd5 ("mmc: dw_mmc: Fix the CTO timeout calculation") have been made changes which can cause multiply overflow for 32-bit systems. The value of cto_ms is lower the drto_ms, but nevertheless overflow can occur. Lets cast this multiply to u64 type which prevents overflow.
cto_ms = DIV_ROUND_UP(MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz);
cto_ms = DIV_ROUND_UP((u64)MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz);
IIRC, someone commented on this or similar, i.e.
DIV_ROUND_UP_ULL() ?
Switch DIV_ROUND_UP macro to DIV_ROUND_UP_ULL is not reasonable because overflow happens on multiply and DIV_ROUND_UP_ULL helps with sum overflow.
Did you try to compile your code for 32-bit target?
Yes, we have compiled code for 32-bit system.
I am wondering why are you asking that?
Because I simple didn't believe you.
ERROR: "__udivdi3" [drivers/mmc/host/dw_mmc.ko] undefined! ...scripts/Makefile.modpost:92: recipe for target '__modpost' failed make[2]: *** [__modpost] Error 1
+++ b/drivers/mmc/host/dw_mmc.c @@ -409,7 +409,7 @@ static inline void dw_mci_set_cto(struct dw_mci *host)
- cto_ms = DIV_ROUND_UP(MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz); + cto_ms = DIV_ROUND_UP((u64)MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz);
Hi Andy,
On Mon, 2018-02-26 at 20:30 +0200, Andy Shevchenko wrote:
On Mon, Feb 26, 2018 at 7:14 PM, Evgeniy Didin Evgeniy.Didin@synopsys.com wrote:
On Mon, 2018-02-26 at 18:53 +0200, Andy Shevchenko wrote:
On Mon, Feb 26, 2018 at 5:14 PM, Evgeniy Didin Evgeniy.Didin@synopsys.com wrote:
On Mon, 2018-02-26 at 16:39 +0200, Andy Shevchenko wrote:
On Mon, Feb 26, 2018 at 4:34 PM, Evgeniy Didin Evgeniy.Didin@synopsys.com wrote:
In commit 4c2357f57dd5 ("mmc: dw_mmc: Fix the CTO timeout calculation") have been made changes which can cause multiply overflow for 32-bit systems. The value of cto_ms is lower the drto_ms, but nevertheless overflow can occur. Lets cast this multiply to u64 type which prevents overflow.
cto_ms = DIV_ROUND_UP(MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz);
cto_ms = DIV_ROUND_UP((u64)MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz);
IIRC, someone commented on this or similar, i.e.
DIV_ROUND_UP_ULL() ?
Switch DIV_ROUND_UP macro to DIV_ROUND_UP_ULL is not reasonable because overflow happens on multiply and DIV_ROUND_UP_ULL helps with sum overflow.
Did you try to compile your code for 32-bit target?
Yes, we have compiled code for 32-bit system. I am wondering why are you asking that?
Because I simple didn't believe you.
Well world around us is much more complicated than it sometimes looks like :)
ERROR: "__udivdi3" [drivers/mmc/host/dw_mmc.ko] undefined! ...scripts/Makefile.modpost:92: recipe for target '__modpost' failed make[2]: *** [__modpost] Error 1
That's right __udivdi3() is not defined for some architectures but it is defined for some others including ARC, Xtensa etc, see https://elixir.bootlin.com/linux/latest/ident/__udivdi3
What happens __udivdi3() is implemented in libgcc in one form or another form be it pure assembly or generic implementation written in C.
So maybe we need to add export of __udivdi3() for other arches, what do you think?
-Alexey
On Mon, 26 Feb 2018 20:27:22 +0000 Alexey Brodkin wrote:
Hi Andy,
On Mon, 2018-02-26 at 20:30 +0200, Andy Shevchenko wrote:
On Mon, Feb 26, 2018 at 7:14 PM, Evgeniy Didin Evgeniy.Didin@synopsys.com wrote:
On Mon, 2018-02-26 at 18:53 +0200, Andy Shevchenko wrote:
On Mon, Feb 26, 2018 at 5:14 PM, Evgeniy Didin Evgeniy.Didin@synopsys.com wrote:
On Mon, 2018-02-26 at 16:39 +0200, Andy Shevchenko wrote:
On Mon, Feb 26, 2018 at 4:34 PM, Evgeniy Didin Evgeniy.Didin@synopsys.com wrote: > In commit 4c2357f57dd5 ("mmc: dw_mmc: Fix the CTO timeout calculation") > have been made changes which can cause multiply overflow for 32-bit systems. > The value of cto_ms is lower the drto_ms, but nevertheless overflow can occur. > Lets cast this multiply to u64 type which prevents overflow. > - cto_ms = DIV_ROUND_UP(MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz); > + > + cto_ms = DIV_ROUND_UP((u64)MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz);
IIRC, someone commented on this or similar, i.e.
DIV_ROUND_UP_ULL() ?
Switch DIV_ROUND_UP macro to DIV_ROUND_UP_ULL is not reasonable because overflow happens on multiply and DIV_ROUND_UP_ULL helps with sum overflow.
Did you try to compile your code for 32-bit target?
Yes, we have compiled code for 32-bit system. I am wondering why are you asking that?
Because I simple didn't believe you.
Well world around us is much more complicated than it sometimes looks like :)
ERROR: "__udivdi3" [drivers/mmc/host/dw_mmc.ko] undefined! ...scripts/Makefile.modpost:92: recipe for target '__modpost' failed make[2]: *** [__modpost] Error 1
That's right __udivdi3() is not defined for some architectures but it is defined for some others including ARC, Xtensa etc, see https://elixir.bootlin.com/linux/latest/ident/__udivdi3
What happens __udivdi3() is implemented in libgcc in one form or another form be it pure assembly or generic implementation written in C.
So maybe we need to add export of __udivdi3() for other arches, what do you think?
Per my understanding, Linux kernel prefer to make use of do_div or implementations in <linux/math64.h> for 64bit divide
Thanks
On Tue, Feb 27, 2018 at 5:52 AM, Jisheng Zhang Jisheng.Zhang@synaptics.com wrote:
On Mon, 26 Feb 2018 20:27:22 +0000 Alexey Brodkin wrote:
On Mon, 2018-02-26 at 20:30 +0200, Andy Shevchenko wrote:
On Mon, Feb 26, 2018 at 7:14 PM, Evgeniy Didin Evgeniy.Didin@synopsys.com wrote:
On Mon, 2018-02-26 at 18:53 +0200, Andy Shevchenko wrote:
On Mon, Feb 26, 2018 at 5:14 PM, Evgeniy Didin
> IIRC, someone commented on this or similar, i.e. > DIV_ROUND_UP_ULL() ?
^^^^^^^^^^^^
So maybe we need to add export of __udivdi3() for other arches, what do you think?
Per my understanding, Linux kernel prefer to make use of do_div or implementations in <linux/math64.h> for 64bit divide
To everyone in this thread. See just above. That's solution which will work. Other than that, drop COMPILE_TEST from the Kconfig and put a comment that driver will not compile on architectures w/o division library.
On Mon, 26 Feb 2018 17:34:11 +0300 Evgeniy Didin wrote:
For some 32-bit architectures calculation of DTO and STO timeout can be incorrect due to multiply overflow. Lets prevent this by casting multiply and result to u64.
Suggested by Jisheng Zhang. Switch DIV_ROUND_UP macro to DIV_ROUND_UP_ULL is not reasonable because overflow happens on multiply and DIV_ROUND_UP_ULL helps with sum overflow.
hmmm, I mean something as below:
-cto_ms = DIV_ROUND_UP(MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz); +cto_ms = DIV_ROUND_UP_ULL((u64)MSEC_PER_SEC * cto_clks * cto_div, host->bus_hz);
This could avoid build error in arm case.
Changes since v2: -add fix for cto_ms
Evgeniy Didin (2): mmc: dw_mmc: Fix the DTO timeout overflow calculation for 32-bit systems mmc: dw_mmc: Fix the CTO overflow calculation for 32-bit systems
These two patches could be folded into one patch?
Thanks
linux-stable-mirror@lists.linaro.org