Re: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

List overview All Threads
Download

newer

older

[PATCH 3.17] drm/msm: Fix fbdev...

next-20141017 build: 2 failures 30...

Viresh Kumar

8 Oct 2014 8 Oct '14

7:04 a.m.

On 25 September 2014 11:37, Robert Schöne robert.schoene@tu-dresden.de wrote:

...

We had some iterations of patches, but the only solution that works for me is the patch with the coarse-grained lock that I sent at Mon, 08 Sep 2014 10:16:48 CEST [1] Viresh is pretty occupied lately, but he told me that he might do the tests himself when the current period of busyness is over as he is supplied with a test system. I'm not sure about his current status (busy or testing).

Hi Robert/Prarit,

The last state of my branch: cpufreq/governor-fixes you tested had few bugs in it and so you weren't able to even tests things up.

I couldn't manage to test my patches on a multi-cluster system (couldn't get it up yet :( ), but was able to do that on a dual-core ARM-cortexA15 board. And could simply find the bugs there.

I have updated my branch with the changes now and it would be great if you can confirm if they fix your issues or not.

git://git.linaro.org/people/viresh.kumar/linux.git cpufreq/governor-fixes

-- viresh

Show replies by date

Prarit Bhargava

8 Oct 8 Oct

12:46 p.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

On 10/08/2014 03:04 AM, Viresh Kumar wrote:

...

On 25 September 2014 11:37, Robert Schöne robert.schoene@tu-dresden.de wrote:

...
We had some iterations of patches, but the only solution that works for me is the patch with the coarse-grained lock that I sent at Mon, 08 Sep 2014 10:16:48 CEST [1] Viresh is pretty occupied lately, but he told me that he might do the tests himself when the current period of busyness is over as he is supplied with a test system. I'm not sure about his current status (busy or testing).

Hi Robert/Prarit,

The last state of my branch: cpufreq/governor-fixes you tested had few bugs in it and so you weren't able to even tests things up.

I couldn't manage to test my patches on a multi-cluster system (couldn't get it up yet :( ), but was able to do that on a dual-core ARM-cortexA15 board. And could simply find the bugs there.

I have updated my branch with the changes now and it would be great if you can confirm if they fix your issues or not.

git://git.linaro.org/people/viresh.kumar/linux.git cpufreq/governor-fixes

Hey Viresh, this is on my plate for today. It does look like the panic I sent you yesterday in email does occur when your patches are put into the latest upstream kernel :(.

I'm going to debug shortly ... for anyone interested the panic is:

[ 30.402052] Modules linked in: rfkill nfsd auth_rpcgss nfs_acl lockd sunrpc e1000e x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul iTCO_wdt iTCO_vendor_support i2c_i801 ptp crc32_pclmul crc32c_intel ghash_clmulni_intel sb_edac pps_core aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr lpc_ich edac_core shpchp mfd_core wmi ipmi_si ipmi_msghandler acpi_pad acpi_cpufreq xfs libcrc32c sd_mod sr_mod cdrom crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm drm ahci libahci libata usb_storage i2c_core dm_mirror dm_region_hash dm_log dm_mod [ 30.464642] CPU: 106 PID: 2074 Comm: cpupower Not tainted 3.17.0+ #2 [ 30.471743] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BIVTSDP1.86B.0049.R00.1403081207 03/08/2014 [ 30.483308] task: ffff88104fafec80 ti: ffff88104f914000 task.ti: ffff88104f914000 [ 30.491669] RIP: 0010:[<ffffffff810a8a05>] [<ffffffff810a8a05>] update_blocked_averages+0x205/0x740 [ 30.501897] RSP: 0018:ffff88205f203df8 EFLAGS: 00010002 [ 30.507831] RAX: 000000000000006a RBX: ffff882050181e00 RCX: 2030203020302030 [ 30.515803] RDX: 2030203020302030 RSI: 0000000000000000 RDI: 0000000000000000 [ 30.523777] RBP: ffff88205f203e60 R08: ffffffffffffffff R09: ffff88205f214800 [ 30.531750] R10: 0000000000000000 R11: 000000000000b4d1 R12: ffff88205078fc00 [ 30.539721] R13: ffff882043e07c00 R14: ffff88205f214780 R15: ffff88205f215028 [ 30.547694] FS: 00007f1bf54a4740(0000) GS:ffff88205f200000(0000) knlGS:0000000000000000 [ 30.556733] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 30.563154] CR2: 00007f1bf4a01900 CR3: 0000002048160000 CR4: 00000000001407e0 [ 30.571127] Stack: [ 30.573372] 00002362000f4240 000000000000013b 0000000000000066 ffff882043e07c00 [ 30.581677] ffff88205f214800 0000000000000246 ffff88205078fcc0 0000000011e9d93e [ 30.589988] 00000000fffcc75f ffff88205f214780 0000000000014780 0000000000000001 [ 30.598294] Call Trace: [ 30.601025] <IRQ> [ 30.603173] [<ffffffff810af9b4>] rebalance_domains+0x54/0x290 [ 30.609916] [<ffffffff810afc34>] run_rebalance_domains+0x44/0x1d0 [ 30.616827] [<ffffffff810797a5>] __do_softirq+0xf5/0x2e0 [ 30.622861] [<ffffffff81079c6d>] irq_exit+0x10d/0x120 [ 30.628608] [<ffffffff81656155>] smp_apic_timer_interrupt+0x45/0x60 [ 30.635710] [<ffffffff8165425d>] apic_timer_interrupt+0x6d/0x80 [ 30.642418] <EOI> [ 30.644566] [<ffffffff813043e2>] ? number.isra.2+0x62/0x360 [ 30.651121] [<ffffffff813046a3>] ? number.isra.2+0x323/0x360 [ 30.657545] [<ffffffff81306755>] vsnprintf+0x3e5/0x5c0 [ 30.663385] [<ffffffff81306ab6>] sprintf+0x56/0x80 [ 30.668841] [<ffffffff814e42be>] show_available_freqs.isra.1+0xae/0xc0 [ 30.676235] [<ffffffff814e42e7>] scaling_available_frequencies_show+0x17/0x20 [ 30.684307] [<ffffffff814e04ac>] show+0x5c/0x90 [ 30.689472] [<ffffffff8125df6c>] sysfs_kf_seq_show+0xcc/0x1e0 [ 30.695992] [<ffffffff8125c663>] kernfs_seq_show+0x23/0x30 [ 30.702224] [<ffffffff8120970a>] seq_read+0xfa/0x3a0 [ 30.707870] [<ffffffff8125ced5>] kernfs_fop_read+0xf5/0x160 [ 30.714198] [<ffffffff811e5b28>] vfs_read+0x98/0x170 [ 30.719844] [<ffffffff811e6805>] SyS_read+0x55/0xd0 [ 30.725394] [<ffffffff81653369>] system_call_fastpath+0x16/0x1b [ 30.732104] Code: c7 4c 8d a0 40 ff ff ff 0f 84 c0 00 00 00 49 8b 94 24 d0 00 00 00 49 63 86 70 09 00 00 48 8b 8a a8 00 00 00 48 8b 92 b0 00 00 00 <48> 8b 1c c1 4c 8b 2c c2 0f 1f 44 00 00 be 01 00 00 00 4c 89 ef [ 30.753924] RIP [<ffffffff810a8a05>] update_blocked_averages+0x205/0x740 [ 30.761523] RSP <ffff88205f203df8> [ 30.765421] ---[ end trace c3a68cab33090779 ]--- [ 30.770579] Kernel panic - not syncing: Fatal exception in interrupt [ 30.773853] general protection fault: 0000 [#2] SMP [ 30.773900] Modules linked in: rfkill nfsd auth_rpcgss nfs_acl lockd sunrpc e1000e x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul iTCO_wdt iTCO_vendor_support i2c_i801 ptp crc32_pclmul crc32c_intel ghash_clmulni_intel sb_edac pps_core aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr lpc_ich edac_core shpchp mfd_core wmi ipmi_si ipmi_msghandler acpi_pad acpi_cpufreq xfs libcrc32c sd_mod sr_mod cdrom crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm drm ahci libahci libata usb_storage i2c_core dm_mirror dm_region_hash dm_log dm_mod [ 30.773905] CPU: 32 PID: 0 Comm: swapper/32 Tainted: G D 3.17.0+ #2 [ 30.773907] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BIVTSDP1.86B.0049.R00.1403081207 03/08/2014 [ 30.773909] task: ffff882053e93640 ti: ffff880853610000 task.ti: ffff880853610000 [ 30.773923] RIP: 0010:[<ffffffff810a8a05>] [<ffffffff810a8a05>] update_blocked_averages+0x205/0x740 [ 30.773925] RSP: 0018:ffff88185f843df8 EFLAGS: 00010002 [ 30.773926] RAX: 0000000000000020 RBX: ffff88184d6a4a80 RCX: 2030203020302030 [ 30.773928] RDX: 2030203020302030 RSI: 0000000000000000 RDI: ffff88185173f4c0 [ 30.773929] RBP: ffff88185f843e60 R08: ffff88185173f4c0 R09: ffff88185f854800 [ 30.773930] R10: 0000000000000000 R11: 000000000000be09 R12: ffff88185081f400 [ 30.773931] R13: ffff88185173f400 R14: ffff88185f854780 R15: ffff88185f855028 [ 30.773934] FS: 0000000000000000(0000) GS:ffff88185f840000(0000) knlGS:0000000000000000 [ 30.773935] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 30.773937] CR2: 00007f5090003308 CR3: 000000000197c000 CR4: 00000000001407e0 [ 30.773938] Stack: [ 30.773942] 0000000154104f38 0000000000000389 0000000000000000 ffff88185173f400 [ 30.773945] ffff88185f854800 0000000000000246 ffff88185081f4c0 a8dd852b1b50c0c9 [ 30.773947] 00000000fffcc8d8 ffff88185f854780 0000000000014780 0000000000000000 [ 30.773948] Call Trace: [ 30.773952] <IRQ> [ 30.773957] [<ffffffff810af9b4>] rebalance_domains+0x54/0x290 [ 30.773967] [<ffffffff810d7066>] ? call_timer_fn+0x36/0x100 [ 30.773971] [<ffffffff810afc34>] run_rebalance_domains+0x44/0x1d0 [ 30.773979] [<ffffffff810797a5>] __do_softirq+0xf5/0x2e0 [ 30.773982] [<ffffffff81079c6d>] irq_exit+0x10d/0x120 [ 30.773991] [<ffffffff81656155>] smp_apic_timer_interrupt+0x45/0x60 [ 30.773994] [<ffffffff8165425d>] apic_timer_interrupt+0x6d/0x80 [ 30.773996] <EOI> [ 30.774005] [<ffffffff814e8ac0>] ? cpuidle_enter_state+0x70/0x170 [ 30.774008] [<ffffffff814e8c77>] cpuidle_enter+0x17/0x20 [ 30.774014] [<ffffffff810b5d5d>] cpu_startup_entry+0x37d/0x3a0 [ 30.774021] [<ffffffff81048550>] start_secondary+0x210/0x2d0 [ 30.774045] Code: c7 4c 8d a0 40 ff ff ff 0f 84 c0 00 00 00 49 8b 94 24 d0 00 00 00 49 63 86 70 09 00 00 48 8b 8a a8 00 00 00 48 8b 92 b0 00 00 00 <48> 8b 1c c1 4c 8b 2c c2 0f 1f 44 00 00 be 01 00 00 00 4c 89 ef [ 30.774049] RIP [<ffffffff810a8a05>] update_blocked_averages+0x205/0x740 [ 30.774050] RSP <ffff88185f843df8> [ 30.774054] ---[ end trace c3a68cab3309077a ]--- [ 32.189638] Shutting down cpus with NMI [ 32.193941] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff) [ 32.205301] drm_kms_helper: panic occurred, switching back to text console [ 32.213005] ---[ end Kernel panic - not syncing: Fatal exception in interrup

...

-- viresh

Viresh Kumar

10 Oct 10 Oct

9:04 a.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

On 8 October 2014 18:16, Prarit Bhargava prarit@redhat.com wrote:

...

On 10/08/2014 03:04 AM, Viresh Kumar wrote:

...

...
The last state of my branch: cpufreq/governor-fixes you tested had few bugs in it and so you weren't able to even tests things up.

I couldn't manage to test my patches on a multi-cluster system (couldn't get it up yet :( ), but was able to do that on a dual-core ARM-cortexA15 board. And could simply find the bugs there.

I have updated my branch with the changes now and it would be great if you can confirm if they fix your issues or not.

git://git.linaro.org/people/viresh.kumar/linux.git cpufreq/governor-fixes

Robert/Prarit,

I thought you guys would test this very quickly as it had been hanging since long time. What happened ?

...

Hey Viresh, this is on my plate for today. It does look like the panic I sent you yesterday in email does occur when your patches are put into the latest upstream kernel :(.

I have tested my patches over mainline only, i.e. v3.17 .

Even the branch I mentioned above is based on that.

Robert Schöne

10:41 a.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

...

Robert/Prarit,

I thought you guys would test this very quickly as it had been hanging since long time. What happened ?

...
Hey Viresh, this is on my plate for today. It does look like the panic I sent you yesterday in email does occur when your patches are put into the latest upstream kernel :(.

I have tested my patches over mainline only, i.e. v3.17 .

Even the branch I mentioned above is based on that.

Hi,

This patch makes it worse. Even when changing the governors sequentially for all CPUs, it fails. Here is what happens:

1. I boot the system. (performance) 2. An Ubuntu service enables ondemand sequentially for all cpus (ondemand) 3. I enable performance sequentially (performance) 4. I enable ondemand sequentially (broken) In the last step, ondemand can only be enabled at a single CPU. All others return -EBUSY.

More about step 3/4: $ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor performance performance performance performance performance performance performance performance $ echo ondemand | sudo tee /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor ondemand $ echo ondemand | sudo tee /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor ondemand tee: /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor: Device or resource busy

Robert

Viresh Kumar

11:14 a.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

On 10 October 2014 16:11, Robert Schöne robert.schoene@tu-dresden.de wrote:

...

This patch makes it worse. Even when changing the governors sequentially for all CPUs, it fails. Here is what happens:

I boot the system. (performance)

An Ubuntu service enables ondemand sequentially for all cpus (ondemand)

I enable performance sequentially (performance)

I enable ondemand sequentially (broken)

In the last step, ondemand can only be enabled at a single CPU. All others return -EBUSY.

More about step 3/4: $ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor performance performance performance performance performance performance performance performance $ echo ondemand | sudo tee /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor ondemand $ echo ondemand | sudo tee /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor ondemand tee: /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor: Device or resource busy

Getting -EBUSY here isn't a problem if the governor is currently changing. But I see that I have over-engineered some part of my patches.

Can you please try: cpufreq/governor-fixes-v2 instaed and let me know how does that behave ?

Sorry for the trouble.

Prarit Bhargava

11:21 a.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

On 10/10/2014 05:04 AM, Viresh Kumar wrote:

...

On 8 October 2014 18:16, Prarit Bhargava prarit@redhat.com wrote:

...
On 10/08/2014 03:04 AM, Viresh Kumar wrote:

...
...
The last state of my branch: cpufreq/governor-fixes you tested had few bugs in it and so you weren't able to even tests things up.

I couldn't manage to test my patches on a multi-cluster system (couldn't get it up yet :( ), but was able to do that on a dual-core ARM-cortexA15 board. And could simply find the bugs there.

I have updated my branch with the changes now and it would be great if you can confirm if they fix your issues or not.

git://git.linaro.org/people/viresh.kumar/linux.git cpufreq/governor-fixes

Robert/Prarit,

I thought you guys would test this very quickly as it had been hanging since long time. What happened ?

...
Hey Viresh, this is on my plate for today. It does look like the panic I sent you yesterday in email does occur when your patches are put into the latest upstream kernel :(.

I have tested my patches over mainline only, i.e. v3.17 .

Even the branch I mentioned above is based on that.

Yep, I get that panic doing a very simple

#!/bin/bash

i=0 while [ True ]; do i=$((i+1)) echo "ondemand" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor & echo "performance" > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor & echo "ondemand" > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor & echo "performance" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor & if [ $((i % 100)) = 0 ]; then echo $i fi done

The blocking issue that I have (soon to be resolved I hope) is

http://marc.info/?l=linux-kernel&m=141286895623716&w=2

which is preventing me from doing any LOCKDEP analysis on this system.

I'm working on all of the above right now ...

Viresh Kumar

11:30 a.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

On 10 October 2014 16:51, Prarit Bhargava prarit@redhat.com wrote:

...

Yep, I get that panic doing a very simple

#!/bin/bash

i=0 while [ True ]; do i=$((i+1)) echo "ondemand" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor & echo "performance" > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor & echo "ondemand" > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor & echo "performance" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor & if [ $((i % 100)) = 0 ]; then echo $i fi done

The blocking issue that I have (soon to be resolved I hope) is

http://marc.info/?l=linux-kernel&m=141286895623716&w=2

which is preventing me from doing any LOCKDEP analysis on this system.

I'm working on all of the above right now ...

I have updated my patchset a bit now, the new branch is: cpufreq/governor-fixes-v2. But have you tried this on 3.17 without my patches?

-- viresh

Prarit Bhargava

11:38 a.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

On 10/10/2014 07:30 AM, Viresh Kumar wrote:

...

On 10 October 2014 16:51, Prarit Bhargava prarit@redhat.com wrote:

...
Yep, I get that panic doing a very simple

#!/bin/bash

i=0 while [ True ]; do i=$((i+1)) echo "ondemand" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor & echo "performance" > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor & echo "ondemand" > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor & echo "performance" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor & if [ $((i % 100)) = 0 ]; then echo $i fi done

The blocking issue that I have (soon to be resolved I hope) is

http://marc.info/?l=linux-kernel&m=141286895623716&w=2

which is preventing me from doing any LOCKDEP analysis on this system.

I'm working on all of the above right now ...

I have updated my patchset a bit now, the new branch is: cpufreq/governor-fixes-v2. But have you tried this on 3.17 without my patches?

Yes, I unfortunately have a different set of issues with vanilla 3.17 (previously mentioned locking issue). I've done a quick and dirty hack to get around that, and everything seems okay.

I apply your patches and I get a panic the first time I read sysfs

...

-- viresh

Viresh Kumar

11:46 a.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

On 10 October 2014 17:08, Prarit Bhargava prarit@redhat.com wrote:

...

Yes, I unfortunately have a different set of issues with vanilla 3.17 (previously mentioned locking issue). I've done a quick and dirty hack to get around that, and everything seems okay.

I apply your patches and I get a panic the first time I read sysfs

All changes in my patches are touching these routines: - cpufreq_set_policy() - __cpufreq_governor()

And these two doesn't get called in the read path at all. And so I am not sure how can these make things bad for you.

I had doubt on just one change, can you please check the v2 branch once to see if that gives the same problem ?

Prarit Bhargava

11:48 a.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

On 10/10/2014 07:46 AM, Viresh Kumar wrote:

...

On 10 October 2014 17:08, Prarit Bhargava prarit@redhat.com wrote:

...
Yes, I unfortunately have a different set of issues with vanilla 3.17 (previously mentioned locking issue). I've done a quick and dirty hack to get around that, and everything seems okay.

I apply your patches and I get a panic the first time I read sysfs

All changes in my patches are touching these routines:

cpufreq_set_policy()

__cpufreq_governor()

And these two doesn't get called in the read path at all. And so I am not sure how can these make things bad for you.

I had doubt on just one change, can you please check the v2 branch once to see if that gives the same problem ?

Yep, trying it now ...

Robert Schöne

12:01 p.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

In v2 my system still crashes when concurrently setting the governors I wasn't able to get a stack trace.

Robert

Am Freitag, den 10.10.2014, 07:48 -0400 schrieb Prarit Bhargava:

...

On 10/10/2014 07:46 AM, Viresh Kumar wrote:

...
On 10 October 2014 17:08, Prarit Bhargava prarit@redhat.com wrote:

...
Yes, I unfortunately have a different set of issues with vanilla 3.17 (previously mentioned locking issue). I've done a quick and dirty hack to get around that, and everything seems okay.

I apply your patches and I get a panic the first time I read sysfs

All changes in my patches are touching these routines:

cpufreq_set_policy()

__cpufreq_governor()

And these two doesn't get called in the read path at all. And so I am not sure how can these make things bad for you.

I had doubt on just one change, can you please check the v2 branch once to see if that gives the same problem ?

Yep, trying it now ...

P.

-- Dipl.-Inf. Robert Schoene Computer Scientist - R&D Energy Efficient Computing Technische Universitaet Dresden Center for Information Services and High Performance Computing Distributed and Data Intensive Computing 01062 Dresden Tel.: +49 (351) 463-42483 Fax : +49 (351) 463-37773 E-Mail: Robert.Schoene@tu-dresden.de

Viresh Kumar

12:39 p.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

On 10 October 2014 17:31, Robert Schöne robert.schoene@tu-dresden.de wrote:

...

In v2 my system still crashes when concurrently setting the governors I wasn't able to get a stack trace.

Are you sure its the same crash log or some other bug has got in ? It would be helpful to have a .jpg of crash even.

Robert Schöne

1:04 p.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

The crash results in a lot of output. My Remote-Keyboard-Video-Mouse equipment is not fast enough to gather this output, as I get numerous follow-up errors at a regular base. However, you can get the same error, if you run my test script on a multi core system.

Robert

Am Freitag, den 10.10.2014, 18:09 +0530 schrieb Viresh Kumar:

...

On 10 October 2014 17:31, Robert Schöne robert.schoene@tu-dresden.de wrote:

...
In v2 my system still crashes when concurrently setting the governors I wasn't able to get a stack trace.

Are you sure its the same crash log or some other bug has got in ? It would be helpful to have a .jpg of crash even.

Prarit Bhargava

1:18 p.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

Robert Schöne

1:23 p.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

I finally got a stack:

Oct 10 15:19:44 basti kernel: [ 395.641363] BUG: unable to handle kernel paging request at ffff8800b2783b10 Oct 10 15:19:44 basti kernel: [ 395.641412] IP: [<ffff8800b2783b10>] 0xffff8800b2783b10 Oct 10 15:19:44 basti kernel: [ 395.641449] PGD 1fc9067 PUD 1fce067 PMD b2737063 PTE 80000000b2783163 Oct 10 15:19:44 basti kernel: [ 395.641503] Oops: 0011 [#1] SMP Oct 10 15:19:44 basti kernel: [ 395.641533] Modules linked in: sep3_15(OE) pax(OE) nfsv3(E) rfcomm(E) bnep(E) bluetooth(E) nfsd(E) auth_rpcgss(E) binfmt_misc(E) nfs_acl(E) nfs(E) lockd(E) sunrpc(E) fscache(E) snd_hda_codec_hdmi(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) intel_rapl(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) x86_adapt_driver(OE) snd_hda_intel(E) snd_hda_controller(E) i915(E) kvm_intel(E) snd_hda_codec(E) snd_hwdep(E) kvm(E) snd_pcm(E) crct10dif_pclmul(E) snd_timer(E) mei_me(E) i2c_algo_bit(E) video(E) drm_kms_helper(E) mei(E) drm(E) crc32_pclmul(E) ghash_clmulni_intel(E) ppdev(E) aesni_intel(E) lp(E) parport_pc(E) parport(E) snd(E) aes_x86_64(E) glue_helper(E) lrw(E) gf128mul(E) ablk_helper(E) gpio_ich(E) cryptd(E) soundcore(E) mac_hid(E) lpc_ich(E) serio_raw(E) tpm_infineon(E) psmouse(E) ahci(E) libahci(E) e1000e(E) ptp(E) pps_core(E) Oct 10 15:19:44 basti kernel: [ 395.642169] CPU: 6 PID: 3079 Comm: tee Tainted: G OE 3.17.0+ #3 Oct 10 15:19:44 basti kernel: [ 395.642209] Hardware name: FUJITSU ESPRIMO P700/D3061-A1, BIOS V4.6.4.0 R1.12.0 for D3061-A1x 07/04/2011 Oct 10 15:19:44 basti kernel: [ 395.642262] task: ffff88022bb96400 ti: ffff880227ea4000 task.ti: ffff880227ea4000 Oct 10 15:19:44 basti kernel: [ 395.642303] RIP: 0010:[<ffff8800b2783b10>] [<ffff8800b2783b10>] 0xffff8800b2783b10 Oct 10 15:19:44 basti kernel: [ 395.642352] RSP: 0018:ffff880227ea7b78 EFLAGS: 00010293 Oct 10 15:19:44 basti kernel: [ 395.642383] RAX: ffff88022ba5d340 RBX: 0000000000000000 RCX: 0000000000000000 Oct 10 15:19:44 basti kernel: [ 395.642423] RDX: ffff8802314f0a08 RSI: 0000000000000100 RDI: 0000000000000000 Oct 10 15:19:44 basti kernel: [ 395.642463] RBP: ffff880227ea7ba8 R08: ffff8802314f0a00 R09: 0000000000000004 Oct 10 15:19:44 basti kernel: [ 395.642503] R10: ffffffff81d1a660 R11: 0000000000000246 R12: 0000000000000000 Oct 10 15:19:44 basti kernel: [ 395.642543] R13: ffff8802314f0a00 R14: ffff88022ba5d240 R15: 0000000000000002 Oct 10 15:19:44 basti kernel: [ 395.642583] FS: 00002b1d0969ab80(0000) GS:ffff88023e380000(0000) knlGS:0000000000000000 Oct 10 15:19:44 basti kernel: [ 395.642629] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 10 15:19:44 basti kernel: [ 395.642662] CR2: ffff8800b2783b10 CR3: 00000000b4e89000 CR4: 00000000000407e0 Oct 10 15:19:44 basti kernel: [ 395.642717] ffffffff815e1918 ffff88022ba5d240 ffff8802314f0a00 0000000000000003 Oct 10 15:19:44 basti kernel: [ 395.642773] 0000000000000002 ffff88023e210ea0 ffff880227ea7c28 ffffffff815e1c06 Oct 10 15:19:44 basti kernel: [ 395.642828] 0000000000010aa8 ffff88022ba5d258 ffffffff81cd9ab0 ffff88023e210ea0 Oct 10 15:19:44 basti kernel: [ 395.642885] Call Trace: Oct 10 15:19:44 basti kernel: [ 395.642906] [<ffffffff815e1918>] ? gov_queue_work+0x68/0xd0 Oct 10 15:19:44 basti kernel: [ 395.642941] [<ffffffff815e1c06>] cpufreq_governor_dbs+0x286/0x740 Oct 10 15:19:44 basti kernel: [ 395.642980] [<ffffffff815dfd87>] od_cpufreq_governor_dbs+0x17/0x20 Oct 10 15:19:44 basti kernel: [ 395.643017] [<ffffffff815dc03f>] __cpufreq_governor+0xdf/0x270 Oct 10 15:19:44 basti kernel: [ 395.643089] [<ffffffff815dcd56>] store_scaling_governor+0x96/0xf0 Oct 10 15:19:44 basti kernel: [ 395.643166] [<ffffffff815db989>] store+0x79/0xc0 Oct 10 15:19:44 basti kernel: [ 395.643232] [<ffffffff81247c90>] kernfs_fop_write+0xe0/0x160 Oct 10 15:19:44 basti kernel: [ 395.643300] [<ffffffff811d2bd6>] SyS_write+0x46/0xb0 Oct 10 15:19:44 basti kernel: [ 395.643363] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 de 84 2b 02 88 ff ff 80 6a c4 81 ff ff ff ff 40 b5 9c 2b 02 88 ff ff <d0> 35 78 b2 00 88 ff ff 80 be 08 81 ff ff ff ff 00 00 00 00 00 Oct 10 15:19:44 basti kernel: [ 395.643772] RSP <ffff880227ea7b78> Oct 10 15:19:44 basti kernel: [ 395.643817] ---[ end trace c4d3fedcdd4b353b ]---

Am Freitag, den 10.10.2014, 18:09 +0530 schrieb Viresh Kumar:

...

On 10 October 2014 17:31, Robert Schöne robert.schoene@tu-dresden.de wrote:

...
In v2 my system still crashes when concurrently setting the governors I wasn't able to get a stack trace.

Are you sure its the same crash log or some other bug has got in ? It would be helpful to have a .jpg of crash even.

Viresh Kumar

1:52 p.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

On 10 October 2014 18:53, Robert Schöne robert.schoene@tu-dresden.de wrote:

...

I finally got a stack:

Thanks a lot..

...

Oct 10 15:19:44 basti kernel: [ 395.641363] BUG: unable to handle kernel paging request at ffff8800b2783b10 Oct 10 15:19:44 basti kernel: [ 395.641412] IP: [<ffff8800b2783b10>] 0xffff8800b2783b10 Oct 10 15:19:44 basti kernel: [ 395.641449] PGD 1fc9067 PUD 1fce067 PMD b2737063 PTE 80000000b2783163 Oct 10 15:19:44 basti kernel: [ 395.641503] Oops: 0011 [#1] SMP Oct 10 15:19:44 basti kernel: [ 395.641533] Modules linked in: sep3_15(OE) pax(OE) nfsv3(E) rfcomm(E) bnep(E) bluetooth(E) nfsd(E) auth_rpcgss(E) binfmt_misc(E) nfs_acl(E) nfs(E) lockd(E) sunrpc(E) fscache(E) snd_hda_codec_hdmi(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) intel_rapl(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) x86_adapt_driver(OE) snd_hda_intel(E) snd_hda_controller(E) i915(E) kvm_intel(E) snd_hda_codec(E) snd_hwdep(E) kvm(E) snd_pcm(E) crct10dif_pclmul(E) snd_timer(E) mei_me(E) i2c_algo_bit(E) video(E) drm_kms_helper(E) mei(E) drm(E) crc32_pclmul(E) ghash_clmulni_intel(E) ppdev(E) aesni_intel(E) lp(E) parport_pc(E) parport(E) snd(E) aes_x86_64(E) glue_helper(E) lrw(E) gf128mul(E) ablk_helper(E) gpio_ich(E) cryptd(E) soundcore(E) mac_hid(E) lpc_ich(E) serio_raw(E) tpm_infineon(E) psmouse(E) ahci(E) libahci(E) e1000e(E) ptp(E) pps_core(E) Oct 10 15:19:44 basti kernel: [ 395.642169] CPU: 6 PID: 3079 Comm: tee Tainted: G OE 3.17.0+ #3 Oct 10 15:19:44 basti kernel: [ 395.642209] Hardware name: FUJITSU ESPRIMO P700/D3061-A1, BIOS V4.6.4.0 R1.12.0 for D3061-A1x 07/04/2011 Oct 10 15:19:44 basti kernel: [ 395.642262] task: ffff88022bb96400 ti: ffff880227ea4000 task.ti: ffff880227ea4000 Oct 10 15:19:44 basti kernel: [ 395.642303] RIP: 0010:[<ffff8800b2783b10>] [<ffff8800b2783b10>] 0xffff8800b2783b10 Oct 10 15:19:44 basti kernel: [ 395.642352] RSP: 0018:ffff880227ea7b78 EFLAGS: 00010293 Oct 10 15:19:44 basti kernel: [ 395.642383] RAX: ffff88022ba5d340 RBX: 0000000000000000 RCX: 0000000000000000 Oct 10 15:19:44 basti kernel: [ 395.642423] RDX: ffff8802314f0a08 RSI: 0000000000000100 RDI: 0000000000000000 Oct 10 15:19:44 basti kernel: [ 395.642463] RBP: ffff880227ea7ba8 R08: ffff8802314f0a00 R09: 0000000000000004 Oct 10 15:19:44 basti kernel: [ 395.642503] R10: ffffffff81d1a660 R11: 0000000000000246 R12: 0000000000000000 Oct 10 15:19:44 basti kernel: [ 395.642543] R13: ffff8802314f0a00 R14: ffff88022ba5d240 R15: 0000000000000002 Oct 10 15:19:44 basti kernel: [ 395.642583] FS: 00002b1d0969ab80(0000) GS:ffff88023e380000(0000) knlGS:0000000000000000 Oct 10 15:19:44 basti kernel: [ 395.642629] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 10 15:19:44 basti kernel: [ 395.642662] CR2: ffff8800b2783b10 CR3: 00000000b4e89000 CR4: 00000000000407e0 Oct 10 15:19:44 basti kernel: [ 395.642717] ffffffff815e1918 ffff88022ba5d240 ffff8802314f0a00 0000000000000003 Oct 10 15:19:44 basti kernel: [ 395.642773] 0000000000000002 ffff88023e210ea0 ffff880227ea7c28 ffffffff815e1c06 Oct 10 15:19:44 basti kernel: [ 395.642828] 0000000000010aa8 ffff88022ba5d258 ffffffff81cd9ab0 ffff88023e210ea0 Oct 10 15:19:44 basti kernel: [ 395.642885] Call Trace: Oct 10 15:19:44 basti kernel: [ 395.642906] [<ffffffff815e1918>] ? gov_queue_work+0x68/0xd0 Oct 10 15:19:44 basti kernel: [ 395.642941] [<ffffffff815e1c06>] cpufreq_governor_dbs+0x286/0x740 Oct 10 15:19:44 basti kernel: [ 395.642980] [<ffffffff815dfd87>] od_cpufreq_governor_dbs+0x17/0x20 Oct 10 15:19:44 basti kernel: [ 395.643017] [<ffffffff815dc03f>] __cpufreq_governor+0xdf/0x270 Oct 10 15:19:44 basti kernel: [ 395.643089] [<ffffffff815dcd56>] store_scaling_governor+0x96/0xf0 Oct 10 15:19:44 basti kernel: [ 395.643166] [<ffffffff815db989>] store+0x79/0xc0 Oct 10 15:19:44 basti kernel: [ 395.643232] [<ffffffff81247c90>] kernfs_fop_write+0xe0/0x160 Oct 10 15:19:44 basti kernel: [ 395.643300] [<ffffffff811d2bd6>] SyS_write+0x46/0xb0 Oct 10 15:19:44 basti kernel: [ 395.643363] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 de 84 2b 02 88 ff ff 80 6a c4 81 ff ff ff ff 40 b5 9c 2b 02 88 ff ff <d0> 35 78 b2 00 88 ff ff 80 be 08 81 ff ff ff ff 00 00 00 00 00 Oct 10 15:19:44 basti kernel: [ 395.643772] RSP <ffff880227ea7b78> Oct 10 15:19:44 basti kernel: [ 395.643817] ---[ end trace c4d3fedcdd4b353b ]---

Okay, this is something new..

Would it be possible for you to get the line in cpufreq_governor.c file which did this? For that you can use objdump.

This is how I do it on ARM:

arm-linux-gnueabihf-objdump -r -S -l --disassemble cpufreq_governor.o | less

If you are using x86 then simply use objdump, otherwise your toolchain will have a command for this..

Then search for gov_queue_work in this and confirm the length of routine is 0xd0 (Came from gov_queue_work+0x68/0xd0) and then tell us what's there at 0x68 ...

Sorry if you already knew all this, just for completeness I am sharing this :)

If you find some difficulty with this, just attach the file and send it to me + let me know what compiler you used, + system architecture.

I tried running your script on a single-cluster (sorry I couldn't get the other board up yet :( ), and couldn't hit this issue.

Once again, thanks for testing.

-- viresh

Robert Schöne

2:05 p.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

@all: I have to leave now and will not be available for a week.

@Viresh: The line you are looking for is 2c8 (260h+68h, length check passed). Here it is with the surrounding instructions:

static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data, unsigned int delay) { struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu); 2c0: 49 8b 06 mov (%r14),%rax 2c3: 89 df mov %ebx,%edi 2c5: ff 50 20 callq *0x20(%rax) /fastfs/rschoene/linux-git/drivers/cpufreq/cpufreq_governor.c:168

mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay); 2c8: 48 8b 35 00 00 00 00 mov 0x0(%rip),%rsi # 2cf <gov_queue_work+0x6f> 2cb: R_X86_64_PC32 system_wq-0x4 2cf: 48 8d 50 30 lea 0x30(%rax),%rdx 2d3: 4c 89 f9 mov %r15,%rcx 2d6: 89 df mov %ebx,%edi 2d8: e8 00 00 00 00 callq 2dd <gov_queue_work+0x7d> 2d9: R_X86_64_PC32 mod_delayed_work_on-0x4 cpumask_next(): /fastfs/rschoene/linux-git/include/linux/cpumask.h:182 (discriminator 1) 2dd: 41 83 c4 01 add $0x1,%r12d 2e1: be 00 01 00 00 mov $0x100,%esi 2e6: 4c 89 ef mov %r13,%rdi 2e9: 49 63 d4 movslq %r12d,%rdx 2ec: e8 00 00 00 00 callq 2f1 <gov_queue_work+0x91> 2ed: R_X86_64_PC32 find_next_bit-0x4

...

If you are using x86 then simply use objdump, otherwise your toolchain will have a command for this..

Then search for gov_queue_work in this and confirm the length of routine is 0xd0 (Came from gov_queue_work+0x68/0xd0) and then tell us what's there at 0x68 ...

Sorry if you already knew all this, just for completeness I am sharing this :)

If you find some difficulty with this, just attach the file and send it to me + let me know what compiler you used, + system architecture.

Viresh Kumar

14 Oct 14 Oct

6:58 a.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

On 10 October 2014 19:35, Robert Schöne robert.schoene@tu-dresden.de wrote:

...

@all: I have to leave now and will not be available for a week.

@Viresh: The line you are looking for is 2c8 (260h+68h, length check passed). Here it is with the surrounding instructions:

Thanks..

I now understand most of the races you and Prarit have reported. Finally I was able to get my multi-cluster board up and could test this myself :)

So you need to try my cpufreq/governor-fixes-v4 branch to confirm if this fixes your issues or not.

@Prarit: As Robert probably isn't around this week, would it be possible for you to test this stuff ?

I will send this as a patchset so that people can review and comment.

-- viresh

Prarit Bhargava

11:42 a.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

On 10/14/2014 02:58 AM, Viresh Kumar wrote:

...

On 10 October 2014 19:35, Robert Schöne robert.schoene@tu-dresden.de wrote:

...
@all: I have to leave now and will not be available for a week.

@Viresh: The line you are looking for is 2c8 (260h+68h, length check passed). Here it is with the surrounding instructions:

Thanks..

I now understand most of the races you and Prarit have reported. Finally I was able to get my multi-cluster board up and could test this myself :)

So you need to try my cpufreq/governor-fixes-v4 branch to confirm if this fixes your issues or not.

@Prarit: As Robert probably isn't around this week, would it be possible for you to test this stuff ?

Hi Viresh,

I've been running both my test and Robert's test for about 5 mins. In Robert's case I don't see any problems ... in my case I do occasionally get a system panic because of the sysfs access race I described in the other thread (cpu 1 holds a sysfs file open, while cpu 2 changes the governor ...)

I do have some concerns about the nature of this patchset; I feel it is more of a band-aid approach to the whole cpufreq mechanism. Having said that, I haven't offered an alternative yet so I can't really object too loudly :)

I'll do a more formal review when you post to the list.

...

I will send this as a patchset so that people can review and comment.

-- viresh

Prarit Bhargava

5:12 p.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

On 10/14/2014 07:42 AM, Prarit Bhargava wrote:

...

On 10/14/2014 02:58 AM, Viresh Kumar wrote:

...
On 10 October 2014 19:35, Robert Schöne robert.schoene@tu-dresden.de wrote:

...
@all: I have to leave now and will not be available for a week.

@Viresh: The line you are looking for is 2c8 (260h+68h, length check passed). Here it is with the surrounding instructions:

Thanks..

I now understand most of the races you and Prarit have reported. Finally I was able to get my multi-cluster board up and could test this myself :)

So you need to try my cpufreq/governor-fixes-v4 branch to confirm if this fixes your issues or not.

@Prarit: As Robert probably isn't around this week, would it be possible for you to test this stuff ?

Hi Viresh,

I've been running both my test and Robert's test for about 5 mins. In Robert's case I don't see any problems ... in my case I do occasionally get a system panic because of the sysfs access race I described in the other thread (cpu 1 holds a sysfs file open, while cpu 2 changes the governor ...)

I do have some concerns about the nature of this patchset; I feel it is more of a band-aid approach to the whole cpufreq mechanism. Having said that, I haven't offered an alternative yet so I can't really object too loudly :)

I'll do a more formal review when you post to the list.

I spoke too soon :( On a larger system (128 processors, 64 cores, two threads each)) the system locks up in about 1 minute using Robert's test. The

[ 2484.634827] NMI watchdog: BUG: soft lockup - CPU#31 stuck for 22s! [tee:34538]^M [ 2484.634827] Modules linked in: sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache cfg80211 rfkill x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw igb gf128mul iTCO_wdt ioatdma ptp glue_helper sb_edac iTCO_vendor_support ablk_helper pps_core lpc_ich edac_core dca cryptd mfd_core shpchp pcspkr i2c_i801 ipmi_si ipmi_msghandler wmi nfsd acpi_cpufreq auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c sd_mod sr_mod cdrom crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm isci drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod^M

[ 2484.634850] CPU: 31 PID: 34538 Comm: tee Tainted: G L 3.17.0+ #10^M [ 2484.634851] Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013^M [ 2484.634851] task: ffff881010376c80 ti: ffff880804938000 task.ti: ffff880804938000^M [ 2484.634852] RIP: 0010:[<ffffffff814e65dc>] [<ffffffff814e65dc>] __cpufreq_governor+0x6c/0x2c0^M [ 2484.634855] RSP: 0018:ffff88080493bc68 EFLAGS: 00000246^M [ 2484.634856] RAX: 0000000000000001 RBX: ffffffff8165a622 RCX: 0000000000262988^M [ 2484.634857] RDX: 0000000000000000 RSI: ffffffff81a72960 RDI: ffff88100db9b400^M [ 2484.634857] RBP: ffff88080493bc90 R08: 0000000000000000 R09: 0000000000124f80^M [ 2484.634858] R10: 0000000000262988 R11: 0000000000000246 R12: ffff88080493bcd8^M [ 2484.634858] R13: ffffffff813a0c22 R14: ffff88080493bbe0 R15: ffff88080490f518^M [ 2484.634859] FS: 00007f8045e7f740(0000) GS:ffff88081f060000(0000) knlGS:0000000000000000^M [ 2484.634860] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M [ 2484.634860] CR2: 000000000080b108 CR3: 000000080e86f000 CR4: 00000000001407e0^M [ 2484.634861] Stack:^M [ 2484.634861] ffff88080493bcd8 ffff88100db9b400 0000000000000000 ffffffff81a72960^M [ 2484.634862] ffff88100db9b400 ffff88080493bcc8 ffffffff814e6a33 ffff88100db9b400^M [ 2484.634863] ffff88080d0c5430 0000000000000009 0000000000000009 ffff88100db9b400^M [ 2484.634865] Call Trace:^M [ 2484.634865] [<ffffffff814e6a33>] cpufreq_set_policy+0x203/0x310^M [ 2484.634867] [<ffffffff814e6e1d>] store_scaling_governor+0xad/0xf0^M [ 2484.634869] [<ffffffff814e6d30>] ? cpufreq_update_policy+0x1f0/0x1f0^M [ 2484.634872] [<ffffffff810b5500>] ? add_wait_queue_exclusive+0x20/0x50^M [ 2484.634873] [<ffffffff814e5899>] store+0x79/0xc0^M [ 2484.634875] [<ffffffff8126197d>] sysfs_kf_write+0x3d/0x50^M [ 2484.634876] [<ffffffff81260ec0>] kernfs_fop_write+0xe0/0x160^M [ 2484.634878] [<ffffffff811e9a67>] vfs_write+0xb7/0x1f0^M [ 2484.634879] [<ffffffff811ea685>] SyS_write+0x55/0xd0^M [ 2484.634881] [<ffffffff8165c8e9>] system_call_fastpath+0x16/0x1b^M [ 2484.634883] Code: 05 3b 87 5c 00 04 0f 85 50 02 00 00 0f 1f 00 48 8b 05 71 35 a2 00 0f b6 50 10 83 e2 08 eb 08 0f b6 43 64 84 c0 74 10 84 d2 75 f4 <48> 8b 43 50 0f b6 40 50 84 c0 75 f0 48 c7 c7 60 27 a7 81 e8 1c ^M

Viresh Kumar

16 Oct 16 Oct

10:58 a.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

On 14 October 2014 22:42, Prarit Bhargava prarit@redhat.com wrote:

...

I spoke too soon :( On a larger system (128 processors, 64 cores, two threads each)) the system locks up in about 1 minute using Robert's test. The

...

[ 2484.634827] NMI watchdog: BUG: soft lockup - CPU#31 stuck for 22s! [tee:34538]^M [ 2484.634827] Modules linked in: sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache cfg80211 rfkill x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw igb gf128mul iTCO_wdt ioatdma ptp glue_helper sb_edac iTCO_vendor_support ablk_helper pps_core lpc_ich edac_core dca cryptd mfd_core shpchp pcspkr i2c_i801 ipmi_si ipmi_msghandler wmi nfsd acpi_cpufreq auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c sd_mod sr_mod cdrom crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm isci drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod^M

[ 2484.634850] CPU: 31 PID: 34538 Comm: tee Tainted: G L 3.17.0+ #10^M [ 2484.634851] Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013^M [ 2484.634851] task: ffff881010376c80 ti: ffff880804938000 task.ti: ffff880804938000^M [ 2484.634852] RIP: 0010:[<ffffffff814e65dc>] [<ffffffff814e65dc>] __cpufreq_governor+0x6c/0x2c0^M [ 2484.634855] RSP: 0018:ffff88080493bc68 EFLAGS: 00000246^M [ 2484.634856] RAX: 0000000000000001 RBX: ffffffff8165a622 RCX: 0000000000262988^M [ 2484.634857] RDX: 0000000000000000 RSI: ffffffff81a72960 RDI: ffff88100db9b400^M [ 2484.634857] RBP: ffff88080493bc90 R08: 0000000000000000 R09: 0000000000124f80^M [ 2484.634858] R10: 0000000000262988 R11: 0000000000000246 R12: ffff88080493bcd8^M [ 2484.634858] R13: ffffffff813a0c22 R14: ffff88080493bbe0 R15: ffff88080490f518^M [ 2484.634859] FS: 00007f8045e7f740(0000) GS:ffff88081f060000(0000) knlGS:0000000000000000^M [ 2484.634860] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M [ 2484.634860] CR2: 000000000080b108 CR3: 000000080e86f000 CR4: 00000000001407e0^M [ 2484.634861] Stack:^M [ 2484.634861] ffff88080493bcd8 ffff88100db9b400 0000000000000000 ffffffff81a72960^M [ 2484.634862] ffff88100db9b400 ffff88080493bcc8 ffffffff814e6a33 ffff88100db9b400^M [ 2484.634863] ffff88080d0c5430 0000000000000009 0000000000000009 ffff88100db9b400^M [ 2484.634865] Call Trace:^M [ 2484.634865] [<ffffffff814e6a33>] cpufreq_set_policy+0x203/0x310^M [ 2484.634867] [<ffffffff814e6e1d>] store_scaling_governor+0xad/0xf0^M [ 2484.634869] [<ffffffff814e6d30>] ? cpufreq_update_policy+0x1f0/0x1f0^M [ 2484.634872] [<ffffffff810b5500>] ? add_wait_queue_exclusive+0x20/0x50^M [ 2484.634873] [<ffffffff814e5899>] store+0x79/0xc0^M [ 2484.634875] [<ffffffff8126197d>] sysfs_kf_write+0x3d/0x50^M [ 2484.634876] [<ffffffff81260ec0>] kernfs_fop_write+0xe0/0x160^M [ 2484.634878] [<ffffffff811e9a67>] vfs_write+0xb7/0x1f0^M [ 2484.634879] [<ffffffff811ea685>] SyS_write+0x55/0xd0^M [ 2484.634881] [<ffffffff8165c8e9>] system_call_fastpath+0x16/0x1b^M [ 2484.634883] Code: 05 3b 87 5c 00 04 0f 85 50 02 00 00 0f 1f 00 48 8b 05 71 35 a2 00 0f b6 50 10 83 e2 08 eb 08 0f b6 43 64 84 c0 74 10 84 d2 75 f4 <48> 8b 43 50 0f b6 40 50 84 c0 75 f0 48 c7 c7 60 27 a7 81 e8 1c ^M

Not sure what's going on here.. Better would be if you can decode things like this while reporting bugs:

__cpufreq_governor+0x6c/0x2c0

So that we know what part of code screwed it up..

Prarit Bhargava

17 Oct 17 Oct

12:12 p.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

On 10/16/2014 06:58 AM, Viresh Kumar wrote:

...

On 14 October 2014 22:42, Prarit Bhargava prarit@redhat.com wrote:

...
I spoke too soon :( On a larger system (128 processors, 64 cores, two threads each)) the system locks up in about 1 minute using Robert's test. The

:(

...
[ 2484.634827] NMI watchdog: BUG: soft lockup - CPU#31 stuck for 22s! [tee:34538]^M [ 2484.634827] Modules linked in: sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache cfg80211 rfkill x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw igb gf128mul iTCO_wdt ioatdma ptp glue_helper sb_edac iTCO_vendor_support ablk_helper pps_core lpc_ich edac_core dca cryptd mfd_core shpchp pcspkr i2c_i801 ipmi_si ipmi_msghandler wmi nfsd acpi_cpufreq auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c sd_mod sr_mod cdrom crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm isci drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod^M

[ 2484.634850] CPU: 31 PID: 34538 Comm: tee Tainted: G L 3.17.0+ #10^M [ 2484.634851] Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013^M [ 2484.634851] task: ffff881010376c80 ti: ffff880804938000 task.ti: ffff880804938000^M [ 2484.634852] RIP: 0010:[<ffffffff814e65dc>] [<ffffffff814e65dc>] __cpufreq_governor+0x6c/0x2c0^M [ 2484.634855] RSP: 0018:ffff88080493bc68 EFLAGS: 00000246^M [ 2484.634856] RAX: 0000000000000001 RBX: ffffffff8165a622 RCX: 0000000000262988^M [ 2484.634857] RDX: 0000000000000000 RSI: ffffffff81a72960 RDI: ffff88100db9b400^M [ 2484.634857] RBP: ffff88080493bc90 R08: 0000000000000000 R09: 0000000000124f80^M [ 2484.634858] R10: 0000000000262988 R11: 0000000000000246 R12: ffff88080493bcd8^M [ 2484.634858] R13: ffffffff813a0c22 R14: ffff88080493bbe0 R15: ffff88080490f518^M [ 2484.634859] FS: 00007f8045e7f740(0000) GS:ffff88081f060000(0000) knlGS:0000000000000000^M [ 2484.634860] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M [ 2484.634860] CR2: 000000000080b108 CR3: 000000080e86f000 CR4: 00000000001407e0^M [ 2484.634861] Stack:^M [ 2484.634861] ffff88080493bcd8 ffff88100db9b400 0000000000000000 ffffffff81a72960^M [ 2484.634862] ffff88100db9b400 ffff88080493bcc8 ffffffff814e6a33 ffff88100db9b400^M [ 2484.634863] ffff88080d0c5430 0000000000000009 0000000000000009 ffff88100db9b400^M [ 2484.634865] Call Trace:^M [ 2484.634865] [<ffffffff814e6a33>] cpufreq_set_policy+0x203/0x310^M [ 2484.634867] [<ffffffff814e6e1d>] store_scaling_governor+0xad/0xf0^M [ 2484.634869] [<ffffffff814e6d30>] ? cpufreq_update_policy+0x1f0/0x1f0^M [ 2484.634872] [<ffffffff810b5500>] ? add_wait_queue_exclusive+0x20/0x50^M [ 2484.634873] [<ffffffff814e5899>] store+0x79/0xc0^M [ 2484.634875] [<ffffffff8126197d>] sysfs_kf_write+0x3d/0x50^M [ 2484.634876] [<ffffffff81260ec0>] kernfs_fop_write+0xe0/0x160^M [ 2484.634878] [<ffffffff811e9a67>] vfs_write+0xb7/0x1f0^M [ 2484.634879] [<ffffffff811ea685>] SyS_write+0x55/0xd0^M [ 2484.634881] [<ffffffff8165c8e9>] system_call_fastpath+0x16/0x1b^M [ 2484.634883] Code: 05 3b 87 5c 00 04 0f 85 50 02 00 00 0f 1f 00 48 8b 05 71 35 a2 00 0f b6 50 10 83 e2 08 eb 08 0f b6 43 64 84 c0 74 10 84 d2 75 f4 <48> 8b 43 50 0f b6 40 50 84 c0 75 f0 48 c7 c7 60 27 a7 81 e8 1c ^M

Not sure what's going on here.. Better would be if you can decode things like this while reporting bugs:

__cpufreq_governor+0x6c/0x2c0

/home/linux/drivers/cpufreq/cpufreq.c: 119 0xffffffff814e65dc <__cpufreq_governor+0x6c>: mov 0x50(%rbx),%rax 0xffffffff814e65e0 <__cpufreq_governor+0x70>: movzbl 0x50(%rax),%eax

bool is_governor_busy(struct cpufreq_policy *policy) { if (have_governor_per_policy()) return policy->governor_busy; else return policy->governor->governor_busy; <<< this line? }

...

So that we know what part of code screwed it up..

Viresh Kumar

16 Oct 16 Oct

10:57 a.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

On 14 October 2014 17:12, Prarit Bhargava prarit@redhat.com wrote:

...

I've been running both my test and Robert's test for about 5 mins. In Robert's case I don't see any problems ... in my case I do occasionally get a system panic because of the sysfs access race I described in the other thread (cpu 1 holds a sysfs file open, while cpu 2 changes the governor ...)

Can you give me the exact script? I wasn't able to reproduce it.

...

I do have some concerns about the nature of this patchset; I feel it is more of a band-aid approach to the whole cpufreq mechanism. Having said that, I haven't offered an alternative yet so I can't really object too loudly :)

Yes and No. Some part of it is indeed required. For example, checking for a valid operation must be performed in __cpufreq_governor(). Its not just called from cpufreq_set_policy() but other places as well.. And so accepting STOP from one thread and maybe START from other will always be a problem.

About serializing calls to __cpufreq_governor(), yes a lock will be a better fix. But we have a long standing issue with that. And I am not able to generate the lockdep for some reason now.

-- viresh

Prarit Bhargava

17 Oct 17 Oct

12:09 p.m.

New subject: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

On 10/16/2014 06:57 AM, Viresh Kumar wrote:

...

On 14 October 2014 17:12, Prarit Bhargava prarit@redhat.com wrote:

...
I've been running both my test and Robert's test for about 5 mins. In Robert's case I don't see any problems ... in my case I do occasionally get a system panic because of the sysfs access race I described in the other thread (cpu 1 holds a sysfs file open, while cpu 2 changes the governor ...)

Can you give me the exact script? I wasn't able to reproduce it.

#!/bin/bash

4089

days inactive

4098

days old

linaro-kernel@lists.linaro.org

23 comments

participants

tags (0)

participants (3)

Prarit Bhargava
Robert Schöne
Viresh Kumar