Re: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()

16 Oct 2014


      On 14 October 2014 22:42, Prarit Bhargava prarit@redhat.com wrote:
...
I spoke too soon :(  On a larger system (128 processors, 64 cores, two threads
each)) the system locks up in about 1 minute using Robert's test.  The
:(
...
[ 2484.634827] NMI watchdog: BUG: soft lockup - CPU#31 stuck for 22s! [tee:34538]^M
[ 2484.634827] Modules linked in: sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver
nfs fscache cfg80211 rfkill x86_pkg_temp_thermal intel_powerclamp coretemp
kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel
aesni_intel lrw igb gf128mul iTCO_wdt ioatdma ptp glue_helper sb_edac
iTCO_vendor_support ablk_helper pps_core lpc_ich edac_core dca cryptd mfd_core
shpchp pcspkr i2c_i801 ipmi_si ipmi_msghandler wmi nfsd acpi_cpufreq auth_rpcgss
nfs_acl lockd grace sunrpc xfs libcrc32c sd_mod sr_mod cdrom crc_t10dif
crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit
drm_kms_helper ttm isci drm libsas ahci libahci scsi_transport_sas libata
i2c_core dm_mirror dm_region_hash dm_log dm_mod^M
[ 2484.634850] CPU: 31 PID: 34538 Comm: tee Tainted: G             L 3.17.0+ #10^M
[ 2484.634851] Hardware name: Intel Corporation S2600CP/S2600CP, BIOS
RMLSDP.86I.00.29.D696.1311111329 11/11/2013^M
[ 2484.634851] task: ffff881010376c80 ti: ffff880804938000 task.ti:
ffff880804938000^M
[ 2484.634852] RIP: 0010:[<ffffffff814e65dc>]  [<ffffffff814e65dc>]
__cpufreq_governor+0x6c/0x2c0^M
[ 2484.634855] RSP: 0018:ffff88080493bc68  EFLAGS: 00000246^M
[ 2484.634856] RAX: 0000000000000001 RBX: ffffffff8165a622 RCX: 0000000000262988^M
[ 2484.634857] RDX: 0000000000000000 RSI: ffffffff81a72960 RDI: ffff88100db9b400^M
[ 2484.634857] RBP: ffff88080493bc90 R08: 0000000000000000 R09: 0000000000124f80^M
[ 2484.634858] R10: 0000000000262988 R11: 0000000000000246 R12: ffff88080493bcd8^M
[ 2484.634858] R13: ffffffff813a0c22 R14: ffff88080493bbe0 R15: ffff88080490f518^M
[ 2484.634859] FS:  00007f8045e7f740(0000) GS:ffff88081f060000(0000)
knlGS:0000000000000000^M
[ 2484.634860] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
[ 2484.634860] CR2: 000000000080b108 CR3: 000000080e86f000 CR4: 00000000001407e0^M
[ 2484.634861] Stack:^M
[ 2484.634861]  ffff88080493bcd8 ffff88100db9b400 0000000000000000
ffffffff81a72960^M
[ 2484.634862]  ffff88100db9b400 ffff88080493bcc8 ffffffff814e6a33
ffff88100db9b400^M
[ 2484.634863]  ffff88080d0c5430 0000000000000009 0000000000000009
ffff88100db9b400^M
[ 2484.634865] Call Trace:^M
[ 2484.634865]  [<ffffffff814e6a33>] cpufreq_set_policy+0x203/0x310^M
[ 2484.634867]  [<ffffffff814e6e1d>] store_scaling_governor+0xad/0xf0^M
[ 2484.634869]  [<ffffffff814e6d30>] ? cpufreq_update_policy+0x1f0/0x1f0^M
[ 2484.634872]  [<ffffffff810b5500>] ? add_wait_queue_exclusive+0x20/0x50^M
[ 2484.634873]  [<ffffffff814e5899>] store+0x79/0xc0^M
[ 2484.634875]  [<ffffffff8126197d>] sysfs_kf_write+0x3d/0x50^M
[ 2484.634876]  [<ffffffff81260ec0>] kernfs_fop_write+0xe0/0x160^M
[ 2484.634878]  [<ffffffff811e9a67>] vfs_write+0xb7/0x1f0^M
[ 2484.634879]  [<ffffffff811ea685>] SyS_write+0x55/0xd0^M
[ 2484.634881]  [<ffffffff8165c8e9>] system_call_fastpath+0x16/0x1b^M
[ 2484.634883] Code: 05 3b 87 5c 00 04 0f 85 50 02 00 00 0f 1f 00 48 8b 05 71 35
a2 00 0f b6 50 10 83 e2 08 eb 08 0f b6 43 64 84 c0 74 10 84 d2 75 f4 <48> 8b 43
50 0f b6 40 50 84 c0 75 f0 48 c7 c7 60 27 a7 81 e8 1c ^M
Not sure what's going on here.. Better would be if you can decode things like
this while reporting bugs:
__cpufreq_governor+0x6c/0x2c0
So that we know what part of code screwed it up..

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [PATCH 1/2] cpufreq: serialize calls to __cpufreq_governor()