Hi Guys,
We are talking here about a bug reported by Duncan here. His cpu/cpu*/cpufreq directory are getting corrupted with 3.9-rc3 and was working well with 3.8
https://bugzilla.kernel.org/show_bug.cgi?id=55411
On his AMD bulldozer tri-cluster/6-core system he doesn't see affected and related cpus set correctly after off-lining 1-5 and bringing them back with:
for i in 1 2 3 4 5; do echo 0 > /sys/devices/system/cpu/cpu$i/online ; done for i in 1 2 3 4 5; do echo 1 > /sys/devices/system/cpu/cpu$i/online ; done
Before running above two, cpufreq-info gave: https://bugzilla.kernel.org/attachment.cgi?id=95701
And after running above it gave: https://bugzilla.kernel.org/attachment.cgi?id=95711
Clearly it got corrupted. Somehow cpu 3 showed up in related cpus field of cpu 5.
I suspect following patches behind this:
commit fcf8058296edbc3de43adf095824fc32b067b9f8 Author: Viresh Kumar viresh.kumar@linaro.org Date: Tue Jan 29 14:39:08 2013 +0000
cpufreq: Simplify cpufreq_add_dev()
Currently cpufreq_add_dev() firsts allocates policy, calls driver->init() and then checks if this CPU is already managed or not. And if it is already managed, its policy is freed.
We can save all this if we somehow know that CPU is managed or not in advance. policy->related_cpus contains the list of all valid sibling CPUs of policy->cpu. We can check this to see if the current CPU is already managed.
From now on, platforms don't really need to set related_cpus from their init() routines, as the same work is done by core too.
If a platform driver needs to set the related_cpus mask with some additional CPUs, other than CPUs present in policy->cpus, they are free to do it, though, as we don't override anything.
[rjw: Changelog] Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Tested-by: Shawn Guo shawn.guo@linaro.org Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com
AND
commit 643ae6e81dd65b333a13259852405fc9f764ac76 Author: Viresh Kumar viresh.kumar@linaro.org Date: Sat Jan 12 05:14:38 2013 +0000
cpufreq: Manage only online cpus
cpufreq core doesn't manage offline cpus and if driver->init() has returned mask including offline cpus, it may result in unwanted behavior by cpufreq core or governors.
We need to get only online cpus in this mask. There are two places to fix this mask, cpufreq core and cpufreq driver. It makes sense to do this at common place and hence is done in core.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com
And this is the latest piece of documentation available:
SMP systems normally have same clock source for a group of cpus. For these the .init() would be called only once for the first online cpu. Here the .init() routine must initialize policy->cpus with mask of all possible cpus (Online + Offline) that share the clock. Then the core would copy this mask onto policy->related_cpus and will reset policy->cpus to carry only online cpus.
I saw acpi-cpufreq drivers driver->init() code and found it is not yet aligned to this theory and probably that is causing these failures.
I don't have enough knowledge about this driver and how is it used for all x86 systems and so want somebody else (who has some prior experience with it) to check how policy->cpus and policy->related_cpus must be set from driver->init().
-- viresh
---------- Forwarded message ---------- From: bugzilla-daemon@bugzilla.kernel.org Date: 19 March 2013 13:19 Subject: [Bug 55411] sysfs per-cpu cpufreq subdirs/symlinks screwed up after s2ram To: viresh.kumar@linaro.org
https://bugzilla.kernel.org/show_bug.cgi?id=55411
--- Comment #9 from Duncan 1i5t5.duncan@cox.net 2013-03-19 07:49:53 --- (In reply to comment #8)
(In reply to comment #0)
After a s2ram/resume cycle (now bad):
/sys/devices/system/cpu/cpu0/cpufreq/ /sys/devices/system/cpu/cpu1/cpufreq -> ../cpu0/cpufreq/ /sys/devices/system/cpu/cpu3/cpufreq/ /sys/devices/system/cpu/cpu5/cpufreq/
Can you try this rather than s2r:
for i in 1 2 3 4 5; do echo 0 > /sys/devices/system/cpu/cpu$i/online ; done for i in 1 2 3 4 5; do echo 1 > /sys/devices/system/cpu/cpu$i/online ; done
and check the status if things are still corrupted for you?
Above doesn't corrupt anything for me Atleast.
That's a nice easy test; no rebuild and reboot needed. =:^)
Tho I had to change the > to >| as I have bash noclobber set and the files obviously already exist...
Uncorrupted before the test, corrupted after. So just cycling the cpus off and then back online *DOES* corrupt cpufreq, thus a much simpler reproducer! =:^) Exact same ls results as the above.
And my system doesn't have S2R support for now.
My old system didn't support s2ram reliably; it would work occasionally but mostly it didn't. But s2disk was workable for awhile, until the fact that I was running mdraid and the disks didn't always return in the same sdX slots due to hardware wakeup issues complicated things, so eventually I didn't use that much either. The new system's great with s2ram, sans this bug of course; s2disk didn't work at all at first, but last time I tried it /almost/ worked so there has been improvement. But I don't like to take unnecessary chances with filesystem log replay and thankfully wall power's good enough around here that I can s2ram for a day and come back and wiggle the mouse and all's fine (with a couple pre-suspend syncs thrown into my script just in case), so I tend to use it a LOT, even more than I'd use s2disk due to the speed. =:^)
But I'd love to have s2both working reliably; for all I know it's actually working now; it was pretty close. But I prefer not to test the reiserfs log replay (even with pre-suspend syncs I worry, tho as I said reiserfs has actually been very good to me even thru faulty ram, a power supply blowing up on me, a mobo dying, etc, since 2.6.16 or whenever it was that it got ordered journaling by default) when it doesn't work, so knowing s2disk didn't work well when I tested it and with s2ram working SO well, I don't tend to test s2disk/s2both too often.
Meanwhile, thanks for the cpuinfo_cur_freq explanation. If that actually real-time touches the hardware to get the data as you say, that does explain the root privs. Maybe that bit of extra info could be added to the documentation? I could propose some new wording and open a new bug on cpu-freq/user-guide.txt for it if appropriate.
-- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
[Adding Boris and Thomas to the CC.]
On Tuesday, March 19, 2013 02:20:06 PM Viresh Kumar wrote:
Hi Guys,
We are talking here about a bug reported by Duncan here. His cpu/cpu*/cpufreq directory are getting corrupted with 3.9-rc3 and was working well with 3.8
https://bugzilla.kernel.org/show_bug.cgi?id=55411
On his AMD bulldozer tri-cluster/6-core system he doesn't see affected and related cpus set correctly after off-lining 1-5 and bringing them back with:
for i in 1 2 3 4 5; do echo 0 > /sys/devices/system/cpu/cpu$i/online ; done for i in 1 2 3 4 5; do echo 1 > /sys/devices/system/cpu/cpu$i/online ; done
Before running above two, cpufreq-info gave: https://bugzilla.kernel.org/attachment.cgi?id=95701
And after running above it gave: https://bugzilla.kernel.org/attachment.cgi?id=95711
Clearly it got corrupted. Somehow cpu 3 showed up in related cpus field of cpu 5.
I suspect following patches behind this:
commit fcf8058296edbc3de43adf095824fc32b067b9f8 Author: Viresh Kumar viresh.kumar@linaro.org Date: Tue Jan 29 14:39:08 2013 +0000
cpufreq: Simplify cpufreq_add_dev() Currently cpufreq_add_dev() firsts allocates policy, calls driver->init() and then checks if this CPU is already managed or not. And if it is already managed, its policy is freed. We can save all this if we somehow know that CPU is managed or not in advance. policy->related_cpus contains the list of all valid sibling CPUs of policy->cpu. We can check this to see if the current CPU is already managed. From now on, platforms don't really need to set related_cpus from their init() routines, as the same work is done by core too. If a platform driver needs to set the related_cpus mask with some additional CPUs, other than CPUs present in policy->cpus, they are free to do it, though, as we don't override anything. [rjw: Changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
AND
commit 643ae6e81dd65b333a13259852405fc9f764ac76 Author: Viresh Kumar viresh.kumar@linaro.org Date: Sat Jan 12 05:14:38 2013 +0000
cpufreq: Manage only online cpus cpufreq core doesn't manage offline cpus and if driver->init() has returned mask including offline cpus, it may result in unwanted behavior by
cpufreq core or governors.
We need to get only online cpus in this mask. There are two places
to fix this mask, cpufreq core and cpufreq driver. It makes sense to do this at common place and hence is done in core.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
And this is the latest piece of documentation available:
SMP systems normally have same clock source for a group of cpus. For these the .init() would be called only once for the first online cpu. Here the .init() routine must initialize policy->cpus with mask of all possible cpus (Online + Offline) that share the clock. Then the core would copy this mask onto policy->related_cpus and will reset policy->cpus to carry only online cpus.
I saw acpi-cpufreq drivers driver->init() code and found it is not yet aligned to this theory and probably that is causing these failures.
I don't have enough knowledge about this driver and how is it used for all x86 systems and so want somebody else (who has some prior experience with it) to check how policy->cpus and policy->related_cpus must be set from driver->init().
OK, so what exactly do you need to now?
This has to be addressed before final 3.9 this way or another - and the sooner the better.
Thanks, Rafael
On 22 March 2013 17:47, Rafael J. Wysocki rjw@sisk.pl wrote:
OK, so what exactly do you need to now?
s/now/know ??
We just need to set policy->cpus with all cpus (online or offline) that share clock line with cpu for which init() is called.
cpufreq core with then set all those cpus onto related cpus and will keep only online cpus in policy->cpus..
Sorry if i haven't answered it well :(
On Friday, March 22, 2013 05:45:29 PM Viresh Kumar wrote:
On 22 March 2013 17:47, Rafael J. Wysocki rjw@sisk.pl wrote:
OK, so what exactly do you need to now?
s/now/know ??
Yes, sorry.
On Fri, Mar 22, 2013 at 02:12:50PM +0100, Rafael J. Wysocki wrote:
On Friday, March 22, 2013 05:45:29 PM Viresh Kumar wrote:
On 22 March 2013 17:47, Rafael J. Wysocki rjw@sisk.pl wrote:
OK, so what exactly do you need to now?
s/now/know ??
Yes, sorry.
Right, I heard about this breakage on Bulldozer and I can repro it here too :(.
Which is not good and it needs to be fixed ASAP. From looking at the patches in question, this should be reproducible on everything, though, since it is generic cpufreq code. And if so, this makes the urgency of a fix much more dire.
Thanks.
On Friday, March 22, 2013 01:17:18 PM Rafael J. Wysocki wrote:
[Adding Boris and Thomas to the CC.]
...
I don't have enough knowledge about this driver and how is it used for all x86 systems and so want somebody else (who has some prior experience with it) to check how policy->cpus and policy->related_cpus must be set from driver->init().
OK, so what exactly do you need to now?
This has to be addressed before final 3.9 this way or another - and the sooner the better.
Is this all to try to fix "cpufreq driver gets loaded while some cores were set offline before"?
I wonder how you run into "cpufreq is initialized with offlined cpus" case. I remember that there were problems, but it's nearly impossible to run into this if the cpufreq driver is loaded early at boot.
cpufreq_add_dev() and friends are complicated. Those init functions got split some time ago and there also slipped in a bug even it was simple splitting of functions.
I do not have time to look at fcf8058296edbc3de43adf095824fc32b067b9f8 right now. Don't know how much other stuff depends on it and how sever it is on ARM that cpufreq does not correctly initialize with offlined cpus..., I would revert this patch.
There are other cpu related drivers (at least on x86) which cannot initialize correctly if the CPUs got offlined before driver init. Obviously this is not a clever thing to do.
Thomas
Hi guys,
I will answer to questions of both of you in this mail.
On 22 March 2013 18:23, Thomas Renninger trenn@suse.de wrote:
Is this all to try to fix "cpufreq driver gets loaded while some cores were set offline before"?
Not really. There are problems with acpi-cpufreq without that case too.
I wonder how you run into "cpufreq is initialized with offlined cpus" case. I remember that there were problems, but it's nearly impossible to run into this if the cpufreq driver is loaded early at boot.
I always thought there is a way not to boot all cpus by passing stuff in command line and so this is a easy case to reproduce.
cpufreq_add_dev() and friends are complicated.
Not anymore, they are hugely simplified now.
Those init functions got split some time ago and there also slipped in a bug even it was simple splitting of functions.
I do not have time to look at fcf8058296edbc3de43adf095824fc32b067b9f8 right now. Don't know how much other stuff depends on it and how sever it is on ARM that cpufreq does not correctly initialize with offlined cpus..., I would revert this patch.
Let me clear it to everybody. There isn't/shouldn't be a bug in cpufreq core, its just that acpi-cpufreq driver isn't adapted well with the changes related to affected_cpus and related_cpus.
I have never gone into coding for any non-ARM platform and am really not aware of acpi-cpufreq driver and its users. That's why i told everybody where the issue is, and they just need to fix acpi-cpufreq driver with right values of policy->cpus (affected_cpus) and everything else would work after that.
-- viresh
On Fri, Mar 22, 2013 at 07:13:33PM +0530, Viresh Kumar wrote:
I have never gone into coding for any non-ARM platform and am really not aware of acpi-cpufreq driver and its users. That's why i told everybody where the issue is, and they just need to fix acpi-cpufreq driver with right values of policy->cpus (affected_cpus) and everything else would work after that.
No, you're breaking existing drivers with your changes - nobody will fix stuff *you're* breaking.
You need to get the hardware and test your changes on *that* hardware too. Otherwise, the offending patches should be reverted until you can provide a patchset which works on everything.
Geez, the fact that I even need to explain this...
On 22 March 2013 19:24, Borislav Petkov bp@alien8.de wrote:
No, you're breaking existing drivers with your changes - nobody will fix stuff *you're* breaking.
You need to get the hardware and test your changes on *that* hardware too. Otherwise, the offending patches should be reverted until you can provide a patchset which works on everything.
Geez, the fact that I even need to explain this...
You are right in every sense and i can't defend myself on that, no excuse.
When i wrote those patches, i fixed many drivers and wasn't aware that acpi-cpufreq got broken. And only with the mail from Duncan i realized it.
Then i tried to go thorough it & fix it and found there is some stuff in init() which is is sort of per user case dependent. And wasn't sure how exactly to fix it and so shouted for help.
But yes i will give it another try to go through the driver in details and see if i can fix it.
-- viresh
On Fri, Mar 22, 2013 at 07:35:19PM +0530, Viresh Kumar wrote:
But yes i will give it another try to go through the driver in details and see if i can fix it.
Right, and make sure it is a minimal fix which can go in now. If it appears to get more involved, we'd probably need to revert now and you can retry next merge window.
Thanks.
On Friday, March 22, 2013 07:13:33 PM Viresh Kumar wrote:
Hi guys,
I will answer to questions of both of you in this mail.
On 22 March 2013 18:23, Thomas Renninger trenn@suse.de wrote:
Is this all to try to fix "cpufreq driver gets loaded while some cores were set offline before"?
Not really. There are problems with acpi-cpufreq without that case too.
I wonder how you run into "cpufreq is initialized with offlined cpus" case. I remember that there were problems, but it's nearly impossible to run into this if the cpufreq driver is loaded early at boot.
I always thought there is a way not to boot all cpus by passing stuff in command line and so this is a easy case to reproduce.
I am pretty sure cpuidle states won't initialize and in best case you never get them working on the offlined cpus. Local APICs won't be set up, ...
Such a parameter will never exist for x86.
cpufreq_add_dev() and friends are complicated.
Not anymore, they are hugely simplified now.
They were hugely simplified and things are not working anymore and you do not know why...
Those init functions got split some time ago and there also slipped in a bug even it was simple splitting of functions.
I do not have time to look at fcf8058296edbc3de43adf095824fc32b067b9f8 right now. Don't know how much other stuff depends on it and how sever it is on ARM that cpufreq does not correctly initialize with offlined cpus..., I would revert this patch.
Let me clear it to everybody. There isn't/shouldn't be a bug in cpufreq core, its just that acpi-cpufreq driver isn't adapted well with the changes related to affected_cpus and related_cpus.
And powernow-k8 driver is broken. The others are not tested that often, I expect they broke as well, right?
I have never gone into coding for any non-ARM platform and am really not aware of acpi-cpufreq driver and its users. That's why i told everybody where the issue is, and they just need to fix acpi-cpufreq driver with right values of policy->cpus (affected_cpus) and everything else would work after that.
Sorry, I cannot look into this due to lack of time, but I remember that there were reasons why cpufreq_add_dev() was complicated. Or that it's really easy to mess it up and it's not easy to fix it again.
Thomas
On 22 March 2013 19:34, Thomas Renninger trenn@suse.de wrote:
I am pretty sure cpuidle states won't initialize and in best case you never get them working on the offlined cpus. Local APICs won't be set up, ...
Such a parameter will never exist for x86.
I will see if i can find what i was referring to here.
They were hugely simplified and things are not working anymore and you do not know why...
I know why, but don't know (for now) how to fix it for acpi-cpufreq.
And powernow-k8 driver is broken. The others are not tested that often, I expect they broke as well, right?
acpi-cpufreq is broken and so all others who are using it. Sorry for asking the stupid question now but what's the hierarchy of cpufreq drivers for intel (I will try to go through it now), some drivers use acpi-cpufreq driver?
Sorry, I cannot look into this due to lack of time, but I remember that there were reasons why cpufreq_add_dev() was complicated. Or that it's really easy to mess it up and it's not easy to fix it again.
Its not cpufreq_add_dev() that is broken but some changes that were part of the same patch, i.e. part that tried to sort out affected and related cpus.
On Fri, Mar 22, 2013 at 07:40:45PM +0530, Viresh Kumar wrote:
acpi-cpufreq is broken and so all others who are using it. Sorry for asking the stupid question now but what's the hierarchy of cpufreq drivers for intel (I will try to go through it now), some drivers use acpi-cpufreq driver?
All relevant x86 out there uses acpi-cpufreq. Which means that breaking it, breaks cpufreq on x86. Which is absolutely a no-no.
On Friday, March 22, 2013 03:04:03 PM Thomas Renninger wrote:
On Friday, March 22, 2013 07:13:33 PM Viresh Kumar wrote:
Hi guys,
I will answer to questions of both of you in this mail.
On 22 March 2013 18:23, Thomas Renninger trenn@suse.de wrote:
Is this all to try to fix "cpufreq driver gets loaded while some cores were set offline before"?
Not really. There are problems with acpi-cpufreq without that case too.
I wonder how you run into "cpufreq is initialized with offlined cpus" case. I remember that there were problems, but it's nearly impossible to run into this if the cpufreq driver is loaded early at boot.
I always thought there is a way not to boot all cpus by passing stuff in command line and so this is a easy case to reproduce.
I am pretty sure cpuidle states won't initialize and in best case you never get them working on the offlined cpus. Local APICs won't be set up, ...
Such a parameter will never exist for x86.
I take that back. CPU hot-add (even if CPU is not present at boot time) works. I looked at C-states for that some time ago and it should only work if the hot-add event came in via ACPI events for CPUs which were not initialized at boot time. Better would be to initialize the first time it gets switched online. Anyway, making such stuff (cpufreq/cpuidle/...) more robust, is certainly a good idea.
...
And powernow-k8 driver is broken. The others are not tested that often, I expect they broke as well, right?
And powernow-k8 does not exist anymore..., fortunately I didn't have to look at this stuff for some time.
...
Sorry, I cannot look into this due to lack of time,...
Thomas
On 24 March 2013 14:35, Thomas Renninger trenn@suse.de wrote:
I take that back. CPU hot-add (even if CPU is not present at boot time) works. I looked at C-states for that some time ago and it should only work if the hot-add event came in via ACPI events for CPUs which were not initialized at boot time. Better would be to initialize the first time it gets switched online.
I removed it from my TODO list, thanks :)
Anyway, making such stuff (cpufreq/cpuidle/...) more robust, is certainly a good idea.
Yes, that was the basic intent of my earlier patches that went into 3.9. It was all about simplifying it and making it robust.
And powernow-k8 does not exist anymore..., fortunately I didn't have to look at this stuff for some time.
Sure? Driver is still present in mainline.
BTW, i have given an initial fix for acpi-cpufreq (which should work) and waiting for Duncan to reply back. All other drivers don't set affected/related cpus directly, so they should be alright.
-- viresh
Hi Duncan,
Please reply to this mail rather than using bugzilla as all others might not be following bugzilla rerport.
--- Comment #31 from Duncan 1i5t5.duncan@cox.net 2013-03-24 09:48:41 --- (In reply to comment #28)
My weekend is already spoiled (due to my bugs :) ) and i don't want to spoil yours.
Don't worry about it. They're all days in the week, to me. I don't really have a defined "weekend". In fact, I had been (somewhat impatiently) grumbling to myself Friday that I was likely to have to wait until Monday for something concrete to test, so I'm happy to be demonstrated wrong! =:^)
:)
[PATCH] cpufreq: acpi-cpufreq: Set policy->cpus correctly from .init()
You appear to be on the right path as all the dirs and symlinks are there now, but it looks like you'll need a v2 as the order/pairing is now very strange, both as booted and after a s2ram/resume (which changes the order but it's still strange):
:(
Testing against 3.9-rc4 now.
Patch applied pre-suspend ls -dl as above (original pairing is 0/1, 2/3, 4/5, with the first always a dir and the second always a symlink to the first, as seen in comment #0):
/sys/devices/system/cpu/cpu0/cpufreq/ /sys/devices/system/cpu/cpu1/cpufreq -> ../cpu0/cpufreq/ /sys/devices/system/cpu/cpu2/cpufreq/ /sys/devices/system/cpu/cpu3/cpufreq/ /sys/devices/system/cpu/cpu4/cpufreq -> ../cpu3/cpufreq/ /sys/devices/system/cpu/cpu5/cpufreq -> ../cpu2/cpufreq/
The way it works (in cpufreq core), the first cpu of group registered for cpufreq would get a directory and second one would get a symlink.
And the above ones also doesn't look good. 2/5 and 3/4 shouldn't be the groups.
acpi-cpufreq driver is getting this information from perf->shared_cpu_map and that seems to be wrong to me now.
Post s2ram/resume cycle:
/sys/devices/system/cpu/cpu0/cpufreq/ /sys/devices/system/cpu/cpu1/cpufreq -> ../cpu0/cpufreq/ /sys/devices/system/cpu/cpu2/cpufreq -> ../cpu4/cpufreq/ /sys/devices/system/cpu/cpu3/cpufreq -> ../cpu5/cpufreq/ /sys/devices/system/cpu/cpu4/cpufreq/ /sys/devices/system/cpu/cpu5/cpufreq/
2/4 and 3/5 are also wrong groups.
But I don't know whether it's CPU ordering that's weird, or the cpufreq ordering. IOW, I don't know whether it /should/ be 0/1, 2/3, 4/5 or not, because I don't know whether those numbers actually reflect the hardware so it's the ordering above that's bad, or if the cpu numbers are just arbitrarily assigned, and the ordering above reflects the actual hardware, and just looks strange due to the arbitrary cpu numbering.
One thing i am sure about is cpu order is fixed at boot and after boot whatever you do can't change that order... So order should always be 0/1, 2/3, 4/5
Either way, there's the correct number of dirs and links now, but the ordering is extremely confusing and looks wrong, regardless of whether it actually corresponds to the hardware or not, something I don't have the ability to judge.
Hmm.. Can you try one thing? Run 3.8 over your machine and give output of cpufreq-info and ls -ld after boot and resume..
I would like to see what's the original behavior.
-- viresh
On Sun, 24 Mar 2013 15:32:39 +0530 Viresh Kumar viresh.kumar@linaro.org wrote:
Hi Duncan,
Please reply to this mail rather than using bugzilla as all others might not be following bugzilla rerport.
Thanks. I'm used to it being the other way 'round.
[PATCH] cpufreq: acpi-cpufreq: Set policy->cpus correctly from .init()
You appear to be on the right path as all the dirs and symlinks are there now, but it looks like you'll need a v2 as the order/pairing is now very strange, both as booted and after a s2ram/resume (which changes the order but it's still strange):
Hmm.. Can you try one thing? Run 3.8 over your machine and give output of cpufreq-info and ls -ld after boot and resume..
I would like to see what's the original behavior.
Good idea! =:^) It now appears that your bug simply cascaded on a previously unreported bug in earlier kernels.
The 3.8 ls -dl isn't too interesting as they were all dirs not symlinks then, and if they hadn't stuck around after a suspend/resume I'd have definitely noticed and reported the bug back then, but it's still useful to confirm.
The 3.8 pre-suspend and post resume ls -dl are identical -- no missing dirs (and no symlinks):
/sys/devices/system/cpu/cpu0/cpufreq/ /sys/devices/system/cpu/cpu1/cpufreq/ /sys/devices/system/cpu/cpu2/cpufreq/ /sys/devices/system/cpu/cpu3/cpufreq/ /sys/devices/system/cpu/cpu4/cpufreq/ /sys/devices/system/cpu/cpu5/cpufreq/
The interesting results are the 3.8 cpufreq-info. I'll attach the full output, but here's the interesting bit:
3.8 pre-suspend cpufreq-info excerpts (nicely paired, as are the pre-patch pre-suspend results for 3.9-rc):
analyzing CPU 0: CPUs which run at the same hardware frequency: 0 1 CPUs which need to have their frequency coordinated by software: 0 analyzing CPU 1: CPUs which run at the same hardware frequency: 0 1 CPUs which need to have their frequency coordinated by software: 1 analyzing CPU 2: CPUs which run at the same hardware frequency: 2 3 CPUs which need to have their frequency coordinated by software: 2 analyzing CPU 3: CPUs which run at the same hardware frequency: 2 3 CPUs which need to have their frequency coordinated by software: 3 analyzing CPU 4: CPUs which run at the same hardware frequency: 4 5 CPUs which need to have their frequency coordinated by software: 4 analyzing CPU 5: CPUs which run at the same hardware frequency: 4 5 CPUs which need to have their frequency coordinated by software: 5
3.8 post-resume (screwed up pairing, so that bit's not a 3.9 thing, I just noticed it in 3.9 due to the now missing symlinks/dirs):
analyzing CPU 0: CPUs which run at the same hardware frequency: 0 1 CPUs which need to have their frequency coordinated by software: 0 analyzing CPU 1: CPUs which run at the same hardware frequency: 0 1 CPUs which need to have their frequency coordinated by software: 1 analyzing CPU 2: CPUs which run at the same hardware frequency: 2 CPUs which need to have their frequency coordinated by software: 2 analyzing CPU 3: CPUs which run at the same hardware frequency: 3 CPUs which need to have their frequency coordinated by software: 3 analyzing CPU 4: CPUs which run at the same hardware frequency: 2 4 CPUs which need to have their frequency coordinated by software: 4 analyzing CPU 5: CPUs which run at the same hardware frequency: 3 5 CPUs which need to have their frequency coordinated by software: 5
On 24 March 2013 17:19, Duncan 1i5t5.duncan@cox.net wrote:
On Sun, 24 Mar 2013 15:32:39 +0530 Viresh Kumar viresh.kumar@linaro.org wrote:
Hmm.. Can you try one thing? Run 3.8 over your machine and give output of cpufreq-info and ls -ld after boot and resume..
I would like to see what's the original behavior.
Good idea! =:^) It now appears that your bug simply cascaded on a previously unreported bug in earlier kernels.
That made me happy, i am not the only culprit :)
The 3.8 pre-suspend and post resume ls -dl are identical -- no missing dirs (and no symlinks):
/sys/devices/system/cpu/cpu0/cpufreq/ /sys/devices/system/cpu/cpu1/cpufreq/ /sys/devices/system/cpu/cpu2/cpufreq/ /sys/devices/system/cpu/cpu3/cpufreq/ /sys/devices/system/cpu/cpu4/cpufreq/ /sys/devices/system/cpu/cpu5/cpufreq/
They were all separate directories (instead of symlinks) earlier because this only depended on policy->cpus earlier. And none of the cpus are shared in policy->cpus, i.e. policy->cpus was always policy->cpu.
3.8 pre-suspend cpufreq-info excerpts (nicely paired, as are the pre-patch pre-suspend results for 3.9-rc):
No they are still not paired well. This is how we should read your analysis: related-cpus: "same hardware freq" affected-cpus or policy->cpus: "frequency coordinated by software"
analyzing CPU 0: CPUs which run at the same hardware frequency: 0 1
related cpus have correct pairs
CPUs which need to have their frequency coordinated by software: 0
but affected cpus doesn't
3.8 post-resume (screwed up pairing, so that bit's not a 3.9 thing
I told you earlier, this made me happy :)
analyzing CPU 0: CPUs which run at the same hardware frequency: 0 1 CPUs which need to have their frequency coordinated by software: 0 analyzing CPU 1: CPUs which run at the same hardware frequency: 0 1 CPUs which need to have their frequency coordinated by software: 1
These stayed as is as cpu 0 is non removable cpu and so doesn't get unregistered from cpufreq at all.
analyzing CPU 2: CPUs which run at the same hardware frequency: 2 CPUs which need to have their frequency coordinated by software: 2 analyzing CPU 3: CPUs which run at the same hardware frequency: 3 CPUs which need to have their frequency coordinated by software: 3 analyzing CPU 4: CPUs which run at the same hardware frequency: 2 4
related cpus got corrupted here.
CPUs which need to have their frequency coordinated by software: 4 analyzing CPU 5: CPUs which run at the same hardware frequency: 3 5 CPUs which need to have their frequency coordinated by software: 5
Now back to the real issues:
@Rafael/Borislav/Thomas/Andre/Darrick:
"What do we mean by software AND hardware coordination for x86 ?"
Following are the sha-id's which had something to do with above statement.
3b2d99429e3386b6e2ac949fc72486509c8bbe36 46f18e3a28295a9e11a6ffa4478241c19bc93735 acd316248205d553594296f1895ba5196b89ffcc e8628dd06d66f2e3965ec9742029b401d63434f1 8adcc0c674004c0f9467031a93dc639c2b01411f
On the platform i work (ARM) there are only two cases, cpus share clock line or they don't. So, they share policy struct or they don't.
Fixing Duncan's issues shouldn't be a very big deal now as i was thinking too much about what was broken without my patches too. And now that part is pretty clear.
-- viresh
On 24 March 2013 17:46, Viresh Kumar viresh.kumar@linaro.org wrote:
Fixing Duncan's issues shouldn't be a very big deal now as i was thinking too much about what was broken without my patches too. And now that part is pretty clear.
Hi Duncan,
Try attached patch and this should take back your system to where it was.
NOTE: With this patch your related_cpus wouldn't show any groups and related cpus must be same as affected cpus. All cpu*/cpufreq must be directories now and no symlinks.
-- viresh
On Sun, 24 Mar 2013 17:53:41 +0530 Viresh Kumar viresh.kumar@linaro.org wrote:
On 24 March 2013 17:46, Viresh Kumar viresh.kumar@linaro.org wrote:
Fixing Duncan's issues shouldn't be a very big deal now as i was thinking too much about what was broken without my patches too. And now that part is pretty clear.
Hi Duncan,
Try attached patch and this should take back your system to where it was.
NOTE: With this patch your related_cpus wouldn't show any groups and related cpus must be same as affected cpus. All cpu*/cpufreq must be directories now and no symlinks.
FWIW, with this patch, pre-s2ram and post-resume are indeed consistent, but it's not back to where it was.
With this patch, each core is a cpufreq law unto itself. Maybe that's what you meant with the note, maybe not (I know the mapping of sysfs files to cpufreq-info output was stated up-thread, but I lost track, and how affected vs related maps to hardware vs software coordinated and what it all actually means other than what I'm seeing isn't ideal, is apparently a bit more than I'm able to keep in my head ATM), but it's what I get. The relevant bits of cpufreq-info output:
analyzing CPU 0: CPUs which run at the same hardware frequency: 0 CPUs which need to have their frequency coordinated by software: 0 analyzing CPU 1: CPUs which run at the same hardware frequency: 1 CPUs which need to have their frequency coordinated by software: 1 analyzing CPU 2: CPUs which run at the same hardware frequency: 2 CPUs which need to have their frequency coordinated by software: 2 analyzing CPU 3: CPUs which run at the same hardware frequency: 3 CPUs which need to have their frequency coordinated by software: 3 analyzing CPU 4: CPUs which run at the same hardware frequency: 4 CPUs which need to have their frequency coordinated by software: 4 analyzing CPU 5: CPUs which run at the same hardware frequency: 5 CPUs which need to have their frequency coordinated by software: 5
But at least it's consistent: The same results both before and after a suspend/resume cycle.
And given that 3.8 wasn't ideal either, maybe that's good enough for this cycle, and a real fix will have to wait until the next commit window and stable-tree. That'd give us more leeway to fix it right, as well as a full cycle for testing anything else the "correct" fix might dredge up.
On 25 March 2013 16:45, Duncan 1i5t5.duncan@cox.net wrote:
FWIW, with this patch, pre-s2ram and post-resume are indeed consistent, but it's not back to where it was.
With this patch, each core is a cpufreq law unto itself. Maybe that's what you meant with the note, maybe not (I know the mapping of sysfs files to cpufreq-info output was stated up-thread, but I lost track, and how affected vs related maps to hardware vs software coordinated and what it all actually means other than what I'm seeing isn't ideal, is apparently a bit more than I'm able to keep in my head ATM), but it's what I get. The relevant bits of cpufreq-info output:
analyzing CPU 0: CPUs which run at the same hardware frequency: 0 CPUs which need to have their frequency coordinated by software: 0 analyzing CPU 1: CPUs which run at the same hardware frequency: 1 CPUs which need to have their frequency coordinated by software: 1 analyzing CPU 2: CPUs which run at the same hardware frequency: 2 CPUs which need to have their frequency coordinated by software: 2 analyzing CPU 3: CPUs which run at the same hardware frequency: 3 CPUs which need to have their frequency coordinated by software: 3 analyzing CPU 4: CPUs which run at the same hardware frequency: 4 CPUs which need to have their frequency coordinated by software: 4 analyzing CPU 5: CPUs which run at the same hardware frequency: 5 CPUs which need to have their frequency coordinated by software: 5
But at least it's consistent: The same results both before and after a suspend/resume cycle.
And given that 3.8 wasn't ideal either, maybe that's good enough for this cycle, and a real fix will have to wait until the next commit window and stable-tree. That'd give us more leeway to fix it right, as well as a full cycle for testing anything else the "correct" fix might dredge up.
This is exactly what i expected and i wrote in Note. Because cpufreq core does a lot of work based on related_cpus now, its better we don't set it blindly for x86.
Following code was the only user of relatead_cpus in 3.8 code:
/* Set governor before ->init, so that driver could check it */ #ifdef CONFIG_HOTPLUG_CPU for_each_online_cpu(sibling) { struct cpufreq_policy *cp = per_cpu(cpufreq_cpu_data, sibling); if (cp && cp->governor && (cpumask_test_cpu(cpu, cp->related_cpus))) { policy->governor = cp->governor; found = 1; break; } } #endif
There is no other user of relatead_cpus earlier in cpufreq core and that's why i wonder why it was added earlier.
But a grep of relatead_cpus for 3.8 showed some interesting users in: tools/power/cpupower/
I will try to check what they are doing with it, but for the kernel it was almost unused. And not it is very much used :)
But for 3.9 i believe this patch should be good enough.
Thanks for testing it.
-- viresh
On Mon, Mar 25, 2013 at 04:53:39PM +0530, Viresh Kumar wrote:
But for 3.9 i believe this patch should be good enough.
Seems so here too.
Tested-by: Borislav Petkov bp@suse.de
On 25 March 2013 16:53, Viresh Kumar viresh.kumar@linaro.org wrote:
There is no other user of relatead_cpus earlier in cpufreq core and that's why i wonder why it was added earlier.
But a grep of relatead_cpus for 3.8 showed some interesting users in: tools/power/cpupower/
I will try to check what they are doing with it, but for the kernel it was almost unused.
I checked tools and they aren't doing anything tricky with it. Just reading groups of cpus from sysfs.
Now i believe even tools should be patched a bit to give correct meanings of affected or related cpus. Let me try it.
On Sun, Mar 24, 2013 at 02:40:53PM +0530, Viresh Kumar wrote:
And powernow-k8 does not exist anymore..., fortunately I didn't have to look at this stuff for some time.
Sure? Driver is still present in mainline.
powernow-k8 is still there for, well, K8 only. Newer machines are handled by acpi-cpufreq. Thus is acpi-cpufreq's health of major importance especially now. :)
Btw,
while we're at it, I just discovered this in dmesg. Plain -rc3.
[34173.893305] ------------[ cut here ]------------ [34173.893321] WARNING: at kernel/mutex.c:199 mutex_lock_nested+0x39c/0x3b0() [34173.893328] Hardware name: To be filled by O.E.M. [34173.893333] Modules linked in: nls_iso8859_15 nls_cp437 fuse tun cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative dm_crypt dm_mod ipv6 vfat fat acpi_cpufreq mperf kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 ablk_helper cryptd xts lrw gf128mul microcode radeon amd64_edac_mod edac_core k10temp fam15h_power drm_kms_helper ttm cfbfillrect cfbimgblt r8169 cfbcopyarea [34173.893395] Pid: 15316, comm: kworker/0:0 Not tainted 3.9.0-rc3 #1 [34173.893402] Call Trace: [34173.893409] [<ffffffff8103b33f>] warn_slowpath_common+0x7f/0xc0 [34173.893417] [<ffffffff8103b39a>] warn_slowpath_null+0x1a/0x20 [34173.893424] [<ffffffff8159654c>] mutex_lock_nested+0x39c/0x3b0 [34173.893432] [<ffffffff8144b94d>] ? cpufreq_governor_dbs+0x3bd/0x560 [34173.893441] [<ffffffff8106bded>] ? __blocking_notifier_call_chain+0x7d/0xd0 [34173.893449] [<ffffffff8144b94d>] ? cpufreq_governor_dbs+0x3bd/0x560 [34173.893457] [<ffffffff81074ce1>] ? get_parent_ip+0x11/0x50 [34173.893464] [<ffffffff81074d99>] ? sub_preempt_count+0x79/0xd0 [34173.893472] [<ffffffff8144b94d>] cpufreq_governor_dbs+0x3bd/0x560 [34173.893480] [<ffffffff8144b24a>] od_cpufreq_governor_dbs+0x1a/0x20 [34173.893487] [<ffffffff81448f13>] __cpufreq_governor+0x53/0xf0 [34173.893494] [<ffffffff814494a5>] __cpufreq_set_policy+0x155/0x180 [34173.893502] [<ffffffff8144a483>] cpufreq_update_policy+0xf3/0x130 [34173.893510] [<ffffffff8144a4c0>] ? cpufreq_update_policy+0x130/0x130 [34173.893519] [<ffffffff8144a4d1>] handle_update+0x11/0x20 [34173.893526] [<ffffffff8105f3f7>] process_one_work+0x1f7/0x670 [34173.893533] [<ffffffff8105f38c>] ? process_one_work+0x18c/0x670 [34173.893541] [<ffffffff8105fbfe>] worker_thread+0x10e/0x370 [34173.893548] [<ffffffff8105faf0>] ? rescuer_thread+0x240/0x240 [34173.893556] [<ffffffff810654fb>] kthread+0xdb/0xe0 [34173.893563] [<ffffffff81071ba5>] ? finish_task_switch+0x85/0x110 [34173.893572] [<ffffffff81065420>] ? __init_kthread_worker+0x70/0x70 [34173.893579] [<ffffffff8159a71c>] ret_from_fork+0x7c/0xb0 [34173.893587] [<ffffffff81065420>] ? __init_kthread_worker+0x70/0x70 [34173.893594] ---[ end trace 66f5addf492b41b2 ]---
On Fri, Mar 22, 2013 at 04:12:28PM +0100, Borislav Petkov wrote:
Btw,
while we're at it, I just discovered this in dmesg. Plain -rc3.
Forgot to say, this happened once during resume.
On 22 March 2013 20:42, Borislav Petkov bp@alien8.de wrote:
Btw,
while we're at it, I just discovered this in dmesg. Plain -rc3.
I am looking at it since some time, but need a bit of help.
[34173.893305] ------------[ cut here ]------------ [34173.893321] WARNING: at kernel/mutex.c:199 mutex_lock_nested+0x39c/0x3b0() [34173.893328] Hardware name: To be filled by O.E.M. [34173.893333] Modules linked in: nls_iso8859_15 nls_cp437 fuse tun cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative dm_crypt dm_mod ipv6 vfat fat acpi_cpufreq mperf kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 ablk_helper cryptd xts lrw gf128mul microcode radeon amd64_edac_mod edac_core k10temp fam15h_power drm_kms_helper ttm cfbfillrect cfbimgblt r8169 cfbcopyarea [34173.893395] Pid: 15316, comm: kworker/0:0 Not tainted 3.9.0-rc3 #1 [34173.893402] Call Trace: [34173.893409] [<ffffffff8103b33f>] warn_slowpath_common+0x7f/0xc0 [34173.893417] [<ffffffff8103b39a>] warn_slowpath_null+0x1a/0x20 [34173.893424] [<ffffffff8159654c>] mutex_lock_nested+0x39c/0x3b0 [34173.893432] [<ffffffff8144b94d>] ? cpufreq_governor_dbs+0x3bd/0x560 [34173.893441] [<ffffffff8106bded>] ? __blocking_notifier_call_chain+0x7d/0xd0 [34173.893449] [<ffffffff8144b94d>] ? cpufreq_governor_dbs+0x3bd/0x560 [34173.893457] [<ffffffff81074ce1>] ? get_parent_ip+0x11/0x50 [34173.893464] [<ffffffff81074d99>] ? sub_preempt_count+0x79/0xd0
What does "?" mean before a functions name in trace? I am not able to figure out the exact sequence as i don't see why sub_preempt_count() and get_parent_ip() got called. And then this notifier call.
On 22 March 2013 21:52, Viresh Kumar viresh.kumar@linaro.org wrote:
On 22 March 2013 20:42, Borislav Petkov bp@alien8.de wrote:
Btw,
while we're at it, I just discovered this in dmesg. Plain -rc3.
I am looking at it since some time, but need a bit of help.
Can you also give us what prints you got before this happened? That would be helpful..
On Fri, Mar 22, 2013 at 09:52:29PM +0530, Viresh Kumar wrote:
What does "?" mean before a functions name in trace?
http://stackoverflow.com/questions/13113384/the-meaning-of-in-linux-kernel-p...
Here's what's in dmesg before that (basically the machine is hibernating):
Mar 20 10:14:40 pd vmunix: [34172.600018] PM: Syncing filesystems ... done. Mar 20 10:14:40 pd vmunix: [34172.606180] Freezing user space processes ... (elapsed 0.01 seconds) done. Mar 20 10:14:40 pd vmunix: [34172.620559] PM: Preallocating image memory... done (allocated 573862 pages) Mar 20 10:14:40 pd vmunix: [34173.370592] PM: Allocated 2295448 kbytes in 0.75 seconds (3060.59 MB/s) Mar 20 10:14:40 pd vmunix: [34173.370615] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done. Mar 20 10:14:40 pd vmunix: [34173.388082] serial 00:09: disabled Mar 20 10:14:40 pd vmunix: [34173.388117] serial 00:09: System wakeup disabled by ACPI Mar 20 10:14:40 pd vmunix: [34173.695941] PM: freeze of devices complete after 311.852 msecs Mar 20 10:14:40 pd vmunix: [34173.696673] PM: late freeze of devices complete after 0.722 msecs Mar 20 10:14:40 pd vmunix: [34173.698914] PM: noirq freeze of devices complete after 2.234 msecs Mar 20 10:14:40 pd vmunix: [34173.698932] Disabling non-boot CPUs ... Mar 20 10:14:40 pd vmunix: [34173.703351] smpboot: CPU 1 is now offline Mar 20 10:14:40 pd vmunix: [34173.708298] smpboot: CPU 2 is now offline Mar 20 10:14:40 pd vmunix: [34173.712056] smpboot: CPU 3 is now offline Mar 20 10:14:40 pd vmunix: [34173.717512] smpboot: CPU 4 is now offline Mar 20 10:14:40 pd vmunix: [34173.722044] smpboot: CPU 5 is now offline Mar 20 10:14:40 pd vmunix: [34173.728418] smpboot: CPU 6 is now offline Mar 20 10:14:40 pd vmunix: [34173.731577] smpboot: CPU 7 is now offline Mar 20 10:14:40 pd vmunix: [34173.732828] PM: Creating hibernation image: Mar 20 10:14:40 pd vmunix: [34174.371451] PM: Need to copy 596216 pages Mar 20 10:14:40 pd vmunix: [34173.879106] Enabling non-boot CPUs ... Mar 20 10:14:40 pd vmunix: [34173.879171] SMP alternatives: lockdep: fixing up alternatives Mar 20 10:14:40 pd vmunix: [34173.879182] smpboot: Booting Node 0 Processor 1 APIC 0x11 Mar 20 10:14:40 pd vmunix: [34173.890627] LVT offset 0 assigned for vector 0x400 Mar 20 10:14:40 pd vmunix: [34173.893305] ------------[ cut here ]------------ Mar 20 10:14:40 pd vmunix: [34173.893321] WARNING: at kernel/mutex.c:199 mutex_lock_nested+0x39c/0x3b0() Mar 20 10:14:40 pd vmunix: [34173.893328] Hardware name: To be filled by O.E.M. Mar 20 10:14:40 pd vmunix: [34173.893333] Modules linked in: nls_iso8859_15 nls_cp437 fuse tun cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative dm_crypt dm_mod ipv6 vfat fat acpi_cpufreq mperf kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 ablk_helper cryptd xts lrw gf128mul microcode radeon amd64_edac_mod edac_core k10temp fam15h_power drm_kms_helper ttm cfbfillrect cfbimgblt r8169 cfbcopyarea Mar 20 10:14:40 pd vmunix: [34173.893395] Pid: 15316, comm: kworker/0:0 Not tainted 3.9.0-rc3 #1 Mar 20 10:14:40 pd vmunix: [34173.893402] Call Trace: Mar 20 10:14:40 pd vmunix: [34173.893409] [<ffffffff8103b33f>] warn_slowpath_common+0x7f/0xc0 Mar 20 10:14:40 pd vmunix: [34173.893417] [<ffffffff8103b39a>] warn_slowpath_null+0x1a/0x20 Mar 20 10:14:40 pd vmunix: [34173.893424] [<ffffffff8159654c>] mutex_lock_nested+0x39c/0x3b0 Mar 20 10:14:40 pd vmunix: [34173.893432] [<ffffffff8144b94d>] ? cpufreq_governor_dbs+0x3bd/0x560 Mar 20 10:14:40 pd vmunix: [34173.893441] [<ffffffff8106bded>] ? __blocking_notifier_call_chain+0x7d/0xd0 Mar 20 10:14:40 pd vmunix: [34173.893449] [<ffffffff8144b94d>] ? cpufreq_governor_dbs+0x3bd/0x560 Mar 20 10:14:40 pd vmunix: [34173.893457] [<ffffffff81074ce1>] ? get_parent_ip+0x11/0x50 Mar 20 10:14:40 pd vmunix: [34173.893464] [<ffffffff81074d99>] ? sub_preempt_count+0x79/0xd0 Mar 20 10:14:40 pd vmunix: [34173.893472] [<ffffffff8144b94d>] cpufreq_governor_dbs+0x3bd/0x560 Mar 20 10:14:40 pd vmunix: [34173.893480] [<ffffffff8144b24a>] od_cpufreq_governor_dbs+0x1a/0x20 Mar 20 10:14:40 pd vmunix: [34173.893487] [<ffffffff81448f13>] __cpufreq_governor+0x53/0xf0 Mar 20 10:14:40 pd vmunix: [34173.893494] [<ffffffff814494a5>] __cpufreq_set_policy+0x155/0x180 Mar 20 10:14:40 pd vmunix: [34173.893502] [<ffffffff8144a483>] cpufreq_update_policy+0xf3/0x130 Mar 20 10:14:40 pd vmunix: [34173.893510] [<ffffffff8144a4c0>] ? cpufreq_update_policy+0x130/0x130 Mar 20 10:14:40 pd vmunix: [34173.893519] [<ffffffff8144a4d1>] handle_update+0x11/0x20 Mar 20 10:14:40 pd vmunix: [34173.893526] [<ffffffff8105f3f7>] process_one_work+0x1f7/0x670 Mar 20 10:14:40 pd vmunix: [34173.893533] [<ffffffff8105f38c>] ? process_one_work+0x18c/0x670 Mar 20 10:14:40 pd vmunix: [34173.893541] [<ffffffff8105fbfe>] worker_thread+0x10e/0x370 Mar 20 10:14:40 pd vmunix: [34173.893548] [<ffffffff8105faf0>] ? rescuer_thread+0x240/0x240 Mar 20 10:14:40 pd vmunix: [34173.893556] [<ffffffff810654fb>] kthread+0xdb/0xe0 Mar 20 10:14:40 pd vmunix: [34173.893563] [<ffffffff81071ba5>] ? finish_task_switch+0x85/0x110 Mar 20 10:14:40 pd vmunix: [34173.893572] [<ffffffff81065420>] ? __init_kthread_worker+0x70/0x70 Mar 20 10:14:40 pd vmunix: [34173.893579] [<ffffffff8159a71c>] ret_from_fork+0x7c/0xb0 Mar 20 10:14:40 pd vmunix: [34173.893587] [<ffffffff81065420>] ? __init_kthread_worker+0x70/0x70 Mar 20 10:14:40 pd vmunix: [34173.893594] ---[ end trace 66f5addf492b41b2 ]---
HTH.
On Saturday, March 23, 2013 02:45:42 PM Borislav Petkov wrote:
On Fri, Mar 22, 2013 at 09:52:29PM +0530, Viresh Kumar wrote:
What does "?" mean before a functions name in trace?
http://stackoverflow.com/questions/13113384/the-meaning-of-in-linux-kernel-p...
Here's what's in dmesg before that (basically the machine is hibernating):
Mar 20 10:14:40 pd vmunix: [34172.600018] PM: Syncing filesystems ... done. Mar 20 10:14:40 pd vmunix: [34172.606180] Freezing user space processes ... (elapsed 0.01 seconds) done. Mar 20 10:14:40 pd vmunix: [34172.620559] PM: Preallocating image memory... done (allocated 573862 pages) Mar 20 10:14:40 pd vmunix: [34173.370592] PM: Allocated 2295448 kbytes in 0.75 seconds (3060.59 MB/s) Mar 20 10:14:40 pd vmunix: [34173.370615] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done. Mar 20 10:14:40 pd vmunix: [34173.388082] serial 00:09: disabled Mar 20 10:14:40 pd vmunix: [34173.388117] serial 00:09: System wakeup disabled by ACPI Mar 20 10:14:40 pd vmunix: [34173.695941] PM: freeze of devices complete after 311.852 msecs Mar 20 10:14:40 pd vmunix: [34173.696673] PM: late freeze of devices complete after 0.722 msecs Mar 20 10:14:40 pd vmunix: [34173.698914] PM: noirq freeze of devices complete after 2.234 msecs Mar 20 10:14:40 pd vmunix: [34173.698932] Disabling non-boot CPUs ... Mar 20 10:14:40 pd vmunix: [34173.703351] smpboot: CPU 1 is now offline Mar 20 10:14:40 pd vmunix: [34173.708298] smpboot: CPU 2 is now offline Mar 20 10:14:40 pd vmunix: [34173.712056] smpboot: CPU 3 is now offline Mar 20 10:14:40 pd vmunix: [34173.717512] smpboot: CPU 4 is now offline Mar 20 10:14:40 pd vmunix: [34173.722044] smpboot: CPU 5 is now offline Mar 20 10:14:40 pd vmunix: [34173.728418] smpboot: CPU 6 is now offline Mar 20 10:14:40 pd vmunix: [34173.731577] smpboot: CPU 7 is now offline Mar 20 10:14:40 pd vmunix: [34173.732828] PM: Creating hibernation image: Mar 20 10:14:40 pd vmunix: [34174.371451] PM: Need to copy 596216 pages Mar 20 10:14:40 pd vmunix: [34173.879106] Enabling non-boot CPUs ... Mar 20 10:14:40 pd vmunix: [34173.879171] SMP alternatives: lockdep: fixing up alternatives Mar 20 10:14:40 pd vmunix: [34173.879182] smpboot: Booting Node 0 Processor 1 APIC 0x11 Mar 20 10:14:40 pd vmunix: [34173.890627] LVT offset 0 assigned for vector 0x400 Mar 20 10:14:40 pd vmunix: [34173.893305] ------------[ cut here ]------------ Mar 20 10:14:40 pd vmunix: [34173.893321] WARNING: at kernel/mutex.c:199 mutex_lock_nested+0x39c/0x3b0() Mar 20 10:14:40 pd vmunix: [34173.893328] Hardware name: To be filled by O.E.M. Mar 20 10:14:40 pd vmunix: [34173.893333] Modules linked in: nls_iso8859_15 nls_cp437 fuse tun cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative dm_crypt dm_mod ipv6 vfat fat acpi_cpufreq mperf kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 ablk_helper cryptd xts lrw gf128mul microcode radeon amd64_edac_mod edac_core k10temp fam15h_power drm_kms_helper ttm cfbfillrect cfbimgblt r8169 cfbcopyarea Mar 20 10:14:40 pd vmunix: [34173.893395] Pid: 15316, comm: kworker/0:0 Not tainted 3.9.0-rc3 #1 Mar 20 10:14:40 pd vmunix: [34173.893402] Call Trace: Mar 20 10:14:40 pd vmunix: [34173.893409] [<ffffffff8103b33f>] warn_slowpath_common+0x7f/0xc0 Mar 20 10:14:40 pd vmunix: [34173.893417] [<ffffffff8103b39a>] warn_slowpath_null+0x1a/0x20 Mar 20 10:14:40 pd vmunix: [34173.893424] [<ffffffff8159654c>] mutex_lock_nested+0x39c/0x3b0 Mar 20 10:14:40 pd vmunix: [34173.893432] [<ffffffff8144b94d>] ? cpufreq_governor_dbs+0x3bd/0x560 Mar 20 10:14:40 pd vmunix: [34173.893441] [<ffffffff8106bded>] ? __blocking_notifier_call_chain+0x7d/0xd0 Mar 20 10:14:40 pd vmunix: [34173.893449] [<ffffffff8144b94d>] ? cpufreq_governor_dbs+0x3bd/0x560 Mar 20 10:14:40 pd vmunix: [34173.893457] [<ffffffff81074ce1>] ? get_parent_ip+0x11/0x50 Mar 20 10:14:40 pd vmunix: [34173.893464] [<ffffffff81074d99>] ? sub_preempt_count+0x79/0xd0 Mar 20 10:14:40 pd vmunix: [34173.893472] [<ffffffff8144b94d>] cpufreq_governor_dbs+0x3bd/0x560 Mar 20 10:14:40 pd vmunix: [34173.893480] [<ffffffff8144b24a>] od_cpufreq_governor_dbs+0x1a/0x20 Mar 20 10:14:40 pd vmunix: [34173.893487] [<ffffffff81448f13>] __cpufreq_governor+0x53/0xf0 Mar 20 10:14:40 pd vmunix: [34173.893494] [<ffffffff814494a5>] __cpufreq_set_policy+0x155/0x180 Mar 20 10:14:40 pd vmunix: [34173.893502] [<ffffffff8144a483>] cpufreq_update_policy+0xf3/0x130 Mar 20 10:14:40 pd vmunix: [34173.893510] [<ffffffff8144a4c0>] ? cpufreq_update_policy+0x130/0x130 Mar 20 10:14:40 pd vmunix: [34173.893519] [<ffffffff8144a4d1>] handle_update+0x11/0x20 Mar 20 10:14:40 pd vmunix: [34173.893526] [<ffffffff8105f3f7>] process_one_work+0x1f7/0x670 Mar 20 10:14:40 pd vmunix: [34173.893533] [<ffffffff8105f38c>] ? process_one_work+0x18c/0x670 Mar 20 10:14:40 pd vmunix: [34173.893541] [<ffffffff8105fbfe>] worker_thread+0x10e/0x370 Mar 20 10:14:40 pd vmunix: [34173.893548] [<ffffffff8105faf0>] ? rescuer_thread+0x240/0x240 Mar 20 10:14:40 pd vmunix: [34173.893556] [<ffffffff810654fb>] kthread+0xdb/0xe0 Mar 20 10:14:40 pd vmunix: [34173.893563] [<ffffffff81071ba5>] ? finish_task_switch+0x85/0x110 Mar 20 10:14:40 pd vmunix: [34173.893572] [<ffffffff81065420>] ? __init_kthread_worker+0x70/0x70 Mar 20 10:14:40 pd vmunix: [34173.893579] [<ffffffff8159a71c>] ret_from_fork+0x7c/0xb0 Mar 20 10:14:40 pd vmunix: [34173.893587] [<ffffffff81065420>] ? __init_kthread_worker+0x70/0x70 Mar 20 10:14:40 pd vmunix: [34173.893594] ---[ end trace 66f5addf492b41b2 ]---
This looks like the CPU offline path to me (i.e. the CPU hotplug notifier does something fishy).
Thanks, Rafael
On 23 March 2013 19:57, Rafael J. Wysocki rjw@sisk.pl wrote:
Mar 20 10:14:40 pd vmunix: [34173.698932] Disabling non-boot CPUs ... Mar 20 10:14:40 pd vmunix: [34173.703351] smpboot: CPU 1 is now offline Mar 20 10:14:40 pd vmunix: [34173.708298] smpboot: CPU 2 is now offline Mar 20 10:14:40 pd vmunix: [34173.712056] smpboot: CPU 3 is now offline Mar 20 10:14:40 pd vmunix: [34173.717512] smpboot: CPU 4 is now offline Mar 20 10:14:40 pd vmunix: [34173.722044] smpboot: CPU 5 is now offline Mar 20 10:14:40 pd vmunix: [34173.728418] smpboot: CPU 6 is now offline Mar 20 10:14:40 pd vmunix: [34173.731577] smpboot: CPU 7 is now offline Mar 20 10:14:40 pd vmunix: [34173.732828] PM: Creating hibernation image: Mar 20 10:14:40 pd vmunix: [34174.371451] PM: Need to copy 596216 pages Mar 20 10:14:40 pd vmunix: [34173.879106] Enabling non-boot CPUs ... Mar 20 10:14:40 pd vmunix: [34173.879171] SMP alternatives: lockdep: fixing up alternatives Mar 20 10:14:40 pd vmunix: [34173.879182] smpboot: Booting Node 0 Processor 1 APIC 0x11 Mar 20 10:14:40 pd vmunix: [34173.890627] LVT offset 0 assigned for vector 0x400 Mar 20 10:14:40 pd vmunix: [34173.893305] ------------[ cut here ]------------ Mar 20 10:14:40 pd vmunix: [34173.893321] WARNING: at kernel/mutex.c:199 mutex_lock_nested+0x39c/0x3b0()
This looks like the CPU offline path to me (i.e. the CPU hotplug notifier does something fishy).
I think otherwise, Its the cpu online path but this didn't happened for first boot (probably).
Enabling non-boot CPUs ...
-- viresh
On 23 March 2013 20:04, Viresh Kumar viresh.kumar@linaro.org wrote:
I think otherwise, Its the cpu online path but this didn't happened for first boot (probably).
I tried on my 4 cpu laptop and my bad, couldn't reproduce the issue reported by both Borislav and Duncan :(
Hibernation logs (Borislav's bug): https://pastebin.linaro.org/2019/
cpufreq-info after hibernation (same happens with suspend) (Duncan's bug): https://pastebin.linaro.org/2020/
The main difference between our systems is number of cpu groups that share clock line. On setup of both Duncan and Borislav, they had total of 8 cpus and four groups 0-1, 2-3, 4-5, 6-7. And thus have four policy structures. And i have only one group 0-1-2-3 and thus only one policy struct.
For Duncan: The first policy structure (that has boot cpu) never gets corrupted and probably that's why i am not able to reproduce it again.
@Borislav: BTW, can you try reproducing your issue again? If that is reproducible?
I am asking for this because your logs looked confusing to me.
Mar 20 10:14:40 pd vmunix: [34173.893409] [<ffffffff8103b33f>] warn_slowpath_common+0x7f/0xc0 Mar 20 10:14:40 pd vmunix: [34173.893417] [<ffffffff8103b39a>] warn_slowpath_null+0x1a/0x20 Mar 20 10:14:40 pd vmunix: [34173.893424] [<ffffffff8159654c>] mutex_lock_nested+0x39c/0x3b0 Mar 20 10:14:40 pd vmunix: [34173.893432] [<ffffffff8144b94d>] ? cpufreq_governor_dbs+0x3bd/0x560 Mar 20 10:14:40 pd vmunix: [34173.893441] [<ffffffff8106bded>] ? __blocking_notifier_call_chain+0x7d/0xd0 Mar 20 10:14:40 pd vmunix: [34173.893449] [<ffffffff8144b94d>] ? cpufreq_governor_dbs+0x3bd/0x560 Mar 20 10:14:40 pd vmunix: [34173.893457] [<ffffffff81074ce1>] ? get_parent_ip+0x11/0x50 Mar 20 10:14:40 pd vmunix: [34173.893464] [<ffffffff81074d99>] ? sub_preempt_count+0x79/0xd0 Mar 20 10:14:40 pd vmunix: [34173.893472] [<ffffffff8144b94d>] cpufreq_governor_dbs+0x3bd/0x560
I don't see (logically) how sub_preempt_count() can be called from cpufreq_governor_dbs()? As it is mostly called from kernel/sched/ part only.
So maybe two logs are mixed here and crash got due to sub_preempt_count()->get_parent_ip() and not cpufreq. :)
If you still get it, try disabling cpufreq completely and see if it is gone or not.
Thanks in Advance.
-- viresh
On Sat, Mar 23, 2013 at 08:46:00PM +0530, Viresh Kumar wrote:
On 23 March 2013 20:04, Viresh Kumar viresh.kumar@linaro.org wrote:
I think otherwise, Its the cpu online path but this didn't happened for first boot (probably).
I tried on my 4 cpu laptop and my bad, couldn't reproduce the issue reported by both Borislav and Duncan :(
Hibernation logs (Borislav's bug): https://pastebin.linaro.org/2019/
cpufreq-info after hibernation (same happens with suspend) (Duncan's bug): https://pastebin.linaro.org/2020/
Those pastebin things want a login. Use a free one.
The main difference between our systems is number of cpu groups that share clock line. On setup of both Duncan and Borislav, they had total of 8 cpus and four groups 0-1, 2-3, 4-5, 6-7. And thus have four policy structures. And i have only one group 0-1-2-3 and thus only one policy struct.
So this should give you a clue - you need to repro it on a similar machine and your laptop is obviously not similar.
@Borislav: BTW, can you try reproducing your issue again? If that is reproducible?
I've seen it only once so far and I've suspended the machine a bunch of times already. So I don't think it is that easy to reproduce.
I don't see (logically) how sub_preempt_count() can be called from cpufreq_governor_dbs()? As it is mostly called from kernel/sched/ part only.
As Rafael said, there's a notifier running which can, AFAICT, disable preemption on another CPU in parallel, for example.
If you still get it, try disabling cpufreq completely and see if it is gone or not.
Unfortunately this is my desktop machine and I don't want to test stuff on it because I need it to work. And I've already downgraded to 3.8.3 because of the other cpufreq breakage which kept a subset of the cores at max freq because acpi-cpufreq wasn't loading on them.
On 23 March 2013 21:36, Borislav Petkov bp@alien8.de wrote:
On Sat, Mar 23, 2013 at 08:46:00PM +0530, Viresh Kumar wrote: Those pastebin things want a login. Use a free one.
Attached now.
So this should give you a clue - you need to repro it on a similar machine and your laptop is obviously not similar.
I know that. I would be trying to on monday on my office desktop.
Unfortunately this is my desktop machine and I don't want to test stuff on it because I need it to work. And I've already downgraded to 3.8.3 because of the other cpufreq breakage which kept a subset of the cores at max freq because acpi-cpufreq wasn't loading on them.
And that is already caused recently?
-- viresh
On Sat, Mar 23, 2013 at 10:34:28PM +0530, Viresh Kumar wrote:
And that is already caused recently?
3.8.3 is fine and there are no cpufreq patches in v3.8..v3.8.3. Which would mean, plain v3.8 is also fine. Which means, the breakage must've come in during the merge window.
[Forgot to add all in previous mail, reply-all now]
On 24 March 2013 00:20, Borislav Petkov bp@alien8.de wrote:
On Sat, Mar 23, 2013 at 10:34:28PM +0530, Viresh Kumar wrote:
And that is already caused recently?
Ahh!! s/already/also
3.8.3 is fine and there are no cpufreq patches in v3.8..v3.8.3. Which would mean, plain v3.8 is also fine. Which means, the breakage must've come in during the merge window.
I didn't get a complete picture of your problem, can you elaborate a bit with all your observations?
It might be related to the problem reported by Duncan and might be solved by the patch i sent yesterday.
On Sun, Mar 24, 2013 at 02:35:41PM +0530, Viresh Kumar wrote:
I didn't get a complete picture of your problem, can you elaborate a bit with all your observations?
It is basically the same observation. Some of the even cores are not handled by any cpufreq driver anymore, i.e. in the cpufreq-info output. And thus they remain in P0 which is max freq. And this happens because I'm suspending my workstation at night which takes the cores offline, so it is the same trigger as Duncan's.
linaro-kernel@lists.linaro.org