Hi Aneesh,
Adding linaro-dev in the loop as someone else could be also interested
I have reproduced my thermal error with a lava test so you can have a complete log available here: http://validation.linaro.org/lava-server/scheduler/job/18564/log_file
As a summary of the problem, the panda board turns off during some sysbench tests because the SDRAM has exceeded its temperature limit
Regards, Vincent
On 9 May 2012 14:43, Vincent Guittot vincent.guittot@linaro.org wrote:
Hi Aneesh,
Sorry for this late replay.
I'm working with the Linaro 12.04 developer image which is quite close to the Linaro ubuntu one but without IHM. Te manifest gives me the following information about the kernel linux-image-3.3.1-38-linaro-lt-omap=3.3.1-38.38~lt~ci~00000000000001+1336186099~4fa4ad78 which should be available here :git://git.linaro.org/landing-teams/leb/ti/kernel.git
This issue occurs each time I run some sysbench tests on my panda board (revA1). I run several tests which are 20 seconds long and the issue occurs after few minutes
I've check with my finger and the package is quite hot when the issue occurs
Regards, Vincent
On 7 May 2012 19:43, Aneesh V aneesh@ti.com wrote:
On 05/07/2012 10:38 AM, Mike Turquette wrote:
On Thu, May 3, 2012 at 7:47 AM, Vincent Guittot vincent.guittot@linaro.org wrote:
Hi Amit and Mike,
While stressing the dual cortex-A9 of my panda board ( omap4430 ) with the latest Linaro ubuntu developer environment, I reach the following message
[ 824.996978] emif emif.2: temperature alert before registers are calculated, not de-rating timings [ 837.243957] emif emif.2: temperature alert before registers are calculated, not de-rating timings [ 839.045318] emif emif.2: temperature alert before registers are calculated, not de-rating timings [ 845.168518] emif emif.1: temperature alert before registers are calculated, not de-rating timings [ 901.361663] emif emif.2: temperature alert before registers are calculated, not de-rating timings [ 902.082672] emif emif.2: temperature alert before registers are calculated, not de-rating timings [ 914.329620] emif emif.2: temperature alert before registers are calculated, not de-rating timings [ 925.496124] emif emif.2: temperature alert before registers are calculated, not de-rating timings [ 930.537292] emif emif.2: temperature alert before registers are calculated, not de-rating timings [ 938.823333] emif emif.1: temperature alert before registers are calculated, not de-rating timings [ 947.828857] emif emif.1: temperature alert before registers are calculated, not de-rating timings [ 954.312072] emif emif.1: temperature alert before registers are calculated, not de-rating timings [ 982.047271] emif emif.2: SDRAM temperature exceeds operating limit.. Needs shut down!!! [ 982.072784] Power down.
It is something you have already faced ? could something miss in the kernel ?
That should not happen normally. If all is well, it should happen only if the temperature alert comes during a small window at bootup. Is that the case with you. Which tree and branch are you using. Please send me the details and I will take a look.
br, Aneesh
On 14/05/12 20:45, Somebody in the thread at some point said:
Hi Aneesh,
Adding linaro-dev in the loop as someone else could be also interested
I have reproduced my thermal error with a lava test so you can have a complete log available here: http://validation.linaro.org/lava-server/scheduler/job/18564/log_file
As a summary of the problem, the panda board turns off during some sysbench tests because the SDRAM has exceeded its temperature limit
Recently Sebastien Jan at TI found that on tilt-3.3, cpu_idle is interacting badly with smartreflex and the wrong Vcore can be selected for 4460.
At the moment we disabled 1.2GHz on tilt-3.3, we think we have a fix + workaround and I'll update with it tomorrow.
-Andy
On 14/05/12 20:53, Somebody in the thread at some point said:
On 14/05/12 20:45, Somebody in the thread at some point said:
Hi Aneesh,
Adding linaro-dev in the loop as someone else could be also interested
I have reproduced my thermal error with a lava test so you can have a complete log available here: http://validation.linaro.org/lava-server/scheduler/job/18564/log_file
As a summary of the problem, the panda board turns off during some sysbench tests because the SDRAM has exceeded its temperature limit
Recently Sebastien Jan at TI found that on tilt-3.3, cpu_idle is interacting badly with smartreflex and the wrong Vcore can be selected for 4460.
At the moment we disabled 1.2GHz on tilt-3.3, we think we have a fix + workaround and I'll update with it tomorrow.
tilt-3.3 is updated with the fixes and workaround of disabling CPU_IDLE, please give that a try.
-Andy
On Mon, May 14, 2012 at 5:12 PM, Andy Green andy.green@linaro.org wrote:
On 14/05/12 20:53, Somebody in the thread at some point said:
On 14/05/12 20:45, Somebody in the thread at some point said:
Hi Aneesh,
Adding linaro-dev in the loop as someone else could be also interested
I have reproduced my thermal error with a lava test so you can have a complete log available here: http://validation.linaro.org/lava-server/scheduler/job/18564/log_file
As a summary of the problem, the panda board turns off during some sysbench tests because the SDRAM has exceeded its temperature limit
Recently Sebastien Jan at TI found that on tilt-3.3, cpu_idle is interacting badly with smartreflex and the wrong Vcore can be selected for 4460.
At the moment we disabled 1.2GHz on tilt-3.3, we think we have a fix + workaround and I'll update with it tomorrow.
tilt-3.3 is updated with the fixes and workaround of disabling CPU_IDLE, please give that a try.
Andy,
Disabling cpuidle is a bit of extreme workaround if one works on power management. :)
Can you point to any discussions about the problem?
/Amit
On 14/05/12 22:23, Somebody in the thread at some point said:
On Mon, May 14, 2012 at 5:12 PM, Andy Greenandy.green@linaro.org wrote:
On 14/05/12 20:53, Somebody in the thread at some point said:
On 14/05/12 20:45, Somebody in the thread at some point said:
Hi Aneesh,
Adding linaro-dev in the loop as someone else could be also interested
I have reproduced my thermal error with a lava test so you can have a complete log available here: http://validation.linaro.org/lava-server/scheduler/job/18564/log_file
As a summary of the problem, the panda board turns off during some sysbench tests because the SDRAM has exceeded its temperature limit
Recently Sebastien Jan at TI found that on tilt-3.3, cpu_idle is interacting badly with smartreflex and the wrong Vcore can be selected for 4460.
At the moment we disabled 1.2GHz on tilt-3.3, we think we have a fix + workaround and I'll update with it tomorrow.
tilt-3.3 is updated with the fixes and workaround of disabling CPU_IDLE, please give that a try.
Andy,
Disabling cpuidle is a bit of extreme workaround if one works on power management. :)
Can you point to any discussions about the problem?
They're on a private list and being dug into atm.
Actually it's not so extreme as a workaround under the circumstances, the issue is cpuidle disrupts Vcore set by smartreflex leading to crashes or excess power consumption and heat. It's better to have voltage part of dvfs working well (and not crashing) than cpuidle until we figure out what broke.
-Andy
On 14 May 2012 16:12, Andy Green andy.green@linaro.org wrote:
On 14/05/12 20:53, Somebody in the thread at some point said:
On 14/05/12 20:45, Somebody in the thread at some point said:
Hi Aneesh,
Adding linaro-dev in the loop as someone else could be also interested
I have reproduced my thermal error with a lava test so you can have a complete log available here: http://validation.linaro.org/lava-server/scheduler/job/18564/log_file
As a summary of the problem, the panda board turns off during some sysbench tests because the SDRAM has exceeded its temperature limit
Recently Sebastien Jan at TI found that on tilt-3.3, cpu_idle is interacting badly with smartreflex and the wrong Vcore can be selected for 4460.
At the moment we disabled 1.2GHz on tilt-3.3, we think we have a fix + workaround and I'll update with it tomorrow.
tilt-3.3 is updated with the fixes and workaround of disabling CPU_IDLE, please give that a try.
Hi Andy,
Is there any daily build hwpack that is available with this version on tilt-3.3 so I will be easier for me to test it?
Vincent
-Andy
-- Andy Green | TI Landing Team Leader Linaro.org │ Open source software for ARM SoCs | Follow Linaro http://facebook.com/pages/Linaro/155974581091106 - http://twitter.com/#%21/linaroorg - http://linaro.org/linaro-blog
On 16/05/12 19:49, Somebody in the thread at some point said:
On 14 May 2012 16:12, Andy Greenandy.green@linaro.org wrote:
On 14/05/12 20:53, Somebody in the thread at some point said:
On 14/05/12 20:45, Somebody in the thread at some point said:
Hi Aneesh,
Adding linaro-dev in the loop as someone else could be also interested
I have reproduced my thermal error with a lava test so you can have a complete log available here: http://validation.linaro.org/lava-server/scheduler/job/18564/log_file
As a summary of the problem, the panda board turns off during some sysbench tests because the SDRAM has exceeded its temperature limit
Recently Sebastien Jan at TI found that on tilt-3.3, cpu_idle is interacting badly with smartreflex and the wrong Vcore can be selected for 4460.
At the moment we disabled 1.2GHz on tilt-3.3, we think we have a fix + workaround and I'll update with it tomorrow.
tilt-3.3 is updated with the fixes and workaround of disabling CPU_IDLE, please give that a try.
Hi Andy,
Is there any daily build hwpack that is available with this version on tilt-3.3 so I will be easier for me to test it?
The Linaro Panda hwpack is based on tilt-3.3, but I don't know if it has taken in the latest patches yet (it should).
Ricardo (cc-d) will know.
-Andy
On Wed, May 16, 2012 at 8:51 AM, Andy Green andy.green@linaro.org wrote:
On 16/05/12 19:49, Somebody in the thread at some point said:
On 14 May 2012 16:12, Andy Greenandy.green@linaro.org wrote:
On 14/05/12 20:53, Somebody in the thread at some point said:
On 14/05/12 20:45, Somebody in the thread at some point said:
Hi Aneesh,
Adding linaro-dev in the loop as someone else could be also interested
I have reproduced my thermal error with a lava test so you can have a complete log available here: http://validation.linaro.org/lava-server/scheduler/job/18564/log_file
As a summary of the problem, the panda board turns off during some sysbench tests because the SDRAM has exceeded its temperature limit
Recently Sebastien Jan at TI found that on tilt-3.3, cpu_idle is interacting badly with smartreflex and the wrong Vcore can be selected for 4460.
At the moment we disabled 1.2GHz on tilt-3.3, we think we have a fix + workaround and I'll update with it tomorrow.
tilt-3.3 is updated with the fixes and workaround of disabling CPU_IDLE, please give that a try.
Hi Andy,
Is there any daily build hwpack that is available with this version on tilt-3.3 so I will be easier for me to test it?
The Linaro Panda hwpack is based on tilt-3.3, but I don't know if it has taken in the latest patches yet (it should).
The latest kernel packages are always published at our Kernel PPA, that is manually copied to the Overlay at least once a week.
Check https://launchpad.net/~linaro-maintainers/+archive/kernel/+packages?field.na... for the latest lt-omap kernel, based on the latest changes from Andy.
Unfortunately the config changes are not propagated automatically yet, so I had to update the package config by hand to properly disable it. The new build should be available at the PPA in a few hours, and you can follow the progress at https://ci.linaro.org/jenkins/view/Ubuntu%20Packaged%20Kernels/job/create-pa... and https://code.launchpad.net/~jcrigby/+recipe/linux-linaro-3.3-lt-omap-daily.
Please check the package changelog to confirm the hash from the LT's tree.
Cheers,
Hi,
I have successfully tested one of the latest daily build. The thermal issue doesn't occur anymore
Thanks, Vincent
On 17 May 2012 08:57, Ricardo Salveti ricardo.salveti@linaro.org wrote:
On Wed, May 16, 2012 at 8:51 AM, Andy Green andy.green@linaro.org wrote:
On 16/05/12 19:49, Somebody in the thread at some point said:
On 14 May 2012 16:12, Andy Greenandy.green@linaro.org wrote:
On 14/05/12 20:53, Somebody in the thread at some point said:
On 14/05/12 20:45, Somebody in the thread at some point said:
Hi Aneesh,
Adding linaro-dev in the loop as someone else could be also interested
I have reproduced my thermal error with a lava test so you can have a complete log available here: http://validation.linaro.org/lava-server/scheduler/job/18564/log_file
As a summary of the problem, the panda board turns off during some sysbench tests because the SDRAM has exceeded its temperature limit
Recently Sebastien Jan at TI found that on tilt-3.3, cpu_idle is interacting badly with smartreflex and the wrong Vcore can be selected for 4460.
At the moment we disabled 1.2GHz on tilt-3.3, we think we have a fix + workaround and I'll update with it tomorrow.
tilt-3.3 is updated with the fixes and workaround of disabling CPU_IDLE, please give that a try.
Hi Andy,
Is there any daily build hwpack that is available with this version on tilt-3.3 so I will be easier for me to test it?
The Linaro Panda hwpack is based on tilt-3.3, but I don't know if it has taken in the latest patches yet (it should).
The latest kernel packages are always published at our Kernel PPA, that is manually copied to the Overlay at least once a week.
Check https://launchpad.net/~linaro-maintainers/+archive/kernel/+packages?field.na... for the latest lt-omap kernel, based on the latest changes from Andy.
Unfortunately the config changes are not propagated automatically yet, so I had to update the package config by hand to properly disable it. The new build should be available at the PPA in a few hours, and you can follow the progress at https://ci.linaro.org/jenkins/view/Ubuntu%20Packaged%20Kernels/job/create-pa... and https://code.launchpad.net/~jcrigby/+recipe/linux-linaro-3.3-lt-omap-daily.
Please check the package changelog to confirm the hash from the LT's tree.
Cheers,
Ricardo Salveti de Araujo
On 23/05/12 18:00, Somebody in the thread at some point said:
Hi,
I have successfully tested one of the latest daily build. The thermal issue doesn't occur anymore
Great - thanks a lot for the report and retest.
-Andy
Thanks, Vincent
On 17 May 2012 08:57, Ricardo Salvetiricardo.salveti@linaro.org wrote:
On Wed, May 16, 2012 at 8:51 AM, Andy Greenandy.green@linaro.org wrote:
On 16/05/12 19:49, Somebody in the thread at some point said:
On 14 May 2012 16:12, Andy Greenandy.green@linaro.org wrote:
On 14/05/12 20:53, Somebody in the thread at some point said:
On 14/05/12 20:45, Somebody in the thread at some point said: > > > Hi Aneesh, > > Adding linaro-dev in the loop as someone else could be also interested > > I have reproduced my thermal error with a lava test so you can have a > complete log available here: > http://validation.linaro.org/lava-server/scheduler/job/18564/log_file > > As a summary of the problem, the panda board turns off during some > sysbench tests because the SDRAM has exceeded its temperature limit
Recently Sebastien Jan at TI found that on tilt-3.3, cpu_idle is interacting badly with smartreflex and the wrong Vcore can be selected for 4460.
At the moment we disabled 1.2GHz on tilt-3.3, we think we have a fix + workaround and I'll update with it tomorrow.
tilt-3.3 is updated with the fixes and workaround of disabling CPU_IDLE, please give that a try.
Hi Andy,
Is there any daily build hwpack that is available with this version on tilt-3.3 so I will be easier for me to test it?
The Linaro Panda hwpack is based on tilt-3.3, but I don't know if it has taken in the latest patches yet (it should).
The latest kernel packages are always published at our Kernel PPA, that is manually copied to the Overlay at least once a week.
Check https://launchpad.net/~linaro-maintainers/+archive/kernel/+packages?field.na... for the latest lt-omap kernel, based on the latest changes from Andy.
Unfortunately the config changes are not propagated automatically yet, so I had to update the package config by hand to properly disable it. The new build should be available at the PPA in a few hours, and you can follow the progress at https://ci.linaro.org/jenkins/view/Ubuntu%20Packaged%20Kernels/job/create-pa... and https://code.launchpad.net/~jcrigby/+recipe/linux-linaro-3.3-lt-omap-daily.
Please check the package changelog to confirm the hash from the LT's tree.
Cheers,
Ricardo Salveti de Araujo
Andy Green andy.green@linaro.org writes:
On 14/05/12 20:45, Somebody in the thread at some point said:
Hi Aneesh,
Adding linaro-dev in the loop as someone else could be also interested
I have reproduced my thermal error with a lava test so you can have a complete log available here: http://validation.linaro.org/lava-server/scheduler/job/18564/log_file
As a summary of the problem, the panda board turns off during some sysbench tests because the SDRAM has exceeded its temperature limit
Recently Sebastien Jan at TI found that on tilt-3.3, cpu_idle is interacting badly with smartreflex and the wrong Vcore can be selected for 4460.
I may be lacking context here, but: we don't have any 4460s in the lab currently, so any problems seen there cannot be 4460 specific (it _is_ possible that the lab pandas are less than perfectly ventilated of course).
Cheers, mwh
On 15/05/12 07:17, Somebody in the thread at some point said:
Andy Greenandy.green@linaro.org writes:
On 14/05/12 20:45, Somebody in the thread at some point said:
Hi Aneesh,
Adding linaro-dev in the loop as someone else could be also interested
I have reproduced my thermal error with a lava test so you can have a complete log available here: http://validation.linaro.org/lava-server/scheduler/job/18564/log_file
As a summary of the problem, the panda board turns off during some sysbench tests because the SDRAM has exceeded its temperature limit
Recently Sebastien Jan at TI found that on tilt-3.3, cpu_idle is interacting badly with smartreflex and the wrong Vcore can be selected for 4460.
I may be lacking context here, but: we don't have any 4460s in the lab currently, so any problems seen there cannot be 4460 specific (it _is_ possible that the lab pandas are less than perfectly ventilated of course).
I don't think the cpu_idle vs smartreflex problems were specific to 4460. Since it's PoP if the cpu die is running hot it will directly impact the sdram.
-Andy