Nico, very appreciate for the useful info; i will try the advised repo. pls see below comments.
I used the modeldebugger to dig into the code for the core's power off, and found some issues in the code.
In the ARM spec, it recommend the flow for core's suspend should look like below:
- clear SCTLR.C bit;
- flush l1 cache;
- clear ACTLR.SMP bit;
- dsb;
- wfi;
But in dcscb.c or tc2_pm.c files, when the core what to run into LPM, we use the function *flush_cache_louis()* to flush L1 cache. But in this function it will NOT only flush l1 cache by set/way, but also it will invalidate I cache with below command: mcr p15, 0, r0, c7, c1, 0; From my experiment result, i believe this sentence will introduce unexpected behavior so that later instructions cannot execute properly. so i manually re-write the flush l1 cache flow for core's power off (almostly same with *flush_cache_louis()*, except remove invalidate I cache instruction), then i saw it's much more stable.
Could you describe what you mean by "more stable" ?
Sorry i made some mistakes so that introduced the confusion; and in short words, so far the testing result with your code is good enough and it will NOT introduce the hang issue.
Here are more inputs for the experiments: for i wanna to prototype the core power down flow for big.LITTLE, so i launch a thread (not the idle thread) and call the function *bL_cpu_power_down()* to power off the core. For the core's exit coherent, the experiment as below:
Testing 1: With your code, it can work well; the step is: flush_cache_louis(); -> cpu_proc_fin(); -> flush_cache_louis(); -> clear SMP bit -> wfi();
Testing 2. I modified the code to: clear SCTLR.C bit -> flush_cache_louis(); -> clear SMP bit -> wfi(); then looks like it's easily to introduce the hang issue; so that just like i asked in the previous email, we need remove the instruction for I$ invalidation; after that, i also can get the testing to pass.
BTW, i just wander whether need to refine the flow as: clear SCTLR.C bit -> don't use flush_cache_louis() anymore, instead to write a dedicated flush L1 cache function -> clear SMP bit -> wfi(); and this flow can exactly meet ARM's recommendation. How about u think for this?
Thx, Leo Yan