Re: [Linaro-acpi] [PATCH v6 3/3] ARM: Check if a CPU has gone offline

24 Apr 2014

      On Thu, Apr 24, 2014 at 9:33 AM, Ashwin Chaugule
ashwin.chaugule@linaro.org wrote:
...
On 18 April 2014 11:21, Ashwin Chaugule ashwin.chaugule@linaro.org wrote:
...
Hi Mark,
On 17 April 2014 15:50, Mark Rutland mark.rutland@arm.com wrote:
...
On Thu, Apr 17, 2014 at 08:15:46PM +0100, Ashwin Chaugule wrote:
...
PSCIv0.2 adds a new function called AFFINITY_INFO, which
can be used to query if a specified CPU has actually gone
offline. Calling this function via cpu_kill ensures that
a CPU has quiesced after a call to cpu_die.
[...]
...
...
...
We can race with the dying CPU here -- if we call AFFINITY_INFO before
the dying cpu is sufficiently far through its CPU_OFF call it won't
register as OFF.
Could we poll here instead (with a reasonable limit on the number of
iterations)? That would enable us to not spuriously declare a CPU to be
dead when it happened to take slightly longer than we expect to turn
off.
True. How about something like this?
int __ref psci_cpu_kill(unsigned int cpu)
 {

  int err;

  int err, retries;

  if (!psci_ops.affinity_info)
          return 1;

  /*

   * cpu_kill could race with cpu_die and we can

   * potentially end up declaring this cpu undead

   * while it is dying. So retry a couple of times.

   */

+retry:
        err = psci_ops.affinity_info(cpu_logical_map(cpu), 0);
    if (err != PSCI_AFFINITY_INFO_RET_OFF) {

          if (++retries < 3) {

                  pr_info("Retrying check for CPU kill: %d\n", retries);

                  goto retry;

          }
          pr_err("psci: Cannot kill CPU:%d, psci ret val: %d\n",
                          cpu, err);
          /* Make platform_cpu_kill() fail. */

Hi Rob, I've already got your Reviewed-by on this patch without this
"retry" thing. Are you okay with this as well? I can then roll it up
in one patch.
Yes. My only comment is I would perhaps add a sleep (or delay if this
context cannot sleep) on the retry. I'm not sure what I reasonable
time would be, but at least then you are waiting a defined amount of
time versus how long it takes this code to execute.
Rob

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [Linaro-acpi] [PATCH v6 3/3] ARM: Check if a CPU has gone offline