Hi Dmitry,
Here is the log of our boot up with the 2 frequencies patch.
Thanks, Kim
-----Original Message----- From: Dmitry Baryshkov dmitry.baryshkov@linaro.org Sent: Monday, March 6, 2023 4:55 AM To: Kim Steiner kim.steiner@sightlineapplications.com; Jordan Holt jordan.holt@sightlineapplications.com Cc: Paul Neuhardt paul.neuhardt@linaro.org; sightlineapplications@lists.linaro.org Subject: Re: Low temperature GPU hang testing
Hi Kim,
On 03/03/2023 16:08, Kim Steiner wrote:
Hi Dmitry,
Are you having any success with setting up a freezer to reproduce the GPU hang bug?
I had a partial success, I was able to once reproduce an issue with my setup. I pushed several patches fixing small GPU issues, fixing the devfreq/scaling frequency and also changing the way GPU handles the MX power rail. Also I pushed several patches that would trigger GPU to use higher frequencies/higher power consumption if exposed to the lower temperatures. I'm currently evaluating if it is possible to slightly increase voltage under the low-temp conditions.
The changes have now propagated to the old Git server, so you can pull them from the old location.
Also, are you able to get us a patch for a single GPU frequency in the latest code base?
I have attached a patch that limits the frequencies to 560MHz and 624 MHz, as those two frequencies are supported by most of msm8996 SoCs. If your device supports 624 MHz for the GPU, you also disable the 560 MHz opp entry. Note, this patch overrides some of voltage controls and thus can occasionally lead to crashes.
Could you please perform: - a performance test of the new kernel - a performance test of the new kernel with the patch applied?
Thanks,
Kim
*From:* Jordan Holt jordan.holt@sightlineapplications.com *Sent:* Friday, February 24, 2023 1:15 PM *To:* Dmitry Baryshkov dmitry.baryshkov@linaro.org; Kim Steiner kim.steiner@sightlineapplications.com *Cc:* Paul Neuhardt paul.neuhardt@linaro.org; sightlineapplications@lists.linaro.org *Subject:* Low temperature GPU hang testing
Hi Dmitry,
I am running the 4000 as follows to cause the GPU hang bug:
- Board in a freezer – ours isn’t super controllable, but a thermometer says it is at -18C.
- Set up so that you can cycle the board power – needed when you hit the crash
- I have a small fan in the freezer pointed at the board with a power cord coming out so that I can turn it on and off.
- Default parameters on the 4000 except
- Cam 0 is set up with TestPattern Multi Car, 1080x1920, color
- Stream network video to your PC
- Parameters -> Save To Board
- View Performance Graphs, check Enable System Status and look at the Temperature value (see below)
- Try to get the temperature into the high 50’sF to low 60’sF (sorry it’s F instead of C!) and you will see: *Warning: GPU hang detected …* messages and jumpy video.
- Turn on the fan if the temperature is too high (usually not for very long)
- You can try turning on other processing temporarily if the temperature is too low, then turn these off once it warms up.
i.Enhance -> Contrast Mode = CLAHE
ii.Track -> Detect -> Mode = Aerial
- Occasionally, the whole system will crash, requires a power cycle
- Let’s get on a call if you are struggling to get it into the zone.
Thanks,
Jordan
*Jordan Holt | **CTO **| SightLine Applications Inc. |**Onboard Video**Processing****| *he/him | 503 724-9727**
-- With best wishes Dmitry