Hi Mike,
On 2020-06-03 19:21, Mike Leach wrote:
Hi,
On Wed, 3 Jun 2020 at 14:34, Robin Murphy robin.murphy@arm.com wrote:
On 2020-06-03 14:22, Mike Leach wrote:
Hi Sai,
On Wed, 3 Jun 2020 at 13:14, Sai Prakash Ranjan saiprakash.ranjan@codeaurora.org wrote:
Hi Mike,
On 2020-06-03 16:57, Mike Leach wrote:
Hi,
On Wed, 3 Jun 2020 at 11:24, Sai Prakash Ranjan saiprakash.ranjan@codeaurora.org wrote:
Hi Mike,
Thanks again for looking at this.
On 2020-06-03 03:42, Mike Leach wrote: [...]
>> >> SMMU/IOMMU won't be able to do much here as it is the client's >> responsiblity to >> properly shutdown and SMMU device link just makes sure that >> SMMU(supplier) shutdown is >> called only after its consumers shutdown callbacks are called. > > I think this use case can be handled slightly differently than the > general requirements for modular CoreSight drivers. > > What is needed here is a way of stopping the underlying ETR hardware > from issuing data to the SMMU, until the entire device has been shut > down, in a way that does not remove the driver, breaking existing > references and causing a system crash. > > We could introduce a new mode to the ETR driver - e.g. > CS_MODE_SHUTDOWN. > > At the end of the block tmc_shutdown(struct amba_device *adev), set > drvdata->mode to CS_MODE_SHUTDOWN & remove the coresight_unregister(). > This new mode can be used to prevent the underlying hardware from > being able to restart until the device is re-powered. > > This mode can be detected in the code that enables / disables the ETR > and handled appropriately (updates to tmc_enable_etr_sink and > tmc_disable_etr_sink). > This mode will persist until the device is re-started - but because we > are on the device shutdown path this is not an issue. > > This should leave the CoreSight infrastructure stable until the > drivers are shut down normally as part of the device power down > process. >
Sounds good to me, but if the coresight_unregister() is the trouble point causing these crashes, then can't we just remove that from tmc_shutdown() callback? This would be like maintaining the same behaviour as now where on reboot/shutdown we basically don't do anything except for disabling ETR.
No - the new mode prevents race conditions where the thread shutting down the SMMU does the ETR shutdown, but then another thread happens to be trying to start trace and restarts the ETR. It also prevents the condition Mathieu discussed where a thread might be attempting to shutdown trace - this could try to disable the hardware again re-releasing resources/ re-flushing and waiting for stop.
I do not think there will a race between SMMU shutdown and ETR shutdown. Driver core takes care of calling SMMU shutdown after its consumer shutdown callbacks via device link, otherwise there would already be bugs in all other client drivers.
I am not saying there could be a race between tmc_shutdowm and Smmu_shutdown - there may be a case if the coresight_disable_path sequence is running and gets to the point of disabling the ETR after the SMMU callback has disabled it.
I'm confused now - there is no "SMMU callback", we're talking about the system-wide cleanup from kernel_shutdown_prepare() or kernel_restart_prepare(). As far as I'm aware userspace should be long gone by that point, so although trace may have been left running ||
((offset >= TRCCIDCVRn(0)) && (offset <= TRCVMIDCVRn(7)),
the
chance of racing against other driver operations seems pretty unlikely.
Sorry - bad choice of terminology. I was referring to the SMMU ensuring that it had all its clients shut-down before if shut down. To quote Sai...
SMMU device link just makes sure that >> SMMU(supplier) shutdown is >> called only after its consumers shutdown callbacks are called.
I agree it is unlikely, but if removing the device from the CoreSight infrastructure via coresight_unregister() is a potential source of a crash, it would seem that there is a potential path where some CoreSight driver side work might be possible. therefore a mode to prevent this crash, and ensure that the device hardware remains off and not sending trace to SMMU until such time as shutdown / reboot restart occurs, seemed prudent.
Actually I did not see any crash with coresight_unregister() during reboot/shutdown as I mentioned previously to Mathieu's query on this being similar to remove callback. I think the crash with coresight_unregister() is only seen when we have coresight as module and the userspace is pretty much there to enable/disable trace when we try to bind/unbind. But here we only consider the system reboot/shutdown where pretty much everything is down by this point.
Thanks, Sai