On Mon, Aug 19, 2024 at 04:44:42PM +0100, Mark Brown wrote:
On Mon, Aug 19, 2024 at 12:46:13PM +0100, Catalin Marinas wrote:
On Thu, Aug 01, 2024 at 01:06:46PM +0100, Mark Brown wrote:
- /*
* Ensure that GCS changes are observable by/from other PEs in
* case of migration.
*/
- if (task_gcs_el0_enabled(current) || task_gcs_el0_enabled(next))
gcsb_dsync();
[...]
What's the GCSB DSYNC supposed to do here? The Arm ARM talks about ordering between GCS memory effects and other memory effects. I haven't looked at the memory model in detail yet (D11.9.1) but AFAICT it has nothing to do with the system registers. We'll need this barrier when ordering is needed between explicit or implicit (e.g. BL) GCS accesses and the explicit classic memory accesses. Paging comes to mind, so maybe flush_dcache_page() would need this barrier. ptrace() is another case if the memory accessed is a GCS page. I can see you added it in other places, I'll have a look as I go through the rest. But I don't think one is needed here.
It's not particuarly for the system registers, is there's so that anything else that looks at the task's GCS sees the current state.
Ah, so that's the to ensure that any writes on the CPU to the GCS stack would be observable if the task appears on a different CPU (together with the additional classic ordering/spinlocks used for the run queues). Maybe update the comment to say "GCS memory effects" instead of "GCS changes". I read the latter as GCS sysreg changes. Something like below would make it clearer:
/* * Ensure that GCS memory effects of the 'prev' thread are * ordered before other memory accesses with release semantics * (or preceded by a DMB) on the current PE. In addition, any * memory accesses with acquire semantics (or succeeded by a * DMB) are ordered before GCS memory effects of the 'next' * thread. This will ensure that the GCS memory effects are * visible to other PEs in case of migration. */
Feel free to rephrase as you see fit.
I'm pretty confident this excessive, the goal was to err on the side of correctness and then relax later.
I think we are missing some. Paging should be ok as we have a pte change and TLBI and IIRC the same rules as for standard memory accesses apply. ptrace() memory accesses may need something though I'm fine with considering this a best effort (we can't guarantee anyway if any accesses are on different CPUs). I haven't got to the signal handling patch yet.