On Tue, Oct 10, 2023 at 12:10:56PM +0200, Peter Zijlstra wrote:
On Tue, Oct 10, 2023 at 10:19:38AM +0200, Borislav Petkov wrote:
On Tue, Oct 10, 2023 at 08:37:16AM +0300, Kirill A. Shutemov wrote:
On machines with 5-level paging, cpu_feature_enabled(X86_FEATURE_LA57) got patched. It includes KASAN code, where KASAN_SHADOW_START depends on __VIRTUAL_MASK_SHIFT, which is defined with the cpu_feature_enabled().
So use boot_cpu_has(X86_FEATURE_LA57).
It seems that KASAN gets confused when apply_alternatives() patches the
It seems?
KASAN_SHADOW_START users. A test patch that makes KASAN_SHADOW_START static, by replacing __VIRTUAL_MASK_SHIFT with 56, fixes the issue.
During text_poke_early() in apply_alternatives(), KASAN should be disabled. KASAN is already disabled in non-_early() text_poke().
It is unclear why the issue was not reported earlier. Bisecting does not help. Older kernels trigger the issue less frequently, but it still occurs. In the absence of any other clear offenders, the initial dynamic 5-level paging support is to blame.
This whole thing sounds like it is still not really clear what is actually happening...
somewhere along the line __asan_loadN() gets tripped, this then ends up in kasan_check_range() -> check_region_inline() -> addr_has_metadata().
This latter has: kasan_shadow_to_mem() which is compared against KASAN_SHADOW_START, which includes, as Kirill says __VIRTUAL_MASK_SHIFT.
Now, obviously you really don't want boot_cpu_has() in __VIRTUAL_MASK_SHIFT, that would be really bad (Linus recently complained about how horrible the code-gen is around this already, must not make it far worse).
Anyway, being half-way through patching X86_FEATURE_LA57 thing *are* inconsistent and I really can't blame things for going sideways.
That said, I don't particularly like the patch, I think it should, at the veyr least, cover all of apply_alternatives, not just text_poke_early().
I can do this, if it is the only stopper.
Do you want it disabled on caller side or inside apply_alternatives()?