"Theodore Ts'o" tytso@mit.edu writes:
Now, I'm not sure how much it's important to bring back the reverted patch. Yes, I know it's claimed that it fixes a "security issue", but in my opinion, it's pretty bullshit worry. First, almost no one uses the case folded feature other than Android, and second, do you *really* think someone will really be trying to run git under Termux on their Pixel 9 Pro Fold? I mean.... I guess; I do have Termux installed on my P9PF, but even I'm not crazy enough to try install git, emacs, gcc, etc., on an Android phone and expect to get aything useful done. Using ssh, or mosh, with Termux, sure. But git? Not convinced....
Anyway, if we *do* want bring back the reverted patch, it would need to be reworked so that there is a bit in the encoding flags which indicates how we are treating Unicode "ignorable" characters, so that e2fsprogs and f2fs-tools can do the right thing. Once the kernel can handle things with and without ignorable characters, on a switchable basis based on a bit in the superblock, then we wouldn't need to use the linear fallback hack, with the attendant performance penalty.
But honestly, I'm not sure it worth it. But if someone sends me a patch which handles the switchable unicode casefold, I'm willing to spend time to get this integrated into e2fsprogs.
What I think would be a correct approach for commit 5c26d2f1d3f5 ("unicode: Don't special case ignorable code points") is to fold *some* code points: zero-length characters like ZWSP are folded as they should be, but we limit the list to not normalize those characters that make some sense, like the Variant Selectors. This would be similar to what APFS seems to do. This would be complex, but the user-visible semantics would be slightly more sane. It should be done with caution, with a bit marking this change and preserving the current unicode database, to prevent further breakage. But given the damage this apparent simple patch has caused already, I myself won't pursue that without a real security motivation.
Thanks for the linear search patches. Not great, but it solves the current situation. For your ext4 patch:
Reviewed-by: Gabriel Krisman Bertazi krisman@suse.de