Hi all,
A recent LLVM change [1] introduces a call to wcslen() in fs/smb/client/smb2pdu.c through UniStrcat() via alloc_path_with_tree_prefix(). Similar to the bcmp() and stpcpy() additions that happened in 5f074f3e192f and 1e1b6d63d634, add wcslen() to fix the linkage failure.
[1]: https://github.com/llvm/llvm-project/commit/9694844d7e36fd5e01011ab56b64f27b...
--- Changes in v3: - During comment shuffle in patch 1, move to standard multi-line comment kernel style (Andy). Carry forward Andy's review. - Move nls_types.h include in string.c in patch 2 to a better place alphabetically (Andy). - Drop 'extern' from wcslen() declaration in string.h, as external linkage is the default for functions and the coding style explicitly forbids it (Andy). - Link to v2: https://lore.kernel.org/r/20250326-string-add-wcslen-for-llvm-opt-v2-0-d864a...
Changes in v2: - Refactor typedefs from nls.h into nls_types.h to make it safe to include in string.h, which may be included in many places throughout the kernel that may not like the other stuff nls.h brings in: https://lore.kernel.org/202503260611.MDurOUhF-lkp@intel.com/ - Drop libstub change due to the above change, as it is no longer necessary. - Move prototype shuffle of patch 2 into the patch that adds wcslen() (Andy) - Use new nls_types.h in string.{c,h} - Link to v1: https://lore.kernel.org/r/20250325-string-add-wcslen-for-llvm-opt-v1-0-b8f1e...
--- Nathan Chancellor (2): include: Move typedefs in nls.h to their own header lib/string.c: Add wcslen()
include/linux/nls.h | 19 +------------------ include/linux/nls_types.h | 26 ++++++++++++++++++++++++++ include/linux/string.h | 2 ++ lib/string.c | 11 +++++++++++ 4 files changed, 40 insertions(+), 18 deletions(-) --- base-commit: 78ab93c78fb31c5dfe207318aa2b7bd4e41f8dba change-id: 20250324-string-add-wcslen-for-llvm-opt-705791db92c0
Best regards,
In order to allow commonly included headers such as string.h to access typedefs such as wchar_t without running into issues with the rest of the NLS library, refactor the typedefs out into their own header that can be included in a much safer manner.
Cc: stable@vger.kernel.org Reviewed-by: Andy Shevchenko andy@kernel.org Signed-off-by: Nathan Chancellor nathan@kernel.org --- include/linux/nls.h | 19 +------------------ include/linux/nls_types.h | 26 ++++++++++++++++++++++++++ 2 files changed, 27 insertions(+), 18 deletions(-)
diff --git a/include/linux/nls.h b/include/linux/nls.h index e0bf8367b274..3d416d1f60b6 100644 --- a/include/linux/nls.h +++ b/include/linux/nls.h @@ -3,24 +3,7 @@ #define _LINUX_NLS_H
#include <linux/init.h> - -/* Unicode has changed over the years. Unicode code points no longer - * fit into 16 bits; as of Unicode 5 valid code points range from 0 - * to 0x10ffff (17 planes, where each plane holds 65536 code points). - * - * The original decision to represent Unicode characters as 16-bit - * wchar_t values is now outdated. But plane 0 still includes the - * most commonly used characters, so we will retain it. The newer - * 32-bit unicode_t type can be used when it is necessary to - * represent the full Unicode character set. - */ - -/* Plane-0 Unicode character */ -typedef u16 wchar_t; -#define MAX_WCHAR_T 0xffff - -/* Arbitrary Unicode character */ -typedef u32 unicode_t; +#include <linux/nls_types.h>
struct nls_table { const char *charset; diff --git a/include/linux/nls_types.h b/include/linux/nls_types.h new file mode 100644 index 000000000000..9479df1016da --- /dev/null +++ b/include/linux/nls_types.h @@ -0,0 +1,26 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_NLS_TYPES_H +#define _LINUX_NLS_TYPES_H + +#include <linux/types.h> + +/* + * Unicode has changed over the years. Unicode code points no longer + * fit into 16 bits; as of Unicode 5 valid code points range from 0 + * to 0x10ffff (17 planes, where each plane holds 65536 code points). + * + * The original decision to represent Unicode characters as 16-bit + * wchar_t values is now outdated. But plane 0 still includes the + * most commonly used characters, so we will retain it. The newer + * 32-bit unicode_t type can be used when it is necessary to + * represent the full Unicode character set. + */ + +/* Plane-0 Unicode character */ +typedef u16 wchar_t; +#define MAX_WCHAR_T 0xffff + +/* Arbitrary Unicode character */ +typedef u32 unicode_t; + +#endif /* _LINUX_NLS_TYPES_H */
A recent optimization change in LLVM [1] aims to transform certain loop idioms into calls to strlen() or wcslen(). This change transforms the first while loop in UniStrcat() into a call to wcslen(), breaking the build when UniStrcat() gets inlined into alloc_path_with_tree_prefix():
ld.lld: error: undefined symbol: wcslen
referenced by nls_ucs2_utils.h:54 (fs/smb/client/../../nls/nls_ucs2_utils.h:54) vmlinux.o:(alloc_path_with_tree_prefix) referenced by nls_ucs2_utils.h:54 (fs/smb/client/../../nls/nls_ucs2_utils.h:54) vmlinux.o:(alloc_path_with_tree_prefix)
The kernel does not build with '-ffreestanding' (which would avoid this transformation) because it does want libcall optimizations in general and turning on '-ffreestanding' disables the majority of them. While '-fno-builtin-wcslen' would be more targeted at the problem, it does not work with LTO.
Add a basic wcslen() to avoid this linkage failure. While no architecture or FORTIFY_SOURCE overrides this, add it to string.c instead of string_helpers.c so that it is built with '-ffreestanding', otherwise the compiler might transform it into a call to itself.
Cc: stable@vger.kernel.org Link: https://github.com/llvm/llvm-project/commit/9694844d7e36fd5e01011ab56b64f27b... [1] Signed-off-by: Nathan Chancellor nathan@kernel.org --- include/linux/string.h | 2 ++ lib/string.c | 11 +++++++++++ 2 files changed, 13 insertions(+)
diff --git a/include/linux/string.h b/include/linux/string.h index 0403a4ca4c11..b000f445a2c7 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -10,6 +10,7 @@ #include <linux/stddef.h> /* for NULL */ #include <linux/err.h> /* for ERR_PTR() */ #include <linux/errno.h> /* for E2BIG */ +#include <linux/nls_types.h> /* for wchar_t */ #include <linux/overflow.h> /* for check_mul_overflow() */ #include <linux/stdarg.h> #include <uapi/linux/string.h> @@ -203,6 +204,7 @@ extern __kernel_size_t strlen(const char *); #ifndef __HAVE_ARCH_STRNLEN extern __kernel_size_t strnlen(const char *,__kernel_size_t); #endif +__kernel_size_t wcslen(const wchar_t *s); #ifndef __HAVE_ARCH_STRPBRK extern char * strpbrk(const char *,const char *); #endif diff --git a/lib/string.c b/lib/string.c index eb4486ed40d2..2c6f8c8f4159 100644 --- a/lib/string.c +++ b/lib/string.c @@ -21,6 +21,7 @@ #include <linux/errno.h> #include <linux/limits.h> #include <linux/linkage.h> +#include <linux/nls_types.h> #include <linux/stddef.h> #include <linux/string.h> #include <linux/types.h> @@ -429,6 +430,16 @@ size_t strnlen(const char *s, size_t count) EXPORT_SYMBOL(strnlen); #endif
+size_t wcslen(const wchar_t *s) +{ + const wchar_t *sc; + + for (sc = s; *sc != '\0'; ++sc) + /* nothing */; + return sc - s; +} +EXPORT_SYMBOL(wcslen); + #ifndef __HAVE_ARCH_STRSPN /** * strspn - Calculate the length of the initial substring of @s which only contain letters in @accept
On Fri, 28 Mar 2025 12:26:30 -0700, Nathan Chancellor wrote:
A recent LLVM change [1] introduces a call to wcslen() in fs/smb/client/smb2pdu.c through UniStrcat() via alloc_path_with_tree_prefix(). Similar to the bcmp() and stpcpy() additions that happened in 5f074f3e192f and 1e1b6d63d634, add wcslen() to fix the linkage failure.
[...]
Applied to for-next/hardening, thanks!
[1/2] include: Move typedefs in nls.h to their own header https://git.kernel.org/kees/c/21592017d384 [2/2] lib/string.c: Add wcslen() https://git.kernel.org/kees/c/61df817d1d1b
Take care,
linux-stable-mirror@lists.linaro.org