On 5/1/23 9:30 AM, Espen Grindhaug wrote:
On Mon, May 01, 2023 at 08:23:35AM -0700, Yonghong Song wrote:
On 5/1/23 6:00 AM, Espen Grindhaug wrote:
On Thu, Apr 27, 2023 at 06:19:29PM -0700, Yonghong Song wrote:
On 4/27/23 12:19 PM, Espen Grindhaug wrote:
On Wed, Apr 26, 2023 at 02:47:27PM -0700, Yonghong Song wrote:
On 4/23/23 11:55 AM, Espen Grindhaug wrote: > This change fixes the handling of versions in elf_find_func_offset. > In the previous implementation, we incorrectly assumed that the
Could you give more explanation/example in the commit message what does 'incorrectly' mean here? In which situations the current libbpf implementation will not be correct?
How about something like this?
libbpf: Improve version handling when attaching uprobe
This change fixes the handling of versions in elf_find_func_offset.
For example, let's assume we are trying to attach an uprobe to pthread_create in glibc. Prior to this commit, it would fail with an error message saying 'elf: ambiguous match [...]', this is because there are two entries in the symbol table with that name.
$ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep pthread_create 0000000000094cc0 T pthread_create@GLIBC_2.2.5 0000000000094cc0 T pthread_create@@GLIBC_2.34
So we go ahead and modify our code to attach to 'pthread_create@@GLIBC_2.34', and this also fails, but this time with the error 'elf: failed to find symbol [...]'. This fails because we incorrectly assumed that the version information would be present in the string found in the string table, but there is only the string 'pthread_create'.
I tried one example with my centos8 libpthread library.
$ llvm-readelf -s /lib64/libc-2.28.so | grep pthread_cond_signal 39: 0000000000095f70 43 FUNC GLOBAL DEFAULT 14 pthread_cond_signal@@GLIBC_2.3.2 40: 0000000000096250 43 FUNC GLOBAL DEFAULT 14 pthread_cond_signal@GLIBC_2.2.5 3160: 0000000000096250 43 FUNC LOCAL DEFAULT 14 __pthread_cond_signal_2_0 3589: 0000000000095f70 43 FUNC LOCAL DEFAULT 14 __pthread_cond_signal 5522: 0000000000095f70 43 FUNC GLOBAL DEFAULT 14 pthread_cond_signal@@GLIBC_2.3.2 5545: 0000000000096250 43 FUNC GLOBAL DEFAULT 14 pthread_cond_signal@GLIBC_2.2.5 $ nm -D /lib64/libc-2.28.so | grep pthread_cond_signal 0000000000095f70 T pthread_cond_signal@@GLIBC_2.3.2 0000000000096250 T pthread_cond_signal@GLIBC_2.2.5 $
Note that two pthread_cond_signal functions have different addresses, which is expected as they implemented for different versions.
But in your case,
$ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep pthread_create 0000000000094cc0 T pthread_create@GLIBC_2.2.5 0000000000094cc0 T pthread_create@@GLIBC_2.34
Two functions have the same address which is very weird and I suspect some issues here at least needs some investigation.
I am no expert on this, but as far as I can tell, this is normal, although much more common on my Ubuntu machine than my Fedora machine.
Script to find duplicates:
nm -D /usr/lib64/libc-2.33.so | awk ' { addr = $1; symbol = $3; sub(/[@].*$/, "", symbol);
if (addr == prev_addr && symbol == prev_symbol) { if (prev_symbol_printed == 0) { print prev_line; prev_symbol_printed = 1; } print; } else { prev_symbol_printed = 0; } prev_addr = addr; prev_symbol = symbol; prev_line = $0;
}'
Second, for the symbol table, the following is ELF encoding,
typedef struct { Elf64_Word st_name; unsigned char st_info; unsigned char st_other; Elf64_Half st_shndx; Elf64_Addr st_value; Elf64_Xword st_size; } Elf64_Sym;
where st_name
An index into the object file's symbol string table, which holds the
character representations of the symbol names. If the value is nonzero, the value represents a string table index that gives the symbol name. Otherwise, the symbol table entry has no name.
So, the function name (including @..., @@...) should be in string table which is the same for the above two pthread_cond_signal symbols.
I think it is worthwhile to debug why in your situation pthread_create@GLIBC_2.2.5 and pthread_create@@GLIBC_2.34 do not have them in the string table.
I think you are mistaken here; the strings in the strings table don't contain the version. Take a look at this partial dump of the strings table.
$ readelf -W -p .dynstr /usr/lib64/libc-2.33.so
String dump of section '.dynstr': [ 1] xdrmem_create [ f] __wctomb_chk [ 1c] getmntent [ 26] __freelocale [ 33] __rawmemchr [ 3f] _IO_vsprintf [ 4c] getutent [ 55] __file_change_detection_for_path (...) [ 350e] memrchr [ 3516] pthread_cond_signal [ 352a] __close (...) [ 61b6] GLIBC_2.2.5 [ 61c2] GLIBC_2.2.6 [ 61ce] GLIBC_2.3 [ 61d8] GLIBC_2.3.2 [ 61e4] GLIBC_2.3.3
As you can see, the strings have no versions, and the version strings themselves are also in this table as entries at the end of the table.
I see you search .dynstr section. Do you think whether we should search .strtab instead since it contains versioned symbols?
I searched .dynstr since my libc files only have that section, but I do see your point. If const char *binary_path points to an executable and not an .so file, then we would find some versioned symbols in the .strtab section. However, since libbpf supports using the .so as binary_path, would we not need the functionality to build the complete name regardless?
Okay, so you do not have .strtab section, the section probably removed with `llvm-strip --strip-all <binary>`. In this particular case, I think your approach to search SHT_GNU_versym and SHT_GNU_verdef for versioned symbols probably is the right choice. Please do add such information in the commit message.
Adding a check to not build the full name if it already contains an '@' is probably a good idea, though.
If you search strtab, you will find name with '@', but this won't be the case if you using SHT_GNU_versym/SHT_GNU_verdef. Since both dynstr and strtab are searched, I guess adding this check is a good idea if the version in strtab case is not NULL.
This patch reworks how we compare the symbol name provided by the user if it is qualified with a version (using @ or @@). We now look up the correct version string in the version symbol table before constructing the full name, as also done above by nm, before comparing.
> version information would be present in the string found in the > string table. > > We now look up the correct version string in the version symbol > table before constructing the full name and then comparing. > > This patch adds support for both name@version and name@@version to > match output of the various elf parsers. > > Signed-off-by: Espen Grindhaug espen.grindhaug@gmail.com
[...]