Hi there. There's a problem with the thread local storage register in the Tegra 2 which is exposed by Ubuntu switching to GCC 4.5. The fault is quite serious and causes many applications to segfault. There's more details in LP: #739374.
The problem is Tegra 2 specific and is caused by bit 20 of the CP15 thread pointer register always reading as zero. With GCC 4.4, access to the thread pointer always goes through the GLIBC helper function '__aeabi_read_tp' which calls into the kernel which then reads and returns CP15. Either GLIBC or the kernel[1] itself swaps bit 20 and bit 0 which works around the problem. This doesn't work in GCC 4.5 as it reads CP15 directly instead.
There are a few solutions:
Change GCC to swap bits 20 and 0 as well. This is a hack and requires rebuilding the archive. The performance should be small.
Change GCC to always call the helper function. The helper function can detect the processor and call into the kernel on Tegra devices, or return CP15 directly on others. IFUNC could be used to reduce the overhead. The archive would have to be rebuilt. Worse performance than above, but still better than 4.4.
Change GLIBC to allocate thread local storage on a 2 M boundary. Bit 20 would always be zero. The thread pointer is a base address so the thread could still have more than 2 M of thread local data. GLIBC would have to be rebuilt and this limits the maximum number of threads. No runtime performance hit.
I prefer changing GLIBC.
Any thoughts? Linaro doesn't support this chip. Who should do the work? Linaro? Ubuntu? The EGLIBC community?
-- Michael
[1] Dave and I disagree where it is but neither have tracked it down