On http://b/122082295 we had this abort:
12-27 15:29:31.237 10222 10814 10848 F libc : invalid pthread_t 0xb1907960 passed to libc
This wasn't super helpful. We can do better. Now you get something like
this instead:
03-27 02:34:58.754 25329 25329 W libc : invalid pthread_t (0) passed to pthread_join
Test: adb shell crasher
Bug: http://b/123255692
Change-Id: I1d545665a233308480cc3747ec3120e2b6de0453
Initialize a thread's DTV to an empty zeroed DTV. Allocate the DTV and
any ELF module's TLS segment on-demand in __tls_get_addr. Use a generation
counter, incremented in the linker, to signal when threads should
update/reallocate their DTV objects.
A generation count of 0 always indicates the constant zero DTV.
Once a DTV is allocated, it isn't freed until the thread exits, because
a signal handler could interrupt the fast path of __tls_get_addr between
accessing the DTV slot and reading a field of the DTV. Bionic keeps a
linked list of DTV objects so it can free them at thread-exit.
Dynamic TLS memory is allocated using a BionicAllocator instance in
libc_shared_globals. For async-signal safety, access to the
linker/libc-shared state is protected by first blocking signals, then by
acquiring the reader-writer lock, TlsModules::rwlock. A write lock is
needed to allocate or free memory.
In pthread_exit, unconditionally block signals before freeing dynamic
TLS memory or freeing the shadow call stack.
ndk_cruft.cpp: Avoid including pthread_internal.h inside an extern "C".
(The header now includes a C++ template that doesn't compile inside
extern "C".)
Bug: http://b/78026329
Bug: http://b/123094171
Test: bionic unit tests
Change-Id: I3c9b12921c9e68b33dcc1d1dd276bff364eff5d7
For ELF TLS "local-exec" accesses, the static linker assumes that an
executable's TLS segment is located at a statically-known offset from the
thread pointer (i.e. "variant 1" for ARM and "variant 2" for x86).
Because these layouts are incompatible, Bionic generally needs to allocate
its TLS slots differently between different architectures.
To allow per-architecture TLS slots:
- Replace the TLS_SLOT_xxx enumerators with macros. New ARM slots are
generally negative, while new x86 slots are generally positive.
- Define a bionic_tcb struct that provides two things:
- a void* raw_slots_storage[BIONIC_TLS_SLOTS] field
- an inline accessor function: void*& tls_slot(size_t tpindex);
For ELF TLS, it's necessary to allocate a temporary TCB (i.e. TLS slots),
because the runtime linker doesn't know how large the static TLS area is
until after it has loaded all of the initial solibs.
To accommodate Golang, it's necessary to allocate the pthread keys at a
fixed, small, positive offset from the thread pointer.
This CL moves the pthread keys into bionic_tls, then allocates a single
mapping per thread that looks like so:
- stack guard
- stack [omitted for main thread and with pthread_attr_setstack]
- static TLS:
- bionic_tcb [exec TLS will either precede or succeed the TCB]
- bionic_tls [prefixed by the pthread keys]
- [solib TLS segments will be placed here]
- guard page
As before, if the new mapping includes a stack, the pthread_internal_t
is allocated on it.
At startup, Bionic allocates a temporary bionic_tcb object on the stack,
then allocates a temporary bionic_tls object using mmap. This mmap is
delayed because the linker can't currently call async_safe_fatal() before
relocating itself.
Later, Bionic allocates a stack-less thread mapping for the main thread,
and copies slots from the temporary TCB to the new TCB.
(See *::copy_from_bootstrap methods.)
Bug: http://b/78026329
Test: bionic unit tests
Test: verify that a Golang app still works
Test: verify that a Golang app crashes if bionic_{tls,tcb} are swapped
Merged-In: I6543063752f4ec8ef6dc9c7f2a06ce2a18fc5af3
Change-Id: I6543063752f4ec8ef6dc9c7f2a06ce2a18fc5af3
(cherry picked from commit 1e660b70da)
This change is intended to allow native-bridge to use independent
TLS memory for host and guest environments, while still sharing a
thread-local errno between the two.
Bug: http://b/78026329
Test: bionic unit tests
Change-Id: I838cd321e159add60760bc12a8aa7e9ddc960c33
Merged-In: I838cd321e159add60760bc12a8aa7e9ddc960c33
(cherry picked from commit a9c7c55462)
Split __libc_init_main_thread into __libc_init_main_thread_early and
__libc_init_main_thread_late. The early function is called very early in
the startup of the dynamic linker and static executables. It initializes
the global auxv pointer and enough TLS memory to do system calls, access
errno, and run -fstack-protector code (but with a zero cookie because the
code for generating a cookie is complex).
After the linker is relocated, __libc_init_main_thread_late finishes
thread initialization.
Bug: none
Test: bionic unit tests
Change-Id: I6fcd8d7587a380f8bd649c817b40a3a6cc1d2ee0
Merged-In: I6fcd8d7587a380f8bd649c817b40a3a6cc1d2ee0
(cherry picked from commit 39bc44bb0e)
This lets us do two things:
1) Make setjmp and longjmp compatible with shadow call stack.
To avoid leaking the shadow call stack address into memory, only the
lower log2(SCS_SIZE) bits of x18 are stored to jmp_buf. This requires
allocating an additional guard page so that we're guaranteed to be
able to allocate a sufficiently aligned SCS.
2) SCS overflow detection. Overflows now result in a SIGSEGV instead
of corrupting the allocation that comes after it.
Change-Id: I04d6634f96162bf625684672a87fba8b402b7fd1
Test: bionic-unit-tests
Instead of allocating the stack within a 16MB guard region as we
were doing before, just allocate the stack on its own. This isn't
as secure as with the guard region (since it means that an attacker
who can read the pthread_internal_t can determine the address of the
SCS), but it will at least allow us to discover more blockers until
a solution to b/118642754 is decided on.
Bug: 112907825
Bug: 118642754
Change-Id: Ibe5dffbad1b4700eaa0e24177eea792e7c329a61
Landing this change separately to the change that implements SCS
because it needs to land at the same time as an internal change. This
will simplify the situation in case SCS needs to be reverted again.
Change-Id: Ibe18750829665b6dcf6e36628a5e5bbdd1a0dd4f
Merged-In: Ibe18750829665b6dcf6e36628a5e5bbdd1a0dd4f
The signal stack is sufficiently large for a single invocation of the
signal handler, but in cases where the signal handler needs to recurse,
(e.g. if our address space is limited by RLIMIT_AS), it's too small for
us to get to the part where we recognize that we've recursed and bail
out.
Bug: http://b/118772392
Test: /data/nativetest64/debuggerd_test/debuggerd_test64 --gtest_filter=CrasherTest.seccomp_crash_oom
Change-Id: Ic7a2cf8b01b3f7ea7f4a2318a3ec22a0c3649da6
The golang runtime currently expects to find the pthread key data after
the tls slots.
Bug: http://b/78026329
Bug: http://b/118381796
Test: run a golang-based app, bionic unit tests
Change-Id: Idc777d809b803093e1c81d9a2ce4eafcc7d61f8d
Merged-In: Idc777d809b803093e1c81d9a2ce4eafcc7d61f8d
(cherry picked from commit a2c30723da)
This reverts commit da1bc79f93.
Reason for revert: Caused OOM in media process
Bug: 112907825
Bug: 118593766
Change-Id: I545663871d75889b209b9fd2131cdaa97166478f
With ELF TLS, the static linker assumes that an executable's TLS segment
is at a known offset from the thread pointer (i.e. __get_tls()). The
segment can be located prior to the TP (variant 2, x86[_64], sparc) or
after it (variant 1, arm{32,64}, ppc, mips, ia64, riscv).
We can't make our pthread_internal_t exactly follow the ordinary arm64
ABI (at least) because TP[5] is used for clang's -fstack-protector on
Android. Instead, reserve extra space after the TP (up to 16 words), which
will be followed by the executable's TLS segment.
Bug: http://b/78026329
Test: boot device, bionic unit tests
Change-Id: I0f3b270b793f9872ba0effeac03f4dec364438d6
Merged-In: I0f3b270b793f9872ba0effeac03f4dec364438d6
(cherry picked from commit f397317e96)
Make sure that TLS_SLOT_TSAN is always available and correctly set up in
HWASan-instrumented functions by setting up the tls register and running hwasan
initialization (__hwasan_init in the main thread and __hwasan_thread_enter in
secondary) early enough.
This is needed to accomodate a change in HWASan: https://reviews.llvm.org/D52249
Bug: 112438058
Test: boot with SANITIZE_TARGET=hwaddress, run bionic-tests
Change-Id: Icd909a4ea0da6c6c1095522bcc28debef5f2c63d
* Allow sanitization of libc (excluding existing global sanitizers)
and disallow sanitization of linker. The latter has not been
necessary before because HWASan is the first sanitizer to support
static binaries (with the exception of CFI, which is not used
globally).
* Static binary startup: initialize HWASan shadow very early so that
almost entire libc can be sanitized. The rest of initialization is
done in a global constructor; until that is done sanitized code can
run but can't report errors (will simply crash with SIGTRAP).
* Switch malloc_common from je_* to __sanitizer_*.
* Call hwasan functions when entering and leaving threads. We can not
intercept pthread_create when libc depends on libclang_rt.hwasan.
An alternative to this would be a callback interface like requested
here:
https://sourceware.org/glibc/wiki/ThreadPropertiesAPI
All of the above is behind a compile-time check
__has_feature(hwaddress_sanitizer). This means that HWASan actually
requires libc to be instrumented, and would not work otherwise. It's
an implementation choice that greatly reduces complexity of the tool.
Instrumented libc also guarantees that hwasan is present and
initialized in every process, which allows piecemeal sanitization
(i.e. library w/o main executable, or even individual static
libraries), unlike ASan.
Change-Id: If44c46b79b15049d1745ba46ec910ae4f355d19c
We've been using #pragma once for new internal files, but let's be more bold.
Bug: N/A
Test: builds
Change-Id: I7e2ee2730043bd884f9571cdbd8b524043030c07
At the cost of two flag bits for what POSIX thinks should be a boolean
choice, plus somewhat confusing behavior from pthread_attr_getinheritsched
depending on when you call it/what specific scheduler attributes you've
set in the pthread_attr_t, we can emulate the old behavior exactly and
prevent annoying SELinux denial spam caused by calls to sched_setscheduler.
Bug: http://b/68391226
Test: adb logcat on boot contains no sys_nice avc denials
Change-Id: I4f759c2c4fd1d80cceb0912d7da09d35902e2e5e
Historically, Android defaulted to EXPLICIT but with a special case
because SCHED_NORMAL/priority 0 was awkward. Because the code couldn't
actually tell whether SCHED_NORMAL/priority 0 was a genuine attempt to
explicitly set those attributes (because the parent thread is SCHED_FIFO,
say) or just because the pthread_attr_t was left at its defaults.
Now we support INHERIT, we could call sched_getscheduler to see whether
we actually need to call sched_setscheduler, but since the major cost
is the fixed syscall overhead, we may as well just conservatively
call sched_setscheduler and let the kernel decide whether it's a
no-op. (Especially because we'd then have to add both sched_getscheduler
and sched_setscheduler to any seccomp filter.)
Platform code (or app code that only needs to support >= P) can actually
add a call to pthread_attr_setinheritsched to say that they just want
to inherit (if they know that none of their threads actually mess with
scheduler attributes at all), which will save them a sched_setscheduler
call except in the doubly-special case of SCHED_RESET_ON_FORK (which we
do handle).
An alternative would be "make pthread_attr_setschedparams and
pthread_attr_setschedprio set EXPLICIT and change the platform default
to INHERIT", but even though I can only think of weird pathological
examples where anyone would notice that change, that behavior -- of
pthread_attr_setschedparams/pthread_attr_setschedprio overriding an
earlier call to pthread_attr_setinheritsched -- isn't allowed by POSIX
(whereas defaulting to EXPLICIT is).
If we have a lot of trouble with this change in the app compatibility
testing phase, though, we'll want to reconsider this decision!
-*-
This change also removes a comment about setting the scheduler attributes
in main_thread because we'd have to actually keep them up to date,
and it's not clear that doing so would be worth the trouble.
Also make async_safe_format_log preserve errno so we don't have to be
so careful around it.
Bug: http://b/67471710
Test: ran tests
Change-Id: Idd026c4ce78a536656adcb57aa2e7b2c616eeddf
Return EAGAIN rather than aborting if we fail to set up the TLS for a new
thread.
Add a test that uses all the VMAs so we can properly test these edge cases.
Add an explicit test for pthread_attr_setdetachstate, which we use in the
previous test, but other than that has no tests.
Remove support for ro.logd.timestamp/persist.logd.timestamp, which doesn't
seem to be used, and which prevents us from logging failures in cases where
mmap fails (because we need to mmap in the system property implementation).
Bug: http://b/65608572
Test: ran tests
Change-Id: I9009f06546e1c2cc55eff996d08b55eff3482343
This also fixes a long-standing bug where the guard region would be taken
out of the stack itself, rather than being -- as POSIX demands -- additional
space after the stack. Historically a 128KiB stack with a 256KiB guard would
have given you an immediate crash.
Bug: http://b/38413813
Test: builds, boots
Change-Id: Idd12a3899be1d92fea3d3e0fa6882ca2216bd79c
snprintf to a buffer of length PATH_MAX consumes about 7kB of stack.
Bug: http://b/35858739
Test: bionic-unit-tests --gtest_filter="*big_enough*"
Change-Id: I34a7f42c1fd2582ca0d0a9b7e7a5290bc1cc19b1
Thread local buffers were using pthread_setspecific for storage with
lazy initialization. pthread_setspecific shares TLS slots between the
linker and libc.so, so thread local buffers being initialized in a
different order between libc.so and the linker meant that bad things
would happen (manifesting as snprintf not working because the
locale was mangled)
Bug: http://b/20464031
Test: /data/nativetest64/bionic-unit-tests/bionic-unit-tests
everything passes
Test: /data/nativetest/bionic-unit-tests/bionic-unit-tests
thread_local tests are failing both before and after (KUSER_HELPERS?)
Test: /data/nativetest64/bionic-unit-tests-static/bionic-unit-tests-static
no additional failures
Change-Id: I9f445a77c6e86979f3fa49c4a5feecf6ec2b0c3f
Another release, another attempt to fix this bug.
This change affects pthread_detach, pthread_getcpuclockid,
pthread_getschedparam/pthread_setschedparam, pthread_join, and pthread_kill:
instead of returning ESRCH when passed an invalid pthread_t, they'll now SEGV.
Note that this doesn't change behavior as much as you might think: the old
lookup only held the global thread list lock for the duration of the lookup,
so there was still a race between that and the dereference in the caller,
given that callers actually need the tid to pass to some syscall or other,
and sometimes update fields in the pthread_internal_t struct too.
We can't check thread->tid against 0 to see whether a pthread_t is still
valid because a dead thread gets its thread struct unmapped along with its
stack, so the dereference isn't safe.
Taking the affected functions one by one:
* pthread_getcpuclockid and pthread_getschedparam/pthread_setschedparam
should be fine. Unsafe calls to those seem highly unlikely.
* Unsafe pthread_detach callers probably want to switch to
pthread_attr_setdetachstate instead, or using pthread_detach(pthread_self())
from the new thread's start routine rather than doing the detach in the
parent.
* pthread_join calls should be safe anyway, because a joinable thread won't
actually exit and unmap until it's joined. If you're joining an
unjoinable thread, the fix is to stop marking it detached. If you're
joining an already-joined thread, you need to rethink your design.
* Unsafe pthread_kill calls aren't portably fixable. (And are obviously
inherently non-portable as-is.) The best alternative on Android is to
use pthread_gettid_np at some point that you know the thread to be alive,
and then call kill/tgkill directly. That's still not completely safe
because if you're too late, the tid may have been reused, but then your
code is inherently unsafe anyway.
If we find too much code is still broken, we can come back and disable
the global thread list lookups for anything targeting >= O and then have
another go at really removing this in P...
Bug: http://b/19636317
Test: N6P boots, bionic tests pass
Change-Id: Ia92641212f509344b99ee2a9bfab5383147fcba6
Before, dynamic executables would initialize the global stack protector
twice, once for the linker, and once for the executable. This worked
because the result was the same for both initializations, because it
used getauxval(AT_RANDOM), which won't be the case once arc4random gets
used for it.
Bug: http://b/29622562
Change-Id: I7718b1ba8ee8fac7127ab2360cb1088e510fef5c
Test: ran the stack protector tests on angler (32/64bit, static/dynamic)
The code to calculate thread stack and signal stack looks weird:
the thread stack size and signal stack size are related with
each other on 32-bit mode, but not on 64-bit mode. So change the
code to make the logic more resonable. This doesn't change anything
as we have defined SIGSTKSZ to 16K on arm64.
Bug: 28005110
Change-Id: I04d2488cfb96ee7e2d894d062c66cef950fec418
Exactly which functions get a stack protector is up to the compiler, so
let's separate the code that sets up the environment stack protection
requires and explicitly build it with -fno-stack-protector.
Bug: http://b/26276517
Change-Id: I8719e23ead1f1e81715c32c1335da868f68369b5
Currently we use __thread variable to store thread_local_dtors,
which makes tsan test fork_atexit.cc hang. The problem is as below:
The main thread creates a worker thread, the worker thread calls
pthread_exit() -> __cxa_thread_finalize() -> __emutls_get_address()
-> pthread_once(emutls_init) -> emutls_init().
Then the main thread calls fork(), the child process cals
exit() -> __cxa_thread_finalize() -> __emutls_get_address()
-> pthread_once(emutls_init).
So the child process is waiting for pthread_once(emutls_init)
to finish which will never occur.
It might be the test's fault because POSIX standard says if a
multi-threaded process calls fork(), the new process may only
execute async-signal-safe operations until exec functions are
called. And exit() is not async-signal-safe. But we can make
bionic more reliable by not using __thread in
__cxa_thread_finalize().
Bug: 25392375
Change-Id: Ife403dd7379dad8ddf1859c348c1c0adea07afb3
It removes calling to pthread_mutex_lock() at the beginning of new
thread, which helps to support thread sanitizer.
Change-Id: Ia3601c476de7976a9177b792bd74bb200cee0e13
As glibc/netbsd don't protect access to thread struct members by a global
lock, we don't want to do it either. This change reduces the
responsibility of g_thread_list_lock to only protect g_thread_list.
Bug: 19636317
Change-Id: I897890710653dac165d8fa4452c7ecf74abdbf2b
aligned attribute can only control compiler's behavior, but we
are manually allocating pthread_internal_t. So we need to make
sure of alignment manually.
Change-Id: Iea4c46eadf10dfd15dc955c5f41cf6063cfd8536
The errors are introduced in "Make pthread join_state not protected by g_thread_list_lock".
Bug: 19636317
Change-Id: I58ae9711da94bfbac809abfd81311eeb70301a4b
1. Move the representation of thread join_state from pthread.attr.flag
to pthread.join_state. This clarifies thread state change.
2. Use atomic operations for pthread.join_state. So we don't need to
protect it by g_thread_list_lock. g_thread_list_lock will be reduced
to only protect g_thread_list or even removed in further changes.
Bug: 19636317
Change-Id: I31fb143a7c69508c7287307dd3b0776993ec0f43
Make this change because I think it is more reasonable to check stack info
in pthread_getattr_np. I believe pthread_attr_t is not tied with any thread,
and can't have a flag saying who using it is the main thread.
This change also helps refactor of g_thread_list_lock.
Bug: 19636317
Change-Id: Iedbb85a391ac3e1849dd036d01445dac4bc63db9
If pthread_detach() is called while the thread is in pthread_exit(),
it takes the risk that no one can free the pthread_internal_t.
So I add PTHREAD_ATTR_FLAG_ZOMBIE to detect this, maybe very rare, but
both glibc and netbsd libpthread have similar function.
Change-Id: Iaa15f651903b8ca07aaa7bd4de46ff14a2f93835