Return EAGAIN rather than aborting if we fail to set up the TLS for a new
thread.
Add a test that uses all the VMAs so we can properly test these edge cases.
Add an explicit test for pthread_attr_setdetachstate, which we use in the
previous test, but other than that has no tests.
Remove support for ro.logd.timestamp/persist.logd.timestamp, which doesn't
seem to be used, and which prevents us from logging failures in cases where
mmap fails (because we need to mmap in the system property implementation).
Bug: http://b/65608572
Test: ran tests
Change-Id: I9009f06546e1c2cc55eff996d08b55eff3482343
This also fixes a long-standing bug where the guard region would be taken
out of the stack itself, rather than being -- as POSIX demands -- additional
space after the stack. Historically a 128KiB stack with a 256KiB guard would
have given you an immediate crash.
Bug: http://b/38413813
Test: builds, boots
Change-Id: Idd12a3899be1d92fea3d3e0fa6882ca2216bd79c
snprintf to a buffer of length PATH_MAX consumes about 7kB of stack.
Bug: http://b/35858739
Test: bionic-unit-tests --gtest_filter="*big_enough*"
Change-Id: I34a7f42c1fd2582ca0d0a9b7e7a5290bc1cc19b1
Thread local buffers were using pthread_setspecific for storage with
lazy initialization. pthread_setspecific shares TLS slots between the
linker and libc.so, so thread local buffers being initialized in a
different order between libc.so and the linker meant that bad things
would happen (manifesting as snprintf not working because the
locale was mangled)
Bug: http://b/20464031
Test: /data/nativetest64/bionic-unit-tests/bionic-unit-tests
everything passes
Test: /data/nativetest/bionic-unit-tests/bionic-unit-tests
thread_local tests are failing both before and after (KUSER_HELPERS?)
Test: /data/nativetest64/bionic-unit-tests-static/bionic-unit-tests-static
no additional failures
Change-Id: I9f445a77c6e86979f3fa49c4a5feecf6ec2b0c3f
Another release, another attempt to fix this bug.
This change affects pthread_detach, pthread_getcpuclockid,
pthread_getschedparam/pthread_setschedparam, pthread_join, and pthread_kill:
instead of returning ESRCH when passed an invalid pthread_t, they'll now SEGV.
Note that this doesn't change behavior as much as you might think: the old
lookup only held the global thread list lock for the duration of the lookup,
so there was still a race between that and the dereference in the caller,
given that callers actually need the tid to pass to some syscall or other,
and sometimes update fields in the pthread_internal_t struct too.
We can't check thread->tid against 0 to see whether a pthread_t is still
valid because a dead thread gets its thread struct unmapped along with its
stack, so the dereference isn't safe.
Taking the affected functions one by one:
* pthread_getcpuclockid and pthread_getschedparam/pthread_setschedparam
should be fine. Unsafe calls to those seem highly unlikely.
* Unsafe pthread_detach callers probably want to switch to
pthread_attr_setdetachstate instead, or using pthread_detach(pthread_self())
from the new thread's start routine rather than doing the detach in the
parent.
* pthread_join calls should be safe anyway, because a joinable thread won't
actually exit and unmap until it's joined. If you're joining an
unjoinable thread, the fix is to stop marking it detached. If you're
joining an already-joined thread, you need to rethink your design.
* Unsafe pthread_kill calls aren't portably fixable. (And are obviously
inherently non-portable as-is.) The best alternative on Android is to
use pthread_gettid_np at some point that you know the thread to be alive,
and then call kill/tgkill directly. That's still not completely safe
because if you're too late, the tid may have been reused, but then your
code is inherently unsafe anyway.
If we find too much code is still broken, we can come back and disable
the global thread list lookups for anything targeting >= O and then have
another go at really removing this in P...
Bug: http://b/19636317
Test: N6P boots, bionic tests pass
Change-Id: Ia92641212f509344b99ee2a9bfab5383147fcba6
Before, dynamic executables would initialize the global stack protector
twice, once for the linker, and once for the executable. This worked
because the result was the same for both initializations, because it
used getauxval(AT_RANDOM), which won't be the case once arc4random gets
used for it.
Bug: http://b/29622562
Change-Id: I7718b1ba8ee8fac7127ab2360cb1088e510fef5c
Test: ran the stack protector tests on angler (32/64bit, static/dynamic)
The code to calculate thread stack and signal stack looks weird:
the thread stack size and signal stack size are related with
each other on 32-bit mode, but not on 64-bit mode. So change the
code to make the logic more resonable. This doesn't change anything
as we have defined SIGSTKSZ to 16K on arm64.
Bug: 28005110
Change-Id: I04d2488cfb96ee7e2d894d062c66cef950fec418
Exactly which functions get a stack protector is up to the compiler, so
let's separate the code that sets up the environment stack protection
requires and explicitly build it with -fno-stack-protector.
Bug: http://b/26276517
Change-Id: I8719e23ead1f1e81715c32c1335da868f68369b5
Currently we use __thread variable to store thread_local_dtors,
which makes tsan test fork_atexit.cc hang. The problem is as below:
The main thread creates a worker thread, the worker thread calls
pthread_exit() -> __cxa_thread_finalize() -> __emutls_get_address()
-> pthread_once(emutls_init) -> emutls_init().
Then the main thread calls fork(), the child process cals
exit() -> __cxa_thread_finalize() -> __emutls_get_address()
-> pthread_once(emutls_init).
So the child process is waiting for pthread_once(emutls_init)
to finish which will never occur.
It might be the test's fault because POSIX standard says if a
multi-threaded process calls fork(), the new process may only
execute async-signal-safe operations until exec functions are
called. And exit() is not async-signal-safe. But we can make
bionic more reliable by not using __thread in
__cxa_thread_finalize().
Bug: 25392375
Change-Id: Ife403dd7379dad8ddf1859c348c1c0adea07afb3
It removes calling to pthread_mutex_lock() at the beginning of new
thread, which helps to support thread sanitizer.
Change-Id: Ia3601c476de7976a9177b792bd74bb200cee0e13
As glibc/netbsd don't protect access to thread struct members by a global
lock, we don't want to do it either. This change reduces the
responsibility of g_thread_list_lock to only protect g_thread_list.
Bug: 19636317
Change-Id: I897890710653dac165d8fa4452c7ecf74abdbf2b
aligned attribute can only control compiler's behavior, but we
are manually allocating pthread_internal_t. So we need to make
sure of alignment manually.
Change-Id: Iea4c46eadf10dfd15dc955c5f41cf6063cfd8536
The errors are introduced in "Make pthread join_state not protected by g_thread_list_lock".
Bug: 19636317
Change-Id: I58ae9711da94bfbac809abfd81311eeb70301a4b
1. Move the representation of thread join_state from pthread.attr.flag
to pthread.join_state. This clarifies thread state change.
2. Use atomic operations for pthread.join_state. So we don't need to
protect it by g_thread_list_lock. g_thread_list_lock will be reduced
to only protect g_thread_list or even removed in further changes.
Bug: 19636317
Change-Id: I31fb143a7c69508c7287307dd3b0776993ec0f43
Make this change because I think it is more reasonable to check stack info
in pthread_getattr_np. I believe pthread_attr_t is not tied with any thread,
and can't have a flag saying who using it is the main thread.
This change also helps refactor of g_thread_list_lock.
Bug: 19636317
Change-Id: Iedbb85a391ac3e1849dd036d01445dac4bc63db9
If pthread_detach() is called while the thread is in pthread_exit(),
it takes the risk that no one can free the pthread_internal_t.
So I add PTHREAD_ATTR_FLAG_ZOMBIE to detect this, maybe very rare, but
both glibc and netbsd libpthread have similar function.
Change-Id: Iaa15f651903b8ca07aaa7bd4de46ff14a2f93835
Unfortunately, this change provokes random crashes for ART, and
I have seen libc crashes on the device that might be related to it.
Reverting it fixes the ART crashes. there is unfortunately no
stack trace for the crashes, but just a "Segmentation fault" message.
This reverts commit cc5f6543e3.
Change-Id: I68dca8e1e9b9edcce7eb84596e8db619e40e8052
A lot of third-party code calls the private __get_thread symbol,
often as part of a backport of bionic's pthread_rwlock implementation.
Hopefully this will go away for LP64 (since you're guaranteed the
real implementation there), but there are still APIs that take a tid
and no way to convert between a pthread_t and a tid. pthread_gettid_np
is a public API for that. To aid the transition, make __get_thread
available again for LP32.
(cherry-pick of 27efc48814b8153c55cbcd0af5d9add824816e69.)
Bug: 14079438
Change-Id: I43fabc7f1918250d31d4665ffa4ca352d0dbeac1
In practice, with this implementation we never need to make a system call.
We get the main thread's tid (which is the same as our pid) back from
the set_tid_address system call we have to make during initialization.
A new pthread will have the same pid as its parent, and a fork child's
main (and only) thread will have a pid equal to its tid, which we get for
free from the kernel before clone returns.
The only time we'd actually have to make a getpid system call now is if
we take a signal during fork and the signal handler calls getpid. (That,
or we call getpid in the dynamic linker while it's still dealing with its
own relocations and hasn't even set up the main thread yet.)
Bug: 15387103
Change-Id: I6d4718ed0a5c912fc75b5f738c49a023dbed5189
The problem with the original patch was that using syscall(3) means that
errno can be set, but pthread_create(3) was abusing the TLS errno slot as
a pthread_mutex_t for the thread startup handshake.
There was also a mistake in the check for syscall failures --- it should
have checked against -1 instead of 0 (not just because that's the default
idiom, but also here because futex(2) can legitimately return values > 0).
This patch stops abusing the TLS errno slot and adds a pthread_mutex_t to
pthread_internal_t instead. (Note that for LP64 sizeof(pthread_mutex_t) >
sizeof(uintptr_t), so we could potentially clobber other TLS slots too.)
I've also rewritten the LP32 compatibility stubs to directly reuse the
code from the .h file.
This reverts commit 75c55ff84e.
Bug: 15195455
Change-Id: I6ffb13e5cf6a35d8f59f692d94192aae9ab4593d
This is a much simpler implementation that lets the kernel
do as much as possible.
Co-authored-by: Jörgen Strand <jorgen.strand@sonymobile.com>
Co-authored-by: Snild Dolkow <snild.dolkow@sonymobile.com>
Change-Id: Iad19f155de977667aea09410266d54e63e8a26bf
This replaces the non-standard pthread_mutex_lock_timeout_np, which we have
to keep around on LP32 for binary compatibility.
Change-Id: I098dc7cd38369f0c1bec1fac35687fbd27392e00
The kernel now maintains the pthread_internal_t::tid field for us,
and __clone was only used in one place so let's inline it so we don't
have to leave such a dangerous function lying around. Also rename
files to match their content and remove some useless #includes.
Change-Id: I24299fb4a940e394de75f864ee36fdabbd9438f9
Let the kernel keep pthread_internal_t::tid updated, including
across forks and for the main thread. This then lets us fix
pthread_join to only return after the thread has really exited.
Also fix the thread attributes of the main thread so we don't
unmap the main thread's stack (which is really owned by the
dynamic linker and contains things like environment variables),
which fixes crashes when joining with an exited main thread
and also fixes problems reported publicly with accessing environment
variables after the main thread exits (for which I've added a new
unit test).
In passing I also fixed a bug where if the clone(2) inside
pthread_create(3) fails, we'd unmap the child's stack and TLS (which
contains the mutex) and then try to unlock the mutex. Boom! It wasn't
until after I'd uploaded the fix for this that I came across a new
public bug reporting this exact failure.
Bug: 8206355
Bug: 11693195
Bug: https://code.google.com/p/android/issues/detail?id=57421
Bug: https://code.google.com/p/android/issues/detail?id=62392
Change-Id: I2af9cf6e8ae510a67256ad93cad891794ed0580b
<pthread.h> was missing nonnull attributes, noreturn on pthread_exit,
and had incorrect cv qualifiers for several standard functions.
I've also marked the non-standard stuff (where I count glibc rather
than POSIX as "standard") so we can revisit this cruft for LP64 and
try to ensure we're compatible with glibc.
I've also broken out the pthread_cond* functions into a new file.
I've made the remaining pthread files (plus ptrace) part of the bionic code
and fixed all the warnings.
I've added a few more smoke tests for chunks of untested pthread functionality.
We no longer need the libc_static_common_src_files hack for any of the
pthread implementation because we long since stripped out the rest of
the armv5 support, and this hack was just to ensure that __get_tls in libc.a
went via the kernel if necessary.
This patch also finishes the job of breaking up the pthread.c monolith, and
adds a handful of new tests.
Change-Id: Idc0ae7f5d8aa65989598acd4c01a874fe21582c7
Also remove the SIGSEGV special case, which was probably because
hand-written __exit_with_stack_teardown stubs used to try to cause
SIGSEGV if the exit system call returned (which it never does, so
that dead code disappeared).
Also move the sigprocmask into the only case where it's necessary ---
the one where we unmap the stack that would be used by a signal
handler.
Change-Id: Ie40d20c1ae2f5e7125131b6b492cba7a2c6d08e9
The x86_64 build was failing because clone.S had a call to __thread_entry which
was being added to a different intermediate .a on the way to making libc.so,
and the linker couldn't guarantee statically that such a relocation would be
possible.
ld: error: out/target/product/generic_x86_64/obj/STATIC_LIBRARIES/libc_common_intermediates/libc_common.a(clone.o): requires dynamic R_X86_64_PC32 reloc against '__thread_entry' which may overflow at runtime; recompile with -fPIC
This patch addresses that by ensuring that the caller and callee end up in the
same intermediate .a. While I'm here, I've tried to clean up some of the mess
that led to this situation too. In particular, this removes libc/private/ from
the default include path (except for the DNS code), and splits out the DNS
code into its own library (since it's a weird special case of upstream NetBSD
code that's diverged so heavily it's unlikely ever to get back in sync).
There's more cleanup of the DNS situation possible, but this is definitely a
step in the right direction, and it's more than enough to get x86_64 building
cleanly.
Change-Id: I00425a7245b7a2573df16cc38798187d0729e7c4
This reverts commits eb1b07469f and
d14dc3b87f, and fixes the bug where
we were calling mmap (which might cause errno to be set) before
__set_tls (which is required to implement errno).
Bug: 8557703
Change-Id: I2c36d00240c56e156e1bb430d8c22a73a068b70c