This was originally motivated by noticing that we were setting the
wrong bits for the well-known tls entries. That was a harmless bug
because none of the well-known tls entries has a destructor, but
it's best not to leave land mines lying around.
Also add some missing POSIX constants, a new test, and fix
pthread_key_create's return value when we hit the limit.
Change-Id: Ife26ea2f4b40865308e8410ec803b20bcc3e0ed1
We had two copies of the backtrace code, and two copies of the
libcorkscrew /proc/pid/maps code. This patch gets us down to one.
We also had hacks so we could log in the malloc debugging code.
This patch pulls the non-allocating "printf" code out of the
dynamic linker so everyone can share.
This patch also makes the leak diagnostics easier to read, and
makes it possible to paste them directly into the 'stack' tool (by
using relative PCs).
This patch also fixes the stdio standard stream leak that was
causing a leak warning every time tf_daemon ran.
Bug: 7291287
Change-Id: I66e4083ac2c5606c8d2737cb45c8ac8a32c7cfe8
...and don't pass a non-heap pointer to free(3), either.
This patch replaces the "node** prev" with the clearer "node* prev"
style and fixes the null pointer dereference in the old code. That's
not sufficient to fix the reporter's bug, though. The pthread_internal_t*
for the main thread isn't heap-allocated --- __libc_init_tls causes a
pointer to a statically-allocated pthread_internal_t to be added to
the thread list.
Bug: http://code.google.com/p/android/issues/detail?id=37410
Change-Id: I112b7f22782fc789d58f9c783f7b323bda8fb8b7
pthread_no_op_detach_after_join test from bionic-unit-tests hangs
on x86 emulator. There is a race in the pthread_join, pthread_exit,
pthread_detach functions:
- pthread_join waits for the non-detached thread
- pthread_detach sets the detached flag on that thread
- the thread executes pthread_exit which just kills the now-detached
thread, without sending the join notification.
This patch improves the test so it fails on ARM too, and modifies
pthread_detach to behave more like glibc, not setting the detach state if
called on a thread that's already being joined (but not returning an error).
Change-Id: I87dc688221ce979ef5178753dd63d01ac0b108e6
Signed-off-by: Sergey Melnikov <sergey.melnikov@intel.com>
The first NULL pointer check against `attr' suggests that `attr' can
be NULL. Then later `attr' is directly dereferenced, suggesting the
opposite.
if (attr == NULL) {
...
} else {
...
}
...
if (attr->stack_base == ...) { ... }
The public API pthread_create(3) allows NULL, and interprets it as "default".
Our implementation actually swaps in a pointer to the global default
pthread_attr_t, so we don't need any NULL checks in _init_thread. (The other
internal caller passes its own pthread_attr_t.)
Change-Id: I0a4e79b83f5989249556a07eed1f2887e96c915e
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Based on a pair of patches from Intel:
https://android-review.googlesource.com/#/c/43909/https://android-review.googlesource.com/#/c/44903/
For x86, this patch supports _both_ the global that ARM/MIPS use
and the per-thread TLS entry (%gs:20) that GCC uses by default. This
lets us support binaries built with any x86 toolchain (right now,
the NDK is emitting x86 code that uses the global).
I've also extended the original tests to cover ARM/MIPS too, and
be a little more thorough for x86.
Change-Id: I02f279a80c6b626aecad449771dec91df235ad01
I gave up trying to use the usual thread-local buffer idiom; calls to
calloc(3) and free(3) from any of the "dl" functions -- which live in
the dynamic linker -- end up resolving to the dynamic linker's stubs.
I tried to work around that, but was just making things more complicated.
This alternative costs us a well-known TLS slot (instead of the
dynamically-allocated TLS slot we'd have used otherwise, so no difference
there), plus an extra buffer inside every pthread_internal_t.
Bug: 5404023
Change-Id: Ie9614edd05b6d1eeaf7bf9172792d616c6361767
Several previous changes conspired to make a mess of the thread list
in static binaries. This was most obvious when trying to call
pthread_key_delete(3) on the main thread.
Bug: http://code.google.com/p/android/issues/detail?id=36893
Change-Id: I2a2f553114d8fb40533c481252b410c10656da2e
Save thread id to *thread_out before new
thread is allowed to run else there's a
risk that the thread has finished and
been deleted when *thread_out is assigned.
Change-Id: I6b84c61a8df06840877d4ab036f26feace3192d8
A call to pthread_key_delete() after pthread_exit() have unmapped the stack of a thread
but before the ongoing pthread_join() have finished executing will result in an access
to unmapped memory.
Avoid this by invalidating the stack_base and tls pointers during pthread_exit().
This is based on the investigation and proprosed solution by
Srinavasa Nagaraju <srinavasa.x.nagaraju@sonyericsson.com>
Change-Id: I145fb5d57930e91b00f1609d7b2cd16a55d5b3a9
The creation of a thread succeeds even if the requested scheduling
parameters can not be set. This is not POSIX compliant, and even
worse, it leads to a wrong behavior. Let pthread_create() fail in this
case.
Change-Id: Ice66e2a720975c6bde9fe86c2cf8f649533a169c
Signed-off-by: Christian Bejram <christian.bejram@stericsson.com>
Since e19d702b8e, dlsym and friends use recursive mutexes that
require the current thread id, which is not available before the libc
constructor. This prevents us from using dlsym() in .preinit_array.
This change moves TLS initialization from libc constructor to the earliest
possible point - immediately after linker itself is relocated. As a result,
pthread_internal_t for the initial thread is available from the start.
As a bonus, values stored in TLS in .preinit_array are not lost when libc is
initialized.
Change-Id: Iee5a710ee000173bff63e924adeb4a4c600c1e2d
First commit:
Revert "Revert "am be741d47: am 2f460fbe: am 73b5cad9: Merge "bionic: Fix wrong kernel_id in pthread descriptor after fork()"""
This reverts commit 06823da2f0.
Second commit:
bionic: fix atfork hanlder_mutex deadlock
This cherry-picks commit 34e89c232d
After applying the kernel_id fix, the system refused to boot up and we
got following crash log:
I/DEBUG ( 113): pid: 618, tid: 618 >>> org.simalliance.openmobileapi.service:remote <<<
I/DEBUG ( 113): signal 16 (SIGSTKFLT), code -6 (?), fault addr --------
I/DEBUG ( 113): eax fffffe00 ebx b77de994 ecx 00000080 edx 00724002
I/DEBUG ( 113): esi 00000000 edi 00004000
I/DEBUG ( 113): xcs 00000073 xds 0000007b xes 0000007b xfs 00000000 xss 0000007b
I/DEBUG ( 113): eip b7761351 ebp bfdf3de8 esp bfdf3dc4 flags 00000202
I/DEBUG ( 113): #00 eip: 00015351 /system/lib/libc.so
I/DEBUG ( 113): #01 eip: 0000d13c /system/lib/libc.so (pthread_mutex_lock)
I/DEBUG ( 113): #02 eip: 00077b48 /system/lib/libc.so (__bionic_atfork_run_prepare)
I/DEBUG ( 113): #03 eip: 00052cdb /system/lib/libc.so (fork)
I/DEBUG ( 113): #04 eip: 0009ae91 /system/lib/libdvm.so (_Z18dvmOptimizeDexFileillPKcjjb)
I/DEBUG ( 113): #05 eip: 000819d6 /system/lib/libdvm.so (_Z14dvmJarFileOpenPKcS0_PP7JarFileb)
I/DEBUG ( 113): #06 eip: 000b175e /system/lib/libdvm.so (_ZL40Dalvik_dalvik_system_DexFile_openDexFilePKjP6JValue)
I/DEBUG ( 113): #07 eip: 0011fb94 /system/lib/libdvm.so
Root cause:
The atfork uses the mutex handler_mutex to protect the atfork_head. The
parent will call __bionic_atfork_run_prepare() to lock the handler_mutex,
and need both the parent and child to unlock their own copy of handler_mutex
after fork. At that time, the owner of hanlder_mutex is set as the parent.
If we apply the kernel_id fix, then the child's kernel_id will be set as
child's tid.
The handler_mutex is a recursive lock, and pthread_mutex_unlock(&hander_mutex)
will fail because the mutex owner is the parent, while the current tid
(__get_thread()->kernel_id) is child, not matched with the mutex owner.
At that time, the handler_mutex is left in lock state.If the child wants to
fork other process after than, then it will try to lock handler_mutex, and
then be deadlocked.
Fix:
Since the child has its own copy of vm space from the the parent, the
child space's handler_mutex should be reset to the initialized state.
Change-Id: I3907dd9a153418fb78862f2aa6d0302c375d9e27
Signed-off-by: Jack Ren <jack.ren@intel.com>
Signed-off-by: Chenyang Du <chenyang.du@intel.com>
Signed-off-by: Bruce Beare <bruce.j.beare@intel.com>
Change-Id: Ic8072f366a877443a60fe215f3c00b3df5a259c8
After forking, the kernel_id field in the phtread_internal_t returned by pthread_self()
is incorrect --- it's the tid from the parent, not the new tid of the
child.
The root cause is that: currently the kernel_id is set by
_init_thread(), which is called in 2 cases:
(1) called by __libc_init_common(). That happens when the execv( ) is
called after fork( ). But when the zygote tries to fork the android
application, the child application doesn't call execv( ), instread, it
tries to call the Java main method directly.
(2) called by pthread_create(). That happens when a new thread is
created.
For the lead thread which is the thread created by fork(), it should
call execv() but it doesn't, as described in (1) above. So its kernel_id
will inherit the parent's kernel_id.
Fixed it in this patch.
Change-Id: I63513e82af40ec5fe51fbb69456b1843e4bc0fc7
Signed-off-by: Chenyang Du <chenyang.du@intel.com>
Signed-off-by: Jack Ren <jack.ren@intel.com>
Signed-off-by: Bruce Beare <bruce.j.beare@intel.com>
This optimization improves the performance of recursive locks
drastically. When running the thread_stress program on a Xoom,
the total time to perform all operations goes from 1500 ms to
500 ms on average after this change is pushed to the device.
Change-Id: I5d9407a9191bdefdaccff7e7edefc096ebba9a9d
This fixes a bug that was introduced in the latest pthread optimization.
It happens when a recursive lock is contented by several threads. The main
issue was that the atomic counter increment in _recursive_increment() could
be annihilated by a non-conditional write in pthread_mutex_lock() used to
update the value's lower bits to indicate contention.
This patch re-introduces the use of the global recursive lock in
_recursive_increment(). This will hit performance, but a future patch
will be provided to remove it from the source code.
Change-Id: Ie22069d376cebf2e7d613ba00b6871567f333544
this works by building a directed graph of acquired
pthread mutexes and making sure there are no loops in
that graph.
this feature is enabled with:
setprop debug.libc.pthread 1
when a potential deadlock is detected, a large warning is
output to the log with appropriate back traces.
currently disabled at compile-time. set PTHREAD_DEBUG_ENABLED=1
to enable.
Change-Id: I916eed2319599e8aaf8f229d3f18a8ddbec3aa8a
This patch provides several small optimizations to the
implementation of mutex locking and unlocking. Note that
a following patch will get rid of the global recursion
lock, and provide a few more aggressive changes, I
though it'd be simpler to split this change in two parts.
+ New behaviour: pthread_mutex_lock et al now detect
recursive mutex overflows and will return EAGAIN in
this case, as suggested by POSIX. Before, the counter
would just wrap to 0.
- Remove un-necessary reloads of the mutex value from memory
by storing it in a local variable (mvalue)
- Remove un-necessary reload of the mutex value by passing
the 'shared' local variable to _normal_lock / _normal_unlock
- Remove un-necessary reload of the mutex value by using a
new macro (MUTEX_VALUE_OWNER()) to compare the thread id
for recursive/errorcheck mutexes
- Use a common inlined function to increment the counter
of a recursive mutex. Also do not use the global
recursion lock in this case to speed it up.
Change-Id: I106934ec3a8718f8f852ef547f3f0e9d9435c816
This patch changes the implementation of pthread_once()
to avoid the use of a single global recursive mutex. This
should also slightly speed up the non-common case where
we have to call the init function, or wait for another
thread to finish the call.
Change-Id: I8a93f4386c56fb89b5d0eb716689c2ce43bdcad9
Pass kernel space sigset_t size to __rt_sigprocmask to workaround
the miss-match of NSIG/sigset_t definition between kernel and bionic.
Note: Patch originally from Google...
Change-Id: I4840fdc56d0b90d7ce2334250f04a84caffcba2a
Signed-off-by: Chenyang Du <chenyang.du@intel.com>
Signed-off-by: Bruce Beare <bruce.j.beare@intel.com>
(1) in pthread_create:
If the one signal is received before esp is subtracted by 16 and
__thread_entry( ) is called, the stack will be cleared by kernel
when it tries to contruct the signal stack frame. That will cause
that __thread_entry will get a wrong tls pointer from the stack
which leads to the segment fault when trying to access tls content.
(2) in pthread_exit
After pthread_exit called system call unmap(), its stack will be
freed. If one signal is received at that time, there is no stack
available for it.
Fixed by subtracting the child's esp by 16 before the clone system
call and by blocking signal handling before pthread_exit is started.
Author: Jack Ren <jack.ren@intel.com>
Signed-off-by: Bruce Beare <bruce.j.beare@intel.com>
Use tgkill instead of tkill to implement pthread_kill.
This is safer in the event that the thread has already terminated
and its id has been reused by a different process.
Change-Id: Ied715e11d7eadeceead79f33db5e2b5722954ac9
Allow the kernel to choose a memory location to put the
thread stack, rather than hard coding 0x10000000
Change-Id: Ib1f37cf0273d4977e8d274fbdab9431ec1b7cb4f
We're going to modify the __atomic_xxx implementation to provide
full memory barriers, to avoid problems for NDK machine code that
link to these functions.
First step is to remove their usage from our platform code.
We now use inlined versions of the same functions for a slight
performance boost.
+ remove obsolete atomics_x86.c (was never compiled)
NOTE: This improvement was benchmarked on various devices.
Comparing a pthread mutex lock + atomic increment + unlock
we get:
- ARMv7 emulator, running on a 2.4 GHz Xeon:
before: 396 ns after: 288 ns
- x86 emulator in KVM mode on same machine:
before: 27 ns after: 27 ns
- Google Nexus S, in ARMv7 mode (single-core):
before: 82 ns after: 76 ns
- Motorola Xoom, in ARMv7 mode (multi-core):
before: 121 ns after: 120 ns
The code has also been rebuilt in ARMv5TE mode for correctness.
Change-Id: Ic1dc72b173d59b2e7af901dd70d6a72fb2f64b17
The old code didn't work because the kernel expects a 64-bit sigset_t
while the one provided by our ABI is only 32-bit. This is originally
due to the fact that the kernel headers themselves define sigset_t
as a 32-bit type when __KERNEL__ is not defined (apparently to cater
to libc5 or some similarly old C library).
We can't modify the size of sigset_t without breaking the NDK ABI,
so instead perform runtime translation during the call.
Change-Id: Ibfdc3cbceaff864af7a05ca193aa050047b4773f
After this change, SIGRTMAX will be set to 64 (instead of 32 currently).
Note that this doesn't change the fact that our sigset_t is still defined
as a 32-bit unsigned integer, so most functions that deal with this type
won't support real-time signals though.
Change-Id: Ie1e2f97d646f1664f05a0ac9cac4a43278c3cfa8
The implementation was using a double-checked locking approach that
could break on SMP.
In addition to the barriers I also switched to a volatile pointer. I
don't think this will matter unless gcc can conclude that _normal_lock
can't affect *once_control, but I figured it was better to be safe.
(It seems to have no impact whatsoever on the generated code.)
Bug 3022795.
Change-Id: Ib91da25d57ff5bee4288526e39d457153ef6aacd
This adds an explicit memory barrier to condition variable signaling.
It's a little murky as to whether it's strictly required, but it seems
like a wise thing to do.
Change-Id: Id0faa542d61e4b8ffa775e4adf68e4d7471f4fb7
Also add missing declarations to misc. functions.
Fix clearerr() implementation (previous was broken).
Handle feature test macros like _POSIX_C_SOURCE properly.
Change-Id: Icdc973a6b9d550a166fc2545f727ea837fe800c4
We simply copy the stuff we need from cutils headers.
A future patch will change cutils to include the private <bionic_atomic_inline.h>
Change-Id: Ib6fd9a03bc9e337ce867bd606dc94c2b4438480a