When I added %m to async_safe_* too, we never followed up and cleaned up
callers.
Test: treehugger
Change-Id: If81943c4c45de49f0fb4bc29cfbd3fc53d4a47fe
Failure to mark shadow stack page as writable will result in a SEGV
fault later when a function tries to save return addresses to shadow
stack. The engineer looking at the crash report would be very confused
because the program crashes at very beginning of an innocent looking
function. For ease of debugging, check for shadow stack errors early.
Test: th
Bug: 279808236
Bug: 253652966
Change-Id: Id2da68fa984b5dfb1846ed14aa7ededee7f2508f
To enable experiments with non-4KiB page sizes, introduce
an inline page_size() function that will either return the runtime
page size (if PAGE_SIZE is not 4096) or a constant 4096 (elsewhere).
This should ensure that there are no changes to the generated code on
unaffected platforms.
Test: source build/envsetup.sh
lunch aosp_cf_arm64_16k_phone-userdebug
m -j32 installclean
m -j32
Test: launch_cvd \
-kernel_path /path/to/out/android14-5.15/dist/Image \
-initramfs_path /path/to/out/android14-5.15/dist/initramfs.img \
-userdata_format=ext4
Bug: 277272383
Bug: 230790254
Change-Id: Ic0ed98b67f7c6b845804b90a4e16649f2fc94028
Neither is great, but "gp" seems actively misleading (and setjmp.S
says x3 every time, so we should be consistent if nothing else).
Bug: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/379
Test: treehugger
Change-Id: Ibccda74d4794caa770b82e7ba2e31ce7b645b83f
We want to give back a useful callee-saved general purpose
register (x18) that was only "chosen" because it was what llvm
allowed for historical reasons. gp is a better choice because it's
effectively unused otherwise anyway.
Unfortunately, that means we need extra space in jmp_buf (which I've
reserved in an earlier change, e7b3b8b467),
so let's rearrange the entries in jmp_buf to match their order in the
register file.
Bug: https://github.com/google/android-riscv64/issues/72
Bug: http://b/277909695
Test: treehugger
Change-Id: Ia629409a894c1a83d2052885702bbdd895c758e1
Map all stacks (primary, thread, and sigaltstack) as PROT_MTE when the
binary requests it through the ELF note.
For the reference, the note is produced by the following toolchain changes:
https://reviews.llvm.org/D118948https://reviews.llvm.org/D119384https://reviews.llvm.org/D119381
Bug: b/174878242
Test: fvp_mini with ToT LLVM (more tests in a separate change)
Change-Id: I04a4e21c966e7309b47b1f549a2919958d93a872
Revert submission 1954983-master-I3030c47be9d02a27505bd4775c1982a20755758c
Reason for revert: PAC has shipped with S, and we're going with app compat outreach rather than regressing security.
Reverted Changes:
I3030c47be:Disable pointer authentication in app processes.
I3030c47be:Disable pointer authentication in app processes.
Change-Id: I8761f08ddbd9077ff98b1a9a0c323de968792778
Unfortunately we have discovered that some applications in the wild
are using PAC instructions incorrectly. To keep those applications
working on PAC enabled devices, disable PAC in application processes
for now.
Bug: 212660282
Change-Id: I3030c47be9d02a27505bd4775c1982a20755758c
This function doesn't return, but it does appear in stack traces. Avoid
using return PAC in this function because we may end up resetting IA,
which may confuse unwinders due to mismatching keys.
Bug: 189808795
Change-Id: I953da9078acd1d43eb7a47fb11f75caa0099fa12
Processes loaded from vendor partitions may have their own sandboxes
that would reject the prctl. Because no devices launched with PAC
enabled before S, we can avoid issues on upgrading devices by checking
for PAC support before issuing the prctl.
Bug: 186117046
Change-Id: I9905b963df01c9007d9fb4527273062ea87a5075
Resetting PAC keys on fork appears to lead to a number of problems. One
problem is that we are constrained in where we can run C++ code after
forking, and with ART those places are implementation-defined. For
example, in app zygotes, ART turns out to insert "interpreter frames"
in the stack trace. Returning into these interpreter frames may lead
to crashes due to failing the ROP protection check on return.
It seems better to reset keys on thread creation instead. We only need
to reset IA because only this key needs to be reset for reverse-edge
PAC, and resetting the other keys may be incompatible with future ABIs.
Chrome (and potentially other applications) has a sandbox that prevents
the use of the prctl, so we restrict its use to applications targeting
S and above.
Bug: 183024045
Change-Id: I1e6502a7d7df319d424e2b0f653aad9a343ae71b
Introduce an android_mallopt(M_DISABLE_MEMORY_MITIGATIONS) API call
that may be used to disable zero- or pattern-init on non-MTE hardware,
or memory tagging on MTE hardware. The intent is that this function
may be called at any time, including when there are multiple threads
running.
Disabling zero- or pattern-init is quite trivial, we just need to set
a global variable to 0 via a Scudo API call (although there will be
some separate work required on the Scudo side to make this operation
thread-safe).
It is a bit more tricky to disable MTE across a process, because
the kernel does not provide an API for disabling tag checking in all
threads in a process, only per-thread. We need to send a signal to each
of the process's threads with a handler that issues the required prctl
call, and lock thread creation for the duration of the API call to
avoid races between thread enumeration and calls to pthread_create().
Bug: 135772972
Change-Id: I81ece86ace916eb6b435ab516cd431ec4b48a3bf
This function will be used by Scudo and GWP-ASan to efficiently collect
stack traces for frames built with frame pointers.
Bug: 135634846
Bug: 135772972
Change-Id: Ic63efdbafe11dfbb1226b5b4b403d53c4dbf28f3
Merged-In: Ic63efdbafe11dfbb1226b5b4b403d53c4dbf28f3
Using ifuncs allows the linker to select faster versions of libc functions
like strcmp, making linking faster.
The linker continues to first initialize TLS, then call the ifunc
resolvers. There are small amounts of code in Bionic that need to avoid
calling functions selected using ifuncs (generally string.h APIs). I've
tried to compile those pieces with -ffreestanding. Maybe it's unnecessary,
but maybe it could help avoid compiler-inserted memset calls, and maybe
it will be useful later on.
The ifuncs are called in a special early pass using special
__rel[a]_iplt_start / __rel[a]_iplt_end symbols. The linker will encounter
the ifuncs again as R_*_IRELATIVE dynamic relocations, so they're skipped
on the second pass.
Break linker_main.cpp into its own liblinker_main library so it can be
compiled with -ffreestanding.
On walleye, this change fixes a recent 2.3% linker64 start-up time
regression (156.6ms -> 160.2ms), but it also helps the 32-bit time by
about 1.9% on the same benchmark. I'm measuring the run-time using a
synthetic benchmark based on loading libandroid_servers.so.
Test: bionic unit tests, manual benchmarking
Bug: none
Merged-In: Ieb9446c2df13a66fc0d377596756becad0af6995
Change-Id: Ieb9446c2df13a66fc0d377596756becad0af6995
(cherry picked from commit 772bcbb0c2)
HWASan-instrumented code needs TLS_SLOT_SANITIZER set up to run, and
that is not done until the new thread calls __hwasan_thread_enter. Block
all signals until that time to prevent hwasan-instrumented signal
handlers running (and crashing) on the new thread.
Bug: 141893397
Test: seq 0 10000000 | xargs -n 1 -P 200 adb shell am instrument \
-w -r -e command grant-all \
com.android.permissionutils/.PermissionInstrumentation
(cherry picked from commit d181585dd5)
Change-Id: Id65fae836edcacdf057327ccf16cf0b5e0f9474a
Bionic creates a single thread mapping to hold a thread's stack and static
TLS memory. Use PR_SET_VMA_ANON_NAME to name this region
"stack_and_tls:tid". dumpsys meminfo can report this region as "Stack"
memory.
The main thread's memory is instead named stack_and_tls:main, and the VMA
is renamed to stack_and_tls:main in a post-fork child.
For the main thread, and threads using pthread_attr_setstack, Bionic still
creates the stack_and_tls mapping, but it only has TLS memory in it.
Bug: http://b/134795155
Test: run "dumpsys meminfo" and verify that this CL increases the reported
stack memory usage from about 4MB to 21MB.
Change-Id: Id1f39ff083329e83426130b4ef94222ffacb90ae
Merged-In: Id1f39ff083329e83426130b4ef94222ffacb90ae
Initialize a thread's DTV to an empty zeroed DTV. Allocate the DTV and
any ELF module's TLS segment on-demand in __tls_get_addr. Use a generation
counter, incremented in the linker, to signal when threads should
update/reallocate their DTV objects.
A generation count of 0 always indicates the constant zero DTV.
Once a DTV is allocated, it isn't freed until the thread exits, because
a signal handler could interrupt the fast path of __tls_get_addr between
accessing the DTV slot and reading a field of the DTV. Bionic keeps a
linked list of DTV objects so it can free them at thread-exit.
Dynamic TLS memory is allocated using a BionicAllocator instance in
libc_shared_globals. For async-signal safety, access to the
linker/libc-shared state is protected by first blocking signals, then by
acquiring the reader-writer lock, TlsModules::rwlock. A write lock is
needed to allocate or free memory.
In pthread_exit, unconditionally block signals before freeing dynamic
TLS memory or freeing the shadow call stack.
ndk_cruft.cpp: Avoid including pthread_internal.h inside an extern "C".
(The header now includes a C++ template that doesn't compile inside
extern "C".)
Bug: http://b/78026329
Bug: http://b/123094171
Test: bionic unit tests
Change-Id: I3c9b12921c9e68b33dcc1d1dd276bff364eff5d7
This implementation simply iterates over each static TLS module and
copies its initialization image into a new thread's static TLS block.
Bug: http://b/78026329
Test: bionic unit tests
Change-Id: Ib7edb665271a07010bc68e306feb5df422f2f9e6
For ELF TLS "local-exec" accesses, the static linker assumes that an
executable's TLS segment is located at a statically-known offset from the
thread pointer (i.e. "variant 1" for ARM and "variant 2" for x86).
Because these layouts are incompatible, Bionic generally needs to allocate
its TLS slots differently between different architectures.
To allow per-architecture TLS slots:
- Replace the TLS_SLOT_xxx enumerators with macros. New ARM slots are
generally negative, while new x86 slots are generally positive.
- Define a bionic_tcb struct that provides two things:
- a void* raw_slots_storage[BIONIC_TLS_SLOTS] field
- an inline accessor function: void*& tls_slot(size_t tpindex);
For ELF TLS, it's necessary to allocate a temporary TCB (i.e. TLS slots),
because the runtime linker doesn't know how large the static TLS area is
until after it has loaded all of the initial solibs.
To accommodate Golang, it's necessary to allocate the pthread keys at a
fixed, small, positive offset from the thread pointer.
This CL moves the pthread keys into bionic_tls, then allocates a single
mapping per thread that looks like so:
- stack guard
- stack [omitted for main thread and with pthread_attr_setstack]
- static TLS:
- bionic_tcb [exec TLS will either precede or succeed the TCB]
- bionic_tls [prefixed by the pthread keys]
- [solib TLS segments will be placed here]
- guard page
As before, if the new mapping includes a stack, the pthread_internal_t
is allocated on it.
At startup, Bionic allocates a temporary bionic_tcb object on the stack,
then allocates a temporary bionic_tls object using mmap. This mmap is
delayed because the linker can't currently call async_safe_fatal() before
relocating itself.
Later, Bionic allocates a stack-less thread mapping for the main thread,
and copies slots from the temporary TCB to the new TCB.
(See *::copy_from_bootstrap methods.)
Bug: http://b/78026329
Test: bionic unit tests
Test: verify that a Golang app still works
Test: verify that a Golang app crashes if bionic_{tls,tcb} are swapped
Merged-In: I6543063752f4ec8ef6dc9c7f2a06ce2a18fc5af3
Change-Id: I6543063752f4ec8ef6dc9c7f2a06ce2a18fc5af3
(cherry picked from commit 1e660b70da)
PR_SET_VMA takes mmap_sem, which can cause contention and reduce
performance any time many threads are created at the same time,
like app startup.
Test: camera launch performance
Bug: 122471935
Change-Id: If7fa7ad99654c01d503f694976fd92bfd30d2afd
Split __libc_init_main_thread into __libc_init_main_thread_early and
__libc_init_main_thread_late. The early function is called very early in
the startup of the dynamic linker and static executables. It initializes
the global auxv pointer and enough TLS memory to do system calls, access
errno, and run -fstack-protector code (but with a zero cookie because the
code for generating a cookie is complex).
After the linker is relocated, __libc_init_main_thread_late finishes
thread initialization.
Bug: none
Test: bionic unit tests
Change-Id: I6fcd8d7587a380f8bd649c817b40a3a6cc1d2ee0
Merged-In: I6fcd8d7587a380f8bd649c817b40a3a6cc1d2ee0
(cherry picked from commit 39bc44bb0e)
This lets us do two things:
1) Make setjmp and longjmp compatible with shadow call stack.
To avoid leaking the shadow call stack address into memory, only the
lower log2(SCS_SIZE) bits of x18 are stored to jmp_buf. This requires
allocating an additional guard page so that we're guaranteed to be
able to allocate a sufficiently aligned SCS.
2) SCS overflow detection. Overflows now result in a SIGSEGV instead
of corrupting the allocation that comes after it.
Change-Id: I04d6634f96162bf625684672a87fba8b402b7fd1
Test: bionic-unit-tests
Instead of allocating the stack within a 16MB guard region as we
were doing before, just allocate the stack on its own. This isn't
as secure as with the guard region (since it means that an attacker
who can read the pthread_internal_t can determine the address of the
SCS), but it will at least allow us to discover more blockers until
a solution to b/118642754 is decided on.
Bug: 112907825
Bug: 118642754
Change-Id: Ibe5dffbad1b4700eaa0e24177eea792e7c329a61
This reverts commit da1bc79f93.
Reason for revert: Caused OOM in media process
Bug: 112907825
Bug: 118593766
Change-Id: I545663871d75889b209b9fd2131cdaa97166478f
Make sure that TLS_SLOT_TSAN is always available and correctly set up in
HWASan-instrumented functions by setting up the tls register and running hwasan
initialization (__hwasan_init in the main thread and __hwasan_thread_enter in
secondary) early enough.
This is needed to accomodate a change in HWASan: https://reviews.llvm.org/D52249
Bug: 112438058
Test: boot with SANITIZE_TARGET=hwaddress, run bionic-tests
Change-Id: Icd909a4ea0da6c6c1095522bcc28debef5f2c63d
We've copied & pasted these to too many places. And if we're going to
have another go at upstreaming these, that's probably yet another reason
to have the *values* in just one place. (Even if upstream wants different
names, we'll likely keep the legacy names around for a while for source
compatibility.)
Bug: http://b/111903542
Test: ran tests
Change-Id: I8ccc557453d69530e5b74f865cbe0b458c84e3ba
* Allow sanitization of libc (excluding existing global sanitizers)
and disallow sanitization of linker. The latter has not been
necessary before because HWASan is the first sanitizer to support
static binaries (with the exception of CFI, which is not used
globally).
* Static binary startup: initialize HWASan shadow very early so that
almost entire libc can be sanitized. The rest of initialization is
done in a global constructor; until that is done sanitized code can
run but can't report errors (will simply crash with SIGTRAP).
* Switch malloc_common from je_* to __sanitizer_*.
* Call hwasan functions when entering and leaving threads. We can not
intercept pthread_create when libc depends on libclang_rt.hwasan.
An alternative to this would be a callback interface like requested
here:
https://sourceware.org/glibc/wiki/ThreadPropertiesAPI
All of the above is behind a compile-time check
__has_feature(hwaddress_sanitizer). This means that HWASan actually
requires libc to be instrumented, and would not work otherwise. It's
an implementation choice that greatly reduces complexity of the tool.
Instrumented libc also guarantees that hwasan is present and
initialized in every process, which allows piecemeal sanitization
(i.e. library w/o main executable, or even individual static
libraries), unlike ASan.
Change-Id: If44c46b79b15049d1745ba46ec910ae4f355d19c
At the cost of two flag bits for what POSIX thinks should be a boolean
choice, plus somewhat confusing behavior from pthread_attr_getinheritsched
depending on when you call it/what specific scheduler attributes you've
set in the pthread_attr_t, we can emulate the old behavior exactly and
prevent annoying SELinux denial spam caused by calls to sched_setscheduler.
Bug: http://b/68391226
Test: adb logcat on boot contains no sys_nice avc denials
Change-Id: I4f759c2c4fd1d80cceb0912d7da09d35902e2e5e
To make it easier for Native Bridge implementations
to override these symbols.
Bug: http://b/67993967
Test: make
Change-Id: I4c53e53af494bca365dd2b3305ab0ccc2b23ba44
Historically, Android defaulted to EXPLICIT but with a special case
because SCHED_NORMAL/priority 0 was awkward. Because the code couldn't
actually tell whether SCHED_NORMAL/priority 0 was a genuine attempt to
explicitly set those attributes (because the parent thread is SCHED_FIFO,
say) or just because the pthread_attr_t was left at its defaults.
Now we support INHERIT, we could call sched_getscheduler to see whether
we actually need to call sched_setscheduler, but since the major cost
is the fixed syscall overhead, we may as well just conservatively
call sched_setscheduler and let the kernel decide whether it's a
no-op. (Especially because we'd then have to add both sched_getscheduler
and sched_setscheduler to any seccomp filter.)
Platform code (or app code that only needs to support >= P) can actually
add a call to pthread_attr_setinheritsched to say that they just want
to inherit (if they know that none of their threads actually mess with
scheduler attributes at all), which will save them a sched_setscheduler
call except in the doubly-special case of SCHED_RESET_ON_FORK (which we
do handle).
An alternative would be "make pthread_attr_setschedparams and
pthread_attr_setschedprio set EXPLICIT and change the platform default
to INHERIT", but even though I can only think of weird pathological
examples where anyone would notice that change, that behavior -- of
pthread_attr_setschedparams/pthread_attr_setschedprio overriding an
earlier call to pthread_attr_setinheritsched -- isn't allowed by POSIX
(whereas defaulting to EXPLICIT is).
If we have a lot of trouble with this change in the app compatibility
testing phase, though, we'll want to reconsider this decision!
-*-
This change also removes a comment about setting the scheduler attributes
in main_thread because we'd have to actually keep them up to date,
and it's not clear that doing so would be worth the trouble.
Also make async_safe_format_log preserve errno so we don't have to be
so careful around it.
Bug: http://b/67471710
Test: ran tests
Change-Id: Idd026c4ce78a536656adcb57aa2e7b2c616eeddf
While this was never an inline, this function alone has caused most of
the bug reports related to _FILE_OFFSET_BITS=64. Providing an inline
for it should allow a lot more code to build with _FILE_OFFSET_BITS=64
when targeting pre-L.
Test: make checkbuild
Test: built trivial cc_binary for LP32 against API 14 with
_FILE_OFFSET_BITS=64 set
Bug: lots
Change-Id: I8479d34af4da358c11423bee43d45b59e9d4143e
Return EAGAIN rather than aborting if we fail to set up the TLS for a new
thread.
Add a test that uses all the VMAs so we can properly test these edge cases.
Add an explicit test for pthread_attr_setdetachstate, which we use in the
previous test, but other than that has no tests.
Remove support for ro.logd.timestamp/persist.logd.timestamp, which doesn't
seem to be used, and which prevents us from logging failures in cases where
mmap fails (because we need to mmap in the system property implementation).
Bug: http://b/65608572
Test: ran tests
Change-Id: I9009f06546e1c2cc55eff996d08b55eff3482343
This also fixes a long-standing bug where the guard region would be taken
out of the stack itself, rather than being -- as POSIX demands -- additional
space after the stack. Historically a 128KiB stack with a 256KiB guard would
have given you an immediate crash.
Bug: http://b/38413813
Test: builds, boots
Change-Id: Idd12a3899be1d92fea3d3e0fa6882ca2216bd79c
(Where errno is relevant.)
Also consistently use -1 as the fd for anonymous mmaps. (It doesn't matter,
but it's more common, and potentially more intention-revealing.)
Bug: http://b/65608572
Test: ran tests
Change-Id: Ie9a207632d8242f42086ba3ca862519014c3c102
__isthreaded is annoying for ARC++ and useless for everyone. Just hard-code
the value in ndk_cruft for LP32 and be done with it.
Bug: N/A
Test: builds
Change-Id: I08f11a404bbec55ed57cb1e18b5116163c7d7d13
This library is used by a number of different libraries in the system.
Make it easy for platform libraries to use this library and create
an actual exported include file.
Change the names of the functions to reflect the new name of the library.
Run clang_format on the async_safe_log.cpp file since the formatting is
all over the place.
Bug: 31919199
Test: Compiled for angler/bullhead, and booted.
Test: Ran bionic unit tests.
Test: Ran the malloc debug tests.
Change-Id: I8071bf690c17b0ea3bc8dc5749cdd5b6ad58478a
__pthread_internal_free doesn't happen on threads that are detached,
causing the bionic TLS allocation (and guard pages) to be leaked.
Fix the leak, and name the allocations to make things apparent if this
ever happens again.
Bug: http://b/36045112
Test: manually ran a program that detached empty threads
Change-Id: Id1c7852b7384474244f7bf5a0f7da54ff962e0a1
Thread local buffers were using pthread_setspecific for storage with
lazy initialization. pthread_setspecific shares TLS slots between the
linker and libc.so, so thread local buffers being initialized in a
different order between libc.so and the linker meant that bad things
would happen (manifesting as snprintf not working because the
locale was mangled)
Bug: http://b/20464031
Test: /data/nativetest64/bionic-unit-tests/bionic-unit-tests
everything passes
Test: /data/nativetest/bionic-unit-tests/bionic-unit-tests
thread_local tests are failing both before and after (KUSER_HELPERS?)
Test: /data/nativetest64/bionic-unit-tests-static/bionic-unit-tests-static
no additional failures
Change-Id: I9f445a77c6e86979f3fa49c4a5feecf6ec2b0c3f
Another release, another attempt to remove the global thread list.
But this time, let's admit that it's not going away. We can switch to using
a read/write lock for the global thread list, and to aborting rather than
quietly returning ESRCH if we're given an invalid pthread_t.
This change affects pthread_detach, pthread_getcpuclockid,
pthread_getschedparam/pthread_setschedparam, pthread_join, and pthread_kill:
instead of returning ESRCH when passed an invalid pthread_t, if you're
targeting O or above, they'll abort with the message "attempt to use
invalid pthread_t".
Note that this doesn't change behavior as much as you might think: the old
lookup only held the global thread list lock for the duration of the lookup,
so there was still a race between that and the dereference in the caller,
given that callers actually need the tid to pass to some syscall or other,
and sometimes update fields in the pthread_internal_t struct too.
(This patch replaces such users with calls to pthread_gettid_np, which
at least makes the TOCTOU window smaller.)
We can't check thread->tid against 0 to see whether a pthread_t is still
valid because a dead thread gets its thread struct unmapped along with its
stack, so the dereference isn't safe.
Taking the affected functions one by one:
* pthread_getcpuclockid and pthread_getschedparam/pthread_setschedparam
should be fine. Unsafe calls to those seem highly unlikely.
* Unsafe pthread_detach callers probably want to switch to
pthread_attr_setdetachstate instead, or using
pthread_detach(pthread_self()) from the new thread's start routine
rather than doing the detach in the parent.
* pthread_join calls should be safe anyway, because a joinable thread
won't actually exit and unmap until it's joined. If you're joining an
unjoinable thread, the fix is to stop marking it detached. If you're
joining an already-joined thread, you need to rethink your design.
* Unsafe pthread_kill calls aren't portably fixable. (And are obviously
inherently non-portable as-is.) The best alternative on Android is to
use pthread_gettid_np at some point that you know the thread to be
alive, and then call kill/tgkill directly.
That's still not completely safe because if you're too late, the tid
may have been reused, but then your code is inherently unsafe anyway.
Bug: http://b/19636317
Test: ran tests
Change-Id: I0372c4428e8a7f1c3af5c9334f5d9c25f2c73f21