A change was made so that pthread_create is calling
prctl(PR_PAC_RESET_KEYS, ...) on aarch64. It's possible that other
seccomp policies might need to change to allow this.
Test: CrasherTest.seccomp_backtrace passes on aarch64.
Change-Id: I9c4d1b3dca5f19a6285bf904bb942f1f52e42bd0
Talk of "gdb" when we currently mean "gdb or lldb" and will soon mean
"lldb" is starting to confuse people. Let's use the more neutral
"debugger" in places where it really doesn't matter.
The switch from gdbclient.py to lldbclient.py is a change for another
day...
Test: treehugger
Change-Id: If39ca7e1cdf4c8bb9475f1791cdaf201fbea50e0
Bug: http://b/181927912
Clang already has -Wfree-nonheap-object but it became a default warning
with clang-r416183
Test: compile crasher.cpp
Change-Id: Ice532e9f373a628e07acd08a4fc7bfa7cf5d4e08
Proto tombstones were missing tagged fault addresses, tagged_addr_ctrl,
tags in memory dumps and Scudo and GWP-ASan error reports. Since text
tombstones now go via protos, all of these features broke when we
switched to text tombstones generated from protos by default. Fix
the features by adding support for them to the proto format,
tombstone_proto and tombstone_proto_to_text.
Bug: 135772972
Bug: 182489365
Change-Id: I3ca854546c38755b1f6410a1f6198a44d25ed1c5
Looks like we unintentionally had a breakage after aosp/1595302, where
both GWP-ASan and MTE tests started failing because the extra
information wasn't plumbed through the tombstones. MTE has end-to-end
tests but aren't run continuously, and GWP-ASan was missing the e2e
tests.
Also remove some unique wording for GWP-ASan, a UaF on the free'd
pointer is now "0 bytes into a 16-byte allocation" instead of "on a
16-byte allocation". The former is more descriptive and is more
ubiquitously used in our tooling.
This patch adds the E2E tests, but the underlying problem needs to be
fixed as well, before this patch can land.
Bug: 182489365
Test: atest debuggerd_test
Change-Id: I0fe8aba7ea443b3071724987f46b19a6525cda3c
In order to test the platform in emulators that are orders of magnitude
slower than real hardware we need to be able to avoid hitting timeouts
that prevent it from coming up properly. For this purpose introduce
a system property, ro.hw_timeout_multiplier, which may be set to
an integer value that acts as a multiplier for various timeouts on
the system.
Bug: 178231152
Change-Id: I6d7710beed0c4c5b1720e74e7abe3a586778c678
Merged-In: I6d7710beed0c4c5b1720e74e7abe3a586778c678
Application developers would like to know how long their process has
been alive for to distinguish between crashes that happen immediately
upon startup and crashes in regular operation.
Test: manual
Change-Id: Ia31eeadfcced358b478c7a7c7bb2e8a0252e30f4
We're running into timeouts from death tests because we're ~doubling the
cost of crash dumping by doing it twice.
Bug: http://b/180605583
Test: treehugger
Change-Id: If5b40434171323a09960b70af0124ec08bd3fbe8
On cuttlefish, the number of tombstones allowed is much larger
than 50, so change the algorithm to search for any tombstone
file.
Test: Ran unit tests on cuttlefish with > 50 tombstones.
Test: Ran unit tests on device.
Change-Id: Ia1d885fe19a7f7751fe3386d40b48750d1e21bd5
With this change we can report memory errors involving secondary
allocations. Update the existing crasher tests to also test
UAF/overflow/underflow on allocations with sizes sufficient to trigger
the secondary allocator.
Bug: 135772972
Change-Id: Ic8925c1f18621a8f272e26d5630e5d11d6d34d38
We were already doing this for the text tombstones but not for protos,
which meant that we stopped producing protos once we hit the limit
on the number of tombstones. Move the code for the text tombstones
into a common location and call it for both types.
Change-Id: I4951150da51a32d50821d147458fc5c18200c9d4
Otherwise we can fail to find map entries for tagged addresses,
such as those of heap objects.
Bug: 135772972
Change-Id: Ia626b0587c8461eb575b2de5c08562c73ba4a66e
Now that we default to sync MTE in tests, the default tagged_addr_ctrl
in this test needs to be updated.
Bug: 135772972
Change-Id: I9bf6fb29df9799d1ed8c0d8b66f4d2891f487d80
There's no way to atomically unlink a specific file for which we have an fd from
a path, which means that we can't safely delete a tombstone without coordination
with tombstoned, which is risky. For example, if we use flock on the directory,
and system_server crashes while holding the lock, we risk deadlock.
We do the next best thing, and keep a file descriptor around for every
tombstone, and truncate it, which requires system_server to be able to
write to tombstones (which are owned by the system group).
Test: treehugger
Change-Id: I6ba7f1fe87ee1a4b57bdb3741e8ec9fbc80788c9
Respect ro.timeout_multiplier property. Some of these are required for
tombstone writing to work on MTE QEMU, the rest are done speculatively.
Test: add crashing code to system_server, observe the tombstone
Bug: 178231152
Change-Id: Ic86e494af571301df7af07d13a6c046a0da6bda7
libbase logging uses getprogname() to get the default tag, which breaks
for the fallback handler which is statically linked into the dynamic
linker. Switch to libasync_safe for logging.
Test: atest -c CtsSeccompHostTestCases:android.seccomp.cts.SeccompHostJUnit4DeviceTest#testAppZygoteSyscalls
Change-Id: Ieeaf33fb26cff4ba7e1589d1d883ac2fcc74cf47
Revert "Let crash_dump read /proc/$PID."
Revert submission 1556807-tombstone_proto
Reason for revert: b/178455196, Broken test: android.seccomp.cts.SeccompHostJUnit4DeviceTest#testAppZygoteSyscalls on git_master on cf_x86_64_phone-userdebug
Reverted Changes:
Ide6811297:tombstoned: switch from goto to RAII.
I8d285c4b4:tombstoned: make it easier to add more types of ou...
Id0f0fa285:tombstoned: support for protobuf fds.
I6be6082ab:Let crash_dump read /proc/$PID.
Id812ca390:Make protobuf vendor_ramdisk_available.
Ieeece6e6d:libdebuggerd: add protobuf implementation.
Change-Id: I8a77f6b9e1b42902ef7ee250cc3f1fd341ea0e2b
Revert "Let crash_dump read /proc/$PID."
Revert submission 1556807-tombstone_proto
Reason for revert: b/178455196, Broken test: android.seccomp.cts.SeccompHostJUnit4DeviceTest#testAppZygoteSyscalls on git_master on cf_x86_64_phone-userdebug
Reverted Changes:
Ide6811297:tombstoned: switch from goto to RAII.
I8d285c4b4:tombstoned: make it easier to add more types of ou...
Id0f0fa285:tombstoned: support for protobuf fds.
I6be6082ab:Let crash_dump read /proc/$PID.
Id812ca390:Make protobuf vendor_ramdisk_available.
Ieeece6e6d:libdebuggerd: add protobuf implementation.
Change-Id: Ib2403c1b61f6cf0513b76361440fbc5909d7554a
Revert "Let crash_dump read /proc/$PID."
Revert submission 1556807-tombstone_proto
Reason for revert: b/178455196, Broken test: android.seccomp.cts.SeccompHostJUnit4DeviceTest#testAppZygoteSyscalls on git_master on cf_x86_64_phone-userdebug
Reverted Changes:
Ide6811297:tombstoned: switch from goto to RAII.
I8d285c4b4:tombstoned: make it easier to add more types of ou...
Id0f0fa285:tombstoned: support for protobuf fds.
I6be6082ab:Let crash_dump read /proc/$PID.
Id812ca390:Make protobuf vendor_ramdisk_available.
Ieeece6e6d:libdebuggerd: add protobuf implementation.
Change-Id: I0c4f3a17e8b06d6c65255388c571ebf11d371dbb
Revert "Let crash_dump read /proc/$PID."
Revert submission 1556807-tombstone_proto
Reason for revert: b/178455196, Broken test: android.seccomp.cts.SeccompHostJUnit4DeviceTest#testAppZygoteSyscalls on git_master on cf_x86_64_phone-userdebug
Reverted Changes:
Ide6811297:tombstoned: switch from goto to RAII.
I8d285c4b4:tombstoned: make it easier to add more types of ou...
Id0f0fa285:tombstoned: support for protobuf fds.
I6be6082ab:Let crash_dump read /proc/$PID.
Id812ca390:Make protobuf vendor_ramdisk_available.
Ieeece6e6d:libdebuggerd: add protobuf implementation.
Change-Id: Ia0a1ee57e7630e01c495dc166218f665340aad7f
This reverts commit 675cb30f05.
Reason for revert: b/178455196, Broken test: android.seccomp.cts.SeccompHostJUnit4DeviceTest#testAppZygoteSyscalls on git_master on cf_x86_64_phone-userdebug
Change-Id: I82d228f2bc3e6b426d4703732e1c8766815ccc97
* changes:
libdebuggerd: add protobuf implementation.
tombstoned: support for protobuf fds.
tombstoned: make it easier to add more types of outputs.
tombstoned: switch from goto to RAII.
Currently, all MTE failures end up displaying 'Fault address falls at
0x<addr> after any mapped regions'. Clearly when scanning, we should use
the untagged address to figure out which ranges it's in.
I've taken the liberty of removing all si_addr parsing and moving it
into the common ProcessInfo, as well as making it really explicit
whether you want the (possibly tagged) original si_addr, or whether you
want the untagged variant (for scanning /proc/maps or whatever).
This is not particularly easily testable, as ReadCrashInfo isn't easily
injectable and `dump_all_maps` should already be passed the untagged
pointer to scan for. I've tested this locally on FVP under SYNC MTE with
a simple UaF binary and noted the problem is fixed. Given that this is
making the code more clear, I'm hoping the owners see no need for a
regression test :).
Bug: 135772972
Test: On FVP, run 'adb shell MEMTAG_OPTIONS=sync sanitizer-status' and
check that the use-after-free test ends up with the /proc/maps
desription in the right place.
Change-Id: I220e4200c75a72474a95a67e5bbc36173a438dd2
This commit implements protobuf output for tombstones, along with a
translator that should emit bytewise identical output to the existing
tombstone dumping code, except for ancillary data from GWP-ASan and
Scudo, which haven't been implemented yet.
Test: setprop debug.debuggerd.translate.translate_proto_to_text 1 &&
/data/nativetest64/debuggerd_test/debuggerd_test
Test: for TOMBSTONE in /data/tombstones/tombstone_??; do
pbtombstone $TOMBSTONE.pb | diff $TOMBSTONE -
done
Change-Id: Ieeece6e6d1c26eb608b00ec24e2e725e161c8c92
Sadly, it looks like we do still really use libcutils for some of the
socket functions.
Test: treehugger
Change-Id: Ic71f97507c89b10d2f3b7a2971064a9e6b1d349d
Now that the feature guarded by this flag has landed in Linux 5.10
we no longer need the flag, so we can remove it.
Bug: 135772972
Change-Id: I02fa50848cbd0486c23c8a229bb8f1ab5dd5a56f
- Make it apply to every thread, and thus remove the restriction
that it must be called while the program is single threaded.
- Make it change TCF0 itself (on all threads), instead of requiring
callers to do it themselves, which can be error prone.
And update all of the call sites.
Change the implementation of
android_mallopt(M_DISABLE_MEMORY_MITIGATIONS) to call
android_mallopt(M_SET_HEAP_TAGGING_LEVEL) internally. This avoids
crashes during startup that were observed when the two mallopts
updated TCF0 unaware of each other.
I wouldn't expect there to be any out-of-tree callers at this point,
but it's worth noting that the new interface is backwards compatible
with the old one because it strictly expands the set of situations in
which the API can be used (i.e. situations where there are multiple
threads running or where TCF0 hadn't been updated beforehand).
Bug: 135772972
Change-Id: I7746707898ff31ef2e0af01c4f55ba90b72bef51
The discussion on LKML is converging on v16 of the fault address tag
bits patch [1]. In this version of the patch the presence of the tag
bits in si_addr is controlled by a sa_flags bit, and a protocol is
introduced to allow userspace to detect kernel support for sa_flags
bits. Update the tombstone signal handler to use this API to read
the tag bits, update the interceptors in libsigchain to implement
the flag support detection protocol and hide the tag bits in si_addr
from chained signal handlers that did not request them to match the
kernel behavior.
[1] https://lore.kernel.org/linux-arm-kernel/cover.1605235762.git.pcc@google.com/
Change-Id: I57f24c07c01ceb3e5b81cfc15edf559ef7dfc740
It turns out that I had originally written the test with a local
patch applied that forces TCF0 to SYNC, so it was testing for the
wrong tagged_addr_ctrl value. Fix it.
Bug: 135772972
Change-Id: Ibb9b25e5f5635372ad5de7825c31d7264ff02590
This simplifies some of the logic and removes the need to pass an
Arch value to functions that should already know about the arch
it is operating on.
Includes fixes for debuggerd/libbacktrace.
Added new unit tests to cover new cases.
Test: All unit tests pass.
Test: Faked unwinder failing to verify debuggerd error messages display
Test: properly in backtrace and tombstone.
Change-Id: I439fcae0695befcfb1cb4c0a786cc74949d33425
This value indicates whether memory tagging is enabled on a thread,
the mode (sync or async) and the set of excluded tags. This information
can sometimes be important for understanding an MTE related crash,
so include it in the per-thread tombstone output.
Bug: 135772972
Change-Id: I25a16e10ac7fbb2b1ab2a961a5279f787039000b
Until 77fdb22cf6, logd started as
AID_ROOT and then dropped its privileges. Since then, there's been no
reason to use string comparisons rather than checking the uid.
Test: pkill -SEGV logd
Test: treehugger
Change-Id: Ia709f8f59cb0ab9abac7df84c96c701b5d0a83ea
An OEM asks for sub-second granularity, and that's most easily done if
we only have one timestamp generator. I'm not convinced sub-second
granularity is particularly useful myself, and I definitely don't think
that nanosecond resolution is meaningful but I do like this cleanup, and
if I'm going to use sub-second precision I may as well use the maximum
precision available to me.
Also reduce some duplication of code reading cmdline/comm.
Bug: https://issuetracker.google.com/161860597
Test: head /data/tombstones/*
Change-Id: I035ecfd4a3338ccd84dae0ef973a998a7c7c5056
Tags appear in the addresses printed in the memory dump, which seems
like a reasonable place to put them because tagged addresses will
also appear in other places in the tombstone, such as registers and
the fault address.
Bug: 135772972
Change-Id: I52da338347ff6b7503cf5ac80763c540695dc061
Previously, we would do a simple bounds check before deciding
whether to dump the memory around a register. On 64-bit platforms,
the register's value was required to be less than (4 << 60). However,
after stripping tags on AArch64 as part of r.android.com/1365229, all
pointer values became less than (4 << 60), so the check became useless
for filtering out invalid pointers. As a result, we would attempt to
dump memory for all registers, which for a register not containing
a valid pointer would typically consist of 16 lines of dashes.
One possible fix may be to replace the constant (4 << 60) with the
process's actual address space limit (known as TASK_SIZE inside the
kernel; typically 39 bits on AArch64 and 48 bits on x86_64), but the
kernel provides no API for retrieving a process's TASK_SIZE value. We
could guess it by looking at for example the highest bit set in the
value of getauxval(AT_EXECFN), which points to an address on the stack
which typically is mapped at the end of the address space on program
startup, but at least on AArch64 it is possible to dynamically extend
TASK_SIZE at runtime by providing a hint to mmap(), so this is not
always sufficient.
Instead, it seems best to remove most of the early bounds check, and
simply issue ptrace() calls for each register value, bailing out of
the entire output if none of the calls ended up succeeding. This also
has the nice side effect of avoiding 16 lines of noise per register
whose value looks like a pointer but actually points to unmapped
memory. We still retain part of the bounds check in order to avoid
integer overflow during the dump (including overflows into the tag
part of the address on architectures that support tagging).
Bug: 154272452
Change-Id: I94e4b7124b7735b92fd83a49c80ebded3483cd4e
We do not install a 32-bit version of libminijail to 64/32 devices,
which means that "atest -a debuggerd_test" always fails on 32-bit.
Fix it by statically linking libminijail.
Change-Id: I1e5610d1353b4f5b718c1259825421c0c07d7c24
After r.android.com/1288984 we started failing to dump memory contents
for heap addresses because the tag started causing any addresses to
fail this bounds check. Add an untag_address() call to the bounds check
so that the tag is ignored.
Bug: 154272452
Change-Id: I3a6d1a078b21871bd93164150a123549f83289f6
It's impractical to test the contents of the stack trace, but we
should at least test that *a* stack trace is present, which would
have caught the bug fixed by r.android.com/1306754 .
Bug: 135772972
Change-Id: Ic5e0b997caa53c7eeec4e5185df5c043c9d4fe3d
Teach debuggerd to use the new scudo APIs proposed in
https://reviews.llvm.org/D77283 for extracing MTE error reports from crashed
processes, and include those reports in tombstones if possible.
Bug: 135772972
Change-Id: I082dfd0ac9d781cfed2b8c34cc73562614bb0dbb
log/log.h primarily concerns itself with writing logs. The few users
who read logs should directly include log/log_read.h.
Bug: 78370064
Test: build
Change-Id: Ie95c55ea2ffc76fc95768323d445ada6ad4f2520
If crash_dump dies before it gets a chance to write to the pipe we use
to let the debugged-process know that it successfully started, we
weren't cleaning up the child we fork to start it, leaving a zombie
child.
Bug: http://b/152119184
Test: debuggerd_test
Change-Id: Id01cc05f693995e9998941774f74ab8e3d8b4d8a
On aarch64, the top 8 bits of the address (i.e. the tag bits) of
the fault address in si_addr are always clear. This isn't ideal for
MTE which will require these bits in order to correctly diagnose
tag mismatches.
A proposed kernel patch [1] exposes the full fault address including
the tag bits as part of the ucontext. Change debuggerd to read this
fault address if available.
[1] https://patchwork.kernel.org/patch/11435077/
Bug: 135772972
Change-Id: Ia05be574113860f4e9ecc36a310c4b740e0c4afb
GWP-ASan uses frame-pointer based unwinding internally on
allocation/deallocation to collect stack traces that are used when
crashes are reported.
This should be generic, so pull it out into libunwindstack so it can be
used by MTE as well.
Bug: 152412331
Test: atest debuggerd_test
Change-Id: I27b32263aac63446f5fe398af108676b70cd3971
Similar to r.android.com/1247247 I'll be adding more of them for MTE.
Also, change the protocol between the crasher and crash_dump to make
it easier to add new fields and change the referenced data structures
without needing to worry about versioning. The version number for
static executables is now always 1 (where the protocol will never
change), while the version number for dynamic executables is always
4 (where the protocol can change, because the linker and crash_dump
are version locked).
Bug: 135772972
Change-Id: Ib4696d0544d7c87cb429aaaa15f18c3640059e16
We're now using it in contexts that don't have all of the registers available,
such as GWP-ASan and soon MTE, so it doesn't make sense to have it be a
member function of Regs.
Bug: 135772972
Change-Id: I18b104ea0adb78588d7e475d0624cefc701ba52c
- Create a static library libunwindstack_no_dex without DEX support.
- Use it in libdebuggerd_handler_fallback, whose only use is in the
linker, which shouldn't need that support.
- Use it in init_first_stage, which doesn't need DEX support either.
- Also need a libbacktrace_no_dex since it's in the dependency chain
from init_first_stage to libunwindstack_no_dex.
Also restrict the *_no_dex libs and libdebuggerd_handler_fallback as
much as possible to avoid inadvertent use of these reduced
functionality libs.
Test: m init_first_stage on Cuttlefish
where BOARD_BUILD_SYSTEM_ROOT_IMAGE=false
Test: m system_image com.android.runtime
Test: Build & boot
Test: atest linker-unit-tests libunwindstack_unit_test debuggerd_test
Bug: 142944931
Bug: 151466650
Change-Id: Iaacb29bfe602f3ca12a00a712e2a64c45ff0118b
A future change will introduce a version lock between linker and
crash_dump. Move crash_dump into the runtime APEX alongside linker in order to
ensure that they will be the same version even if the runtime APEX is updated.
Bug: 135772972
Change-Id: Ic2eae31b6927eb0e8a62315ac141f50933c00bcc
Merged-In: Ic2eae31b6927eb0e8a62315ac141f50933c00bcc
We're now passing around a couple of addresses for GWP-ASan in addition
to abort_msg_address and fdsan_table_address, and I'm going to need to add
more of them for MTE. Move them into a data structure in order to simplify
various function signatures.
Bug: 135772972
Change-Id: Ie01e1bd93a9ab64f21865f56574696825a6a125f
On userdebug/eng devices, check a system property to see whether we
should create tombstones or not. OEMs that would rather have core dumps
can set this property and configure /proc/sys/kernel/core_pattern
appropriately.
Bug: https://issuetracker.google.com/149663286
Test: set the property, cause a crash
Change-Id: If894b4582a1820b64bdae819cec593b7710cb6e3