Rather than have to create a number of #if defines for the memory
dumping parts of the tombstone, create a single function to generate
these strings for the memory tests.
Make CrasherTest.smoke use a regex that passes on 32 bit and 64 bit.
Make the tests page size agnostic.
Bug: 339017792
Test: Treehugger.
Test: Ran 32 bit and 64 bit versions of tests on a real device.
Test: Ran on the aosp_cf_x86_64_phone_pgagnostic-trunk_staging-userdebug
Change-Id: If9365061b85de23b00a1bf947d85923cde06c068
This is done so that we could depend on it elsewhere without needing all the unrelated methods.
Needed for ag/24553347
Bug: 296207744
Test: refactoring build
Change-Id: I7c6733208f3ae63ba9559753a24cffcb8e1b9d1e
r.android.com/2108505 was intended to fix a crash in Scudo in
the case where the stack depot, region info or ring buffer were
unreadable. However, it also ended up introducing a number of bugs into
the code. It failed to call __scudo_get_error_info if the page at the
fault address was unreadable. This can happen in legitimate crash cases
if a primary allocation was close to the boundary of a mapped region,
or if the allocation was a secondary allocation with guard pages. It
also used long as the type for tags, whereas Scudo expects it to be
char. In combination this ended up causing most of the MTE tests to
fail. Therefore, mostly revert that change.
Fix the original crash by null checking the pointers returned by
AllocAndReadFully before proceeding with the rest of the function.
Bug: 233720136
Change-Id: I04d70d2abffaa35fe315d15d9224f9b412a9825d
The code doesn't properly check if data is not read properly, so
make it fail if reads fail. Also, change the algorithm so that
first try and read the faulting page then 16 pages before and 16
pages after. Rather than trying to read every one of these pages,
stop as soon as one is unreadable. This means that the total memory
passed to the scudo error function is all valid data, rather than
potentially being some uninitialized memory.
Added new unit tests to cover scudo address processing.
Bug: 233720136
Test: All unit tests pass.
Test: atest CtsIncidentHostTestCases
Change-Id: I18a97bdee9a0c44075c1c31ccd1b546d10895be9
Hard to get otherwise if you're trying to debug PAC issues.
Bug: http://b/214314197
Test: treehugger
Change-Id: I2e5502809f84579bf287364e59d6e7ff67770919
It is expensive to keep the non-protobuf path around and it hasn't
been used for an entire release without anyone noticing, so remove it.
Create new end-to-end unit tests that cover tests of the non-proto
code paths that are being deleted.
Bug: 197981919
Test: Unit tests pass.
Change-Id: Ia1c45572300bd63e5f196ad61e5e5386830c8ece
strerror is nice, but usually I don't care about the text, I care about
the uppercase enum
Bug: N/A
Test: N/A
Change-Id: I8ea86220cb04cbded701379c47b8aba8ea8864b8
Proto tombstones were missing tagged fault addresses, tagged_addr_ctrl,
tags in memory dumps and Scudo and GWP-ASan error reports. Since text
tombstones now go via protos, all of these features broke when we
switched to text tombstones generated from protos by default. Fix
the features by adding support for them to the proto format,
tombstone_proto and tombstone_proto_to_text.
Bug: 135772972
Bug: 182489365
Change-Id: I3ca854546c38755b1f6410a1f6198a44d25ed1c5
Looks like we unintentionally had a breakage after aosp/1595302, where
both GWP-ASan and MTE tests started failing because the extra
information wasn't plumbed through the tombstones. MTE has end-to-end
tests but aren't run continuously, and GWP-ASan was missing the e2e
tests.
Also remove some unique wording for GWP-ASan, a UaF on the free'd
pointer is now "0 bytes into a 16-byte allocation" instead of "on a
16-byte allocation". The former is more descriptive and is more
ubiquitously used in our tooling.
This patch adds the E2E tests, but the underlying problem needs to be
fixed as well, before this patch can land.
Bug: 182489365
Test: atest debuggerd_test
Change-Id: I0fe8aba7ea443b3071724987f46b19a6525cda3c
An OEM asks for sub-second granularity, and that's most easily done if
we only have one timestamp generator. I'm not convinced sub-second
granularity is particularly useful myself, and I definitely don't think
that nanosecond resolution is meaningful but I do like this cleanup, and
if I'm going to use sub-second precision I may as well use the maximum
precision available to me.
Also reduce some duplication of code reading cmdline/comm.
Bug: https://issuetracker.google.com/161860597
Test: head /data/tombstones/*
Change-Id: I035ecfd4a3338ccd84dae0ef973a998a7c7c5056
Tags appear in the addresses printed in the memory dump, which seems
like a reasonable place to put them because tagged addresses will
also appear in other places in the tombstone, such as registers and
the fault address.
Bug: 135772972
Change-Id: I52da338347ff6b7503cf5ac80763c540695dc061
Previously, we would do a simple bounds check before deciding
whether to dump the memory around a register. On 64-bit platforms,
the register's value was required to be less than (4 << 60). However,
after stripping tags on AArch64 as part of r.android.com/1365229, all
pointer values became less than (4 << 60), so the check became useless
for filtering out invalid pointers. As a result, we would attempt to
dump memory for all registers, which for a register not containing
a valid pointer would typically consist of 16 lines of dashes.
One possible fix may be to replace the constant (4 << 60) with the
process's actual address space limit (known as TASK_SIZE inside the
kernel; typically 39 bits on AArch64 and 48 bits on x86_64), but the
kernel provides no API for retrieving a process's TASK_SIZE value. We
could guess it by looking at for example the highest bit set in the
value of getauxval(AT_EXECFN), which points to an address on the stack
which typically is mapped at the end of the address space on program
startup, but at least on AArch64 it is possible to dynamically extend
TASK_SIZE at runtime by providing a hint to mmap(), so this is not
always sufficient.
Instead, it seems best to remove most of the early bounds check, and
simply issue ptrace() calls for each register value, bailing out of
the entire output if none of the calls ended up succeeding. This also
has the nice side effect of avoiding 16 lines of noise per register
whose value looks like a pointer but actually points to unmapped
memory. We still retain part of the bounds check in order to avoid
integer overflow during the dump (including overflows into the tag
part of the address on architectures that support tagging).
Bug: 154272452
Change-Id: I94e4b7124b7735b92fd83a49c80ebded3483cd4e
We're now passing around a couple of addresses for GWP-ASan in addition
to abort_msg_address and fdsan_table_address, and I'm going to need to add
more of them for MTE. Move them into a data structure in order to simplify
various function signatures.
Bug: 135772972
Change-Id: Ie01e1bd93a9ab64f21865f56574696825a6a125f
GWP-ASan can provide information about a crash that it caused. Grab the
GWP-ASan regions from the globals shared by the linker for crash-handler
purpopses, pull the information from GWP-ASan, and display it.
This adds two regions:
1. Causality tracking by GWP-ASan. We now print a cause header about
the crash, like `Cause: [GWP-ASan]: Use After Free on a 1-byte
allocation at 0x7365bb3ff8`
2. Allocation and deallocation stack traces.
Bug: 135634846
Test: atest debuggerd_test
Change-Id: Id28d5400c9a9a053fcde83a4788f971e677d4643
This takes a lot of space, isn't convincingly useful, and makes it
likely that the far more valuable stuff that comes after it gets
truncated. So let's just drop it.
Bug: http://b/139860930
Test: manual crasher, presubmit
Change-Id: Ie417ffc07e3cb17e95fdb3d183f8c87de0f34b89
C++20 wants members to be ordered unlike C99.
Bug: 139945549
Test: mm
Change-Id: I3cbca589511c1e0bbc10c691949e18de77e16031
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
This is for Android Telemetry to be able to categorise the processes
that produce tombstones.
Test: atest debugerd_test:TombstoneTest
Change-Id: Ie635347c9839eb58bfd27739050bd68cbdbf98da
Somehow the code was still including this include from libbacktrace.
I think the libbacktrace include directory was coming from some
transitive includes. I verified that nothing in debuggerd is using
the libbacktace.so shared library.
Bug: 120606663
Test: Builds, unit tests pass.
Change-Id: I85c2837c5a539ccefc5a7140949988058d21697a
Small modifications to the dump_stack method and added unit tests to
verify the output.
Bug: 120606663
Test: Unit tests pass, debuggerd run on processes on target.
Change-Id: Id385a915b751abda3dd6baebed6c3ce498c3bf6e
This commit only prints the raw value of the owner tag, pretty-printing
will come in a follow-up commit.
Test: debuggerd `pidof adbd`
Test: static_crasher fdsan_file + manual inspection of tombstone
Change-Id: Idb7375a12e410d5b51e6fcb6885d4beb20bccd0e
In order to support the offline unwinding properly, get rid of the
usage of non-fixed type uintptr_t from all API calls.
In addition, completely remove the old local and remote unwinding code
that used libunwind.
The next step will be to move the offline unwinding to the new unwinder.
Bug: 65682279
Test: Ran unit tests for libbacktrace/debuggerd.
Test: Ran debuggerd -b on a few arm and arm64 processes.
Test: Ran crasher and crasher64 and verified tombstones look correct.
Change-Id: Ib0c6cee3ad6785a102b74908a3d8e5e93e5c6b33
Reduce the amount of time that a process remains paused by pausing its
threads, fetching their registers, and then performing unwinding on a
copy of its address space. This also works around a kernel change
that's in 4.9 that prevents ptrace from reading memory of processes
that we don't have immediate permissions to ptrace (even if we
previously ptraced them).
Bug: http://b/62112103
Bug: http://b/63989615
Test: treehugger
Change-Id: I7b9cc5dd8f54a354bc61f1bda0d2b7a8a55733c4
Nobody is looking at the mismatches, and it can cause problems
with tombstone parsers.
Also, fix the dump_header_info test and remove unused properties_fake.cpp.
Test: Ran unit tests, verified tombstones still work.
Change-Id: I4261646016b4e84b26a5aee72f3227f1ce48ec9a
Update the tests to match new output (and stop pluralizing '1 entries').
Test: `debuggerd_test{32,64} --gtest_filter="TombstoneTest.*" on hikey960
Change-Id: I16b0335715303252fad3a35d6a053a50fefdac30
Move libdebuggerd headers into their own directory for namespacing,
move some includes to the top of their implementing files, delete some
dead code.
Test: mma, treehugger
Change-Id: Ie4c44e32e2ab3bc678092899d257fd4ed634aa34
- Change the field name load_base to load_bias (which is what it really is).
- Add a rel_pc field so that callers do not need to compute it themselves.
- Remove the BacktraceMap::GetRelativePc() since nobody should need to
compute this themselves.
Bug: 23762183
Test: Compiles and unit tests pass (debuggerd, libbacktrace).
Change-Id: I2cb579767120adf08c407a58f3c487ee3f2b45fc
There was at least one failure due to si_code being unitialized
and then examined.
Test: Run the 32 bit and 64 bit version of the unit tests on angler.
Change-Id: I5455a2cd29afafcd26a49f696e61141bb48478dc
Remove debuggerd in favor of a helper process that gets execed by
crashing processes.
Bug: http://b/30705528
Test: debuggerd_test
Change-Id: I9906c69473989cbf7fe5ea6cccf9a9c563d75906