Commit graph

110 commits

Author SHA1 Message Date
Mattias Simonsson
38ab045342 debuggerd_test: Scale timeouts by HwTimeoutMultiplier
Timeouts in tombstoned.cpp and intercept_manager.cpp are scaled
by HwTimeoutMultiplier, but the timeouts in debuggerd_test.cpp
are not, which means the CrasherTest#intercept_timeout test will
fail for any platform that has a high enough HwTimeoutMultiplier.

Bug: 309532789
Test: debuggerd_test.CrasherTest#intercept_timeout
Change-Id: I83cd01e87644c011efa155a32fd5d92cc8a43a95
2023-11-08 14:56:48 -08:00
Christopher Ferris
b92b52c071 Add ability to handle multiple intercepts per pid.
While doing this, refactor the intercept code to be easier to understand.

The primary use case for this is to perform a parallel stack dump (both Java and native) for specific ANRs.

Add tests for all of the different intercept conditions.

Modify the tests to display the error message from the intercept
response if there is an error.

Bug: 254634348
Test: All unit tests pass.
Test: Ran debuggerd on native and java processes.
Test: Created a bugreport without error.
Change-Id: Ic531ccee05b9a470748b815cf109e0076150a0b6
2023-10-19 15:13:59 +00:00
Christopher Ferris
3a0833c9cd Fix potential miscellaneous debuggerd issues.
Check for the log opening failing.

Add the ability to put error messages in the log and tombstone so
that it's clear if the log reading failed in some way.

Adjust test so that if there is a log or if no log exists, the test
will still pass.

Print an <unknown> if the command line is unreadable instead of nothing.

Test: Ran unit tests.
Test: Induced error and verified error message is save in tombstone.
Change-Id: I2fce8078573b40b9fed3cd453235f3824cadb5e3
2023-08-09 17:31:55 -07:00
Kelvin Zhang
786dac3d50 Update some fs_mgr/debuggerd to use getpagesize() instead of PAGE_SIZE
Test: th
Bug: 279808236
Change-Id: I9d30cfe19d2b1a7d624cc5425e4315dc6e3b2ad2
2023-06-27 10:50:07 -07:00
Christopher Ferris
98d6242dc7 Limit the number of log messages in a tombstone.
Some testing environments can have a test that is sending many
thousands of messages to the log. When this type of process crashes
all of these log messages are captured and can cause OOM errors
while creating the tombstone.

Added a test to verify the log messages are truncated. Leaving this
test disabled for now since it is inherently flaky due to having to
assume that 500 messages are in the log.

Added a test for a newline in a log message since it's somewhat
related to this change.

NOTE: The total number of messages is capped at 500, but if a message
contains multiple newlines, the total messages will exceed 500.
Counting messages this way seems to be in the spirit of the cap,
that a process logging a large message with multiple newlines does
not completely fill the tombstone log data.

Bug: 269182937
Bug: 282661754

Test: All unit tests pass.
Test: The disabled max_log_messages test passes.
Change-Id: If18e62b29f899c2c4670101b402e37762bffbec6
2023-05-24 20:10:55 +00:00
Christopher Ferris
bda1064160 Re-add code to skip gettings logs on logd crashes.
Also add new unit tests to verify this behavior.

Bug: 276934420

Test: New unit tests pass.
Test: Ran new unit tests without pthread_setname_np call and verified
Test: the tests fail.
Test: Force crash logd and verify log messages are not gathered.
Test: Force crash a logd thread and verify log messages are not gathered.
Change-Id: If8effef68f629432923cdc89e57d28ef5b8b4ce2
2023-04-24 18:31:29 -07:00
Elliott Hughes
857e29c356 riscv64: fix debuggerd_test build.
This adds the missing assembler for riscv64, even though I don't have a
working tombstoned yet to test it with. There's a distinct possibility
we'll be back to fix the test (because although "register 1" is harmless
for the other architectures, it's the ra register on riscv64; the default
link register), but at least this lets us build the test.

I've also simplified all the assembly to be the simplest sequence I
know that writes 0 to address 0 (because if there was a reason to use
so many instructions before, I want to know what it is so I can write
the missing comment!).

Test: treehugger
Change-Id: I10d117eaedf361d9759a450e0973d07c4f97090e
2023-03-20 17:48:53 -07:00
Treehugger Robot
b3bf57dbe9 Merge "Accept SEGV_MTESERR in CrasherTest.mte_async test." 2023-03-09 00:34:39 +00:00
Peter Collingbourne
91e816aa0e Accept SEGV_MTESERR in CrasherTest.mte_async test.
This is possible when upgrading to sync mode.

Change-Id: I71c213cb9ddda87765a0dc2ff5455f0eb7b484fe
2023-03-07 21:27:09 -08:00
Mitch Phillips
70aa219034 [gwp-asan] fix tests under clang coverage, and extend invariants
1. Fixes this test under clang coverage, which is run under presubmit
   for TEST_MAPPING files. When we spawn under a minijail, and the
   process exited normally (which is the case for recoverable), clang
   coverage would use atexit handlers to dump some stuff using banned
   prctl's and other syscalls. Instead of allow-listing them all which
   sounds like a huge pain, call _exit() which skips those handlers.

2. Extends the invariant testing to make sure that recoverable GWP-ASan
   recovers both the first time, and a second time in a different slot.

Bug: N/A
Test: CLANG_COVERAGE=true NATIVE_COVERAGE_PATHS="*" atest debuggerd_test
Change-Id: I6059e21db4c2898b1c9777a00d2a54497d80ef79
2023-02-22 12:27:37 -08:00
Mitch Phillips
6e0eb996b3 Merge "Add recoverable GWP-ASan." 2023-02-03 18:35:08 +00:00
Mitch Phillips
18ce54241c Add recoverable GWP-ASan.
Recoverable GWP-ASan is a mode landed upstream in
https://reviews.llvm.org/D140173. For more information about why/what it
is, see
https://android-review.git.corp.google.com/c/platform/bionic/+/2394588.

This patch makes debuggerd call the required libc callbacks for GWP-ASan
to recover from the memory corruption. It also adds the functionality
that libart/sigchain eventually ends up calling, which dumps a GWP-ASan
report for the first error encountered.

Test: Build the platform, run sanitizer-status in recoverable mode,
asserting that it doesn't crash but we get a debuggerd report.
Bug: 247012630

Change-Id: I27212f7250844c20a8fd1e961417cdb4e5bd3626
2023-02-01 15:25:29 -08:00
Christopher Ferris
22035ccb01 Display offset in backtraces if necessary.
When moving to a proto tombstone, backtraces no longer contain
an offset when a frame is in a shared library from an apk.
Add the offset display again if needed, and add a test to
verify this behavior.

Bug: 267341682

Test: All unit tests pass.
Test: Dumped a process running through an apk to verify the offset
Test: is present.
Change-Id: Ib720ccb5bfcc8531d1e407f3d01817e8a0b9128c
2023-01-31 17:53:45 -08:00
haocheng.zy@linux.alibaba.com
3f4d036cb6 Add riscv support for heap_addr_in_register
Change-Id: I42a93a96c8c9c7a32d32674535ff466380e3c2fa
Signed-off-by: haocheng.zy <haocheng.zy@linux.alibaba.com>
2022-10-29 14:57:23 +00:00
Evgenii Stepanov
361455eb37 Harden CrasherTest::Trap under sanitizers.
The use of __builtin_abort in CrasherTest::Trap breaks with
-ftrap-function=abort, because then the argument of Trap is no longer in
the first argument register at the time of crash.

This flag is added when *any* sanitizer is enabled on the target, even harmless
ones like memtag-heap. See sanitize.go:769.

Fix CrasherTest::Trap to be a little more reliable.

Test: debuggerd_test with SANITIZE_TARGET=memtag_heap
Change-Id: I150f1c0355bd6f2bfabfa5a7bba125acdde1120e
2022-10-13 16:40:05 -07:00
Elliott Hughes
b795d6fa4b Fix the build with a newer LLVM.
Unify all our "noinline" variants to the current most common one, not
least because the new [[noinline]] syntax is fussier about where it goes.

Test: treehugger
Change-Id: Icfcb75c9d687f0f05c19f66ee778fd8962519436
2022-09-14 20:16:25 +00:00
Christopher Ferris
7c2e7e31f6 Fix fallback paths for dumping threads.
In the fallback path, if the non-main thread is the target
to be dumped, then no other threads are dumped when creating
a tombstone. Fix this and add unit tests to verify that
this all threads, including the main thread are dumped.

Bug: 234058038

Test: All unit tests pass.
Test: debuggerd -b media.swcodec process
Test: debuggerd media.swcodec process
Change-Id: Ibb75264f7b3847acdbab939a66902d986c0d0e5c
2022-05-27 13:05:56 -07:00
Christopher Ferris
303c6bef77 Fix check for thread unwind.
If a process requires executing fallback unwinder and the thread
crashing is not the main thread, the wrong unwinder is used.
Fix this case, and add a new unit test that causes an abort in
the non main thread.

Bug: 233721755

Test: New unit test passes with fix and fails without.
Test: Ran debuggerd on swcodec process and it still dumps all threads.
Change-Id: I70fffc5d680256ce867e7a1d427593b584259160
Merged-In: I70fffc5d680256ce867e7a1d427593b584259160
(cherry picked from commit 2d5d46ca85)
2022-05-25 13:07:07 -07:00
Mitch Phillips
5411905232 Merge "[GWP-ASan] Enable debuggerd to pull more allocation metadata." 2022-04-21 18:12:43 +00:00
Florian Mayer
b4979293b3 Skip debuggerd tests that do not apply to HWASan.
Change-Id: Ieab61dc61e11c3e55f116a45c37ceb805a6212e0
2022-04-15 15:41:59 -07:00
Mitch Phillips
1e0969997a [GWP-ASan] Enable debuggerd to pull more allocation metadata.
With the addition of runtime-configurable GWP-ASan, there might be many,
many more than 1,000 allocations. Have support for them, but keep a
hopefully-won't-crash-the-device limit.

Bug: 219651032
Test: atest bionic-unit-tests

Change-Id: I7b8e2bf5ab7c723ab6c61365f0dc610e400dbbce
2022-04-14 11:30:05 -07:00
Christopher Ferris
c95047dd20 Update for accurate unreadable elf files.
The functionality moved from the Unwinder object to the MapInfo
object and means that the individual unreadable files can be
displayed now.

Included adding the unreadable elfs per thread in the protobuf.

Updated the unwinder test.

Test: All unit tests pass.
Change-Id: I7140bde16938736da005f926e10bbdb3dbc0f6f5
2022-03-15 09:50:48 -07:00
Christopher Ferris
b999b82eb7 Dump threads in tombstone fallback path.
When dumping a tombstone using the fallback path, only the main
thread was showing up. Modify the code to dump the threads using
a slightly different path for the tombstone generation code.

In addition, while looking at this code, two MTE variables were
not set in the tombstone fallback code. Added those variables
so MTE devices will work properly in this fallback path.

Modified the tombstone unit tests for seccomp to have
multiple threads and verify those threads show up in the tombstone.

Bug: 208933016

Test: Ran unit tests.
Test: Ran debuggerd <PID> on a privileged process and verified
Test: all threads dumped. Also verified that the tagged_addr_ctrl
Test: variable is present on the raven device.
Change-Id: I16eadb0cc2c37a7dbc5cac16af9b5051008b5127
2022-02-16 15:02:38 -08:00
Christopher Ferris
16a7bc2355 Fix typo.
Change use of new_ to old_ to save the old sigaction data. This hasn't
caused any issues, but it's obviously wrong.

Test: Ran unit tests on coral.
Change-Id: I96be5b0980c323c3aeafb422fbc06202577604a2
2022-01-31 13:08:54 -08:00
Elliott Hughes
d13ea523e1 debuggerd: add the PAC keys to the tombstones.
Hard to get otherwise if you're trying to debug PAC issues.

Bug: http://b/214314197
Test: treehugger
Change-Id: I2e5502809f84579bf287364e59d6e7ff67770919
2022-01-13 15:03:19 -08:00
Christopher Ferris
bdea3bb56b Remove non-protobuf path.
It is expensive to keep the non-protobuf path around and it hasn't
been used for an entire release without anyone noticing, so remove it.

Create new end-to-end unit tests that cover tests of the non-proto
code paths that are being deleted.

Bug: 197981919

Test: Unit tests pass.
Change-Id: Ia1c45572300bd63e5f196ad61e5e5386830c8ece
2021-11-19 02:07:30 +00:00
Treehugger Robot
a44f269eba Merge "Improvements to tombstone output." 2021-11-12 00:17:12 +00:00
Peter Collingbourne
773acaa18e Improvements to tombstone output.
- Use "likelihood" instead of "probability" since that has connotations
  of being less precise, and our probability ordering isn't very precise
  anyway.

- Hide the fault address with SEGV_MTEAERR because it is not available.

- Pad the fault address with leading zeroes to make it clearer which
  bits of the top byte (and any following bytes such as PAC signature
  bits) are set.

Bug: 206015287
Change-Id: I5e1e99b7f3e967c44781d8550bbd7158eb421b64
2021-11-11 15:05:47 -08:00
Peter Collingbourne
57e19ac46e Merge "Add a human readable description of the tagged_addr_ctrl value to tombstones." 2021-11-10 18:56:59 +00:00
Peter Collingbourne
47d784e9f2 Add a human readable description of the tagged_addr_ctrl value to tombstones.
Change-Id: Ib9860b282cf749891e0f6ef7697669b94235c236
2021-11-05 18:59:26 -07:00
Christopher Ferris
2038cc7633 Add a test to verify the dex_pc is correct.
The libunwindstack code will attempt to dlopen the libdexfile.so
when a dex pc is found. Unfortunately, this failed since that
library was not properly listed as a runtime library. To make
sure this doesn't happen again, add an end to end test that
will create a dex pc frame, and will verify the correct
dex function name is in that frame.

Bug: 199043576

Test: Unit test passes on arm/aarch64/x86/x86_64.
Test: Removed the runtime_libs of libdexfile from libunwindstack
Test: and verified the new test fails.
Change-Id: I3a11f9ee44e06e37a547d193b04f7fbb90ccfe0a
2021-09-15 22:14:28 +00:00
Christopher Ferris
7e4c2a8ccc Add fault address marker in proto to tombstone.
When the switch was made to dump the tombstone from the protobuf,
the fault address marker in the maps section went missing. Re-add
that logic and add new unit tests to verify all of the different
behaviors.

Bug: 193935960

Test: All unit tests pass.
Test: All unit tests pass when setprop debug.debuggerd.translate_proto_to_text 0
Test: The above on cuttlefish, 32 bit and 64 bit.
Test: The above on a flame, 32 bit and 64 bit.
Change-Id: I098bb6ab4bacacae2ca0fc5ec9a73549ed0b9489
2021-08-23 16:25:13 -07:00
Christopher Ferris
e8891458e5 Remove trailing newlines from abort message.
The tombstone will add a newline after the abort message, so remove
any trailing newlines before saving/printing.

Bug: 196414062

Test: Unit tests pass.
Test: Set system property debug.debuggerd.translate_proto_to_text to 0
test: and unit tests still pass.
Change-Id: I0d3dc215eb5d8be93d99e5b9d4f0a14b1d61396d
2021-08-18 14:13:02 -07:00
Christopher Ferris
a3e9a0b2e1 Always use main thread pid for manual dumping.
When running debuggerd from the command line, it's possible that
the signal will happen on a side thread. The original intercept
in tombstoned is set to only handle crashes from the main thread
pid, so in this case, the intercept doesn't occur. To fix this,
modify the code so that running debuggerd always sends the signal
to the main pid. In addition, modify the signal handler is entered
due to the BIONIC_SIGNAL_DEBUGGER signal, then the crashing tid is
set to the main thread pid instead of the current thread.

Add unit test to cover this case.

Bug: 194346289

Test: All unit tests pass.
Test: Verify the new unit test is getting the signal on the non-main
Test: thread and still properly handling the intercept.
Test: Modify the debuggerd code to send the signal to the non main pid
Test: and verify the dump still occurs correctly.

Change-Id: I2dd1bd11fc8ef4a6fe87f05ecc67ae349a101c82
2021-07-30 14:08:03 -07:00
Mitch Phillips
5ddcea2924 [MTE] Add a HWASan-style tag dump to tombstones.
We already dump the tags in the regigster dump section by appending the
tag to the memory address. You only get 2 granules before each register
and 13 after.

The HWASan-style tag dump is extremely useful for debugging, as it gives
a pretty comprehensive overview of the memory subsystem. It also
provides enough context bytes (256) to give you a reasonable intuition
about a particular bug.

The tag dump shows up only if PTRACE_PEEKTAGS returns at least one value
in the 256 requested. If the start of end of the region is untagged,
it's omitted. The tag dump looks like this:

Change-Id: Icc33fb97542d9b1fa3ae9e58aba34d524c6ba7b5

---
Memory tags around the fault address (0x60000704414d340), one tag per 16 bytes:
      0x704414d000: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x704414d100: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x704414d200: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
    =>0x704414d300: 0  0  0  0 [2] 2  0  0  0  0  0  0  0  0  0  0
      0x704414d400: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x704414d500: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x704414d600: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x704414d700: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x704414d800: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x704414d900: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x704414da00: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
---

Bug: 183992164
Test: atest debuggerd_test on MTE+QEMU and sunfish.

Change-Id: I8d5842e4803ca30b407e866c99eef56f2cb36600
2021-06-28 15:53:10 -07:00
Mitch Phillips
78f0670dda [MTE] Print cause and alloc/dealloc traces to logcat.
This information clearly meets the bar for being dumped to logcat. If we
omit the info, we may confuse the user into thinking that it's not
available at all, especially if it's their first time seeing an MTE
report.

This also adds some functionality to the integration testing library to
pull logcat messages and scan them to make sure the contents are in both
places.

Bug: 187881237
Test: atest debuggerd_test # on QEMU w/ MTE.
Change-Id: Icc17ea45bda7628331cc4812eaad3bc5c949b7a7
2021-06-01 18:12:22 -07:00
Peter Collingbourne
93406da189 Merge "Run MTE tests on zero-sized allocations." 2021-05-14 03:58:13 +00:00
Peter Collingbourne
aa544796ae Run MTE tests on zero-sized allocations.
Bug: 187879470
Change-Id: I9547f3032af9d1a921f8597a53389d25af33b369
2021-05-13 14:08:41 -07:00
Peter Collingbourne
fc7852b741 Merge "Test that out-of-bounds UAF is not detected with MTE." 2021-05-13 02:53:23 +00:00
Peter Collingbourne
dc47634ec4 Test that out-of-bounds UAF is not detected with MTE.
This type of error is unlikely and attempting to detect it with MTE
is likely to produce false positive reports. Make sure that this type
of error is not detected by the allocator.

Change-Id: I90676d1a031411d6b725890311317802bc24b459
2021-05-12 15:56:43 -07:00
Christopher Ferris
fe751c5a61 Re-add backtrace note about unreadable elf.
When moving to the proto-ized tombstones, the note about unreadable
elf files in a backtrace got lost. This re-adds it and adds a test
to verify that the note properly shows up.

Bug: 185428454

Test: Ran unit tests.
Change-Id: I1150cc737772e1b79fd73ec5c782caadc4629421
2021-05-03 15:21:11 -07:00
Christopher Ferris
6702256e0c Allow another prctl call.
A change was made so that pthread_create is calling
prctl(PR_PAC_RESET_KEYS, ...) on aarch64. It's possible that other
seccomp policies might need to change to allow this.

Test: CrasherTest.seccomp_backtrace passes on aarch64.
Change-Id: I9c4d1b3dca5f19a6285bf904bb942f1f52e42bd0
2021-04-16 13:35:16 -07:00
Peter Collingbourne
f4a40c0edd Merge "Support MTE and GWP-ASan features in proto tombstones." 2021-03-19 23:42:23 +00:00
Peter Collingbourne
d0f5eb5716 Merge "[GWP-ASan] Add debuggerd end-to-end tests and remove unique wording." 2021-03-19 23:42:23 +00:00
Elliott Hughes
e4781d54a5 debuggerd: prepare to abandon ship^Wgdb.
Talk of "gdb" when we currently mean "gdb or lldb" and will soon mean
"lldb" is starting to confuse people. Let's use the more neutral
"debugger" in places where it really doesn't matter.

The switch from gdbclient.py to lldbclient.py is a change for another
day...

Test: treehugger
Change-Id: If39ca7e1cdf4c8bb9475f1791cdaf201fbea50e0
2021-03-17 10:03:25 -07:00
Peter Collingbourne
1a1f7d79a4 Support MTE and GWP-ASan features in proto tombstones.
Proto tombstones were missing tagged fault addresses, tagged_addr_ctrl,
tags in memory dumps and Scudo and GWP-ASan error reports. Since text
tombstones now go via protos, all of these features broke when we
switched to text tombstones generated from protos by default. Fix
the features by adding support for them to the proto format,
tombstone_proto and tombstone_proto_to_text.

Bug: 135772972
Bug: 182489365
Change-Id: I3ca854546c38755b1f6410a1f6198a44d25ed1c5
2021-03-16 10:59:39 -07:00
Mitch Phillips
7168a217b9 [GWP-ASan] Add debuggerd end-to-end tests and remove unique wording.
Looks like we unintentionally had a breakage after aosp/1595302, where
both GWP-ASan and MTE tests started failing because the extra
information wasn't plumbed through the tombstones. MTE has end-to-end
tests but aren't run continuously, and GWP-ASan was missing the e2e
tests.

Also remove some unique wording for GWP-ASan, a UaF on the free'd
pointer is now "0 bytes into a 16-byte allocation" instead of "on a
16-byte allocation". The former is more descriptive and is more
ubiquitously used in our tooling.

This patch adds the E2E tests, but the underlying problem needs to be
fixed as well, before this patch can land.

Bug: 182489365
Test: atest debuggerd_test
Change-Id: I0fe8aba7ea443b3071724987f46b19a6525cda3c
2021-03-11 15:56:35 -08:00
Peter Collingbourne
90947d442c Merge "Teach debuggerd to pass the secondary ring buffer to __scudo_get_error_info()." 2021-03-11 01:15:49 +00:00
Treehugger Robot
3f24fefe29 Merge "Untag addresses in registers before looking up the mapping." 2021-03-06 02:41:05 +00:00
Christopher Ferris
35da288199 Don't hard-code number of tombstones.
On cuttlefish, the number of tombstones allowed is much larger
than 50, so change the algorithm to search for any tombstone
file.

Test: Ran unit tests on cuttlefish with > 50 tombstones.
Test: Ran unit tests on device.
Change-Id: Ia1d885fe19a7f7751fe3386d40b48750d1e21bd5
2021-02-18 15:29:13 -08:00