Commit graph

73 commits

Author SHA1 Message Date
Elliott Hughes
a029d98ad0 crash_dump: avoid misleading error messages.
I'm guessing that the original

  F crash_dump64: crash_dump.cpp:460] failed to attach to thread 1671, already traced by 0 ()

was probably a race, where there _was_ a tracer but they disappeared?
Whatever, it doesn't seem helpful to show "already traced by nobody",
and we also don't want to clobber errno in the fallthrough case
(previously just where get_tracer() failed, but now also where
get_tracer() returns "nobody").

Bug: http://b/188668580
Test: treehugger
Change-Id: I3fa3b4f7e32531d48dfbb0ef946ff351ed5d9171
2021-06-21 12:39:40 -07:00
Josh Gao
31348a74e0 debuggerd: store commandline instead of process name.
Bug: http://b/180605583
Test: debuggerd_test
Change-Id: I018d399a5460f357766dc1b429f645f78fe88565
2021-03-30 12:15:56 -07:00
Treehugger Robot
15ab143bea Merge "debuggerd: prepare to abandon ship^Wgdb." 2021-03-18 13:35:50 +00:00
Elliott Hughes
e4781d54a5 debuggerd: prepare to abandon ship^Wgdb.
Talk of "gdb" when we currently mean "gdb or lldb" and will soon mean
"lldb" is starting to confuse people. Let's use the more neutral
"debugger" in places where it really doesn't matter.

The switch from gdbclient.py to lldbclient.py is a change for another
day...

Test: treehugger
Change-Id: If39ca7e1cdf4c8bb9475f1791cdaf201fbea50e0
2021-03-17 10:03:25 -07:00
Peter Collingbourne
fb5eac9445 Add support for a hw_timeout_multiplier system property.
In order to test the platform in emulators that are orders of magnitude
slower than real hardware we need to be able to avoid hitting timeouts
that prevent it from coming up properly. For this purpose introduce
a system property, ro.hw_timeout_multiplier, which may be set to
an integer value that acts as a multiplier for various timeouts on
the system.

Bug: 178231152
Change-Id: I6d7710beed0c4c5b1720e74e7abe3a586778c678
Merged-In: I6d7710beed0c4c5b1720e74e7abe3a586778c678
2021-03-11 14:04:18 -08:00
Peter Collingbourne
bb4b49c63c Teach debuggerd to pass the secondary ring buffer to __scudo_get_error_info().
With this change we can report memory errors involving secondary
allocations. Update the existing crasher tests to also test
UAF/overflow/underflow on allocations with sizes sufficient to trigger
the secondary allocator.

Bug: 135772972
Change-Id: Ic8925c1f18621a8f272e26d5630e5d11d6d34d38
2021-02-12 12:30:52 -08:00
Evgenii Stepanov
2a55e1adbe Scale timeouts in debuggerd and llkd.
Respect ro.timeout_multiplier property. Some of these are required for
tombstone writing to work on MTE QEMU, the rest are done speculatively.

Test: add crashing code to system_server, observe the tombstone
Bug: 178231152
Change-Id: Ic86e494af571301df7af07d13a6c046a0da6bda7
2021-02-01 20:00:53 +00:00
Josh Gao
76e1e30f16 Reland protobuf tombstones.
This reverts the following commits:
    e156ede145.
    eda96eddcb.
    5ec54d1e84.
    1e45d3f223.
    a50f61f8fa.

Test: treehugger
Test: atest -c CtsSeccompHostTestCases:android.seccomp.cts.SeccompHostJUnit4DeviceTest#testAppZygoteSyscalls
Change-Id: Ic2b1f489ac9f1fec7d7a33c845c29891f4306bbd
2021-01-26 17:55:17 -08:00
Jerome Gaillard
1e45d3f223 Revert "libdebuggerd: add protobuf implementation."
Revert "Let crash_dump read /proc/$PID."

Revert submission 1556807-tombstone_proto

Reason for revert: b/178455196, Broken test: android.seccomp.cts.SeccompHostJUnit4DeviceTest#testAppZygoteSyscalls on git_master on cf_x86_64_phone-userdebug

Reverted Changes:
Ide6811297:tombstoned: switch from goto to RAII.
I8d285c4b4:tombstoned: make it easier to add more types of ou...
Id0f0fa285:tombstoned: support for protobuf fds.
I6be6082ab:Let crash_dump read /proc/$PID.
Id812ca390:Make protobuf vendor_ramdisk_available.
Ieeece6e6d:libdebuggerd: add protobuf implementation.

Change-Id: Ia0a1ee57e7630e01c495dc166218f665340aad7f
2021-01-26 12:41:20 +00:00
Josh Gao
6bf6a9fc61 Merge changes from topic "tombstone_proto"
* changes:
  libdebuggerd: add protobuf implementation.
  tombstoned: support for protobuf fds.
  tombstoned: make it easier to add more types of outputs.
  tombstoned: switch from goto to RAII.
2021-01-25 22:18:48 +00:00
Mitch Phillips
e4adff0721 [MTE] Cleanup tagged si_addr refs to fix mappings OOB bug.
Currently, all MTE failures end up displaying 'Fault address falls at
0x<addr> after any mapped regions'. Clearly when scanning, we should use
the untagged address to figure out which ranges it's in.

I've taken the liberty of removing all si_addr parsing and moving it
into the common ProcessInfo, as well as making it really explicit
whether you want the (possibly tagged) original si_addr, or whether you
want the untagged variant (for scanning /proc/maps or whatever).

This is not particularly easily testable, as ReadCrashInfo isn't easily
injectable and `dump_all_maps` should already be passed the untagged
pointer to scan for. I've tested this locally on FVP under SYNC MTE with
a simple UaF binary and noted the problem is fixed. Given that this is
making the code more clear, I'm hoping the owners see no need for a
regression test :).

Bug: 135772972
Test: On FVP, run 'adb shell MEMTAG_OPTIONS=sync sanitizer-status' and
      check that the use-after-free test ends up with the /proc/maps
      desription in the right place.
Change-Id: I220e4200c75a72474a95a67e5bbc36173a438dd2
2021-01-21 20:49:06 -08:00
Josh Gao
92317d82c9 libdebuggerd: add protobuf implementation.
This commit implements protobuf output for tombstones, along with a
translator that should emit bytewise identical output to the existing
tombstone dumping code, except for ancillary data from GWP-ASan and
Scudo, which haven't been implemented yet.

Test: setprop debug.debuggerd.translate.translate_proto_to_text 1 &&
        /data/nativetest64/debuggerd_test/debuggerd_test
Test: for TOMBSTONE in /data/tombstones/tombstone_??; do
        pbtombstone $TOMBSTONE.pb | diff $TOMBSTONE -
      done
Change-Id: Ieeece6e6d1c26eb608b00ec24e2e725e161c8c92
2021-01-21 15:40:23 -08:00
Peter Collingbourne
cd27807bfe Remove ANDROID_EXPERIMENTAL_MTE.
Now that the feature guarded by this flag has landed in Linux 5.10
we no longer need the flag, so we can remove it.

Bug: 135772972
Change-Id: I02fa50848cbd0486c23c8a229bb8f1ab5dd5a56f
2021-01-11 10:55:51 -08:00
Peter Collingbourne
ebc78cc852 Switch to the new kernel API for obtaining fault address tag bits.
The discussion on LKML is converging on v16 of the fault address tag
bits patch [1]. In this version of the patch the presence of the tag
bits in si_addr is controlled by a sa_flags bit, and a protocol is
introduced to allow userspace to detect kernel support for sa_flags
bits. Update the tombstone signal handler to use this API to read
the tag bits, update the interceptors in libsigchain to implement
the flag support detection protocol and hide the tag bits in si_addr
from chained signal handlers that did not request them to match the
kernel behavior.

[1] https://lore.kernel.org/linux-arm-kernel/cover.1605235762.git.pcc@google.com/

Change-Id: I57f24c07c01ceb3e5b81cfc15edf559ef7dfc740
2020-11-13 16:08:27 -08:00
Christopher Ferris
b05c472421 Add arch member into Unwinder object.
This simplifies some of the logic and removes the need to pass an
Arch value to functions that should already know about the arch
it is operating on.

Includes fixes for debuggerd/libbacktrace.

Added new unit tests to cover new cases.

Test: All unit tests pass.
Test: Faked unwinder failing to verify debuggerd error messages display
Test: properly in backtrace and tombstone.
Change-Id: I439fcae0695befcfb1cb4c0a786cc74949d33425
2020-09-24 18:46:23 -07:00
Peter Collingbourne
864f15dd6d Dump the per-thread TAGGED_ADDR_CTRL value if available.
This value indicates whether memory tagging is enabled on a thread,
the mode (sync or async) and the set of excluded tags. This information
can sometimes be important for understanding an MTE related crash,
so include it in the per-thread tombstone output.

Bug: 135772972
Change-Id: I25a16e10ac7fbb2b1ab2a961a5279f787039000b
2020-09-15 21:32:36 -07:00
Peter Collingbourne
f86225206d Add support for MTE error reports in tombstones.
Teach debuggerd to use the new scudo APIs proposed in
https://reviews.llvm.org/D77283 for extracing MTE error reports from crashed
processes, and include those reports in tombstones if possible.

Bug: 135772972
Change-Id: I082dfd0ac9d781cfed2b8c34cc73562614bb0dbb
2020-04-27 13:15:49 -07:00
Peter Collingbourne
f03af8844a Read fault address on arm64 using proposed kernel API.
On aarch64, the top 8 bits of the address (i.e. the tag bits) of
the fault address in si_addr are always clear. This isn't ideal for
MTE which will require these bits in order to correctly diagnose
tag mismatches.

A proposed kernel patch [1] exposes the full fault address including
the tag bits as part of the ucontext. Change debuggerd to read this
fault address if available.

[1] https://patchwork.kernel.org/patch/11435077/

Bug: 135772972
Change-Id: Ia05be574113860f4e9ecc36a310c4b740e0c4afb
2020-03-27 20:00:06 -07:00
Peter Collingbourne
f3d542fe9f Create a debugger_process_info data structure with the process info pointers.
Similar to r.android.com/1247247 I'll be adding more of them for MTE.

Also, change the protocol between the crasher and crash_dump to make
it easier to add new fields and change the referenced data structures
without needing to worry about versioning. The version number for
static executables is now always 1 (where the protocol will never
change), while the version number for dynamic executables is always
4 (where the protocol can change, because the linker and crash_dump
are version locked).

Bug: 135772972
Change-Id: Ib4696d0544d7c87cb429aaaa15f18c3640059e16
2020-03-24 17:23:15 -07:00
Peter Collingbourne
843f7e645d Create a ProcessInfo structure with the process-wide information from the crasher.
We're now passing around a couple of addresses for GWP-ASan in addition
to abort_msg_address and fdsan_table_address, and I'm going to need to add
more of them for MTE. Move them into a data structure in order to simplify
various function signatures.

Bug: 135772972
Change-Id: Ie01e1bd93a9ab64f21865f56574696825a6a125f
2020-02-28 19:12:19 -08:00
Mitch Phillips
e0b4bb1b2e [GWP-ASan] Add GWP-ASan information to tombstones.
GWP-ASan can provide information about a crash that it caused. Grab the
GWP-ASan regions from the globals shared by the linker for crash-handler
purpopses, pull the information from GWP-ASan, and display it.

This adds two regions:
 1. Causality tracking by GWP-ASan. We now print a cause header about
 the crash, like `Cause: [GWP-ASan]: Use After Free on a 1-byte
 allocation at 0x7365bb3ff8`
 2. Allocation and deallocation stack traces.

Bug: 135634846
Test: atest debuggerd_test

Change-Id: Id28d5400c9a9a053fcde83a4788f971e677d4643
2020-02-18 16:49:50 -08:00
Josh Gao
a48b41bcb8 debuggerd: switch to using platform headers for DEBUGGER_SIGNAL.
Test: treehugger
Change-Id: Ie9736c4a077dba1029d2352bd94d47ce07323aec
2019-12-17 16:36:05 -08:00
Josh Gao
5df504c5f8 crash_dump: populate uid field.
Bug: http://b/132359035
Test: manual
Change-Id: I99d8446024fc2d9395132dea45f03317976a9b62
2019-05-09 12:49:57 -07:00
Josh Gao
18cb681247 debuggerd: call setsid in our children.
There appears to be a kernel bug that causes SIGHUP and SIGCONT to be
sent to the parent process group we spawn from if the process group
contains stopped jobs (e.g. the parent itself, because of wait_for_gdb).

Call setsid in all of our children to prevent this from happening.

Bug: http://b/31124563
Test: adb shell 'setprop debug.debuggerd.wait_for_gdb 1; killall -ABRT surfaceflinger'
Change-Id: I1a48d70886880a5bfbe2deb80d48deece55faf09
2019-04-16 13:17:08 -07:00
David Srbecky
b9cc4fbb26 Revert "Check for data races when reading JIT/DEX entries."
This reverts commit 85b5fecec9.

Reason for revert: Breaks ART tests, reverting to investigate.

Change-Id: I1bb905407e87cbd4f832646651133a9caf6fcfc8
2019-04-05 18:23:32 +00:00
David Srbecky
85b5fecec9 Check for data races when reading JIT/DEX entries.
Update the entries only when the list is modified by the runtime.

Check that the list wasn't concurrently modified when being read.

Bug: 124287208
Test: libunwindstack_test
Test: art/test.py -b --host -r -t 137-cfi
Change-Id: I87ba70322053a01b3d5be1fdf6310e1dc21bb084
2019-03-29 14:01:32 +00:00
Christopher Ferris
60eb19795b Replace libbacktrace with libunwindstack directly.
Small modifications to the dump_stack method and added unit tests to
verify the output.

Bug: 120606663

Test: Unit tests pass, debuggerd run on processes on target.
Change-Id: Id385a915b751abda3dd6baebed6c3ce498c3bf6e
2019-01-29 17:57:14 -08:00
Jinguang Dong
8ac2f27cc2 tombstoned: fixed tombstones failed issue
There is a problem about tombstone, which it will fail to
generate tombstone file in some scenarios due to socket
communication exception.

Reproduce step:
step 1: reboot device
step 2: ps -ef |grep zygote , get the pid of zygote64
(Attention: zygote64 should never been killed or reboot,
otherwise we can get the tombstone file)
step 3: kill -5 pid of zygote64
step 4: cd data/tombstones/, and could not find the tombstone
file of zygote64.

[Cause Analysis]
1. There are following logs by logcat:
11-19 15:38:43.789   569   569 F libc : Fatal signal 5 (SIGTRAP),
code 0 (SI_USER) in tid 569 (main), pid 569 (main)
11-19 15:38:43.829  6115  6115 I crash_dump64: obtaining output
fd from tombstoned, type: kDebuggerdTombstone
11-19 15:38:43.830   569  5836 I Zygote  : Process 6114 exited
cleanly (0)
11-19 15:38:43.830   777   777 I /system/bin/tombstoned: received
crash request for pid 569
11-19 15:38:43.831  6115  6115 I crash_dump64: performing dump of
process 569 (target tid = 569)
...
11-19 15:38:43.937   777   777 W /system/bin/tombstoned: crash
socket received short read of length 0 (expected 12)
2. The last log was print by function of crash_request_cb in
file of tombstoned.cpp, following related code:
  rc = TEMP_FAILURE_RETRY(read(sockfd, &request, sizeof(request)));
  if (rc == -1) {
    PLOG(WARNING) << "failed to read from crash socket";
    goto fail;
  } else if (rc != sizeof(request)) {
    LOG(WARNING) << "crash socket received short read of length " << rc << " (expected "
                 << sizeof(request) << ")";
    goto fail;
  }

Tombstoned read message by socket, and now the message length is
zero. Some socket communication exception occurs at that time.
We try to let crash_dump resend the socket message when the
communication is abnormal. Just as this CL.

Test: 1 reboot device
      2 ps -ef |grep zygote , get the pid of zygote64
       (Attention: zygote64 should never been killed or reboot,
       otherwise we can get the tombstone file)
      3 kill -5 pid of zygote64
      4 cd data/tombstones/, and could find the tombstone file of
       zygote64.

Change-Id: Ic152b081024d6c12f757927079fd221b63445b18
2018-11-28 14:00:27 +08:00
Josh Gao
8d44b14543 crash_dump: annotate intended fallthrough.
Bug: http://b/116020901
Test: treehugger
Change-Id: I5d059d051fb257efe7f7e1790fd0bc2abd364167
2018-09-18 13:22:22 -07:00
Josh Gao
ce841d91fb libdebuggerd: extract and print the fdsan table.
This commit only prints the raw value of the owner tag, pretty-printing
will come in a follow-up commit.

Test: debuggerd `pidof adbd`
Test: static_crasher fdsan_file + manual inspection of tombstone
Change-Id: Idb7375a12e410d5b51e6fcb6885d4beb20bccd0e
2018-08-06 18:50:10 -07:00
Josh Gao
9da1f51c10 crash_dump: pass the address of the fdsan table.
Pass the address of the fdsan table down to crash_dump so that we can
dump the fdsan table along with the open file descriptor list.

Test: debuggerd_test
Test: manually ran an old static_crasher
Change-Id: Icbac5487109f2db1e1061c4d46de11b016b299e3
2018-08-06 18:50:10 -07:00
Josh Gao
38ac45df17 crash_dump: defuse our signal handlers earlier.
We have a LOG(FATAL) that can potentially happen before we turn off
SIGABRT. Move the signal handler defusing to the very start of main.

Bug: http://b/77920633
Test: treehugger
Change-Id: I7a2f2a0f2bed16e54467388044eca254102aa6a0
2018-04-27 13:31:47 -07:00
Josh Gao
2b2ae0c88e crash_dump: fork a copy of the target's address space.
Reduce the amount of time that a process remains paused by pausing its
threads, fetching their registers, and then performing unwinding on a
copy of its address space. This also works around a kernel change
that's in 4.9 that prevents ptrace from reading memory of processes
that we don't have immediate permissions to ptrace (even if we
previously ptraced them).

Bug: http://b/62112103
Bug: http://b/63989615
Test: treehugger
Change-Id: I7b9cc5dd8f54a354bc61f1bda0d2b7a8a55733c4
2017-12-15 14:11:12 -08:00
Christopher Ferris
ab9cf8b4cc Only call one unwinder.
Nobody is looking at the mismatches, and it can cause problems
with tombstone parsers.

Also, fix the dump_header_info test and remove unused properties_fake.cpp.

Test: Ran unit tests, verified tombstones still work.
Change-Id: I4261646016b4e84b26a5aee72f3227f1ce48ec9a
2017-10-27 15:18:27 -07:00
dimitry
6429e20494 Recommend using pid instead of tid for gdbclient.py
Using pid allows to examine other threads after gdb
is attached to a crashing process.

Test: make
Change-Id: Ie4bab0925d7abde7f114791848fa5563db245c8e
2017-09-12 10:47:50 +02:00
Josh Gao
c3706668c6 libdebuggerd: cleanup.
Move libdebuggerd headers into their own directory for namespacing,
move some includes to the top of their implementing files, delete some
dead code.

Test: mma, treehugger
Change-Id: Ie4c44e32e2ab3bc678092899d257fd4ed634aa34
2017-08-29 15:18:46 -07:00
Treehugger Robot
e67c7b94c2 Merge "crash_dump: print the identity of tracers." 2017-08-19 01:20:24 +00:00
Josh Gao
fd13bf0dcd crash_dump: print the identity of tracers.
Instead of printing a useless "ptrace attach failed: strerror(EPERM)"
message, print the name and pid of a competing tracer when we fail to
attach because a process is already being ptraced.

Bug: http://b/31531918
Test: debuggerd_test32, debuggerd_test64 on aosp_angler
Test: strace -p `pidof surfaceflinger`; debuggerd -b surfaceflinger
Change-Id: Ifd3f80fe03de30ff38c0e0068560a7b12875f29d
2017-08-18 16:16:58 -07:00
Christopher Ferris
9a8c855780 Compare new unwinder to old unwinder in debuggerd.
In debuggerd, when dumping a tombstone, run the new unwinder and verify
the old and new unwinder are the same. If not, dump enough information
in the tombstones to figure out how to duplicate the failure.

Bug: 23762183

Test: Builds, ran and forced a mismatch and verified output.
Change-Id: Ia178bde64d67e623d4f35086ebda68aebbff0c3c
2017-08-11 16:37:59 -07:00
Andreas Gampe
b02851a984 Debuggerd: Extend crash_dump timeout to 5 seconds
Some processes have lots of threads and minidebug-info. Unwinding
these can take more than the original two seconds.

Bug: 62828735
Test: m
Test: debuggerd_test
Test: adb shell kill -s 6 `pid system_server`
Change-Id: I0041bd01753135ef9d86783a3c6a5cbca1c5bbad
2017-06-22 20:19:11 -07:00
Josh Gao
3407d7c80f Revert "crash_dump: defer pausing threads until we're ready."
This reverts commit 8a2a2d182a.

Bug: http://b/62572585
Change-Id: Ia4278bca52178eb7b7b28b30d0930b292d97f353
2017-06-13 17:21:12 +00:00
Josh Gao
8a2a2d182a crash_dump: defer pausing threads until we're ready.
Don't pause the threads we're going to dump until after we're about to
fetch their backtraces.

Bug: http://b/62112103
Test: debuggerd_test
Change-Id: Id7ab0464842b35f98f3b3ebc42fb76161d8afbd2
2017-06-07 14:11:28 -07:00
Josh Gao
8bb039073f crash_dump: add tracing.
Add some tracing to figure out where time is going during a dump.

Bug: http://b/62112103
Test: systrace.py sched freq idle bionic
Change-Id: Ic2a212beeb0bb0350b4d9c2cd7a4e70adc97752d
2017-06-07 14:11:06 -07:00
Josh Gao
b0e51e388b crash_dump: don't notify ActivityManager if it crashed.
Bug: http://b/38427757
Test: killall -ABRT system_server, plus added logging
Change-Id: Ic15e0b0870b1ec08a2f165ad0e5356afed02eece
2017-06-01 12:42:33 -07:00
Josh Gao
e740250b9d crash_dump: clear the signal mask.
crash_dump inherits its signal mask from the thread that forked it,
which always has all of its signals blocked, now that sigchain respects
sa_mask.

Manually clear the signal mask, and reduce the timeout to a
still-generous 2 seconds.

Bug: http://b/38427757
Test: manually inserted sleep in crash_dump
Change-Id: If1c9adb68777b71fb19d9b0f47d6998733ed8f52
2017-06-01 11:55:25 -07:00
Narayan Kamath
a73df601b7 tombstoned: allow intercepts for java traces.
All intercept requests and crash dump requests must now specify a
dump_type, which can be one of kDebuggerdNativeBacktrace,
kDebuggerdTombstone or kDebuggerdJavaBacktrace. Each process can have
only one outstanding intercept registered at a time.

There's only one non-trivial change in this changeset; and that is
to crash_dump. We now pass the type of dump via a command line
argument instead of inferring it from the (resent) signal, this allows
us to connect to tombstoned before we wait for the signal as the
protocol requires.

Test: debuggerd_test

Change-Id: I189b215acfecd08ac52ab29117e3465da00e3a37
2017-05-31 10:35:32 +01:00
Narayan Kamath
2d377cd688 tombstoned: Add a shared library version of libtombstoned_client...
.. for ART and the frameworks to link against. In the new stack dumping
scheme (see related bug), the Java runtime will communicate with
tombstoned in order to obtain a FD to which it can write its traces.

Also move things around to separate headers that are private
implementation details from headers that constitute the public debuggerd
API. There are currently only three such headers :

- tombstoned/tombstoned.h
- debuggerd/client.h
- debuggerd/handler.h

Bug: 32064548
Test: make

Change-Id: If1b8578550e373d84828b180bbe585f1088d1aa3
2017-05-22 16:55:21 +01:00
Chenjie Luo
68c24eff77 Remove not-used dependency in crash_dump
Test: Build crash_dump.
Change-Id: I053cf53196b3e438545138ca8401a0ad01006a8c
2017-05-08 15:18:40 -07:00
Josh Gao
57f58f8e4a crash_dump: fetch process/thread names before dropping privileges.
Processes that don't have dumpable set to 1 cannot have their
process/thread names read by processes that don't have all of their
capabilities. Fetch these names in crash_dump before dropping
privileges.

Bug: http://b/36237221
Test: debuggerd_test
Test: debuggerd -b `pidof android.hardware.bluetooth@1.0-service`
Change-Id: I174769e7b3c1ea9f11f9c8cbdff83028a4225783
2017-03-15 23:30:14 -07:00
Josh Gao
c7fe0600cc crash_dump: fix warnings, turn on -Werror.
Test: mma
Change-Id: I0722fef7b513be976cbbe89f73e8bb7138a80442
2017-03-13 14:13:29 -07:00