Commit graph

71 commits

Author SHA1 Message Date
Elliott Hughes
b20a3aaa1c debuggerd: unify licenses.
All but three files are Apache-2.0 already.

Bug: http://b/191499510
Test: /google/src/files/head/depot/google3/wireless/android/busytown/ayeaye/analyzers/copyright/tools/scan_android_project.sh ~/aosp/system/core/debuggerd/ | grep -v APACHE
Change-Id: I430c3382dd160e398f02470d7053ecea39c98f41
2021-06-18 15:49:30 -07:00
Christopher Ferris
0c787f0d6b Avoid thread cache in unwinder.
The code in the fallback path calls pthread_key_create when using the
normal thread cache. However, this code is executed out of the linker,
which means that the call doesn't see keys created by the libc version
of pthread_key_create. As of now, simply avoid using the thread cache
to avoid this problem.

Bug: 189803009

Test: debuggerd -b on a media process on a 32 bit Android Go device
Test: and observe no crash.
Test: debuggerd unit tests pass.
Change-Id: I9ca1a55e44d3bb69d49450826d7d64d7a64145c3
(cherry picked from commit 49e5a76544)
2021-06-14 19:57:33 +00:00
Elliott Hughes
e4781d54a5 debuggerd: prepare to abandon ship^Wgdb.
Talk of "gdb" when we currently mean "gdb or lldb" and will soon mean
"lldb" is starting to confuse people. Let's use the more neutral
"debugger" in places where it really doesn't matter.

The switch from gdbclient.py to lldbclient.py is a change for another
day...

Test: treehugger
Change-Id: If39ca7e1cdf4c8bb9475f1791cdaf201fbea50e0
2021-03-17 10:03:25 -07:00
Josh Gao
76e1e30f16 Reland protobuf tombstones.
This reverts the following commits:
    e156ede145.
    eda96eddcb.
    5ec54d1e84.
    1e45d3f223.
    a50f61f8fa.

Test: treehugger
Test: atest -c CtsSeccompHostTestCases:android.seccomp.cts.SeccompHostJUnit4DeviceTest#testAppZygoteSyscalls
Change-Id: Ic2b1f489ac9f1fec7d7a33c845c29891f4306bbd
2021-01-26 17:55:17 -08:00
Jerome Gaillard
1e45d3f223 Revert "libdebuggerd: add protobuf implementation."
Revert "Let crash_dump read /proc/$PID."

Revert submission 1556807-tombstone_proto

Reason for revert: b/178455196, Broken test: android.seccomp.cts.SeccompHostJUnit4DeviceTest#testAppZygoteSyscalls on git_master on cf_x86_64_phone-userdebug

Reverted Changes:
Ide6811297:tombstoned: switch from goto to RAII.
I8d285c4b4:tombstoned: make it easier to add more types of ou...
Id0f0fa285:tombstoned: support for protobuf fds.
I6be6082ab:Let crash_dump read /proc/$PID.
Id812ca390:Make protobuf vendor_ramdisk_available.
Ieeece6e6d:libdebuggerd: add protobuf implementation.

Change-Id: Ia0a1ee57e7630e01c495dc166218f665340aad7f
2021-01-26 12:41:20 +00:00
Josh Gao
6bf6a9fc61 Merge changes from topic "tombstone_proto"
* changes:
  libdebuggerd: add protobuf implementation.
  tombstoned: support for protobuf fds.
  tombstoned: make it easier to add more types of outputs.
  tombstoned: switch from goto to RAII.
2021-01-25 22:18:48 +00:00
Josh Gao
92317d82c9 libdebuggerd: add protobuf implementation.
This commit implements protobuf output for tombstones, along with a
translator that should emit bytewise identical output to the existing
tombstone dumping code, except for ancillary data from GWP-ASan and
Scudo, which haven't been implemented yet.

Test: setprop debug.debuggerd.translate.translate_proto_to_text 1 &&
        /data/nativetest64/debuggerd_test/debuggerd_test
Test: for TOMBSTONE in /data/tombstones/tombstone_??; do
        pbtombstone $TOMBSTONE.pb | diff $TOMBSTONE -
      done
Change-Id: Ieeece6e6d1c26eb608b00ec24e2e725e161c8c92
2021-01-21 15:40:23 -08:00
Elliott Hughes
d8af5b5e4f Remove unnecessary #includes.
Sadly, it looks like we do still really use libcutils for some of the
socket functions.

Test: treehugger
Change-Id: Ic71f97507c89b10d2f3b7a2971064a9e6b1d349d
2021-01-19 09:21:52 -08:00
Peter Collingbourne
ebc78cc852 Switch to the new kernel API for obtaining fault address tag bits.
The discussion on LKML is converging on v16 of the fault address tag
bits patch [1]. In this version of the patch the presence of the tag
bits in si_addr is controlled by a sa_flags bit, and a protocol is
introduced to allow userspace to detect kernel support for sa_flags
bits. Update the tombstone signal handler to use this API to read
the tag bits, update the interceptors in libsigchain to implement
the flag support detection protocol and hide the tag bits in si_addr
from chained signal handlers that did not request them to match the
kernel behavior.

[1] https://lore.kernel.org/linux-arm-kernel/cover.1605235762.git.pcc@google.com/

Change-Id: I57f24c07c01ceb3e5b81cfc15edf559ef7dfc740
2020-11-13 16:08:27 -08:00
Treehugger Robot
d0642a373d Merge "Improve error message in debuggerd fallback handler." 2020-10-01 21:00:16 +00:00
Josh Gao
68083003b8 Improve error message in debuggerd fallback handler.
Bug: http://b/164014625
Test: none
Change-Id: I4f1e61be93c511676e66b909a15735bba963eff0
2020-09-25 13:51:02 -07:00
Christopher Ferris
b05c472421 Add arch member into Unwinder object.
This simplifies some of the logic and removes the need to pass an
Arch value to functions that should already know about the arch
it is operating on.

Includes fixes for debuggerd/libbacktrace.

Added new unit tests to cover new cases.

Test: All unit tests pass.
Test: Faked unwinder failing to verify debuggerd error messages display
Test: properly in backtrace and tombstone.
Change-Id: I439fcae0695befcfb1cb4c0a786cc74949d33425
2020-09-24 18:46:23 -07:00
Josh Gao
c40a7515eb debuggerd: don't leave a zombie child if crash_dump is killed.
If crash_dump dies before it gets a chance to write to the pipe we use
to let the debugged-process know that it successfully started, we
weren't cleaning up the child we fork to start it, leaving a zombie
child.

Bug: http://b/152119184
Test: debuggerd_test
Change-Id: Id01cc05f693995e9998941774f74ab8e3d8b4d8a
2020-04-10 10:09:39 -07:00
Peter Collingbourne
cd63cae6b2 Merge "Read fault address on arm64 using proposed kernel API." 2020-03-30 21:40:58 +00:00
Peter Collingbourne
5677803cb7 Merge "Create a debugger_process_info data structure with the process info pointers." 2020-03-30 21:36:41 +00:00
Peter Collingbourne
f03af8844a Read fault address on arm64 using proposed kernel API.
On aarch64, the top 8 bits of the address (i.e. the tag bits) of
the fault address in si_addr are always clear. This isn't ideal for
MTE which will require these bits in order to correctly diagnose
tag mismatches.

A proposed kernel patch [1] exposes the full fault address including
the tag bits as part of the ucontext. Change debuggerd to read this
fault address if available.

[1] https://patchwork.kernel.org/patch/11435077/

Bug: 135772972
Change-Id: Ia05be574113860f4e9ecc36a310c4b740e0c4afb
2020-03-27 20:00:06 -07:00
Peter Collingbourne
f3d542fe9f Create a debugger_process_info data structure with the process info pointers.
Similar to r.android.com/1247247 I'll be adding more of them for MTE.

Also, change the protocol between the crasher and crash_dump to make
it easier to add new fields and change the referenced data structures
without needing to worry about versioning. The version number for
static executables is now always 1 (where the protocol will never
change), while the version number for dynamic executables is always
4 (where the protocol can change, because the linker and crash_dump
are version locked).

Bug: 135772972
Change-Id: Ib4696d0544d7c87cb429aaaa15f18c3640059e16
2020-03-24 17:23:15 -07:00
Peter Collingbourne
b72e74810c Move crash_dump into the runtime APEX.
A future change will introduce a version lock between linker and
crash_dump. Move crash_dump into the runtime APEX alongside linker in order to
ensure that they will be the same version even if the runtime APEX is updated.

Bug: 135772972
Change-Id: Ic2eae31b6927eb0e8a62315ac141f50933c00bcc
Merged-In: Ic2eae31b6927eb0e8a62315ac141f50933c00bcc
2020-03-18 10:38:04 -07:00
Mitch Phillips
e0b4bb1b2e [GWP-ASan] Add GWP-ASan information to tombstones.
GWP-ASan can provide information about a crash that it caused. Grab the
GWP-ASan regions from the globals shared by the linker for crash-handler
purpopses, pull the information from GWP-ASan, and display it.

This adds two regions:
 1. Causality tracking by GWP-ASan. We now print a cause header about
 the crash, like `Cause: [GWP-ASan]: Use After Free on a 1-byte
 allocation at 0x7365bb3ff8`
 2. Allocation and deallocation stack traces.

Bug: 135634846
Test: atest debuggerd_test

Change-Id: Id28d5400c9a9a053fcde83a4788f971e677d4643
2020-02-18 16:49:50 -08:00
Josh Gao
55c7ed4e2e debuggerd_handler: increase thread stack size.
1 page isn't enough to log on AArch64, and clean pages are free, so
increase the stack size to 8 pages.

Bug: http://b/144887737
Test: treehugger
Change-Id: I731b3bc27ab37f4b830a9478a04cd34d4f7648d3
2020-01-17 17:25:30 -08:00
Josh Gao
a48b41bcb8 debuggerd: switch to using platform headers for DEBUGGER_SIGNAL.
Test: treehugger
Change-Id: Ie9736c4a077dba1029d2352bd94d47ce07323aec
2019-12-17 16:36:05 -08:00
Nick Desaulniers
67d52aa0f6 [debuggerd] fix -Wreorder-init-list
C++20 wants members to be ordered unlike C99.

Bug: 139945549
Test: mm
Change-Id: I3cbca589511c1e0bbc10c691949e18de77e16031
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
2019-10-10 14:54:35 -07:00
Josh Gao
18cb681247 debuggerd: call setsid in our children.
There appears to be a kernel bug that causes SIGHUP and SIGCONT to be
sent to the parent process group we spawn from if the process group
contains stopped jobs (e.g. the parent itself, because of wait_for_gdb).

Call setsid in all of our children to prevent this from happening.

Bug: http://b/31124563
Test: adb shell 'setprop debug.debuggerd.wait_for_gdb 1; killall -ABRT surfaceflinger'
Change-Id: I1a48d70886880a5bfbe2deb80d48deece55faf09
2019-04-16 13:17:08 -07:00
David Srbecky
b9cc4fbb26 Revert "Check for data races when reading JIT/DEX entries."
This reverts commit 85b5fecec9.

Reason for revert: Breaks ART tests, reverting to investigate.

Change-Id: I1bb905407e87cbd4f832646651133a9caf6fcfc8
2019-04-05 18:23:32 +00:00
David Srbecky
85b5fecec9 Check for data races when reading JIT/DEX entries.
Update the entries only when the list is modified by the runtime.

Check that the list wasn't concurrently modified when being read.

Bug: 124287208
Test: libunwindstack_test
Test: art/test.py -b --host -r -t 137-cfi
Change-Id: I87ba70322053a01b3d5be1fdf6310e1dc21bb084
2019-03-29 14:01:32 +00:00
Josh Gao
5e8d68c2b2 debuggerd_handler: demote abort on exec failure to log.
If a process is ptraced already, we might not be able to exec crash_dump
due to selinux. Since we can be called for non-fatal events, we
shouldn't abort in that case.

Bug: http://b/128054996
Test: treehugger
Change-Id: I1442041caa7af908df2ab87b9e010c44082e7587
2019-03-18 14:39:47 -07:00
Christopher Ferris
60eb19795b Replace libbacktrace with libunwindstack directly.
Small modifications to the dump_stack method and added unit tests to
verify the output.

Bug: 120606663

Test: Unit tests pass, debuggerd run on processes on target.
Change-Id: Id385a915b751abda3dd6baebed6c3ce498c3bf6e
2019-01-29 17:57:14 -08:00
Josh Gao
08163cb032 debuggerd_fallback: fix fd leak.
Previously, when we received simultaneous dump requests, we were CASing
a file descriptor value into a variable, and then failing to close it
if the CAS failed.

Bug: http://b/118412443
Test: debuggerd_test
Change-Id: I075c35a239426002eb9416da3d268c3d1a18e9d2
2018-10-30 15:33:58 -07:00
Josh Gao
6f9eeecd2b Fix multithreaded backtraces for seccomp processes.
Add threads to the existing seccomp backtrace test to prevent
regressing this.

Bug: http://b/114139908
Bug: http://b/115349586
Test: debuggerd_test32
Test: debuggerd_test64
Change-Id: I07fbe1619b60f0008deb045a249f9045404478c2
2018-09-12 18:12:13 -07:00
Josh Gao
4843c18634 debuggerd_handler: receive abort messages via sigqueue(DEBUGGER_SIGNAL).
Make it possible for code such as fdsan that generates debugging
tombstones via raise(DEBUGGER_SIGNAL) to pass an abort message as well.

Bug: http://b/112770187
Test: debuggerd_test
Change-Id: Idc34263241c18033573e466da3a45aa6f716ddb3
2018-08-27 16:55:07 -07:00
Josh Gao
9da1f51c10 crash_dump: pass the address of the fdsan table.
Pass the address of the fdsan table down to crash_dump so that we can
dump the fdsan table along with the open file descriptor list.

Test: debuggerd_test
Test: manually ran an old static_crasher
Change-Id: Icbac5487109f2db1e1061c4d46de11b016b299e3
2018-08-06 18:50:10 -07:00
Josh Gao
c954ec09c5 debuggerd_handler: use syscall(__NR_close) instead of close.
Avoid bionic's file descriptor ownership checks by calling the close
syscall manually.

Test: debuggerd_test
Change-Id: I10af6aca0e66fe030fd7a53506ae61c87695641d
2018-07-18 18:11:46 -07:00
Josh Gao
d2b15dd674 debuggerd: fix CrasherTest.seccomp_crash_oom.
Switch from _exit to raising SIGABRT when we recurse in the fallback
handler, so that waiters see an abort instead of a regular exit.

Bug: http://b/79717060
Test: debuggerd_test32
Test: debuggerd_test64
Change-Id: Iddee1cb1b759690adf07bbb8cd0fda2faac87571
2018-05-16 00:16:09 -07:00
Elliott Hughes
70d8f28945 Show signal sender for SI_FROMUSER signals.
Suicide doesn't change:

  signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------

But homicide now looks like this (this is `sleep 666` killed by
`kill -SEGV` as root:

  signal 11 (SIGSEGV), code 0 (SI_USER from pid 4446, uid 0), fault addr --------

Bug: http://b/78594105
Test: manual
Change-Id: I8c2feafba8cc5a3db85e8250004d428a464c5d9e
2018-04-26 08:19:17 -07:00
Josh Gao
70adac6a8a debuggerd_fallback: don't recursively abort.
Calls to abort() will always result in our signal handler being called,
because abort will manually unblock SIGABRT before raising it. This
can lead to deadlock when handling address space exhaustion in the
fallback handler. To fix this, switch our mutex to a recursive mutex,
and manually keep track of our lock count.

Bug: http://b/72929749
Test: debuggerd_test --gtest_filter="CrasherTest.seccomp_crash_oom"
Change-Id: I609f263ce93550350b17757189326b627129d4a7
2018-02-22 16:31:38 -08:00
Josh Gao
c531ed6648 debuggerd_fallback: fix race.
A race condition occurs when one thread takes more than a second to get
scheduled to handle the signal we send to ask it to dump its stack.
When this happens, the main thread will continue on, close the fd, and
then ask the next thread to dump, but the slow thread will then wake up
and try to write to the new thread's fd, or trigger an assertion in
__linker_enable_fallback_allocator.

Do a few things to make this less bad:
  - encode both target tid and fd in the shared atomic, so that we know
    who each fd is for
  - switch __linker_enable_fallback_allocator to return success instead
    of aborting, and bail out if it's already in use
  - write to the output fd right when we get to it, instead of doing it
    whenever the dumping code decides to, to reduce the likelihood that
    the timeout expires

Test: debuggerd_test
Change-Id: Ife0f6dae388b601e7f991605f14d7a0274013f6b
2018-02-09 15:35:40 -08:00
Luis Hector Chavez
4841e744c2 debuggerd_handler: set PR_SET_PTRACER before running crash_dump.
Set and restore PR_SET_PTRACER when performing a dump, so that when
Android is running on a kernel that has the Yama LSM enabled (and the
value of ptrace_scope is > 0), crash_dump can attach to processes and
print nice, symbolized stack traces.

Bug: 70992745
Test: kill -6 `pidof surfaceflinger` && logcat -d -b crash
      # in both sailfish and Chrome OS

Change-Id: If4646442c6000fdcc69cf4ab95fdc71ae74baaaf
2017-12-27 13:19:31 -08:00
Josh Gao
7302097e77 debuggerd: wait for dump completion on crashes.
When a process crashes, both ActivityManager and init will try to kill
its process group when they notice. The recent change to minimize the
amount of time a process is paused results in crash dumps being killed
before they finish as a result of this. Since anything that needs to be
low-latency is probably not going to be too happy if it crashes, just
wait for completion whenever we're processing a real crash.

Bug: http://b/70343110
Test: debuggerd_test
Change-Id: I894bb06efd264b1ba005df06f7326a72f4b767bb
2017-12-22 14:20:12 -08:00
Josh Gao
2b2ae0c88e crash_dump: fork a copy of the target's address space.
Reduce the amount of time that a process remains paused by pausing its
threads, fetching their registers, and then performing unwinding on a
copy of its address space. This also works around a kernel change
that's in 4.9 that prevents ptrace from reading memory of processes
that we don't have immediate permissions to ptrace (even if we
previously ptraced them).

Bug: http://b/62112103
Bug: http://b/63989615
Test: treehugger
Change-Id: I7b9cc5dd8f54a354bc61f1bda0d2b7a8a55733c4
2017-12-15 14:11:12 -08:00
Christopher Ferris
664d2a9093 Force call the fallback handler.
Always check to see if the fallback handler has been called and is
not trying to dump a specific thread.

Bug: 69110957

Test: Verified on a system where the prctl value changes, that before the
Test: change it dumps multiple tombstones, and after the change it
Test: works as expected.
Test: Ran debuggerd unit tests.
Test: Dumped process using debuggerd -b <PID> and debuggerd <PID>.
Change-Id: Id98bbe96cced9335f7c3e17088bb4ab2ad2e7a64
2017-11-16 20:07:13 -08:00
Josh Gao
cdea750576 crash_dump: don't inherit environment from parent.
Bug: http://b/68381717
Test: debuggerd_test
Change-Id: Ie1b342bc9901cb9ae9b79147899928a19052cbad
2017-11-03 16:57:56 -07:00
Josh Gao
c3706668c6 libdebuggerd: cleanup.
Move libdebuggerd headers into their own directory for namespacing,
move some includes to the top of their implementing files, delete some
dead code.

Test: mma, treehugger
Change-Id: Ie4c44e32e2ab3bc678092899d257fd4ed634aa34
2017-08-29 15:18:46 -07:00
Josh Gao
fdf832dfd3 base: add Pipe and Socketpair wrappers.
Also, switch debuggerd_handler over to using android::base::unique_fd.

Test: treehugger
Change-Id: I97b2ce22f1795ce1c4370f95d00d769846cc54b8
2017-08-28 14:51:07 -07:00
Josh Gao
81e6c0b613 debuggerd_handler: print pid and process name.
Bug: http://b/64483618
Test: manual
Change-Id: Ie772324895a8ffcd41d919a4a6113862a6468d12
2017-08-11 15:38:51 -07:00
Narayan Kamath
a73df601b7 tombstoned: allow intercepts for java traces.
All intercept requests and crash dump requests must now specify a
dump_type, which can be one of kDebuggerdNativeBacktrace,
kDebuggerdTombstone or kDebuggerdJavaBacktrace. Each process can have
only one outstanding intercept registered at a time.

There's only one non-trivial change in this changeset; and that is
to crash_dump. We now pass the type of dump via a command line
argument instead of inferring it from the (resent) signal, this allows
us to connect to tombstoned before we wait for the signal as the
protocol requires.

Test: debuggerd_test

Change-Id: I189b215acfecd08ac52ab29117e3465da00e3a37
2017-05-31 10:35:32 +01:00
Narayan Kamath
2d377cd688 tombstoned: Add a shared library version of libtombstoned_client...
.. for ART and the frameworks to link against. In the new stack dumping
scheme (see related bug), the Java runtime will communicate with
tombstoned in order to obtain a FD to which it can write its traces.

Also move things around to separate headers that are private
implementation details from headers that constitute the public debuggerd
API. There are currently only three such headers :

- tombstoned/tombstoned.h
- debuggerd/client.h
- debuggerd/handler.h

Bug: 32064548
Test: make

Change-Id: If1b8578550e373d84828b180bbe585f1088d1aa3
2017-05-22 16:55:21 +01:00
Josh Gao
2e7b8e2d1a debuggerd_handler: use syscall(__NR_get[pt]id) instead of get[pt]id.
bionic's cached values for getpid/gettid can be invalid if the crashing
process manually invoked clone to create a thread or process, which
will lead the crash_dump refusing to do anything, because it sees the
actual values.

Use the getpid/gettid syscalls directly to ensure correct values on
this end.

Bug: http://b/37769298
Test: debuggerd_test
Change-Id: I0b1e652beb1a66e564a48b88ed7fa971d61c6ff9
2017-05-05 14:58:12 -07:00
Christopher Ferris
ac225780dd Move libc_logging to libasync_safe.
Move the name of the "private/libc_logging.h" header to <async_safe/log.h>.

For use of libc_malloc_debug_backtrace, remove the libc_logging library.
The library now includes the async safe log functions.

Remove the references to libc_logging.cpp in liblog, it isn't needed because
the code is already protected by a check of the __ANDROID__ define.

Test: Compiled and boot bullhead device.
Test: Run debuggerd unit tests.
Test: Run liblog unit tests on target and host.
Test: Run libmemunreachable unit tests (these tests are flaky though).
Change-Id: Ie79d7274febc31f210b610a2c4da958b5304e402
2017-05-02 18:38:46 -07:00
Josh Gao
e06f2a4886 debuggerd_handler: don't assume that abort message implies fatal.
Applications can set abort messages via android_set_abort_message
without actually aborting. This leads to following non-fatal dumps
printing their output to logcat in the same format as a regular crash.

Bug: http://b/37754992
Test: debuggerd_test
Change-Id: I9c5e942984dfda36448860202b0ff1c2950bdd07
2017-04-27 17:28:05 -07:00
Elliott Hughes
561da6aa82 "Requested dump for tid XXX" message shouldn't be fatal.
This just means we were asked to dump, not that something necessarily went
wrong.

Bug: http://b/36191903
Test: builds
Change-Id: I5638b38f3a13081b1e971512f43238010febb59c
2017-03-23 23:04:27 -07:00