Commit graph

53 commits

Author SHA1 Message Date
Christopher Ferris
b92b52c071 Add ability to handle multiple intercepts per pid.
While doing this, refactor the intercept code to be easier to understand.

The primary use case for this is to perform a parallel stack dump (both Java and native) for specific ANRs.

Add tests for all of the different intercept conditions.

Modify the tests to display the error message from the intercept
response if there is an error.

Bug: 254634348
Test: All unit tests pass.
Test: Ran debuggerd on native and java processes.
Test: Created a bugreport without error.
Change-Id: Ic531ccee05b9a470748b815cf109e0076150a0b6
2023-10-19 15:13:59 +00:00
Nikita Ioffe
75be784fba Switch to tombstoned.microdroid
The long term plan is to completely remove tombstoned from microdroid (b/243494912), however it might take time some time to implement it.

In the meantime, we've recently removed cgroups support from the microdroid kernel. This means that starting a tombstoned results in a bunch of non-fatal errors in the logs that are related to the fact that tombstoned service specifies task_profiles.

To get rid of these error messages we temporary add a microdroid variant of the tombstoned (tombstoned.microdroid) that doesn't specify task_profiles.

Bug: 239367015
Test: microdroid presubmit
Change-Id: Ia7d37ede2276790008702e48fdfaf37f4c1fd251
2022-10-24 15:56:33 +00:00
Christopher Ferris
ab9f0cd759 Remove double check of fd value.
The output.text.fd value is only ever -1 when there is a failure.
There is no need to check both < 0 or -1, so only check for -1.

Test: Unit tests pass.
Test: Verified the message is seen on intercept and not on
Test: regular crashes.
Change-Id: I1eddcd5d2342b268ceb261b246c98b10cee85bb4
2021-09-01 13:36:03 -07:00
Christopher Ferris
64a92413b6 Modify missing output fd message.
The "missing output fd" message can seem like an error, so modify
the message to indicate what is really happening. This message
will occur normally when running the debuggerd command, or when
a bugreport is generated, or when an ANR occurs. In all of those
cases, this is not an error, but an expected action.

Bug: 196189981

Test: Ran debuggerd -b and debuggerd and verified this message is seen.
Test: Ran unit tests.
Change-Id: I6e3d5a76d92b972c77fca301ea7147745bc67c37
2021-08-18 17:01:13 -07:00
Suren Baghdasaryan
2079c5f0c9 Replace writepid with task_profiles command for cgroup migration
writepid command usage to join a cgroup has been deprecated in favor
of a more flexible approach using task_profiles. This way cgroup path
is not hardcoded and cgroup changes can be easily made. Replace
writepid with task_profiles command to migrate between cgroups.

Bug: 191283136
Test: build and boot
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I945c634dfa7621437d8ea3981bce370d680b7371
2021-06-24 17:24:20 +00:00
Josh Gao
931274862f tombstoned: fix file creation for ANRs.
Bug: http://b/188315622
Test: manual
Change-Id: I2948b929beb3093b59c8d37b706c857e7422a3cc
2021-05-18 16:20:00 -07:00
Peter Collingbourne
fb5eac9445 Add support for a hw_timeout_multiplier system property.
In order to test the platform in emulators that are orders of magnitude
slower than real hardware we need to be able to avoid hitting timeouts
that prevent it from coming up properly. For this purpose introduce
a system property, ro.hw_timeout_multiplier, which may be set to
an integer value that acts as a multiplier for various timeouts on
the system.

Bug: 178231152
Change-Id: I6d7710beed0c4c5b1720e74e7abe3a586778c678
Merged-In: I6d7710beed0c4c5b1720e74e7abe3a586778c678
2021-03-11 14:04:18 -08:00
Treehugger Robot
4abe7c4165 Merge "Unlink the tombstone proto file before linking the temporary fd." 2021-02-08 19:23:44 +00:00
Peter Collingbourne
1e1d920785 Unlink the tombstone proto file before linking the temporary fd.
We were already doing this for the text tombstones but not for protos,
which meant that we stopped producing protos once we hit the limit
on the number of tombstones. Move the code for the text tombstones
into a common location and call it for both types.

Change-Id: I4951150da51a32d50821d147458fc5c18200c9d4
2021-02-05 16:41:48 -08:00
Josh Gao
3a2f885ec6 Merge "Let system_server truncate tombstones." 2021-02-05 20:35:19 +00:00
Josh Gao
88846a2ccf Let system_server truncate tombstones.
There's no way to atomically unlink a specific file for which we have an fd from
a path, which means that we can't safely delete a tombstone without coordination
with tombstoned, which is risky. For example, if we use flock on the directory,
and system_server crashes while holding the lock, we risk deadlock.

We do the next best thing, and keep a file descriptor around for every
tombstone, and truncate it, which requires system_server to be able to
write to tombstones (which are owned by the system group).

Test: treehugger
Change-Id: I6ba7f1fe87ee1a4b57bdb3741e8ec9fbc80788c9
2021-02-01 17:48:58 -08:00
Evgenii Stepanov
2a55e1adbe Scale timeouts in debuggerd and llkd.
Respect ro.timeout_multiplier property. Some of these are required for
tombstone writing to work on MTE QEMU, the rest are done speculatively.

Test: add crashing code to system_server, observe the tombstone
Bug: 178231152
Change-Id: Ic86e494af571301df7af07d13a6c046a0da6bda7
2021-02-01 20:00:53 +00:00
Josh Gao
76e1e30f16 Reland protobuf tombstones.
This reverts the following commits:
    e156ede145.
    eda96eddcb.
    5ec54d1e84.
    1e45d3f223.
    a50f61f8fa.

Test: treehugger
Test: atest -c CtsSeccompHostTestCases:android.seccomp.cts.SeccompHostJUnit4DeviceTest#testAppZygoteSyscalls
Change-Id: Ic2b1f489ac9f1fec7d7a33c845c29891f4306bbd
2021-01-26 17:55:17 -08:00
Jerome Gaillard
e156ede145 Revert "tombstoned: switch from goto to RAII."
Revert "Let crash_dump read /proc/$PID."

Revert submission 1556807-tombstone_proto

Reason for revert: b/178455196, Broken test: android.seccomp.cts.SeccompHostJUnit4DeviceTest#testAppZygoteSyscalls on git_master on cf_x86_64_phone-userdebug

Reverted Changes:
Ide6811297:tombstoned: switch from goto to RAII.
I8d285c4b4:tombstoned: make it easier to add more types of ou...
Id0f0fa285:tombstoned: support for protobuf fds.
I6be6082ab:Let crash_dump read /proc/$PID.
Id812ca390:Make protobuf vendor_ramdisk_available.
Ieeece6e6d:libdebuggerd: add protobuf implementation.

Change-Id: I8a77f6b9e1b42902ef7ee250cc3f1fd341ea0e2b
2021-01-26 12:42:09 +00:00
Jerome Gaillard
eda96eddcb Revert "tombstoned: make it easier to add more types of outputs."
Revert "Let crash_dump read /proc/$PID."

Revert submission 1556807-tombstone_proto

Reason for revert: b/178455196, Broken test: android.seccomp.cts.SeccompHostJUnit4DeviceTest#testAppZygoteSyscalls on git_master on cf_x86_64_phone-userdebug

Reverted Changes:
Ide6811297:tombstoned: switch from goto to RAII.
I8d285c4b4:tombstoned: make it easier to add more types of ou...
Id0f0fa285:tombstoned: support for protobuf fds.
I6be6082ab:Let crash_dump read /proc/$PID.
Id812ca390:Make protobuf vendor_ramdisk_available.
Ieeece6e6d:libdebuggerd: add protobuf implementation.

Change-Id: Ib2403c1b61f6cf0513b76361440fbc5909d7554a
2021-01-26 12:42:03 +00:00
Jerome Gaillard
5ec54d1e84 Revert "tombstoned: support for protobuf fds."
Revert "Let crash_dump read /proc/$PID."

Revert submission 1556807-tombstone_proto

Reason for revert: b/178455196, Broken test: android.seccomp.cts.SeccompHostJUnit4DeviceTest#testAppZygoteSyscalls on git_master on cf_x86_64_phone-userdebug

Reverted Changes:
Ide6811297:tombstoned: switch from goto to RAII.
I8d285c4b4:tombstoned: make it easier to add more types of ou...
Id0f0fa285:tombstoned: support for protobuf fds.
I6be6082ab:Let crash_dump read /proc/$PID.
Id812ca390:Make protobuf vendor_ramdisk_available.
Ieeece6e6d:libdebuggerd: add protobuf implementation.

Change-Id: I0c4f3a17e8b06d6c65255388c571ebf11d371dbb
2021-01-26 12:41:52 +00:00
Josh Gao
6bf6a9fc61 Merge changes from topic "tombstone_proto"
* changes:
  libdebuggerd: add protobuf implementation.
  tombstoned: support for protobuf fds.
  tombstoned: make it easier to add more types of outputs.
  tombstoned: switch from goto to RAII.
2021-01-25 22:18:48 +00:00
Elliott Hughes
d8af5b5e4f Remove unnecessary #includes.
Sadly, it looks like we do still really use libcutils for some of the
socket functions.

Test: treehugger
Change-Id: Ic71f97507c89b10d2f3b7a2971064a9e6b1d349d
2021-01-19 09:21:52 -08:00
Josh Gao
1091d24c16 tombstoned: support for protobuf fds.
Test: debuggerd_test
Change-Id: Id0f0fa2856e4b8e57d7dc0e1495134e943b289da
2021-01-13 13:35:45 -08:00
Josh Gao
e2aa621c83 tombstoned: make it easier to add more types of outputs.
While we're at it, switch to unlinkat.

Test: debuggerd_test
Change-Id: I8d285c4b4e94effa1acb8f69ac3af4ff8c37defb
2021-01-13 13:35:45 -08:00
Josh Gao
9a61f685d8 tombstoned: switch from goto to RAII.
Test: debuggerd_test
Change-Id: Ide6811297bf59776619aac6ed96653ae5cc84040
2021-01-13 13:35:45 -08:00
Josh Gao
81c94cdce6 Start tombstoned early in post-fs-data.
Bug: http://b/169659307
Test: manual
Change-Id: Ie19de31e7e2b6cd43402cfd3a2e9274728e9e6b4
2020-10-01 14:25:36 -07:00
Elliott Hughes
ec220cd877 debuggerd: increase the default limit on tombstones to 32.
We're missing useful crashes, especially on hwasan builds.

Bug: http://b/140580637
Test: run crasher
Change-Id: Ib5d8d3bd3fc4d7fec77d0b10302e5595f97a3515
2019-09-26 14:36:01 -07:00
Josh Gao
8ad965ae5e tombstoned: start immediately after /data is mounted.
Catch as many early-boot crashes as we can by starting tombstoned
immediately after /data is mounted.

Bug: http://b/139864948
Test: adb shell su 0 dmesg | grep "starting service"
Change-Id: I7f8821102191a445e87020f3efa59a2e0620d9db
2019-08-22 15:19:44 -07:00
Josh Gao
5f87bbdb0a debuggerd: switch to base::{Send,Receive}FileDescriptors.
Bug: http://b/12204763
Test: debuggerd_test
Change-Id: I0be40916214de51ab36fd6bd6d44090a84312e51
2019-02-13 13:21:54 -08:00
Josh Gao
2b22ae132f tombstoned: don't generate tombstones for native backtraces.
Previously, if an intercept ends before we ask for a file descriptor
when doing a backtrace, we'll create a tombstone file instead.

Bug: http://b/114139908
Bug: http://b/115349586
Test: debuggerd_test32
Change-Id: I23c7bb8ae5a982a4374a862d0a4f17bee03eb1d9
2018-09-14 14:06:47 -07:00
Josh Gao
f5974aedc4 tombstoned: make missing O_TMPFILE workaround actually work around.
We can't actually link an unlinked file back onto disk if it wasn't
opened with O_TMPFILE. Switch to using a temporary filename instead.

Bug: http://b/77729983
Test: agampe
Change-Id: I1970497114f0056065a1ba65f6358f08b51ec551
2018-05-03 16:05:32 -07:00
Josh Gao
28f8cf0f97 tombstoned: don't bail out if we fail to unlink a file that isn't there.
Test: crasher with no tombstones
Change-Id: I20e0537a347dd1f83877150ab13f53441dd65d95
2018-05-03 14:31:08 -07:00
Josh Gao
48383c806a tombstoned: don't create tombstones for failed dumps.
Instead of creating tombstone FDs in place and passing them out to
crash_dump directly, create them as O_TMPFILEs and link them into place
when crash_dump reports success, to avoid creating empty tombstones
in cases like an aborting thread racing with another thread that
manages to cleanly exit_group before the dump finishes.

Bug: http://b/77729983
Test: debuggerd_test
Test: adb shell 'for x in `seq 0 50`; do crasher; done'
Change-Id: I31ce4fd4a524abf8bde57152450209483d9d0ba9
2018-04-19 14:33:18 -07:00
Josh Gao
ce9cc4e428 tombstoned: fix another call to evconnlistener_new.
Apply the same fix from c2e98f63 to intercept_manager.cpp.

Bug: http://b/64543673
Test: debuggerd_test
Change-Id: Ibfb919e059fa62f8336cfc1426d03ef015590136
2017-09-22 18:00:35 -07:00
Narayan Kamath
c2e98f6340 tombstoned: Fix calls to evconnlistener_new.
The order of arguments is wrong - we're passing flags=static_cast<unsigned>(-1)
and backlog=LEV_OPT_CLOSE_ON_FREE (which is 2).

On versions of libevent prior to 2.1.8, this ends up accidentally setting
OPT_LEAVE_SOCKETS_BLOCKING, OPT_CLOSE_ON_EXEC, OPT_REUSABLE and OPT_THREADSAFE
and limiting our backlog to two. These unintentional changes are relatively
benign; we never make our sockets block, we never exec, we never reuse
sockets and the additional locking overhead should be negligible. The
backlog of two might be a problem in theory, but there haven't been any
reports of issues caused by it.

Things get worse on 2.1.8 - that version introduces several new flags,
one of which is OPT_DISABLED. This disables the new listener by default,
which means that our event loop returns early because it has no active listeners
for any of its events.

Bug: 64543673
Test: Manual.

Change-Id: I9954bc7fe1af761de1a950d935dd2e6ce7e2c5f5
2017-09-13 14:15:57 +01:00
Elliott Hughes
3e8d923276 Merge "Allow configuration of the number of tombstones." 2017-06-27 20:57:08 +00:00
Elliott Hughes
35bb6d2a89 Allow configuration of the number of tombstones.
Bug: http://b/62810514
Test: altered the property, got more tombstones
Change-Id: Iba8089915fa715658d2dfecb076c6a61321243bd
2017-06-26 14:00:00 -07:00
Narayan Kamath
111f351762 Merge "tombstoned: Improve message on java trace completion." 2017-06-23 08:12:51 +00:00
Narayan Kamath
79dd143e5f tombstoned: Improve message on java trace completion.
For java traces, log the kind of dump as well as the PID of the
completed dump. This makes it easier to correlate dump requests with the
actual file they're written to.

Sample log statement:
E /system/bin/tombstoned: Traces for pid 4737 written to: /data/anr/trace_00

The message for native traces / tombstones remains unchanged because
several tools parse it.

Test: manual
Bug: 32064548

Change-Id: I7b3792dd5ae312ee0bc055c22ec3f7c747152072
2017-06-22 11:04:33 +01:00
Narayan Kamath
b123220dd6 tombstoned: change path for traces from "anr_" to "trace_"
The only case where tombstoned creates files for java traces is
when the process is signalled "by hand" using "shell kill -3", or
by the program itself. Such traces do not correspond to an ANR, so
name those files "trace_XX".

When dumpstate / system_server want to dump java traces, they set up
a tombstoned intercept and manage the lifetime of any associated file
that themselves.

Bug: 32064548
Test: manual, debuggerd_test
Change-Id: I97006ec7c0cd35de4b9564f535e77af846cc3891
2017-06-21 18:00:09 +01:00
Treehugger Robot
87f5432f52 Merge "tombstoned: log where we're writing the tombstone." 2017-06-13 02:47:34 +00:00
Josh Gao
cb68a0317d tombstoned: log where we're writing the tombstone.
Make it easy to find out where a specific crash's tombstone was written
to by adding a log.

Bug: http://b/62268830
Test: crasher
Change-Id: I1961dfb19f76a42a8448ebafd4be153b73cb6800
2017-06-12 21:00:59 +00:00
Narayan Kamath
ca5e908dd6 tombstoned: turn on java trace support + unit tests.
The SELinux changes that this depends on have now landed.

This change also adds a few lower level unit tests of intercept
functionality.

Test: make; debuggerd_test
Change-Id: I0be5e85e7097e26b71db269c9ed92d9b438bfb28
2017-06-07 18:57:54 +01:00
Narayan Kamath
a73df601b7 tombstoned: allow intercepts for java traces.
All intercept requests and crash dump requests must now specify a
dump_type, which can be one of kDebuggerdNativeBacktrace,
kDebuggerdTombstone or kDebuggerdJavaBacktrace. Each process can have
only one outstanding intercept registered at a time.

There's only one non-trivial change in this changeset; and that is
to crash_dump. We now pass the type of dump via a command line
argument instead of inferring it from the (resent) signal, this allows
us to connect to tombstoned before we wait for the signal as the
protocol requires.

Test: debuggerd_test

Change-Id: I189b215acfecd08ac52ab29117e3465da00e3a37
2017-05-31 10:35:32 +01:00
Narayan Kamath
2d377cd688 tombstoned: Add a shared library version of libtombstoned_client...
.. for ART and the frameworks to link against. In the new stack dumping
scheme (see related bug), the Java runtime will communicate with
tombstoned in order to obtain a FD to which it can write its traces.

Also move things around to separate headers that are private
implementation details from headers that constitute the public debuggerd
API. There are currently only three such headers :

- tombstoned/tombstoned.h
- debuggerd/client.h
- debuggerd/handler.h

Bug: 32064548
Test: make

Change-Id: If1b8578550e373d84828b180bbe585f1088d1aa3
2017-05-22 16:55:21 +01:00
Narayan Kamath
922f6b22fc tombstoned: Support java trace dumps.
The changes here involve :
- Creating and opening a new socket to receive trace dump requests on. Having
  different sockets allows us to install different sets of access control rules.

- A minor refactor to allow us to share common pieces of implementation
  between the java and native dumping code. This will also allow us to
  add a unit test for all file / directory related logic.

There are two java trace specific additions here :
- We use SO_PEERCRED instead of trusting the PID written to the seocket
  because requests come in from untrusted processes.
- Java trace dumps are not interceptible.

kJavaTraceDumpsEnabled is set to false for now but the value of the flag
will be flipped in a future change.

Bug: 32064548
Test: Manual; Currently working on a unit_test for CrashType.

Change-Id: I1d62cc7a7035fd500c3e2b831704a2934d725e35
2017-05-18 12:01:14 +00:00
Josh Gao
460b336d6a tombstoned: fix a race between intercept and crash_dump.
Previously, there was no way to detect when tombstoned processed an
intercept request packet, making it possible for a intercept request
followed by a crash_dump being processed in the wrong order.

Add a response to intercept registration, to eliminate this race.

Test: debuggerd_test
Change-Id: If38c6d14081ebc86ff1ed0edd7afaeafc40a8381
2017-03-30 16:49:02 -07:00
Josh Gao
807a45807b tombstoned: refactor request dequeuing a bit.
Also make it loop, so that upon failing to start a dequeued crash
request, we continue to the next one.

Bug: http://b/36685795
Test: debuggerd_test
Change-Id: I94889125f16f4681c6fa0fa9cac456302602ce01
2017-03-30 16:19:53 -07:00
Josh Gao
13078245a0 tombstoned: don't increment num_concurrent_dumps until success.
Previously, we would increment num_concurrent_dumps and fail to
decrement it if we failed to start the request. Change this to
only increment after we've successfully started the dump.

Bug: http://b/36685795
Test: debuggerd_test
Change-Id: I66169ed56ed44271e1d8fe1298d95260be7a32a3
2017-03-30 14:51:38 -07:00
Josh Gao
55f79a5953 tombstoned: turn off signal handlers.
Don't try to connect to ourselves in a signal handler (e.g. if someone
does `killall -ABRT tombstoned`).

Test: killall -ABRT tombstoned
Change-Id: Ib69a206f741acb523c9f2883d474c940b6ebfab2
2017-03-06 12:30:25 -08:00
Josh Gao
8830c95def tombstoned: create tombstones with 0640 permissions.
Make tombstones group readable to allow them to be picked up by the
dropbox service.

Bug: http://b/35979630
Test: killall -ABRT rild; dumpsys dropbox
Change-Id: If57cc17563c80d5b5c4887b0937905bffef6b231
2017-03-06 12:30:25 -08:00
Josh Gao
8498016b81 tombstoned: silence spurious error messages.
Bug: none
Test: booted after deleting /data/tombstones/*
Test: crasher creates a tombstone
Change-Id: I8b3e8a3b521952412ebc955b2437bf8150220c16
2017-01-23 16:01:14 -08:00
Josh Gao
0ad5107e51 Actually don't start tombstoned until /data is mounted.
Bug: http://b/34461270
Test: boot is actually faster
Test: tombstoned still started by init
Change-Id: I4976abef108bbb6fad264f9b68cbc1fba711085b
2017-01-23 16:01:14 -08:00
Treehugger Robot
b479a5002e Merge "init: don't start tombstoned until /data is mounted." 2017-01-20 22:13:38 +00:00