Commit graph

200 commits

Author SHA1 Message Date
Dimitry Ivanov
9fbd7e1026 Merge changes from topic "revert-3062926-CJGHTRPCBP" into main
* changes:
  Revert "[Berberis][CrashReporting] Extend ThreadInfo to have gue..."
  Revert "[Berberis][CrashReporting] Dump guest thread info to tom..."
2024-05-10 16:14:06 +00:00
Dimitry Ivanov
cdf499f9cd Revert "[Berberis][CrashReporting] Extend ThreadInfo to have gue..."
Revert submission 3062926

Reason for revert: We want guest state to be present in all threads - revert to be able to fix the proto field type.

Reverted changes: /q/submissionid:3062926

Change-Id: I32b745cca95a619b78bdce0a7d948ac479d42f21
2024-05-10 10:02:07 +00:00
Dimitry Ivanov
899c1bdfa1 Revert "[Berberis][CrashReporting] Dump guest thread info to tom..."
Revert submission 3062926

Reason for revert: We want guest state to be present in all threads - revert to be able to fix the proto field type.

Reverted changes: /q/submissionid:3062926

Change-Id: I87b282a0d9caebe4eae2e7d8eca8ec8ebaa3eca6
2024-05-10 10:02:07 +00:00
Sijie Chen
a6e1ac8efe Merge changes from topic "berberis-crash-reporting-guest_regs" into main
* changes:
  [Berberis][CrashReporting] Dump guest thread info to tombstone file
  [Berberis][CrashReporting] Extend ThreadInfo to have guest registers
2024-05-09 22:16:44 +00:00
Sijie Chen
4c3a9dfd2f [Berberis][CrashReporting] Dump guest thread info to tombstone file
As title.

Bug: b/321799516
Test: riscv64, checked tombstone file has wanted block.
https://paste.googleplex.com/5958508322750464
Added arm64 support and tested arm64 unwinding in internal repo.
https://paste.googleplex.com/6545612887818240

Change-Id: I4e8a3414d0198de88a577ef4d5672a9ad0286fc5
2024-05-09 20:21:16 +00:00
Sijie Chen
3ff250f6d7 [Berberis][CrashReporting] Extend ThreadInfo to have guest registers
This CL is to get guest registers information.

Bug: b/321799516
Test: m
Testing for TLS Slot:
Manual testing by: 1. crash the jni tests to produce tombstones file 2.
get the signature field of guest state header 3. verified it is the same
value as NATIVE_BRIDGE_GUEST_STATE_SIGNATURE

Manual test the arm64 by: 1. flash build to pixel phone and verify
retrieving TLS_SLOT_THREAD_ID's tid field is the same as current thread
id.

Testing for register values:
Test and print out registers values for riscv64, looks make sense that
has null zero value slots.

Change-Id: Iff44ac5c2b202e44f3fb4e6909fbea141e54ae6b
2024-05-09 18:28:30 +00:00
Christopher Ferris
0455ca3e09 Merge "Clean up usage of 32 bit/64 bit checks." into main 2024-05-08 20:06:18 +00:00
Christopher Ferris
2f77c2a516 Clean up usage of 32 bit/64 bit checks.
Rather than have to create a number of #if defines for the memory
dumping parts of the tombstone, create a single function to generate
these strings for the memory tests.

Make CrasherTest.smoke use a regex that passes on 32 bit and 64 bit.

Make the tests page size agnostic.

Bug: 339017792

Test: Treehugger.
Test: Ran 32 bit and 64 bit versions of tests on a real device.
Test: Ran on the aosp_cf_x86_64_phone_pgagnostic-trunk_staging-userdebug
Change-Id: If9365061b85de23b00a1bf947d85923cde06c068
2024-05-07 15:30:47 -07:00
Devin Moore
87ff7115ef Merge "Add page size info to tombstone" into main 2024-05-07 19:55:51 +00:00
Devin Moore
4647b6b305 Add page size info to tombstone
Now that Android devices can use 16k page size, it's important that we
know what configuration the device is in when see issues.

1) If the device is in 4k mode, we see nothing new.

2) If the device is in 16k mode, we see this line in the tombstones:

Page size: 16

3) If the device is in 4k mode, but was previous in 16k mode we see:

Has been in 16kb mode: yes

Test: atest debuggerd_test
Test: atest debuggerd_test with ro.misctrl.16kb_before="1"
Test: adb shell cat /data/tombstones/tombstone_00
Bug: 335247092
Change-Id: If7ca3b0954a01070ff413758296460ca1d023ca5
2024-05-06 22:20:04 +00:00
Christopher Ferris
94c9cb6447 Merge "Replace malloc_not_svelte with malloc_low_memory." into main 2024-05-06 19:45:05 +00:00
Xiaohui Niu
7bfbe41714 Fix fallback signal issue.
Add signo for target thread in fallback path;
Update test for seccomp tombstone thread abort.

Bug: 336946834

Test: debuggerd_test
Test: Send fatal signal to process with NO_NEW_PRIVS
Change-Id: Ie9d77a93da9cd89ab7093b8949f311e03d96ec50
2024-04-30 21:22:31 +08:00
Christopher Ferris
1c46a00865 Replace malloc_not_svelte with malloc_low_memory.
The malloc_not_svelte variable name is confusing and makes the
low memory config the default. Change this so that the default is
the regular allocator, and that Malloc_low_memory is used to enable
the low memory allocator.

Update blueprint rules so that scudo is the default action.

Test: Verified scudo config is used by default.
Test: Verfified Android GO config uses the jemalloc low memory config.
Change-Id: Ie7b4b005a6377e2a031bbae979d66b50c8b3bcdb
2024-04-26 13:33:26 -07:00
Florian Mayer
2d45331a9e Avoid confusing main_thread name
It is not in fact the main thread of the process, but the thread that crashed

Change-Id: I3af6d0ffc6c0617526a9cbeb36b2a3286aaeb6f6
2024-04-22 23:43:59 +00:00
Treehugger Robot
acafa40d82 Merge changes from topics "crashapi2", "crashapi3" into main
* changes:
  use new location of crash_detail API
  Add tests for android_replace_crash_detail_[name|data]
2024-02-21 07:42:42 +00:00
Mitch Phillips
acd092ad4e Merge "Update debuggerd for stack MTE." into main 2024-02-16 09:01:34 +00:00
Florian Mayer
920d95b1c8 use new location of crash_detail API
Bug: 155462331
Change-Id: I862f91368d421363adbbf002fe3c7d446c437b03
2024-02-14 12:58:18 -08:00
Florian Mayer
5fa6663458 Read data set by android_add_crash_detail into tombstone.
Bug: 155462331
Bug: 309446525
Change-Id: I6d01aafca48e0e5e8cbd5ae87add6aec0c429503
2024-02-13 18:13:22 -08:00
Mitch Phillips
bf2d6dd7d4 Update debuggerd for stack MTE.
Two things need changing for debuggerd_test to pass.

 1. The seccomp policy needs to allow for PROT_MTE (0x20) in both
    mprotect() and mmap(). Stack MTE processes do a mprotect()/mmap() of
    the stack when launching a process.
 2. The fault address and stack pointer need to be untagged when trying
    to figure out the stack overflow cause.

Bug: 320448268
Bug: 292478827
Test: atest debuggerd_test --iterations=10
Change-Id: I56471c32ca40edffbb61b7547bdf2b85a6eb1ff7
2024-02-06 15:18:04 +01:00
Florian Mayer
cdf55585a7 Use correct stack depot size in __scudo_get_error
This is a no-op but will be used in upcoming scudo changes that allow to
change the depot size at process startup time, and as such we will no
longer be able to call __scudo_get_stack_depot_size in debuggerd.

Bug: 309446692
Change-Id: Ib64b9d042b2a2088484ec5e61944c089a1d85314
2023-12-13 22:21:19 +00:00
Treehugger Robot
1f5b0f9fee Merge "Handle scudo_stack_depot_size = 0" into main 2023-12-12 01:24:03 +00:00
Florian Mayer
f9566853bd Merge "Use scudo_stack_depot_size from process_info" into main 2023-12-11 23:38:13 +00:00
Siim Sammul
73ade16187 Merge "Move tombstone_proto_to_text out of libdebuggerd." into main 2023-12-06 10:13:18 +00:00
Treehugger Robot
1772cd427c Merge "Match upstream API change" into main 2023-12-06 01:28:10 +00:00
Florian Mayer
6757ecd2a3 Match upstream API change
Change was done in
e68c265543

Change-Id: Id1a288dfdb5edb7cb7d639ec4548926cc4085d8c
2023-12-06 00:16:43 +00:00
Siim Sammul
c08a34e3dc Move tombstone_proto_to_text out of libdebuggerd.
This is done so that we could depend on it elsewhere without needing all the unrelated methods.
Needed for ag/24553347

Bug: 296207744
Test: refactoring build
Change-Id: I7c6733208f3ae63ba9559753a24cffcb8e1b9d1e
2023-12-05 10:14:27 +00:00
Florian Mayer
4841207b53 Handle scudo_stack_depot_size = 0
Bug: 309446692
Change-Id: Ic55294316137847041f1e829cb0243aae8926379
2023-12-04 17:29:23 -08:00
Florian Mayer
e8fcfee409 Use scudo_stack_depot_size from process_info
This is a no-op but will be used in upcoming scudo changes that allow to
change the depot size at process startup time, and as such we will no
longer be able to call __scudo_get_stack_depot_size in debuggerd.

We already did the equivalent change for the ring buffer size in
https://r.android.com/q/topic:%22scudo_ring_buffer_size%22

Bug: 309446692
Change-Id: I761a7602c54a1f8f2d0575c5e011820d8dbaab63
2023-12-04 16:48:45 -08:00
Christopher Ferris
c7cc571fa1 Avoid crashing on bad architecture value.
The only way to get a bad architecture value in the protobuf is if
the data was corrupted or an unsupported architecture was added without
the register support.

If the protobuf is corrupted, this is strictly better since it
still produces a tombstone with the data present.

If there is an unsupported architecture, it will still result in a tombstone,
only the registers would not be present. It would also be very obviously
a problem that needs to be fixed. Again, this is strictly better since
the crash in generation is not necessarily visible unless you look at
the log. Here, the data is in the log and in the tombstone.

This also removes the only dependency in this file on the async_safe
library.

Test: Ran unit tests.
Test: Forced an invalid architecture and verified tombstone is present
Test: with error message, and error message printed in the log.
Change-Id: I8e4a2e3f778fafb5b7241c2f23d5f867f1341ed8
2023-11-17 22:12:14 +00:00
Christopher Ferris
6aa72490dc Add new segv type SEGV_CPERR.
The new 6.6 kernel headers added a new segv type, SEGV_CPERR. Add this
to the switch statement.

Test: Unit tests pass.
Change-Id: I77eb4748e51c7e7d7291bfd2180b0ccb3b5a6ded
2023-10-31 14:01:09 -07:00
Christopher Ferris
3a0833c9cd Fix potential miscellaneous debuggerd issues.
Check for the log opening failing.

Add the ability to put error messages in the log and tombstone so
that it's clear if the log reading failed in some way.

Adjust test so that if there is a log or if no log exists, the test
will still pass.

Print an <unknown> if the command line is unreadable instead of nothing.

Test: Ran unit tests.
Test: Induced error and verified error message is save in tombstone.
Change-Id: I2fce8078573b40b9fed3cd453235f3824cadb5e3
2023-08-09 17:31:55 -07:00
Kelvin Zhang
786dac3d50 Update some fs_mgr/debuggerd to use getpagesize() instead of PAGE_SIZE
Test: th
Bug: 279808236
Change-Id: I9d30cfe19d2b1a7d624cc5425e4315dc6e3b2ad2
2023-06-27 10:50:07 -07:00
Christopher Ferris
98d6242dc7 Limit the number of log messages in a tombstone.
Some testing environments can have a test that is sending many
thousands of messages to the log. When this type of process crashes
all of these log messages are captured and can cause OOM errors
while creating the tombstone.

Added a test to verify the log messages are truncated. Leaving this
test disabled for now since it is inherently flaky due to having to
assume that 500 messages are in the log.

Added a test for a newline in a log message since it's somewhat
related to this change.

NOTE: The total number of messages is capped at 500, but if a message
contains multiple newlines, the total messages will exceed 500.
Counting messages this way seems to be in the spirit of the cap,
that a process logging a large message with multiple newlines does
not completely fill the tombstone log data.

Bug: 269182937
Bug: 282661754

Test: All unit tests pass.
Test: The disabled max_log_messages test passes.
Change-Id: If18e62b29f899c2c4670101b402e37762bffbec6
2023-05-24 20:10:55 +00:00
Christopher Ferris
bda1064160 Re-add code to skip gettings logs on logd crashes.
Also add new unit tests to verify this behavior.

Bug: 276934420

Test: New unit tests pass.
Test: Ran new unit tests without pthread_setname_np call and verified
Test: the tests fail.
Test: Force crash logd and verify log messages are not gathered.
Test: Force crash a logd thread and verify log messages are not gathered.
Change-Id: If8effef68f629432923cdc89e57d28ef5b8b4ce2
2023-04-24 18:31:29 -07:00
Florian Mayer
152de539df Merge "Print number of frames" 2023-04-10 20:59:18 +00:00
Florian Mayer
59e632a292 Print number of frames
liblog can drop data when debuggerd is overloaded, which leads to
truncated tombstones. by adding the count separately, automation can
easily see whether it is dealing with a truncated tombstone or not.

Bug: 269537146
Change-Id: Ia991537efc0d6b57cbff23ee45af6521467aa20d
2023-04-06 23:38:40 +00:00
Elliott Hughes
f9cd73f851 Remove floating point register cruft.
We stopped showing floating point registers years ago, but some cruft
remains.

Test: treehugger
Change-Id: Ib89032db90a31a49d090bc5d99f9c401af734e7a
2023-03-17 00:38:26 +00:00
Christopher Ferris
22035ccb01 Display offset in backtraces if necessary.
When moving to a proto tombstone, backtraces no longer contain
an offset when a frame is in a shared library from an apk.
Add the offset display again if needed, and add a test to
verify this behavior.

Bug: 267341682

Test: All unit tests pass.
Test: Dumped a process running through an apk to verify the offset
Test: is present.
Change-Id: Ib720ccb5bfcc8531d1e407f3d01817e8a0b9128c
2023-01-31 17:53:45 -08:00
Florian Mayer
1d79a07586 [MTE] add link to SAC docs to tombstones
Test: m, flash, look at tombstone
Change-Id: I091d3dc9207d0ba7e692dcc28adc04aec33cf336
2023-01-26 02:09:57 +00:00
Florian Mayer
8b91862b8f [Refactor] move memory map printing to helper
An early return out of this function makes it harder to add new prints
after the memory maps.

Test: m, flash, look at tombstone
Change-Id: Id06e432918d69ac3307761b244473b6b7ab769e8
2023-01-26 01:39:15 +00:00
Florian Mayer
3d11890797 Merge "[MTE] warn about async crashes being imprecise" 2023-01-20 02:12:42 +00:00
Florian Mayer
5fcdfd2504 [MTE] warn about async crashes being imprecise
Bug: 175335730
Change-Id: If666c98b53dee1c63c48887f4448bc54f78a0a9f
2023-01-20 00:33:29 +00:00
Treehugger Robot
a812f45678 Merge "Pass fault address to GWP-ASan's changed API." 2023-01-17 20:29:46 +00:00
Florian Mayer
30a25286c4 Handle scudo_ring_buffer_size = 0
Bug: 263287052
Change-Id: I0bec3a817d7a16c72d5dfeddd0dcc86830f5a311
2023-01-12 16:06:10 -08:00
Mitch Phillips
8a34b179ad Pass fault address to GWP-ASan's changed API.
GWP-ASan changed one of the APIs upstream to now take the fault address
as well. This is to support the recoverable mode.

Add the fault address as well.

Test: gwp_asan_unittest
Bug: N/A
Change-Id: I8a4edd3fad159d91cc036050d330bbb8f9c8d435
2023-01-12 09:48:11 -08:00
Florian Mayer
bd49c387f0 Use scudo_ring_buffer_size from process_info
This is a no-op but will be used in upcoming scudo changes that allow to
change the buffer size at process startup time, and as such we will no
longer be able to call __scudo_get_ring_buffer_size in debuggerd.

Bug: 263287052
Change-Id: I350421d1fcdf22ce3b8b73780b88c1e10fa8a074
2023-01-05 15:14:56 -08:00
Christopher Ferris
fac411d97c Remove unnecessary logging.
Test: Extra logging no longers happens.
Change-Id: Ia179ebe5d16e0bde7d6ec66e39d4484ff18f2b1e
2022-10-27 17:56:27 -07:00
Liu Cunyuan
8c0101b971 Add tomstone proto support for riscv64
Signed-off-by: Liu Cunyuan <liucunyuan.lcy@linux.alibaba.com>
Signed-off-by: Mao Han <han_mao@linux.alibaba.com>
Change-Id: Ie22c2895fc30fab68eddc18713c80e403f44b203
2022-10-12 22:31:45 +00:00
Peter Collingbourne
7827991d7f Fix scudo MTE tests.
r.android.com/2108505 was intended to fix a crash in Scudo in
the case where the stack depot, region info or ring buffer were
unreadable. However, it also ended up introducing a number of bugs into
the code. It failed to call __scudo_get_error_info if the page at the
fault address was unreadable. This can happen in legitimate crash cases
if a primary allocation was close to the boundary of a mapped region,
or if the allocation was a secondary allocation with guard pages. It
also used long as the type for tags, whereas Scudo expects it to be
char. In combination this ended up causing most of the MTE tests to
fail. Therefore, mostly revert that change.

Fix the original crash by null checking the pointers returned by
AllocAndReadFully before proceeding with the rest of the function.

Bug: 233720136
Change-Id: I04d70d2abffaa35fe315d15d9224f9b412a9825d
2022-06-30 18:54:19 -07:00
Christopher Ferris
d17cefe7e4 Merge "Fix scudo fault address processing." 2022-06-01 20:20:09 +00:00