Rather than only updating the maps when a pc can't be found, always update
the maps before using them. This avoids issues where the maps change
and it could cause a crash reading from a map that has been modified.
This assumes that executed code never gets unloaded, or that the
code is unloaded so infrequently that it doesn't matter. This happens
because the pcs for the backtraces are gathered as the program runs
and those pcs are symbolized and made into relative pcs at a later time.
Also, add safe reading of the elf data when necessary to avoid any
crashes if maps are changing while this is running.
Since the MapEntry objects can be deleted, copy the values for
the current map in the backtrace code to detect when in our own code
instead of keeping a pointer.
Bug: 340988785
Test: malloc_debug unit/system tests pass.
Test: libmemunreachable tests pass.
Change-Id: Ica2ba50a5bcf9e19c7e4033e29a5a67a1847d1a6
While studying the implementation of POSIX pthread_rwlock* functions, I noticed
that two functions were marked __always_inline twice. "They must really mean it
this time."
Also add back `inline` keyword to one other usage of __always_inline to be
consistent with other uses of __always_inline throughout the codebase.
Change-Id: Ibf9eaed5fc9fd03afcdd969cff82dec71a8ce30f
This CL is created as a best effort to migrate test targets to the new Android ownership model.
It is based on historical data from repository history and insights from git blame.
Given the nature of this effort, there may be instances of incorrect attribution. If you find incorrect or unnecessary
attribution in this CL, please create a new CL to fix that.
For detailed guidelines and further information on the migration please refer to the link below,
go/new-android-ownership-model
Bug: 304529413
Test: N/A
Change-Id: Ie36b2a3245d9901323affcc5e51dafbb87af9248
Adding the new option log_allocator_stats_on_signal that will call
mallopt(M_LOG_STATS, 0), when a signal is sent to a process.
This call does not happen in the signal handler, but a variable
is set to trigger the mallopt call. The next time any allocation
call occurs, the mallopt call will be made.
Also, includes new unit tests for the new config option.
Test: All unit tests pass.
Test: Enabling the new feature and stop/start and sending the signal to
Test: a process and observing the log contains the memory stats.
Test: Did the above for a scudo and jemalloc based device.
Change-Id: Idcf6fca74036efb065b9c4cc037aaf9bcf0f139e
This will, hopefully, reduce the number of flaky runs of this test.
Add skipping xml files for the notice file parser.
Bug: 280572235
Test: atest malloc_debug_system_tests
Change-Id: I6fb76287f55d0cff5b695dce09cc2b7a69b62874
Since it doesn't matter if these calls take a little longer than
before, use the more thorough but slightly longer purge mechanism.
Test: Unit tests pass.
Change-Id: Ifab7166a9682a13231746b78717d52673d13be1b
Some operations wind up allocating then freeing a significant
amount of memory. So after those operations, do a purge so that
the RSS of the process is not artificially inflated.
Bug: 262321164
Test: Ran unit tests.
Test: Verified after this change, the RSS does not go up after running
Test: am dumpheap -n <PID>.
Change-Id: I08477f8ce12c06fd2a068f536a81f4a577d619e2
Using the long option names might not fit in the malloc debug option
property since properties have a 92 character limit.
This patch creates new aliases for the original options.
Bug: 264504531
Test: set new options pass
Test: Config unit tests pass.
Change-Id: Id985720f36a2bf0da7b35ff444c2c80eb1fb4363
The clang-analyzer-unix.Malloc and other warnings in these
unit tests are either false positive or in
negative tests that can be ignored.
Bug: 259995529
Test: presubmits; make tidy-bionic-libc_subset
Change-Id: Iddabe613d21d3717ba34f9e4d5bb97436279649f
We swap the 2nd and 3rd arguments to the CallocEntry constructor
to match the order in the cpp file, and match the C calloc convention.
We also fix an invocation of this constructor.
Change-Id: Iebe16d82a74459e5e957c1d9e2cc1aebb15150d0
Test: TreeHugger
With timestamps, we are able to tell the details of allocator
performance such as the average time for malloc() in different size
class, the potential contention time by examing the overlap between
operations, .etc.
Not all malloc et al. operations are recorded with timestamp. Only
operations relates to memory usage change will have them.
Test: All unit tests pass.
Change-Id: I5c2016246a6f10b221387001bb44778969bb26ae
It's hard to find the start of the previous instruction for riscv64
(given the C extension), and discussion with the ART folks cast doubt
on the comment's claim that we do this correctly for arm32 anyway.
So, rather than add complexity for riscv64, let's simplify this routine
for everyone. I suspect we could probably get away with just `--ip` for
all architectures, but since it's trivial to at least maintain plausible
alignment, I've stuck with the correct "at least" byte counts instead.
(See the discussion on
https://lists.riscv.org/g/sig-android/topic/detecting_16_bit_vs_32_bit/94813787
for more about riscv64 specifically.)
Test: treehugger
Change-Id: Ie43451d329470b3ece8779d11eb705d24d01c3d7
There are many cases where there are no more allocations
when a dump of the record allocs might occur. Move the entries
being written to file in the interrupt handler.
Update the unit tests for this new functionality and add a new
unit test that verifies no allocations occur while the entries
are being written.
Fix the unit tests so that the record allocs files get deleted
properly.
Test: All unit tests pass.
Test: Ran 1000 iterations of unit tests to verify no flakes.
Change-Id: I20941596c1dda5a761522050dc155b06f3f3e735
Add backtrace_size for only backtracing a single size.
Add backtrace_min_size to set the minimum value of size to backtrace.
Add backtrace_max_size to set the maximum value of size to backtrace.
Documented the new options.
Test: New unit test pass.
Change-Id: I1a773737910cd4bc2af9546547b3a2740bbcb22b
Modify the tests that require a single filename, to use a filename
that has the pid as part of the name. This allows multiple different
versions of the test to run on the same machine, and allows
each test to be run at the same time.
Test: Ran unit tests on device.
Test: Ran the unit tests 100 times, no failures.
Change-Id: Ia38483049e7b66bd3da824bcd484c03e46f85280
The new object incorporates all Android specific knowledge into
a single place and makes everything simpler.
Fixed a bug where if backtrace_full was enabled, the AddBacktrace
function would always set the size to the maximum number
of frames instead of the actual number of frames.
Added a new smoke system tests for backtrace_full.
Modified the smoke test to do a malloc/free, so it's really
a smoke test.
Bug: 232575330
Test: Unit tests pass on device.
Test: Verify the full backtrace actually produces valid backtraces.
Test: Run bionic-unit-tests with backtrace_full enabled.
Test: Run bionic-benchmarks --benchmark_filter=stdlib_malloc_free_decay1/512
Change-Id: I23128a73a8691007e1c7f69e0c99bb4dcd713db8
The new option is named check_unreachable_on_signal. It is meant
to duplicate dumpsys meminfo --unreachable <PID> for non-java
processes. When enabled, a user can send a signal to a process
which will trigger the unreachable check on the next allocation
call.
Added new unit tests.
Test: New unit tests pass.
Test: Enabled for the entire system, then dumped on the netd
Test: process and also system_server.
Change-Id: I73561b408a947a11ce21a211b065d59fcc39097b
The libmemunreachable library looks through memory to determine
if pointers are leaked. Unfortunately, the malloc debug code
stores the original pointer in data structures, so it looks like
pointers are still in use. The fix is to mangle the pointers
stored in memory so that it doesn't trick the library into thinking
they are live.
Test: All unit/system tests pass.
Test: Ran libmemunreachable and verified leaks show up.
Change-Id: Ic40a0a5ae73857cde936fd76895d88829686a643
Modify libfdtrack to use the normal Unwinder object. In addition,
update the libfdtrack so that it doesn't record frames in
libfdtrack.so rather than skipping frames it thinks will be in
the library.
Modify the malloc debug code to use the normal Unwinder object.
Bug: 120606663
Test: All unit tests pass.
Change-Id: I3c9612dd10e62389e6219e68045ee87f7b2625f5
The wrap.<APP> property was broken in Android 12, so provide documentation
about how to workaround it.
Test: NA
Change-Id: I98fdc5801997492442802e1295fb6969f9190e1c
strerror is nice, but usually I don't care about the text, I care about
the uppercase enum
Bug: N/A
Test: ./tests/run-on-host.sh glibc (existing failures -> b/201305529)
Test: atest bionic-unit-tests-static
Test: atest malloc_debug_unit_tests
Change-Id: I407bd9f4dfa918fff66a0da7df8d7239f789c7b8
I accidentally made the tests run MAX_RETRIES times instead of
running once when passing, and at most MAX_RETRIES when the
test fails. Also, add a bit of randomness to the usleep to try and
avoid tests syncing up on failures.
Bug: 193898572
Test: Ran unit tests and verified that a pass doesn't result in another run.
Test: Ran three copies of the unit tests at the same time to verify that
Test: there isn't a flaky test failure.
Change-Id: I8b8d3cd05ca7d1e87ce34bf10aeef84f6989fdab
Refactor the code a bit to allow retrying if a log message is missing.
This is because there is a possibility of a log message getting dropped.
If that happens, merely rerun up to three times and pass if the missing
message is present.
Also fix a race condition that can occur if the LogReader threads are
being terminated but happen to be allocating memory while they are
in the signal handler. This situation causes aborts in the memory
allocator or a deadlock. Before this change, the verify_leak*
tests would fail in less than twenty iterations. After, I could run
for hundreds of iterations.
Bug: 193898572
Test: Ran unit tests in a loop.
Change-Id: I0fb5beab6041dcf3e3fd19f6748bb20bb81a2506
These tests can be flaky if something is spamming the log and
expiring messages. Previously, the log wasn't read until the
forked process was complete. Now, two threads are spawned right
after the process forks to read the log while running. This
should avoid any problems if the log is being spammed while the
test runs.
In addition, fixed some potential flakiness in the test that might
occur if the test incorrectly gets stale log data.
Bug: 193898572
Test: All unit tests pass on cuttlefish (both 32 bit and 64 bit).
Test: Forced an abort and verified that the crash log reader sees
Test: the message.
Test: Verified that the log reader for the main and crash log properly
Test: read data.
Test: Verified the test passes while swamping the log by running
Test: the log unit tests while running two copies of the system
Test: tests.
Change-Id: I68bd92a8c483eac0146ada87dd4201dda0902c49
This is a follow-on to commit 5358cc40ab.
I forgot that there were two places the undefined behavior could occur.
Make the literal unsigned in the other place.
Test: TreeHugger
Bug: 189541929
Change-Id: Iaef81507ca61e802d277246bf5995157c93d86ce
Technically, shifting a signed value beyond its maximum range is
undefined behavior in C++. A bare integer literal is signed, and
defaults to 'int'. On platforms where 'int' is 32-bits, we
shift outside this range with 1 << 31.
We make our literal an unsigned integer to avoid this.
Test: TreeHugger
Bug: 189541929
Change-Id: Iade1bcd3f86d025dd6e10c26622d10c26e2c8295
When the main thread is exiting, the code deleted the g_debug global
pointer and destroys the disable pthread key. Unfortunately, if
malloc debug was enabled in a way that requires a header for the pointer,
any frees that occur after the main thread is torn down result in calls
to the underlying allocator with bad pointers.
To avoid this, don't delete the g_debug pointer and don't destroy the
disable pthread key.
Added a new system test that allocates a lot of pointers and frees them
after letting the main thread finish.
Also, fix one test that can fail sporadically due to a lack of unwinding
information on arm32.
Bug: 189541929
Test: Passes new system tests.
Change-Id: I1cfe868987a8f0dc880a5b65de6709f44a5f1988
malloc_debug can use libunwind and libunwindstck to unwind backtrace,
if libc.debug.malloc.options contains the string of "backtrace_full",
malloc_debug will use libunwindstck, and if libc.debug.malloc.options
contains the string of "backtrace=*", malloc_debug will use libunwind.
The result of libunwindstck is normal, but the result of libuniwnd
is abnormal, there is a offset between the rel_cp and the correct value,
so addr2line can't decode the right line number.
Libunwind and libunbiwndpack calculate load_bias is different, so malloc_debug
get load_bias alignment with libunwindstack.
Bug: 169539402
Change-Id: I640fb5db39af622a0bb52abf2c107984065a89d5
When malloc debug is enabled, using libbacktrace to unwind can
result in a deadlock. This happens when an unwind of a thread
is occuring which triggers a signal to be sent to that thread. If
that thread is interrupted while a malloc debug function is
executing and owns a lock, that thread is then stuck in the signal
handler. Then the original unwinding thread attempts to do an
allocation and gets stuck waiting for the same malloc debug lock.
This is not a complete deadlock since the unwinder has timeouts,
but it results in truncated unwinds that take at least five
seconds to complete.
Only the backtrace signals needs to be blocked because it is the only
known signal that will result in a thread being paused in a signal
handler.
Also, added a named signal in the reserved signal list for the
special bionic backtrace signal.
Bug: 150833265
Test: New unit tests pass with fix, fail without fix.
Change-Id: If3e41f092ebd40ce62a59ef51d636a91bc31ed80