Switch to a reader writer lock for the Element List lock. Also setup
for a reader writer lock for the Times list, but continue to use a
mutex where rdlock() and wrlock() are the same implementation for now.
This should improve general reader performance and prevent blocking of
other reader operations or exit by a single hung logd.reader.per
thread. For example, a full length logcat of an empty buffer (eg:
crash log buffer) will hold a lock while the iterator scans the entire
list.
Test: gTest liblog-unit-tests, logd-unit-tests, logcat-unit-tests
Bug: 37378309
Bug: 37483775
Change-Id: If5723ff4a978e17d828a75321e8f0ba91d4a09e0
Logspan down to the millisecond. Show a percentage if trimmed by
Chatty messages, a subspan from the newest to the newest chatty in
the log buffer. Sniff stats.add(elem), stats.subtract(elem) and
stats.dropped(elem) to generate the logspan data.
Test: gTest liblog-unit-tests, logd-unit-tests and logcat-unit-tests
Test: manual check Logspan statistics for being in range, added
temporary internal instrumentation to confirm expectations.
Bug: 37254265
Change-Id: I09c0d9375d5580315543c747b37976f9eeb9e408
Replace stats.add(elem) + stats.subtract(elem) with a new more
efficient method stats.addTotal(elem).
Test: gTest liblog-unit-test, logd-unit-tests and logcat-unit-tests
Bug: 37254265
Change-Id: I2b3c2ac44209772b38f383ae46fe6c4422b542cf
Helpful instrumentation to determine who is waiting for logger data.
Test: manual
Bug: 37274132
Bug: 37378309
Change-Id: I14fb1d9d15ae413930121048b770852359f06682
Add checking for impossible(tm) scenarios within LogBuffer::flushTo:
1) When iterating through the log entries, check if the iterator
returns two identical element references and break out of the loop.
2) Cap the maximum number of log entries we will skip while holding
the iterator lock at 4194304, break out of the loop.
We print a message to the kernel logs if we hit these cases.
ToDo: Remove this paranoia at some future date.
Test: gTest liblog-unit-tests logcat-unit-tests and logd-unit-tests
Bug: 37378309
Change-Id: I789594649db14093238828b9f6d1daeca8b780c2
Deal with a regression introduced in commit
5a34d6ea43 (logd: drop mSequence from
LogBufferElement) where log_time was compared against nsec() time
miscalculating the watermark boundary. When dealing with logcat
-t/-T, or any tail reading, add a margin to prune to back off by a
period of 3 seconds (pruneMargin).
Test: gTest liblog-unit-tests logcat-unit-tests and logd-unit-tests
Bug: 37378309
Change-Id: I72ea858e4e7b5fa91741ea84c40d2e7c3c4aa031
- moved __android_log_is_debuggable to a new public header
(log_properties.h)
- vendor version of sched_policy uses ALOG* instead SLOG*
Test: (sanity) liblog-unit-tests
Test: (sanity) libcutils_test (noting b/b/32972117, two tests continue
to fail)
Test: system/core as a whole makes with BOARD_VNDK_VERSION := current
now with no problems.
Test: boots/works on internal marlin
Bug: 33241851
(cherry picked from commit 1f83aa424f)
Merged-In: I5bc1f348dc0f0c8814bec5b5c3d2c52c825ab640
Change-Id: I5bc1f348dc0f0c8814bec5b5c3d2c52c825ab640
* changes:
liblog: allow event tags to include some punctuations
liblog: logprint supports number of seconds time event field
logcat: test: standardize rest() to let logs land when injecting
Add s to report time in seconds. The time could be a period, duration
or monotonic, expanded to seconds, minutes, hours and days. gTest has
to acquire a dynamic tag allocation as there are no users of this
feature yet.
Looking to the future, audio media logging has binary content similar
to the binary events structures Android logging uses and they have
a definition of a duration field in their internal binary logging, so
may be of use when we unify the logs.
Test: gTest logcat-unit-tests --gtest_filter=*.descriptive
Bug: 31456426
Change-Id: I262c03775983b3bc7b1b00227ce2bb2b0f357bec
Prefix long truncated names with an ellipse (...). Shift left as
much as possible when doing so, but keep spaces between command
name and other tabular fields.
Test: manual/visual
Bug: 37254265
Change-Id: I185b1e121ba911a9410a8b6624e013d5a531962b
(cherry picked from commit c27f12a3d396f113c5ae09d2f2c8ff7de3f8b551)
logd assumes that it is running in UTC time zone.
However, if persist.sys.timezone is set at some point later,
that affects and confuses logd behavior.
To avoid such a case, this CL sets TZ to UTC, which overrides
the property's behavior.
Test: Ran CtsOsTestCases.
Test: gTest liblog-unit-tests, logd-unit-tests and logcat-unit-tests
Bug: 33566779
Change-Id: Ib9edd4cb06f019a33aaf8d77d33bd82fdbbda480
If setcon fails, try alternate setcon, and then if it still
fails call getcon to confirm if it is an OK sepolicy context
anyways.
Test: gTest logd-unit-tests --gtest_filter=logd.sepolicy*
Change-Id: Iaf20b8a1a4a7312247288e1879884a54893c15ae
Reduce the period we are willing to look back at for out-of-order
entries. Cap the number of iterations we are willing to look back
for out-of-order entries to 300.
Test: gTest liblog-unit-tests, logd-unit-tests and logcat-unit-tests
Bug: 36875387
Bug: 36874561
Bug: 36861142
Change-Id: Icee289dfc0a37ccab9912dc8ab40a10ef3967b7a
Move lastTid array from local in LogBuffer::flushTo to per-reader
context in LogTimes::mLastTid and pass into LogBuffer::flushTo.
Replace NULL with nullptr in touched files.
Simplify LogTimeEntry::cleanSkip_Locked initialization of skipAhead
to memset, to match mLastTid memset initialization.
Test: gTest liblog-unit-tests, logd-unit-tests & logcat-unit-tests
Test: adb logcat -b all | grep chatty | grep -v identical
Bug: 36488201
Change-Id: I0c3887f220a57f80c0490be4b182657b9563aa3f
Failed to acquire BM_log_print_overhead as it was renamed from
BM_log_overhead in commit 8f2492f582
(liblog: benchmark: Use local LOGGER_NULL frontend')
The test report would not clearly identify which entry was missing, or
unparsed, so unrolled the loop and incorporating the indexes by name
so that gTest failure report offers a much better clue to the problem.
Test: gTest logd-unit-tests --gtest=logd.benchmark
Bug: 36683634
Bug: 27405083
Change-Id: Ic590c230569871651fb716054ecf635385d0f8a2
If the last buffer has zero length, strip it out of the iovec
issued to SocketClient::sendDatav().
Test: gTest liblog-unit-tests, logd-unit-tests, logcat-unit-tests
Bug: 36497967
Change-Id: I8fc585bbec63402d0e818ff4c620fdd7edcc38dc
last should start with mLogElements.end() and be updated as
we iterate to find a matching time entry in the list. Since
it is impossible(sic) for a newer start time to be supplied
than the list, the incorrect iterator initialization should
be inconsequential, but if it ever happens this change will
behave correctly and dump nothing.
Test: gTest liblog-unit-tests, logd-unit-tests and logcat-unit-tests
Bug: 36536248
Bug: 36608728
Change-Id: I96998c4b713258f29d5db2e24a83ae562ddf3420
A mixture of fixes and cleanup for LogKlog.cpp and friends.
- sscanf calls strlen. Check if the string is missing a nul
terminator, if it is, do not call sscanf.
- replace NULL with nullptr for stronger typechecking.
- pass by reference for simpler code.
- Use ssize_t where possible to check for negative values.
- fix FastCmp to add some validity checking since ASAN reports that
callers are not making sure pre-conditions are met.
- add fasticmp templates for completeness.
- if the buffer is too small to contain a meaningful time, do not
call down to log_time::strptime() because it does not limit its
accesses to the buffer boundaries, instead stopping at a
terminating nul or invalid match.
- move strnstr to LogUtils.h, drop size checking of needle and
clearly report the list of needles used with android::strnstr
- replace 'sizeof(static const char[]) - 1' with strlen.
Test: gTest liblog-unit-test, logd-unit-tests & logcat-unit-tests
Bug: 30792935
Bug: 36536248
Bug: 35468874
Bug: 34949125
Bug: 34606909
Bug: 36075298
Bug: 36608728
Change-Id: I161bf03ba029050e809b31cceef03f729d318866
--wrap flag in logcat translates directly to the mTimeout inside logd,
the value set is ANDROID_LOG_WRAP_DEFAULT_TIMEOUT defined in
<log/log_read.h> as 7200 or 2 hours. For a non blocking read with
a selected timeout, the logger waits until either the log buffer is
about to 'wrap' and prune the log entry, or at the specified timeout.
Non blocking in the logger context means that when there are no more
log entries, the socket is closed.
clock_gettime(CLOCK_REALTIME) is UTC 1970 epoch *NIX time. Is only
affected for time updates, not timezone or daylight savings time.
If there is a large user initiated time change, both the log entries
and the timeout mentioned above really get called into question, so we
trigger a release of the logs for clarity. This is so that the log
reader can handle the disruptively updated time, and can immediately
check the local time if necessary.
The logger has a 5 second window for entries to land in time sorted
order into the logging list. This should offer the log reader some
differentiation between logging order sequence for monotonically
increasing time, and sequence order in the face of user initiated time
adjustments that break monotonicity.
This change is about major time adjustments that can cause Fear,
Uncertainty or Doubt about log entries. By returning, immediate action
can be taken, rather than having to comb through the logs with less
details about the time disruptions in hand. The least it can do is
record what we have, and restart the call with a new tail time and
timeout.
Test: gTest liblog-unit-tests logcat-unit-test logd-unit-tests
Bug: 35373582
Change-Id: I92cac83be99d68634ffd4ebd2f3a3067cfd0e942
Add some deterministic behavior should the user change the hour
backwards when altering the device time, prevent sort-in-place
and cause the logger to land the new entries at the end.
Do not limit how far kernel logs can be sorted.
Test: gTest liblog-unit-tests logd-unit-tests logcat-unit-tests
Bug: 35373582
Change-Id: Ie897c40b97adf1e3996687a0e28c1199c41e0d0c
Regression from commit 8e8e8db549
For liblogcat reader -t or -T <timestamp> tail requests, continue
search for pertinent out-of-order entries for an additional 30 seconds
back into logging history to find a more inclusive starting point.
For example, if you have an out of order landing like
[..., 3, 6, 1, 8, 2, 5] and ask for 3 you used to get only 5, and now
you get 3, 6, 8, 5 as 'expected'
Test: gTest liblog-unit-tests logd-unit-tests logcat-unit-tests
Bug: 35373582
Change-Id: I2a0732933fa371aed383d49c8d48d01f33db2a79
Use getRealTime() instead and leverage private liblog log_time
comparison and math functions. This saves 8 bytes off each
element in the logging database.
Test: gTest liblog-unit-tests logd-unit-tests logcat-unit-tests
Bug: 35373582
Change-Id: Ia55ef8b95cbb2a841ccb1dae9a24f314735b076a
- Improves accuracy of -t/-T '<timestamp>' behavior when out of order
arrival of entries messes with mSequence as the list will now have
monotonic sequence numbers enforced.
- Out of order time entries still remain because of reader requiring
the ability to receive newly arrived old entries.
- -t/-T '<timestamp>' can still quit backward search prematurely
because an old entry lands later in the list.
- Adjust insert in place algorithm from two loops of scan placement
and then limit against watermark, into one that does all of that
plus iteratively swap update the sequence numbers to set
monotonicity. Side effect will be that the read lock (which is
actually the LogTimes lock) will be held longer while we search
for a placement above the youngest LogTimes watermark. We need
to hold the read (LogTimes) lock because we may be altering the
sequence numbers affecting -t/-T '<timestamp>' search.
Test: gTest logd-unit-tests liblog-unit-tests logcat-unit-tests
Bug: 35373582
Change-Id: I79a385fc149bac2179128b53d4c8f71e429181ae
Better keep the right order, or ASAN will complain when you read
out of bounds.
Bug: 36234128
Test: m
Test: m && m SANITIZE_TARGET=address
Test: Sanitized device boots without ASAN crashes
Change-Id: Ifc09cb0ece6835d2b636a3ad2128e09ca9aa45c9
Bug: 34949125
Bug: 34606909
Test: Make sure Android boots when built with SANITIZE_TARGET='address'
Change-Id: I9c004e806f2025098aa72228284b05affd2c2802
Switch _all_ file's coding style to match to ease all future changes.
SideEffects: None
Test: compile
Bug: 35373582
Change-Id: I470cb17f64fa48f14aafc02f574e296bffe3a3f3
This is the precursor for "Plan B" recovery when access to
/dev/event-log-tags is blocked to untrusted zones. Also
deals with mitigating issues with long-lived mappings that
do not update /dev/event-log-tags when dynamically changed.
Test: gTest logd-unit-test --gtest_filter=logd.getEventTag_42
Bug: 31456426
Bug: 35326290
Change-Id: I3db2e73763603727a369da3952c5ab4cf709f901
android_lookupEventTagNum added. Adds support for creating a new
log tag at runtime, registered to the logd service.
Tested on Hikey, all services stopped, shell only access, CPUs not
locked (there is enough repeatability on this platform).
$ /data/nativetest64/liblog-benchmarks/liblog-benchmarks BM_lookupEventTagNum
iterations ns/op
Precharge: start
Precharge: stop 231
NB: only Tag matching, linear lookup (as reference, before unordered_map)
BM_lookupEventTagNum 1000000 1017
NB: unordered_map with full Tag & Format lookup, but with Tag hashing
BM_lookupEventTagNum 2000000 683
NB: with full Tag & Format hash and lookup for matching
BM_lookupEventTagNum 2000000 814
NB: only Tag matching (Hail Mary path)
BM_lookupEventTagNum 5000000 471
Because the database can now be dynamic, we added reader/writer locks
which adds a 65ns (uncontended) premium on lookups, and switch to
check for an allocation adds 25ns (either open code, or using
string_view, no difference) which means our overall speed takes 90%
as long as the requests did before we switched to unordered_map.
Faster than before where we originally utilized binary lookup on
static content, but not by much. Dynamic updates that are not cached
locally take the following times to acquire long path to logd to
generate.
BM_lookupEventTag 20000000 139
BM_lookupEventTag_NOT 20000000 87
BM_lookupEventFormat 20000000 139
BM_lookupEventTagNum_logd_new 5000 335936
BM_lookupEventTagNum_logd_existing 10000 249226
The long path pickups are mitigated by the built-in caching, and
the public mapping in /dev/event-log-tags.
SideEffects: Event tags and signal handlers do not mix
Test: liblog benchmarks
Bug: 31456426
Change-Id: I69e6489d899cf35cdccffcee0d8d7cad469ada0a
Will register a new event tag by name and format, and return an
event-log-tags format response with the newly allocated tag.
If format is not specified, then nothing will be recorded, but
a pre-existing named entry will be listed. If name and format are
not specified, list all dynamic entries. If name=* list all
event log tag entries.
Stickiness through logd crash will be managed with the tmpfs file
/dev/event-log-tags and through a reboot with add_tag entries in
the pmsg last logcat event log. On debug builds we retain a
/data/misc/logd/event-log-tags file that aids stickiness and that
can be picked up by the bugreport.
If we detect truncation damage to /dev/event-log-tags, or to
/data/misc/logd/event-log-tags, rebuild file with a new first line
signature incorporating the time so mmap'd readers of the file can
detect the possible change in shape and order.
Manual testing:
Make sure nc (netcat) is built for the target platform on the host:
$ m nc
Then the following can be used to issue a request on the platform:
$ echo -n 'getEventTag name=<name> format="<format>"\0EXIT\0' |
> nc -U /dev/socket/logd
Test: gTest logd-unit-test --gtest_filter=getEventTag*
Bug: 31456426
Change-Id: I5dacc5f84a24d52dae09cca5ee1a3a9f9207f06d
Report multiple identical chatty messages differently than for
regular expire chatty messages. Multiple identical will
report identical count, while spam filter will report
expire count.
This should reduce the expected flood of people confused
but chatty messages in continuous logcat output.
Test: gTest logd_unit_tests --gtest_filter=logd.multiple*
Change-Id: Iad93d3efc6a3938a4b87ccadddbd86626a015d44
Resolve issues seen on continuous testing frame:
- statistics test, info instead of fail on missing radio log data.
- sepolicy switch from /data/misc/logd/ to /data/backup/ as the
directory we access(2) to inject sepolicy violations. The key here
is we are still root, but we are in u:r:shell:s0, and the directory
does not provide us DAC access (0700 system system) so we trigger
the pair dac_override and dac_read_search on every try to get past
the message de-duper. /data/misc/logd is not always there, until
logpersist is enabled, but /data/backup is always there.
- a stricter signature of '): avc: denied'
- put in a looser threshold for sepolicy_rate_limiter_spam test.
Test: gTest logd-unit-tests --gtest_filter=logd.sepolicy*
Bug: 34454758
Change-Id: I28ce4fdb51dc4869944e3253b593ce222d16ec98
Processing overhead for selinux violation messages is costly. We want
to deal with bursts of violations, but we have no intent of allowing
that sustained burst to go unabated as there is a cost of processing
and battery usage.
Tunables in libaudit.h are:
AUDIT_RATE_LIMIT_DEFAULT 20 /* acceptable burst rate */
AUDIT_RATE_LIMIT_BURST_DURATION 10 /* number of seconds of burst */
AUDIT_RATE_LIMIT_MAX 5 /* acceptable sustained rate */
Since we can only asymptotically handle DEFAULT rate, we set an upper
threshold of half way between the MAX and DEFAULT rate.
Default kernel audit subsystem message rate is set to 20 a second.
If sepolicy exceeds 125 violation messages over up to ten seconds
(>=~12/s), tell kernel audit subsystem to drop the rate to 5 messages
a second. If rate drops below 50 messages over the past ten seconds
(<5/s), tell kernel it is ok to increase the burst rate back to 20
messages a second.
Test: gTest logd-unit-tests --gtest_filter=logd.sepolicy_rate_limiter_*
Bug: 27878170
Change-Id: I843f8dcfbb3ecfbbe94a4865ea332c858e3be7f2
Some kernels have a bug which causes a newline to show up in audit
messages. The embedded newlines cause one message to look like two due
to prefix controls.
Replace any newlines with spaces. Duplicate spaces are further
consolidated in code immediately after this newly added code.
Test: create an audit message with a newline, and watch it be cleaned up.
Bug: 27878170
Change-Id: Id90c29ab9e10d3be96f51403b0293622d782422a
log selinux audit messages boolean (true or false, default true)
selection for logging destinations:
ro.logd.auditd - turn on logd.auditd to pick up violations.
ro.logd.auditd.dmesg - to the kernel log.
ro.logd.auditd.main - to the "main" log buffer.
ro.logd.auditd.events - to the "events" log buffer.
We used to also read logd.auditd.dmesg and persist.logd.auditd.dmesg
which do not get refreshed when /data mounts internally. This is a
confusing state as these properties will be read after a logd crash
and restart, adjusting the behavior of the logger. Same can be said
for logd.auditd as well. Drop reading these other parameters.
Test: manual set r/o parameters, stop/start logd to confirm behavior
Bug: 33969000
Bug: 27878170
Change-Id: I1a6bb4a903074c9aa7b227cf583a0094d49cbefd
Until the socket ages out, it sticks around and gets reused in
subsequent tests affecting the outcome of those tests. We opt
to run logd.timeout in a forked and isolated process to keep
these conditions from interfering.
Adjusted benchmark execute to only run the tests we are
interested in to improve the time it takes to run.
Commented some areas of code to make them easier to maintain.
Test: gTest logd-unit-tests success
Bug: 33962045
Change-Id: Ic1b98bc4a2d7e8927f1a87628e3bcc368c9cf8ce
Caused +/- field data to land under the Pruned column
This reverts commit 0adcc3e3e8.
Test: manual
Bug: 30118730
Change-Id: Ic75ce3a90baded19f3efc0cc77474fe5d9a8accd
As an extension to the duplicate multiple message filtering, special
case liblog tagged event messages to be summed. This solves the
inefficient and confusing duplicate message report from the DOS attack
detection such as:
liblog: 2
liblog: 2
liblog: 2
liblog: 2
liblog: 3
which would result in:
liblog: 2
chatty: ... expire 2 lines
liblog: 2
liblog: 3
And instead sums them and turns them all into:
liblog: 11
liblog messages should never be subject to chatty conversion.
Test: liblog-benchmarks manually check for coalesced liblog messages
and make sure they do not turn into chatty messages.
Instrumented code to capture sum intermediates to be sure.
Bug: 33535908
Change-Id: I3bf03c4bfa36071b578bcd6f62234b409a91184b
Inspection turned up that for the case of three identical messages,
the result would be a stutter of the first message only. Added
comments to describe the state machine, incoming variables, outcoming
and false condition outputs, for proper maintenance in the future.
Test: gTest liblog-benchmarks BM_log_maximum* and manually check
for correct midstream chatty messages,
Bug: 33535908
Change-Id: I852260d18a484e6207b80063159f1a74eaa83b55