Do not make the assumption that if worstPid is set, that the log
buffer id is not LOG_ID_EVENTS or LOG_ID_SECURITY. Add comments
to prevent future over-optimization based on this assumption.
Make sure we reset mLast[id] = begin() when we mark it unset, but
tell optimizer this is an _impossible_ path.
SideEffects: drop two branches in all erase calls, gain an unordered
find() on an empty list for events and security buffers.
Test: gTest logd-unit-tests, liblog-unit-test & logcat-unit-tests
Bug: 32247044
Change-Id: Ic156ca2253c050c28021cedf48bedaf7bd692c09
mLastWorstPidOfSystem is filled with iterator references
that are not from AID_SYSTEM to aid the performance. But
we only clear entries from the list during erase if they
are from AID_SYSTEM. Remove the filter check in erase so
the stale references will be removed.
The conditions that caused this failure are difficult to
reproduce and are rare.
Test: gTests logd-unit-tests, liblog-unit-tests and logcat-unit-tests
Bug: 32247044
Bug: 31237377
Change-Id: Ie405dd643203b816cac15eef5c97600551cee450
Allows us to mitigate the impact of MAP_PRIVATE and copy on write by
calling android_lookupEventTag_len instead of android_lookupEventTag,
and delaying the copy on write impact to the later. We return a
string length in a supplied location along with the string pointer
with android_lookupEventTag_len(const EventTagMap* map, size_t* len,
int tag). The string is not guaranteed to be nul terminated. Since
android_lookupEventTag() called even once can cause the memory
impact, we will mark it as deprecated, but we currently have no
timeframe for removal since this is a very old interface.
Add an API for __android_log_is_loggable_len() that accepts the non
null terminated content and fixup callers that would gain because the
length is known prior to the call either in the compiler or at
runtime. Tackle transition to android_lookupEventTag_len() and
fixup callers.
On any application that performs logging (eg: com.android.phone)
/proc/<pid>/smaps before:
xxxxxxxxxx-xxxxxxxxxx rw-p 00000000 fd:00 463 /system/etc/event-log-tags
Size: 20 kB
Rss: 20 kB
Pss: 1 kB
Shared_Clean: 0 kB
Shared_Dirty: 20 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 0 kB
Anonymous: 20 kB
AnonHugePages: 0 kB
Swap: 0 kB
SwapPss: 0 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Locked: 0 kB
VmFlags: rd wr mr mw me ac
/proc/<pid>/smaps after:
xxxxxxxxxx-xxxxxxxxxx rw-p 00000000 fd:00 1773 /system/etc/event-log-tags
Size: 20 kB
Rss: 20 kB
Pss: 1 kB
Shared_Clean: 20 kB (was 0kB)
Shared_Dirty: 0 kB (was 20kB)
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 20 kB (was 0kB)
Anonymous: 0 kB (was 20kB)
AnonHugePages: 0 kB
Swap: 0 kB
SwapPss: 0 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Locked: 0 kB
VmFlags: rd wr mr mw me ac
Added liblog-unit-tests --gtest_filter=liblog.event_log_tags to
check for Shared_Clean: to not be 0 and Anonymous: to be 0 for
all processes referencing event-log-tags. Which can include multiple
references to /system/etc/event-log-tags and future possible refs to
/data/misc/logd/event-log-tags and /dev/event-log-tags. We want
failure messages to help point to errant code using the deprecated
interface.
This change saves 1/4MB of memory or more on a typical system.
Test: gTest liblog-unit-tests
Bug: 31456426
Change-Id: I9e08e44d9092bd96fe704b5709242e7195281d33
pruneRows not necessarily ULONG_MAX when uid not system source,
allow for speed up for status response if pruneRows exhausted.
Change-Id: I38c76bb20215e3d96513a575e2e3bc85a5e5b41c
mLastWorstPidOfSystem is supposed to be indexed by element->getPid()
Bug: 31237377
Bug: 30797725
Bug: 30688716
Change-Id: I81a55e92f175ded1c571a0aa8836736d86b36b1d
- Add drop logistics to TagTable
- replace uid references to a key reference since it
is an UID for most buffers, but a TAG for the
events and security buffer
- template the find worst entry mechanics into LogFindWorst class
Bug: 30118730
Change-Id: Ibea4be2c50d6ff4b39039e371365fed2453f17a2
- limit AID_SYSTEM uid or gid to read security buffer messages
- adjust liblog tests to reflect the reality of this adjustment
To fully test all security buffer paths and modes
$ su 0,0,0 /data/nativetest/liblog-unit-tests/liblog-unit-tests --gtest_filter=liblog.__security*
$ su 1000,1000,1000 /data/nativetest/liblog-unit-tests/liblog-unit-tests --gtest_filter=liblog.__security*
$ su 2000,2000,2000 /data/nativetest/liblog-unit-tests/liblog-unit-tests --gtest_filter=liblog.__security*
ToDo: Integrate the above individually into the gTest Q/A testing
Bug: 26029733
Change-Id: Idcf5492db78fa6934ef6fb43f3ef861052675651
Without this change LogBuffer::prune and LogBuffer::erase
contributes 16.7% and 1.79% respectively. With this change,
they contributes 3.06 and 2.33% respectively. Pruning is
performed roughly 1 in every 255 log entries, a periodic
tamer latency spike.
Bug: 23685592
Change-Id: I6ae1cf9f3559bca4cf448efe8bcb2b96a1914c54
Callers will not guarantee that they can or will ratelimit, we need to
retain the ability to blacklist snet_event_log as a result.
This reverts commit 6aa21b225d.
Bug: 26178938
Change-Id: Ibf47d2e23a84c56f5f72d02312c698df7ff2b601
Bad comment advise in LogBuffer.cpp results in partners failing to
considering using ro.logd.size to set the platform buffer size
default.
NB: It is not good practice to increase the log buffer size to deal
with logspam, as increases will result in logd scale issues getting
closer to hitting the background cgroup cpu cap. Once we hit that
cap, logd spirals, pruning old entries slower than the incoming log
entries. logd.writer will take 100% cpu.
Change-Id: If4a7a74f300d078eeaed0ffd3eb3fd77d1f9fe90
Dangerous bridge to cross to whitelist, who is special, who is not?
Rationalized as these events are used to catch exploits on platform.
As it stands no one should be allowed to block any messages in the
security context, not even for development purposes.
Bug: 26178938
Change-Id: Ibdc76bc0fe29ba05be168b623af1c9f41d7edbd2
- Add a new statistic that reports per pid and log_id for AID_SYSTEM
- Add a new pruning filter ~1000/! boolean
- Use this new statistic to prune on worst pid within AID_SYSTEM
Bug: 26029733
Bug: 21615139
Bug: 22855208
Change-Id: Iab5dd28f807dcf03d276372853883f3b6afa8294
- enhance property_get_bool, drop property_get_bool_svelte
- enhance base properties with ro and persist variants
- update and fortify README.property
- primarily move auditd and kernel logger into a realm where
they can be controlled by build properties.
- Move logd.klogd to logd.kernel, and add ro.logd.kernel
and persist.logd.kernel.
- Add ro.logd.auditd and persist.logd.auditd.
- Document persist.logd.security
- Document log.tag and persist.logd.tag properties.
- Document ro.logd.size, persist.logd.size and logd.size
properties.
Bug: 26178938
Bug: 26029733
Bug: 17760225
Change-Id: Ibc1a497e45d69db5cf52133937f7ba6fb1d6cd21
Primarily gives access to the Chattiest TIDs and TAGs
associated with a pid.
Has a secondary effect of allowing us to pull out the
command line, comm and in some cases the associated
PACKAGE for a specific pid while the logs are still
present even if the executable is gone.
Bug: 26029733
Bug: 21615139
Change-Id: I1ea63165a680a9318360579b70b1512078ed5682
Adds the uid field to outgoing content for readlog applications.
AID_LOG, AID_ROOT and AID_SYSTEM gain access to the information.
Bug: 25996918
Change-Id: I0254303c19d174cbf5e722c38844be5c54410c85
If a timeout is specified for the reader, then go to sleep
with the socket open. If the start time is about to get
pruned in the specified log buffers, then wakeup and dump
the logs; or wakeup on timeout, whichever comes first.
Bug: 25929746
Change-Id: I7d2421c2c5083b33747b84f74d9a560d3ba645df
android_log_timestamp returns the property leading letter,
it is better to return a clockid_t with android_log_clockid()
Bug: 23668800
Change-Id: I38dee773bf3844177826b03a26b03215c79a5359
android_log_timestamp returns the property leading letter,
it is better to return a clockid_t with android_log_clockid()
Bug: 23668800
Change-Id: I3c4e3e6b87f6676950797f1f0e203b44c542ed43
Although ever present, an increased regression introduced with
commit b6bee33182 (liblog: logd:
support logd.timestamp = monotonic).
A signal handler can interrupt in locked context, if log is written
in the signal handler, we are in deadlock. Block signals while we
are locked. Separate out timestamp lock from is loggable lock to
reduce contention situations. Provide a best-guess response if
lock would fail in timestamp path.
Bug: 25563384
Change-Id: I6dccd6b99ebace1c473c03a785a35c63ed5c6a8a
if ro.logd.timestamp or persist.logd.timestamp are set to the value
monotonic then liblog writer, liblog printing and logd all switch to
recording/printing monotonic time rather than realtime. If reinit
detects a change for presist.logd.timestamp, correct the older entry
timestamps in place.
ToDo: A corner case condition where new log entries in monotonic time
occur before logd reinit detects persist.logd.timestamp, there
will be a few out-of-order entries, but with accurate
timestamps. This problem does not happen for ro.logd.timestamp
as it is set before logd starts.
NB: This offers a nano second time accuracy on all log entries
that may be more suitable for merging with other system
activities, such as systrace, that also use monotonic time. This
feature is for debugging.
Bug: 23668800
Change-Id: Iee6dab7140061b1a6627254921411f61b01aa5c2
Chatty logs would distort the average log size by elevating the
elements, but not the size. Add statistical collection for the
number of elements that report chatty, and subtract that from
the number of elements to improve the pruning estimate. Pick
minElements as 1% rather than 10% of the total with this more
accurate number of elements, to a minumum of 4.
Bug: 24511000
Change-Id: I3f36558138aa0b2a50e4fac6440c3a8505d95276
- switch to coalesce instead of merge in naming of functions
and variables. Confusing since we also to merge-sorts and
other activities in the logger.
- define maxPrune rather than using a number in the code path.
Bug: 24511000
- If doing a clear, skip accounting
- Ensure for busy checking, behind a region lock for instance, only
break out if there was something to do. Basically move the filter
actions first, and defer checking the region lock to the ends of
the loops.
Bug: 23711431
Change-Id: Icc984f406880633516fb17dda84188a30d092e01
- Propagate to caller the clearing errors, busy blocked by reader.
- For clear, perform retries within logd with a one second lul each,
telling readers to skip, but on final retry to kill all readers if
problem still persists due to block reader (or high volume logspammer).
Bug: 23711431
Change-Id: Ie4c46bc9480a7f49b96a81fae25a95c603270c33
A regression that resulted in increased memory consumption for some
logging patterns because we rarely did merge or leading checks, and
age-out checking. On the last prune cycle, we reset for a full scan.
Add some comments describing the pruning processes.
Bug: 23327476
Bug: 23681639
Bug: 23685592
Change-Id: I22b0f339c9269b006831fda9cefe295a263ebb92
With part deux we caused an apparent regression by not checking for
stale recorded iterators. This checking was on-purpose bypassesed
when leading prune entries were to be deleted without touching the
statistics engine due to an in-place merge.
Part deux had us leaving iterators we were not focussed on untouched
which in turn because they were left behind, had a much higher
likelihood of being deleted without touching the statistics engine.
Perform the check every delete.
Bug: 23789348
Change-Id: Idc6cc23d1f9e3b6cd9a083139a0de59479fbfe08
Only record watermark if not known, or represents the worst UID
currently under focus. This has resulted in a halving of the average
prune time in the face of heavy spam because we get less processing
spikes.
Bug: 23327476
Change-Id: I19f297042b9fc2c98d902695c1c36df1bf5cd6f6
Hold on to last worst uid watermark and bypass a spike to O(n*n*x)
(n=samples, x=number of spammers) wrt chatty trimming.
Bug: 23327476
Change-Id: I9f21ce95e969b67e576417a760f75c4d86acf364
- Default level when not specified is ANDROID_LOG_VERBOSE
which is inert.
Bug: 20416721
Bug: 19544788
Bug: 17760225
Change-Id: Icc098e53dc47ceaaeb24ec42eb6f61d6430ec2f6
- Cleanup resulting from experience and feedback
- When filtering inside logd, drop any leading expire messages, they
are cluttering up leading edge of tombstones (which filter by pid)
- Increase and introduce EXPIRE_RATELIMIT from 1 to 10 seconds
- Increase EXPIRE_THRESHOLD from 4 to 10 count
- Improve the expire messages from:
logd : uid=1000(system) too chatty comm=com.google.android.phone,
expire 2800 lines
change tag to be more descriptive, and reduce accusatory tone to:
chatty : uid=1000(system) com.google.android.phone expire 2800
lines
- if the UID name forms a prefix for comm name, then drop UID name
Change-Id: Ied7cc04c0ab3ae02167649a0b97378e44ef7b588
BasicHashtable is relatively untested, move over to
a C++ template library that has more bake time.
Bug: 20419786
Bug: 21590652
Bug: 20500228
Change-Id: I926aaecdc8345eca75c08fdd561b0473504c5d95
Code in 833a9b1e38 was borken,
simpler approach is to simply check last entry (to save a
syscall) minus EXPIRE_HOUR_THRESHOLD. This does make longer logs
less likely to call upon the spam detection than the algorithm
being replaced, but sadly we ended up with a log entry in the
future at the beginning of the logs confounding the previous
algorithm.
Bug: 21555259
Change-Id: I04fad67e95c8496521dbabb73b5f32c19d6a16c2