Commit graph

168 commits

Author SHA1 Message Date
Mark Salyzyn
ddda212faa logd: optimize code hotspots
Discovered that we had a few libc hotspots. Adjust code to generally
reduce or nullify the number of calls to malloc, free, strlen,
strcmp, strncmp, memcmp & strncasecmp. Total gain looks to be about
3% of logd's processing time. malloc still contributes to 3%, but all
others are now total 0.5%.

Bug: 23685592
Change-Id: Ife721121667969260cdb8b055524ae90f5911278
2015-10-02 16:45:22 -07:00
Mark Salyzyn
5ac5c6b193 logd: Add LogUtils.h
Move prototypes from LogBufferElement.h to LogUtils.h

Change-Id: I55e42e17e6c997e35b2b78b87fd2f84f8f393282
2015-10-02 16:36:41 -07:00
Mark Salyzyn
151beac76d logd: klogd deal with nuls in dmesg
Switch to using string and length in all transactions, treating
trailing nuls the same as spaces.

ToDo: change dumpstate (bugreport) to use logcat -b printable _regardless_

Bug: 23517551
Change-Id: I42162365e6bf8ed79d356e7b689a673902116fdb
2015-09-29 15:51:38 -07:00
Mark Salyzyn
ea1a241107 logd: log_strtok_r deal with nuls
Rename to log_strntok_r and change from dealing with strings
to dealing with a string and an associated length.

Bug: 23517551
Change-Id: Ia72f1305a53f55eeef9861ac378fb8205fd2378e
2015-09-29 15:43:34 -07:00
Mark Salyzyn
2d159bf3b5 logd: klogd: sniff for time correction on Mediatek
Need some more flexibility when parsing kernel messages
cluttered with extra fluff. This is the minimal relaxation
we can do to the rules to ensure that we work on all
possible devices and kernels when sniffing for time
correction information.

We want to minimize any future maintenance, keep in mind
klogd is a "userdebug" or "eng" feature and is disabled
in "user" builds. Manage expectations.

Bug: 23517551
Change-Id: I026d074e14fb2550e728683e85a973bd87e78a9c
2015-09-29 15:43:07 -07:00
Mark Salyzyn
47dba71f24 logd: klogd and Mediatek part deux
- switch to an open coded strnrchr
- validity checking on n, taglen and b values.

Bug: 23517551
Change-Id: I568e25c5aa6d8474835454a0e83b19c2921b7986
2015-09-29 15:40:08 -07:00
Mark Salyzyn
39944c89a9 logd: increase dgram_max_qlen to 600
Seeing liblog messages on system_server runtime restart
(too much system_server spam, 566 messages in 72ms)

Bug: 23788621
Change-Id: I5171f2c19a3538da190fc6c2b40e978d89bf0e20
2015-09-25 14:15:53 +00:00
Mark Salyzyn
8c943b6dc8 logpersist: bundle with logcatd service
Bug: 23186545
Change-Id: I130d7c7e120acb372e58aec028f39e161d53628e
2015-09-21 13:53:01 -07:00
Mark Salyzyn
1b9456a1a5 logpersist: Additional barrier
If this shows up on "user" builds, block execution

Change-Id: I2e137d1ff7583ac000b81dee7390b582dfd02095
2015-09-21 13:52:45 -07:00
Mark Salyzyn
5bb2972dce logd: worst uid record watermark part five
A regression that resulted in increased memory consumption for some
logging patterns because we rarely did merge or leading checks, and
age-out checking. On the last prune cycle, we reset for a full scan.

Add some comments describing the pruning processes.

Bug: 23327476
Bug: 23681639
Bug: 23685592
Change-Id: I22b0f339c9269b006831fda9cefe295a263ebb92
2015-09-10 08:43:03 -07:00
Mark Salyzyn
831aa29730 logd: worst uid record watermark part four
With part deux we caused an apparent regression by not checking for
stale recorded iterators. This checking was on-purpose bypassesed
when leading prune entries were to be deleted without touching the
statistics engine due to an in-place merge.

Part deux had us leaving iterators we were not focussed on untouched
which in turn because they were left behind, had a much higher
likelihood of being deleted without touching the statistics engine.

Perform the check every delete.

Bug: 23789348
Change-Id: Idc6cc23d1f9e3b6cd9a083139a0de59479fbfe08
2015-09-03 17:13:45 -07:00
Mark Salyzyn
ccfe8446a1 logd: worst uid record watermark part three
Regression that cause records to be preserved for more than a day.

Bug: 23681639
Bug: 23685592
Change-Id: I5e4393c8e3ed935790994c77ec51dc6512a6daa6
2015-08-31 20:51:42 +00:00
Mark Salyzyn
46d159d462 logd: klogd and Mediatek
- sniff for PID in kernel log messages if available
- properly deal with klogd watermark in face of modified output
- deal more stringently with priority tag, must have [ following
- suppress process-name stutter in tag that can happen
- do not use : to demark tag if within [ ]

Mediatek-special change that adds <printk_state>(<cpu>)[<pid>:<comm>]
as a prefix to the printk messages. Along the lines of (simplified
for entertainment purposes, YMMV):

    char tbuf[50]; /* printk prefix */
    int this_cpu = smp_processor_id();
    char state = __raw_get_cpu_var(printk_state);
    unsigned tlen = snprintf(tbuf, sizeof(tbuf), "%c(%x)[%d:%s]",
               state, this_cpu, current->pid, current->comm);

Bug: 23517551
Change-Id: I568e25c5aa6d8474835454a0e83b19c2921b7985
2015-08-26 14:57:16 -07:00
Mark Salyzyn
49afe0d00f logd: worst uid record watermark part deux
Only record watermark if not known, or represents the worst UID
currently under focus. This has resulted in a halving of the average
prune time in the face of heavy spam because we get less processing
spikes.

Bug: 23327476
Change-Id: I19f297042b9fc2c98d902695c1c36df1bf5cd6f6
2015-08-24 14:04:45 -07:00
Tom Cherry
ff5be396d7 Merge changes from topic 'init-rc-breakup'
* changes:
  init: Queue Triggers instead of Actions
  bundle init.rc contents with its service
2015-08-21 17:42:29 +00:00
Tom Cherry
20391b1de5 bundle init.rc contents with its service
Bug: 23186545
Change-Id: I52616b8ab1165fdef716f9b8f958665f2308c12e
2015-08-21 10:14:43 -07:00
Mark Salyzyn
73160acc5c logd: switch asprintf to std::string
Bug: 23350706
Change-Id: I715cdd4563a09de3680081947a3439f0cac623be
2015-08-20 10:32:16 -07:00
Mark Salyzyn
decbcd9c41 logd: statistics switch to std::string
Bug: 23350706
Change-Id: I5564898c4f67b8fcc43cee64604855f789409482
2015-08-20 10:25:57 -07:00
Mark Salyzyn
b332f1c427 logd: white and black switch to std::string
Bug: 23350706
Change-Id: I92f21aee0a9702f63e8465851d0f35007b0469a7
2015-08-20 10:25:57 -07:00
Mark Salyzyn
c892ea3fa8 logd: worst uid record watermark
Hold on to last worst uid watermark and bypass a spike to O(n*n*x)
(n=samples, x=number of spammers) wrt chatty trimming.

Bug: 23327476
Change-Id: I9f21ce95e969b67e576417a760f75c4d86acf364
2015-08-20 10:25:57 -07:00
Mark Salyzyn
94a89c42fe logd: log buffer switch to std::list
Bug: 23350706
Bug: 23327476
Change-Id: Ib0530b9dd5842c6d05c84d7a66f2531c97461067
2015-08-20 08:36:13 -07:00
Mark Salyzyn
e0ed16c6db logd: white and black switch to std::list
Bug: 23350706
Change-Id: I981d9fa63a0d87eb309485cca93cfc44fc0506cc
2015-08-20 08:36:08 -07:00
Mark Salyzyn
98dca2d0b1 logd: logtimes switch to std::list
Bug: 23350706
Change-Id: Icc60dd06119ea20a22610644ff880d5135363aba
2015-08-20 08:36:03 -07:00
Mark Salyzyn
b39ed0c992 logd: prune 10% or 256 entries max
Bug: 22351810
Bug: 23327476
Change-Id: I902ba6b431d8b7cee2d65ee2f76e9f7c4f30b152
2015-08-20 08:35:45 -07:00
Mark Salyzyn
95e7cb5b8e Merge "healthd: logd: add timestamp to kernel logged battery messages" 2015-08-12 14:17:38 +00:00
Mark Salyzyn
62ab0fd4ef logd: sizes > 1M prune in smaller batches
Switch to 1% batch sizes from 10% when individual buffer size > 1M

Bug: 22351810
Change-Id: Ifee570a54643ceb0ba767e1787e937f70cc90b72
2015-08-11 16:20:54 -07:00
Mark Salyzyn
acb1ddf56c healthd: logd: add timestamp to kernel logged battery messages
Aid monotonic to realtime logging synchronization correction in
the Android ecosystem by providing a periodic notification.  We
now have the following messages in the kernel logs:

- PM: suspend entry %Y-%m-%d %H:%M:%S.%09q UTC
- PM: suspend exit %Y-%m-%d %H:%M:%S.%09q UTC
- Suspended for %s.%03q seconds
- healthd: battery l=100 ... %Y-%m-%d %H:%M:%S.%09q UTC

Alter klogd to resynchronize on healthd messages as well.

NB: Time using strftime format, %q is a reference to fractional
second as introduced into log_time strptime method.

Bug: 21868540
Change-Id: I854afc0a07dff9c7f26d2b2f68990e52bf90e300
2015-07-28 16:52:58 -07:00
Mark Salyzyn
cb41e3673b logd: deprecate TARGET_USES_LOGD
This is not the kernel logger you are looking for

Bug: 22787659
Change-Id: Idbf8fae908287ae9a96007b6353ec56d3c49f5bf
2015-07-28 09:37:07 -07:00
Andreas Gampe
d75564f9b8 Logd: Handle unused variable and fields
For build-system CFLAGS clean-up.

Bug: 18632512
Change-Id: If81d6705b44e9a29f64c44c56ea633c031e831b7
2015-07-27 14:17:33 -07:00
Mark Salyzyn
618d0dec50 logd: refine is_prio
- Heuristics associated with translation of kernel messages to
  Android user space logs.
- Limit is_prio to 4 characters, we got false positives on hex
  values like <register contents> with no alpha chars.
- x11 and other register definitions are not valid tags, en0 is
- fix some Android coding standard issues

Change-Id: Idc3dcc53a2cb75ac38628c8ef7a5d5b53f12587a
2015-07-22 07:36:03 -07:00
Mark Salyzyn
ed777e9eec logd: serialize accesses to stats helpers
Quick low-risk to resolve possible hash table corruption.
Resolved an unlikely path memory leak.

ToDo: replace lock with nested lock so no lock
      helpers are required.

Bug: 22068332
Change-Id: I303ab06608502c7d61d42f111a9c43366f184d0c
2015-06-25 07:39:24 -07:00
Mark Salyzyn
ee49c6a670 logd: missing klogd content
- regression in log_strtok_r (part deux) In commit
      'logd: fix kernel logline stutter'
  2c3b300fd8 we introduced log_strtok_r.
  as a replacement for strtok_r that dealt with a problem with
  some kernel log messages. Fix is to refine definition of
  is_timestamp to not match on patterns like [0], requiring
  a single period. Another fix is to refine definition of
  is_prio to properly escape non-digit content.
- Missing content because SYSLOG_ACTION_SIZE_BUFFER with added logging
  is too short for full read of SYSLOG_ACTION_READ_ALL dropping
  initial content. Add a margin for additional 1024 bytes.
- Absolute _first_ log entry has sequence number of 1, which is
  specifically dropped, start sequence count at 1 rather than 0.
- Remove trailing space for efficiency.
- If tag exists but no content, trick into kernel logging.

Bug: 21851884
Change-Id: I0867a555a3bca09bbf18d18e75e41dffffe57a23
2015-06-15 21:19:10 +00:00
Mark Salyzyn
e59c469fa8 logd: filter on __android_log_is_loggable
- Default level when not specified is ANDROID_LOG_VERBOSE
  which is inert.

Bug: 20416721
Bug: 19544788
Bug: 17760225
Change-Id: Icc098e53dc47ceaaeb24ec42eb6f61d6430ec2f6
2015-06-12 10:35:09 -07:00
Riley Andrews
aede9897df Lower the priority of the threads in logd/logcat.
(cherry pick from commit d98f4e8af5)

sched_batch implies only a penalty to latency in scheduling, but
does not imply that the process will be given less cpu time. Increase
the nice level to 10 to prioritize it below ui threads.

Bug: 21696721
Change-Id: I075af059dc755402f7df9b0d7a66cca921ff04b2
2015-06-09 12:40:20 -07:00
Mark Salyzyn
3e21de2915 logd: build breakage
OPEN_BRACKET_SPACE comparison always false

Change-Id: I1ff4288b4b79a49702727d3a8b8c8f179f500951
2015-06-08 14:52:14 -07:00
Mark Salyzyn
2c3b300fd8 logd: fix kernel logline stutter
- look for cases where one log line contains two without a newline.
- rare condition, occurs when a printk does not have
  a terminating newline under certain race conditions.
- the newline may be performed broken up as a second call
- the timestamps can be reversed (showing the race effects).
- driver(s) should really have the newline in there log messages.

Change-Id: Ibfb56b32047da3d6513db059ca6edad0f0105168
2015-06-08 13:10:31 -07:00
Mark Salyzyn
047cc0729f logd: filters remove leading expire messages and rate
- Cleanup resulting from experience and feedback
- When filtering inside logd, drop any leading expire messages, they
  are cluttering up leading edge of tombstones (which filter by pid)
- Increase and introduce EXPIRE_RATELIMIT from 1 to 10 seconds
- Increase EXPIRE_THRESHOLD from 4 to 10 count
- Improve the expire messages from:
   logd : uid=1000(system) too chatty comm=com.google.android.phone,
                                                   expire 2800 lines
  change tag to be more descriptive, and reduce accusatory tone to:
   chatty : uid=1000(system) com.google.android.phone expire 2800
                                                               lines
- if the UID name forms a prefix for comm name, then drop UID name

Change-Id: Ied7cc04c0ab3ae02167649a0b97378e44ef7b588
2015-06-05 08:05:05 -07:00
Mark Salyzyn
511338dd57 logd: switch to unordered_map from BasicHashtable
BasicHashtable is relatively untested, move over to
a C++ template library that has more bake time.

Bug: 20419786
Bug: 21590652
Bug: 20500228
Change-Id: I926aaecdc8345eca75c08fdd561b0473504c5d95
2015-06-03 13:03:07 -07:00
Mark Salyzyn
100658c303 init.rc: logd: Add logpersistd (nee logcatd)
- logpersistd is defined as a thread or process in the context of the
  logd domain. Here we define logpersistd as logcat -f in logd domain
  and call it logcatd to represent its service mechanics.
- Use logcatd to manage content in /data/misc/logd/ directory.
- Only turn on for persist.logd.logpersistd = logcatd.
- Add logpersist.start, logpersist.stop and logpersist.cat debug
  class executables, thus only in the eng and userdebug builds.

ToDo: Wish to add Developer Options menu to turn this feature on or
off, complicated by the fact that user builds have no tools with
access rights to /data/misc/logd.

Bug: 19608716
Change-Id: I57ad757f121c473d04f9fabe9d4820a0eca06f31
2015-06-02 15:17:59 -07:00
Mark Salyzyn
a8b6131ddf Merge "logd: test modernization" 2015-06-01 21:16:58 +00:00
Mark Salyzyn
10a124d342 Merge "logd: whitelist should not preserve expire messages" 2015-06-01 21:16:57 +00:00
Mark Salyzyn
5921276a16 logd: KISS & fix preserve a day
Code in 833a9b1e38 was borken,
simpler approach is to simply check last entry (to save a
syscall) minus EXPIRE_HOUR_THRESHOLD. This does make longer logs
less likely to call upon the spam detection than the algorithm
being replaced, but sadly we ended up with a log entry in the
future at the beginning of the logs confounding the previous
algorithm.

Bug: 21555259
Change-Id: I04fad67e95c8496521dbabb73b5f32c19d6a16c2
2015-06-01 13:06:35 -07:00
Mark Salyzyn
5392aac95d logd: deal with sloppy leading expire messages
The odds of chatty content also leading the logs is pretty high eg:

 1799 12017 I logd: uid=10007 chatty comm=Binder_B, expire 4 lines
 1799  1829 I logd: uid=10007 chatty comm=Binder_2, expire 4 lines
 1919 20637 I logd: uid=10007 chatty comm=m.sersistent, expire 1 line
 1919 20638 I logd: uid=10007 chatty comm=s.persistent, expire 1 line
 1919  2316 I logd: uid=10007 chatty comm=UlrDispatch, expire 4 lines
19379 20634 I logd: uid=10045 chatty, expire 14 lines
19379 19388 I logd: uid=10045 chatty comm=lizerDaemon, expire 4 lines
  591  4396 I logd: uid=1000 chatty comm=Thread-220, expire 5 lines
  591  1377 I logd: uid=1000 chatty comm=Thread-92, expire 4 lines
 1919  2267 I logd: uid=10007 chatty comm=WifiScanner, expire 4 lines
  591  4397 I logd: uid=1000 chatty comm=DhcpClient, expire 4 lines
  591  4398 I logd: uid=1000 chatty comm=Thread-222, expire 4 lines
  226   580 D CommandListener: Setting iface cfg

Change-Id: I5ab24bc7bf5d2690bae7e789831b07f23ff8bcc6
2015-06-01 13:04:09 -07:00
Mark Salyzyn
62d6a2a921 logd: test modernization
Bug: 19603976
Change-Id: Ie920c128e7e6a436fea7a96c7d68bc39e13a2ad4
2015-05-26 12:24:51 -07:00
Mark Salyzyn
c5bf3b8304 logd: whitelist should not preserve expire messages
Change-Id: I56275c73191b96aa21e7b4049d401e1f44211f9b
2015-05-21 13:01:56 -07:00
Mark Salyzyn
833a9b1e38 logd: worst-UID only to preserve a day
Do not invoke worst-UID pruning in the face of other
UIDs logs that are more than a day old, switch to
pruning oldest only.

Change-Id: Icf988b8d5458400a660d0f8e9d2df3f9d9a4c2d9
2015-05-20 09:47:54 -07:00
Mark Salyzyn
d1371a8b8b Merge "logd: Add TID statistics" 2015-05-13 16:17:37 +00:00
Mark Salyzyn
7718778793 logd: Cleanup
- Android Coding Standard for Constructors
- Side effects NONE

Change-Id: I2cda9dd73f3ac3ab58f394015cb810820093d47b
2015-05-12 15:51:46 -07:00
Mark Salyzyn
ae4d928d81 logd: Add klogd
- Add a klogd to collect the kernel logs and place them into a
  new kernel log buffer
- Parse priority, tag and message from the kernel log messages.
- Turn off pruning for worst UID for the kernel log buffer
- Sniff for 'PM: suspend exit', 'PM: suspend enter' and
  'Suspended for' messages and correct the internal definition
  time correction against monotonic dynamically.
- Discern if we have monotonic or real time (delineation 1980) in
  audit messages.
- perform appropriate math to correct the timestamp to be real time
- filter out any external sources of kernel logging

Change-Id: I8d4c7c5ac19f1f3218079ee3a05a50e2ca55f60d
2015-05-12 15:51:46 -07:00
Mark Salyzyn
17ed6797df logd: Add TID statistics
Bug: 19608965
Change-Id: Ifbf0b00c48ef12b5970b9f9f217bd1dd8f587f2c
2015-05-12 12:57:25 -07:00