llkd: add bit_wait_io to stack monitoring

This will discover if the I/O is starved.

Add the ability to search for " <symbol>.cfi+0x".

Cleaned up README.md to reflect current defaults.

Test: none
Bug: 113648929
Change-Id: I990a54f99de536406fd752a490e60f962380d71a
This commit is contained in:
Mark Salyzyn 2018-10-18 14:39:27 -07:00
parent 312339167e
commit bb1256a728
3 changed files with 14 additions and 9 deletions

View file

@ -44,7 +44,8 @@ then have a confirmed live-lock condition and need to panic. There is no
ABA detection since forward scheduling progress is allowed, thus the condition
for the symbols are:
- Check is looking for " " + __symbol__+ "0x" in /proc/<pid>/stack.
- Check is looking for " __symbol__+0x" or " __symbol__.cfi+0x" in
/proc/__pid__/stack.
- The __symbol__ should be rare and short lived enough that on a typical
system the function is seen at most only once in a sample over the timeout
period of ro.llk.stack.timeout_ms, samples occur every ro.llk.check_ms. This
@ -121,14 +122,14 @@ default ro.llk.timeout_ms, Z maximum timelimit.
#### ro.llk.stack.timeout_ms
default ro.llk.timeout_ms,
checking for persistent stack symbols maximum timelimit.
Only active on userdebug and eng builds.
Only active on userdebug or eng builds.
#### ro.llk.check_ms
default 2 minutes samples of threads for D or Z.
#### ro.llk.stack
default cma_alloc,__get_user_pages, comma separated list of kernel symbols.
The string "*false*" is the equivalent to an *empty* list.
default cma_alloc,__get_user_pages,bit_wait_io comma separated list of kernel
symbols. The string "*false*" is the equivalent to an *empty* list.
Look for kernel stack symbols that if ever persistently present can
indicate a subsystem is locked up.
Beware, check does not on purpose do forward scheduling ABA except by polling
@ -136,11 +137,14 @@ every ro.llk_check_ms over the period ro.llk.stack.timeout_ms, so stack symbol
should be exceptionally rare and fleeting.
One must be convinced that it is virtually *impossible* for symbol to show up
persistently in all samples of the stack.
Only active on userdebug and eng builds.
Again, looks for a match for either " **symbol**+0x" or " **symbol**.cfi+0x"
in stack expansion.
Only available on userdebug or eng builds, limited privileges due to security
concerns on user builds prevents this checking.
#### ro.llk.blacklist.process
default 0,1,2 (kernel, init and [kthreadd]) plus process names
init,[kthreadd],[khungtaskd],lmkd,lmkd.llkd,llkd,watchdogd,
init,[kthreadd],[khungtaskd],lmkd,llkd,watchdogd,
[watchdogd],[watchdogd/0],...,[watchdogd/***get_nprocs**-1*].
The string "*false*" is the equivalent to an *empty* list.
Do not watch these processes. A process can be comm, cmdline or pid reference.
@ -160,7 +164,7 @@ The string "*false*" is the equivalent to an *empty* list.
Do not watch processes that match this uid.
#### ro.llk.blacklist.process.stack
default process names init,lmkd,lmkd.llkd,llkd,keystore,logd.
default process names init,lmkd.llkd,llkd,keystore,ueventd,apexd,logd.
The string "*false*" is the equivalent to an *empty* list.
This subset of processes are not monitored for live lock stack signatures.
Also prevents the sepolicy violation associated with processes that block

View file

@ -48,7 +48,7 @@ unsigned llkCheckMilliseconds(void);
/* LLK_CHECK_MS_DEFAULT = actual timeout_ms / LLK_CHECKS_PER_TIMEOUT_DEFAULT */
#define LLK_CHECKS_PER_TIMEOUT_DEFAULT 5
#define LLK_CHECK_STACK_PROPERTY "ro.llk.stack"
#define LLK_CHECK_STACK_DEFAULT "cma_alloc,__get_user_pages"
#define LLK_CHECK_STACK_DEFAULT "cma_alloc,__get_user_pages,bit_wait_io"
#define LLK_BLACKLIST_PROCESS_PROPERTY "ro.llk.blacklist.process"
#define LLK_BLACKLIST_PROCESS_DEFAULT \
"0,1,2,init,[kthreadd],[khungtaskd],lmkd,llkd,watchdogd,[watchdogd],[watchdogd/0]"

View file

@ -726,7 +726,8 @@ bool llkCheckStack(proc* procp, const std::string& piddir) {
char match = -1;
for (const auto& stack : llkCheckStackSymbols) {
if (++idx < 0) break;
if (kernel_stack.find(" "s + stack + "+0x") != std::string::npos) {
if ((kernel_stack.find(" "s + stack + "+0x") != std::string::npos) ||
(kernel_stack.find(" "s + stack + ".cfi+0x") != std::string::npos)) {
match = idx;
break;
}