llkd: Add cma_alloc stack symbol checking

Add ro.llk.stack to list a set of symbols that should rarely happen
but if persistent in multiple checks, indicates a live lock condition.
At ro.llk.stack.timeout_ms the process is sent a kill, if it remains,
then panic the kernel.

There is no ABA detection in the paths, the condition for the
stack symbol being present instantaneously must be its rarity of
being caught.  If a livelock occurs in the path of the symbol, then
it is possible more than one path could be stuck in the state, but
the best candidate symbols are found underneath a lock resulting in
only one process being the culprit, and the best aim.  There may be
processes that induce a look of persistence, if so the symbol is not
a candidate for checking.

Adding cma_alloc to the default list.  It is not behind a lock, so
multiple references can happen.  The hope is the first one to spin or
wait gets the kill, but there is the possibility that both will get
the kill.  It is unknown if this will escalate to a kernel panic at
this time.  It is also suspect that a RT task could cause this by
starving the background worker, and llkd could suffer a similar fate
as it is SCHED_BATCH policy.

Test: compile
Bug: 33808187
Bug: 111910505
Bug: 80502612
Change-Id: I49c9f0646d627869144c5c1ca32272515ed60f7b
This commit is contained in:
Mark Salyzyn 2018-08-07 10:16:47 -07:00
parent a9afe5933d
commit 00b2ce7005
2 changed files with 2 additions and 2 deletions

View file

@ -127,7 +127,7 @@ Only active on userdebug and eng builds.
default 2 minutes samples of threads for D or Z.
#### ro.llk.stack
default __get_user_pages, comma separated list of kernel symbols.
default cma_alloc,__get_user_pages, comma separated list of kernel symbols.
The string "*false*" is the equivalent to an *empty* list.
Look for kernel stack symbols that if ever persistently present can
indicate a subsystem is locked up.

View file

@ -48,7 +48,7 @@ unsigned llkCheckMilliseconds(void);
/* LLK_CHECK_MS_DEFAULT = actual timeout_ms / LLK_CHECKS_PER_TIMEOUT_DEFAULT */
#define LLK_CHECKS_PER_TIMEOUT_DEFAULT 5
#define LLK_CHECK_STACK_PROPERTY "ro.llk.stack"
#define LLK_CHECK_STACK_DEFAULT "__get_user_pages"
#define LLK_CHECK_STACK_DEFAULT "cma_alloc,__get_user_pages"
#define LLK_BLACKLIST_PROCESS_PROPERTY "ro.llk.blacklist.process"
#define LLK_BLACKLIST_PROCESS_DEFAULT \
"0,1,2,init,[kthreadd],[khungtaskd],lmkd,lmkd.llkd,llkd,watchdogd,[watchdogd],[watchdogd/0]"