No description
Find a file
Mark Salyzyn f089e1403b llkd: add live-lock daemon
Introduce a standalone live-lock daemon (llkd), to catch kernel
or native user space deadlocks and take mitigating actions.  Will
also configure [khungtaskd] to fortify the actions.

If a thread is in D or Z state with no forward progress for longer
than ro.llk.timeout_ms, or ro.llk.[D|Z].timeout_ms, kill the process
or parent process respectively.  If another scan shows the same
process continues to exist, then have a confirmed live-lock condition
and need to panic.  Panic the kernel in a manner to provide the
greatest bugreporting details as to the condition.  Add a alarm self
watchdog should llkd ever get locked up that is double the expected
time to flow through the mainloop.  Sampling is every
ro.llk_sample_ms.

Default will not monitor init, or [kthreadd] and all that [kthreadd]
spawns.  This reduces the effectiveness of llkd by limiting its
coverage.  If in the future, if value in covering kthreadd spawned
threads, the requirement will be to code drivers so that they do not
remain in a persistent 'D' state, or that they have mechanisms to
recover the thread should it be killed externally.  Then the
blacklists can be adjusted accordingly if these conditions are met.

An accompanying gTest set have been added, and will setup a persistent
D or Z process, with and without forward progress, but not in a
live-lock state because that would require a buggy kernel, or a module
or kernel modification to stimulate.

Android Properties llkd respond to (*_ms parms are in milliseconds):
- ro.config.low_ram default false, if true do not sysrq t (dump
  all threads).
- ro.llk.enable default false, allow live-lock daemon to be enabled.
- ro.khungtask.enable default false, allow [khungtaskd] to be enabled.
- ro.llk.mlockall default true, allow mlock'd live-lock daemon.
- ro.khungtask.timeout default 12 minutes.
- ro.llk.timeout_ms default 10 minutes, D or Z maximum timelimit,
  double this value and it sets the alarm watchdog for llkd.
- ro.llk.D.timeout_ms default ro.llk.timeout_ms, D maximum timelimit.
- ro.llk.Z.timeout_ms default ro.llk.timeout_ms, Z maximum timelimit.
- ro.llk.check_ms default 2 minutes sampling interval
  (ro.llk.timeout_ms / 5) for threads in D or Z state.
- ro.llk.blacklist.process default 0,1,2 (kernel, init and
  [kthreadd]), and process names (/comm or /cmdline) init,[kthreadd],
  lmkd,lmkd.llkd,llkd,[khungtaskd],watchdogd,[watchdogd],
  [watchdogd/0] ...
- ro.llk.blacklist.parent default 0,2 (kernel and [kthreadd]) and
  "[kthreadd]".  A comma separated lists of process ids, /comm names
  or /cmdline names.
- ro.llk.blacklist.uid default <empty>, comma separated list of
  uid numbers or names from getpwuid/getpwnam.

Test: llkd_unit_test
Bug: 33808187
Bug: 72838192
Change-Id: I32e8aa78aef10834e093265d0f3ed5b4199807c6
2018-04-18 14:01:56 -07:00
adb Remove out of date comment. 2018-04-16 15:22:33 -07:00
adf Merge "Add OWNERS." 2017-12-07 23:21:26 +00:00
base Add SIZEOF_MEMBER. 2018-04-11 12:29:50 -07:00
bootstat Make bootstat container-friendly 2018-04-16 11:17:42 -07:00
cpio Possible null pointer miss on realloc 2017-03-23 22:41:14 +01:00
debuggerd debuggerd: remove maximum abort message length. 2018-04-13 17:34:20 -07:00
demangle bpfmt. 2018-02-16 17:58:14 -08:00
diagnose_usb Copy adb/OWNERS to diagnose_usb/OWNERS. 2018-03-05 15:47:43 -08:00
fastboot Remove header version check for command "fastboot boot boot.img" 2018-04-16 16:55:58 -07:00
fs_mgr make_f2fs: specify sector size for target image size and missing options 2018-04-04 09:24:24 -07:00
gatekeeperd resolve merge conflicts of 0dd4b6aa3 to stage-aosp-master 2017-09-15 16:50:34 +09:00
healthd Remove obsolete BRILLO variable 2018-03-10 15:41:37 -08:00
include Move android_filesystem_config.h => fs_config.h 2017-08-03 17:20:27 +00:00
init Merge "Add /mnt/vendor rw mount point for vendor partitions." 2018-04-18 19:32:32 +00:00
libappfuse bpfmt. 2018-02-16 17:58:14 -08:00
libasyncio Merge "Make libasyncio headers usable from C" 2018-03-06 19:35:21 +00:00
libbacktrace Add a MemoryOfflineBuffer object. 2018-04-03 18:37:52 -07:00
libbinderwrapper Make libbinderwrapper available in /vendor partition 2018-04-06 08:41:21 +09:00
libcrypto_utils Mark the modules as VNDK in Android.bp 2017-09-14 08:35:16 +00:00
libcutils llkd: add live-lock daemon 2018-04-18 14:01:56 -07:00
libdiskconfig Rename target.linux[_x86[_64]] to target.linux_glibc[_x86[_64]] 2017-10-02 10:44:29 -07:00
libgrallocusage Use -Werror in system/core 2017-11-01 11:32:55 -07:00
libion libion: cleanup logging 2018-03-07 10:56:06 -08:00
libkeyutils Add libkeyutils. 2017-05-10 10:40:11 -07:00
liblog Add missing @addtogroup tags. 2018-04-13 14:49:41 -07:00
libmemtrack Add OWNERS. 2017-12-07 13:30:03 -08:00
libmemunreachable Use ld when lld fails 2018-04-16 16:00:15 -07:00
libmetricslogger Add OWNERS. 2017-12-07 13:30:03 -08:00
libnativebridge bpfmt. 2018-02-16 17:58:14 -08:00
libnativeloader bpfmt. 2018-02-16 17:58:14 -08:00
libnetutils Add OWNERS. 2017-12-07 13:30:03 -08:00
libpackagelistparser bpfmt. 2018-02-16 17:58:14 -08:00
libpixelflinger MIPS[64]: codeflinger: Fix build due to unused variable warnings 2017-11-06 16:38:49 +01:00
libprocessgroup libprocessgroup: remove legacy C string handling and build for host 2018-02-27 14:12:19 -08:00
libprocinfo bpfmt. 2018-02-16 17:58:14 -08:00
libqtaguid Redirect qtaguid native call to netd fwmark client 2017-11-09 18:02:22 -08:00
libsparse Merge "<stdbool.h> not necessary in C++." 2017-10-17 19:26:53 +00:00
libsuspend Add force_suspend function 2018-01-19 12:30:39 -08:00
libsync Add missing @addtogroup tags. 2018-04-13 14:49:41 -07:00
libsystem bpfmt. 2018-02-16 17:58:14 -08:00
libsysutils Include iface index in the netlink event 2018-03-07 11:39:52 +09:00
libunwindstack Add a MemoryOfflineBuffer object. 2018-04-03 18:37:52 -07:00
libusbhost Remove urb request size maximum. 2018-02-07 16:12:14 -08:00
libutils Remove more semicolons at the end of namespaces 2018-04-11 23:14:13 -07:00
libvndksupport bpfmt. 2018-02-16 17:58:14 -08:00
libziparchive Remove empty zip warning on host builds 2018-03-01 21:33:49 +00:00
llkd llkd: add live-lock daemon 2018-04-18 14:01:56 -07:00
lmkd lmkd: limit capability set to minimum 2018-04-16 14:51:56 -07:00
logcat bpfmt. 2018-02-16 17:58:14 -08:00
logd logd: identical check access message data out of range 2018-03-13 12:16:39 -07:00
logwrapper Build /vendor/bin/logwrapper too. 2018-04-11 08:28:37 -07:00
mkbootimg Add fastboot --os-version and --os-patch-level. 2018-04-09 18:37:39 +00:00
property_service Verify the SELabels used in property_contexts 2018-03-26 09:22:55 -07:00
qemu_pipe Add OWNERS. 2017-12-07 13:30:03 -08:00
reboot reboot: only pause indefinitely for non-shutdown operations 2017-09-29 16:29:52 +00:00
rootdir Merge "Add /mnt/vendor rw mount point for vendor partitions." 2018-04-18 19:32:32 +00:00
run-as run-as: Keep supplementary groups. 2017-09-29 15:34:23 -04:00
sdcard Remove FUSE logic; it's only a sdcardfs wrapper. 2018-01-12 15:41:55 -07:00
shell_and_utilities Build /vendor/bin/logwrapper too. 2018-04-11 08:28:37 -07:00
storaged Merge "storaged: lower capabilities in init" 2018-04-12 15:55:26 +00:00
toolbox Build toolbox with _FILE_OFFSET_BITS=64. 2018-01-22 16:15:55 -08:00
trusty bpfmt. 2018-02-16 17:58:14 -08:00
usbd bpfmt. 2018-02-16 17:58:14 -08:00
.clang-format Add a 2 width option of clang format. 2017-03-10 13:01:39 -08:00
.clang-format-2 Only allow short functions in class definitions. 2017-03-28 12:31:37 -07:00
.clang-format-4 Only allow short functions in class definitions. 2017-03-28 12:31:37 -07:00
.gitignore Ignore adb/*.pyc files 2015-08-11 12:59:58 -07:00
Android.bp Export android_filesystem_config.h as a filegroup 2017-01-17 18:20:28 -08:00
Android.mk Remove the simulator target from all makefiles. 2011-07-11 22:12:32 -07:00
CleanSpec.mk Add VNDK version for namespace configuration files 2017-12-13 10:31:04 +09:00
MODULE_LICENSE_APACHE2 auto import from //depot/cupcake/@135843 2013-07-30 13:56:49 -07:00
NOTICE Fix omission in NOTICE file. 2013-07-30 13:56:55 -07:00
OWNERS Add ek and lorenzo to OWNERS for system/core netlink code. 2018-03-05 19:18:02 +09:00
platform_tools_tool_version.mk Fix warning on the build servers 2017-05-25 12:35:40 -07:00
PREUPLOAD.cfg Add a PREUPLOAD.cfg file to run git-clang-format on every commit 2017-03-08 16:51:26 +08:00