Commit graph

11 commits

Author SHA1 Message Date
Suren Baghdasaryan
2b92541e7f llkd: Disable in userdebug builds by default
While llkd helps in discovering issues in apps which leave zombies, it
creates issues for dogfooders when apps are killed. Disable it by
default.

Bug: 202411543
Test: boot and check llkd not running
Test: `setprop ro.llk.enable true` enables llkd
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: If93bf9e981eaa3921a9da5f3160db26c4fe17e66
2021-11-04 16:21:46 -07:00
Bob Badour
d69ad69a93 [LSC] Add LOCAL_LICENSE_KINDS to system/core
Added SPDX-license-identifier-Apache-2.0 to:
  bootstat/Android.bp
  cli-test/Android.bp
  code_coverage/Android.bp
  cpio/Android.bp
  debuggerd/crasher/Android.bp
  debuggerd/proto/Android.bp
  diagnose_usb/Android.bp
  fs_mgr/libdm/Android.bp
  fs_mgr/libfiemap/Android.bp
  fs_mgr/liblp/Android.bp
  fs_mgr/libsnapshot/Android.bp
  fs_mgr/libstorage_literals/Android.bp
  fs_mgr/libvbmeta/Android.bp
  fs_mgr/tests/Android.bp
  fs_mgr/tools/Android.bp
  gatekeeperd/Android.bp
  healthd/Android.bp
  healthd/testdata/Android.bp
  init/Android.bp
  init/Android.mk
  init/sysprop/Android.bp
  init/test_kill_services/Android.bp
  init/test_service/Android.bp
  libappfuse/Android.bp
  libasyncio/Android.bp
  libbinderwrapper/Android.bp
  libcrypto_utils/Android.bp
  libcrypto_utils/tests/Android.bp
  libdiskconfig/Android.bp
  libgrallocusage/Android.bp
  libkeyutils/mini_keyctl/Android.bp
  libmodprobe/Android.bp
  libnetutils/Android.bp
  libpackagelistparser/Android.bp
  libprocessgroup/Android.bp
  libprocessgroup/cgrouprc/Android.bp
  libprocessgroup/cgrouprc_format/Android.bp
  libprocessgroup/profiles/Android.bp
  libprocessgroup/setup/Android.bp
  libqtaguid/Android.bp
  libsparse/Android.bp
  libstats/push_compat/Android.bp
  libsuspend/Android.bp
  libsync/Android.bp
  libsystem/Android.bp
  libsysutils/Android.bp
  libusbhost/Android.bp
  libutils/Android.bp
  libvndksupport/Android.bp
  libvndksupport/tests/Android.bp
  llkd/Android.bp
  llkd/tests/Android.bp
  property_service/libpropertyinfoparser/Android.bp
  property_service/libpropertyinfoserializer/Android.bp
  property_service/property_info_checker/Android.bp
  qemu_pipe/Android.bp
  reboot/Android.bp
  rootdir/Android.bp
  rootdir/Android.mk
  rootdir/avb/Android.bp
  rootdir/avb/Android.mk
  run-as/Android.bp
  sdcard/Android.bp
  set-verity-state/Android.bp
  shell_and_utilities/Android.bp
  storaged/Android.bp
  toolbox/Android.bp
  trusty/apploader/Android.bp
  trusty/confirmationui/Android.bp
  trusty/confirmationui/fuzz/Android.bp
  trusty/coverage/Android.bp
  trusty/fuzz/Android.bp
  trusty/fuzz/test/Android.bp
  trusty/gatekeeper/Android.bp
  trusty/gatekeeper/fuzz/Android.bp
  trusty/keymaster/Android.bp
  trusty/keymaster/fuzz/Android.bp
  trusty/libtrusty/Android.bp
  trusty/libtrusty/tipc-test/Android.bp
  trusty/secure_dpu/Android.bp
  trusty/storage/interface/Android.bp
  trusty/storage/lib/Android.bp
  trusty/storage/proxy/Android.bp
  trusty/storage/tests/Android.bp
  trusty/utils/spiproxyd/Android.bp
  trusty/utils/trusty-ut-ctrl/Android.bp
  usbd/Android.bp
  watchdogd/Android.bp

Added SPDX-license-identifier-Apache-2.0 SPDX-license-identifier-BSD to:
  debuggerd/Android.bp
  fastboot/Android.bp
  libkeyutils/Android.bp

Added SPDX-license-identifier-Apache-2.0 SPDX-license-identifier-BSD
    SPDX-license-identifier-MIT
to:
  libcutils/Android.bp

Added SPDX-license-identifier-Apache-2.0 SPDX-license-identifier-MIT
to:
  fs_mgr/Android.bp
  fs_mgr/libfs_avb/Android.bp
  trusty/Android.bp
  trusty/utils/rpmb_dev/Android.bp

Added SPDX-license-identifier-BSD
to:
  fastboot/fuzzy_fastboot/Android.bp

Bug: 68860345
Bug: 151177513
Bug: 151953481

Test: m all

Exempt-From-Owner-Approval: janitorial work
Change-Id: Id740a7d2884556081fdb68876584b25eb95e1bef
2021-02-19 12:59:05 -08:00
Elliott Hughes
c3a206ccda Revert "[LSC] Add LOCAL_LICENSE_KINDS to system/core"
This reverts commit 187b7d1950.

Reason for revert: system/core is multiple projects, not one.

Change-Id: I790ea41741f8cd9b8b6db2f59a49e71fb0958fd6
2021-02-16 20:01:20 +00:00
Bob Badour
187b7d1950 [LSC] Add LOCAL_LICENSE_KINDS to system/core
Added SPDX-license-identifier-Apache-2.0 to:
  bootstat/Android.bp
  cli-test/Android.bp
  code_coverage/Android.bp
  cpio/Android.bp
  debuggerd/crasher/Android.bp
  debuggerd/proto/Android.bp
  diagnose_usb/Android.bp
  fs_mgr/libdm/Android.bp
  fs_mgr/libfiemap/Android.bp
  fs_mgr/liblp/Android.bp
  fs_mgr/libsnapshot/Android.bp
  fs_mgr/libstorage_literals/Android.bp
  fs_mgr/libvbmeta/Android.bp
  fs_mgr/tests/Android.bp
  fs_mgr/tools/Android.bp
  gatekeeperd/Android.bp
  healthd/Android.bp
  healthd/testdata/Android.bp
  init/Android.bp
  init/Android.mk
  init/sysprop/Android.bp
  init/test_kill_services/Android.bp
  init/test_service/Android.bp
  libappfuse/Android.bp
  libasyncio/Android.bp
  libbinderwrapper/Android.bp
  libcrypto_utils/Android.bp
  libcrypto_utils/tests/Android.bp
  libdiskconfig/Android.bp
  libgrallocusage/Android.bp
  libkeyutils/mini_keyctl/Android.bp
  libmodprobe/Android.bp
  libnetutils/Android.bp
  libpackagelistparser/Android.bp
  libprocessgroup/Android.bp
  libprocessgroup/cgrouprc/Android.bp
  libprocessgroup/cgrouprc_format/Android.bp
  libprocessgroup/profiles/Android.bp
  libprocessgroup/setup/Android.bp
  libqtaguid/Android.bp
  libsparse/Android.bp
  libstats/push_compat/Android.bp
  libsuspend/Android.bp
  libsync/Android.bp
  libsystem/Android.bp
  libsysutils/Android.bp
  libusbhost/Android.bp
  libutils/Android.bp
  libvndksupport/Android.bp
  libvndksupport/tests/Android.bp
  llkd/Android.bp
  llkd/tests/Android.bp
  property_service/libpropertyinfoparser/Android.bp
  property_service/libpropertyinfoserializer/Android.bp
  property_service/property_info_checker/Android.bp
  qemu_pipe/Android.bp
  reboot/Android.bp
  rootdir/Android.bp
  rootdir/Android.mk
  rootdir/avb/Android.bp
  rootdir/avb/Android.mk
  run-as/Android.bp
  sdcard/Android.bp
  set-verity-state/Android.bp
  shell_and_utilities/Android.bp
  storaged/Android.bp
  toolbox/Android.bp
  trusty/apploader/Android.bp
  trusty/confirmationui/Android.bp
  trusty/confirmationui/fuzz/Android.bp
  trusty/coverage/Android.bp
  trusty/fuzz/Android.bp
  trusty/fuzz/test/Android.bp
  trusty/gatekeeper/Android.bp
  trusty/gatekeeper/fuzz/Android.bp
  trusty/keymaster/Android.bp
  trusty/keymaster/fuzz/Android.bp
  trusty/libtrusty/Android.bp
  trusty/libtrusty/tipc-test/Android.bp
  trusty/secure_dpu/Android.bp
  trusty/storage/interface/Android.bp
  trusty/storage/lib/Android.bp
  trusty/storage/proxy/Android.bp
  trusty/storage/tests/Android.bp
  trusty/utils/spiproxyd/Android.bp
  trusty/utils/trusty-ut-ctrl/Android.bp
  usbd/Android.bp
  watchdogd/Android.bp

Added SPDX-license-identifier-Apache-2.0 SPDX-license-identifier-BSD to:
  debuggerd/Android.bp
  fastboot/Android.bp
  libkeyutils/Android.bp

Added SPDX-license-identifier-Apache-2.0 SPDX-license-identifier-BSD
    SPDX-license-identifier-MIT
to:
  Android.bp
  libcutils/Android.bp

Added SPDX-license-identifier-Apache-2.0 SPDX-license-identifier-MIT
to:
  fs_mgr/Android.bp
  fs_mgr/libfs_avb/Android.bp
  trusty/utils/rpmb_dev/Android.bp

Added SPDX-license-identifier-BSD
to:
  fastboot/fuzzy_fastboot/Android.bp

Bug: 68860345
Bug: 151177513
Bug: 151953481

Test: m all

Exempt-From-Owner-Approval: janitorial work
Change-Id: I5bd81adb5cdcf2b4dd4141b204eb430ff526af8f
2021-02-16 04:10:03 -08:00
Mark Salyzyn
92f7bbfbe5 llkd: test: llkd.sleep also check for __arm64_sys_openat
4.19 kernel reported __arm64_sys_openat instead of SyS_openat, so
the test for llkd.sleep also needs to check for that as well.

Signed-off-by: Mark Salyzyn <salyzyn@google.com>
Bug: 147486902
Test: llkd_unit_test
Change-Id: Ibb4b4ca45391e35fd03fcb8e7ccea01f547b76e1
2020-01-15 09:08:48 -08:00
Mark Salyzyn
da2aeb0c42 llkd: handle 'adbd shell setsid' to preserve adbd
A zombie setsid process occurs when adb shell setsid <command> is
issued, however llkd can only detect if it is a result of a kernel
livelock by killing the associated parent, which would be adbd;
resulting in the adb connection(s) being terminated.  Will special
case this condition in order to preserve adbd for debugging purposes.

We parse <parent>&<child> in ro.llk.blacklist.parent as this
association, thus adbd&[setsid] covers this special case.
Ampersand was selected because it is never part of a process name,
however a setprop in the shell requires it to be escaped or quoted;
init rc file where this is normally specified does not have issue.

getComm() is effectively pure, so hold on to the return value for
sake of efficiency.

This also reverts commit 599958d114
which granted adbd blanket parent immunity from monitoring on
userdebug builds.  The new logic is a more refined means of
preserving the live lock checking associated with adbd and allows
the operation to be performed on user builds.

POC: date ; adb shell setsid sleep 900 ; date
Positive for bug, reports less than 15 minutes, otherwise solved.

Test: llkd_unit_test
Bug: 120983740
Change-Id: I6442463a48499d925a3a074423a24a1622905559
2019-01-04 13:49:48 -08:00
Mark Salyzyn
8a5f081763 llkd: enhance list properties
Because of the limited length of properties, and to ease the
complexity of product and vendor adjustments, the comma separated
list properties will use a leading comma to preserve the defaults
and add or subtract entries with + and - prefixes respectively.

Without the leading comma, the list is explicitly specified as before.

Cleanup:
- use empty() instead of space() == 0 (or converse if != 0)
- if (unlikely) pprocp can not be allocated, to a to_string(ppid) check

For testing, observe before and after llkd_unit_test below to
confirm leading comma effects for example:

livelock: ro.llk.stack=wait_on_page_bit_killable,bit_wait_io,\
                       __get_user_pages,cma_alloc
livelock: ro.llk.stack=...,SyS_openat,...

Test: llkd_unit_test
Bug: 120983740
Change-Id: Ia3d164c2fdac5295a474c6c1294a34e4ae9d0b61
2019-01-04 11:43:15 -08:00
Mark Salyzyn
96505fad80 llkd: Add stack symbol checking
Feature outlined here is only available on userdebug or eng builds.
Blocked for security reasons because requires ptrace capabilities.

Add ro.llk.stack to list a set of symbols that should rarely happen
but if persistent in multiple checks, indicates a live lock condition.
At ro.llk.stack.timeout_ms the process is sent a kill, if it remains,
then panic the kernel.

There is no ABA detection in the paths, the condition for the
stack symbol being present instantaneously must be its rarity of
being caught.  If a livelock occurs in the path of the symbol, then
it is possible more than one path could be stuck in the state, but
the best candidate symbols are found underneath a lock resulting in
only one process being the culprit, and the best aim.  There may be
processes that induce a look of persistence, if so the symbol is not
a candidate for checking.

Add ro.llk.blacklist.process.stack to list process names we want
to skip checking.  This configuration parameter is also used to
prevent sepolicy noise when trying to acquire stacks from non
ptrace'able services.

Test: gTest llkd_unit_tests
Bug: 33808187
Bug: 111910505
Bug: 80502612
Change-Id: Ie71221e371b189bbdda2a1155d47826997842dcc
2018-08-30 13:53:19 -07:00
Mark Salyzyn
afd66f2fd3 llkd: bootstat: propagate detailed livelock canonical boot reason
Report kernel_panic,sysrq,livelock,<state> reboot reason via last
dmesg (pstore console).  Add ro.llk.killtest property, which will
allow reliable ABA platforms to drop kill test and go directly
to kernel panic.  This should also allow some manual unit testing
of the canonical boot reason report.

New canonical boot reasons from llkd are:
- kernel_panic,sysrq,livelock,alarm llkd itself locked up (Hail Mary)
- kernel_panic,sysrq,livelock,driver uninterrruptible D state
- kernel_panic,sysrq,livelock,zombie uninterrruptible Z state

Manual test assumptions:
- llkd is built by the platform and landed on system partition
- unit test is built and landed in /data/nativetest (could
  land in /data/nativetest64, adjust test correspondingly)
- llkd not enabled, ro.llk.enable and ro.llk.killtest
  are not set by platform allowing test to adjust all the
  configuration properties and start llkd.
- or, llkd is enabled, ro.llk.enable is true, and killtest is
  disabled, ro.llk.killtest is false, setup by the platform.
  This breaks the go/apct generic operations of the unit test
  for llk.zombie and llk.driver as kernel panic results
  requiring manual intervention otherwise.  If test moves to
  go/apct, then we will be forced to bypass these tests under
  this condition (but allow them to run if ro.llk.killtest
  is "off" so specific testing above/below can be run).

for i in driver zombie; do
        adb shell su root setprop ro.llk.killtest off
        adb shell /data/nativetest/llkd_unit_test/llkd_unit_test --gtest_filter=llkd.${i}
        adb wait-for-device
        adb shell su root setprop ro.llk.killtest off
        sleep 60
        adb shell getprop sys.boot.reason
        adb shell /data/nativetest/llkd_unit_test/llkd_unit_test --gtest_filter=llkd.${i}
done

Test: llkd_unit_test (see test assumptions)
Bug: 33808187
Bug: 72838192
Change-Id: I2b24875376ddfdbc282ba3da5c5b3567de85dbc0
2018-04-18 14:02:16 -07:00
Mark Salyzyn
d035dbbecf llkd: default enabled for userdebug
If LLK_ENABLE_DEFAULT is false, then check "ro.llk.enable" for "eng",
also the default value if not set, and then check if userdebug build
to establish a default of true for enable.  Same for
ro.khungtask.enable.

Test: llkd_unit_test report eng status on "userdebug" or "user" builds
Bug: 33808187
Bug: 72838192
Change-Id: I2adb23c7629dccaa2856c50bccbf4e363703c82c
2018-04-18 14:02:05 -07:00
Mark Salyzyn
f089e1403b llkd: add live-lock daemon
Introduce a standalone live-lock daemon (llkd), to catch kernel
or native user space deadlocks and take mitigating actions.  Will
also configure [khungtaskd] to fortify the actions.

If a thread is in D or Z state with no forward progress for longer
than ro.llk.timeout_ms, or ro.llk.[D|Z].timeout_ms, kill the process
or parent process respectively.  If another scan shows the same
process continues to exist, then have a confirmed live-lock condition
and need to panic.  Panic the kernel in a manner to provide the
greatest bugreporting details as to the condition.  Add a alarm self
watchdog should llkd ever get locked up that is double the expected
time to flow through the mainloop.  Sampling is every
ro.llk_sample_ms.

Default will not monitor init, or [kthreadd] and all that [kthreadd]
spawns.  This reduces the effectiveness of llkd by limiting its
coverage.  If in the future, if value in covering kthreadd spawned
threads, the requirement will be to code drivers so that they do not
remain in a persistent 'D' state, or that they have mechanisms to
recover the thread should it be killed externally.  Then the
blacklists can be adjusted accordingly if these conditions are met.

An accompanying gTest set have been added, and will setup a persistent
D or Z process, with and without forward progress, but not in a
live-lock state because that would require a buggy kernel, or a module
or kernel modification to stimulate.

Android Properties llkd respond to (*_ms parms are in milliseconds):
- ro.config.low_ram default false, if true do not sysrq t (dump
  all threads).
- ro.llk.enable default false, allow live-lock daemon to be enabled.
- ro.khungtask.enable default false, allow [khungtaskd] to be enabled.
- ro.llk.mlockall default true, allow mlock'd live-lock daemon.
- ro.khungtask.timeout default 12 minutes.
- ro.llk.timeout_ms default 10 minutes, D or Z maximum timelimit,
  double this value and it sets the alarm watchdog for llkd.
- ro.llk.D.timeout_ms default ro.llk.timeout_ms, D maximum timelimit.
- ro.llk.Z.timeout_ms default ro.llk.timeout_ms, Z maximum timelimit.
- ro.llk.check_ms default 2 minutes sampling interval
  (ro.llk.timeout_ms / 5) for threads in D or Z state.
- ro.llk.blacklist.process default 0,1,2 (kernel, init and
  [kthreadd]), and process names (/comm or /cmdline) init,[kthreadd],
  lmkd,lmkd.llkd,llkd,[khungtaskd],watchdogd,[watchdogd],
  [watchdogd/0] ...
- ro.llk.blacklist.parent default 0,2 (kernel and [kthreadd]) and
  "[kthreadd]".  A comma separated lists of process ids, /comm names
  or /cmdline names.
- ro.llk.blacklist.uid default <empty>, comma separated list of
  uid numbers or names from getpwuid/getpwnam.

Test: llkd_unit_test
Bug: 33808187
Bug: 72838192
Change-Id: I32e8aa78aef10834e093265d0f3ed5b4199807c6
2018-04-18 14:01:56 -07:00