Introduce a standalone live-lock daemon (llkd), to catch kernel
or native user space deadlocks and take mitigating actions. Will
also configure [khungtaskd] to fortify the actions.
If a thread is in D or Z state with no forward progress for longer
than ro.llk.timeout_ms, or ro.llk.[D|Z].timeout_ms, kill the process
or parent process respectively. If another scan shows the same
process continues to exist, then have a confirmed live-lock condition
and need to panic. Panic the kernel in a manner to provide the
greatest bugreporting details as to the condition. Add a alarm self
watchdog should llkd ever get locked up that is double the expected
time to flow through the mainloop. Sampling is every
ro.llk_sample_ms.
Default will not monitor init, or [kthreadd] and all that [kthreadd]
spawns. This reduces the effectiveness of llkd by limiting its
coverage. If in the future, if value in covering kthreadd spawned
threads, the requirement will be to code drivers so that they do not
remain in a persistent 'D' state, or that they have mechanisms to
recover the thread should it be killed externally. Then the
blacklists can be adjusted accordingly if these conditions are met.
An accompanying gTest set have been added, and will setup a persistent
D or Z process, with and without forward progress, but not in a
live-lock state because that would require a buggy kernel, or a module
or kernel modification to stimulate.
Android Properties llkd respond to (*_ms parms are in milliseconds):
- ro.config.low_ram default false, if true do not sysrq t (dump
all threads).
- ro.llk.enable default false, allow live-lock daemon to be enabled.
- ro.khungtask.enable default false, allow [khungtaskd] to be enabled.
- ro.llk.mlockall default true, allow mlock'd live-lock daemon.
- ro.khungtask.timeout default 12 minutes.
- ro.llk.timeout_ms default 10 minutes, D or Z maximum timelimit,
double this value and it sets the alarm watchdog for llkd.
- ro.llk.D.timeout_ms default ro.llk.timeout_ms, D maximum timelimit.
- ro.llk.Z.timeout_ms default ro.llk.timeout_ms, Z maximum timelimit.
- ro.llk.check_ms default 2 minutes sampling interval
(ro.llk.timeout_ms / 5) for threads in D or Z state.
- ro.llk.blacklist.process default 0,1,2 (kernel, init and
[kthreadd]), and process names (/comm or /cmdline) init,[kthreadd],
lmkd,lmkd.llkd,llkd,[khungtaskd],watchdogd,[watchdogd],
[watchdogd/0] ...
- ro.llk.blacklist.parent default 0,2 (kernel and [kthreadd]) and
"[kthreadd]". A comma separated lists of process ids, /comm names
or /cmdline names.
- ro.llk.blacklist.uid default <empty>, comma separated list of
uid numbers or names from getpwuid/getpwnam.
Test: llkd_unit_test
Bug: 33808187
Bug: 72838192
Change-Id: I32e8aa78aef10834e093265d0f3ed5b4199807c6
Set F() capability set and 'drop' lmkd from AID_ROOT to AID_LMKD uid
and from AID_ROOT to AID_LMKD and AID_SYSTEM gid.
/dev/memcg/memory.pressure defaults to root.root mode 0000, set it up
as root.system mode 0040 to allow lmkd read access.
Instrument failure to set SCHED_FIFO.
Annotate access points that require elevated capabilities.
Test: check /proc/`pidof lmkd`/status for capability set
Test: lmkd_unit_test
Bug: 77650566
Change-Id: I986081a0434cf6e842b63a55726380205b30a3ea
So we can auto-generate tracing code for AIDL interfaces.
Bug: 74416314
Test: inspect atrace output
Change-Id: I91b14b3b16d8d7a29f531101b14ddf10dbc61a5a
Merged-In: I91b14b3b16d8d7a29f531101b14ddf10dbc61a5a
Now all somewhat time-consuming methods of the VibratorService
are surrounded by traceBegin/traceEnd blocks.
The vibration itself is surrounded with asyncTrace block.
Test: Run "systrace vibrator" and see the time consumption report.
Bug: 73000045
Merged-In: I94172e379354ec3418321b8151e6182cec2e886c
Change-Id: I94172e379354ec3418321b8151e6182cec2e886c
Multiple LTP tests require a "daemon" or "bin" user. These user ids
have been defined since UNIX incept, and even up to the '80s remained
in many of the tools as hard coded values. Add these two ids with
a cautionary note.
Test: compile
Bug: 31152327
Bug: 31226046
Bug: 32385889
Change-Id: Ida2fb6d817b8ada0624870439fcf848667b31fb3
Filesystems allow the setting of the "resgid" parameter to designate
a GID that is allowed to use the "reserved" disk space (in addition
to UID 0). We'll be granting this GID to critical system processes,
so that the system is usable enough for the user to free up disk
space used by abusive apps.
Test: builds, boots
Bug: 62024591
Change-Id: I2d166f3b730f0a3e7279fb40f12db7413c1dadad
Just the minimial changes to get this to actually build, because otherwise
we always bog down trying to rewrite everything (when the real answer
is usually "stop using libcutils, it's awful").
This doesn't move a handful of files: two are basically just BSD libc
source, a couple have outstanding code reviews, and one can be deleted
(but I'll do that in a separate change).
I'm also skipping the presubmit hooks because otherwise clang-format
wants to reformat everything. I'll follow up with that...
Bug: N/A
Test: builds
Change-Id: I06403f465b67c8e493bad466dd76b1151eed5993
In order to replace qtaguid module with new eBPF network monitoring
module. We firstly move the current qtaguid userspace implementation
into netd and hide the detail from other processes. The current API will
talk to netd fwmark client to pass down the qtaguid related request from
high level framework and netd will use the proper method to complete the
request.
Test: Current TrafficStats CTS tests should not fail.
Bug: 30950746
Change-Id: Ie90c28f3594ab2877746b2372a1b6944768bfb18
The qtaguid kernel module will be deprecated on devices running 4.9
kernel or above and we need to support both old and new module in
userspace. Netd is responsible for choosing which kernel module to use
and all the current qtaguid native implementation need to be hided
behind it. So the current qtaguid native API implementation will be
moved to a isolate library under system/core and only netd can access to
it. The libcutils qtaguid API will become a wrapper to send request to
netd module. This modification will make sure the apps that currently
using this native API will not be broken.
Bug: 30950746
Test: All cts and vts test related should not fail.
Change-Id: I9de98a25ed5dc71bbf520ee0aadd16d59025699a
This change adds user namespace-awareness to uevent_kernel_* in
libcutils. Instead of assuming that root is always uid 0, it detects
whether the uid 0 is mapped in the current user namespace and returns
the appropriately mapped uid (or the kernel's "overflowuid" in case it
is not mapped).
In older kernels, or those where user namespaces are not enabled, this
still uses uid 0 for root.
Bug: 62378620
Test: bullhead networking still works
Test: Android in Chrome OS can now receive netlink-related messages
Change-Id: I7ea3454e8f38b9c70c65294d6b2a99e5a88f9d70
Switch from /data/misc/reboot/last_reboot_reason to persistent
Android property persist.sys.boot.reason for indicating why the
device is rebooted or shutdown.
persist.sys.boot.reason has a standard as outlined in b/63736262 and
the associated investigation. Made adjustments to the values so that
we did not create a problem even before we started. Compliance is
part of the tests in boot_reason_test.sh.
Test: system/core/bootstat/boot_reason_test.sh
Bug: 64687998
Change-Id: I812c55a12faf7cb7ff92101009be058ad9958d07
The non AID_ things in android_filesystem_config.h are moved
to fs_config.h. For libcutils.vendor and libcutils_headers.vendor,
fs_config.h is not exported.
An empty system/core/include/private/fs_config.h is placed to
appease the dependency from certain modules (logd, etc.)
that includes system/core/include/private/android_filesystem_config.h
directly.
Test: m -j
Test: BOARD_VNDK_VERSION=current m -j
Bug: 63135587
Change-Id: I95dfb874a426941022b100c0ca26a0576b0f4aa3
Merged-In: I95dfb874a426941022b100c0ca26a0576b0f4aa3
This lets us redeclare property_get with diagnose_if tagged on it,
so we no longer need to deal with overloads.
Bug: 12231437
Test: m checkbuild on bullhead aosp-master.
Change-Id: Ic55dcfeaa314f83d3713aabac7852cb766330fc8
Add NOLINT comment to work around clang-tidy
error in checking macro arguments used in
type expressions.
Bug: 28705665
Test: make with WITH_TIDY=1 WITH_TIDY_CHECKS=-*,misc-macro-* \
WITH_TIDY_FLAGS=-header-filter=system/core/.*
Change-Id: I7619978c1804e151a11a8b0477e80076bcf21cab
Commit 795267d4c7 ("Removed cpusets/schedboost build time dependency.")
turned the cpusets and schedtune options into runtime
decisions.
However the kernel option which is mentioned in the
code comment (CONFIG_SCHEDTUNE) is very misleading
as it doesn't exist (CONFIG_SCHED_TUNE does exist)
and it doesn't describe the real functionality of
the method. schedboost_enabled() will still return
false if CONFIG_SCHED_TUNE is set in the kernel but
CONFIG_CGROUP_SCHEDTUNE is not.
So to clarify this, we need to change the comment
to reflect that CONFIG_CGROUP_SCHEDTUNE, which depends
on CONFIG_SCHED_TUNE, is required.
Signed-off-by: Alex Naidis <alex.naidis@linux.com>
The EVS HAL and related software stack will use this id when running
daemons that monitor car state, capture and display video, and interact
with users.
Test: visual inspection
Change-Id: I53404c624933b7f55f1292c041c6c712522ab13b
Add an SP_RT_APP group which will be used to provide minimum
capacity guarantees to RT tasks sensitive to frequency drops
such as synthesizer workloads.
Bug: 33085313
Change-Id: I07cca79e52661d1325a1db9ef3b61eb0f8d20989
Signed-off-by: Joel Fernandes <joelaf@google.com>
This CL is in support of another CL c/2048848, topic
'Refactor hid command in /frameworks/base/cmds' in
internal master. Adding the permissions for
shell here to access uhid_node as part of the
new 'uhid' group.
Bug: 34052337
Test: Tested on angler, bluetooth mouse works OK.
Change-Id: If9e100aa1262d689fb8adc5c0ce93f157c96399e
The fallback android_filesystem_capability.h doesn't play nicely with
other kernel headers, since it #undef's __user. If we're building with
bionic (either for device or host), we use the same kernel headers, so
just use those.
Bug: 38056396
Test: build with Host_bionic:true
Change-Id: Idc61b6d96d86891164abe71604924638d67aefe2
We can't reuse the GID range for internal cache files, otherwise
we don't have a way to tease apart the difference when deciding if
it's safe to move apps.
Test: builds, boots
Bug: 37193650
Change-Id: I22c4e575cd557636e74c5c73035adb1d4dcbb7f7
Misconfigured systems can have localhost pointing to an address that
isn't 127.0.0.1 or ::1.
adb is the only caller of the libcutils socket_loopback functions, so
move them into adb and switch the implementations over to using
INADDR_LOOPBACK and in6addr_loopback, instead of resolving 'localhost'
when connecting.
Bug: http://b/37282612
Test: `killall adb; adb shell`
Test: `killall adb; ip addr del 127.0.0.1/8 dev lo; adb shell`
Change-Id: I01c1885f1d9757ad0f7b353dd04b4d1f057741c8
private/fs_config.h is required in order to build an independent
test that requires internal binary knowledge of the
etc/fs_config_(files|dirs) files.
Test: compile
Bug: 36071012
Change-Id: I268bcfdbb6d45b7bf6040cbf307a4e34812f5fef
This reverts commit 82f8bb785e.
Sadly, we'd have to extend CAP_SYS_RESOURCE to a bunch of execution
domains to make this work, which isn't feasible.
Bug: 36450358
Change-Id: Iffe88e45d538c044382eb0d0ac24ff11a10d73c3
Filesystems like ext4 allow the setting of the "resgid" parameter
to designate a GID that is allowed to use the "reserved" disk space
(in addition to UID 0). We'll be granting this GID to critical
system processes, so that the system is usable enough for the user
to free up disk space used by apps.
Test: builds
Bug: 36450358
Change-Id: I224bd1e597130edb411a1528872faff1ada02a89
- Emergency shutdown just marks the fs as clean while leaving fs
in the middle of any state. Do not use it anymore.
- Changed android_reboot to set sys.powerctl property so that
all shutdown can be done by init.
- Normal reboot sequence changed to
1. Terminate processes (give time to clean up). And wait for
completion based on ro.build.shutdown_timeout.
Default value (when not set) is changed to 3 secs. If it is 0, do not
terminate processes.
2. Kill all remaining services except critical services for shutdown.
3. Shutdown vold using "vdc volume shutdown"
4. umount all emulated partitions. If it fails, just detach.
Wait in step 5 can handle it.
5. Try umounting R/W block devices for up to max timeout.
If it fails, try DETACH.
If umount fails to complete before reboot, it can be detected when
system reboots.
6. Reboot
- Log shutdown time and umount stat to log so that it can be collected after reboot
- To umount emulated partitions, all pending writes inside kernel should
be completed.
- To umount /data partition, all emulated partitions on top of /data should
be umounted and all pending writes should be completed.
- umount retry will only wait up to timeout. If there are too many pending
writes, reboot will discard them and e2fsck after reboot will fix any file system
issues.
bug: 36004738
bug: 32246772
Test: many reboots combining reboot from UI and adb reboot. Check last_kmsg and
fs_stat after reboot.
Change-Id: I6e74d6c68a21e76e08cc0438573d1586fd9aaee2
Bug: 33241851
Test: No changes needed for modules not using VNDK.
For VNDK, enable BOARD_VNDK_VERSION in BoardConfig.mk
and add libcutils to modules that need these headers.
Change-Id: I6102778aab35ed26a5ddde11230502dcd4edc852
Move last reboot reason file to new directory data/misc/reboot/ to
require only SELinux permissions specific to this new file.
Bug: 30994946
Test: manual: reboot command, setprop sys.powerctl
Change-Id: I1e067235aa4b06391cff8ab0741a9d317ba5b7da
Save a string identifying the reason for last Android reboot or power
off in file /data/misc/recovery/last_reboot_reason . This file may
be used for informing users of reboot or shutdown reasons at next
boot, and for other diagnostic purposes.
Bug: 30994946
Test: Manual: reboot, setprop sys.powerctl
Change-Id: I01e44473fdd21b33e9e4dced77aba9a66b6d3755
The qtaguid_tagSocket() function tags a network socket by passing a
reference to the given socket to the qtaguid kernel module. The module
will keep the socket alive even if the process calls close() on said
socket. In this scenario, the socket object would not be destroyed
even if all the file descriptor.
While this is at least a memory leak, it plays bad with epoll(7)
if you also didn't remove the socket from the epoll fd before closing
since epoll will not notice that the socket was closed and there is no
way to remove the socket from epoll after it was closed.
This patch updates the documentation to explicitly mention that the
socket must be untag before closing or bad things happen.
Bug: 36264049
Test: None.
Change-Id: I564a9b6d11d22b43a6c12312524386c0338b42ed