Global UID level cgroup removal was eliminated because of a race
between app launch and app killing using the same directory name. [1]
However isolated app UIDs are assigned sequentially, and are
basically never reused until we wrap around the large range of
isolated UIDs. This leaves thousands of isolated cgroup directories
unused, which consumes kernel memory and increases memory reclaim
overhead. Remove this subset of UID level cgroup directories when
killing process groups.
[1] d0464b0c01
Test: 50 cycle ACT leaves 1000 fewer empty isolated cgroups
Bug: 290953668
Change-Id: If7d2a7b8eec14561a72208049b74ff785ca961bd
Provide profile validity check functions for cases when user wants to
check whether a profile can be successfully applied before actually
applying it. Add test cases to cover new APIs.
Also add a wrapper function for framework code to call it.
Bug: 277233783
Test: atest task_profiles_test
Test: manually verify freezer with outdated cgroup configuration
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Li Li <dualli@google.com>
Change-Id: Iefb321dead27adbe67721972f164efea213c06cb
This variable is no longer used.
Fixes: d0464b0c01 ("libprocessgroup: Do not remove uid cgroups directory")
Change-Id: I2b606d953722cf38cc865d91ea00a3b08236675b
GKE provides an unusual environment: the cgroupv2 filesystem is mounted
read-only. Skip the task_profiles_test on the host if the cgroup2
filesystem is mounted read-only to prevent that a test fails as
follows:
Failed to write '-1' to /sys/fs/cgroup/cgroup.procs: Read-only file system.
Bug: 278899193
Change-Id: I8c5a0c0848a47a395ae87f2fc31ba0ccda7d7f31
Signed-off-by: Bart Van Assche <bvanassche@google.com>
This reverts commit aee11b0a3d.
This change was originally reverted because its only user was reverted
under b/243096961 at ag/19679188. We bring it back now with a fixed user.
Bug: 236708592
Bug: 148425913
Ignore-AOSP-First: Topic with AMS changes which is developed on git_master
Change-Id: I2a8ae0d9faabe7950b758a09870d128889be4d0a
Merged-In: I2a8ae0d9faabe7950b758a09870d128889be4d0a
Add a function which sends signals to all members of a process group,
but does not wait for the processes to exit, or for the associated
cgroup to be removed.
Bug: 274646058
Ignore-AOSP-First: Dependency of ActivityManager change which developed on interal git_master
Test: Force-stop of chrome with 15 tabs completes ~500ms faster
Test: Full Play store update causes no ANR
(cherry picked from https://googleplex-android-review.googlesource.com/q/commit:d87b6018d25cbbd33b345dc58c634718bf5d0def)
Merged-In: I37dbdecb3394101abbee8495e71f6912b3c031f5
Change-Id: I37dbdecb3394101abbee8495e71f6912b3c031f5
NOTE FOR REVIEWERS - original patch and result patch are not identical.
PLEASE REVIEW CAREFULLY.
Diffs between the patches:
37,6 +537,15 @@
return KillProcessGroup(uid, initialPid, signal, 0 /*retries*/, max_processes);
}
+int sendSignalToProcessGroup(uid_t uid, int initialPid, int signal) {
+ std::string hierarchy_root_path;
+ if (CgroupsAvailable()) {
+ CgroupGetControllerPath(CGROUPV2_CONTROLLER_NAME, &hierarchy_root_path);
+ }
+ const char* cgroup = hierarchy_root_path.c_str();
+ return DoKillProcessGroupOnce(cgroup, uid, initialPid, signal);
+}
+
static int createProcessGroupInternal(uid_t uid, int initialPid, std::string cgroup,
bool activate_controllers) {
auto uid_path = ConvertUidToPath(cgroup.c_str(), uid);
Original patch:
From d87b6018d2 Mon Sep 17 00:00:00 2001
From: T.J. Mercier <tjmercier@google.com>
Date: Tue, 04 Apr 2023 18:41:13 +0000
Subject: [PATCH] libprocessgroup: Add sendSignalToProcessGroup
Add a function which sends signals to all members of a process group,
but does not wait for the processes to exit, or for the associated
cgroup to be removed.
Bug: 274646058
Ignore-AOSP-First: Dependency of ActivityManager change which developed on interal git_master
Test: Force-stop of chrome with 15 tabs completes ~500ms faster
Test: Full Play store update causes no ANR
Change-Id: I37dbdecb3394101abbee8495e71f6912b3c031f5
---
diff --git a/libprocessgroup/include/processgroup/processgroup.h b/libprocessgroup/include/processgroup/processgroup.h
index 8fa9fd5..48bc0b7 100644
--- a/libprocessgroup/include/processgroup/processgroup.h
+++ b/libprocessgroup/include/processgroup/processgroup.h
@@ -76,6 +76,11 @@
// that it only returns 0 in the case that the cgroup exists and it contains no processes.
int killProcessGroupOnce(uid_t uid, int initialPid, int signal, int* max_processes = nullptr);
+// Sends the provided signal to all members of a process group, but does not wait for processes to
+// exit, or for the cgroup to be removed. Callers should also ensure that killProcessGroup is called
+// later to ensure the cgroup is fully removed, otherwise system resources may leak.
+int sendSignalToProcessGroup(uid_t uid, int initialPid, int signal);
+
int createProcessGroup(uid_t uid, int initialPid, bool memControl = false);
// Set various properties of a process group. For these functions to work, the process group must
Change-Id: Ie479348dee8e8092b1959927a1143009632d3914
Use correct attribute of the mntent to check for cgroup v2 entry.
Bug: 277233783
Change-Id: Ie34b89b610117b8ce043f2f18947273d75618fef
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
A user ID (uid) must be greater than or equal to zero to be valid. Only
strictly positive process IDs are valid. Add argument checks in
libprocessgroup of uid and pid arguments to make it easier to determine
the origin of invalid arguments.
Change-Id: I8a6d96ca4576bc9c329498c6a804dd05a02afca5
Signed-off-by: Bart Van Assche <bvanassche@google.com>
There are multiple use cases in Android for which background writes need
to be controlled via the cgroup mechanism. The cgroup mechanism can only
control background writes if both the blkio and memcg controllers are
mounted in the v2 cgroup hierarchy. Hence this patch that migrates the
blkio controller from the v1 to the v2 cgroup hierarchy.
The blkio controller has been marked as optional since not all Android
kernels enable this controller (CONFIG_BLK_CGROUP).
This patch increases the TOTAL_BOOT_TIME for devices with a 4.19 kernel
(redfin) from 18.9 s to 20 s. This patch does not affect the boot time
for devices with a 5.10 or 5.15 kernel.
This patch increases the time spent in CgroupMap::ActivateControllers()
by 25 microseconds in Cuttlefish on an x86-64 CPU.
CgroupMap::ActivateControllers() is called by Service::Start().
Bug: 213617178
Test: Cuttlefish and various phones
Change-Id: I3c07c1be84c3feb277b7d7003652d5d3b57c6541
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Apparently there is Java code that calls KillProcessGroup() with an
invalid initialPid argument. Hence this CL that makes KillProcessGroup()
fail early if one of its arguments is invalid.
Change-Id: I42f98eed139d9d0950428d04180e4613ba74b4e6
Signed-off-by: Bart Van Assche <bvanassche@google.com>
The way processes are accounted in DoKillProcessGroupOnce has been
changed recently, which affects retries in KillProcessGroup. More specifically, initialPid was not counted before and would not
cause a retry with 5ms sleep.
Restore previous behavior to avoid boot time regressions.
Bug: 271198843
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Ibc1bdd855898688a4a03806671e6ac31570aedf9
PLOG reports the value of errno. These four PLOG statements are
after functions that are not syscalls, leading to confusing logs
such as "Failed to apply Foo task profile: Success".
Bug: 271196526
Test: N/A
Change-Id: Iede5274d1ceebabec8432527112291ba63dca090
So the child processes in the process group won't be orphaned
when we decide to kill the process group of a given process but
find it's already dead.
Bug: 266633286
Test: atest MicrodroidDemoApp
Change-Id: Ib6f45b992566f0ab5cf152463c95294a306dd736
Not all Android kernels support all the cgroup controllers mentioned in
task_profiles.json and/or cgroups.json. Support such kernels by ignoring
certain cgroup activation failures.
Bug: 213617178
Change-Id: I90c0bd959f8a6484c4f2fbc895845e073527271e
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Improve the readability of this function by splitting it.
This CL includes the following behavior changes:
- If changing the directory owner and/or mode fails for /sys/fs/cgroup,
this is considered as a fatal error instead of something that should
only fail if "Optional" has not been set.
- If mounting the v2 cgroup controller fails, this is considered as an
error.
- Activating/mounting a cgroup controller only fails if the controller
has not been marked as optional.
Bug: 213617178
Change-Id: If6908dfdbcb2e1c9637ab4ac8a7625f0a17dc9e0
Signed-off-by: Bart Van Assche <bvanassche@google.com>
This CL prepares for introducing an additional flag test.
Bug: 213617178
Change-Id: Ia74c1990792b5839f76498de2cac0008ed92040f
Signed-off-by: Bart Van Assche <bvanassche@google.com>
There are multiple use cases in Android for which background writes need
to be controlled via the cgroup mechanism. The cgroup mechanism can only
control background writes if both the blkio and memcg controllers are
mounted in the v2 cgroup hierarchy. Hence this patch that migrates the
blkio controller from the v1 to the v2 cgroup hierarchy.
This patch increases the TOTAL_BOOT_TIME for devices with a 4.19 kernel
(redfin) from 18.9 s to 20 s. This patch does not affect the boot time
for devices with a 5.10 or 5.15 kernel.
This patch increases the time spent in CgroupMap::ActivateControllers()
by 25 microseconds in Cuttlefish on an x86-64 CPU.
CgroupMap::ActivateControllers() is called by Service::Start().
Bug: 213617178
Test: Cuttlefish and various phones
Change-Id: I490740e1c9ee4f7bb5bb7afba721a083f952c8f2
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Adjusting attributes that correspond to the `blkio`(v1)/`io`(v2) controller. The migration of the `blkio` v1 controller to v2 requires renaming it to `io`, therefore we want to update the `File` field to point to `blkio` file and `FileV2` to point to `io` file.
Test: Verified with cuttlefish that this works with the `io` controller migration by cherry-picking aosp/2218645
Bug: 263269364
Bug: 213617178
Change-Id: I0aacfc6d74e3eec61ebb2ce443b04c792392aa9e
It makes no sense, because there are no cgroup procs file.
Bug: 257264124
Test: atest MicrodroidBenchmarkApp
Change-Id: I4e3a118d2237afc46aa8fbcbad055afb7d56f464
process_cgroup_empty_ is used to indicate that a service is already
killed or not. If cgroup support lacks, services cannot be killed
because process_cgroup_empty_ is always true.
This change fixes it by not assigning process_cgroup_empty_ as true.
Instead, make KillProcessGroup send signals even when cgroup is
disabled. Also DoKillProcessGroupOnce() is updated so it returns a number of killed processes, excluding already dead processes. This behavior agrees with its name (DoKillProcessOnce), and it prevents regression upon missing cgroups, because kill(-pgid) will always
"succeed" so KillProcessGroup will loop even when all processes are
already dead.
Bug: 257264124
Test: boot microdroid, see services are terminated
Change-Id: I19abf19ff1b70c666cd6f12d0a12956765174aaa
We are planning to remove cgroups from the Micrdroid kernel, since the
entire VM belongs exclusively to a single owner, and is in the control
of the cgroups on the host side.
This patch expoxes CgroupAvailable API from libprocessgroup, and changes
init to query the CgroupAvailable API before doing any
cgroups/task_profiles related work.
Bug: 239367015
Test: run MicrodroidDemoApp
Test: atest --test-mapping packages/modules/Virtualization:avf-presubmit
Change-Id: I82787141cd2a7f9309a4e9b24acbd92ca21c145b
Change two PLOG() statements into LOG() statements since PLOG() should
only be used if errno has been set. Make it easier to find the code that
logs an error message.
Bug: 213617178
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Change-Id: I73443f3adb7d7ba3fc0d39a55777f0b132529fbd
Currently, tagging a symbol with #apex (# systemapi or # llndk) is not
required when the symbol is in a non-NDK library. However, this is
considered dangerous because such a symbol will automatically be
promoted to NDK APIs when the library is promoted to an NDK library.
When that happens, the native API council won't be able to notice the
promotion because promoting a non-NDK library into an NDK library
doesn't require an update of the map.txt file, but Android.bp only.
To prevent that, we should mandate those tags for Mainline APIs
regardless of whether the library the API belongs to is an NDK library
or not.
Upcoming changes in build/soong will enforce this. This change is to
prepare for the enforcement.
Note that this is a build-time only change. There's no behavior change
at runtime.
Bug: 184712170
Test: m
Change-Id: I769c5318e0cfd092f2f2b368f1a860065c79818f
The differences between the v1 and v2 hierarchies are as follows:
* Different mountpoints. In Android the blkio v1 hierarchy has
/dev/blkio as top directory while the v2 hierarchy has /sys/fs/cgroup
as top directory.
* Different directory structure. In Android there are two directories in
the v1 blkio hierarchy (. and background) while in the v2 hierarchy
there is one subdirectory per process and per task.
* Different controller names. The name of the blkio controller in the v1
hierarchy is "blkio" while it is "io" in the v2 hierarchy.
* In the v1 hierarchy the NormalIoPriority policy is applied at process
creation time but that policy is not applied at process creation time
if the blkio controller exists in the v2 hierarchy.
Prepare for migration of the blkio controller to the v2 hierarchy by
adding the blkio v2 attributes in task_profiles.json. All these
attributes have been marked as optional because:
* The "io" controller does not exist in the v1 hierarchy.
* Which attributes can be applied depends on the I/O scheduler that has
been selected (CFQ, BFQ, ...).
This patch causes the following warnings to appear in the logs of
devices that mount the blkio controller in the v1 hierarchy:
W libprocessgroup: Controller io is not found
W libprocessgroup: Controller io is not found
W libprocessgroup: Controller io is not found
W libprocessgroup: SetAttribute: unknown attribute: CfqGroupIdle
W libprocessgroup: SetAttribute: unknown attribute: CfqWeight
W libprocessgroup: SetAttribute: unknown attribute: BfqWeight
This patch restores a subset of aosp/1962326 and prepares for the
migration of the blkcg controller to the cgroup v2 hierarchy.
Bug: 213617178
Change-Id: Ia7b117bc777239b416e2ac268308e634b018144d
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Provide alternative versions that do not force callers to create
std::string objects. This patch has the intended side-effect that all
callers that pass a {string} initializer list to the 'profiles' argument
now call an std::initializer_list<> overload instead of the const
std::vector<std::string>& overload.
Additionally, add std::function<> arguments instead of calling
ExecuteForProcess() or ExecuteForTask() directly to make it easier to
write unit tests for SetTaskProfiles() and SetProcessProfiles().
Bug: 213617178
Change-Id: Ica61e944a66a17178ee43a113b8ca082f7eb834b
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Do not force callers to create an std::string object. This patch
implements the following advice from go/totw/1: "Google’s preferred
option for accepting such string parameters is through a string_view."
Use std::less<> as comparison type to prevent that std::string_view
objects have to be converted into std::string objects for lookups in
std::map<>.
Bug: 213617178
Change-Id: I08125a02220a8c003d9202a7e177be776c3b9829
Signed-off-by: Bart Van Assche <bvanassche@google.com>
This change enables headers like <span>. Inside the <span> header file
file the following guard makes its functionality unavailable when
building with std=gnu++-17:
#if _LIBCPP_STD_VER > 17
[ ... ]
#endif
Bug: 213617178
Change-Id: I5c40708ea196ab112990b5ca6fae9370b75f8752
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Fix the function name in a log message inside CgroupSetup().
Bug: 213617178
Change-Id: I897c831f5e53093df2664e0e8ceefadf9a89369c
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Cgroup removal fails with EBUSY if there are active processes or threads
still alive in the cgroup. Occasionally a thread or a process might be
stuck in an interruptible sleep and take some time during exit. In such
cases attempts to remove the cgroup it belongs to will fail. This
results in occasional leftover cgroups. These empty unused cgroups
consume memory.
Ensure RemoveProcessGroup always retries and increase the retries to
keep trying for 2 secs before giving up. In majority of cases only a few
retries are needed but in rare cases a thread can be blocked for longer
time, therefore the number of retries is set large enough to cover them.
Bug: 233319780
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I2e4bb1f7b7e19c904c85faea7bbabbfdef9c8125
There isn't any reason to keep the VMCompilationPerformance special, so
rename to a more generic, reusable name.
One day we may support whatever-purpose VMs using other generic
profiles.
Bug: 231437770
Test: TH
Change-Id: Id7e78ba4d6ea0dc415ed0bc1f4bdf051f7e7fe05
The VMCompilationPerformance profile is used to run Isolated Compilation
in a Protected VM, normally during the device idle, where relevant
APEXes are staged.
The original VMCompilationPerformance introduced in aosp/2060891 does
not have any specific definition and requires vendors to customize it.
This change re-defines it as an aggregated profile with a default set of
existing profiles, so that performance can be reasonable by default.
This profile may be renamed to a more generic name later, e.g.
"SCHED_SP_COMPUTE".
Bug: 231437770
Test: Run `composd_cmd test-compile` on a local device.
Before: 1m50s +/- 10s (with whatever that's default)
After: 1m25s +/- 5s
Change-Id: Ib8cd65782c818474fb129efbd9ef9a3e23ad1eb3
When system_server and zygote crash or get killed, all apps also get
killed but their process groups are left empty. Provide a function to
remove all empty process groups so that init can purge them when this
even happens.
Bug: 228160715
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Ife38ca021e80cd38106f218ae13183e8c2631bf0
This CL fixes a bug that null names are passed when profiles are set
via android_set_process_profiles. This is because the `profiles_` vector
was initialized with the number of task profiles and then we append the
actual task profile names to the vector. As a result, when {"a", "b"}
was given, the vector ended up having {"", "", "a", "b"}. Fixing this by
correctly using reserve().
Bug: N/A
Test: m
Change-Id: I28d6c2e891b01a2d3a8a88d9d0652fe0dbffac96
The wrapper is to call SetProcessFiles (C++ API) from crosvm via FFI.
Bug: 223790172
Bug: 216788146
Test: m
Change-Id: If342ca0d19deb1cb7ee581bba2cc543385199cbe