This revert was created by Android Culprit Assistant. The culprit was identified in the following culprit search session (http://go/aca-get/91da3c52-9b76-498b-bdbd-a9de7d7ff53b).
Change-Id: I996c595bee9acc15aedaf0a912f67fa027f33cd0
This revert was created by Android Culprit Assistant. The culprit was identified in the following culprit search session (http://go/aca-get/91da3c52-9b76-498b-bdbd-a9de7d7ff53b).
Change-Id: I459265b9c9117d6006c1223947a202505d24c08f
By using cgroup.kill we don't need to read cgroup.procs at all for
SIGKILLs, which is more efficient and should help reduce CPU contention
and cgroup lock contention. Fallback to cgroup.procs if we encounter an
error trying to use cgroup.kill, but if cgroup.kill fails it's likely
that cgroup.procs will too.
Bug: 239829790
Change-Id: I44706faccfb7c4611b512a3642b913f06d30c1dc
In killProcessGroup we currently read cgroup.procs to find processes to
kill, send them kill signals until cgroup.procs is empty, then remove
the cgroup directory. The cgroup cannot be removed until all processes
are dead, otherwise we'll get an EBUSY error from the kernel.
There is a race in the kernel where cgroup.procs can read empty even
though the cgroup is pinned by processes which are still exiting, and
can't be removed yet. [1]
Let's use the populated field of cgroup.events instead of an empty
cgroup.procs file to determine when the cgroup is removable. In
addition to functioning like we expect, this is more efficient because
we can poll on cgroup.events instead of retrying kills and rereading
cgroup.procs every 5ms which should help reduce CPU contention and
cgroup lock contention.
It's still possible that it takes longer for a cgroup to become
unpopulated than our timeout allows, in which case we will fail to
remove the cgroup and leak kernel memory. But this change should help
reduce the probability of that happening.
[1] https://lore.kernel.org/all/CABdmKX3SOXpcK85a7cx3iXrwUj=i1yXqEz9i9zNkx8mB=ZXQ8A@mail.gmail.com/
Bug: 301871933
Change-Id: If7dcfb331f47e06994c9ac85ed08bbcce18cdad7
Provide an additional way to request the block layer to prioritize
foreground I/O over background I/O. This patch prepares for tests with
the mq-deadline I/O scheduler. While CFQ and BFQ support a "weight"
cgroup attribute, the mq-deadline scheduler does not. The mq-deadline
I/O scheduler only supports the I/O priority class for differentiating
I/O priorities.
The io.prio.class attribute is declared optional since this attribute
only exists if CONFIG_BLK_CGROUP_IOPRIO != n in the kernel
configuration.
Bug: 186902601
Change-Id: Iee1004cd0996e32245aff2b51772ef40178e024f
Signed-off-by: Bart Van Assche <bvanassche@google.com>
CL 2828279 doesn't do what it's description says that it does. Making
Service::Stop() work for processes that have been migrated to another v2
cgroup requires changing DoKillProcessGroupOnce(). Hence this CL that
removes the early return statements from DoKillProcessGroupOnce().
Bug: 308900853
Change-Id: Ib798555feeb95a786a619c3d7013c7d5829b01ad
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Revert this CL because it tests whether or not cgroup.procs files are
empty with the stat() system call and because the cgroup filesystem
always reports st_size == 0. Rename RemoveUidCgroups() into
RemoveEmptyUidCgroups().
Change-Id: I4de6f16c814c4b47d8d74c8045f0c1ee71975ac0
Signed-off-by: Bart Van Assche <bvanassche@google.com>
There are multiple use cases in Android for which background writes need
to be controlled via the cgroup mechanism. The cgroup mechanism can only
control background writes if both the blkio and memcg controllers are
mounted in the v2 cgroup hierarchy. Hence this patch that migrates the
blkio controller from the v1 to the v2 cgroup hierarchy.
The changes compared to the previous version of this CL are as follows:
- The JoinCgroup actions for the "io" controller have been left out
since these caused processes to be migrated to the v2 root cgroup.
- The BfqWeight / CfqGroupIdle / CfqWeight settings have been included
in this CL instead of applying these settings as a separate CL.
Change-Id: I67e06ce3462bb1c1345dba78f8d3d655b6519c74
Signed-off-by: Bart Van Assche <bvanassche@google.com>
A JoinCgroup action for a v2 cgroup controller migrates a process or task
from the uid_%d/pid_%d cgroup into another cgroup, e.g. the root cgroup.
This may make services unkillable because Service::Stop() only stops a
service if the uid_%d/pid_%d cgroup still exists when Service::Stop() is
called.
Bug: 309674654
Change-Id: I20b797afdf596125ff5a6ed41cb33fe59b84ac88
Signed-off-by: Bart Van Assche <bvanassche@google.com>
For log messages like the following it is not possible to derive why
this message has been logged:
E libprocessgroup: AddTidToCgroup failed to write '3949'; fd=55: Operation not supported on transport endpoint
Hence include the cgroup path and the tid type in the log message.
Change-Id: I057711fe576b82f6454456b7284186ddeece33c3
Signed-off-by: Bart Van Assche <bvanassche@google.com>
When attempting to remove a cgroup, a ENOENT means this cgroup is
already removed. Treat such errno as success for idempotency.
Test: th
Bug: 308900853
Change-Id: I6ef3c25f03d185194205b3845784d284fdc4d444
PLOG depends on errno being set to a useful value, otherwise it will
print a meaningless error string. A few PLOG call sites occur where
either errno is not set at all, or it is set only some of the time where
there are already PLOG calls closer to where the error occurs. Convert
these PLOG calls to LOG.
Bug: 301871933
Change-Id: Ifa6bd2401f9dd9b84b2506e886336e89bac81bb1
The max_processes calculation is incorrect for KillProcessGroup because
the set of processes in cgroup.procs can differ between the multiple
reads in the implementation. Luckily the exact value isn't very
important because it's just logged. Remove max_processes from the API
and remove the warning about the new behavior in Android 11.
Note that we still always LOG(INFO) that any cgroup is being killed.
Bug: 301871933
Change-Id: I8e449f5089d4a48dbc1797b6d979539e87026f43
Currently we sleep for 5ms before decrementing retries for the last
time. This is a waste of time, so bail out of the loop if the last
rmdir attempt fails.
Change-Id: Ia20840d27592b4eb3d9762647b19c111ff94209f
Current implementation of SetAttributeAction::ExecuteForProcess reuses
SetAttributeAction::ExecuteForTask while not utilizing available uid/pid
information. This results in a call to GetPathForTask() which is an
expensive function due to it reading and parsing /proc/$pid/cgroups.
This can be avoided if we utilize available uid/pid info and the fact
that cgroup v2 attributes share the cgroup v2 hierarchy as process
groups, which use a known path template.
Bug: 292636609
Change-Id: I02e3046bd85d0dfebc68ab444f1796bb54cc69c7
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
ProfileAttribute::Reset does not reset file_v2_name, fix that. Also
provide ProfileAttribute::file_name() to consolidate the code.
Bug: 292636609
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I5b33ca47b4fa5cabf582c8804bd13f72f6e58411
We missed two incorrect specifiers in the previous commit with this same
title.
We use the %d format specificier for uid_t, which maps to
__kernel_uid32_t, which is unsigned. [1] This is undefined behavior
which can lead to paths with negative UIDs when erroneously large
values are passed for uid:
E libprocessgroup: No such cgroup attribute: /sys/fs/cgroup/uid_-89846/cgroup.freeze
Fix it with %u.
[1] https://cs.android.com/search?q=typedef.*__kernel_uid32_t&ss=android%2Fplatform%2Fsuperproject%2Fmain
Change-Id: Ica04b03526bd2e156f026a2797fe9912b259cd9f
We use the %d format specificier for uid_t, which maps to
__kernel_uid32_t, which is unsigned. [1] This is undefined behavior
which can lead to paths with negative UIDs when erroneously large
values are passed for uid:
E libprocessgroup: No such cgroup attribute: /sys/fs/cgroup/uid_-89846/cgroup.freeze
Fix it with %u.
[1] https://cs.android.com/search?q=typedef.*__kernel_uid32_t&ss=android%2Fplatform%2Fsuperproject%2Fmain
Change-Id: Ibb52ba2503e30e2f20770b7d23629167e38d076a
Global UID level cgroup removal was eliminated because of a race
between app launch and app killing using the same directory name. [1]
However isolated app UIDs are assigned sequentially, and are
basically never reused until we wrap around the large range of
isolated UIDs. This leaves thousands of isolated cgroup directories
unused, which consumes kernel memory and increases memory reclaim
overhead. Remove this subset of UID level cgroup directories when
killing process groups.
[1] d0464b0c01
Test: 50 cycle ACT leaves 1000 fewer empty isolated cgroups
Bug: 290953668
Change-Id: If7d2a7b8eec14561a72208049b74ff785ca961bd
This reverts commit 9c0fcbb0f7.
Bug: 261857030
Ignore-AOSP-First: this change is for the U branch only.
Change-Id: Id4bd3494b33dd6bc0644406d599c9bfd597c7435
Signed-off-by: Bart Van Assche <bvanassche@google.com>
This reverts commit 92153fb955.
Bug: 261857030
Ignore-AOSP-First: this change is for the U branch only.
Change-Id: I99417707f0d0b8c7dca3927b6ce9d52436621f4e
Signed-off-by: Bart Van Assche <bvanassche@google.com>
This reverts commit 6ad747ac2d.
Bug: 261857030
Ignore-AOSP-First: this change is for the U branch only.
Change-Id: I93447b71146e6e9297ef49026d90dc4c35b91244
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Provide profile validity check functions for cases when user wants to
check whether a profile can be successfully applied before actually
applying it. Add test cases to cover new APIs.
Also add a wrapper function for framework code to call it.
Bug: 277233783
Test: atest task_profiles_test
Test: manually verify freezer with outdated cgroup configuration
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Li Li <dualli@google.com>
Change-Id: Iefb321dead27adbe67721972f164efea213c06cb
Merged-In: Iefb321dead27adbe67721972f164efea213c06cb
Provide profile validity check functions for cases when user wants to
check whether a profile can be successfully applied before actually
applying it. Add test cases to cover new APIs.
Also add a wrapper function for framework code to call it.
Bug: 277233783
Test: atest task_profiles_test
Test: manually verify freezer with outdated cgroup configuration
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Li Li <dualli@google.com>
Change-Id: Iefb321dead27adbe67721972f164efea213c06cb
This variable is no longer used.
Fixes: d0464b0c01 ("libprocessgroup: Do not remove uid cgroups directory")
Change-Id: I2b606d953722cf38cc865d91ea00a3b08236675b
GKE provides an unusual environment: the cgroupv2 filesystem is mounted
read-only. Skip the task_profiles_test on the host if the cgroup2
filesystem is mounted read-only to prevent that a test fails as
follows:
Failed to write '-1' to /sys/fs/cgroup/cgroup.procs: Read-only file system.
Bug: 278899193
Change-Id: I8c5a0c0848a47a395ae87f2fc31ba0ccda7d7f31
Signed-off-by: Bart Van Assche <bvanassche@google.com>
This reverts commit aee11b0a3d.
This change was originally reverted because its only user was reverted
under b/243096961 at ag/19679188. We bring it back now with a fixed user.
Bug: 236708592
Bug: 148425913
Ignore-AOSP-First: Topic with AMS changes which is developed on git_master
Change-Id: I2a8ae0d9faabe7950b758a09870d128889be4d0a
Merged-In: I2a8ae0d9faabe7950b758a09870d128889be4d0a
Add a function which sends signals to all members of a process group,
but does not wait for the processes to exit, or for the associated
cgroup to be removed.
Bug: 274646058
Ignore-AOSP-First: Dependency of ActivityManager change which developed on interal git_master
Test: Force-stop of chrome with 15 tabs completes ~500ms faster
Test: Full Play store update causes no ANR
(cherry picked from https://googleplex-android-review.googlesource.com/q/commit:d87b6018d25cbbd33b345dc58c634718bf5d0def)
Merged-In: I37dbdecb3394101abbee8495e71f6912b3c031f5
Change-Id: I37dbdecb3394101abbee8495e71f6912b3c031f5
NOTE FOR REVIEWERS - original patch and result patch are not identical.
PLEASE REVIEW CAREFULLY.
Diffs between the patches:
37,6 +537,15 @@
return KillProcessGroup(uid, initialPid, signal, 0 /*retries*/, max_processes);
}
+int sendSignalToProcessGroup(uid_t uid, int initialPid, int signal) {
+ std::string hierarchy_root_path;
+ if (CgroupsAvailable()) {
+ CgroupGetControllerPath(CGROUPV2_CONTROLLER_NAME, &hierarchy_root_path);
+ }
+ const char* cgroup = hierarchy_root_path.c_str();
+ return DoKillProcessGroupOnce(cgroup, uid, initialPid, signal);
+}
+
static int createProcessGroupInternal(uid_t uid, int initialPid, std::string cgroup,
bool activate_controllers) {
auto uid_path = ConvertUidToPath(cgroup.c_str(), uid);
Original patch:
From d87b6018d2 Mon Sep 17 00:00:00 2001
From: T.J. Mercier <tjmercier@google.com>
Date: Tue, 04 Apr 2023 18:41:13 +0000
Subject: [PATCH] libprocessgroup: Add sendSignalToProcessGroup
Add a function which sends signals to all members of a process group,
but does not wait for the processes to exit, or for the associated
cgroup to be removed.
Bug: 274646058
Ignore-AOSP-First: Dependency of ActivityManager change which developed on interal git_master
Test: Force-stop of chrome with 15 tabs completes ~500ms faster
Test: Full Play store update causes no ANR
Change-Id: I37dbdecb3394101abbee8495e71f6912b3c031f5
---
diff --git a/libprocessgroup/include/processgroup/processgroup.h b/libprocessgroup/include/processgroup/processgroup.h
index 8fa9fd5..48bc0b7 100644
--- a/libprocessgroup/include/processgroup/processgroup.h
+++ b/libprocessgroup/include/processgroup/processgroup.h
@@ -76,6 +76,11 @@
// that it only returns 0 in the case that the cgroup exists and it contains no processes.
int killProcessGroupOnce(uid_t uid, int initialPid, int signal, int* max_processes = nullptr);
+// Sends the provided signal to all members of a process group, but does not wait for processes to
+// exit, or for the cgroup to be removed. Callers should also ensure that killProcessGroup is called
+// later to ensure the cgroup is fully removed, otherwise system resources may leak.
+int sendSignalToProcessGroup(uid_t uid, int initialPid, int signal);
+
int createProcessGroup(uid_t uid, int initialPid, bool memControl = false);
// Set various properties of a process group. For these functions to work, the process group must
Change-Id: Ie479348dee8e8092b1959927a1143009632d3914
Use correct attribute of the mntent to check for cgroup v2 entry.
Bug: 277233783
Change-Id: Ie34b89b610117b8ce043f2f18947273d75618fef
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Add a function which sends signals to all members of a process group,
but does not wait for the processes to exit, or for the associated
cgroup to be removed.
Bug: 274646058
Ignore-AOSP-First: Dependency of ActivityManager change which developed on interal git_master
Test: Force-stop of chrome with 15 tabs completes ~500ms faster
Test: Full Play store update causes no ANR
(cherry picked from https://googleplex-android-review.googlesource.com/q/commit:d87b6018d25cbbd33b345dc58c634718bf5d0def)
Merged-In: I37dbdecb3394101abbee8495e71f6912b3c031f5
Change-Id: I37dbdecb3394101abbee8495e71f6912b3c031f5
A user ID (uid) must be greater than or equal to zero to be valid. Only
strictly positive process IDs are valid. Add argument checks in
libprocessgroup of uid and pid arguments to make it easier to determine
the origin of invalid arguments.
Change-Id: I8a6d96ca4576bc9c329498c6a804dd05a02afca5
Signed-off-by: Bart Van Assche <bvanassche@google.com>
There are multiple use cases in Android for which background writes need
to be controlled via the cgroup mechanism. The cgroup mechanism can only
control background writes if both the blkio and memcg controllers are
mounted in the v2 cgroup hierarchy. Hence this patch that migrates the
blkio controller from the v1 to the v2 cgroup hierarchy.
The blkio controller has been marked as optional since not all Android
kernels enable this controller (CONFIG_BLK_CGROUP).
This patch increases the TOTAL_BOOT_TIME for devices with a 4.19 kernel
(redfin) from 18.9 s to 20 s. This patch does not affect the boot time
for devices with a 5.10 or 5.15 kernel.
This patch increases the time spent in CgroupMap::ActivateControllers()
by 25 microseconds in Cuttlefish on an x86-64 CPU.
CgroupMap::ActivateControllers() is called by Service::Start().
Bug: 213617178
Test: Cuttlefish and various phones
Change-Id: I3c07c1be84c3feb277b7d7003652d5d3b57c6541
Signed-off-by: Bart Van Assche <bvanassche@google.com>