platform_system_core/init/service.h

219 lines
7.9 KiB
C
Raw Normal View History

/*
* Copyright (C) 2015 The Android Open Source Project
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once
#include <signal.h>
#include <sys/types.h>
#include <chrono>
#include <memory>
#include <optional>
#include <set>
#include <string>
#include <vector>
#include <android-base/chrono_utils.h>
#include <cutils/iosched_policy.h>
#include "action.h"
init: Add support for ambient capabilities. Ambient capabilities are inherited in a straightforward way across execve(2): " If you are nonroot but you have a capability, you can add it to pA. If you do so, your children get that capability in pA, pP, and pE. For example, you can set pA = CAP_NET_BIND_SERVICE, and your children can automatically bind low-numbered ports. " This will allow us to get rid of the special meaning for AID_NET_ADMIN and AID_NET_RAW, and if desired, to reduce the use of file capabilities (which grant capabilities to any process that can execute the file). An additional benefit of the latter is that a single .rc file can specify all properties for a service, without having to rely on a separate file for file capabilities. Ambient capabilities are supported starting with kernel 4.3 and have been backported to all Android common kernels back to 3.10. I chose to not use Minijail here (though I'm still using libcap) for two reasons: 1-The Minijail code is designed to work in situations where the process is holding any set of capabilities, so it's more complex. The situation when forking from init allows for simpler code. 2-The way Minijail is structured right now, we would not be able to make the required SELinux calls between UID/GID dropping and other priv dropping code. In the future, it will make sense to add some sort of "hook" to Minijail so that it can be used in situations where we want to do other operations between some of the privilege-dropping operations carried out by Minijail. Bug: 32438163 Test: Use sample service. Change-Id: I3226cc95769d1beacbae619cb6c6e6a5425890fb
2016-10-27 16:33:03 +02:00
#include "capabilities.h"
#include "keyword_map.h"
#include "parser.h"
#include "service_utils.h"
#include "subcontext.h"
remove emergency shutdown and improve init's reboot logic - Emergency shutdown just marks the fs as clean while leaving fs in the middle of any state. Do not use it anymore. - Changed android_reboot to set sys.powerctl property so that all shutdown can be done by init. - Normal reboot sequence changed to 1. Terminate processes (give time to clean up). And wait for completion based on ro.build.shutdown_timeout. Default value (when not set) is changed to 3 secs. If it is 0, do not terminate processes. 2. Kill all remaining services except critical services for shutdown. 3. Shutdown vold using "vdc volume shutdown" 4. umount all emulated partitions. If it fails, just detach. Wait in step 5 can handle it. 5. Try umounting R/W block devices for up to max timeout. If it fails, try DETACH. If umount fails to complete before reboot, it can be detected when system reboots. 6. Reboot - Log shutdown time and umount stat to log so that it can be collected after reboot - To umount emulated partitions, all pending writes inside kernel should be completed. - To umount /data partition, all emulated partitions on top of /data should be umounted and all pending writes should be completed. - umount retry will only wait up to timeout. If there are too many pending writes, reboot will discard them and e2fsck after reboot will fix any file system issues. bug: 36004738 bug: 32246772 Test: many reboots combining reboot from UI and adb reboot. Check last_kmsg and fs_stat after reboot. Change-Id: I6e74d6c68a21e76e08cc0438573d1586fd9aaee2
2017-03-13 19:54:47 +01:00
#define SVC_DISABLED 0x001 // do not autostart with class
#define SVC_ONESHOT 0x002 // do not restart on exit
#define SVC_RUNNING 0x004 // currently active
#define SVC_RESTARTING 0x008 // waiting to restart
#define SVC_CONSOLE 0x010 // requires console
#define SVC_CRITICAL 0x020 // will reboot into bootloader if keeps crashing
remove emergency shutdown and improve init's reboot logic - Emergency shutdown just marks the fs as clean while leaving fs in the middle of any state. Do not use it anymore. - Changed android_reboot to set sys.powerctl property so that all shutdown can be done by init. - Normal reboot sequence changed to 1. Terminate processes (give time to clean up). And wait for completion based on ro.build.shutdown_timeout. Default value (when not set) is changed to 3 secs. If it is 0, do not terminate processes. 2. Kill all remaining services except critical services for shutdown. 3. Shutdown vold using "vdc volume shutdown" 4. umount all emulated partitions. If it fails, just detach. Wait in step 5 can handle it. 5. Try umounting R/W block devices for up to max timeout. If it fails, try DETACH. If umount fails to complete before reboot, it can be detected when system reboots. 6. Reboot - Log shutdown time and umount stat to log so that it can be collected after reboot - To umount emulated partitions, all pending writes inside kernel should be completed. - To umount /data partition, all emulated partitions on top of /data should be umounted and all pending writes should be completed. - umount retry will only wait up to timeout. If there are too many pending writes, reboot will discard them and e2fsck after reboot will fix any file system issues. bug: 36004738 bug: 32246772 Test: many reboots combining reboot from UI and adb reboot. Check last_kmsg and fs_stat after reboot. Change-Id: I6e74d6c68a21e76e08cc0438573d1586fd9aaee2
2017-03-13 19:54:47 +01:00
#define SVC_RESET 0x040 // Use when stopping a process,
// but not disabling so it can be restarted with its class.
remove emergency shutdown and improve init's reboot logic - Emergency shutdown just marks the fs as clean while leaving fs in the middle of any state. Do not use it anymore. - Changed android_reboot to set sys.powerctl property so that all shutdown can be done by init. - Normal reboot sequence changed to 1. Terminate processes (give time to clean up). And wait for completion based on ro.build.shutdown_timeout. Default value (when not set) is changed to 3 secs. If it is 0, do not terminate processes. 2. Kill all remaining services except critical services for shutdown. 3. Shutdown vold using "vdc volume shutdown" 4. umount all emulated partitions. If it fails, just detach. Wait in step 5 can handle it. 5. Try umounting R/W block devices for up to max timeout. If it fails, try DETACH. If umount fails to complete before reboot, it can be detected when system reboots. 6. Reboot - Log shutdown time and umount stat to log so that it can be collected after reboot - To umount emulated partitions, all pending writes inside kernel should be completed. - To umount /data partition, all emulated partitions on top of /data should be umounted and all pending writes should be completed. - umount retry will only wait up to timeout. If there are too many pending writes, reboot will discard them and e2fsck after reboot will fix any file system issues. bug: 36004738 bug: 32246772 Test: many reboots combining reboot from UI and adb reboot. Check last_kmsg and fs_stat after reboot. Change-Id: I6e74d6c68a21e76e08cc0438573d1586fd9aaee2
2017-03-13 19:54:47 +01:00
#define SVC_RC_DISABLED 0x080 // Remember if the disabled flag was set in the rc script.
#define SVC_RESTART 0x100 // Use to safely restart (stop, wait, start) a service.
#define SVC_DISABLED_START 0x200 // A start was requested but it was disabled at the time.
#define SVC_EXEC 0x400 // This service was started by either 'exec' or 'exec_start' and stops
// init from processing more commands until it completes
remove emergency shutdown and improve init's reboot logic - Emergency shutdown just marks the fs as clean while leaving fs in the middle of any state. Do not use it anymore. - Changed android_reboot to set sys.powerctl property so that all shutdown can be done by init. - Normal reboot sequence changed to 1. Terminate processes (give time to clean up). And wait for completion based on ro.build.shutdown_timeout. Default value (when not set) is changed to 3 secs. If it is 0, do not terminate processes. 2. Kill all remaining services except critical services for shutdown. 3. Shutdown vold using "vdc volume shutdown" 4. umount all emulated partitions. If it fails, just detach. Wait in step 5 can handle it. 5. Try umounting R/W block devices for up to max timeout. If it fails, try DETACH. If umount fails to complete before reboot, it can be detected when system reboots. 6. Reboot - Log shutdown time and umount stat to log so that it can be collected after reboot - To umount emulated partitions, all pending writes inside kernel should be completed. - To umount /data partition, all emulated partitions on top of /data should be umounted and all pending writes should be completed. - umount retry will only wait up to timeout. If there are too many pending writes, reboot will discard them and e2fsck after reboot will fix any file system issues. bug: 36004738 bug: 32246772 Test: many reboots combining reboot from UI and adb reboot. Check last_kmsg and fs_stat after reboot. Change-Id: I6e74d6c68a21e76e08cc0438573d1586fd9aaee2
2017-03-13 19:54:47 +01:00
#define SVC_SHUTDOWN_CRITICAL 0x800 // This service is critical for shutdown and
// should not be killed during shutdown
#define SVC_TEMPORARY 0x1000 // This service was started by 'exec' and should be removed from the
// service list once it is reaped.
#define NR_SVC_SUPP_GIDS 12 // twelve supplementary groups
namespace android {
namespace init {
class Service {
friend class ServiceParser;
public:
Service(const std::string& name, Subcontext* subcontext_for_restart_commands,
const std::vector<std::string>& args, bool from_apex = false);
Service(const std::string& name, unsigned flags, uid_t uid, gid_t gid,
const std::vector<gid_t>& supp_gids, int namespace_flags, const std::string& seclabel,
Subcontext* subcontext_for_restart_commands, const std::vector<std::string>& args,
bool from_apex = false);
static Result<std::unique_ptr<Service>> MakeTemporaryOneshotService(
const std::vector<std::string>& args);
remove emergency shutdown and improve init's reboot logic - Emergency shutdown just marks the fs as clean while leaving fs in the middle of any state. Do not use it anymore. - Changed android_reboot to set sys.powerctl property so that all shutdown can be done by init. - Normal reboot sequence changed to 1. Terminate processes (give time to clean up). And wait for completion based on ro.build.shutdown_timeout. Default value (when not set) is changed to 3 secs. If it is 0, do not terminate processes. 2. Kill all remaining services except critical services for shutdown. 3. Shutdown vold using "vdc volume shutdown" 4. umount all emulated partitions. If it fails, just detach. Wait in step 5 can handle it. 5. Try umounting R/W block devices for up to max timeout. If it fails, try DETACH. If umount fails to complete before reboot, it can be detected when system reboots. 6. Reboot - Log shutdown time and umount stat to log so that it can be collected after reboot - To umount emulated partitions, all pending writes inside kernel should be completed. - To umount /data partition, all emulated partitions on top of /data should be umounted and all pending writes should be completed. - umount retry will only wait up to timeout. If there are too many pending writes, reboot will discard them and e2fsck after reboot will fix any file system issues. bug: 36004738 bug: 32246772 Test: many reboots combining reboot from UI and adb reboot. Check last_kmsg and fs_stat after reboot. Change-Id: I6e74d6c68a21e76e08cc0438573d1586fd9aaee2
2017-03-13 19:54:47 +01:00
bool IsRunning() { return (flags_ & SVC_RUNNING) != 0; }
bool IsEnabled() { return (flags_ & SVC_DISABLED) == 0; }
Result<void> ExecStart();
Result<void> Start();
Result<void> StartIfNotDisabled();
Result<void> StartIfPostData();
Result<void> Enable();
void Reset();
Support for stopping/starting post-data-mount class subsets. On devices that use FDE and APEX at the same time, we need to bring up a minimal framework to be able to mount the /data partition. During this period, a tmpfs /data filesystem is created, which doesn't contain any of the updated APEXEs. As a consequence, all those processes will be using the APEXes from the /system partition. This is obviously not desired, as APEXes in /system may be old and/or contain security issues. Additionally, it would create a difference between FBE and FDE devices at runtime. Ideally, we restart all processes that have started after we created the tmpfs /data. We can't (re)start based on class names alone, because some classes (eg 'hal') contain services that are required to start apexd itself and that shouldn't be killed (eg the graphics HAL). To address this, keep track of which processes are started after /data is mounted, with a new 'mark_post_data' keyword. Additionally, create 'class_reset_post_data', which resets all services in the class that were created after the initial /data mount, and 'class_start_post_data', which starts all services in the class that were started after /data was mounted. On a device with FBE, these keywords wouldn't be used; on a device with FDE, we'd use them to bring down the right processes after the user has entered the correct secret, and restart them. Bug: 118485723 Test: manually verified process list Change-Id: I16adb776dacf1dd1feeaff9e60639b99899905eb
2019-04-23 16:26:01 +02:00
void ResetIfPostData();
void Stop();
void Terminate();
void Timeout();
void Restart();
void Reap(const siginfo_t& siginfo);
void DumpState() const;
remove emergency shutdown and improve init's reboot logic - Emergency shutdown just marks the fs as clean while leaving fs in the middle of any state. Do not use it anymore. - Changed android_reboot to set sys.powerctl property so that all shutdown can be done by init. - Normal reboot sequence changed to 1. Terminate processes (give time to clean up). And wait for completion based on ro.build.shutdown_timeout. Default value (when not set) is changed to 3 secs. If it is 0, do not terminate processes. 2. Kill all remaining services except critical services for shutdown. 3. Shutdown vold using "vdc volume shutdown" 4. umount all emulated partitions. If it fails, just detach. Wait in step 5 can handle it. 5. Try umounting R/W block devices for up to max timeout. If it fails, try DETACH. If umount fails to complete before reboot, it can be detected when system reboots. 6. Reboot - Log shutdown time and umount stat to log so that it can be collected after reboot - To umount emulated partitions, all pending writes inside kernel should be completed. - To umount /data partition, all emulated partitions on top of /data should be umounted and all pending writes should be completed. - umount retry will only wait up to timeout. If there are too many pending writes, reboot will discard them and e2fsck after reboot will fix any file system issues. bug: 36004738 bug: 32246772 Test: many reboots combining reboot from UI and adb reboot. Check last_kmsg and fs_stat after reboot. Change-Id: I6e74d6c68a21e76e08cc0438573d1586fd9aaee2
2017-03-13 19:54:47 +01:00
void SetShutdownCritical() { flags_ |= SVC_SHUTDOWN_CRITICAL; }
bool IsShutdownCritical() const { return (flags_ & SVC_SHUTDOWN_CRITICAL) != 0; }
void UnSetExec() {
is_exec_service_running_ = false;
flags_ &= ~SVC_EXEC;
}
void AddReapCallback(std::function<void(const siginfo_t& siginfo)> callback) {
reap_callbacks_.emplace_back(std::move(callback));
}
size_t CheckAllCommands() const { return onrestart_.CheckAllCommands(); }
static bool is_exec_service_running() { return is_exec_service_running_; }
const std::string& name() const { return name_; }
const std::set<std::string>& classnames() const { return classnames_; }
unsigned flags() const { return flags_; }
pid_t pid() const { return pid_; }
android::base::boot_clock::time_point time_started() const { return time_started_; }
int crash_count() const { return crash_count_; }
uid_t uid() const { return proc_attr_.uid; }
gid_t gid() const { return proc_attr_.gid; }
int namespace_flags() const { return namespaces_.flags; }
const std::vector<gid_t>& supp_gids() const { return proc_attr_.supp_gids; }
const std::string& seclabel() const { return seclabel_; }
const std::vector<int>& keycodes() const { return keycodes_; }
IoSchedClass ioprio_class() const { return proc_attr_.ioprio_class; }
int ioprio_pri() const { return proc_attr_.ioprio_pri; }
const std::set<std::string>& interfaces() const { return interfaces_; }
int priority() const { return proc_attr_.priority; }
int oom_score_adjust() const { return oom_score_adjust_; }
bool is_override() const { return override_; }
bool process_cgroup_empty() const { return process_cgroup_empty_; }
unsigned long start_order() const { return start_order_; }
void set_sigstop(bool value) { sigstop_ = value; }
std::chrono::seconds restart_period() const { return restart_period_; }
std::optional<std::chrono::seconds> timeout_period() const { return timeout_period_; }
const std::vector<std::string>& args() const { return args_; }
bool is_updatable() const { return updatable_; }
Support for stopping/starting post-data-mount class subsets. On devices that use FDE and APEX at the same time, we need to bring up a minimal framework to be able to mount the /data partition. During this period, a tmpfs /data filesystem is created, which doesn't contain any of the updated APEXEs. As a consequence, all those processes will be using the APEXes from the /system partition. This is obviously not desired, as APEXes in /system may be old and/or contain security issues. Additionally, it would create a difference between FBE and FDE devices at runtime. Ideally, we restart all processes that have started after we created the tmpfs /data. We can't (re)start based on class names alone, because some classes (eg 'hal') contain services that are required to start apexd itself and that shouldn't be killed (eg the graphics HAL). To address this, keep track of which processes are started after /data is mounted, with a new 'mark_post_data' keyword. Additionally, create 'class_reset_post_data', which resets all services in the class that were created after the initial /data mount, and 'class_start_post_data', which starts all services in the class that were started after /data was mounted. On a device with FBE, these keywords wouldn't be used; on a device with FDE, we'd use them to bring down the right processes after the user has entered the correct secret, and restart them. Bug: 118485723 Test: manually verified process list Change-Id: I16adb776dacf1dd1feeaff9e60639b99899905eb
2019-04-23 16:26:01 +02:00
bool is_post_data() const { return post_data_; }
bool is_from_apex() const { return from_apex_; }
void set_oneshot(bool value) {
if (value) {
flags_ |= SVC_ONESHOT;
} else {
flags_ &= ~SVC_ONESHOT;
}
}
private:
void NotifyStateChange(const std::string& new_state) const;
void StopOrReset(int how);
void KillProcessGroup(int signal, bool report_oneshot = false);
void SetProcessAttributesAndCaps();
static unsigned long next_start_order_;
static bool is_exec_service_running_;
std::string name_;
std::set<std::string> classnames_;
unsigned flags_;
pid_t pid_;
android::base::boot_clock::time_point time_started_; // time of last start
android::base::boot_clock::time_point time_crashed_; // first crash within inspection window
int crash_count_; // number of times crashed within window
std::optional<CapSet> capabilities_;
ProcessAttributes proc_attr_;
NamespaceInfo namespaces_;
std::string seclabel_;
std::vector<SocketDescriptor> sockets_;
std::vector<FileDescriptor> files_;
std::vector<std::pair<std::string, std::string>> environment_vars_;
Action onrestart_; // Commands to execute on restart.
std::vector<std::string> writepid_files_;
std::vector<std::string> task_profiles_;
std::set<std::string> interfaces_; // e.g. some.package.foo@1.0::IBaz/instance-name
// keycodes for triggering this service via /dev/input/input*
std::vector<int> keycodes_;
int oom_score_adjust_;
int swappiness_ = -1;
int soft_limit_in_bytes_ = -1;
int limit_in_bytes_ = -1;
int limit_percent_ = -1;
std::string limit_property_;
bool process_cgroup_empty_ = false;
bool override_ = false;
unsigned long start_order_;
bool sigstop_ = false;
std::chrono::seconds restart_period_ = 5s;
std::optional<std::chrono::seconds> timeout_period_;
bool updatable_ = false;
std::vector<std::string> args_;
std::vector<std::function<void(const siginfo_t& siginfo)>> reap_callbacks_;
Proper mount namespace configuration for bionic This CL fixes the design problem of the previous mechanism for providing the bootstrap bionic and the runtime bionic to the same path. Previously, bootstrap bionic was self-bind-mounted; i.e. /system/bin/libc.so is bind-mounted to itself. And the runtime bionic was bind-mounted on top of the bootstrap bionic. This has not only caused problems like `adb sync` not working(b/122737045), but also is quite difficult to understand due to the double-and-self mounting. This is the new design: Most importantly, these four are all distinct: 1) bootstrap bionic (/system/lib/bootstrap/libc.so) 2) runtime bionic (/apex/com.android.runtime/lib/bionic/libc.so) 3) mount point for 1) and 2) (/bionic/lib/libc.so) 4) symlink for 3) (/system/lib/libc.so -> /bionic/lib/libc.so) Inside the mount namespace of the pre-apexd processes, 1) is bind-mounted to 3). Likewise, inside the mount namespace of the post-apexd processes, 2) is bind-mounted to 3). In other words, there is no self-mount, and no double-mount. Another change is that mount points are under /bionic and the legacy paths become symlinks to the mount points. This is to make sure that there is no bind mounts under /system, which is breaking some apps. Finally, code for creating mount namespaces, mounting bionic, etc are refactored to mount_namespace.cpp Bug: 120266448 Bug: 123275379 Test: m, device boots, adb sync/push/pull works, especially with following paths: /bionic/lib64/libc.so /bionic/bin/linker64 /system/lib64/bootstrap/libc.so /system/bin/bootstrap/linker64 Change-Id: Icdfbdcc1efca540ac854d4df79e07ee61fca559f
2019-01-16 15:00:59 +01:00
bool pre_apexd_ = false;
Support for stopping/starting post-data-mount class subsets. On devices that use FDE and APEX at the same time, we need to bring up a minimal framework to be able to mount the /data partition. During this period, a tmpfs /data filesystem is created, which doesn't contain any of the updated APEXEs. As a consequence, all those processes will be using the APEXes from the /system partition. This is obviously not desired, as APEXes in /system may be old and/or contain security issues. Additionally, it would create a difference between FBE and FDE devices at runtime. Ideally, we restart all processes that have started after we created the tmpfs /data. We can't (re)start based on class names alone, because some classes (eg 'hal') contain services that are required to start apexd itself and that shouldn't be killed (eg the graphics HAL). To address this, keep track of which processes are started after /data is mounted, with a new 'mark_post_data' keyword. Additionally, create 'class_reset_post_data', which resets all services in the class that were created after the initial /data mount, and 'class_start_post_data', which starts all services in the class that were started after /data was mounted. On a device with FBE, these keywords wouldn't be used; on a device with FDE, we'd use them to bring down the right processes after the user has entered the correct secret, and restart them. Bug: 118485723 Test: manually verified process list Change-Id: I16adb776dacf1dd1feeaff9e60639b99899905eb
2019-04-23 16:26:01 +02:00
bool post_data_ = false;
bool running_at_post_data_reset_ = false;
std::optional<std::string> on_failure_reboot_target_;
bool from_apex_ = false;
};
} // namespace init
} // namespace android