platform_system_core/init/service.h

261 lines
10 KiB
C
Raw Normal View History

/*
* Copyright (C) 2015 The Android Open Source Project
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef _INIT_SERVICE_H
#define _INIT_SERVICE_H
#include <sys/types.h>
#include <memory>
#include <set>
#include <string>
#include <vector>
#include <android-base/chrono_utils.h>
#include <cutils/iosched_policy.h>
#include "action.h"
init: Add support for ambient capabilities. Ambient capabilities are inherited in a straightforward way across execve(2): " If you are nonroot but you have a capability, you can add it to pA. If you do so, your children get that capability in pA, pP, and pE. For example, you can set pA = CAP_NET_BIND_SERVICE, and your children can automatically bind low-numbered ports. " This will allow us to get rid of the special meaning for AID_NET_ADMIN and AID_NET_RAW, and if desired, to reduce the use of file capabilities (which grant capabilities to any process that can execute the file). An additional benefit of the latter is that a single .rc file can specify all properties for a service, without having to rely on a separate file for file capabilities. Ambient capabilities are supported starting with kernel 4.3 and have been backported to all Android common kernels back to 3.10. I chose to not use Minijail here (though I'm still using libcap) for two reasons: 1-The Minijail code is designed to work in situations where the process is holding any set of capabilities, so it's more complex. The situation when forking from init allows for simpler code. 2-The way Minijail is structured right now, we would not be able to make the required SELinux calls between UID/GID dropping and other priv dropping code. In the future, it will make sense to add some sort of "hook" to Minijail so that it can be used in situations where we want to do other operations between some of the privilege-dropping operations carried out by Minijail. Bug: 32438163 Test: Use sample service. Change-Id: I3226cc95769d1beacbae619cb6c6e6a5425890fb
2016-10-27 16:33:03 +02:00
#include "capabilities.h"
#include "descriptors.h"
#include "keyword_map.h"
#include "parser.h"
remove emergency shutdown and improve init's reboot logic - Emergency shutdown just marks the fs as clean while leaving fs in the middle of any state. Do not use it anymore. - Changed android_reboot to set sys.powerctl property so that all shutdown can be done by init. - Normal reboot sequence changed to 1. Terminate processes (give time to clean up). And wait for completion based on ro.build.shutdown_timeout. Default value (when not set) is changed to 3 secs. If it is 0, do not terminate processes. 2. Kill all remaining services except critical services for shutdown. 3. Shutdown vold using "vdc volume shutdown" 4. umount all emulated partitions. If it fails, just detach. Wait in step 5 can handle it. 5. Try umounting R/W block devices for up to max timeout. If it fails, try DETACH. If umount fails to complete before reboot, it can be detected when system reboots. 6. Reboot - Log shutdown time and umount stat to log so that it can be collected after reboot - To umount emulated partitions, all pending writes inside kernel should be completed. - To umount /data partition, all emulated partitions on top of /data should be umounted and all pending writes should be completed. - umount retry will only wait up to timeout. If there are too many pending writes, reboot will discard them and e2fsck after reboot will fix any file system issues. bug: 36004738 bug: 32246772 Test: many reboots combining reboot from UI and adb reboot. Check last_kmsg and fs_stat after reboot. Change-Id: I6e74d6c68a21e76e08cc0438573d1586fd9aaee2
2017-03-13 19:54:47 +01:00
#define SVC_DISABLED 0x001 // do not autostart with class
#define SVC_ONESHOT 0x002 // do not restart on exit
#define SVC_RUNNING 0x004 // currently active
#define SVC_RESTARTING 0x008 // waiting to restart
#define SVC_CONSOLE 0x010 // requires console
#define SVC_CRITICAL 0x020 // will reboot into recovery if keeps crashing
#define SVC_RESET 0x040 // Use when stopping a process,
// but not disabling so it can be restarted with its class.
remove emergency shutdown and improve init's reboot logic - Emergency shutdown just marks the fs as clean while leaving fs in the middle of any state. Do not use it anymore. - Changed android_reboot to set sys.powerctl property so that all shutdown can be done by init. - Normal reboot sequence changed to 1. Terminate processes (give time to clean up). And wait for completion based on ro.build.shutdown_timeout. Default value (when not set) is changed to 3 secs. If it is 0, do not terminate processes. 2. Kill all remaining services except critical services for shutdown. 3. Shutdown vold using "vdc volume shutdown" 4. umount all emulated partitions. If it fails, just detach. Wait in step 5 can handle it. 5. Try umounting R/W block devices for up to max timeout. If it fails, try DETACH. If umount fails to complete before reboot, it can be detected when system reboots. 6. Reboot - Log shutdown time and umount stat to log so that it can be collected after reboot - To umount emulated partitions, all pending writes inside kernel should be completed. - To umount /data partition, all emulated partitions on top of /data should be umounted and all pending writes should be completed. - umount retry will only wait up to timeout. If there are too many pending writes, reboot will discard them and e2fsck after reboot will fix any file system issues. bug: 36004738 bug: 32246772 Test: many reboots combining reboot from UI and adb reboot. Check last_kmsg and fs_stat after reboot. Change-Id: I6e74d6c68a21e76e08cc0438573d1586fd9aaee2
2017-03-13 19:54:47 +01:00
#define SVC_RC_DISABLED 0x080 // Remember if the disabled flag was set in the rc script.
#define SVC_RESTART 0x100 // Use to safely restart (stop, wait, start) a service.
#define SVC_DISABLED_START 0x200 // A start was requested but it was disabled at the time.
#define SVC_EXEC 0x400 // This service was started by either 'exec' or 'exec_start' and stops
// init from processing more commands until it completes
remove emergency shutdown and improve init's reboot logic - Emergency shutdown just marks the fs as clean while leaving fs in the middle of any state. Do not use it anymore. - Changed android_reboot to set sys.powerctl property so that all shutdown can be done by init. - Normal reboot sequence changed to 1. Terminate processes (give time to clean up). And wait for completion based on ro.build.shutdown_timeout. Default value (when not set) is changed to 3 secs. If it is 0, do not terminate processes. 2. Kill all remaining services except critical services for shutdown. 3. Shutdown vold using "vdc volume shutdown" 4. umount all emulated partitions. If it fails, just detach. Wait in step 5 can handle it. 5. Try umounting R/W block devices for up to max timeout. If it fails, try DETACH. If umount fails to complete before reboot, it can be detected when system reboots. 6. Reboot - Log shutdown time and umount stat to log so that it can be collected after reboot - To umount emulated partitions, all pending writes inside kernel should be completed. - To umount /data partition, all emulated partitions on top of /data should be umounted and all pending writes should be completed. - umount retry will only wait up to timeout. If there are too many pending writes, reboot will discard them and e2fsck after reboot will fix any file system issues. bug: 36004738 bug: 32246772 Test: many reboots combining reboot from UI and adb reboot. Check last_kmsg and fs_stat after reboot. Change-Id: I6e74d6c68a21e76e08cc0438573d1586fd9aaee2
2017-03-13 19:54:47 +01:00
#define SVC_SHUTDOWN_CRITICAL 0x800 // This service is critical for shutdown and
// should not be killed during shutdown
#define SVC_TEMPORARY 0x1000 // This service was started by 'exec' and should be removed from the
// service list once it is reaped.
#define NR_SVC_SUPP_GIDS 12 // twelve supplementary groups
namespace android {
namespace init {
struct ServiceEnvironmentInfo {
ServiceEnvironmentInfo();
ServiceEnvironmentInfo(const std::string& name, const std::string& value);
std::string name;
std::string value;
};
class Service {
public:
Service(const std::string& name, const std::vector<std::string>& args);
Service(const std::string& name, unsigned flags, uid_t uid, gid_t gid,
init: Add support for ambient capabilities. Ambient capabilities are inherited in a straightforward way across execve(2): " If you are nonroot but you have a capability, you can add it to pA. If you do so, your children get that capability in pA, pP, and pE. For example, you can set pA = CAP_NET_BIND_SERVICE, and your children can automatically bind low-numbered ports. " This will allow us to get rid of the special meaning for AID_NET_ADMIN and AID_NET_RAW, and if desired, to reduce the use of file capabilities (which grant capabilities to any process that can execute the file). An additional benefit of the latter is that a single .rc file can specify all properties for a service, without having to rely on a separate file for file capabilities. Ambient capabilities are supported starting with kernel 4.3 and have been backported to all Android common kernels back to 3.10. I chose to not use Minijail here (though I'm still using libcap) for two reasons: 1-The Minijail code is designed to work in situations where the process is holding any set of capabilities, so it's more complex. The situation when forking from init allows for simpler code. 2-The way Minijail is structured right now, we would not be able to make the required SELinux calls between UID/GID dropping and other priv dropping code. In the future, it will make sense to add some sort of "hook" to Minijail so that it can be used in situations where we want to do other operations between some of the privilege-dropping operations carried out by Minijail. Bug: 32438163 Test: Use sample service. Change-Id: I3226cc95769d1beacbae619cb6c6e6a5425890fb
2016-10-27 16:33:03 +02:00
const std::vector<gid_t>& supp_gids, const CapSet& capabilities,
unsigned namespace_flags, const std::string& seclabel,
const std::vector<std::string>& args);
static std::unique_ptr<Service> MakeTemporaryOneshotService(const std::vector<std::string>& args);
remove emergency shutdown and improve init's reboot logic - Emergency shutdown just marks the fs as clean while leaving fs in the middle of any state. Do not use it anymore. - Changed android_reboot to set sys.powerctl property so that all shutdown can be done by init. - Normal reboot sequence changed to 1. Terminate processes (give time to clean up). And wait for completion based on ro.build.shutdown_timeout. Default value (when not set) is changed to 3 secs. If it is 0, do not terminate processes. 2. Kill all remaining services except critical services for shutdown. 3. Shutdown vold using "vdc volume shutdown" 4. umount all emulated partitions. If it fails, just detach. Wait in step 5 can handle it. 5. Try umounting R/W block devices for up to max timeout. If it fails, try DETACH. If umount fails to complete before reboot, it can be detected when system reboots. 6. Reboot - Log shutdown time and umount stat to log so that it can be collected after reboot - To umount emulated partitions, all pending writes inside kernel should be completed. - To umount /data partition, all emulated partitions on top of /data should be umounted and all pending writes should be completed. - umount retry will only wait up to timeout. If there are too many pending writes, reboot will discard them and e2fsck after reboot will fix any file system issues. bug: 36004738 bug: 32246772 Test: many reboots combining reboot from UI and adb reboot. Check last_kmsg and fs_stat after reboot. Change-Id: I6e74d6c68a21e76e08cc0438573d1586fd9aaee2
2017-03-13 19:54:47 +01:00
bool IsRunning() { return (flags_ & SVC_RUNNING) != 0; }
bool ParseLine(const std::vector<std::string>& args, std::string* err);
bool ExecStart();
bool Start();
bool StartIfNotDisabled();
bool Enable();
void Reset();
void Stop();
void Terminate();
void Restart();
void Reap();
void DumpState() const;
remove emergency shutdown and improve init's reboot logic - Emergency shutdown just marks the fs as clean while leaving fs in the middle of any state. Do not use it anymore. - Changed android_reboot to set sys.powerctl property so that all shutdown can be done by init. - Normal reboot sequence changed to 1. Terminate processes (give time to clean up). And wait for completion based on ro.build.shutdown_timeout. Default value (when not set) is changed to 3 secs. If it is 0, do not terminate processes. 2. Kill all remaining services except critical services for shutdown. 3. Shutdown vold using "vdc volume shutdown" 4. umount all emulated partitions. If it fails, just detach. Wait in step 5 can handle it. 5. Try umounting R/W block devices for up to max timeout. If it fails, try DETACH. If umount fails to complete before reboot, it can be detected when system reboots. 6. Reboot - Log shutdown time and umount stat to log so that it can be collected after reboot - To umount emulated partitions, all pending writes inside kernel should be completed. - To umount /data partition, all emulated partitions on top of /data should be umounted and all pending writes should be completed. - umount retry will only wait up to timeout. If there are too many pending writes, reboot will discard them and e2fsck after reboot will fix any file system issues. bug: 36004738 bug: 32246772 Test: many reboots combining reboot from UI and adb reboot. Check last_kmsg and fs_stat after reboot. Change-Id: I6e74d6c68a21e76e08cc0438573d1586fd9aaee2
2017-03-13 19:54:47 +01:00
void SetShutdownCritical() { flags_ |= SVC_SHUTDOWN_CRITICAL; }
bool IsShutdownCritical() const { return (flags_ & SVC_SHUTDOWN_CRITICAL) != 0; }
void UnSetExec() {
is_exec_service_running_ = false;
flags_ &= ~SVC_EXEC;
}
static bool is_exec_service_running() { return is_exec_service_running_; }
const std::string& name() const { return name_; }
const std::set<std::string>& classnames() const { return classnames_; }
unsigned flags() const { return flags_; }
pid_t pid() const { return pid_; }
android::base::boot_clock::time_point time_started() const { return time_started_; }
int crash_count() const { return crash_count_; }
uid_t uid() const { return uid_; }
gid_t gid() const { return gid_; }
unsigned namespace_flags() const { return namespace_flags_; }
const std::vector<gid_t>& supp_gids() const { return supp_gids_; }
const std::string& seclabel() const { return seclabel_; }
const std::vector<int>& keycodes() const { return keycodes_; }
int keychord_id() const { return keychord_id_; }
void set_keychord_id(int keychord_id) { keychord_id_ = keychord_id; }
IoSchedClass ioprio_class() const { return ioprio_class_; }
int ioprio_pri() const { return ioprio_pri_; }
int priority() const { return priority_; }
int oom_score_adjust() const { return oom_score_adjust_; }
bool process_cgroup_empty() const { return process_cgroup_empty_; }
unsigned long start_order() const { return start_order_; }
const std::vector<std::string>& args() const { return args_; }
private:
using OptionParser = bool (Service::*) (const std::vector<std::string>& args,
std::string* err);
class OptionParserMap;
void NotifyStateChange(const std::string& new_state) const;
void StopOrReset(int how);
void ZapStdio() const;
void OpenConsole() const;
void KillProcessGroup(int signal);
void SetProcessAttributes();
init: Add support for ambient capabilities. Ambient capabilities are inherited in a straightforward way across execve(2): " If you are nonroot but you have a capability, you can add it to pA. If you do so, your children get that capability in pA, pP, and pE. For example, you can set pA = CAP_NET_BIND_SERVICE, and your children can automatically bind low-numbered ports. " This will allow us to get rid of the special meaning for AID_NET_ADMIN and AID_NET_RAW, and if desired, to reduce the use of file capabilities (which grant capabilities to any process that can execute the file). An additional benefit of the latter is that a single .rc file can specify all properties for a service, without having to rely on a separate file for file capabilities. Ambient capabilities are supported starting with kernel 4.3 and have been backported to all Android common kernels back to 3.10. I chose to not use Minijail here (though I'm still using libcap) for two reasons: 1-The Minijail code is designed to work in situations where the process is holding any set of capabilities, so it's more complex. The situation when forking from init allows for simpler code. 2-The way Minijail is structured right now, we would not be able to make the required SELinux calls between UID/GID dropping and other priv dropping code. In the future, it will make sense to add some sort of "hook" to Minijail so that it can be used in situations where we want to do other operations between some of the privilege-dropping operations carried out by Minijail. Bug: 32438163 Test: Use sample service. Change-Id: I3226cc95769d1beacbae619cb6c6e6a5425890fb
2016-10-27 16:33:03 +02:00
bool ParseCapabilities(const std::vector<std::string>& args, std::string *err);
bool ParseClass(const std::vector<std::string>& args, std::string* err);
bool ParseConsole(const std::vector<std::string>& args, std::string* err);
bool ParseCritical(const std::vector<std::string>& args, std::string* err);
bool ParseDisabled(const std::vector<std::string>& args, std::string* err);
bool ParseGroup(const std::vector<std::string>& args, std::string* err);
bool ParsePriority(const std::vector<std::string>& args, std::string* err);
bool ParseIoprio(const std::vector<std::string>& args, std::string* err);
bool ParseKeycodes(const std::vector<std::string>& args, std::string* err);
bool ParseOneshot(const std::vector<std::string>& args, std::string* err);
bool ParseOnrestart(const std::vector<std::string>& args, std::string* err);
bool ParseOomScoreAdjust(const std::vector<std::string>& args, std::string* err);
bool ParseMemcgLimitInBytes(const std::vector<std::string>& args, std::string* err);
bool ParseMemcgSoftLimitInBytes(const std::vector<std::string>& args, std::string* err);
bool ParseMemcgSwappiness(const std::vector<std::string>& args, std::string* err);
bool ParseNamespace(const std::vector<std::string>& args, std::string* err);
bool ParseSeclabel(const std::vector<std::string>& args, std::string* err);
bool ParseSetenv(const std::vector<std::string>& args, std::string* err);
bool ParseShutdown(const std::vector<std::string>& args, std::string* err);
bool ParseSocket(const std::vector<std::string>& args, std::string* err);
bool ParseFile(const std::vector<std::string>& args, std::string* err);
bool ParseUser(const std::vector<std::string>& args, std::string* err);
bool ParseWritepid(const std::vector<std::string>& args, std::string* err);
template <typename T>
bool AddDescriptor(const std::vector<std::string>& args, std::string* err);
static unsigned long next_start_order_;
static bool is_exec_service_running_;
std::string name_;
std::set<std::string> classnames_;
std::string console_;
unsigned flags_;
pid_t pid_;
android::base::boot_clock::time_point time_started_; // time of last start
android::base::boot_clock::time_point time_crashed_; // first crash within inspection window
int crash_count_; // number of times crashed within window
uid_t uid_;
gid_t gid_;
std::vector<gid_t> supp_gids_;
init: Add support for ambient capabilities. Ambient capabilities are inherited in a straightforward way across execve(2): " If you are nonroot but you have a capability, you can add it to pA. If you do so, your children get that capability in pA, pP, and pE. For example, you can set pA = CAP_NET_BIND_SERVICE, and your children can automatically bind low-numbered ports. " This will allow us to get rid of the special meaning for AID_NET_ADMIN and AID_NET_RAW, and if desired, to reduce the use of file capabilities (which grant capabilities to any process that can execute the file). An additional benefit of the latter is that a single .rc file can specify all properties for a service, without having to rely on a separate file for file capabilities. Ambient capabilities are supported starting with kernel 4.3 and have been backported to all Android common kernels back to 3.10. I chose to not use Minijail here (though I'm still using libcap) for two reasons: 1-The Minijail code is designed to work in situations where the process is holding any set of capabilities, so it's more complex. The situation when forking from init allows for simpler code. 2-The way Minijail is structured right now, we would not be able to make the required SELinux calls between UID/GID dropping and other priv dropping code. In the future, it will make sense to add some sort of "hook" to Minijail so that it can be used in situations where we want to do other operations between some of the privilege-dropping operations carried out by Minijail. Bug: 32438163 Test: Use sample service. Change-Id: I3226cc95769d1beacbae619cb6c6e6a5425890fb
2016-10-27 16:33:03 +02:00
CapSet capabilities_;
unsigned namespace_flags_;
std::string seclabel_;
std::vector<std::unique_ptr<DescriptorInfo>> descriptors_;
std::vector<ServiceEnvironmentInfo> envvars_;
Action onrestart_; // Commands to execute on restart.
std::vector<std::string> writepid_files_;
// keycodes for triggering this service via /dev/keychord
std::vector<int> keycodes_;
int keychord_id_;
IoSchedClass ioprio_class_;
int ioprio_pri_;
int priority_;
int oom_score_adjust_;
int swappiness_;
int soft_limit_in_bytes_;
int limit_in_bytes_;
bool process_cgroup_empty_ = false;
unsigned long start_order_;
std::vector<std::string> args_;
};
class ServiceList {
public:
static ServiceList& GetInstance();
// Exposed for testing
ServiceList();
void AddService(std::unique_ptr<Service> service);
void RemoveService(const Service& svc);
template <typename T, typename F = decltype(&Service::name)>
Service* FindService(T value, F function = &Service::name) const {
auto svc = std::find_if(services_.begin(), services_.end(),
[&function, &value](const std::unique_ptr<Service>& s) {
return std::invoke(function, s) == value;
});
if (svc != services_.end()) {
return svc->get();
}
return nullptr;
}
void DumpState() const;
auto begin() const { return services_.begin(); }
auto end() const { return services_.end(); }
const std::vector<std::unique_ptr<Service>>& services() const { return services_; }
const std::vector<Service*> services_in_shutdown_order() const;
private:
std::vector<std::unique_ptr<Service>> services_;
};
class ServiceParser : public SectionParser {
init: Stop combining actions In the past, I had thought it didn't make sense to have multiple Action classes with identical triggers within ActionManager::actions_, and opted to instead combine these into a single action. In theory, it should reduce memory overhead as only one copy of the triggers needs to be stored. In practice, this ends up not being a good idea. Most importantly, given a file with the below three sections in this same order: on boot setprop a b on boot && property:true=true setprop c d on boot setprop e f Assuming that property 'true' == 'true', when the `boot` event happens, the order of the setprop commands will actually be: setprop a b setprop e f setprop c d instead of the more intuitive order of: setprop a b setprop c d setprop e f This is a mistake and this CL fixes it. It also documents this order. Secondly, with a given 'Action' now spanning multiple files, in order to keep track of which file a command is run from, the 'Command' itself needs to store this. Ironically to the original intention, this increases total ram usage. This change now only stores the file name in each 'Action' instead of each 'Command'. All in all this is a negligible trade off of ram usage. Thirdly, this requires a bunch of extra code and assumptions that don't help anything else. In particular it forces to keep property triggers sorted for easy comparison, which I'm using an std::map for currently, but that is not the best data structure to contain them. Lastly, I added the filename and line number to the 'processing action' LOG(INFO) message. Test: Boot bullhead, observe above changes Test: Boot sailfish, observe no change in boot time Change-Id: I3fbcac4ee677351314e33012c758145be82346e9
2017-04-18 22:21:54 +02:00
public:
ServiceParser(ServiceList* service_list) : service_list_(service_list), service_(nullptr) {}
bool ParseSection(std::vector<std::string>&& args, const std::string& filename, int line,
std::string* err) override;
bool ParseLineSection(std::vector<std::string>&& args, int line, std::string* err) override;
void EndSection() override;
init: Stop combining actions In the past, I had thought it didn't make sense to have multiple Action classes with identical triggers within ActionManager::actions_, and opted to instead combine these into a single action. In theory, it should reduce memory overhead as only one copy of the triggers needs to be stored. In practice, this ends up not being a good idea. Most importantly, given a file with the below three sections in this same order: on boot setprop a b on boot && property:true=true setprop c d on boot setprop e f Assuming that property 'true' == 'true', when the `boot` event happens, the order of the setprop commands will actually be: setprop a b setprop e f setprop c d instead of the more intuitive order of: setprop a b setprop c d setprop e f This is a mistake and this CL fixes it. It also documents this order. Secondly, with a given 'Action' now spanning multiple files, in order to keep track of which file a command is run from, the 'Command' itself needs to store this. Ironically to the original intention, this increases total ram usage. This change now only stores the file name in each 'Action' instead of each 'Command'. All in all this is a negligible trade off of ram usage. Thirdly, this requires a bunch of extra code and assumptions that don't help anything else. In particular it forces to keep property triggers sorted for easy comparison, which I'm using an std::map for currently, but that is not the best data structure to contain them. Lastly, I added the filename and line number to the 'processing action' LOG(INFO) message. Test: Boot bullhead, observe above changes Test: Boot sailfish, observe no change in boot time Change-Id: I3fbcac4ee677351314e33012c758145be82346e9
2017-04-18 22:21:54 +02:00
private:
bool IsValidName(const std::string& name) const;
ServiceList* service_list_;
std::unique_ptr<Service> service_;
};
} // namespace init
} // namespace android
#endif