On devices that use FDE and APEX at the same time, we need to bring up a
minimal framework to be able to mount the /data partition. During this
period, a tmpfs /data filesystem is created, which doesn't contain any
of the updated APEXEs. As a consequence, all those processes will be
using the APEXes from the /system partition.
This is obviously not desired, as APEXes in /system may be old and/or
contain security issues. Additionally, it would create a difference
between FBE and FDE devices at runtime.
Ideally, we restart all processes that have started after we created the
tmpfs /data. We can't (re)start based on class names alone, because some
classes (eg 'hal') contain services that are required to start apexd
itself and that shouldn't be killed (eg the graphics HAL).
To address this, keep track of which processes are started after /data
is mounted, with a new 'mark_post_data' keyword. Additionally, create
'class_reset_post_data', which resets all services in the class that
were created after the initial /data mount, and 'class_start_post_data',
which starts all services in the class that were started after /data was
mounted.
On a device with FBE, these keywords wouldn't be used; on a device with
FDE, we'd use them to bring down the right processes after the user has
entered the correct secret, and restart them.
Bug: 118485723
Test: manually verified process list
Change-Id: I16adb776dacf1dd1feeaff9e60639b99899905eb
In particular, this allows services running as the root user to have
capabilities removed instead of always having full capabilities.
Test: boot device with a root service with an empty capabilities
option in init showing no capabilities in /proc/<pid>/status
Change-Id: I569a5573ed4bc5fab0eb37ce9224ab708e980451
This CL fixes the design problem of the previous mechanism for providing
the bootstrap bionic and the runtime bionic to the same path.
Previously, bootstrap bionic was self-bind-mounted; i.e.
/system/bin/libc.so is bind-mounted to itself. And the runtime bionic
was bind-mounted on top of the bootstrap bionic. This has not only caused
problems like `adb sync` not working(b/122737045), but also is quite
difficult to understand due to the double-and-self mounting.
This is the new design:
Most importantly, these four are all distinct:
1) bootstrap bionic (/system/lib/bootstrap/libc.so)
2) runtime bionic (/apex/com.android.runtime/lib/bionic/libc.so)
3) mount point for 1) and 2) (/bionic/lib/libc.so)
4) symlink for 3) (/system/lib/libc.so -> /bionic/lib/libc.so)
Inside the mount namespace of the pre-apexd processes, 1) is
bind-mounted to 3). Likewise, inside the mount namespace of the
post-apexd processes, 2) is bind-mounted to 3). In other words, there is
no self-mount, and no double-mount.
Another change is that mount points are under /bionic and the legacy
paths become symlinks to the mount points. This is to make sure that
there is no bind mounts under /system, which is breaking some apps.
Finally, code for creating mount namespaces, mounting bionic, etc are
refactored to mount_namespace.cpp
Bug: 120266448
Bug: 123275379
Test: m, device boots, adb sync/push/pull works,
especially with following paths:
/bionic/lib64/libc.so
/bionic/bin/linker64
/system/lib64/bootstrap/libc.so
/system/bin/bootstrap/linker64
Change-Id: Icdfbdcc1efca540ac854d4df79e07ee61fca559f
This reverts commit 2599088ff6.
Reason: Breaks some 3p apps.
Bug: 122920047
Test: run the app, login.
Change-Id: Idea332b1f91e9d2ac6ebd3879da7820c8ba2284f
This change makes the bionic libs and the dynamic linker from the
runtime APEX (com.android.runtime) available to all processes started
after apexd finishes activating APEXes.
Specifically, the device has two sets of bionic libs and the dynamic
linker: one in the system partition for pre-apexd processes and another
in the runtime APEX for post-apexd processes. The former is referred as
the 'bootstrap' bionic and are located at
/system/lib/{libc|libdl|libm}.so and /system/bin/linker. The latter is
referred as the 'runtime' bionic and are located at
/apex/com.android.runtime/lib/bionic/{libc|libdl|libm}.so and
/apex/com.android.runtime/bin/linker.
Although the two sets are located in different directories, at runtime,
they are accessed via the same path: /system/lib/* and
/system/bin/linker ... for both pre/post-apexd processes. This is done
by bind-mounting the bootstrap or the runtime bionic to the same path.
Keeping the same path is necessary because there are many modules and
apps that explicitly or implicitly depend on the fact that bionic libs
are located in /system/lib and are loaded into the default linker
namespace (which has /system/lib in its search paths).
Before the apexd is started, init executes a built-in action
'prepare_bootstrap_bionic' that bind-mounts the bootstrap bionic to the
mount points. Processes started during this time are provided with the
bootstrap bionic. Then after the apexd is finished, init executes
another built-in action 'setup_runtime_bionic' which again mounts the
runtime bionic to the same mount points, thus hiding the previous mounts
that target the bootstrap bionic. The mounting of the runtime bionic
(which is only for post-apexd processes) is hidden from pre-apexd
processes by changing propagation type of the mount points to 'private'
and execute the pre-apexd processes with a new mount namespace using
unshare(2). If a pre-apexd process crashes and re-launched after the
apexd is on, the process still gets the bootstrap bionic by unmounting
the runtime bionic which effectively un-hides the previous bind-mounts
targeting the bootstrap bionic.
Bug: 120266448
Test: device boots
Test: cat /proc/`pidof zygote`/mountinfo shows that
/system/lib/{libc|libdl|libm}.so and /system/bin/linker are from the
runtime APEX
Test: cat /proc/'pidof vold`/mountinfo shows that the same mount points
are from system partition.
Change-Id: I7ca67755dc0656c0f0c834ba94bf23ba9b1aca68
A service with 'updatable' option can be overriden by the same service
definition in APEXes.
/system/etc/init/foo.rc:
service foo /system/bin/foo
updatable
/apex/myapex/etc/init.rc:
service foo /apex/myapex/bin/foo
override
Overriding a non-updatable (i.e. without updatable option) service
from APEXes is prohibited.
When an updatable service is started before APEXes are all activated,
the execution is delayed until when the APEXes are all activated.
Bug: 117403679
Test: m apex.test; adb push <built_apex> /data/apex; adb reboot
adb shell, then lsof -p $(pidof surfaceflinger) shows that
the process is executing
/apex/com.android.example.apex@1/bin/surfaceflinger instead of
/system/bin/surfaceflinger
Change-Id: I8a57b8e7f6da81b4d2843e261a9a935dd279067c
The memcg.limit_percent option can be used to limit the cgroup's
max RSS to the given value as a percentage of the device's physical
memory. The memcg.limit_property option specifies the name of a
property that can be used to control the cgroup's max RSS. These
new options correspond to the arguments to the limitProcessMemory
function in frameworks/av/media/libmedia/MediaUtils.cpp; this will
allow us to add these options to the rc files for the programs that
call this function and then remove the callers in a later change.
There is also a change in semantics: the memcg.* options now have
an effect on all devices which support memory cgroups, not just
those with ro.config.low_ram or ro.config.per_app_memcg set to true.
This change also brings the semantics in line with the documentation,
so it looks like the previous semantics were unintentional.
Change-Id: I9495826de6e477b952e23866743b5fa600adcacb
Bug: 118642754
ParseLineSection() provides 'args' as an rvalue reference, so its
callers can and should use it as such. This saves some copying
overhead and cleans up the code a bit.
Test: boot
Change-Id: Ib906318583dc81de9ea585f5f09fdff35403be1b
'Critical' services have rebooted into bootloader, like all other
catastrophic init crashes, for years now. Update the text to match.
Test: n/a
Change-Id: Icfc41bf3e383958f14ecfaab9ca187e2c3dc7fd9
Allow services to specify a custom restart period via the
restart_period service option. This will allow services to be run
periodically, such as a service that needs to run every hour.
Allow services to specify a timeout period via the timeout_period
service option. This will allow services to be killed after the
timeout expires if they are still running. This can be combined with
restart_period for creating period services.
Test: test app restarts every minute
Change-Id: Iad017820f9a602f9826104fb8cafc91bfb4b28d6
Drop all references to keychord_id and id and instead use keycodes_
as the id. The keycodes are a std::vector<int> with an unique
sorted-order emplacement method added in the parser. Solves the
academic issue with duplicate keychords and trigger all services
that match rather than first match only.
Test: init_tests
Bug: 64114943
Change-Id: I5582779d81458fda393004c551c0d3c03d9471e0
Add the ability to enter a network namespace when launching a service.
Typical usage of this would be something similar to the below:
on fs
exec ip netns add namespace_name
service vendor_something /vendor/...
capabilities <lower than root>
user not_root
enter_namespace net /mnt/.../namespace_name
Note changes to the `ip` tool are needed to create the namespace in
the correct directory.
Bug: 73334854
Test: not yet
Change-Id: Ifa91c873d36d69db399bb9c04ff2362518a0b07d
FindService can't be used w/ interfaces due
to the fact that multiple interfaces can be
added to any given interface.
Bug: 79418581
Test: boot device, manually use ctl commands
Change-Id: I7c152630462c9b7509473bc190f5b30460fcc2bc
An earlier such change was reverted in commit e242a97db5.
Bug: 70487538
Test: ensure that angler can boot
Merged-In: Id5f57fce1c9b817a2650e0c848143d8a0d286bf0
Change-Id: Id5f57fce1c9b817a2650e0c848143d8a0d286bf0
For instance, on vendor.img:
service foo /vendor/bin/nfc
...
And then on odm.img:
service foo /odm/bin/super-nfc
override
Allows a service on ODM to override a HAL on vendor.
Bug: 69050941
Test: boot, init_tests
Change-Id: I4e908fb66e89fc6e021799fe1fa6603d3072d62a
Allow it to fail. When there is an error for a section ending,
print the error pointing to the line where the section starts.
Bug: 69050941
Test: boot, init_tests
Change-Id: I1d8ed25f4b74cc9ac24d38b8075751c7d606aea8
This associates every service with a list of HIDL services
it provides. If these are disabled, hwservicemanager will
request for the service to startup.
Bug: 64678982
Test: manual with the light service
Change-Id: Ibf8a6f1cd38312c91c798b74574fa792f23c2df4
One of the major aspects of treble is the compartmentalization of system
and vendor components, however init leaves a huge gap here, as vendor
init scripts run in the same context as system init scripts and thus can
access and modify the same properties, files, etc as the system can.
This change is meant to close that gap. It forks a separate 'subcontext'
init that runs in a different SELinux context with permissions that match
what vendors should have access to. Commands get sent over a socket to
this 'subcontext' init that then runs them in this SELinux context and
returns the result.
Note that not all commands run in the subcontext; some commands such as
those dealing with services only make sense in the context of the main
init process.
Bug: 62875318
Test: init unit tests, boot bullhead, boot sailfish
Change-Id: Idf4a4ebf98842d27b8627f901f961ab9eb412aee
Add a new service option, `rlimit` that allows a given rlimit to be
set for a specific service instead of globally.
Use the same parsing, now allowing text such as 'cpu' or 'rtprio'
instead of relying on the enum value for the `setrlimit` builtin
command as well.
Bug: 63882119
Bug: 64894637
Test: boot bullhead, run a test app that attempts to set its rtprio to
95, see that the priority set fails normally but passes when
`rlimit rtprio 99 99` is used as its service option.
See that this fails when `rlimit rtprio 50 50` is used as well.
Test: new unit tests
Change-Id: I4a13ca20e8529937d8b4bc11718ffaaf77523a52
Log Service failures via Result<T> such that their context can be
captured when interacting with services through builtin functions.
Test: boot bullhead
Change-Id: I4d99744d64008d4a06a404e3c9817182c6e177bc
Init keep its own copy of the environment that it uses for execve when
starting services. This is unnecessary however as libc already has
functions that mutate the environment and the environment that init
uses is clean for starting services. This change removes init's copy
of the environment and uses the libc functions instead.
This also makes small clean-up to the way the Service class stores
service specific environment variables.
Test: boot bullhead
Change-Id: I7c98a0b7aac9fa8f195ae33bd6a7515bb56faf78
ServiceManager is essentially just a list now that the rest of its
functionality has been moved elsewhere, so the class is renamed
appropriately.
The ServiceList::Find* functions have been cleaned up into a single
smaller interface.
The ServiceList::ForEach functions have been removed in favor of
ServiceList itself being directly iterable.
Test: boot bullhead
Change-Id: Ibd57c103338f03b83d81e8b48ea0e46cd48fd8f0
signal_handler.cpp itself needs to be cleaned up, but this is a step
to clean up ServiceManager.
Test: boot bullhead
Change-Id: I81f1e8ac4d09692cfb364bc702cbd3deb61aa55a
These can be implemented without ServiceManager, so we remove them and
make ServiceManager slightly less of a God class.
Test: boot bullhead
Test: init unit tests
Change-Id: Ia6e546fe5292255412245256f7d230af4ece135f
The time data types associated with restarting processes halfway moved
to std::chrono and halfway didn't. In this intermediate state, the
times would get converted from nanoseconds to seconds then to
milliseconds. The precision lost when converting to seconds would
cause the main loop of init to spin whenever a process was within a
second of being restarted.
This patch cleans up this logic and uses nanoseconds and milliseconds
explicitly, with a ceiling to milliseconds to prevent unneeded
spinning.
Test: boot bullhead, kill processes, see that they restart sanely.
Change-Id: I0b017ba0e50c09704b0c5cdfcde1dba461804593
* Remove the Parser singleton (Hooray!)
* Rename parser.* to tokenizer.* as this is actually a tokenizer
* Rename init_parser.* to parser.* as this is a generic parser
* Move contents of init_parser_test.cpp to service_test.cpp as this
actually is a test of the parsing in MakeExecOneshotService() and
nothing related to (init_)parser.cpp
Test: boot bullhead
Test: bool sailfish
Test: init unit tests
Change-Id: I4fe39e6483f58ebd3ce5ee715a45dbba0acf5d91
Currently, the order that we kill to services during shutdown is the
order of services_ in ServiceManager and that is defacto the order in
which they were parsed, which is not a very useful ordering.
Related to this, we have seen a few issues during shutdown that may be
related to services with dependencies on other services, where the
dependency is killed first and the dependent service then misbehaves.
This change allows services to keep track of the order in which they
were started and shutdown then uses that information to kill running
services in the opposite order that they were started.
Bug: 64067984
Test: Boot and reboot bullhead
Change-Id: I6b4cacb03aed2a72ae98a346bce41ed5434a09c2
Allow configuring memory.swappiness, memory.soft_limit_in_bytes
and memory.limit_in_bytes by init; by doing so there is better
control of memory consumption per native app.
Test: tested on gobo branch.
bug: 63765067
Change-Id: I8906f3ff5ef77f75a0f4cdfbf9d424a579ed52bb
- "shutdown critical" prevents killing the service during
shutdown. And the service will be started if not running.
- Without it, services will be killed by SIGTERM / SIGKILL during shutdown.
- Even services with "shutdown critical" will be killed if shutdown
times out.
- Removes ueventd and vold from hard coded list. Each service's rc will
be updated to add "shutdown critical". watchdogd is still kept in the list.
bug: 37626581
Test: reboot and check last kmsg
Change-Id: Ie8cc699d1efbc59b9a2561bdd40fec64aed5a4bb
We have been seeing panics and errors during shutdown sequence in
some vendor's platform, and it is required to disable error handling
during shutdown.
This CL separates the shutdown request to execute another "shutdown"
trigger at the beginning of shutdown stage. And vendor can use this
trigger to add custom commands needed for shutting down gracefully.
Bug: 38203024
Bug: 62084631
Test: device reboot/shutdown
Change-Id: I3fac4ed59f06667d86e477ee55ed391cf113717f
First kill the process group before killing the cgroup to catch
the hopeful case that killing the cgroup becomes a no-op as all of its
processes have already been killed.
Do not report an error if kill fails due to ESRCH, as this happens
often when reaping processes due to the order in which we call
waitpid() and kill().
Do not call killProcessGroup in libprocessgroup if we have already
successfully killed and removed a process group.
Bug: 36661364
Bug: 36701253
Bug: 37540956
Test: Reboot bullhead
Test: Start and stop services
Test: Init unit tests
Change-Id: I172acf0f8e00189f910f865f4635a7b1782fc7e3
Add unit test to ensure all POD types of Service are initialized.
Bug: 37855222
Test: Ensure bugreport is triggered via keychord properly.
Test: New unit tests
Change-Id: If2cfea15a74ab417a7b909a60c264cb8eb990de7
Remove the dependency on Action and Service from what should be a
generic Parser class.
Make ActionParser, ImportParser, and ServiceParser take a pointer to
their associated classes instead of accessing them through a
singleton.
Misc fixes to SectionParser Interface:
1) Make SectionParser::ParseLineSection() non-const as it always should
have been.
2) Use Rvalue references where appropriate
3) Remove extra std::string& filename in SectionParser::EndFile()
4) Only have SectionParser::ParseSection() as pure virtual
Document SectionParser.
Make ImportParser report the filename and line number of failed imports.
Make ServiceParser report the filename and line number of duplicated services.
Test: Boot bullhead
Change-Id: I86568a5b375fb4f27f4cb235ed1e37635f01d630
In the past, I had thought it didn't make sense to have multiple
Action classes with identical triggers within ActionManager::actions_,
and opted to instead combine these into a single action. In theory,
it should reduce memory overhead as only one copy of the triggers
needs to be stored.
In practice, this ends up not being a good idea.
Most importantly, given a file with the below three sections in this
same order:
on boot
setprop a b
on boot && property:true=true
setprop c d
on boot
setprop e f
Assuming that property 'true' == 'true', when the `boot` event
happens, the order of the setprop commands will actually be:
setprop a b
setprop e f
setprop c d
instead of the more intuitive order of:
setprop a b
setprop c d
setprop e f
This is a mistake and this CL fixes it. It also documents this order.
Secondly, with a given 'Action' now spanning multiple files, in order
to keep track of which file a command is run from, the 'Command'
itself needs to store this. Ironically to the original intention,
this increases total ram usage. This change now only stores the file
name in each 'Action' instead of each 'Command'. All in all this is a
negligible trade off of ram usage.
Thirdly, this requires a bunch of extra code and assumptions that
don't help anything else. In particular it forces to keep property triggers
sorted for easy comparison, which I'm using an std::map for currently,
but that is not the best data structure to contain them.
Lastly, I added the filename and line number to the 'processing
action' LOG(INFO) message.
Test: Boot bullhead, observe above changes
Test: Boot sailfish, observe no change in boot time
Change-Id: I3fbcac4ee677351314e33012c758145be82346e9
Remove includes of "log.h" that really want <android-base/logging.h>
Fix header include order
Remove headers included in .cpp files that their associated .h already includes
Remove some unused headers
Test: boot bullhead
Change-Id: I2b415adfe86a5c8bbe4fb1ebc53c7b0ee2253824
Use this for bootstat and init. This replaces the custom uptime parser in
bootstat.
This is a reland of aosp/338325 with a stubbed implementation for Darwin.
This change also has clang_format fixes (automatic).
Bug: 34352037
Test: chrono_utils_test
Change-Id: I72a62a3ca1ccfc0a4ccc6294ff1776c263144686
Exec services may also want to set other service flags such as
priority. Instead of expanding the exec syntax to handle this, create
a new command, exec_start, that will treat an existing service
definition as an exec service. The new exec_start command will start
the service then halt init from executing further commands until the
service has exited.
This change additionally encapsulates the waiting_for_exec logic into
ServiceManager and removes the ambiguous 'bool' return value from
Reap() which previously indicated if a Reaped service was an exec
service or not.
Bug: 36511808
Bug: 36102163
Test: Bullhead boots, services run with exec_start as they do exec.
Change-Id: I44f775cf1c1dd81d5c715f44fdc150c651a2c80a
Add support of multiple class names in service, so that related services
can be grouped together. By doing this, we can start/stop some services
for special purpose. For example, early zygote, early boot animation
and etc.
Bug: 36535312
Test: marlin boots with defined classes
Change-Id: Ifeaaf034fd836816e24f3775bece53ea83faada6
- Emergency shutdown just marks the fs as clean while leaving fs
in the middle of any state. Do not use it anymore.
- Changed android_reboot to set sys.powerctl property so that
all shutdown can be done by init.
- Normal reboot sequence changed to
1. Terminate processes (give time to clean up). And wait for
completion based on ro.build.shutdown_timeout.
Default value (when not set) is changed to 3 secs. If it is 0, do not
terminate processes.
2. Kill all remaining services except critical services for shutdown.
3. Shutdown vold using "vdc volume shutdown"
4. umount all emulated partitions. If it fails, just detach.
Wait in step 5 can handle it.
5. Try umounting R/W block devices for up to max timeout.
If it fails, try DETACH.
If umount fails to complete before reboot, it can be detected when
system reboots.
6. Reboot
- Log shutdown time and umount stat to log so that it can be collected after reboot
- To umount emulated partitions, all pending writes inside kernel should
be completed.
- To umount /data partition, all emulated partitions on top of /data should
be umounted and all pending writes should be completed.
- umount retry will only wait up to timeout. If there are too many pending
writes, reboot will discard them and e2fsck after reboot will fix any file system
issues.
bug: 36004738
bug: 32246772
Test: many reboots combining reboot from UI and adb reboot. Check last_kmsg and
fs_stat after reboot.
Change-Id: I6e74d6c68a21e76e08cc0438573d1586fd9aaee2