If there are snapshot metadata persisting in /metadata/ota/snapshots,
remove them before applying a new update. Make sure that
the snapshots are indeed invalid before removing them.
On a sidenote, add a comment in init.cpp related to
b/223076262.
Bug: 228250473
Test: 1: Apply OTA in recovery through adb sideload
2: Reboot
3: Apply OTA OTA again through update_device.py
4: Re-run Full OTA updates just from update_device.py
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: I116bbafae09042b9c391ccd58c102704571c214e
* changes:
DO NOT MERGE: Revert "init: Add more diagnostics for signalfd hangs."
DO NOT MERGE: Revert "init: Add diagnostics for signalfd epoll failures."
This adds two new diagnostics. First, signalfd reads are now non-blocking. If the read takes more than 10 seconds, we log an error.
Second, init now wakes up from epoll() every 10 seconds. If it waits on an "exec" command for more than 10 seconds, it logs an error.
This change will be reverted as soon as we get feedback.
Bug: 223076262
Test: device boots
Change-Id: I7ee98d159599217a641b3de2564a92c2435f57ef
This reverts commit 471643a909.
Reason for revert: Given https://r.android.com/1960063, it is safe to revert this diagnostics patch
Change-Id: Ib3600c1982ee10a0204ac0fdbc3e160c2833ed07
This patch attempts to diagnose snapuserd hangs by performing reads
immediately after entering second-stage init. This is done by spawning
two threads: one to perform the reads, and another to wait for the read
thread to finish. If any aspect of the read fails, or the read thread
does not complete in 10 seconds, then a list of snapuserd's open file
descriptors are logged.
Bug: 207298357
Test: apply working OTA, check logcat for success
apply broken OTA, check logcat for fd map
Change-Id: I549e07b7d576fcdaca9b2d6ff33e0924c3812c07
This property will hold the major.minor part of the kernel version (e.g. "5.4"), allowing init scripts to act depending on that version, enabling and disabling certain features.
Bug: 194156700
Change-Id: Icec640b8a7150b344d9aa3bc0bdbcdae050c7c45
Test: manual on a Pixel device
Signed-off-by: Alexander Potapenko <glider@google.com>
Currently there is no socket for daemon instances launched during the
selinux phase of init. We don't create any sockets due to the complexity
of the required sepolicy.
This workaround will allow us to create the socket with very minimal
sepolicy changes. init will launch a one-off instance of snapuserd in
"proxy" mode, and then the following steps will occur:
1. The proxy daemon will be given two sockets, the "normal" socket that
snapuserd clients would connect to, and a "proxy" socket.
2. The proxy daemon will listen on the proxy socket.
3. The first-stage daemon will wake up and connect to the proxy daemon
as a client.
4. The proxy will send the normal socket via SCM_RIGHTS, then exit.
5. The first-stage daemon can now listen and accept on the normal
socket.
Ordering of these events is achieved through a snapuserd.proxy_ready
property.
Some special-casing was needed in init to make this work. The snapuserd
socket owned by snapuserd_proxy is placed into a "persist" mode so it
doesn't get deleted when snapuserd_proxy exits. There's also a special
case method to create a Service object around a previously existing pid.
Finally, first-stage init is technically on a different updateable
partition than snapuserd. Thus, we add a way to query snapuserd to see
if it supports socket handoff. If it does, we communicate this
information through an environment variable to second-stage init.
Bug: 193833730
Test: manual test
Change-Id: I1950b31028980f0138bc03578cd455eb60ea4a58
restrictions
Use the property ro.product.enforce_debugfs_restrictions to enable
debugfs restrictions instead of checking the launch API level. Vendors
can enable build-time as well as run-time debugfs restrictions by
setting the build flag PRODUCT_SET_DEBUGFS_RESTRICTIONS true which in
turn sets ro.product.enforce_debugfs_restrictions true as well enables
sepolicy neverallow restrictions that prevent debugfs access. The
intention of the build flag is to prevent debugfs dependencies from
creeping in during development on userdebug/eng builds.
Test: build and boot
Bug: 184381659
Change-Id: If555037f973e6e4f35eb7312637f58e8360c3013
This check in export_oem_lock_status happens after PropertyInit() so
all of the ro.boot.* properties will be set. There is no need to import
the kernel cmdline again.
Test: build and boot cuttlefish
Bug: 173815685
Change-Id: I5df7c0105566d4617442dbb8e77eb26e465775f1
Add a property ro.boottime.init.modules to provide kernel modules
loading time in milliseconds. Also add corresponding log to show in init
log along with loaded module count.
Test: boot test
Bug: 178143513
Change-Id: I77e3939c2a271da6841350a8c2a34ad32f637377
When there is a transition of daemon from selinux stage, we observe
intermittent hangs during OTA. This is a workaround wherein
we don't do the transition and allow the daemon to continue which
was spawned during selinux stage.
Bug: 179331261
Test: Incremental OTA, full OTA on pixel
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: I622a0ed8afcd404bac4919b1de00728de2c12eaf
This has been something the kernel does automatically since 2014, so
there's no obvious reason to add extra work during boot to duplicate
that effort.
Bug: http://b/179086242
Test: treehugger
Change-Id: I44cce99a892e4f2a6a303c2126bd29f955f5fb23
This change will help non-user builds with keeping debugfs
disabled during run time. Instead, debugfs will be mounted by init
to enable boot time initializations to set up vendor debug data
collection and unmounted after boot. It will be also be mounted by
dumpstate for bug report generation and unmounted after.
This change is only intended to help vendors (who depend on debugfs to
collect debug information from userdebug/eng builds) keep debugfs
disabled during runtime. Platform code must not depend on debugfs at all.
Test: manual
Bug: 176936478
Change-Id: I2e89d5b9540e3de094976563682d4b8c5c125876
With compressed VAB updates, it is not possible to mount /system without
first running snapuserd, which is the userspace component to the dm-user
kernel module. This poses a problem because as soon as selinux
enforcement is enabled, snapuserd (running in a kernel context) does not
have access to read and decompress the underlying system partition.
To account for this, we split SelinuxInitialize into multiple steps:
First, sepolicy is read into an in-memory string.
Second, the device-mapper tables for all snapshots are rebuilt. This
flushes any pending reads and creates new dm-user devices. The original
kernel-privileged snapuserd is then killed.
Third, sepolicy is loaded from the in-memory string.
Fourth, we re-launch snapuserd and connect it to the newly created
dm-user devices. As part of this step we restorecon device-mapper
devices and /dev/block/by-name/super, since the new snapuserd is in a
limited context.
Finally, we set enforcing mode.
This sequence ensures that snapuserd has appropriate privileges with a
minimal number of permissive audits.
Bug: 173476209
Test: full OTA with VABC applies and boots
Change-Id: Ie4e0f5166b01c31a6f337afc26fc58b96217604e
This hasn't helped investigating the issue, and the issue itself isn't
a problem anymore, so we remove these logs.
Bug: 155203339
Test: reboot
Change-Id: I20e51d8fcad5572906a8d556bec8a8dee4522834
Running snapuserd before early-init means ueventd is missing, which
means we can't use WaitForFile() when dm-user misc devices are created.
Fix this by starting the transition after early-init.
Bug: 173476209
Test: full OTA with VABC applies and boots
Change-Id: Ice594cceb44981ae38deb82289d313c14726c36b
snapuserd is used as a user-space block device implementation during
Virtual A/B Compression-enabled updates. It has to be started in
first-stage init, so that updated partitions can be mounted.
Once init reaches second-stage, and sepolicy is loaded, we want to
re-launch snapuserd at the correct privilege level. We accomplish this
by rebuilding the device-mapper tables of each block device, which
allows us to re-bind the kernel driver to a new instance of snapuserd.
After this, the old daemon can be shut down.
Ideally this transition happens as soon as possible, before any .rc
scripts are run. This minimizes the amount of time the original
snapuserd is running, as well as any ambiguity about which instance of
snapuserd is the correct one.
The original daemon is sent a SIGTERM signal once the transition is
complete. The pid is stored in an environment variable to make this
possible (these details are implemented in libsnapshot).
Bug: 168259959
Test: manual test
Change-Id: Ife9518e502ce02f11ec54e7f3e6adc6f04d94133
This change does the following:
- Create /second_stage_resources empty dir at root.
- At runtime:
- At first stage init:
- mount tmpfs to /second_stage_resources.
- Copy /system/etc/ramdisk/build.prop to
/second_stage_resources/system/etc/ramdisk/build.prop
- At second stage init:
- Load prop from the above path
- umount /second_stage_resources
Test: getprop -Z
Test: getprop
Bug: 169169031
Change-Id: I18b16aa5fd42fa44686c858982a17791b2d43489
To support "override" services, we need to scan partitions from least
speicific to most specific.
Bug: 163021585
Test: m
Change-Id: I26a6de4f7fb571c60038e803137a4b1c237792fd
If the system partition has been updated and _PATH_DEFPATH has a new
value, then we must set $PATH in second stage init to take on the new
value, as well as having set it in first stage init.
Bug: 160210288
Test: build
Change-Id: I18765709dc9bff9379b0ae39272199cf74a79d2f
Now we have more things e.g. loading kernel modules, initialize
selinux. And before all these, system cannot make further progress. It
should be beneficial to boost this critical peroid in init.
Benchmark on a Pixel device shows this saves 100+ms on early boot
ToT release + This CL (P15538587)
D/BaseBootTest: init_stage_second_START_TIME_avg : 1563.4
D/BaseBootTest: ueventd_Coldboot_avg : 219.0
D/BaseBootTest: action_init_/system/etc/init/hw/init.rc:114_START_TIME_avg : 2103.7
ToT release (6654154)
D/BaseBootTest: init_stage_second_START_TIME_avg : 1639.0
D/BaseBootTest: ueventd_Coldboot_avg : 238.2
D/BaseBootTest: action_init_/system/etc/init/hw/init.rc:114_START_TIME_avg : 2226.0
Bug: 143857500
Bug: 147997403
Bug: 160271169
Bug: 160696502
Test: Boottime
Signed-off-by: Wei Wang <wvw@google.com>
Change-Id: I21c9e051f4ae3e953d991c031f151b2779702548
With GKI we find in certain situations the timing of the drivers
loading is delayed as compared to a monolithic kernel. This
introduces a race where during second stage init, the attributes
inside /sys/class/udc/ might not be set by the time
SetUsbController() is called.
To address this, we also call SetUsbController() until the property
sys.usb.controller is set at the bottom of the event loop.
Signed-off-by: Mark Salyzyn <salyzyn@google.com>
Bug: 151950334
Test: make sure user space fastbootd comes up reliably for a GKI kernel
Change-Id: Iececd8ffa3e6641554d215d622d8dab72d85d34d
There are devices stuck waiting for vendor_init to finish a command,
without giving much more information. Instead of setting aside the
last run command, it's more valuable to store and dump the last 30
logs seen.
Bug: 155203339
Test: these logs appear during hung reboots
Test: normal reboots have no difference.
Change-Id: I99cae248eb81eaf34ef66b910fa653a9fa135f68
When the subcontext code was redone to allow only one subcontext
(vendor_init), the code for restarting it and for terminating it
during shutdown was not updated, resulting in it not working.
Bug: 155203339
Test: kill subcontext init and notice it restart
Test: subcontext init stops during shutdown
Change-Id: Ib77f59d1e7be0ffcfd3f31c8450dc022c20bb322
Bug: 150863651
Test: add delays during critical parts of shutdown and see the
expected debug information
Change-Id: Ida586903fd3eefc32ca9ee34ea2db037896ed9f4
We already stop queue'ing new control messages, but we forgot to stop
handling those control messages that are already queued. This CL
fixes that.
Bug: 150863651
Test: CF reboots appropriately
Change-Id: Ifea07a30b868de23eb735db10d8bae410e1b98bb
Particularly in the case of the device failing to reboot. Some test
devices are showing that they've received the reboot message but
without rebooting.
Bug: 150863651
Test: prevent init from handling reboot and see a stacktrace
Test: reboot works normally
Change-Id: Ide001dadbb9e9cd235ea509066e6ae6664bb429b
Some services are lazy HALs on some platforms and not lazy HALs on
others; this is known at runtime by hwservicemanager, so this change
adds these properties to allow hwservicemanager to turn one oneshot
(for lazy HALs). It may also be required to make a lazy HAL not lazy
anymore, and oneshot_off is provided for this.
Bug: 147841742
Test: new unit test that turn on and off oneshot on a service (bootanim)
and observes that it follows the expected behavior
Change-Id: I79524e2c9a5008f90c8d3bc40920fde00602a439
We want to ignore SIGPIPE within init, but if we use SIG_IGN, that
would be inherited by child processes through exec(), which we do not
want to have happen. We instead set up a real signal handler with a
no-op handler function, that will ignore SIGPIPE within init, but will
not be inherited across exec().
This fixes c29c2baa69 ("init: Add support for native service
registration with lmkd"), when SIG_IGN was introduced.
Note that we caught this issue before shipping a release with that
change, so the major motivation here is to not cause a behavior change
in init.
Bug: 151581751
Test: children of init that don't explicitly block SIGPIPE exit when
sent SIGPIPE
Test: children of init that do explicitly block SIGPIPE do not exit
when sent SIGPIPE
Test: init does not exit when sent SIGPIPE
Test: init exits when sent SIGABRT
Change-Id: Ieda8555fd03836bcd672a422fe673a8369ad9beb
A previous change moved property_service into its own thread, since
there was otherwise a deadlock whenever a process called by init would
try to set a property. This new thread, however, would send a message
via a blocking socket to init for each property that it received,
since init may need to take action depending on which property it is.
Unfortunately, this means that the deadlock is still possible, the
only difference is the socket's buffer must be filled before init deadlocks.
This change, therefore, adds the following:
1) A lock for instructing init to reboot
2) A lock for waiting on properties
3) A lock for queueing new properties
A previous version of this change was reverted and added locks around
all service operations and allowed the property thread to spawn
services directly. This was complex due to the fact that this code
was not designed to be multi-threaded. It was reverted due to
apparent issues during reboot. This change keeps a queue of processes
pending control messages, which it will then handle in the future. It
is less flexible but safer.
Bug: 146877356
Bug: 148236233
Bug: 150863651
Bug: 151251827
Test: multiple reboot tests, safely restarting hwservicemanager
Change-Id: Ice773436e85d3bf636bb0a892f3f6002bdf996b6
This is apparently causing problems with reboot.
This reverts commit 7205c62933.
Bug: 150863651
Test: build
Change-Id: Ib8a4835cdc8358a54c7acdebc5c95038963a0419
The new exported DSU status removes the need to make blocking binder
calls out of system server during device boot.
Bug: 149790245
Bug: 149716497
Test: adb shell am start-activity \
-n com.android.dynsystem/com.android.dynsystem.VerificationActivity \
-a android.os.image.action.START_INSTALL \
-d file:///storage/emulated/0/Download/system.raw.gz \
--el KEY_SYSTEM_SIZE $(du -b system.raw|cut -f1) \
--el KEY_USERDATA_SIZE 8589934592
Change-Id: I27fae316214498407a73474ca8b93aec3518e4b5
A previous change moved property_service into its own thread, since
there was otherwise a deadlock whenever a process called by init would
try to set a property. This new thread, however, would send a message
via a blocking socket to init for each property that it received,
since init may need to take action depending on which property it is.
Unfortunately, this means that the deadlock is still possible, the
only difference is the socket's buffer must be filled before init deadlocks.
There are possible partial solutions here: the socket's buffer may be
increased or property_service may only send messages for the
properties that init will take action on, however all of these
solutions still lead to eventual deadlock. The only complete solution
is to handle these messages asynchronously.
This change, therefore, adds the following:
1) A lock for instructing init to reboot
2) A lock for waiting on properties
3) A lock for queueing new properties
4) A lock for any actions with ServiceList or any Services, enforced
through thread annotations, particularly since this code was not
designed with the intention of being multi-threaded.
Bug: 146877356
Bug: 148236233
Test: boot
Test: kill hwservicemanager without deadlock
Change-Id: I84108e54217866205a48c45e8b59355012c32ea8
We currently do not handle process actions (restarting services or
exiting timedout services) when we are waiting for an exec service,
but this seems to be the wrong behavior. Particularly, an exec
service may depend on a previously started service and if that service
crashes, we will deadlock unless init restarts it.
Bug: 146920034
Test: build, boot
Change-Id: Id2fc936b8a7b989862ba4c32c398a544941e0e76
Historically, the syscall was controlled by a system-wide
perf_event_paranoid sysctl, which is not flexible enough to allow only
specific processes to use the syscall. However, SELinux support for the
syscall has been upstreamed recently[1] (and is being backported to
Android R release common kernels).
[1] da97e18458
As the presence of these hooks is not guaranteed on all Android R
platforms (since we support upgrades while keeping an older kernel), we
need to test for the feature dynamically. The LSM hooks themselves have
no way of being detected directly, so we instead test for their effects,
so we perform several syscalls, and look for a specific success/failure
combination, corresponding to the platform's SELinux policy.
If hooks are detected, perf_event_paranoid is set to -1 (unrestricted),
as the SELinux policy is then sufficient to control access.
This is done within init for several reasons:
* CAP_SYS_ADMIN side-steps perf_event_paranoid, so the tests can be done
if non-root users aren't allowed to use the syscall (the default).
* init is already the setter of the paranoid value (see init.rc), which
is also a privileged operation.
* the test itself is simple (couple of syscalls), so having a dedicated
test binary/domain felt excessive.
I decided to go through a new sysprop (set by a builtin test in
second-stage init), and keeping the actuation in init.rc. We can change
it to an immediate write to the paranoid value if a use-case comes up
that requires the decision to be made earlier in the init sequence.
Bug: 137092007
Change-Id: Ib13a31fee896f17a28910d993df57168a83a4b3d
Currently linker config locates under /dev, but this makes some problem
in case of using two system partitions with chroot. To match system
image and configuration, linker config better stays under /linkerconfig
Bug: 144966380
Test: m -j passed && tested from cuttelfish
Change-Id: Iaae5af65721eee8106311c1efb4760a9db13564a
Such services will be re-parsed and added back to the service list
during post-fs-data stage.
Test: adb reboot userspace
Test: atest CtsInitTestCases
Bug: 145669993
Bug: 135984674
Change-Id: Ibb393dfe0f101c4ebe37bc763733fd5d981d3691
Init is no longer a special case and talks to property service just
like every other client, therefore move it away from property_set()
and to android::base::SetProperty().
In doing so, this change moves the initial property set up from the
kernel command line and property files directly into PropertyInit().
This makes the responsibilities between init and property services
more clear.
Test: boot, unit test cases
Change-Id: I36b8c83e845d887f1b203355c2391ec123c3d05f
Previously, we assumed that TriggerShutdown() should never be called
from vendor_init and used property service as a back up in case it
ever did. We have since then found out that vendor_init may indeed
call TriggerShutdown() and we want to make it just as strict as it is
in init, wherein it will immediately start the shutdown sequence
without executing any further commands.
Test: init unit tests, trigger shuttdown from init and vendor_init
Change-Id: I1f44dae801a28269eb8127879a8b7d6adff6f353