Commit graph

9 commits

Author SHA1 Message Date
Akilesh Kailash
5140f3ad47 init: Wait for daemon to fully spin up all threads
During selinux transition, daemon will notify `init` process
by writing to file "/metadata/ota/daemon-alive-indicator".

Init will wait until daemon notifies it. Furthermore, daemon
will only write to that file once all threads are spin up
and attached to dm-user misc devices.

Once snapshot-merge is completed, this file will be removed.
Additionally, during boot, init will also ensure that
there are no stale files and will try to remove just
before selinux transition.

Bug: 262407519
Test: OTA on Pixel - Verify new file exits and init waits until daemon
is fully up.
Change-Id: Iabef58ad282d80a7afa493e9df9468ae41a13e44
Signed-off-by: Akilesh Kailash <akailash@google.com>
2023-01-11 19:24:56 +00:00
Akilesh Kailash
035e557fd3 init: Detach daemon only after sepolicy is loaded
The new sequence of operation would be:

1: Load sepolicy - Daemon will continue to be alive and serve any I/O request

2: After sepolicy loading is complete - Switch the device-mapper tables.

3: Kill the block device daemon launched in the first-stage init.

4: Re-launch the daemon with the correct selinux labels set.

5: Enforce the sepolicy

Bug: 240321741
Test: Full OTA on pixel
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: Idd392f0f0aae7d93e546c0ec0762e6c07b6263e4
2022-10-10 21:58:52 +00:00
Akilesh Kailash
4ffe8a3109 init: Set oom_score_adj to snapuserd process
When a process is started as a native service,
oom_score_adj is set to -1000 so that processes
are unkillable by lmkd.

During boot, snapuserd process is not started as a service;
hence, we need to set the oom_score_adj explicitly else in
the event of low memory situation, lmkd can kill the
process thereby device can never boot.

Bug: 234691483
Test: th and OTA on Pixel
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: Ic2c85aa470522b4bc847a16b4f5cebfc528ed3cf
2022-06-02 20:49:03 +00:00
Akilesh Kailash
fd5562b0a5 init: Wait for snapuserd before starting second stage
This is a race between init process and bionic libc initialization of
snapuserd.

init->fork() ----------------> SecondStageMain() -> PropertyInit()
       |
       |
       v
      execveat ---> __libc_init_common() -> __system_properties_init()
     (snapuserd)

When init process calls PropertyInit(), /dev/__properties__ directory
is created. When bionic libc of snapuserd daemon invokes __system_properties_init
_after_ init process PropertyInit() function is invoked, libc will
try to initialize the property by reading
/system/etc/selinux/plat_property_contexts. Since any reads on /system
has to be served by snapuserd, this specific read from libc cannot be
serviced leading to deadlock.

Reproduce the race by inducing a sleep of 1500ms just before execveat()
so that init process calls PropertyInit() before bionic libc
initialization. This leads to deadlock
immediately and with additional kernel instrumentation with debug
logs confirms the failure:

======================================================
init: Relaunched snapuserd with pid: 428
ext4_file_open: SNAPUSERD: path /system/etc/selinux/plat_property_contexts - Pid: 428 comm 8
ext4_file_read_iter: SNAPUSERD for path: /system/etc/selinux/plat_property_contexts pid: 428 comm 8

[   25.418043][  T428]  ext4_file_read_iter+0x3dc/0x3e0
[   25.423000][  T428]  vfs_read+0x2e0/0x354
[   25.426986][  T428]  ksys_read+0x7c/0xec
[   25.430894][  T428]  __arm64_sys_read+0x20/0x30
[   25.435419][  T428]  el0_svc_common.llvm.17612735770287389485+0xd0/0x1e0
[   25.442095][  T428]  do_el0_svc+0x28/0xa0
[   25.446100][  T428]  el0_svc+0x14/0x24
[   25.449825][  T428]  el0_sync_handler+0x88/0xec
[   25.454343][  T428]  el0_sync+0x1c0/0x200

=====================================================

Fix:

Before starting init second stage, we will wait
for snapuserd daemon to be up and running. We do a simple probe by
reading system partition. This read will eventually be serviced by
daemon confirming that daemon is up and running. Furthermore,
we are still in the kernel domain and sepolicy has not been enforced yet.
Thus, access to these device mapper block devices are ok even though
we may see audit logs.

Note that daemon will re-initialize the __system_property_init()
as part of WaitForSocket() call. This is subtle but important; since
bionic libc initialized had failed silently, it is important
that this re-initialization is done.

Bug: 207298357
Test: Induce the failure by explicitly delaying the call of execveat().
      With fix, no issues observed.
      Tested incremental OTA on pixel ~15 times.
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: I86c2de977de052bfe9dcdc002dcbd9026601d0f3
2022-01-25 08:30:08 +00:00
Akilesh Kailash
3b874456fc libsnapshot: Integrate userspace snapshots APIs
dm-user block device will be the snapshot device; thus, no
more explicit call to MapSnapshot(). Additionally, block device
name for dm-user will be the snapshot name so that mount works
seamlessly.

API's to query the snapshot status, merge progress has been
integrated. Since daemon requires base device for merge, we pass
additional parameter during initialization.

Add a new virtual a/b property flag to enable/disable
user-snapshots feature. Propagate this flag to init layer
for first stage mount during boot process.

Some minor cleanup and renaming of variables.

Bug: 193863443
Test: 1: Full OTA on CF and pixel and verify the merge completion.
Tested merge-resume path by rebooting device during merge.
2: Incremental OTA on CF and pixel

Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: I5088f40a55807946cd044b3987678ead3696d996
2021-11-19 23:45:43 +00:00
David Anderson
0e5ad5a093 snapuserd: Allow connecting to the first-stage daemon.
Currently there is no socket for daemon instances launched during the
selinux phase of init. We don't create any sockets due to the complexity
of the required sepolicy.

This workaround will allow us to create the socket with very minimal
sepolicy changes. init will launch a one-off instance of snapuserd in
"proxy" mode, and then the following steps will occur:

1. The proxy daemon will be given two sockets, the "normal" socket that
snapuserd clients would connect to, and a "proxy" socket.
2. The proxy daemon will listen on the proxy socket.
3. The first-stage daemon will wake up and connect to the proxy daemon
as a client.
4. The proxy will send the normal socket via SCM_RIGHTS, then exit.
5. The first-stage daemon can now listen and accept on the normal
socket.

Ordering of these events is achieved through a snapuserd.proxy_ready
property.

Some special-casing was needed in init to make this work. The snapuserd
socket owned by snapuserd_proxy is placed into a "persist" mode so it
doesn't get deleted when snapuserd_proxy exits. There's also a special
case method to create a Service object around a previously existing pid.

Finally, first-stage init is technically on a different updateable
partition than snapuserd. Thus, we add a way to query snapuserd to see
if it supports socket handoff. If it does, we communicate this
information through an environment variable to second-stage init.

Bug: 193833730
Test: manual test
Change-Id: I1950b31028980f0138bc03578cd455eb60ea4a58
2021-07-27 19:35:29 -07:00
Akilesh Kailash
36aeeb3f56 snapuserd: Refactor code to a seperate directory
Move all the code relevant to snapuserd to a seperate
directory. Add OWNERS file.

No other code changes apart from moving files around
and fixing couple location of header paths
at few places.

Bug: 194642092
Test: Compile, Full OTA
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: Ib1d852bfeda4eca5c996d6cd7b057f141cb5ddad
2021-07-26 07:04:06 +00:00
David Anderson
9fd8862741 init: only mlock() system pages when performing snapuserd transitions.
Bug: 181032115
Test: manual test w/ VABC OTA
Change-Id: Ib4d2856b9b5eaf8688534f9d84edeb64d4b3244d
2021-03-05 15:44:25 -08:00
David Anderson
491e4da372 init: Add an selinux transition for snapuserd.
With compressed VAB updates, it is not possible to mount /system without
first running snapuserd, which is the userspace component to the dm-user
kernel module. This poses a problem because as soon as selinux
enforcement is enabled, snapuserd (running in a kernel context) does not
have access to read and decompress the underlying system partition.

To account for this, we split SelinuxInitialize into multiple steps:

First, sepolicy is read into an in-memory string.

Second, the device-mapper tables for all snapshots are rebuilt. This
flushes any pending reads and creates new dm-user devices. The original
kernel-privileged snapuserd is then killed.

Third, sepolicy is loaded from the in-memory string.

Fourth, we re-launch snapuserd and connect it to the newly created
dm-user devices. As part of this step we restorecon device-mapper
devices and /dev/block/by-name/super, since the new snapuserd is in a
limited context.

Finally, we set enforcing mode.

This sequence ensures that snapuserd has appropriate privileges with a
minimal number of permissive audits.

Bug: 173476209
Test: full OTA with VABC applies and boots
Change-Id: Ie4e0f5166b01c31a6f337afc26fc58b96217604e
2021-01-08 16:39:51 -08:00