platform_system_core/init/snapuserd_transition.h

92 lines
2.8 KiB
C
Raw Normal View History

/*
* Copyright (C) 2020 The Android Open Source Project
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once
#include <sys/types.h>
#include <optional>
#include <string>
#include <vector>
#include <libsnapshot/snapshot.h>
#include "block_dev_initializer.h"
namespace android {
namespace init {
// Fork and exec a new copy of snapuserd.
void LaunchFirstStageSnapuserd();
class SnapuserdSelinuxHelper final {
using SnapshotManager = android::snapshot::SnapshotManager;
public:
SnapuserdSelinuxHelper(std::unique_ptr<SnapshotManager>&& sm, pid_t old_pid);
void StartTransition();
void FinishTransition();
// Return a helper for facilitating the selinux transition of snapuserd.
// If snapuserd is not in use, null is returned. StartTransition() should
// be called after reading policy. FinishTransition() should be called
// after loading policy. In between, no reads of /system or other dynamic
// partitions are possible.
static std::unique_ptr<SnapuserdSelinuxHelper> CreateIfNeeded();
private:
void RelaunchFirstStageSnapuserd();
void ExecSnapuserd();
init: Wait for snapuserd before starting second stage This is a race between init process and bionic libc initialization of snapuserd. init->fork() ----------------> SecondStageMain() -> PropertyInit() | | v execveat ---> __libc_init_common() -> __system_properties_init() (snapuserd) When init process calls PropertyInit(), /dev/__properties__ directory is created. When bionic libc of snapuserd daemon invokes __system_properties_init _after_ init process PropertyInit() function is invoked, libc will try to initialize the property by reading /system/etc/selinux/plat_property_contexts. Since any reads on /system has to be served by snapuserd, this specific read from libc cannot be serviced leading to deadlock. Reproduce the race by inducing a sleep of 1500ms just before execveat() so that init process calls PropertyInit() before bionic libc initialization. This leads to deadlock immediately and with additional kernel instrumentation with debug logs confirms the failure: ====================================================== init: Relaunched snapuserd with pid: 428 ext4_file_open: SNAPUSERD: path /system/etc/selinux/plat_property_contexts - Pid: 428 comm 8 ext4_file_read_iter: SNAPUSERD for path: /system/etc/selinux/plat_property_contexts pid: 428 comm 8 [ 25.418043][ T428] ext4_file_read_iter+0x3dc/0x3e0 [ 25.423000][ T428] vfs_read+0x2e0/0x354 [ 25.426986][ T428] ksys_read+0x7c/0xec [ 25.430894][ T428] __arm64_sys_read+0x20/0x30 [ 25.435419][ T428] el0_svc_common.llvm.17612735770287389485+0xd0/0x1e0 [ 25.442095][ T428] do_el0_svc+0x28/0xa0 [ 25.446100][ T428] el0_svc+0x14/0x24 [ 25.449825][ T428] el0_sync_handler+0x88/0xec [ 25.454343][ T428] el0_sync+0x1c0/0x200 ===================================================== Fix: Before starting init second stage, we will wait for snapuserd daemon to be up and running. We do a simple probe by reading system partition. This read will eventually be serviced by daemon confirming that daemon is up and running. Furthermore, we are still in the kernel domain and sepolicy has not been enforced yet. Thus, access to these device mapper block devices are ok even though we may see audit logs. Note that daemon will re-initialize the __system_property_init() as part of WaitForSocket() call. This is subtle but important; since bionic libc initialized had failed silently, it is important that this re-initialization is done. Bug: 207298357 Test: Induce the failure by explicitly delaying the call of execveat(). With fix, no issues observed. Tested incremental OTA on pixel ~15 times. Signed-off-by: Akilesh Kailash <akailash@google.com> Change-Id: I86c2de977de052bfe9dcdc002dcbd9026601d0f3
2022-01-25 08:05:31 +01:00
bool TestSnapuserdIsReady();
std::unique_ptr<SnapshotManager> sm_;
BlockDevInitializer block_dev_init_;
pid_t old_pid_;
std::vector<std::string> argv_;
};
// Remove /dev/socket/snapuserd. This ensures that (1) the existing snapuserd
// will receive no new requests, and (2) the next copy we transition to can
// own the socket.
void CleanupSnapuserdSocket();
// Kill an instance of snapuserd given a pid.
void KillFirstStageSnapuserd(pid_t pid);
// Save an open fd to /system/bin (in the ramdisk) into an environment. This is
// used to later execveat() snapuserd.
void SaveRamdiskPathToSnapuserd();
// Returns true if first-stage snapuserd is running.
bool IsFirstStageSnapuserdRunning();
// Return the pid of the first-stage instances of snapuserd, if it was started.
std::optional<pid_t> GetSnapuserdFirstStagePid();
snapuserd: Allow connecting to the first-stage daemon. Currently there is no socket for daemon instances launched during the selinux phase of init. We don't create any sockets due to the complexity of the required sepolicy. This workaround will allow us to create the socket with very minimal sepolicy changes. init will launch a one-off instance of snapuserd in "proxy" mode, and then the following steps will occur: 1. The proxy daemon will be given two sockets, the "normal" socket that snapuserd clients would connect to, and a "proxy" socket. 2. The proxy daemon will listen on the proxy socket. 3. The first-stage daemon will wake up and connect to the proxy daemon as a client. 4. The proxy will send the normal socket via SCM_RIGHTS, then exit. 5. The first-stage daemon can now listen and accept on the normal socket. Ordering of these events is achieved through a snapuserd.proxy_ready property. Some special-casing was needed in init to make this work. The snapuserd socket owned by snapuserd_proxy is placed into a "persist" mode so it doesn't get deleted when snapuserd_proxy exits. There's also a special case method to create a Service object around a previously existing pid. Finally, first-stage init is technically on a different updateable partition than snapuserd. Thus, we add a way to query snapuserd to see if it supports socket handoff. If it does, we communicate this information through an environment variable to second-stage init. Bug: 193833730 Test: manual test Change-Id: I1950b31028980f0138bc03578cd455eb60ea4a58
2021-07-22 06:53:28 +02:00
// Return snapuserd info strings that were set during first-stage init.
std::vector<std::string> GetSnapuserdFirstStageInfo();
// Save an open fd to /system/bin (in the ramdisk) into an environment. This is
// used to later execveat() snapuserd.
void SaveRamdiskPathToSnapuserd();
// Returns true if first-stage snapuserd is running.
bool IsFirstStageSnapuserdRunning();
} // namespace init
} // namespace android