Vold currently uses AServiceManager_getService() to connect to
keystore2, which has an internal timeout of 5s. Since a lot of vold
keystore2 connection failures are fatal, we instead use
AServiceManager_waitForService(), which will wait efficiently for
keystore2 to start, instead of timing out after 5s.
Bug: 185934601
Test: Cuttlefish boots.
Change-Id: Ib4e977a997e020082382e0686f448d1aa72834ec
In wait_and_unmount(), kill the processes with open files after umount()
has been failing for 2 seconds rather than 17 seconds. This avoids a
long boot delay on devices that use FDE.
Detailed explanation:
On FDE devices, vold needs to unmount the tmpfs /data in order to mount
the real, decrypted /data. On first boot, it also needs to unmount the
unencrypted /data in order to encrypt it in-place.
/data can't be unmounted if files are open inside it. In theory, init
is responsible for killing all processes with open files in /data, via
the property trigger "vold.decrypt=trigger_shutdown_framework".
However, years ago, commit 6e8440fd50 ("cryptfs: kill processes with
open files on tmpfs /data") added a fallback where vold kills the
processes itself. Since then, in practice people have increasingly been
relying on this fallback, as services keep being added that use /data
but don't get stopped by trigger_shutdown_framework.
This is slowing down boot, as vold sleeps for 17 seconds before it
actually kills the processes.
The problematic services include services that are now started
explicitly in the post-fs-data trigger rather than implicitly as part of
a class (e.g., tombstoned), as well as services that now need to be
started as part of one of the early-boot classes like core or early_hal
but can still open files in /data later (e.g. keystore2 and credstore).
Another complication is that on default-encrypted devices (devices with
no PIN/pattern/password), trigger_shutdown_framework isn't run at all,
but rather it's expected that the relevant services simply weren't
started yet. This means that we can't fix the problem just by fixing
trigger_shutdown_framework to kill all the needed processes.
Therefore, given that the vold fallback is being relied on in practice,
and FDE won't be supported much longer anyway (so simple fixes are very
much preferable here), let's just change wait_and_unmount() in vold to
use more appropriate timeouts. Instead of waiting for 17 seconds before
killing processes, just wait for 2 seconds. Keep the total timeout of
20 seconds, but spend most of it retrying killing the processes, and
only if the unmount is still failing.
This avoids the long boot delays in practice.
Bug: 187231646
Bug: 186165644
Test: Tested FDE on Cuttlefish, and checked logcat to verify that the
boot delay is gone.
Change-Id: Id06a9615a87988c8336396c49ee914b35f8d585b
Otherwise only the pids are shown, and it's hard to tell which
processes actually got killed.
Bug: 187231646
Change-Id: Icccf60d0ad4439d702f36ace31abe092df1c69c2
Otherwise, when system removes user's volume, it will hang
as there are mounts (obb and data mounts) still remain mounted in system.
Bug: 187122943
Test: atest UserLifecycleTests#managedProfileUnlock_stopped, it's not blocked anymore
Change-Id: Ic37985f98e6cbfe4fa38b981d3332c4dfc40c5b8
Otherwise, the only sign of what went wrong may be system_server
logging a "ServiceSpecificException".
Bug: 187079978
Change-Id: I59b2ba2b0e679dfd1ec1fd8fff6790256fbfdf29
Originally it kills all the apps with obb and data mounted.
Due to recent changes, all apps will have obb and data dirs mounted
in default root namespace. Hence all apps will be killed by
by KillProcessesWithMounts().
To fix this, we also check if the dir is mounted as tmpfs,
as the default namespace one is bind mounted to lowerfs,
which app data isolation is mounted as tmpfs, so we only
kill the process that have obb dir mounted as tmpfs.
Bug: 148049767
Test: Able to boot without warnings
Change-Id: I5f862ad6f64f5df739b68ea7c9815352bae3be5c
Merged-In: I45d9a63ed47cbc27aebb63357a43f51ad62275db
Fix KeymasterOperation::updateCompletely() to not treat an empty output
as an error, since for RSA signing (used by cryptfs / FDE) it is
expected that the output from update() be empty. The output is instead
produced at the end by finish().
This is one of a set of changes that is needed to get FDE working again
so that devices that launched with FDE can be upgraded to Android 12.
Bug: 186165644
Change-Id: Icf120f8b9526d051d0ebe16bc8ad1edf712241e1
This CL updates vold to use the updated storage key API that provides an
optional upgraded key blob. In this patch the upgraded key blob is not
yet stored by vold.
Bug: 185811713
Test: N/A
Change-Id: I39eeb20df0eb2b023479f3adebab264d29d00048
It's very useful to see the mkfs log in console to debug any issues.
Bug: 172378121
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Change-Id: Icdac3609860cf0bba3fa758cead885bd4960f2c0
This is needed so that system_server can remind itself about which users
have their storage unlocked, if system_server is restarted due to a
userspace reboot (soft restart).
Bug: 146206679
Test: see I482ed8017f7bbc8f7d4fd5a2c0f58629317ce4ed
Change-Id: I02f0494d827094bd41bcfe5f63c24e204b728595
(cherry picked from commit 1799debfd6)
* changes:
Remove Keymaster::isSecure() and simplify callers
Make vold use keystore2 instead of keymaster
Remove HardwareAuthToken support from vold::Keymaster
Now that isSecure() always returns true, we can remove it and simplify
all the callers (i.e. cryptfs). Refer to the commit description for
Iaebfef082eca0da8a305043fafb6d85e5de14cf8 for why this function always
return true.
Bug: 181910578
Test: Cuttlefish and bramble boot
Change-Id: I185dd8180bd7842b05295263f0b1aa7205329a88
Make vold use keystore2 for all its operations instead of directly using
keymaster. This way, we won't have any clients that bypass keystore2,
and we'll no longer need to reserve a keymaster operation for vold.
Note that we now hardcode "SecurityLevel::TRUSTED_ENVIRONMENT" (TEE)
when talking to Keystore2 since Keystore2 only allows TEE and STRONGBOX.
Keystore2 presents any SOFTWARE implementation as a TEE to callers when
no "real" TEE is present. As far as storage encryption is concerned,
there's no advantage to using a STRONGBOX when a "real" TEE is present,
and a STRONGBOX can't be present if a "real" TEE isn't, so asking
Keystore2 for a TEE is the best we can do in any situation.
The difference in behaviour only really affects the full disk encryption
code in cryptfs.cpp, which used to explicitly check that the keymaster
device is a "real" TEE (as opposed to a SOFTWARE implementation) before
using it (it can no longer do so since Keystore2 doesn't provide a way
to do this).
A little code history digging (7c49ab0a0b in particular) shows that
cryptfs.cpp cared about two things when using a keymaster.
- 1) that the keys generated by the keymaster were "standalone" keys -
i.e. that the keymaster could operate on those keys without
requiring /data or any other service to be available.
- 2) that the keymaster was a non-SOFTWARE implementation so that things
would still work in case a "real" TEE keymaster was ever somehow
added to the device after first boot.
Today, all "real" TEE keymasters always generate "standalone" keys, and
a TEE has been required in Android devices since at least Android N. The
only two exceptions are Goldfish and ARC++, which have SOFTWARE
keymasters, but both those keymasters also generate "standalone" keys.
We're also no longer worried about possibly adding a "real" TEE KM to
either of those devices after first boot. So there's no longer a reason
cryptfs.cpp can't use the SOFTWARE keymaster on those devices.
There's also already an upgrade path in place (see
test_mount_encrypted_fs() in cryptfs.cpp) to upgrade the kdf that's
being used once a TEE keymaster is added to the device. So it's safe for
cryptfs.cpp to ask for a TEE keymaster from Keystore2 and use it
blindly, without checking whether or not it's a "real" TEE, which is why
Keymaster::isSecure() just returns true now. A future patch will remove
that function and simplify its callers.
Bug: 181910578
Test: cuttlefish and bramble boot. Adding, switching between, stopping
and removing users work.
Change-Id: Iaebfef082eca0da8a305043fafb6d85e5de14cf8
HardwareAuthTokens are no longer used by vold since Android P. So remove
the auth token parameter from vold. This patch doesn't remove the token
from IVold.aidl, and the methods in VoldNativeService.cpp return an
error if a non-empty auth token is passed to them.
Bug: 181910578
Test: cuttlefish and bramble boot with patch
Change-Id: I1a9f54e10f9efdda9973906afd0a5de5a699ada5
So shell / root will always access to them directly not via fuse.
And zygote will be unmount these directories to prevent them being
abused for leaking app visibility.
Also, /mnt/androidwritable is not very useful now as it's the same as
/mnt/installer, but we should make shell / root to access /mnt/androidwritable
later and /mnt/installer should only access obb but not data dir.
Bug: 182997439
Test: Able to boot without errors
Test: df on /sdcard/Android/data shows it's no on fuse.
Change-Id: I2ad10b1e80c135f637d37ddf502ee010f89f4946
reboot maybe cause a deadlock scenario:
1:init->vdc->vold for abort_fuse blocked on futex hold by another
vold binder_x
2:binder_x blocked in binder_ioctl_write_read wait a dead service's
response
3:dead service is exiting and schedule a deferred work for put files
in binder_vma_close, after put files is completed, the binder_x will
eventually wake up
4:kworker execute binder_deferred_work is blocked on fuse request:
crash> bt 1707
PID: 1707 TASK: ffffffe366175e80 CPU: 2 COMMAND: "kworker/2:4"
#0 [ffffff801b8b3ac0] __switch_to at ffffff962ce88a60
#1 [ffffff801b8b3b10] __schedule at ffffff962e2d3d30
#2 [ffffff801b8b3b70] schedule at ffffff962e2d3ff4
#3 [ffffff801b8b3bc0] __fuse_request_send at ffffff962d20e008
#4 [ffffff801b8b3c00] fuse_request_send at ffffff962d20deac
#5 [ffffff801b8b3c30] fuse_flush at ffffff962d217fa4
#6 [ffffff801b8b3c80] filp_close at ffffff962d0bd7b4
#7 [ffffff801b8b3cb0] put_files_struct at ffffff962d0e7658
#8 [ffffff801b8b3d30] binder_deferred_func at ffffff962dc9e60c
#9 [ffffff801b8b3d90] process_one_work at ffffff962cee761c
#10 [ffffff801b8b3e00] worker_thread at ffffff962cee7a68
#11 [ffffff801b8b3e60] kthread at ffffff962ceecc14
waiting for init abort_fuse
suggested by maco, do not acquire lock when abort fuse.
Test: reboot stress test
Change-Id: If6dd7f5e9c413a16ba047204c33d82d6ff41c4ae
Signed-off-by: lijiazi <lijiazi@xiaomi.com>
The error messages that are printed when probing for rollback resistance
support on a device that doesn't support rollback-resistant keys can
make it sound like something is going wrong. Print a WARNING message
afterwards to try to make it clear what is going on. Also adjust or add
DEBUG messages when starting to generate each key so that it's easier to
distinguish the log messages for different key generation operations.
Bug: 182815123
Test: boot on device that doesn't support rollback-resistant keys and
check log.
Change-Id: I37a13eb5c1e839fb94581f3e7ec1cd8da0263d2b
The FUSE daemon is often holding fds on behalf of other apps and if a
volume is ejected the daemon would often get killed first while vold
is walking /proc/<pid>/fd to kill pids with open fds on the
volume. This is required for the volume unmount successfully.
To mitigate this, we avoid killing the FUSE daemon during the usual
/proc walk. This ensures that we first send SIGINT, SIGTERM and
SIGKILL to other apps first. There is an additional SIGKILL attempt
and on that last attempt, we kill the FUSE daemon as a last resort
Test: Manual
Bug: 171673908
Change-Id: I100d2ce4cb4c145cbb49e0696842e97dfba2c1c9
This allows libincremental_aidl-cpp to be built via cc_library instead
of aidl_interface.
BUG: 181266844
Test: builds
Change-Id: I4f0bc82629c0df758467aa074274b30f9dc6718d
This directory is used as a root for external storage on adopted storage
devices. It needs to be writable by processes holding the AID_MEDIA_RW
GID permission; in particular, it should be writable by the FUSE daemon.
On devices with sdcardfs, this was ensured automatically, because
sdcardfs presented a view of this directory that was writable, that we
could use for the FUSE daemon. But on devices without sdcardfs, the FUSE
daemon sees the raw filesystem and its permissions. This also means that
files created by the FUSE daemon will have their uid/gid set to the uid
of the FUSE daemon; to ensure these files stay writable to other system
applications that have AID_MEDIA_RW, use a default ACL to make sure the
gid stays AID_MEDIA_RW.
In particular, this fixes an issue with app cloning, where we want the
FUSE daemon of user 0 to be able to access the files of the app clone
user, and vice versa.
Bug: 154057120
Test: inspect uid/gid of /data/media/0 and contents
Change-Id: Ic5d63457ec917ea407b900dbb7773d89311780c6
Acquiring a wakelock can fail if the suspend service is unavailable.
Explicitly check that wakelock was acquired before performing
operations that require the device to stay on.
Bug: b/179229598
Test: Boot test on Pixel 4 device
Change-Id: If30087223e44098801a31d1bfd239ac22e891abe