Use vold version of writeStringToFile which fsync files, and
manually fsync directories after initialize global DE
(cherry picked from commit a98464f688)
Bug: 71810347
Test: Build pass and reboot stress test.
Original boot failure symptom is NOT reproducible.
Change-Id: I1ca8f8cf0ccfd01075a9c33f79042e58d99aea26
Merged-In: I1ca8f8cf0ccfd01075a9c33f79042e58d99aea26
Remove static definition of writeStringToFile, and
move it from KeyStorage to Utils
(cherry picked from commit 0bd2d11692)
Bug: 71810347
Test: Build pass and reboot stress test.
Change-Id: I38bfd27370ac2372e446dc699f518122e73c6877
Merged-In: I38bfd27370ac2372e446dc699f518122e73c6877
vdc currently only prints generic binder failure status on failure.
This doesn't help debugging early boot failures at all since we don't
know which exact vdc command failed. Fix that by adding the command as
part of the failure message.
Bug: 129946805
Test: Boot cuttlefish
Change-Id: Ic2367cf592d6b5bf23d6d4b1447baa1baf41afe7
Signed-off-by: Sandeep Patil <sspatil@google.com>
getopt_long() assumes an all-zeroes 'struct option' at the end of the
array. Add it.
Fortunately this isn't causing problems in practice because vold is
always passed valid command line options...
Test: Running 'vold --foo' no longer segfaults.
Change-Id: I2cd3af501cc1aa11327a8062ec492be1d23defdf
If more than the default number of loop devices is in use, we may need
to wait for the device path to be available.
Bug: 128873591
Bug: 122059364
Test: Set up adopted virtual disk and check that it loads on boot
Change-Id: I201dcc32043664076f50b0d6f40de6e5e1a65342
F2FS changes proper configurations along with gc_urgent, so idle-maint doesn't
need to set this redundantly.
Change-Id: I4a71a5d877a3bb9636e2b65132ec806edc56a8fe
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
The expression "!fd" calls the implicit conversion to int, but comparing
the raw fd against 0 does not work, since open() and other POSIX calls
returning a file descriptor use -1 to signal an error.
Test: m vold
Change-Id: I0847c276f39cb9dd09c7ffb96951276113418fc8
This changes the property from microsecond to milliseconds, as we don't
need that sort of precision here. Also switches from using ulseep, which
has been removed from POSIX, to nanosleep.
Test: Builds, Boots, Times
Change-Id: Iefbaf8489ba05d8d688542fd7d4305efb980e701
std::ifstream does not use O_CLOEXEC flag when opening files. This leads
to file descriptors being inherited by child processes. In the case of vold
this results in leaking FDs to less privileged children with no permission
for these files which occasionally leads to SELinux denials.
Bug: 129298168
Change-Id: Id2731782a25d65c9a7cbf25dc441f3e7a17609c1
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Take action if we are running out of checkpoint space.
Configurable via ro.sys properties.
ro.sys.cp_usleeptime = Time to sleep between checks
ro.sys.cp_min_free_bytes = Min free space to act on
ro.sys.cp_commit_on_full = action to take. Either commits or reboots to
continue attempt without checkpoint, or retry
and eventually abort OTA
Test: Trigger a checkpoint and fill the disk.
Bug: 119769392
Change-Id: I977cc03b7aef9320d661c8a0d716f8a1ef0be347
abortChanges will attempt to pass a reboot message, and will only reboot
if the device is currently checkpointing. Additionally, it can opt to
attempt to prevent future attempts. This only works for non-bootloader
controlled updates. Failures are ignored, as it will always reboot the
device. In the unlikely event of such a failure, the device will
continue to retry as though you did not ask to prevent future attempts.
Test: vdc checkpoint abortChanges abort_retry_test 1
vdc checkpoint abortChanges abort_noretry_test 0
Change-Id: I7b6214765a1faaf4fd193c73331696b53ae572d2
This makes needCheckpoint return true when the device will or is using
checkpointing.
Test: vdc checkpoint startCheckpoint 1
reboot
vdc checkpoint needsCheckpoint
should return 1 before and after data mounts, and 0 once the
checkpoint has been committed
Change-Id: Ib57f4461d837f41a8110ed318168165a684d913a
Also add vdc checkpoint supportsFileCheckpoint
This is to allow tests to be specific to supported checkpoint mode.
Test: Built on Taimen and Crosshatch, made sure both new functions work
as expected
Change-Id: I0eab7453b13c0a2e31840ef9ad24a692cec55b00
The boot failure symptom is reproduced on Walleye devices. System boots
up after taking OTA and try to upgrade key, but keymaster returns "failed
to ugprade key". Device reboots to recovery mode because of the failure,
and finally trapped in bootloader screen. Possible scenario is:
(After taking OTA)
vold sends old key and op=UPGRADE to keymaster
keymaster creates and saves new key to RPMB, responses new key to vold
vold saves new key as temp key
vold renames temp key to main key -------------- (1) -- still in cache
vold sends old key and op=DELETE_KEY to keymaster
keymaster removes old key from RPMB ------------ (2) -- write directly to RPMB
==> SYSTEM INTERRUPTED BY CRASH OR SOMETHING; ALL CACHE LOST.
==> System boots up, key in RPMB is deleted but key in storage is old key.
Solution: A Fsync is required between (1) and (2) to cover this case.
Detail analysis: b/124279741#comment21
Bug: 112145641
Bug: 124279741
Test: Insert fault right after deleteKey in vold::begin (KeyStorage.cpp),
original boot failure symptom is NOT reproducible.
Change-Id: Ib8c349d6d033f86b247f4b35b8354d97cf249d26
This allows us to resume rolling back in the event of an unexpected
shutdown during the restore process. We save progress after we process
each log sector, and whenever restarting the current log sector would
result in invalid data.
Test: Run restore, interrupt it, and attempt to resume
Change-Id: I91cf0defb0d22fc5afdb9debc2963c956e9e171c
Restores the first n entries of a checkpoint. Allows automated testing
of interrupted restores.
Test: vdc checkpoint restoreCheckpoint [device] [n]
Change-Id: I47570e8eba0bc3c6549a04a33600df05d393990b
In preparation for restore code, we need to guarantee fsync happens.
Switch over to fd based operations to prepare for that.
Test: Successfully restores device over reboots
Change-Id: Ic9901779e8a4258bf8090d6a62fa9829e343fd39
Motivation:
Early processes launched before the runtime APEX - that hosts the bionic
libs - is activated can't use the bionic libs from the APEX, but from the
system partition (which we call the bootstrap bionic). Other processes
after the APEX activation should use the bionic libs from the APEX.
In order to let both types of processes to access the bionic libs via
the same standard paths /system/lib/{libc|libdl|libm}.so, some mount
namespace magic is used.
To be specific, when the device boots, the init initially bind-mounts
the bootstrap bionic libs to the standard paths with MS_PRIVATE. Early
processes are then executed with their own mount namespaces (via
unshare(CLONE_NEWNS)). After the runtime APEX is activated, init
bind-mounts the bionic libs in the APEX to the same standard paths.
Processes launched thereafter use the bionic libs from the APEX (which
can be updated.)
Important thing is that, since the propagation type of the mount points
(the standard paths) is 'private', the new bind-mount events for the
updated bionic libs should not affect the early processes. Otherwise,
they would experience sudden change of bionic libs at runtime. However,
other mount/unmounts events outside of the private mount points are
still shared across early/late processes as before. This is made possible
because the propagation type of / is 'shared' .
Problem:
vold uses the equality of the mount namespace to filter-out processes
that share the global mount namespace (the namespace of the init). However,
due to the aforementioned change, the early processes are not filtered
out because they have different mount namespaces. As a result,
umount2("/storage/") is executed on them and this unmount event
becomes visible to the global mount namespace (because as mentioned before /
is 'shared').
Solution:
Fiter-out the early processes by skipping a native (non-Java) process
whose UID is < AID_APP. The former condition is because all early
processes are native ones; i.e., zygote is started after the runtime
APEX is activated. The latter condition is to not filter-out native
processes created locally by apps.
Bug: 120266448
Test: m; device boots
Change-Id: I054deedc4af8421854cf35be84e14995523a259a