When appending, if the cluster should end after the given label, ensure
that it does.
Bug: 183985866
Test: cow_api_test#ResumeEndCluster
Change-Id: Ie93d09b3431755d0b9b92761619d55df7f9f6151
When opening in append mode, we could write less than what was present
before. This could result in data blocks referencing beyond the end of
the file, or partially written ops. Zeroing these out will prevent
invalid leftovers from potentially causing confusion.
Bug: 183985866
Test: cow_api_test
Change-Id: I56f0218f3ea5b83c0614d1b86e81a4ca885f5c5e
When opening in append mode, we ftruncate() the COW. This has three side
effects:
(1) If the COW is never modified, or Finalized(), the state of the COW
will have changed. Ideally it should only change on an explicit
write operation.
(2) Data after the current cluster will be accidentally thrown away.
(3) The ending "cluster" op will be thrown away if the current cluster
was incomplete, and thus the last valid label could be invalidated.
Bug: 183985866
Test: cow_api_test
Change-Id: I3c9a38553b7492a3d6e71d177d75ddb1b6490dfe
Example log line:
update_engine: Block device was lazily unmounted and is still in-use:
/dev/block/dm-28; possibly open file descriptor or attached loop device.
This will help diagnose bugs such as b/184715543 in the future.
Bug: N/A
Test: manual test
Change-Id: Ia6b17fe9bd1796d59be7fc0b355218509acfd4af
When all threads are terminated, dm-user handler's are removed
from the list. When the last handler is removed, daemon is
shutdown gracefully.
Bug: 183652708
Test: 1: Apply full OTA and verify daemon is terminated; reapply the OTA
to verify daemon is restarted again.
2: vts_libsnapshot_test
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: Ibd41223fc0eba884993a533fcc95661f72805db2
When worker threads were created, snapuserd was converted to a
shared_pointer. Earlier, memory was forcefully released
by setting snapuserd to nullptr which worked as it
was a unique pointer. Now, every worker thread holds
a reference. Clear the vector once all the worker
threads are terminated.
Test: Apply OTA and verify memory is released after OTA is applied
Bug: 183652708
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: I256d26d98b02ad599aff49b92192226546c59b17
If somehow we wind up with snapshots with a source suffix, we could wind
up trying to unmap an in-use partition. Detect this case and allow the
snapshot to be deleted without the unmap.
Bug: 183567503
Test: vts_libsnapshot_test
Change-Id: I87dd5bb3a7b9be59dede624924374ccc47b563c2
Use sorted std:vector instead of std:map to store
the mapping between chunk-id to COW operation.
Addtionally, use shrink_to_fit to cut down vector
capacity when COW operations are stored.
On a full OTA of 1.8G, Anon RSS usage is
reduced from 120MB to 68MB. No variance observed
when merge was in progress.
Bug: 182960300
Test: Full and Incremental OTA - verified memory usage
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: I50cacbe0d03837a830dedcf9bd0ac9663fc68fa7
Add worker threads per partition to serve the IO request.
Remove memset of buffer in IO path which was impacting
4k IO performance.
update_verifier performance:
1: ~10-12 seconds with this change (both on full OTA and incremental
OTA); ~70 seconds observed without this changeset
2: ~8 seconds without the daemon once merge is completed
and snapshot devices are removed.
Bug: 181293939
Test: update_verifier, full OTA, incremental OTA
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: Id90887f3f4a664ee5d39433715d1c166acbd6c60
update_engine and libsnapshot must agree on CowOptions parameters,
otherwise the COW size estimation may be incorrect.
Bug: N/A
Test: vts_libsnapshot_test
apply OTA, snapshotctl dump
Change-Id: I219ae458dfa19e4b3c96360d3b847edb2a01ebc8
This addresses bugs where unexpected edge cases in the snapshot state
could prevent a merge or data wipe from completing in recovery.
Invalid snapshots (eg on the wrong slot) are now ignored in
CheckMergeState(). This prevents those snapshots from being detected as
"cancelled" and thus falling into RemoveAllUpdateState.
ProcessUpdateState will no longer call RemoveAllUpdateState in recovery.
Furthermore, when RemoveAllUpdateState fails, we will no longer return
the "old" state. If this state is Merging, ProcessUpdateState can
infinite loop.
Finally, HandleImminentDataWipe now guarantees the final state will be
either MergeFailed or None. For testing purposes, the old mechanism was
too susceptible to state machinery changes. And for practical purposes,
either we're going to wipe data (which removes the OTA), or a merge
failed and we can't. So the effective outcome is always no update or a
failed update.
Bug: 179006671
Test: vts_libsnapshot_test
Change-Id: Idcb30151e4d35cbeccf14369f09707ae94a57c66
QuerySnapshotStatus assumes IsSnapshotDevice() would return true.
Additionally, recovery does not have access to /dev/loop-control, which
cannot be used by libfiemap anyway. Access it on-demand instead of
preemptively.
Bug: N/A
Test: manual test
Change-Id: I0f746870d7a8ec6d666f0bdd2fef3464b214928b
`header` is only initialized in the `if` block of this condition. Hence,
its use in the `else` portion isn't correct. Refactor the code a bit to
make this kind of bug a bit harder to write in the future.
Caught by the static analyzer:
> system/core/fs_mgr/libsnapshot/snapuserd.cpp:457:9: warning: 1st
function call argument is an uninitialized value
[clang-analyzer-core.CallAndMessage]
Bug: None
Test: TreeHugger
Change-Id: Ie56578520acf3cc972efa3336e40698feed20200
The exception size was hardcoded and its explanation hidden in one of
the comments.
Move it to a separate constant e better explain why that is 64 * 2 / 8.
Bug: 176972301
Test: m
Signed-off-by: Alessio Balsini <balsini@google.com>
Change-Id: Ifcb527540882222916ada07dacf3f76f87609539
Once the daemon is terminated, print merge completion
ops and the total ops present in the COW file. This
will help to know if the merge operation was interrupted
and how many pending operations were done during
each reboot until merge is completed
Bug: 167409187
Test: Incremental and full OTA
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: Ie33c840e80aaeca86f51adc8085cb4e306dca110
Enforce the checking of what chunk/sector/byte is written to the COW
size calculator to avoid possible overflows in the container.
If an undesired behavior occurs, the COW computation is aborted and,
the class user will be notified with an empty returned value.
Bug: 176972301
Test: vts_libsnapshot_test
Change-Id: I29f7909780853434a09032c27c943f505c6d0d19
Allow batch merge of copy operations during merge.
When metadata is read from COW device, assign
the chunk-id by validating there is no overlap
of copy operations. Furthermore, detect the blocks
which are contiguous and batch merge them.
No regression in merge time for full OTA (~35-40 seconds)
Merge time for incremental OTA of ~200M takes about 2 minutes
as compared to 15-20+ minutes without this change.
Add unit test to test ReadMetadata() functionality.
Multiple incremental OTA and full OTA test done on pixel.
adb reboot during merge and validate the merge resume operations.
Bug: 179629624
Test: incremental OTA and full OTA on pixel,
cow_snapuserd_test
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: I4cd84e4923e42afacc796b8cec01738b1bb1f420
When there is a transition of daemon from selinux stage, we observe
intermittent hangs during OTA. This is a workaround wherein
we don't do the transition and allow the daemon to continue which
was spawned during selinux stage.
Bug: 179331261
Test: Incremental OTA, full OTA on pixel
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: I622a0ed8afcd404bac4919b1de00728de2c12eaf
SnapshotManager::New() is now preferred in recovery. Previously we used
NewForFirstStageMount(), which is technically incorrect as that enables
code paths specifically for first-stage init.
We also explicitly label the snapuserd context, since rootfs in recovery
has unlabelled files.
Finally, we add a timeout to internal calls to
CreateSnapshotsAndLogicalPartitions. Without this, WaitForDevice() calls
will terminate immediately, which breaks VABC given the more complex
device stacking that is created.
Bug: 168258606
Test: fastboot snapshot-update merge
Change-Id: I3a663b95c0b1eabaf14e6fde409c6902653c3c5e
By accident, this was mounting partitions as well, which caused
conflicts in partial updates where some partitions don't have snapshots.
Test: update_device.py with partial OTA
Change-Id: I2db0e6269f0a02cbe8164fa2a72b887c352f56d8
Simulate merge interruption and merge restart and
validate the data once entire merge is completed.
Bug: 167409187
Test: cow_snapuserd_test
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: Ia940d5fbd2426bdf13347ffb6637d753b2228de6