It's possible that CONFIG_ZRAM_WRITEBACK is y,
but userspace doesn't set the /sys/block/zram0/backing_dev,
so its value is 'none'.
It's the same with "CONFIG_ZRAM_WRITEBACK is not set".
Change-Id: I2df89ceee68e4685deef5113bada21be96779e9b
Signed-off-by: shisiyuan <shisiyuan@xiaomi.com>
'/sys/block/zram0/backing_dev' will exist even if zram is not swapped on in some devices. And there is no reason to ensure that zram is swapped on if '/sys/block/zram0/backing_dev' exists. So, if we want to kill backing_dev during userspace reboot, we should check if zram is swapped on first.
TEST: as follow
- adb root
- adb shell swapoff /dev/block/zram0
- adb shell echo 1 > /sys/block/zram0/reset
- adb shell setprop test.userspace.reboot.flag 1
- adb reboot userspace
- (wait reboot ending) adb shell getprop test.userspace.reboot.flag (1 will be show if successful)
Signed-off-by: luwei9 <luwei9@xiaomi.com>
Change-Id: Icca569cf8d64bc024b867dae2ab789fc9e76445a
This test spawns several services backed by /system/bin/yes executable,
and then stops them either while SIGTERM or SIGKILL.
Ideally we want to unit test more of reboot logic, but that requires a
bigger refactoring.
Test: atest CtsInitTestCases
Bug: 170315126
Bug: 174335499
Change-Id: Ife48b1636c6ca2d0aac73f4eb6f4737343a88e7a
Instead of operating on raw pointers, init now uses name of the
services as it's primary identifier. Only place that still uses
vector<Service*> is StopServices.
In addition, ServiceList::services() function is removed, which should
help avoiding similar bugs in the future.
Bug: 170315126
Bug: 174335499
Test: adb reboot
Test: atest CtsInitTestCases
Change-Id: I73ecd7a8c58c2ec3732934c595b7f7db814b7034
Merged-In: I73ecd7a8c58c2ec3732934c595b7f7db814b7034
Ignore-AOSP-First: fixing security vulnerability
(cherry picked from commit 8d6ae2dd8a)
Instead of operating on raw pointers, init now uses name of the
services as it's primary identifier. Only place that still uses
vector<Service*> is StopServices.
In addition, ServiceList::services() function is removed, which should
help avoiding similar bugs in the future.
Bug: 170315126
Bug: 174335499
Test: adb reboot
Test: atest CtsInitTestCases
Change-Id: I73ecd7a8c58c2ec3732934c595b7f7db814b7034
Ignore-AOSP-First: fixing security vulnerability
Store pertinent information about userspace reboot events in the case
of failure. This information is any services which failed to stop
cleanly, the output of the default fstab and /proc/mounts, and
a list of mounts which failed to unmount. This information is only
stored as necessary (i.e. mount information will not be stored if
everything unmounted, even if some services failed to stop).
Added new /metadata/userspacereboot directory to persist this
information. Information older than 3 days will be deleted.
Test: adb reboot userspace with sigterm/sigkill timeouts set to
very low values
Test: Manual test of storing all other information
Bug: 151820675
Change-Id: I6cfbfae92a7fc6f6c984475cad2c50c559924866
Having mounted apexes with loop back devices backing files on /data
partition will prevent clean unmount of it. Unmounting them and tearing
down loop devices should minimize the risk of that.
Note that it won't fix the issue completely, as there are a few (~2-3)
processes that keep restarting even after SIGKILL is sent. Which means
that they can still hold references to apexes on /data partition. But
in practice probability of this is quite low.
Test: adb reboot
Test: put tzdata apex in /data/apex/active && adb reboot
Bug: 158152940
Change-Id: I4624567b3d0f304dba4c6e37b77abd89e57411de
Init starts ueventd in the default mount namespace to support loading
firmware from APEXes.
Bug: 155023652
Test: devices boots
adb$ nsenter -t (pid of ueventd) -m ls /apex
=> shows all APEXes
Change-Id: Ibb8b33a07eb014752275e3bca4541b8b694dc64b
To ensure we can shutdown cleanly, and don't hang an outstanding
requests to a FUSE host daemon that has already exited.
Bug: 153411204
Test: inspect logs during shutdown
Change-Id: I8e6479bd54dbc1fc85b087617aa6b16be9f15a3b
The exit of init panics the system *after* process context (mm, stack,
...etc.) are recycled, according to Linux kernel's 'do_exit'
implementation. To preserve most init process context for debugging,
triggers the panic via proc-sysrq explicitly.
Note: after this change, there will be no "Attempt to kill init" panic
when androidboot.init_fatal_panic is set.
Test: Insert data abort fault in init, the full process context is
preserved in memory dump captured after panic.
Bug: 155940351
Change-Id: I3393bd00f99b8cb432cfa19a105b7d636b411764
(cherry picked from commit be1cf9006a)
The exit of init panics the system *after* process context (mm, stack,
...etc.) are recycled, according to Linux kernel's 'do_exit'
implementation. To preserve most init process context for debugging,
triggers the panic via proc-sysrq explicitly.
Note: after this change, there will be no "Attempt to kill init" panic
when androidboot.init_fatal_panic is set.
Test: Insert data abort fault in init, the full process context is
preserved in memory dump captured after panic.
Bug: 155940351
Change-Id: I3393bd00f99b8cb432cfa19a105b7d636b411764
Since this function is used in userspace reboot, we need to be more
diligent with error handling, e.g.:
* If init fails to read /sys/block/zram0/backing_dev, then fail and
fallback to hard reboot.
* Always call swapoff.
* Always reset zram.
* Tear down loop device only if zram is backed by a loop device.
Test: adb reboot userspace
Bug: 153917129
Change-Id: I4709da1d08cf427ad9c898cfb2506b6a29f1d680
Merged-In: I4709da1d08cf427ad9c898cfb2506b6a29f1d680
(cherry picked from commit a840d405eb)
Since this function is used in userspace reboot, we need to be more
diligent with error handling, e.g.:
* If init fails to read /sys/block/zram0/backing_dev, then fail and
fallback to hard reboot.
* Always call swapoff.
* Always reset zram.
* Tear down loop device only if zram is backed by a loop device.
Test: adb reboot userspace
Bug: 153917129
Change-Id: I4709da1d08cf427ad9c898cfb2506b6a29f1d680
Similarly to other recovery mechanisms, timeout is controlled by a
read-only property that can be configured per-device.
Test: adb root
Test: adb shell setprop init.userspace_reboot.started.timeoutmillis 2
Test: adb reboot userspace
Bug: 152803929
Change-Id: Id70710b46da798945ac5422ef7d69265911ea5ef
Merged-In: Id70710b46da798945ac5422ef7d69265911ea5ef
(cherry picked from commit d05535485f)
Similarly to other recovery mechanisms, timeout is controlled by a
read-only property that can be configured per-device.
Test: adb root
Test: adb shell setprop init.userspace_reboot.started.timeoutmillis 2
Test: adb reboot userspace
Bug: 152803929
Change-Id: Id70710b46da798945ac5422ef7d69265911ea5ef
Devices in the lab are hitting an issue where they're getting stuck
likely in the sync() call in DoReboot() before we start the reboot
monitor thread and before we shut down services.
It's possible that concurrent writing to RW file systems is causing
this sync() call to take essentially forever. To protect against
this, we need to remove this sync(). Note that we will still call
sync() after shutting down services.
Note that the service shutdown code has a timeout and there is a
reboot monitor thread that will shutdown the device if more than 30
seconds pass above that timeout. This change increases that timeout
to 300 seconds to give the final sync() calls explicitly more time to
finish.
Bug: 150863651
Test: reboot functions normally
Test: put an infinite loop in DoReboot and the the reboot monitor thread
triggers and shuts down the device appropriately
Merged-In: I6fd7d3a25d3225081388e39a14c9fdab21b592ba
Change-Id: I6fd7d3a25d3225081388e39a14c9fdab21b592ba
(cherry picked from commit 10615eb397)
Devices in the lab are hitting an issue where they're getting stuck
likely in the sync() call in DoReboot() before we start the reboot
monitor thread and before we shut down services.
It's possible that concurrent writing to RW file systems is causing
this sync() call to take essentially forever. To protect against
this, we need to remove this sync(). Note that we will still call
sync() after shutting down services.
Note that the service shutdown code has a timeout and there is a
reboot monitor thread that will shutdown the device if more than 30
seconds pass above that timeout. This change increases that timeout
to 300 seconds to give the final sync() calls explicitly more time to
finish.
Bug: 150863651
Test: reboot functions normally
Test: put an infinite loop in DoReboot and the the reboot monitor thread
triggers and shuts down the device appropriately
Change-Id: I6fd7d3a25d3225081388e39a14c9fdab21b592ba
A previous change moved property_service into its own thread, since
there was otherwise a deadlock whenever a process called by init would
try to set a property. This new thread, however, would send a message
via a blocking socket to init for each property that it received,
since init may need to take action depending on which property it is.
Unfortunately, this means that the deadlock is still possible, the
only difference is the socket's buffer must be filled before init deadlocks.
This change, therefore, adds the following:
1) A lock for instructing init to reboot
2) A lock for waiting on properties
3) A lock for queueing new properties
A previous version of this change was reverted and added locks around
all service operations and allowed the property thread to spawn
services directly. This was complex due to the fact that this code
was not designed to be multi-threaded. It was reverted due to
apparent issues during reboot. This change keeps a queue of processes
pending control messages, which it will then handle in the future. It
is less flexible but safer.
Bug: 146877356
Bug: 148236233
Bug: 150863651
Bug: 151251827
Test: multiple reboot tests, safely restarting hwservicemanager
Merged-In: Ice773436e85d3bf636bb0a892f3f6002bdf996b6
Change-Id: Ice773436e85d3bf636bb0a892f3f6002bdf996b6
(cherry picked from commit 802864c782)
This is apparently causing problems with reboot.
This reverts commit d2dab830d3.
Bug: 150863651
Test: build
Merged-In: Ib8a4835cdc8358a54c7acdebc5c95038963a0419
Change-Id: Ib8a4835cdc8358a54c7acdebc5c95038963a0419
A previous change moved property_service into its own thread, since
there was otherwise a deadlock whenever a process called by init would
try to set a property. This new thread, however, would send a message
via a blocking socket to init for each property that it received,
since init may need to take action depending on which property it is.
Unfortunately, this means that the deadlock is still possible, the
only difference is the socket's buffer must be filled before init deadlocks.
This change, therefore, adds the following:
1) A lock for instructing init to reboot
2) A lock for waiting on properties
3) A lock for queueing new properties
A previous version of this change was reverted and added locks around
all service operations and allowed the property thread to spawn
services directly. This was complex due to the fact that this code
was not designed to be multi-threaded. It was reverted due to
apparent issues during reboot. This change keeps a queue of processes
pending control messages, which it will then handle in the future. It
is less flexible but safer.
Bug: 146877356
Bug: 148236233
Bug: 150863651
Bug: 151251827
Test: multiple reboot tests, safely restarting hwservicemanager
Change-Id: Ice773436e85d3bf636bb0a892f3f6002bdf996b6
This is apparently causing problems with reboot.
This reverts commit 7205c62933.
Bug: 150863651
Test: build
Change-Id: Ib8a4835cdc8358a54c7acdebc5c95038963a0419
A previous change moved property_service into its own thread, since
there was otherwise a deadlock whenever a process called by init would
try to set a property. This new thread, however, would send a message
via a blocking socket to init for each property that it received,
since init may need to take action depending on which property it is.
Unfortunately, this means that the deadlock is still possible, the
only difference is the socket's buffer must be filled before init deadlocks.
There are possible partial solutions here: the socket's buffer may be
increased or property_service may only send messages for the
properties that init will take action on, however all of these
solutions still lead to eventual deadlock. The only complete solution
is to handle these messages asynchronously.
This change, therefore, adds the following:
1) A lock for instructing init to reboot
2) A lock for waiting on properties
3) A lock for queueing new properties
4) A lock for any actions with ServiceList or any Services, enforced
through thread annotations, particularly since this code was not
designed with the intention of being multi-threaded.
Bug: 146877356
Bug: 148236233
Test: boot
Test: kill hwservicemanager without deadlock
Merged-In: I84108e54217866205a48c45e8b59355012c32ea8
Change-Id: I84108e54217866205a48c45e8b59355012c32ea8
(cherry picked from commit 7205c62933)
If init is wedged, then the write will never succeed and reboot won't
happen.
Also, in case of normal reboot, move call to PersistRebootReason to the
top of DoReboot() function, to make sure we persist it even if /data is
not mounted.
Test: builds
Test: adb shell svc power reboot userspace
Test: atest CtsUserspaceRebootHostSideTestCases
Bug: 148767783
Change-Id: I4ae40e1f6fdc41cc0bcae57020fa3d3385dda1b4
Merged-In: I4ae40e1f6fdc41cc0bcae57020fa3d3385dda1b4
If init is wedged, then the write will never succeed and reboot won't
happen.
Also, in case of normal reboot, move call to PersistRebootReason to the
top of DoReboot() function, to make sure we persist it even if /data is
not mounted.
Test: builds
Test: adb shell svc power reboot userspace
Test: atest CtsUserspaceRebootHostSideTestCases
Bug: 148767783
Change-Id: I4ae40e1f6fdc41cc0bcae57020fa3d3385dda1b4
A previous change moved property_service into its own thread, since
there was otherwise a deadlock whenever a process called by init would
try to set a property. This new thread, however, would send a message
via a blocking socket to init for each property that it received,
since init may need to take action depending on which property it is.
Unfortunately, this means that the deadlock is still possible, the
only difference is the socket's buffer must be filled before init deadlocks.
There are possible partial solutions here: the socket's buffer may be
increased or property_service may only send messages for the
properties that init will take action on, however all of these
solutions still lead to eventual deadlock. The only complete solution
is to handle these messages asynchronously.
This change, therefore, adds the following:
1) A lock for instructing init to reboot
2) A lock for waiting on properties
3) A lock for queueing new properties
4) A lock for any actions with ServiceList or any Services, enforced
through thread annotations, particularly since this code was not
designed with the intention of being multi-threaded.
Bug: 146877356
Bug: 148236233
Test: boot
Test: kill hwservicemanager without deadlock
Change-Id: I84108e54217866205a48c45e8b59355012c32ea8
Helps with support of recovery and rollback boot reason history, by
also using /metadata/bootstat/persist.sys.boot.reason to file the
reboot reason.
Test: manual
Bug: 129007837
Change-Id: Id1d21c404067414847bef14a0c43f70cafe1a3e2
Instead they will be logged from system_server. This CL just prepares
grounds for logging CL to land.
Test: adb reboot userspace
Bug: 148767783
Change-Id: Ie9482ef735344ecfb0de8a37785d314a3c0417ff