Commit graph

35 commits

Author SHA1 Message Date
Tom Cherry
5938379e91 init: shutdown services in the opposite order that they started
Currently, the order that we kill to services during shutdown is the
order of services_ in ServiceManager and that is defacto the order in
which they were parsed, which is not a very useful ordering.

Related to this, we have seen a few issues during shutdown that may be
related to services with dependencies on other services, where the
dependency is killed first and the dependent service then misbehaves.

This change allows services to keep track of the order in which they
were started and shutdown then uses that information to kill running
services in the opposite order that they were started.

Bug: 64067984
Test: Boot and reboot bullhead

Change-Id: I6b4cacb03aed2a72ae98a346bce41ed5434a09c2
2017-07-26 16:48:06 -07:00
Wei Wang
a01c27eef8 Do not umount roofs even if it is R/W.
Latest device has rootfs instead of "/system" mount point

Bug: 37737296
Test: adb remount, reboot, and check log
Change-Id: I315ecf71e85255fc55c3a80619920b456bad0956
2017-07-25 10:55:10 -07:00
Wei Wang
1be2212319 init: Fire shutdown trigger for thermal shutdown
Recent change in init has bring normal shutdown sequence in
thermal-shutdown condition. This CL will make sure init fire shutdown
trigger where holds custom shutdown actions for vendor SoC/platform.

Bug: 63686426
Test: adb shell setprop sys.powerctl thermal-shutdown
Change-Id: Ieb8579fdf9c30c1a81d60466a7375c9784f3ca98
2017-07-24 13:12:22 -07:00
Keun-young Park
30173874fc init: Do full shutdown even for thermal shutdown
- Skipping SIGTERM / SIGKILL / umount brings race between block
  device driver and fs layer. Do umount before shutting down.
- Reduce timeout to 1 sec for thermal shutdown and skip other time
  taking part like fsck.
- Refactor waiting part to check time in ms so that 1 sec can
  have enough resolution.

bug: 63686426
Test: adb shell setprop sys.powerctl thermal-shutdown, adb shell setprop sys.powerctl reboot and check dmesg
Change-Id: I048bac767b328c8d656a97fe65dde5f2b5bf4ae5
2017-07-19 17:27:05 -07:00
Keun-young Park
c59b822d1f dump stack before kill all
- If problematic process is from user, kill all kills
  it and dump does not show problematic process.

bug: 37737296
Test: reboot and check log
Change-Id: Iaa4f7d12f5a40fa7528c6672567c36e30b140372
2017-07-18 18:52:25 -07:00
Keun-young Park
6e12b3887e Do not umount /vendor, /system, and /oem even if they are R/W.
- /vendor, /system, /oem can be remounted to R/W for development
  purpose.

- In such case, umounting these partitions can lead into some processes
  not running properly during shutdown or blocking umount of fs.

- So skip them. As it is dev feature, it is up to each developer to
  understand the risk. But for normal adb sync - reboot should be ok
  as shutdown involves sync operations.

bug: 37737296
Test: adb remount,reboot, and check last kmsg
Change-Id: Iab6a6374bc558375d359b3b49b14db93d363b1ad
2017-07-17 17:32:26 -07:00
Tom Cherry
ede0d53850 Move Timer from init to libbase
Test: boot bullhead
Test: new libbase unit tests

Change-Id: Ic398a1daa1fe92c10ea7bc1e6ac3f781cee9a5b5
2017-07-10 09:28:24 -07:00
Treehugger Robot
b3915d113d Merge "add "shutdown critical" to service" 2017-07-06 00:40:55 +00:00
Keun-young Park
cccb34fce8 add "shutdown critical" to service
- "shutdown critical" prevents killing the service during
  shutdown. And the service will be started if not running.
- Without it, services will be killed by SIGTERM / SIGKILL during shutdown.
- Even services with "shutdown critical" will be killed if shutdown
  times out.
- Removes ueventd and vold from hard coded list. Each service's rc will
  be updated to add "shutdown critical". watchdogd is still kept in the list.

bug: 37626581
Test: reboot and check last kmsg

Change-Id: Ie8cc699d1efbc59b9a2561bdd40fec64aed5a4bb
2017-07-05 14:55:22 -07:00
Wei Wang
eeab491efd init: Support custom shutdown actions
We have been seeing panics and errors during shutdown sequence in
some vendor's platform, and it is required to disable error handling
during shutdown.

This CL separates the shutdown request to execute another "shutdown"
trigger at the beginning of shutdown stage. And vendor can use this
trigger to add custom commands needed for shutting down gracefully.

Bug: 38203024
Bug: 62084631
Test: device reboot/shutdown
Change-Id: I3fac4ed59f06667d86e477ee55ed391cf113717f
2017-07-05 14:49:57 -07:00
Nick Kralevich
33391dad15 Remove unnecessary SELinux dependencies
These are unused.

Test: code compiles.
Change-Id: Idd707dfcc8f6daac3a489c791ecc364841cf31f9
2017-07-01 07:41:48 -07:00
Luis Hector Chavez
519e5f0592 init: Reland "Terminate gracefully when CAP_SYS_BOOT is absent"
This change makes it possible for Android running in a container to
terminate cleanly instead of calling abort() when requested to shut
down.

Bug: 62388055
Test: `adb reboot` on bullhead causes no kernel panics
Test: `adb reboot` on a system without CAP_SYS_BOOT makes init terminate
       nicely

Change-Id: I36b2298610f5b4a2bf8b05103d04804883df2c88
2017-06-29 14:41:23 -07:00
Guang Zhu
c22f93856f Revert "init: Terminate gracefully when CAP_SYS_BOOT is absent"
Bug: 63080844

This reverts commit 683ebc8059.

Change-Id: I6074ff09300fd30bfc66881ded1c4f868a845a91
2017-06-28 02:10:33 +00:00
Luis Hector Chavez
683ebc8059 init: Terminate gracefully when CAP_SYS_BOOT is absent
This change makes it possible for Android running in a container to
terminate cleanly instead of calling abort() when requested to shut
down.

Bug: 62388055
Test: setprop sys.powerctl reboot makes init terminate nicely

Change-Id: I31c7b475d89d7cbd665e135d9b8951dfd4bca80d
2017-06-27 13:51:46 -07:00
Tom Cherry
81f5d3ebef init: create android::init:: namespace
With some small fixups along the way

Test: Boot bullhead
Test: init unit tests
Change-Id: I7beaa473cfa9397f845f810557d1631b4a462d6a
2017-06-23 13:21:20 -07:00
Keun-young Park
7264bee975 add ueventd to shutdown critial process
- In some devices, some drivers still try to load firmware while shutting
  down, and crashes the kernel. So keep ueventd to prevent such case.

bug: 38203024
Test: reboots
Change-Id: I4f1910723254ccb69f8e9c78e8727fbd8c7eed3e
2017-05-18 20:58:10 +00:00
Treehugger Robot
59c74a3bd1 Merge "init: fix last_reboot_reason string" 2017-04-27 19:39:53 +00:00
Keun-young Park
47d15ed5b9 Merge "set default shutdown timeout to 6 secs" 2017-04-26 22:35:26 +00:00
Treehugger Robot
84d43c8df7 Merge "do not start shutdown animation from init" 2017-04-26 20:59:56 +00:00
Keun-young Park
7feab68238 set default shutdown timeout to 6 secs
- Test data shows that most shutdown finishes in 6 secs.
- The original 10 secs is too long wih no shutdown animation
  running in screen.

bug: 36657139
Test: check time with reboot
Change-Id: I9a805ddfde8156b066485902048d0cd01365c453
2017-04-26 13:58:31 -07:00
Keun-young Park
1663e97fe1 add additional dump for timeout
- add sysrq-trigger current tasks dump
- This helps detecting kernel thread stuck in a specific driver

bug: 37573746
Test: python packages/services/Car/tools/bootanalyze/bootanalyze.py -r -c packages/services/Car/tools/bootanalyze/config.yaml -n 2000 -f -e 15 -w 30  -v -a

Change-Id: Icb20b5fba63d601bb937f004f5889a9bc8340b34
2017-04-26 10:16:23 -07:00
Tom Cherry
47336cebc3 init: fix last_reboot_reason string
This got moved when refactoring the reboot commands.

Bug: 37540660
Test: verify bullhead's last_reboot_reason is correct
Change-Id: I3b86496fc469ca41645df7e7ba8bb51dd25b6b38
2017-04-26 16:17:08 +00:00
Keun-young Park
e2b04b71ae do not start shutdown animation from init
- init will only keep animation related services as shutdown critical.
- external component like system server can start shutdown animation.

bug: 37500823
Test: reboot
Change-Id: Ief328306eba7e3b15402ae27e6236767095f508c
2017-04-19 14:30:25 -07:00
Tom Cherry
98ad32a967 init: handle sys.powerctl immediately
Currently if a process sets the sys.powerctl property, init adds this
property change into the event queue, just like any other property.
The actual logic to shutdown the device is not executed until init
gets to the action associated with the property change.

This is bad for multiple reasons, but explicitly causes deadlock in
the follow scenario:

A service is started with `exec` or `exec_start`
The same service sets sys.powerctl indicating to the system to
shutdown
The same service then waits infinitely

In this case, init doesn't process any further commands until the exec
service completes, including the command to reboot the device.

This change causes init to immediately handle sys.powerctl and reboot
the device regardless of the state of the event queue, wait for exec,
or wait for property conditions.

Bug: 37209359
Bug: 37415192

Test: Init reboots normally
Test: Update verifier can reboot the system
Change-Id: Iff2295aed970840f47e56c4bacc93001b791fa35
2017-04-17 16:40:06 -07:00
Todd Poynor
fc827be3f9 reboot: fix owner and permissions of last_reboot_reason file
Default signature WriteStringToFile creates world-writeable files.
Set owner and group system and remove read/write for non-owner.

Bug: 37251463
Test: Manual: reboot, inspect
Change-Id: I6a29c678168dcae611b120dc52170f4eee7069a9
2017-04-13 18:03:59 -07:00
Keun-young Park
2ba5c8103d poll umount completion from /proc/mounts
- umount operation is asynchronous except for root partition.
  Returning from umount does not guarantee completion of
  umount. Poll /proc/mounts to confirm completion of umount.
- Treat all devices mounting to /data as emulated devices. This is
  future proof when fs other than sdcardfs is used.
- Drop quota sync from sync step. There is no differences in
  frequencies of quota error.
- Run umount in reverse order from mounting order so that any
  hidden dependency can be auto-resolved.
- Add dump of lsof and /proc/mounts when umount fails. lsof only runs
  when selinux is toggled into permissive mode. The dump is enabled
  only for non-user build.
- Keep logcat until vold shutdown in case vold has any error to report.

bug: 36551218
Test: python packages/services/Car/tools/bootanalyze/bootanalyze.py -r -c packages/services/Car/tools/bootanalyze/config.yaml -n 1000 -f -e 20 -w 30

Change-Id: I87b17b966d7004c205452d81460b02c6acf50d45
2017-04-10 15:41:15 -07:00
Tom Cherry
3f5eaae526 init: more header cleanup
Remove includes of "log.h" that really want <android-base/logging.h>
Fix header include order
Remove headers included in .cpp files that their associated .h already includes
Remove some unused headers

Test: boot bullhead
Change-Id: I2b415adfe86a5c8bbe4fb1ebc53c7b0ee2253824
2017-04-06 18:06:34 -07:00
Keun-young Park
7830d59500 add shutdown animation
- Run shutdown animation during shutdown if surfaceflinger is
  available / running.
- services necessary for animation should be added to animation
  class.
- Keep debugging tools while non-critical services are terminated:
  logd, adbd, tombstoned

bug: 36526187
Test: many reboots

Change-Id: I758f700a622c6005f3df9f29de2b55270055ad4d
2017-03-31 16:48:20 -07:00
Keun-young Park
c4ffa5c47d set zero shutdown timeout for eng build
- still it will take time to kill services, < 3 secs in tested device.

bug: 36678028
Test: reboot
Change-Id: I3f3eb83aede8cd950da12e3fcc259eeaf8517c3b
2017-03-29 12:25:33 -07:00
Tom Cherry
ccf23537ee init: replace property_get with its android::base equivalent
Slowly try to decouple property_service.cpp from the rest of init.

Test: Boot bullhead
Change-Id: I267ae0b057bca0bf657b97cb8bfbb18199282729
2017-03-29 10:07:54 -07:00
Tom Cherry
1ec1bd918c init: remove unused cutils includes
Test: Boot bullhead
Change-Id: I629f9c3863f00fa38f87a68442c2380d28764718
2017-03-28 16:22:33 -07:00
Keun-young Park
3ee0df9bdf update shutdown sequence and use shutdown_timeout to cover all wait
- Use ro.build.shutdown_timeout to cover the total time for shutdown.
  Limit wait time for termination only to half of shutdown_timeout
  with max of 3 secs as process not terminating by that time
  will not terminate anyway. It is better to move to the next
  stage quickly. fsck time for user shutdown is excluded from timeout.
- Change last detach to kill, sync, and umount. Last detach did not
  work in many tests.
- add sync after emulated partitions umount as it can trigger
  change in /data.

bug: 36551393
Test: many reboots
Change-Id: Ib75dc19af79b8326b02ccef6b16a8817ae7f8b0e
2017-03-27 13:44:50 -07:00
Keun-young Park
aa08ea458a add kill all for shutdown_timeout of 0
- If it is explicitly set to 0, active processes can block
  umount completely. Safe to kill all processes and umount.
- also add additional sync after emulated partition umount
  as that can change /data partition files

bug: 36004738
Test: many reboots

Change-Id: I6c9b07b6fdece44b9caec4e45ecf26a20d0eb96e
2017-03-23 18:01:24 -07:00
Keun-young Park
3cd8c6f912 add clear log for reboot start / end
- hard to tell if reboot itself is problem or not.

bug: 36004738
Test: reboot and check last kmsg
Change-Id: I0de0e10eac9ac336cc352ddee22a4a1d9e46cb79
2017-03-23 16:55:24 -07:00
Keun-young Park
8d01f63f50 remove emergency shutdown and improve init's reboot logic
- Emergency shutdown just marks the fs as clean while leaving fs
  in the middle of any state. Do not use it anymore.

- Changed android_reboot to set sys.powerctl property so that
  all shutdown can be done by init.

- Normal reboot sequence changed to
    1. Terminate processes (give time to clean up). And wait for
      completion based on ro.build.shutdown_timeout.
        Default value (when not set) is changed to 3 secs. If it is 0, do not
        terminate processes.
    2. Kill all remaining services except critical services for shutdown.
    3. Shutdown vold using "vdc volume shutdown"
    4. umount all emulated partitions. If it fails, just detach.
       Wait in step 5 can handle it.
    5. Try umounting R/W block devices for up to max timeout.
      If it fails, try DETACH.
      If umount fails to complete before reboot, it can be detected when
      system reboots.
    6. Reboot

- Log shutdown time and umount stat to log so that it can be collected after reboot

- To umount emulated partitions, all pending writes inside kernel should
  be completed.
- To umount /data partition, all emulated partitions on top of /data should
  be umounted and all pending writes should be completed.
- umount retry will only wait up to timeout. If there are too many pending
  writes, reboot will discard them and e2fsck after reboot will fix any file system
  issues.

bug: 36004738
bug: 32246772

Test: many reboots combining reboot from UI and adb reboot. Check last_kmsg and
      fs_stat after reboot.

Change-Id: I6e74d6c68a21e76e08cc0438573d1586fd9aaee2
2017-03-22 11:23:31 -07:00