Only thread group leaders should be registered with lmkd. Add a check to
ignore any non-leader TIDs and generate an error if such condition is
detected. Run the same check before killing a process to detect cases of
non-leader TIDs being used to kill a process. This might happen if PIDs
overflow and previously registered PID gets reused for a non-leader
thread in the following scenario:
1. pid X is a thread group leader and is registered with lmkd
2. pid X dies without lmkd knowing it and pid gets recycled
3. process Y creates a thread with tid X
4. lmkd kills pid X which results in process Y being killed
Bug: 136408020
Test: lmkd_unit_test
Change-Id: I46c5a0b273f2b72cefc20ec59b80b4393f2a1a37
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Occasionally we see cases when 40ms polling is still too conservative.
Change to 10ms polling period. Since the polling happens only after PSI
signal and continues for 1sec this should not affect system performance.
Test: lmkd_unit_test
Bug: 129358844
Change-Id: Ib759b865b2104be23741fc0eacaa541e22d50dde
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Previous change If154dc364711bf7c86f32e24ddcd10be359386de called
"lmkd: Do not downgrade/ignore events when swap is full" added SwapTotal
into meminfo structure without adding the field into events.logtag file.
This results in logs which missing field and all fields starting with
"SwapFree" get reordered as a result. Fix this by adding the missing field
into events.logtag.
Bug: 129274901
Test: Confirm correct information in the logcat
Change-Id: Ia4de3790a7e9d49a0e4cba8b3161a715eaf6532e
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
200ms was too lenient when under severe memory pressure.
Test: boots, works
Bug: 127765309
Change-Id: I8e047de6318574a107720c56473ed0f25582e182
Signed-off-by: Tim Murray <timmurray@google.com>
Log min_score_adj when lmkd kills a process to determine the oom_score
levels that lmkd considers during the kill.
Bug: 123024834
Change-Id: I986ae8f2808199b1654bc8d2a32dd88046c79aa3
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
lmkd can't kill processes because it has compare the size between free swap and free memory. Free swap is often larger than the free memory when system is under low memory with less swap-backed or swappable pages and finally leads to I/O thrashing.
Test: TreeHugger
Bug: 124727769
Change-Id: Ia2848859aa97a24bd13c704acee4b86cd2d3f647
Bug: http://b/116873221
If not, building with coverage (and -O0) is broken since Clang does not
inline these functions, and does not emit a definition because they are
not static or extern.
Test: Build with coverage (-O0)
Change-Id: I2880119834f764164a1aac10b696e76a2b462b71
We already know that "polling" must be non-zero at this point,
because it hasn't been modified since our check on line 1960.
So we remove this check for code clarity.
Test: TreeHugger
Change-Id: I069d9fd0eef70748a5333733dd0518d1ac8021b7
With new psi monitor support in the kernel lmkd can use it to register
custom pressure levels. Add lmkd support for psi monitors when they are
provided by the kernel and use them by default. When kernel does not
support psi lmkd will fall back to vmpressure usage.
Add ability to poll memory status after the initial psi event is triggered
because kernel throttles psi memory pressure events to one per PSI tracking
window (currently set to 1sec). Current implementation polls every 200ms
for 1sec duration after the initial event is triggered.
If ro.lmk.use_psi is set to false psi logic will be disabled even when psi
is supported in kernel.
Bug: 111308141
Test: lmkd_unit_test
Change-Id: I685774b176f393bab7412161773f5c9af51e0163
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
New library to create, register and destroy PSI monitors in a consistent
way with lmkd.
Test: used within lmkd
Bug: 111308141
Change-Id: If243a97f178e90fe41e2de90c7b858ba82440279
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
This reverts commit 1bef8c550c.
Reason for revert: AOSP is fixed with new vendor image
Change-Id: Ib341ac80e2f88c13a7815a490ea2d9422ebdf55f
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
lmkd's test depends on the socket_local_client() function from
libcutils, but since liblog also exposes this symbol weakly, that is
the symbol that gets used instead of the intended libcutil's copy of
the symbol.
Test: build
Change-Id: I294fa157a7f50881bf6360922419976eb1ee3ac7
This is to measure an application's behavior with respect to being LMKed
(the longer an app lives before being LMKed, the better).
Bug: 119854389
Test: Manual
Change-Id: I4ef6433391c8758626334731d2b5de038e4468ae
Merged-In: I4ef6433391c8758626334731d2b5de038e4468ae
(cherry picked from I4ef6433391c8758626334731d2b5de038e4468ae)
find_and_kill_processes() does not kill multiple processes at a time
anymore. Remove support for bulk process killing.
Change-Id: Id09132a9cebe44589a1a3ebcbff800a16fa56557
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Kill a single process at a time and try to wait up to 100ms for
that process to reclaim memory before triggering another kill.
Test: boots, works
bug: 116877958
Change-Id: I6775d0534b3e3728c04389d3eae1a00e3cbf9f27
Intrduce LMK_GETKILLCNT command for ActivityManager to get the number of
kills from lmkd.
Bug: 117126077
Test: used lmkd_unit_test to verify correct reporting
Change-Id: I09c720a7176b4df95efc544177cd2694f8d791be
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
lmkd sets the soft limit parameters for Go devices.
The limit for apps in the perceptible group is set to 16M.
However this limit is not sufficient for the keyboard app to
prevent pages from being re-claimed quickly. The mem usage of
the keyboard app is around 55M most cases with some occasional
spikes to 70-80M. Increasing the limit to 64M improves the warm
startup latency for keyboard. It is still lower than the limits
set for foreground and visible apps.
Test: Go device (1G)
Bug: 117517805
Merged-In: Id50e49327cfd76126e41ef6503971845f29196af
Change-Id: Id50e49327cfd76126e41ef6503971845f29196af
lmkd keeps a list of pids registered by ActivityManager, however on rare
occasions when framework restarts and lmkd survives that list has to be
purged. Implement a command that can be used to clear the pid list.
Bug: 116801366
Test: locally by killing zygote process
Change-Id: I71d6012f86bb83a73edd5b687e05a0848e0569b1
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
pid_remove() frees a structure representing registered process and the
pointer can't be used anymore. This change fixes an instance when pointer
was used after it was freed. pid_remove() is moved to the end of the
function and comments are added to prevent similar situation in the future.
Bug: 117625315
Change-Id: I6a922952a31232497b3f9caf87d5a21bd402db94
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Excessive number of failed kill reports when lmkd can't find an eligible
process to kill or frees not enough memory pollutes logs and bugreports.
Cleanup kill reports to remove duplicate information and rate limit failed
kill attempts at 1 report per sec. The number of suppressed failed kills
will be reported in the next lmkd report.
Bug: 113864581
Test: Verified using lmkd_unit_test
Change-Id: I67fa1fec97613f136c7582115edcbc56b1503c9c
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Introduce sys.lmk.minfree_levels system property to allow minfree level
reporting. The format for this property is:
<minfree 1>:<oom_adj 1>, <minfree 2>:<oom_adj 2>, ...
Max number of minfree levels is 6 and they are specified in the
increasing order. For example:
sys.lmk.minfree_levels=18432:0,23040:100,27648:200,32256:300,55296:900,80640:906
sys.lmk.minfree_levels updates are ratelimited to once per second in order
to prevent DoS attacks.
Bug: 111521182
Test: getprop sys.lmk.minfree_levels returns expected value
Change-Id: I80d75d6836650b12457d6a99ca88898535837a97
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
While troubleshooting memory pressure related issues it's hard to get a
good view of the memory state when lmkd kill happens. Logging relevant
information from /proc/meminfo file that was used to make a kill decision
is very helpful for further analysis. To do this efficiently we are using
Android Logger event library functions and log the data used for kill
decision after the kill signal was issued.
Test: Run lmkd_unit_test and logcat -b events -v descriptive
Change-Id: Id5de41b9d91a04dd5d3eb9b85d4e1babe9755628
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
When the swap space is full, a pressure event is unlikely to resolve by
itself. In this case, do not downgrade or ignore the events.
Bug: 112056451
Test: Fill up swap on a 1GB device and check critical vmpressure events
are not downgraded.
Change-Id: If154dc364711bf7c86f32e24ddcd10be359386de
Initial change to remove memory.stat usage when per-application memcgs
are disabled was partially merged into AOSP under the following id:
Ib6dd7586d3ef1c64cb04d16e2d2b21fa9c8e6a3a
This change adds the missing parts.
Bug: 110384555
Change-Id: I1265021b1ede0e68efbf80d6430a959eaf46a69a
Merged-In: Ib6dd7586d3ef1c64cb04d16e2d2b21fa9c8e6a3a
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
We're passing a 'line' whose backing buffer is PAGE_MAX in size
into memory_stat_parse_line(). We protect overflowing the smaller
LINE_MAX 'key' buffer via some C preprocessing macros to assure
we limit the size.
Test: Local build with LMKD_LOG_STATS set for this file.
Bug: 76220622
Change-Id: I9e50d4270f7099e37a9bfc7fb9b9b95cc7adb086
Per-application memory.stat files are not available when per-application
memcgs are not used (per_app_memcg=false). Disable its usage based on
ro.config.per_app_memcg property.
minchan:
* correct indentation of memory_stat_parse
* move per_app_memcg check into memory_stat_parse inside
* change low_ram_device to per_app_memcg
Bug: 110384555
Test: manual test to see lkmd log message with memory hogger
Merged-In: Ib6dd7586d3ef1c64cb04d16e2d2b21fa9c8e6a3a
Change-Id: Ib6dd7586d3ef1c64cb04d16e2d2b21fa9c8e6a3a
Signed-off-by: Minchan Kim <minchan@google.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
* New clang compiler requires variadic function to have
at least one named parameter type.
* Use ##__VA_ARGS__ to work with empty __VA_ARGS__.
* Fix one ALOG_ASSERT parameter bug in lmkd/lmkd.c.
Bug: 111614304
Test: make with WITH_TIDY=1
Change-Id: I90f35aa88527a6897954f69a35b256a157a725c5
Setting memory.soft_limit_in_bytes on high-end devices with large memory
reserves affects performance of memory-hungry applications that have
large workingsets and keep thrashing because of the memory limits imposed.
Limit the usage of memory.soft_limit_in_bytes to low-memory devices only.
Add debug messages for future troubleshooting to capture cases when
vmpressure events are being ignored.
Bug: 78916015
Test: collect vmstat while running a heavy app
Change-Id: Ib4434b96d2be802ef89960b573486eae8d12f198
Merged-In: Ib4434b96d2be802ef89960b573486eae8d12f198
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Setting memory.soft_limit_in_bytes on high-end devices with large memory
reserves affects performance of memory-hungry applications that have
large workingsets and keep thrashing because of the memory limits imposed.
Limit the usage of memory.soft_limit_in_bytes to low-memory devices only.
Add debug messages for future troubleshooting to capture cases when
vmpressure events are being ignored.
Bug: 78916015
Test: collect vmstat while running a heavy app
Change-Id: Ib4434b96d2be802ef89960b573486eae8d12f198
Merged-In: Ib4434b96d2be802ef89960b573486eae8d12f198
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Logs that provide information about memory conditions during a process
kill event contain useful information and do not affect device
performance because these events are rare. Enabling them even when
ro.config.debug flag is not set will help in understanding low memory
conditions.
Bug: 79572814
Change-Id: Iae6e9bb612b9a7904ca491de3f1ddc727f24c7e0
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
1. let logs be written to statsd directly like all other stats logs.
+ stats log should not write to logd anymore(b/78239479)
2. fixed the log format
+ need to embed the elapsed real time in the log
3. fixed the log context reuse problem
+reset the log context buffer and internal state before reuse
Bug: 78603347
Bug: 78239479
Test: tested with alloc_stress, and saw logs written to statsd
performance measurement (memory & cpu):
https://paste.googleplex.com/5508158646648832
Change-Id: I345f0eace8ba1687ff480fb88e9abba1d8533f76