Add -T to the pstree command so that it doesn't print the threads of
the running processes, the thread information hasn't been useful for
debugging any previous instances of stuck processes.
Log that there are stuck processes to stdout with a pointer to look
in soong.log.
Test: manual
Change-Id: I6459f2887a7e79591e8c451d06969f8709db3393
To maintain good backwards compatibility with the legacy partition
building behavior, allow actions to read
BUILD_BROKEN_INCORRECT_PARTITION_IMAGES so that we don't have to rerun
analysis.
Bug: 205632228
Test: Presubmits
Change-Id: I2b55c0143cbdaf010e6b5fd0c3d51d6930a94eff
To maintain good backwards compatibility with the legacy partition
building behavior, allow actions to read
BUILD_BROKEN_INCORRECT_PARTITION_IMAGES so that we don't have to rerun
analysis.
Bug: 205632228
Test: Presubmits
Change-Id: I8b25a62f24bc6d628fb055239b084f6ea535e38b
As-is: Even with NINJA_LOG option, ninja weight list is copied from
ninja log
To-be: uses usesninjalogasweightlist to uses ninja log file directly
If weight list is large(more than 100,000 lines), reading the whole list
spends a few seconds(1-3s). It is ignorable in full build, but it might
be considerable in small build or m nothing scenario.
If weight list comes from ninja log, we can just directly read ninja log
data instead of writing weight list from ninja log outside, and read the
weight list. It doesn't spend additional time because ninja log is
loaded by default.
Bug: 271527305
Test: build with NINJA_LOG
Change-Id: Id7a4cca95898ce439d0a682f9bfd954b462f1849
If this option is set, Soong generates ninja weight list including
modules in HugeModulesList in allowlists.go
Test: m --ninja_weight_source=soong
Bug: 273282046
Change-Id: Id92b7f9f9e8152c1c46ae071c5821a479cf47bce
1. extract .ninja_log as const
2. log an error during reading .ninja_log
Test: m nothing
Bug: 271527305
Change-Id: I395dd8419620bfa9fad3af23c96e5a22ca44e2fb
It has 3 options for now.
* ninja_log: uses ninja log file for data source
* evenly_distributed: pass empty list for ninja to consider every work as the same
* not_used: do not use critical path logic in ninja
In addition, I added the option in the metrics to track
Bug: 271527305
Test: m --ninja_weight_source=ninja_log|empty|not_used
Change-Id: Ib4c812c20606a34b17d3f0edb71057b477c4f90e
This contains the environment variables set by soong_ui during the ninja
execution. This file will be unused in single-tree builds, but will be
useful for multi-tree. The orchestrator will use this file to set the
correct environment for each inner tree build in the combined ninja
execution.
Test: m nothing compiles
Change-Id: I0570e34f51f426448464fb80030d4bea1cd52acb
This allows us to track how much time is spent in Clang.
Test: TOOLCHAIN_RUSAGE_OUTPUT=/tmp/rusage.txt m
Change-Id: Ib2961904f363bc59bd9d928bb055a96740cb9f17
This variable tells reproxy the number of times an action that
mismatches between local and remote-executions should be run remotely to
determine if the action is inherently non-deterministic.
An action is considered non-deterministic if the number of unique output
digests we get from the remote-run is > 1.
Bug: b/178842439
Change-Id: I0d50eed29b801be67a7ede9b25ce7f4535980f05
This requires linking Blueprint into soong_ui. It lets us avoid the
complicated dance of Ninja files and shell scripts: now the information
as to how soong_build is built is passed directly to Blueprint using a
struct that contains all the information the command line arguments used
to contain.
The ability to run Blueprint from the command line is kept (for now).
Some variables in bootstrap/command.go needed public accessor functions
because soong_build reads them. This will be disentangled by moving the
flag parsing to soong_build.
The presence of the flag definitions in Blueprint means that soong_ui
now also accepts them. This is not a problem in practice because they
are ignored and because soong_ui itself is hidden behind a few layers of
shell scripts.
Test: Presubmits + the new bootstrap_test.sh .
Change-Id: I6dca478f356f56a8aee1e457d71439272351390b
On top of improving the comments, I also refactored the status checker
to simplify and clarify its use case to be a ninja stuckness checker,
since it doesn't appear to have other purposes.
Test: TH presubmit
Bug: b/173474588
Change-Id: I2cf51a1ebf16071a24a1c13c06c7b1adf60256de
The stats output will now use the new "DEBUG" message type, which we can
always redirect to verbose.log.gz.
Test: m aprotoc (look in verbose.log.gz)
Change-Id: Ie1b58f12c008ff7d29f11ff7a9807488dba8a504
This replaces the _kati_always_build_ hack with a ninja feature so that
ninja can actually understand what's happening. This means that we can
turn on some more options and checks around expected output filenames:
* Remove the output file(s) before the command executes
* Error if the output file(s) don't exist after the execution
* Error if the output is a directory
They're turned on immediately during the soong bootstrap runs, as those
run a limited number of rules. The main ninja execution does not remove
the output files, and prints warnings instead of errors for the others.
I'll turn them on as we understand how often those warnings are seen.
Test: m (check for new warnings)
Test: treehugger (check for new warnings)
Change-Id: I7f5c1a361dd1498eb54a2c07a918f3b0aa086e4c
Trying to override the default directory for ccache by using the
CCACHE_DIR environment variable fails unless it is in the allowed
list.
Bug: 149670916
Test: manual
Change-Id: I8e7eea7a5c25d7ea5f0956fafc70d62522f3c4fc
These variables control whether we accept cached-results / not and
whether we should update the cache with locally executed results or not.
I need accpet-cached whitelisted to purge the the RBE cache for the
invalid cache entry we have set for the failure we say yesterday.
Bug: b/148387048
Change-Id: I7344fc083f82e0b7bc11084376a267d19cf30bb8
This is used by the art apex test.
Bug: 147197813
Test: m EMMA_INSTRUMENT=true EMMA_INSTRUMENT_FRAMEWORK=true art-check-debug-apex-gen
Change-Id: Id185cd35f16131f2c9a8afeba2e5b87834b0e620
We're passing in ASAN_OPTIONS=detect_leaks=0 on the command line, and
the build started failing after we limited the environment.
Test: forrest of aosp-master/aosp_x86_64-eng
Change-Id: I42c91897c7677e1a249412e5a8bc7bb1edb2f881
See the change that added ALLOW_NINJA_ENV for more information.
Test: m nothing; check out/soong.log for small list
Test: ALLOW_NINJA_ENV=true m nothing; check out/soong.log
Change-Id: I7761c6a07a7f8b0acee107e9c27c7739dd4b63ab
Ninja does not track changes in environment variables, so we get
potentially incorrect builds when environment variables change during
incremental builds if some action was using one of them.
Add a variable to limit exposure of these variables to ninja, and thus,
all actions run by ninja. Kati and Soong can still read environment
variables, they explicitly track which ones they read so that we can
re-run them appropriately.
This list is just the beginning, there's no good way to detect which
environment variables are currently being used and to pass them through.
So this initial change won't have a behavioral change, and we'll flip
the switch and see what fails or who complains, flipping it off and on
and adding to the list until we can make this always happen.
Also adds a board-specific `BUILD_BROKEN_NINJA_USES_ENV_VARS := ...`
list so that we can temporarily allow board-specific variables until
they're fixed.
Test: check out/soong.log
Test: ALLOW_NINJA_ENV=false m nothing; check out/soong.log
Test: set BUILD_BROKEN_NINJA_USES_ENV_VARS := OLDPWD
ALLOW_NINJA_ENV=false m nothing; check out/soong.log
Change-Id: I08e4834ce12100a577ef7d6a9a21b9e9d345cb93
Pass --default_pool=local_pool to kati when RBE or goma is enabled
to put most rules into the local_pool. Specific rules that support
RBE or goma will set .KATI_NINJA_POOL := none to remove themselves
from the local_pool. Passing --default_pool will also disable the
hack in kati that sets the pool based on the presence of the string
"/gomacc" in the command line.
Fixes: 143938974
Test: inspect pools in build-${TARGET-PRODUCT}.ninja for m USE_RBE=true
Test: inspect pools in build-${TARGET-PRODUCT}.ninja for m USE_GOMA=true
Change-Id: I839b2488383fcd63fffd613e25b0b9abcb72b567
Test: Built aosp_arm-user with and without USE_RBE. USE_RBE uses
a proxy script in place of rewrapper.
Change-Id: I5bf008a940513872d70b5b215bd6209f759826ae
When running ninja stream stdout to the status writer. This improves
the ninja -d explain behavior, and will also allow Soong to be put
into the ninja console pool letting it print timely output to the
console.
Bug: 80165685
Test: NINJA_ARGS="-d explain" m
Change-Id: I40f03fc01d837ad91d400eae378007a7ce807705
Wait for the ninja proto processing goroutine to notice the fifo
has closed and exit before continuing.
Test: m nothing
Change-Id: I8cf5f3b8bf6a91496c6d2bbbd3e811eb7f0c9d21
This really only initializes the sandbox, it does not attempt to change
the view of the filesystem, nor does it turn off networking.
Bug: 122270019
Test: m
Test: trigger nsjail check failure; lunch; m; cat out/soong.log
Test: USE_GOMA=true m libc
Change-Id: Ib291072dcee8247c7a15f5b6831295ead6e4fc22
So that ninja produces an error instead of just a warning when a dep
file is not produced.
Bug: 121058584
Test: check build logs for "depfile is missing" warning
Test: treehugger
Change-Id: I1cbaba866eaf293495c3c0b2b174190bcb2b0f9a
Test: Dumped the text formated based metrics file to out dir,
and checked the file.
Bug: b/63815990
Change-Id: Iff476f72a0be74eb53b6b26ef468d11c0f24a404
This way we don't appear hung at:
No need to regenerate ninja file
Change-Id: I8dbdaa2c1b1c5a6a73187d0e6061f363b62e10c9
Fixes: 122251150
Test: m nothing
We're only using it to distribute files in case of failure, which isn't
well supported currently, but can be handled for now by using the
DIST_DIR environment variable during the command execution.
This was at least one cause that we'd be re-running Soong during every
build server build, as the DIST_DIR values are unique.
Test: m dist
Change-Id: Ibd5e6b6c46695350de80b745bfb6a6aa685033a0
Ninja now knows how to write directly to a file (or in our case, a named
pipe). This works around an issue we were seeing on Mac, where Go would
just hang after 50-2000 proto messages. It's also just a simpler
solution.
Bug: 111544015
Test: `m` with updated ninja on both Linux & Mac
Change-Id: Ic91920d83a6d2ea0b79e82b467e2423d78189f12
This adds a new status package that merges the running of "actions"
(ninja calls them edges) of multiple tools into one view of the current
state, and gives that to a number of different outputs.
For inputs:
Kati's output parser has been rewritten (and moved) to map onto the
StartAction/FinishAction API. A byproduct of this is that the build
servers should be able to extract errors from Kati better, since they
look like the errors that Ninja used to write.
Ninja is no longer directly connected to the terminal, but its output is
read via the protobuf frontend API, so it's just another tool whose
output becomes merged together.
multiproduct_kati loses its custom status routines, and uses the common
one instead.
For outputs:
The primary output is the ui/terminal.Status type, which along with
ui/terminal.Writer now controls everything about the terminal output.
Today, this doesn't really change any behaviors, but having all terminal
output going through here allows a more complicated (multi-line / full
window) status display in the future.
The tracer acts as an output of the status package, tracing all the
action start / finish events. This replaces reading the .ninja_log file,
so it now properly handles multiple output files from a single action.
A new rotated log file (out/error.log, or out/dist/logs/error.log) just
contains a description of all of the errors that happened during the
current build.
Another new compressed and rotated log file (out/verbose.log.gz, or
out/dist/logs/verbose.log.gz) contains the full verbose (showcommands)
log of every execution run by the build. Since this is now written on
every build, the showcommands argument is now ignored -- if you want to
get the commands run, look at the log file after the build.
Test: m
Test: <built-in tests>
Test: NINJA_ARGS="-t list" m
Test: check the build.trace.gz
Test: check the new log files
Change-Id: If1d8994890d43ef68f65aa10ddd8e6e06dc7013a
This way we only have one way to start a build, which always has logging
/ tracing / etc, even if we don't need Kati.
There's two ways to use this:
As a direct replacement for mkdir out; cd out; ../bootstrap.bash;
./soong -- as long as --skip-make is always passed, we'll never run
Kati, and Soong will run outside of it's "make" mode. This preserves
most of the speed, and allows full user control over the Soong
configuration.
A (experimental, dangerous) way to temporarily bypass the product
variable and kati steps of a build. As long as a user is sure that
nothing has changed from the last build, and they know exactly which
Ninja targets they want to build (which may not be the same as the
arguments normally passed to 'm'), this can lead to shorter build
startup times.
Test: rm -rf out; m --skip-make libc
Test: rm -rf out; m libc; m --skip-make libc
Test: rm -rf out; mkdir out; cd out; ../bootstrap.bash; ./soong libc
Test: build/soong/scripts/build-ndk-prebuilts.sh
Change-Id: Ic0f91167b5779dba3f248a379fbaac67a75a946e
This doesn't catch all the possible causes of timeouts,
(like if Ninja is only partially stuck or if Kati is stuck)
but it should clarify some causes of stuckness
Bug: 62065855
Test: m -j showcommands NINJA_HEARTBEAT_INTERVAL=500ms
Change-Id: I73a792ae91873b19d7b336166a2d47f37c549906
Wrap os/exec.Cmd to use our Context and Config interfaces for automatic
logging and error handling. It also simplifies environment modification
based on the Config's environment.
This also adds sandboxing on Macs using sandbox-exec. A simple profile
is provided that only logs on violations, though multiproduct_kati on
AOSP has no violations. This isn't applied to ninja, only make / soong /
kati to start with. I measured <5% time increase in reading all
makefiles, and no noticable difference when kati doesn't regenerate.
I'd like to spin up a process to dump violation logs into our log file,
but the log reporting changed over the range of Mac versions that we
support, so that's going to be more complicated. Opening Console.app
works in all cases if you're local -- just search/filter for sandbox.
Linux sandboxing will be implemented later -- the sandbox definition is
opaque enough to support a different implementation.
Test: multiproduct_kati on AOSP master on Mac
Change-Id: I7046229333d0dcc8f426a493e0f7380828879f17
I missed this when converting to soong_ui.
Test: m -j blueprint_tools (check soong.log)
Test: SANITIZE_HOST=address m -j blueprint_tools
Change-Id: I01eb567db6848dc36dd679557291a4e600a63bba
This creates a rotating build.trace.gz in the out directory that can be
loaded with chrome://tracing. It'll include start and end timings for
make/soong/kati/ninja, and it will import and time-correct the ninja log
files.
Test: m -j; load out/build.trace.gz in chrome://tracing
Test: multiproduct_kati -keep; load out/multiproduct*/build.trace.gz
Change-Id: Ic060fa9515eb88d95dbe16712479dae9dffcf626
Right now this mostly just copies what Make is doing in
build/core/ninja.mk and build/core/soong.mk. The only major feature it
adds is a rotating log file with some verbose logging.
There is one major functional difference -- you cannot override random
Make variables during the Make phase anymore. The environment variable
is set, and if Make uses ?= or the equivalent, it can still use those
variables. We already made this change for Kati, which also loads all of
the same code and actually does the build, so it has been half-removed
for a while.
The only "UI" this implements is what I'll call "Make Emulation" mode --
it's expected that current command lines will continue working, and
we'll explore alternate user interfaces later.
We're still using Make as a wrapper, but all it does is call into this
single Go program, it won't even load the product configuration. Once
this is default, we can start moving individual users over to using this
directly (still in Make emulation mode), skipping the Make wrapper.
Ideas for the future:
* Generating trace files showing time spent in Make/Kati/Soong/Ninja
(also importing ninja traces into the same stream). I had this working
in a previous version of this patch, but removed it to keep the size
down and focus on the current features.
* More intelligent SIGALRM handling, once we fully remove the Make
wrapper (which hides the SIGALRM)
* Reading the experimental binary output stream from Ninja, so that we
can always save the verbose log even if we're not printing it out to
the console
Test: USE_SOONG_UI=true m -j blueprint_tools
Change-Id: I884327b9a8ae24499eb6c56f6e1ad26df1cfa4e4