Commit graph

2 commits

Author SHA1 Message Date
Ryan Prichard
129f7a1d8e Neon-optimized version of the GNU symbol calculation
On 64-bit walleye, improves the linker relocation benchmark from 71.9ms to
70.7ms (1.7% of the run-time).

On a 32-bit device, it improves the linker relocation benchmark from
205.5ms to 201.2ms (2.1% of the run-time).

$ adb shell taskset 10 /data/benchmarktest64/linker-benchmarks/linker-benchmarks --benchmark_repetitions=100 --benchmark_display_aggregates_only
--------------------------------------------------------------------
Benchmark                          Time             CPU   Iterations
--------------------------------------------------------------------
BM_gnu_hash_simple_mean        15232 ns        15212 ns          100
BM_gnu_hash_simple_median      15176 ns        15159 ns          100
BM_gnu_hash_simple_stddev        111 ns          110 ns          100
BM_gnu_hash_neon_mean          10265 ns        10252 ns          100
BM_gnu_hash_neon_median        10261 ns        10249 ns          100
BM_gnu_hash_neon_stddev         28.1 ns         23.9 ns          100

Bug: none
Test: linker-unit-tests
Test: linker-benchmarks
Change-Id: I3983bca1dddc9241bb70290ad3651d895f046660
2020-01-13 13:29:25 -08:00
Ryan Prichard
339ecef22d Optimize GNU hash linking for large inputs
Symbol lookup is O(L) where L is the number of libraries to search (e.g.
in the global and local lookup groups). Factor out the per-DSO work into
soinfo_do_lookup_impl, and optimize for the situation where all the DSOs
are using DT_GNU_HASH (rather than SysV hashes).

To load a set of libraries, the loader first constructs an auxiliary list
of libraries (SymbolLookupList, containing SymbolLookupLib objects). The
SymbolLookupList is reused for each DSO in a load group. (-Bsymbolic is
accommodated by modifying the SymbolLookupLib at the front of the list.)
To search for a symbol, soinfo_do_lookup_impl has a small loop that first
scans a vector of GNU bloom filters looking for a possible match.

There was a slight improvement from templatizing soinfo_do_lookup_impl
and skipping the does-this-DSO-lack-GNU-hash check.

Rewrite the relocation processing loop to be faster. There are specialized
functions that handle the expected relocation types in normal relocation
sections and in PLT relocation sections.

This CL can reduce the initial link time of large programs by around
40-50% (e.g. audioserver, cameraserver, etc). On the linker relocation
benchmark (64-bit walleye), it reduces the time from 131.6ms to 71.9ms.

Bug: http://b/143577578 (incidentally fixed by this CL)
Test: bionic-unit-tests
Change-Id: If40a42fb6ff566570f7280b71d58f7fa290b9343
2020-01-13 13:29:25 -08:00