platform_bionic/benchmarks
Elliott Hughes a4959aa6f8 Reimplement the <ctype.h> is* functions.
Following on from the towlower()/towupper() changes, add benchmarks for
most of <ctype.h>, rewrite the tests to cover the entire defined range
for all of these functions, and then reimplement most of the functions.

The old table-based implementation is mostly a bad idea on modern
hardware, with only ispunct() showing a significant benefit compared to
any other way I could think of writing it, and isalnum() a marginal but
still convincingly genuine benefit.

My new benchmarks make an effort to test an example from each relevant
range of characters to avoid, say, accidentally optimizing the behavior
of `isalnum('0')` at the expense of `isalnum('z')`.

Interestingly, clang is able to generate what I believe to be the
optimal implementations from the most readable code, which is
impressive. It certainly matched or beat all my attempts to be clever!

The BSD table-based implementations made a special case of EOF despite
having a `_ctype_` table that's offset by 1 to include EOF at index 0.
I'm not sure why they didn't take advantage of that, but removing the
explicit check for EOF measurably improves the generated code on arm and
arm64, so even the two functions that still use the table benefit from
this rewrite.

Here are the benchmark results:

arm64 before:
  BM_ctype_isalnum_n                 3.73 ns         3.73 ns    183727137
  BM_ctype_isalnum_y1                3.82 ns         3.81 ns    186383058
  BM_ctype_isalnum_y2                3.73 ns         3.72 ns    187809830
  BM_ctype_isalnum_y3                3.78 ns         3.77 ns    181383055
  BM_ctype_isalpha_n                 3.75 ns         3.75 ns    189453927
  BM_ctype_isalpha_y1                3.76 ns         3.75 ns    184854043
  BM_ctype_isalpha_y2                4.32 ns         3.78 ns    186326931
  BM_ctype_isascii_n                 2.49 ns         2.48 ns    275583822
  BM_ctype_isascii_y                 2.51 ns         2.51 ns    282123915
  BM_ctype_isblank_n                 3.11 ns         3.10 ns    220472044
  BM_ctype_isblank_y1                3.20 ns         3.19 ns    226088868
  BM_ctype_isblank_y2                3.11 ns         3.11 ns    220809122
  BM_ctype_iscntrl_n                 3.79 ns         3.78 ns    188719938
  BM_ctype_iscntrl_y1                3.72 ns         3.71 ns    186209237
  BM_ctype_iscntrl_y2                3.80 ns         3.80 ns    184315749
  BM_ctype_isdigit_n                 3.76 ns         3.74 ns    188334682
  BM_ctype_isdigit_y                 3.78 ns         3.77 ns    186249335
  BM_ctype_isgraph_n                 3.99 ns         3.98 ns    177814143
  BM_ctype_isgraph_y1                3.98 ns         3.95 ns    175140090
  BM_ctype_isgraph_y2                4.01 ns         4.00 ns    178320453
  BM_ctype_isgraph_y3                3.96 ns         3.95 ns    175412814
  BM_ctype_isgraph_y4                4.01 ns         4.00 ns    175711174
  BM_ctype_islower_n                 3.75 ns         3.74 ns    188604818
  BM_ctype_islower_y                 3.79 ns         3.78 ns    154738238
  BM_ctype_isprint_n                 3.96 ns         3.95 ns    177607734
  BM_ctype_isprint_y1                3.94 ns         3.93 ns    174877244
  BM_ctype_isprint_y2                4.02 ns         4.01 ns    178206135
  BM_ctype_isprint_y3                3.94 ns         3.93 ns    175959069
  BM_ctype_isprint_y4                4.03 ns         4.02 ns    176158314
  BM_ctype_isprint_y5                3.95 ns         3.94 ns    178745462
  BM_ctype_ispunct_n                 3.78 ns         3.77 ns    184727184
  BM_ctype_ispunct_y                 3.76 ns         3.75 ns    187947503
  BM_ctype_isspace_n                 3.74 ns         3.74 ns    185300285
  BM_ctype_isspace_y1                3.77 ns         3.76 ns    187202066
  BM_ctype_isspace_y2                3.73 ns         3.73 ns    184105959
  BM_ctype_isupper_n                 3.81 ns         3.80 ns    185038761
  BM_ctype_isupper_y                 3.71 ns         3.71 ns    185885793
  BM_ctype_isxdigit_n                3.79 ns         3.79 ns    184965673
  BM_ctype_isxdigit_y1               3.76 ns         3.75 ns    188251672
  BM_ctype_isxdigit_y2               3.79 ns         3.78 ns    184187481
  BM_ctype_isxdigit_y3               3.77 ns         3.76 ns    187635540

arm64 after:
  BM_ctype_isalnum_n                 3.37 ns         3.37 ns    205613810
  BM_ctype_isalnum_y1                3.40 ns         3.39 ns    204806361
  BM_ctype_isalnum_y2                3.43 ns         3.43 ns    205066077
  BM_ctype_isalnum_y3                3.50 ns         3.50 ns    200057128
  BM_ctype_isalpha_n                 2.97 ns         2.97 ns    236084076
  BM_ctype_isalpha_y1                2.97 ns         2.97 ns    236083626
  BM_ctype_isalpha_y2                2.97 ns         2.97 ns    236084246
  BM_ctype_isascii_n                 2.55 ns         2.55 ns    272879994
  BM_ctype_isascii_y                 2.46 ns         2.45 ns    286522323
  BM_ctype_isblank_n                 3.18 ns         3.18 ns    220431175
  BM_ctype_isblank_y1                3.18 ns         3.18 ns    220345602
  BM_ctype_isblank_y2                3.18 ns         3.18 ns    220308509
  BM_ctype_iscntrl_n                 3.10 ns         3.10 ns    220344270
  BM_ctype_iscntrl_y1                3.10 ns         3.07 ns    228973615
  BM_ctype_iscntrl_y2                3.07 ns         3.07 ns    229192626
  BM_ctype_isdigit_n                 3.07 ns         3.07 ns    228925676
  BM_ctype_isdigit_y                 3.07 ns         3.07 ns    229182934
  BM_ctype_isgraph_n                 2.66 ns         2.66 ns    264268737
  BM_ctype_isgraph_y1                2.66 ns         2.66 ns    264445277
  BM_ctype_isgraph_y2                2.66 ns         2.66 ns    264327427
  BM_ctype_isgraph_y3                2.66 ns         2.66 ns    264427480
  BM_ctype_isgraph_y4                2.66 ns         2.66 ns    264155250
  BM_ctype_islower_n                 2.66 ns         2.66 ns    264421600
  BM_ctype_islower_y                 2.66 ns         2.66 ns    264341148
  BM_ctype_isprint_n                 2.66 ns         2.66 ns    264415198
  BM_ctype_isprint_y1                2.66 ns         2.66 ns    264268793
  BM_ctype_isprint_y2                2.66 ns         2.66 ns    264419205
  BM_ctype_isprint_y3                2.66 ns         2.66 ns    264205886
  BM_ctype_isprint_y4                2.66 ns         2.66 ns    264440797
  BM_ctype_isprint_y5                2.72 ns         2.72 ns    264333293
  BM_ctype_ispunct_n                 3.52 ns         3.51 ns    198956572
  BM_ctype_ispunct_y                 3.38 ns         3.38 ns    201661792
  BM_ctype_isspace_n                 3.39 ns         3.39 ns    206896620
  BM_ctype_isspace_y1                3.39 ns         3.39 ns    206569020
  BM_ctype_isspace_y2                3.39 ns         3.39 ns    206564415
  BM_ctype_isupper_n                 2.76 ns         2.75 ns    254227134
  BM_ctype_isupper_y                 2.76 ns         2.75 ns    254235314
  BM_ctype_isxdigit_n                3.60 ns         3.60 ns    194418653
  BM_ctype_isxdigit_y1               2.97 ns         2.97 ns    236082424
  BM_ctype_isxdigit_y2               3.48 ns         3.48 ns    200390011
  BM_ctype_isxdigit_y3               3.48 ns         3.48 ns    202255815

arm32 before:
  BM_ctype_isalnum_n                 4.77 ns         4.76 ns    129230464
  BM_ctype_isalnum_y1                4.88 ns         4.87 ns    147939321
  BM_ctype_isalnum_y2                4.74 ns         4.73 ns    145508054
  BM_ctype_isalnum_y3                4.81 ns         4.80 ns    144968914
  BM_ctype_isalpha_n                 4.80 ns         4.79 ns    148262579
  BM_ctype_isalpha_y1                4.74 ns         4.73 ns    145061326
  BM_ctype_isalpha_y2                4.83 ns         4.82 ns    147642546
  BM_ctype_isascii_n                 3.74 ns         3.72 ns    186711139
  BM_ctype_isascii_y                 3.79 ns         3.78 ns    183654780
  BM_ctype_isblank_n                 4.20 ns         4.19 ns    169733252
  BM_ctype_isblank_y1                4.19 ns         4.18 ns    165713363
  BM_ctype_isblank_y2                4.22 ns         4.21 ns    168776265
  BM_ctype_iscntrl_n                 4.75 ns         4.74 ns    145417484
  BM_ctype_iscntrl_y1                4.82 ns         4.81 ns    146283250
  BM_ctype_iscntrl_y2                4.79 ns         4.78 ns    148662453
  BM_ctype_isdigit_n                 4.77 ns         4.76 ns    145789210
  BM_ctype_isdigit_y                 4.84 ns         4.84 ns    146909458
  BM_ctype_isgraph_n                 4.72 ns         4.71 ns    145874663
  BM_ctype_isgraph_y1                4.86 ns         4.85 ns    142037606
  BM_ctype_isgraph_y2                4.79 ns         4.78 ns    145109612
  BM_ctype_isgraph_y3                4.75 ns         4.75 ns    144829039
  BM_ctype_isgraph_y4                4.86 ns         4.85 ns    146769899
  BM_ctype_islower_n                 4.76 ns         4.75 ns    147537637
  BM_ctype_islower_y                 4.79 ns         4.78 ns    145648017
  BM_ctype_isprint_n                 4.82 ns         4.81 ns    147154780
  BM_ctype_isprint_y1                4.76 ns         4.76 ns    145117604
  BM_ctype_isprint_y2                4.87 ns         4.86 ns    145801406
  BM_ctype_isprint_y3                4.79 ns         4.78 ns    148043446
  BM_ctype_isprint_y4                4.77 ns         4.76 ns    145157619
  BM_ctype_isprint_y5                4.91 ns         4.90 ns    147810800
  BM_ctype_ispunct_n                 4.74 ns         4.73 ns    145588611
  BM_ctype_ispunct_y                 4.82 ns         4.81 ns    144065436
  BM_ctype_isspace_n                 4.78 ns         4.77 ns    147153712
  BM_ctype_isspace_y1                4.73 ns         4.72 ns    145252863
  BM_ctype_isspace_y2                4.84 ns         4.83 ns    148615797
  BM_ctype_isupper_n                 4.75 ns         4.74 ns    148276631
  BM_ctype_isupper_y                 4.80 ns         4.79 ns    145529893
  BM_ctype_isxdigit_n                4.78 ns         4.77 ns    147271646
  BM_ctype_isxdigit_y1               4.74 ns         4.74 ns    145142209
  BM_ctype_isxdigit_y2               4.83 ns         4.82 ns    146398497
  BM_ctype_isxdigit_y3               4.78 ns         4.77 ns    147617686

arm32 after:
  BM_ctype_isalnum_n                 4.35 ns         4.35 ns    161086146
  BM_ctype_isalnum_y1                4.36 ns         4.35 ns    160961111
  BM_ctype_isalnum_y2                4.36 ns         4.36 ns    160733210
  BM_ctype_isalnum_y3                4.35 ns         4.35 ns    160897524
  BM_ctype_isalpha_n                 3.67 ns         3.67 ns    189377208
  BM_ctype_isalpha_y1                3.68 ns         3.67 ns    189438146
  BM_ctype_isalpha_y2                3.75 ns         3.69 ns    190971186
  BM_ctype_isascii_n                 3.69 ns         3.68 ns    191029191
  BM_ctype_isascii_y                 3.68 ns         3.68 ns    191011817
  BM_ctype_isblank_n                 4.09 ns         4.09 ns    171887541
  BM_ctype_isblank_y1                4.09 ns         4.09 ns    171829345
  BM_ctype_isblank_y2                4.08 ns         4.07 ns    170585590
  BM_ctype_iscntrl_n                 4.08 ns         4.07 ns    170614383
  BM_ctype_iscntrl_y1                4.13 ns         4.11 ns    171495899
  BM_ctype_iscntrl_y2                4.19 ns         4.18 ns    165255578
  BM_ctype_isdigit_n                 4.25 ns         4.24 ns    165237008
  BM_ctype_isdigit_y                 4.24 ns         4.24 ns    165256149
  BM_ctype_isgraph_n                 3.82 ns         3.81 ns    183610114
  BM_ctype_isgraph_y1                3.82 ns         3.81 ns    183614131
  BM_ctype_isgraph_y2                3.82 ns         3.81 ns    183616840
  BM_ctype_isgraph_y3                3.79 ns         3.79 ns    183620182
  BM_ctype_isgraph_y4                3.82 ns         3.81 ns    185740009
  BM_ctype_islower_n                 3.75 ns         3.74 ns    183619502
  BM_ctype_islower_y                 3.68 ns         3.68 ns    190999901
  BM_ctype_isprint_n                 3.69 ns         3.68 ns    190899544
  BM_ctype_isprint_y1                3.68 ns         3.67 ns    190192384
  BM_ctype_isprint_y2                3.67 ns         3.67 ns    189351466
  BM_ctype_isprint_y3                3.67 ns         3.67 ns    189430348
  BM_ctype_isprint_y4                3.68 ns         3.68 ns    189430161
  BM_ctype_isprint_y5                3.69 ns         3.68 ns    190962419
  BM_ctype_ispunct_n                 4.14 ns         4.14 ns    171034861
  BM_ctype_ispunct_y                 4.19 ns         4.19 ns    168308152
  BM_ctype_isspace_n                 4.50 ns         4.50 ns    156250887
  BM_ctype_isspace_y1                4.48 ns         4.48 ns    155124476
  BM_ctype_isspace_y2                4.50 ns         4.50 ns    155077504
  BM_ctype_isupper_n                 3.68 ns         3.68 ns    191020583
  BM_ctype_isupper_y                 3.68 ns         3.68 ns    191015669
  BM_ctype_isxdigit_n                4.50 ns         4.50 ns    156276745
  BM_ctype_isxdigit_y1               3.28 ns         3.27 ns    214729725
  BM_ctype_isxdigit_y2               4.48 ns         4.48 ns    155265129
  BM_ctype_isxdigit_y3               4.48 ns         4.48 ns    155216846

I've also corrected a small mistake in the documentation for isxdigit().

Test: tests and benchmarks
Change-Id: I4a77859f826c3fc8f0e327e847886882f29ec4a3
2019-10-08 12:04:09 -07:00
..
spawn Revert "Revert "Add benchmarks that run simple programs"" 2019-09-26 16:18:37 -07:00
suites Fix benchmark-tests 2018-08-11 23:43:03 -07:00
test_suites Fix benchmark-tests 2018-08-11 23:43:03 -07:00
tests Fix test failures. 2018-11-07 14:30:55 -08:00
Android.bp Add trivial <ctype.h> benchmarks. 2019-09-26 21:47:01 -07:00
atomic_benchmark.cpp Add benchmarks for heap size retrieval 2018-10-18 17:56:58 -07:00
bionic_benchmarks.cpp Add option to define ranges sets for benchmarks 2018-08-11 00:27:27 +00:00
ctype_benchmark.cpp Reimplement the <ctype.h> is* functions. 2019-10-08 12:04:09 -07:00
expf_input.cpp Add expf and exp2f benchmark 2018-06-19 11:34:54 -03:00
get_heap_size_benchmark.cpp Add benchmarks for heap size retrieval 2018-10-18 17:56:58 -07:00
inttypes_benchmark.cpp benchmarks: remove more boilerplate. 2019-09-26 07:42:23 -07:00
logf_input.cpp Add logf and log2f benchmark 2018-06-19 11:34:54 -03:00
malloc_benchmark.cpp Add malloc benchmarks. 2019-04-05 14:45:15 -07:00
malloc_sql.h Add new malloc benchmarks. 2018-08-14 16:01:58 -07:00
math_benchmark.cpp Add pow benchmark 2018-08-08 18:04:48 -03:00
powf_input.cpp Add powf benchmark 2018-06-19 11:34:54 -03:00
property_benchmark.cpp switch to using android-base/file.h instead of android-base/test_utils.h 2018-11-14 15:46:49 -08:00
pthread_benchmark.cpp Fix/suppress bionic google-explicit-constructor warnings 2019-01-02 11:04:05 -08:00
README.md Revert "Revert "Add benchmarks that run simple programs"" 2019-09-26 16:18:37 -07:00
run-on-host.sh run-on-host fixes 2019-09-24 15:36:31 -07:00
semaphore_benchmark.cpp Modernise code to use override specifier 2019-03-29 14:27:27 -07:00
sincosf_input.cpp Add sinf/cosf/sincosf benchmark 2018-06-19 11:34:54 -03:00
stdio_benchmark.cpp switch to using android-base/file.h instead of android-base/test_utils.h 2018-11-14 15:46:49 -08:00
stdlib_benchmark.cpp benchmarks: remove more boilerplate. 2019-09-26 07:42:23 -07:00
string_benchmark.cpp Add benchmark for strncmp 2018-08-21 21:04:43 +00:00
time_benchmark.cpp bionic: benchmark: add clock_getres performance tests 2017-12-07 09:41:31 -08:00
unistd_benchmark.cpp benchmarks: remove more boilerplate. 2019-09-26 07:42:23 -07:00
util.cpp Fix bug in --bionic_cpu option handling. 2018-05-04 17:34:35 -07:00
util.h benchmarks: remove more boilerplate. 2019-09-26 07:42:23 -07:00
wctype_benchmark.cpp benchmarks: remove more boilerplate. 2019-09-26 07:42:23 -07:00

Bionic Benchmarks

[TOC]

libc benchmarks (bionic-benchmarks)

bionic-benchmarks is a command line tool for measuring the runtimes of libc functions. It is built on top of Google Benchmark with some additions to organize tests into suites.

Device benchmarks

$ mmma bionic/benchmarks
$ adb root
$ adb sync data
$ adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks
$ adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks

By default, bionic-benchmarks runs all of the benchmarks in alphabetical order. Pass --benchmark_filter=getpid to run just the benchmarks with "getpid" in their name.

Host benchmarks

See the benchmarks/run-on-host.sh script. The host benchmarks can be run with 32-bit or 64-bit Bionic, or the host glibc.

XML suites

Suites are stored in the suites/ directory and can be chosen with the command line flag --bionic_xml.

To choose a specific XML file, use the --bionic_xml=FILE.XML option. By default, this option searches for the XML file in the suites/ directory. If it doesn't exist in that directory, then the file will be found as relative to the current directory. If the option specifies the full path to an XML file such as /data/nativetest/suites/example.xml, it will be used as-is.

If no XML file is specified through the command-line option, the default is to use suites/full.xml. However, for the host bionic benchmarks (bionic-benchmarks-glibc), the default is to use suites/host.xml.

XML suite format

The format for a benchmark is:

<fn>
    <name>BM_sample_benchmark</name>
    <cpu><optional_cpu_to_lock></cpu>
    <iterations><optional_iterations_to_run></iterations>
    <args><space separated list of function args|shorthand></args>
</fn>

XML-specified values for iterations and cpu take precedence over those specified via command line (via --bionic_iterations and --bionic_cpu, respectively.)

To make small changes in runs, you can also schedule benchmarks by passing in their name and a space-separated list of arguments via the --bionic_extra command line flag, e.g. --bionic_extra="BM_string_memcpy AT_COMMON_SIZES" or --bionic_extra="BM_string_memcmp 32 8 8"

Note that benchmarks will run normally if extra arguments are passed in, and it will fail with a segfault if too few are passed in.

Shorthand

For the sake of brevity, multiple runs can be scheduled in one XML element by putting one of the following in the args field:

NUM_PROPS
MATH_COMMON
AT_ALIGNED_<ONE|TWO>BUF
AT_<any power of two between 2 and 16384>_ALIGNED_<ONE|TWO>BUF
AT_COMMON_SIZES

Definitions for these can be found in bionic_benchmarks.cpp, and example usages can be found in the suites directory.

Unit Tests

bionic-benchmarks also has its own set of unit tests, which can be run from the binary in /data/nativetest[64]/bionic-benchmarks-tests

Process startup time (bionic-spawn-benchmarks)

The spawn/ subdirectory has a few benchmarks measuring the time used to start simple programs (e.g. Toybox's true and sh -c true). Run it on a device like so:

m bionic-spawn-benchmarks
adb root
adb sync data
adb shell /data/benchmarktest/bionic-spawn-benchmarks/bionic-spawn-benchmarks
adb shell /data/benchmarktest64/bionic-spawn-benchmarks/bionic-spawn-benchmarks

Google Benchmark reports both a real-time figure ("Time") and a CPU usage figure. For these benchmarks, the CPU measurement only counts time spent in the thread calling posix_spawn, not that spent in the spawned process. The real-time is probably more useful, and it is the figure used to determine the iteration count.

Locking the CPU frequency seems to improve the results of these benchmarks significantly, and it reduces variability.

Google Benchmark notes

Repetitions

Google Benchmark uses two settings to control how many times to run each benchmark, "iterations" and "repetitions". By default, the repetition count is one. Google Benchmark runs the benchmark a few times to determine a sufficiently-large iteration count.

Google Benchmark can optionally run a benchmark run repeatedly and report statistics (median, mean, standard deviation) for the runs. To do so, pass the --benchmark_repetitions option, e.g.:

# ./bionic-benchmarks --benchmark_filter=BM_stdlib_strtoll --benchmark_repetitions=4
...
-------------------------------------------------------------------
Benchmark                         Time             CPU   Iterations
-------------------------------------------------------------------
BM_stdlib_strtoll              27.7 ns         27.7 ns     25290525
BM_stdlib_strtoll              27.7 ns         27.7 ns     25290525
BM_stdlib_strtoll              27.7 ns         27.7 ns     25290525
BM_stdlib_strtoll              27.8 ns         27.7 ns     25290525
BM_stdlib_strtoll_mean         27.7 ns         27.7 ns            4
BM_stdlib_strtoll_median       27.7 ns         27.7 ns            4
BM_stdlib_strtoll_stddev      0.023 ns        0.023 ns            4

There are 4 runs, each with 25290525 iterations. Measurements for the individual runs can be suppressed if they aren't needed:

# ./bionic-benchmarks --benchmark_filter=BM_stdlib_strtoll --benchmark_repetitions=4 --benchmark_report_aggregates_only
...
-------------------------------------------------------------------
Benchmark                         Time             CPU   Iterations
-------------------------------------------------------------------
BM_stdlib_strtoll_mean         27.8 ns         27.7 ns            4
BM_stdlib_strtoll_median       27.7 ns         27.7 ns            4
BM_stdlib_strtoll_stddev      0.043 ns        0.043 ns            4

CPU frequencies

To get consistent results between runs, it can sometimes be helpful to restrict a benchmark to specific cores, or to lock cores at specific frequencies. Some phones have a big.LITTLE core setup, or at least allow some cores to run at higher frequencies than others.

A core can be selected for bionic-benchmarks using the --bionic_cpu option or using the taskset utility. e.g. A Pixel 3 device has 4 Kryo 385 Silver cores followed by 4 Gold cores:

blueline:/ # /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_stdlib_strtoll --bionic_cpu=0
...
------------------------------------------------------------
Benchmark                  Time             CPU   Iterations
------------------------------------------------------------
BM_stdlib_strtoll       64.2 ns         63.6 ns     11017493

blueline:/ # /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_stdlib_strtoll --bionic_cpu=4
...
------------------------------------------------------------
Benchmark                  Time             CPU   Iterations
------------------------------------------------------------
BM_stdlib_strtoll       21.8 ns         21.7 ns     33167103

A similar result can be achieved using taskset. The first parameter is a bitmask of core numbers to pass to sched_setaffinity:

blueline:/ # taskset f /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_stdlib_strtoll
...
------------------------------------------------------------
Benchmark                  Time             CPU   Iterations
------------------------------------------------------------
BM_stdlib_strtoll       64.3 ns         63.6 ns     10998697

blueline:/ # taskset f0 /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_stdlib_strtoll
...
------------------------------------------------------------
Benchmark                  Time             CPU   Iterations
------------------------------------------------------------
BM_stdlib_strtoll       21.3 ns         21.2 ns     33094801

To lock the CPU frequency, use the sysfs interface at /sys/devices/system/cpu/cpu*/cpufreq/. Changing the scaling governor to performance suppresses the warning that Google Benchmark otherwise prints:

***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.

Some devices have a perf-setup.sh script that locks CPU and GPU frequencies. Some TradeFed benchmarks appear to be using the script. For more information: