ece43e14c9
cortex-a53/bionic/memmove.S looks like a more optimized version. It should be used in most cases. It delegates small (<= 96 bytes) moves to memcpy. The only exception is denver64. It is using its own memcpy, which doesn't allow overlap for < 96 bytes copies. Only for this variant we need generic/bionic/memmove.S. Benchmark result looks pretty close through (on marlin) Before: using generic/bionic/memmove.S ------------------------------------------------------------------- Benchmark Time CPU Iterations ------------------------------------------------------------------- BM_string_memcpy/8/0/0 6 ns 6 ns 108872005 1.15787GB/s BM_string_memcpy/64/0/0 7 ns 7 ns 107387438 9.14365GB/s BM_string_memcpy/512/0/0 21 ns 20 ns 34165353 23.2734GB/s BM_string_memcpy/1024/0/0 40 ns 39 ns 17766657 24.2346GB/s BM_string_memcpy/8192/0/0 311 ns 310 ns 2259904 24.6339GB/s BM_string_memcpy/16384/0/0 616 ns 613 ns 1143027 24.8852GB/s BM_string_memcpy/32768/0/0 1322 ns 1316 ns 530799 23.1835GB/s BM_string_memcpy/65536/0/0 2672 ns 2661 ns 229638 22.937GB/s BM_string_memcpy/131072/0/0 5379 ns 5357 ns 128316 22.788GB/s After: using cortex-a53/bionic/memmove.S ------------------------------------------------------------------- Benchmark Time CPU Iterations ------------------------------------------------------------------- BM_string_memcpy/8/0/0 6 ns 6 ns 116610749 1.24646GB/s BM_string_memcpy/64/0/0 6 ns 6 ns 115634093 9.84708GB/s BM_string_memcpy/512/0/0 21 ns 21 ns 34167322 22.8938GB/s BM_string_memcpy/1024/0/0 39 ns 39 ns 17859445 24.3312GB/s BM_string_memcpy/8192/0/0 311 ns 310 ns 2260192 24.6325GB/s BM_string_memcpy/16384/0/0 610 ns 608 ns 1151889 25.0987GB/s BM_string_memcpy/32768/0/0 1488 ns 1482 ns 532508 20.5988GB/s BM_string_memcpy/65536/0/0 2421 ns 2411 ns 290502 25.3146GB/s BM_string_memcpy/131072/0/0 5278 ns 5256 ns 132710 23.2234GB/s Test: Build and benchmark on marlin Bug: http://b/63992911 Change-Id: Id85961aca18ba841bcbcfe0d8b162843eab30584 |
||
---|---|---|
.. | ||
bionic | ||
denver64/bionic | ||
generic/bionic | ||
syscalls |