tequilaOS/platform_bionic

Author	SHA1	Message	Date
Mark Salyzyn	79249b0897	bionic: add vdso clock_getres clock_getres() should not be a hot call, nevertheless it is ~6-7 times faster for supported clock ids if it uses __vdso_clock_getres if available. There is a 3% performance penalty for unsupported clock ids via __vdso_clock_getres with respect to a direct syscall. [TL;DR] w/vdso32 kernel patches, locked cores to MAX, little cores only. BEFORE: hikey960 vdso (aarch64): ---------------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------------- BM_time_clock_getres 126 ns 126 ns 5577874 BM_time_clock_getres_syscall 127 ns 127 ns 5505016 BM_time_clock_getres_REALTIME 126 ns 126 ns 5574682 BM_time_clock_getres_BOOTTIME 126 ns 126 ns 5575237 BM_time_clock_getres_TAI 126 ns 126 ns 5576810 BM_time_clock_getres_unsupported 128 ns 128 ns 5480189 hikey960 vdso32 (aarch32): ---------------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------------- BM_time_clock_getres 199 ns 199 ns 3508708 BM_time_clock_getres_syscall 220 ns 220 ns 3184676 BM_time_clock_getres_REALTIME 199 ns 199 ns 3509697 BM_time_clock_getres_BOOTTIME 199 ns 199 ns 3513551 BM_time_clock_getres_TAI 200 ns 199 ns 3512412 BM_time_clock_getres_unsupported 196 ns 196 ns 3575609 x86_64 (glibc): --------------------------------------------------------------------- Benchmark Time CPU Iterations --------------------------------------------------------------------- BM_time_clock_getres 252 ns 252 ns 2370263 BM_time_clock_getres_syscall 215 ns 215 ns 3287497 BM_time_clock_getres_REALTIME 214 ns 214 ns 3294228 BM_time_clock_getres_BOOTTIME 213 ns 213 ns 3277519 BM_time_clock_getres_TAI 213 ns 213 ns 3294991 BM_time_clock_getres_unsupported 206 ns 206 ns 3450654 imx7d_pico IOT nyc (w/arm,cpu-registers-not-fw-configured) (armv7a): (Virtual Timers) Benchmark Time(ns) CPU(ns) Iterations ------------------------------------------------------------------ BM_time_clock_getres 16 345 2000000 BM_time_clock_getres_syscall 16 339 2121212 BM_time_clock_getres_REALTIME 17 350 2058824 BM_time_clock_getres_BOOTTIME 17 345 2000000 BM_time_clock_getres_TAI 16 350 2000000 BM_time_clock_getres_unsupported 13 284 2500000 AFTER: hikey960 vdso (aarch64): --------------------------------------------------------------------- Benchmark Time CPU Iterations --------------------------------------------------------------------- BM_time_clock_getres 18 ns 18 ns 37880389 BM_time_clock_getres_syscall 127 ns 127 ns 5520029 BM_time_clock_getres_REALTIME 18 ns 18 ns 37879962 BM_time_clock_getres_BOOTTIME 19 ns 18 ns 37878361 BM_time_clock_getres_TAI 131 ns 131 ns 5368484 BM_time_clock_getres_unsupported 97 ns 97 ns 7182864 hikey960 vdso32 (aarch32): --------------------------------------------------------------------- Benchmark Time CPU Iterations --------------------------------------------------------------------- BM_time_clock_getres 36 ns 36 ns 19205240 BM_time_clock_getres_syscall 212 ns 212 ns 3297100 BM_time_clock_getres_REALTIME 36 ns 36 ns 19219109 BM_time_clock_getres_BOOTTIME 36 ns 36 ns 19222490 BM_time_clock_getres_TAI 206 ns 206 ns 3402868 BM_time_clock_getres_unsupported 159 ns 159 ns 4409492 imx7d_pico IOT nyc (wo/arm,cpu-registers-not-fw-configured) (armv7a): (Physical Timers) Benchmark Time(ns) CPU(ns) Iterations ------------------------------------------------------------------ BM_time_clock_getres 2 48 14000000 BM_time_clock_getres_syscall 14 335 2058824 BM_time_clock_getres_REALTIME 2 49 14583333 BM_time_clock_getres_BOOTTIME 2 48 14000000 BM_time_clock_getres_TAI 14 350 2058824 BM_time_clock_getres_unsupported 8 203 3500000 Test: taskset F \ /data/benchmarktest{64}/bionic-benchmarks/bionic-benchmarks \ --bionic_xml=vdso.xml --benchmark_filter=BM_time_clock_getres* Bug: 63737556 Change-Id: I80c0a5106625d76720287f715fcf145d2aad1705	2017-12-07 09:41:48 -08:00
Sebastian Pop	ed9bfc4616	[AArch64] Optimized memcmp Patch written by Wilco Dijkstra submitted for review to newlib: https://sourceware.org/ml/newlib/2017/msg00524.html This is an optimized memcmp for AArch64. This is a complete rewrite using a different algorithm. The previous version split into cases where both inputs were aligned, the inputs were mutually aligned and unaligned using a byte loop. The new version combines all these cases, while small inputs of less than 8 bytes are handled separately. This allows the main code to be sped up using unaligned loads since there are now at least 8 bytes to be compared. After the first 8 bytes, align the first input. This ensures each iteration does at most one unaligned access and mutually aligned inputs behave as aligned. After the main loop, process the last 8 bytes using unaligned accesses. This improves performance of (mutually) aligned cases by 25% and unaligned by >500% (yes >6 times faster) on large inputs. 2017-06-28 Wilco Dijkstra <wdijkstr@arm.com> * bionic/libc/arch-arm64/generic/bionic/memcmp.S (memcmp): Rewrite of optimized memcmp. GLIBC benchtests/bench-memcmp.c performance comparison for Cortex-A53: Length 1, alignment 1/ 1: 153% Length 1, alignment 1/ 1: 119% Length 1, alignment 1/ 1: 154% Length 2, alignment 2/ 2: 121% Length 2, alignment 2/ 2: 140% Length 2, alignment 2/ 2: 121% Length 3, alignment 3/ 3: 105% Length 3, alignment 3/ 3: 105% Length 3, alignment 3/ 3: 105% Length 4, alignment 4/ 4: 155% Length 4, alignment 4/ 4: 154% Length 4, alignment 4/ 4: 161% Length 5, alignment 5/ 5: 173% Length 5, alignment 5/ 5: 173% Length 5, alignment 5/ 5: 173% Length 6, alignment 6/ 6: 145% Length 6, alignment 6/ 6: 145% Length 6, alignment 6/ 6: 145% Length 7, alignment 7/ 7: 125% Length 7, alignment 7/ 7: 125% Length 7, alignment 7/ 7: 125% Length 8, alignment 8/ 8: 111% Length 8, alignment 8/ 8: 130% Length 8, alignment 8/ 8: 124% Length 9, alignment 9/ 9: 160% Length 9, alignment 9/ 9: 160% Length 9, alignment 9/ 9: 150% Length 10, alignment 10/10: 170% Length 10, alignment 10/10: 137% Length 10, alignment 10/10: 150% Length 11, alignment 11/11: 160% Length 11, alignment 11/11: 160% Length 11, alignment 11/11: 160% Length 12, alignment 12/12: 146% Length 12, alignment 12/12: 168% Length 12, alignment 12/12: 156% Length 13, alignment 13/13: 167% Length 13, alignment 13/13: 167% Length 13, alignment 13/13: 173% Length 14, alignment 14/14: 167% Length 14, alignment 14/14: 168% Length 14, alignment 14/14: 168% Length 15, alignment 15/15: 168% Length 15, alignment 15/15: 173% Length 15, alignment 15/15: 173% Length 1, alignment 0/ 0: 134% Length 1, alignment 0/ 0: 127% Length 1, alignment 0/ 0: 119% Length 2, alignment 0/ 0: 94% Length 2, alignment 0/ 0: 94% Length 2, alignment 0/ 0: 106% Length 3, alignment 0/ 0: 82% Length 3, alignment 0/ 0: 87% Length 3, alignment 0/ 0: 82% Length 4, alignment 0/ 0: 115% Length 4, alignment 0/ 0: 115% Length 4, alignment 0/ 0: 122% Length 5, alignment 0/ 0: 127% Length 5, alignment 0/ 0: 119% Length 5, alignment 0/ 0: 127% Length 6, alignment 0/ 0: 103% Length 6, alignment 0/ 0: 100% Length 6, alignment 0/ 0: 100% Length 7, alignment 0/ 0: 82% Length 7, alignment 0/ 0: 91% Length 7, alignment 0/ 0: 87% Length 8, alignment 0/ 0: 111% Length 8, alignment 0/ 0: 124% Length 8, alignment 0/ 0: 124% Length 9, alignment 0/ 0: 136% Length 9, alignment 0/ 0: 136% Length 9, alignment 0/ 0: 136% Length 10, alignment 0/ 0: 136% Length 10, alignment 0/ 0: 135% Length 10, alignment 0/ 0: 136% Length 11, alignment 0/ 0: 136% Length 11, alignment 0/ 0: 136% Length 11, alignment 0/ 0: 135% Length 12, alignment 0/ 0: 136% Length 12, alignment 0/ 0: 136% Length 12, alignment 0/ 0: 136% Length 13, alignment 0/ 0: 135% Length 13, alignment 0/ 0: 136% Length 13, alignment 0/ 0: 136% Length 14, alignment 0/ 0: 136% Length 14, alignment 0/ 0: 136% Length 14, alignment 0/ 0: 136% Length 15, alignment 0/ 0: 136% Length 15, alignment 0/ 0: 136% Length 15, alignment 0/ 0: 136% Length 4, alignment 0/ 0: 115% Length 4, alignment 0/ 0: 115% Length 4, alignment 0/ 0: 115% Length 32, alignment 0/ 0: 127% Length 32, alignment 7/ 2: 395% Length 32, alignment 0/ 0: 127% Length 32, alignment 0/ 0: 127% Length 8, alignment 0/ 0: 111% Length 8, alignment 0/ 0: 124% Length 8, alignment 0/ 0: 124% Length 64, alignment 0/ 0: 128% Length 64, alignment 6/ 4: 475% Length 64, alignment 0/ 0: 131% Length 64, alignment 0/ 0: 134% Length 16, alignment 0/ 0: 128% Length 16, alignment 0/ 0: 119% Length 16, alignment 0/ 0: 128% Length 128, alignment 0/ 0: 129% Length 128, alignment 5/ 6: 475% Length 128, alignment 0/ 0: 130% Length 128, alignment 0/ 0: 129% Length 32, alignment 0/ 0: 126% Length 32, alignment 0/ 0: 126% Length 32, alignment 0/ 0: 126% Length 256, alignment 0/ 0: 127% Length 256, alignment 4/ 8: 545% Length 256, alignment 0/ 0: 126% Length 256, alignment 0/ 0: 128% Length 64, alignment 0/ 0: 171% Length 64, alignment 0/ 0: 171% Length 64, alignment 0/ 0: 174% Length 512, alignment 0/ 0: 126% Length 512, alignment 3/10: 585% Length 512, alignment 0/ 0: 126% Length 512, alignment 0/ 0: 127% Length 128, alignment 0/ 0: 129% Length 128, alignment 0/ 0: 128% Length 128, alignment 0/ 0: 129% Length 1024, alignment 0/ 0: 125% Length 1024, alignment 2/12: 611% Length 1024, alignment 0/ 0: 126% Length 1024, alignment 0/ 0: 126% Length 256, alignment 0/ 0: 128% Length 256, alignment 0/ 0: 127% Length 256, alignment 0/ 0: 128% Length 2048, alignment 0/ 0: 125% Length 2048, alignment 1/14: 625% Length 2048, alignment 0/ 0: 125% Length 2048, alignment 0/ 0: 125% Length 512, alignment 0/ 0: 126% Length 512, alignment 0/ 0: 127% Length 512, alignment 0/ 0: 127% Length 4096, alignment 0/ 0: 125% Length 4096, alignment 0/16: 125% Length 4096, alignment 0/ 0: 125% Length 4096, alignment 0/ 0: 125% Length 1024, alignment 0/ 0: 126% Length 1024, alignment 0/ 0: 126% Length 1024, alignment 0/ 0: 126% Length 8192, alignment 0/ 0: 125% Length 8192, alignment 63/18: 636% Length 8192, alignment 0/ 0: 125% Length 8192, alignment 0/ 0: 125% Length 16, alignment 1/ 2: 317% Length 16, alignment 1/ 2: 317% Length 16, alignment 1/ 2: 317% Length 32, alignment 2/ 4: 395% Length 32, alignment 2/ 4: 395% Length 32, alignment 2/ 4: 398% Length 64, alignment 3/ 6: 475% Length 64, alignment 3/ 6: 475% Length 64, alignment 3/ 6: 477% Length 128, alignment 4/ 8: 479% Length 128, alignment 4/ 8: 479% Length 128, alignment 4/ 8: 479% Length 256, alignment 5/10: 543% Length 256, alignment 5/10: 539% Length 256, alignment 5/10: 543% Length 512, alignment 6/12: 585% Length 512, alignment 6/12: 585% Length 512, alignment 6/12: 585% Length 1024, alignment 7/14: 611% Length 1024, alignment 7/14: 611% Length 1024, alignment 7/14: 611% The performance measured on the bionic-benchmarks on a hikey board with a new benchmark for unaligned memcmp submitted for review at https://android-review.googlesource.com/414860 The base is with the libc from /system/lib64. The bionic libc with this patch is in /data. hikey:/data # export LD_LIBRARY_PATH=/system/lib64 hikey:/data # ./bionic-benchmarks --benchmark_filter=BM_string_memcmp Run on (8 X 2.4 MHz CPU s) Benchmark Time CPU Iterations ---------------------------------------------------------------------- BM_string_memcmp/8 30 ns 30 ns 22955680 251.07MB/s BM_string_memcmp/64 57 ns 57 ns 12349184 1076.99MB/s BM_string_memcmp/512 305 ns 305 ns 2297163 1.56496GB/s BM_string_memcmp/1024 571 ns 571 ns 1225211 1.66912GB/s BM_string_memcmp/8k 4307 ns 4306 ns 162562 1.77177GB/s BM_string_memcmp/16k 8676 ns 8675 ns 80676 1.75887GB/s BM_string_memcmp/32k 19233 ns 19230 ns 36394 1.58695GB/s BM_string_memcmp/64k 36986 ns 36984 ns 18952 1.65029GB/s BM_string_memcmp_aligned/8 199 ns 199 ns 3519166 38.3336MB/s BM_string_memcmp_aligned/64 386 ns 386 ns 1810734 158.073MB/s BM_string_memcmp_aligned/512 1735 ns 1734 ns 403981 281.525MB/s BM_string_memcmp_aligned/1024 3200 ns 3200 ns 218838 305.151MB/s BM_string_memcmp_aligned/8k 25084 ns 25080 ns 28180 311.507MB/s BM_string_memcmp_aligned/16k 51730 ns 51729 ns 13521 302.057MB/s BM_string_memcmp_aligned/32k 103228 ns 103228 ns 6782 302.727MB/s BM_string_memcmp_aligned/64k 207117 ns 207087 ns 3450 301.806MB/s BM_string_memcmp_unaligned/8 339 ns 339 ns 2070998 22.5302MB/s BM_string_memcmp_unaligned/64 1392 ns 1392 ns 502796 43.8454MB/s BM_string_memcmp_unaligned/512 9194 ns 9194 ns 76133 53.1104MB/s BM_string_memcmp_unaligned/1024 18325 ns 18323 ns 38206 53.2963MB/s BM_string_memcmp_unaligned/8k 148579 ns 148574 ns 4713 52.5831MB/s BM_string_memcmp_unaligned/16k 298169 ns 298120 ns 2344 52.4118MB/s BM_string_memcmp_unaligned/32k 598813 ns 598797 ns 1085 52.188MB/s BM_string_memcmp_unaligned/64k 1196079 ns 1196083 ns 540 52.2539MB/s hikey:/data # export LD_LIBRARY_PATH=/data hikey:/data # ./bionic-benchmarks --benchmark_filter=BM_string_memcmp Benchmark Time CPU Iterations ---------------------------------------------------------------------- BM_string_memcmp/8 27 ns 27 ns 26198166 286.069MB/s BM_string_memcmp/64 45 ns 45 ns 15553753 1.32443GB/s BM_string_memcmp/512 242 ns 242 ns 2892423 1.97049GB/s BM_string_memcmp/1024 455 ns 455 ns 1537290 2.09436GB/s BM_string_memcmp/8k 3446 ns 3446 ns 203295 2.21392GB/s BM_string_memcmp/16k 7567 ns 7567 ns 92582 2.01657GB/s BM_string_memcmp/32k 16081 ns 16081 ns 43524 1.8977GB/s BM_string_memcmp/64k 31029 ns 31028 ns 22565 1.96712GB/s BM_string_memcmp_aligned/8 184 ns 184 ns 3800912 41.3654MB/s BM_string_memcmp_aligned/64 287 ns 287 ns 2438835 212.65MB/s BM_string_memcmp_aligned/512 1370 ns 1370 ns 511014 356.498MB/s BM_string_memcmp_aligned/1024 2543 ns 2543 ns 275253 384.006MB/s BM_string_memcmp_aligned/8k 20413 ns 20411 ns 34306 382.764MB/s BM_string_memcmp_aligned/16k 42908 ns 42907 ns 16132 364.158MB/s BM_string_memcmp_aligned/32k 88902 ns 88886 ns 8087 351.574MB/s BM_string_memcmp_aligned/64k 173016 ns 173007 ns 4122 361.258MB/s BM_string_memcmp_unaligned/8 212 ns 212 ns 3304163 36.0243MB/s BM_string_memcmp_unaligned/64 361 ns 361 ns 1941597 169.279MB/s BM_string_memcmp_unaligned/512 1754 ns 1753 ns 399210 278.492MB/s BM_string_memcmp_unaligned/1024 3308 ns 3308 ns 211622 295.243MB/s BM_string_memcmp_unaligned/8k 27227 ns 27225 ns 25637 286.964MB/s BM_string_memcmp_unaligned/16k 55877 ns 55874 ns 12455 279.645MB/s BM_string_memcmp_unaligned/32k 112397 ns 112366 ns 6200 278.11MB/s BM_string_memcmp_unaligned/64k 223493 ns 223482 ns 3127 279.665MB/s Test: bionic-benchmarks --benchmark_filter='BM_string_memcmp*' Change-Id: Ia16a8cf69c68b8c0533f025f03b925c9883bb708	2017-11-03 13:21:07 -04:00
dimitry	fa432524a6	Mark __BIONIC_WEAK_FOR_NATIVE_BRIDGE symbols To make it easier for Native Bridge implementations to override these symbols. Bug: http://b/67993967 Test: make Change-Id: I4c53e53af494bca365dd2b3305ab0ccc2b23ba44	2017-10-27 10:01:46 +02:00
Elliott Hughes	d4ca231ae2	Unified sysroot: kill arch-specific include dirs. <machine/asm.h> was internal use only. <machine/fenv.h> is quite large, but can live in <bits/...>. <machine/regdef.h> is trivially replaced by saying $x instead of x in our assembler. <machine/setjmp.h> is trivially inlined into <setjmp.h>. <sgidefs.h> is unused. Bug: N/A Test: builds Change-Id: Id05dbab43a2f9537486efb8f27a5ef167b055815	2017-10-12 13:19:51 -07:00
Elliott Hughes	8465e968a8	Add <sys/random.h>. iOS 10 has <sys/random.h> with getentropy, glibc >= 2.25 has <sys/random.h> with getentropy and getrandom. (glibc also pollutes <unistd.h>, but that seems like a bad idea.) Also, all supported devices now have kernels with the getrandom system call. We've had these available internally for a while, but it seems like the time is ripe to expose them. Bug: http://b/67014255 Test: ran tests Change-Id: I76dde1e3a2d0bc82777eea437ac193f96964f138	2017-09-29 05:31:35 +00:00
Elliott Hughes	896362eb0e	Add syncfs(2). GMM calls this system call directly at the moment. That's silly. Bug: http://b/36405699 Test: ran tests Change-Id: I1e14c0e5ce0bc2aa888d884845ac30dc20f13cd5	2017-08-24 16:31:49 -07:00
George Burgess IV	6cb0687932	Split our FORTIFY implementation into libc_fortify As requested in the bug. This also rips __memcpy_chk out of memcpy.S, which lets us cut down on copypasta (all of the implementations look identical). Bug: 12231437 Test: mma on aosp_{arm,arm64,mips,x86,x86_64} internal master; checkbuild on bullhead internal master; CtsBionicTestCases on bullhead. No new failures. Change-Id: I88c39ca166bacde0b692aa3063e743bb046a5d2f	2017-07-24 14:20:16 -07:00
Elliott Hughes	94072fbb4e	Switch to inline assembler in crtbegin. Using __builtin_frame_address was clever, but didn't work for arm64 (for reasons which were never investigated) and the ChromeOS folks claim it causes trouble for x86 with ARC++ (though without a reproduceable test case). Naked functions turn out to be quite unevenly supported: some architectures do the right thing, others don't; some architectures warn, others don't (and the warnings don't always match the platforms that _actually_ have problems). Inline assembler also removes the guessing games: everyone knows what the couple of instructions _ought_ to be, and now we don't have to reason about what the compiler will actually do (yet still keep the majority of the code in C). Bug: N/A Test: builds, boots Change-Id: I14207ef50ca46b6eca273c3cb7509c311146a3ca	2017-05-23 14:47:16 -07:00
Kevin Brodsky	f19eeb8446	libc: ARM64: fix memset for non-standard ZVA sizes `372f19e9e2` ("libc: ARM64: update memset/strlen/memcpy/memmove to newlib/cortex-strings") introduced a bug in memset, only occurring on the [set_long + zero + non-standard ZVA size] path, more specifically when DCZID_EL0 reports a size different to 64 or 128. On platforms with such sizes reported by DCZID_EL0, various string* unit tests fail due to memset zeroing memory before and/or after the area it is supposed to set. Test: bionic-unit-tests --gtest_filter=string* Change-Id: Idb80c0269226e40e343645a58608e3f324378468	2017-05-16 11:29:49 +01:00
Jake Weinstein	28285f5338	libc: clean up ARM64 copyright notices Test: None needed Change-Id: I3626a92329e954f67bada6ed73f3033225bbfef5	2017-05-04 12:59:53 -04:00
Elliott Hughes	5109bb463d	Make all the ELF relocation constants available. BSD thinks you should only get the relocation constants for your target architecture, but it's often useful to have them all available at once. Rearrange the headers to enable that. Also update the (modified) NetBSD files to CVS HEAD. Also remove the unused BSDism R_TYPE. Bug: N/A Test: builds Change-Id: Iad5ef29192a732696e2b36af35144a9ca116aa46	2017-04-19 13:28:32 -07:00
Elliott Hughes	901601b48e	Remove unused elf_machdep.h cruft. Also add a few missing include guards. Bug: N/A Test: builds Change-Id: I9557303c81a4b11d430112528def038ecb5562a9	2017-04-17 16:25:09 -07:00
Yuanyuan Zhong	9d150dd9a0	bionic: arm64: generic: strcmp: align to 64B cache line Align strcmp to 64B. This will ensure the preformance critical loop is within one 64B cache line. Change-Id: I88eef2f12b2a6442cacec9cdbdffbf17293e7d32 Signed-off-by: Yuanyuan Zhong <zyy@motorola.com> Reviewed-on: https://gerrit.mot.com/902536 SME-Granted: SME Approvals Granted SLTApproved: Slta Waiver <sltawvr@motorola.com> Tested-by: Jira Key <jirakey@motorola.com> Reviewed-by: Yi-Wei Zhao <gbjc64@motorola.com> Reviewed-by: Igor Kovalenko <igork@motorola.com> Submit-Approved: Jira Key <jirakey@motorola.com>	2017-03-20 17:54:29 +00:00
Jake Weinstein	372f19e9e2	libc: ARM64: update memset/strlen/memcpy/memmove to newlib/cortex-strings * Bionic benchmarks results at the bottom * This is a squash of the following commits: libc: ARM64: optimize memset. This is an optimized memset for AArch64. Memset is split into 4 main cases: small sets of up to 16 bytes, medium of 16..96 bytes which are fully unrolled. Large memsets of more than 96 bytes align the destination and use an unrolled loop processing 64 bytes per iteration. Memsets of zero of more than 256 use the dc zva instruction, and there are faster versions for the common ZVA sizes 64 or 128. STP of Q registers is used to reduce codesize without loss of performance. Change-Id: I0c5b5ec5ab8a1fd0f23eee8fbacada0be08e841f libc: ARM64: improve performance in strlen Change-Id: Ic20f93a0052a49bd76cd6795f51e8606ccfbf11c libc: ARM64: Optimize memcpy. This is an optimized memcpy for AArch64. Copies are split into 3 main cases: small copies of up to 16 bytes, medium copies of 17..96 bytes which are fully unrolled. Large copies of more than 96 bytes align the destination and use an unrolled loop processing 64 bytes per iteration. In order to share code with memmove, small and medium copies read all data before writing, allowing any kind of overlap. On a random copy test memcpy is 40.8% faster on A57 and 28.4% on A53. Change-Id: Ibb9483e45bbc0e8ca3d5ce98a31c55dfd8a5ac28 libc: AArch64: Tune memcpy * Further tuning for performance. Change-Id: Id08eaab885f9743fa7575077924a947c1b88e4ff libc: ARM64: optimize memmove for Cortex-A53 * Sadly does not work on Denver or Kryo, so can't go to generic This is an optimized memmove for AArch64. All copies of up to 96 bytes and all backward copies are done by the new memcpy. The only remaining case is large forward copies which are done in the same way as the memcpy loop, but copying from the end rather than the start. Tested on the Nextbit Robin with MSM8992 (Snapdragon 808): Before BM_string_memcmp/8 1000k 27 0.286 GiB/s BM_string_memcmp/64 50M 20 3.053 GiB/s BM_string_memcmp/512 20M 126 4.060 GiB/s BM_string_memcmp/1024 10M 234 4.372 GiB/s BM_string_memcmp/8Ki 1000k 1726 4.745 GiB/s BM_string_memcmp/16Ki 500k 3711 4.415 GiB/s BM_string_memcmp/32Ki 200k 8276 3.959 GiB/s BM_string_memcmp/64Ki 100k 16351 4.008 GiB/s BM_string_memcpy/8 1000k 13 0.612 GiB/s BM_string_memcpy/64 1000k 8 7.187 GiB/s BM_string_memcpy/512 50M 38 13.311 GiB/s BM_string_memcpy/1024 20M 86 11.858 GiB/s BM_string_memcpy/8Ki 5M 620 13.203 GiB/s BM_string_memcpy/16Ki 1000k 1265 12.950 GiB/s BM_string_memcpy/32Ki 500k 2977 11.004 GiB/s BM_string_memcpy/64Ki 500k 8003 8.188 GiB/s BM_string_memmove/8 1000k 11 0.684 GiB/s BM_string_memmove/64 1000k 16 3.855 GiB/s BM_string_memmove/512 50M 57 8.915 GiB/s BM_string_memmove/1024 20M 117 8.720 GiB/s BM_string_memmove/8Ki 2M 853 9.594 GiB/s BM_string_memmove/16Ki 1000k 1731 9.462 GiB/s BM_string_memmove/32Ki 500k 3566 9.189 GiB/s BM_string_memmove/64Ki 500k 7708 8.501 GiB/s BM_string_memset/8 1000k 16 0.487 GiB/s BM_string_memset/64 1000k 16 3.995 GiB/s BM_string_memset/512 50M 37 13.489 GiB/s BM_string_memset/1024 50M 58 17.405 GiB/s BM_string_memset/8Ki 5M 451 18.160 GiB/s BM_string_memset/16Ki 2M 883 18.554 GiB/s BM_string_memset/32Ki 1000k 2181 15.022 GiB/s BM_string_memset/64Ki 500k 4563 14.362 GiB/s BM_string_strlen/8 1000k 8 0.965 GiB/s BM_string_strlen/64 1000k 16 3.855 GiB/s BM_string_strlen/512 20M 92 5.540 GiB/s BM_string_strlen/1024 10M 167 6.111 GiB/s BM_string_strlen/8Ki 1000k 1237 6.620 GiB/s BM_string_strlen/16Ki 1000k 2765 5.923 GiB/s BM_string_strlen/32Ki 500k 6135 5.341 GiB/s BM_string_strlen/64Ki 200k 13168 4.977 GiB/s After BM_string_memcmp/8 1000k 21 0.369 GiB/s BM_string_memcmp/64 1000k 28 2.272 GiB/s BM_string_memcmp/512 20M 128 3.983 GiB/s BM_string_memcmp/1024 10M 234 4.375 GiB/s BM_string_memcmp/8Ki 1000k 1732 4.728 GiB/s BM_string_memcmp/16Ki 500k 3485 4.701 GiB/s BM_string_memcmp/32Ki 500k 7031 4.660 GiB/s BM_string_memcmp/64Ki 200k 14296 4.584 GiB/s BM_string_memcpy/8 1000k 5 1.458 GiB/s BM_string_memcpy/64 1000k 7 8.952 GiB/s BM_string_memcpy/512 50M 36 13.907 GiB/s BM_string_memcpy/1024 20M 80 12.750 GiB/s BM_string_memcpy/8Ki 5M 572 14.307 GiB/s BM_string_memcpy/16Ki 1000k 1165 14.053 GiB/s BM_string_memcpy/32Ki 500k 3141 10.430 GiB/s BM_string_memcpy/64Ki 500k 7008 9.351 GiB/s BM_string_memmove/8 50M 7 1.074 GiB/s BM_string_memmove/64 1000k 9 6.593 GiB/s BM_string_memmove/512 50M 37 13.502 GiB/s BM_string_memmove/1024 20M 80 12.656 GiB/s BM_string_memmove/8Ki 5M 573 14.281 GiB/s BM_string_memmove/16Ki 1000k 1168 14.018 GiB/s BM_string_memmove/32Ki 1000k 2825 11.599 GiB/s BM_string_memmove/64Ki 500k 6548 10.008 GiB/s BM_string_memset/8 1000k 7 1.038 GiB/s BM_string_memset/64 1000k 8 7.151 GiB/s BM_string_memset/512 1000k 29 17.272 GiB/s BM_string_memset/1024 50M 53 18.969 GiB/s BM_string_memset/8Ki 5M 424 19.300 GiB/s BM_string_memset/16Ki 2M 846 19.350 GiB/s BM_string_memset/32Ki 1000k 2028 16.156 GiB/s BM_string_memset/64Ki 500k 4514 14.517 GiB/s BM_string_strlen/8 1000k 7 1.120 GiB/s BM_string_strlen/64 1000k 16 3.918 GiB/s BM_string_strlen/512 50M 64 7.894 GiB/s BM_string_strlen/1024 20M 104 9.815 GiB/s BM_string_strlen/8Ki 5M 664 12.337 GiB/s BM_string_strlen/16Ki 1000k 1291 12.682 GiB/s BM_string_strlen/32Ki 1000k 2940 11.143 GiB/s BM_string_strlen/64Ki 500k 6440 10.175 GiB/s Change-Id: I635bd2798a755256f748b2af19b1a56fb85a40c6	2016-11-28 19:35:12 +00:00
Elliott Hughes	beb8796624	Use ENTRY_PRIVATE in __bionic_clone assembler. Bug: N/A Test: bionic tests Change-Id: Ic651d628be009487a36d0b2e5bcf900b981b1ef9	2016-10-26 17:01:58 -07:00
Colin Cross	7510c33b61	Remove deprecated Android.mk files These directories all have Android.bp files that are always used now, delete the Android.mk files. Change-Id: Ib0ba2d28bff88483b505426ba61606da314e03ab	2016-05-26 16:41:57 -07:00
Elliott Hughes	eafad49bd6	Add <sys/quota.h>. It turns out that at least the Nexus 9 kernel is built without CONFIG_QUOTA. If we decide we're going to mandate quota functionality, I'm happy for us to be a part of CTS that ensures that happens, but I don't want to be first, so there's not much to test here other than "will it compile?". The strace output looks right though. Bug: http://b/27948821 Bug: http://b/27952303 Change-Id: If667195eee849ed17c8fa9110f6b02907fc8fc04	2016-04-06 11:06:09 -07:00
Elliott Hughes	7f72ad4d6c	Add sync_file_range to <fcntl.h>. Bug: http://b/27952303 Change-Id: Idadfacd657ed415abc11684b9471e4e24c2fbf05	2016-04-05 12:17:22 -07:00
Elliott Hughes	afe835d540	Move math headers in with the other headers. Keeping them separate is a pain for the NDK, and doesn't help the platform. Change-Id: I96b8beef307d4a956e9c0a899ad9315adc502582	2016-04-02 08:36:33 -07:00
Greg Hackmann	e2faf07d65	Add {get,set}domainname(2) {get,set}domainname aren't in POSIX but are widely-implemented extensions. The Linux kernel provides a setdomainname syscall but not a symmetric getdomainname syscall, since it expects userspace to get the domain name from uname(2). Change-Id: I96726c242f4bb646c130b361688328b0b97269a0 Signed-off-by: Greg Hackmann <ghackmann@google.com>	2016-03-25 14:16:58 -07:00
Josh Gao	0c3655a864	Add a checksum to jmp_buf on AArch64. Bug: http://b/27417786 Change-Id: I17c22dc28a46dd6b678b449b506b0da978f3793e	2016-03-03 12:45:08 -08:00
Elliott Hughes	784609317d	Mandate optimized __memset_chk for arm and arm64. This involves actually implementing assembler __memset_chk for arm64, but that's easily done. Obviously I'd like this for all architectures (and all the string functions), but this is low-hanging fruit... Change-Id: I70ec48c91aafd1f0feb974a2555c51611de9ef82	2016-03-02 11:58:41 -08:00
Elliott Hughes	3c6016f04a	Improve diagnostics from the assembler __memcpy_chk routines. Change-Id: Iec16c92ed80beee505cba2121ea33e3550197b02	2016-03-01 14:45:58 -08:00
Elliott Hughes	b83d6747fa	Improve FORTIFY failure diagnostics. Our FORTIFY _chk functions' implementations were very repetitive and verbose but not very helpful. We'd also screwed up and put the SSIZE_MAX checks where they would never fire unless you actually had a buffer as large as half your address space, which probably doesn't happen very often. Factor out the duplication and take the opportunity to actually show details like how big the overrun buffer was, or by how much it was overrun. Also remove the obsolete FORTIFY event logging. Also remove the unused __libc_fatal_no_abort. This change doesn't improve the diagnostics from the optimized assembler implementations. Change-Id: I176a90701395404d50975b547a00bd2c654e1252	2016-02-26 22:06:17 -08:00
Elliott Hughes	5f26c6bc91	Really add adjtimex(2), and add clock_adjtime(2) too. Change-Id: I81fde2ec9fdf787bb19a784ad13df92d33a4f852	2016-02-03 13:19:10 -08:00
Greg Hackmann	3f3f6c526b	Add adjtimex Change-Id: Ia92d35b1851e73c9f157a749dba1e98f68309a8d Signed-off-by: Greg Hackmann <ghackmann@google.com>	2016-01-28 13:41:22 -08:00
Elliott Hughes	42d949ff9d	Defend against -fstack-protector in libc startup. Exactly which functions get a stack protector is up to the compiler, so let's separate the code that sets up the environment stack protection requires and explicitly build it with -fno-stack-protector. Bug: http://b/26276517 Change-Id: I8719e23ead1f1e81715c32c1335da868f68369b5	2016-01-06 20:06:08 -08:00
Daniel Micay	4200e260d2	fix the mremap signature The mremap definition was incorrect (unsigned long instead of int) and it was missing the optional new_address parameter. Change-Id: Ib9d0675aaa098c21617cedc9b2b8cf267be3aec4	2015-11-06 13:14:43 -08:00
Dan Willemsen	268a673bd1	Switch to LOCAL_SRC_FILES_EXCLUDE This moves the generic arm/arm64/x86 settings into the main makefiles and makes the rest of them derivatives. This better aligns with how soong handles arch/cpu variants. Also updates the Android.bp to make it consistent with the make versions. Change-Id: I5a0275d992bc657459eb6fe1697ad2336731d122	2015-10-20 11:58:28 -07:00
Josh Gao	54db0df8d6	Implement setjmp cookies on AArch64. Bug: http://b/23942752 Change-Id: I81408ef0dd53010140b51e3083d357d3f2961112	2015-09-17 14:07:24 -07:00
Elliott Hughes	6f4594d5dc	Add preadv/pwritev. Bug: http://b/12612572 Change-Id: I38ff2684d69bd0fe3f21b1d371b88fa60d5421cb	2015-08-26 14:48:55 -07:00
Jake Weinstein	2926f9a31e	libc: remove bcopy from memmove on 64-bit architectures * bcopy is deprecated on LP64 by the following commit: `ce9ce28e5d` Change-Id: I6849916f0ec4a2d0db9a360999ad1dc8edda952b	2015-08-17 22:06:12 +00:00
Elliott Hughes	5891abdc66	Invalidate cached pid in vfork. Bug: http://b/23008979 Change-Id: I1dd900ac988cdbe10aad3abc53240c5d352891d5	2015-08-07 19:44:12 -07:00
Tim Murray	9876aa273d	Merge "Add support for cortex-a53 in bionic."	2015-06-16 19:04:14 +00:00
Tim Murray	a73b2c961f	Add support for cortex-a53 in bionic. allows -mcpu=cortex-a53 to be passed as part of a command line. Change-Id: Id4203a9fd197f4c3b661bad21ac58c32819fd687	2015-06-15 21:43:30 -07:00
Elliott Hughes	b1304935b6	Hide accidentally-exposed __clock_nanosleep. Bug: http://b/21858067 Change-Id: Iaa83a5e17cfff796aed4f641d0d14427614d9399	2015-06-15 19:39:04 -07:00
Elliott Hughes	be57a40d29	Add process_vm_readv and process_vm_writev. Bug: http://b/21761353 Change-Id: Ic8ef3f241d62d2a4271fbc783c8af50257bac498	2015-06-10 17:24:20 -07:00
Nick Kralevich	e1d0810cd7	Add O_PATH support for flistxattr() A continuation of commit `2825f10b7f`. Add O_PATH compatibility support for flistxattr(). This allows a process to list out all the extended attributes associated with O_PATH file descriptors. Change-Id: Ie2285ac7ad2e4eac427ddba6c2d182d41b130f75	2015-06-06 11:25:41 -07:00
Nick Kralevich	2825f10b7f	libc: Add O_PATH support for fgetxattr / fsetxattr Support O_PATH file descriptors when handling fgetxattr and fsetxattr. This avoids requiring file read access to pull extended attributes. This is needed to support O_PATH file descriptors when calling SELinux's fgetfilecon() call. In particular, this allows the querying and setting of SELinux file context by using something like the following code: int dirfd = open("/path/to/dir", O_DIRECTORY); int fd = openat(dirfd, "file", O_PATH \| O_NOFOLLOW); char *context; fgetfilecon(fd, &context); This change was motivated by a comment in https://android-review.googlesource.com/#/c/152680/1/toys/posix/ls.c Change-Id: Ic0cdf9f9dd0e35a63b44a4c4a08400020041eddf	2015-06-01 15:51:56 -07:00
Yabin Cui	40a8f214a5	Hide rt_sigqueueinfo. Bug: 19358804 Change-Id: I38a53ad64c81d0eefdd1d24599e769fd8a477a56	2015-05-18 11:29:20 -07:00
Chih-Hung Hsieh	33f33515b5	Use unified syntax to compile with both llvm and gcc. All arch-arm and arch-arm64 .S files were compiled by gcc with and without this patch. The output object files were identical. When compiled with llvm and this patch, the output files were also identical to gcc's output. BUG: 18061004 Change-Id: I458914d512ddf5496e4eb3d288bf032cd526d32b	2015-05-11 17:15:03 -07:00
Dan Albert	7c2c01d681	Revert "Fix volantis boot." Bug: http://b/20065774 This reverts commit `76e1cbca75`.	2015-05-07 15:12:24 -07:00
Dan Albert	6f0d7005f9	Revert "Fix clang build." Bug: http://b/20065774 This reverts commit `0975a5d9d2`.	2015-05-07 15:12:16 -07:00
Dan Albert	f920f821e2	Revert "Try again to fix clang build." Bug: http://b/20065774 This reverts commit `dffd3c5838`. Change-Id: I5dd095ff4ab133baa2afcbd4c79fbee55d05c459	2015-05-07 15:11:48 -07:00
Dmitriy Ivanov	ea295f68f1	Unregister pthread_atfork handlers on dlclose() Bug: http://b/20339788 Change-Id: I874c87faa377645fa9e0752f4fc166d81fd9ef7e	2015-04-24 17:57:37 -07:00
Dimitry Ivanov	6c63ee41ac	Merge "Revert "Unregister pthread_atfork handlers on dlclose()""	2015-04-24 03:49:30 +00:00
Dimitry Ivanov	094f58fb2a	Revert "Unregister pthread_atfork handlers on dlclose()" The visibility control in pthread_atfork.h is incorrect. It breaks 64bit libc.so by hiding pthread_atfork. This reverts commit `6df122f852`. Change-Id: I21e4b344d500c6f6de0ccb7420b916c4e233dd34	2015-04-24 03:46:57 +00:00
Elliott Hughes	3da9373fe0	Merge "Simplify close(2) EINTR handling."	2015-04-23 21:14:25 +00:00
Elliott Hughes	3391a9ff13	Simplify close(2) EINTR handling. This doesn't affect code like Chrome that correctly ignores EINTR on close, makes code that tries TEMP_FAILURE_RETRY work (where before it might have closed a different fd and appeared to succeed, or had a bogus EBADF), and makes "goto fail" code work (instead of mistakenly assuming that EINTR means that the close failed). Who loses? Anyone actively trying to detect that they caught a signal while in close(2). I don't think those people exist, and I think they have better alternatives available. Bug: https://code.google.com/p/chromium/issues/detail?id=269623 Bug: http://b/20501816 Change-Id: I11e2f66532fe5d1b0082b2433212e24bdda8219b	2015-04-23 08:41:45 -07:00
Dmitriy Ivanov	6df122f852	Unregister pthread_atfork handlers on dlclose() Change-Id: I326fdf6bb06bed12743f08980b5c69d849c015b8	2015-04-22 19:19:37 -07:00

1 2 3 4

185 commits