tequilaOS/platform_bionic

Author	SHA1	Message	Date
George Burgess IV	6cb0687932	Split our FORTIFY implementation into libc_fortify As requested in the bug. This also rips __memcpy_chk out of memcpy.S, which lets us cut down on copypasta (all of the implementations look identical). Bug: 12231437 Test: mma on aosp_{arm,arm64,mips,x86,x86_64} internal master; checkbuild on bullhead internal master; CtsBionicTestCases on bullhead. No new failures. Change-Id: I88c39ca166bacde0b692aa3063e743bb046a5d2f	2017-07-24 14:20:16 -07:00
Christopher Ferris	866e7b6906	Fix assembler warnings. There are a few instructions deprecated on armv8 that result in lots of warnings. Add an arch directive so that these warnings go away. This doesn't cause any problems because the instructions still execute properly. Bug: 38319728 Test: Built all of these assembler files and verified the warning are gone. Change-Id: If063defdd16f290c01975233c8d257d1b2005e76	2017-05-24 16:06:11 -07:00
Chitti Babu Theegala	cbfdc7f905	Fix streaming(memcpy) performance on Cortex-A7 Stream-mode detection for L1 in A7-core is failing for non cache-line-size (non 64 byte) aligned addresses. This leads to destination data getting cached unnecessarily. This A7 issue is confirmed by ARM This issue is solved by aligning destination address to 64 byte before entering the loop in memcpy routine. Though we get lower score for micro_bench memcpy when L1 cache is bypassed, it is desirable since it avoids unnecessary eviction of other process data from L1 which is good for overall system performance. Higher micro_bench memcpy numbers for < 64byte alignment shows good numbers but this is at the cost of L1 cache pollution. During memcpy/memset, unnecessary data is filled in L1 cache, this causes eviction of other process data from L1. For example during msmset(0), L1 cache gets filled with 0s which should be avoided. Additionally, there is another issue with cortex A7 that impacts performance for all alignments / all Android Wear versions: Store Buffer on A7 is 32 byte which limits the 32-byte back to back stores. In the current implementation back to back 32bytes writes is causing CPU stalls. This issue can be solved by interleaved Loads and Stores. This helps in avoiding CPU stalls during memcpy by utilizing efficiently the A7 internal load and store buffers. Change-Id: Ie5f12f2bb5d86f627686730416279057e4f5f6d0	2016-12-19 15:11:43 -08:00
Christopher Ferris	ecebb49ac6	Add cortex-a7 specific routines. Test: Changed angler target to use cortex-a7 and I compiled. Test: Booted this version on angler and ran bionic-unit-tests. Change-Id: Ice7f6ea38a2569582161a8e659d7877918c1a45a	2016-11-28 12:49:36 -08:00
Elliott Hughes	382bd666e2	Stop including <machine/cpu-features.h>. We're not looking at __ARM_ARCH__, because we don't support ARMv6. Bug: http://b/18556103 Change-Id: I91fe096af697dc842a57e97515312e3530743678	2016-05-16 17:52:40 -07:00
Elliott Hughes	01d5b946ac	Remove optimized code for bzero, which was removed from POSIX in 2008. I'll come back for the last bcopy remnant... Bug: http://b/26407170 Change-Id: Iabfeb95fc8a4b4b3992e3cc209ec5221040e7c26	2016-03-02 17:21:07 -08:00
Elliott Hughes	62e59646f8	Improve diagnostics from the assembler __memset_chk routines. Change-Id: Ic165043ab8cd5e16866b3e11cfba960514cbdc57	2016-03-01 12:46:47 -08:00
Elliott Hughes	b83d6747fa	Improve FORTIFY failure diagnostics. Our FORTIFY _chk functions' implementations were very repetitive and verbose but not very helpful. We'd also screwed up and put the SSIZE_MAX checks where they would never fire unless you actually had a buffer as large as half your address space, which probably doesn't happen very often. Factor out the duplication and take the opportunity to actually show details like how big the overrun buffer was, or by how much it was overrun. Also remove the obsolete FORTIFY event logging. Also remove the unused __libc_fatal_no_abort. This change doesn't improve the diagnostics from the optimized assembler implementations. Change-Id: I176a90701395404d50975b547a00bd2c654e1252	2016-02-26 22:06:17 -08:00
Christopher Ferris	5930772286	Add optimized cortex-a7/cortex-a53 memset/memcpy. Add an optimized memset that is ~20% faster for cortex-a7 and cortex-a53. Add a 32 bit optimized cortex-a53 memcpy that is about ~20% faster on cached data. Fix the cortex-a15 __str{cat,cpy}_chk.S, memcpy_base.S to remove the phony functions, since they aren't needed any more. Then add a direct include of these for cortex-a53. Verified the new functions by stepping through all of the major paths and verifying the backtrace is still correct. Bug: 22696180 Change-Id: Iec92a3f82d51243cca76c9aff9f35d920ff865ae	2015-08-17 13:02:03 -07:00

9 commits