Commit graph

58 commits

Author SHA1 Message Date
George Burgess IV
6cb0687932 Split our FORTIFY implementation into libc_fortify
As requested in the bug. This also rips __memcpy_chk out of memcpy.S,
which lets us cut down on copypasta (all of the implementations look
identical).

Bug: 12231437
Test: mma on aosp_{arm,arm64,mips,x86,x86_64} internal master;
checkbuild on bullhead internal master; CtsBionicTestCases on bullhead.
No new failures.
Change-Id: I88c39ca166bacde0b692aa3063e743bb046a5d2f
2017-07-24 14:20:16 -07:00
Christopher Ferris
866e7b6906 Fix assembler warnings.
There are a few instructions deprecated on armv8 that result in lots
of warnings. Add an arch directive so that these warnings go away.

This doesn't cause any problems because the instructions still
execute properly.

Bug: 38319728

Test: Built all of these assembler files and verified the warning are gone.
Change-Id: If063defdd16f290c01975233c8d257d1b2005e76
2017-05-24 16:06:11 -07:00
Christopher Ferris
fe1af1a64b Small cleanup of cortex-a15 code.
Remove new version of the cortex-a15 that caused a regression. We are never
going to revisit that code, and it is only confusing things.

Also remove the setting of MEMCPY_BASE and use the correct include
directly.

Test: Compiled angler with 32 bit arch as cortex-a15. Ran 32 bit bionic
Test: unit tests on angler.

Change-Id: I9372c01758fd7a596849c87b1a3f805bb477c94f
2016-11-01 14:28:22 -07:00
Colin Cross
8428fb03c8 Merge "Remove deprecated Android.mk files" 2016-06-02 16:31:40 +00:00
Colin Cross
7510c33b61 Remove deprecated Android.mk files
These directories all have Android.bp files that are always used now,
delete the Android.mk files.

Change-Id: Ib0ba2d28bff88483b505426ba61606da314e03ab
2016-05-26 16:41:57 -07:00
Elliott Hughes
bdd8f896dd Improve diagnostics from the assembler __strcpy_chk routines.
Change-Id: Ib95b598f7f8338cc1a618c00232a4259dc4a6319
2016-05-26 16:38:34 -07:00
Elliott Hughes
c75da09f4f Improve diagnostics from the assembler __strcat_chk routines.
Change-Id: I44cbe5389c66de6618e581a6e302eea22c39d6fb
2016-05-26 14:55:00 -07:00
Elliott Hughes
382bd666e2 Stop including <machine/cpu-features.h>.
We're not looking at __ARM_ARCH__, because we don't support ARMv6.

Bug: http://b/18556103
Change-Id: I91fe096af697dc842a57e97515312e3530743678
2016-05-16 17:52:40 -07:00
Elliott Hughes
01d5b946ac Remove optimized code for bzero, which was removed from POSIX in 2008.
I'll come back for the last bcopy remnant...

Bug: http://b/26407170
Change-Id: Iabfeb95fc8a4b4b3992e3cc209ec5221040e7c26
2016-03-02 17:21:07 -08:00
Elliott Hughes
3c6016f04a Improve diagnostics from the assembler __memcpy_chk routines.
Change-Id: Iec16c92ed80beee505cba2121ea33e3550197b02
2016-03-01 14:45:58 -08:00
Elliott Hughes
62e59646f8 Improve diagnostics from the assembler __memset_chk routines.
Change-Id: Ic165043ab8cd5e16866b3e11cfba960514cbdc57
2016-03-01 12:46:47 -08:00
Elliott Hughes
b83d6747fa Improve FORTIFY failure diagnostics.
Our FORTIFY _chk functions' implementations were very repetitive and verbose
but not very helpful. We'd also screwed up and put the SSIZE_MAX checks where
they would never fire unless you actually had a buffer as large as half your
address space, which probably doesn't happen very often.

Factor out the duplication and take the opportunity to actually show details
like how big the overrun buffer was, or by how much it was overrun.

Also remove the obsolete FORTIFY event logging.

Also remove the unused __libc_fatal_no_abort.

This change doesn't improve the diagnostics from the optimized assembler
implementations.

Change-Id: I176a90701395404d50975b547a00bd2c654e1252
2016-02-26 22:06:17 -08:00
Dan Willemsen
268a673bd1 Switch to LOCAL_SRC_FILES_EXCLUDE
This moves the generic arm/arm64/x86 settings into the main makefiles
and makes the rest of them derivatives. This better aligns with how
soong handles arch/cpu variants.

Also updates the Android.bp to make it consistent with the make
versions.

Change-Id: I5a0275d992bc657459eb6fe1697ad2336731d122
2015-10-20 11:58:28 -07:00
Christopher Ferris
fdfcfce7c6 Fix over read in strcpy/stpcpy/strcat.
This bug will happen when these circumstances are met:

- Destination address & 0x7 == 1, strlen of src is 11, 12, 13.
- Destination address & 0x7 == 2, strlen of src is 10, 11, 12.
- Destination address & 0x7 == 3, strlen of src is 9, 10, 11.
- Destination address & 0x7 == 4, strlen of src is 8, 9, 10.

In these cases, the dest alignment code does a ldr which reads 4 bytes,
and it will read past the end of the source. In most cases, this is
probably benign, but if this crosses into a new page it could cause a
crash.

Fix the labels in the cortex-a9 strcat.

Modify the overread test to vary the dst alignment to expost this bug.
Also, shrink the strcat/strlcat overread cases since the dst alignment
variation increases the runtime too much.

Bug: 24345899
Change-Id: Ib34a559bfcebd89861985b29cae6c1e47b5b5855
2015-09-24 14:17:36 -07:00
Christopher Ferris
5930772286 Add optimized cortex-a7/cortex-a53 memset/memcpy.
Add an optimized memset that is ~20% faster for cortex-a7 and
cortex-a53.

Add a 32 bit optimized cortex-a53 memcpy that is about ~20% faster
on cached data.

Fix the cortex-a15 __str{cat,cpy}_chk.S, memcpy_base.S to remove
the phony functions, since they aren't needed any more. Then add
a direct include of these for cortex-a53.

Verified the new functions by stepping through all of the major
paths and verifying the backtrace is still correct.

Bug: 22696180
Change-Id: Iec92a3f82d51243cca76c9aff9f35d920ff865ae
2015-08-17 13:02:03 -07:00
Christopher Ferris
795a8e3d69 Make all labels local.
Change the non-local labels to .L labels.

Change-Id: I720e894f2e311af8f4a0970303d8b86575fb69a5
2015-07-23 12:12:55 -07:00
Christopher Ferris
41efc92e35 Use assembly memmove for all arm32 processors.
Bug: 15110993
Change-Id: Ia3dcd6b8c4032f8c72b6f2e628b635ce99667c09
2015-04-08 16:53:16 -07:00
Elliott Hughes
76f8916b90 Clean up <stdlib.h> slightly.
Interestingly, this mostly involves cleaning up our implementation of
various <string.h> functions.

Change-Id: Ifaef49b5cb997134f7bc0cc31bdac844bdb9e089
2015-01-26 14:28:41 -08:00
Elliott Hughes
1ef6ec40e1 Move the generic arm memcmp.S into the generic directory.
Change-Id: I48e4d14a0dcddbb246edbac6d0329619574ab44d
2014-12-15 11:06:34 -08:00
Christopher Ferris
7d849ac378 Add stpcpy assembler version.
For generic, continue to use the C version of the code.

Bug: 13746695
Change-Id: I77426a70b06131f2373bb51265bea1240bb3f101
2014-09-30 19:23:26 -07:00
Christopher Ferris
c8bd2abab2 Cleanup arm assembly.
Remove the old arm directives.
Change the non-local labels to .L labels.
Add cfi directives to strcpy.S.

Change-Id: I9bafee1ffe5d85c92d07cfa8a85338cef9759562
2014-09-29 15:53:10 -07:00
Shu Zhang
6c80ccdeed denver: optimize memmove
Optimize 32-bit denver memmove with reversal memcpy.

Change-Id: Iaad0a9475248cdd7e4f50d58bea9db1b767abc88
2014-05-20 12:31:38 -07:00
Elliott Hughes
851e68a240 Unify our assembler macros.
Our <machine/asm.h> files were modified from upstream, to the extent
that no architecture was actually using the upstream ENTRY or END macros,
assuming that architecture even had such a macro upstream. This patch moves
everyone to the same macros, with just a few tweaks remaining in the
<machine/asm.h> files, which no one should now use directly.

I've removed most of the unused cruft from the <machine/asm.h> files, though
there's still rather a lot in the mips/mips64 ones.

Bug: 12229603
Change-Id: I2fff287dc571ac1087abe9070362fb9420d85d6d
2014-02-20 13:51:26 -08:00
Ying Wang
f25d677147 Reconfig libc's Android.mk to build for multilib
1. Moved arch-specific setup to their own files:
    - <arch>/<arch>.mk, arch-specific configs. Variables in those config
      end with the arch name.
    - removed the extra complexity introduced by function libc-add-cpu-variant-src,
      which seems to be not very useful these days.
2. Separated out the crt object files generation rules and set up the
   rules for both TARGET_ARCH and TARGET_2ND_ARCH.
3. Build all the libraries for both TARGET_ARCH and TARGET_2ND_ARCH,
  with the arch-specific LOCAL_ variables.

Bug: 11654773
Change-Id: I9c2d85db0affa49199d182236d2210060a321421
2014-02-12 13:58:34 -08:00
The Android Open Source Project
f00c938c7f Merge commit '811b0cdb2d6e4a697dbc63a678712759dd0db242' into HEAD
Change-Id: I786944f80fb1a2d502fed51dc2c391ed5db66761
2013-11-22 13:38:33 -08:00
Christopher Ferris
507cfe2e10 Add .cfi_startproc/.cfi_endproc to ENTRY/END.
Bug: 10414953
Change-Id: I711718098b9f3cc0ba8277778df64557e9c7b2a0
2013-11-19 16:31:24 -08:00
Elliott Hughes
f2a760dca7 am a85606e1: am c100a100: Merge "\'Avoid confusing "read prevented write" log messages\' 2."
* commit 'a85606e1563c2153bea3c73dfe4ca1588e778f22':
  'Avoid confusing "read prevented write" log messages' 2.
2013-10-15 17:38:17 -07:00
Elliott Hughes
68b67113a4 'Avoid confusing "read prevented write" log messages' 2.
This time it's assembler.

Change-Id: Iae6369833b8046b8eda70238bb4ed0cae64269ea
2013-10-15 17:17:05 -07:00
Christopher Ferris
289c460c55 am ac6bc319: Remove new aligned memcpy path for cortex-a15.
* commit 'ac6bc31942e58c8893c0695d9766d0f3e39335fe':
  Remove new aligned memcpy path for cortex-a15.
2013-10-15 16:17:14 -07:00
Christopher Ferris
ac6bc31942 Remove new aligned memcpy path for cortex-a15.
For some reason the new cortex-a15 memcpy code from ARM is really bad
for really large copies. This change forces us to go down the old path
for all copies.

All of my benchmarks show the new version is faster for large copies, but
something is going on that I don't understand.

Bug: 10838353
Change-Id: I01c16d4a2575e76f4c69862c6f78fd9024eb3fb8
2013-10-15 14:54:02 -07:00
Elliott Hughes
eb847bc866 Fix x86_64 build, clean up intermediate libraries.
The x86_64 build was failing because clone.S had a call to __thread_entry which
was being added to a different intermediate .a on the way to making libc.so,
and the linker couldn't guarantee statically that such a relocation would be
possible.

  ld: error: out/target/product/generic_x86_64/obj/STATIC_LIBRARIES/libc_common_intermediates/libc_common.a(clone.o): requires dynamic R_X86_64_PC32 reloc against '__thread_entry' which may overflow at runtime; recompile with -fPIC

This patch addresses that by ensuring that the caller and callee end up in the
same intermediate .a. While I'm here, I've tried to clean up some of the mess
that led to this situation too. In particular, this removes libc/private/ from
the default include path (except for the DNS code), and splits out the DNS
code into its own library (since it's a weird special case of upstream NetBSD
code that's diverged so heavily it's unlikely ever to get back in sync).

There's more cleanup of the DNS situation possible, but this is definitely a
step in the right direction, and it's more than enough to get x86_64 building
cleanly.

Change-Id: I00425a7245b7a2573df16cc38798187d0729e7c4
2013-10-09 16:00:17 -07:00
Nick Kralevich
6861c6f85e Make error messages even better!
Change-Id: I72bd1eb1d526dc59833e5bc3c636171f7f9545af
2013-10-04 11:43:30 -07:00
Christopher Ferris
aec1b3540a Remove the __ARM_FEATURE_DSP check.
The check for __ARM_FEATURE_DSP being defined is pointless since it
is always defined.

Bug: 10971279

Merge from internal master.

(cherry-picked from d2642fa70c)

Change-Id: If23ab3271f4da0c38cd531ffdc9a7e5eed6ec5dc
2013-10-02 23:14:01 -07:00
Nick Kralevich
32bbf8a63b libc: don't export unnecessary symbols
Symbols associated with the internal implementation of memcpy
like routines should be private.

Change-Id: I2b1d1f59006395c29d518c153928437b08f93d16
2013-10-02 16:54:58 -07:00
Christopher Ferris
16e185c908 __memcpy_chk: Fix signed cmp of unsigned values.
I accidentally did a signed comparison of the size_t values passed in
for three of the _chk functions. Changing them to unsigned compares.

Add three new tests to verify this failure is fixed.

Bug: 10691831

Merge from internal master.

(cherry-picked from 883ef2499c)

Change-Id: Id9a96b549435f5d9b61dc132cf1082e0e30889f5
2013-09-20 20:12:09 -07:00
Christopher Ferris
a57c9c084b Fix all debug directives.
The backtrace when a fortify check failed was not correct. This change
adds all of the necessary directives to get a correct backtrace.

Fix the strcmp directives and change all labels to local labels.

Testing:
- Verify that the runtime can decode the stack for __memcpy_chk, __memset_chk,
  __strcpy_chk, __strcat_chk fortify failures.
- Verify that gdb can decode the stack properly when hitting a fortify check.
- Verify that the runtime can decode the stack for a seg fault for all of the
  _chk functions and for memcpy/memset.
- Verify that gdb can decode the stack for a seg fault for all of the _chk
  functions and for memcpy/memset.
- Verify that the runtime can decode the stack for a seg fault for strcmp.
- Verify that gdb can decode the stack for a seg fault in strcmp.

Bug: 10342460
Bug: 10345269

Merge from internal master.

(cherry-picked from 05332f2ce7)

Change-Id: Ibc919b117cfe72b9ae97e35bd48185477177c5ca
2013-09-20 18:59:58 -07:00
Christopher Ferris
bd7fe1d3c4 Update all debug directives.
The libcorkscrew stack unwinder does not understand cfi directives,
so add .save directives so that it can function properly.

Also add the directives in to strcmp.S and fix a missing set of
directives in cortex-a9/memcpy_base.S.

Bug: 10345269

Merge from internal master.

(cherry-picked from 5f7ccea3ff)

Change-Id: If48a216203216a643807f5d61906015984987189
2013-09-20 13:49:38 -07:00
Christopher Ferris
883ef2499c __memcpy_chk: Fix signed cmp of unsigned values.
I accidentally did a signed comparison of the size_t values passed in
for three of the _chk functions. Changing them to unsigned compares.

Add three new tests to verify this failure is fixed.

Bug: 10691831
Change-Id: Ia831071f7dffd5972a748d888dd506c7cc7ddba3
2013-09-10 17:34:03 -07:00
Christopher Ferris
05332f2ce7 Fix all debug directives.
The backtrace when a fortify check failed was not correct. This change
adds all of the necessary directives to get a correct backtrace.

Fix the strcmp directives and change all labels to local labels.

Testing:
- Verify that the runtime can decode the stack for __memcpy_chk, __memset_chk,
  __strcpy_chk, __strcat_chk fortify failures.
- Verify that gdb can decode the stack properly when hitting a fortify check.
- Verify that the runtime can decode the stack for a seg fault for all of the
  _chk functions and for memcpy/memset.
- Verify that gdb can decode the stack for a seg fault for all of the _chk
  functions and for memcpy/memset.
- Verify that the runtime can decode the stack for a seg fault for strcmp.
- Verify that gdb can decode the stack for a seg fault in strcmp.

Bug: 10342460
Bug: 10345269

Change-Id: I1dedadfee207dce4a285e17a21e8952bbc63786a
2013-08-28 15:42:05 -07:00
Christopher Ferris
5f7ccea3ff Update all debug directives.
The libcorkscrew stack unwinder does not understand cfi directives,
so add .save directives so that it can function properly.

Also add the directives in to strcmp.S and fix a missing set of
directives in cortex-a9/memcpy_base.S.

Bug: 10345269

Change-Id: I043f493e0bb6c45bd3f4906fbe1d9f628815b015
2013-08-20 11:22:34 -07:00
Christopher Ferris
5f45d583b0 Create optimized __strcpy_chk/__strcat_chk.
This change pulls the memcpy code out into a new file so that the
__strcpy_chk and __strcat_chk can use it with an include.

The new versions of the two chk functions uses assembly versions
of strlen and memcpy to implement this check. This allows near
parity with the assembly versions of strcpy/strcat. It also means that
as memcpy implementations get faster, so do the chk functions.

Other included changes:
- Change all of the assembly labels to local labels. The other labels
  confuse gdb and mess up backtracing.
- Add .cfi_startproc and .cfi_endproc directives so that gdb is not
  confused when falling through from one function to another.
- Change all functions to use cfi directives since they are more powerful.
- Move the memcpy_chk fail code outside of the memcpy function definition
  so that backtraces work properly.
- Preserve lr before the calls to __fortify_chk_fail so that the backtrace
  actually works.

Testing:

- Ran the bionic unit tests. Verified all error messages in logs are set
  correctly.
- Ran libc_test, replacing strcpy with __strcpy_chk and replacing
  strcat with __strcat_chk.
- Ran the debugger on nexus10, nexus4, and old nexus7. Verified that the
  backtrace is correct for all fortify check failures. Also verify that
  when falling through from __memcpy_chk to memcpy that the backtrace is
  still correct. Also verified the same for __memset_chk and bzero.
  Verified the two different paths in the cortex-a9 memset routine that
  save variables to the stack still show the backtrace properly.

Bug: 9293744

(cherry-picked from 2be91915dc)

Change-Id: Ia407b74d3287d0b6af0139a90b6eb3bfaebf2155
2013-08-15 11:13:39 -07:00
Christopher Ferris
59a13c122e Optimize __memset_chk, __memcpy_chk. DO NOT MERGE.
This change creates assembler versions of __memcpy_chk/__memset_chk
that is implemented in the memcpy/memset assembler code. This change
avoids an extra call to memcpy/memset, instead allowing a simple fall
through to occur from the chk code into the body of the real
implementation.

Testing:

- Ran the libc_test on __memcpy_chk/__memset_chk on all nexus devices.
- Wrote a small test executable that has three calls to __memcpy_chk and
  three calls to __memset_chk. First call dest_len is length + 1. Second
  call dest_len is length. Third call dest_len is length - 1.
  Verified that the first two calls pass, and the third fails. Examined
  the logcat output on all nexus devices to verify that the fortify
  error message was sent properly.
- I benchmarked the new __memcpy_chk and __memset_chk on all systems. For
  __memcpy_chk and large copies, the savings is relatively small (about 1%).
  For small copies, the savings is large on cortex-a15/krait devices
  (between 5% to 30%).
  For cortex-a9 and small copies, the speed up is present, but relatively
  small (about 3% to 5%).
  For __memset_chk and large copies, the savings is also small (about 1%).
  However, all processors show larger speed-ups on small copies (about 30% to
  100%).

Bug: 9293744

Merge from internal master.

(cherry-picked from 7c860db074)

Change-Id: I916ad305e4001269460ca6ebd38aaa0be8ac7f52
2013-08-14 18:14:43 -07:00
Christopher Ferris
f0c3d90913 Create optimized __strcpy_chk/__strcat_chk.
This change pulls the memcpy code out into a new file so that the
__strcpy_chk and __strcat_chk can use it with an include.

The new versions of the two chk functions uses assembly versions
of strlen and memcpy to implement this check. This allows near
parity with the assembly versions of strcpy/strcat. It also means that
as memcpy implementations get faster, so do the chk functions.

Other included changes:
- Change all of the assembly labels to local labels. The other labels
  confuse gdb and mess up backtracing.
- Add .cfi_startproc and .cfi_endproc directives so that gdb is not
  confused when falling through from one function to another.
- Change all functions to use cfi directives since they are more powerful.
- Move the memcpy_chk fail code outside of the memcpy function definition
  so that backtraces work properly.
- Preserve lr before the calls to __fortify_chk_fail so that the backtrace
  actually works.

Testing:

- Ran the bionic unit tests. Verified all error messages in logs are set
  correctly.
- Ran libc_test, replacing strcpy with __strcpy_chk and replacing
  strcat with __strcat_chk.
- Ran the debugger on nexus10, nexus4, and old nexus7. Verified that the
  backtrace is correct for all fortify check failures. Also verify that
  when falling through from __memcpy_chk to memcpy that the backtrace is
  still correct. Also verified the same for __memset_chk and bzero.
  Verified the two different paths in the cortex-a9 memset routine that
  save variables to the stack still show the backtrace properly.

Bug: 9293744
Change-Id: Id5aec8c3cb14101d91bd125eaf3770c9c8aa3f57
(cherry picked from commit 2be91915dc)
2013-08-14 07:46:00 +00:00
Christopher Ferris
4e24dcc8d8 Optimize strcat/strcpy, small tweaks to strlen. DO NOT MERGE
Create one version of strcat/strcpy/strlen for cortex-a15/krait and another
version for cortex-a9.

Tested with the libc_test strcat/strcpy/strlen tests.
Including new tests that verify that the src for strcat/strcpy do not
overread across page boundaries.

NOTE: The handling of unaligned strcpy (same code in strcat) could probably
be optimized further such that the src is read 64 bits at a time instead of
the partial reads occurring now.

strlen improves slightly since it was recently optimized.

Performance improvements for strcpy and strcat (using an empty dest string):

cortex-a9
- Small copies vary from about 5% to 20% as the size gets above 10 bytes.
- Copies >= 1024, about a 60% improvement.
- Unaligned copies, from about 40% improvement.

cortex-a15
- Most small copies exhibit a 100% improvement, a few copies only
  improve by 20%.
- Copies >= 1024, about 150% improvement.
- Unaligned copies, about 100% improvement.

krait
- Most small copies vary widely, but on average 20% improvement, then
  the performance gets better, hitting about a 100% improvement when
  copies 64 bytes of data.
- Copies >= 1024, about 100% improvement.
- When coping MBs of data, about 50% improvement.
- Unaligned copies, about 90% improvement.

As strcat destination strings get larger in size:

cortex-a9
- about 40% improvement for small dst strings (>= 32).
- about 250% improvement for dst strings >= 1024.

cortex-a15
- about 200% improvement for small dst strings (>=32).
- about 250% improvement for dst strings >= 1024.

krait
- about 25% improvement for small dst strings (>=32).
- about 100% improvement for dst strings >=1024.

Merge from internal master.

(cherry-picked from d119b7b6f4)

Change-Id: I296463b251ef9fab004ee4dded2793feca5b547a
2013-08-08 11:13:46 -07:00
Christopher Ferris
7c860db074 Optimize __memset_chk, __memcpy_chk.
This change creates assembler versions of __memcpy_chk/__memset_chk
that is implemented in the memcpy/memset assembler code. This change
avoids an extra call to memcpy/memset, instead allowing a simple fall
through to occur from the chk code into the body of the real
implementation.

Testing:

- Ran the libc_test on __memcpy_chk/__memset_chk on all nexus devices.
- Wrote a small test executable that has three calls to __memcpy_chk and
  three calls to __memset_chk. First call dest_len is length + 1. Second
  call dest_len is length. Third call dest_len is length - 1.
  Verified that the first two calls pass, and the third fails. Examined
  the logcat output on all nexus devices to verify that the fortify
  error message was sent properly.
- I benchmarked the new __memcpy_chk and __memset_chk on all systems. For
  __memcpy_chk and large copies, the savings is relatively small (about 1%).
  For small copies, the savings is large on cortex-a15/krait devices
  (between 5% to 30%).
  For cortex-a9 and small copies, the speed up is present, but relatively
  small (about 3% to 5%).
  For __memset_chk and large copies, the savings is also small (about 1%).
  However, all processors show larger speed-ups on small copies (about 30% to
  100%).

Bug: 9293744

Change-Id: I8926d59fe2673e36e8a27629e02a7b7059ebbc98
2013-08-06 15:38:29 -07:00
Christopher Ferris
d119b7b6f4 Optimize strcat/strcpy, small tweaks to strlen.
Create one version of strcat/strcpy/strlen for cortex-a15/krait and another
version for cortex-a9.

Tested with the libc_test strcat/strcpy/strlen tests.
Including new tests that verify that the src for strcat/strcpy do not
overread across page boundaries.

NOTE: The handling of unaligned strcpy (same code in strcat) could probably
be optimized further such that the src is read 64 bits at a time instead of
the partial reads occurring now.

strlen improves slightly since it was recently optimized.

Performance improvements for strcpy and strcat (using an empty dest string):

cortex-a9
- Small copies vary from about 5% to 20% as the size gets above 10 bytes.
- Copies >= 1024, about a 60% improvement.
- Unaligned copies, from about 40% improvement.

cortex-a15
- Most small copies exhibit a 100% improvement, a few copies only
  improve by 20%.
- Copies >= 1024, about 150% improvement.
- Unaligned copies, about 100% improvement.

krait
- Most small copies vary widely, but on average 20% improvement, then
  the performance gets better, hitting about a 100% improvement when
  copies 64 bytes of data.
- Copies >= 1024, about 100% improvement.
- When coping MBs of data, about 50% improvement.
- Unaligned copies, about 90% improvement.

As strcat destination strings get larger in size:

cortex-a9
- about 40% improvement for small dst strings (>= 32).
- about 250% improvement for dst strings >= 1024.

cortex-a15
- about 200% improvement for small dst strings (>=32).
- about 250% improvement for dst strings >= 1024.

krait
- about 25% improvement for small dst strings (>=32).
- about 100% improvement for dst strings >=1024.

Change-Id: Ifd091ebdbce70fe35a7c5d8f71d5914255f3af35
2013-08-02 10:31:51 -07:00
Christopher Ferris
0aa9b52efa Add new optimized strlen for arm.
This optimized version is primarily targeted at cortex-a15.

Tested on all nexus devices using the system/extras/libc_test strlen test.
Tested alignments from 1 to 32 that are powers of 2.
Tested that strlen does not cross page boundaries at all alignments.

Speed improvements listed below:

cortex-a15
- Sizes >= 32 bytes, ~75% improvement.
- Sizes >= 1024 bytes, ~250% improvement.

cortex-a9
- Sizes >= 32 bytes, ~75% improvement.
- Sizes >= 1024 bytes, ~85% improvement.

krait
- Sizes >= 32 bytes, ~95% improvement.
- Sizes >= 1024 bytes, ~160% improvement.

Merge from internal master.

(cherry-picked from 2fc0717977)

Change-Id: I1ceceb4e745fd68e9d946f96d1d42e0cdaff6ccf
2013-07-16 16:47:37 -07:00
Christopher Ferris
2fc0717977 Add new optimized strlen for arm.
This optimized version is primarily targeted at cortex-a15.

Tested on all nexus devices using the system/extras/libc_test strlen test.
Tested alignments from 1 to 32 that are powers of 2.
Tested that strlen does not cross page boundaries at all alignments.

Speed improvements listed below:

cortex-a15
- Sizes >= 32 bytes, ~75% improvement.
- Sizes >= 1024 bytes, ~250% improvement.

cortex-a9
- Sizes >= 32 bytes, ~75% improvement.
- Sizes >= 1024 bytes, ~85% improvement.

krait
- Sizes >= 32 bytes, ~95% improvement.
- Sizes >= 1024 bytes, ~160% improvement.

Change-Id: I361b1a36ed89ab991f2a8f0abbf0d7416d39c8f5
2013-07-15 12:37:51 -07:00
Christopher Ferris
796cbe249b Rewrite memset for cortexa15 to use strd.
Merge from internal master.

(cherry-picked from commit 7ffad9c120)

Change-Id: Ia67f2a545399f4fa37b63d5634a3565e4f5482f9
2013-04-12 10:58:25 -07:00
Christopher Ferris
bf0d1ad72b Add missing branch in memcpy.S dst aligned case.
Merge from internal master.

(cherry-picked from commit 6ffaa931c3)

Change-Id: Ifdcf01fd122866cf0d4c5b5f7a997803561d7889
2013-04-10 17:21:29 -07:00