platform_bionic/libc/arch-arm
Christopher Ferris 21ede92d79 Update to latest cortexa15 memcpy code.
This uses the new code original submitted as memcpy.a15.S as
the base. However, the old code handled unaligned src/dst better
so that was spliced in. I optimized the original unaligned code by
removing a few unnecessary instructions. I optimized the a15 code by
rewriting the pre and post code. I also modified the main loop to add
a pld so that larger copies would not stall waiting for memory.

Test cases for the new memcpy:

- Copy all sized values from 0 to 1024 bytes, using whatever alignment
  is returned by malloc.
For each alignment case described below, the test copied from 0 to 128
bytes.
- Src and dst pointers are both aligned to the same value, starting
  at one going through every power of two up to and including 128.
- Src aligned to double word boundary, dst aligned to word boundary.
- Src aligned to word boundary, dst aligned to double word boundary.
- Src aligned to 16 bit boundary, dst aligned to word boundary.
- Src aligned to word boundary, dst aligned to 16 byte boundary.
- Src aligned to word boundary, dst aligned to 1 byte from a word
  boundary.
- Src aligned to word boundary, dst aligned to 2 bytes from a word
  boundary.
- Src aligned to word boundary, dst aligned to 3 bytes from a word
  boundary.
- Src aligned to 1 byte from a word boundary, dst aligned to a word
  boundary.
- Src aligned to 2 bytes from a word boundary, dst aligned to a word
  boundary.
- Src aligned to 3 bytes from a word boundary, dst aligned to a word
  boundary.

Cases to verify the unaligned source code properly aligns to a 16 bit
boundary.
- Src aligned to 1 byte from a 128 bit boundary, dst aligned to
  4 + 128 bit boundary.
- Src aligned to 1 byte from a 128 bit boundary, dst aligned to
  8 + 128 bit boundary.
- Src aligned to 1 byte from a 128 bit boundary, dst aligned to
  12 + 128 bit boundary.
- Src aligned to 1 byte from a 128 bit boundary, dst aligned to
  16 + 128 bit boundary.

In all cases, a two byte fencepost was placed at the end of the
destination to verify that only the requested number of bytes were copied.

Bug: 8005082
Change-Id: I700b2fab81941959d301ab1934c18fbd8ee3eee4
2013-03-30 14:32:49 -07:00
..
bionic Replace unnecessary ARM uses of <sys/linux-syscalls.h> with <asm/unistd.h>. 2013-03-21 23:07:11 -07:00
cortex-a7 libc/arm: add cortex-a7 cpu variant 2013-03-23 01:38:22 -07:00
cortex-a9 Create arch specific versions of strcmp. 2013-03-20 14:33:54 -07:00
cortex-a15 Update to latest cortexa15 memcpy code. 2013-03-30 14:32:49 -07:00
generic Create arch specific versions of strcmp. 2013-03-20 14:33:54 -07:00
include/machine Upgrade libm. 2013-02-01 14:51:19 -08:00
krait Create arch specific versions of strcmp. 2013-03-20 14:33:54 -07:00
syscalls Use the correct names for the __ARM_NR_* syscalls. 2013-03-22 13:53:43 -07:00
arm.mk Create arch specific versions of strcmp. 2013-03-20 14:33:54 -07:00
syscalls.mk Expose wait4 as wait4 rather than __wait4. 2013-03-21 16:14:06 -07:00