Commit graph

142 commits

Author SHA1 Message Date
Elliott Hughes
5eccb9646d Fix aarch64 futex assembly routines.
Also make the other architectures more similar to one another,
use NULL instead of 0 in calling code, and remove an unused #define.

Change-Id: I52b874afb6a351c802f201a0625e484df6d093bb
2013-12-20 16:58:06 -08:00
Ben Cheng
e3fb66dd01 Add __popcountsi2 to the function compat list.
This is needed if we use Clang to compile Bionic, which won't include
__popcountsi2 anymore as Clang generates inline instructions. However
prebuilt binary blobs still depend on libc.so to resolve __popcountsi2.

Change-Id: I9001a3884c4be250c0ceebcd79922783fae1a0b7
2013-12-19 16:26:40 -08:00
Elliott Hughes
c54ca40aef Clean up some ARMv4/ARMv5 cruft.
Change-Id: I29e836fea4b53901e29f96c6888869c35f6726be
2013-12-13 14:02:30 -08:00
Christopher Ferris
ed45970ac5 Add cfi directives to all arm assembly.
Since the ENTRY/END macros now have .cfi_startproc/.cfi_endproc, most of the
custom arm assembly has no unwind information. Adding the proper cfi directives
for these and removing the arm directives.

Update the gensyscalls.py script to add these cfi directives for the generated
assembly. Also fix the references to non-uapi headers to the proper uapi
header.

In addition, remove the kill.S, tkill.S, tgkill.S for arm since they are not
needed at all. The unwinder (libunwind) is able to properly unwind using the
normal abort.

After this change, I can unwind through the system calls again.

Bug: 11559337
Bug: 11825869
Bug: 11321283

Change-Id: I18b48089ef2d000a67913ce6febc6544bbe934a3
2013-12-02 19:13:12 -08:00
Elliott Hughes
36d6188f8c Clean up forking and cloning.
The kernel now maintains the pthread_internal_t::tid field for us,
and __clone was only used in one place so let's inline it so we don't
have to leave such a dangerous function lying around. Also rename
files to match their content and remove some useless #includes.

Change-Id: I24299fb4a940e394de75f864ee36fdabbd9438f9
2013-11-19 14:08:54 -08:00
Elliott Hughes
70b24b1cc2 Switch pthread_create over to __bionic_clone.
Bug: 8206355
Bug: 11693195
Change-Id: I04aadbc36c87e1b7e33324b9a930a1e441fbfed6
2013-11-15 14:41:19 -08:00
Elliott Hughes
ed74484dcb Stop using the non-uapi <linux/err.h> header file.
We only need it for MAX_ERRNO, and it's time we had somewhere to put
the little assembler utility macros we've been putting off writing.

Change-Id: I9354d2e0dc47c689296a34b5b229fc9ba75f1a83
2013-11-07 10:31:05 -08:00
Elliott Hughes
bf425680e4 Let the compiler worry about implementing ffs(3).
It does at least as good a job as our old hand-written assembly anyway.

Change-Id: If7c4a1ac508bace0b71ee7b67808caa6eabf11d2
2013-10-24 16:29:40 -07:00
Serban Constantinescu
7f70c9b64e AArch64: Fix uses of stack size for 32/64bit libc builds
This patch fixes stack size uses to size_t.

Change-Id: I0671c85ddb1c1aceaf9440a7c73c21fe528653fa
Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
2013-10-22 12:01:29 -07:00
Elliott Hughes
eb847bc866 Fix x86_64 build, clean up intermediate libraries.
The x86_64 build was failing because clone.S had a call to __thread_entry which
was being added to a different intermediate .a on the way to making libc.so,
and the linker couldn't guarantee statically that such a relocation would be
possible.

  ld: error: out/target/product/generic_x86_64/obj/STATIC_LIBRARIES/libc_common_intermediates/libc_common.a(clone.o): requires dynamic R_X86_64_PC32 reloc against '__thread_entry' which may overflow at runtime; recompile with -fPIC

This patch addresses that by ensuring that the caller and callee end up in the
same intermediate .a. While I'm here, I've tried to clean up some of the mess
that led to this situation too. In particular, this removes libc/private/ from
the default include path (except for the DNS code), and splits out the DNS
code into its own library (since it's a weird special case of upstream NetBSD
code that's diverged so heavily it's unlikely ever to get back in sync).

There's more cleanup of the DNS situation possible, but this is definitely a
step in the right direction, and it's more than enough to get x86_64 building
cleanly.

Change-Id: I00425a7245b7a2573df16cc38798187d0729e7c4
2013-10-09 16:00:17 -07:00
Elliott Hughes
c4c6e192ac pthread_exit should call __NR_exit with status 0.
We shouldn't have been passing the bottom 32 bits of the address used
for pthread_join to the kernel.

Change-Id: I487e5002d60c27adba51173719213abbee0f183f
2013-10-08 14:48:05 -07:00
Elliott Hughes
141029327c Merge "Move common arch-* code to arch-common directory" 2013-10-03 23:17:58 +00:00
Christopher Ferris
fc4d70fe54 Remove dead files.
memcpy.a15.S/strcmp.a15.S files were submitted by ARM for use as the basis
for the memcpy/strcmp implementations in cortex-a15.

memset.S was moved in to the generic directory.

NOTE: memcpy.a9.S was submitted by Linaro to be the basis for the memcpy
for cortex-a9/cortex-a15 but has not been incorporated yet.

Bug: 10971279

Merge from internal master.

(cherry-picked from 48fc3e8b9f)

Change-Id: I8f9297578990d517f004e4e8840e2b2cbd5a47d8
2013-10-03 12:35:56 -07:00
Pavel Chupin
b49c17c2bf Move common arch-* code to arch-common directory
Will be helpful on adding x86_64

Change-Id: I96cf6fc7912c02f289c75f07ae0079c32d69173f
Signed-off-by: Pavel Chupin <pavel.v.chupin@intel.com>
2013-10-03 11:14:33 +04:00
Nick Kralevich
bdbdbb8319 Delete CAVEATS / fix spelling.
Change-Id: I0ed504271b7c2e4434d0d5f53bc10335c8cf7b5b
2013-08-27 17:05:19 -07:00
Christopher Ferris
4e24dcc8d8 Optimize strcat/strcpy, small tweaks to strlen. DO NOT MERGE
Create one version of strcat/strcpy/strlen for cortex-a15/krait and another
version for cortex-a9.

Tested with the libc_test strcat/strcpy/strlen tests.
Including new tests that verify that the src for strcat/strcpy do not
overread across page boundaries.

NOTE: The handling of unaligned strcpy (same code in strcat) could probably
be optimized further such that the src is read 64 bits at a time instead of
the partial reads occurring now.

strlen improves slightly since it was recently optimized.

Performance improvements for strcpy and strcat (using an empty dest string):

cortex-a9
- Small copies vary from about 5% to 20% as the size gets above 10 bytes.
- Copies >= 1024, about a 60% improvement.
- Unaligned copies, from about 40% improvement.

cortex-a15
- Most small copies exhibit a 100% improvement, a few copies only
  improve by 20%.
- Copies >= 1024, about 150% improvement.
- Unaligned copies, about 100% improvement.

krait
- Most small copies vary widely, but on average 20% improvement, then
  the performance gets better, hitting about a 100% improvement when
  copies 64 bytes of data.
- Copies >= 1024, about 100% improvement.
- When coping MBs of data, about 50% improvement.
- Unaligned copies, about 90% improvement.

As strcat destination strings get larger in size:

cortex-a9
- about 40% improvement for small dst strings (>= 32).
- about 250% improvement for dst strings >= 1024.

cortex-a15
- about 200% improvement for small dst strings (>=32).
- about 250% improvement for dst strings >= 1024.

krait
- about 25% improvement for small dst strings (>=32).
- about 100% improvement for dst strings >=1024.

Merge from internal master.

(cherry-picked from d119b7b6f4)

Change-Id: I296463b251ef9fab004ee4dded2793feca5b547a
2013-08-08 11:13:46 -07:00
Ben Cheng
772b797b7b Update the comments to reflect the current status.
Change-Id: I3a6348b568230fe8b21d121e5b8d30561a9703c2
2013-08-02 15:53:18 -07:00
synergydev
efddf44c8e libgcc_compat: Introduce __aeabi_lasr for cortex-a9 and higher
This is needed when passing -mcpu=cortex-a9 or higher on a modern
toolchain for prebuilt library compatibility

Change-Id: I73eb2393377914ae26216a8c2828ad973d1c1225
2013-07-29 16:55:08 -07:00
Christopher Ferris
0aa9b52efa Add new optimized strlen for arm.
This optimized version is primarily targeted at cortex-a15.

Tested on all nexus devices using the system/extras/libc_test strlen test.
Tested alignments from 1 to 32 that are powers of 2.
Tested that strlen does not cross page boundaries at all alignments.

Speed improvements listed below:

cortex-a15
- Sizes >= 32 bytes, ~75% improvement.
- Sizes >= 1024 bytes, ~250% improvement.

cortex-a9
- Sizes >= 32 bytes, ~75% improvement.
- Sizes >= 1024 bytes, ~85% improvement.

krait
- Sizes >= 32 bytes, ~95% improvement.
- Sizes >= 1024 bytes, ~160% improvement.

Merge from internal master.

(cherry-picked from 2fc0717977)

Change-Id: I1ceceb4e745fd68e9d946f96d1d42e0cdaff6ccf
2013-07-16 16:47:37 -07:00
Elliott Hughes
da4a3e6515 EABI syscall cleanup.
We cleaned up the auto-generated ones a while back to not touch
the stack unnecessarily if they have <= 4 arguments. This patch
cleans up some hand-crafted ones.

Also improve comments in clone.S.

Change-Id: I8850bf98f2b26829385315304472a760e6880ed8
2013-07-16 11:52:24 -07:00
Will Newton
2753e12af5 libc/arch-arm/bionic/memcpy.a9.S: memcpy from cortex-strings.
This memcpy code uses NEON/VFP to achieve very good performance
on ARMv7-A processors. It is specifically tuned for A15 but should
provide good performance on A9 also. It is equivalent to the code
in cortex-strings rev 116.

This patch is a follow up the existing gerrit change:

I7f6f77995f3ca903ad9c66d14261441667a2a935

This version includes a tweak for performance on misaligned
buffers and splits the header comment into license and
documentation sections.

Change-Id: Ibd2e23c8d8e01357ba0247be1d05192de3ceba69
Signed-off-by: Will Newton <will.newton@linaro.org>
2013-07-03 10:20:43 -07:00
Will Newton
b61103dff4 libc/arch-arm/bionic/memcpy.a9.S: memcpy from cortex-strings.
This memcpy code uses NEON/VFP to achieve very good performance
on ARMv7-A processors. It is specifically tuned for A15 but should
provide good performance on A9 also. It is equivalent to the code
in cortex-strings rev 116.

This patch is a follow up the existing gerrit change:

I7f6f77995f3ca903ad9c66d14261441667a2a935

But this version includes a tweak for performance on misaligned
buffers.

Change-Id: I285abac0068f8ae29a1cbf7862ea8590aadaf0a7
Signed-off-by: Will Newton <will.newton@linaro.org>
2013-07-01 11:15:27 +01:00
Ben Cheng
7e6ce1a3c5 Fix abort(3) to raise SIGABRT rather than causing SIGSEGV.
tgkill() needs the .save stack unwinding directive to get the complete
stack trace.

BUG: https://code.google.com/p/android/issues/detail?id=16672

Change-Id: Ifb447dca2147a592c48baf32769dfc175d8aea72
2013-06-10 17:17:46 -07:00
Ben Cheng
a123b5d319 Use bl instead of blx to support interworking properly.
(cherry picked from commit 9e1905794b in
master)

Change-Id: I9b8c35ea9e201e00f84315f9f105013c23c94d85
2013-05-31 14:39:23 -07:00
Andrew Hsieh
e8f46e8edd Remove redundant space within square brackets
The new "as" in binutils-2.23 (with gcc4.8) is more picky:
it expects register right after [

Change-Id: I876124841582070ab2083ffafe38bc333b5812d0
2013-04-25 15:05:03 +08:00
Elliott Hughes
8794ece296 Replace unnecessary ARM uses of <sys/linux-syscalls.h> with <asm/unistd.h>.
For some reason, socketcalls.c was only being compiled for ARM, where
it makes no sense. For x86 we generate stubs for the socket functions
that use __NR_socketcall directly.

Change-Id: I84181e6183fae2314ae3ed862276eba82ad21e8e
2013-03-21 23:07:11 -07:00
Elliott Hughes
5c2772f59d The SYS_ constants should cover all __NR_ values.
<sys/linux-syscalls.h> only contains constants for the syscalls
we're generating stubs for. We want all the syscalls available
on the architecture in question.

Keep using <sys/linux-syscalls.h> on ARM for now because the
__NR_ARM_set_tls and __NR_ARM_cacheflush values aren't in <asm/unistd.h>.

Change-Id: I66683950d87d9b18d6107d0acc0ed238a4496f44
2013-03-21 22:26:20 -07:00
Elliott Hughes
8f2a5a0b40 Clean up internal libc logging.
We only need one logging API, and I prefer the one that does no
allocation and is thus safe to use in any context.

Also use O_CLOEXEC when opening the /dev/log files.

Move everything logging-related into one header file.

Change-Id: Ic1e3ea8e9b910dc29df351bff6c0aa4db26fbb58
2013-03-15 16:12:58 -07:00
Elliott Hughes
9aceab5015 Use the kernel's MAX_ERRNO in the syscall stubs.
Bug: http://code.google.com/p/android/issues/detail?id=53104
Change-Id: Iaabf7025b153e96dc5eca231a33a32d4cb7d8116
2013-03-12 17:43:58 -07:00
Ben Cheng
14283004f5 Add stack unwinding directives to memcpy.
Also include some Android specific header files.

Change-Id: Idbcbd43458ba945ca8c61bfbc04ea15fc0ae4e00
2013-03-01 14:56:04 -08:00
Greta Yorsh
eb149e954e Adding strcmp tuned for Cortex-A15.
The attached patch provides a new implementation of strcmp for ARM,
using LDRD instead of LDR whenever possible.

For older architectures that do not support LDRD, this implementation
uses the same algorithm as before.

Testing and benchmarking:
* Validation: successfully passes a test that compares different strings
of length 1-128 and offsets 0-8 from a word boundary. Checked on
qemu/A15/A9, ARM/Thumb mode, Big/Little Endian.
* Integration with gcc: no regression on qemu for arm-none-eabi --with-cpu
a15/a9 --with-mode arm/thumb.

Change-Id: I9e230e1b99dbdc9119b69ee858a89038c516a4ea
Signed-off-by: Vassilis Laganakos <vasileios.laganakos@arm.com>
2013-03-01 10:41:01 +00:00
Greta Yorsh
5b349fc22e Adding memcpy tuned for Cortex-A15.
The strategy for large block sizes is LDRD and STRD with offset addressing,
where the main loop copies 64 bytes in every iteration, (i.e., 8 calls to
LDRD and STRD pairs), interleaving load and stores (i.e., the pairs of LDRD
and STRD of the same data are consecutive instructions), and the writeback
of an updated address is a separate instruction, which allows us to write
back the accumulated update once per iteration.

This strategy is implemented in memcpy.S. In some configurations, a plain
version of memcpy (included from memcpy-stub.c) is used instead of the
optimized one.

Validation:
* Correctness: checked memcpy using a test harness for block sizes
ranging between 1 to 128, and source and destination buffers alignment
ranging in { 0,1,2,3,4,8,12 } bytes each.
* Performance: benchmarking on Cortex-A15 FPGA indicates that this strategy
is better for A15 than the strategy used by glibc and even slightly better
than using NEON. Benchmarking on Cortex-A9 bare metal and Linux shows
that the proposed strategy is reasonable: not as fast as the version of
memcpy from glibc (which is the best open source strategy for A9), but
comparable with csl and bionic.
* Integration with GCC: no regression for arm-none-eabi --with-cpu
cortex-a15 and cortex-a9.

Change-Id: Ied56354d8992c62ae3e02d582a2bd55585d814b9
Signed-off-by: Vassilis Laganakos <vasileios.laganakos@arm.com>
2013-03-01 10:40:50 +00:00
Elliott Hughes
40eabe24e4 Fix the pthread_setname_np test.
Fix the pthread_setname_np test to take into account that emulator kernels are
so old that they don't support setting the name of other threads.

The CLONE_DETACHED thread is obsolete since 2.5 kernels.

Rename kernel_id to tid.

Fix the signature of __pthread_clone.

Clean up the clone and pthread_setname_np implementations slightly.

Change-Id: I16c2ff8845b67530544bbda9aa6618058603066d
2013-02-15 12:08:59 -08:00
Elliott Hughes
6719500dbd Add a bunch more missing ENDs to assembler routines.
This isn't everything; I've missed out those x86 files that are

Change-Id: Idb7bb1a68796d6c0b70ea2b5c3300e49da6c62d2
2013-02-13 15:12:32 -08:00
Elliott Hughes
73964c592c Everyone has CLZ.
Even armv5 had CLZ.

Change-Id: I51bc8d1166d09940fd0d3f4c7717edf26977082c
2013-02-13 14:40:48 -08:00
Elliott Hughes
9f878c2fca Really set errno if __pthread_clone fails.
If r0 == 0, we're the child. If r0 > 0, we're the parent.
Otherwise set errno.

The __bionic_clone code I copy & pasted was wrong. This patch
fixes both.

Bug: 3461078
Change-Id: Ibb7d6cc7e54e666841f2f0dc59a141a0b31982e4
2013-02-12 16:07:06 -08:00
Elliott Hughes
d7a3a403c1 Use ENTRY/END in ARM __get_sp.
Change-Id: If2f159b266f5fa4ad9d188a17d4cd318b605e446
2013-02-11 16:58:34 -08:00
Elliott Hughes
5e3fc43dde Fix __pthread_clone on ARM to set errno on failure.
MIPS and x86 appear to have been correct already.

(Also fix unit tests that ASSERT_EQ with errno so that the
arguments are in the retarded junit order.)

Bug: 3461078
Change-Id: I2418ea98927b56e15b4ba9cfec97f5e7094c6291
2013-02-11 16:39:10 -08:00
Elliott Hughes
f94fd3ccc6 Clean up ARM assembler files to use ENTRY/END.
We also don't need legacy syscall support (non-"swi 0").

Change-Id: Id1012e8ca18bf13f3f4e42200f39ba0e2e632cbf
2013-02-11 15:36:59 -08:00
Elliott Hughes
646e058136 Fix x86 build, remove void* arithmetic.
Change-Id: Idc7f14af2e094ac33de315e808176237af063bb8
2013-02-07 12:16:10 -08:00
Elliott Hughes
42b2c6a5ee Clean up the argc/argv/envp/auxv handling.
There's now only one place where we deal with this stuff, it only needs to
be parsed once by the dynamic linker (rather than by each recipient), and it's
now easier for us to get hold of auxv data early on.

Change-Id: I6314224257c736547aac2e2a650e66f2ea53bef5
2013-02-07 11:44:21 -08:00
Elliott Hughes
1e980b6bc8 Fix the duplication in the debugging code.
We had two copies of the backtrace code, and two copies of the
libcorkscrew /proc/pid/maps code. This patch gets us down to one.

We also had hacks so we could log in the malloc debugging code.
This patch pulls the non-allocating "printf" code out of the
dynamic linker so everyone can share.

This patch also makes the leak diagnostics easier to read, and
makes it possible to paste them directly into the 'stack' tool (by
using relative PCs).

This patch also fixes the stdio standard stream leak that was
causing a leak warning every time tf_daemon ran.

Bug: 7291287
Change-Id: I66e4083ac2c5606c8d2737cb45c8ac8a32c7cfe8
2013-01-18 22:20:06 -08:00
Ben Cheng
35f5385aa5 Add __aeabi_idiv to the dummy reference list.
If the platform code is compiled with -mcpu=cortex-a15, then without this
change prebuilt libraries built against -march=armv7 cannot resolve the
dependency on __aeabi_idiv (provided by libgcc.a).

Bug: 7961327

cherry-picked from internal master.

Change-Id: I8fe59a98eb53d641518b882523c1d6a724fb7e55
2013-01-14 15:33:40 -08:00
Henrik Smiding
884e4f839b Add optimized version of memset for Cortex A9
Adds new code to function memset, optimized for Cortex A9.

Copyright (C) ST-Ericsson SA 2010

Added neon implementation

Author: Henrik Smiding henrik.smiding@stericsson.com for ST-Ericsson.

Change-Id: Id3c87767953439269040e15bd30a27aba709aef6
Signed-off-by: Christian Bejram <christian.bejram@stericsson.com>
2012-11-09 15:05:32 -08:00
Henrik Smiding
6d0bcdc832 Add optimized version of memcpy for Cortex A9
Adds new code to memcpy function, optimized for Cortex A9.
Adds new ARM-only loop, for operations where source and
destination are aligned.

Copyright (C) ST-Ericsson SA 2010

Modified neon implementation to fit Cortex A9 cache line size,
for those running 32 bytes L2 cache line size.
Also split the implementation in aligned and unaligned access,
for those that allows unaligned memory access with Neon.
For totally aligned operations, arm-only code is used.

Change-Id: I95ebf6164cd6486b12a7e3e98e369db21e7e18d2
Author: Henrik Smiding henrik.smiding@stericsson.com for ST-Ericsson.
Signed-off-by: Christian Bejram <christian.bejram@stericsson.com>
2012-11-08 18:02:14 -08:00
Elliott Hughes
c213291515 Merge "Add optimized version of memcmp for Cortex A9" 2012-11-08 17:48:19 -08:00
Andrew Hsieh
048569be54 Rename __dso_handle_so.c to __dso_handle_so.h
Also chang libc/arch-arm/bionic/crtbegin_so.c to include it
as a header.

Change-Id: Ib91b0b8caf5c8b936425aa8a4fc1a229b2b27929
2012-09-07 12:49:41 +08:00
Elliott Hughes
b2c5bd543d Merge "ARM: warn about atexit() calls from legacy shared libraries" 2012-09-05 10:18:43 -07:00
Elliott Hughes
26f2e4a163 Merge "ARM: make CRT_LEGACY_WORKAROUND work as intended" 2012-09-05 09:43:35 -07:00
Nick Kralevich
069c64cdf2 Merge "ARM: make sure __on_dlclose() actually gets called" 2012-08-28 13:04:22 -07:00