Our FORTIFY _chk functions' implementations were very repetitive and verbose
but not very helpful. We'd also screwed up and put the SSIZE_MAX checks where
they would never fire unless you actually had a buffer as large as half your
address space, which probably doesn't happen very often.
Factor out the duplication and take the opportunity to actually show details
like how big the overrun buffer was, or by how much it was overrun.
Also remove the obsolete FORTIFY event logging.
Also remove the unused __libc_fatal_no_abort.
This change doesn't improve the diagnostics from the optimized assembler
implementations.
Change-Id: I176a90701395404d50975b547a00bd2c654e1252
On the path that only uses r0 in both the krait and cortex-a9
memset, remove the push and use r3 instead.
In addition, for cortex-a9, remove the artificial function since
it's not needed since dwarf unwinding is now supported on arm.
Change-Id: Ia4ed1cc435b03627a7193215e76c8ea3335f949a
When there is arm assembler of this format:
ldmxx sp!, {..., lr} or pop {..., lr}
bx lr
It can be replaced with:
ldmxx sp!, {..., pc} or pop {..., pc}
Change-Id: Ic27048c52f90ac4360ad525daf0361a830dc22a3
All arch-arm and arch-arm64 .S files were compiled
by gcc with and without this patch. The output object files
were identical. When compiled with llvm and this patch,
the output files were also identical to gcc's output.
BUG: 18061004
Change-Id: I458914d512ddf5496e4eb3d288bf032cd526d32b
Remove the old arm directives.
Change the non-local labels to .L labels.
Add cfi directives to strcpy.S.
Change-Id: I9bafee1ffe5d85c92d07cfa8a85338cef9759562
Our <machine/asm.h> files were modified from upstream, to the extent
that no architecture was actually using the upstream ENTRY or END macros,
assuming that architecture even had such a macro upstream. This patch moves
everyone to the same macros, with just a few tweaks remaining in the
<machine/asm.h> files, which no one should now use directly.
I've removed most of the unused cruft from the <machine/asm.h> files, though
there's still rather a lot in the mips/mips64 ones.
Bug: 12229603
Change-Id: I2fff287dc571ac1087abe9070362fb9420d85d6d
I originally modified the krait mainloop prefetch from cacheline * 8 to * 2.
This causes a perf degradation for copies bigger than will fit in the cache.
Fixing this back to the original * 8. I tried other multiples, but * 8 is th
sweet spot on krait.
Bug: 11221806
Change-Id: I1f75fad6440f7417e664795a6e7b5616f6a29c45
The x86_64 build was failing because clone.S had a call to __thread_entry which
was being added to a different intermediate .a on the way to making libc.so,
and the linker couldn't guarantee statically that such a relocation would be
possible.
ld: error: out/target/product/generic_x86_64/obj/STATIC_LIBRARIES/libc_common_intermediates/libc_common.a(clone.o): requires dynamic R_X86_64_PC32 reloc against '__thread_entry' which may overflow at runtime; recompile with -fPIC
This patch addresses that by ensuring that the caller and callee end up in the
same intermediate .a. While I'm here, I've tried to clean up some of the mess
that led to this situation too. In particular, this removes libc/private/ from
the default include path (except for the DNS code), and splits out the DNS
code into its own library (since it's a weird special case of upstream NetBSD
code that's diverged so heavily it's unlikely ever to get back in sync).
There's more cleanup of the DNS situation possible, but this is definitely a
step in the right direction, and it's more than enough to get x86_64 building
cleanly.
Change-Id: I00425a7245b7a2573df16cc38798187d0729e7c4
The check for __ARM_FEATURE_DSP being defined is pointless since it
is always defined.
Bug: 10971279
Merge from internal master.
(cherry-picked from d2642fa70c)
Change-Id: If23ab3271f4da0c38cd531ffdc9a7e5eed6ec5dc
I accidentally did a signed comparison of the size_t values passed in
for three of the _chk functions. Changing them to unsigned compares.
Add three new tests to verify this failure is fixed.
Bug: 10691831
Merge from internal master.
(cherry-picked from 883ef2499c)
Change-Id: Id9a96b549435f5d9b61dc132cf1082e0e30889f5
The backtrace when a fortify check failed was not correct. This change
adds all of the necessary directives to get a correct backtrace.
Fix the strcmp directives and change all labels to local labels.
Testing:
- Verify that the runtime can decode the stack for __memcpy_chk, __memset_chk,
__strcpy_chk, __strcat_chk fortify failures.
- Verify that gdb can decode the stack properly when hitting a fortify check.
- Verify that the runtime can decode the stack for a seg fault for all of the
_chk functions and for memcpy/memset.
- Verify that gdb can decode the stack for a seg fault for all of the _chk
functions and for memcpy/memset.
- Verify that the runtime can decode the stack for a seg fault for strcmp.
- Verify that gdb can decode the stack for a seg fault in strcmp.
Bug: 10342460
Bug: 10345269
Merge from internal master.
(cherry-picked from 05332f2ce7)
Change-Id: Ibc919b117cfe72b9ae97e35bd48185477177c5ca
The libcorkscrew stack unwinder does not understand cfi directives,
so add .save directives so that it can function properly.
Also add the directives in to strcmp.S and fix a missing set of
directives in cortex-a9/memcpy_base.S.
Bug: 10345269
Merge from internal master.
(cherry-picked from 5f7ccea3ff)
Change-Id: If48a216203216a643807f5d61906015984987189
I accidentally did a signed comparison of the size_t values passed in
for three of the _chk functions. Changing them to unsigned compares.
Add three new tests to verify this failure is fixed.
Bug: 10691831
Change-Id: Ia831071f7dffd5972a748d888dd506c7cc7ddba3
The backtrace when a fortify check failed was not correct. This change
adds all of the necessary directives to get a correct backtrace.
Fix the strcmp directives and change all labels to local labels.
Testing:
- Verify that the runtime can decode the stack for __memcpy_chk, __memset_chk,
__strcpy_chk, __strcat_chk fortify failures.
- Verify that gdb can decode the stack properly when hitting a fortify check.
- Verify that the runtime can decode the stack for a seg fault for all of the
_chk functions and for memcpy/memset.
- Verify that gdb can decode the stack for a seg fault for all of the _chk
functions and for memcpy/memset.
- Verify that the runtime can decode the stack for a seg fault for strcmp.
- Verify that gdb can decode the stack for a seg fault in strcmp.
Bug: 10342460
Bug: 10345269
Change-Id: I1dedadfee207dce4a285e17a21e8952bbc63786a
The libcorkscrew stack unwinder does not understand cfi directives,
so add .save directives so that it can function properly.
Also add the directives in to strcmp.S and fix a missing set of
directives in cortex-a9/memcpy_base.S.
Bug: 10345269
Change-Id: I043f493e0bb6c45bd3f4906fbe1d9f628815b015
This change pulls the memcpy code out into a new file so that the
__strcpy_chk and __strcat_chk can use it with an include.
The new versions of the two chk functions uses assembly versions
of strlen and memcpy to implement this check. This allows near
parity with the assembly versions of strcpy/strcat. It also means that
as memcpy implementations get faster, so do the chk functions.
Other included changes:
- Change all of the assembly labels to local labels. The other labels
confuse gdb and mess up backtracing.
- Add .cfi_startproc and .cfi_endproc directives so that gdb is not
confused when falling through from one function to another.
- Change all functions to use cfi directives since they are more powerful.
- Move the memcpy_chk fail code outside of the memcpy function definition
so that backtraces work properly.
- Preserve lr before the calls to __fortify_chk_fail so that the backtrace
actually works.
Testing:
- Ran the bionic unit tests. Verified all error messages in logs are set
correctly.
- Ran libc_test, replacing strcpy with __strcpy_chk and replacing
strcat with __strcat_chk.
- Ran the debugger on nexus10, nexus4, and old nexus7. Verified that the
backtrace is correct for all fortify check failures. Also verify that
when falling through from __memcpy_chk to memcpy that the backtrace is
still correct. Also verified the same for __memset_chk and bzero.
Verified the two different paths in the cortex-a9 memset routine that
save variables to the stack still show the backtrace properly.
Bug: 9293744
(cherry-picked from 2be91915dc)
Change-Id: Ia407b74d3287d0b6af0139a90b6eb3bfaebf2155
This change creates assembler versions of __memcpy_chk/__memset_chk
that is implemented in the memcpy/memset assembler code. This change
avoids an extra call to memcpy/memset, instead allowing a simple fall
through to occur from the chk code into the body of the real
implementation.
Testing:
- Ran the libc_test on __memcpy_chk/__memset_chk on all nexus devices.
- Wrote a small test executable that has three calls to __memcpy_chk and
three calls to __memset_chk. First call dest_len is length + 1. Second
call dest_len is length. Third call dest_len is length - 1.
Verified that the first two calls pass, and the third fails. Examined
the logcat output on all nexus devices to verify that the fortify
error message was sent properly.
- I benchmarked the new __memcpy_chk and __memset_chk on all systems. For
__memcpy_chk and large copies, the savings is relatively small (about 1%).
For small copies, the savings is large on cortex-a15/krait devices
(between 5% to 30%).
For cortex-a9 and small copies, the speed up is present, but relatively
small (about 3% to 5%).
For __memset_chk and large copies, the savings is also small (about 1%).
However, all processors show larger speed-ups on small copies (about 30% to
100%).
Bug: 9293744
Merge from internal master.
(cherry-picked from 7c860db074)
Change-Id: I916ad305e4001269460ca6ebd38aaa0be8ac7f52
This change pulls the memcpy code out into a new file so that the
__strcpy_chk and __strcat_chk can use it with an include.
The new versions of the two chk functions uses assembly versions
of strlen and memcpy to implement this check. This allows near
parity with the assembly versions of strcpy/strcat. It also means that
as memcpy implementations get faster, so do the chk functions.
Other included changes:
- Change all of the assembly labels to local labels. The other labels
confuse gdb and mess up backtracing.
- Add .cfi_startproc and .cfi_endproc directives so that gdb is not
confused when falling through from one function to another.
- Change all functions to use cfi directives since they are more powerful.
- Move the memcpy_chk fail code outside of the memcpy function definition
so that backtraces work properly.
- Preserve lr before the calls to __fortify_chk_fail so that the backtrace
actually works.
Testing:
- Ran the bionic unit tests. Verified all error messages in logs are set
correctly.
- Ran libc_test, replacing strcpy with __strcpy_chk and replacing
strcat with __strcat_chk.
- Ran the debugger on nexus10, nexus4, and old nexus7. Verified that the
backtrace is correct for all fortify check failures. Also verify that
when falling through from __memcpy_chk to memcpy that the backtrace is
still correct. Also verified the same for __memset_chk and bzero.
Verified the two different paths in the cortex-a9 memset routine that
save variables to the stack still show the backtrace properly.
Bug: 9293744
Change-Id: Id5aec8c3cb14101d91bd125eaf3770c9c8aa3f57
(cherry picked from commit 2be91915dc)
This change creates assembler versions of __memcpy_chk/__memset_chk
that is implemented in the memcpy/memset assembler code. This change
avoids an extra call to memcpy/memset, instead allowing a simple fall
through to occur from the chk code into the body of the real
implementation.
Testing:
- Ran the libc_test on __memcpy_chk/__memset_chk on all nexus devices.
- Wrote a small test executable that has three calls to __memcpy_chk and
three calls to __memset_chk. First call dest_len is length + 1. Second
call dest_len is length. Third call dest_len is length - 1.
Verified that the first two calls pass, and the third fails. Examined
the logcat output on all nexus devices to verify that the fortify
error message was sent properly.
- I benchmarked the new __memcpy_chk and __memset_chk on all systems. For
__memcpy_chk and large copies, the savings is relatively small (about 1%).
For small copies, the savings is large on cortex-a15/krait devices
(between 5% to 30%).
For cortex-a9 and small copies, the speed up is present, but relatively
small (about 3% to 5%).
For __memset_chk and large copies, the savings is also small (about 1%).
However, all processors show larger speed-ups on small copies (about 30% to
100%).
Bug: 9293744
Change-Id: I8926d59fe2673e36e8a27629e02a7b7059ebbc98
Streamline the memcpy a bit removing some unnecessary instructions.
The biggest speed improvement comes from changing the size of
the preload. On krait, the sweet spot for the preload in the main
loop is twice the L1 cache line size.
In most cases, these small tweaks yield > 1000MB/s speed ups. As
the size of the memcpy approaches about 1MB, the speed improvement
disappears.
Change-Id: Ief79694d65324e2db41bee4707dae19b8c24be62
This uses the new strcmp.a15.S code as the basis for new versions
of strcmp.S.
The cortex-a15 code is the performance optimized version of strcmp.a15.S
taken with only the addition of a few pld instructions.
The cortex-a9 code is the same as the cortex-a15 code except that the
unaligned strcmp code was taken from the original strcmp.S.
The krait code is the same as the cortex-a15 code except that one path
in the unaligned strcmp code was taken from the original strcmp.S code
(the 2 byte overlap case).
The generic code is the original unmodified strmp.S from the bionic
subdirectory.
All three new versions underwent these test cases:
Strings the same, all same size:
- Both pointers double word aligned.
- One pointer double word aligned, one pointer word aligned.
- Both pointers word aligned.
- One pointer double word aligned, one pointer 1 off a word alignment.
- One pointer double word aligned, one pointer 2 off a word alignment.
- One pointer double word aligned, one pointer 3 off a word alignment.
- One pointer word aligned, one pointer 1 off a word alignment.
- One pointer word aligned, one pointer 2 off a word alignment.
- One pointer word aligned, one pointer 3 off a word alignment.
For all cases where it made sense, the two pointers were also tested
swapped.
Different strings, all same size:
- Single difference at double word boundary.
- Single difference at word boudary.
- Single difference at 1 off a word alignment.
- Single difference at 2 off a word alignment.
- Single difference at 3 off a word alignment.
Different sized strings, strings the same until the end:
- Shorter string ends on a double word boundary.
- Shorter string ends on word boundary.
- Shorter string ends at 1 off a word boundary.
- Shorter string ends at 2 off a word boundary.
- Shorter string ends at 3 off a word boundary.
For all different cases, run them through the same pointer alignment
cases when the strings are the same size.
For all cases the two pointers were also tested swapped.
Bug: 8005082
Merge from internal master.
(cherry-picked from commit a9a5870d16)
Change-Id: I4c2b98f8a50804fb98ab67f75e9d660f1315a144
Move arch specific code for arm, mips, x86 into separate
makefiles.
In addition, add different arm cpu versions of memcpy/memset.
Bug: 8005082
Merge from internal master (acdde8c1cf).
Change-Id: I04f3d0715104fab618e1abf7cf8f7eec9bec79df
This uses the new strcmp.a15.S code as the basis for new versions
of strcmp.S.
The cortex-a15 code is the performance optimized version of strcmp.a15.S
taken with only the addition of a few pld instructions.
The cortex-a9 code is the same as the cortex-a15 code except that the
unaligned strcmp code was taken from the original strcmp.S.
The krait code is the same as the cortex-a15 code except that one path
in the unaligned strcmp code was taken from the original strcmp.S code
(the 2 byte overlap case).
The generic code is the original unmodified strmp.S from the bionic
subdirectory.
All three new versions underwent these test cases:
Strings the same, all same size:
- Both pointers double word aligned.
- One pointer double word aligned, one pointer word aligned.
- Both pointers word aligned.
- One pointer double word aligned, one pointer 1 off a word alignment.
- One pointer double word aligned, one pointer 2 off a word alignment.
- One pointer double word aligned, one pointer 3 off a word alignment.
- One pointer word aligned, one pointer 1 off a word alignment.
- One pointer word aligned, one pointer 2 off a word alignment.
- One pointer word aligned, one pointer 3 off a word alignment.
For all cases where it made sense, the two pointers were also tested
swapped.
Different strings, all same size:
- Single difference at double word boundary.
- Single difference at word boudary.
- Single difference at 1 off a word alignment.
- Single difference at 2 off a word alignment.
- Single difference at 3 off a word alignment.
Different sized strings, strings the same until the end:
- Shorter string ends on a double word boundary.
- Shorter string ends on word boundary.
- Shorter string ends at 1 off a word boundary.
- Shorter string ends at 2 off a word boundary.
- Shorter string ends at 3 off a word boundary.
For all different cases, run them through the same pointer alignment
cases when the strings are the same size.
For all cases the two pointers were also tested swapped.
Bug: 8005082
Change-Id: I5f3dc02b48afba2cb9c13332ab45c828ff171a1c
Move arch specific code for arm, mips, x86 into separate
makefiles.
In addition, add different arm cpu versions of memcpy/memset.
Bug: 8005082
Change-Id: I04f3d0715104fab618e1abf7cf8f7eec9bec79df