We're going to modify the __atomic_xxx implementation to provide
full memory barriers, to avoid problems for NDK machine code that
link to these functions.
First step is to remove their usage from our platform code.
We now use inlined versions of the same functions for a slight
performance boost.
+ remove obsolete atomics_x86.c (was never compiled)
NOTE: This improvement was benchmarked on various devices.
Comparing a pthread mutex lock + atomic increment + unlock
we get:
- ARMv7 emulator, running on a 2.4 GHz Xeon:
before: 396 ns after: 288 ns
- x86 emulator in KVM mode on same machine:
before: 27 ns after: 27 ns
- Google Nexus S, in ARMv7 mode (single-core):
before: 82 ns after: 76 ns
- Motorola Xoom, in ARMv7 mode (multi-core):
before: 121 ns after: 120 ns
The code has also been rebuilt in ARMv5TE mode for correctness.
Change-Id: Ic1dc72b173d59b2e7af901dd70d6a72fb2f64b17
This also allows us to optimize the case where we increment an
uncontended semaphore (no need to call futex_wake() then).
Change-Id: Iad48efe8551dc66dc89d3e3f18c001e5a6c1939f
We simply copy the stuff we need from cutils headers.
A future patch will change cutils to include the private <bionic_atomic_inline.h>
Change-Id: Ib6fd9a03bc9e337ce867bd606dc94c2b4438480a
Update ARM atomic ops to use LDREX/STREX. Stripped out #if 0 chunk.
Insert explicit memory barriers in pthread and semaphore code.
For bug 2721865.
Change-Id: I0f153b797753a655702d8be41679273d1d5d6ae7