tequilaOS/platform_bionic

Author	SHA1	Message	Date
ahs	919fb7f2e0	avx2 implementation for memset. This patch includes handwritten avx2 assembly for memset 64-bit. Uses non-temporal stores for very large sizes. Also includes dynamic dispatch for APIs having multiple implementations. Convincing benchmark improvements for sizes above 512 bytes, and although the slight regression for small sizes is unfortunate, it's probably small enough to be okay? Before: BM_string_memset/8/0 3.06 ns 3.04 ns 222703428 bytes_per_second=2.45261G/s BM_string_memset/16/0 3.50 ns 3.47 ns 202569932 bytes_per_second=4.29686G/s BM_string_memset/32/0 3.50 ns 3.48 ns 200064955 bytes_per_second=8.57386G/s BM_string_memset/64/0 3.49 ns 3.46 ns 201928186 bytes_per_second=17.2184G/s BM_string_memset/512/0 14.8 ns 14.7 ns 47776178 bytes_per_second=32.3887G/s BM_string_memset/1024/0 27.3 ns 27.1 ns 25884933 bytes_per_second=35.2515G/s BM_string_memset/8192/0 203 ns 201 ns 3476903 bytes_per_second=37.9311G/s BM_string_memset/16384/0 402 ns 399 ns 1750471 bytes_per_second=38.2725G/s BM_string_memset/32768/0 932 ns 925 ns 755750 bytes_per_second=33.0071G/s BM_string_memset/65536/0 2038 ns 2014 ns 347060 bytes_per_second=30.3057G/s BM_string_memset/131072/0 4012 ns 3980 ns 175186 bytes_per_second=30.6682G/s After: BM_string_memset/8/0 3.32 ns 3.23 ns 208939089 bytes_per_second=2.3051G/s BM_string_memset/16/0 4.07 ns 3.98 ns 173479615 bytes_per_second=3.74822G/s BM_string_memset/32/0 4.07 ns 3.95 ns 177208119 bytes_per_second=7.54344G/s BM_string_memset/64/0 4.09 ns 4.00 ns 174729144 bytes_per_second=14.8878G/s BM_string_memset/512/0 10.7 ns 10.4 ns 65922763 bytes_per_second=45.6611G/s BM_string_memset/1024/0 18.0 ns 17.6 ns 40489136 bytes_per_second=54.3166G/s BM_string_memset/8192/0 109 ns 106 ns 6577711 bytes_per_second=71.7667G/s BM_string_memset/16384/0 221 ns 210 ns 3343800 bytes_per_second=72.684G/s BM_string_memset/32768/0 655 ns 623 ns 1153501 bytes_per_second=48.9781G/s BM_string_memset/65536/0 1547 ns 1495 ns 461702 bytes_per_second=40.8154G/s BM_string_memset/131072/0 2991 ns 2924 ns 240189 bytes_per_second=41.7438G/s This patch drops the wmemset() code because we don't even have a microbenchmark for it, we have as many implementations checked in as we have non-test call sites (!), so at this point it seems like we've spent more time maintaining wmemset() than running it! Test: bionic/tests/run-on-host.sh 64 Signed-off-by: ahs <amrita.h.s@intel.com> Change-Id: Ie5047df5300638c1e4c69f8285d33d034f79c83b	2022-07-22 21:48:50 +00:00
jaishank	2e50fa7cf8	Optimized L2 Cache value for Intel(R) Core Architectures. Performance Gain: AnTuTu - 4.80% 3D Mark Sling Shot - 3.47% BaseMarkGPU - 5.51% GeekBench - 3.19% Test: ./tests/run-on-host.sh 64 Change-Id: I6122835a3f5fd97cc291623d1062fe25843a2d93 Signed-off-by: jaishank <jaishankar.rajendran@intel.com>	2019-11-12 15:58:34 +00:00
Shalini Salomi Bodapati	4ed2f475d8	Add avx2 version of wmemset in binoic Test: ./tests/run-on-host.sh 64 Change-Id: Id2f696cc60a10c01846ca3fe0d3a5d513020afe3 Signed-off-by: Shalini Salomi Bodapati <shalini.salomi.bodapati@intel.com>	2019-07-16 18:06:57 +05:30
Haibo Huang	8a0f0ed5e7	Make memcpy memmove Bug: http://b/63992911 Test: Change BoardConfig.mk and compile for each variant Change-Id: Ia0cc68d8e90e3316ddb2e9ff1555a009b6a0c5be	2018-06-11 18:12:45 +00:00
Jeremy Compostella	611ad621c6	Revert "Add 64-bit slm optimized strlcpy and srlcat." This reverts commit `2e7145c048`. When src is at the end page, the sse2 strlcpy SSE2 optimized version can issue a movdqu instruction that can cross the page boundary. If the next page is not allocated to that process, it leads to segmentation fault. This is a rare but has be caught multiple times during robustness testing. We isolated a way to reproduce that issue outside of an Android device and we have been able to resolve this particular case. However, we ran some additional compliance and robustness tests and found several other similar page crossing issues with this implementation. In conclusion, this optimization needs to be re-written from scratch because its design is at cause. In the meantime, it is better to remove it. Change-Id: If90450de430ba9b7cd9282a422783beabd701f3d Signed-off-by: Jeremy Compostella <jeremy.compostella@intel.com>	2018-04-12 14:00:43 -07:00
Elliott Hughes	a80ddc8a34	Fix x86-64 __memset_chk. I can only assume I was testing the 32-bit implementation when I claimed this worked. While improving the 32-bit code I realized that I'd used signed comparisons instead of unsigned, and came back to find that the 64-bit code didn't work. By way of apology, make x86-64 the first architecture where __memset_chk falls through to memset. Change-Id: I54d9eee5349b6a2abb2ce81e161fdcde09556561	2016-03-03 16:46:25 -08:00
Elliott Hughes	ff9bda7201	Merge "Mandate optimized assembler for x86-64 __memset_chk."	2016-03-03 22:18:46 +00:00
Elliott Hughes	01d5b946ac	Remove optimized code for bzero, which was removed from POSIX in 2008. I'll come back for the last bcopy remnant... Bug: http://b/26407170 Change-Id: Iabfeb95fc8a4b4b3992e3cc209ec5221040e7c26	2016-03-02 17:21:07 -08:00
Elliott Hughes	61c95fe52d	Mandate optimized assembler for x86-64 __memset_chk. Change-Id: I4d6b452f3cf850d405e8f5d7da01d432603e606b	2016-03-02 16:39:29 -08:00
Jake Weinstein	2926f9a31e	libc: remove bcopy from memmove on 64-bit architectures * bcopy is deprecated on LP64 by the following commit: `ce9ce28e5d` Change-Id: I6849916f0ec4a2d0db9a360999ad1dc8edda952b	2015-08-17 22:06:12 +00:00
Chih-Hung Hsieh	0a93df369c	Fix opcode to compile with both gcc and llvm. BUG: 17302991 Change-Id: I31febd9ad24312388068803ce247b295bd73b607	2015-04-23 21:40:31 +00:00
Varvara Rainchik	2e7145c048	Add 64-bit slm optimized strlcpy and srlcat. Change-Id: Ic948934d91c83bbfdfd00c05ee8b14952e012549 Signed-off-by: Varvara Rainchik <varvara.rainchik@intel.com>	2014-11-12 17:32:28 +03:00
Varvara Rainchik	fce861498c	Fix for slm-tuned memmove (both 32- and 64-bit). Introduce a test for memmove that catches a fault. Fix both 32- and 64-bit versions of slm-tuned memmove. Change-Id: Ib416def2610a0972e32c3b9b6055b54967643dc3 Signed-off-by: Varvara Rainchik <varvara.rainchik@intel.com>	2014-06-05 11:08:09 -07:00
Dan Albert	ce9ce28e5d	Removes bcopy and bzero from bionic. These symbols are still defined for LP32 for binary compatibility, but the declarations have been replaced with the POSIX recommended #defines. Bug: 13935372 Change-Id: Ief7e6ca012db374588ba5839f11e8f3a13a20467	2014-06-03 17:22:07 -07:00
Varvara Rainchik	a020a244ae	Add 64-bit Silvermont-optimized string/memory functions. Add following functions: bcopy, bzero, memcpy, memmove, memset, stpcpy, stpncpy, strcat, strcpy, strlen, strncat, strncpy, memcmp, strcmp, strncmp. Set all these functions as the default ones. Change-Id: Ic66b250ad8c349a43d25e2d4dea075604f6df6ac Signed-off-by: Varvara Rainchik <varvara.rainchik@intel.com>	2014-05-12 17:37:07 -07:00
Elliott Hughes	bf425680e4	Let the compiler worry about implementing ffs(3). It does at least as good a job as our old hand-written assembly anyway. Change-Id: If7c4a1ac508bace0b71ee7b67808caa6eabf11d2	2013-10-24 16:29:40 -07:00
Elliott Hughes	8ca530e559	Add ffs and memcmp16 to x86_64. Change-Id: I652c1356f1c7c52299977181c2cf154386979380	2013-10-17 17:03:22 -07:00

17 commits