| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit e1154df8806b65a7706e1ba3e7b2a9f0fe2f32d0.
Revert "libc: Optimize strcat/strcpy, small tweaks to strlen"
This reverts commit dd2150b68a0e00d130bcf9b3b4af68a93971ae00.
Revert "libc: Update to latest cortexa15 memcpy code."
This reverts commit 392d2cd535185b4013ae7279587aaae3972737d1.
Revert "Bionic: Add AOSP master cortex optimizations"
This reverts commit 456c0e8cbe4ed2da25317e8e46ed327a14fbeb1c.
Revert "libc: krait: Restore prior version of strcmp."
This reverts commit d22231e03e36a34a149a90d2c01d867d62932f7f.
Revert "libc: krait: Use performance version of memcpy."
This reverts commit cc8a530aba0f1b616becbc0bbd40501f082d411f.
Revert "libc: krait: Use performance version of bcopy and memmove"
This reverts commit 9ef0a0f6002a3c0e0bcfa2b329811d9b3b8a534b.
Revert "tegra2: add missing strcpy and strcat"
This reverts commit 31a06f332f1176ae8e2a241dcae00f78112a3b2a.
Revert "libc: add tegra2 TARGET_CPU_VARIANT"
This reverts commit 4172806c7ee309fa78e20bb03313d0e8bc6bde0e.
Revert "Detect userspace memory leak for JB_MR2"
This reverts commit bba493121c8e135bd9cd145ab35e53cac1702a9e.
Revert "[libc] Add missing strcat.S to x86.mk Makefile"
This reverts commit b59b790f97dc58a931719524499269f2f3b904f2.
Revert "armv6: REX routines are currently broken, use alternatives"
This reverts commit 364c9574c789b046423f25903c9c2f36c1a32d01.
Revert "Bump the number of TLS slots to 128."
This reverts commit 5bce86fd740e13f4f6a55092d159af0200c46e72.
Revert "don't hardcode register r0/v1 when reading the TLS"
This reverts commit fb0f92fa912480e381c4a96ddd561e2360d6e1ca.
|
|
|
|
|
|
|
| |
arch-x86/string/strcat.S was not included in the x86.mk ;
This was causing the build to fail when building for x86 Targets
Change-Id: I8001cdb25c2ff84c994603e648ef74c9c1daa5ea
|
|
|
|
| |
Change-Id: I3f72fa4524c09d3ea9abcc3a4a693f563e58ed53
|
|
|
|
|
| |
Bug: 9997352
Change-Id: I7bde7228d803e9d4bb83309c5891d54a07e3b025
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When process is consuming memory beyond the limit specified by
property, it will print all the allocations of the same process.
To enable this feature, do following steps:
1. libc.debug.malloc 40
2. libc.debug.malloc.program <PROCESS_NAME>
3. libc.debug.malloc.maxprocsize <VALUE_IN_BYTES>
4. libc.malloc.minalloclim <VALUE_IN_BYTES>
Change-Id: I03a4de9643ec954802b26443ce5685975ea30f89
Signed-off-by: Maunik Shah <mshah@codeaurora.org>
|
|
|
|
|
|
|
|
|
| |
These applets are no more common since recent optimisations
strcpy from cortex-a9
strcat : keep standard one
Change-Id: Icfedf17aa4f9fa30e7f63b19bd6d4ac533e5e0e9
|
|
|
|
|
|
| |
this leads to much improved code when calling __get_tls()
Change-Id: I21d870fb33c33a921ca55c4e100772e0f7a8d1e4
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tegra2 is a Dual Cortex A9 without FP neon SIMD
based on 'generic' arm (memcpy, memset)
on cortex-a15 and a8 (strcmp)
and armv7-a from cm10.1 (memchr and strlen)
rebased and now for cm-10.2 only
Change-Id: I1e7207e2d1ba02012fb5306f46b688903d31386c
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ported from cm10.1
bionic-benchmarks on cortex-a9
generic:
BM_string_memmove/8 50000000 67 119.35 MiB/s
BM_string_memmove/64 10000000 175 364.59 MiB/s
BM_string_memmove/512 1000000 1078 474.86 MiB/s
BM_string_memmove/1K 1000000 2072 494.20 MiB/s
BM_string_memmove/8K 100000 16400 499.50 MiB/s
BM_string_memmove/16K 50000 32293 507.35 MiB/s
BM_string_memmove/32K 50000 66585 492.12 MiB/s
BM_string_memmove/64K 10000 160435 408.49 MiB/s
NEON-optimized:
BM_string_memmove/8 100000000 25 319.06 MiB/s
BM_string_memmove/64 50000000 43 1472.60 MiB/s
BM_string_memmove/512 10000000 247 2069.74 MiB/s
BM_string_memmove/1K 5000000 463 2210.08 MiB/s
BM_string_memmove/8K 500000 3465 2363.69 MiB/s
BM_string_memmove/16K 500000 6894 2376.30 MiB/s
BM_string_memmove/32K 100000 15490 2115.38 MiB/s
BM_string_memmove/64K 50000 42097 1556.75 MiB/s
Change-Id: I89253a01fb811438089e16320ac265177a2ca152
|
|
|
|
|
|
|
|
|
|
|
| |
Tested using a static version of the strlen libc_test program
on a nexus7 that uses the generic code.
Merge from internal master.
(cherry-picked from d8d10a8994472e40d19301b7087806630877b4d5)
Change-Id: I88f7dc01dc5b5c3ac2d5580d92153bc1bc36c564
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Create one version of strcat/strcpy/strlen for cortex-a15/krait/scorpion and another
version for cortex-a9.
Tested with the libc_test strcat/strcpy/strlen tests.
Including new tests that verify that the src for strcat/strcpy do not
overread across page boundaries.
NOTE: The handling of unaligned strcpy (same code in strcat) could probably
be optimized further such that the src is read 64 bits at a time instead of
the partial reads occurring now.
strlen improves slightly since it was recently optimized.
Performance improvements for strcpy and strcat (using an empty dest string):
cortex-a9
- Small copies vary from about 5% to 20% as the size gets above 10 bytes.
- Copies >= 1024, about a 60% improvement.
- Unaligned copies, from about 40% improvement.
cortex-a15
- Most small copies exhibit a 100% improvement, a few copies only
improve by 20%.
- Copies >= 1024, about 150% improvement.
- Unaligned copies, about 100% improvement.
krait
- Most small copies vary widely, but on average 20% improvement, then
the performance gets better, hitting about a 100% improvement when
copies 64 bytes of data.
- Copies >= 1024, about 100% improvement.
- When coping MBs of data, about 50% improvement.
- Unaligned copies, about 90% improvement.
As strcat destination strings get larger in size:
cortex-a9
- about 40% improvement for small dst strings (>= 32).
- about 250% improvement for dst strings >= 1024.
cortex-a15
- about 200% improvement for small dst strings (>=32).
- about 250% improvement for dst strings >= 1024.
krait
- about 25% improvement for small dst strings (>=32).
- about 100% improvement for dst strings >=1024.
Merge from internal master.
(cherry-picked from d119b7b6f48fe507088cfb98bcafa99b320fd884)
Change-Id: I296463b251ef9fab004ee4dded2793feca5b547a
Conflicts:
libc/Android.mk
libc/arch-arm/generic/generic.mk
libc/arch-arm/krait/krait.mk
Fix strcpy.c that should have been strcpy.S
Merge from internal master.
(cherry-picked from 1ce665416307628f4bcaced86faa64bdf9c489c3)
Conflicts:
libc/arch-arm/generic/generic.mk
Change-Id: I376b831df42248baadde7202a30a68112f752ff7
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This uses the new code original submitted as memcpy.a15.S as
the base. However, the old code handled unaligned src/dst better
so that was spliced in. I optimized the original unaligned code by
removing a few unnecessary instructions. I optimized the a15 code by
rewriting the pre and post code. I also modified the main loop to add
a pld so that larger copies would not stall waiting for memory.
Test cases for the new memcpy:
- Copy all sized values from 0 to 1024 bytes, using whatever alignment
is returned by malloc.
For each alignment case described below, the test copied from 0 to 128
bytes.
- Src and dst pointers are both aligned to the same value, starting
at one going through every power of two up to and including 128.
- Src aligned to double word boundary, dst aligned to word boundary.
- Src aligned to word boundary, dst aligned to double word boundary.
- Src aligned to 16 bit boundary, dst aligned to word boundary.
- Src aligned to word boundary, dst aligned to 16 byte boundary.
- Src aligned to word boundary, dst aligned to 1 byte from a word
boundary.
- Src aligned to word boundary, dst aligned to 2 bytes from a word
boundary.
- Src aligned to word boundary, dst aligned to 3 bytes from a word
boundary.
- Src aligned to 1 byte from a word boundary, dst aligned to a word
boundary.
- Src aligned to 2 bytes from a word boundary, dst aligned to a word
boundary.
- Src aligned to 3 bytes from a word boundary, dst aligned to a word
boundary.
Cases to verify the unaligned source code properly aligns to a 16 bit
boundary.
- Src aligned to 1 byte from a 128 bit boundary, dst aligned to
4 + 128 bit boundary.
- Src aligned to 1 byte from a 128 bit boundary, dst aligned to
8 + 128 bit boundary.
- Src aligned to 1 byte from a 128 bit boundary, dst aligned to
12 + 128 bit boundary.
- Src aligned to 1 byte from a 128 bit boundary, dst aligned to
16 + 128 bit boundary.
In all cases, a two byte fencepost was placed at the end of the
destination to verify that only the requested number of bytes were copied.
Bug: 8005082
Merge from internal master.
(cherry-picked from commit 21ede92d794969f22cacbdb9f557818f1c5712b5)
Change-Id: Ief70c9e6dc8c6473ae245b6570b2c266fed9618c
Add missing branch in memcpy.S dst aligned case.
Merge from internal master.
(cherry-picked from commit 6ffaa931c362602a2b606a610c92326a425a876e)
Change-Id: Ifdcf01fd122866cf0d4c5b5f7a997803561d7889
Rewrite memset for cortexa15 to use strd.
Merge from internal master.
(cherry-picked from commit 7ffad9c120054eedebd5f56f8bed01144e93eafa)
Change-Id: Ia67f2a545399f4fa37b63d5634a3565e4f5482f9
|
|
|
|
|
|
|
|
|
|
|
| |
Spacing between brackets is a illegal ARM entry
PS1 - Inital patch
PS2 - update description and rebase
Signed-off-by: Paul Beeler <sparksco@gmail.com>
Change-Id: I52d7e95a8afb8b26f55fe0bc01853a15a95e90c2
|
|
|
|
| |
Change-Id: I9ba50a469de27269adc008c2a96767830ad6b70b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libc/arch-arm/bionic/memcpy.a9.S: memcpy from cortex-strings.
This memcpy code uses NEON/VFP to achieve very good performance
on ARMv7-A processors. It is specifically tuned for A15 but should
provide good performance on A9 also. It is equivalent to the code
in cortex-strings rev 116.
This patch is a follow up the existing gerrit change:
I7f6f77995f3ca903ad9c66d14261441667a2a935
But this version includes a tweak for performance on misaligned
buffers.
Change-Id: I285abac0068f8ae29a1cbf7862ea8590aadaf0a7
Signed-off-by: Will Newton <will.newton@linaro.org>
libc/arch-arm/bionic/memcpy.a9.S: memcpy from cortex-strings.
This memcpy code uses NEON/VFP to achieve very good performance
on ARMv7-A processors. It is specifically tuned for A15 but should
provide good performance on A9 also. It is equivalent to the code
in cortex-strings rev 116.
This patch is a follow up the existing gerrit change:
I7f6f77995f3ca903ad9c66d14261441667a2a935
This version includes a tweak for performance on misaligned
buffers and splits the header comment into license and
documentation sections.
Change-Id: Ibd2e23c8d8e01357ba0247be1d05192de3ceba69
Signed-off-by: Will Newton <will.newton@linaro.org>
Add new optimized strlen for arm.
This optimized version is primarily targeted at cortex-a15.
Tested on all nexus devices using the system/extras/libc_test strlen test.
Tested alignments from 1 to 32 that are powers of 2.
Tested that strlen does not cross page boundaries at all alignments.
Speed improvements listed below:
cortex-a15
- Sizes >= 32 bytes, ~75% improvement.
- Sizes >= 1024 bytes, ~250% improvement.
cortex-a9
- Sizes >= 32 bytes, ~75% improvement.
- Sizes >= 1024 bytes, ~85% improvement.
krait
- Sizes >= 32 bytes, ~95% improvement.
- Sizes >= 1024 bytes, ~160% improvement.
Merge from internal master.
(cherry-picked from 2fc071797743b88a9a47427d46baed7c7b24f4d2)
Change-Id: I1ceceb4e745fd68e9d946f96d1d42e0cdaff6ccf
Conflicts:
libc/arch-arm/generic/generic.mk
libc/arch-arm/krait/krait.mk
libgcc_compat: Introduce __aeabi_lasr for cortex-a9 and higher
This is needed when passing -mcpu=cortex-a9 or higher on a modern
toolchain for prebuilt library compatibility
Change-Id: I73eb2393377914ae26216a8c2828ad973d1c1225
|
|
|
|
|
|
|
| |
* Optimized routines from Qualcomm for msm8660 class SoCs
* Enable with TARGET_CPU_VARIANT := scorpion
Change-Id: I01d0f22efba5a418ddd20fca0d0c570d855e0f6f
|
|
|
|
|
|
|
|
|
| |
Prelink support is required to load old vendor binary blobs
on many devices properly
This commit partially reverts 4688279db5dcc4004941e7f133c4a1c3617d842c
Change-Id: Iee5325e048daee78e8dcaa47c87454b908c89b4e
|
|
|
|
| |
Change-Id: Ifa732ec52fb582c34512cd1d09d918199ae7395e
|
|
|
|
|
|
| |
allow more QCOM targets to use optimized math functions.
Change-Id: I76ee1bf951ae1c8397fef3af6e9937ed8cad9b62
|
|
|
|
|
|
|
| |
Move integer representations of x bits on the integer side rather
than moving them to and from the FP registers.
Change-Id: I1895db385c9616cdae9ab6403f392dfbae292adc
|
|
|
|
|
|
|
|
| |
Optimized sqrt and sqrtf for arm by using hardware
opcode for sqrt rather than generic slow portable
code.
Change-Id: I84694159577aef6418710548085d8149c45e0e3f
|
|
|
|
|
|
|
| |
Call optimized pow optimistically and revert to full range
implementation if we detect an out-of-range input.
Change-Id: I6f3aa734adbf99484b7ff70736ef83a41e5815b8
|
|
|
|
|
|
|
| |
Modify sin/cos to improve performance while retaining bit-for-bit
accuracy with existing algorithm.
Change-Id: Iaba2dd731cd015732744705dad8bddb713b43067
|
|
|
|
|
|
|
|
|
| |
Use VFP calling convention for pow_neon handoff function by default.
Fix register usage collision between two different polynomial
coefficients in pow_neon. Remove conditional execution in pow_neon
and replace with branching.
Change-Id: I76095f4a006e2fb01a53943b66fd69bfa1fd3033
|
|
|
|
|
|
|
|
| |
Add assembly versions of sin/cos with integrated remainder pi/2
calculation. Directly extracted from binary libm.so compiled with
__ieee754_rem_pio2 calls inlined.
Change-Id: Ia093f420e58e794635e3a5f09e8236ae7601f1f6
|
|
|
|
|
|
|
|
|
| |
For internal functions set gcc attribute "aapcs-vfp" for ARM
and use -fno-if-conversion to prefer branches over predicated
instructions (improves performance on architectures with good
branch prediction).
Change-Id: I8424e0e82a19d35e7e3b6e3e122dcdecdd5426fd
|
|
|
|
|
|
|
|
|
| |
Add a fast neon version of pow() suitable for relatively small
positive x and y (between 0 and 4). Run the standard
implementation in all other cases. Gives approximately 60%
performance improvement to AnTuTu FPU scores
Change-Id: I97e0712daeb2740764b26a44be0caaa39c481453
|
|
|
|
| |
Change-Id: I30e8dd6d4b2e7889aea8f5ed21182a5941bfb489
|
|
|
|
| |
Change-Id: I8ddcdb4d3a905dd746985435dcdb525ab5a1c947
|
|
|
|
| |
Change-Id: I587085aeb0c30ceccaa3f420594a194b129632b5
|
|
|
|
| |
Change-Id: I47d25d1da5b1a96bbc1b60f8acdaa31721a68e73
|
|
|
|
| |
Change-Id: I7ac05ab6add7802a4cc24fe36f7181e7cdfe07e0
|
|
|
|
|
|
|
|
| |
Restore the outside-kernel exclusion for some syscalls that was
removed by change I959b64280e184655ef8c713aa79f9e23cb1f7df4,
since these syscalls are used elsewhere.
Change-Id: I5b5bf3d78edd137e820d25281a375966b6c009ec
|
|
|
|
|
|
|
|
| |
This affect all command line tools which show the usage on stdout
after an invalid parameter error (printed on stderr)
Change-Id: I8e2cb3fda241ab989dc42055f15082f8b3ba1397
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
__attribute__((const)) is mainly intended for the compiler to optimize
away repeated calls to a function that the compiler knows will return
the same value repeatedly.
By adding __attribute__((const)), the compiler can choose to call the
function just once and cache the return value. Therefore, this yields
code size reduction.
Here are the reference results by arm-eabi-size for crespo device:
[before]
text data bss dec hex filename
267715 10132 45948 323795 4f0d3
[after]
text data bss dec hex filename
267387 10132 45948 323467 4ef8b
Change-Id: I1d80465c0f88158449702d4dc6398a130eb77195
|
|\ |
|
| |\ |
|
| | |\ |
|
| | | |\ |
|
| | | | |\ |
|
| | | | | |\ |
|
| | | | | | |\ |
|
| | | | | | | |\ |
|
| | | | | | | | |\ |
|
| | | | | | | | | |\ |
|
| | | | | | | | | | |\ |
|
| | | | | | | | | | | |\ |
|
| | | | | | | | | | | | |\ |
|
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
bad Olson ids."
* commit '2e7b8d6399fdea6e43dd07f353346324d2bf4ec4':
Don't search off the end of the index for bad Olson ids.
|
| | | | | | | | | | | | | |\ |
|