| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Change-Id: Ia0c1d81cecb2f05834842bf3e6a3cc9cb06e00c9
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This version is from Newlib.
Original comment of the author:
The attached patch provides a new implementation of strcmp for ARM, using
LDRD instead of LDR whenever possible.
For older architectures that do not support LDRD, this implementation uses
the same algorithm as before.
This patch replaces strcmp.c with strcmp.S. The huge inline assembly from
strcmp.c was converted into plain assembly and included in strcmp.S under
the appropriate predefines.
Testing and benchmarking:
* Validation: successfully passes a test that compares different strings of
length 1-128 and offsets 0-8 from a word boundary. Checked on qemu/A15/A9,
ARM/Thumb mode, Big/Little Endian. This test is also added to newlib
testsuite as part of this patch.
* Integration with gcc: no regression on qemu for arm-none-eabi --with-cpu
a15/a9 --with-mode arm/thumb.
* Performance (relative to the current strcmp in newlib, only in ARM mode):
On Dhrystone, the new implementation (ldrd) is 22% faster on Cortex-A15
FPGA, and 16% on Cortex-A9 VE2.
On synthetic benchmarks, which measure the average number of cycles for
strcmp on strings of length 4-128K and offsets 0,1,2,3,4,8 from a word
boundary, where the strings are equal, the new implementation is three times
faster for long strings, when the input strings have the same offset from a
word boundary, and up to 30% faster in other cases, on both A15 FPGA and A9
VE2.
Change-Id: I7a39770cad48c2ecc6b6d375165652cd4f0c0619
Signed-off-by: Dirk Rettschlag <dirk.rettschlag@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
* This code is slower than the new implementation under every
circumstance on Scorpion devices. Remove it and standardize on one
implementation.
* Also remove "optimized" memmove_words, which was just a pointer to
memmove. The performance was identical to the C version on Krait, and
has regressions on other platforms. This function is only used for
System.arraycopy in Dalvik.
Change-Id: I42d06fbd7c56d9874b80a0e768dbadf0fa5025a4
|
|
|
|
|
|
|
| |
* From Cortex-Strings via Linaro
* Performance is 10x over the default version on microbenchmarks
Change-Id: I14c1b5ccbd0078219a1dbf9f02fb1591b3a2403c
|
|
|
|
|
|
|
| |
* This version is from Newlib and is written/recommended by ARM as the best
routine for Cortex-A15.
Change-Id: Ie9c2817dabda929ae4efaada5f6d467d53551ba4
|
|
|
|
|
|
|
| |
* The CodeAurora version of memmove provides roughly 20x the throughput
of the default version for all tested platforms. Enable it.
Change-Id: Ia9740f6b38a72c0bd6a6818c96d73e2c23bb5979
|
|
|
|
|
|
| |
* Too many weird regressions in this codepath.
This reverts commit 8c35d7eaeb67ace9a96922f16ba9e491dcde6534.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds new code to memcpy function, optimized for Cortex A9.
Adds new ARM-only loop, for operations where source and
destination are aligned.
Copyright (C) ST-Ericsson SA 2010
Modified neon implementation to fit Cortex A9 cache line size,
for those running 32 bytes L2 cache line size.
Also split the implementation in aligned and unaligned access,
for those that allows unaligned memory access with Neon.
For totally aligned operations, arm-only code is used.
Change-Id: I95ebf6164cd6486b12a7e3e98e369db21e7e18d2
Author: Henrik Smiding henrik.smiding@stericsson.com for ST-Ericsson.
Signed-off-by: Christian Bejram <christian.bejram@stericsson.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds new code to function memcmp, optimized for Cortex A9.
Copyright (C) ST-Ericsson SA 2010
Added neon optimization
Change-Id: I8864d277042db40778b33232feddd90a02a27fb0
Author: Henrik Smiding henrik.smiding@stericsson.com for ST-Ericsson.
Signed-off-by: Christian Bejram <christian.bejram@stericsson.com>
|
|
|
|
| |
This reverts commit 579dba196255d3f2db7664cfba2db4cb86f59aa9.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds new code to function memset, optimized for Cortex A9.
Copyright (C) ST-Ericsson SA 2010
Added neon implementation
Author: Henrik Smiding henrik.smiding@stericsson.com for ST-Ericsson.
Change-Id: Id3c87767953439269040e15bd30a27aba709aef6
Signed-off-by: Christian Bejram <christian.bejram@stericsson.com>
|
|
|
|
|
|
|
|
|
|
| |
Use the kernel user helper feature to calculate the results
of gettimeofday and clock_gettime without actually calling
into kernel space. If the user helper patches have not been
applied to the kernel, the regular system calls are used as
a fallback.
Change-Id: I3aebc6ac19ab4743725648a1a279819e624cc5c4
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
https://android.googlesource.com/platform/bionic into mr1
Conflicts:
libc/Android.mk
libc/bionic/system_properties.c
libc/kernel/common/linux/msm_mdp.h
libc/tools/gensyscalls.py
libm/Android.mk
linker/linker.c
Change-Id: I11944300d7fcf2fd9dc587d8c7a937bf5366bcc0
|
| |\ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Also chang libc/arch-arm/bionic/crtbegin_so.c to include it
as a header.
Change-Id: Ib91b0b8caf5c8b936425aa8a4fc1a229b2b27929
|
| |/
| |
| |
| | |
Change-Id: I10e961d701e74aab07211ec7975f61167e387853
|
| |\ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Legacy ARM shared libraries use this generic version of atexit(),
which queues exit functions for invocation at program exit, at
which time the library may have been dlclose()'d, causing the
program to crash.
Change-Id: I41ae153c23268daa65ede7fb8966fc3e9caec369
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@gmail.com>
|
| |\ \
| | |/ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
To properly support legacy ARM shared libraries, libc.so needs
to export the symbols __dso_handle and atexit, even though
these are now supplied by the crt startup code.
This patch reshuffles the existing CRT_LEGACY_WORKAROUND
conditionally compiled code slightly so it works as the
original author likely intended.
Change-Id: Id6c0e94dc65b7928324a5f0bad7eba6eb2f464b9
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@gmail.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add getsid() system call to bionic for
all architectures. This is needed for various tools
(e.g. perf).
Adding the getsid system call was done in 3 steps:
() add getsid system call (function name and syscall
number) to libc/SYSCALLS.TXT
() generate all necessary headers by calling
libc/tools/gensyscalls.py. This patch is adding
the generated files since the build system
does not call gensyscalls.py.
() add the system call signature to libc/include/unistd.h
Change-Id: Id69a257e13ec02e1a44085a6b217a3f19ab025b1
Signed-off-by: Irina Tirdea <irina.tirdea@intel.com>
|
| |\ \ |
|
| | | |
| | | |
| | | |
| | | |
| | | | |
Change-Id: I280e5428b0543cccf17ca36baee4865395928cdb
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@gmail.com>
|
| |\ \ \ |
|
| | | |/
| | |/|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The runtime linker parses the ELF section headers to
discover the size of the init_array and fini_array, so
there is no point in putting NULL terminators at the end.
Change-Id: I3246cd585efce9314155600277dd829e9f37d04f
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@gmail.com>
|
| |/ /
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
None of the supported ARCHs actually populate these sections,
so there is no point in keeping them in the binaries.
Change-Id: I21a364f510118ac1114e1b49c53ec8c895c6bc6b
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@gmail.com>
|
| |/
| |
| |
| |
| |
| |
| |
| | |
Useful if you're trying to defeat ASLR, otherwise not
so much ...
Change-Id: I17ebb50bb490a3967db9c3038f049adafe2b8ea7
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@gmail.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This header is used on bionic build and should be propagated into
sysroot on toolchain rebuild. Discussion re. this header is here:
http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00936.html
It is available already in mips NDK platforms:
development/ndk/platforms/android-9/arch-mips/include/link.h
Change-Id: I39ff467cdac9f448e31c11ee3e14a6200e82ab57
Signed-off-by: Pavel Chupin <pavel.v.chupin@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add a GNU_STACK marker to crtend* files. This tells the linker
that these files do not require an executable stack.
When linking, a missing GNU_STACK marker in any .o file can prevent
the compiler from automatically marking the final executable as NX
safe (executable stack not required). In Android, we normally work
around this by adding -Wa,--noexecstack / -Wl,-z,noexecstack.
For files like crtend.S / crtend_so.S, which are included in every
executable / shared library, it's better to add the GNU_STACK note
directly to the assembly file. This allows the compiler to
automatically mark the final executable as NX safe without any
special command line options.
References: http://www.gentoo.org/proj/en/hardened/gnu-stack.xml
Change-Id: I07bd058f9f60ddd8b146e0fb36ba26ff84c0357d
|
| |
| |
| |
| |
| |
| |
| | |
(cherry-pick of 5467f25f82934d611c60f8bc57a05114f3c1bea0.)
Bug: 6925012
Change-Id: Ic5ea2fbd606311087de05d7a3594df2fa9b2fef9
|
| |
| |
| |
| |
| |
| |
| | |
Move the stackpointer so a captured signal does not corrupt
stack variables needed for __thread_entry.
Change-Id: I3e1e7b94a6d7cd3a07081f849043262743aa8064
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Rewrite
crtbegin.S -> crtbegin.c
crtbegin_so.S -> crtbegin_so.c
This change allows us to generate PIC code without relying
on text relocations.
As a consequence of this rewrite, also rewrite
__dso_handle.S -> __dso_handle.c
__dso_handle_so.S -> __dso_handle_so.c
atexit.S -> atexit.c
In crtbegin.c _start, place the __PREINIT_ARRAY__, __INIT_ARRAY__,
__FINI_ARRAY__, and __CTOR_LIST__ variables onto the stack, instead of
passing a pointer to the text section of the binary.
This change appears sorta wonky, as I attempted to preserve,
as much as possible, the structure of the original assembly.
As a result, you have C files including other C files, and other
programming uglyness.
Result: This change reduces the number of files with text-relocations
from 315 to 19 on my Android build.
Before:
$ scanelf -aR $OUT/system | grep TEXTREL | wc -l
315
After:
$ scanelf -aR $OUT/system | grep TEXTREL | wc -l
19
Change-Id: Ib9f98107c0eeabcb606e1ddc7ed7fc4eba01c9c4
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
crtbegin_dynamic and crtbegin_static are essentially identical,
minus a few trivial differences (comments and whitespace).
Eliminate duplicates.
Change-Id: Ic9fae6bc9695004974493b53bfc07cd3bb904480
|
| |\
| | |
| | |
| | | |
Change-Id: If00e354a5953ed54b31963d4f8ea77e1603c321e
|
| |\ \
| | | |
| | | |
| | | |
| | | | |
* commit '08b51e2c091d036c124259ae59eb7be6bbe346af':
Implement the "abort" stub in assembly for ARM.
|
| |\ \ \
| | | | |
| | | | |
| | | | |
| | | | | |
* commit '8657eafc3552f36c176667c1591beab255308da6':
Adjust memcpy for ARM Cortex A9 cache line size
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
ARM Cortex A8 use 64 bytes and ARM Cortex A9 use 32 bytes cache line
size.
The following patch:
Adds code to adjust memcpy cache line size to match A9 cache line
size.
Adds a flag to select between 32 bytes and 64 bytes cache line
size.
Copyright (C) ST-Ericsson SA 2010
Modified neon implementation to fit Cortex A9 cache line size
Author: Henrik Smiding henrik.smiding@stericsson.com for
ST-Ericsson.
Change-Id: I8a55946bfb074e6ec0a14805ed65f73fcd0984a3
Signed-off-by: Christian Bejram <christian.bejram@stericsson.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Define SPARROW_NEON_OPTIMIZATION flag so that neon optimized
memove and pow functions are used. Also add Corresponding
definitions in make files.
Change-Id: I12089fc7002e3ec294e63632bd84e395fbd24936
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Move the stackpointer so a captured signal does not corrupt
stack variables needed for __thread_entry.
Change-Id: I3e1e7b94a6d7cd3a07081f849043262743aa8064
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This was broken by the scorpion/krait optimization merges, fixit
Change-Id: I689d95d6db2061254729b4d9064203d464d9e0f6
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
(codeaurora)
Taken from the following commits to codeaurora
https://www.codeaurora.org/gitweb/quic/la/?p=platform/bionic.git;a=commit;h=a5333c8fbeb5190e3a8dc9f99af66eb50b01462c
https://www.codeaurora.org/gitweb/quic/la/?p=platform/bionic.git;a=commit;h=6077a9577667fc9999312a2c6daf4d3c77bdf294
Uses following variables in BoardConfig.mk
TARGET_USE_KRAIT_BIONIC_OPTIMIZATION := true
TARGET_USE_KRAIT_PLD_SET := true
TARGET_KRAIT_BIONIC_PLDOFFS := 10
TARGET_KRAIT_BIONIC_PLDTHRESH := 10
TARGET_KRAIT_BIONIC_BBTHRESH := 64
TARGET_KRAIT_BIONIC_PLDSIZE := 64
Change-Id: Iee66f7698dc301507a012e27c91141f3f6925dcb
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The original memcmp() was tweaked for ARM9, which is not optimal for ARM
Cortex-A cores. This patch merges the prefetch optimizations from
ST-Ericsson and removes NEON slowdowns.
Reference experiement results on Nexus S (ARM Cortex-A8; 1 GHz) using
strbench program:
http://pasky.or.cz//dev/glibc/strbench/
[before]
size, samples, TIMES[s] - user, system, total)
4 262144 2.510000 0.000000 2.510000
8 131072 1.570000 0.010000 1.590000
32 32768 1.310000 0.000000 1.320000
[after]
size, samples, TIMES[s] - user, system, total)
4 262144 2.280000 0.000000 2.290000
8 131072 1.210000 0.000000 1.220000
32 32768 1.040000 0.000000 1.050000
Change-Id: I961847da96d2025f7049773cd2ddaa08579e78d6
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Merge the ARM optimized strlen() routine from Linaro. Although it is
optimized for ARM Cortex-A9, the performance is still reasonably faster
than the original on Cortex-A8 machines.
Reference benchmark on Nexus S (ARM Cortex-A8; 1 GHz):
[before]
prc thr usecs/call samples errors cnt/samp size
strlen_1k 1 1 1.31712 97 0 1000 1024
[after]
prc thr usecs/call samples errors cnt/samp size
strlen_1k 1 1 1.05855 96 0 1000 1024
Change-Id: I809928804726620f399510af1cd1c852ed754403
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Cache line size for 7627A is 32 bytes. The existing memcpy routine
gives sub-optimal performance for this cache line size. The memcpy
routine has been optimized taking this into consideration. Currently
7627A is the only ARM-v7 target with cache line size of 32 bytes,
hence this optimized code has been featurized under
CORTEX_CACHE_LINE_32 in memcpy.S, memset.S, which can be enabled
by definingTARGET_CORTEX_CACHE_LINE_32 in BoardConfig.mk
This change also adds corresponding cflag definition in Android.mk.
Change-Id: Idea0d1b977f60e0a690ddee36b1d9c67d3c241ef
|
| |_|_|/
|/| | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Update the memcpy, memmove, and memset routines to use the
versions from CodeAurora when specified in the bionic/Android.mk
file (actually activated in the BoardConfig.mk file under
device/<vendor>/<board>). With this change, the mem* routines are
only used for the msm8660, while other platforms will use the
current Android mem* routines.
Future platforms can modify the makefile to use the CodeAurora-based
mem* routines as desired. This has the benefit of making the CodeAurora-
based routines opt-in instead of opt-out.
Also, PLDSIZE and PLDOFFS can be specified in the BoardConfig.mk as well,
so other platforms with different PLD tunings can use the same code
without modifying the source file itself.
Tests with FileCycler-0.3 showed a slight 1.1% improvement with these
files on an 8660v2, based on the average of three FileCycler runs with
and without the patch. Since the min/max values did not overlap, and
the average score showed an improvement, we can consider upstreaming these
modifications.
Change-Id: I6946076bc6a88a2a2c8667b09494e1eb31e01ee0
|
| |_|/
|/| |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Some SoCs that support NEON nevertheless perform better with a non-NEON than a
NEON memcpy(). This patch adds build variable ARCH_ARM_USE_NON_NEON_MEMCPY,
which can be set in BoardConfig.mk. When ARCH_ARM_USE_NON_NEON_MEMCPY is
defined, we compile in the non-NEON optimized memcpy() even if the SoC supports
NEON.
Change-Id: Ia0e5bee6bad5880ffc5ff8f34a1382d567546cf9
|
|/ /
| |
| |
| |
| |
| |
| |
| |
| |
| | |
So that we can always get the full stack trace regardless of gcc's handling
of the "noreturn" attribute associated with abort().
[cherry-picked from master]
BUG:6455193
Change-Id: I0102355f5bf20e636d3feab9d1424495f38e39e2
|
| |
| |
| |
| |
| |
| | |
This change mirrors cd15bac for statically-linked binaries.
Change-Id: Id870832a50b37f0ef3e79e1ed03ed31390bfc9ef
|
| |
| |
| |
| | |
Change-Id: I427a18811089cb280769ac8da3ed8adc00a65a10
|
|\ \ |
|