From 4e24dcc8d869db7303650d8444c8796445fbbc07 Mon Sep 17 00:00:00 2001 From: Christopher Ferris Date: Mon, 15 Jul 2013 12:49:26 -0700 Subject: Optimize strcat/strcpy, small tweaks to strlen. DO NOT MERGE Create one version of strcat/strcpy/strlen for cortex-a15/krait and another version for cortex-a9. Tested with the libc_test strcat/strcpy/strlen tests. Including new tests that verify that the src for strcat/strcpy do not overread across page boundaries. NOTE: The handling of unaligned strcpy (same code in strcat) could probably be optimized further such that the src is read 64 bits at a time instead of the partial reads occurring now. strlen improves slightly since it was recently optimized. Performance improvements for strcpy and strcat (using an empty dest string): cortex-a9 - Small copies vary from about 5% to 20% as the size gets above 10 bytes. - Copies >= 1024, about a 60% improvement. - Unaligned copies, from about 40% improvement. cortex-a15 - Most small copies exhibit a 100% improvement, a few copies only improve by 20%. - Copies >= 1024, about 150% improvement. - Unaligned copies, about 100% improvement. krait - Most small copies vary widely, but on average 20% improvement, then the performance gets better, hitting about a 100% improvement when copies 64 bytes of data. - Copies >= 1024, about 100% improvement. - When coping MBs of data, about 50% improvement. - Unaligned copies, about 90% improvement. As strcat destination strings get larger in size: cortex-a9 - about 40% improvement for small dst strings (>= 32). - about 250% improvement for dst strings >= 1024. cortex-a15 - about 200% improvement for small dst strings (>=32). - about 250% improvement for dst strings >= 1024. krait - about 25% improvement for small dst strings (>=32). - about 100% improvement for dst strings >=1024. Merge from internal master. (cherry-picked from d119b7b6f48fe507088cfb98bcafa99b320fd884) Change-Id: I296463b251ef9fab004ee4dded2793feca5b547a --- libc/arch-arm/generic/generic.mk | 2 ++ 1 file changed, 2 insertions(+) (limited to 'libc/arch-arm/generic/generic.mk') diff --git a/libc/arch-arm/generic/generic.mk b/libc/arch-arm/generic/generic.mk index 18cad9d..0b3f644 100644 --- a/libc/arch-arm/generic/generic.mk +++ b/libc/arch-arm/generic/generic.mk @@ -1,4 +1,6 @@ $(call libc-add-cpu-variant-src,MEMCPY,arch-arm/generic/bionic/memcpy.S) $(call libc-add-cpu-variant-src,MEMSET,arch-arm/generic/bionic/memset.S) +$(call libc-add-cpu-variant-src,STRCAT,string/strcat.c) $(call libc-add-cpu-variant-src,STRCMP,arch-arm/generic/bionic/strcmp.S) +$(call libc-add-cpu-variant-src,STRCPY,arch-arm/generic/bionic/strcpy.c) $(call libc-add-cpu-variant-src,STRLEN,arch-arm/generic/bionic/strlen.c) -- cgit v1.1