| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add linear alloc. Moved ArtField to be native object. Changed image
writer to put ArtFields after the mirror section.
Savings:
2MB on low ram devices
4MB on normal devices
Total PSS measurements before (normal N5, 95s after shell start):
Image size: 7729152 bytes
23112 kB: .NonMoving
23212 kB: .NonMoving
22868 kB: .NonMoving
23072 kB: .NonMoving
22836 kB: .NonMoving
19618 kB: .Zygote
19850 kB: .Zygote
19623 kB: .Zygote
19924 kB: .Zygote
19612 kB: .Zygote
Avg: 42745.4 kB
After:
Image size: 7462912 bytes
17440 kB: .NonMoving
16776 kB: .NonMoving
16804 kB: .NonMoving
17812 kB: .NonMoving
16820 kB: .NonMoving
18788 kB: .Zygote
18856 kB: .Zygote
19064 kB: .Zygote
18841 kB: .Zygote
18629 kB: .Zygote
3499 kB: .LinearAlloc
3408 kB: .LinearAlloc
3424 kB: .LinearAlloc
3600 kB: .LinearAlloc
3436 kB: .LinearAlloc
Avg: 39439.4 kB
No reflection performance changes.
Bug: 19264997
Bug: 17643507
Change-Id: I10c73a37913332080aeb978c7c94713bdfe4fe1c
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Main motivation is to remove all the functionality / field access on
java side to ArtField. Also comes with some reflection speedups /
slowdowns.
Summary results:
getDeclaredField/getField are slower mostly due to JNI overhead.
However, there is a large speedup in getInt, setInt,
GetInstanceField, and GetStaticField.
Before timings (N5 --compiler-filter=everything):
benchmark ns linear runtime
Class_getDeclaredField 782.86 ===
Class_getField 832.77 ===
Field_getInt 160.17 =
Field_setInt 195.88 =
GetInstanceField 3214.38 ==============
GetStaticField 6809.49 ==============================
After:
Class_getDeclaredField 1068.15 ============
Class_getField 1180.00 ==============
Field_getInt 121.85 =
Field_setInt 139.98 =
GetInstanceField 1986.15 =======================
GetStaticField 2523.63 ==============================
Bug: 19264997
Change-Id: Ic0d0fc1b56b95cd6d60f8e76f19caeaa23045c77
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use an actual PathClassLoader when compiling apps, instead of a
side structure and cutout.
This CL sets up a minimal object 'cluster' that recreates the Java
side of a regular ClassLoader such that the Class-Linker will
recognize it and use the internal native fast-path.
This CL removes the now unnecessary compile-time-classpath and
replaces it with a single 'compiling-the-boot-image' flag in the
compiler callbacks.
Note: This functionality is *only* intended for the compiler, as
the objects have not been completely initialized.
Bug: 19781184
Change-Id: I7f36af12dd7852d21281110a25c119e8c0669c1d
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Now when opening an oat file, the caller can pass an absolute dex
location used to resolve the absolute path for any relative
encoded dex locations in the oat file.
Bug: 19550105
Change-Id: I6e9559afe4d86ac12cf0b90176b5ea696a83d0e7
|
| |
| |
| |
| |
| |
| |
| |
| | |
First step towards the compression of the StackMap (not
the DexRegisterMap). Next step will be to just use what is
needed (instead of byte -> word).
Change-Id: I4f81b2d05bf5cc508585e16fbbed1bafbc850e2e
|
| |
| |
| |
| |
| |
| | |
Helps diagnose related jank.
Change-Id: I38191cdda723c6f0355d0197c494a3dff2b6653c
|
|/
|
|
|
|
|
|
|
| |
- moved environment recording from code generator to stack map stream
- added creation/loading factory methods for the DexRegisterMap (hides
internal details)
- added new tests
Change-Id: Ic8b6d044f0d8255c6759c19a41df332ef37876fe
|
|
|
|
|
|
|
|
|
|
| |
- Remove GetVReg() and SetVReg() that were expecting to always succeed.
- Change Quick-only methods to take a FromQuickCode suffix.
- Change deopt to use dead values when GetVReg does not succeed:
the optimizing compiler will not have a location for uninitialized
Dex registers and potentially dead registers.
Change-Id: Ida05773a97aff8aa69e0caf42ea961f80f854b77
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Replace the current list-based (fixed-size) Dex register
encoding in stack maps emitted by the optimizing compiler
with another list-based variable-size Dex register
encoding compressing short locations on 1 byte (3 bits for
the location kind, 5 bits for the value); other (large)
values remain encoded on 5 bytes.
- In addition, use slot offsets instead of byte offsets to
encode the location of Dex registers placed in stack
slots at small offsets, as it enables more values to use
the short (1-byte wide) encoding instead of the large
(5-byte wide) one.
- Rename art::DexRegisterMap::LocationKind as
art::DexRegisterLocation::Kind, turn it into a
strongly-typed enum based on a uint8_t, and extend it to
support new kinds (kInStackLargeOffset and
kConstantLargeValue).
- Move art::DexRegisterEntry from
compiler/optimizing/stack_map_stream.h to
runtime/stack_map.h and rename it as
art::DexRegisterLocation.
- Adjust art::StackMapStream,
art::CodeGenerator::RecordPcInfo,
art::CheckReferenceMapVisitor::CheckOptimizedMethod,
art::StackVisitor::GetVRegFromOptimizedCode, and
art::StackVisitor::SetVRegFromOptimizedCode.
- Implement unaligned memory accesses in art::MemoryRegion.
- Use them to manipulate data in Dex register maps.
- Adjust oatdump to support the new Dex register encoding.
- Update compiler/optimizing/stack_map_test.cc.
Change-Id: Icefaa2e2b36b3c80bb1b882fe7ea2f77ba85c505
|
|
|
|
| |
Change-Id: I779b80b8139d9afdc28373f8c68edff5df7726ce
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
New features list includes:
- Class filter option to limit classes search space
- Method filter is applied only against the method
name, instead of the entire signature. Can be
combined with class filter for maximum efficiency.
- Bulk dump of class and method names list only.
Can be combined with filters to limit results.
- Export embedded dex files from input oat files
to filesystem (symlinks not supported as utils
functions are utilized for os & fs operations).
- addr2instr option to locate the in-range method
implementation and limit disassemble dumps. Input
relative addr is added to oat executable offset to
calculate the search offset. If method has been
successfully located, code is dumped and program
aborts further analysis of the input file. Methods
located before the target address just print their
signature, although skip all disassemble and other
info. Calculated search offset is also printed as
part of the initial header info.
- Little-endian dex instructions bytecode is printed
in the same line before the instruction string.
Some minor re-orders have been also taken place for
more targeted results.
Change-Id: I3116ee3c99c258718f46faea8ea4295da6ae2bf7
Signed-off-by: Anestis Bechtsoudis <anestis@census-labs.com>
|
|
|
|
|
|
| |
A few elements are not displayed yet (stack mask, inline info) though.
Change-Id: I5e51a801c580169abc5d1ef43ad581aadc110754
|
|
|
|
|
| |
Bug: 18809837
Change-Id: Ie571eae8fc19ee9207390cff5c7e2a38071b126a
|
|
|
|
|
|
|
|
| |
Refactor and modify cmdline.h to allow oatdump to run without a
Runtime.
Bug: 18789891
Change-Id: I1d7a1585e3672d04e58dbac9a4d4bd835c1c9143
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Analyze the dirty memory pages of a running process per-object,
this allows is to to fine-tune the dirty object binning algorithm in
image writer.
Also:
* Factor out oatdump command line parsing code into cmdline.h
* Factor out common build rules for building variations of binaries
* Add a gtest for imgdiag
Bug: 17611661
Change-Id: I3ac852a0d223af66f6d59ae5dbc3df101475e3d0
|
|
|
|
| |
Change-Id: I3bf3250fa866fd2265f1b115d52fa5dedc48a7fc
|
|
|
|
| |
Change-Id: I2d74e2d5b3c35a691c95339de0db9361847fca11
|
|
|
|
|
|
|
|
|
|
|
| |
Moved the gc_map field from OatMethod to OatQuickMethodHeader.
Deleted the ArtMethod gc_map_ field.
Bug: 17643507
Change-Id: Ifa0470c3e4c2f8a319744464d94c6838b76b3d48
(cherry picked from commit 807140048f82a2b87ee5bcf337f23b6a3d1d5269)
|
|
|
|
|
|
|
| |
Bug: 18473190
Change-Id: If505b4f62105899f4f1257d3bccda3e6eb0dcd7c
(cherry picked from commit c934e483ceabbd589422beea1fa35f5182ecfa99)
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This contains three changes:
- Use register aliases in the disassembly.
- When loading from a literal pool, show what is being loaded.
- Disassemble using absolute addresses on ARM64.
This ensures that addresses disassembled are coherent with instruction
location addresses shown.
Examples of disassembled instructions before and after the changes:
Before:
movz w17, #0x471f
ldr d9, pc+736 (addr 0x72690d50)
After:
movz wip1, #0x471f
ldr d9, pc+736 (addr 0x72690d50) (-745.133)
Change-Id: I72fdc160fac26f74126921834f17a581c26fd5d8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Implement a check that aborts when a file hasn't been explicitly
flushed and closed when it is destructed.
Add WARN_UNUSED to FdFile methods.
Update dex2oat, patchoat, scoped_flock and some gtests to pass with
this.
(cherry picked from commit 9433ec60b325b708b9fa87e699ab4a6565741494)
Change-Id: I9ab03b1653e69f44cc98946dc89d764c3e045dd4
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Also, refactor how feature strings are handled so they are additive or
subtractive.
Make MIPS have features for FPU 32-bit and MIPS v2. Use in the quick compiler
rather than #ifdefs that wouldn't have worked in cross-compilation.
Add SIMD features for x86/x86-64 proposed in:
https://android-review.googlesource.com/#/c/112370/
Bug: 18056890
Change-Id: Ic88ff84a714926bd277beb74a430c5c7d5ed7666
|
| |
| |
| |
| |
| |
| |
| |
| | |
I intend to use oatdump for testing generated code, and
being able to filter on a method name will make the
testing more reliable.
Change-Id: Iaf7fef7228d9d8a901bd9b98452d244d42ca497e
|
| |
| |
| |
| | |
Change-Id: I53a3bfffb834284c5c3d2297305c7cdc241f8963
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Changed the class def index to use a HashMap instead of unordered_map
so that we can use FindWithHash to reduce how often we need to compute
hashes.
Fixed a bug in ClassLinker::UpdateClass where we didn't properly
handle classes with the same descriptor but different class loaders.
Introduced by previous CL.
Before (fb launch):
1.74% art::ComputeModifiedUtf8Hash(char const*)
After:
0.95% art::ComputeModifiedUtf8Hash(char const*)
Bug: 18054905
Bug: 16828525
Change-Id: Iba2ee37c9837289e0ea187800ba4af322225a994
(cherry picked from commit 564ff985184737977aa26c485d0c1a413e530705)
|
| |
| |
| |
| |
| |
| |
| | |
Enable -Wno-conversion-null, -Wredundant-decls and -Wshadow in general,
and -Wunused-but-set-parameter for GCC builds.
Change-Id: I81bbdd762213444673c65d85edae594a523836e5
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Fix associated errors about unused paramenters and implict sign conversions.
For sign conversion this was largely in the area of enums, so add ostream
operators for the effected enums and fix tools/generate-operator-out.py.
Tidy arena allocation code and arena allocated data types, rather than fixing
new and delete operators.
Remove dead code.
Change-Id: I5b433e722d2f75baacfacae4d32aef4a828bfe1b
|
|/
|
|
|
|
|
| |
Move to shared rather than static libraries. Avoids capture of all static
libraries library dependencies.
Change-Id: I2be96e92dad4ed1842d76b044745f2a2e15372eb
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Images (.art) compiled with pic now have a new field added.
* isDexOptNeeded will now skip patch-ing for apps compiled PIC
* First-boot patching now only copies boot.art, boot.oat is linked
As a result, all system preopted dex files (with --compile-pic) no
longer take up any space in /data/dalvik-cache/<isa>.
Bug: 18035729
Change-Id: Ie1acad81a0fd8b2f24e1f3f07a06e6fdb548be62
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added more inlining, removed imt array allocation and replaced it
with a handle scope. Removed some un-necessary handle scopes.
Added logic to base interface method tables from the superclass so
that we dont need to reconstruct for every interface (large win).
Facebook launch Dalvik KK MR2:
TotalTime: 3165
TotalTime: 3652
TotalTime: 3143
TotalTime: 3298
TotalTime: 3212
TotalTime: 3211
Facebook launch TOT before:
WaitTime: 3702
WaitTime: 3616
WaitTime: 3616
WaitTime: 3687
WaitTime: 3742
WaitTime: 3767
After optimizations:
WaitTime: 2903
WaitTime: 2953
WaitTime: 2918
WaitTime: 2940
WaitTime: 2879
WaitTime: 2792
LinkInterfaceMethods no longer one of the hottest methods, new list:
4.73% art::ClassLinker::LinkVirtualMethods(art::Thread*, art::Handle<art::mirror::Class>)
3.07% art::DexFile::FindClassDef(char const*) const
2.94% art::mirror::Class::FindDeclaredStaticField(art::mirror::DexCache const*, unsigned int)
2.90% art::DexFile::FindStringId(char const*) const
Bug: 18054905
Bug: 16828525
(cherry picked from commit 1fb463e42cf1d67595cff66d19c0f99e3046f4c4)
Change-Id: I27cc70178fd3655fbe5a3178887fcba189d21321
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Tidy up InstructionSetFeatures so that it has a type hierarchy dependent on
architecture.
Add to instruction_set_test to warn when InstructionSetFeatures don't agree
with ones from system properties, AT_HWCAP and /proc/cpuinfo.
Clean-up class linker entry point logic to not return entry points but to
test whether the passed code is the particular entrypoint. This works around
image trampolines that replicate entrypoints.
Bug: 17993736
Change-Id: I5f4b49e88c3b02a79f9bee04f83395146ed7be23
|
|/
|
|
|
|
|
| |
Added MemMap::Init if we dont initialize the runtime.
Bug: 18000219
Change-Id: I1bd715e18838919c0773db5fa25623348326baa6
|
|
|
|
| |
Change-Id: I4e4ef3a2002fc59ebd9097087f150eaf3f2a7e08
|
|
|
|
|
|
|
|
|
| |
Don't do "if (ptr)". Use const. Use DISALLOW_COPY_AND_ASSIGN. Avoid public
member variables.
Move ValueObject to base and use in ELF builder.
Tidy VectorOutputStream to not use non-const reference arguments.
Change-Id: I2c727c3fc61769c3726de7cfb68b2d6eb4477e53
|
|
|
|
|
|
|
|
|
|
| |
Store the linker patches with each CompiledMethod instead of
keeping them in CompilerDriver. Reorganize oat file creation
to apply the patches as we're writing the method code. Add
framework for platform-specific relative call patches in the
OatWriter. Implement relative call patches for ARM.
Change-Id: Ie2effb3d92b61ac8f356140eba09dc37d62290f8
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Bring up a runtime for oatdump of an oat file if the boot image
option is given. This allows to use the verifier on any oat file.
Some refactoring.
Change-Id: Ifa895f22b648c7064fb0837fb36a0118422a3462
|
|/
|
|
| |
Change-Id: If5d09b0d5cb9a8039b0037e6eedaae6785b1ded2
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Added printing of OatClass offsets.
- Added printing of OatMethod offsets.
- Added bounds checks for code size size, code size, mapping table, gc map, vmap table.
- Added sanity check of 100k for code size.
- Added partial disassembly of questionable code.
- Added --no-disassemble to disable disassembly.
- Added --no-dump:vmap to disable vmap dumping.
- Reordered OatMethod info to be in file order.
Bug: 15567083
(cherry picked from commit 34fa79ece5b3a1940d412cd94dbdcc4225aae72f)
Change-Id: I2c368f3b81af53b735149a866f3e491c9ac33fb8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update the class linker to accept class status from the boot image
in compiler mode. Update compiler driver to allow quickening for
boot image classes. Update method verifier to accept quickened
instructions in compiler mode when we just want to dump. Update
oatdump to the new verifier API.
Bug: 17316928
(cherry picked from commit 35439baf287b291b67ee406308e17fc6194facbf)
Change-Id: I9ef1bfd78b0d93625b89b3d662131d7d6e5f2903
|
|
|
|
|
|
|
|
|
|
| |
Note: this moves the miranda modifier to the upper 16 bit.
Bug: 16161620
(cherry picked from commit 7fc8f90b7160e879143be5cfd6ea3df866398884)
Change-Id: I2f591d53b7d1559171e70aaaf22225d94b4882f5
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reduce virtual method dispatch in the method verifier and make more code
inline-able.
Add a StringPiece with const char* equality operator to avoid redundant
StringPieces and strlens.
Remove back link from register line to verifier and pass as argument to reduce
size of RegisterLine.
Remove instruction length from instruction flags and compute from the
instruction, again to reduce size.
Add suspend checks to resolve and verify to allow for more easy monitor
inflation and reduce contention on Locks::thread_list_suspend_thread_lock_.
Change ThrowEarlierClassFailure to throw pre-allocated exception.
Avoid calls to Thread::Current() by passing self.
Template specialize IsValidClassName.
Make ANR reporting with SIGQUIT run using checkpoints rather than suspending
all threads. This makes the stack/lock analysis less lock error prone.
Extra Barrier assertions and condition variable time out is now returned as a
boolean both from Barrier and ConditionVariable::Wait.
2 threaded host x86-64 interpret-only numbers from 341 samples:
Before change: Avg 176.137ms 99% CI 3.468ms to 1060.770ms
After change: Avg 139.163% 99% CI 3.027ms to 838.257ms
Reduction in average compile time after change is 20.9%.
Slow-down without change is 26.5%.
Bug: 17471626 - Fix bug where RegTypeCache::JavaLangObject/String/Class/Throwable
could return unresolved type when class loading is disabled.
Bug: 17398101
Change-Id: Id59ce3cc520701c6ecf612f7152498107bc40684
|
|
|
|
|
|
|
| |
Now most of the methods supported by the compiler can be optimized,
instead of using the baseline.
Change-Id: I80ab36a34913fa4e7dd576c7bf55af63594dc1fa
|
|
|
|
|
|
|
|
|
| |
Refactors some classes in elf_writer_quick.h to elf_builder.h to
be more friendly for re-use. Use this in oatdump to add a symtab
to an oat file.
Bug: 17187621, 17322125
Change-Id: I2333291334fd98bd09cc5717fb83cb18efe3a029
|
|
|
|
|
|
|
|
|
| |
For both debugging and performance analysis, it is necessary to understand
stack layout. This patch adds capability to oatdump to print out the offsets
of the locals, ins, method*, and out VRs.
Change-Id: I73512f59e4fd2d2b12725a6c76d602182c46ff78
Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use the class linker for descriptor lookups from the compile driver so that
dex caches are populated.
Reduce the scope of functions for scanning class paths to just the class
linker where they are performed.
If we see more than a threshold number of find class def misses on a dex file
lazily compute an index, so that future lookups are constant time (part of the
collection code is taken from
https://android-review.googlesource.com/#/c/103865/3). Note that we take a lazy
approach so that we don't serialize on loading dex files, this avoids the
reason the index was removed in 8b2c0b9abc3f520495f4387ea040132ba85cae69.
Remove an implicit and unnecessary std::string creation for PrintableString.
Single threaded interpret-only dex2oat performance is improved by roughly 10%.
Bug: 16853450
Change-Id: Icf72df76b0a4328f2a24075e81f4ff267b9401f4
|
|
|
|
|
|
|
|
|
|
| |
Reduced memory used by byte and boolean fields from 4 bytes down to a
single byte and shorts and chars down to two bytes. Fields are now
arranged as Reference followed by decreasing component sizes, with
fields shuffled forward as needed.
Bug: 8135266
Change-Id: I65eaf31ed27e5bd5ba0c7d4606454b720b074752
|