summaryrefslogtreecommitdiffstats
path: root/compiler/optimizing/code_generator.cc
Commit message (Collapse)AuthorAgeFilesLines
* MIPS: Initial version of optimizing compiler for MIPS64R6.Roland Levillain2015-06-261-6/+12
| | | | | | | | | | (cherry picked from commit 4dda3376b71209fae07f5c3c8ac3eb4b54207aa8) (amended for mnc-dev) Bug: 21555893 Change-Id: I874dc356eee6ab061a32f8f3df5f8ac3a4ab7dcf Signed-off-by: Alexey Frunze <Alexey.Frunze@imgtec.com> Signed-off-by: Douglas Leung <douglas.leung@imgtec.com>
* ART: Remove old DCHECK that trips BaselineDavid Brazdil2015-06-191-1/+0
| | | | | | | | | | Codegen verified that the entry block always falls through to the next block. While this is the case with Optimizing, it doesn't hold for Baseline but it doesn't need to since codegen handles it fine. Bug:21913514 Change-Id: I751ef227e6cf103af3e7fc35fca4b01c663385a1 (cherry picked from commit 015c7e63604c038e866d7af3850c557403cddc8b)
* Move mirror::ArtMethod to nativeMathieu Chartier2015-06-021-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Optimizing + quick tests are passing, devices boot. TODO: Test and fix bugs in mips64. Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS. Some of the savings are from removal of virtual methods and direct methods object arrays. Bug: 19264997 (cherry picked from commit e401d146407d61eeb99f8d6176b2ac13c4df1e33) Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d Fix some ArtMethod related bugs Added root visiting for runtime methods, not currently required since the GcRoots in these methods are null. Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes --trace run-tests 005, 044. Fixed optimizing compiler bug where we used a normal stack location instead of double on ARM64, this fixes the debuggable tests. TODO: Fix JDWP tests. Bug: 19264997 Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3 ART: Fix casts for 64-bit pointers on 32-bit compiler. Bug: 19264997 Change-Id: Ief45cdd4bae5a43fc8bfdfa7cf744e2c57529457 Fix JDWP tests after ArtMethod change Fixes Throwable::GetStackDepth for exception event detection after internal stack trace representation change. Adds missing ArtMethod::GetInterfaceMethodIfProxy call in case of proxy method. Bug: 19264997 Change-Id: I363e293796848c3ec491c963813f62d868da44d2 Fix accidental IMT and root marking regression Was always using the conflict trampoline. Also included fix for regression in GC time caused by extra roots. Most of the regression was IMT. Fixed bug in DumpGcPerformanceInfo where we would get SIGABRT due to detached thread. EvaluateAndApplyChanges: From ~2500 -> ~1980 GC time: 8.2s -> 7.2s due to 1s less of MarkConcurrentRoots Bug: 19264997 Change-Id: I4333e80a8268c2ed1284f87f25b9f113d4f2c7e0 Fix bogus image test assert Previously we were comparing the size of the non moving space to size of the image file. Now we properly compare the size of the image space against the size of the image file. Bug: 19264997 Change-Id: I7359f1f73ae3df60c5147245935a24431c04808a [MIPS64] Fix art_quick_invoke_stub argument offsets. ArtMethod reference's size got bigger, so we need to move other args and leave enough space for ArtMethod* and 'this' pointer. This fixes mips64 boot. Bug: 19264997 Change-Id: I47198d5f39a4caab30b3b77479d5eedaad5006ab
* Add a parent environment to HEnvironment.Nicolas Geoffray2015-05-111-5/+10
| | | | | | | This code has no functionality change. It adds a placeholder for chaining inlined frames. Change-Id: I5ec57335af76ee406052345b947aad98a6a4423a
* Have HInvoke instructions know their number of actual arguments.Roland Levillain2015-04-281-0/+1
| | | | | | | | | | | | | | Add an art::HInvoke::GetNumberOfArguments routine so that art::HInvoke and its subclasses can return the number of actual arguments of the called method. Use it in code generators and intrinsics handlers. Consequently, no longer remove a clinit check as last input of a static invoke if it is still present during baseline code generation, but ensure that static invokes have no such check as last input in optimized compilations. Change-Id: Iaf9e07d1057a3b15b83d9638538c02b70211e476
* Cleanup and improve stack map streamCalin Juravle2015-04-231-8/+10
| | | | | | | | | | | | | - transform AddStackMapEntry into BeginStackMapEntry/EndStackMapEntry. This allows for nicer code and less assumptions when searching for equal dex register maps. - store the components sizes and their start positions as fields to avoid re-computation. - store the current stack map entry as a field to avoid the copy semantic when updating its value in the stack maps array. - remove redundant methods and fix visibility for the remaining ones. Change-Id: Ica2d2969d7e15993bdbf8bc41d9df083cddafd24
* [optimizing] Fix a bug in moving the null check to the user.Calin Juravle2015-04-221-2/+4
| | | | | | | When taking the decision to move a null check to the user we did not verify if the next instruction checks the same object. Change-Id: I2f4533a4bb18aa4b0b6d5e419f37dcccd60354d2
* Opt compiler: Correctly require register or FPU register.Alexandre Rames2015-04-201-0/+76
| | | | | | | Also add a check that location summary are correctly typed with the HInstruction. Change-Id: I699762ff4e8f4e321c7db01ea005236ea1934af9
* Type MoveOperands.Nicolas Geoffray2015-04-151-3/+8
| | | | | | | | | The ParallelMoveResolver implementation needs to know if a move is for 64bits or not, to handle swaps correctly. Bug found, and test case courtesy of Serguei I. Katkov. Change-Id: I9a0917a1cfed398c07e57ad6251aea8c9b0b8506
* Implement CFI for Optimizing.David Srbecky2015-04-091-8/+11
| | | | | | CFI is necessary for stack unwinding in gdb, lldb, and libunwind. Change-Id: I1a3480e3a4a99f48bf7e6e63c4e83a80cfee40a2
* ART: Enable more Clang warningsAndreas Gampe2015-04-061-2/+0
| | | | Change-Id: Ie6aba02f4223b1de02530e1515c63505f37e184c
* [optimizing] Implement x86/x86_64 math intrinsicsMark Mendell2015-04-011-2/+6
| | | | | | | | | | | | | | | | | | | Implement floor/ceil/round/RoundFloat on x86 and x86_64. Implement RoundDouble on x86_64. Add support for roundss and roundsd on both architectures. Support them in the disassembler as well. Add the instruction set features for x86, as the 'round' instruction is only supported if SSE4.1 is supported. Fix the tests to handle the addition of passing the instruction set features to x86 and x86_64. Add assembler tests for roundsd and roundss to x86_64 assembler tests. Change-Id: I9742d5930befb0bbc23f3d6c83ce0183ed9fe04f Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
* Merge "ART: Boolean simplifier"David Brazdil2015-03-241-13/+3
|\
| * ART: Boolean simplifierDavid Brazdil2015-03-241-13/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | The optimization recognizes the negation pattern generated by 'javac' and replaces it with a single condition. To this end, boolean values are now consistently assumed to be represented by an integer. This is a first optimization which deletes blocks from the HGraph and does so by replacing the corresponding entries with null. Hence, existing code can continue indexing the list of blocks with the block ID, but must check for null when iterating over the list. Change-Id: I7779da69cfa925c6521938ad0bcc11bc52335583
* | Unify ART's various implementations of bit_cast.Roland Levillain2015-03-241-2/+2
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ART had several implementations of art::bit_cast: 1. one in runtime/base/casts.h, declared as: template <class Dest, class Source> inline Dest bit_cast(const Source& source); 2. another one in runtime/utils.h, declared as: template<typename U, typename V> static inline V bit_cast(U in); 3. and a third local version, in runtime/memory_region.h, similar to the previous one: template<typename Source, typename Destination> static Destination MemoryRegion::local_bit_cast(Source in); This CL removes versions 2. and 3. and changes their callers to use 1. instead. That version was chosen over the others as: - it was the oldest one in the code base; and - its syntax was closer to the standard C++ cast operators, as it supports the following use: bit_cast<Destination>(source) since `Source' can be deduced from `source'. Change-Id: I7334fd5d55bf0b8a0c52cb33cfbae6894ff83633
* Update locations of registers after slow paths spilling.Nicolas Geoffray2015-03-161-34/+176
| | | | Change-Id: Id9aafcc13c1a085c17ce65d704c67b73f9de695d
* Merge "[optimizing] Don't record None locations in the stack maps."Nicolas Geoffray2015-03-131-133/+33
|\
| * [optimizing] Don't record None locations in the stack maps.Nicolas Geoffray2015-03-131-133/+33
| | | | | | | | | | | | | | | | | | - moved environment recording from code generator to stack map stream - added creation/loading factory methods for the DexRegisterMap (hides internal details) - added new tests Change-Id: Ic8b6d044f0d8255c6759c19a41df332ef37876fe
* | Refactor code in preparation of correct stack maps in slow path.Nicolas Geoffray2015-03-131-46/+50
|/ | | | | | | | | Move the logic of saving/restoring live registers in slow path in the SlowPathCode method. Also add a RecordPcInfo helper to SlowPathCode, that will act as the placeholder of saving correct stack maps. Change-Id: I25c2bc7a642ef854bbc8a3eb570e5c8c8d2d030c
* Fix build breakage.Nicolas Geoffray2015-03-131-1/+1
| | | | Change-Id: I86959eca5d8f5458ff75c78776b0af9db9c26800
* Merge "Tweak liveness when instructions are used in environments."Nicolas Geoffray2015-03-131-0/+5
|\
| * Tweak liveness when instructions are used in environments.Nicolas Geoffray2015-03-121-0/+5
| | | | | | | | | | | | | | | | | | Instructions remain live when debuggable, but only instructions with object types remain live when non-debuggable. Enable StackVisitor::GetThisObject for optimizing. Change-Id: Id87b2cbf33a02450059acc9993995782e5f28987
* | Compress the Dex register maps built by the optimizing compiler.Roland Levillain2015-03-121-19/+29
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Replace the current list-based (fixed-size) Dex register encoding in stack maps emitted by the optimizing compiler with another list-based variable-size Dex register encoding compressing short locations on 1 byte (3 bits for the location kind, 5 bits for the value); other (large) values remain encoded on 5 bytes. - In addition, use slot offsets instead of byte offsets to encode the location of Dex registers placed in stack slots at small offsets, as it enables more values to use the short (1-byte wide) encoding instead of the large (5-byte wide) one. - Rename art::DexRegisterMap::LocationKind as art::DexRegisterLocation::Kind, turn it into a strongly-typed enum based on a uint8_t, and extend it to support new kinds (kInStackLargeOffset and kConstantLargeValue). - Move art::DexRegisterEntry from compiler/optimizing/stack_map_stream.h to runtime/stack_map.h and rename it as art::DexRegisterLocation. - Adjust art::StackMapStream, art::CodeGenerator::RecordPcInfo, art::CheckReferenceMapVisitor::CheckOptimizedMethod, art::StackVisitor::GetVRegFromOptimizedCode, and art::StackVisitor::SetVRegFromOptimizedCode. - Implement unaligned memory accesses in art::MemoryRegion. - Use them to manipulate data in Dex register maps. - Adjust oatdump to support the new Dex register encoding. - Update compiler/optimizing/stack_map_test.cc. Change-Id: Icefaa2e2b36b3c80bb1b882fe7ea2f77ba85c505
* [optimizing] Use callee-save registers for x86Mark Mendell2015-03-051-35/+30
| | | | | | | | Add ESI, EDI, EBP to available registers for non-baseline mode. Ensure that they aren't used when byte addressible registers are needed. Change-Id: Ie7130d4084c2ae9cfcd1e47c26eb3e5dcac1ebd6 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
* Opt Compiler: ARM64: Enable explicit memory barriers over acquire/releaseSerban Constantinescu2015-03-021-1/+3
| | | | | | | | | | | | | | | | Implement remaining explicit memory barrier code paths and temporarily enable the use of explicit memory barriers for testing. This CL also enables the use of instruction set features in the ARM64 backend. kUseAcquireRelease has been replaced with PreferAcquireRelease(), which for now is statically set to false (prefer explicit memory barriers). Please note that we still prefer acquire-release for the ARM64 Optimizing Compiler, but we would like to exercise the explicit memory barrier code path too. Change-Id: I84e047ecd43b6fbefc5b82cf532e3f5c59076458 Signed-off-by: Serban Constantinescu <serban.constantinescu@arm.com>
* Merge "Display optimizing compiler's CodeInfo objects in oatdump."Roland Levillain2015-02-201-1/+1
|\
| * Display optimizing compiler's CodeInfo objects in oatdump.Roland Levillain2015-02-191-1/+1
| | | | | | | | | | | | A few elements are not displayed yet (stack mask, inline info) though. Change-Id: I5e51a801c580169abc5d1ef43ad581aadc110754
* | Ensure the graph is correctly typed.Nicolas Geoffray2015-02-191-0/+2
|/ | | | | | | | | | We used to be forgiving because of HIntConstant(0) also being used for null. We now create a special HNullConstant for such uses. Also, we need to run the dead phi elimination twice during ssa building to ensure the correctness. Change-Id: If479efa3680d3358800aebb1cca692fa2d94f6e5
* Avoid generating jmp +0.Nicolas Geoffray2015-02-181-6/+36
| | | | | | | When a block branches to a non-following block, but blocks in-between do branch to it, we can avoid doing the branch. Change-Id: I9b343f662a4efc718cd4b58168f93162a24e1219
* Optimize leaf methods.Nicolas Geoffray2015-02-061-14/+32
| | | | | | Avoid suspend checks and stack changes when not needed. Change-Id: I0fdb31e8c631e99091b818874a558c9aa04b1628
* Finally implement Location::kNoOutputOverlap.Nicolas Geoffray2015-02-041-1/+1
| | | | | | | | | | The [i, i + 1) interval scheme we chose for representing lifetime positions is not optimal for doing this optimization. It however doesn't prevent recognizing a non-split interval during the TryAllocateFreeReg phase, and try to re-use its inputs' registers. Change-Id: I80a2823b0048d3310becfc5f5fb7b1230dfd8201
* Use a different block order when not compiling baseline.Nicolas Geoffray2015-02-031-47/+43
| | | | | | | | Use the linearized order instead, as it puts blocks logically next to each other in a better way. Also, it does not contain dead blocks. Change-Id: Ie65b56041a093c8155e6c1e06351cb36a4053505
* Support callee-save registers on ARM.Nicolas Geoffray2015-01-241-3/+1
| | | | Change-Id: I7c519b7a828c9891b1141a8e51e12d6a8bc84118
* Support callee save floating point registers on x64.Nicolas Geoffray2015-01-231-0/+3
| | | | | | | | | - Share the computation of core_spill_mask and fpu_spill_mask between backends. - Remove explicit stack overflow check support: we need to adjust them and since they are not tested, they will easily bitrot. Change-Id: I0b619b8de4e1bdb169ea1ae7c6ede8df0d65837a
* Enable core callee-save on x64.Nicolas Geoffray2015-01-211-17/+36
| | | | | | Will work on other architectures and FP support in other CLs. Change-Id: I8cef0343eedc7202d206f5217fdf0349035f0e4d
* Merge "ART: Replace NULL to nullptr in the optimizing compiler"Roland Levillain2015-01-211-1/+1
|\
| * ART: Replace NULL to nullptr in the optimizing compilerJean Christophe Beyler2015-01-211-1/+1
| | | | | | | | | | | | | | Replace macro NULL to the nullptr variation for C++. Change-Id: Ib6e48dd4bb3c254343383011b67372622578ca76 Signed-off-by: Jean Christophe Beyler <jean.christophe.beyler@intel.com>
* | Revert "Revert "Fully support pairs in the register allocator.""Nicolas Geoffray2015-01-211-0/+8
| | | | | | | | | | | | This reverts commit c399fdc442db82dfda66e6c25518872ab0f1d24f. Change-Id: I19f8215c4b98f2f0827e04bf7806c3ca439794e5
* | Record implicit null checks at the actual invoke time.Calin Juravle2015-01-211-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | ImplicitNullChecks are recorded only for instructions directly (see NB below) preceeded by NullChecks in the graph. This way we avoid recording redundant safepoints and minimize the code size increase. NB: ParallalelMoves might be inserted by the register allocator between the NullChecks and their uses. These modify the environment and the correct action would be to reverse their modification. This will be addressed in a follow-up CL. Change-Id: Ie50006e5a4bd22932dcf11348f5a655d253cd898
* | Revert "Fully support pairs in the register allocator."Nicolas Geoffray2015-01-211-8/+0
| | | | | | | | | | | | | | | | Libcore tests fail. This reverts commit 41aedbb684ccef76ff8373f39aba606ce4cb3194. Change-Id: I2572f120d4bbaeb7a4d4cbfd47ab00c9ea39ac6c
* | Fully support pairs in the register allocator.Nicolas Geoffray2015-01-211-0/+8
| | | | | | | | | | | | Enabled on ARM for longs and doubles. Change-Id: Id8792d08bd7ca9fb049c5db8a40ae694bafc2d8b
* | Merge "Add implicit null checks for the optimizing compiler"Calin Juravle2015-01-201-5/+7
|\ \
| * | Add implicit null checks for the optimizing compilerCalin Juravle2015-01-161-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | - for backends: arm, arm64, x86, x86_64 - fixed parameter passing for CodeGenerator - 003-omnibus-opcodes test verifies that NullPointerExceptions work as expected Change-Id: I1b302acd353342504716c9169a80706cf3aba2c8
* | | Do not use register pair in a parallel move.Nicolas Geoffray2015-01-161-4/+2
|/ / | | | | | | | | | | | | The ParallelMoveResolver does not work with pairs. Instead, decompose the pair into two individual moves. Change-Id: Ie9d3f0b078cef8dc20640c98b20bb20cc4971a7f
* | [optimizing compiler] Compute live spill sizeMark Mendell2015-01-151-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current stack frame calculation assumes that each live register to be saved/restored has the word size of the machine. This fails for X86, where a double in an XMM register takes up 8 bytes. Change the calculation to keep track of the number of core registers and number of fp registers to handle this distinction. This is slightly pessimal, as the registers may not be active at the same time, but the only way to handle this would be to allocate both classes of registers simultaneously, or remember all the active intervals, matching them up and compute the size of each safepoint interval. Change-Id: If7860aa319b625c214775347728cdf49a56946eb Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
* | Merge "Move code around in OptimizingCompiler::Compile to reduce stack space."Nicolas Geoffray2015-01-121-6/+5
|\ \
| * | Move code around in OptimizingCompiler::Compile to reduce stack space.Nicolas Geoffray2015-01-121-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also fix an (intentional) memory leak, by allocating the CodeGenerator on the heap instead of the arena: they construct an Assembler object that requires destruction. BUG:18787334 Change-Id: I8cf0667cb70ce5b14d4ac334bd4487a562635f1b
* | | Implement double and float support for arm in register allocator.Nicolas Geoffray2015-01-081-0/+8
|/ / | | | | | | | | | | | | | | | | | | | | | | The basic approach is: - An instruction that needs two registers gets two intervals. - When allocating the low part, we also allocate the high part. - When splitting a low (or high) interval, we also split the high (or low) equivalent. - Allocation follows the (S/D register) requirement that low registers are always even and the high equivalent is low + 1. Change-Id: I06a5148e05a2ffc7e7555d08e871ed007b4c2797
* | Look at instruction set features when generating volatiles codeCalin Juravle2015-01-051-2/+4
| | | | | | | | Change-Id: Ia882405719fdd60b63e4102af7e085f7cbe0bb2a
* | ART: Swap-space in the compilerAndreas Gampe2014-12-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | Introduce a swap-space and corresponding allocator to transparently switch native allocations to memory backed by a file. Bug: 18596910 (cherry picked from commit 62746d8d9c4400e4764f162b22bfb1a32be287a9) Change-Id: I131448f3907115054a592af73db86d2b9257ea33