summaryrefslogtreecommitdiffstats
path: root/compiler/optimizing/parallel_move_resolver.cc
Commit message (Collapse)AuthorAgeFilesLines
* Opt compiler: Implement parallel move resolver without using swap.Zheng Xu2015-04-171-25/+284
| | | | | | | | | | | | | | | | | | The algorithm of ParallelMoveResolverNoSwap() is almost the same with ParallelMoveResolverWithSwap(), except the way we resolve the circular dependency. NoSwap() uses additional scratch register to resolve the circular dependency. For example, (0->1) (1->2) (2->0) will be performed as (2->scratch) (1->2) (0->1) (scratch->0). On architectures without swap register support, NoSwap() can reduce the number of moves from 3x(N-1) to (N+1) when there is circular dependency with N moves. And also, NoSwap() algorithm does not depend on architecture register layout information, which means it can support register pairs on arm32 and X/W, D/S registers on arm64 without additional modification. Change-Id: Idf56bd5469bb78c0e339e43ab16387428a082318
* Merge "Revert "[optimizing] Improve x86 parallel moves/swaps""Calin Juravle2015-04-161-24/+0
|\
| * Revert "[optimizing] Improve x86 parallel moves/swaps"Guillaume Sanchez2015-04-151-24/+0
| | | | | | | | | | | | | | | | This reverts commit a5c19ce8d200d68a528f2ce0ebff989106c4a933. This commit introduces a performance regression on CaffeineLogic of 30%. Change-Id: I917e206e249d44e1748537bc1b2d31054ea4959d
* | Type MoveOperands.Nicolas Geoffray2015-04-151-4/+4
|/ | | | | | | | | The ParallelMoveResolver implementation needs to know if a move is for 64bits or not, to handle swaps correctly. Bug found, and test case courtesy of Serguei I. Katkov. Change-Id: I9a0917a1cfed398c07e57ad6251aea8c9b0b8506
* [optimizing] Improve x86 parallel moves/swapsMark Mendell2015-04-101-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new constructor to ScratchRegisterScope that will supply a register if there is a free one, but not spill to force one. Use this to generated alternate code that doesn't use a temporary, as the spill/restore of a register generates extra instructions that aren't necessary on x86. Here is the benefit for a 32 bit memory-to-memory exchange with no free registers: < 50 push eax < 53 push ebx < 8B44244C mov eax, [esp + 76] < 8B5C246C mov ebx, [esp + 108] < 8944246C mov [esp + 108], eax < 895C244C mov [esp + 76], ebx < 5B pop ebx < 58 pop eax --- > FF742444 push [esp + 68] > FF742468 push [esp + 104] > 8F44244C pop [esp + 72] > 8F442468 pop [esp + 100] Avoid using xchg instruction, as it is slow on smaller processors. Change-Id: Id29ee3abd998577baaee552d55d23e60ae0c7871 Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
* Fix wrong assumptions about ParallelMove.Nicolas Geoffray2015-03-311-27/+31
| | | | | | | | | | Registers involved in single and double operations can drag stack locations as well, so it is possible to update a single stack location with a slot from a double stack location. bug:19999189 Change-Id: Ibeec7d6f1b3126c4ae226fca56e84dccf798d367
* Improve ParallelMoveResolver to work with pairs.Nicolas Geoffray2015-02-101-21/+94
| | | | Change-Id: Ie2a540ffdb78f7f15d69c16a08ca2d3e794f65b9
* Do not use register pair in a parallel move.Nicolas Geoffray2015-01-161-0/+3
| | | | | | | The ParallelMoveResolver does not work with pairs. Instead, decompose the pair into two individual moves. Change-Id: Ie9d3f0b078cef8dc20640c98b20bb20cc4971a7f
* Remove constant moves after emitting them in parallel resolver.Nicolas Geoffray2015-01-141-3/+5
| | | | | | | | This fixes the case where a constant move requires a scratch register. Note that there is no backend that needs this for now, but X86 might with the move to hard float. Change-Id: I37f6b8961b48f2cf6fbc0cd281e70d58466d018e
* ART: More warningsAndreas Gampe2014-11-041-3/+3
| | | | | | | Enable -Wno-conversion-null, -Wredundant-decls and -Wshadow in general, and -Wunused-but-set-parameter for GCC builds. Change-Id: I81bbdd762213444673c65d85edae594a523836e5
* Stop converting from Location to ManagedRegister.Nicolas Geoffray2014-10-091-2/+1
| | | | | | | Now the source of truth is the Location object that knows which register (core, pair, fpu) it needs to refer to. Change-Id: I62401343d7479ecfb24b5ed161ec7829cda5a0b1
* Enable the register allocator on ARM.Nicolas Geoffray2014-06-121-8/+8
| | | | | | | | - Also fixes a few bugs/wrong assumptions in code not hit by x86. - We need to differentiate between moves due to connecting siblings within a block, and moves due to control flow resolution. Change-Id: Idd05cf138a71c8f36f5531c473de613c0166fe38
* Final CL to enable register allocation on x86.Nicolas Geoffray2014-06-121-0/+60
| | | | | | | | | | | | This CL implements: 1) Resolution after allocation: connecting the locations allocated to an interval within a block and between blocks. 2) Handling of fixed registers: some instructions require inputs/output to be at a specific location, and the allocator needs to deal with them in a special way. 3) ParallelMoveResolver::EmitNativeCode for x86. Change-Id: I0da6bd7eb66877987148b87c3be6a983b4e3f858
* Import Dart's parallel move resolver.Nicolas Geoffray2014-05-231-0/+150
And write a few tests while at it. A parallel move resolver will be needed for performing multiple moves that are conceptually parallel, for example moves at a block exit that branches to a block with phi nodes. Change-Id: Ib95b247b4fc3f2c2fcab3b8c8d032abbd6104cd7