| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Added missing EntryPointToCodePointer.
This reverts commit a5ca888d715cd0c6c421313211caa1928be3e399.
Change-Id: Ia74df0ef3a7babbdcb0466fd24da28e304e3f5af
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sorry, run-test crashes on target:
0-05 12:15:51.633 I/DEBUG (27995): Abort message: 'art/runtime/mirror/art_method.cc:349] Check failed: PcIsWithinQuickCode(reinterpret_cast<uintptr_t>(code), pc) java.lang.Throwable java.lang.Throwable.fillInStackTrace() pc=71e3366b code=0x71e3362d size=ad000000'
10-05 12:15:51.633 I/DEBUG (27995): r0 00000000 r1 0000542b r2 00000006 r3 00000000
10-05 12:15:51.633 I/DEBUG (27995): r4 00000006 r5 b6f9addc r6 00000002 r7 0000010c
10-05 12:15:51.633 I/DEBUG (27995): r8 b63fe1e8 r9 be8e1418 sl b6427400 fp b63fcce0
10-05 12:15:51.633 I/DEBUG (27995): ip 0000542b sp be8e1358 lr b6e9a27b pc b6e9c280 cpsr 40070010
10-05 12:15:51.633 I/DEBUG (27995):
Bug: 17950037
This reverts commit 2535abe7d1fcdd0e6aca782b1f1932a703ed50a4.
Change-Id: I6f88849bc6f2befed0c0aaa0b7b2a08c967a83c3
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently disabled by default unless -Xjit is passed in.
The proposed JIT is a method JIT which works by utilizing interpreter
instrumentation to request compilation of hot methods async during
runtime.
JIT options:
-Xjit / -Xnojit
-Xjitcodecachesize:N
-Xjitthreshold:integervalue
The JIT has a shared copy of a compiler driver which is accessed
by worker threads to compile individual methods.
Added JIT code cache and data cache, currently sized at 2 MB
capacity by default. Most apps will only fill a small fraction of
this cache however.
Added support to the compiler for compiling interpreter quickened
byte codes.
Added test target ART_TEST_JIT=TRUE and --jit for run-test.
TODO:
Clean up code cache.
Delete compiled methods after they are added to code cache.
Add more optimizations related to runtime checks e.g. direct pointers
for invokes.
Add method recompilation.
Move instrumentation to DexFile to improve performance and reduce
memory usage.
Bug: 17950037
Change-Id: Ifa5b2684a2d5059ec5a5210733900aafa3c51bca
|
|
|
|
| |
Change-Id: Id718f8a4450adf1608306286fa4e6b9194022532
|
|
|
|
|
|
|
|
|
|
| |
Make several fields const in CompilationUnit. May benefit some Mir2Lir
code that repeats tests, and in general immutability is good.
Remove compiler_internals.h and refactor some other headers to reduce
overly broad imports (and thus forced recompiles on changes).
Change-Id: I898405907c68923581373b5981d8a85d2e5d185a
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move the TypeInference pass to post-opt passes and make it
a PassMEMirSsaRep as we need to rerun the pass if the SSA
representation has changed. (Though we currently don't have
any pass that would require it.)
The results of MethodUseCount and ConstantPropagation passes
are used only in the BBOptimization and codegen and stay
valid across BBOptimization and SuspendCheckElimination, so
move them out of post-opt passes to just before the BBOpt
(and reverse the dependency between ConstantPropagation and
init reg locations passes).
Change-Id: If02c087107cef48d5f9f7c18b0a0ace370fe2647
|
|
|
|
| |
Change-Id: I85aa9e7d744b37ee3d2531c50470cd3fa87dc864
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
.method public static getInt(I)I
.registers 2
const/4 v0, 0x0
if-ne v0, v0, :after
float-to-int v0, v0
:exit
add-int/2addr v0, v1
return v0
:after
move v1, v0
goto :exit
.end method
In this code sample, v1 is the single parameter to this method. In one
of the phi-nodes inserted between :exit and add-int/2addr, v1's two
incoming SSA regs are:
- the initial def of v1 as a parameter
- the v1 def'd at move v1, v0.
During type inference, because the 2nd def is a float (because of the
earlier float-to-int v0, v0) this will change the type of the 1st def to a
float as well, which is incorrect since the first parameter is known to be
non-float.
This fix checks during phi-node type-inference if an SSA reg that is the
initial def of a parameter vreg is about to be set as float when it was
not previously, and skips the inference if so.
In this case, when using a hard-float ABI, having the in-reg v1 set as
float causes FlushIns() to read the argument to the method from an FP reg,
when the argument will be passed in a core reg by any caller.
Also included is a smali test for this bug: compare difference between
./run-test --64 800
./run-test --64 --interpreter 800
when the vreg_analysis patch has not been applied.
(Requires 64-bit because 32-bit ARM currently does not use hard-float.)
getInt(I)I should return its argument, but it returns an incorrect
value.
Change-Id: I1d4b5be6a931fe853279e89dd820820f29823da1
Signed-off-by: Stephen Kyle <stephen.kyle@arm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changes compiler temporaries to have positive names. The numbering now
puts them above the code VRs (locals + ins, in that order). The patch also
introduces APIs to query the number of temporaries, locals and ins.
The compiler temp infrastructure suffered from several issues
which are also addressed by this patch:
-There is no longer a queue of compiler temps. This would be polluted
with Method* when post opts were called multiple times.
-Sanity checks have been added to allow requesting of temps from BE
and to prevent temps after frame is committed.
-None of the structures holding temps can overflow because they are
allocated to allow holding maximum temps. Thus temps can be requested
by BE with no problem.
-Since the queue of compiler temps is no longer maintained, it is no
longer possible to refer to a temp that has invalid ssa (because it
was requested before ssa was run).
-The BE can now request temps after all ME allocations and it is guaranteed
to actually receive them.
-ME temps are now treated like normal VRs in all cases with no special
handling. Only the BE temps are handled specially because there are no
references to them from MIRs.
-Deprecated and removed several fields in CompilationUnit that saved
register information and updated callsites to call the new interface from
MIRGraph.
Change-Id: Ia8b1fec9384a1a83017800a59e5b0498dfb2698c
Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
Signed-off-by: Udayan Banerji <udayan.banerji@intel.com>
|
|\ |
|
| |
| |
| |
| |
| |
| |
| | |
Modified FlagsOf to handle extended flags.
Change-Id: I9e47e0c42816136b2b53512c914200dd9dd11376
Signed-off-by: Jean Christophe Beyler <jean.christophe.beyler@intel.com>
|
|/
|
|
|
|
|
| |
Clean up the compiler: less extern functions, dis-entangle
compilers, hide some compiler specifics, lower global includes.
Change-Id: Ibaf88d02505d86994d7845cf0075be5041cc8438
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rewrite the topological sort order to include a full loop
before the blocks that go after the loop. Add a new iterator
class LoopRepeatingTopologicalSortIterator that differs from
the RepeatingTopologicalSortIterator by repeating only loops
and repeating them early. It returns to the loop head if the
head needs recalculation when we reach the end of the loop.
In GVN, use the new loop-repeating topological sort iterator
and for a loop head merge only the preceding blocks' LVNs
if we're not currently recalculating this loop.
Also fix LocalValueNumbering::InPlaceIntersectMaps() which
was keeping only the last element of the intersection, avoid
some unnecessary processing during LVN merge and add some
missing braces to MIRGraph::InferTypeAndSize().
Bug: 16398693
Change-Id: I4e10d4acb626a5b8a28ec0de106a7b37f9cbca32
|
|
|
|
|
|
|
|
|
| |
Add a method Invokes to test for the kInvoke flag.
Also moved IsPseudoMirOp to DecodedInstruction to use it for the various
querry methods.
Change-Id: I59a2056b7b802b8393fa2b0d977304d252b38c89
Signed-off-by: Jean Christophe Beyler <jean.christophe.beyler@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Dex specification is a bit loose - particularly in regards
to types. The Quick compiler, though, makes some assumptions
about types. In particular, it doesn't expect to encounter the
use of a long or double virtual register pair that was defined
by two independent 32-bit operations.
dx does not create such patterns (at least in recent memory).
However, at least one such case exists in the wild. The next
version of the Dex specification will add more type constraints
and formally disallow such cases. Meanwhile, existing code will
be handled by identifying these cases an reverting to interpretation
for the offending method.
Fix for internal b/15616104
Change-Id: Ibe9c423be9a952ff58cf8d985aa164885b8dd2ae
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MIRGraph::InlineCalls() was using the MIR opcode to recover
Dalvik instruction flags - something that is only valid for
Dalvik opcodes and not the set of extended MIR opcodes.
This is probably the 3rd or 4th time we've had a bug using
the MIR opcode in situations that are only valid for the Dalvik
opcode subset. I took the opportunity to scan the code for
other cases of this (didn't find any), and did some cleanup while
I was in the neighborhood.
We should probably rework the DalvikOpcode/MirOpcode model whenver we
get around to removing DalvikInstruction from MIR.
Internal bug b/15352667: out-of-bound access in mir_optimization.cc
Change-Id: I75f06780468880892151e3cdd313e14bfbbaa489
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the first of two CLs intended to fix up and make consistent
the handling of references in the Quick backend. A sibling
CL c/96237 updates the runtime to treat Method* as a compressed
reference when stored. This CL makes a similar change for the
backend.
As far as the general handling of in-register references, though,
the current Quick backend is not consistent even for non-Method*
references. Sometimes they are treated as references, but other
times are handled as if they were 32-bit ints. A subsequent CL
will deal with that issue.
Change-Id: I5591c5eea6cca6ed22208ab806fd38b959c9d03d
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Significant refactoring of register handling to unify usage across
all targets & 32/64 backends.
Reworked RegStorage encoding to allow expanded use of
x86 xmm registers; removed vector registers as a separate
register type. Reworked RegisterInfo to describe aliased
physical registers. Eliminated quite a bit of target-specific code
and generalized common code.
Use of RegStorage instead of int for registers now propagated down
to the NewLIRx() level. In future CLs, the NewLIRx() routines will
be replaced with versions that are explicit about what kind of
operand they expect (RegStorage, displacement, etc.). The goal
is to eventually use RegStorage all the way to the assembly phase.
TBD: MIPS needs verification.
TBD: Re-enable liveness tracking.
Change-Id: I388c006d5fa9b3ea72db4e37a19ce257f2a15964
|
|
|
|
|
|
|
|
|
|
|
| |
The oat_data_flow_attributes had no checking mechanism to ensure bound
correctness.
This fix handles this and also offers two functions to retrieve the
attributes: using the MIR and DecodedInstruction.
Change-Id: Ib4f1f749efb923a803d364a4eea83a174527a644
Signed-Off-By: Jean Christophe Beyler <jean.christophe.beyler@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Two things:
- Added a default initialization for the RegLocation.
- Added a default constructor and Reset for the GrowableArray's Iterator class.
Change-Id: I74d9c584304c77add42e0d66e4037ac45b890142
Signed-off-by: Jean Christophe Beyler <jean.christophe.beyler@intel.com>
Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
Signed-off-by: Yixin Shou <yixin.shou@intel.com>
Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
Signed-off-by: Udayan Banerji <udayan.banerji@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This saves more than 0.5s of boot.oat compilation time
on Nexus 5.
TODO: Move other stuff to the scoped allocator. This CL
alone increases the peak memory allocation. By reusing
the memory for other parts of the compilation we should
reduce this overhead.
Change-Id: Ifbc00aab4f3afd0000da818dfe68b96713824a08
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 86ec520fc8b696ed6f164d7b756009ecd6e4aace.
Ready. Fixed the original type, plus some mechanical changes
for rebasing.
Still needs additional testing, but the problem with the original
CL appears to have been a typo in the definition of the x86
double return template RegLocation.
Change-Id: I828c721f91d9b2546ef008c6ea81f40756305891
|
|
|
|
|
|
| |
This reverts commit 2c1ed456dcdb027d097825dd98dbe48c71599b6c.
Change-Id: If88d69ba88e0af0b407ff2240566d7e4545d8a99
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For historical reasons, the Quick backend found it convenient
to consider all 64-bit Dalvik values held in registers
to be contained in a pair of 32-bit registers. Though this
worked well for ARM (with double-precision registers also
treated as a pair of 32-bit single-precision registers) it doesn't
play well with other targets. And, it is somewhat problematic
for 64-bit architectures.
This is the first of several CLs that will rework the way the
Quick backend deals with physical registers. The goal is to
eliminate the "64-bit value backed with 32-bit register pair"
requirement from the target-indendent portions of the backend
and support 64-bit registers throughout.
The key RegLocation struct, which describes the location of
Dalvik virtual register & register pairs, previously contained
fields for high and low physical registers. The low_reg and
high_reg fields are being replaced with a new type: RegStorage.
There will be a single instance of RegStorage for each RegLocation.
Note that RegStorage does not increase the space used. It is
16 bits wide, the same as the sum of the 8-bit low_reg and
high_reg fields.
At a target-independent level, it will describe whether the physical
register storage associated with the Dalvik value is a single 32
bit, single 64 bit, pair of 32 bit or vector. The actual register
number encoding is left to the target-dependent code layer.
Because physical register handling is pervasive throughout the
backend, this restructuring necessarily involves large CLs with
lots of changes. I'm going to roll these out in stages, and
attempt to segregate the CLs with largely mechanical changes from
those which restructure or rework the logic.
This CL is of the mechanical change variety - it replaces low_reg
and high_reg from RegLocation and introduces RegStorage. It also
includes a lot of new code (such as many calls to GetReg())
that should go away in upcoming CLs.
The tentative plan for the subsequent CLs is:
o Rework standard register utilities such as AllocReg() and
FreeReg() to use RegStorage instead of ints.
o Rework the target-independent GenXXX, OpXXX, LoadValue,
StoreValue, etc. routines to take RegStorage rather than
int register encodings.
o Take advantage of the vector representation and eliminate
the current vector field in RegLocation.
o Replace the "wide" variants of codegen utilities that take
low_reg/high_reg pairs with versions that use RegStorage.
o Add 64-bit register target independent codegen utilities
where possible, and where not virtualize with 32-bit general
register and 64-bit general register variants in the target
dependent layer.
o Expand/rework the LIR def/use flags to allow for more registers
(currently, we lose out on 16 MIPS floating point regs as
well as ARM's D16..D31 for lack of space in the masks).
o [Possibly] move the float/non-float determination of a register
from the target-dependent encoding to RegStorage. In other
words, replace IsFpReg(register_encoding_bits).
At the end of the day, all code in the target independent layer
should be using RegStorage, as should much of the target dependent
layer. Ideally, we won't be using the physical register number
encoding extracted from RegStorage (i.e. GetReg()) until the
NewLIRx() layer.
Change-Id: Idc5c741478f720bdd1d7123b94e4288be5ce52cb
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Compiler temporaries are a facility for having virtual register sized space
for dealing with intermediate values during MIR transformations. They receive
explicit space in managed frames so they can have a home location in case they
need to be spilled. The facility also supports "special" temporaries which
have specific semantic purpose and their location in frame must be tracked.
The compiler temporaries are treated in the same way as virtual registers
so that the MIR level transformations do not need to have special logic. However,
generated code needs to know stack layout so that it can distinguish between
home locations.
MIRGraph has received an interface for dealing with compiler temporaries. This
interface allows allocation of wide and non-wide virtual register temporaries.
The information about how temporaries are kept on stack has been moved to
stack.h. This is was necessary because stack layout is dependent on where the
temporaries are placed.
Change-Id: Iba5cf095b32feb00d3f648db112a00209c8e5f55
Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
|
|
|
|
|
|
|
|
| |
This reverts commit 8ff67e3338952c70ccf3b609559bf8cc0f379cfd.
Fix applied to loc.fp usage.
Change-Id: I1eb3005392544fcf30c595923ed25bcee2dc4859
|
|
|
|
|
|
|
|
| |
The invalid usage of loc.fp must be corrected before this change can be submitted.
This reverts commit 766a5e5940b469ab40e52770862c81cfec1d835b.
Change-Id: I1173a9bf829da89cccd9c2898f5e11164987a22b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, ART Quick mode assumes that a double FP register is composed
of two single consecutive FP registers. This is true for ARM and MIPS,
but not x86. This means that only half of the 8 XMM registers are
available for use by x86 doubles.
This patch breaks the assumption that a wide FP RegisterLocation must be
a paired set of FP registers. This is done by making some routines in
common code virtual and overriding them in the X86Mir2Lir class. For
these wide fp locations, the high register is set to the same value as
the low register, in order to minimize changes to common code. In a
couple of places, the common code checks for this case.
The changes are also supposed to allow the possibility of using the XMM
registers for vector operations,but that support is still WIP.
Change-Id: Ic6ef24ea764991c6f4d9fb88d483a619f5a468cb
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Another round of compile-time tuning, this time yeilding in the
vicinity of 3% total reduction in compile time (which means about
double that for the Quick Compile portion).
Primary improvements are skipping the basic block combine optimization
pass when using Quick (because we already have big blocks), combining
the null check elimination and type inference passes, and limiting
expensive local value number analysis to only those blocks which
might benefit from it.
Following this CL, the actual compile phase consumes roughly 60%
of the total dex2oat time on the host, and 55% on the target (Note,
I'm subtracting out the Deduping time here, which the timing logger
normally counts against the compiler).
A sample breakdown of the compilation time follows (this taken on
PlusOne.apk w/ a Nexus 4):
39.00% -> MIR2LIR: 1374.90 (Note: includes local optimization & scheduling)
10.25% -> MIROpt:SSATransform: 361.31
8.45% -> BuildMIRGraph: 297.80
7.55% -> Assemble: 266.16
6.87% -> MIROpt:NCE_TypeInference: 242.22
5.56% -> Dedupe: 196.15
3.45% -> MIROpt:BBOpt: 121.53
3.20% -> RegisterAllocation: 112.69
3.00% -> PcMappingTable: 105.65
2.90% -> GcMap: 102.22
2.68% -> Launchpads: 94.50
1.16% -> MIROpt:InitRegLoc: 40.94
1.16% -> Cleanup: 40.93
1.10% -> MIROpt:CodeLayout: 38.80
0.97% -> MIROpt:ConstantProp: 34.35
0.96% -> MIROpt:UseCount: 33.75
0.86% -> MIROpt:CheckFilters: 30.28
0.44% -> SpecialMIR2LIR: 15.53
0.44% -> MIROpt:BBCombine: 15.41
(cherry pick of 9e8e234af4430abe8d144414e272cd72d215b5f3)
Change-Id: I86c665fa7e88b75eb75629a99fd292ff8c449969
|
|
|
|
|
|
| |
Small, but measurable, improvement.
Change-Id: Ie3c7180f9f9cbfb1729588e7a4b2cf6c6d291c95
|
|
|
|
|
|
|
|
|
| |
Specialized the dataflow iterators and did a few other minor tweaks.
Showing ~5% compile-time improvement in a single-threaded environment;
less in multi-threaded (presumably because we're blocked by something
else).
Change-Id: I2e2ed58d881414b9fc97e04cd0623e188259afd2
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before we were creating arenas for each method. The issue with doing this
is that we needed to memset each memory allocation. This can be improved
if you start out with arenas that contain all zeroed memory and recycle
them for each method. When you give memory back to the arena pool you do
a single memset to zero out all of the memory that you used.
Always inlined the fast path of the allocation code.
Removed the "zero" parameter since the new arena allocator always returns
zeroed memory.
Host dex2oat time on target oat apks (2 samples each).
Before:
real 1m11.958s
user 4m34.020s
sys 1m28.570s
After:
real 1m9.690s
user 4m17.670s
sys 1m23.960s
Target device dex2oat samples (Mako, Thinkfree.apk):
Without new arena allocator:
0m26.47s real 0m54.60s user 0m25.85s system
0m25.91s real 0m54.39s user 0m26.69s system
0m26.61s real 0m53.77s user 0m27.35s system
0m26.33s real 0m54.90s user 0m25.30s system
0m26.34s real 0m53.94s user 0m27.23s system
With new arena allocator:
0m25.02s real 0m54.46s user 0m19.94s system
0m25.17s real 0m55.06s user 0m20.72s system
0m24.85s real 0m55.14s user 0m19.30s system
0m24.59s real 0m54.02s user 0m20.07s system
0m25.06s real 0m55.00s user 0m20.42s system
Correctness of Thinkfree.apk.oat verified by diffing both of the oat files.
Change-Id: I5ff7b85ffe86c57d3434294ca7a621a695bf57a9
|
|
|
|
| |
Change-Id: Iae286862c85fb8fd8901eae1204cd6d271d69496
|
|
|
|
|
|
| |
whitespace/labels, whitespace/semicolon issues
Change-Id: Ide4f8ea608338b3fed528de7582cfeb2011997b6
|
|
|
|
| |
Change-Id: I730bd87b476bfa36e93b42e816ef358006b69ba5
|
|
|
|
| |
Change-Id: Ifc678d59a8bed24ffddde5a0e543620b17b0aba9
|
|
|
|
| |
Change-Id: I456fc8d80371d6dfc07e6d109b7f478c25602b65
|
|
|
|
| |
Change-Id: Ide80939faf8e8690d8842dde8133902ac725ed1a
|
|
The runtime, compiler, dex2oat, and oatdump now are in seperate trees
to prevent dependency creep. They can now be individually built
without rebuilding the rest of the art projects. dalvikvm and jdwpspy
were already this way. Builds in the art directory should behave as
before, building everything including tests.
Change-Id: Ic6b1151e5ed0f823c3dd301afd2b13eb2d8feb81
|