summaryrefslogtreecommitdiffstats
path: root/src/glsl/loop_controls.cpp
Commit message (Collapse)AuthorAgeFilesLines
* glsl/loops: Get rid of lower_bounded_loops and ir_loop::normative_bound.Paul Berry2013-12-091-8/+1
| | | | | | | | Now that loop_controls no longer creates normatively bound loops, there is no need for ir_loop::normative_bound or the lower_bounded_loops pass. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* glsl/loops: Stop creating normatively bound loops in loop_controls.Paul Berry2013-12-091-10/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, when loop_controls analyzed a loop and found that it had a fixed bound (known at compile time), it would remove all of the loop terminators and instead set the loop's normative_bound field to force the loop to execute the correct number of times. This made loop unrolling easy, but it had a serious disadvantage. Since most GPU's don't have a native mechanism for executing a loop a fixed number of times, in order to implement the normative bound, the back-ends would have to synthesize a new loop induction variable. As a result, many loops wound up having two induction variables instead of one. This caused extra register pressure and unnecessary instructions. This patch modifies loop_controls so that it doesn't set the loop's normative_bound anymore. Instead it leaves one of the terminators in the loop (the limiting terminator), so the back-end doesn't have to go to any extra work to ensure the loop terminates at the right time. This complicates loop unrolling slightly: when deciding whether a loop can be unrolled, we have to account for the presence of the limiting terminator. And when we do unroll the loop, we have to remove the limiting terminator first. For an example of how this results in more efficient back end code, consider the loop: for (int i = 0; i < 100; i++) { total += i; } Previous to this patch, on i965, this loop would compile down to this (vec4) native code: mov(8) g4<1>.xD 0D mov(8) g8<1>.xD 0D loop: cmp.ge.f0(8) null g8<4;4,1>.xD 100D (+f0) if(8) break(8) endif(8) add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD add(8) g8<1>.xD g8<4;4,1>.xD 1D add(8) g4<1>.xD g4<4;4,1>.xD 1D while(8) loop (notice that both g8 and g4 are loop induction variables; one is used to terminate the loop, and the other is used to accumulate the total). After this patch, the same loop compiles to: mov(8) g4<1>.xD 0D loop: cmp.ge.f0(8) null g4<4;4,1>.xD 100D (+f0) if(8) break(8) endif(8) add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD add(8) g4<1>.xD g4<4;4,1>.xD 1D while(8) loop Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* glsl/loops: Get rid of loop_variable_state::max_iterations.Paul Berry2013-12-091-25/+16
| | | | | | | | This value is now redundant with loop_variable_state::limiting_terminator->iterations and ir_loop::normative_bound. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* glsl/loops: Move some analysis from loop_controls to loop_analysis.Paul Berry2013-12-091-80/+26
| | | | | | | | | | | | | | | | | | | | Previously, the sole responsibility of loop_analysis was to find all the variables referenced in the loop that are either loop constant or induction variables, and find all of the simple if statements that might terminate the loop. The remainder of the analysis necessary to determine how many times a loop executed was performed by loop_controls. This patch makes loop_analysis also responsible for determining the number of iterations after which each loop terminator will terminate the loop, and for figuring out which terminator will terminate the loop first (I'm calling this the "limiting terminator"). This will allow loop unrolling to make use of information that was previously only visible from loop_controls, namely the identity of the limiting terminator. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* glsl/loops: Remove unnecessary list walk from loop_control_visitor.Paul Berry2013-12-091-33/+30
| | | | | | | | | | | When loop_control_visitor::visit_leave(ir_loop *) is analyzing a loop terminator that acts on a certain ir_variable, it doesn't need to walk the list of induction variables to find the loop_variable entry corresponding to the variable. It can just look it up in the loop_variable_state hashtable and verify that the loop_variable entry represents an induction variable. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* glsl/loops: replace loop controls with a normative bound.Paul Berry2013-12-091-13/+8
| | | | | | | | | | | | | | This patch replaces the ir_loop fields "from", "to", "increment", "counter", and "cmp" with a single integer ("normative_bound") that serves the same purpose. I've used the name "normative_bound" to emphasize the fact that the back-end is required to emit code to prevent the loop from running more than normative_bound times. (By contrast, an "informative" bound would be a bound that is informational only). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* glsl: Fix inconsistent assumptions about ir_loop::counter.Paul Berry2013-11-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The compiler back-ends (i965's fs_visitor and brw_visitor, ir_to_mesa_visitor, and glsl_to_tgsi_visitor) assume that when ir_loop::counter is non-null, it points to a fresh ir_variable that should be used as the loop counter (as opposed to an ir_variable that exists elsewhere in the instruction stream). However, previous to this patch: (1) loop_control_visitor did not create a new variable for ir_loop::counter; instead it re-used the existing ir_variable. This caused the loop counter to be double-incremented (once explicitly by the body of the loop, and once implicitly by ir_loop::increment). (2) ir_clone did not clone ir_loop::counter properly, resulting in the cloned ir_loop pointing to the source ir_loop's counter. (3) ir_hierarchical_visitor did not visit ir_loop::counter, resulting in the ir_variable being missed by reparenting. Additionally, most optimization passes (e.g. loop unrolling) assume that the variable mentioned by ir_loop::counter is not accessed in the body of the loop (an assumption which (1) violates). The combination of these factors caused a perfect storm in which the code worked properly nearly all of the time: for loops that got unrolled, (1) would introduce a double-increment, but loop unrolling would fail to notice it (since it assumes that ir_loop::counter is not accessed in the body of the loop), so it would unroll the loop the correct number of times. For loops that didn't get unrolled, (1) would introduce a double-increment, but then later when the IR was cloned for linking, (2) would prevent the loop counter from being cloned properly, so it would look to further analysis stages like an independent variable (and hence the double-increment would stop occurring). At the end of linking, (3) would prevent the loop counter from being reparented, so it would still belong to the shader object rather than the linked program object. Provided that the client program didn't delete the shader object, the memory would never get reclaimed, and so the shader would function properly. However, for loops that didn't get unrolled, if the client program did delete the shader object, and the memory belonging to the loop counter got re-used, this could cause a use-after-free bug, leading to a crash. This patch fixes loop_control_visitor, ir_clone, and ir_hierarchical_visitor to treat ir_loop::counter the same way the back-ends treat it: as a freshly allocated ir_variable that needs to be visited and cloned independently of other ir_variables. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72026 Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* glsl: Hide many classes local to individual .cpp files in anon namespaces.Eric Anholt2013-09-231-0/+2
| | | | | | | | This gives the compiler the chance to inline and not export class symbols even in the absence of LTO. Saves about 60kb on disk. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
* glsl: Fix loop bounds detection.Paul Berry2013-01-081-4/+4
| | | | | | | | | | | | | When analyzing a loop where the loop condition is expressed in the non-standard order (e.g. "4 > i" instead of "i < 4"), we were reversing the condition incorrectly, leading to a loop bound that was off by 1. Fixes piglit tests {vs,fs}-loop-bounds-unrolled.shader_test. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>
* Use C-style system headers in C++ code to avoid issues with std:: namespaceIan Romanick2011-02-211-1/+1
|
* Convert everything from the talloc API to the ralloc API.Kenneth Graunke2011-01-311-2/+2
|
* glsl: fix crash in loop analysis when some controls can't be determinedAras Pranckevicius2010-11-111-0/+3
| | | | Fixes loop-07.frag.
* glsl: Fix 'format not a string literal and no format arguments' warning.Vinson Lee2010-09-151-1/+1
| | | | | | Fix the following GCC warning. loop_controls.cpp: In function 'int calculate_iterations(ir_rvalue*, ir_rvalue*, ir_rvalue*, ir_expression_operation)': loop_controls.cpp:88: warning: format not a string literal and no format arguments
* loop_controls: fix analysis of already analyzed loopsLuca Barbieri2010-09-131-1/+8
| | | | | | The loop_controls pass didn't look at the counter values it put in ir_loop on previous iterations, so while the first iteration worked, subsequent ones couldn't determine max_iterations.
* glsl2: Use as_constant some places instead of constant_expression_valueIan Romanick2010-09-031-2/+2
| | | | | | | | | | | The places where constant_expression_value are still used in loop analysis are places where a new expression tree is created and constant folding won't have happened. This is used, for example, when we try to determine the maximal loop iteration count. Based on review comments by Eric. "...rely on constant folding to have done its job, instead of going all through the subtree again when it wasn't a constant."
* glsl2: Add module to perform simple loop unrollingIan Romanick2010-09-031-1/+3
|
* glsl2: Track the number of ir_loop_jump instructions that are in a loopIan Romanick2010-09-031-0/+4
|
* glsl2: Eliminate zero-iteration loopsIan Romanick2010-09-031-1/+7
|
* glsl2: Add module to suss out loop control variables from loop analysis dataIan Romanick2010-09-031-0/+282
This is the next step on the road to loop unrolling