Register Allocation and Assignment
Complexity of Target Machine
Level of Translation: expression, statement, basic block, routine, program?
Management of Scarce Resources
Approaches to Instruction Selection
For RISC targets, translate one IR instruction to one or more target instructions.
For CISC targets, translate several IR instructions to one target instruction.
Simplistic SPARC Instruction Selection for PCAT
Generate instructions directly from AST, using 3-address style.
Include explicit code for array and record calculations.
(Alternatively, could do one-to-one translation of IR.)
Take advantage of SPARC's M[reg+const] addressing mode to generate good code for frame references.
DO THIS: ld [
NOT THIS: add ld [
Use (small) constants directly where possible.
DO THIS: add
NOT NOT: mov 42, add
Fill delay slots with nop's, unless producing a ``canned'' sequence that can use them.
Register Allocation and Assignment
Task: Manage scarce resources (registers) in environment with imperfect information (static program text) about dynamic program behavior.
General aim is to keep frequently-used values in registers as much as possible, to lower memory traffic. Can have a large effect on program performance.
Variety of approaches are possible, differing in sophistication and in scope of analysis used.
Allocator may be unable to keep every ``live'' variable in registers; must then ``spill'' variables to memory. Spilling adds new instructions, which often affects the allocation analysis, requiring a new iteration.
If spilling is necessary, what should we spill? Some heuristics:
Don't spill variables used in inner loops.
Spill variables not used again for ``longest'' time.
Spill variables which haven't been updated since last read from memory.
Simplistic Register Management for PCAT
Assume variables ``normally'' live in memory.
Fetch values into registers just before (each) use in an expression.
Register use never spans statements.
Use auxiliary table to track register use.
Certain SPARC registers are reserved.
Remember that same register can be used as source and target:
might produce add %r3,%r5,%r3
Ignore possibility of spills. Register Allocation for Expressions
Choice of evaluation order can affect number of registers needed.
If we compute left child first, need 4 regs, but doing right child first needs only 3.
Minimizing Registers Needed to Evaluate Expression Trees
Key idea (Sethi & Ullman): At each node, first evaluate subtree requiring largest number of registers to evaluate. Can then save result of this evaluation in a register while doing other subtree.
1. Label each node with minimum number of registers needed to evaluate subtree.
2. Use labels to guide order of code emission; emit code for higher-numbered subtree first.
3. If we run out of registers (when does this happen?), spill to temporary memory locations.
Sethi-Ullman Numbering Example
Other Issues in Tree Evaluation Order
Some machines (e.g., X86) allow one operand (e.g., left one) to be a complex expression, while the other must be a register, which also holds the result (``accumulator'' style):
These machines have different Sethi-Ullman numbering, e.g., right leaves may require no registers at all.
If we must spill registers, it is better to evaluate left child of non-commutative operators (like -,/) last, because they will require their left operand in a register anyhow.
Can use associativity to make trees ``less bushy, '' e.g.
Extend analysis of register use to program units larger than expressions but still completely analyzable at compile time.
Basic Block = sequence of instructions with single entry & exit.
If first instruction of BB is executed, so is remainder of block (in order).
To calculate basic blocks:
(1) Determine BB leaders ( ) :
(a) First statement in routine
(b) Target of any jump (conditional or unconditional).
(c) Statement following any jump.
(What about subroutine calls?)
(2) Basic block extends from leader to (but not including) next leader (or end of routine).
Basic Block Example
Register Assignment Within Basic Blocks
Idea: Let program variables and temporaries stay in registers as long as possible.
Simplest to operate on one basic block at a time; can extend to multiple blocks with some effort.
Assume an infinite supply of registers; later ``spill'' some to memory if required.
Registers behave like a cache for memory locations.
Nasty problems for source-level debuggers and dump utilities - where is that variable?!?
To determine how long to keep a given variable in a register, need to know the range of instructions for which the variable is live.
A variable is live immediately following an instruction if its current value will be needed in the future.
It's easy to calculate live ranges within a basic block, just by working backwards through the block.
Can assume that all user variables are live at the end of a basic block, i.e., that their values may be used in a subsequent block. (If doing BB-level register allocation, must save the values back in memory at the end of the block anyhow.)
Treat temporaries like user variables, except that they are assumed dead at end of BB.
Can improve accuracy by calculating liveness over entire routines, including control flow statements, not just BBs.
To do this requires iterative flow analysis and the result is only conservative approximation to true liveness.
BB Code Generation using Liveness
Can combine code generation with ``greedy'' register allocation: bring each variable into a register when first needed, and leave it there as long as it's needed (if possible).
Maintain register descriptors saying which variable is in each register, and address descriptors saying where (in memory and/or a register) each variable is. For each IR instruction x := y op z (other instructions similar):
1. If y isn't in a register, load it into a free one, updating descriptors.
2. Similarly for z.
3. If y and/or z are no longer live following this instruction, mark their registers as free.
4. Choose a free register for x, updating descriptors.
5. Generate instruction op ry,rz,rx.
For the special case x := y, load y into a register, if necessary, and then mark that register as holding x too.
Must now be careful not to free a register unless none of its associated variables is live.
Register Interference Graphs
Mixing instruction selection and register allocation gets confusing; need a more systematic way to look at the problem.
Initially generate code assuming an infinite number of ``logical'' registers; calculate live ranges
Build a register interference graph, which has
- a node for each logical register.
- an edge between two nodes if the corresponding registers are simultaneously live.
Coloring Interference Graphs
Interference Graph Example:
A coloring of a graph is an assignment of colors to nodes such that no two connected nodes have the same color. (Like coloring a map, where nodes=countries and edges connect countries with common border.)
Suppose we have k physical registers available. Then aim is to color interference graph with k or fewer colors. This implies we can allocate logical registers to physical registers without spilling.
In general case, determining whether a graph can be k-colored is hard (N.P. Complete, and hence probably exponential).
But a simple heuristic will usually find a k-coloring if there is one.
Graph Coloring Heuristic
1. Choose a node with fewer than k neighbors.
2. Remove that node. Note that if we can color the resulting graph with kcolors, we can also color the original graph, by giving the deleted node a color different from all its neighbors.
3. Repeat until either
there are no nodes with fewer than k neighbors, in which case we must spill; or
the graph is gone, in which case we can color the original graph by adding the deleted nodes back in one at a time and coloring them.
Finds a 3-coloring. There cannot be a 2-coloring (why not?).
Each ``color'' corresponds to a physical register, so 3 registers will do for