SSA Form: Static Single Assignment (SSA) Form. - Every variable has one (static) definition (though defn. may be executed many times). - For straightline code, this is just like value numbering: Original Code: v <- 4 w <- v + 5 v <- 6 w <- v + 7 Code in SSA: v1 <- 4 w1 <- v1 + 5 v2 <- 6 w2 <- v2 + 6 - For general flow, must introduce phi-nodes. These are fictional operations, (usually) not intended to have exeuction significance. To interpret these nodes, must view code as CFG, with the in-edges to any node have well-defined order. Example 1: Original CFG: |--------| | P? | |--------| / \ / \ V V |------| |------| | v<-4 | | v<-5 | |------| |------| \ / \ / V V |------------| | w <- v + v | |------------| CFG in SSA form: |--------| | P? | |--------| / \ / \ V V |-------| |-------| | v1<-4 | | v2<-5 | |-------| |-------| \ / \ / V V |------------------| | v3 <- phi(v1,v2) | | w1 <- v3 + v3 | |------------------| Example 2: Original CFG: |-----------| | i <- 0 | | j <- 0 | ------------- | ---------------- | | | V V | |-----------| | | i > N ? | | |-----------| | /\ | / \ | EXIT \ | \ | |------------| | | j <- j + i | | | i <- j + 1 | | |------------| | | | |----------- CFG in SSA form: |-----------| | i1 <- 0 | | j1 <- 0 | ------------- | ---------------- 1 | | 2 | V V | |-----------------| | | i2 = phi(i1,i3) | | | j2 = phi(j1,j3) | | | i2 > N ? | | |-----------------| | /\ | / \ | EXIT \ | \ | |--------------| | | j3 <- j2 + i | | | i3 <- j2 + 1 | | |--------------| | | | |----------- Where should we put phi assignments, and for which variables? Simple answer: in every join node, for every variable in the program. Too expensive! (Suffices to put a phi assignment for x in join nodes that aren't dominated by a single definition of x. More details later.) Larger Scopes for Value Numbering: - Assume unique names via SSA form. 1. "Superlocal." Do analysis over paths in extended basic blocks. An extended basic block has one entry, but can have multiple exits. It forms a subtree of the CFG: the root block may have multiple predecessors; the other blocks have a unique predecessor, which is inside the EBB. 2. "Dominator-based." Do the analysis over paths in dominator tree. Aside on dominators: Assume the CFG has a distinguished start node S, and has no disconnected subgraphs (nodes unreachable from S). Then we define that node d *dominates* node n if all paths from S to n include d. In particular, for all n, n dominates n. Fact: d dominates n iff d = n or d dominates all predecessors of n. So, the set D[n] of nodes that dominate n can be defined as: D[S] = { S } D[n] = { n } U (intersection over all p in pred[n] of D[p]) (where pred[n] = set of predecessors on n in CFG) Define the *immediate dominator* of n, idom(n) as follows: (1) idom(n) dominates n (2) idom(n) is not n (3) idom(n) does not dominate any other dominator of n (except n itself) Fact: every node (except S) has a unique immediate dominator. Hence the immediate dominator relation defined a tree, called the *dominator tree*, whose nodes are the nodes of the CFG, where the parent of a node is its immediate dominator. Have D[n] = {n} U (descendents of n in dominator tree) Fact: The dominator tree of a CFG can be computed in almost-linear time. (See textbook Ch. 9 for details.) Even with dominators, cannot find redundant expressions computed on *different* paths. A different approach: compute *available expressions*. This is a classic *data flow analysis* problem. In the SSA context, an expression is *available* at instr n if it is computed at least once on *every* path from the entry node to n. Application: If an expression is available at a node where it is being recomputed, it is possible to replace the recomputation by a variable representing the result of the previous computation. To compute available expressions we can solve the following dataflow equations: gen["t <- b bop c"] = {b bop c} gen[other] = {} in[s] = intersection over all p in pred[s] of out[p] out[s] = in[s] U gen[s] We're interested in computing in[s]; at the moment, we'll just wave our hands to do this. Example (Muchnick, "Advanced Compiler Design & Implementation", Fig. 13.8): ------- ENTRY ------- | | V A --------------- c1 <- a1 + b1 d1 <- a1 * c1 e1 <- d1 * d1 i1 <- 1 --------------- | _______________________ | | | V V | B ----------------- | i1 = phi(i1,i2) | i3 = phi(c1,c2) | f[i3] <- a1 + b1 | c2 <- c3 * 2 | c2 > d1 ? | ----------------- | | Y N | | | | | V V | C ----------------- D ------------------ | g[i3] <- a1 * c2 g[i3] <- d1 * d1 | ----------------- ------------------ | | | | | | | V V | E ------------------ | i2 <- i3 + 1 | i2 > 10? | ------------------ | | Y N | | | | | V ---------------------- ------ EXIT ------ We have gen[A] = {a1+b1,a1*c1,d1*d1} gen[B] = {a1+b1,c3*2} gen[C] = {a1*c2} gen[D] = {d1*d1} gen[E] = {i3+1} Here's a solution (the maximal one, which is what we want) in[A] = {} out[A] = {a1+b1,a1*c1,d1*d1} in[B] = {a1+b1,a1*c1,d1*d1} out[B] = {a1+b1,a1*c1,d1*d1,c3*2} in[C] = {a1+b1,a1*c1,d1*d1,c3*2} out[C] = {a1+b1,a1*c1,d1*d1,c3*2,a1*c2} in[D] = {a1+b1,a1*c1,d1*d1,c3*2} out[D] = {a1+b1,a1*c1,d1*d1,c3*2} in[E] = {a1+b1,a1*c1,d1*d1,c3*2} out[E] = {a1+b1,a1*c1,d1*d1,c3*2,i3+1} So re-computations of a1+b1 in B and d1*d1 in D can be removed. Here's another solution (a less useful one): in[A] = {} out[A] = {a1+b1,a1*c1,d1*d1} in[B] = {a1+b1} out[B] = {a1+b1,c3*2} in[C] = {a1+b1,c3*2} out[C] = {a1+b1,c3*2,a1*c2} in[D] = {a1+b1,c3*2} out[D] = {a1+b1,c3*2,d1*d1} in[E] = {a1+b1,c3*2} out[E] = {a1+b1,c3*2,i3+1} Note importance of taking "optimistic" view of in[B].