Standard Optimizations - Copy propagation int i = j + k; => int i = j + k; int i1 = i; int i2 = i + 1; int i2 = i1 + 1; - Constant propagation int i = 1; => int i = 1; ... ... if (i > 0) then e1; e1 else e2; - Constant folding & algebraic simplification int i = 3 + 8 + (a * 0); => int i = 11; - Dead-code elimination - Common-subexpression elimination int i = r.a + 3; => int i = r.a + 3; int k = r.a + 3; r.b = i + i; r.b = i + j; - Loop-invariant code motion while (...) => i = 99 * ... * 99; i = 99 * ... * 99; while (...) ... ... j = i * p; j = i * p; ... ... done done - Utilizing predicates if (i = 0) => if (i <> 0) j = j + i; j = 2; else j = 2; Supported by - Control flow analysis - Data flow analysis - Alias analysis -------- Above techniques operate easily on basic blocks, and with additional effort on entire procedure bodies ("global" optimization). But what if procedure calls intervene? EXAMPLE func get(r) return r.a; proc set(r,x) r.b := x; int i = get(r)+ 3; int k = get(r)+ 3; set(r,i+j); How can we know that get(r) returns the same value each time? Use - Interprocedural analysis Might discover a fact like "get is a pure function". - Inlining int i = r.a + 3; int k = r.a + 3; r.b := i+j; Utility of analyses not firmly established. Analyses substantially complicates optimizer implementation. Inlining/specialization can cause code bloat. Don't mix well with separate compilation. ----------- But what if call is to unknown procedure, e.g., through a C function pointer (C) ? Traditional answer: give up! Not good enough for object-oriented languages, where - any method call (in C++, any "virtual" call) is to an unknown address. - object-oriented style encourages heavy use of small routines EXAMPLE class myclass { int a; int b; get () {return a;} set (int x) {b := x;} } class yourclass extends myclass { get () {return b;} } proc foo (myclass r) { myclass p = new myclass; int j = p.get(); int i = r.get(); yourclass q = (yourclass) r; r.set (3); } May also have additional runtime costs due to class-based narrowing checks. --------- In Java's dynamic linking model, improvements based on static analysis are completely hopeless: not only is target address unknown, it might not even exist in the image yet! But we CAN do useful things if given the WHOLE program, or if we can optimize based on DYNAMIC profile information. Goal: reduce number of dynamic dispatches and cost of runtime checks. Programming language may give hints (C++ virtual, Java final) ------- Intraprocedural data analysis At each program step, classify each object variable as follows: - Cone (C) - for self, vars with declared types, results of calls - Class (C) - for constants, results of new - Union(S1,...,Sn) - after control flow - Difference(S1,S2) - after unsucessful type tests Now for each method call c.m(...) can hope to replace dynamic dispatch with something cheaper, if c can only refer to one of a finite set of classes. - Do static lookup on each class and find out what method is invoked on that class. - If only one possible method, call it directly or inline it. - If several methods, can make explicit chain of class tests on c, calling (or inlining) appropriate method. Either way, much better opportunities for optimization. But if unbounded set of classes, these techniques don't work. --------------- How can we associate Cone(C) variables to finite sets of classes? Quick & Dirty Whole-program static analysis: All based on class hierarchy (inheritance graph and associated method definitions): class A method m method p / \____________ / \ / \ class B : A class C : A method m method m / / \ / / \ / / \ class D : B class E : C class F : C ... method m method p / \ / \ / \ class G : F class H : F ... ... Idea 1: Unique Name Look at the entire tree. If there's only one method defined with the desired name, that must be it! Idea 2: Class Hierarchy Analysis (based on inheritance graph) Use graph to enumerate all Cone sets. For example, in graph suppose that F::p calls this.m(). can tell that Cone(F) contains no overriding verison of m, so can change to a direct call (or inlining of) C::m. Idea 3: "Rapid Type Analysis" (based on inspection of code for missing classes) Look at entire code base. If no objects of a particular class are created, then that class can be effectively removed from the hierarchy and ignored when doing hiearchy analysis. Note that this requires the code, not just the inheritance graph. Fancier schemes: interprocedural variants of class set stuff; all require looking at effect of each code line on each variable; much more expensive than these dirty methods. Practical? --------- Not all choices are equally probable. Receiver class prediction tries to guess most likely method(s). Can be used to order tests in chain. Less likely options can be: - not inlined - left dynamic - not even compiled. Can be hardwired for known methods, but works best based on profile information. To be useful, profiles should have well-defined peaks and be stable across inputs and modifications. ------- Specialization is alternative to inlining: make specialized versions of routines with some parameter (typically self) known. a.k.a customization, cloning. - hard to get balance between code explosion and improved speed right. - profiles can also guide specialization.