## Additional Slides to De Micheli Book **Sungho Kang** **Yonsei University** - Behavioral Synthesis - Resource allocation; Pipelining; Control flow parallelization; Communicating Sequential Processes; Partitioning - Sequential Synthesis - Register Movement and Retiming; State Minimization; State Assignment; Synthesis for Testable FSM's; State Machine Verification - Logic Synthesis - Extraction of combinational logic to HDL; Two-level minimization; Algebraic Decomposition; Multilevel Logic Minimization; Synthesis for Multifault Testability; Test Generation via Minimization; Technology mapping; Timing optimization - Technology Mapping - Mapping to Library of Logic Gates; Timing Optimization - Physical Design Synthesis - Cell Placement; Routing; Fabrication; Engineering Changes - A typical tradeoff is area versus delay - With just one full adder, this circuit can do 32-bit addition - But, it is 32x slower than a parallel adder (32 full adders, 1 bit output per clock cycle) - Holding other factors constant, the Area vs Delay tradeoff curve is typically parabolic - The first design requirement is meet Constraints on Chip Area and Critical Path Delay (0 to 1) - The next priority is to optimize a feasible design - Design 2 is optimal, in the sense that area and delay cannot both be decreased from this point - Tradeoff is now necessary, according to Policy - A typical design *Policy* is to optimize area subject to a delay constraint (2 to 3) - Often a preferred policy is to optimize delay subject to an area constraint (2 to 4) - Sometimes other factors, such as testability or power, take priority - Typically this moves the area-delay tradeoff curve up and to the right - Thus designs 1 and 2 are optimal - Typically performed in a technology independent view of the circuit - In this view gates are regarded as logic functions - These functions are converted to physical gates by Technology Mapping ``` a = xi yi b = xi' yi' e = a' b' zi = ec<sub>i-1</sub>' +e'c<sub>i-1</sub> c = xi yi d = xi + y + i f = dc<sub>i-1</sub> ci = c + f ``` In this view the gates of the full adder circuit are just logic equations - Common subfunctions shared - Functions "Technology Mapped" to negative gates - Faults Models - Stuck-at faults - Delay faults - Test vectors - Fault simulation - Automatic test pattern generation - Diagnosis - Testable Design - First step is to identify the Critical Path - Simplest delay model: "number of logic levels" - Static Delay Models: - Levels of Logic - Delay function of size, load - Worst, best case models - Dynamic Delay Models - Simplified device models - Full Spice analysis - In the 70s IBM found that 75% of all its cpu cycles went to critical path algorithms - Now the bottleneck is Formal Verification - Scalable algorithms required, for which cpu time and space increase: - linearly in problem size n (O(n)) - log-linearly (O(nlog(n))) - even quadratic $(O(n^2))$ too expensive - Graph - Edge - Vertex - Undirected graph - Digraph - All edges are directed - Mixed graph - DAG (directed acyclic graph) **Graph: ordered set of two sets** $$G = (V, E)$$ : a set of vertices or nodes : a set of edges or arcs 4; the successors (fanouts) of node 4 $$4 \rightarrow = \{1,3\}$$ → 4: the predecessors (fanins) of node 4 $$4_{\rightarrow} = \{1,2\}$$ - Transitive closure - fanout - fanin - Source - v has no predecessors - Sink - v has no successors ## • Intersecting 2 sets of sets ``` Ops Times/call Procedure SET_CARTESIAN_PRODUCT(G,H) { Best Worst m = |G|; n = |H|; c1 P = NULL c2 for (i = 1, 2, ..., m) { c3m 1 for (j = 1, 2, ..., n) { c4n m m P = P \cup (Gi \cap Hj) c5 mn mng return(P) c6 ``` - This problem is modeled as that of finding the longest path in a DAG (Directed Acyclic Graph) - Thus we digress for a while, and introduce the notions of sets and graphs - Then we discuss a scalable (linear) complexity algorithm for finding the longest path in a DAG ## **Longest Paths** Line 4 5 6 9 10 11 12 13 14 Procedure LONGEST\_PATH(V, E, L, I, spec) { Times/call Ops Best Worst n = |V|; m = |E|; q = |I|CO. $\mathbf{for}(v \in V)$ { $C_1$ $\lambda(v) = 0$ $D_v = |{ ightarrow} v|$ Q = I while $(Q eq \emptyset)$ { $v = \mathtt{DEQUEUE}(Q)$ $foreach(a \in \iota \rightarrow) \{$ $\lambda_a = \max(\lambda_a, (\lambda_v + L_{v,a}))$ $D_a = D_a - 1$ $v^* = \operatorname{select}(V, \lambda^*)$ $return(\pi, \lambda)$ } $if(D_a=0)$ QUEUE(Q,a) } $\lambda^* = \max_{v \in V} \{\lambda_v\}$ $\pi = \mathtt{BACK\_TRACE}(V, E, L, \lambda, v^*, (spec - \lambda^*))$ $\pi$ 1 $\pi$ $\pi$ 1 $\pi$ $\pi$ m 777. m $C_2$ C3Q CT $C_{5}$ Cal **C7** CR. Cq C10 $c_{11}\pi$ $c_{12}\pi$ C13 $c_{14}$ m Graph Algorithms $\pi$ $\pi$ $\pi$ $\pi$ m 777. m m $\pi$ - The slack of an edge(a,v) is the slack of v plus the difference between the length of the longest path to v and the longest path to v through (a,v) - In formula, slack<sub>a,v</sub> = slack<sub>v</sub> + $(\lambda_v (\lambda_a + L_{a,v}))$ - Here $\lambda_v$ is the length of the longest path to v, and ( $\lambda_a$ + L<sub>a,v</sub>) is the length of the longest path to v that passes through the edge (a,v) - The slack of a node u is defined as the minimum of its fanout edge slacks, so slack<sub>a</sub> = min slack<sub>v</sub> - A function F(n) is in the set O(g(n)) if and only if there exist positive constants $c_o$ and $n_o$ such that $F(n) \le c_o g(n)$ for all $n \ge n_o$ - This means that F(n) is asymptotically bounded <u>from above</u> by a linear function of g(n) - A function F(n) is in the set $\Omega(g(n))$ if and only if there exist positive constants $c_{\Omega}$ and $n_{\Omega}$ such that $F(n) \geq c_{\Omega} g(n)$ for all $n \geq n_{\Omega}$ - This means that F(n) is asymptotically bounded <u>from</u> <u>below</u> by a linear function of g(n)