Chapter 3  Scan Architectures and Techniques

Figure 3-1 Introduction to Scan-based Testing

Chip under Test with Full-Scan

- >1,000,000 gates
- >5,000,000 faults
- >10,000 flip-flops
- > 1,000 sequential depth
- < 500 chip pins
* > 2,000 gates/pin
* > $2M = 2^{1000}$
A deep sequential circuit

Chip under Test without Scan

- >1,000,000 gates
- >5,000,000 faults
- > no effective flip-flops
- > no sequential depth
- < 500 + 10,000 chip pins
* > 95.23 gates/pin
* > $2M = 2^0 = 1$
A combinational circuit
Figure 3-2 An Example Non-Scan Circuit

Sequential Depth of 4
Combinational Width of 6

$2^{6+4} = 1024$ Vectors
A no-clock, combinational-only circuit with:
6 inputs plus 5 pseudo-inputs and
2 outputs plus 4 pseudo-outputs

Figure 3-3 Scan Effective Circuit
Figure 3-4 Flip-Flop versus Scan Flip-Flop
Set-Scan D Flip-Flop with Set at Higher Priority

Set-Scan D Flip-Flop with Scan-Shift at Higher Priority

Figure 3-5 Example Set-Scan Flip-Flops
Figure 3-6 An Example Scan Circuit with a Scan Chain
The scan cell provides observability and controllability of the signal path by conducting the four transfer functions of a scan element.

**Operate:** D to Q through port a of the input multiplexer: allows normal transparent operation of the element.

**Scan Sample:** D to SDO through port a of the input multiplexer: gives observability of logic that fans into the scan element.

**Scan Load/Shift:** SDI to SDO through the b port of the multiplexer: used to serially load/shift data into the scan chain while simultaneously unloading the last sample.

**Scan Data Apply:** SDI to Q through the b port of the multiplexer: allows the scan element to control the value of the output, thereby controlling the logic driven by Q.

Figure 3-7 Scan Element Operations
Figure 3-8 Example Scan Test Sequencing

**Functional Operation Mode**
- While the clock is low, apply test data to SDI and Place SE = 1
- At the rising edge of the clock, test data will be loaded

**Scan Shift Load/Unload Mode**
- Apply clocks for scan length
- When chain is loaded, the last shift clock will apply scan data
- While the clock is low, place SE = 0

**Scan Apply Mode (Last Shift)**
- Normal circuit response will be applied to D
- The next rising edge of the clock will sample D
- Return to Load/Shift mode to unload circuit response sample
- NOTE: unloading is simultaneous with loading the next test

**Scan Sample Mode**
- Repeat operations until all vectors have been applied
- NOTE: the chip’s primary inputs must be applied during the scan apply mode (after the last shift)
Figure 3-9 Example Scan Testing Timing
Figure 3-10 Safe Scan Shifting

**Gated Clock Nets**
- Provide an Enable Signal
- Gated Clock Nets
- force the clock on

**Asynchronous or Synchronous Signals with Higher Priority than Scan—or Non-Scan Elements**
- Provide a Blocking Signal
- f_seB

**Driven Contention During Scan Shifting**
- Provide a Forced Mutual Exclusivity
- t_seB

Design-for-Test for Digital IC’s and Embedded Core Systems
Alfred L. Crouch
© 1999 Prentice Hall, All Rights Reserved
The Scan Sample
The Last Shift In
The First Shift Out

Faults Exercised Interval

CLK

SE

t_seB

a tristate scan enable may be a separate signal that has slightly different timing than the flip-flop SE

Driven Contention during the Capture Cycle

D Q
CLK

D Q
CLK

D Q
CLK

D Q
CLK

Q D
CLK

Q D
CLK

Q D
CLK

t_seB
de-asserted

Figure 3-11 Safe Scan Vectors
A clocked, sequential circuit with depth=1:
6 inputs plus 4 pseudo-inputs and
2 outputs plus 3 pseudo-outputs
Design-for-Test for Digital IC’s and Embedded Core Systems
© 1999 Prentice Hall, All Rights Reserved

Chapter 3  Scan Architectures and Techniques

An Example Using a Chip with 1000 Scan Bits and 5 Scan Vectors
Red Space Is \textit{Wasted} Tester Memory

One Long Scan Chain

<table>
<thead>
<tr>
<th>1000</th>
<th>1000</th>
<th>Vector Data</th>
<th>1000</th>
<th>1000</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td>\textit{X’s on all Other Channels not actively used for parallel pin data}</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Each Vector is 1000 Bits Long
So 5 Vectors Are 5000 Bits of Tester Memory

Many Variable Length Scan Chains

<table>
<thead>
<tr>
<th>120</th>
<th>X</th>
<th>120</th>
<th>X</th>
<th>Vector Data</th>
<th>120</th>
<th>X</th>
<th>120</th>
<th>X</th>
</tr>
</thead>
<tbody>
<tr>
<td>80</td>
<td>\textit{XXX}</td>
<td>80</td>
<td>\textit{XXX}</td>
<td>Vector Data</td>
<td>80</td>
<td>\textit{XXX}</td>
<td>80</td>
<td>\textit{XXX}</td>
</tr>
<tr>
<td>100</td>
<td>X</td>
<td>100</td>
<td>X</td>
<td>Vector Data</td>
<td>100</td>
<td>X</td>
<td>100</td>
<td>X</td>
</tr>
<tr>
<td>110</td>
<td>X</td>
<td>110</td>
<td>X</td>
<td>Vector Data</td>
<td>110</td>
<td>X</td>
<td>110</td>
<td>X</td>
</tr>
<tr>
<td>90</td>
<td>\textit{XXX}</td>
<td>90</td>
<td>\textit{XXX}</td>
<td>Vector Data</td>
<td>90</td>
<td>\textit{XXX}</td>
<td>90</td>
<td>\textit{XXX}</td>
</tr>
<tr>
<td>180</td>
<td>180</td>
<td>Vector Data</td>
<td>180</td>
<td>180</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>\textit{X\times2 XXXX}</td>
<td>\textit{X\times2 XXXX}</td>
<td>Vector Data</td>
<td>\textit{X\times2 XXXX}</td>
<td>\textit{X\times2 XXXX}</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>100</td>
<td>XX</td>
<td>100</td>
<td>XX</td>
<td>Vector Data</td>
<td>100</td>
<td>XX</td>
<td>100</td>
<td>XX</td>
</tr>
<tr>
<td>100</td>
<td>XX</td>
<td>100</td>
<td>XX</td>
<td>Vector Data</td>
<td>100</td>
<td>XX</td>
<td>100</td>
<td>XX</td>
</tr>
<tr>
<td>100</td>
<td>XX</td>
<td>100</td>
<td>XX</td>
<td>Vector Data</td>
<td>100</td>
<td>XX</td>
<td>100</td>
<td>XX</td>
</tr>
</tbody>
</table>

Each Vector Is 180 Bits Long—So 900 Bits of Tester Memory
Differences from Longest Chain (180) Are Full of \textit{X’s}—Wasted Memory

Many Balanced Scan Chains

<table>
<thead>
<tr>
<th>100</th>
<th>100</th>
<th>Vector Data</th>
<th>100</th>
<th>100</th>
</tr>
</thead>
<tbody>
<tr>
<td>100</td>
<td>100</td>
<td>Vector Data</td>
<td>100</td>
<td>100</td>
</tr>
<tr>
<td>100</td>
<td>100</td>
<td>Vector Data</td>
<td>100</td>
<td>100</td>
</tr>
<tr>
<td>100</td>
<td>100</td>
<td>Vector Data</td>
<td>100</td>
<td>100</td>
</tr>
<tr>
<td>100</td>
<td>100</td>
<td>Vector Data</td>
<td>100</td>
<td>100</td>
</tr>
<tr>
<td>100</td>
<td>100</td>
<td>Vector Data</td>
<td>100</td>
<td>100</td>
</tr>
<tr>
<td>100</td>
<td>100</td>
<td>Vector Data</td>
<td>100</td>
<td>100</td>
</tr>
</tbody>
</table>

Each Vector Is 100 Bits Long—So 500 Bits of Tester Memory
\textbf{No \textit{Wasted} Memory Space}

Figure 3-13 Multiple Scan Chains
Figure 3-14 The Borrowed Scan Interface
Figure 3-15 Clocking and Scan

- Scan Bypass Clocks
- Scan Testing an On-Chip Clock Source
Driven Contention during Scan Shifting

Asynchronous or Synchronous Signals with Higher Priority than Scan—or Non-Scan Elements

Gated Clock Nets

Figure 3-16 Scan-Based Design Rules
Basic Netlist Scan Insertion

Element Substitution

Ports, Routing & Connection of SE

Ports, Routing & Connection of SDI-SDO

Extras

Tristate “Safe Shift” Logic

Asynchronous “Safe Shift” Logic

Gated-Clock “Safe Shift” Logic

Multiple Scan Chains

Scan-Bit Re-Ordering

Clock Considerations

Figure 3-17 DC Scan Insertion
Scan Fail Data Presented at Chip Interface Automatically Implicates the Cone of Logic at One Flip-Flop

Multiple Fails under the Single Fault Assumption Implicate Gates Common to Both Cones of Logic

Figure 3-18 Stuck-At Scan Diagnostics
Basic Purpose

- Frequency Assessment
- Pin Specifications
- Delay Fault Content

Cost Drivers

- No Functional Vectors
- Fewer Overall Vectors
- Deterministic Grade

Figure 3-19 At-Speed Scan Goals
Separate Scan Enables for Tristate Drivers, Clock Forcing Functions, Logic Forcing Functions, Scan Interface Forcing Functions, and the Scan Multiplexor Control

Because the Different Elements Have Different Timing Requirements

Figure 3-20 At-Speed Scan Testing
Figure 3-21 At-Speed Scan Architecture
Figure 3-22 At-Speed Scan Interface
The Clock Domains and Logic Timing should be crafted so that the very next rising edge after the launch or last shift is the legal capture edge.
Cross Domain Clock Skew must be managed to less than the fastest flip-flop update time in the launching clock domain

If it is not, then the receiving flip-flop may receive new-new scan data before the capture clock arrives

To prevent this outcome, constrain the ATPG tool to only sample one clock domain at a time during the sample interval

Figure 3-24 Clock Skew and Scan Insertion
Figure 3-25 Scan Insertion for At-Speed Scan

**Design Flow Chart**

- **Behavior**
  - Model
  - Simulation Verification

- **Gates**
  - Synthesis
  - Timing Analysis
  - Specification Determination

- **Mask**
  - Place and Route
  - Mask and Fab
  - Silicon Test

- **Silicon**
  - Silicon

**Scan Mode:** Fixed “Safe” Logic

- **Scan Mode**:
  - Bus_SE
  - Tristate_SE
  - Logic Force_SE
  - Architecture Development

- **Scan Shift SE**
- **Clock Force_SE**
- **Scan Data Connection Insertion**

- **Scan Chain Bit Re-Ordering**

**Scan Interface Control**

- **Scan Enable (SE):** Scan Shift
- **Force_SE:** Logic Forced States
- **Tristate_SE:** Internal Tristates
- **Bus_SE:** Scan Interface Control
Static Timing Analysis Provides Path Description of Identified Critical Path from the Q-output of R1 to the Device Output Pin—Out1

Isolated Combinational Logic
All Fan-in to Endpoint Is Accounted at this Endpoint
Fanout to other Endpoints is Evaluated at Those Endpoints

Period = 20ns : Output Strobe @ 15ns

<table>
<thead>
<tr>
<th>Path Element Description</th>
<th>Incremental Delay</th>
<th>Cumulative Delay</th>
</tr>
</thead>
<tbody>
<tr>
<td>Clk</td>
<td>2.2ns</td>
<td>Skew Amb.</td>
</tr>
<tr>
<td>R1.Q</td>
<td>0.0ns</td>
<td>0.0ns</td>
</tr>
<tr>
<td>U35.A</td>
<td>2.1ns</td>
<td>2.1ns</td>
</tr>
<tr>
<td>U35.Z</td>
<td>0.1ns</td>
<td>2.2ns</td>
</tr>
<tr>
<td>U37.A</td>
<td>3.2ns</td>
<td>5.4ns</td>
</tr>
<tr>
<td>U37.Z</td>
<td>0.2ns</td>
<td>5.6ns</td>
</tr>
<tr>
<td>U38.A</td>
<td>2.2ns</td>
<td>7.8ns</td>
</tr>
<tr>
<td>U38.Z</td>
<td>0.1ns</td>
<td>7.9ns</td>
</tr>
<tr>
<td>Out1</td>
<td>Dly=10.1</td>
<td>Slk=4.9ns</td>
</tr>
</tbody>
</table>

Timing Analysis Report

Figure 3-26 Critical Paths for At-Speed Testing
Polynomial: $X^3 + X + 1 = X^3 + X^1 + X^0 = 2^3 + 2^1 + 2^0 = 11$

Figure 3-27 Logic BIST
Scan Testing Methodology

Advantages
Direct Observability of Internal Nodes
Direct Controllability of Internal Nodes
Enables Combinational ATPG
More Efficient Vectors
Higher Potential Fault Coverage
Deterministic Quality Metric
Efficient Diagnostic Capability
AC and DC Compliance

Concerns
Safe Shifting
Safe Sampling
Power Consumption
Clock Skew
Design Rule Impact on Budgets

Figure 3-28 Scan Test Fundamentals Summary