You are on page 1of 104

UNIVERSITY OF CALIFORNIA

Los Angeles

An 8-Bit 150-MHz CMOS A/D Converter

A dissertation submitted in partial satisfaction of the

requirements for the degree Doctor of Philosophy

in Electrical Engineering

by

Yun-Ti Wang

1999
Dedication

To my parents and Jing for their love and support.

iii
Table of Contents

Chapter 1 Introduction .............................................................................. 1

1.1 Motivation ............................................................................................... 1

1.2 Thesis Organization ................................................................................ 3

Chapter 2 ADC Applications and Architectures ..................................... 4

2.1 Applications ............................................................................................ 4

2.1.1 Digital Oscilloscopes ..................................................................... 4

2.1.2 Gigabit Ethernet ............................................................................. 5

2.1.3 RGB-to-LCD Display Conversion ................................................. 6

2.2 Architecture Review ............................................................................... 7

2.2.1 Flash Architecture .......................................................................... 7

2.2.2 Two-Step Architecture ................................................................. 10

2.2.3 Pipelined Architecture ................................................................. 12

2.2.4 Interleaved Architecture ............................................................... 14

2.2.5 Interpolating Architecture ............................................................ 16

Chapter 3 Proposed ADC Architecture ................................................. 18

3.1 Sliding Interpolation ............................................................................. 19

3.2 Embedded Pipelining ............................................................................ 27

3.3 Addition of Interleaving ........................................................................ 29

iv
3.4 Clock Edge Reassignment .................................................................... 32

3.5 Reinterpolation ...................................................................................... 36

3.6 Effect of Nonlinearity in Sliding Interpolation ..................................... 40

Chapter 4 Circuit Design and Layout Considerations ......................... 42

4.1 Introduction ........................................................................................... 42

4.2 Front-End Sample-and-Hold Circuit ..................................................... 42

4.3 Differential Amplifiers .......................................................................... 46

4.4 Comparator ........................................................................................... 49

4.5 Clock Edge Reassignment .................................................................... 53

4.6 Control and Decode Circuit .................................................................. 55

4.7 ROM and Output Stage ......................................................................... 57

4.8 Clock Generator .................................................................................... 58

4.9 One Slice of First-Stage Signal Path ..................................................... 60

4.10 Floor Plan And Layout Considerations ................................................. 63

Chapter 5 Experimental Results ............................................................. 71

5.1 Introduction ........................................................................................... 71

5.2 Design of Chip-on-Board Assembly ..................................................... 72

5.3 Test Setup .............................................................................................. 78

5.4 Experimental Results ............................................................................ 79

v
Chapter 6 Conclusion and Future Work ............................................... 84

Bibliography ................................................................................................. 87

vi
List of Figures
Figure 2.1 Digital oscilloscope. ...................................................................... 5

Figure 2.2 ADC application in Gigabit Ethernet. ........................................... 6

Figure 2.3 ADC application in RGB-to-LCD display conversion. ................. 7

Figure 2.4 Block diagram of an N-bit flash ADC. .......................................... 8

Figure 2.5 Transfer curves of an N-bit flash ADC. ......................................... 9

Figure 2.6 Mapping scheme of a 4-bit two-step ADC. ................................. 10

Figure 2.7 Block diagram of an N-bit, two-step ADC. ................................. 11

Figure 2.8 Block diagram of a pipelined ADC. ............................................ 13

Figure 2.9 Block diagram of an interleaved ADC. ....................................... 15

Figure 2.10 A 2x active interpolating ADC. ................................................. 16

Figure 3.1 Traditional active 2x interpolation architecture. ......................... 20

Figure 3.2 Sliding interpolation architecture. ............................................... 21

Figure 3.3 Flow diagram of multi-stage sliding interpolation. ..................... 22

Figure 3.4 Sliding interpolation architecture. ............................................... 23

Figure 3.5 Block diagram of multi-stage sliding interpolation. .................... 24

Figure 3.6 Detailed block diagram of multi-stage sliding interpolation. ...... 25

Figure 3.7 Sliding mechanism ...................................................................... 26

Figure 3.8 Pipelined sliding interpolation ADC architecture. ...................... 28

Figure 3.9 Addition of interleaving scheme. ................................................ 30

Figure 3.10 Complete ADC architecure with replica SHA. ......................... 31

vii
Figure 3.11 (a) Timing mismatch in interleaved architecture, (b) generation

of CK1 and CK2 by a frequency divider. ................................. 33

Figure 3.12 Basic concept of the clock edge reassignment. ......................... 35

Figure 3.13 Detailed operation of clock edge reassignment. ........................ 36

Figure 3.14 Reinterpolation (a) implementation, (b) error plot. ................... 38

Figure 3.15 INL reduction by reinterpolation observed in Monte Carlo

simulations. ............................................................................... 39

Figure 3.16 Nonlinearity-induced error in 2x interpolation. ......................... 41

Figure 4.1 Dual-channel interleaved SHA. ................................................... 43

Figure 4.2 Timing diagram for SHA. ............................................................ 44

Figure 4.3 A triple-channel interleaved SHA with a replica. ....................... 45

Figure 4.4 Preamplifier. ................................................................................ 47

Figure 4.5 Reinterpolating and interpolating amplifiers. .............................. 48

Figure 4.6 Comparator used in the first stage (CMP_A). ............................. 50

Figure 4.7 Comparator used in stages 2 to 5 (CMP_B). ............................... 52

Figure 4.8 Clock edge reassignment circuit for a dual-channel system. ...... 53

Figure 4.9 Operational diagram of a dual-channel CERA system. ............... 54

Figure 4.10 Block diagram of NAND_FF with comparator. ........................ 55

Figure 4.11 Details of NAND_FF ................................................................ 56

Figure 4.12 ROM and output stage. .............................................................. 57

Figure 4.13 Clock generator. ........................................................................ 59

viii
Figure 4.14 High-speed differential D latch. ................................................ 60

Figure 4.15 Realization of a slice of the signal path in the first stage. ......... 61

Figure 4.16 Simulated output of the 8-bit ADC. .......................................... 62

Figure 4.17 Layout floor plan. ...................................................................... 64

Figure 4.18 Detailed circuit arrangement in the first stage. .......................... 66

Figure 4.19 Sliding/multiplexing mechanism of the lower bank. ................. 67

Figure 4.20 Sliding/multiplexing mechanism of the upper bank. ................. 68

Figure 4.21 Die photo. .................................................................................. 69

Figure 5.1 Chip-on-board assembly. ............................................................. 72

Figure 5.2 Zoom-in of the central cavity area. .............................................. 73

Figure 5.3 Actual size of chip-on-board assembly. ...................................... 74

Figure 5.4 Mother board and daughter board. .............................................. 75

Figure 5.5 A combination of different kinds of boards. ............................... 77

Figure 5.6 Test setup. .................................................................................... 78

Figure 5.7 DNL and INL at fin = 1.8 MHz and fsample = 150 MHz. ............. 80

Figure 5.8 FFT at fin = 1.76 MHz. ................................................................ 81

Figure 5.9 SNDR and SFDR at fsample = 150 MHz. ..................................... 82

ix
List of Tables
Table 1: Measurement Summary .................................................................. 83

x
ACKNOWLEDGMENTS

First, I would like to express my sincere thankfulness to Professor Behzad

Razavi for his guidance and support throughout my Ph.D study. He is my role

model. I am also grateful to Professors William J. Kaiser, Frank M. Chang, and

James S. Gibson for serving on my Ph.D. committee.

Next, I would like to thank Jafar Savoj and Tai-Cheng Lee for the helpful

technical discussions with them and also Alireza Razzaghi for his proofreading of

my dissertation.

I am very grateful and feel very lucky that my wonderful parents are open-

minded, patient, and very supportive throughout this long graduate study process. I

am also thankful for the support from other members in our family.

Last but not the least, I would like to express my deepest gratitude to Guo Jing,

my dear better half, who has generously and consistently provided me the crucial

emotional support and love that I needed the most to complete this challenging

process. With the mutual understanding, caring, encouragement, and support

between us, I believe we can make more contributions to this world in the future.

xi
ABSTRACT OF THE DISSERTATION

An 8-Bit 150-MHz CMOS A/D Converter

by

Yun-Ti Wang
Doctor of Philosophy in Electrical Engineering
University of California, Los Angeles, 1999
Professor Behzad Razavi, Chair

High-speed analog-to-digital converters (ADCs) with resolutions of 8 bits

find wide application in instrumentation and communication systems. For example,

portable digital oscilloscopes use 8-bit ADCs with sampling rates above one

hundred megahertz. Also, the Gigabit Ethernet standard with CAT-5 copper cable

requires four 125-MHz ADCs having a resolution of 7 to 8 bits to perform the front-

end analog-to-digital data conversion.

This dissertation presents an 8-bit, 5-stage interleaved and pipelined ADC

that performs analog processing only by means of open-loop circuits such as

differential pairs and source followers, thereby achieving a high conversion rate.

The concept of “sliding interpolation” is proposed to obviate the need for a large

number of comparators or interstage digital-to-analog converters and residue

xiv
amplifiers. The pipelining incorporates distributed sampling between the stages so

as to relax the linearity-speed trade-offs in the sample-and-hold functions. This

work also introduces a “clock edge reassignment” technique that suppresses timing

mismatch issues in interleaved systems. Moreover, in order to reduce the integral

nonlinearity error (INL) with negligible speed or power penalty, a “reinterpolation”

method is proposed.

Fabricated in a 0.6-µm CMOS technology, the ADC achieves a DNL of 0.62

LSB, INL of 1.24 LSB, SFDR of 50 dB, and SNDR of 43.7 dB at 150 MHz

sampling rate with low input frequencies. When input frequency is at 70 MHz,

SNDR of 40 dB is attained. The converter draws 395 mW from a 3.3-V supply and

occupies an area of 1.2 x 1.5 mm2.

xv
Chapter 1

Introduction

1.1 Motivation
Analog-to-digital (A/D) conversion and digital-to-analog (D/A) conversion are

critical interfaces in mixed-signal processing systems. With the continuous advance of

semiconductor technology and scaling of devices, digital circuits have achieved both

high speed and low power dissipation. This trend has several impacts on mixed-signal

integrated circuits (ICs). First, increasingly more operations are performed by digital

circuits rather than by their analog counterparts. Second, the speed of the A/D and D/A

interfaces must scale with the speed of the digital circuits in order to fully utilize the

advantages of advanced technologies. Third, cost and performance make it desirable to

achieve the high levels of integration on a single chip for mixed-signal processing

systems.

1
The above observations lead to several important design implications for analog

circuits. First, front-end analog signal processing and data conversion (including anti-

aliasing filters) still remain as important niches where analog solutions provide

advantages over digital approaches. A/D converters (ADCs) and D/A converters

(DACs) will continue to play an indispensable and significant role in mixed-signal

processing systems. Second, in video and communications applications, the transfer

rate of the data between the analog and digital domains continues to increase, creating

new challenges in the design of data converters. Third, when a data converter is

implemented on a chip along with a great deal of digital circuitry, it experiences a

substantial substrate and supply noise. Thus, the noise immunity of data converters

becomes an extremely important issue in mixed-signal processing systems. Fourth, the

power consumption of data converters is a critical parameter in many of today’s

applications, impacting the cost of packaging as well as the battery lifetime in portable

products.

In general, A/D conversion requires higher power consumption and circuit

complexity than D/A conversion to achieve a given resolution and speed. Therefore,

ADCs often appear as the bottleneck in high-performance mixed-signal systems. This

observation underscores the importance of research and development to improve A/D

conversion algorithms and circuits for future applications [8]-[37].

The goal of this research is to develop new A/D conversion architectures and

circuit techniques that lead to high speed and moderate power consumption in CMOS

2
technology, with the objective of achieving a conversion rate well above 100 MHz at a

resolution of 8 bits. Such A/D converters find wide application in digital sampling

oscilloscopes, Gigabit Ethernet over CAT-5 twisted pair cables, RGB-to-LCD data

conversion, and imaging systems with a large number of pixels.

The concepts introduced in this research have been realized in the design of an

8-bit, 150-MHz A/D converter fabricated in a 0.6-µm CMOS technology. The

prototype achieved a signal-to-(noise+distortion) ratio (SNDR) of 43 dB at full

sampling rate while consuming 395 mW from a 3.3-V supply.

1.2 Thesis Organization


This dissertation presents both a theoretical study and experimental verification

of the novel architecture and circuit techniques developed during the course of this

research.

Chapter 2 reviews applications of ADCs and conventional ADC architectures.

Chapter 3 introduces the ADC architecture, presenting techniques such as sliding

interpolation, embedded pipelining and interleaving, clock edge reassignment, and

reinterpolation. Chapter 4 describes the design of each building block and various

trade-offs at the circuit level and the architecture level. Some critical layout issues are

also addressed. Chapter 5 presents the test procedure and the experimental results

obtained for the prototype and Chapter 6 provides a summary and recommendations for

future work.

3
Chapter 2

ADC Applications and Architectures

In this Chapter, we first describe general potential applications of A/D converters

with sampling rates above 100 MHz and resolutions of about 8 bits. These include digital

oscilloscopes, Gigabit Ethernet receivers, and LCD displays.

Next, we review a number of ADC architectures suited to high-speed operation

and study their speed-resolution-power trade-offs [9]-[21]. Of interest to us are flash,

two-step, pipelined, interleaved, interpolating architectures.

2.1 Applications

2.1.1 Digital Oscilloscopes

Digital oscilloscopes employ high speed ADCs to quantize the probed analog

4
signal. For portable digital oscilloscopes, low power consumption and cost are critical.

As shown in Fig. 2.1, an 8-bit A/D converter digitizes input signal, a core DSP processes

the result, and an 8-bit DAC converts the signal back to an analog waveform, which is

then applied to the display.

Display

8-bit 8-bit
DSP
ADC DAC

Figure 2.1 Digital oscilloscope.

2.1.2 Gigabit Ethernet

Gigabit Ethernet over CAT-5 twisted-pair wires requires four 8-bit, 125-MHz

ADCs at the receiver end. The principal challenge in the design of these converters is

power consumption. Shown in Fig. 2.2, the ADCs must provide sufficient dynamic range

so as to accommodate a large echo and signal level variation due to the attenuation

through the cable.

5
125 MHz

CAT-5 Echo Canceller


8-bit Equalizer
ADC Demodulator

Figure 2.2 ADC application in Gigabit Ethernet.

2.1.3 RGB-to-LCD Display Conversion

Computer CRT displays typically incorporate three 8-bit RAMDACs to display

the analog RGB images on a CRT monitor. With the advent of flat-panel LCD displays,

it is necessary to convert the analog RGB signals to the form required for LCD displays.

This is accomplished as shown in Fig. 2.3.

High-end LCD displays require an ADC conversion rate of 150 MHz. The power

dissipation is also critical because three such ADCs are integrated on one chip.

6
RGB
Display
150 MHz

8-bit 8-bit LCD


DSP DSP
DAC ADC Display

8-bit 8-bit
DAC ADC

8-bit 8-bit
DAC ADC

Figure 2.3 ADC application in RGB-to-LCD display conversion.

2.2 Architecture Review

2.2.1 Flash Architecture

Figure 2.4 shows the block diagram of an N-bit flash ADC. The analog input

signal is simultaneously compared with threshold voltages of the ADC by an array of 2N−

1 comparators, thereby producing a thermometer code. The result is subsequently

converted to a binary output by an encoder. The threshold levels are usually generated by

a ladder consisting of a string of matched resistors.

7
Vref Vin

+ 2N−1

VMAX = VR[2N−1]
+ 2N−2

VR[2N−2] Thermometer
0

1 Encoder Output
+
− N
2N−1 --> N

+
3

+ 2
VR[2] −
+ 1
VMIN = VR[1] −

Comparators

Figure 2.4 Block diagram of an N-bit flash ADC.

The operation of a flash ADC can be viewed from another perspective that leads

to techniques such as interpolation. Illustrated in Fig. 2.5 are the differential outputs of

each preamplifier in the ADC as the analog input varies from VMIN to VMAX. We note

that each output crosses zero when the input of the preamplifier crosses its respective

reference voltage. Hence, the ADC operation can be viewed as a collection of these zero

crossings.

8
VR[2N−2] VMAX = VR[2N−1]

VMIN
Vin
0

VMAX

VMIN = VR[1] VR[2]

2N−1 Zero-crossings

Figure 2.5 Transfer curves of an N-bit flash ADC.

The principal advantage of the flash architecture is its high throughput rate. The

conversion of each sample takes only one single clock period. However, many issues

limit the utility of this approach for resolutions above 6 bits. The exponential growth of

the input capacitance, power dissipation, and area are critical drawbacks. Furthermore,

the offset of the comparators, the feedthrough of the analog input to the resistor ladder

[2], the slew-dependent comparator delay [4][7], and the problem of bubbles in the

thermometer code [2][4] degrade the static and dynamic performance substantially.

9
2.2.2 Two-Step Architecture

In order to avoid the exponential growth encountered in flash ADCs, the

quantization of the signal can be performed in two or more steps. The basic principle of

the two-step architecture can be illustrated by the mapping scheme of a 4-bit ADC as

shown in Fig. 2.6. In the two-step topology, the input range is first divided into four equal

segments, and a coarse quantizer is used to determine in which segment the analog input

lies, thus producing the most significant bits (MSBs). Next, each segment is subdivided

into four segments, and a fine quantizer detects the least significant bits (LSBs).

MSB overflow LSB overflow

11 11

10 10

01 01

00 00

Coarse Quantizer Fine Quantizer

Figure 2.6 Mapping scheme of a 4-bit two-step ADC.

A more detailed description of the operation is depicted in Fig. 2.7 , where the

block diagram of an N-bit, two-step ADC and the conversion flow are illustrated. The

circuit consists of a sample-and-hold amplifier (SHA), two flash quantizers, D/A

10
converter, and a subtractor.
Analog Input
Vin
Residue
S/H
Vres

Coarse Fine
DAC
ADC ADC

“N1-bit” “N1-bit” “N2-bit”

N1 MSBs N2 LSBs

N-bit Digital Output Buffer

N-bit Digital Output


Dout
(a)

MSB LSB
Digital
Analog Output
input 11 Vres 11
Dout
“ 1011 ”
Vin
Residue
10 10
V1
01 01

00 00

Coarse Subtractor Fine


Quantizer Quantizer

(b)

Figure 2.7 Block diagram of an N-bit, two-step ADC.

11
The conversion proceeds as follows: an analog input signal with magnitude of Vin

is sampled by the SHA and subsequently mapped onto the level V1 by the coarse

quantizer, resulting in the two MSBs, e.g., “10” in this case. Next, an analog residue, Vres,

is produced by the subtractor and digitized by the fine quantizer, thereby generating the

two LSBs.

The primary advantage of the two-step topology is that it requires less hardware

and power than a flash architecture. However, this savings is obtained at the cost of longer

processing time, leading to a substantial reduction of the throughput rate.

2.2.3 Pipelined Architecture

From the above discussion, it is clear that the use of multiple stages can alleviate

the exponential growth present in flash topologies. The two-step architecture exemplifies

this benefit to a certain degree, but the low throughput rate limits the use of this approach.

Pipelining enables potentially faster conversion while avoiding the exponential

growth of power and hardware.

Figure 2.8 illustrates the block diagram of a pipelined ADC. The analog input is

applied to the first stage in the chain, and N1 bits are detected. The analog residue is also

generated and applied to the next stage. The same procedure repeats up to the end of the

chain. This concept is similar to the idea of an assembly line because the interstage

sampling allows all of the stages to operate concurrently.

A common approach to pipelining is based on a precision multiply-by-two stage

12
[7] that merges most of the interstage operations into a compact circuit. Usually used with

0.5 bits of overlap, this technique provides a modular implementation.

Analog
Input

Stage 1 Stage j Stage m

N1 bits Nj bits Nm bits

S/H N
2 j

ADC DAC
“Nj-bit”
“Nj-bit”

Nj bits

Figure 2.8 Block diagram of a pipelined ADC.

The pipelined architecture offers a number of advantages. First, the throughput

rate is determined by the speed of only one stage in the pipeline. Second, interstage

residue amplification relaxes the precision required of subsequent stages. Third, the

power and hardware of pipelined converters grow almost linearly with the number of bits.

Also, overlap and digital correction [2] can be used to allow large offsets in the

comparators.

The primary drawback of the conventional pipelined topology is the need for high

13
precision in the interstage SHAs, DACs, and subtractors, especially at the front end. The

precision typically mandates the use of op amps, imposing severe trade-offs among

speed, voltage swing, gain, and power dissipation. As device dimensions, supply

voltages, and the intrinsic gain (gmro) of MOSFETs continue to scale down, the design

of op amps becomes increasingly more difficult.

2.2.4 Interleaved Architecture

In the pipelined topology, the conversion rate is still limited by the settling time

and accuracy requirements of the interstage operations. Interleaving can be used to

further improve the throughput rate.

The basic principle behind interleaving is illustrated in Fig. 2.9. The architecture

employs M identical sub-ADCs, each incorporating a SHA that tracks for T1 seconds and

holds for (M−1)T1 seconds. Thus, each sub-ADC is allotted (M−1)T1 seconds for one

conversion.

The use of multiple parallel channels, however, introduces serious difficulties due

to mismatches [14]. Tones at fck/M and fixed-pattern noise are generally caused by offset

mismatches and sideband modulation around fck/M is introduced due to gain mismatches.

The dynamic performance is severely affected by the timing mismatch among the

channels [5][6].

14
CK1

N bits
SHA1 Sub-ADC1

CK2

N bits Digital
Vin Output
SHA2 Sub-ADC2

@ M x fck

CKM

N bits
SHAM Sub-ADCM

CK1

T1 (M −1)T1

CK2

CKM

Figure 2.9 Block diagram of an interleaved ADC.

15
2.2.5 Interpolating Architecture

As mentioned in Section 2.2.1, one of the critical disadvantages of the flash

topology is the large input capacitance. This problem can be alleviated by applying

interpolation as shown in Fig. 2.10. The idea is that if Vin crosses (VR2 + VR1)/2, then Vo2

crosses zero, increasing the resolution by a factor of two. In essence, interpolation adds

zero crossings to the set of input/output characteristics of a flash stage.

Vin Interpolating
Preamplifiers Amplifiers

VR2 Vo3

Vo2

VR1 Vo1

20 + 1 21 + 1 22 + 1 2N + 1

Figure 2.10 A 2x active interpolating ADC.

Interpolation lends itself to implementation submicron technologies because the

amplifiers used in Fig. 2.10 need not have an accurate gain, high linearity, or large output

swings. Also, it can reduce the differential nonlinearity (DNL) resulting from the offset

of the preamplifiers [30]. However, the simple scheme shown in Fig. 2.10 still requires

16
high power and substantial hardware because of the 2x growth in each interpolation step.

Furthermore, the offset voltages of the amplifiers lead to uncorrected integral

nonlinearity (INL).

17
Chapter 3

Proposed ADC Architecture

In this Chapter, we describe the architecture of the proposed A/D converter. We

introduce the concept of “sliding interpolation” as a means of avoiding the exponential

growth of power and area, extending the idea to multiple stages. Next, we incorporate

a distributed sampling scheme between the stages so as to realize pipelining without op

amps. To further improve the conversion rate, dual-channel interleaving is employed in

all of the interpolative stages, while triple-channel interleaving is used in the front-end

sample-and-hold circuit. In order to minimize dynamic performance degradation due to

the timing mismatch among the channels in an interleaving system, a new technique,

namely “clock edge reassignment” is proposed. The concept of “reinterpolation” is also

introduced to reduce the INL by roughly 30%. Finally, the effects of the gain and offset

mismatches among different channels are studied in a generic interleaved architecture

18
with interpolation.

3.1 Sliding Interpolation


Interpolation can generally be viewed as analog-to-digital conversion in terms

of zero-crossing points rather than direct amplitude quantization. The basic operation

can be described as follows. A group of preamplifiers first generate the difference

between the analog input signal and each tap voltage of a reference ladder. According

to their polarities, the outputs of these preamplifiers can be divided into two groups:

positive and negative, with a distinctive boundary between them. This phenomenon is

similar to what happens inside a thermometer and can be used later to recover the actual

amplitude information of the original analog input signal. These preamplifier outputs

can then be fed into the next-level bank of interpolating amplifiers, whose outputs retain

the thermometer code property. With the aid of these interpolating amplifiers, this code

contains more divisions and hence a higher resolution. As long as the zero-crossing

boundary is unique and the code exhibits sufficient linearity, the original analog signal

can be recovered.

Before introducing the concept of sliding interpolation, let us first briefly review

the traditional active interpolation. As an example, a simple active 2x interpolation

circuit is shown in Fig. 3.1. While this scheme reduces the number of the input

preamplifiers and hence the input capacitance, it still requires a large number of

differential pairs and comparators. However, we recognize that for a given input level,

the outputs of only a few preamplifiers in the first stage are of interest. Thus, the

19
subsequent stages need not interpolate the outputs of all of the preamplifiers. We then

surmise that a compact interpolating stage can “slide” up and down if the analog input

value is roughly known. Shown in Fig. 3.2, the idea is to use a sub-ADC to determine

which preamplifier outputs must be interpolated and route these outputs to the

interpolating differential pairs through a differential multiplexer (MUX). The rest of the

preamplifier outputs are discarded.

Vin Interpolating
Preamplifiers Amplifiers

VR2 Vo3

Vo2

VR1 Vo1

2 21 + 1 22 + 1 2N + 1

Figure 3.1 Traditional active 2x interpolation architecture.

While, in principle, multiplexing and interpolating between only two outputs is

sufficient, in this design we process four preamplifier outputs to allow margin for offsets

of the comparators in the sub-ADC. When this concept is repeatedly applied to the

following stages, a multi-stage sliding interpolation system can be formed.

20
Vin
MUX

Vo3
VR2

Vo2

VR1 Vo1

Sub-ADC

Figure 3.2 Sliding interpolation architecture.

Through the sliding interpolation, the power and hardware grow only linearly,

rather than exponentially. These features make sliding interpolation a promising

architecture for high-speed ADCs.

The principle of multi-stage sliding interpolation is illustrated in Fig. 3.3. The

first stage has 16 preamplifiers to generate 16 zero crossings. If the analog input lies

between VR,j and VR,j+1, then a 4-bit coarse ADC and a 16-to-4 MUX route the outputs

of the preamplifiers sensing VR,j−1,..., VR,j+2 to the next interpolating stage.

21
Stage 1 Stage 2 Stage 3 Stage 4

Vmax

VR,j+2

Vin

VR,j−1

Vmin
MUX MUX MUX
(16 --> 4) (7 --> 4) (7 --> 4)

Figure 3.3 Flow diagram of multi-stage sliding interpolation.

Since only 2x-interpolation is used, each stage, excluding the first one,

generates a total of seven outputs. Also a sub-ADC is used to detect two more bits in

each stage. The overall resolution is increased by one bit because the second bit is used

for subsequent digital error correction.

Detection of zero crossings can be implemented by a simple differential

amplifier. Therefore, all of the decision levels in Fig. 3.3 can be replaced by amplifiers

as shown in Fig. 3.4.

22
Stage 1 Stage 2 Stage 3 Stage 4

Vmax

Vin

Vmin

MUX MUX MUX


(16 --> 4) (7 --> 4) (7 --> 4)

Figure 3.4 Sliding interpolation architecture.

If the gain of every amplifier in each stage is about two, the input dynamic range

of the sub-ADCs remains nearly the same through the chain. All of the sub-ADCs can

therefore be realized in the same form, allowing a modular design.

23
Vmax

Vin

Vmin

(a)

Vin Stage 1 Stage 2 Stage 3 Stage 4

Pre- Interpolative Interpolative Interpolative


SHA AMP MUX AMP MUX AMP MUX AMP

Sub- Sub- Sub- Sub-


ADC ADC ADC ADC

(b)

Figure 3.5 Block diagram of multi-stage sliding interpolation.

The implementation of the sliding interpolation is shown in Fig. 3.5. The front-

end SHA samples and holds the analog input signal. In stage 1, the preamplifiers

generate 16 zero crossings, while the sub-ADC determines the four MSBs. In the

second and the following stages, each MUX is commanded by the sub-ADC in the

previous stage to select and route four amplified outputs to the interpolative amplifiers.

Stages 2 through 5 are identical, simplifying the design and layout.

24
Further details of the sliding interpolation are shown in Fig. 3.6. The first stage

incorporates 16 preamplifiers while each of the following interpolative stages requires

seven amplifiers. By virtue of this technique, the total number of differential pairs

reduces from roughly 500 to 50. The five sub-ADCs require a total of 28 comparators.

Stage 1 Stage 2 Stage 3 Stage 4

16 2x- 2x- 2x-


Preamps Interpolation Interpolation Interpolation
MUX
Vin (16 --> 4) MUX MUX
(7 --> 4) (7 --> 4)

16 4 7 4 7 4
SHA

Sub- Sub- Sub- Sub-


ADC ADC ADC ADC

4 bits 2 bits 2 bits 2 bits

Figure 3.6 Detailed block diagram of multi-stage sliding interpolation.

25
Sliding Interpolation (5 Stages)
1

Output (V)
0.5 A: Stage 1

-0.5

-1
-0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1
Input (V)
1

0.5 B: Stage 2

-0.5

-1
-0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1
1

0.5 C: Stage 3
0

-0.5

-1
-0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1
1

0.5
D: Stage 4
0

-0.5

-1
-0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1
1
Output (V)

0.5
E: Stage 5
0

-0.5

-1
-0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1
Input (V)

Figure 3.7 Sliding mechanism

26
Figure 3.7 plots the amplifier outputs in each stage as sliding interpolation is

activated. The outputs of the first stage exhibit zero-crossing points that are separated

by 50 mV. After sliding interpolation with redundancy, zero crossings with 25-mV

spacing are generated, etc.

Sliding interpolation provides a number of benefits. First, as described before it

lends itself well to the multi-stage pipelining with no D/A converters or subtractors.

Second, it requires no precision gain in any of the building blocks, allowing the use of

simple differential pairs in the entire signal path. Third, it can include reinterpolation to

improve the precision.

Although, the hardware size and the associated power consumption in the

sliding interpolation structure are substantially less than those in the traditional

interpolation method, the throughput rate is severely limited by the multi-stage

operation. For each held analog input sample, the overall A/D conversion is not

complete unless the digital data is generated by all of the sub-ADCs, an operation that

can easily take several tens of nanoseconds. For a higher conversion rate, a pipelining

scheme is needed.

3.2 Embedded Pipelining


As mentioned before, pipelining can improve the throughput rate. The question

is where and how it should be applied. As shown in Fig. 3.6, each interpolative stage

contains only two analog blocks, a MUX and an amplifier bank. Thus, pipelining can

be applied at only one of two points: at the input or output of the MUX.

27
Stage 1 Stage 2

Distributed
Sampling &
16 2x-
Preamps Interpolation
MUX
(16 --> 4)
Vin
To
16 4 stages 3 to 5

SHA

Sub- Sub-
ADC ADC

4-bit 2-bit
Digital Error Correction

Figure 3.8 Pipelined sliding interpolation ADC architecture.

As shown in Fig. 3.8, the interface between the multiplexer and the amplifier

bank is the best choice. This is so for two reasons. First, the multiplex switches can also

function as the sample-and-hold switches, significantly reducing the delay between the

two stages because only one switch appears in the signal path between two consecutive

stages. Second, the interconnection wires between the multiplexers and the

interpolative amplifiers exhibit a significant amount of parasitic capacitance, which can

28
now be utilized as the sample-and-hold capacitors. This type of distributed sample-and-

hold system is similar to that reported in [19]. Partitioning the conversion into several

equal-length time slots, the pipelining significantly improves the throughput rate.

Note that each stage in the pipeline operates in the sample mode for half of the

clock period and in the hold mode for the other half. On the other hand, the sub_ADC

in each stage operates only during the hold mode, raising the possibility of adding

interleaving to further increase the throughput rate.

3.3 Addition of Interleaving


Besides the reasons mentioned in the previous section, the addition of

interleaving is also desirable because, even though the maximum path “length” between

consecutive samplers in the pipeline corresponds to roughly two differential pairs, the

settling requirements still limit the conversion speed. As shown in Fig. 3.9, the

converter employs two identical interleaved channels to increase the speed. The

multiplexers (MUXs), distributed sample-and-holds, and 2x-interpolation amplifiers

are duplicated for the even and the odd channels whereas the front-end buffer, the

preamplifiers, and all of the sub-ADCs are shared between the two channels. The timing

is such that when one stage in the odd channel is in the sampling mode, the

corresponding stage in the even channel is in the hold/amplification mode and vice

versa.

29
Stage 1 Stage 2 Stage 3

Preamps OFF (odd) ON (odd)


M Distributed M Distributed
SHA U S/H & 2x U S/H & 2x
(odd) X Interpolation X Interpolation

Buffer
Vin Sub- Sub-
ADC ADC
(even) (even)
SHA M Distributed M Distributed
(even) U S/H & 2x U S/H & 2x
X Interpolation X Interpolation

ON OFF

Digital Output

Figure 3.9 Addition of interleaving scheme.

When the SHA in the odd channel is sampling the analog input, the SHA in the

even channel is holding and passing the previous analog sample to the preamplifiers

through the buffer. The sub-ADC in stage 1 then generates the four-bit digital code and

commands the MUX in the even channel of stage 2 to redirect the selected preamplifier

outputs to the interpolation amplifiers.

Even though the addition of interleaving increases the speed by almost a factor

of two, the first sub-ADC still creates three difficulties. First, due to the finite

impedance seen at the preamplifier outputs, the kickback noise generated by the sub-

30
ADC considerably disturbs the analog signals at the inputs of the MUX thereby

requiring a long settling time after the sub-ADC is strobed. Second, the sub-ADC

cannot begin its conversion until the front-end SHA, the buffer, and the preamplifier

outpus are settled. Since the buffer drives a relatively large capacitance, the settling in

this path is quite slow. Third, since the sub-ADC appears in the critical path, that is, the

preamplifier outputs must remain idle until the sub-ADC makes a decision, the

throughput rate is severely limited.

Stage 1 Stage 2 Stage 3

Preamps OFF (odd) ON (odd)


M Distributed M Distributed
SHA U S/H & 2x U S/H & 2x
(odd) X Interpolation X Interpolation

Vin
Buffer

Sub-
ADC
(even) (even)
SHA M Distributed M Distributed
(even) U S/H & 2x U S/H & 2x
X Interpolation X Interpolation
Replica ON OFF

SHA
(odd)
Buffer

Sub-
ADC Digital Error Correction

SHA
(even) Digital Output

Figure 3.10 Complete ADC architecure with replica SHA.

31
Figure 3.10 illustrates a modification that alleviates all of the above issues. A

“replica” front-end SHA is added and its output directly drives the first sub-ADC.

Scaled down in device sizes and current levels by a factor of two with respect to the

main SHA, the replica prohibits the large kickback noise of the sub-ADC from

corrupting the output of the preamplifiers. Also, the replica signal experiences a shorter

delay than that in the main path because of the much smaller load capacitance seen by

the replica buffer. Thus, the sub-ADC can be strobed much earlier than before.

The use of interleaving raises concern with respect to mismatches between the

offsets, gains, and timings of the two channels. The first two issues will be discussed in

Section 3.6. The problem of the timing mismatch and the proposed solution are

described in the next section.

3.4 Clock Edge Reassignment


Before proposing a solution for the timing mismatch problem in interleaved

systems, we revisit the problem itself to understand its nature. As shown in Fig. 3.11(a),

two interleaved channels, SHA1 and SHA2, require two corresponding clocks, CK1 and

CK2, which are generated by two different clock generators. In the ideal case, the

sampling edge of CK1 is placed precisely midway between the sampling edges of CK2

such that SHA1 and SHA2 sample the analog signal at evenly-spaced points in time.

32
CK1a CK1
Clock
Generator 1 SHA1

CK2a CK2
Clock
Generator 2 SHA2

Ta Tb

CK1

CK2

2T = Ta + Tb

(a)

CK1
CKin 2 CK2

(b)

Figure 3.11 (a) Timing mismatch in interleaved architecture, (b) generation of CK1 and
CK2 by a frequency divider.

This is usually accomplished by a frequency divider [Fig. 3.11(b)], producing

CK1 and CK2 with a nominal duty cycle of 50% even if the duty cycle of CKin deviates

from 50%. In reality, however, the devices in the clock generators of Fig. 3.11(a) or the

frequency divider of Fig. 3.11(b) suffer from substantial mismatches, especially at high

33
speeds, introducing large timing errors between CK1 and CK2. Since an 8-bit ADC

sampling a 75-MHz signal cannot tolerate timing mismatches greater than roughly

12 ps, frequency division does not provide the accuracy required in this design.

The problem of timing mismatch can be considerably relaxed if a single clock

drives both SHAs. Since the duty cycle of the clock may deviate from 50%, only one of

the edges must be used for the sampling command in both circuits. Figure 3.12

illustrates how this is accomplished by “clock edge reassignment.” Two switches, S1

and S2, and two “predictive” control signals, Vodd and Veven, are added to the system. A

master clock, CKmaster, with a frequency twice the sampling rate, is provided to the two

channels through the two switches. The predictive signals Vodd and Veven enable one of

the switches S1 or S2, thus routing the falling edge of CKmaster to either of the SHAs.

The timing mismatch is now equal to the propagation delay mismatch between S1 and

S2, and the two switches inside SHA1 and SHA2, a value that can be maintained well

below 10 picoseconds even with 20% mismatch between the sizes of the switches.

The timing of Vodd and Veven is quite relaxed so long as they contain the falling

edge of CKmaster with enough margin. Thus, they can be produced by a simple

nonoverlapping clock generator.

In reality, each SHA requires both a rising edge and a falling edge to perform

the sample and hold operations. As shown in Fig. 3.13, the falling edges of CK1x and

the rising edges of CK2x are alternately applied to the SHAs, while the rising edges of

34
CK1x and the falling edges of CK2x are discarded.

Vodd Vin

CKmaster S1
Veven SHA1

S2
SHA2

To SHA1 To SHA2

CKmaster

Vodd
Veven

Figure 3.12 Basic concept of the clock edge reassignment.

The actual sequence of operation is as follows: during phase 1, the falling edge

of CK1x is routed to SHA1 and the rising edge of CK2x to SHA2. During phase 2, the

states of CK1 and CK2 are stored, and during phase 3, the falling edge of CK1x is re-

routed to SHA2 and the rising edge of CK2x to SHA1.

This concept can be easily extended from two channels to three, or more

channels. As discussed in Section 6.2, the front-end sample-and-hold circuit used in this

work incorporates three channels.

35
A1
CK1x CK1 CK1x CK1 CK1x CK1
SHA1 SHA1 SHA1

CK2x CK2 CK2x CK2


SHA2 SHA2 SHA2
CK2x CK2
A2
1 2 3

1 2 3 2 1

Odd Odd Odd


Even Even Even

CK1x (from A1)


CK2x (from A2)

CK1
CK2
T T T T

Figure 3.13 Detailed operation of clock edge reassignment.

3.5 Reinterpolation
As mentioned in Chapter 3, an important benefit of interpolation is the reduction

of the differential nonlinearity resulting from the offset of the preamplifiers [2,3].

However, integral nonlinearity still remains uncorrected, demanding large input

devices. To alleviate the problem, a “reinterpolation” method is introduced here. As

depicted in Fig. 3.14(a), the original outputs (VA’s) from the preamplifiers are fed into

another bank of interpolation amplifiers to generate a second set of interpolated outputs

(VB’s) which, though different from VA’s, contain sufficient information to represent the

original analog input signal. If the offset components of the adjacent VA’s are

36
uncorrelated, the standard deviation of the offsets of the corresponding VB’s is reduced

by a factor of the square root of 2. As shown in Fig. 3.14(b), INLA or INLB is defined

as the maximum offset error of the zero crossings of VA’s or VB’s respectively. If only

the interpolated zero crossings, VB’s, are sensed by the following stages and the original

zero crossings, VA’s, are discarded, then, the overall INL is reduced by approximately

30%.

Figure 3.15 plots the maximum INL with and without reinterpolation as

predicted by Monte Carlo simulations, confirming the theoretical result. The reduction

of the INL translates into a higher tolerance of offsets in the preamplifiers, allowing

smaller input devices and a two-fold reduction in the capacitance seen by the buffer

driving the first stage.

The redundancy associated with reinterpolation is necessary only in the first

stage of the pipeline, where the cumulative gain is still low; in stages 2 through 5 all

zero crossings are utilized. Thus, reinterpolation is obtained at the cost of a few

additional differential pairs.

37
Reinterpolation
Amplifiers Interpolation
Preamplifiers Amplifiers

VA3

VB2

VA2

VB1

VA1

VA1, VA2, VA3 discarded


(a)

INL
A1
INLA INLB
Vin
0
B1

A2

A’s: Original Offsets : Original Zero Crossings


B’s: Interpolated Offsets : Interpolated Zero Crossings

A1 + A2
B1 =
2

σ2A1 + σ2A2 σoriginal


σB1 = =
2 2
(b)

Figure 3.14 Reinterpolation (a) implementation, (b) error plot.

38
* offset averaging mechanism through interpolation by 2
20m
Wave Symbol
D0:A0:v(va1)
D0:A0:v(va2) 15m
D0:A0:v(va3)
D0:A0:v(va4)
10m
D0:A0:v(va5)
D0:A0:v(va6)
D0:A0:v(va7) 5m
D0:A0:v(va8)
Voltages (lin)

D0:A0:v(va9)
0
D0:A0:v(va10)
D0:A0:v(va11)
D0:A0:v(va12) -5m
D0:A0:v(va13)
D0:A0:v(va14)
-10m
D0:A0:v(va15)
D0:A0:v(va16)
-15m

-20m

-500m 0 500m
-1 1
Voltage X (lin) (VOLTS)

Panel 2

Wave Symbol
D0:A0:v(vb1) 10m
D0:A0:v(vb2x)
D0:A0:v(vb3x) 8m

D0:A0:v(vb4x)
6m
D0:A0:v(vb5x)
D0:A0:v(vb6x) 4m
D0:A0:v(vb7x)
D0:A0:v(vb8x) 2m
Voltages (lin)

D0:A0:v(vb9x) 0
D0:A0:v(vb10x)
-2m
D0:A0:v(vb11x)
D0:A0:v(vb12x)
-4m
D0:A0:v(vb13x)
D0:A0:v(vb14x) -6m
D0:A0:v(vb15x)
-8m
D0:A0:v(vb16x)

-10m

-12m

-500m 0 500m
-1 1
Voltage X (lin) (VOLTS)

Figure 3.15 INL reduction by reinterpolation observed in Monte Carlo simulations.

39
3.6 Effect of Nonlinearity in Sliding Interpolation
While the first stage of interpolation by a factor of two is quite insensitive to the

nonlinearity of differential pairs [2], the subsequent reinterpolation and interpolation

are susceptible to nonlinearity in each differential pair. Figure 3.16 illustrates the effect.

Curves A and B are the original characteristics with the zero-crossing points at V0 and

V2. After first 2x interpolation, curve C is generated with a zero crossing at V1 and a

slope of one half of the original one. If one more 2x interpolation is applied between

curves B and C as shown in the circled area in Fig. 3.16, the resulting zero-crossing

point should ideally fall midway between V1 and V2, i.e., at Vid. In practice, however,

the actual zero-crossing point, Vact, deviates from Vid because B and C exhibit different

slopes. The difference between Vact and Vid is denoted by δ.

In the worst case, curve A is flat for Vin > V1 and the slope of curve C is equal

to one half of that of B. Through a simple derivation, it can be shown δ = (V2 - V1)/6

and hence curve D suffers from a DNL of 1/3 LSB. In order to further increase the

resolution through active 2x interpolation, the linear portion of curves A or B must be

extended accordingly.

40
A C
V1 B

Vin
V0
V2

Ideal δ
Position
Vid C

B
V1
Vin
V2
D

Vact
Actual
Position

Figure 3.16 Nonlinearity-induced error in 2x interpolation.

41
Chapter 4

Circuit Design and Layout


Considerations

4.1 Introduction
In this chapter, the design of the ADC’s building blocks as well as various layout

considerations are discussed. All of the analog signal paths are implemented in

differential form to achieve a wide dynamic range and high immunity to common-mode

noise. For the sake of simplicity, some of the circuits are drawn in single-ended form.

4.2 Front-End Sample-and-Hold Circuit


The front-end SHA plays a critical role in the dynamic behavior of the converter.

In order to achieve fast settling, this circuit uses a simple top-plate sampling method and

a PMOS source follower as shown in Fig. 4.1.

42
V DD

CK1a CK1b
Vout
X
Vin M1
S1 S2
C1

CK1b CK1a

S3 S4
C2

Figure 4.1 Dual-channel interleaved SHA.

The interleaving is realized in the sampling network by alternately connecting

C1 and C2 to Vin whereas the source follower is shared between the two channels. Thus,

gain and offset mismatches arise primarily from the charge injection mismatches of S1-

S4. The n-well of the source follower is tied to its source to suppress nonlinearity and

gain error due to body effect. Simulations indicate that two such followers operating

differentially achieve a linearity of about 10 bits.

The input-dependent charge injection from S1 and S3 does introduce

nonlinearity but it is partially cancelled by the charge absorbed by S2 and S4. Also,

differential operation as well as large sampling capacitors (1 pF) improve the overall

linearity to about 9 bits.

43
The finite input capacitance of the source follower results in an equivalent

resistor connected between the outputs of the two channels, yielding a gain roll-off at

high frequencies. From another perspective, the capacitance seen at node X and

switches S2 and S4 form a switched-capacitor low-pass filter. With proper design, this

roll-off is limited to 1 dB at an input frequency of 75 MHz.

Sample Quantize/MUX
CK

(a)

Sample Quantize MUX t

(b)

CKa

CKb

CKc
(c)

Figure 4.2 Timing diagram for SHA.

In the actual design, the front-end SHA is realized with triple-channel

interleaving. This is because the sampling phase is quite faster than the hold/

quantization/multiplexing phase, thereby requiring a clock duty cycle of about 30%

[Fig. 4.2(a)]. Since the duty cycle deviates substantially from 50%, it is difficult to

employ dual-channel interleaving without any “dead” time. To resolve this issue, the

clock period is divided into three equal time slots: one for front-end sampling, one for

44
sub-ADC (coarse quantization), and one for multiplexing [Fig. 4.2(b)]. The timing

diagram of Fig. 4.2(c) is then used to interleave three sampling capacitors. To generate

the time slots with reasonable accuracy, the 150-MHz clock is divided by 3 on the chip.

V DD V DD

(To Preamps) (To sub-ADC1)


Vout, m Vout, r

CK1c CK1a CK1a CK1b

M1m M1r
S2 m S1 m S1r S2r
C1m C1r
Vin

CK1a CK1b CK1b CK1c

S4 m S3 m S3r S4r
C2m C2r

CK1b CK1c CK1c CK1a

S6 m S5 m S5r S6r
C3m C3r

Main SHA Replica SHA

Figure 4.3 A triple-channel interleaved SHA with a replica.

The actual triple-channel interleaved SHA circuit is shown in Fig. 4.3. As

mentioned before, the replica is scaled down by a factor of two with respect to the main

45
SHA. The switches connected between Vin and the sampling capacitors use the same

timing sequence in both of the main and the replica SHAs. However, the switches

connected between the sampling capacitors and the PMOS source followers have a

different timing sequence in the two SHAs. For each channel in the main SHA, the

operation sequence is: (1) sampling, (2) holding, (3) holding and connecting the held

sample to the PMOS source follower. On the other hand, the replica operates in a

slightly different sequence: (1) sampling, (2) holding and connecting the held sample

to the PMOS source follower (whose output is then sensed by the first sub-ADC ). (3)

holding.

4.3 Differential Amplifiers


The A/D converter incorporates “differential difference amplifiers” in the first

stage and simple differential pair in the subsequent stages. The resistors used in the

prototype are realized by non-silicided polysilicon.

As shown in Fig. 4.4, the preamplifier consists of two NMOS differential pairs

with source degeneration. In order to properly transform amplitude quantization into

zero crossings, the design of the preamplifiers requires special attention to several

issues. First, the input-referred offset of the circuit must be less than 1/4 LSB so that it

does not degrade the overall DNL and INL significantly. The offset arises from three

sources: mismatch between the input transistors, mismatch between the load resistors,

and mismatch between the tail current sources. The mismatch of the differential pair

46
typically dominates the overall offset.

V DD

R 1a R 2a
Vout+
Vout−
+ −
Vin M 1a M 2a Vin M 3a M 4a

Vir+ Vir−

R c1a R c2a

I1a I2 a I3 a I4 a
R 1a, R2a: 4x200 Ω
M1a -- M4a: 8x12/0.6
R c1a, R c2a: 200 Ω
I1a -- I4a: 0.4 mA
(4x12/1.5)

Figure 4.4 Preamplifier.

By virtue of reinterpolation the tolerable offset is 30% higher and, with the aid

of the data in [39], the dimensions of M1a - M4a are chosen as W/L = 100 µm/0.6µm.

This results in a total gate area of 60 µm2, about one half of that reported in [19]. The

matching requirement of the output resistors is alleviated by the gain of the preamplifier

and with proper layout. The mismatch of the tail current sources is reduced significantly

by using a channel length of 1.5 µm and a channel width of 48 µm.

The second issue relates to the gain of the preamplifier. The gain is chosen to be

around two as a trade-off between gain, linearity, and speed. Finally, although the

47
output of preamplifiers has a small swing, about 200 mV single-ended, the

preamplifiers do require a wide input common-mode range, approximately 800 mV.

This common-mode constraint limits the overdrive voltage of the input devices and the

tail current source. The current densities must therefore be low enough to consume a

reasonable headroom. This is possible for the current source but not for the input

transistors as their linearity determines the DNL and INL in subsequent interpolation

stages.

The reinterpolating and interpolating amplifiers have the same topology but

different device dimensions and bias currents. Figure 4.5 shows the details.

V DD

R 1b R 2b
Vout+
Vout−
+
Vin M 1b M 2b

Vin−

Rc1b

Reinterpolating Amp I1 b I2 b Interpolating Amp


M 1b, M2b: 12x8/0.6 M 1b, M2b: 12x4/0.6
R 1b, R2b: 4x125 Ω R 1b, R2b: 2x500 Ω
Rc1b : 2x200 Ω Rc1b : 2x200 Ω
I1b, I2b: 0.8 mA I1b, I2b: 0.4 mA

Figure 4.5 Reinterpolating and interpolating amplifiers.

48
4.4 Comparator
The design of the comparators used in the sub-ADCs directly impacts the speed

and power dissipation of the overall converter. Shown in Fig. 4.6 is the high-speed

comparator utilized in the first stage sub-ADC. When CK is low, Sb1 and Sb2 are off. All

the p-switches (S1 - S4) are on and the four internal nodes (P, Q, X, and Y) are pulled up

to VDD with the aid of two equalization switches (Seq and Seqx), placing the comparator

in the reset mode. When CK goes high, Sb1 and Sb2 turn on and M1 - M4 compare the

positive input voltage, Vin+, with the positive reference voltage, Vr +, and the negative

input voltage, Vin−, with the negative reference voltage, Vr +. When all of the reset and

equalization PMOS switches are off, the cross-coupled inverters (M5 - M8)

regeneratively amplify the difference between the inputs to rail-to-rail levels. The

digital outputs are buffered by inverters and then fed to the control circuit.

This comparator offers three important advantages over other topologies. First,

the static power dissipation is zero. When CK is low, no static current flows through the

circuit. When CK is high, M7 and M8 ensure that the current is zero. Second, the

comparator requires only a single-phase clock, greatly simplifying the routing of the

clock across the chip.

49
Seq

CK S1 M5 CK M6 S2 CK
Inv1
Vout−
P Q
Vout+
M7 CK M8 S Inv2
CK S3 4 CK
Seqx

X Cf+ Cf−
Y
Vr+ M1 M2 Vin− M3 M4
Vin+
Vr−

CK Sb1 CK Sb2

S1, 2: 1.8/0.6 S3, 4: 3.0/0.6


Seq : 2.4/0.6 Seqx: 3.0/0.6
Sb1, b2 : 6.0/0.6
M 1, -- M4: 10.8/0.6 Cf+, f− : 8 fF
M 5, -- M8: 2.4/0.6
Inv1,2 : p-- 4.8/0.6, n-- 1.2/0.6

Figure 4.6 Comparator used in the first stage (CMP_A).

The third property of the comparator is that the effect of the offsets due to the

cross-coupled transistors is reduced by the dynamic gain of the input stage. This effect

can be described in two phases.

This is because when CK goes high, nodes X, Y, P, and Q are precharged to VDD

and M5 - M8 are off. The input difference is therefore amplified by M1 - M4 and the

parasitic capacitances at nodes X and Y until Vx and Vy drop below VDD by VTHN. At

50
this point, M7 and M8 turn on but M5 and M6 are still off. The amplification then

continues while M5 and M6 contribute a small regenerative gain until M5 and M6 turn

on, and initiate the final regeneration.

Using SPICE, it is possible to calculate the contribution of M5 - M6 and M7 - M8

to the input-referred offset. With the device dimensions chosen in this design,

simulations suggest that the offset voltages of M5 and M6 is divided by a factor of 20

and that of M7 and M8 by a factor of 2. Since the channel area of M7 and M8 is about

one fourth of that of the input devices, they contribute roughly equal amounts of input-

referred offset. The overall offset of the comparator is about 10 mV.

Another important phenomenon in the comparator is the large kickback noise

produced at the beginning of reset and regeneration modes. This effect is particularly

critical in the first stage and can introduce significant dynamic offsets, saturating the

second stage and creating nonlinearity. Adding a pair of cross-coupled capacitors with

proper value (around 8 fF) at the input reduces the kickback noise to an acceptable level.

As shown in Fig. 4.7, the comparators in stages 2 to 5 are basically the same as

that in the first stage, except for the input network. The multiplexers consisting of Si1 -

Si4 select the even- or odd-channel signals in a dual-channel interleaving mode. Due to

the accumulative gain after stage 1, larger comparator offsets can be tolerated in stages

2 to 5. Therefore, the input differential pair uses W/L = 5.4 µm/0.6 µm.

51
S1 M5 CK M6 S2 CK
CK
Inv1
Vout+

Vout-
M3 M4 Inv2
S3 S4 CK
CK

CK_2ec CK_2ec
Vie+ CK Vie−
Si1 M1 M2 Si3

Vio+ Si 2 Si4 Vio−


CK_2oc CK_2oc
CK Sb

S1, 2: 1.8/0.6 S3, 4: 3.0/0.6 M 1 -- M2: 5.4/0.6


Seq : 2.4/0.6 Seqx: 3.0/0.6 M 3 -- M6: 2.4/0.6
Sb: 6.0/0.6 Si1-i4: 4.8/0.6 Inv1,2 : p-- 4.8/0.6, n-- 1.2/0.6

Figure 4.7 Comparator used in stages 2 to 5 (CMP_B).

Unlike the first stage, the comparators in stages 2 to 5 do not share the same

input line because they are driven by the interpolation amplifier outputs. Thus, the

kickback noise is less important here.

52
4.5 Clock Edge Reassignment
The clock edge reassignment (CERA) circuit for a dual-channel system is

shown in Fig. 4.8. With proper control signals, Spe and Spo pass the rising edges and Sno

and Sne pass the falling edges of the master clock to SHA1 and SHA2, respectively.

When Vodd is high, Sno and Spo are on, allowing SHA1 to receive a falling edge from A2

and SHA2 a rising edge from A1. When Veven is high, the reverse occurs. The falling

edge of A1 and the rising edge of A2 are discarded.

A1
CK1x
CK
Spe Spo

SHA1 CK1 CK2 SHA2

Sno Sne
A2

CK2x

Vodd Veven

Figure 4.8 Clock edge reassignment circuit for a dual-channel system.

The operation of the CERA circuit is further illustrated in Fig. 4.9. The circuit

operates in two “pass” modes and one “block” mode. During the block mode, the clock

signals inside SHA1 or SHA2 are stored on the parasitic capacitance at each node. The

clock edge reassignment concept can be easily extended to a multi-channel system as

well.

53
Vodd = High & Veven = Low Vodd = Low & Veven = High

A1 CK1x A1
CK CK
Spo Spe
CK1
SHA1 SHA2 SHA1 SHA2
CK2
Sno Sne
A2 CK2x A2

1 2 3

Vodd & Veven = Low

Holding

SHA1 SHA2

1 2 3 2 1
Vodd
Odd
Veven Even

CK1x (from A1)


CK2x (from A2)

CK1

CK2

T T

Figure 4.9 Operational diagram of a dual-channel CERA system.

54
4.6 Control and Decode Circuit
As shown in Fig. 4.10, the control circuit (NAND_FF) senses the outputs of two

adjacent comparators to generate three control signals, two applied to the MUX, and

one to the ROM. The two-input NAND gate performs 1-of-n encoding and its output

drives a D-type flip-flop, which produces the digital output at the falling edge of CK.

With the assertion of either CKey or CKoy , the control signal is routed to either the even-

channel MUX or the odd-channel MUX, performing interleaving operation in stages 2

to 5.

CMP NAND_FF
Oa 2 + Odax2+
Odax2- D1 Oe2-
Oa2 -
D Q Oo2- To MUX
+ D2
Odax1
To ROM
Od2
CK
CKey
CKoy

Figure 4.10 Block diagram of NAND_FF with comparator.

The detailed circuit of NAND_FF is shown in Fig. 4.11. The core is based on a

TSPC flip-flop structure [38]. The dual-input NAND is merged into the input stage of

the D-FF, while the two interleave-control NANDs are combined with the D-FF output

stage. INV3 and INV4 are scaled to drive the heavy capacitive load inside MUXs with

reasonable delay.

55
M1 M2 M6 M9 Inv1 Inv3
D1 Oe-

D2 M3 M7 M10

M4 M13
ND
M8 Od
M5

M11 Inv2 Inv4


CK
Oo-
CK_2ey
CK_2oy M12

M 1 - M3 : 1.8/0.6 M14
M 4 - M5 : 3.0/0.6
M 6 - M7 : 2.4/0.6
M8 : 1.2/0.6 Inv1,2 : p- 4.8/0.6; n- 2.4/0.6
M 9 - M12 : 1.8/0.6 Inv3,4 : p- 19.2/0.6; n- 9.6/0.6
ND : p- 4.8/0.6; n- 2.4/0.6
M 13 - M14 : 3.6/0.6

Figure 4.11 Details of NAND_FF

56
4.7 ROM and Output Stage
The ROM consists of a dynamic digital circuit with a precharging PMOS as

shown in Fig. 4.12.

ROM

Ro
Mp
Dx Dout

PAD
Din Mx

D Q D Q Mo
Mn

M p : 9.0/0.6
CKc
M x : 2.4/0.6
M n : 14.4/0.6
M o : 18/0.6
Ro : 100 Ω

Figure 4.12 ROM and output stage.

When CKc is low, Mp precharges the output node Dx and Mx is controlled by

Din. When CKc goes high, Mp turns off and Mn turns on. The output Dx is then evaluated

and fed to the following pipelined register array and eventually the output driver. The

registers consist of dynamic TSPC D flip-flops.

The output driver is an open-drain NMOS device producing a current of about

6 mA. The current is drawn from an off-chip termination resistor of 100 Ω , generating

57
a voltage swing of 600 mV, a value sufficient for driving off-chip ECL buffers. The

small voltage swings allow sharp edges in the output data waveforms even with the

large capacitance due to the traces on the printed-circuit (PC) board.

One dedicated ground pad is used for all of the output drivers to ensure that the

large ground bounce does not disturb the sensitive analog sections.

4.8 Clock Generator


In a high-speed ADC system, the clock generator requires special attention. As

shown in Fig. 4.13, the clock generator contains four building blocks: two divide-by-

two circuits (DIV2a and DIV2b), one divide-by-three circuit (DIV3), and an output

buffer section (BUF).

The 300-MHz differential master clock signals, CKin and CKin, drive DIV2a and

the output flip-flop of DIV2b. DIV2a produces 150-MHz outputs that are applied to

DIV2b and DIV3. DIV2b generates 75-MHz clocks required for interleaving the

interpolative stages and DIV3 produces 50-MHz clocks used in the triple-channel front-

end SHA.

The BUF section generates CKa1, CKa2, CK_3a, CK_3b, CK_3c for the front-end

SHAs, CK_2ec, CK_2oc, CK_2ey, CK_2oy for the dual-channel interleaving interpolative

stages, and CK and CKc for the comparators and the pipelining registers. The BUF

section is actually laid out in different parts of the chip, in proximity to the related

sections.

58
DIV2b
DIV2a
o_2f2+
8
ick+ D Q
CKin D Q D Q D Q
o_2f1− o_2b2+
CKin D Q D Q
D Q
ick− D Q

Q D Q D
o_2b1+ o_2b2−
4.8
4 3.6 4
DIV3

o_2c2+ o_4c+ o_4c−

D Q D Q D Q

o_2c1−

3.0
unityINV = BUF
1.2

8 8 4 4 4 8 8 4 4 8 8

Local BUFs

16 16 8 8 8 4x16 4x16 4x4 4x4 4x16 4x16

CKa1 CKa2 CK_3a CK_3b CK_3c CKc CK CK_2ec CK_2oc CK_2ey CK_2oy

Figure 4.13 Clock generator.

59
Figure 4.14 shows the high-speed differential D-type latch used in the two

divide-by-two circuits. With the device dimensions shown here, the latch operates at

clock frequencies as high as 400 MHz.

M5 M6
Q
Q
M3 M4

D M1 M2 D

CKc Mn M 1 - M6 : 3.0/0.6
M n : 12/0.6

Figure 4.14 High-speed differential D latch.

4.9 One Slice of First-Stage Signal Path


Figure 4.15 shows the realization of a slice of the signal path in the first stage.

While interpolation by a factor of 2 tolerates large nonlinearity in differential pairs, the

reinterpolation scheme does require tighter linearity. Hence, the differential amplifiers

in the signal path employ resistive degeneration. The actual design is fully differential.

It is also important to note that the converter requires no floating capacitors and

can therefore utilize native metal-sandwich structures in digital CMOS technologies.

60
Even Channel

V DD

Vin

Slide Interleave
Command Control Command
Logic

Odd Channel

Figure 4.15 Realization of a slice of the signal path in the first stage.

The entire ADC is simulated at the transistor level by StarSim (previously called

ADM), a SPICE-like simulator. The result for typical process parameters and at room

temperature is shown in Fig. 4.16.

61
0
f = 153.8462 MHz fin = 37.2596 MHz ( 31/128 * fsamp)
samp
-20 SNDR = 43.148 dB
dB

HD3 = -51.6222 dB
-40

-60

-80
0 10 20 30 40 50 60
Frequency (MHz)
(a)

0.5
Volt

-0.5

-1
0 20 40 60 80 100 120

Sample (Time)
(b)

Figure 4.16 Simulated output of the 8-bit ADC.

62
4.10 Floor Plan And Layout Considerations
The floor plan and layout of the ADC must deal with issues such as: routing of

critical paths, power and ground isolation, noise coupling from the digital sections to

the analog sections, etc. Due to the nature of sliding interpolation, the high-speed digital

control signals must travel through the analog sections. Also, the sub_ADCs in stages

2 to 5 must be embedded with the interpolating stages. These issues underscore the

importance of careful layout to suppress various noise coupling effects.

Figure 4.17 shows the floor plan of the ADC. In order to reduce the wiring

capacitance in the critical path, the front-end building blocks in the first stage (CMP_A,

reference ladder, preamplifiers, MUX, and the distributed sample-and-hold) are folded

into a U shape. The front-end SHA output and the reference ladder are routed between

the comparator bank (CMP_A) and the preamplifier bank. The reference ladder is made

of silicide poly-resistor with a length of two squares (about 8 Ω) between consecutive

taps. Each preamplifier provides an empty stripe so that the digital control signals from

CMP_A to MUX can run through it without interfering with the analog signal path. The

digital signals have also been shielded on both sides with analog ground along the entire

path. The ROM generates the four corresponding digital bits in the first stage.

63
Clock Generator MUX &
Interpolation
SHA Amp Distributed
Preamp Amp2 Sampling
18 18 even/odd 7 (e) 7 (e)
5 (e)
-1 -1 e/o
7 (o) 7 (o)
17 17 e/o
CMP_A 5 (o)
0 0 e/o 6 (e) 6 (e)
16 16 16 e/o CMP_B
4 (e) 6 (o) 6 (o)
1 1 1 e/o
15 15 15 e/o 5 (e) 5 (e)
4 (o) 3
2 2 2 e/o 5 (o) 5 (o)
14 14 14 e/o
3 (e) 4 (e) 4 (e)

ROM
3 3 3 e/o
2
13 13 13 e/o 3 (o) 4 (o) 4 (o)
4 4 4 e/o
ROM

3 (e) 3 (e)
12 12 12 e/o 1
2 (e)
5 5 5 e/o 3 (o) 3 (o)
11 11 11 e/o 2 (o) 2 (e) 2 (e)
6 6 6 e/o
10 10 10 e/o 2 (o) 2 (o)
1 (e)
7 7 7 e/o 1 (e) 1 (e)
9 9 9 e/o 1 (o)
8 8 8 e/o 1 (o) 1 (o)

Reference MUX &


Ladder Distributed
Stage 2
Sampling

Figure 4.17 Layout floor plan.

The MUX outputs are connected to metal-3 lines running vertically. Each

differential analog output pair is shielded by analog VDD lines on both sides, providing

isolation and forming the sampling capacitor as well. The even and the odd channels

are uniformly distributed within this building block. Amp2 performs reinterpolation

and contains 5 dual-channel sets which sense the outputs of the MUX and subsequently

reinterpolate the new outputs to drive the next stage.

64
A more detailed diagram of the first stage is shown in Fig. 4.18. Note that the

positioning of preamplifiers 1, 2, 3 and 13, 14, 15 creates a U-shaped layout. This

strategy is chosen because preamplifiers 1 and 15 share the same reference voltages,

etc.

The actual sliding/multiplexing mechanism is depicted in more detail for the

lower and upper banks in Fig. 4.19 and Fig. 4.20, respectively. The sliding/interleaving

command reaches a unit cell (in the middle column) in each slice from the left and then

connects to the cells in the adjacent two slices above and below.

Stage 2 consists of reinterpolation and interpolation amplifiers, and another

MUX and distributed sampling circuit. All of these circuits are in differential and dual-

channel form. Only three comparators are required in CMP_B to decide which sections

are needed to provide the zero-crossing information to the following stage. The ROM

creates the two bits based on the results of the comparators. Stages 3 to 5 are identical

to stage 2.

Since all of the switches in the MUX are PMOS devices, a large n-well is used

to accommodate them. With properly-spaced substrate contacts, the n-well isolates the

switches from the noisy common p-substrate.

65
Vref− Vref+ Vo1 Vo2 Vo3
Vin− Vin+ (even) (odd) (even) (odd) (even) (odd)
unit cell
Cntleven

Cntlodd

15

Cntleven

Cntlodd

14

Cntleven

Cntlodd

13

Cntleven

CMP_A Preamps MUX


NAND_FF

Figure 4.18 Detailed circuit arrangement in the first stage.

66
Vo2 Vo3 Vo4
(even) (odd) (even) (odd) (even) (odd)

15

14

13

Figure 4.19 Sliding/multiplexing mechanism of the lower bank.

67
Vo2 Vo3 Vo4
(even) (odd) (even) (odd) (even) (odd)

15

14

13

Figure 4.20 Sliding/multiplexing mechanism of the upper bank.

68
Clock Interpolation ADC2
SHA Generator Amp

CKin+ CKin−

Ref
Ladder
Vin+

Vin−

ADC1

Preamp Reinterpolation Stage 3 Stage 5


MUX Amp Stage 4

Figure 4.21 Die photo.

Fig. 4.21 shows the die photo. The chip size is 1.5 mm x 1.2 mm with the active

area about 1.2 mm2. The analog differential input signals, Vin+ and Vin−, enter from the

left side of the chip and are shielded with a common VDD in metal 2. Digital outputs

leave the chip from the lower and the right sides of the chip. Three different power lines

69
are used in this layout, one for the analog section, one for the digital section, and one

for the first sub_ADC.

The front-end SHA is placed at the left-top corner and right above the reference

ladder so that its outputs readily reach the preamplifiers and the first sub_ADC. The

high-speed (300-MHz) input clocks, CKin+ and CKin−, and the clock generator are

placed on the top of the chip.

The modularity of the design can be seen in stages 2 to 5. The resuling layout is

quite compact and relatively easy to handle in transistor-level simulations.

70
Chapter 5

Experimental Results

5.1 Introduction
In this chapter, the test setup and the experimental results obtained from a

prototype fabricated in a 0.6-µm CMOS technology are described. Testing an 8-bit

converter at sampling rates greater than 100 MHz entails many challenges, requiring

great care in the design of the test board and the setup. In order to avoid the parasitics

of typical packages, a chip-on-board assembly is adopted. Since building a single chip-

on-board assembly to test high-speed ADCs is time-consuming and error-prone, a

mother-board-and-daughter-board configuration is used to reduce the work associated

with changing the device under test. The raw digital output data of the ADC is collected

by a logic analyzer and subsequently fed into a personal computer for error correction

and performance analysis. MATLAB is used to characterize the results of both low-

71
frequency and dynamic tests. INL and DNL are measured at low input frequencies (still

with a sampling rate of 150 MHz), while SNDR and SFDR are obtained for input

frequencies up to the Nyquist rate.

5.2 Design of Chip-on-Board Assembly


The first version of the test board is made of double-copper-layer PC board.

Shown in Fig. 5.1 the chip (bare die) is mounted in the middle of the central cavity on

the board by conductive epoxy to reduce the ground inductance.

VR_2 AVdd NB1BC


VR18
Vin- Vin+ BootCntl NB1S

NB1C
AVdd2
ICK−
NB1B
VR8

NB1A

GND ICK+
CVdd
DVdd2

DVdd1
CKout

A0 A1 A2 A3 C0 C1 E0E1
B0 B1 D0 D1

Figure 5.1 Chip-on-board assembly.

72
Bootcntl
VIN- VIN+ VR18 VR_2 AVdd2
NB1C
NB1S
NB1B
NB1BC
VR8

ICK-
NB1A

Down ICK+
Bond
GND

DVdd1

CVdd CKout

DVdd2
E1
A0
E0
A1
A2 D1
A3 D0
B0 B1 C0 C1

Figure 5.2 Zoom-in of the central cavity area.

The analog inputs are fed from the top of the board, while the complementary

clocks are coming from the right with 50-Ω termination resistors. The high-speed

digital outputs are placed on the bottom side of the board. The solid dots are through-

holes which connect the top ground areas to the bottom ground plane. Since there are

no protection diodes on the pads inside the chip, discrete diodes are used for all the

biasing nodes on the board to minimize the probability of damage due to electrostatic

73
discharge. Chip capacitors with values ranging from 1 nF to 0.1 nF are soldered from

bias lines to the ground traces in order to bypass the high-frequency noise.

The zoom-in diagram around the central cavity is shown in Fig. 5.2. The cavity

is a ground plane with a large through-hole in the middle, right beneath the chip, so that

the inductance of the ground node can be minimized. All of the ground pads are down-

bonded to this common ground plane. The traces for CVdd, DVdd1 and DVdd2, Vdd

lines for the first sub_ADC and the digital circuits are extended into the cavity area in

order to reduce the length of the corresponding bond wires.

Hole for standoff

1.5”

2”

Figure 5.3 Actual size of chip-on-board assembly.

The actual board size is as shown in Fig. 5.3. All the through holes are drilled

and some of the critical bypass chip capacitors (near the central cavity) are soldered

before the bare die is glued and bonded. Then, it comes to the most stressful part: using

a soldering iron to solder all other passive components while the fragile bond wires are

74
sitting in the middle. Any slight touch on one bond wire could easily break it.

Unfortunately, the soldering work cannot be completely done before the bonding

because our bonding machine requires that the back side of the board be flat.

Vin- Vin+ BootCntl

VR18 VR_2 AVdd

NB1S

AVdd2
NB1BS
NB1C
ICK-
VR8

NB1B
Bias_CK

NB1A
Gnd
ICK+
CVdd

DVdd2

Vup DVdd

A0 A1 A2 A3 B0 B1 C0 C1 D0 D1 E0 E1 CKout

Figure 5.4 Mother board and daughter board.

In order to reduce the overhead work, the mother-board-and-daughter-board

75
configuration shown in Fig. 5.4 is used. It is similar to a common chip-on-board

assembly, except that the central part of the board is made of a smaller detachable board

(a “daughter board”) which contains the mounted die and several critical bypass

capacitors between the bias lines and the ground plane. The two boards are connected

through thin copper-foil stripes (with 1~3 mil thickness), which have small inductance

and are narrow enough to be soldered on traces of both boards.

Once all of the stripes on a daughter-board are de-soldered, the daughter board

can be detached from the mother board, just like the case with a regular package.

Therefore, one mother board can be used for many daughter boards, reducing the

overhead work.

Off-chip ECL buffers are used in order to reliably read out the digital data

produced by the chip at the clock rate of 150 MHz. The ECL buffers are necessary here

to drive the input channels of the logic analyzer with reasonable rise-time and fall-time

and voltage swings. The ECL buffers are mounted on a different board to lower noise

coupling and allow more flexibility in the setup.

Figure 5.5 shows the layout of the boards used in the setup.

76
Daughter
Boards

Mother
Board

ECL
Boards

Back Side Front Side

Figure 5.5 A combination of different kinds of boards.

77
5.3 Test Setup
A proper test setup is crucial to measuring the true performance of the ADC.

Figure 5.6 shows the overall test setup.

Synchronization Signal
Generator

10 MHz

Clock_1 300 MHz Chip-On-Board


Generator Assembly

10 MHz Sync 12 bits

Clock_2 150 MHz ECL


Generator Buffers

10 MHz Sync 12 bits

Clock_3 75 MHz Logic


Generator Analyzer

12 bits

PC

Figure 5.6 Test setup.

In this setup, four signal generators are synchronized by the 10-MHz Sync

signal and are used to provide the analog input and the three digital clock signals. The

78
analog input waveform is fed into the chip-on-board assembly through a bias-T and a

low-pass filter. The 300-MHz clock signal is applied to the chip to generate the on-chip

150-MHz master clock. Another off-chip 150-MHz clock is used to trigger the ECL

buffers in order to collect the digital output data from the chip. Since the maximum

operating speed of the logic analyzer is 100 MHz, 2x subsampling is used, requiring a

75-MHz clock to control the logic analyzer. Since the front-end sample-and-hold circuit

employs three interleaved channels, subsampling by a factor of two still reveals

possible mismatches between the channels.

Since there are five stages in the ADC, four extra bits are needed for the digital

error correction. Therefore, a total of 12 bits are collected by the logic analyzer through

the ECL buffers. Finally, the digital data is transferred to a PC to perform

characterization.

5.4 Experimental Results


The ADC has been fabricated in a 0.6-µm single-poly triple-metal CMOS

technology, occupying a total area of 1.2 mm x 1.5 mm and an active area of 1.2 mm2.

The circuit is tested with a 3.3-V supply with differential input swings of 1.6 Vpp and a

sampling rate of 150 MHz.

Figure 5.7 shows the measured DNL and INL profiles obtained from code

density (histogram) tests with 16 times of the total number of the codes (4K samples).

After a normalized curve-fitting process, the maximum values of DNL and INL are 0.61

79
and 1.24 LSB, respectively.

1
DNL = 0.61196 LSB
0.5
LSB
0

-0.5

-1
0 50 100 150 200 250
1.5
INL = 1.239 LSB
1
0.5
LSB

0
-0.5
-1
-1.5
0 50 100 150 200 250
Code

Figure 5.7 DNL and INL at fin = 1.8 MHz and fsample = 150 MHz.

The dynamic performance of the converter is measured in the frequency

domain. Figure 5.8 depicts the spectrum of the reconstructed signal at 1.76 MHz,

exhibiting harmonics 50 dB below the fundamental and a signal-to-(noise+distortion)

ratio (SNDR) of 43.7 dB, which implies the effective number of bits (ENOB) is equal

to 7 bits.

The spurious-free dynamic range (SFDR) and SNDR as a function of the analog

input frequency are plotted in Fig. 5.9. SFDR starts from around 50 dB at low

frequencies and reaches about 44 dB at high frequencies. SNDR is about 43.7 dB at low

80
frequencies and about 40 dB (ENOB = 6.5) for frequencies above 40 MHz.

10

0
fsamp = 150 MHz, fin = 1.7578MHz
-10
SNDR = 43.7506 dB
-20
Magnitude (dB)

HD2 = -61.7943 dB
-30
HD3 = -53.2677 dB
-40
HD5 = -50.463 dB
-50

-60

-70

-80

-90
0 5 10 15 20 25 30 35
Frequency (MHz)

Figure 5.8 FFT at fin = 1.76 MHz.

The consistent SNDR performance in the high frequency range up to the

Nyquist rate indicates that the clock edge reassignment technique indeed minimizes

timing mismatches in the interleaved system.

81
50

48
SFDR
46

44
dB

42

40
SNDR
38

36
10 20 30 40 50 60 70 80

Input frequency (MHz)

Figure 5.9 SNDR and SFDR at fsample = 150 MHz.

Table 1 summarizes the overall performance. The analog power consumption is

higher than expected mainly due to the discrepancy between the target resistance values

and the actual values. Since the sheet resistance of the fabricated poly resistors is about

30% higher than that used in the simulations, more power consumption is required to

improve the large-signal slewing behavior of the differential pairs.

82
Technology 0.6-µm, 1-poly, 3-metal CMOS
Resolution 8 bits
DNL 0.62 LSB
INL 1.24 LSB
Sampling Rate 150 MHz
SNDR @ fin=1.8 MHz 43.7 dB
fin=70 MHz 40 dB
Analog Input Swing 1.6 Vp-p
Input Capacitance 1.5 pF
Active Chip Area 1.2 mm2
Supply Voltage 3.3 V
Power Consumption
Analog 330 mW
Digital 53 mW
Reference Ladder 12 mW
Total 395 mW
Table 1: Measurement Summary

83
Chapter 6

Conclusion and Future Work

This dissertation presents the design work and experimental results of an 8-bit,

150-MHz CMOS ADC. The research introduces a new sliding interpolation ADC

architecture that lends itself to pipelining without the need for interstage DACs or

subtractors. Also two other circuit techniques, namely clock edge reassignment and

reinterpolation are presented to improve the precision.

The entire design uses only open-loop circuits in order to maximize the

throughput rate. A triple-channel interleaved topology and a front-end open-loop

sample-and-hold circuit are adopted to reduce the settling time in the critical path. The

clock edge reassignment technique relaxes the timing-mismatch problem in multi-

channel interleaved systems, enhancing the dynamic performance. The sliding

interpolation eliminates the need for interstage D/A converters, subtractors, and residue

amplifiers in traditional pipelining structures. Parasitic wiring capacitance is utilized for

84
the distributed sampling scheme. Since the converter requires no floating capacitors, it

is well suited to digital CMOS processes.

The ADC avoids the use of op amps, incorporating only source followers and

differential pairs. Thus, it suffers much less from headroom-gain-speed trade-offs.The

prototype is functional even with a 2.5-V supply, though it was designed for a 3.3-V

system. The simple and modular design of the pipelining structure results in a compact

layout requiring a core area of only 1.2 mm2 in a 0.6-µm CMOS process.

The fabricated prototype delivered the performance reported here in the first try.

Many aspects of the design can be reexamined and improved in future work.

First, due to a layout error, the gain of the differential pairs in the reinterpolation

stage was equal to one rather than two, leading to a higher input-referred offset voltage.

Second, the kickback noise of the comparators in the first sub-ADC still creates a large

dynamic offset, sometimes pushing the following stage to the edge of the overlap range.

Third, the layout of the resistor ladder must preferably avoid current-carrying contacts

to provide better matching.

In deep-submicron technologies, many of the techniques described in this

dissertation can be used. For higher resolutions, the charge injection and nonlinearity

issues of the front-end SHA require additional work and the comparator kickback noise

must be reduced. Also, the trade-off between the offset voltage and the gate capacitance

must be relaxed, perhaps through the use of averaging [37]. Moreover, the problem of

charge injection in the distributed sample-and-hold circuits should be treated more

85
carefully. Finally, the power dissipation in the pipelined stages can be scaled down

because the precision requirements become more relaxed as the signal travels through

the pipeline.

86
Bibliography

[1] R. Gregorian and G. C. Temes, Analog MOS Integrated Circuits for Signal
Processing, John Wiley and Sons, New York, 1986.

[2] B. Razavi, Principles of Data Conversion System Design, IEEE Press, New
York, 1995.

[3] P. R. Gray and R. G. Meyer, Analysis and Design of Analog Integrated


Circuits, Third Edition, John Wiley and Sons, New York, 1992.

[4] B. Razavi, “Design of Sample-and-Hold Amplifiers for High-Speed Low-


Voltage A/D Converters,” Proc. CICC, pp. 59-66, May 1997.

[5] W. Black and D. A. Hodges, “Time Interleaved Converter Arrays,” IEEE J.


Solid-State Circuits, vol. SC-15, pp. 1022-1029, Dec 1980.

[6] Y. C. Jenq, “Digital Spectra of Nonuniformly Sampled Signals:


Fundamentals and High-Speed Waveform Digitizers,” IEEE Trans. Instrum.
Meas., vol. 37, pp. 245-251, June 1988.

[7] B. S. Song, M. F. Tompsett, and K. R. Lakshmikumar, “A 12-Bit 1-


Msample/s Capacitor-Averaging Pipelined A/D Converter,” IEEE J. Solid-
State Circuits, vol. SC-23, pp. 1324-1333, Dec 1988.

87
[8] R. J. van de Plassche and P. Baltus, “An 8-b 100-MHz Full-Nyquist Analog-
to-Digital Converter,” IEEE J. Solid-State Circuits, vol. SC-23, pp. 1334-
1344, Dec 1988.

[9] N. Fukushima et al., “A CMOS 40 MHz 8b 105mW Two-Step ADC,”


ISSCC Dig. Tech. Pap., pp. 14-15, Feb. 1989.

[10] N. Shiwaku, Y. Tung, T. Hiroshima, K.-S. Tan, T. Kurosawa, K. McDonald,


and M. A. Chiang, “A Rail-to-Tail Video-Band Full Nyquist 8-Bit A/D
Converter,” Proc. CICC, pp. 26.2/1-4, May 1991.

[11] T. Matsuura, H. Kojima, E. Imaizumi, K. Usui, and S. Ueda, “An 8-b 50-
MHz 225-mW Submicron CMOS ADC Using Saturation Eliminated
Comparators,” Proc. CICC, pp. 6.4/1-4, May 1990.

[12] K. Tsuji, K. Sugiyama and N. Sugawa, “A CMOS 20 MHz 8 Bit 50 mW


ADC for Mixed Analog/Digital ASICs,” Proc. CICC, pp. 26.3/1-4, May
1991.

[13] G. T. Tuttle, S. Fallahi, and A. A. Abidi, “An 8-b CMOS Vector A/D
Converter,” ISSCC Dig. Tech. Pap., pp. 38-39, Feb. 1993.

[14] C. S. G. Conroy, D. W. Cline, and P. R. Gray, “An 8-b 85-MS/s Parallel


Pipelined A/D Converter in 1-µm CMOS,” IEEE J. Solid-State Circuits, vol.
SC-28, pp. 447-454, April 1993.

[15] A. Abrial, J. Bouview, J.-M. Fournier, and P. A. Senn, “Low-Power 8-b


13.5-MHz Video CMOS ADC for Visiophony ISDN Applications,” IEEE J.
Solid-State Circuits, vol. SC-28, pp. 725-729, July 1993.

[16] B. Nauta and A. G. W. Venes, “A 70 MS/s, 110 mW, 8-b CMOS Folding

88
Interpolating A/D Converter ,” ISSCC Dig. Tech. Pap., pp. 276-277, Feb.
1995.

[17] Chung-Yu Wu, Chih-Cheng Chen, and Jhy-Jer Cho, “A CMOS Transistor-
Only 8-b 4.5-MS/s Pipelined Analog-to-Digital Converter Using Fully-
Differential Current-Mode Circuit Techniques,” IEEE J. Solid-State
Circuits, vol. SC-30, pp. 522-532, May 1995.

[18] M. Bracey, W. Redman-White, J. Richardson, and J. B. Huges, “A Full


Nyquist 15 MS/s 8-b Differential Switched-Current A/D Converter,” IEEE
J. Solid-State Circuits, vol. SC-31, pp. 1945-951, July. 1996.

[19] A. G. W. Venes and R. J. van de Plassche, “An 80-MHz, 80-mW, 8-b CMOS
Folding A/D Converter with Distributed Track-and-Hold Preprocessing,”
IEEE J. Solid-State Circuits, vol. SC-31, pp. 1846-1853, Dec. 1996.

[20] K. Nagaraj, H. S. Fetterman, J. Anidjar, S. H. Lewis, R. G. Renninger, “A


250-mW, 8-b, 52-MS/s Parallel Pipelined A/D Converter with Rreduced
Number of Amplifiers,” IEEE J. Solid-State Circuits, Vol. SC-32, pp. 312-
320, March 1997.

[21] W. Bright, “An 8b 75 MS/s 70 mW Parallel Pipelined ADC Incorporating


Double Sampling,” ISSCC Dig. Tech. Pap., pp. 146-147, Feb. 1998.

[22] C. W. Mangelsdorf et al., “A 400-MHz Input Flash Converter with Error


Correction,” IEEE J. Solid-State Circuits, vol. SC-25, pp. 184-191, Feb.
1990.

[23] Y. Akazawa et al., “A 400 MSPS 8 b Flash AD Conversion LSI,” ISSCC Dig.
Tech. Pap., pp. 98-99 Apr. 1987.

89
[24] T. Tsukada et. al., “CMOS 8b 25 MHz Flash ADC,” ISSCC Dig. Tech. Pap.,
pp. 34-35, Feb. 1985.

[25] A. G. Dingwall and V. Zazzu, “An 8-MHz CMOS Subranging 8-Bit A/D
Converter,” IEEE J. Solid-State Circuits, vol. SC-20, pp. 1138-1143, Dec.
1985.

[26] K. Sone, Y. Nishida, and N. Nakadai, “A 10-b 100-Msample/sec Subranging


BiCMOS ADC,” IEEE J. Solid-State Circuits, vol. SC-28, pp. 1187-1199,
Dec. 1993.

[27] S. H. Lewis and P. R. Gray, “A Pipeline 5-Msample/s 9-bit Analog-to-


Digital Converter,” IEEE J. Solid-State Circuits, vol. SC-22, pp. 954-961,
Dec. 1987.

[28] M. Yotsuyanagi, T. Etoh, and K. Hirata, “A 10 Bit 50 MHz Pipelined CMOS


A/D Converter with S/H,” IEEE J. Solid-State Circuits, vol. SC-28, pp. 292-
300, March 1993.

[29] W. T. Colleran and A. A. Abidi, “A 10-b 75-MHz Two-Step Pipelined


Bipolar A/D Converter,” IEEE J. Solid-State Circuits, vol. SC-28, pp. 1187-
1199, Dec. 1993.

[30] C. Lane, “A 10-Bit, 60-MS/s Flash ADC,” Proc. BCTM., pp. 44-47, Sep.
1989.

[31] H. Kimura et al., “A 10-b 300-MHz Interpolated Parallel A/D Converter,”


IEEE J. Solid-State Circuits, vol. SC-28, pp. 438-446, Apr. 1993.

[32] M. Steyaert, R. Roovers, and J. Cranickx, “A 100 MHz 8 Bit CMOS


Interpolating A/D Converter,” Proc. CICC, pp. 28.1.1-28-1.4, May 1993.

90
[33] K. Kusumoto, A. Matsuzawa, and K. Murata, “A 10-b 20-MHz 30-mW
Pipelined Interpolating CMOS ADC,” IEEE J. Solid-State Circuits, vol. SC-
28, pp. 1200-1206, Dec. 1993.

[34] R. J. van de Plassche, “An 8-Bit 100-MHz Full-Nyquist Analog-to-Digital


Converter,” IEEE J. Solid-State Circuits, vol. SC-27, pp. 1334-1344, Dec.
1988.

[35] R. J. van de Plassche and P. Baltus, “An 8 b 100 MHz Folding ADC,” ISSCC
Dig. Tech. Pap., pp. 222-223, Feb. 1988.

[36] J. van Valburg and R. J. van de Plassche, “An 8-b 650-MHz Folding ADC,”
IEEE J. Solid-State Circuits, vol. SC-27, pp. 1662-1666, Dec. 1992.

[37] K. Bult and A. Buchwald, “An embedded 240-mW 10-b 50-MS/s CMOS
ADC in 1-mm2,” IEEE J. Solid-State Circuits, vol. SC-32, pp. 1887-1895,
Dec. 1997.

[38] F. Lu, H. Samueli, J. Yuan, and C. Svensson, “A 700-MHz 24-b Pipelined


Accumulator in 1.2-µm CMOS for Application As a Numerically
Controlled Oscillator,” IEEE J. Solid-State Circuits, vol. SC-28, pp. 878-
886, Aug. 1993.

[39] M. J. M. Pelgrom, A. C. J. Duinmaijer, and A. P. G. Welbers, “Matching


Properties of MOS Transistors,” IEEE J. Solid-State Circuits, vol. SC-24,
pp. 1433-1439, Oct. 1989.

[40] B. M. Gordon, “Linear Electronic Analog/Digital Conversion


Architectures, Their Origins, Parameter, Limitations, and Applications,”
IEEE Trans. Circuits Syst., vol. CAS-25, pp. 391-418, July 1978.

91
[41] S. K. Tewksbury, et al., “Terminology Related to the Performance of S/H,
A/D, and D/A circuits,” IEEE Trans. Circuits Syst., vol. CAS-25, pp. 419-
426, July 1978.

[42] J. Doernberg, H. S. Lee, and D. A. Hodges, “Full-Speed Testing of A/D


Converters,” IEEE J. Solid-State Circuits, vol. SC-19, pp. 820-827, Dec.
1984.

92

You might also like