Professional Documents
Culture Documents
Los Angeles
in Electrical Engineering
by
Yun-Ti Wang
1999
Dedication
iii
Table of Contents
iv
3.4 Clock Edge Reassignment .................................................................... 32
v
Chapter 6 Conclusion and Future Work ............................................... 84
Bibliography ................................................................................................. 87
vi
List of Figures
Figure 2.1 Digital oscilloscope. ...................................................................... 5
vii
Figure 3.11 (a) Timing mismatch in interleaved architecture, (b) generation
simulations. ............................................................................... 39
Figure 4.8 Clock edge reassignment circuit for a dual-channel system. ...... 53
viii
Figure 4.14 High-speed differential D latch. ................................................ 60
Figure 4.15 Realization of a slice of the signal path in the first stage. ......... 61
Figure 5.7 DNL and INL at fin = 1.8 MHz and fsample = 150 MHz. ............. 80
ix
List of Tables
Table 1: Measurement Summary .................................................................. 83
x
ACKNOWLEDGMENTS
Razavi for his guidance and support throughout my Ph.D study. He is my role
Next, I would like to thank Jafar Savoj and Tai-Cheng Lee for the helpful
technical discussions with them and also Alireza Razzaghi for his proofreading of
my dissertation.
I am very grateful and feel very lucky that my wonderful parents are open-
minded, patient, and very supportive throughout this long graduate study process. I
am also thankful for the support from other members in our family.
Last but not the least, I would like to express my deepest gratitude to Guo Jing,
my dear better half, who has generously and consistently provided me the crucial
emotional support and love that I needed the most to complete this challenging
between us, I believe we can make more contributions to this world in the future.
xi
ABSTRACT OF THE DISSERTATION
by
Yun-Ti Wang
Doctor of Philosophy in Electrical Engineering
University of California, Los Angeles, 1999
Professor Behzad Razavi, Chair
portable digital oscilloscopes use 8-bit ADCs with sampling rates above one
hundred megahertz. Also, the Gigabit Ethernet standard with CAT-5 copper cable
requires four 125-MHz ADCs having a resolution of 7 to 8 bits to perform the front-
differential pairs and source followers, thereby achieving a high conversion rate.
The concept of “sliding interpolation” is proposed to obviate the need for a large
xiv
amplifiers. The pipelining incorporates distributed sampling between the stages so
work also introduces a “clock edge reassignment” technique that suppresses timing
method is proposed.
LSB, INL of 1.24 LSB, SFDR of 50 dB, and SNDR of 43.7 dB at 150 MHz
sampling rate with low input frequencies. When input frequency is at 70 MHz,
SNDR of 40 dB is attained. The converter draws 395 mW from a 3.3-V supply and
xv
Chapter 1
Introduction
1.1 Motivation
Analog-to-digital (A/D) conversion and digital-to-analog (D/A) conversion are
semiconductor technology and scaling of devices, digital circuits have achieved both
high speed and low power dissipation. This trend has several impacts on mixed-signal
integrated circuits (ICs). First, increasingly more operations are performed by digital
circuits rather than by their analog counterparts. Second, the speed of the A/D and D/A
interfaces must scale with the speed of the digital circuits in order to fully utilize the
achieve the high levels of integration on a single chip for mixed-signal processing
systems.
1
The above observations lead to several important design implications for analog
circuits. First, front-end analog signal processing and data conversion (including anti-
aliasing filters) still remain as important niches where analog solutions provide
advantages over digital approaches. A/D converters (ADCs) and D/A converters
rate of the data between the analog and digital domains continues to increase, creating
new challenges in the design of data converters. Third, when a data converter is
substantial substrate and supply noise. Thus, the noise immunity of data converters
applications, impacting the cost of packaging as well as the battery lifetime in portable
products.
complexity than D/A conversion to achieve a given resolution and speed. Therefore,
The goal of this research is to develop new A/D conversion architectures and
circuit techniques that lead to high speed and moderate power consumption in CMOS
2
technology, with the objective of achieving a conversion rate well above 100 MHz at a
resolution of 8 bits. Such A/D converters find wide application in digital sampling
oscilloscopes, Gigabit Ethernet over CAT-5 twisted pair cables, RGB-to-LCD data
The concepts introduced in this research have been realized in the design of an
of the novel architecture and circuit techniques developed during the course of this
research.
reinterpolation. Chapter 4 describes the design of each building block and various
trade-offs at the circuit level and the architecture level. Some critical layout issues are
also addressed. Chapter 5 presents the test procedure and the experimental results
obtained for the prototype and Chapter 6 provides a summary and recommendations for
future work.
3
Chapter 2
with sampling rates above 100 MHz and resolutions of about 8 bits. These include digital
2.1 Applications
Digital oscilloscopes employ high speed ADCs to quantize the probed analog
4
signal. For portable digital oscilloscopes, low power consumption and cost are critical.
As shown in Fig. 2.1, an 8-bit A/D converter digitizes input signal, a core DSP processes
the result, and an 8-bit DAC converts the signal back to an analog waveform, which is
Display
8-bit 8-bit
DSP
ADC DAC
Gigabit Ethernet over CAT-5 twisted-pair wires requires four 8-bit, 125-MHz
ADCs at the receiver end. The principal challenge in the design of these converters is
power consumption. Shown in Fig. 2.2, the ADCs must provide sufficient dynamic range
so as to accommodate a large echo and signal level variation due to the attenuation
5
125 MHz
the analog RGB images on a CRT monitor. With the advent of flat-panel LCD displays,
it is necessary to convert the analog RGB signals to the form required for LCD displays.
High-end LCD displays require an ADC conversion rate of 150 MHz. The power
dissipation is also critical because three such ADCs are integrated on one chip.
6
RGB
Display
150 MHz
8-bit 8-bit
DAC ADC
8-bit 8-bit
DAC ADC
Figure 2.4 shows the block diagram of an N-bit flash ADC. The analog input
signal is simultaneously compared with threshold voltages of the ADC by an array of 2N−
converted to a binary output by an encoder. The threshold levels are usually generated by
7
Vref Vin
+ 2N−1
−
VMAX = VR[2N−1]
+ 2N−2
−
VR[2N−2] Thermometer
0
1 Encoder Output
+
− N
2N−1 --> N
+
3
−
+ 2
VR[2] −
+ 1
VMIN = VR[1] −
Comparators
The operation of a flash ADC can be viewed from another perspective that leads
to techniques such as interpolation. Illustrated in Fig. 2.5 are the differential outputs of
each preamplifier in the ADC as the analog input varies from VMIN to VMAX. We note
that each output crosses zero when the input of the preamplifier crosses its respective
reference voltage. Hence, the ADC operation can be viewed as a collection of these zero
crossings.
8
VR[2N−2] VMAX = VR[2N−1]
VMIN
Vin
0
VMAX
2N−1 Zero-crossings
The principal advantage of the flash architecture is its high throughput rate. The
conversion of each sample takes only one single clock period. However, many issues
limit the utility of this approach for resolutions above 6 bits. The exponential growth of
the input capacitance, power dissipation, and area are critical drawbacks. Furthermore,
the offset of the comparators, the feedthrough of the analog input to the resistor ladder
[2], the slew-dependent comparator delay [4][7], and the problem of bubbles in the
thermometer code [2][4] degrade the static and dynamic performance substantially.
9
2.2.2 Two-Step Architecture
quantization of the signal can be performed in two or more steps. The basic principle of
the two-step architecture can be illustrated by the mapping scheme of a 4-bit ADC as
shown in Fig. 2.6. In the two-step topology, the input range is first divided into four equal
segments, and a coarse quantizer is used to determine in which segment the analog input
lies, thus producing the most significant bits (MSBs). Next, each segment is subdivided
into four segments, and a fine quantizer detects the least significant bits (LSBs).
11 11
10 10
01 01
00 00
A more detailed description of the operation is depicted in Fig. 2.7 , where the
block diagram of an N-bit, two-step ADC and the conversion flow are illustrated. The
10
converter, and a subtractor.
Analog Input
Vin
Residue
S/H
Vres
Coarse Fine
DAC
ADC ADC
N1 MSBs N2 LSBs
MSB LSB
Digital
Analog Output
input 11 Vres 11
Dout
“ 1011 ”
Vin
Residue
10 10
V1
01 01
00 00
(b)
11
The conversion proceeds as follows: an analog input signal with magnitude of Vin
is sampled by the SHA and subsequently mapped onto the level V1 by the coarse
quantizer, resulting in the two MSBs, e.g., “10” in this case. Next, an analog residue, Vres,
is produced by the subtractor and digitized by the fine quantizer, thereby generating the
two LSBs.
The primary advantage of the two-step topology is that it requires less hardware
and power than a flash architecture. However, this savings is obtained at the cost of longer
From the above discussion, it is clear that the use of multiple stages can alleviate
the exponential growth present in flash topologies. The two-step architecture exemplifies
this benefit to a certain degree, but the low throughput rate limits the use of this approach.
Figure 2.8 illustrates the block diagram of a pipelined ADC. The analog input is
applied to the first stage in the chain, and N1 bits are detected. The analog residue is also
generated and applied to the next stage. The same procedure repeats up to the end of the
chain. This concept is similar to the idea of an assembly line because the interstage
12
[7] that merges most of the interstage operations into a compact circuit. Usually used with
Analog
Input
S/H N
2 j
ADC DAC
“Nj-bit”
“Nj-bit”
Nj bits
rate is determined by the speed of only one stage in the pipeline. Second, interstage
residue amplification relaxes the precision required of subsequent stages. Third, the
power and hardware of pipelined converters grow almost linearly with the number of bits.
Also, overlap and digital correction [2] can be used to allow large offsets in the
comparators.
The primary drawback of the conventional pipelined topology is the need for high
13
precision in the interstage SHAs, DACs, and subtractors, especially at the front end. The
precision typically mandates the use of op amps, imposing severe trade-offs among
speed, voltage swing, gain, and power dissipation. As device dimensions, supply
voltages, and the intrinsic gain (gmro) of MOSFETs continue to scale down, the design
In the pipelined topology, the conversion rate is still limited by the settling time
The basic principle behind interleaving is illustrated in Fig. 2.9. The architecture
employs M identical sub-ADCs, each incorporating a SHA that tracks for T1 seconds and
holds for (M−1)T1 seconds. Thus, each sub-ADC is allotted (M−1)T1 seconds for one
conversion.
The use of multiple parallel channels, however, introduces serious difficulties due
to mismatches [14]. Tones at fck/M and fixed-pattern noise are generally caused by offset
mismatches and sideband modulation around fck/M is introduced due to gain mismatches.
The dynamic performance is severely affected by the timing mismatch among the
channels [5][6].
14
CK1
N bits
SHA1 Sub-ADC1
CK2
N bits Digital
Vin Output
SHA2 Sub-ADC2
@ M x fck
CKM
N bits
SHAM Sub-ADCM
CK1
T1 (M −1)T1
CK2
CKM
15
2.2.5 Interpolating Architecture
topology is the large input capacitance. This problem can be alleviated by applying
interpolation as shown in Fig. 2.10. The idea is that if Vin crosses (VR2 + VR1)/2, then Vo2
crosses zero, increasing the resolution by a factor of two. In essence, interpolation adds
Vin Interpolating
Preamplifiers Amplifiers
VR2 Vo3
Vo2
VR1 Vo1
20 + 1 21 + 1 22 + 1 2N + 1
amplifiers used in Fig. 2.10 need not have an accurate gain, high linearity, or large output
swings. Also, it can reduce the differential nonlinearity (DNL) resulting from the offset
of the preamplifiers [30]. However, the simple scheme shown in Fig. 2.10 still requires
16
high power and substantial hardware because of the 2x growth in each interpolation step.
nonlinearity (INL).
17
Chapter 3
growth of power and area, extending the idea to multiple stages. Next, we incorporate
all of the interpolative stages, while triple-channel interleaving is used in the front-end
the timing mismatch among the channels in an interleaving system, a new technique,
introduced to reduce the INL by roughly 30%. Finally, the effects of the gain and offset
18
with interpolation.
of zero-crossing points rather than direct amplitude quantization. The basic operation
between the analog input signal and each tap voltage of a reference ladder. According
to their polarities, the outputs of these preamplifiers can be divided into two groups:
positive and negative, with a distinctive boundary between them. This phenomenon is
similar to what happens inside a thermometer and can be used later to recover the actual
amplitude information of the original analog input signal. These preamplifier outputs
can then be fed into the next-level bank of interpolating amplifiers, whose outputs retain
the thermometer code property. With the aid of these interpolating amplifiers, this code
contains more divisions and hence a higher resolution. As long as the zero-crossing
boundary is unique and the code exhibits sufficient linearity, the original analog signal
can be recovered.
Before introducing the concept of sliding interpolation, let us first briefly review
circuit is shown in Fig. 3.1. While this scheme reduces the number of the input
preamplifiers and hence the input capacitance, it still requires a large number of
differential pairs and comparators. However, we recognize that for a given input level,
the outputs of only a few preamplifiers in the first stage are of interest. Thus, the
19
subsequent stages need not interpolate the outputs of all of the preamplifiers. We then
surmise that a compact interpolating stage can “slide” up and down if the analog input
value is roughly known. Shown in Fig. 3.2, the idea is to use a sub-ADC to determine
which preamplifier outputs must be interpolated and route these outputs to the
interpolating differential pairs through a differential multiplexer (MUX). The rest of the
Vin Interpolating
Preamplifiers Amplifiers
VR2 Vo3
Vo2
VR1 Vo1
2 21 + 1 22 + 1 2N + 1
sufficient, in this design we process four preamplifier outputs to allow margin for offsets
of the comparators in the sub-ADC. When this concept is repeatedly applied to the
20
Vin
MUX
Vo3
VR2
Vo2
VR1 Vo1
Sub-ADC
Through the sliding interpolation, the power and hardware grow only linearly,
first stage has 16 preamplifiers to generate 16 zero crossings. If the analog input lies
between VR,j and VR,j+1, then a 4-bit coarse ADC and a 16-to-4 MUX route the outputs
21
Stage 1 Stage 2 Stage 3 Stage 4
Vmax
VR,j+2
Vin
VR,j−1
Vmin
MUX MUX MUX
(16 --> 4) (7 --> 4) (7 --> 4)
Since only 2x-interpolation is used, each stage, excluding the first one,
generates a total of seven outputs. Also a sub-ADC is used to detect two more bits in
each stage. The overall resolution is increased by one bit because the second bit is used
amplifier. Therefore, all of the decision levels in Fig. 3.3 can be replaced by amplifiers
22
Stage 1 Stage 2 Stage 3 Stage 4
Vmax
Vin
Vmin
If the gain of every amplifier in each stage is about two, the input dynamic range
of the sub-ADCs remains nearly the same through the chain. All of the sub-ADCs can
23
Vmax
Vin
Vmin
(a)
(b)
The implementation of the sliding interpolation is shown in Fig. 3.5. The front-
end SHA samples and holds the analog input signal. In stage 1, the preamplifiers
generate 16 zero crossings, while the sub-ADC determines the four MSBs. In the
second and the following stages, each MUX is commanded by the sub-ADC in the
previous stage to select and route four amplified outputs to the interpolative amplifiers.
24
Further details of the sliding interpolation are shown in Fig. 3.6. The first stage
seven amplifiers. By virtue of this technique, the total number of differential pairs
reduces from roughly 500 to 50. The five sub-ADCs require a total of 28 comparators.
16 4 7 4 7 4
SHA
25
Sliding Interpolation (5 Stages)
1
Output (V)
0.5 A: Stage 1
-0.5
-1
-0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1
Input (V)
1
0.5 B: Stage 2
-0.5
-1
-0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1
1
0.5 C: Stage 3
0
-0.5
-1
-0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1
1
0.5
D: Stage 4
0
-0.5
-1
-0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1
1
Output (V)
0.5
E: Stage 5
0
-0.5
-1
-0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1
Input (V)
26
Figure 3.7 plots the amplifier outputs in each stage as sliding interpolation is
activated. The outputs of the first stage exhibit zero-crossing points that are separated
by 50 mV. After sliding interpolation with redundancy, zero crossings with 25-mV
lends itself well to the multi-stage pipelining with no D/A converters or subtractors.
Second, it requires no precision gain in any of the building blocks, allowing the use of
simple differential pairs in the entire signal path. Third, it can include reinterpolation to
Although, the hardware size and the associated power consumption in the
sliding interpolation structure are substantially less than those in the traditional
operation. For each held analog input sample, the overall A/D conversion is not
complete unless the digital data is generated by all of the sub-ADCs, an operation that
can easily take several tens of nanoseconds. For a higher conversion rate, a pipelining
scheme is needed.
is where and how it should be applied. As shown in Fig. 3.6, each interpolative stage
contains only two analog blocks, a MUX and an amplifier bank. Thus, pipelining can
be applied at only one of two points: at the input or output of the MUX.
27
Stage 1 Stage 2
Distributed
Sampling &
16 2x-
Preamps Interpolation
MUX
(16 --> 4)
Vin
To
16 4 stages 3 to 5
SHA
Sub- Sub-
ADC ADC
4-bit 2-bit
Digital Error Correction
As shown in Fig. 3.8, the interface between the multiplexer and the amplifier
bank is the best choice. This is so for two reasons. First, the multiplex switches can also
function as the sample-and-hold switches, significantly reducing the delay between the
two stages because only one switch appears in the signal path between two consecutive
stages. Second, the interconnection wires between the multiplexers and the
28
now be utilized as the sample-and-hold capacitors. This type of distributed sample-and-
hold system is similar to that reported in [19]. Partitioning the conversion into several
equal-length time slots, the pipelining significantly improves the throughput rate.
Note that each stage in the pipeline operates in the sample mode for half of the
clock period and in the hold mode for the other half. On the other hand, the sub_ADC
in each stage operates only during the hold mode, raising the possibility of adding
interleaving is also desirable because, even though the maximum path “length” between
consecutive samplers in the pipeline corresponds to roughly two differential pairs, the
settling requirements still limit the conversion speed. As shown in Fig. 3.9, the
converter employs two identical interleaved channels to increase the speed. The
are duplicated for the even and the odd channels whereas the front-end buffer, the
preamplifiers, and all of the sub-ADCs are shared between the two channels. The timing
is such that when one stage in the odd channel is in the sampling mode, the
corresponding stage in the even channel is in the hold/amplification mode and vice
versa.
29
Stage 1 Stage 2 Stage 3
Buffer
Vin Sub- Sub-
ADC ADC
(even) (even)
SHA M Distributed M Distributed
(even) U S/H & 2x U S/H & 2x
X Interpolation X Interpolation
ON OFF
Digital Output
When the SHA in the odd channel is sampling the analog input, the SHA in the
even channel is holding and passing the previous analog sample to the preamplifiers
through the buffer. The sub-ADC in stage 1 then generates the four-bit digital code and
commands the MUX in the even channel of stage 2 to redirect the selected preamplifier
Even though the addition of interleaving increases the speed by almost a factor
of two, the first sub-ADC still creates three difficulties. First, due to the finite
impedance seen at the preamplifier outputs, the kickback noise generated by the sub-
30
ADC considerably disturbs the analog signals at the inputs of the MUX thereby
requiring a long settling time after the sub-ADC is strobed. Second, the sub-ADC
cannot begin its conversion until the front-end SHA, the buffer, and the preamplifier
outpus are settled. Since the buffer drives a relatively large capacitance, the settling in
this path is quite slow. Third, since the sub-ADC appears in the critical path, that is, the
preamplifier outputs must remain idle until the sub-ADC makes a decision, the
Vin
Buffer
Sub-
ADC
(even) (even)
SHA M Distributed M Distributed
(even) U S/H & 2x U S/H & 2x
X Interpolation X Interpolation
Replica ON OFF
SHA
(odd)
Buffer
Sub-
ADC Digital Error Correction
SHA
(even) Digital Output
31
Figure 3.10 illustrates a modification that alleviates all of the above issues. A
“replica” front-end SHA is added and its output directly drives the first sub-ADC.
Scaled down in device sizes and current levels by a factor of two with respect to the
main SHA, the replica prohibits the large kickback noise of the sub-ADC from
corrupting the output of the preamplifiers. Also, the replica signal experiences a shorter
delay than that in the main path because of the much smaller load capacitance seen by
the replica buffer. Thus, the sub-ADC can be strobed much earlier than before.
The use of interleaving raises concern with respect to mismatches between the
offsets, gains, and timings of the two channels. The first two issues will be discussed in
Section 3.6. The problem of the timing mismatch and the proposed solution are
systems, we revisit the problem itself to understand its nature. As shown in Fig. 3.11(a),
two interleaved channels, SHA1 and SHA2, require two corresponding clocks, CK1 and
CK2, which are generated by two different clock generators. In the ideal case, the
sampling edge of CK1 is placed precisely midway between the sampling edges of CK2
such that SHA1 and SHA2 sample the analog signal at evenly-spaced points in time.
32
CK1a CK1
Clock
Generator 1 SHA1
CK2a CK2
Clock
Generator 2 SHA2
Ta Tb
CK1
CK2
2T = Ta + Tb
(a)
CK1
CKin 2 CK2
(b)
Figure 3.11 (a) Timing mismatch in interleaved architecture, (b) generation of CK1 and
CK2 by a frequency divider.
CK1 and CK2 with a nominal duty cycle of 50% even if the duty cycle of CKin deviates
from 50%. In reality, however, the devices in the clock generators of Fig. 3.11(a) or the
frequency divider of Fig. 3.11(b) suffer from substantial mismatches, especially at high
33
speeds, introducing large timing errors between CK1 and CK2. Since an 8-bit ADC
sampling a 75-MHz signal cannot tolerate timing mismatches greater than roughly
12 ps, frequency division does not provide the accuracy required in this design.
drives both SHAs. Since the duty cycle of the clock may deviate from 50%, only one of
the edges must be used for the sampling command in both circuits. Figure 3.12
and S2, and two “predictive” control signals, Vodd and Veven, are added to the system. A
master clock, CKmaster, with a frequency twice the sampling rate, is provided to the two
channels through the two switches. The predictive signals Vodd and Veven enable one of
the switches S1 or S2, thus routing the falling edge of CKmaster to either of the SHAs.
The timing mismatch is now equal to the propagation delay mismatch between S1 and
S2, and the two switches inside SHA1 and SHA2, a value that can be maintained well
below 10 picoseconds even with 20% mismatch between the sizes of the switches.
The timing of Vodd and Veven is quite relaxed so long as they contain the falling
edge of CKmaster with enough margin. Thus, they can be produced by a simple
In reality, each SHA requires both a rising edge and a falling edge to perform
the sample and hold operations. As shown in Fig. 3.13, the falling edges of CK1x and
the rising edges of CK2x are alternately applied to the SHAs, while the rising edges of
34
CK1x and the falling edges of CK2x are discarded.
Vodd Vin
CKmaster S1
Veven SHA1
S2
SHA2
To SHA1 To SHA2
CKmaster
Vodd
Veven
The actual sequence of operation is as follows: during phase 1, the falling edge
of CK1x is routed to SHA1 and the rising edge of CK2x to SHA2. During phase 2, the
states of CK1 and CK2 are stored, and during phase 3, the falling edge of CK1x is re-
This concept can be easily extended from two channels to three, or more
channels. As discussed in Section 6.2, the front-end sample-and-hold circuit used in this
35
A1
CK1x CK1 CK1x CK1 CK1x CK1
SHA1 SHA1 SHA1
1 2 3 2 1
CK1
CK2
T T T T
3.5 Reinterpolation
As mentioned in Chapter 3, an important benefit of interpolation is the reduction
of the differential nonlinearity resulting from the offset of the preamplifiers [2,3].
depicted in Fig. 3.14(a), the original outputs (VA’s) from the preamplifiers are fed into
(VB’s) which, though different from VA’s, contain sufficient information to represent the
original analog input signal. If the offset components of the adjacent VA’s are
36
uncorrelated, the standard deviation of the offsets of the corresponding VB’s is reduced
by a factor of the square root of 2. As shown in Fig. 3.14(b), INLA or INLB is defined
as the maximum offset error of the zero crossings of VA’s or VB’s respectively. If only
the interpolated zero crossings, VB’s, are sensed by the following stages and the original
zero crossings, VA’s, are discarded, then, the overall INL is reduced by approximately
30%.
Figure 3.15 plots the maximum INL with and without reinterpolation as
predicted by Monte Carlo simulations, confirming the theoretical result. The reduction
of the INL translates into a higher tolerance of offsets in the preamplifiers, allowing
smaller input devices and a two-fold reduction in the capacitance seen by the buffer
stage of the pipeline, where the cumulative gain is still low; in stages 2 through 5 all
zero crossings are utilized. Thus, reinterpolation is obtained at the cost of a few
37
Reinterpolation
Amplifiers Interpolation
Preamplifiers Amplifiers
VA3
VB2
VA2
VB1
VA1
INL
A1
INLA INLB
Vin
0
B1
A2
A1 + A2
B1 =
2
38
* offset averaging mechanism through interpolation by 2
20m
Wave Symbol
D0:A0:v(va1)
D0:A0:v(va2) 15m
D0:A0:v(va3)
D0:A0:v(va4)
10m
D0:A0:v(va5)
D0:A0:v(va6)
D0:A0:v(va7) 5m
D0:A0:v(va8)
Voltages (lin)
D0:A0:v(va9)
0
D0:A0:v(va10)
D0:A0:v(va11)
D0:A0:v(va12) -5m
D0:A0:v(va13)
D0:A0:v(va14)
-10m
D0:A0:v(va15)
D0:A0:v(va16)
-15m
-20m
-500m 0 500m
-1 1
Voltage X (lin) (VOLTS)
Panel 2
Wave Symbol
D0:A0:v(vb1) 10m
D0:A0:v(vb2x)
D0:A0:v(vb3x) 8m
D0:A0:v(vb4x)
6m
D0:A0:v(vb5x)
D0:A0:v(vb6x) 4m
D0:A0:v(vb7x)
D0:A0:v(vb8x) 2m
Voltages (lin)
D0:A0:v(vb9x) 0
D0:A0:v(vb10x)
-2m
D0:A0:v(vb11x)
D0:A0:v(vb12x)
-4m
D0:A0:v(vb13x)
D0:A0:v(vb14x) -6m
D0:A0:v(vb15x)
-8m
D0:A0:v(vb16x)
-10m
-12m
-500m 0 500m
-1 1
Voltage X (lin) (VOLTS)
39
3.6 Effect of Nonlinearity in Sliding Interpolation
While the first stage of interpolation by a factor of two is quite insensitive to the
are susceptible to nonlinearity in each differential pair. Figure 3.16 illustrates the effect.
Curves A and B are the original characteristics with the zero-crossing points at V0 and
V2. After first 2x interpolation, curve C is generated with a zero crossing at V1 and a
slope of one half of the original one. If one more 2x interpolation is applied between
curves B and C as shown in the circled area in Fig. 3.16, the resulting zero-crossing
point should ideally fall midway between V1 and V2, i.e., at Vid. In practice, however,
the actual zero-crossing point, Vact, deviates from Vid because B and C exhibit different
In the worst case, curve A is flat for Vin > V1 and the slope of curve C is equal
to one half of that of B. Through a simple derivation, it can be shown δ = (V2 - V1)/6
and hence curve D suffers from a DNL of 1/3 LSB. In order to further increase the
extended accordingly.
40
A C
V1 B
Vin
V0
V2
Ideal δ
Position
Vid C
B
V1
Vin
V2
D
Vact
Actual
Position
41
Chapter 4
4.1 Introduction
In this chapter, the design of the ADC’s building blocks as well as various layout
considerations are discussed. All of the analog signal paths are implemented in
differential form to achieve a wide dynamic range and high immunity to common-mode
noise. For the sake of simplicity, some of the circuits are drawn in single-ended form.
In order to achieve fast settling, this circuit uses a simple top-plate sampling method and
42
V DD
CK1a CK1b
Vout
X
Vin M1
S1 S2
C1
CK1b CK1a
S3 S4
C2
C1 and C2 to Vin whereas the source follower is shared between the two channels. Thus,
gain and offset mismatches arise primarily from the charge injection mismatches of S1-
S4. The n-well of the source follower is tied to its source to suppress nonlinearity and
gain error due to body effect. Simulations indicate that two such followers operating
nonlinearity but it is partially cancelled by the charge absorbed by S2 and S4. Also,
differential operation as well as large sampling capacitors (1 pF) improve the overall
43
The finite input capacitance of the source follower results in an equivalent
resistor connected between the outputs of the two channels, yielding a gain roll-off at
high frequencies. From another perspective, the capacitance seen at node X and
switches S2 and S4 form a switched-capacitor low-pass filter. With proper design, this
Sample Quantize/MUX
CK
(a)
(b)
CKa
CKb
CKc
(c)
interleaving. This is because the sampling phase is quite faster than the hold/
[Fig. 4.2(a)]. Since the duty cycle deviates substantially from 50%, it is difficult to
employ dual-channel interleaving without any “dead” time. To resolve this issue, the
clock period is divided into three equal time slots: one for front-end sampling, one for
44
sub-ADC (coarse quantization), and one for multiplexing [Fig. 4.2(b)]. The timing
diagram of Fig. 4.2(c) is then used to interleave three sampling capacitors. To generate
the time slots with reasonable accuracy, the 150-MHz clock is divided by 3 on the chip.
V DD V DD
M1m M1r
S2 m S1 m S1r S2r
C1m C1r
Vin
S4 m S3 m S3r S4r
C2m C2r
S6 m S5 m S5r S6r
C3m C3r
mentioned before, the replica is scaled down by a factor of two with respect to the main
45
SHA. The switches connected between Vin and the sampling capacitors use the same
timing sequence in both of the main and the replica SHAs. However, the switches
connected between the sampling capacitors and the PMOS source followers have a
different timing sequence in the two SHAs. For each channel in the main SHA, the
operation sequence is: (1) sampling, (2) holding, (3) holding and connecting the held
sample to the PMOS source follower. On the other hand, the replica operates in a
slightly different sequence: (1) sampling, (2) holding and connecting the held sample
to the PMOS source follower (whose output is then sensed by the first sub-ADC ). (3)
holding.
stage and simple differential pair in the subsequent stages. The resistors used in the
As shown in Fig. 4.4, the preamplifier consists of two NMOS differential pairs
zero crossings, the design of the preamplifiers requires special attention to several
issues. First, the input-referred offset of the circuit must be less than 1/4 LSB so that it
does not degrade the overall DNL and INL significantly. The offset arises from three
sources: mismatch between the input transistors, mismatch between the load resistors,
and mismatch between the tail current sources. The mismatch of the differential pair
46
typically dominates the overall offset.
V DD
R 1a R 2a
Vout+
Vout−
+ −
Vin M 1a M 2a Vin M 3a M 4a
Vir+ Vir−
R c1a R c2a
I1a I2 a I3 a I4 a
R 1a, R2a: 4x200 Ω
M1a -- M4a: 8x12/0.6
R c1a, R c2a: 200 Ω
I1a -- I4a: 0.4 mA
(4x12/1.5)
By virtue of reinterpolation the tolerable offset is 30% higher and, with the aid
of the data in [39], the dimensions of M1a - M4a are chosen as W/L = 100 µm/0.6µm.
This results in a total gate area of 60 µm2, about one half of that reported in [19]. The
matching requirement of the output resistors is alleviated by the gain of the preamplifier
and with proper layout. The mismatch of the tail current sources is reduced significantly
The second issue relates to the gain of the preamplifier. The gain is chosen to be
around two as a trade-off between gain, linearity, and speed. Finally, although the
47
output of preamplifiers has a small swing, about 200 mV single-ended, the
This common-mode constraint limits the overdrive voltage of the input devices and the
tail current source. The current densities must therefore be low enough to consume a
reasonable headroom. This is possible for the current source but not for the input
transistors as their linearity determines the DNL and INL in subsequent interpolation
stages.
The reinterpolating and interpolating amplifiers have the same topology but
different device dimensions and bias currents. Figure 4.5 shows the details.
V DD
R 1b R 2b
Vout+
Vout−
+
Vin M 1b M 2b
Vin−
Rc1b
48
4.4 Comparator
The design of the comparators used in the sub-ADCs directly impacts the speed
and power dissipation of the overall converter. Shown in Fig. 4.6 is the high-speed
comparator utilized in the first stage sub-ADC. When CK is low, Sb1 and Sb2 are off. All
the p-switches (S1 - S4) are on and the four internal nodes (P, Q, X, and Y) are pulled up
to VDD with the aid of two equalization switches (Seq and Seqx), placing the comparator
in the reset mode. When CK goes high, Sb1 and Sb2 turn on and M1 - M4 compare the
positive input voltage, Vin+, with the positive reference voltage, Vr +, and the negative
input voltage, Vin−, with the negative reference voltage, Vr +. When all of the reset and
equalization PMOS switches are off, the cross-coupled inverters (M5 - M8)
regeneratively amplify the difference between the inputs to rail-to-rail levels. The
digital outputs are buffered by inverters and then fed to the control circuit.
This comparator offers three important advantages over other topologies. First,
the static power dissipation is zero. When CK is low, no static current flows through the
circuit. When CK is high, M7 and M8 ensure that the current is zero. Second, the
comparator requires only a single-phase clock, greatly simplifying the routing of the
49
Seq
CK S1 M5 CK M6 S2 CK
Inv1
Vout−
P Q
Vout+
M7 CK M8 S Inv2
CK S3 4 CK
Seqx
X Cf+ Cf−
Y
Vr+ M1 M2 Vin− M3 M4
Vin+
Vr−
CK Sb1 CK Sb2
The third property of the comparator is that the effect of the offsets due to the
cross-coupled transistors is reduced by the dynamic gain of the input stage. This effect
This is because when CK goes high, nodes X, Y, P, and Q are precharged to VDD
and M5 - M8 are off. The input difference is therefore amplified by M1 - M4 and the
parasitic capacitances at nodes X and Y until Vx and Vy drop below VDD by VTHN. At
50
this point, M7 and M8 turn on but M5 and M6 are still off. The amplification then
continues while M5 and M6 contribute a small regenerative gain until M5 and M6 turn
to the input-referred offset. With the device dimensions chosen in this design,
and that of M7 and M8 by a factor of 2. Since the channel area of M7 and M8 is about
one fourth of that of the input devices, they contribute roughly equal amounts of input-
produced at the beginning of reset and regeneration modes. This effect is particularly
critical in the first stage and can introduce significant dynamic offsets, saturating the
second stage and creating nonlinearity. Adding a pair of cross-coupled capacitors with
proper value (around 8 fF) at the input reduces the kickback noise to an acceptable level.
As shown in Fig. 4.7, the comparators in stages 2 to 5 are basically the same as
that in the first stage, except for the input network. The multiplexers consisting of Si1 -
Si4 select the even- or odd-channel signals in a dual-channel interleaving mode. Due to
the accumulative gain after stage 1, larger comparator offsets can be tolerated in stages
2 to 5. Therefore, the input differential pair uses W/L = 5.4 µm/0.6 µm.
51
S1 M5 CK M6 S2 CK
CK
Inv1
Vout+
Vout-
M3 M4 Inv2
S3 S4 CK
CK
CK_2ec CK_2ec
Vie+ CK Vie−
Si1 M1 M2 Si3
Unlike the first stage, the comparators in stages 2 to 5 do not share the same
input line because they are driven by the interpolation amplifier outputs. Thus, the
52
4.5 Clock Edge Reassignment
The clock edge reassignment (CERA) circuit for a dual-channel system is
shown in Fig. 4.8. With proper control signals, Spe and Spo pass the rising edges and Sno
and Sne pass the falling edges of the master clock to SHA1 and SHA2, respectively.
When Vodd is high, Sno and Spo are on, allowing SHA1 to receive a falling edge from A2
and SHA2 a rising edge from A1. When Veven is high, the reverse occurs. The falling
A1
CK1x
CK
Spe Spo
Sno Sne
A2
CK2x
Vodd Veven
The operation of the CERA circuit is further illustrated in Fig. 4.9. The circuit
operates in two “pass” modes and one “block” mode. During the block mode, the clock
signals inside SHA1 or SHA2 are stored on the parasitic capacitance at each node. The
well.
53
Vodd = High & Veven = Low Vodd = Low & Veven = High
A1 CK1x A1
CK CK
Spo Spe
CK1
SHA1 SHA2 SHA1 SHA2
CK2
Sno Sne
A2 CK2x A2
1 2 3
Holding
SHA1 SHA2
1 2 3 2 1
Vodd
Odd
Veven Even
CK1
CK2
T T
54
4.6 Control and Decode Circuit
As shown in Fig. 4.10, the control circuit (NAND_FF) senses the outputs of two
adjacent comparators to generate three control signals, two applied to the MUX, and
one to the ROM. The two-input NAND gate performs 1-of-n encoding and its output
drives a D-type flip-flop, which produces the digital output at the falling edge of CK.
With the assertion of either CKey or CKoy , the control signal is routed to either the even-
to 5.
CMP NAND_FF
Oa 2 + Odax2+
Odax2- D1 Oe2-
Oa2 -
D Q Oo2- To MUX
+ D2
Odax1
To ROM
Od2
CK
CKey
CKoy
The detailed circuit of NAND_FF is shown in Fig. 4.11. The core is based on a
TSPC flip-flop structure [38]. The dual-input NAND is merged into the input stage of
the D-FF, while the two interleave-control NANDs are combined with the D-FF output
stage. INV3 and INV4 are scaled to drive the heavy capacitive load inside MUXs with
reasonable delay.
55
M1 M2 M6 M9 Inv1 Inv3
D1 Oe-
D2 M3 M7 M10
M4 M13
ND
M8 Od
M5
M 1 - M3 : 1.8/0.6 M14
M 4 - M5 : 3.0/0.6
M 6 - M7 : 2.4/0.6
M8 : 1.2/0.6 Inv1,2 : p- 4.8/0.6; n- 2.4/0.6
M 9 - M12 : 1.8/0.6 Inv3,4 : p- 19.2/0.6; n- 9.6/0.6
ND : p- 4.8/0.6; n- 2.4/0.6
M 13 - M14 : 3.6/0.6
56
4.7 ROM and Output Stage
The ROM consists of a dynamic digital circuit with a precharging PMOS as
ROM
Ro
Mp
Dx Dout
PAD
Din Mx
D Q D Q Mo
Mn
M p : 9.0/0.6
CKc
M x : 2.4/0.6
M n : 14.4/0.6
M o : 18/0.6
Ro : 100 Ω
Din. When CKc goes high, Mp turns off and Mn turns on. The output Dx is then evaluated
and fed to the following pipelined register array and eventually the output driver. The
6 mA. The current is drawn from an off-chip termination resistor of 100 Ω , generating
57
a voltage swing of 600 mV, a value sufficient for driving off-chip ECL buffers. The
small voltage swings allow sharp edges in the output data waveforms even with the
One dedicated ground pad is used for all of the output drivers to ensure that the
large ground bounce does not disturb the sensitive analog sections.
shown in Fig. 4.13, the clock generator contains four building blocks: two divide-by-
two circuits (DIV2a and DIV2b), one divide-by-three circuit (DIV3), and an output
The 300-MHz differential master clock signals, CKin and CKin, drive DIV2a and
the output flip-flop of DIV2b. DIV2a produces 150-MHz outputs that are applied to
DIV2b and DIV3. DIV2b generates 75-MHz clocks required for interleaving the
interpolative stages and DIV3 produces 50-MHz clocks used in the triple-channel front-
end SHA.
The BUF section generates CKa1, CKa2, CK_3a, CK_3b, CK_3c for the front-end
SHAs, CK_2ec, CK_2oc, CK_2ey, CK_2oy for the dual-channel interleaving interpolative
stages, and CK and CKc for the comparators and the pipelining registers. The BUF
section is actually laid out in different parts of the chip, in proximity to the related
sections.
58
DIV2b
DIV2a
o_2f2+
8
ick+ D Q
CKin D Q D Q D Q
o_2f1− o_2b2+
CKin D Q D Q
D Q
ick− D Q
Q D Q D
o_2b1+ o_2b2−
4.8
4 3.6 4
DIV3
D Q D Q D Q
o_2c1−
3.0
unityINV = BUF
1.2
8 8 4 4 4 8 8 4 4 8 8
Local BUFs
CKa1 CKa2 CK_3a CK_3b CK_3c CKc CK CK_2ec CK_2oc CK_2ey CK_2oy
59
Figure 4.14 shows the high-speed differential D-type latch used in the two
divide-by-two circuits. With the device dimensions shown here, the latch operates at
M5 M6
Q
Q
M3 M4
D M1 M2 D
CKc Mn M 1 - M6 : 3.0/0.6
M n : 12/0.6
reinterpolation scheme does require tighter linearity. Hence, the differential amplifiers
in the signal path employ resistive degeneration. The actual design is fully differential.
It is also important to note that the converter requires no floating capacitors and
60
Even Channel
V DD
Vin
Slide Interleave
Command Control Command
Logic
Odd Channel
Figure 4.15 Realization of a slice of the signal path in the first stage.
The entire ADC is simulated at the transistor level by StarSim (previously called
ADM), a SPICE-like simulator. The result for typical process parameters and at room
61
0
f = 153.8462 MHz fin = 37.2596 MHz ( 31/128 * fsamp)
samp
-20 SNDR = 43.148 dB
dB
HD3 = -51.6222 dB
-40
-60
-80
0 10 20 30 40 50 60
Frequency (MHz)
(a)
0.5
Volt
-0.5
-1
0 20 40 60 80 100 120
Sample (Time)
(b)
62
4.10 Floor Plan And Layout Considerations
The floor plan and layout of the ADC must deal with issues such as: routing of
critical paths, power and ground isolation, noise coupling from the digital sections to
the analog sections, etc. Due to the nature of sliding interpolation, the high-speed digital
control signals must travel through the analog sections. Also, the sub_ADCs in stages
2 to 5 must be embedded with the interpolating stages. These issues underscore the
Figure 4.17 shows the floor plan of the ADC. In order to reduce the wiring
capacitance in the critical path, the front-end building blocks in the first stage (CMP_A,
reference ladder, preamplifiers, MUX, and the distributed sample-and-hold) are folded
into a U shape. The front-end SHA output and the reference ladder are routed between
the comparator bank (CMP_A) and the preamplifier bank. The reference ladder is made
taps. Each preamplifier provides an empty stripe so that the digital control signals from
CMP_A to MUX can run through it without interfering with the analog signal path. The
digital signals have also been shielded on both sides with analog ground along the entire
path. The ROM generates the four corresponding digital bits in the first stage.
63
Clock Generator MUX &
Interpolation
SHA Amp Distributed
Preamp Amp2 Sampling
18 18 even/odd 7 (e) 7 (e)
5 (e)
-1 -1 e/o
7 (o) 7 (o)
17 17 e/o
CMP_A 5 (o)
0 0 e/o 6 (e) 6 (e)
16 16 16 e/o CMP_B
4 (e) 6 (o) 6 (o)
1 1 1 e/o
15 15 15 e/o 5 (e) 5 (e)
4 (o) 3
2 2 2 e/o 5 (o) 5 (o)
14 14 14 e/o
3 (e) 4 (e) 4 (e)
ROM
3 3 3 e/o
2
13 13 13 e/o 3 (o) 4 (o) 4 (o)
4 4 4 e/o
ROM
3 (e) 3 (e)
12 12 12 e/o 1
2 (e)
5 5 5 e/o 3 (o) 3 (o)
11 11 11 e/o 2 (o) 2 (e) 2 (e)
6 6 6 e/o
10 10 10 e/o 2 (o) 2 (o)
1 (e)
7 7 7 e/o 1 (e) 1 (e)
9 9 9 e/o 1 (o)
8 8 8 e/o 1 (o) 1 (o)
The MUX outputs are connected to metal-3 lines running vertically. Each
differential analog output pair is shielded by analog VDD lines on both sides, providing
isolation and forming the sampling capacitor as well. The even and the odd channels
are uniformly distributed within this building block. Amp2 performs reinterpolation
and contains 5 dual-channel sets which sense the outputs of the MUX and subsequently
64
A more detailed diagram of the first stage is shown in Fig. 4.18. Note that the
strategy is chosen because preamplifiers 1 and 15 share the same reference voltages,
etc.
lower and upper banks in Fig. 4.19 and Fig. 4.20, respectively. The sliding/interleaving
command reaches a unit cell (in the middle column) in each slice from the left and then
connects to the cells in the adjacent two slices above and below.
MUX and distributed sampling circuit. All of these circuits are in differential and dual-
channel form. Only three comparators are required in CMP_B to decide which sections
are needed to provide the zero-crossing information to the following stage. The ROM
creates the two bits based on the results of the comparators. Stages 3 to 5 are identical
to stage 2.
Since all of the switches in the MUX are PMOS devices, a large n-well is used
to accommodate them. With properly-spaced substrate contacts, the n-well isolates the
65
Vref− Vref+ Vo1 Vo2 Vo3
Vin− Vin+ (even) (odd) (even) (odd) (even) (odd)
unit cell
Cntleven
Cntlodd
15
Cntleven
Cntlodd
14
Cntleven
Cntlodd
13
Cntleven
66
Vo2 Vo3 Vo4
(even) (odd) (even) (odd) (even) (odd)
15
14
13
67
Vo2 Vo3 Vo4
(even) (odd) (even) (odd) (even) (odd)
15
14
13
68
Clock Interpolation ADC2
SHA Generator Amp
CKin+ CKin−
Ref
Ladder
Vin+
Vin−
ADC1
Fig. 4.21 shows the die photo. The chip size is 1.5 mm x 1.2 mm with the active
area about 1.2 mm2. The analog differential input signals, Vin+ and Vin−, enter from the
left side of the chip and are shielded with a common VDD in metal 2. Digital outputs
leave the chip from the lower and the right sides of the chip. Three different power lines
69
are used in this layout, one for the analog section, one for the digital section, and one
The front-end SHA is placed at the left-top corner and right above the reference
ladder so that its outputs readily reach the preamplifiers and the first sub_ADC. The
high-speed (300-MHz) input clocks, CKin+ and CKin−, and the clock generator are
The modularity of the design can be seen in stages 2 to 5. The resuling layout is
70
Chapter 5
Experimental Results
5.1 Introduction
In this chapter, the test setup and the experimental results obtained from a
converter at sampling rates greater than 100 MHz entails many challenges, requiring
great care in the design of the test board and the setup. In order to avoid the parasitics
with changing the device under test. The raw digital output data of the ADC is collected
by a logic analyzer and subsequently fed into a personal computer for error correction
and performance analysis. MATLAB is used to characterize the results of both low-
71
frequency and dynamic tests. INL and DNL are measured at low input frequencies (still
with a sampling rate of 150 MHz), while SNDR and SFDR are obtained for input
Shown in Fig. 5.1 the chip (bare die) is mounted in the middle of the central cavity on
NB1C
AVdd2
ICK−
NB1B
VR8
NB1A
GND ICK+
CVdd
DVdd2
DVdd1
CKout
A0 A1 A2 A3 C0 C1 E0E1
B0 B1 D0 D1
72
Bootcntl
VIN- VIN+ VR18 VR_2 AVdd2
NB1C
NB1S
NB1B
NB1BC
VR8
ICK-
NB1A
Down ICK+
Bond
GND
DVdd1
CVdd CKout
DVdd2
E1
A0
E0
A1
A2 D1
A3 D0
B0 B1 C0 C1
The analog inputs are fed from the top of the board, while the complementary
clocks are coming from the right with 50-Ω termination resistors. The high-speed
digital outputs are placed on the bottom side of the board. The solid dots are through-
holes which connect the top ground areas to the bottom ground plane. Since there are
no protection diodes on the pads inside the chip, discrete diodes are used for all the
biasing nodes on the board to minimize the probability of damage due to electrostatic
73
discharge. Chip capacitors with values ranging from 1 nF to 0.1 nF are soldered from
bias lines to the ground traces in order to bypass the high-frequency noise.
The zoom-in diagram around the central cavity is shown in Fig. 5.2. The cavity
is a ground plane with a large through-hole in the middle, right beneath the chip, so that
the inductance of the ground node can be minimized. All of the ground pads are down-
bonded to this common ground plane. The traces for CVdd, DVdd1 and DVdd2, Vdd
lines for the first sub_ADC and the digital circuits are extended into the cavity area in
1.5”
2”
The actual board size is as shown in Fig. 5.3. All the through holes are drilled
and some of the critical bypass chip capacitors (near the central cavity) are soldered
before the bare die is glued and bonded. Then, it comes to the most stressful part: using
a soldering iron to solder all other passive components while the fragile bond wires are
74
sitting in the middle. Any slight touch on one bond wire could easily break it.
Unfortunately, the soldering work cannot be completely done before the bonding
because our bonding machine requires that the back side of the board be flat.
NB1S
AVdd2
NB1BS
NB1C
ICK-
VR8
NB1B
Bias_CK
NB1A
Gnd
ICK+
CVdd
DVdd2
Vup DVdd
A0 A1 A2 A3 B0 B1 C0 C1 D0 D1 E0 E1 CKout
75
configuration shown in Fig. 5.4 is used. It is similar to a common chip-on-board
assembly, except that the central part of the board is made of a smaller detachable board
(a “daughter board”) which contains the mounted die and several critical bypass
capacitors between the bias lines and the ground plane. The two boards are connected
through thin copper-foil stripes (with 1~3 mil thickness), which have small inductance
Once all of the stripes on a daughter-board are de-soldered, the daughter board
can be detached from the mother board, just like the case with a regular package.
Therefore, one mother board can be used for many daughter boards, reducing the
overhead work.
Off-chip ECL buffers are used in order to reliably read out the digital data
produced by the chip at the clock rate of 150 MHz. The ECL buffers are necessary here
to drive the input channels of the logic analyzer with reasonable rise-time and fall-time
and voltage swings. The ECL buffers are mounted on a different board to lower noise
Figure 5.5 shows the layout of the boards used in the setup.
76
Daughter
Boards
Mother
Board
ECL
Boards
77
5.3 Test Setup
A proper test setup is crucial to measuring the true performance of the ADC.
Synchronization Signal
Generator
10 MHz
12 bits
PC
In this setup, four signal generators are synchronized by the 10-MHz Sync
signal and are used to provide the analog input and the three digital clock signals. The
78
analog input waveform is fed into the chip-on-board assembly through a bias-T and a
low-pass filter. The 300-MHz clock signal is applied to the chip to generate the on-chip
150-MHz master clock. Another off-chip 150-MHz clock is used to trigger the ECL
buffers in order to collect the digital output data from the chip. Since the maximum
operating speed of the logic analyzer is 100 MHz, 2x subsampling is used, requiring a
75-MHz clock to control the logic analyzer. Since the front-end sample-and-hold circuit
Since there are five stages in the ADC, four extra bits are needed for the digital
error correction. Therefore, a total of 12 bits are collected by the logic analyzer through
characterization.
technology, occupying a total area of 1.2 mm x 1.5 mm and an active area of 1.2 mm2.
The circuit is tested with a 3.3-V supply with differential input swings of 1.6 Vpp and a
Figure 5.7 shows the measured DNL and INL profiles obtained from code
density (histogram) tests with 16 times of the total number of the codes (4K samples).
After a normalized curve-fitting process, the maximum values of DNL and INL are 0.61
79
and 1.24 LSB, respectively.
1
DNL = 0.61196 LSB
0.5
LSB
0
-0.5
-1
0 50 100 150 200 250
1.5
INL = 1.239 LSB
1
0.5
LSB
0
-0.5
-1
-1.5
0 50 100 150 200 250
Code
Figure 5.7 DNL and INL at fin = 1.8 MHz and fsample = 150 MHz.
domain. Figure 5.8 depicts the spectrum of the reconstructed signal at 1.76 MHz,
ratio (SNDR) of 43.7 dB, which implies the effective number of bits (ENOB) is equal
to 7 bits.
The spurious-free dynamic range (SFDR) and SNDR as a function of the analog
input frequency are plotted in Fig. 5.9. SFDR starts from around 50 dB at low
frequencies and reaches about 44 dB at high frequencies. SNDR is about 43.7 dB at low
80
frequencies and about 40 dB (ENOB = 6.5) for frequencies above 40 MHz.
10
0
fsamp = 150 MHz, fin = 1.7578MHz
-10
SNDR = 43.7506 dB
-20
Magnitude (dB)
HD2 = -61.7943 dB
-30
HD3 = -53.2677 dB
-40
HD5 = -50.463 dB
-50
-60
-70
-80
-90
0 5 10 15 20 25 30 35
Frequency (MHz)
Nyquist rate indicates that the clock edge reassignment technique indeed minimizes
81
50
48
SFDR
46
44
dB
42
40
SNDR
38
36
10 20 30 40 50 60 70 80
higher than expected mainly due to the discrepancy between the target resistance values
and the actual values. Since the sheet resistance of the fabricated poly resistors is about
30% higher than that used in the simulations, more power consumption is required to
82
Technology 0.6-µm, 1-poly, 3-metal CMOS
Resolution 8 bits
DNL 0.62 LSB
INL 1.24 LSB
Sampling Rate 150 MHz
SNDR @ fin=1.8 MHz 43.7 dB
fin=70 MHz 40 dB
Analog Input Swing 1.6 Vp-p
Input Capacitance 1.5 pF
Active Chip Area 1.2 mm2
Supply Voltage 3.3 V
Power Consumption
Analog 330 mW
Digital 53 mW
Reference Ladder 12 mW
Total 395 mW
Table 1: Measurement Summary
83
Chapter 6
This dissertation presents the design work and experimental results of an 8-bit,
150-MHz CMOS ADC. The research introduces a new sliding interpolation ADC
architecture that lends itself to pipelining without the need for interstage DACs or
subtractors. Also two other circuit techniques, namely clock edge reassignment and
The entire design uses only open-loop circuits in order to maximize the
sample-and-hold circuit are adopted to reduce the settling time in the critical path. The
interpolation eliminates the need for interstage D/A converters, subtractors, and residue
84
the distributed sampling scheme. Since the converter requires no floating capacitors, it
The ADC avoids the use of op amps, incorporating only source followers and
prototype is functional even with a 2.5-V supply, though it was designed for a 3.3-V
system. The simple and modular design of the pipelining structure results in a compact
layout requiring a core area of only 1.2 mm2 in a 0.6-µm CMOS process.
The fabricated prototype delivered the performance reported here in the first try.
Many aspects of the design can be reexamined and improved in future work.
First, due to a layout error, the gain of the differential pairs in the reinterpolation
stage was equal to one rather than two, leading to a higher input-referred offset voltage.
Second, the kickback noise of the comparators in the first sub-ADC still creates a large
dynamic offset, sometimes pushing the following stage to the edge of the overlap range.
Third, the layout of the resistor ladder must preferably avoid current-carrying contacts
dissertation can be used. For higher resolutions, the charge injection and nonlinearity
issues of the front-end SHA require additional work and the comparator kickback noise
must be reduced. Also, the trade-off between the offset voltage and the gate capacitance
must be relaxed, perhaps through the use of averaging [37]. Moreover, the problem of
85
carefully. Finally, the power dissipation in the pipelined stages can be scaled down
because the precision requirements become more relaxed as the signal travels through
the pipeline.
86
Bibliography
[1] R. Gregorian and G. C. Temes, Analog MOS Integrated Circuits for Signal
Processing, John Wiley and Sons, New York, 1986.
[2] B. Razavi, Principles of Data Conversion System Design, IEEE Press, New
York, 1995.
87
[8] R. J. van de Plassche and P. Baltus, “An 8-b 100-MHz Full-Nyquist Analog-
to-Digital Converter,” IEEE J. Solid-State Circuits, vol. SC-23, pp. 1334-
1344, Dec 1988.
[11] T. Matsuura, H. Kojima, E. Imaizumi, K. Usui, and S. Ueda, “An 8-b 50-
MHz 225-mW Submicron CMOS ADC Using Saturation Eliminated
Comparators,” Proc. CICC, pp. 6.4/1-4, May 1990.
[13] G. T. Tuttle, S. Fallahi, and A. A. Abidi, “An 8-b CMOS Vector A/D
Converter,” ISSCC Dig. Tech. Pap., pp. 38-39, Feb. 1993.
[16] B. Nauta and A. G. W. Venes, “A 70 MS/s, 110 mW, 8-b CMOS Folding
88
Interpolating A/D Converter ,” ISSCC Dig. Tech. Pap., pp. 276-277, Feb.
1995.
[17] Chung-Yu Wu, Chih-Cheng Chen, and Jhy-Jer Cho, “A CMOS Transistor-
Only 8-b 4.5-MS/s Pipelined Analog-to-Digital Converter Using Fully-
Differential Current-Mode Circuit Techniques,” IEEE J. Solid-State
Circuits, vol. SC-30, pp. 522-532, May 1995.
[19] A. G. W. Venes and R. J. van de Plassche, “An 80-MHz, 80-mW, 8-b CMOS
Folding A/D Converter with Distributed Track-and-Hold Preprocessing,”
IEEE J. Solid-State Circuits, vol. SC-31, pp. 1846-1853, Dec. 1996.
[23] Y. Akazawa et al., “A 400 MSPS 8 b Flash AD Conversion LSI,” ISSCC Dig.
Tech. Pap., pp. 98-99 Apr. 1987.
89
[24] T. Tsukada et. al., “CMOS 8b 25 MHz Flash ADC,” ISSCC Dig. Tech. Pap.,
pp. 34-35, Feb. 1985.
[25] A. G. Dingwall and V. Zazzu, “An 8-MHz CMOS Subranging 8-Bit A/D
Converter,” IEEE J. Solid-State Circuits, vol. SC-20, pp. 1138-1143, Dec.
1985.
[30] C. Lane, “A 10-Bit, 60-MS/s Flash ADC,” Proc. BCTM., pp. 44-47, Sep.
1989.
90
[33] K. Kusumoto, A. Matsuzawa, and K. Murata, “A 10-b 20-MHz 30-mW
Pipelined Interpolating CMOS ADC,” IEEE J. Solid-State Circuits, vol. SC-
28, pp. 1200-1206, Dec. 1993.
[35] R. J. van de Plassche and P. Baltus, “An 8 b 100 MHz Folding ADC,” ISSCC
Dig. Tech. Pap., pp. 222-223, Feb. 1988.
[36] J. van Valburg and R. J. van de Plassche, “An 8-b 650-MHz Folding ADC,”
IEEE J. Solid-State Circuits, vol. SC-27, pp. 1662-1666, Dec. 1992.
[37] K. Bult and A. Buchwald, “An embedded 240-mW 10-b 50-MS/s CMOS
ADC in 1-mm2,” IEEE J. Solid-State Circuits, vol. SC-32, pp. 1887-1895,
Dec. 1997.
91
[41] S. K. Tewksbury, et al., “Terminology Related to the Performance of S/H,
A/D, and D/A circuits,” IEEE Trans. Circuits Syst., vol. CAS-25, pp. 419-
426, July 1978.
92