

Brigham Young University BYU ScholarsArchive

Theses and Dissertations

2004-04-29

# Design and Analysis of Charge-Transfer Amplifiers for Low-Power Analog-to-Digital Converter Applications

William Joel Marble Brigham Young University - Provo

Follow this and additional works at: https://scholarsarchive.byu.edu/etd

Part of the Electrical and Computer Engineering Commons

### **BYU ScholarsArchive Citation**

Marble, William Joel, "Design and Analysis of Charge-Transfer Amplifiers for Low-Power Analog-to-Digital Converter Applications" (2004). *Theses and Dissertations*. 35. https://scholarsarchive.byu.edu/etd/35

This Dissertation is brought to you for free and open access by BYU ScholarsArchive. It has been accepted for inclusion in Theses and Dissertations by an authorized administrator of BYU ScholarsArchive. For more information, please contact scholarsarchive@byu.edu, ellen\_amatangelo@byu.edu.

# DESIGN AND ANALYSIS OF CHARGE-TRANSFER AMPLIFIERS FOR LOW-POWER ANALOG-TO-DIGITAL CONVERTER APPLICATIONS

by

William J. Marble

A dissertation submitted to the faculty of

Brigham Young University

in partial fulfillment of the requirements for the degree of

Doctor of Philosophy

Department of Electrical and Computer Engineering

Brigham Young University

August 2004

Copyright © 2004 William J. Marble

All Rights Reserved

## BRIGHAM YOUNG UNIVERSITY

### GRADUATE COMMITTEE APPROVAL

## of a dissertation submitted by

William J. Marble

This dissertation has been read by each member of the following graduate committee and by majority vote has been found to be satisfactory.

| Date | Craig S. Petrie, Chair |
|------|------------------------|
| Date | Donald T. Comer        |
| Date | David J. Comer         |
| Date | Brian D. Jeffs         |
| Date | Mark L. Manwaring      |

### BRIGHAM YOUNG UNIVERSITY

As chair of the candidate's graduate committee, I have read the dissertation of William J. Marble in its final form and have found that (1) its format, citations, and bibliographical style are consistent and acceptable and fulfill university and department style requirements; (2) its illustrative materials including figures, tables, and charts are in place; and (3) the final manuscript is satisfactory to the graduate committee and is ready for submission to the university library.

Date

Craig S. Petrie Chair, Graduate Committee

Accepted for the Department

Michael A. Jensen Graduate Coordinator

Accepted for the College

Douglas M. Chabries Dean, College of Engineering and Technology

### ABSTRACT

# DESIGN AND ANALYSIS OF CHARGE-TRANSFER AMPLIFIERS FOR LOW-POWER ANALOG-TO-DIGITAL CONVERTER APPLICATIONS

William J. Marble

Electrical and Computer Engineering

Doctor of Philosophy

The demand for low-power A/D conversion techniques has motivated the exploration of charge-transfer amplifiers (CTAs) to construct efficient, precise voltage comparators. Despite notable advantages over classical, continuous-time architectures, little is understood about the dynamic behavior of CTAs or their utility in precision A/D converters. Accordingly, this dissertation presents several advancements related to the design and analysis of charge-transfer amplifiers for low-power data conversion.

First, an analysis methodology is proposed which leads to a deterministic model of the voltage transfer function. The model is generalized to any timing scheme and can be extended to account for nonlinear threshold modulation. The model is compared with simulation results and test chip measurements, and shows good agreement over a broad range of circuit parameters.

Three new charge-transfer amplifier architectures are proposed to address the limitations of existing designs: first, a truly differential CTA which improves upon the pseudo-differential configuration; second, a CTA which achieves more than 10x reduction in input capacitance with a moderate reduction in common mode range; third, a CTA which combines elements of the first two but also operates without a precharge voltage and achieves nearly rail to rail input range. Results from test chips fabricated in 0.6  $\mu$ m CMOS are described.

Power dissipation in CTAs is considered and an idealized power consumption model is compared with measured test chip results. Four figures of merit (FOMs) are also proposed, incorporating power dissipation, active area, input charging energy and accuracy. The FOMs are used to compare the relative benefits and costs of particular charge-transfer amplifiers with respect to flash A/D converter applications.

The first 10-bit CTA-based A/D converter is reported. It consumes low dynamic power of 600  $\mu$ W/MSPS from a 2.1 V supply, 40% less than the current state of the art of 1 mW/MSPS. This subranging type converter incorporates capacitive interpolation to achieve a nearly ideal comparator count and power consumption. A distributed sample-and-hold (S/H) eliminates the need for a separate S/H amplifier. A test chip, fabricated in 0.6  $\mu$ m 2P/3M CMOS, occupies 2.7 mm<sup>2</sup> and exhibits 8.2 effective bits at 2 MSPS.

#### ACKNOWLEDGMENTS

I wish to express my sincere appreciation to my research advisor, Dr. Craig Petrie, for his guidance and support during this effort. It has been a great pleasure to learn from him during the process of expanding and refining my research. He has been a generous mentor and a good friend.

I also want to thank Dr. Don Comer for the many valuable discussions early on in my graduate research. His encouragement to take risks, coupled with his incisive understanding of circuit techniques, was instrumental in creating an environment in which I was able to embrace this research topic and eventually experience many exciting opportunities to work on the cutting edge of low-power circuit design.

This work would not have been possible without the commitment of AMI Semiconductor. I owe a special thanks to the managers who enthusiastically supported me: Ryan Cameron, Jerry Downey and Robert Klosterboer.

One of the most influential forces on the course of this research has been Dr. Koji Kotani, whose original work sparked my interest in charge-transfer amplifiers five years ago. Though we have never met, our many email discussions have greatly enriched my graduate studies. I am truly grateful for his perspectives and ideas and hope we will be able to collaborate further in the future.

My deepest thanks go to my parents for their encouragement in all of my endeavors. Arguably, my interest in power-efficient circuits was motivated at a young age by their countless sermons on conservation and frugality. They have been a primary source of inspiration and teaching throughout my life.

Finally, I express my sincere gratitude to my dear wife Rose for her constant support and patience throughout this effort.

To Rosemary

# Contents

| A        | ckno  | wledgments                                     | vii       |
|----------|-------|------------------------------------------------|-----------|
| D        | edica | ation                                          | viii      |
| Li       | st of | Tables                                         | xiii      |
| Li       | st of | 'Figures x                                     | viii      |
| 1        | Ab    | breviations and Conventions                    | 1         |
| <b>2</b> | Inti  | coduction                                      | 3         |
|          | 2.1   | A Brief History of Charge-transfer Amplifiers  | 4         |
|          |       | 2.1.1 Transconveyance                          | 6         |
|          | 2.2   | Low-power A/D Converters                       | 8         |
|          |       | 2.2.1 A/D Converter Specifications             | 9         |
|          |       | 2.2.2 Voltage Comparators                      | 14        |
|          | 2.3   | Contributions of this Work                     | 17        |
|          | 2.4   | Organization of the Dissertation               | 19        |
| 3        | Exi   | sting CTA Architectures                        | <b>21</b> |
|          | 3.1   | NMOS Charge-transfer Amplifier                 | 21        |
|          | 3.2   | CMOS Charge-transfer Amplifier                 | 24        |
|          | 3.3   | Pseudo-differential Charge-transfer Amplifier  | 25        |
|          | 3.4   | Feedback Charge-transfer Amplifier             | 25        |
| 4        | Ana   | alysis of Dynamic Behavior                     | 29        |
|          | 4.1   | Analysis of the NMOS Charge-transfer Amplifier | 29        |

| 4.2 Analysis of the CMOS |     | Analysis of the CMOS Charge-transfer Amplifier                     | 40  |
|--------------------------|-----|--------------------------------------------------------------------|-----|
|                          | 4.3 | Analysis of the Fully-differential Charge-transfer Amplifier       | 42  |
|                          |     | 4.3.1 Voltage Transfer Function                                    | 48  |
|                          |     | 4.3.2 Statistical Variation and Offset Voltage                     | 50  |
|                          | 4.4 | Design Tradeoffs and Considerations                                | 53  |
|                          | 4.5 | Experimental Verification                                          | 55  |
|                          |     | 4.5.1 Comparator Cell                                              | 55  |
|                          | 4.6 | Summary                                                            | 57  |
| <b>5</b>                 | Nev | w Architectures for Practical Applications                         | 59  |
|                          | 5.1 | Enhanced Differential Charge-transfer Amplifier                    | 60  |
|                          |     | 5.1.1 Experimental Results                                         | 63  |
|                          | 5.2 | Direct-coupled Charge-transfer Amplifier                           | 64  |
|                          | 5.3 | $V_{PR}$ -less Charge-transfer Amplifier                           | 71  |
|                          | 5.4 | Experimental Results of the Direct-coupled and $V_{PR}$ -less CTAs | 73  |
|                          |     | 5.4.1 Input Range and Offset Voltage                               | 74  |
|                          |     | 5.4.2 Dynamic Power                                                | 77  |
|                          | 5.5 | Summary                                                            | 77  |
| 6                        | Dyr | namic Power                                                        | 79  |
|                          | 6.1 | Dynamic Power of the NMOS Charge-transfer Amplifier                | 80  |
|                          | 6.2 | Dynamic Power of the CMOS Charge-transfer Amplifier                | 85  |
|                          | 6.3 | Discussion                                                         | 87  |
|                          | 6.4 | Application to New Architectures                                   | 88  |
|                          | 6.5 | Figures of Merit                                                   | 89  |
|                          | 6.6 | Subthreshold Operation                                             | 95  |
|                          | 6.7 | Summary                                                            | 97  |
| 7                        | A 1 | 0-bit CTA-based A/D Converter                                      | 99  |
|                          | 7.1 | Types of A/D Converters                                            | 100 |
|                          | 7.2 | Averaging                                                          | 102 |

|    | 7.3   | Interpolation                                                   | 107 |
|----|-------|-----------------------------------------------------------------|-----|
|    | 7.4   | Voltage Comparator                                              | 109 |
|    | 7.5   | Gain Enhancement                                                | 113 |
|    | 7.6   | Distributed Sampling                                            | 116 |
|    | 7.7   | Settling of the Fine References                                 | 118 |
|    | 7.8   | Preliminary Study                                               | 120 |
|    | 7.9   | Subranging A/D Converter                                        | 121 |
|    | 7.10  | Fabrication Results                                             | 123 |
|    |       | 7.10.1 Dynamic Power                                            | 124 |
|    |       | 7.10.2 Linearity                                                | 125 |
|    |       | 7.10.3 Spectral Performance                                     | 126 |
|    | 7.11  | Potential Applications                                          | 128 |
|    | 7.12  | Summary                                                         | 131 |
| 8  | Con   | clusion                                                         | 133 |
|    | 8.1   | Contributions of the Dissertation                               | 133 |
|    | 8.2   | Future Work                                                     | 137 |
| Α  | Con   | nmon Random Access Memory Architectures                         | 143 |
| в  | Mat   | lab Model of a Fully-differential CTA Voltage Transfer Function | 145 |
| Bi | bliog | raphy                                                           | 153 |

xii

# List of Tables

| 1.1 | List of conventions                                                         | 1   |
|-----|-----------------------------------------------------------------------------|-----|
| 1.2 | List of abbreviations                                                       | 2   |
| 6.1 | Capacitor node voltages in an idealized NMOS CTA                            | 81  |
| 6.2 | Typical circuit parameters for a CTA in a 0.6 $\mu m$ CMOS process $~$      | 84  |
| 6.3 | Components of the power profile of a typical NMOS CTA                       | 84  |
| 6.4 | Components of the power profile of a typical CMOS CTA                       | 86  |
| 6.5 | Figures of merit for known charge-transfer amplifiers                       | 93  |
| 6.6 | Figures of merit for zero mean offset charge-transfer amplifiers $\ldots$ . | 95  |
| A.1 | Common random access memory architectures                                   | 143 |

# List of Figures

| 2.1  | Symbolic charge-transfer amplifier                                                             | 6  |
|------|------------------------------------------------------------------------------------------------|----|
| 2.2  | Equivalent circuit of a charge-transfer amplifier                                              |    |
| 2.3  | Symbolic A/D converter                                                                         | 8  |
| 2.4  | Digitization of an analog signal by an 8-bit A/D converter $\ldots$ .                          | 9  |
| 2.5  | Block diagram of an N-bit flash A/D converter                                                  | 10 |
| 2.6  | Dynamic performance of a 10-b A/D converter as a function of input                             |    |
|      | frequency                                                                                      | 12 |
| 2.7  | Dynamic performance of a 10-b A/D converter as a function of sample                            |    |
|      | rate                                                                                           | 13 |
| 2.8  | Voltage comparator (a) symbol and (b) voltage transfer function                                | 14 |
| 2.9  | Offset voltage of a comparator                                                                 | 15 |
| 2.10 | Diagram of a low-offset comparator                                                             | 15 |
| 2.11 | Output voltage waveforms for (a) static and (b) dynamic comparators                            | 17 |
| 3.1  | Diagrams of (a) NMOS CTA, (b) classical NMOS amplifier, (c) CTA                                |    |
|      | timing chart                                                                                   | 22 |
| 3.2  | NMOS CTA in its operating phases: (a) reset, (b) precharge and (c)                             |    |
|      | amplify                                                                                        | 23 |
| 3.3  | CMOS charge-transfer amplifier                                                                 | 24 |
| 3.4  | Pseudo-differential charge-transfer amplifier                                                  | 26 |
| 3.5  | Feedback charge-transfer amplifier                                                             | 27 |
| 4.1  | Equivalent circuits of the NMOS CTA during the (a) reset, (b) precharge                        |    |
|      | and (c) amplify phases $\ldots$ $\ldots$ $\ldots$ $\ldots$ $\ldots$ $\ldots$ $\ldots$ $\ldots$ | 30 |
| 4.2  | Fixed-gain curves for the NMOS CTA                                                             | 36 |
| 4.3  | Normalized amplitude plots for various $t_a:t_p$ ratios (level 1 Spice models)                 | 37 |

| 4.4  | Normalized amplitude plots for various $V_{SS}$ potentials (level 1 Spice         |    |
|------|-----------------------------------------------------------------------------------|----|
|      | models)                                                                           |    |
| 4.5  | .5 Precharging with and without threshold voltage modulation, and com-            |    |
|      | parison to Spice simulations                                                      | 39 |
| 4.6  | CMOS charge-transfer amplifier                                                    | 41 |
| 4.7  | Enhanced differential charge-transfer amplifier                                   | 42 |
| 4.8  | Simplified portion of a DCTA                                                      | 44 |
| 4.9  | Comparative normalized amplitude plot                                             | 49 |
| 4.10 | Comparative normalized amplitude plot                                             | 50 |
| 4.11 | Simulated offset voltage                                                          | 52 |
| 4.12 | Comparator circuit                                                                | 56 |
| 4.13 | Measured offset voltage                                                           | 57 |
| 5.1  | Fully differential CTA in the (a) reset, (b) precharge and (c) amplify            |    |
|      | phases                                                                            | 61 |
| 5.2  | Simulation waveforms showing the charge-transfer process                          | 62 |
| 5.3  | 4-bit flash A/D converter block diagram                                           | 63 |
| 5.4  | Measured waveforms of a sampled sawtooth signal                                   | 64 |
| 5.5  | Diagrams of (a) conventional CMOS CTA, (b) direct-coupled CMOS                    |    |
|      | CTA and (c) timing diagram                                                        | 65 |
| 5.6  | Waveforms illustrating the limits of CMR in a direct-coupled CMOS                 |    |
|      | СТА                                                                               | 66 |
| 5.7  | Common-mode range of (a) direct-coupled CTA, (b) fully-differential               |    |
|      | direct-coupled CTA, (c) $V_{PR}$ -less CTA, and (d) fully-differential $V_{PR}$ - |    |
|      | less CTA                                                                          | 67 |
| 5.8  | Fully-differential direct-coupled charge-transfer amplifier                       | 68 |
| 5.9  | Waveforms showing the effects of the cutoff and convergence conditions            | 69 |
| 5.10 | Waveforms showing the effects of the bulk condition                               | 70 |
| 5.11 | Single-ended $V_{PR}$ -less charge-transfer amplifier                             | 71 |
| 5.12 | Fully-differential $V_{PR}$ -less charge-transfer amplifier                       | 73 |
| 5.13 | Measured common-mode range of experimental comparators $\ldots$ .                 | 75 |

| 5.14 | Measured offset voltage of experimental comparators                         |     |
|------|-----------------------------------------------------------------------------|-----|
| 6.1  | NMOS CTA in its operating phases: (a) reset, (b) precharge and (c)          |     |
|      | amplify                                                                     | 82  |
| 6.2  | Dynamic power profile of an ideal NMOS CTA                                  | 83  |
| 6.3  | Dynamic power profile of an ideal CMOS CTA                                  | 85  |
| 6.4  | Dynamic power consumption of charge-transfer amplifiers with similar        |     |
|      | circuit components                                                          | 90  |
| 6.5  | Measured 1- $\sigma$ bandwidth (SB1) of a CTA operated in subthreshold .    | 97  |
| 7.1  | Functional block diagram of a flash A/D converter                           | 101 |
| 7.2  | Subranging A/D converter architecture                                       | 102 |
| 7.3  | Example of an averaging scheme                                              | 103 |
| 7.4  | Result of averaging in an array of amplifiers                               | 104 |
| 7.5  | Averaging applied to charge-transfer amplifiers                             | 105 |
| 7.6  | Distortion at the ends in a standard averaging scheme                       | 106 |
| 7.7  | Cross-coupling with CTAs                                                    | 107 |
| 7.8  | Virtual cross-coupling via "dummy" amplifiers                               | 108 |
| 7.9  | Result of cross-coupling at the ends in an averaging scheme                 | 109 |
| 7.10 | Capacitive interpolation                                                    | 110 |
| 7.11 | CTA-based comparators                                                       | 111 |
| 7.12 | Improved CTA-based comparator                                               | 112 |
| 7.13 | Positive, capacitive feedback illustration                                  | 113 |
| 7.14 | Example of gain enhancement in a two-stage cascade of CTAs $\ldots$ .       | 114 |
| 7.15 | Sample waveforms showing gain enhancement at 10 kSPS                        | 115 |
| 7.16 | Sample waveforms showing gain enhancement at 200 kSPS $\ . \ . \ .$ .       | 116 |
| 7.17 | Input circuit modification for distributed sample-and-hold $\ldots \ldots$  | 117 |
| 7.18 | Timing summary of a sampling method                                         | 118 |
| 7.19 | Optional timing diagrams for extended settling of fine references           | 119 |
| 7.20 | Survey of the performance of 10-bit A/D converters $\ldots \ldots \ldots$   | 120 |
| 7.21 | Interpolation with fully differential charge-transfer amplifiers $\ldots$ . | 122 |
| 7.22 | Die photo of the A/D converter test chip                                    | 123 |

| 7.23 | Measured dynamic power dissipation of the A/D converter $\ldots$ .              | 124 |
|------|---------------------------------------------------------------------------------|-----|
| 7.24 | Measured nonlinearity of the A/D converter $\ldots \ldots \ldots \ldots \ldots$ | 126 |
| 7.25 | Typical measured vs. simulated dynamic performance of the A/D con-              |     |
|      | verter                                                                          | 127 |
| 7.26 | Recent A/D converter performance requirements for several classes of            |     |
|      | commercial products                                                             | 129 |

# Chapter 1

## Abbreviations and Conventions

The abbreviations and labeling conventions contained in Tables 1.1 and 1.2 are used commonly throughout this dissertation. Most are also defined in the text when used for the first time.

| Labels     |                                                |
|------------|------------------------------------------------|
| $P_D$      | Dynamic power, $\mu W/MSPS$                    |
| $V_{DD}$   | Power supply voltage                           |
| $V_{SS}$   | Ground supply voltage                          |
| $V_{SUP}$  | Supply voltage $(V_{DD}-V_{SS})$               |
| $V_{IN}$   | Input voltage                                  |
| $V_{REF}$  | Reference voltage                              |
| $V_O$      | Output voltage (also labeled $V_{OUT}$ )       |
| $V_{O_i}$  | ith output voltage                             |
| $V_{PR}$   | Precharge voltage                              |
| $V_{TN}$   | NMOS threshold voltage                         |
| $V_{TP}$   | PMOS threshold voltage                         |
| $V_{OS}$   | Offset voltage                                 |
| $S_k$      | $k^{th}$ MOS switch                            |
| $*S_k$     | Logical complement of $S_k$                    |
| $M_k$      | $k^{th}$ MOS device in a circuit               |
| $C_T$      | Transfer capacitance                           |
| $C_L$      | Load capacitance                               |
| $C_C$      | Coupling capacitance                           |
| ACC        | Relative amplifier accuracy, in bits           |
| $\mu_A$    | Mean amplifier offset voltage                  |
| $\sigma_A$ | Standard deviation of amplifier offset voltage |

Table 1.1: List of conventions

Table 1.2: List of abbreviations

| Abbreviations        |                                                  |
|----------------------|--------------------------------------------------|
| ADC                  | Analog-to-digital converter                      |
| BE                   | Body effect                                      |
| CMOS                 | Complementary metal oxide semiconductor          |
| $\mathrm{CMR}$       | Common-mode range                                |
| CMRR                 | Common-mode rejection ratio                      |
| CTA                  | Charge-transfer amplifier                        |
| DCCTA                | Direct-coupled charge-transfer amplifier         |
| DCTA                 | Differential charge-transfer amplifier           |
| DIBL                 | Drain induced barrier lowering                   |
| DNL                  | Differential nonlinearity                        |
| DRAM                 | Dynamic random access memory                     |
| ENOB                 | Effective number of bits                         |
| FCTA                 | Feedback charge-transfer amplifier               |
| $\operatorname{FET}$ | Field effect transistor                          |
| FOM                  | Figure of merit                                  |
| FSR                  | Full scale range                                 |
| IC                   | Integrated circuit                               |
| INL                  | Integral nonlinearity                            |
| kSPS                 | Kilo-samples per second                          |
| LSB                  | Least significant bit                            |
| MOS                  | Metal oxide semiconductor                        |
| MSB                  | Most significant bit                             |
| MSPS                 | Mega-samples per second                          |
| NMOS                 | N-channel metal oxide semiconductor              |
| PLCTA                | Precharge voltage-less charge-transfer amplifier |
| PMOS                 | P-channel metal oxide semiconductor              |
| $\mathbf{PSRR}$      | Power-supply rejection ratio                     |
| RAM                  | Random access memory                             |
| RMS                  | Root mean squared                                |
| RPC                  | Residual precharge current                       |
| $\operatorname{SAR}$ | Successive approximation register                |
| SNDR                 | Signal to noise and distortion ratio             |
| $\operatorname{SPS}$ | Samples per second                               |
| SRAM                 | Synchronous random access memory                 |

### Chapter 2

## Introduction

Charge-transfer amplifiers (CTAs) have been used as sense amplifiers in digital integrated circuits, such as random access memory  $(RAM)^1$ , for over thirty years. A simple CTA with only a few transistors and a capacitor offers the best combination of speed and power for such a low precision application (e.g., 1-bit). For this reason, the use and study of charge-transfer amplifiers has, until recently, been the exclusive domain of memory array designers. But the seminal work by Kotani, Ohmi and Shibata in 1997 [1,2] expanded the field of charge-transfer amplifiers into analog applications such as precise sense amplifiers, comparators and A/D converters, opening many new doors for research and discovery. This dissertation explores the design and analysis of charge-transfer amplifiers and reports several related advancements to the field.

The advantages inherent in the construction of charge-transfer amplifiers create a compelling case for furthering their use in precise sensing applications such as A/D converters. CTAs can be made extremely power efficient; they are compact and straightforward to design; and they are notably robust and tolerant to common CMOS process variations.

Initial research into the construction of CTA-based comparators has been promising, yet there remains a great deal to be studied and proven about the analysis, design and use of charge-transfer amplifiers in an A/D converter environment. This dissertation is aimed at exploring the relatively new field of *precision charge-transfer* 

 $<sup>^{-1}</sup>$ The construction of stored-charge memory circuits is outside the scope of this work. For reference, Appendix A lists several types of memories in which CTAs have previously been utilized.

*amplifiers*, first by providing a comprehensive review of the existing literature and research, and second by reporting several advancements to the field, namely (1) dynamic analysis techniques, (2) novel design architectures and (3) the first demonstration of a 10-bit A/D converter utilizing a completely CTA-based methodology.

### 2.1 A Brief History of Charge-transfer Amplifiers

Charge-transfer amplifiers operate on the principle of capacitive charge sharing. If two capacitors are electrically coupled such that charge, but not voltage, is shared, then it is possible to achieve voltage amplification by transferring charge from the larger capacitor to the smaller one. The voltage gain is simply the capacitance ratio. This type of voltage amplification consumes no static (DC) current and arbitrarily small dynamic current, depending on the sizes of the capacitors. In contrast to continuous-time amplifiers, which dissipate power all of the time, CTAs consume current only when needed and in proportion to the size of the signal.

Engineers at IBM were awarded the first two patents for charge-transfer amplifiers: Yao in 1973 [3] and Dennard in 1976 [4]. Yao's patent claims a straightforward NMOS circuit used for amplifying stored charge in a memory element. Dennard's patent covers essentially the same circuit as Yao's, but adds a pseudo-differential input configuration and integrates a dynamic latch on the back end. These two patents formed the foundation for the next 20+ years of research in CTA design, and are still widely referenced in the literature on stored charge memory detection circuits (for example, see [5]).

In 1976, Heller [6] presented a CTA in the IEEE Journal of Solid State Circuits (JSSC), which built upon Yao's and Dennard's patents. Heller continued this work for several years [7], using the designs exclusively in memory circuits. More recently, Kawashima and Kim reported state of the art performance in G-bit scale SRAM and DRAM circuits respectively, each featuring incrementally better chargetransfer amplifier designs [8,9].

Then in 1997 [1, 2] Kotani demonstrated a novel approach to the use of CTAs by constructing a 4-bit flash A/D converter with 8-bits of accuracy out of

comparators that consisted of a low accuracy latch and a novel charge-transfer amplifier. Unlike the CTAs used as DRAM sense amplifiers, which use a grounded gate configuration, Kotani's amplifier was gate-driven in a method similar to a classical MOS amplifier stage. The new design exhibited both low offset fluctuation and dualpolarity signal amplification, two critical attributes for a comparator preamplifier. Kotani's groundbreaking research introduced the world to a new way of thinking about CTAs.

The "CMOS CTA" [2] incorporated parallel NMOS and PMOS charge transfer amplifiers in order to cancel systematic offset due to circuit imbalance. The result was high tolerance to fluctuations in device parameters, such as threshold voltage and effective length. Although the mean offset voltage was large and unpredictable across sample die, the standard deviation was small, even across variations in device parameters and external conditions. These properties are ideal for flash A/D converters (although not necessarily for other architectures such as subranging or pipelined).

Applying his new CTA to a 4-bit flash ADC, Kotani demonstrated 8 bits of accuracy (in terms of DNL, not INL or gain offset) and up to 10 mega-samples per second (MSPS) operation. Purely dynamic power was around 4  $\mu$ W/MSPS per comparator from a 3.3 V supply, not counting the reference voltage generator. The matching accuracy was achieved with near minimum scale devices. For comparable speed and matching performance, a continuous time preamplifier would have required about twice the current and at least ten times larger devices assuming a typical  $A_{VT}$ matching coefficient of 20 mV· $\mu$ m.

Kotani later theorized that CTAs could be applied to 10-bit accurate ADCs (or the limit of on-chip capacitor matching) and up to several tens of MSPS, depending on the process geometry. He also concluded that all of this could be achieved by using small active devices and with up to 50% less overall power than the best reported 10-bit ADC [10].

Kotani also published a latched feedback CTA (FCTA) at the 1999 ISSCC [10,11]. This circuit incorporated a latching mechanism into the CTA itself (unlike

Dennard's original patent, which simply included a dynamic latch at the back end). A fully differential FCTA achieved very low offset by incorporating two active feedback CMOS CTAs in parallel, in contrast to "dummy differential" methods suggested in all previous memory-related CTA literature.

Besides the publications and patents related to the researching of this dissertation, no other literature related to precision charge-transfer amplifiers exists to the author's knowledge.

#### 2.1.1 Transconveyance

An instructive visualization of a charge-transfer amplifier appears in Figure 2.1. In response to an input stimulus,  $\Delta V_{IN}$ , the amplifier transfers a proportional amount of charge to or from the load,  $C_L$ . Whereas a transconductance amplifier conveys a proportional output current in response to an input voltage, a CTA conveys a proportional amount of charge,

$$\Delta Q = g_c \Delta V_{IN}, \qquad (2.1)$$

where  $g_c$  is a new term which is called *transconveyance* (units of Farads). A perhaps more intuitive terminology, *trans-capacitance*, is also suggested because  $g_c$  often correlates to the value of an actual circuit capacitor.



Figure 2.1: Symbolic charge-transfer amplifier

Given a load capacitance,  $C_L$ , it is straightforward to convert (2.1) into an open loop voltage gain,  $A_O$ , by projecting  $\Delta Q$  onto  $C_L$ 

$$A_O = \frac{\Delta V_{OUT}}{\Delta V_{IN}} = \frac{g_c}{C_L}.$$
(2.2)

A proposed idealized equivalent circuit of a charge-transfer amplifier is shown in Figure 2.2. The input voltage,  $V_{IN}$ , is switched onto the capacitor,  $C_1$ , at times  $t = t_1, t_2, ..., t_n$  and switched off of the capacitor immediately. Each time the voltage is switched onto  $C_1$ , an impulse of charging current  $i_{C_1}$  flows in the capacitor in order to establish the new input voltage. Specifically, this current equals

$$i_{C_1} = Q_i \cdot \delta(t - t_i) \tag{2.3}$$

since the integral of  $i_{C_1}$  over each  $t \in (t_{i-}, t_{i+})$  equals  $Q_i$ , or the total amount of charge added to  $C_1$  at  $t_i$ . It is clear that

$$Q_{i} = C_{1}(V_{IN_{i}} - V_{IN_{i-1}})$$
  
=  $C_{1}\Delta V_{IN_{i}}$ . (2.4)



Figure 2.2: Equivalent circuit of a charge-transfer amplifier

A current controlled current source pulls a copy of  $i_{C_1}$  from the load capacitor,  $C_L$ . The amount of charge transferred from  $C_L$  at time  $t_i$  equals  $Q_i$ . The change in output voltage is

$$\Delta V_{OUT_i} = \Delta V_{IN_i} \frac{C_1}{C_L}.$$
(2.5)

In a discrete time sense, this is exactly the same result described in (2.1) and depicted in Figure 2.1 for a transconveyance of  $g_c = C_1$ . Further details on the realization of such circuits is one of the primary topics of this dissertation.

#### 2.2 Low-power A/D Converters

Due to advancements in digital signal processing (DSP) and the downscaling of CMOS technology, it is almost always desirable to implement most of the functions of an electrical system with digital algorithms and filters. Yet, signals occurring in nature (i.e., speech, light intensity, motion, chemical composition, etc.) are inherently analog, leading to the conclusion that any IC which interfaces to the real world must include some form of A/D conversion. As a result of demand for higher levels of performance and integration in IC applications, there is a widespread need for power-efficient A/D converters.

Figure 2.3 shows a commonly accepted ADC circuit symbol, and Figure 2.4 illustrates the digitization of an analog signal by an 8-bit sampling ADC. The digital output represents the closest approximation to the sampled signal at discrete points in time.



Figure 2.3: Symbolic A/D converter

The field of A/D converters has so matured over the years that entire textbooks are now dedicated to the study of single architectures. Some designs have been reported with speed and resolution performance near the fundamental limits of physics. While some opportunities do still exist for research in maximizing the product of A/D converter speed and resolution, the most fruitful and relevant area for research and innovation is now in minimizing power while preserving high performance.



Figure 2.4: Digitization of an analog signal by an 8-bit A/D converter

Flash converters are popular for their simplicity and low latency. The classical N-bit flash consists of a reference voltage generator (as in a resistor ladder),  $2^N - 1$  comparators, and binary encoding logic, as shown in Figure 2.5. Since comparators consume a substantial portion of the ADC power and silicon area, flash converters are considered economical only up to about 7 bits. Beyond this point, the size and power become prohibitively large and inefficient due to the  $2^N$  growth factor in comparator count. However, the flash converter is a key building block in several common architectures achieving greater than 7 bits. For instance, 10-bit subranging ADCs are typically implemented with two 5-bit flash converters, and 16-bit deltasigma ADCs are often designed with 4- to 6-bit flash quantizers to increase stability, resolution and speed.

### 2.2.1 A/D Converter Specifications

Common specifications given for A/D converters are listed below.

• **Resolution:** The *resolution* of an A/D converter is given by the number of bits, N. The resolution is the smallest increment of output that the converter



Figure 2.5: Block diagram of an N-bit flash A/D converter

can produce. For example, a 10-bit A/D converter has a resolution of 10 bits or 1 part in  $2^{10}$  or 0.098%.

- Integral Nonlinearity: The integral nonlinearity (INL) measures the maximum allowable deviation from an ideal straight line drawn between zero and full-scale outputs. Often linearity is given in fractions of an LSB. For example, ±1/2 LSB INL for a 10-bit A/D converter means that the converter output never deviates more than 1/2 LSB from the straight line ideal output.
- Differential Nonlinearity: The differential nonlinearity (DNL) measures the maximum deviation from ideal of the output between two adjacent codes. It is expressed in fractions of an LSB. For example, ±1/2 LSB DNL for a 10-bit converter means that the output between adjacent codes never deviates from

the ideal, which is 1 LSB, by more than 1/2 LSB. In other words, from one adjacent output code to the next, the change in input is always between 1/2 and  $1 \ 1/2$  LSB.

- Monotonicity: An A/D converter is said to be *monotonic* if each step in the digital code output corresponds to an increase in the analog input, however small or large. In general, an A/D converter is expected to be monotonic if the DNL is less than 1 LSB.
- Latency: The *latency* measures how long it takes from the time an A/D converter begins a conversion to the time when a valid digital output is ready. When the converter uses a clock to determine timing, latency is expressed in terms of clock cycles. Otherwise, latency is given in units of time.
- SNR: The signal-to-noise ratio (SNR) is a spectral measurement of the output of an A/D converter in response to a perfect sine-wave input. Given in dB, SNR is the ratio of signal magnitude to the noise floor of a converter. SNR changes with sample rate, generally becoming worse at higher sample rates and as the input frequency approaches the Nyquist rate (when the input frequency is half the sample rate) or above. SNR is a function of the resolution.
- SDR: The signal-to-distortion ratio (SDR) is a spectral measurement of the output of an A/D converter in response to a perfect sine-wave input. Given in dB, SDR is the ratio of signal magnitude to the combined magnitudes of the harmonic distortion peaks (usually the sum of 3rd, 5th and 7th harmonics). SDR changes with sample rate, generally becoming worse at higher sample rates and as the input frequency approaches the Nyquist rate or above.
- **SNDR:** The signal-to-noise-and-distortion ratio or (SNDR, also called SINAD or SDNR) is a spectral measurement of the SNR plus SDR. It measures the total quality of the output of an A/D converter response to a perfect sine wave input.



Figure 2.6: Dynamic performance of a 10-b A/D converter as a function of input frequency

• ENOB: The effective number of bits (ENOB) is a dynamic measure of resolution. Usually specified at a given sample rate and input frequency, ENOB is always less than or equal to the ideal resolution number. For example, a 10-bit resolution converter might only have 9.3 effective bits at the Nyquist sampling rate. ENOB is a convenient way of incorporating SNR and SDR into a single number that relates to the ideal converter resolution.

Figure 2.6 shows the relationship between SNR, SDR, SNDR and ENOB for a theoretical 10-bit A/D converter. The ideal 10-bit ADC represents the minimum possible noise floor that is possible with perfect 10-bit quantization. Summing the SNR and SDR together gives the SNDR in dB (left axis), which has a direct translation to ENOB (right axis). The SDR becomes markedly worse as the input frequency approaches and surpasses the Nyquist rate, while noise remains relatively constant.



Figure 2.7: Dynamic performance of a 10-b A/D converter as a function of sample rate

Figure 2.7 shows how SNDR and ENOB drop as the sample rate increases. The input signal remains fixed at 4.4 MHz. When the converter reaches 20 MSPS, ENOB is about 9.6. Dynamic performance begins to drop considerably above 20 MSPS; once the speed reaches 40 MSPS, ENOB is about 5. The converter shown here would be specified with a maximum sample rate of about 20 MSPS.

- Dynamic Range: The dynamic range of an A/D converter measures the range of input signals that can be converted successfully. Given in dB, dynamic range is the ratio of the full-scale input range to the noise floor. For example, if the full-scale input range is 2V (peak-to-peak) and the noise floor is 1mV, then the dynamic range is 20log(2V/1mV) = 66 dB. The lowest possible noise floor for an N-bit converter is always a function of N.
- Active Area: The *active area* gives the area occupied by the fabricated A/D converter not counting external circuitry such as pads, power routing, company labels, scribe lines, etc. Active area is given mm<sup>2</sup> or in square mils.



Figure 2.8: Voltage comparator (a) symbol and (b) voltage transfer function

### 2.2.2 Voltage Comparators

One of the most important building blocks of an A/D converter is the comparator. A comparator is a 1-bit quantizer (or 1-bit A/D). The input consists of two voltages for comparison, and the output is a logical 1 or 0. A typical comparator is shown in Figure 2.8(a). The inputs are  $V_{IN}$  and  $V_{REF}$ , and the output is  $D_O$ .

The transfer function of a comparator is

$$D_O = \begin{cases} 1 & \text{if } V_{IN} \ge V_{REF} \\ \\ 0 & \text{if } V_{IN} < V_{REF} \end{cases}$$

$$(2.6)$$

as shown in Figure 2.8(b).

A common way of constructing a comparator is to use a high-gain amplifier; in an ideal comparator, the gain is infinite. Like all amplifiers, comparators have offset voltage that can be measured directly at the input by observing the difference between the reference voltage and the so-called trip point,  $V_{TRIP}$ , as illustrated in Figure 2.9.

Most precision A/D converters rely on highly accurate comparators (e.g., low offset voltage). A method of constructing a low-offset comparator is shown in Figure 2.10. It comprises a preamplifier and a latching circuit, where the latching circuit outputs a logical 1 or 0. Latching circuits use regenerative feedback in order to guarantee full CMOS logic level outputs, and usually have a large offset voltage



Figure 2.9: Offset voltage of a comparator



Figure 2.10: Diagram of a low-offset comparator

(30 mV, or more). A small- to moderate-gain preamplifier can be designed with low offset voltage in order to boost the difference signal to a level that overcomes the latch offset voltage.

For example, if the latch input-referred offset voltage is 30 mV but the application requires accuracy of 4 mV, then a preamplifier is used to reduce the input-referred offset of the comparator by boosting the minimum-level 4 mV input difference signal above 30 mV before it reaches the latch. This assumes, of course, that the preamplifier offset is sufficiently lower than 4 mV to begin with.

The comparator input-referred offset is

$$V_{OS-COMP} = \frac{V_{OS-LATCH}}{A_{PREAMP}} + V_{OS-PREAMP}$$
(2.7)

where  $V_{OS-PREAMP}$  is the preamplifier offset voltage,  $V_{OS-LATCH}$  is the latch offset and  $A_{PREAMP}$  is the gain of the preamplifier stage.

#### **Continuous and Dynamic Amplifiers**

Continuous-time amplifiers perform the amplification function on a constant basis without clocking or switching. Figure 2.11(a) shows the input-output waveforms for a typical high-gain voltage comparator with a continuous-time preamplifier (this could also be viewed as simply a high-gain amplifier). At any given time, the output represents a highly amplified version of the difference between  $V_{IN}$  and  $V_{REF}$ .

A dynamic amplifier contains either switches or clocks to improve the functionality. In contrast to continuous-time amplifiers, dynamic amplifiers are characterized by sampling, or amplifying based on the time-sampled difference between  $V_{IN}$ and  $V_{REF}$ . For example, Figure 2.11(b) shows the input-output waveforms for a hypothetical comparator with a dynamic amplifier. The input difference is sampled and amplified at the labeled sample points, and for a certain amount of time the output represents a highly amplified version of the sample. Before the next sample begins, however, the output is reset to some common-mode value. In this illustration the common-mode output is a voltage near mid-supply.

#### Low Power Comparators

Efficient voltage comparators are critical to the construction of a low-power A/D converter. Solving the problem of reducing the power drawn by a voltage comparator essentially boils down to developing an economical preamplifier. (Dynamic latch circuits are usually designed with hybrid logic gates, and as such consume very little power.) In principle, amplifiers with the lowest dynamic power dissipation and zero static bias current make the best candidates for constructing an efficient voltage comparator. Amplifiers fitting this description which can also be implemented in a small silicon area, exhibit a wide input range and have low input capacitance, are ideal. This topic is the motivation behind the work reported in this dissertation.



Figure 2.11: Output voltage waveforms for (a) static and (b) dynamic comparators

### 2.3 Contributions of this Work

This dissertation focuses on the analysis, design and application of chargetransfer amplifiers for low-power flash A/D converters. The contributions of the dissertation are summarized below.

1. Techniques for analysis of the dynamic behavior of charge-transfer amplifiers. Until now, the published literature concerning the behavior of CTAs has been limited to qualitative analysis backed by Spice simulations [2,11]. Not only is the lack of formal analysis intellectually unsatisfying, it also leaves future designers
with the equivalent of an idealized opamp model – useful, but not enough to design practical circuits in the real world. This dissertation provides, for the first time, a quantitative study of the dynamic behavior of CTAs, including:

- A development of the voltage transfer function;
- An analysis of offset voltage and its dependencies.
- 2. Novel amplifier architectures to address the weaknesses of early designs. Early CTAs are limited in practical use by several factors, including high input capacitance, large and unpredictable offset voltage, and special reference voltage requirements. Three new architectures are presented to overcome previous limitations with respect to the following attributes:
  - Offset voltage;
  - Input capacitance;
  - Precharge voltage requirements.
- 3. Power dissipation analysis and figures of merit. The dynamic power dissipation of charge-transfer amplifiers is considered. An idealized model is formulated and discussed in light of actual circuit measurements. Figures of merit are proposed to compare the relative merits of existing and future charge-transfer amplifiers.
- 4. A 10-bit CTA-based A/D converter. In all existing literature, 4-bit ADCs have been used to predict the potential of CTAs (up to 10-bits of precision without trimming); yet the design of a 4-bit ADC is elementary in comparison with ADC architectures needed in many practical applications. This dissertation describes the implementation and evaluation of the industry's first 10-bit CTA-based A/D converter. In addition to achieving the desired precision, the reported converter is marked by the following unique features:
  - <u>Power dissipation</u>. The reported converter consumes 40% less dynamic power than any published 10-bit A/D converter as of January 2003.

- <u>Capacitive interpolation</u>. The number of CTAs in the finebank is reduced from 62 to 23 through the first known use of capacitive interpolation of charge-transfer amplifier outputs.
- <u>Cascaded CTAs</u>. The first use of cascaded fully-differential CTAs helps realize low offset voltage and high gain.
- <u>Distributed sample-and-hold</u>. A novel input switching scheme allows the existing input capacitors of the CTA preamplifiers to perform a distributed sample-and-hold function. This eliminates the need for a separate sample-and-hold amplifier and reduces the overall power dissipation.

The contributions presented in this dissertation are published in [12–17]. Some of the design techniques are covered under U.S. patents [18–21], each ascribed to the author.

### 2.4 Organization of the Dissertation

The dissertation is organized as follows.

Chapter 3 presents a summary of existing CTA architectures beginning with a well-known NMOS CTA. A pseudo-differential architecture is discussed. Two recent amplifiers, the CMOS CTA and feedback CTA are also presented.

Chapter 4 introduces an analysis technique which can be applied to determine the frequency-dependent voltage transfer function. Examples of the analysis are given for the NMOS CTA, the CMOS CTA and the fully-differential CTA. The results are compared with Spice simulations. Offset analysis techniques are also considered and fabrication results are provided.

Chapter 5 describes three new CTAs which are designed for practical use. The first is a truly differential amplifier, which is also introduced in the analysis of Section 4.3. Next, an amplifier with low input capacitance is presented. Finally, an amplifier requiring no precharge voltage is proposed. Various performance tradeoffs are examined and experimental results are given. Chapter 6 provides a quantitative discussion regarding the dynamic power dissipation of charge-transfer amplifiers. This chapter also presents four figures of merit (FOMs) for use in evaluating the relative advantages of using existing or future charge-transfer amplifiers. The FOMs are intentionally related to specific A/D converter architectures.

Chapter 7 discusses the design, fabrication and testing of a 10-bit CTAbased A/D converter. Design techniques and challenges are discussed. Several optimization methods are explored. Measurements from silicon test chips show that the converter performs well over a wide supply voltage range and consumes less dynamic power than any reported 10-bit converter in the literature at the time of testing.

Chapter 8 summarizes the dissertation and presents conclusions about the impact of this research. Additional research topics for future work are suggested.

Appendix A contains a table showing common memory architectures which have incorporated CTAs in the past.

Appendix B shows the Matlab code used to implement the dynamic behavior model of the fully differential charge-transfer amplifier in Section 4.3.

# Chapter 3

# **Existing CTA Architectures**

This chapter gives a brief introduction to four known charge-transfer amplifier architectures. First, the NMOS CTA is a gate-driven circuit with similarity to the traditional NMOS inverting amplifier. Second, the CMOS CTA comprises multiple amplification channels to alleviate two obvious limitations of the NMOS CTA. Third, the pseudo-differential charge-transfer amplifier has been suggested for achieving zero mean offset voltage. Finally, the feedback CTA incorporates a latching mechanism for improved response time and infinite gain.

In this dissertation, the following notation is adopted for MOS switches.  $S_1$  and  $*S_1$  denote the control status and complement respectively of switch  $S_1$ . For example, if  $S_1$  is open then  $*S_1$  is closed, and vice versa. Note that in the NMOS CTA described immediately below,  $*S_2$  is used in the circuit but not  $S_2$ . The reason for not using  $S_2$  and reversing the control polarity will become apparent with the introduction of additional CTAs. In those circuits, it is conventional for  $S_2$  and  $*S_2$ to follow a certain polarity, and the NMOS CTA is drawn for consistency with the same scheme.

## 3.1 NMOS Charge-transfer Amplifier

The well-known NMOS CTA is shown in Figure 3.1. An analysis of the dynamic behavior of this device appears in Section 4.1 and a discussion on the power dissipation is provided in Section 6.1.

Figure 3.2 shows the NMOS CTA during the (a) reset, (b) precharge and (c) amplify phases. In the reset phase both  $S_1$  and  $S_2$  are open. All circuit nodes are



Figure 3.1: Diagrams of (a) NMOS CTA, (b) classical NMOS amplifier, (c) CTA timing chart

discharged in this phase. Capacitor  $C_T$  is discharged to  $V_{SS}$  and the output node is discharged to the precharge voltage,  $V_{PR}$ .

In the precharge phase  $S_1$  is closed.  $V_O$  remains tied to  $V_{PR}$ , but  $C_T$  is now disconnected from  $V_{SS}$ . The transfer capacitor,  $C_T$ , precharges through MN1 towards the drain voltage,  $V_{PR}$ . Device cutoff stops the precharging when node B reaches  $V_{IN}-V_{TN}$ , where  $V_{TN}$  is the NMOS threshold. This process occurs fairly rapidly, depending on the size of  $C_T$  and the strength of MN1.

The final phase is the amplify phase. First, the output is detached from  $V_{PR}$ . For any *positive*  $\Delta V_{IN}$  at the gate node, MN1 becomes active and conducts current from  $C_L$  to  $C_T$ . Current flow continues until node B rises by  $\Delta V_{IN}$ , at which point MN1 is again cutoff. The charge conveyed to  $C_T$  equals  $C_T \cdot \Delta V_{IN}$ . Since all of the charge came from  $C_L$ ,  $V_O$  experiences a voltage drop of  $\Delta V_{IN}(C_T/C_L)$ .

In this way, a MOS device is used to create a controlled current source to transfer charge from  $C_L$  to  $C_T$ . The inverting gain of this amplifier is  $-C_T/C_L$ . This circuit is a simple, yet robust illustration of a charge-transfer amplifier as described



Figure 3.2: NMOS CTA in its operating phases: (a) reset, (b) precharge and (c) amplify

in Section 2.1.1. No static power is consumed by the NMOS CTA and the dynamic current can be made arbitrarily small by appropriate selection of  $C_T$  and  $C_L$ .

Note that when  $V_{SS} > V_{PR}-V_{TN}$ , MN1 starts out in subthreshold. According to an idealized analysis, no current flows through MN1 in this situation and no voltage gain is achieved. In reality, however, a subthreshold current continues to flow and this effect is not negligible. The study and analysis of subthreshold CTAs is outside the scope of this work. Section 6.6 describes some measurements taken on a CTA-based comparator operating in deep subthreshold. Subthreshold CTA analysis and design is suggested as an area for future work in Section 8.2.



Figure 3.3: CMOS charge-transfer amplifier

The source follower action gives this circuit an inherent immunity to fluctuations in device parameters, such as threshold voltage and transconductance. Moreover, there is only one restriction on  $V_{SS}$ : it must be sufficiently low to "erase" memory from the previous cycle. This is because MN1 must bias in the active region in each precharge phase.

## 3.2 CMOS Charge-transfer Amplifier

The NMOS CTA has two significant problems. First, its offset voltage is inherently large when any residual current exists in MN1 at the start of the amplify phase. This current is projected directly onto  $C_L$ , resulting in an uncontrollable offset. Even if precharging is complete, subthreshold current still leads to a potentially large offset if the load capacitance is small. The second problem is the single polarity amplification. Only positive  $\Delta V_{IN}$  gets amplified by the circuit. When  $\Delta V_{IN}$  is negative, MN1 becomes more cutoff and no charge transfer occurs.

To solve these two problems, Kotani developed the CMOS CTA shown in Figure 3.3 [1]. It is comprised of parallel NMOS and PMOS CTAs with gate and drain nodes shared. The NMOS channel amplifies signals of positive polarity in the manner described above. The PMOS device is also precharged to cutoff, but since the gate-source "on" voltage is negative, the PMOS channel amplifies only signals of negative polarity. In the ideal scenario,  $C_T$  devices are matched as closely as possible so that positive and negative polarity signals are amplified with the same gain.

If the NMOS and PMOS devices are designed for matching betas, then the residual precharge current in the amplify phase can be reduced. This can never be a perfect matching, however, because NMOS and PMOS transistors vary differently over temperature and processing. Test chip measurements of fabricated CMOS CTAs have shown a normally distributed offset voltage with mean of -11 mV and standard deviation of 1.4 mV (16 samples from the same lot) [2].

### 3.3 Pseudo-differential Charge-transfer Amplifier

An obvious, but effective, means of zeroing the offset of a CMOS CTA is to use the pseudo-differential CTA (PDCTA) configuration shown in Figure 3.4. The upper channel is the CMOS CTA of Figure 3.3 and the lower channel is a second CMOS CTA with both inputs tied directly to  $V_{REF}$ . No gain is contributed by this channel; it only serves to null the offset. When both channels are laid out for parallel matching, the offsets from each channel are subtracted to yield an overall zero mean offset. It is a fairly straightforward exercise to extrapolate the offset and power data from the CMOS CTA to the PDCTA. While effective in reducing offset, this configuration is technically wasteful in terms of area. A superior fully-differential CTA is proposed as a part of this work in Section 5.1.

#### 3.4 Feedback Charge-transfer Amplifier

A charge-transfer amplifier with a dynamic feedback mechanism (FCTA) has also been proposed recently [10]. This amplifier amplifies and then rapidly latches the input signal while preserving a relatively low power dissipation. The circuit appears in Figure 3.5.

The unique feature of this amplifier is the addition of MOS devices MN2 and MP2. Like the CMOS CTA, the FCTA functions in three phases. The reset and



Figure 3.4: Pseudo-differential charge-transfer amplifier

precharge phases are identical to the CMOS CTA. At the end of the precharge phase, the source nodes of MN1 and MP1 are  $V_{PR}-V_{TN}$  and  $V_{PR}-V_{TP}$  respectively.

In the amplify phase, the output (drain) and input (gate) nodes of MN1 and MP1 are disconnected from the precharge voltage,  $V_{PR}$ . If, for example,  $V_{REF} > V_{IN}$  then a positive  $\Delta V_{IN}$  is coupled onto the input node. This causes MN1 to turn on and conduct current from  $C_L$  to  $C_T$ . The output voltage decreases as a result of charge transfer, just as in the CMOS CTA. But the decrease in output voltage also causes a decrease in the gate voltage of MP2, turning that device on incrementally and beginning a secondary charge transfer between  $C_T$  and  $C_C$ . Since the left node of  $C_T$  was precharged to  $V_{DD}$ , this secondary charge transfer simultaneously pulls the source node of MN1 down further and increases the voltage at the input node. Thus, a positive feedback network is created and the input signal is amplified with infinite



Figure 3.5: Feedback charge-transfer amplifier

gain until the output of the FCTA converges (i.e., the source-drain voltage of MN1 and MP2 go to zero).

When  $V_{REF} < V_{IN}$ , a similar positive feedback mechanism is formed by MP1 and MN2. The FCTA is advantageous in the sense that when it interfaces to a CMOS level dynamic latch, the input-referred offset component of the latch is negligible. However, the mean offset of the FCTA is nonzero due to the imbalance between NMOS and CMOS channels. Test circuits fabricated in 1.2  $\mu$ m CMOS showed a typical 25 mV mean offset with 0.6 mV standard deviation (27 samples from the same lot). As a result of the latching mechanism, the observed power dissipation was 21.5  $\mu$ W/MSPS, or roughly five times greater than the average power of a CMOS CTA.

# Chapter 4

# Analysis of Dynamic Behavior

Most of the published advancements in charge-transfer amplifier design provide an adequate qualitative discussion about the dynamic behavior [2, 6, 7, 10]. The multi-phase operation of CTAs is not as easily or neatly summarized as, for example, the properties of an operational amplifier. But prior to this work, no attempt has been made to quantify the behavior of charge-transfer amplifiers in terms of characteristics such as voltage gain, bandwidth and offset voltage. For these reasons, a quantified analysis of the dynamic behavior of charge-transfer amplifiers is considered in depth here.

This chapter begins by providing an analysis of a simple NMOS CTA. A generalized model during each timing phase is constructed. The model, implemented in Matlab, is compared with Spice simulations. The analysis techniques are then extended to a more recent CMOS CTA [2] and to the fully-differential CTA proposed in Chapter 5 [12]. For the latter, simulation results are compared with the model implemented in Matlab. Regions where the model breaks down are also discussed with some suggestions for future research.

## 4.1 Analysis of the NMOS Charge-transfer Amplifier

Equivalent circuits for the NMOS CTA during each operational phase are displayed in Figure 4.1. Without a loss in generality, closed switches have been replaced with resistors and open switches have become open circuits. Dashed faded lines indicate circuit components which can be safely neglected during a phase of operation, such as the MOSFET itself during the reset phase.



Figure 4.1: Equivalent circuits of the NMOS CTA during the (a) reset, (b) precharge and (c) amplify phases

In the following sections, it is assumed for simplicity that  $V_{PR} = 0$  V and that  $V_{SS} = -V_{DD}$  (symmetric supply).

## **Reset Phase**

The reset phase illustrated in Figure 4.1(a) begins with nodes B and  $V_O$  at arbitrary potentials, as determined by the prior amplify phase. They are now effectively isolated from one another by disengaging M1 from any current path. The transient responses at nodes B and  $V_O$  are RC exponential, described by the following

two equations:

$$V_{B_R}(t) = (V_{B_R}(0) - V_{SS})e^{-t/R_{S1}C_T} + V_{SS}$$
(4.1)

where  $V_{B_R}(0)$  is the initial potential on node B and  $R_{S1}$  is the on resistance of S1, and

$$V_{O_R}(t) = (V_{O_R}(0) - V_{PR})e^{-t/R_{S2}C_L} + V_{PR}$$
(4.2)

where  $V_{O_R}(0)$  is the initial potential on node  $V_O$  and  $R_{S2}$  is the on resistance of S2. The time constant for  $V_B$  is generally of primary concern, as  $V_O$  remains connected to  $V_{PR}$  during the precharge phase.

Given the typical dimensions of CTA switches and capacitors, a time constant on the order of a nanosecond is expected for the reset phase. The number of time constants required for a particular level of settling (in volts) depends on  $V_{SS}$ . A larger negative supply requires fewer time constants to reach the goal. However, a small negative supply is favorable for minimizing power. This tradeoff is similar to one encountered in digital design, where high supply rails facilitate faster clock rates but also increase the power consumption exponentially. The NMOS CTA can remain in the reset phase indefinitely before proceeding onto the precharge and amplify phases.

As the following sections show, the settling time in the reset phase is practically negligible compared to the minimum time for the precharge phase. On a fixed-clock scheme, even if the duration of the reset phase,  $t_r$ , equals the duration of the precharge phase,  $t_p$ , it is safe to assume that the resetting will always be complete except at very high sample rates. But since this analysis breaks down for other reasons at high frequencies, it is assumed that resetting is always complete once the precharge phase starts. For the NMOS CTA, this means that  $V_B = V_{SS}$  at the start of the precharge phase.

#### **Precharge Phase**

An equivalent circuit for the precharge phase is shown in Figure 4.1(b). Transistor M1 forms a current path for  $C_T$  to charge towards the drain voltage,  $V_{PR}$ , until precharging is stopped by transistor cutoff. When this happens,  $V_B = -V_{TN}$ . Before continuing the analysis, some attention is given to the on resistance of the MOS switches.  $R_{S3}$  is normally made small by using wide transistors so that threshold fluctuations in the pass gate devices do not introduce dynamic offset into the CTA. Charge injection errors from the switch comprising  $R_{S3}$  are small in effect because the switch changes state at the beginning of the precharge phase, whereas critical injection errors will be those occurring at the start of the amplify phase. On the other hand,  $R_{S2}$  is made as large as possible with near minimum width transistors. This is preferable in order to minimize charge injection errors and reduce the output capacitance. Circuit simulations show that the precharge current does not exceed several tens of microamps for an appreciable time period, so even a comparably large  $R_{S2}$  reduces the drain voltage only slightly for a brief time.

Except for high sample rates which are basically impractical, the effects of  $R_{S2}$  and  $R_{S3}$  are negligible. Therefore, this analysis does not account for finite switch resistance. Their effects are described at a later point and illustrated with Spice simulations.

In the precharge phase, it is preferable to assume that transistor M1 is biased in the active region. This assumption avoids a separate analysis for each of three possible operating regions (active, triode, or subthreshold). An active bias definitely exists when the drain and gate voltages are equal *and* when subthreshold conduction currents do not pull the device substantially below cutoff. For now, an active-region bias is considered uniformly true and the square law is applied<sup>1</sup> [22] in order to write the current equality at node B as

$$C_T \frac{dV_{B_P}(t)}{dt} = \frac{\beta_n}{2} [V_{TN} + V_{B_P}(t)]^2, \qquad (4.3)$$

which can also be written

$$\frac{2C_T}{\beta_n} [V_{TN} + V_{B_P}(t)]^{-2} dV_{B_P}(t) = dt, \qquad (4.4)$$

<sup>&</sup>lt;sup>1</sup>For submicron CMOS devices, the square law begins to become invalid. The relationship between current and effective gate voltage becomes asymptotically linear in deep submicron technologies. As shown later in this chapter, simulation results agree well with a model formulated on the basis of the square law for 0.6  $\mu$ m CMOS. However, this agreement may not be observed for CTAs designed in a smaller process geometry.

where  $\beta_n = \mu_n C_{OX}(W/L)$  describes the relative strength of M1,  $\mu_n$  is the electron mobility,  $C_{OX}$  is the oxide capacitance per unit area and W/L is the transistor width/length ratio. Now both sides of (4.4) are integrated to find  $V_{B_P}(t)$ 

$$-\frac{2C_T}{\beta_n} [V_{TN} + V_{B_P}(t)]^{-1} = t + t_0$$

$$V_{TN} + V_{B_P}(t) = -\frac{2C_T}{\beta_n} \cdot \frac{1}{t + t_0}$$

$$V_{B_P}(t) = -V_{TN} - \frac{2C_T}{\beta_n} \cdot \frac{1}{t + t_0}.$$
(4.5)

The constant term,  $t_0$ , satisfies an initial condition that  $V_{B_P}(0) = V_{SS}$ . It also modifies the rate at which B settles toward its final value of  $-V_{TN}$ . Precharging is not exponential in t, so  $t_0$  is not called a time constant, but rather an *initial condition* constant or boundary condition constant.

Interestingly, if nonzero  $R_{S3}$  were assumed in the above development, it would not be feasible to arrive at an analytical solution like (4.5). Rather, the equations would reduce to the following transcendental equation

$$-\frac{2C_T}{\beta_n} [V_{TN} + V_{B_P}(t)]^{-1} - 2R_{S3}C_T \ln\left(V_{TN} + V_{B_P}(t)\right) = t + t_0, \qquad (4.6)$$

which has no closed form solution.

Enumeration of the absolute limitations on the length of the precharge phase is a subjective process. On one hand, voltage gain is achieved even when node B does not fully precharge to  $-V_{TN}$ . However, if the precharge phase is too short, residual precharge current (RPC) will carry over into the amplify phase, leading to a high dynamic offset. This tradeoff is almost irrelevant in the newer differential architectures, but it is a severe problem in single-ended output designs.

A general form of  $t_0$  is found by solving (4.5) at t = 0 (recall that  $V_{B_P}(0) = V_{SS}$ ),

$$t_0 = \frac{2C_T}{\beta_n} \left( \frac{1}{-V_{TN} - V_{SS}} \right). \tag{4.7}$$

From (4.7) it is once again clear that  $V_{SS}$  must be less than  $-V_{TN}$ , in this case to avoid a negative  $t_0$ . Actually, a negative  $t_0$  is not physically impossible, but it does indicate a situation where no precharging can occur.

Now that  $t_0$  is known, (4.5) describes all of the interesting activity during the precharge phase. The circuit conditions at the end of the precharge phase can be determined by evaluating (4.5) at  $t_p$  seconds. Doing so sets up a boundary condition for the amplify phase.

## **Amplify Phase**

The amplify phase begins when the output node is disconnected from  $V_{PR}$ . An equivalent circuit at this point is shown in Figure 4.1(c). At the onset,  $V_B$  is still charging towards a final potential of  $-V_{TN}$ . According to (4.5), the initial voltage  $V_{B_A}(0)$  is

$$V_{B_A}(0) = V_{B_P}(t_p) (4.8)$$

$$= -V_{TN} - \frac{2C_T}{\beta_n} \cdot \frac{1}{t_{pe}}, \qquad (4.9)$$

where  $t_{pe} = t_0 + t_p$ . Applying a positive  $\Delta V_{IN}$  at the gate of M1 activates the NMOS device temporarily, until M1 becomes cutoff again by the same process described earlier. In a manner similar to the development of (4.5), the voltage at node B during the amplify phase is

$$V_{B_A}(t) = -V_{TN} + \Delta V_{IN} - \frac{2C_T}{\beta_n} \cdot \frac{1}{t+t_1}$$
(4.10)

where  $t_1$  is another boundary condition constant. The value of  $t_1$  is found by solving (4.10) at t = 0 and noting that  $V_{B_A}(0) = V_{B_P}(t_p)$ ,

$$t_1 = \frac{t_{pe}}{1 + \frac{\beta_n}{2C_T} \Delta V_{IN} t_{pe}}.$$
 (4.11)

It can be seen from (4.11) that both  $\Delta V_{IN}$  and  $t_{pe}$  affect the dynamic properties of the amplify phase. There is an intuitive explanation behind this result. When  $\Delta V_{IN}$  is large, M1 becomes more strongly activated and conducts more current at the start of the amplify phase. Likewise, when  $\Delta V_{IN}$  is small, so is the relative on current of M1. Note that since  $t_{pe} = t_p + t_0$ , it is an indirect function of supply voltage, threshold voltage and MOS W/L ratio. Therefore, the dependence of  $t_1$ on  $t_{pe}$  in (4.11) reveals a highly nonlinear relationship between the behavior of the amplify phase and all of the internal and external circuit parameters. It is important to mention that  $t_1 \leq t_{pe}$ , with equality occurring when  $\Delta V_{IN} = 0$  V.

The equations above indicate that a relationship exists between the amplification process and the lengths of the operating phases. In fact, for a fixed gain, there is a minimum amplify phase duration,  $t_{a-min}$ , which can be expressed explicitly in terms of  $t_p$ . This relationship is described as follows.

When the precharge phase is extremely short, M1 has a high residual precharge current (RPC) in the amplify phase. This gives the MOS device a functionally large gain-bandwidth product and expedites the charge transfer process, thereby facilitating a comparably short amplify phase. A long precharge phase leaves the MOS device with almost no RPC, necessitating a longer amplify phase for a small input stimulus. Figure 4.2 illustrates the dependence of  $t_{a-min}$  on  $t_p$  for constant gain factors. The curves are defined by the following equation, a result of algebraically combining (4.5), (4.7), (4.10) and (4.11):

$$t_{a-min} = -\frac{1}{2}(t_1 + t_{pe}) + \frac{1}{2}\sqrt{(t_1 + t_{pe})^2 - 4t_1t_{pe} - \frac{8C_T(t_1 - t_{pe})}{\beta_n(\alpha - 1)\Delta V_{IN}}}, \quad (4.12)$$

where  $\alpha$  is a the normalized gain ranging from 0 to 1 (as long as  $\alpha > 0$ , the radical will never become imaginary). The open loop voltage gain,  $A_O$ , is now modified from the idealized result in (2.5) to a more realistic expression,

$$A_O = \frac{\Delta V_O}{\Delta V_{IN}} = \alpha \frac{C_T}{C_L}.$$
(4.13)

In the NMOS CTA, is should be clear that  $\alpha \in (0,1)$ .

The value of  $t_{a-min}$  is calculated vs.  $t_p$  for fixed- $\alpha$  (fixed gain) in Figure 4.2. Several important dynamic properties are apparent from a comparison of the results for two values of  $\Delta V_{IN}$ , 10 mV and 100 mV. First, over a wide range, decreasing  $t_p$  results in a proportional decrease in  $t_{a-min}$ . This suggests that the CTA possesses an ability to dynamically self-adjust its response time according to the sampling frequency. In contrast to continuous-time, single-pole dynamic amplifiers, where settling time and resolution time are absolute, the CTA demonstrates a capability to reduce or extend the time needed for amplification in response to the degree of precharging.



Figure 4.2: Fixed-gain curves for the NMOS CTA

At high speeds (small  $t_p$  and  $t_{a-min}$ ) the fixed- $\alpha$  curves overlap for any input signal size. The reason for this behavior is the high RPC that exists for short  $t_p$ . The high current allows basically the same amount of charge transfer to occur whether the input is small or large. Also, there is a clear limit to how much  $t_{a-min}$ can be reduced just by lowering  $t_p$ . This suggests a maximum operating frequency of the CTA. It is noteworthy that the range over which  $t_{a-min}$  tracks  $t_p$  is anywhere from three to six orders of magnitude, implying that the gain response ought to appear flat over as many orders of magnitude in sample rate.

When  $t_{pe}$  becomes large,  $t_{a-min}$  levels off at a certain point due to the fact that a device in cutoff has a fixed, signal size dependent response time no matter how long it has been in cutoff.

Admittedly, the above analysis is primarily theoretical, since in most applications it is more practical to fix the  $t_a:t_p$  ratio at an integer ratio, such as 1:1 or 2:1. This represents the case where a simple clock input is used to generate all three timing phases. For an arbitrary  $t_a:t_p$  ratio, the following equation for  $\alpha$  is obtained



Figure 4.3: Normalized amplitude plots for various  $t_a:t_p$  ratios (level 1 Spice models)

by rearranging (4.12),

$$\alpha = 1 - \frac{2C_T}{\Delta V_{IN}\beta_n} \left( \frac{1}{t_a + t_1} - \frac{1}{t_a + t_{pe}} \right).$$
(4.14)

Figure 4.3 shows three amplitude plots generated from (4.14) for three simple integer  $t_a:t_p$  ratios. Simulations with level 1 Spice models and ideal switches are also shown to confirm agreement between theory and first-order physical behavior. Important second-order effects necessary for agreement with more realistic transistor models are discussed in the following sections.

With reference to Figure 4.3, favoring the amplify phase provides better gain performance, although it demands increasingly complex timing circuitry and a faster input clock to maintain a given sample rate. As expected, voltage gain can be made flat or nearly flat over a wide range of frequencies.

Looking now at performance vs. supply voltage, Figure 4.4 shows normalized gain amplitude plots for several values of  $V_{SS}$ . A larger supply does in fact permit faster sampling rates, up to 10x from 1.05 to 1.65 V. Of course, the power



Figure 4.4: Normalized amplitude plots for various  $V_{SS}$  potentials (level 1 Spice models)

expense may outweigh the speed benefit over alternative amplifier architectures that are optimized for high-speed. Both Figures 4.3 and 4.4 seem to demonstrate good first-order agreement between simulation and analysis.

## **Threshold Voltage Effects**

The formulation leading to the closed form solutions in (4.12) and (4.14) is admittedly simplified, particularly with respect to second-order effects on threshold voltage. In reality, the body effect plays a critical role in the precharge and amplify phases. This becomes apparent if the CTA is viewed correctly as a source follower amplifier. For example, a positive applied  $\Delta V_{IN}$  tends to increase the source voltage  $(V_B)$  proportionally, but the source follower gain is always less than unity. As the source voltage increases, so does the threshold voltage, thereby limiting the rise at  $V_B$ to some amount less than  $\Delta V_{IN}$ . A source follower gain of 0.8 – 0.9 is not uncommon, depending on process and scaling parameters.



Figure 4.5: Precharging with and without threshold voltage modulation, and comparison to Spice simulations

If used in a recursive calculation, (4.4) and (4.5), and the consequent equations (4.12) and (4.14), are still valid when  $V_{TN}$  is modulated by the body effect. The body effect is most commonly described as [23]

$$V_{TN} = V_{THO} + \gamma \left( \sqrt{\Phi_F + |V_{SB}|} - \sqrt{\Phi_F} \right)$$
(4.15)

where  $V_{THO}$  is the threshold voltage when  $V_{SB} = 0$  V (the "zero bias threshold"),  $\Phi_F$  is a reference voltage related to semiconductor doping and  $\gamma$  is the so-called body effect coefficient, with units of  $\sqrt{V}$ . Values for  $\gamma$  range from 0.3 to 0.9. To illustrate the relationship between time and  $V_B$  during the precharge phase, along with the potential error caused by ignoring the body effect, Figure 4.5 compares dynamic settling behavior from (4.5) against Spice simulation – using BSIM3v3.1 models. Calculations with and without the body effect are shown for  $V_{SS} = -2.5$  V,  $C_T = 500$  fF and  $\beta_n = 1mA/V^2s$ . Clearly the accuracy of the model is improved by accounting for the body effect.

Yet another important factor must be considered for a reliably accurate estimate of threshold modulation in a CTA. In the amplify phase, an applied signal simultaneously increases  $V_B$  and decreases  $V_O$  due to the inverting voltage gain. Again viewing the transistor amplifier as a source follower, it may be said that  $V_{DS}$  always converges toward a smaller value in response to a positive input signal. This tends to raise the threshold voltage in the amplify phase due to drain-induced barrier lowering (DIBL). One way to interpret DIBL is that the change in threshold voltage is linearly proportional (with opposite sign) to the drain-source potential [23]. For submicronlength MOS devices, DIBL can further reduce the source follower gain to 0.6 - 0.7.

The above two processes are easily applied to the analysis presented here, although they were neglected for simplicity in the forgoing first-order equations and simulations. One method would be to construct a recursive environment in which sources of threshold modulation are constantly incorporated during the precharge and amplify phases. While doing so eliminates the simplicity of a single closed form equation for voltage gain, it is necessary to create a realistic model of the voltage transfer function. The Matlab model described in Section 4.3 (see also Appendix B) incorporates threshold modulation by adding one extra calculation step in each of the precharge and amplify phases.

#### 4.2 Analysis of the CMOS Charge-transfer Amplifier

As explained earlier, residual precharge current introduces large offset errors in NMOS CTAs, especially at high sample rates. Furthermore subthreshold current from M1 is projected onto  $C_L$  in the amplify phase, thereby creating a potentially large offset which is difficult to characterize. In addition, the NMOS CTA amplifies only in one polarity. Negative  $\Delta V_{IN}$  is not amplified by the charge-transfer mechanism. For these two reasons, the CMOS CTA was proposed.

The CMOS charge-transfer amplifier is essentially comprised of a PMOS CTA in parallel with an NMOS CTA. Residual precharge current from the PMOS channel is opposite in direction in direction relative to RPC from the NMOS channel, canceling the net current projected onto  $C_L$ . Moreover, the circuit amplifies positive and negative polarities as explained in Section 3.2. Figure 4.6 shows the CMOS CTA with respective component channels outlined. It is a fairly simple exercise to show that an appropriate  $\beta$ -ratio (with carefully selected drain areas) leads to a theoretical



Figure 4.6: CMOS charge-transfer amplifier

nulling of the offset voltage. Of course, the nulling is never perfect over temperature or process variations. Nominally  $\beta_n = \beta_p$  by making MP1 2 – 3 times wider than MN1. Dynamic behavioral analysis of the CMOS CTA leads to timing constraints and speed limitations similar to those discussed previously for the NMOS CTA. The analysis is not repeated here for brevity.

At sample rates above a few MSPS, the CMOS CTA begins to amplify through both channels, since neither MP1 nor MN1 completely cuts off in the precharge phase. Accordingly, twice the  $C_T/C_L$  gain of the NMOS CTA is expected at higher speeds, particularly for signals in the mV regime. For example, taking into account attenuation from DIBL and the body effect, a CMOS CTA with  $C_T/C_L$  ratio of 5 operating above a few MSPS would be expected to yield a midband voltage gain of

$$A_{CMOS-CTA} = 2 \cdot \frac{C_T}{C_L} \cdot A_{DIBL+BE}$$

$$\approx 2 \cdot 5 \cdot 0.6$$

$$= 6,$$
(4.16)

where  $A_{DIBL+BE}$  is the approximate loss due to the body effect and DIBL, which as discussed in the previous section is potentially as low as 0.6. The result in (4.16) was observed almost exactly through the simulations featured in [2].



Figure 4.7: Enhanced differential charge-transfer amplifier

## 4.3 Analysis of the Fully-differential Charge-transfer Amplifier

Due to the inevitable imbalance between the PMOS and NMOS channels, a CMOS CTA exhibits a finite mean offset voltage. In fact, the offset voltage can be quite large and is naturally impossible to predict deterministically. An enhanced differential CTA (DCTA) architecture is proposed in this work as a means of further improving offset voltage. An analysis of the dynamic behavior of the DCTA is given here as a continuation of the analysis above. However, the advantages and unique attributes of this new circuit are explained in further detail in Section 5.1 in connection with the introduction of two other novel charge-transfer amplifier architectures.

Shown in Figure 4.7, the DCTA nulls any native offset imbalance by matching each device to a same-type differential counterpart. It also processes input signals in true differential mode. This affords the possibility of greater precision and area efficiency in ADC applications. In this circuit, two CMOS CTAs are placed in parallel with one key alteration: capacitors  $C_{T1}$  and  $C_{T2}$  cross-couple the CTAs dynamically. The configuration allows a single-ended input signal to be amplified differentially. The modified connection of  $C_{T1}$  and  $C_{T2}$  allows both channels to process the input signal, rather than just cancel offset voltage in a pseudo-differential fashion, and also eliminates half of the transfer capacitors (leaving the same number of  $C_T$  elements as in the CMOS CTA).

Just as before, the reset phase is RC exponential, dominated by the time constant associated with the  $C_T$  capacitors. No further discussion about the reset phase is given here. The precharge and amplify phases are considered in detail in this section. Before moving further, the following terms are defined for simplicity in the equations of the analysis:

$$a_{n} = \frac{2C_{T}}{\beta_{n}} \qquad b_{n} = \sqrt{\frac{\beta_{n}}{\beta_{p}}}$$

$$a_{p} = \frac{2C_{T}}{\beta_{p}} \qquad b_{p} = \sqrt{\frac{\beta_{p}}{\beta_{n}}}$$

$$B_{nn} = -4a_{n}V_{TN}$$

$$A_{n} = a_{n}(2b_{n} + 1) \qquad B_{np} = 4a_{n}V_{TP}$$

$$B_{pp} = 4a_{p}V_{TP}$$

$$A_{p} = a_{p}(2b_{p} + 1) \qquad B_{pn} = -4a_{p}V_{TN}$$

It is preferable, but not required, that  $\beta_n = \beta_p$ . This has already been shown to be a desirable design goal for CTAs. If this assumption is true, then  $a_n = a_p$ ,  $b_n = b_p$ ,  $B_{pp} = B_{np}$ , and  $B_{nn} = B_{pn}$ .

#### **Precharge Phase**

In the precharge phase nodes B and D behave identically, as do nodes A and C. For brevity, analysis of the dynamic behavior at B and C is expanded here and the results are equated to nodes A and D.

Figure 4.8 is a simplified schematic showing B and C with the associated transfer capacitor,  $C_{T1}$ . At  $t=0^-$  nodes B and C are reset to  $V_{SS}$  and  $V_{DD}$  respectively. By Kirchhoff's Law, the current through MN1 equals the current through  $C_{T1}$  and MP2. The following set of two equations with two unknowns represents this equality:



Figure 4.8: Simplified portion of a DCTA

$$\frac{\beta_n}{2} (V_{TN} + V_{B_P}(t))^2 = C_T \frac{dV_{CAP}(t)}{dt}$$
(4.17)

$$V_{C_P}(t) + V_{TP} = -b_n (V_{B_P}(t) + V_{TN}), \qquad (4.18)$$

which is simplified to one equation by replacing  $V_{CAP}$  in (4.17) with  $V_{B_P}-V_{C_P}$  and substituting  $V_{C_P}$  with the equivalent in terms of  $V_{B_P}$  obtained by rearranging (4.18). The resulting equality, where  $V_{B_P}(t) + V_{TN} = V_1(t)$ , is given and solved for  $V_{B_P}(t)$ as follows:

$$V_{1}(t)^{2} = a_{n} \frac{d}{dt} \left( V_{TP} + V_{TN} b_{n} + V_{BP}(t)(b_{n} + 1) \right)$$

$$\int dt = a_{n} \int V_{1}(t)^{-2} \cdot d(V_{TP} + V_{TN} b_{n} + V_{BP}(t)(b_{n} + 1))$$

$$t + t_{0} = a_{n} \left[ -V_{1}(t)^{-1} - 2b_{n} V_{1}(t)^{-1} + V_{TP} V_{1}(t)^{-2} \right]$$

$$V_{1}(t) = \frac{1}{2} \frac{-A_{n}}{t + t_{0}} - \frac{1}{2} \sqrt{\left(\frac{A_{n}}{t + t_{0}}\right)^{2} + \frac{B_{np}}{t + t_{0}}}$$

$$V_{BP}(t) = -V_{TN} - \frac{1}{2} \left[ \frac{A_{n}}{t + t_{0}} + \sqrt{\left(\frac{A_{n}}{t + t_{0}}\right)^{2} + \frac{B_{np}}{t + t_{0}}} \right], \quad (4.19)$$

where  $t_0$  satisfies the initial conditions, but is not equal to the  $t_0$  derived previously for the NMOS CTA. By following a similar procedure beginning with (4.17) and (4.18),  $V_{C_P}(t)$  is found to be

$$V_{C_P}(t) = -V_{TP} + \frac{1}{2} \left[ \frac{A_p}{t+t_0} + \sqrt{\left(\frac{A_p}{t+t_0}\right)^2 + \frac{B_{pp}}{t+t_0}} \right].$$
(4.20)

Due to the circuit symmetry, the results obtained in (4.19) and (4.20) for  $V_{B_P}(t)$  and  $V_{C_P}(t)$  are equated to  $V_{D_P}(t)$  and  $V_{A_P}(t)$  respectively.

It is now necessary to determine  $t_0$ , which in this case preserves the continuity of  $V_{CAP}$  from  $t=0^-$  to  $t=0^+$ . A deterministic solution for  $t_0$  is not as easily found as before for the NMOS CTA. The  $C_T$  capacitor maintains its voltage across the t=0 boundary, while  $V_B$  and  $V_C$  may themselves be discontinuous. In mathematical terms, the following equalities apply:

$$V_{B_P}(0^-) = V_{SS} (4.21)$$

$$V_{C_P}(0^-) = V_{DD}$$
 (4.22)

$$V_{B_P}(0^-) \neq V_{B_P}(0^+)$$
 (4.23)

$$V_{C_P}(0^-) \neq V_{C_P}(0^+)$$
 (4.24)

$$V_{B_P}(0^-) - V_{C_P}(0^-) = V_{B_P}(0^+) - V_{C_P}(0^+)$$

$$= V_{SS} - V_{DD}.$$
(4.25)

From (4.25), the reset-precharge boundary equation is derived for the purpose of determining  $t_0$ ,

$$-V_{TN} + V_{TP} - \left[\frac{A_n}{t_0} + \sqrt{\left(\frac{A_n}{t_0}\right)^2 + \frac{B_{np}}{t_0}}\right] = V_{SS} - V_{DD}, \qquad (4.26)$$

which, if modified slightly, is of the form

$$|Ax + \sqrt{A^2 x^2 + Bx}| = C, (4.27)$$

where  $x = 1/t_0$ ,  $A = A_n$ ,  $B = B_{np}$  and  $C = V_{DD} - V_{SS} - V_{TN} + V_{TP}$ . The absolute value does apply because only the magnitude (not phase) of  $V_{CAP}$  is of interest. Depending

on the sign of  $A^2x^2 + Bx$ , one of two possible real solutions for x exists;

$$x = \begin{cases} \frac{C^2}{B+2AC}, & A^2 x^2 \ge Bx \\ \\ \frac{-C^2}{B}, & A^2 x^2 < Bx \end{cases}$$
(4.28)

which is a general form used to calculate the candidate values of  $t_0$ . Assuming  $\beta_n = \beta_p$ , (4.28) is used to find a solution for  $t_0$ ,

$$t_{0} = \begin{cases} \frac{B+2AC}{C^{2}}, & \frac{A^{2}}{t_{0}^{2}} \ge \frac{B}{t_{0}} \\ \\ \frac{B}{-C^{2}}, & \frac{A^{2}}{t_{0}^{2}} < \frac{B}{t_{0}}. \end{cases}$$
(4.29)

Note that both possible solutions must be calculated, and then a decision made as to which one is correct depending on the sign in the radical and the sign of  $t_0$ . For a reasonable set of circuit parameters, the second solution will always be positive and the first will either be positive or negative. The larger value is always chosen and the smaller value provides a mathematical, but not physical, solution to the differential equations of (4.17) and (4.18). Once a value for  $t_0$  is obtained,  $V_{B_P}(0^+)$  and  $V_{C_P}(0^+)$ follow by (4.19) and (4.20).

# **Amplify Phase**

Unlike in the precharge phase, differential counterpart nodes (e.g., A and C, B and D) do not behave the same in the amplify phase once  $\Delta V_{IN}$  is applied to node X. By a process similar to that above, it can be shown that  $V_{B_A}(t)$  and  $V_{C_A}(t)$  are described by

$$V_{B_{A}}(t) = -V_{TN} + \Delta V_{IN} - \frac{1}{2} \left[ \frac{A_{n}}{t + t_{1_{BC}}} + \sqrt{\left(\frac{A_{n}}{t + t_{1_{BC}}}\right)^{2} + \frac{B_{np}}{t + t_{1_{BC}}} + \frac{4a_{n}b_{n}\Delta V_{IN}}{t + t_{1_{BC}}}} \right]$$
(4.30)

and

$$V_{C_A}(t) = -V_{TP} + \frac{1}{2} \left[ \frac{A_p}{t + t_{1_{BC}}} + \sqrt{\left(\frac{A_p}{t + t_{1_{BC}}}\right)^2 + \frac{B_{pp}}{t + t_{1_{BC}}} + \frac{4a_p b_n \Delta V_{IN}}{t + t_{1_{BC}}}} \right], \quad (4.31)$$

where  $t_{1_{BC}}$  is used to maintain continuity of  $V_{CAP}$  at the precharge-amplify phase boundary. When  $\Delta V_{IN} = 0$  V,  $t_{1_{BC}}$  reduces to  $t_0 + t_p$ . Analysis of  $V_{A_A}(t)$  and  $V_{D_A}(t)$ yields

$$V_{D_A}(t) = -V_{TN} - \frac{1}{2} \left[ \frac{A_n}{t + t_{1_{AD}}} + \sqrt{\left(\frac{A_n}{t + t_{1_{AD}}}\right)^2 - \frac{B_{nn}}{t + t_{1_{AD}}} + \frac{4a_n b_p \Delta V_{IN}}{t + t_{1_{AD}}}} \right]$$
(4.32)

$$V_{A_{A}}(t) = -V_{TP} + \Delta V_{IN} + \frac{1}{2} \left[ \frac{A_{p}}{t + t_{1_{AD}}} + \sqrt{\left(\frac{A_{p}}{t + t_{1_{AD}}}\right)^{2} - \frac{B_{pn}}{t + t_{1_{AD}}}} + \frac{4a_{p}b_{p}\Delta V_{IN}}{t + t_{1_{AD}}} \right], \quad (4.33)$$

where  $t_{1_{AD}}$  maintains the continuity of  $V_{CAP}$  in the corresponding signal path.

The constants  $t_{1_{BC}}$  and  $t_{1_{AD}}$  depend on many factors, including  $t_p$ ,  $t_0$  and  $\Delta V_{IN}$ . The general form in (4.28) also applies to the determination of  $t_{1_{BC}}$  and  $t_{1_{AD}}$  (where  $x = 1/t_{1_{ij}}$ ), with the one notable alteration being that  $C = V_{C_A}(0^-) - V_{B_A}(0^-) - V_{TN} + V_{TP} \pm \Delta V_{IN}$  ( $+\Delta V_{IN}$  for  $t_{1_{BC}}$  and  $-\Delta V_{IN}$  for  $t_{1_{AD}}$ ).

With dynamic voltage equations for all nodes and associated boundary condition terms in hand, it is possible to compute the amplifier voltage gain response. One problem in this type of circuit, where devices often operate near threshold, is that input voltages of sufficiently large positive or negative polarities will not be fully amplified by one or the other signal path due to device cutoff. With the NMOS CTA, it was assumed simply that only positive polarity signals applied; however, at sufficiently high speeds, the NMOS CTA (and likewise the CMOS CTA and DCTA) can in fact amplify a wide range of dual-polarity signals, provided that the input signal does not drive the active devices into the cutoff region. This effect, which grays the boundary between large-signal and small-signal analysis, is rather easily accounted for in the computational environment by setting cutoff or subthreshold conduction rules.

Still another effect which causes trouble in the mathematical formulation is device saturation (to borrow from bipolar junction transistor terminology), whereby the source and drain nodes of active transistors converge toward zero during the charge transfer process. A categorically large signal effect, this can be viewed as a variation of the clipping that occurs in static amplifiers when the ideal amplitude of the output signal exceeds the supply voltage. Moreover, the source-drain voltage convergence first pushes active devices into the linear region, rendering the above stated mathematical models inaccurate. Therefore, equations developed here apply more specifically to small signal (mV range, not to be confused with small signal AC analysis) estimation. It should be emphasized, however, that small signal estimation provides the most useful information about accuracy and offset voltage for comparator preamplifiers. Furthermore, amplifier designers are already accustomed to similar limitations in large signal analyses.

#### 4.3.1 Voltage Transfer Function

The mathematical formulations presented above were implemented in a Matlab script (see Appendix B) for comparison with the Spice simulated voltage transfer function of a typical DCTA. Normalized voltage gain was obtained by observing net changes at nodes A through D (the stimulated responses minus the nostimulus responses), rather than monitoring the output and dividing by the capacitor ratio,  $C_T/C_L$ . Either method would yield the same gain result, but the methodology followed here provides inherent normalization for a more universal characteristic. Typical device model parameters were used to correlate results with BSIM3v3.1 Spice models for AMI Semiconductor's  $0.6\mu m$  CMOS process. Recursive calculations were performed in both the precharge and amplify phases to account for second-order threshold modulation and consequent gain degradation.



Figure 4.9: Comparative normalized amplitude plot

Figure 4.9 contrasts the calculated gain transfer function with simulation for a 1 mV input stimulus and various  $t_a:t_p$  ratios. Figure 4.10 shows gain curves for a range of operating supply voltages. Good correlation is observed for sample frequencies below around 45 MHz. Computational errors above that rate are due to the neglected switch resistances, which introduce poles in the simulated gain transfer function and limits the actual available bandwidth. However, the agreement of computation and simulation data indicates that the analysis and equations presented in this work are reliable up to moderate video or ultrasound speeds. A model that includes finite switch resistance would provide improved validity at operating frequencies approaching the functional limits of the amplifier. This is considered in Chapter 8 as an area for future research.

While it is difficult to see from the linear y-axis of Figures 4.9 and 4.10, the simulation response reveals two identical high-frequency poles, where the roll-off is 40 dB/decade as opposed to the 20 dB/decade response of a single pole amplifier. This follows because there are two identical poles associated with the two transfer capacitors in the DCTA.



Figure 4.10: Comparative normalized amplitude plot

An interesting property of the transfer function is that the normalized gain exceeds unity over midband frequencies. This is due to the fact that voltage gain essentially doubles as a byproduct of having both nodes of the  $C_T$  devices transfer charge to or from the load capacitors. This quadruples the area efficiency over the pseudo-differential CTA because half the transfer capacitors are used *and* each contributes twice as much charge-transfer capacity to the overall gain. Ideally, the gain factor reaches 4 V/V at midband frequencies, where both NMOS and PMOS input channels are activated, and 2 V/V at low frequencies as described in Section 4.2. However, the benefit in voltage gain is not without cost, since the extra charge adds to the amplifier's power consumption. The gradual increase in gain with frequency is a consequence of the nonlinear  $t_p$  dependence in the gain equations.

#### 4.3.2 Statistical Variation and Offset Voltage

In the DCTA, device mismatch results in two modes of offset voltage contribution: *charge injection* and *channel mismatch errors*. Each of these is examined now. Offset voltage is first evaluated qualitatively and then simulated using a Monte Carlo approach to estimating the worst-case behavior. In Section 4.5, measured offset data are compared with the simulations.

## **Charge Injection**

In Figure 4.7, there are 8 matched pairs of CMOS switches (16 total) providing dynamic bias, isolation, and signal coupling to the amplifier. Section 4.4 indicated that the 4 switch pairs controlled by S1 or \*S1 do not change state at the amplify phase start and hence contribute a negligible amount of offset voltage by charge injection. The remaining 4 switch pairs change state on the rising edge of S2 (the onset of the amplify phase) and therefore are more likely to cause dominant errors.

Charge injection mismatch from any of the 3 switch pairs at the input coupling capacitors is projected onto the input nodes X and Y, contributing an offset equal to the total differential charge divided by the input capacitance. The output switch pair likewise projects mismatched charge injection onto the output nodes. Any resulting offset here is divided by the amplifier gain to yield an input-referred value. It is assumed that offset due to charge sharing from the output switch pair, when divided by the amplifier gain, is small compared to that caused by the input switches.

Input charge sharing depends more on the input voltage and on the behavior of controlling clock edges (S2 and its complement) than on any other circuit parameter [22, 24]. For this reason, it is expected that the input offset will be signal dependent, but fixed over frequency. In this case, simple CMOS switches sized as described in Section 4.4 lead to a worst-case (Monte Carlo) simulated input-referred offset of about 500  $\mu$ V.

## **Channel Mismatch Errors**

Channel mismatch also influences offset voltage. In Figure 4.7, there are essentially two amplification "channels," one through CT1 and the other through CT2. Counterpart devices – capacitors, switches, or transistors – in each path directly affect the balanced precharging and charge-transfer process. Any mismatch in these



Figure 4.11: Simulated offset voltage

devices results in bias current shifts between the channels. The imbalance becomes progressively worse with frequency because more residual current exists in the amplify phase.

The most critical matched are the transfer capacitors  $(C_T)$ , the output capacitors  $(C_L)$ , and the active transistors (MN1, MN2, MP1, and MP2). The next most critical devices are switches in the signal path, specifically those coupling the transfer capacitors to the respective transistor source nodes (the S1 switches). As mentioned earlier, offset from these switches is safely neglected if they are scaled to a W/L ratio of about 10–20.

In Monte Carlo simulations, the dynamic offset of the present amplifier due to channel mismatch ranges from 600  $\mu$ V at 100 Hz to 4 mV at 30 MHz. Combining channel mismatch with charge injection in simulation leads to the predicted offset curves given in Figure 4.11, where part (a) presents the data with a linear frequency scale and part (b) uses a log scale to show the low-frequency detail.

## 4.4 Design Tradeoffs and Considerations

This section describes circuit design parameters used in simulation, and layout techniques used in fabricated test chips. A number of important design considerations presented here draw upon the analysis in [2]. The obvious decision which must be made first deals with the tradeoff between voltage gain and sampling bandwidth. In practice, large gain reduces the input referred offset voltage, but at the expense of speed, since the transfer capacitor size scales linearly with the desired gain. Moreover, larger transfer capacitors increase the dynamic power dissipation proportionally.

In designing a comparator with a standard latch in 0.6  $\mu$ m CMOS, the load capacitance of the CTA is about 100 fF. A capacitor ratio of 6 can be set by making  $C_T$  around 600 fF. This was done in a test chip in order to realize a target midband voltage gain of 10 (referring to the data in Figure 4.9 and assuming a  $t_a:t_p$ ratio of 2:1).

#### Dimensioning and Layout

Design of the active source follower transistors is critical. First, they should be small devices to preserve speed and minimize input capacitance. With the input signal being dynamically coupled onto node X, capacitive voltage division attenuates the difference signal unless the ratio of  $C_C$  to the combined input gate capacitance is large. The other consideration is that the betas of NMOS and PMOS source follower devices should be equal for best RPC cancellation at the output nodes.

Under these constraints, a test chip was constructed with W/L of 5/0.6 and 13/0.6 respectively for the active NMOS and PMOS transistors. All switches were implemented with a complementary "T-gate" structure with minimum gate length. In cases where switches shunt directly to  $V_{DD}$  ( $V_{SS}$ ), a simple PMOS (NMOS) switch would have sufficed. However, the choice was made in this experiment to use T-gate
CMOS switches for maximum flexibility in reducing charge injection. With reference to Figure 4.7, switch widths were as follows (all in  $\mu$ m):

- Output precharge switches discharge only a small capacitance and remain tied to  $V_{PR}$  during both the reset and precharge phases. Since the diffusion capacitance of these devices reduces the overall voltage gain by adding to  $C_L$ , they were scaled to  $W_n/W_p = 1.5/2.1$ , which also minimizes the total amount of possible charge injected onto the output nodes.
- S1 switches appear directly in the signal path between the active devices and associated transfer capacitors. The on resistance should be small compared to the drain-source resistance of the respective source followers or else offset fluctuations in the switch transistors will introduce dynamic offset errors, not to mention that the analysis predicts best speed performance for zero on resistance. Since S1 does not change state at the start of the amplify phase, charge sharing is not so critical. The switches were therefore scaled to  $W_n/W_p = 11/20$ .
- \*S1 switches discharge the transfer capacitors to a supply rail. Speed in the reset phase is the primary concern, and since these switches also do not change state at the start of the amplify phase they were sized identically to the S1 switches, or  $W_n/W_p = 11/20$ .
- Input precharge switches discharge the input nodes X and Y to  $V_{PR}$ . Since the node capacitance to discharge is potentially the parallel combination of  $C_C$  and two gate capacitors, these devices are cautiously scaled up to  $W_n/W_p = 5.2/5.8$  so as obtain an acceptable tradeoff between speed and total charge injection.
- Input switches tie the input and reference signals into the amplifier and are subject to essentially the same speed and feedthrough restraints as the input precharge switches. These are also scaled to  $W_n/W_p = 5.2/5.8$ .

All matched differential transistors, including switch transistors, were situated proximally in layout with diffusion guard bars around each for consistent boundary conditions. Poly-poly transfer capacitors were drawn in a common centroid configuration in order to give each capacitor node equal parasitic exposure and to improve overall linearity. Metal traces, contacts, and via connections were also matched differentially to further cancel any first order parasitic capacitance differences between channel nodes.

#### 4.5 Experimental Verification

Unless very large transfer capacitors are used, it is impossible to directly probe the output of a CTA in a test chip due to the high sensitivity to load capacitance. Therefore, the voltage gain is observed indirectly by measuring the offset voltage of a voltage comparator where the latch offset is known. Comparator test cells were constructed in AMI Semiconductor's  $0.6\mu m$  2P/2M CMOS process with the above-described amplifier design parameters. A low power dynamic comparator provided a practical implementation for indirectly observing the voltage gain and directly observing the dependence of offset on sample rate.

#### 4.5.1 Comparator Cell

The DCTA interfaces with a high speed, low power dynamic latch [25], as shown in Figure 4.12, to form an efficient and fast comparator. The latch resets during the DCTA precharge phase, tracks the amplifier outputs during the amplify phase and performs the amplify-latch function during the reset phase. Latch input capacitance provides the DCTA load,  $C_L$ , which was designed for 100 fF nominal.  $C_T$ devices, implemented as poly-poly capacitors for improved predictability (primarily linearity and temperature coefficient), were set to 600 fF nominal, for a  $C_T/C_L$  ratio of 6 and an estimated midband voltage gain of around 10, based on the normalized amplitude plot of Figure 4.9. On-chip timing circuitry generated the clock signals from a single input clock, so that  $t_a = 2t_p$  and  $t_p = t_r$ .



Figure 4.12: Comparator circuit

The simulated standard deviation of the latch offset was approximately  $\pm 10$  mV with nonminimum device sizes and assumed matched layout techniques. With a DCTA gain of 10, the predicted input referred offset (which is the latch offset divided by the DCTA gain) was  $\pm 1$  mV, in addition to the simulated CTA offset of 1.1–4.5 mV according to Figure 4.11.

Sixteen test chips were fabricated, each containing one test cell. The comparator was tested on a 2.5 V supply at sample rates ranging from 100 Hz to 30 MHz. The mean offset was 0.54 mV at 10 kHz and 0.61 mV at 7.5 MHz. Offset voltage was measured stochastically by applying a 50 mV, low-frequency sine wave to the input and varying the DC offset until the observed average output (a logical 1 or 0) was a square wave with 50% duty cycle. This method allowed an average canceling of any white noise superimposed on the input signal.

Figure 4.13 shows measured standard deviation of the offset. Again, (a) uses a linear x-axis and (b) uses the log scale for low-frequency detail. As predicted in Figure 4.11, offset becomes worse at high frequencies due to dynamic channel mismatch. The measured low frequency offset has a standard deviation of about 1 mV, meaning that 68.3% of a given sample set would have offset less than 1 mV,



Figure 4.13: Measured offset voltage

95.5% would have offset less than 2 mV, and 99.7% less than 3 mV. When compared with the worst-case low frequency simulation estimate of 2.1 mV (1.1 mV from the DCTA plus roughly 1 mV input-referred from the latch – see Figure 4.11) these data indicate good correlation with simulation and add further validity to the analysis.

### 4.6 Summary

A methodology for analyzing the dynamic behavior of charge-transfer amplifiers has been provided. A generalized voltage transfer function has been developed for any timing scheme or supply voltage. By incorporating a recursive calculation, the equations can be extended to account for the significant effects of nonlinear threshold modulation. This analysis places specific emphasis on creating a deterministic, straightforward estimate of the voltage transfer function. Models of the NMOS CTA and DCTA were developed. The model was shown to be reasonably accurate up to 45 MSPS by comparison with BSIM3 simulations.

Offset voltage was also investigated to verify the gain profile and demonstrate frequency dependent characteristics. It has been shown to result from the superposition of two apparently independent sources: charge injection and differential channel matching. The analysis reveals which devices contribute the majority of the offset voltage, providing a good starting point for designing and constructing low-offset charge transfer amplifiers. Measurements made on fabricated test chips show a strong correlation between Monte Carlo simulations and physically observed offset voltage.

## Chapter 5

# New Architectures for Practical Applications

This chapter describes three charge-transfer amplifiers with practical benefits in A/D applications. These new architectures are designed to solve the following three problems inherent in the earlier CTAs mentioned in Chapter 3.

- 1. Early generation CTAs have a large mean offset voltage due to inherent circuit imbalance when trying to match PMOS and NMOS transistors over processing and temperature. The pseudo-differential architecture is a possible solution. But since it is technically just a "brute force" approach, it is potentially wasteful of die area and power.
- 2. Input capacitance can be prohibitively high when coupling devices are used for input isolation during the precharge phase. Without this isolation, common mode range becomes significantly restricted. Coupling capacitors are made large to avoid signal attenuation by capacitive voltage division at the inputs. These large devices not only add to the input loading, but also create large switching transients at the input. This can be detrimental for even moderate speed flash A/D converters, where noise on the reference ladder can result in both poor SNR and systematic nonlinearity.
- 3. A mid-supply precharge reference voltage is required for dynamic biasing. Generating this reference can be costly whether it is done on-chip or off-chip.

The first proposed amplifier uses a new dynamic charge-transfer mechanism to amplify in true differential mode with half the number of transfer capacitors, or the same number as in the single-ended CMOS CTA. This differential CTA (DCTA), which was analyzed in the previous chapter, maintains all other desirable characteristics of charge-transfer amplifiers, including low power, variable supply and insensitivity to device parameter fluctuations.

The second amplifier requires no input coupling capacitors, yet possesses good dynamic biasing properties over a wide input range. This "direct-coupled" CTA (DCCTA) is shown to have improved common-mode input range as compared with direct-coupled versions of simpler charge-transfer amplifiers.

The third amplifier is also direct-coupled, but in addition requires no precharge reference voltage. Analysis of this precharge-voltage-less CTA ( $V_{PR}$ -less CTA or PLCTA) shows that the input range is nearly rail-to-rail.

For each of the amplifiers presented in this chapter, experimental data are reported to prove the theoretical advantages and demonstrate the most important performance tradeoffs.

#### 5.1 Enhanced Differential Charge-transfer Amplifier

Figure 5.1 shows the DCTA in its three operational phases. Capacitors  $C_{T1}$  and  $C_{T2}$  have the common value  $C_T$ . Similarly,  $C_{L1}$  and  $C_{L2}$  have value  $C_L$ . During the reset phase (a), the output nodes are tied to a precharge voltage  $V_{PR}$ , as are nodes X and Y. Nodes A and C are reset to  $V_{DD}$  and nodes B and D to  $V_{SS}$ . Static current through the MOS devices is prevented by opening the \*S1 switches, which are complements of the S1 switches. Meanwhile, the differential input nodes are connected to  $V_{IN}$  and  $V_{REF}$ . In the precharge phase (b), nodes A, B, C and D are disconnected from their reset voltages and connected to their respective transistor source nodes. Nodes A and C now precharge through devices MP1 and MP2 respectively towards their drain voltage,  $V_{PR}$ , but are limited by MOS cutoff to  $V_{PR}-V_{TP}$ . In the same manner, nodes B and D precharge through MN1 and MN2 to a final state of  $V_{PR}-V_{TN}$ . When the amplify phase (c) begins, nodes X, Y,  $V_{O1}$  and  $V_{O2}$  are disconnected from  $V_{PR}$ . The inputs are both tied to  $V_{REF}$  and the difference signal  $\Delta V_{IN} = V_{REF}-V_{IN}$  is projected onto node X.



Figure 5.1: Fully differential CTA in the (a) reset, (b) precharge and (c) amplify phases

The incremental rise at node X turns MN1 on and further cuts off MP1. Channel current commences through MN1 from  $V_{O1}$  to node B, which now rises towards its new cutoff value of  $V_{PR}-V_{TN}+\Delta V_{IN}$ . Charge transfer occurs between  $C_{T2}$  and  $C_{L1}$  through MN1 until node B reaches the cutoff level. A voltage drop proportional to  $C_T/C_L$  results at node  $V_{O1}$ . Initially  $C_{T2}$  dynamically couples most of the signal at node B to node C. This dynamic coupling mechanism turns MP2 incrementally on and permits current flow from node C to  $V_{O2}$ . The dynamically coupled signal is soon discharged as charge transfer from  $C_{T2}$  to  $C_{L2}$  reduces the voltage on node C back to its steady state value of  $V_{PR}-V_{TP}$ . Node  $V_{O2}$  reacts to the charge transfer with a voltage rise proportional to the ratio  $C_T/C_L$  and a truly differential output signal is therefore produced by the DCTA's dynamic differential



Figure 5.2: Simulation waveforms showing the charge-transfer process

charge-transfer mechanism. The output magnitude is written

$$\Delta V_{OUT} = \alpha \frac{C_T}{C_L} \Delta V_{IN} \tag{5.1}$$

where  $\alpha$  is a constant that ideally ranges from 2 to 4 depending on the sample rate. In reality,  $\alpha$  is always inhibited by threshold modulation.

It should be noted that the projection of a positive  $\Delta V_{IN}$  onto node X also causes MP1 to become further cutoff. Since at high speeds the transistor has not fully reached cutoff, this mechanism reduces the precharge current from node A to  $V_{O1}$ and causes an additional voltage drop at  $V_{O1}$  relative to node  $V_{O2}$ . The incrementally reduced precharge current through MP1 is projected to node D through  $C_{T1}$ , bringing MN2 closer to cutoff as well. This action results in a relative potential increase at  $V_{O2}$ . In this manner two differential charge-transfer paths add to the DCTA gain.

Figure 5.2 shows simulation waveforms of the operation at 5 MSPS. In each successive cycle, the input is alternated between  $\pm 10$  mV. A positive input leads



Figure 5.3: 4-bit flash A/D converter block diagram

to a positive  $\Delta V_{OUT}$ , as referenced in Figure 5.1(c). The individual output node voltages,  $V_{O1}$  and  $V_{O2}$  are also shown. The node deltas are approximately equal in magnitude but opposite in direction. Current through the active devices is shown to incrementally increase or decrease as described upon input signal application.

#### 5.1.1 Experimental Results

The DCTA interfaces with a high-speed, low-power dynamic latch [25], as shown in Figure 4.12, to form an efficient, fast comparator. The latch resets during the DCTA precharge phase, tracks the amplifier outputs during the amplify phase and performs the amplify-latch function during the reset phase.

A 4-bit flash ADC was constructed as shown in Figure 5.3. DCTA preamplifiers were set to a gain of 10 using the method described in Section 4.3. Test chips were fabricated in 0.6  $\mu$ m, 2P/2M CMOS. The active area occupies 0.55  $mm^2$ . A single-poly process would have been sufficient, since linear capacitors are not a prerequisite for accuracy of the comparators or linearity of the overall converter. The supply voltage can be as low as 2.1 V, limited by architecture, and as high as the process allows for reliability. The input range is full scale. Typical test waveforms of



Figure 5.4: Measured waveforms of a sampled sawtooth signal

the ADC sampling a 50 kHz sawtooth wave at 7.5 MSPS are shown in Figure 5.4. When the resistor string current is cut to 4.9  $\mu$ A, simulating the conditions for 10-bit resolution, the ADC exhibits DNL below 0.7 LSB and INL below 1 LSB.

The peak sample rate is above 30 MSPS and the dynamic power is less than  $1.2 \ \mu W/MSPS$  per amplifier at 2.1 V. It should be noted that the purely dynamic power dissipation makes the comparators described here particularly efficient in low frequency applications. For example, at 40 kSPS the consumption per comparator is just 56 nW.

#### 5.2 Direct-coupled Charge-transfer Amplifier

Figure 5.5(a) shows the CMOS CTA, which operates in three phases as described in Section 4.2. The operation is highly tolerant to fluctuations in device parameters, but also occupies a lot of area, with roughly 30-40% consumed by input coupling capacitors. In addition, these capacitors create a serious loading problem for the input source. Figure 5.5(b) depicts the same CTA with input coupling capacitors



Figure 5.5: Diagrams of (a) conventional CMOS CTA, (b) direct-coupled CMOS CTA and (c) timing diagram

removed. This reduces the silicon area significantly and limits charge kickback, but restricts the input common-mode range. As shown in Figure 5.5(c), two clock signals S1 and S2 generate the three operating phases.

The first problem with the amplifier in Figure 5.5(b) occurs if  $V_{REF}$  is close to either supply rail, forcing MN1 or MP1 into cutoff at the beginning of the precharge phase. This is named the *cutoff condition*. For example, if  $V_{REF} < V_{IN} < V_{TN} - V_{SS}$ , then the NMOS device is needed for charge transfer because the input voltage increases from the precharge phase to the amplify phase (the PMOS device is not used for charge transfer in this case). However, since the NMOS gate-source voltage remains below the threshold voltage, the device is cutoff and charge transfer cannot occur. The cutoff condition is summarized as

$$V_{SS} + V_{TN} \le V_{REF} \le V_{DD} + V_{TP}.$$
 (5.2)



Figure 5.6: Waveforms illustrating the limits of CMR in a direct-coupled CMOS CTA

Figure 5.6(a) illustrates the cutoff condition on MN1 with waveforms for nodes A and B (in the transition from reset to precharge phase). Note that  $V_{REF}$  $< V_{SS} + V_{TN}$  and no current flows in MN1. Similarly, Figure 5.6(b) illustrates the scenario where MP1 violates the cutoff condition. In each case, either MN1 or MP1 is inactive and cannot transfer any charge at small input signal voltage, leading to zero gain for a positive or negative input signal polarity.

Another problem occurs when the source-drain voltage converges to zero during precharging; this is named the *convergence condition*. For example, if  $V_{REF}$  is sufficiently high, then precharging continues only until  $V_{DS}$  goes to zero. MN1 is now useless for charge transfer for a positive applied gate voltage during the amplify phase. The same problem happens to MP1 when  $V_{REF}$  is low enough. The convergence



Figure 5.7: Common-mode range of (a) direct-coupled CTA, (b) fully-differential direct-coupled CTA, (c)  $V_{PR}$ -less CTA, and (d) fully-differential  $V_{PR}$ -less CTA

condition is expressed as

$$V_{PR} + V_{TP} \le V_{REF} \le V_{PR} + V_{TN}. \tag{5.3}$$

Note that if it is supposed that the  $V_{REF}$  voltage causes the CTA to operate near the edge of the convergence condition in the precharge phase, there is no sourcedrain voltage remaining for amplification in the amplify phase. Therefore, to prevent the convergence condition in the amplify phase, the actual convergence condition becomes slightly more severe than (5.3). But, practically this additional restraint can be negligible. Figures 5.6(c) and (d) show cases where  $V_A$  and  $V_B$  reach  $V_{PR}$ , violating the convergence condition in MP1 and MN1 respectively.

The common-mode range, defined by a combination of the cutoff and convergence conditions, is plotted versus supply voltage in Figure 5.7(a), where  $V_{TN}$  =



Figure 5.8: Fully-differential direct-coupled charge-transfer amplifier

0.7V and  $V_{TP} = -0.9$ V are supposed. The solid line shows the net input range, or the smallest range allowed by either condition. Note that the cutoff condition tracks supply voltage, while the convergence condition remains constant for any supply, as suggested by (5.2) and (5.3).

Improving on the single-ended version, a fully-differential direct-coupled architecture [12], shown in Figure 5.8, overcomes the cutoff condition because the  $C_T$ capacitors force the same current through series NMOS and PMOS channels while isolating the source nodes. Figure 5.9(a) shows the case where NMOS channels are initially cutoff. MP1 and MP2 push on MN2 and MN1 respectively through  $C_{T1}$ and  $C_{T2}$  respectively, to establish a compromise bias current. In this case, cutoff recovery occurs as the source nodes of all transistors drop. Likewise, Figure 5.9(b) demonstrates cutoff recovery when PMOS devices are initially cutoff.

Even in a fully-differential circuit, the convergence condition still exists and limits CMR according to (5.3), as shown in Figs. 5.9(c) and (d). Also, the pushpull action of PMOS and NMOS devices pumps the source voltages above or below



Figure 5.9: Waveforms showing the effects of the cutoff and convergence conditions

the respective supply, leading to substrate charge injection and possible breakdown if the supply voltage is near the maximum allowed by process. Moreover, the source pumping also limits common-mode range by a new condition called the *bulk condition*. When the source-bulk diode becomes forward biased, then the diode diverts drainsource current away from the transfer capacitor and into the bulk, effectively clipping the dynamic response. In essence, this is simply another type of cutoff condition.

The bulk condition is illustrated briefly in Figure 5.10, where  $V_{REF}$  is just below  $V_{DD}$ . PMOS devices start out in cutoff (violation of the initial cutoff condition) whereas NMOS devices start out highly turned on. Kirchhoff's Law forces the source nodes higher in search of a common series current. However, to achieve conduction the PMOS source must rise above  $V_{REF} + |V_{TP}|$ , which exceeds the source clipping



Figure 5.10: Waveforms showing the effects of the bulk condition

voltage of  $V_{DD} + V_{Diode}$ , where  $V_{Diode}$  is the forward diode voltage (typically 0.6–0.7 V). The bulk condition is summarized as

$$V_{SS} - V_{Diode} + V_{TN} < V_{REF} < V_{DD} + V_{Diode} + V_{TP},$$
(5.4)

as illustrated graphically in Figure 5.7(b). Note that while the convergence condition dominates input CMR for the supply range shown, the bulk condition becomes the limiting factor at lower supply voltages.

A straightforward circuit analysis shows that if the power supply voltage is below  $V_{TN} + |V_{TP}|$ , precharging cannot occur at all, because active NMOS and PMOS become cutoff. However, this limitation is not so severe in conventional CMOS technologies, where subthreshold conduction allows enough current to maintain gain.

In summary, removing the input coupling capacitors from a CTA adversely affects common-mode range, but limits loading on the input and reference voltages and reduces size. A fully differential architecture alleviates the cutoff condition, but the convergence and bulk conditions still constrain the input common-mode range substantially.

Experimental results for the DCCTA are given along with results for the  $V_{PR}$ -less CTA in Section 5.4.



Figure 5.11: Single-ended  $V_{PR}$ -less charge-transfer amplifier

# 5.3 V<sub>PR</sub>-less Charge-transfer Amplifier

The direct-coupled CTA described in the previous section requires a precharge voltage to reset the output capacitor and define the output common-mode voltage level. This section introduces a modification that eliminates the need for this precharge reference voltage. Additionally, it shall be demonstrated that this new architecture restores the full rail-to-rail input common-mode range, even with direct-coupled inputs.

Figure 5.11 presents a CTA that requires no precharge voltage, yet is similar in operation to the amplifier of Figure 5.5(b). Inside the left-hand dashed box is the earlier CMOS charge-transfer amplifier, with the drain nodes disconnected. Coupling the drains in the right-hand dashed box is a dynamic reference voltage generator [20]. The dynamic reference voltage generator serves two purposes: (1) it provides acceptable precharge bias conditions without a precharge voltage, and (2) it generates a near mid-supply common-mode output voltage. The capacitive reference generator operates within the three CTA phases. During the reset phase, all nodes of  $C_R$  capacitors are discharged to  $V_{SS}$ . In the precharge phase, the output node is disconnected and the lower  $C_R$  capacitor is connected to  $V_{DD}$  while the upper  $C_R$  capacitor remains tied to  $V_{SS}$ . Therefore, by capacitive voltage division, the output node settles to

$$V_{OUT_{Precharge}} = \frac{C_R}{2C_R + C_L}.$$
(5.5)

In addition to establishing a useful common-mode output voltage, precharging in this manner offers the advantage that both NMOS and PMOS devices precharge in the "on" state for any input voltage. This benefit occurs because the target source voltage for NMOS and PMOS is no longer a mid-supply  $V_{PR}$  but rather the supply voltage opposite to the starting source voltage. Finally, when the amplify phase begins the output remains floating at the voltage determined by (5.5) and the inner nodes of the  $C_R$  capacitors are disconnected from their respective supplies. When an input stimulus is applied, charge transfer occurs as in previous CTAs with each  $C_R$  now acting as a coupler to the output node. Since operation is charge-based, these capacitors have virtually no effect on voltage gain except for a slight loss due to bottom plate parasitics.

For the reasons given above, this configuration eliminates the convergence condition entirely. However, the cutoff condition still exists according to (5.2). Figure 5.7(c) depicts the CMR of this amplifier, which simply follows the cutoff condition.

As discussed in Section II, the fully-differential implementation of the direct-coupled architecture eliminates the cutoff condition (replacing it with the bulk condition). Figure 5.7(c) shows that the proposed  $V_{PR}$ -less configuration also eliminates the convergence condition. Therefore, it is logical to combine the benefits of the fully-differential direct-coupled CTA (small size, low input capacitance and elimination of the cutoff condition) and the  $V_{PR}$ -less CTA (no precharge voltage and elimination of the convergence condition). The resulting amplifier appears in Figure 5.12, where the left-hand dashed box contains the earlier differential circuit and the



Figure 5.12: Fully-differential  $V_{PR}$ -less charge-transfer amplifier

right-hand dashed box contains a fully-differential adaptation of the dynamic voltage reference generator used in Figure 5.11. The switch  $*S_2$  discharges any residual differential voltage after each cycle. Spice simulations confirm that the cutoff and convergence conditions are eliminated by the combination of previous configurations, leaving only the bulk condition. The resulting common-mode input range is shown in Figure 5.7(d).

In comparison to the differential CTA in Figure 5.8, power consumption is expected to increase due to the addition of switched  $C_R$  capacitors. With typical capacitor sizes, the expected power increase of the amplifier itself is about 80%; however, this figure neglects the potential to save power and/or cost by removing the external precharge reference voltage generator.

### 5.4 Experimental Results of the Direct-coupled and $V_{PR}$ -less CTAs

Experimental voltage comparators were constructed to verify the functionality and input range of the proposed CTAs. To evaluate performance, input-referred offset voltage and frequency response were also measured.

Each comparator was comprised of a CTA preamplifier and a dynamic latch. Test structures were fabricated in AMI Semiconductor's 0.6  $\mu$ m double-poly,

triple-metal CMOS. Two of the proposed preamplifiers were used for comparison: the fully-differential direct-coupled CTA (DCCTA) from Section 5.2 and the fullydifferential  $V_{PR}$ -less CTA (PLCTA) from Section 5.3. The dynamic latch [25] was designed for low offset by using non-minimum device sizes and common-centroid layout techniques. The worst case (3 $\sigma$ ) simulated latch offset was 53 mV.

The simulated midband voltage gain was 6.5 V/V and 5.9 V/V respectively for the DCCTA and PLCTA with  $C_T = 600$  fF and  $C_R = 200$  fF (a discussion about modeling the dynamic frequency response of CTAs is found in [13]). The inferred latch input capacitance, representing the preamplifier load, was about 100 fF. Under these conditions, the expected input-referred 1 $\sigma$  comparator offsets were

$$V_{OS-DCCTA} = \frac{53mV}{(6.5V/V)(3)} = 2.72mV$$
(5.6)

$$V_{OS-PLCTA} = \frac{53mV}{(5.9V/V)(3)} = 3.00mV.$$
(5.7)

The DCCTA-based comparator occupied 0.0148 mm<sup>2</sup> and the PLCTA version occupied 0.0166 mm<sup>2</sup> (an increase of 12% due to the added  $C_R$  capacitors).

#### 5.4.1 Input Range and Offset Voltage

Absolute input range was measured by sweeping the input voltage while monitoring the change in offset at a low sample rate (25 kSPS). Since gain cannot be measured directly as a result of high sensitivity to output capacitance, the CMR limits were estimated by the levels where input-referred offset shifted by one standard deviation. While not a perfect measurement, this methodology provides a reasonable indication of the loss in gain symptomatic of the convergence, cutoff and bulk conditions.

Twenty prototypes were selected at random for the measurements. Measurement results for the input range appear in Figure 5.13, where solid lines represent the theoretical limits developed above and dashed lines show the silicon data. Examining first the DCCTA, input range closely matches the theoretical convergence



Figure 5.13: Measured common-mode range of experimental comparators

condition at high supply voltages but becomes inconsistent at low voltages. One explanation for the difference is that response time of CTAs decreases with supply voltage, leading to an apparent loss in gain (and therefore an increase in offset voltage) at the measurement frequency. The PLCTA input range agrees almost exactly with the theoretical bulk condition. Nearly full-scale input range is achieved, since the only limitation on input range is the bulk condition.

Offset behavior and frequency response of the test devices were also experimentally measured. Figure 5.14 shows the observed offset of 20 samples at several sample rates with a 3 V supply. The offset of the DCCTA-based comparator is below 2 mV (1 $\sigma$ ) at 25 kSPS and 6.4 MSPS, and rises to about 4 mV at 25 MSPS. These results compare favorably with the predicted value in (5.6). The PLCTAbased comparator offset was about 2 mV at 25 kSPS and 3 mV at 6.4 MSPS and 25 MSPS, in agreement with (5.7). The offset increases with frequency because residual precharge current introduces MOSFET mismatch components on top of the capacitor mismatch [13]. With the fully-differential architectures, the mean offset was approximately zero at all frequencies.



Figure 5.14: Measured offset voltage of experimental comparators

It is important to note that the preamplifier offset adds to the inputreferred latch offset. In a 3 V system, the single-preamplifier comparators achieve about 8 bits of accuracy (e.g., a flash ADC consisting of 255 such comparators would achieve less than 1/2 LSB of differential nonlinearity). However, the accuracy can be raised by adding a second preamplifier stage.

The differences in offset dependence on sample rate are interesting. At low speed, the DCCTA offset is lower than the PLCTA offset by about 33%. This is partly due to the slight difference in gain which results from the parasitic bottom plate loading of the  $C_R$  capacitors, but mainly a result of the extra mismatch introduced by the parasitic capacitors. Mismatch of the  $C_R$  capacitors themselves does not cause additional offset. At high speed, however, the DCCTA offset exceeds the offset of the PLCTA by a factor of about 20%, with a noticeably higher rate of frequencydependent increase. The output nodes of the PLCTA settle faster due to the lower voltage-dependent resistivity of the reset switches which discharge to  $V_{SS}$  rather than  $V_{PR}$ , as in the DCCTA. In addition, the output cut switch in the PLCTA helps dissipate any residual differential output charge.

#### 5.4.2 Dynamic Power

In a single preamplifier, dynamic power dissipation is a strong function of the input signal (see Chapter 6). To measure the expected average dynamic power, a 6-bit flash converter architecture was used. After separating power in the preamplifiers from the other circuitry while applying a full scale sinusoidal input signal, the total power was divided by the number of preamplifiers, 63, to calculate average dynamic power per CTA. The resulting power per preamplifier was  $3.33 \ \mu W/MSPS$ and  $6.03 \ \mu W/MSPS$  for the DCCTA and PLCTA respectively. These measured results agree well with simulation data over processing corners. The data also provide a strong agreement with the simulations described in the next chapter, particularly with the simulation data given in Figure 6.4.

It should be noted that the high ratio of PLCTA to DCCTA power figures may be misleading because the PLCTA eliminates the cost of supplying a precharge voltage. If generated off-chip, the precharge voltage requires a dedicated package pin. If integrated on-chip, a precharge voltage generator adds die area and consumes a potentially high static power, depending on the accuracy and sample rate requirements of the application.

#### 5.5 Summary

The enhanced DCTA presented in Section 5.1 overcomes several limitations of prior charge-transfer amplifiers. It enables low untrimmed offset with zero mean. A novel differential charge-transfer mechanism creates truly differential processing while at the same time reduces the number of transfer capacitors by half as compared with the pseudo-differential CMOS CTA. The architecture preserves other desired characteristics, such as low-power operation, supply voltage scalability and tolerance to process variations. Experimental test circuits consumed low dynamic power of a few  $\mu$ W/MSPS from a 2.1 V supply. This is on the same order of magnitude as the single-ended-output CMOS CTA.

Two other new types of advanced CTAs, the DCCTA and PLCTA, were presented in Sections 5.2 and 5.3. Both amplifiers are direct-coupled, allowing for a reduction in die area and input capacitance. The absence of input coupling capacitors also limits the amount of charge kickback on amplifier inputs and reference voltages, such as a resistor ladder in a flash ADC. In the PLCTA, the precharge reference voltage is eliminated completely by separating the PMOS and NMOS output nodes and adding a new dynamic biasing circuit. The output circuitry also generates a near mid-supply common-mode output voltage. Use of the enhanced fully-differential configuration permits nearly rail-to-rail input range by overcoming both the cutoff and convergence conditions. The reported amplifiers preserve the other attractive features of CTAs and consume dynamic power on the same order as the DCTA. As such, this new work improves on the prior CTA designs and provides new alternatives for implementing practical low-power CTA-based A/D converters.

# Chapter 6

# **Dynamic Power**

This chapter discusses a methodology for predicting the dynamic power dissipation of a charge-transfer amplifier. A comparison of theory to measurement data is also presented and the differences are discussed. Figures of merit (FOMs) incorporating power, area and accuracy are proposed, with specific relevance to flash A/D converters. The FOMs provide an objective standard for evaluating the costs and benefits of all of the known CTA architectures.

The most straightforward method of determining the power dissipation of a CTA is to start by examining  $Q_{CYCLE}$ , the charge consumed in each cycle,

$$Q_{CYCLE} = \int_{CYCLE} I(t)dt$$

where I(t) is the time-varying amplifier current drawn from the power supply. However, since I(t) is hard to predict deterministically, the net charge can be written more simply as the sum of products of node capacitances and net cyclic voltage change. For N total circuit nodes, this becomes

$$Q_{CYCLE} = \sum_{i=1}^{N} C_i \Delta V_i, \qquad (6.1)$$

where  $C_i$  represents the capacitance on the *i*th node and  $\Delta V_i$  is the net cyclic voltage change on that node. The average current (charge per unit time) is the cycle charge divided by the cycle time. Average current can also be written as  $Q_{CYCLE}$  times the sample frequency,  $f_S$ ,

$$I = f_S \cdot Q_{CYCLE}$$
  
=  $f_S \sum_{i=1}^{N} C_i \Delta V_i,$  (6.2)

from which power is obtained by multiplying the cycle charge contribution of each node by the corresponding supply voltage,

$$P = f_S \sum_{i=1}^{N} (C_i \Delta V_i) V_{SUP_i}, \qquad (6.3)$$

where  $V_{SUP_i}$  represents the voltage of the power supply used for charging the *i*th node, referenced to the supply used for discharging the *i*th node.

A term used commonly in this chapter is *dynamic power*, or  $P_D$ . The units of this term are W/SPS (Watts per sample per second) or more commonly  $\mu$ W/MSPS (micro-Watts per mega-sample per second). Dynamic power is defined as

$$P_D = \frac{P}{f_S}$$
$$= \sum_{i=1}^{N} (C_i \Delta V_i) V_{SUP_i}.$$
 (6.4)

Dynamic power is a preferable notion for power in the analysis of CTAs because it provides a more generalized value for comparison. In much of the literature, the terms "power" and "dynamic power" are used interchangeably. A distinction is provided here to avoid confusion. (Note that the units of dynamic power can be rearranged into units of energy per sample, or J/Sample. This unit is not used because it adds a step when calculating power in watts once the sample frequency is known.)

#### 6.1 Dynamic Power of the NMOS Charge-transfer Amplifier

With reference to Figure 6.1 (equivalent to Figure 3.2, shown again here for convenience), the capacitor node voltages of the NMOS CTA during the reset, precharge and amplify phases are shown in Table 6.1.

The cycle charge is calculated from Table 6.1 by summing the products of node capacitances ( $C_T$  and  $C_L$ ) and the differences in each node's voltage from the reset phase to the amplify phase:

$$Q_{CYCLE} = C_T \Delta V_{C_T} + C_L \Delta V_{C_L} = C_T (V_{PR} - V_{TN} + \Delta V_{IN} - V_{SS}) + C_L (\Delta V_{IN} \frac{C_T}{C_L}) = C_T (V_{PR} - V_{TN} - V_{SS} + 2\Delta V_{IN}).$$
(6.5)

|       | Reset Phase | Precharge Phase   | Amplify Phase                                     |
|-------|-------------|-------------------|---------------------------------------------------|
| $C_T$ | $V_{SS}$    | $V_{PR} - V_{TN}$ | $V_{PR} - V_{TN} + \Delta V_{IN}$                 |
| $C_L$ | $V_{PR}$    | $V_{PR}$          | $V_{PR} - \Delta V_{IN} \left( C_T / C_L \right)$ |

Table 6.1: Capacitor node voltages in an idealized NMOS CTA

Note that the cycle charge does *not* depend on the load capacitance,  $C_L$ , but only on the transfer capacitance,  $C_T$ . This is because while the charge dissipated at the output node is proportional to  $C_L$ , it also varies linearly with the voltage gain, which is inversely proportional to  $C_L$ . The net effect is a complete cancellation of  $C_L$  from the cycle charge.

The same supply voltage,  $V_{PR} - V_{SS}$ , is used to charge all nodes in the NMOS CTA. Therefore, by a combination of (6.4) and (6.5), the input-dependent dynamic power is

$$P_D = C_T (V_{PR} - V_{SS}) \left[ V_{PR} - V_{TN} - V_{SS} + 2\Delta V_{IN} \right].$$
(6.6)

The dynamic power increases linearly with input voltage until the NMOS transistor becomes drain-source saturated. The saturating input voltage is calculated by setting the input-stimulated source and drain voltages equal,

$$V_{PR} - V_{TN} + \Delta V_{IN-SAT} = V_{PR} - \Delta V_{IN-SAT} \frac{C_T}{C_L}, \qquad (6.7)$$

where the left hand side represents the source voltage, which is increased by exactly  $\Delta V_{IN-SAT}$  once the drain-source convergence has occurred. On the right hand side, when this happens the drain voltage has been decreased by exactly  $\Delta V_{IN-SAT} \frac{C_T}{C_L}$ . The saturating input voltage is found by solving (6.7),

$$\Delta V_{IN-SAT} = \frac{V_{TN}}{1 + \frac{C_T}{C_L}}.$$
(6.8)

When  $\Delta V_{IN} \leq 0$ , only the precharging and resetting of  $C_T$  contributes dynamic power. This amounts to

$$P_{D_0} = C_T (V_{PR} - V_{SS}) (V_{PR} - V_{SS} - V_{TN}).$$
(6.9)



Figure 6.1: NMOS CTA in its operating phases: (a) reset, (b) precharge and (c) amplify

Figure 6.2 shows how  $P_D$  remains fixed above  $\Delta V_{IN-SAT}$  and is linearly dependent on  $\Delta V_{IN}$  below, with slope

$$\frac{dP_D}{d\Delta V_{IN}} = 2C_T (V_{PR} - V_{SS}).$$
(6.10)

The maximum dynamic power,  $P_{{\cal D}_{MAX}}$  is

$$P_{D_{MAX}} = C_T (V_{PR} - V_{SS}) \left[ V_{PR} - V_{TN} - V_{SS} + 2\Delta V_{IN-SAT} \right]$$
  
=  $C_T (V_{PR} - V_{SS}) \left[ V_{PR} - V_{SS} - V_{TN} \frac{C_T - C_L}{C_T + C_L} \right].$  (6.11)



Figure 6.2: Dynamic power profile of an ideal NMOS CTA

#### Supply Voltage Dependence

The supply voltage,  $V_{SUP}$ , is considered to be the difference between maximum and minimum circuit voltages. In this case

$$V_{SUP} = V_{PR} - V_{SS}. (6.12)$$

Both (6.9) and (6.11) have square and linear terms in  $V_{SUP}$ . That is to say, the equations describing dynamic power have the form  $a(V_{SUP})^2 - b(V_{SUP})$ . Thus the overall dynamic power has both positive square and negative linear dependencies on  $V_{SUP}$ . This point becomes important when designing an A/D converter for minimal power dissipation. The best solution inevitably involves a tradeoff between low power (low supply voltage) and high bandwidth (high supply voltage).

### **Threshold Modulation**

As seen in Chapter 4, threshold modulation plays an important role in the dynamic behavior of CTAs. In the analysis above, each instance of  $V_{TN}$  applies to the *biased threshold*, meaning the threshold modulated by second order effects such as DIBL and the body effect. In practice, depending on the technology, if the zero bias

threshold were 0.7 V, then the biased threshold might be 0.85–0.95 V. Since threshold is always subtracted in the equations above, increasing the threshold means less power dissipation.

### **Example Power Estimation**

In a typical submicron CMOS process, the following parameters might be typical for constructing a charge-transfer amplifier with a voltage gain of 6 V/V. Assuming the parameters in Table 6.2, the components of the power profile are shown in Table 6.3.

| Parameter | Value | Units         |
|-----------|-------|---------------|
| $V_{PR}$  | 0     | V             |
| $V_{DD}$  | 1.25  | V             |
| $V_{SS}$  | -1.25 | V             |
| $V_{TN}$  | 0.95  | V             |
| $V_{TP}$  | -1.15 | V             |
| $C_T$     | 0.6   | $\mathrm{pF}$ |
| $C_L$     | 0.1   | $\mathrm{pF}$ |

Table 6.2: Typical circuit parameters for a CTA in a 0.6  $\mu$ m CMOS process

Table 6.3: Components of the power profile of a typical NMOS CTA

| Profile Parameter             | Value | Units          |
|-------------------------------|-------|----------------|
| $\Delta V_{IN-SAT}$           | 0.14  | V              |
| $P_{D_0}$                     | 2.3   | $\mu W/MSPS$   |
| $P_{D_{MAX}}$                 | 4.3   | $\mu W/MSPS$   |
| $\frac{dP_D}{d\Delta V_{IN}}$ | 1.5   | $\mu W/MSPS/V$ |

The average dynamic power drawn from an NMOS CTA is calculated by

$$P_{D-AVG} = \int_{\Delta V_{IN}} P_D(\Delta V_{IN}) g(\Delta V_{IN}) \cdot d\Delta V_{IN}$$
(6.13)

where  $g(\Delta V_{IN})$  is the probability density function of  $\Delta V_{IN}$ . For instance, if  $\Delta V_{IN}$ is uniformly distributed between 0 V and 1 V, then (6.13) reduces to the average power in Figure 6.2 for  $\Delta V_{IN} \in [0,1]$ . In this case, referring again to Table 6.3, the predicted dynamic power would be 4.15  $\mu$ W/MSPS. This is roughly equivalent to a 10-bit subranging A/D consuming just 300  $\mu$ W at 1 MSPS, not counting the resistor ladder and encoding logic.

### 6.2 Dynamic Power of the CMOS Charge-transfer Amplifier

The power profile of a CMOS CTA is shown in Figure 6.3. For  $\Delta V_{IN} > 0$ , the profile is identically the same as for the NMOS CTA except for the addition of a constant term to account for the reset and precharge power in the PMOS channel.



Figure 6.3: Dynamic power profile of an ideal CMOS CTA

Following a development similar to that in Section 6.1, it can be shown that the dynamic power profile is defined by the following:

$$P_{D_0} = P_{D_0 - N} + P_{D_0 - P} (6.14)$$

$$P_{D_0-N} = C_T \left[ (V_{PR} - V_{SS})(V_{PR} - V_{TN} - V_{SS}) \right]$$
(6.15)

$$P_{D_0-P} = C_T \left[ (V_{DD} - V_{PR})(V_{DD} - V_{PR} + V_{TP}) \right]$$
(6.16)

$$P_{D_{MAX-N}} = C_T (V_{PR} - V_{SS}) \left[ V_{PR} - V_{SS} + V_{TN} \frac{C_L - C_T}{C_L + C_T} \right] + P_{D_0 - P} \quad (6.17)$$

$$P_{D_{MAX-P}} = C_T (V_{DD} - V_{PR}) \left[ V_{DD} - V_{PR} - V_{TP} \frac{C_L - C_T}{C_L + C_T} \right] + P_{D_0 - N} (6.18)$$

$$\Delta V_{IN-SAT-N} = \frac{V_{TN}}{1 + \frac{C_T}{C_L}} \tag{6.19}$$

$$\Delta V_{IN-SAT-P} = \frac{V_{TP}}{1 + \frac{C_T}{C_L}} \tag{6.20}$$

$$\frac{dP_D}{d\Delta V_{IN}} = \begin{cases} -2C_T(V_{DD} - V_{PR}) & \Delta V_{IN-SAT-P} < \Delta V_{IN} < 0\\ 2C_T(V_{PR} - V_{SS}) & 0 < \Delta V_{IN} < \Delta V_{IN-SAT-N}. \end{cases}$$
(6.21)

It is expected that the maximum power attributed to the PMOS channel is greater than the maximum power from the NMOS channel, since in general  $|V_{TP}| > |V_{TN}|$ . Table 6.4 shows the parameters of the resulting power profile when the values of Table 6.2 are assumed.

| Profile Parameter             | Value                                              | Units          |
|-------------------------------|----------------------------------------------------|----------------|
| $\Delta V_{IN-SAT-N}$         | 0.14                                               | V              |
| $\Delta V_{IN-SAT-P}$         | -0.16                                              | V              |
| $P_{D_0-N}$                   | 2.25                                               | $\mu W/MSPS$   |
| $P_{D_0-P}$                   | 0.75                                               | $\mu W/MSPS$   |
| $P_{D_0}$                     | 3.0                                                | $\mu W/MSPS$   |
| $P_{D_{MAX-N}}$               | 5.0                                                | $\mu W/MSPS$   |
| $P_{D_{MAX-P}}$               | 5.5                                                | $\mu W/MSPS$   |
| $\frac{dP_D}{d\Delta V_{IN}}$ | $1.5 \ (0 < \Delta V_{IN} < \Delta V_{IN-SAT-N})$  | $\mu W/MSPS/V$ |
| $\frac{dP_D}{d\Delta V_{IN}}$ | $-1.5 \ (\Delta V_{IN-SAT-P} < \Delta V_{IN} < 0)$ | $\mu W/MSPS/V$ |

Table 6.4: Components of the power profile of a typical CMOS CTA

As an example of typical average dynamic power dissipation, if the input signal is uniformly distributed over -1 V to 1 V and the parameters in Table 6.2 are assumed, then expected dynamic power dissipation calculated by using (6.13) would be 5.08  $\mu$ W/MSPS.

### 6.3 Discussion

The formulations of Sections 6.1 and 6.2 are based on idealized assumptions. But how does the analysis compare with reality?

Consider the CMOS CTA. Kotani measured 4.3  $\mu$ W/MSPS dynamic power per comparator drawn from a 3 V supply when  $C_T = 500$  fF [2]. If the circuit values from Table 6.2 are modified for  $C_T = 500$  fF and  $V_{SUP} = 3$  V ( $V_{DD} = 1.5$  V and  $V_{SS} = -1.5$  V), then the analysis above predicts 9.14  $\mu$ W/MSPS, or 113% higher than measured. This glaring difference between measurement and theory may be accounted for by any of the following circumstances:

- 1. <u>Process</u>. The circuits measured by Kotani were fabricated in a different CMOS process than the one used for all circuits described in this dissertation. It is possible that the threshold voltage was higher in [2] than in this work.
- 2. <u>Threshold Modulation</u>. The threshold voltages above are based on a supply of 2.5 V. When the supply voltage is raised to 3 V, the swing on MOS source nodes becomes much larger. This increases the modulation caused by the body effect which implies a higher threshold voltage. If an additional 0.05 V of threshold modulation is assumed at a 3 V supply, then the analysis above predicts 7.25 μW/MSPS slightly closer to the measurements. This example shows that power estimates can be far too conservative because threshold modulation is difficult to predict without performing simulations.
- 3. Incomplete Charging. The development above assumes complete charging and discharging of all nodes in the CTA. This is an unrealistic assumption, especially for the source nodes which drive the majority of the power dissipation. Even at 100 kSPS, the NMOS source node, for example, does not reach  $V_{PR}-V_{TN}$  in

the precharge phase. Above 10 MSPS, the precharging is so incomplete that the MOS transistors begin to act like devices with source nodes continuously biased near their respective supply voltages rather than precharged to the point of cutoff. If the numbers above are modified for 0.2 V less precharging at the source nodes, then the analysis above predicts 6.55  $\mu$ W/MSPS, again slightly closer to the measurements.

4. <u>Aggregate Charge Cancellation</u>. In a flash converter, such as the one used in [2] for the average power measurements, some of the CTA outputs will be saturated high and others will be saturated low. When all of the CTA outputs are reset to  $V_{PR}$ , the charge stored on the high nodes will be absorbed by the low nodes with a charge depletion. Actually, if the input were uniformly distributed across the input range and the NMOS and PMOS thresholds were equal, then the output nodes would on average draw zero net charge from  $V_{PR}$ . Therefore, it would represent a fair approximation of the dynamic power of individual CTAs in an A/D converter array to remove the dynamic power term attributed to resetting the output nodes. Doing so in the analysis above reduces the predicted dynamic power to 7.95  $\mu$ W/MSPS.

If the last three considerations above are all incorporated together as described, then the predicted power drops to 4.45  $\mu$ W/MSPS, almost exactly in line with the measurements in [2]. Although this is close enough to be considered an accurate estimate of reality, items (2) and (3) above are not easily generalized for a closed form model and are, in fact, highly nonlinear functions which can only be solved numerically (or, better, in a Spice simulator). Item (4) above is easy to add into the estimating calculations.

#### 6.4 Application to New Architectures

It is a straightforward exercise to apply the techniques used in this chapter to create a model of the dynamic power for any of the amplifiers described in recent literature [10, 15] or for those in Chapter 5. The models are not derived in this dissertation for two reasons: first, because the resulting power profiles are identical to Figure 6.3 (with different parameters of course), and second, because the idealized models have been shown to be inaccurate. When estimating the power of a CTAbased comparator or A/D converter, it is always recommended to

- Perform an idealized analysis to obtain a rough estimate of the power, with the foreknowledge that the estimate will be higher than reality.
- Simulate the transient power of the individual CTA or CTA-based comparator in order to understand the underlying sources of power dissipation and the charge conveyance characteristics.
- Simulate the transient power of the comparator array in the A/D converter with references and input voltages connected normally. It is expected that an immediate average power reduction will be observed due to aggregate charge sharing at the output nodes. The total power of the amplifiers divided by the number of amplifiers provides an estimate of the average individual contributions that is reasonably accurate compared to what will be observed in silicon.

Figure 6.4 shows the simulated dynamic power of each of the chargetransfer amplifiers discussed in this report. The sample rate was 100 kSPS and the circuit values of Table 6.2 were assumed for a typical 0.6  $\mu$ m CMOS process.  $\Delta V_{IN}$ was uniformly distributed from -1 V to 1 V, except for the NMOS CTA where the distribution was uniform over 0 V to 1 V. When plotted with a log y-axis, the data show a square dependence on supply voltage, minus what appears to be a linear term as discussed above with respect to (6.9) and (6.11). If the dependence on supply voltage were square only, then the data would appear as perfect lines on the logscale graph.

### 6.5 Figures of Merit

A figure of merit (FOM) can be used to compare the overall performance (or the cost) of charge-transfer amplifiers. Since the purpose of this dissertation involves


Figure 6.4: Dynamic power consumption of charge-transfer amplifiers with similar circuit components

the use of charge-transfer amplifiers in data converter applications, a commonly used FOM for A/D converters is used as the basis for introducing CTA-related FOMs. The most widely accepted figure of merit for A/D converters is [26]

$$FOM = \frac{P}{2^{ENOB} \cdot 2BW} \tag{6.22}$$

where BW is the input bandwidth, ENOB is the effective number of bits when sampling at Nyquist (i.e., for non-oversampling converters) and P is the power dissipation. The units are technically joules, but it is more common to refer to this figure as "joules per bit transition" (or simply  $\frac{pJ}{step}$  or  $\frac{fJ}{step}$ ).

Some FOMs also incorporate active circuit area, supply voltage and/or minimum feature size [27]. Many additional customized figures of merit have been proposed through the years, usually to show that a particular converter offers the "best" overall performance for a particular application. Since the sample rate is already built into the units of dynamic power, none of the figures of merit suggested below contain a bandwidth or sample rate component. Four CTA-related FOMs are presented here, all based on (6.22) for relevance to A/D converter applications. All of the reported charge-transfer amplifiers are compared in terms of the proposed figures of merit. A second comparison is also performed for those CTAs exhibiting zero mean offset voltage, since only these amplifiers can be used in subranging A/D converters and many flash converters requiring zero global offset (or gain offset).

The four proposed figures of merit account for various combinations of power, accuracy, area and input capacitance, as follows. In each case, a smaller figure of merit indicates better performance.

$$FOM_1 = \frac{P}{2^{ACC}} \tag{6.23}$$

$$FOM_2 = \frac{P + C_{IN}V_{SUP}^2}{2^{ACC}} \tag{6.24}$$

$$FOM_3 = \frac{P \cdot A}{2^{ACC}} \tag{6.25}$$

$$FOM_4 = \frac{(P + C_{IN}V_{SUP}^2)A}{2^{ACC}}.$$
 (6.26)

Here, ACC represents the relative accuracy in bits, as explained in the next paragraph. A is the active area and  $C_{IN}$  is the input capacitance. The recommended units of FOM<sub>1</sub> and FOM<sub>2</sub> are  $\frac{fJ}{step}$ , the same units as in (6.22). The units of FOM<sub>3</sub> and FOM<sub>4</sub> are  $\frac{pJ-\mu m^2}{step}$ . FOM<sub>1</sub> measures the raw energy per sampled bit step and FOM<sub>3</sub> describes both the energy and silicon area required for a sampled bit step. FOM<sub>2</sub> incorporates input charging energy by adding the term  $C_{IN}V_{DD}^2$ . This allows the figure of merit to represent not only the energy consumed by the CTA, but also the worst-case energy drawn from the source for input charging.<sup>1</sup> FOM<sub>4</sub> is a combination

<sup>&</sup>lt;sup>1</sup>The worst-case input charging energy occurs when the source drives the input from one supply rail to the other in each successive cycle. This represents sampling at or below the Nyquist rate. The actual energy required to charge the input capacitors is  $\frac{1}{2}C_{IN}V_{DD}^2$ , but twice that amount is actually drawn from the source due to resistive heat dissipation. The energy dissipated in the form of resistive heat always equals the energy delivered to the capacitor, regardless of the source's Thévenin equivalent resistance.

of all of the elements of the other three. If the input capacitance goes to zero, then  $FOM_1 = FOM_2$  and  $FOM_3 = FOM_4$ .

The term ACC is intended to represent of the best achievable accuracy, in bits, of a CTA-based A/D converter. The formula is

$$ACC = \log_2\left(\frac{V_{FS}}{V_{OS-IN}}\right)$$
 (6.27)

where  $V_{OS-IN}$  is the input-referred offset of the CTA (not counting the native offset) and  $V_{FS}$  is the full-scale range of the A/D converter in volts. For instance, if  $V_{FS}$ = 2.5 V and  $V_{OS-IN}$  = 2.5 mV, then ACC = 9.97 bits. In other words, the CTA is capable of realizing at best a 9.97 bit accurate flash A/D. This is not the same as the *effective number of bits* (ENOB) used in (6.22), but it does provide an insightful yardstick for comparison of the different amplifiers for converter applications.

The term  $V_{OS-IN}$  is calculated as the sum of the inherent offset of the amplifier  $(3\sigma_{CTA})$  plus the offset of the latch  $(3\sigma_{LATCH})$  scaled by the gain of the CTA,

$$V_{OS-IN} = 3\sigma_{CTA} + \frac{3\sigma_{LATCH}}{A_{CTA}}.$$
 (6.28)

In this way, any figure of merit utilizing the factor  $2^{ACC}$  accounts for both the CTA offset and the residual input-referred offset of the latch. Therefore, the offset behavior of the latch to be used must be known before it is possible to calculate these A/D converter based figures of merit for a charge-transfer amplifier. This is an important feature of the FOMs because it appropriately includes the CTA gain in the figure of merit. Note that  $V_{OS-IN}$  does not include the native, or mean, offset. The mean offset is irrelevant for many flash converters since it translates into a global offset but not a bitwise offset. The issue of mean offset is addressed later in this section for converter applications requiring zero global offset.

Table 6.5 compares the figures of merit for all of the known charge-transfer amplifiers at 100 kSPS and 2.5 V supply. The full-scale input range,  $V_{FS}$ , equals 2.5 V for all amplifiers except the NMOS CTA (1.25 V), the DCCTA (1.3 V), and the PLCTA (2.4 V). The input capacitance was assumed to be 0.7 pF for all but the NMOS CTA (0.35 pF) and the DCCTA (0.07 pF) and PLCTA (0.07 pF). The assumed latch offset voltage,  $\sigma_{LATCH}$  was 10 mV.

| CTA   | $P_D$        | $A_{CTA}$ | Area      | $V_{OS-CTA}$ |               | $FOM_1$           | $FOM_2$           | $FOM_3$                   | $FOM_4$                   |
|-------|--------------|-----------|-----------|--------------|---------------|-------------------|-------------------|---------------------------|---------------------------|
|       | $\mu W/MSPS$ | V/V       | $\mu m^2$ | $\mu$ , mV   | $\sigma$ , mV | $\frac{fJ}{step}$ | $\frac{fJ}{step}$ | $\frac{pJ-\mu m^2}{step}$ | $\frac{pJ-\mu m^2}{step}$ |
| NMOS  | 0.4          | 2.4       | 2940      | 125          | 6.0           | 9.6               | 22.9              | 28.4                      | 67.4                      |
| CMOS  | 0.8          | 4.1       | 4700      | 12           | 1.4           | 3.8               | 24.1              | 18.0                      | 113.4                     |
| PDCTA | 1.5          | 4.1       | 9390      | 0            | 1.2           | 6.6               | 25.3              | 61.7                      | 237.4                     |
| FCTA  | 13.0         | $\infty$  | 5170      | 25           | 0.6           | 9.4               | 12.5              | 48.4                      | 64.7                      |
| DCTA  | 3.4          | 7.1       | 6570      | 0            | 1.1           | 10.2              | 23.4              | 67.3                      | 153.8                     |
| DCCTA | 3.3          | 8.2       | 4930      | 0            | 1.4           | 20.2              | 20.9              | 99.4                      | 103.0                     |
| PLCTA | 6.3          | 8.9       | 5520      | 0            | 2.1           | 25.2              | 26.8              | 139.2                     | 148.1                     |

Table 6.5: Figures of merit for known charge-transfer amplifiers

Some of the results in the table above are intuitive, but the implications warrant a discussion.

First, input charging adds a significant amount of overhead in the capacitively coupled CTAs. For instance, the CMOS CTA has a comparatively low FOM<sub>1</sub> of 3.8, but FOM<sub>2</sub> is 6.3 times higher at 24.1 due to the input charging energy. In other words, the CMOS CTA consumes six times less energy than is conceivably required to drive its input. As shown by the collective results for FOM<sub>2</sub>, the input charging energy totally drowns out any power advantage that the CMOS CTA might have over the fully differential configurations that consume much more internal power. Because of their low input capacitance, the DCCTA and the PLCTA either beat or at least match the FOM<sub>2</sub> performance of the CMOS CTA.

The FCTA provides a superior  $FOM_2$  and  $FOM_4$ . This was not an expected outcome based on the FCTA's high dynamic power dissipation. Even though its internal energy is much larger than that of the other amplifiers, the low input referred offset variance, infinite gain and relatively small size lead to superior overall performance.

The pseudo-differential amplifier (PDCTA) is the worst performing when area and input charging are considered (FOM<sub>4</sub>). This validates some of the earlier assumptions made in Chapter 5 that the enhanced differential architecture offers advantages to the "brute force" approach to offset nullification.

The FOM<sub>2</sub> results are almost equal for all of the amplifiers except the FCTA. As mentioned above, the reasons for this result is that FOM<sub>2</sub> becomes almost totally dominated by the input charging energy for the NMOS CTA, CMOS CTA, PDCTA and DCTA. Whereas the DCCTA and PLCTA look much worse in terms of FOM<sub>1</sub>, which accounts for internal energy only, they are equally competitive on FOM<sub>2</sub> due to their low input capacitance.

A possible conclusion to draw from Table 6.5 is that for flash converters where global offset is not a problem, the best options for circuit performance only (not area) would be the CMOS CTA, FCTA or DCCTA. On the basis of area and power, the CMOS CTA is superior. The following section goes into greater depth regarding the best options for converters requiring zero global offset.

# Zero Mean Offset Amplifiers

Amplifiers with no systematic (zero mean) offset are necessary for subranging A/D converters, as well as some flash converters with tight global offset specifications. The figures of merit for zero mean offset CTAs are now considered separately from the other amplifier architectures.

Table 6.6 lists the figure of merit data for those amplifiers in Table 6.5 exhibiting zero mean offset. The data are unchanged and are repeated in this table only for convenience in making comparisons.

The PDCTA actually consumes the least power for a given accuracy, according to  $FOM_1$ . Looking at  $FOM_2$ , input charging adds a significant amount of energy to the PDCTA and DCTA, but makes almost no difference in the DCCTA and PLCTA. The best performing amplifier in terms of  $FOM_2$  is the DCCTA.

When area is a primary concern, either  $FOM_3$  or  $FOM_4$  are useful. The PDCTA and DCTA are almost equally advantageous in terms of  $FOM_3$ , where input

| CTA   | $P_D$        | $A_{CTA}$ | Area      | V <sub>OS-CTA</sub> |               | $FOM_1$           | $FOM_2$           | $FOM_3$                   | $FOM_4$                   |
|-------|--------------|-----------|-----------|---------------------|---------------|-------------------|-------------------|---------------------------|---------------------------|
|       | $\mu W/MSPS$ | V/V       | $\mu m^2$ | $\mu$ , mV          | $\sigma$ , mV | $\frac{fJ}{step}$ | $\frac{fJ}{step}$ | $\frac{pJ-\mu m^2}{step}$ | $\frac{pJ-\mu m^2}{step}$ |
| PDCTA | 1.5          | 4.1       | 9390      | 0                   | 1.2           | 6.6               | 25.3              | 61.7                      | 237.4                     |
| DCTA  | 3.4          | 7.1       | 6570      | 0                   | 1.1           | 10.2              | 23.4              | 67.3                      | 153.8                     |
| DCCTA | 3.3          | 8.2       | 4930      | 0                   | 1.4           | 20.2              | 20.9              | 99.4                      | 103.0                     |
| PLCTA | 6.3          | 8.9       | 5520      | 0                   | 2.1           | 25.2              | 26.8              | 139.2                     | 148.1                     |

Table 6.6: Figures of merit for zero mean offset charge-transfer amplifiers

charging energy is neglected. However, the combined figure of merit, FOM<sub>4</sub>, shows that the PDCTA performs worst by far, followed by the DCTA. The DCCTA appears to offer the best overall performance in terms of FOM<sub>4</sub>, although the PLCTA may be preferred if rail-to-rail input range is important to the application. The fact that FOM<sub>4</sub> of the PLCTA is slightly better than the DCTA is noteworthy because it shows that the reduction in input capacitance and die area overcome the near doubling in power. One noteworthy advantage to the DCTA is that the input capacitors can be used to perform a temporal sampling function, whereas the DCCTA and PLCTA require a separate S/H circuit for applications where the reference voltage is compared to a time delayed input voltage, such as in a subranging converter. For this reason, the DCTA was chosen to implement the 10-bit ADC described in the next chapter.

#### 6.6 Subthreshold Operation

As mentioned earlier, subthreshold conduction is not negligible in chargetransfer amplifiers. Conduction currents exist in the active devices even when the supply voltage is below the ideally predicted minimum. When this happens, the devices operate in the subthreshold region with a current that decreases logarithmically with supply voltage. Although small, this current does achieve amplification through charge-transfer. When this happens, the peak sample rate will drop exponentially with supply voltage, or linearly with the subthreshold current.

In laboratory tests, operation of the PDCTA comparator was measured as the supply voltage was decreased from the nominal 2.1 V down to 0 V. A summary of the results for a single representative test chip appears in Figure 6.5. The so-called  $1-\sigma$  bandwidth (SB1) was determined by measuring the frequency at which the offset voltage shifted by one standard deviation. This method of measurement approximates the frequency limitation imposed by subthreshold conduction. It is assumed that  $\sigma$  is already known, although there is no reason why an arbitrary offset shift could not be chosen as well. For example, 1 mV, 2 mV or a voltage equaling 1/2 LSB for an N-bit A/D converter could also be selected as the triggering offset shift.

The SB1 was observed to be about 40 MSPS for supply voltages above 2 V. The bandwidth drops off logarithmically below 2 V as expected. At 1.2 V supply, the SB1 was 18 kSPS. While this is not the lowest voltage published for an 18 kSPS comparator, the power dissipation at this operating point was just 3.78 pW (measured by simulation), or 0.21 pW/kSPS. The same PLCTA consumes four orders of magnitude higher dynamic power when operated at 2.1 V supply. To the author's knowledge, no comparator sampling in the tens of kSPS range has ever achieved low power dissipation on the order of picowatts.

Below about 1 V, the PLCTA began to perform badly in terms of offset voltage. Theoretically, the reduction SB1 could be expected to follow the logarithmic trend asymptotically down to 0 V, as suggested in the figure by the dashed line. But with the frequencies of value being well below 1 kSPS, parasitic leakage currents begin to change the bias conditions to the point that reliable amplification is no longer possible. At such low sample rates, it is recommended to extend the reset phase as long as possible in order to inhibit the effects of parasitic leakage in the precharge and amplify phases.

Further study about subthreshold CTAs may lead to breakthroughs in low power A/D converter design. This topic is suggested in Section 8.2 as an attractive area for future work. The potential benefits and applications of a subthreshold CTAbased A/D converter are highlighted briefly in Section 7.11.



Figure 6.5: Measured 1- $\sigma$  bandwidth (SB1) of a CTA operated in subthreshold

## 6.7 Summary

A simplified, accurate technique for estimating power dissipation in CTAs does not exist. As a rule, the simplified analysis methodology in this chapter provides an estimate which is always overly conservative. Knowledge of the physics involved leads to a more accurate and intuitive, albeit perhaps ad-hoc, model. But a truly accurate prediction probably will always require the aid of a transient simulation. It should be pointed out that the measured dynamic power of CTAs agrees almost exactly with the simulated power of the CMOS CTA, as well as the other more recent amplifiers [2, 10, 12, 14, 15].

Four figures of merit have been proposed for comparing the relative advantages of each existing and/or future charge-transfer amplifier in flash and subranging A/D converter applications. The FOMs provide an objective tool for measuring performance and determining the best amplifier for a given set of system constraints.

In addition, the operation of a charge-transfer amplifier in subthreshold was considered briefly. The results of test chip measurements show that the power dissipation and peak sample rate drop exponentially with supply voltage. Future work involving the use of subthreshold CTAs holds promise for achieving orders of magnitude in power reduction.

# Chapter 7

# A 10-bit CTA-based A/D Converter

This chapter describes a 10-bit A/D converter in which charge-transfer amplifiers are used to achieve low dynamic power dissipation of 400  $\mu$ W/MSPS drawn from a 2.1 V supply (plus the resistive reference generator). This work builds upon previous CTA-based ADCs [2,12] and demonstrates for the first time the applicability of charge-transfer amplifiers in a 10-bit converter. The two-step subranging type converter evaluates on a scheme of 5 coarse bits and 5 fine bits. Capacitive interpolation allows a reduction in power to near the ideal equivalent of 62 comparators, albeit with added design and layout complexity. A test chip was fabricated in 0.6  $\mu$ m 2P/3M CMOS. The active area occupies 2.7 mm<sup>2</sup>, and exhibits good behavior over a wide range of supply voltages and sample rates.

Also included in this chapter are discussions related to the following techniques and features used to optimize the reported converter.

- The choice of a subranging converter for this experiment and the relevance of flash and subranging type converters;
- The use of interpolation and averaging techniques with CTAs;
- Methods to improve linearity and increase the peak sample rate in flash and subranging A/D converters;
- Implementation of a distributed sample-and-hold function at no additional expense by using the input coupling capacitors;
- A modified CTA-latch interface to improve the precision at high sample rates.

#### 7.1 Types of A/D Converters

The principle of A/D conversion was illustrated in Section 2.2. Many types of A/D converters exist today. Two commonly known A/D converter architectures are the *flash converter* and the *subranging converter*. When referring to an N-bit flash converter, one generally refers to a circuit where the conversion from analog signal to N digital bits is performed in a single step, or a "flash." An N-bit subranging converter most often implies a circuit where the conversion from analog signal to N digital bits occurs in two or more steps, each step becoming more precise than the previous step by "sub-ranging" into the result of the previous step, in order to obtain N total bits. For example, a 2-step N-bit subranging converter would convert *m* bits (the *coarse bits*) in the first step, and then, by sub-ranging, *n* bits (the *fine bits*) in the second step, where m + n = N. There are several other types and classes of A/D converters, each with a unique set of advantages and disadvantages. Noteworthy examples include *pipeline*, *flash-flash*, *successive approximation register* (SAR), *incremental*, *integrating*, *logarithmic*, *dual-slope* and *delta-sigma* ( $\Delta\Sigma$ ).

#### Flash Converters

In most flash converters, the analog input voltage is presented at the first input of an array of differential or pseudo-differential amplifier cells (see Figure 7.1). At the second input of each amplifier cell is presented one of a progressive set of partitions of a *reference voltage*, such that for any given input voltage, there exists some "low" portion of the amplifier cells where the corresponding reference voltage is progressively lower than the input voltage, and for the remaining "high" portion of the amplifier cells, the corresponding reference voltage is progressively higher than the input voltage.

Through the array of amplifiers, the input voltage is separated into translatable information packets (e.g., the output of each amplifier cell), which are subsequently detected by a combination of additional analog circuitry and digital logic, and finally projected onto an N-bit digital map. The resulting N-bit code represents,



Figure 7.1: Functional block diagram of a flash A/D converter

in the ideal case, the closest digital number approximation of the original analog voltage.

The overall performance of flash converters relies most heavily on the properties of the amplifier cells. Therefore, novel advancements in the construction and methods of use of the amplifier elements of the flash converter are extremely important to the performance of a particular flash converter and the overall behavior of a particular system based on that converter.

## Subranging Converters

In the majority of subranging converters (see Figure 7.2), the analog input is first converted into m coarse bits, by an m-bit flash converter. The remaining n



Figure 7.2: Subranging A/D converter architecture

fine bits are then converted by using the coarse bits to create a new, smaller set of references which feed a second n-bit flash converter along with the original analog input [28, 29].

The subranging converter reported here follows the architecture in Figure 7.2. A reference ladder feeds a 5-bit coarse section with the "coarse" references. Based on the resulting coarse bits, a range of "fine" references is selected via an analog mux and fed to a 5-bit fine section. The 5 fine bits are then combined with the 5 coarse bits, digitally corrected and registered as a 10-bit output word.

#### 7.2 Averaging

Averaging helps to reduce the effects of mismatch in an array of matched amplifiers [30–34]. Shown in Figure 7.3 is the method of capacitive averaging. In normal averaging schemes, resistors or capacitors are connected between corresponding



Figure 7.3: Example of an averaging scheme

output terminals of adjacent amplifiers (or any array of repetitive cells, for that matter). The averaging devices act to reduce the effects of cell mismatches by averaging these mismatches over neighboring cells.

The net effect of averaging is shown in Figure 7.4. A large offset in amplifier #5 is reduced by distributing the offset among neighboring amplifiers. Of course, the apparent offset of the nearby devices becomes larger, but the worst case offset of any given amplifier in the array is smaller than without averaging. Overall, this can greatly increase the accuracy of an array of amplifiers in a flash A/D converter. A good summary of the impact of averaging is that "one dummy can't take down the whole team" [35].



Figure 7.4: Result of averaging in an array of amplifiers

Figure 7.5 shows how averaging can be used with charge-transfer amplifiers. Although it may appear to be identical to Figure 7.3, averaging with CTAs is unique by virtue of the fact that the averaging capacitors constitute a significant portion of the charge-transfer amplifier load impedance. Thus they constitute in large part the actual signal carrying elements and become much more significant in the overall performance than in a classical averaging scheme. In practice this is largely bad news, since the averaging elements must be designed more carefully for matching and parasitics to avoid degradation of the gain and offset of the individual CTAs.

Averaging tends to "pull" on the end nodes as shown in Figure 7.6. This is because the end devices experience the effects of averaging from only one side but not the other. The pulling at the ends causes distortion in an A/D converter. Two known methods are directed at reducing distortion at the ends. One technique is



Figure 7.5: Averaging applied to charge-transfer amplifiers

shown in Figure 7.7, where the ends are cross-coupled appropriately so as to pull the ends back in line. This method may not be practical in most flash A/D converters because long wires (with unacceptably large parasitic capacitances) are needed to create the cross-coupling connections.

A second method is to use "dummy" amplifiers at the ends, as in Figure 7.8. The input polarities of the dummy amplifiers are swapped and the reference voltage fed to the dummy amplifier at one end is the same reference voltage given at the reference input of the processing amplifier at the other end. Weak averaging is performed on a strong signal in this way to reduce distortion at the ends. Figure 7.8 illustrates this method. Note the input and reference voltage labels and the



Analog Input

Figure 7.6: Distortion at the ends in a standard averaging scheme

distinction between dummy amplifier cells and weak averaging capacitors,  $C_W$ . In the A/D reported here, extra dummy amplifiers are used, requiring more die area and dissipating more dynamic power. However, it does effectively provide the same benefit as cross-coupling but without the high parasitic capacitances associated with long end-to-end cross-coupling wires. This is particularly important for CTA-based applications.

The desired result of using the above-described circuits is to reduce distortion at the ends as illustrated in Figure 7.9. The thick dashed line represents the distortion without any compensation for pulling. The thin dashed line indicates the ideal transfer function and the thick line shows the results of correction through



Figure 7.7: Cross-coupling with CTAs

dummy devices at the ends. Appropriate choice of the weak averaging capacitors can lead to arbitrarily close approximation of the ideal curve. Careful simulation with back-annotated parasitics should be incorporated after the circuit layout is completed.

## 7.3 Interpolation

Interpolation is useful to reduce power dissipation and heat [29, 30, 36]. Figure 7.10 illustrates the principle of interpolation. The output of two adjacent amplifiers is averaged, so to speak, by the interpolating capacitors. As a result, a



Figure 7.8: Virtual cross-coupling via "dummy" amplifiers

third "interpolated" output is created. This new voltage is then useful as though there were a third amplifier in between the two amplifiers.<sup>1</sup> The method of interpolation offers the benefit of eliminating the need for fully one-half of the amplifiers in each amplifier stage of a flash A/D converter.

Just as in averaging, interpolation with CTAs is unique by virtue of the fact that the interpolation capacitors constitute a significant portion of the chargetransfer amplifier load impedance. Therefore they also act as signal carrying elements and become much more significant in the overall performance than in a classical interpolation scheme.

<sup>&</sup>lt;sup>1</sup>If the two amplifiers are linear then the interpolated output can be a near-perfect average of the other two outputs.



Figure 7.9: Result of cross-coupling at the ends in an averaging scheme

Distortion at the edges also occurs in interpolation. The technique described above for reducing bending at the edges in averaging also applies to interpolation. A method that is conceptually the same as in Figure 7.8 is to use reverse polarity dummy devices at the ends.

#### 7.4 Voltage Comparator

Figure 7.11 shows two CTA-based comparators. Part (a) depicts a comparator based on Kotani's CMOS CTA [2] and part (b) contains a comparator using the differential CTA proposed in Section 5.1. Both of these comparators perform rather well at moderate speeds, but neither is well suited for high-speed operation without improvements to the CTA-latch interface.



Figure 7.10: Capacitive interpolation

The dynamic latch comparator has a large residual charge imbalance at the end of the latch hold stage, which corresponds to the beginning of the CTA precharge phase. Due to the high sensitivity to output charge of CTAs, the imbalance can appear as an offset voltage at high speed. The polarity of the offset depends on the state of the latch, or the result of the previous cycle. This residual charge imbalance may be eliminated at low to moderate operating frequencies by simply connecting the latch directly to the output of the CTA, as shown in Figure 7.11. The residual charge imbalance on the latch input nodes couples into the charge-transfer amplifier during the precharge phase. Not only does this residual imbalance appear on the output nodes of the CTA, but it also couples to the amplifier inputs through the drain-gate overlap capacitor. At high speed there is insufficient time to dissipate all



Figure 7.11: CTA-based comparators

of the residual charge from the CTA. Of course, this leads to a potentially large offset voltage which is both frequency and signal dependent.

An improved comparator is shown with a timing diagram in Figure 7.12. The proposed changes include the addition of cut switches between the CTA and the latch and a modification to the three-phase timing as follows. The introduction of latch reset switches labeled S3 allows the latch to recover, or zero-out the charge imbalance that is a result of the latch decision process, during the precharge phase



Figure 7.12: Improved CTA-based comparator

of the charge-transfer amplifier. During this time the latch is isolated from the CTA. This allows the latch to recover normally during the CTA precharge phase without passing any residual charge imbalance onto the amplifier. Dynamic stability in the precharge phase is guaranteed as a result. Simulations showed that this isolation allowed an increase in peak sample rate of the comparator from 15 MSPS to about 50 MSPS.



Figure 7.13: Positive, capacitive feedback illustration

#### 7.5 Gain Enhancement

When classical amplifiers are arranged in a cascade configuration, the overall gain equals the product of the individual stage gains. The same principle can be applied with charge-transfer amplifiers to create a high-gain amplifier cascade.

Since the gain of a charge-transfer amplifier depends directly on the inverse of the load capacitance, it is possible in principle to increase the gain dramatically by decreasing the load capacitance. This is made possible by utilizing the principles of positive feedback.

Figure 7.13 illustrates how positive feedback can be used to create a virtual negative capacitor. When the gain, A, is negative, the virtual input capacitor appears as what is commonly called a "Miller capacitor." However, when A is positive and greater than unity, the input-referred component of  $C_F$  appears as a negative capacitance. This same principle has been applied previously to equalize power systems and improve response time in communication circuits [37, 38].

Figure 7.14 is a detailed schematic representation of a negative capacitance generator applied to a two-stage cascade of CTAs [19]. The first stage CTA comprises all circuit elements in the signal path up to the differential nodes P and Q. A modified second stage CTA comprises the remaining circuit elements. The modified second stage CTA is essentially the same as the first stage, but without the input coupling capacitors.



Figure 7.14: Example of gain enhancement in a two-stage cascade of CTAs

Capacitors  $C_{F1}$  and  $C_{F2}$  form positive feedback paths which, assuming the gain of the second stage is greater than unity, act to decrease the load capacitance on the first stage. The operation of this positive feedback connection is different for a CTA than for a continuous time amplifier. In the continuous time amplifier, the equation shown in Figure 7.13 is generally valid, except when the signal frequency is high enough that amplifier phase shift becomes dominant. The positive, capacitive feedback on the CTA, however, exists in a sampled environment and thus has a different mode of operation, which is described as follows.

At a given sampling rate the feedback capacitors will cause the CTA to initially exhibit reduced gain due to the added load capacitance at nodes P and Q. However, as the CTA proceeds further into the amplify phase, the positive feedback capacitors will couple some of the output signal back into the signal path. This process dynamically boosts the first stage CTA gain and thus increases the overall amplifier gain over that achievable simply by cascading two CTAs. These feedback capacitors do add load capacitance at the output of the second stage CTA, thereby decreasing



Figure 7.15: Sample waveforms showing gain enhancement at 10 kSPS

its part of the overall gain. But, this decrease is overcome by the significant increase in the first stage CTA gain if capacitors are chosen such that

$$C_T >> C_L > C_F. \tag{7.1}$$

The second stage CTA does not see the step function  $\Delta V_{IN}$ , but rather the output of the first stage amplifier, which charges at a rate inversely proportional to the elapsed time according to (4.30) and (4.31). This causes the second stage CTA to produce its output even more slowly, thus producing at first the appearance of decreased overall gain. However, if given sufficient time during the amplify phase, the feedback connections of  $C_{F1}$  and  $C_{F2}$  will quickly boost the first stage gain and the overall gain. Appropriate timing is important to the application of this method for gain enhancement.

Figure 7.15 shows Spice simulation results where the sample rate is 10 kSPS. The amplifier is stimulated with a 1 mV differential; the amplifier response in mV during the amplify phase represents the gain. The use of feedback capacitors with no other circuit alterations clearly boosts the gain in the amplify phase, but



Figure 7.16: Sample waveforms showing gain enhancement at 200 kSPS

not until after a certain amount of time has elapsed. Prior to this required delay, no gain advantage is observed. Figure 7.16 shows a similar set of simulations where the sample rate is 200 kSPS. Here the gain advantage is again visible after a sufficient elapsed time. The amplify phase should have a time duration greater than or equal to that of the precharge phase to ensure a gain advantage.

#### 7.6 Distributed Sampling

In subranging converters, the input voltage is required first by the coarse section and then later by the fine section. Therefore, an accurate sample-and-hold function is critical to the operation. But even in recent subranging converters, a dedicated sample-and-hold circuit can consume as much as 40% of the total converter power [29]. The application of distributed sampling has been used to reduce the overall power dissipation and increase speed by spreading out the performance requirements of the individual sample-and-hold circuits [39–42].



Figure 7.17: Input circuit modification for distributed sample-and-hold

A circuit diagram illustrating the construction of a charge-transfer amplifier configured for distributed sampling is shown in Figure 7.17 with a summary of the clock phases. The input coupling capacitors,  $C_C$ , now serve a dual role as the sampling capacitors and as isolation devices for dynamic biasing. The input and reference voltages are not sampled simultaneously, allowing a difference in time between when the input voltage and reference voltage are sampled. This scheme permits the fine bank of amplifiers to operate after the references have been adjusted (or sub-ranged) by the result of the coarse bank.

It is critical that the clock phases for the sampling switches are driven in precisely the same manner (i.e., same rise/fall time and same drive strength). Not only must the driving logic be identical, but the transmission lines to the switches should be matched as closely as possible in the layout. Even a small difference in the rise times of the sampling switches can result in sizeable input offset errors.

Figure 7.18 gives the construction of one possible timing order. The input voltage is sampled simultaneously by the coarse and fine sections; the sampled input is used immediately by the coarse section to evaluate the coarse bits, and is subsequently used by the fine section to evaluate the fine bits after a delay.

# 7.7 Settling of the Fine References

To reduce dynamic errors in a subranging converter, it is favorable to allow more time for the fine references to settle once they have been adjusted by the result of the coarse section. Two proposed methods of accomplishing extended fine reference settling time are shown in the timing diagram of Figure 7.19. In option (a), the CTA amplify phases of the coarse and fine sections are shortened to one clock partition. (In Chapter 4, it was shown that giving the amplify phase two clock cycles maximized the gain.) In option (b), the CTA amplify phase of the coarse section amplifiers is one clock partition, whereas the fine section amplifiers are given two clock partitions.

The tradeoff between giving the CTAs a high gain and allowing the fine references to settle completely can only be made once the accuracy requirements of the converter have been decided. But as a general rule, the input capacitance of the

| Coarse<br>Section | Sample<br>Input<br>Voltage | Evaluate<br>Coarse<br>Bits | Select<br>Fine<br>References | Hold<br>Fine<br>References   | Hold<br>Fine<br>References |
|-------------------|----------------------------|----------------------------|------------------------------|------------------------------|----------------------------|
| Fine<br>Section   | Sample<br>Input<br>Voltage | Hold<br>Input<br>Voltage   | Hold<br>Input<br>Voltage     | Sample<br>Fine<br>References | Evaluate<br>Fine<br>Bits   |

A/D Converter Functional Periods

Figure 7.18: Timing summary of a sampling method



Figure 7.19: Optional timing diagrams for extended settling of fine references

fine bank is large enough that the best performance is achieved by maximizing the amount of time for the fine references to settle.

Referring again to Figure 7.19, option (a) dedicates four clock partitions for settling of the fine references. By comparison, if both coarse and fine sections were given two clock partitions for their amplify phases, this settling period would be cut in half. The fine section amplifiers do lose some gain on account of having only one clock partition available in the amplify phase, but this loss is small compared to the potential doubling of the accuracy of the fine references. In option (b), three clock partitions are allowed for settling of the fine references. This method still increases



Figure 7.20: Survey of the performance of 10-bit A/D converters

the accuracy of the fine references, but without any loss in the gain performance of the fine bank. Since the overall accuracy is so dependent on the fine bank amplifiers, this option may be preferable in some circumstances.

### 7.8 Preliminary Study

Figure 7.20 shows the reported power dissipation in a broad survey of 10bit converters in the open literature spanning the range 100 kSPS up to 1 GSPS. The highest dissipation is 25 mW/MSPS and the lowest (state of the art) is 1 mW per MSPS. Based on previous measurement data [1, 2, 12, 13], it has been suggested that power benefits inherent in CTA technology could push the state of the art down as low as 400  $\mu$ W per MSPS, a reduction of 60% over the best reported 10-bit converters.

In this work, the fully differential charge transfer amplifier (DCTA) of Section 5.1 is used for low offset voltage characteristics (see Figure 5.1). A voltage comparator is constructed using the CTA as a preamplifier to reduce input-referred offset voltage of a dynamic latch comparator. The offset voltage of the charge-transfer amplifier is below 2.1 mV and consumes purely dynamic current. It has also been shown to be robust over a wide range of operating conditions and tolerant to large fluctuations in device parameters such as threshold voltage and transconductance. The circuit is also useful in that a sample-and-hold is built into the front end via the input coupling capacitor,  $C_C$ .

When the CTA is used as a preamplifier to a dynamic latch comparator, the latch input capacitance of about 100 fF acts as the preamplifier load,  $C_L$ . In the architecture of this particular charge-transfer amplifier,  $g_c$  equals roughly  $1.6 \cdot C_T$ , so it is possible to achieve stage gains of 10 V/V by using a small transfer capacitance of 600 fF. With these design parameters and a fairly standard dynamic latch [25], the comparator dissipates roughly 5  $\mu$ W/MSPS from a 2.1 V supply. Carrying that number through to the ideal 10-bit subranging converter with 62 comparators leads to an estimated 320  $\mu$ W/MSPS, not counting dynamic power drawn by the encoding logic and static power dissipated in the resistor ladder.

# 7.9 Subranging A/D Converter

As mentioned before, timing is important in subranging converters both architecturally and to accommodate differences in comparator design. In this converter, conversion takes four master clock cycles and follows the scheme outlined in Figure 7.19(b). The coarse section is allowed one half clock cycle for the precharge and amplify phases. The fine section is allowed one half clock cycle for the precharge phase, but a full clock cycle for the amplify phase. This is done so that (a) the coarse bits evaluate early, allowing more time for the fine references to settle through the analog mux, and (b) because allowing more time for the fine bits in the amplify phase improves the accuracy of the finebank preamplifiers. As shown, three half-cycle periods are allotted for the fine references to settle before the finebank amplify phase begins.

Interpolation in the finebank allows a reduction in the number of CTA preamplifiers and also reduces the total capacitive load seen by the analog mux to the finebank amplifiers, resulting in a faster settling of the fine references. A simplified



Figure 7.21: Interpolation with fully differential charge-transfer amplifiers

(e.g., switches and input coupling capacitors removed) illustration of the 2:1 interpolation scheme applied to the output of fully-differential charge-transfer amplifiers is shown in Figure 7.21. For every two adjacent CTA preamplifiers, the outputs are capacitively interpolated to produce a third output, emulating a third preamplifier. Interpolation in this way reduces the number of preamplifiers by 50%. In this work, two stages of preamplifiers were used in the finebank (for increased gain), each stage with 2:1 interpolation applied for a total of 4:1 interpolation benefit as seen by the analog mux.

When using interpolation with CTAs, the implementation is considerably more difficult than with a classical preamplifier. This is because classical amplifiers derive voltage gain by an R/R ratio or by a  $g_m$ R product. With charge-transfer amplifiers, on the other hand, the load capacitance directly determines both the voltage gain and the input-referred offset voltage. Care must be taken to minimize parasitic capacitance in the interpolating capacitors and to match all parasitics carefully in the



Figure 7.22: Die photo of the A/D converter test chip

differential signal path. For example, wire lengths and trace proximities were carefully controlled in this design in an effort to achieve low differential offset and also to preserve voltage gain by limiting the load capacitance seen by the charge-transfer preamplifiers.

Five additional comparators at each end of the fine bank were used to provide overlap for digital error correction [29]. In the aggregate, a total of 41 comparators (31 for the fine bits and 10 for error correction) were implemented in the finebank, consisting of 11 first-stage CTA preamplifiers, 21 second-stage preamplifiers and 41 latches. Four additional CTAs were used to correct bending at the edges.

#### 7.10 Fabrication Results

Test chips were fabricated on 0.6  $\mu$ m 3M/2P CMOS by AMI Semiconductor. A digital image of the fabricated circuit appears in Figure 7.22.  $C_T$  capacitors were implemented in poly-poly cap, not to improve preamplifier linearity (which is irrelevant in a comparator application) but rather to ensure reliability of the target voltage gain in the preamplifiers. The reference ladder was a continuous strip of polysilicon, without bends, spanning the center of the converter. The coarse and fine



Figure 7.23: Measured dynamic power dissipation of the A/D converter

sections were situated on either side of the reference ladder for convenience. Clock generation circuitry was placed on the edge farthest from the finebank.

## 7.10.1 Dynamic Power

Dynamic power dissipation in the ADC was measured at 2.1 V, 2.5 V and 3.3 V. Neglecting DC power drawn by the reference ladder (1.3 k $\Omega$ ), observations are displayed in Figure 7.23. At 2.1 V, dynamic power is just under 400  $\mu$ W/MSPS. Including the resistor ladder, the total power is above 1 mW/MSPS, but according to simulations this could be made as low as 50% of the core power while still preserving the desired accuracy. For example, the resistor power could be as low as 200  $\mu$ W at 1 MSPS or 400  $\mu$ W at 2 MSPS. It is emphasized that the low dynamic power dissipation underscores the potential CTAs can offer in total power reduction. This is especially applicable at low to moderate speeds, where the reference ladder current can be made small relative to power consumed by the core. The reference ladder resistance can be increased to fit the sample rate and accuracy requirements of the application.

#### Comparison to the State of the Art

The measured dynamic power dissipation plus a nominal 50% for the resistor bias is 600  $\mu$ W/MSPS, or 40% lower than the state of the art of 1 mW/MSPS. At the time of this writing, the next lowest power for a 10-bit A/D converter was reported in December, 2003 at 690  $\mu$ W/MSPS at 80 MSPS [43]. However, that converter followed a pipelined architecture, which has inherent power dissipation advantages over the subranging architecture used in this work but does not offer the benefit of low latency.

Another low-power A/D for hearing aid applications, published in February, 2004, achieved 3  $\mu$ W total power at 300 SPS [44]. The accuracy was less than 10 bits, but the design also incorporated temperature compensation and an input diode to translate a log-scale input current into a linear voltage. By comparison, the converter reported here would consume just 0.54  $\mu$ W at 300 Hz (an 82% reduction in power), assuming a nominal 50% overhead for the resistor ladder.

#### 7.10.2 Linearity

Linearity performance was measured by sampling at low speed (256 kSPS) while sweeping the input with a full scale, 50 Hz linear ramp, providing 5 samples per step. The digital outputs were acquired into a high-speed logic analyzer and transferred to a computer for analysis. DNL and INL plots of a single representative converter appear in Figure 7.24. The DNL clearly exceeds 1 LSB, meaning that some codes are missing from the converter. Reexamination of the design shows that this nonlinearity resulted due to comparator offsets induced by mismatches in the interpolation capacitors. Therefore, improvements in the interpolation design is still needed. The INL plot reveals evidence of bending at the lower edge to a degree of just under 3 LSB. This indicates that the dummy correction amplifier at the lower end was not coupled strongly enough into the amplifier array (see Section 7.3).


Figure 7.24: Measured nonlinearity of the A/D converter

The extend of the bending at the low end covers most of the bottom half of the input range. This performance indicates a high sensitivity of the 4:1 interpolated input devices to bending at the ends. Since there were only 11 input stage amplifiers in the fine bank, the effects of pulling on just a few of these devices near the low end affected INL over a broad portion of the input range. A conclusion to be drawn from this result is that great care must be taken to balance the tradeoff between the overall linearity goal and the savings gained from interpolation. In this case, the INL could be corrected by one of two methods: first, by coupling the dummy devices near the low end more strongly by using a larger capacitor, and second, by using a lesser degree of interpolation (e.g., 2:1 rather than 4:1).

### 7.10.3 Spectral Performance

Dynamic performance was measured against simulation by applying a pure sine wave input at -0.1 dB-FS (2.46 V at 2.5 V supply) equal to 1/8.33 times the sample rate, with sample rates varying from 20 kSPS up to 2 MSPS. Data were acquired into a logic analyzer and finally transferred to a computer for spectral analysis in



Figure 7.25: Typical measured vs. simulated dynamic performance of the A/D converter  $% \mathcal{A}$ 

Matlab. Figure 7.25 shows the measured SNDR (signal to noise and distortion ratio) compared with simulation results. Below about 1 MSPS, the converter yields 8.2 effective bits. The majority of the distortion is directly caused by the sources of nonlinearity mentioned above. It may appear that the performance is poor compared with the ideal 10 bits, but in reality it is quite common for 10-bit converters to exhibit between 8 and 9 effective bits at the full sample rate. The present design would be improved from the low end to the high end of the acceptable range by improvements in the interpolation as described above. The maximum sample rate is also architecturally limited by the fine reference settling period, which is in turn dominated by the large CTA input sampling capacitors. These capacitors were actually made quite large (700 fF) in order to prevent signal attenuation from capacitive voltage division between  $C_C$  and the input gate nodes of the active MOSFETs. Higher speed operation could be made possible by using nonsampling CTA preamplifiers, which would require a separate sample-and-hold circuit, and by using recent ADC architectural techniques, such as absolute value processing and fully differential design as in [29]. In spite of limitations incurred by the architecture and interpolation scheme, the reported ADC clearly demonstrates the potential for 10-bit accuracy. Dynamic power reduction of up to 60% lower than the state of the art is made feasible by using low-power charge-transfer amplifiers at near the minimum allowable supply voltage for CTAs. Further improvements to the integration of CTAs into particular ADC architectures, as well as optimizations to the charge-transfer amplifiers themselves, could lead to significant steps forward in low-power A/D converter design.

### 7.11 Potential Applications

The 10-bit converter reported in this work is potentially well suited for applications with a signal bandwidth on the order of 1 MHz. By reducing the capacitive loading on the resistor ladder, utilization of 0.25 or  $0.18\mu$ m CMOS processing would immediately raise the peak sample rate of the subranging architecture to 5–15 MSPS. Further architectural optimizations such as absolute value processing and a fully differential resistor ladder might realistically be expected to help achieve 25–50 MSPS operation while also increasing the performance to greater than 9 effective bits (see, for example, the subranging converter described in [29]).

Despite the challenges already discussed, the low dynamic power dissipation made available with charge-transfer amplifiers appears to offer a significant enough advantage as to merit serious consideration for practical applications. In combination with the methods of interpolation and distributed sampling demonstrated in this dissertation, additional design techniques may lead to unprecedented efficiency in the process of analog-to-digital conversion.

In order to understand the relevance of this work with respect to existing technologies, it is important to consider the potential applications of a CTA-based A/D converter. Figure 7.26 shows the recent A/D converter performance needs for several popular classes of products [45]. Expected power dissipation increases up and to the right.



Figure 7.26: Recent A/D converter performance requirements for several classes of commercial products

The converter design reported in this dissertation could easily be used in Bluetooth applications,<sup>2</sup> especially in battery-powered nodes such as wireless earpieces, software radios and handheld computers. Cost being a critical factor in the feasibility of Bluetooth, utilizing a 0.25 or 0.18  $\mu$ m CMOS technology would probably be necessary in order to shrink the CTAs enough to make this approach competitive.

Video applications are also possible with a CTA-based converter achieving moderately faster operation of greater than 3 MSPS. With the majority of 10-bit video-rate converters consuming in excess of 5-10 mW, the charge-transfer amplifier approach may present an advantageous alternative on the basis of power alone. Ultrasound, which is very high-speed video, requires an A/D sampling at 30-70 MSPS.

 $<sup>^{2}</sup>$ Bluetooth transmits frequencies around 2.4 GHz, with 22-79 channels spaced 1 MHz apart.

A CTA could conceivably be designed to fit this application if the converter architecture incorporated averaging to reduce offset voltages and interpolation to limit the size and power.

The charge-transfer amplifier approach is also potentially useful for appliance control applications in the 50 to 300 kHz range. Applications falling under this category include load balancing in washing machines or dampness sensors for dryers. In most appliances, optimization for low power dissipation is not necessarily critical, but the electrically noisy environment often causes difficulty for purely analog circuits. The inherent robustness of charge-transfer amplifiers is definitely an architectural advantage over the continuous time approach.

At the low end of the spectrum, moderately accurate ADCs are needed for telephony. If a figure of 600  $\mu$ W/MSPS is assumed for dynamic power of a 10bit CTA-based converter, then an 8-bit converter could theoretically be designed to consume 150  $\mu$ W/MSPS or 1.5  $\mu$ W at 10 kSPS. In a typical 4 kHz telephony application requiring 8 kSPS sampling, the charge-transfer amplifier approach could yield a 1.2  $\mu$ W converter. This is below the expected consumption of a low-power A/D used for telephony today. Moreover, with reference to Section 6.6, a CTA can achieve up to five orders of magnitude in power reduction if the circuit is powered below the ideal minimum of 2.1 V. For example, based on the results in Figure 6.5, a 6-bit, 18 kSPS flash A/D powered at 1.2 V could theoretically consume just 238 pW by using a PLCTA-based comparator methodology (plus the resistor ladder power). In such an approach, the dominant power source would actually be the resistor ladder in order to allow noise spikes to settle in between samples.

Finally, charge-transfer amplifiers could be used to provide a power advantage in industrial sensors. Applications in this field are numerous, ranging from automobile oil temperature sensors, to blood glucose meters, to strain gauges used in helicopters, trains, airplanes and buildings. The sample rate in these applications can actually fall well below 1 kSPS, and in some cases a sample may be taken only once per week or once per month. But such devices would also be expected to operate for many years powered by a single 10 mAh battery. A charge-transfer amplifierbased approach is likely to yield attractive tradeoffs between cost and power for such applications.

### 7.12 Summary

This chapter has reported the first implementation of a 10-bit CTA-based A/D converter. The converter consumed less than 400  $\mu$ W/MSPS of dynamic power in the core and it was estimated by simulation that approximately 200  $\mu$ W/MSPS is sufficient for the resistor ladder. The core power was linearly dependent on the master clock frequency. To optimize for efficiency, the resistor ladder power must be programmed in the design phase based on prior knowledge of the application's sample rate requirements.

The use of 4:1 capacitive interpolation in the finebank was vital to achieving such low power dissipation. The interpolation capacitors appear to interact much more strongly with the CTAs than with previously reported implementations using classical amplifiers. This led to some distortion at the low end and also missing codes throughout the converter's range. Even with careful design and back-annotated simulations, it is clear that improved analysis techniques and simulation accuracy will be required in order for the full benefit of interpolation to be realized in CTA-based A/D converters.

A distributed sample-and-hold scheme was devised to leverage existing input coupling capacitors and eliminate the need for a separate S/H amplifier. Proper design of the switch drivers and careful management of the layout parasitics helped prevent global offset errors that are a symptom of mismatched clocks driving the input sampling switches. Proper division of the timing sequence between coarse and fine sections resulted in an acceptable tradeoff between accuracy in the fine bank and settling time for the fine references. A maximum sample rate of 1 MSPS was achieved, limited by the settling of the fine references.

Potential applications for CTA-based converters were also considered. In some cases, the low power dissipation aspect would be the most important advantage. But the die area or cost cannot be ignored. The figures of merit proposed in Section 6.5 may be helpful in evaluating for the best overall performance in a low-power, low-cost application.

## Chapter 8

## Conclusion

Charge-transfer amplifiers offer unique advantages in the design of efficient A/D converters. This dissertation has explored several important aspects relating to the design, analysis and implementation of CTAs and CTA-based converters. A methodology for analyzing the dynamic behavior of charge-transfer amplifiers has been shown to yield relatively accurate predictions of the voltage transfer function over a wide range of frequencies. Additionally, three new charge-transfer amplifiers were proposed, each improving over existing designs with respect to practical considerations.

The dynamic power consumption of charge-transfer amplifiers was examined and it was shown that a simplified model exists but significantly overestimates the actual power due to second order effects such as threshold modulation and incomplete precharging at high speeds. In connection with this analysis, four FOMs were proposed. Finally, a 10-bit CTA-based A/D converter was designed, fabricated and tested, demonstrating for the first-time the potential for CTAs to be used in precise, ultra low-power data converter applications.

### 8.1 Contributions of the Dissertation

The specific contributions of this dissertation are:

1. An analysis of the dynamic behavior of charge-transfer amplifiers, leading to a generalized expression for the voltage transfer function. The resulting model was implemented in Matlab for the NMOS CTA and the DCTA. Up to a certain frequency, calculations agreed quite well with Spice simulations of the voltage gain over a number of circuit parameters (e.g., transfer capacitance and threshold voltage) and external conditions (e.g., supply voltage and timing ratio of the clock phases). The model does break down above a certain sample rate due to the effects of finite MOS switch resistance. But, including this resistance in the model led to intractable equations. This problem is reiterated below as a topic for future research.

- 2. Examination of the sources of offset voltage in fully differential CTAs. Two sources of offset voltage were identified: charge injection and channel mismatch. Charge injection appears as a constant offset term that is probably overshadowed by channel mismatch errors, especially at high speed. Channel mismatch error has two components: capacitors and active MOS transistors. Capacitor matching contributes a fixed offset, whereas MOS matching is negligible at low sample rates but becomes dominant as the frequency increases.
- 3. Development of a truly differential charge-transfer amplifier. The proposed differential-mode CTA (DCTA) improves over the pseudo-differential amplifier in that two CMOS CTA channels are dynamically coupled by sharing of the transfer capacitors. The connections of these capacitors are such that the charge on both plates contributes to the voltage gain, enhancing the gain by a factor of two at no additional cost. Moreover, the number of transfer capacitors is reduced from four to two in comparison to the pseudo-differential configuration, reducing die area by about 25%.
- 4. Development of a CTA with 10x reduction in input capacitance. The proposed direct-coupled CTA (DCCTA) overcomes the cutoff condition, allowing a relatively wide common-mode range at low supply voltages. At larger supply voltages, the saturation condition limits the advantages of this amplifier.

- 5. Development of a CTA with no precharge voltage. The proposed  $V_{PR}$ -less CTA (PLCTA) is designed with a modified output switching network that accomplishes two purposes. First, the need for a precharge voltage is eliminated by decoupling the PMOS and NMOS drain nodes from each other. Second, the switching network dynamically generates a suitable output common-mode voltage. This preserves the CTA's usefulness in interfacing to another amplifier or a latching comparator. By combining the benefits of the fully-differential architecture and the direct-coupled CTA, the proposed amplifier achieves nearly rail-to-rail input range at any supply voltage.
- 6. An analysis of the dynamic power consumption of charge-transfer amplifiers. The idealized analysis was straightforward to develop and led to an intuitive input-dependent power profile. However, comparison with measurement data revealed that the model overestimated the actual power dissipation by more than 100%. After examining the dynamic behavior, it was shown that aggregate charge sharing in flash A/D converters reduces the average power per CTA. It was also shown that threshold modulation and incomplete precharging combine together to further reduce the power consumption. Since including these effects in a compact empirical model is difficult, and because simulation results have been shown to accurately predict the power consumed by CTAs, it is recommended to use a combination of idealized analysis (which is always overly conservative) followed by local and global simulations to predict the dynamic power per amplifier and of the combined A/D converter.
- 7. Figures of merit for charge-transfer amplifiers. Four figures of merit were proposed, each linked directly to the commonly accepted figure of merit for A/D converters. An objective comparison of the overall performance of all reported charge-transfer amplifiers revealed which amplifiers are best suited to satisfy particular converter constraints, cost limitations and/or system parameters.

- 8. A 10-bit subranging CTA-based A/D converter. The converter reported here uses only charge-transfer amplifiers and dynamic latch comparators in the construction of the coarse and fine sections. A timing scheme was utilized which allowed an optimal tradeoff between coarse bank accuracy, fine bank accuracy and settling time for the fine references. The converter was implemented in 0.6  $\mu$ m CMOS and consumed 400  $\mu$ W/MSPS of core power plus the power of the resistor ladder. It is asserted that a rule of 50% overhead for the resistor ladder is sufficient and, as a result, a total of 600  $\mu$ W/MSPS is possible up to 1 MSPS, the limit of this converter. This is 40% lower than the current state of the art of 1 mW/MSPS. In addition to the reported converter, future CTA-based converters can potentially be applied to several applications, including industrial sensors, telephony, appliance controls, Bluetooth and video/ultrasound.
- 9. Interpolation of an array of charge-transfer amplifiers. A 4:1 interpolation scheme was utilized within the fine bank of the reported A/D converter in order to reduce size, power dissipation and loading on the resistor ladder and input source. The implementation was successful in the sense that power, area and noise were dramatically reduced. However, it was discovered that interpolation with CTAs presents unique challenges as well. In spite of careful design and layout practices, distortion at the lower end was observed which cut the overall performance to 8.2 effective bits with nonlinearity above the acceptable 10-bit level. In addition, mismatches in the interpolation capacitors led to systematic offsets which appeared as patterned DNL. The problem of successfully implementing interpolation with arrays of charge-transfer amplifiers is suggested below as an area for future study and optimization.
- 10. A distributed sample-and-hold utilizing existing input coupling capacitors of charge-transfer amplifiers. The A/D converter presented in this dissertation required no S/H amplifier because that function was folded into the existing coupling capacitors and input switching network of the fine bank CTAs. Careful management of the switch drivers led to good sampling accuracy.

11. A gain enhancement method for cascaded charge-transfer amplifiers. In cases where two or more CTAs are cascaded to increase the forward gain, a method has been proposed in which a small capacitor in positive feedback around the trailing stages is used to add a virtual negative capacitance at the input nodes. This has the effect of boosting the gain of the first stage CTA. The resulting overall increase in forward gain leads to lower input-referred offset voltage by a factor on the order of 25–50%. The cost of using this technique is small, since the size of the feedback capacitors can be on the order of a few tens of fF in a typical CMOS process.

### 8.2 Future Work

In the course of this work, the following topics have been identified as areas for future research:

- Analysis of the high-speed behavior of charge-transfer amplifiers. One of the problems with the analytical model developed in Chapter 4 was an inherent inaccuracy at high sampling rates. The reason for this limitation is that finite switch resistance was not accounted for in the calculations. Doing so led to equations with no closed form solution. But as CTAs are implemented in smaller geometry processes, it will be important to have a reliable model of the behavior up to the maximum sample rate (on the order of 100–1000 MSPS).
- Novel offset reduction techniques, particularly for high speed operation, of chargetransfer amplifiers. It was shown earlier that the offset voltage increases steadily with sample rate due to the influence of matching in the active MOS transistors. One solution for the future may be to simply use larger transistors for better matching. However, this approach adds two new complications: first, the larger drain junctions add load capacitance which decreases the overall gain; and second, due to the larger gate capacitance, voltage division at the inputs becomes worse unless the size of the input coupling capacitors is increased. Efficient methods of trimming CTAs may be very attractive for certain applications.

- Improved methods of interpolation for charge-transfer amplifiers. As described above, interpolation with CTAs presents a number of challenging design propositions. Distortion at the ends is difficult to control accurately, leading to poor spectral performance and high INL. In contrast to interpolation with continuous time amplifiers, any mismatch in the interpolating capacitors leads directly to offset errors in the CTAs. The result is potentially poor DNL performance. The development of a reliable method of interpolation that builds on the unique construction of CTAs would represent not only a novel (and assuredly patentable) improvement, but would also considerably advance the usefulness of chargetransfer amplifiers in precision A/D converters.
- An accurate dynamic power model for charge-transfer amplifiers. The idealized power analysis in Chapter 6 provided an intuitive means of predicting the dynamic power consumption of CTAs. However, the model was shown to be lacking with respect to multiple important factors. A model that bridges the gap between unrealistically high idealized predictions and estimates obtained from lengthy transient simulations would help lead to more satisfying design analyses.
- Subthreshold charge-transfer amplifiers. CTAs can in fact be operated below the minimum supply voltage predicted by the idealized equations. In 0.6  $\mu$ m CMOS, for example, subthreshold conduction allows a CTA to amplify well below the 2.1 V limit suggested by summing the absolute modulated values of  $V_{TN}$  and  $V_{TP}$ . The power dissipation of a subthreshold CTA drops exponentially with supply voltage, but so does the peak sample rate. Nevertheless, a study into the design and performance of subthreshold CTAs may lead to unprecedented reductions in the power dissipation of low-frequency (below 100 kSPS) A/D converters.
- A two-phase charge-transfer amplifier. One of the common features of all reported CTAs to-date is a three-phase operation. For practical considerations, this means two complete clock cycles are required for each CTA cycle. This

limits the advantages available with CTAs as compared to many switched amplifier circuits that operate in just one clock cycle, or two clock phases. It would represent a significant advantage if a CTA could function on just two phases as well. It should be possible to devise a scheme whereby the reset phase is absorbed into the precharge phase by appropriate switching of either the transfer capacitors or the active MOS devices.

• Application to new printed circuit technologies. The inherent robustness of charge-transfer amplifier architectures is an advantage in CMOS technologies to be sure. But, emerging design mediums may prove even more favorable for a CTA-based approach to A/D conversion. Circuits are now being integrated directly onto non traditional substrates in order to reduce size and cost. Organic thin film transistors (OTFTs), also called "plastic transistors," are becoming popular through disruptive carbon-based technologies which hold promise in the five to twenty year time frame as cheap and efficient mediums for manufacturing displays and other human interface circuits. The relative insensitivity of CTAs to variations in most transistor properties may lead to the feasible construction of high-performance amplifiers, comparators and A/D converters in these new organic technologies without requiring particularly high quality transistors.

Appendix

## Appendix A

## **Common Random Access Memory Architectures**

Table A.1 provides a list of the most common random-access memory (RAM) architectures. Charge-transfer amplifiers have been used as low-power, high-speed sense amplifiers in RAM applications since 1972 [8,9,46,47].

|          | Duramia Dandom Access Momony                                       |
|----------|--------------------------------------------------------------------|
| DRAM     | Dynamic Random Access Memory                                       |
|          | The most common type of memory; it must be constantly retreshed    |
|          | or it will lose its contents.                                      |
| SRAM     | Static Random Access Memory                                        |
|          | Faster and more reliable than DRAM; the term "static" implies that |
|          | it does not require refreshing. SRAM is more expensive to produce  |
|          | than DRAM.                                                         |
| FPM RAM  | Full Page Mode Random Access Memory                                |
|          | A type of DRAM that allows faster access to row or page data.      |
|          | Sometimes called Page-Mode Memory, it eliminates the need for a    |
|          | row address if data is located in the previously-accessed row.     |
| EDO DRAM | Extended Data Out Dynamic Random Access Memory                     |
|          | Faster than conventional DRAM, which can access only one block     |
|          | of data at a time. EDO DRAM can start fetching the next block      |
|          | of memory at the same time that it sends the previous block to the |
|          | output.                                                            |
| SDRAM    | Synchronous Dynamic Random Access Memory                           |
|          | Can run at much higher clock speeds than conventional DRAM.        |
|          | SDRAM actually synchronizes itself with the system clock and is    |
|          | capable of running about three times faster than conventional FPM  |
|          | RAM, and about twice as fast EDO DRAM.                             |

Table A.1: Common random access memory architectures

## Appendix B

# Matlab Model of a Fully-differential CTA Voltage Transfer Function

The following Matlab script was used to calculate the transfer function of a fully-differential charge-transfer amplifier according to the model developed in Section 4.3.

In the first part of the code, input parameters pertaining to the process, circuit parameters and external conditions are initialized. Next, the precharge phase is initialized with these parameters. Behavior during the precharge phase is then modelled by computing the voltage at key nodes. Since only the voltages at the end of the precharge phase are of interest, the interim voltages are not calculated directly. Finally, the amplify phase response is computed with corrections made for threshold modulation. A subthreshold conduction parameter is included in the scripts in order to allow for the added gain introduced by subthreshold conduction for small signals. This parameter is used to ensure continuity between the low-frequency and high-frequency responses.

| %% | Script for computing the gain of a $\ensuremath{DCTA}$ | %% |
|----|--------------------------------------------------------|----|
| %% | (Differential Charge Transfer Amp.)                    | %% |
| %% | Author: William J. Marble                              | %% |
| %% | Date: Dec 06, 2000                                     | %% |
| %% |                                                        | %% |

| %%                                     | INPUT VARIABLES                             | %% |  |
|----------------------------------------|---------------------------------------------|----|--|
| %%                                     | dv = Delta(Vin)                             | %% |  |
| %%                                     | vss = negative supply (-1.05 to -2.0)       | %% |  |
| %%                                     | vdd = positive supply (1.05 to 2.0)         | %% |  |
| %%                                     | ct = transfer capacitance (F)               | %% |  |
| %%                                     | co = load capacitance (F)                   | %% |  |
| %%                                     | <pre>bn = Kp(W/L) of NMOS transistors</pre> | %% |  |
| %%                                     | <pre>bp = Kp(W/L) of PMOS transistors</pre> | %% |  |
| %%                                     | vtno = NMOS zero-bias threshold             | %% |  |
| %%                                     | vtpo = PMOS zero-bias threshold             | %% |  |
| %%                                     |                                             | %% |  |
| %%                                     | OUTPUT VARIABLES                            | %% |  |
| %%                                     | a = alpha, the gain scaling factor          | %% |  |
| %%                                     | where GAIN = alpha(ct/co)                   | %% |  |
| %%                                     | The output is computed for frequencies      | %% |  |
| %%                                     | spanning 1Hz to 1GHz.                       | %% |  |
| %%                                     | This code ignores the speed-limiting        | %% |  |
| %%                                     | effects of switch resistance.               | %% |  |
| 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/ |                                             |    |  |

### clear;

bp=.0008;

- vtno=.655;
- vtpo=-.958;
- vtn=vtno;
- vtp=vtpo;
- k1n=0.88;
- k1p=0.55;
- phif=0.7;
- an=2\*ct/bn;
- ap=2\*ct/bp;

```
Bn=sqrt(bn/bp);
```

```
Bp=sqrt(bp/bn);
```

```
%% Subthreshold conduction parameter
```

```
vbo=vss-(vtn+vtp)/2;
```

```
vco=-vtp-Bn*(vbo+vtn);
```

```
fs=logspace(0,9,250);
```

```
tp=1./(4.*fs);
```

```
ta=1./(2.*fs);
```

- j=[];
- m=[];

```
for i=1:250
```

```
m=[m;linspace(0,tp(i),100)];
```

```
end
```

for i=1:250

```
j=[j;linspace(0,ta(i),100)];
end
inr=(bn/2).*(vbo+vtn).^2;
ipr=(bp/2).*(vco+vtp).^2;
A=an*(2*Bn+1);
B=4*an*vtp;
C=-(vtn+vbo-vco-vtp);
to1=(B+2*A*C)/C^2;
to2=-B/C^2;
if to2>to1
  to=to2;
else
  to=to1;
end;
tpe=to+tp;
%% Precharge phase
for i=1:250
vtn=vtno;
vtp=vtpo;
  for k=1:100
     B(i,k)=4*an*vtp;
     vbp(i,k)=-vtn-abs((1/2).*((A)./(m(i,k)+to)+sqrt(((A)./
        (m(i,k)+to)).^2+(B(i,k))./(m(i,k)+to))));
     iinp(i,k)=(bn/2).*(vtn+vbp(i,k)).^2;
     vcp(i,k)=-vtp+abs((1/2).*((A)./(m(i,k)+to)-sqrt(((A)./
        (m(i,k)+to)).^2+(B(i,k))./(m(i,k)+to))));
     iipp(i,k)=(bp/2).*(vtp+vcp(i,k)).^2;
     vtn=vtno+k1n*(sqrt(phif+(vbp(i,k)-vss))-sqrt(phif));
```

```
vtp=vtpo-k1p*(sqrt(phif-(vcp(i,k)-vdd))-sqrt(phif));
```

end

```
Cb(i)=-(vbp(i,100)+vtn-dv-vcp(i,100)-vtp);
```

```
Ca(i)=-(vbp(i,100)+vtn+dv-vcp(i,100)-vtp);
```

end

```
%% Time constants
```

```
t11=(B(:,100)'+2.*A.*Cb)./Cb.^2;
```

```
t12=-B(:,100)'./Cb.^2;
```

if t12>t11

t1b=t12;

else

t1b=t11;

end;

```
t13=(B(:,100)'+2.*A.*Ca)./Ca.^2;
```

```
t14=-B(:,100)'./Ca.^2;
```

```
if t14>t13
```

t1a=t14;

else

t1a=t13;

end;

```
%% Transient response
```

for i=1:250

```
vtn=vtno+k1n*(sqrt(phif+abs(vbp(i,100)-vss))-sqrt(phif));
```

```
vtp=vtpo-k1p*(sqrt(phif+abs(vcp(i,100)-vdd))-sqrt(phif));
```

for k=1:100

```
B(i,k)=4*an*vtp;
     vba(i,k)=-vtn+dv-abs((1/2).*((A)./(j(i,k)+t1b(i))+sqrt(((A)./
        (j(i,k)+t1b(i))).^2+B(i,k)./(j(i,k)+t1b(i))+4*an*Bn*dv./
        (j(i,k)+t1b(i))));
     vda(i,k)=-vtn-dv-abs((1/2).*((A)./(j(i,k)+t1a(i))+sqrt(((A)./
        (j(i,k)+t1a(i))).^2+B(i,k)./(j(i,k)+t1a(i))-4*an*Bn*dv./
        (j(i,k)+t1a(i))));
     ina1(i,k)=(bn/2).*(vtn+vba(i,k)-dv).^{2};
     ina2(i,k)=(bn/2).*(vtn+vda(i,k)+dv).^{2};
     vca(i,k)=-vtp+abs((1/2).*((A)./(j(i,k)+t1b(i))-sqrt(((A)./
        (j(i,k)+t1b(i))).^2+(B(i,k))./(j(i,k)+t1b(i))+4*an*Bn*dv./
        (j(i,k)+t1b(i))));
     vaa(i,k)=-vtp+abs((1/2).*((A)./(j(i,k)+t1a(i))-sqrt(((A)./
        (j(i,k)+t1a(i))).^2+(B(i,k))./(j(i,k)+t1a(i))-4*an*Bn*dv./
        (j(i,k)+t1a(i))));
     ipa2(i,k)=(bp/2).*(vtp+vca(i,k)).^2;
     ipa1(i,k)=(bp/2).*(vtp+vaa(i,k)).^2;
     vban(i,k)=-vtn-abs((1/2)*((A)./(j(i,k)+tpe(i))+sqrt(((A)./
        (j(i,k)+tpe(i))).^2+(B(i,k))./(j(i,k)+tpe(i))));
     vcan(i,k)=-vtp+abs((1/2)*((A)./(j(i,k)+tpe(i))-sqrt(((A)./
        (j(i,k)+tpe(i))).<sup>2</sup>+(B(i,k))./(j(i,k)+tpe(i))));
     ipan(i,k)=(bp/2)*(vtp+vcan(i,k)).^2;
     inan(i,k)=(bn/2)*(vtn+vban(i,k)).^2;
%% Body effect
vtn=vtno+k1n*(sqrt(phif+(vba(i,k)-vss))-sqrt(phif));
     vtp=vtpo-k1p*(sqrt(phif-(vca(i,k)-vdd))-sqrt(phif));
```

%% Subthreshold parameters

### 

```
if Cb(i)<=0
vba(i,k)=vban(i,k)-Cb(i)/ec+dv/ec;
vca(i,k)=vcan(i);
ina1(i,k)=0;
ipa2(i,k)=0;</pre>
```

### end

```
if Ca(i)<=0
    vda(i,k)=vban(i,k)-Ca(i)/ec-dv/ec;
    vaa(i,k)=vcan(i);
    ina1(i,k)=0;
    ipa2(i,k)=0;</pre>
```

end

end

end

title('Gain Magnitude Plot')
xlabel('Sample Rate (Hz)')
ylabel('Normalized Gain, alpha')
axis([1 1e9 0 2])

## Bibliography

- K. Kotani et al., "CMOS Charge-Transfer Preamplifier for Offset-Fluctuation Cancellation in Low-Power, High-Accuracy Comparators," in *Digest of Technical Papers, IEEE Symposium on VLSI Circuits*, June 1997, pp. 21–22.
- [2] K. Kotani et al., "CMOS Charge-Transfer Preamplifier for Offset-Fluctuation Cancellation in Low-Power A/D Converters," *IEEE Journal of Solid State Circuits*, vol. 33, no. 5, pp. 762–768, May 1998.
- [3] Y. Yao, "Stored Charge Memory Detection Circuit," United States Patent Number 3, 760, 381, September 1973, Assigned to International Business Machines Corporation.
- [4] R. Dennard and D. Spaminato, "Differential Charge Transfer Sense Amplifier," United States Patent Number 3,949,381, April 1976, Assigned to International Business Machines Corporation.
- [5] P. Diodato, "Embedded DRAM: More than Just a Memory," IEEE Communications Magazine, Online Edition, July 2000.
- [6] J. Heller et al., "High Sensitivity Charge Transfer Sense Amplifier," IEEE Journal of Solid State Circuits, vol. SC-11, pp. 596–601, October 1976.
- [7] J. Heller, "Cross-coupled Charge-transfer Sense Amplifier," in Digest of Technical Papers, IEEE International Solid State Circuits Conference (ISSCC), San Francisco, 1979, pp. 20-21.
- [8] J. Kim et al., "Boosted Charge Transfer Preamplifier for Low Power Gbit-scale DRAM," *Electronics Letters*, vol. 34, no. 18, pp. 1785–1791, September 1998.

- S. Kawashima et al., "A Charge-transfer Amplifier and an Encoded-bus Architecture for Low-Power SRAMs," *IEEE Journal of Solid State Circuits*, vol. 33, no. 5, pp. 793-799, May 1998.
- [10] K. Kotani et al., "Feedback Charge-Transfer Comparator with Zero Static Power," in Digest of Technical Papers, IEEE International Solid State Circuits Conference (ISSCC), San Francisco, February 1999, pp. 328–329.
- [11] K. Kotani et al., "Charge Transfer Amplifier Circuit, Voltage Comparator, and Sense Amplifier," United States Patent Number 6,150,851, November 2000, Assigned to T. Ohmi, Kabushiki Kaisha Ultraclean Technology Research Institute, Tokyo, Japan.
- [12] W. Marble and D. T. Comer, "Ultra Low Power A/D Converters Using an Enhanced Differential Charge Transfer Amplifier," in *Digest of Technical Papers*, *IEEE European Solid State Circuits Conference (ESSCIRC)*, Stockholm, Sweden, September 2000.
- [13] W. Marble and D. T. Comer, "Analysis of the Dynamic Behavior of a Chargetransfer Amplifier," *IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications*, vol. 48, no. 7, pp. 793–804, July 2001.
- [14] W. Marble, "A 10-b Charge-transfer Amplifier-based A/D Converter with 400 μw/MSPS Dynamic Power Dissipation," in Digest of Technical Papers, IEEE European Solid State Circuits Conference (ESSCIRC), Florence, Italy, September 2002.
- [15] W. Marble et al., "Practical Charge-Transfer Amplifier Design Architectures for Low-Power Flash A/D Converters," *IEEE Transactions on Circuits and* Systems-I: Regular Papers, In Press.
- [16] W. Marble, "Charging Ahead : Low Power Dissipation and Robust Performance From New Interface Technology," New Electronics, England, pp. 35–36, October 2002.

- [17] W. Marble, "Effektsnålare A/D Med Ny Laddningsförstärkare, (Ultra Efficient A/D Conversion with Novel Charge-Transfer Amplifiers) Trans. and Ed. G. Lilliesköld," *Elektronik i Norden, Sweden*, p. 28, February 2003.
- [18] W. Marble, "Differential Mode Charge Transfer Amplifier," United States Patent Number 6, 249, 181, June 2001, Assigned to AMI Semiconductor, Inc.
- [19] W. Marble, "Systems and Methods for Enhancing Charge Transfer Amplifier Gain," United States Patent Number 6,356,148, March 2002, Assigned to AMI Semiconductor, Inc.
- [20] W. Marble, "Reference-free Charge Transfer Amplifier," United States Patent Number 6,566,943, May 2003, Assigned to AMI Semiconductor, Inc.
- [21] W. Marble, "A/D Converters Based on Transconveyance Amplifiers," United States Patent Number 6,606,049, August 2003, Assigned to AMI Semiconductor, Inc.
- [22] R. Pierret, Semiconductor Device Fundamentals, Reading, Addison Wesley, 1996.
- [23] D. Foty, MOSFET Modeling with Spice, Reading, Prentice Hall, 1997.
- [24] W. Wilson et al., "Measurement and Modeling of Charge Feedthrough in n-Channel MOS Analog Switches," *IEEE Journal of Solid State Circuits*, vol. 20, no. 6, pp. 1206–1213, December 1985.
- [25] P. Cusinato et al., "Analysis of the Behavior of a Dynamic Latch Comparator," *IEEE Transactions on Circuits and Systems-I: Fundamental Theory and Applications*, vol. 45, no. 3, pp. 294–298, March 1998.
- [26] G. Gielen, "Nyquist-Rate Data Converters: Overview and Figures of Merit," Presentation Notes, Course on CMOS Data Converters for Communications, May 2002, Barcelona, Spain.

- [27] M. Vogels and G. Gielen, "Figure of Merit Based Selection of A/D Converters," in Design, Automation and Test in Europe, Munich, Germany, March 2003, pp. 1090–1091.
- [28] A. Dingwall and V. Zazzu, "An 8-MHz CMOS Subranging 8-Bit A/D Converter," *IEEE Journal of Solid State Circuits*, vol. SC-20, no. 6, pp. 1138–1143, December 1985.
- [29] B. Brandt and J. Lutsky, "A 75-mW, 10-b, 20-MSPS CMOS Subranging ADC with 9.5 Effective Bits at Nyquist," *IEEE Journal of Solid State Circuits*, vol. 34, no. 12, pp. 1788–1795, December 1999.
- [30] K. Kusumoto et al., "A 10-b 20-MHz 30-mW Pipelined Interpolating CMOS ADC," *IEEE Journal of Solid State Circuits*, vol. 28, no. 12, pp. 1200–1206, December 1993.
- [31] K. Kattman and J. Barrow, "Repetitive Cell Matching Technique for Integrated Circuits," United States Patent Number 5,175,550, December 1992, Assigned to Analog Devices, Inc.
- [32] K. Bult and A. Buchwald, "Analog To Digital Converter," United States Patent Number 6,169,510, January 2001, Assigned to Broadcom Corporation.
- [33] K. Kattman and J. Barrow, "A Technique for Reducing Differential Non-Linearity Errors in Flash A/D Converters," in *IEEE International Solid-State Circuits Conference (ISSCC)*, San Francisco, February 1991, pp. 170–171.
- [34] P. Figueiredo and J. Vital, "Averaging Technique in Flash Analog-to-Digital Converters," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 51, no. 2, pp. 233–253, February 2004.
- [35] K. Bult, "High-Speed CMOS ADCs," IEEE International Solid State Circuits Conference (ISSCC), Tutorial Session, February 1999.

- [36] W. Colleran, "A 10-bit, 100 MS/s A/D Converter Using Folding, Interpolation, and Analog Encoding," Ph.D. Dissertation, University of California at Los Angeles, 1993.
- [37] F. Kiko, "Compensated Transformer Circuit Utilizing Negative Capacitance Simulating Circuit," United States Patent Number 3,832,654, August 1974, Assigned to Lorain Products Corporation.
- [38] R. Rolfe and M. Shoji, "Single Terminal Negative Capacitance Generator for Response Time Enhancement," United States Patent Number 4,443,882, April 1984, Assigned to Bell Telephone Laboratories, Inc.
- [39] M. Clara et al., "An 11 bit Oversampled ADC for 3rd Generation VDSL in 0.18m CMOS," in Austrochip 2001, Vienna, Austria, October 2001, pp. 17–23.
- [40] R. Senger et al., "A 150 MSample/s Folding and Current Mode Interpolating ADC in 0.35 μm CMOS," Unpublished Report, University of Michigan, Fall 2002.
- [41] M. Ito et al., "A 10 bit 20 MS/s 3V Supply CMOS A/D Converter," IEEE Journal of Solid State Circuits, vol. 29, no. 12, pp. 1531–1536, December 1994.
- [42] Y. Wang, "An 8-bit 150-MHz CMOS A/D Converter," Ph.D. Dissertation, University of California at Los Angeles, 1999.
- [43] P. Kim et al., "A 69-mW 10-bit 80-MSample/s Pipelined CMOS ADC," IEEE Journal of Solid State Circuits, vol. 39, no. 2, pp. 308–319, February 2004.
- [44] J. Sit and R. Sarpeshkar, "A Micropower Logarithmic A/D With Offset and Temperature Compensation," *IEEE Journal of Solid State Circuits*, vol. 39, no. 2, pp. 308–319, February 2004.
- [45] International Technology Roadmap for Semiconductors, "System Drivers, 2003
   Edition," URL: http://public.itrs.net, 2003.

- [46] I. Fukushi et al., "A Low-power SRAM Using Improved Charge Transfer Sense Amplifiers and a Dual-Vth CMOS Circuit Scheme," in *Digest of Technical Papers, IEEE Symposium on VLSI Circuits*, 1998, pp. 142–145.
- [47] I. Fukushi et al., "Dual-Vth 0.25um CMOS Cells and Macros for 1V Low-power LSIs," *Fujitsu Scientific & Technical Journal*, vol. 36.1, pp. 72–81, June 2000.