Adv. Radio Sci., 4, 287–291, 2006 www.adv-radio-sci.net/4/287/2006/ © Author(s) 2006. This work is licensed under a Creative Commons License.

# Advances in Radio Science

# 2.5 Gbps clock data recovery using 1/4th-rate quadricorrelator frequency detector and skew-calibrated multi-phase clock generator

#### S. Tontisirin and R. Tielert

Technische Universität Kaiserslautern, Germany

Abstract. A Gb/s clock and data recovery (CDR) circuit using 1/4th-rate digital quadricorrelator frequency detector and skew-calibrated multi-phase voltage-controlled oscillator is presented. With 1/4th-rate clock architecture, the coil-free oscillator can have lower operation frequency providing sufficient low-jitter operation. Moreover, it is an inherent 1-to-4 DEMUX. The skew calibration scheme is applied to reduce phase offset in multi-phase clock generator. The CDR with frequency detector can have small loop bandwidth, wide pull-in range and can operate without the need for a local reference clock. This 1/4th-rate CDR is implemented in standard  $0.18 \,\mu\text{m}$  CMOS technology. It has an active area of  $0.7 \text{ mm}^2$  and consumes 100 mW at 1.8 V supply. The CDR has low jitter operation in a wide frequency range from 1-2.25 Gb/s. Measurement of Bit-Error Rate is less than  $10^{-12}$ for 2.25 Gb/s incoming data  $2^7 - 1$  PRBS, jitter peak-to-peak of 0.7 unit interval (UI) modulation at 10 MHz.

### 1 Introduction

At the receiver side of optical communication or high speed serial communication, clock and data recovery circuit has to generate a synchronized clock with the incoming serial data for using it in data regeneration and demultiplexing. Regarding the factor of cost, the monolithic IC solution for clock and data recovery is attractive, especially, the CMOS IC without the on-chip coil, which occupies large area. At high frequency, the ring-based voltage control oscillator (VCO) is far inferior in jitter performance but at adequately low frequency it can provide sufficient low-jitter operation. With 1/4th-rate architecture, the required clock frequency can be reduced. Hence low-jitter operation is achievable. Moreover, power consumption can also be decreased.

Correspondence to: S. Tontisirin (tontirisin@eit.uni-kl.de)



Fig. 1. Block diagram of 1/4th-rate CDR with frequency detector and skew calibration.

The jitter transfer function of CDR is normally small to improve noise performance. Conversely, pull-in range is reduced. CDRs with frequency initial loop (Cao, 2002) or with frequency synthesizer loop (Farjad-Rad, 2004) require a local reference clock. CDR with frequency acquisition loop using digital quadricorrelator frequency detector (DQFD) can have low jitter operation and wide pull-in range because there is no trade-off between loop bandwidth and pull-in range. It can operate without the need for a local reference clock. The full-rate clock architecture was presented in Pottbäcker (1992). The half-rate clock approach (Yang, 2004) can lower the power consumption and relax the required clock frequency.

In this work, a 1/4th-rate CDR using digital quadricorrelator frequency detector is described. The skew calibration scheme (Wu, 2001) is applied to reduce phase offsets in the required multi-phase clock generator.

#### 2 Overall architecture

Block diagram of the proposed 1/4th-rate CDR using digital quadricorrelator frequency detector is depicted in Fig. 1. The CDR consists of a fully differential 8-phase ring-oscillator-



Fig. 2. Block diagram of phase frequency detector.

Table 1. Operation table of phase detector.

| Serialdata transition | Meaning         | Output               |
|-----------------------|-----------------|----------------------|
| State-1               | Clock too late  | F-up-state-1 = '1'   |
| State-2               | Clock too late  | F-up-state-2 = '1'   |
| State-3               | Clock too early | F-down-state-3 = '1' |
| State-4               | Clock too early | F-down-state-4 = '1' |

based VCO, skew calibration circuit, 16 sense-amplifiers, phase frequency detector (PFD), charge pump and loop filter components. The skew-calibrated 16-phase clocks are provided by using both clock edges from VCO. They are applied in 16 sense-amplifiers for sampling the incoming serial data stream. The 16 sampling data, covering 4 serial data bit periods, are used in PFD to generate frequency up/down signals to control the frequency and phase of VCO synchronously to the incoming serial data.

#### 2.1 Phase frequency detector

The block diagram of PFD is depicted in Fig. 2. PFD senses the phase difference between 1/4th-rate clock and input data. In order to have automatic data recovery, PFD have to sample the data by the clock instead of sampling the clock by the data, as described in Pottbäcker (1992) and Yang (2004). The data from sense-amplifiers are retimed to make all sampling data synchronous. When CDR locking is achieved, the sampling data of clk-0, clk-4, clk-8 and clk-12 represent the regenerated 4-bit output of the demultiplexer.

PFD operates at 1/4th-rate clock, therefore it can be implemented by CMOS logic which has lower power consumption compared to high speed current mode logic. The operation of PFD can be explained in the following. Each 1/4th of clock



Fig. 3. Time diagram of phase detector.

Table 2. Operation table of frequency detector.

| Signal         | Set-condition                   | Reset-condition         |
|----------------|---------------------------------|-------------------------|
| Q1             | F-down-state- $4 = '1'$         | F-up-state- $2 = 1'$    |
| Q2             | F-up-state-1 = '1'              | F-down-state- $3 = '1'$ |
| F-down-disable | Rising edge of Q1<br>& Q2 = '1' | Falling edge of Q1      |
| F-up-disable   | Rising edge of Q2<br>& Q1 = '1' | Falling edge of Q2      |

period, 1 unit interval (UI), is divided into 4 phase states, state-1, state-2, state-3 and state-4, as shown in Fig. 3. The operation table of PD is depicted in Table 1. Phase detector (PD) steers clk-0, clk-4, clk-8 and clk-12 to achieve sampling at the middle of the data eye. If transitions of serial data occur in state-3 or state-4, as too early sampling points, PD generates F-down-state-3 signals or F-down-state-4 signals. If transition edges of serial data occur in state-1 or state-2, as too late sampling points, PD generates F-up-state-2 signals. In the phase-locked condition, serial data transitions alternate between phase state-2 and state-3, hence, PD generates the same amount of F-up-state-2 and F-down-state-3 signals.

Frequency detector (FD) monitors the outputs from PD to detect a frequency difference. If VCO frequency is not equal to 1/4th of the data rate frequency, the serial data transition edges rotate around the phase states of VCO clock. The frequency difference defines this rotation frequency. As shown in Fig. 4, if VCO frequency is lower than 1/4th of data rate frequency, appearances of serial data transitions cyclically move in direction state-2, state-1, state-4, state-3, state-2 and state-1, respectively. It rotates in the opposite way, if VCO frequency is too high, represented in Fig. 5. PD generates the same amount of F-up and F-down signals because of the cyclical walking of serial data transitions, so that the loop could not be driven locked. Therefore, in this proposed CDR,



Fig. 4. Time diagram of frequency detector: frequency is too low.



Fig. 5. Time diagram of frequency detector: frequency is too high.

FD detects the cyclical walking of serial data transitions and generates signal F-up-disable and F-down-disable to produce the required unbalance of F-up and F-down signals to drive VCO frequency to 1/4th of the data rate frequency. The operation of FD is shown in Table 2. Q1 and Q2 are used to detect the cyclical walking of serial data transitions. If VCO frequency is too low Q2 leads Q1, shown in Fig. 4. F-downdisable is set by the rising edge of Q1 when Q2 is '1'. It is reset by the falling edge of Q1. F-up-disable is set by the rising edge of Q2 when Q1 is '1' and reset by the falling edge of Q2. If Q2 leads Q1, F-down-disable is generated but Fup-disable is always '0'. If VCO frequency is too high Q1 leads Q2, F-up-disable is generated while F-down-disable is '0', depicted in Fig. 5. In the lock condition, serial data transitions alternate between phase state-2 and state-3 hence Q1 and Q2 stop beating. Therefore, F-down-disable and F-updisable automatically disappear. As a result, FD will not disturb the lock condition.



Fig. 6. Simplified VCO schematic.



Fig. 7. Skew calibration scheme.

#### 2.2 VCO design and skew calibration scheme

A simplified schematic of the VCO is shown in Fig. 6. It consists of 8 fully differential delay units. The 4 configurable frequency ranges can be selected by switching of resistive loads. Replicated bias is applied to control the voltage swing. A V-I converter circuit converts the control voltage to the bias current of the delay unit. The layout of VCO is designed for balancing delays between each unit. Common centroid topology and dummy cells are used to improve matching properties.

In order to reduce phase offsets further, a skew calibration, as described in Wu (2001), is utilized. The block diagram is shown in Fig. 7a. Each clock phase has its own delay-locked loop (DLL) to calibrate the phase positions. The phase control hierarchy is shown in Fig. 7b. Clk-8 is calibrated to be in the middle between 2 successive clk-0s. Clk-4 is adjusted to be in the middle between clk-0 and clk-8. Similarly, clk-12



Fig. 8. Die photograph.

#### Table 3. Performance summary.

| Technology                | $0.18\mu\mathrm{m}\mathrm{CMOS}$         |
|---------------------------|------------------------------------------|
| Active area + loop filter | $0.7 \mathrm{mm^2} + 0.65 \mathrm{mm^2}$ |
| Power Consumption         | 100 mW at 1.8-Vdd                        |
| VCO Gain                  | 80 MHz./V                                |
| VCO Frequency Ranges      | 240 – 570 MHz.                           |
| CDR Data Rate             | 1 - 2.25  Gb/s                           |
| Loop Bandwidth            | 1 MHz.                                   |
| Pull-in Range             | more than 100 MHz.                       |
| Bit-Error-Rate (BER)      | less than $10^{-12}$ at 2.25 Gb/s,       |
|                           | Jitter p-p 0.7 UI,                       |
|                           | modulation at 10 MHz.                    |

is controlled to be in the middle between clk-8 and the next clk-0. This method controls also the rest of the clocks.

#### 2.3 Loop characteristic

Standard loop filter using on-chip passive capacitors and resistors are used. Linear model approximation for phaselocked loop (PLL) is still valid for the close-to-lock behavior of this CDR because the FD is not active when frequency acquisition is achieved. Loop behavior during frequency acquisition can be approximated as a control loop with one pole at origin, which is always stable.

# **3** Experimental results

The 1/4th-rate CDR is implemented in 0.18  $\mu$ m CMOS technology. CDR excluding loop filter components occupies an area of 0.7 mm<sup>2</sup>. The on-chip loop filter capacitor of 600 pF occupies an additional area of 0.65 mm<sup>2</sup>. The die photograph is depicted in Fig. 8. The performance summary is shown in Table 3. The CDR can operate at data rate from 1 to



**Fig. 9.** (a) DEMUX output and recovered clock 562.25 MHz, (2.25 Gb/s) (b) Jitter histogram of corresponding recovered clock, jitter 5.9 ps,rms.

 $2.25\,Gb/s$  and consumes  $100\,mW$  from  $1.8\,V$  supply. Its loop bandwidth is 1MHz and pull-in range is larger than 100MHz.

The CDR was measured by using NRZ data with PRBS of  $2^7 - 1$ . Fig. 9a shows the 4-bit output of DEMUX and recovered clock at 562.5 MHz., corresponding to the incoming data rate of 2.25 Gb/s. The recovered clock has 5.9 ps(rms) jitter. Its jitter histogram is depicted in Fig. 9b. The CDR was also tested using 2.25 Gb/s serial data with a peak-topeak jitter of 312 ps, 0.7 UI, at 10 MHz modulation. The jitter histograms of incoming serial data and its corresponding recovered clock are shown in Fig. 10a and Fig. 10b, respectively. The recovered clock has a jitter of 15.9 ps (rms) and bit-error-rate (BER) was measured to be less than  $10^{-12}$ .

# 4 Conclusions

The 1/4th-rate CDR using digital quadricorrelator frequency detector is described. The 1/4th-rate architecture allows a lower operation frequency, hence a coil-free oscillator can meet the jitter requirement. All logic units can be imple-



**Fig. 10. (a)** Jitter histogram of serial data 2.25 Gb/s with jitter, p-p 0.7 UI modulation at 10 MHz. **(b)** Jitter histogram of corresponding recovered clock, jitter 15.9 ps,rms.

mented in CMOS logic that leads to lower power consumption compared to high speed current mode logic. Furthermore, this 1/4th-rate CDR is an inherent 1-to-4 demultiplexer, therefore additional circuit is not required. The CDR can have small loop bandwidth and wide pull-in range. It operates without a local reference clock. The CDRs, implemented on 0.18  $\mu$ m CMOS technology, showed low jitter operation at data rates from 1 to 2.25 Gb/s. Higher data rate will be achieved by parameter changes in the oscillator range. Acknowledgement. The testchip is sponsored by the Gesellschaft für Schwerionenforschung (GSI) Darmstadt.

#### References

- Cao, J., Green, M., Momtaz, A., Vakilian, K., Chung, D., Jen, K.-C., Caresosa, M., Wang, X., Tan, W.-G., Cai, Y., Fujimori, I., and Hairapetian, A.: OC-192 Transmitter and Receiver in Standard 0.18 μm CMOS, IEEE J. of Solid-State Circuits, 1768–1780, 2002.
- Farjad-Rad, R., Nguyen, A., Tran, J. M., Greer, T., Poulton, J., Dally, W. J., Edmondson, J. H., Senthinathan, R., Rathi, R., Edward Lee, M.-J., and Ng, H.-T.: A 33 mW 8 Gb/s CMOS Clock Multiplier and CDR for Highly Integrated I/O, IEEE J. of Solid-State Ciruits, 1553–1561, 2004.
- Pottbäcker, A., Langmann, U., and Schreiber, H.-U.: A Si Bipolar Phase and Frequency Detector IC for Clock Extraction up to 8Gb/s, IEEE J. of Solid-State Circuits, 1747–1751, 1992.
- Wu, L. and Black Jr., C.: A low Jitter Skew-Calibrated Multi-Phase Clock Generator for Time Interleaved Application, ISSCC, 396– 397, 2001.
- Yang, R.-J., Chen, S.-P., and Liu, S.-I.: A 3.125 Gb/s Clock and Data Recovery Circuit for the 10-Gbase-LX4 Ethernet, IEEE J. of Solid-State Circuits, 1356–1360, 2004.