# A Digital PLL made from Standard Cells

Thomas Olsson\* and Peter Nilsson\*

*ABSTRACT:* A fully integrated digital PLL used as a clock multiplying circuit is designed. The PLL has no offchip components and it is made from standard cells found in most standard cell libraries. It is therefore portable between processes as an IP-block in netlist format. Using a 0.35µm standard CMOS process and a 3.0V supply voltage, the PLL is designed for a frequency range of 170MHz to 360MHz and occupies an on-chip area of about 0.07mm<sup>2</sup>.

# 1. INTRODUCTION

The PLL is a widely used circuit for clocking digital IP-blocks. Traditionally, a PLL is made as a partly analog building block. However, integrating an analog PLL in a digital noisy environment is difficult. In addition, the analog PLL is also sensitive to process variations and must therefore be redesigned for each new process.

Robust and easy implemented fully digital clock multipliers without phase locking are proposed in [1] and [2]. These clock multipliers produce a fixed number of cycles for each period of an external reference clock signal followed by an idle margin. For many digital applications, such simpler clock generations are a possible solution. However, for a number of applications, using for instance synchronous on-chip communication, a PLL is necessary to ensure correct functionality.

In this paper, an implementation of a digital PLL is presented. The digital PLL is designed as an IP-block described as a netlist linked to a standard cell library. For most digital applications, a standard cell description of the PLL simplifies the design, since the design including the PLL becomes portable between technologies.



Figure 1. PLL block structure.

The block structure of the digital PLL is shown in figure 1, which is divided into a phase detector, a loop filter an oscillator and a frequency divider. The phase detector is a slightly modified standard type IV detector [3].

The loop filter consists of a counter and a first order digital recursive filter. The counter performs an integration of the output from the phase detector. The result is then filtered using the digital filter. The oscillator is a numerically controlled oscillator (NCO).

# 2. NUMERICALLY CONTROLLED OSCILLATOR

Local clock generators based on ring oscillators have many advantages such as robustness, small size and low power consumption. The ring oscillator in its simplest form consists of an odd number of inverters connected in a circular chain. Such a circuit has no stable operation point and will therefore oscillate. The ring oscillator frequency is determined by the propagation time through the chain of inverters.

There are many methods to manipulate the ring oscillator frequency. The most straightforward technique is to change the propagation delay by changing the number of inverters. Other techniques are to use current starved inverters [4] or a delay line of controllable capacitors [5].

The difficulty when making an all standard cell PLL is often to implement an oscillator with high resolution. Often, a variable number of inverters are used for implementing a variable delay. However, this results in delay steps of several hundred ps which gives an inaccurate and unstabile phase lock for high frequency applications.



Figure 2. NCO.

The oscillator for the digital PLL is a 7 stage ring oscillator with one inverter replaced by a NAND-gate for shutting down the ring oscillator during idle mode. To change the frequency of the ring oscillator, a set of 21 inverting tri-state gates are connected in parallel with each inverter (see figure 2). When the tri-state gates are enabled additional current drive is added to each inverter stage. The 126 tri-state gates are controlled by a 126 bit vector, C, which is decoded from a 7 bit control word, W. The vector C is all ones for W=0 and all zeros for W=125 to 127. For W<126, the number of zeros in C is equal to W.

<sup>\*</sup> Dept. of Electroscience, Lund University. P.O. Box 118, SE-22100, Lund, Sweden.

In table 1, the number of enabled tri-state gates in parallel with each inverter is given for varying control word (W).

| W   | Inv1 | Inv2 | Inv3 | Inv4 | Inv5 | Inv6 |
|-----|------|------|------|------|------|------|
| 0   | 21   | 21   | 21   | 21   | 21   | 21   |
| 1   | 20   | 21   | 21   | 21   | 21   | 21   |
| 2   | 20   | 20   | 21   | 21   | 21   | 21   |
|     |      |      | _    |      |      | _    |
|     |      |      |      |      |      |      |
| 123 | 0    | 0    | 0    | 0    | 1    | 1    |
| 124 | 0    | 0    | 0    | 0    | 0    | 1    |
| 125 | 0    | 0    | 0    | 0    | 0    | 0    |
| 126 | 0    | 0    | 0    | 0    | 0    | 0    |
| 127 | 0    | 0    | 0    | 0    | 0    | 0    |

Table 1. Number if enabled tri-state gates.

The period time versus digital control word for the NCO at 3.0 V supply voltage is shown in figure 3. The plot in figure 3 has a slope of between 10 ps/bit and 55ps/bit. It is important to keep the slope low since the slope sets the resolution of the NCO.

A negative slope must also be avoided, since this might cause the PLL to be unstabile. The decoding from W to C makes negative slope impossible since adding current drive by enabling a tri-state gate can not slow down the oscillator.

The simulations indicate a frequency range of 170-360MHz. Since the minimum NCO frequency is less than 50% of the maximum frequency, a large actual frequency range can be obtained by dividing the frequency by factors of two using a set of flip-flops.



Figure 3. Period time vs. digital control word (W).

## 3. PHASE DETECTOR

The type IV phase detector has in its original configuration two outputs controlling the oscillator frequency: One for signaling "UP" and one for

signaling "DOWN" for the duration of the phase error. A slight modification is done to produce one signal "UP" or "DOWN" and one signal "EVENT" showing the length of the phase error. The modified phase detector is shown in figure 4.



Figure 4. Phase detector.

Due to internal delay of the phase detector, the pulses at node "UP" and "DOWN" are longer than the actual phase error. There is therefore always a pulse at both nodes "UP" and "DOWN". For the "EVENT" signal in figure 4, the internal delay is canceled out. Thereby, a more accurate measurement of the phase error is achieved. This PLL implementation works as a synchronous digital circuit, which thus needs a clock pulse for updating all registers. At the end of each phase error a short pulse is produced at the node "UPDATE". This pulse, which is as long as the internal delay of the phase detector, is further delayed and used as a system clock. Figure 5 shows a simulation of the "UP", "DOWN", "EVENT" and "UPDATE" pulses. For the simulation, the phase error is set to 1.40ns, which gives an "EVENT" signal of about 1.42ns. The pulse "UPDATE" is about 800ps independently of the phase error.



Figure 5. The "EVENT" and "UPDATE" signals.

#### 4. LOOP FILTER

The loop filter consists of a counter and a digital recursive filter. The counter measures the phase error and the result from the counter is used to update the control word for the NCO. In order to get a high resolution when measuring the phase error, the counter is preferably clocked at a frequency higher than the output from the NCO. To avoid using a very high frequency for this purpose, clock flanks instead of complete pulses are counted during the phase error. This enables use of the output from the NCO instead of implementing an extra oscillator.

The circuit of figure 6 is used for gating a clock signal when "EVENT" is low. The clock pulses at the output "clk\_out" are always complete pulses. Two three-bit counters are equipped with the circuit of figure 6. One counter is clocked with the gated version of the clock and the other counter is clocked with the gated version of the inverted clock. Figure 7 is a simulation showing the "EVENT" signal, the clock signal and the clock bursts counting up the two counters. The result from the two counters is then added to get the number of clock flanks during the phase error.



Figure 6. Clock-gating circuit.



Figure 7. Gated clock pulses.



Figure 8. Step response.

Since the counters are only three-bit wide, they will often saturate during the initial phase of an impulse response. To get a faster impulse response and thereby a shorter lock time for the PLL, the output from the counters is multiplied by 4 whenever both counters are saturated. The effect of this is shown in figure 8, where the lower plot shows the improvement in step response.

Figure 9 shows the first order recursive filter, which is used for stabilizing the PLL feedback. The signal "W" is the digital control word to the NCO and the signal "Counter" is the sum of the two tree-bit counters.



Figure 9. Recursive filter.

Figure 10 is a plot showing the digital control word when phase lock is achieved. The plot of figure 10 is a zoom in version of the lower part of figure 8. Since an exact frequency never is found, the control word will oscillate between two values during phase lock. This oscillation gives a phase noise, which is equal to the resolution of the NCO.

Since simulating a PLL is extremely timeconsuming, a MATLAB model of the digital PLL is used for the simulations of figure 8 and 10.



5. DESIGN OVERVIEW

Figure 11 shows the block structure for the digital PLL. The phase detector is controlling the two counters, which in this configuration are clocked by an extra oscillator. The sum of the two counters is added to the output of the digital filter. A decoder transforms the output from the digital filter (W) to control signals (C) for the NCO. The output of the NCO is divided by

the multiplication factor and then compared in phase to the reference. The signal "UPDATE" from the phasedetector is used for clocking the register and for resetting the counters.



Figure 11. Block structure.

## 6. CHIP LAYOUT

Figure 12 shows the complete chip layout of a prototype chip containing the digital PLL, which can be used as an on-chip IP-block. The actual core area for the digital PLL is limited to  $0.07 \text{mm}^2$ . The prototype chip is sent for fabrication.



Figure 12. Chip layout.

# 7. CONCLUSIONS

A prototype of a standard-cell digital PLL clock multiplier is designed using a  $0.35\mu m$  CMOS process. The digital PLL is designed for a frequency range of 170-360MHz and occupies about  $0.07mm^2$  of on-chip

area. Instead of a charge pump and an analog filter, two three-bit counters and a recursive digital filter is used as a loop filter. A numerically controlled oscillator with high resolution is made from a ring oscillator with additional tri-state gates. The high resolution enables accurate frequency control and low phase noise.

The digital PLL is implemented using cells found in an ordinary standard-cell library, which makes it portable between technologies in a netlist format.

A prototype chip containing the digital PLL is sent for fabrication.

## 8. REFERENCES

- P. Nilsson and M. Torkelson, "A Monolitic Digital Clock-Generator for On-Chip Clocking of Custom DSP's", *IEEE J. Solid-State Circuits*, vol.31, No. 5, pp. 700-706, May. 1996.
- [2] T. Olsson, P. Nilsson, T. Meincke, A. Hemani and M. Torkelson. "A Digitally Controlled Low-Power Clock Multiplier for Globally Asynchronous Locally Synchronous Designs", In Proceedings of ISCAS'2000, Geneva, May 2000.
- [3] R. E. Best, Phase-locked loops, McGraw-Hill, 1984.
- [4] J. M. Rabaey. Digital Integrated Circuits: A Design perspective, Prentice hall, 1996.
- [5] P. Andreani, F. Bigongiari, R. Roncella, R. Saletti and P. Terreni, "A Digitally Controlled Shunt Capacitor CMOS Delay Line", Analog Circuits and Signal Processing, Kluwer Academic Publishers, Volume 18, pp. 89-96.
- [6] D. Mijuskovic et al. "Cell-Based Fully Integrated CMOS Frequency Synthesizers", *IEEE J. Solid-State Circuits*, vol.29, pp. 271-279, March. 1994.