# Design and Analysis of a Performance-Optimized CMOS UWB Distributed LNA

Payam Heydari, Senior Member, IEEE

Abstract-In this paper, the systematic design and analysis of a CMOS performance-optimized distributed low-noise amplifier (DLNA) comprising bandwidth-enhanced cascode cells will be presented. Each cascode cell employs an inductor between the common-source and common-gate devices to enhance the bandwidth, while reducing the high-frequency input-referred noise. The noise analysis and optimization of the DLNA accurately accounts for the impact of thermal noise of line terminations and all device noise sources of each CMOS cascode cell including flicker noise, correlated gate-induced noise and channel thermal noise on the overall noise figure. A three-stage performance-optimized wideband DLNA has been designed and fabricated in a 0.18- $\mu$ m SiGe process, where only MOS transistors were utilized. Measurements of the test chip show a flat noise figure of 2.9 dB, a forward gain of 8 dB, and input and output return losses below -12 dB and -10 dB, respectively, across the 7.5 GHz UWB band. The circuit exhibits an average IIP3 of -3.55 dBm. The 872  $\mu$ m imes 872  $\mu$ m DLNA chip consumes 12 mA of current from a 1.8-V DC voltage.

*Index Terms*—CMOS, distributed amplifier, linearity, low-noise amplifier, noise figure, radio-frequency (RF) integrated circuits, SiGe, stochastic analysis, ultra-wideband (UWB).

## I. INTRODUCTION

LTRA-WIDEBAND (UWB) wireless radio is capable of carrying extremely high data rates over a short distance (e.g., less than 15 meters). The spread spectrum characteristics of wideband wireless systems, and the ability of the UWB wireless receivers to resolve multipath fading, make UWB systems a promising wireless scheme for a variety of high-rate, shortto medium-range wireless communications. Despite attributes enumerated for the UWB wireless radios, the RF front-end, particularly the low-noise amplifier (LNA), entails several design challenges due to stringent requirements. A key building block in the UWB receiver's RF front-end, the UWB LNA must retain good performance (i.e., low noise figure and high gain) across the system's wideband frequency spectrum from 3.1 to 10.6 GHz. Importantly, the same set of design requirements should be satisfied in a UWB LNA design regardless of the type of UWB system (i.e., impulse radio or multiband) being used [1]. In fact, the input signal power at the receiver after the UWB antenna and the pre-filter circuit is too low to allow any pre-processing for appropriate sub-band filtering in a multiband UWB

The author is with the Department of Electrical Engineering, University of California, Irvine, CA 92697 USA (e-mail: payam@uci.edu).

Digital Object Identifier 10.1109/JSSC.2007.903046

receiver utilizing all available sub-bands. Even if such signal processing was possible, each sub-band would require a distinct LNA circuit, which leads to a bank of LNAs in the receiver. Such design solution is, however, inefficient from both chip area and performance perspectives, demanding alternative circuit design techniques. [2] and [3] independently designed the first lumped LNA circuits for the UWB radio using a cascode circuit and high order wideband bandpass filters (BPFs) to provide wideband input matching. The noise figures (NFs) reported in [2] and [3] were not flat across the 7.5-GHz bandwidth and the minimum NF obtained by these works were 4 dB, and 2.5 dB (in bipolar technology), respectively. The in-band NF of the LNA in [2] increases to as much as 8 dB. CMOS common-gate (CG) amplifier providing a wideband input match with good reverse isolation and inherent stability has been used in [4] to design a UWB LNA circuit. However, the NF of the CG LNA is considerably larger than that of the CMOS common-source or cascode LNAs. Previously employed in common-source LNAs in [5] and [6], the  $g_m$ -boosting technique was proposed by [7] to improve the NF performance of a UWB CG LNA.

On the other hand, recent advances in high-speed integrated circuits and continuous scaling of minimum feature sizes of silicon-based devices have increased the interest in on-chip implementation of transmission lines (TLs), which are key components of broadband distributed circuits. An important concern regarding distributed topologies is, however, higher power dissipation and larger chip area compared to lumped circuits. Both the power dissipation and the area of a distributed circuit increase with the number of stages, suggesting a compromise between power dissipation and gain-bandwidth product (GBW). Despite consuming more power than the conventional lumped circuits, the distributed architectures are highly amenable to technology scaling, which makes them a topology of choice for future developments of silicon-based millimeter-wave (MMW) broadband ICs.

Silicon-based distributed circuits have gained considerable attention during the past decade. Inspired by Beyer's work in [8], Kleveland *et al.* presented a CMOS distributed amplifier (DA) and distributed ring oscillator [9]. Ref. [10] presented the design of a conventional DA and utilized a simulated-annealing-based optimization methodology to optimize the design performance. Refs. [11] and [12] used the differential and conventional DA topologies, respectively, and fabricated those circuits in advanced CMOS technologies to achieve better performance. Ref. [13] presented the noise analysis of the distributed amplifier, which was utilized later by [14] to design and analyze a low-power distributed LNA circuit. Despite providing useful approach for the high-frequency noise analysis of the DA, [13] (and [14]), however, suffers from an analytical misconception,

Manuscript received August 14, 2006; revised March 29, 2007. This work was supported in part by a National Science Foundation (NSF) CAREER Award under Contract ECS-0449433 and by Intel Corporation through a UC-Micro grant. Equipment was provided by Broadcom Corporation. Chip fabrication was provided by Jazz Semiconductor.



Fig. 1. The block diagram of a distributed circuit incorporating (a) actual CPWs, or (b) artificial LC circuits.

which will be explained in details in Section III-A1. Briefly speaking, [13] first calculated the Fourier transform of noise current (and not the Fourier transform of the autocorrelation) at the load termination, which itself is a nonstationary random process [15], while omitting the *partial* correlation between the gate-induced and channel thermal noise. The power spectral density (PSD) of noise was then obtained by calculating the magnitude square of its Fourier transform, and setting it equal to the PSD.

This paper presents the analysis and design of a performance-optimized CMOS distributed LNA (DLNA) incorporating bandwidth-enhanced cascode cells. A brief summary of the design methodology of this DLNA first appeared in [16]. The DLNA's noise analysis takes into account the impact of thermal noise of line terminations and all existing device noise sources of each cascode cell including flicker noise, correlated gate-induced noise and channel thermal noise on the overall noise figure. The proposed stochastic modeling of noise can easily be extended to any other DA topology. As will be explained in details, the proposed LNA achieves a lower flat NF over a wider bandwidth than lumped implementations presented in [2]–[4]. It is noteworthy that the design of prefilter preceding the wideband LNA in the receiver chain, which is used to filter out of band frequencies below 3.5 GHz and to reduce strong interference due to the 5 GHz UNII and ISM bands, is beyond the scope of this paper.

The remainder of this paper is organized as follows. Section II gives a brief overview of distributed circuits. Section III describes the circuit topology and a method to calculate the bandwidth-enhancing inductors. This section presents the noise analysis and performance optimization methodology for the proposed DLNA, by first giving a brief overview of the current state of knowledge. Section IV provides measurement results of the fabricated DLNA. Finally, Section V presents conclusions of this paper.

#### **II. BACKGROUND: DISTRIBUTED CIRCUITS**

The distributed topology incorporating transmission lines (TLs) was originally proposed by Ginzton *et al.* [17]. Insufficient technological capability to design area-efficient distributed circuits delayed the usability of these circuits for a long time. They reappeared in the 1980s using a variety of processes, such as GaAs or other III-V technologies, and recently in CMOS process. Examples include distributed amplifiers [9]–[11], distributed mixers [18], and distributed oscillators [9], [19]. The renewed interest in distributed circuits is mainly due to the capability of designing on-chip TLs, and high-Q inductors.

Fig. 1 shows the general block diagram of a DA comprising TLs and gain stages distributed along the TLs, where each gain stage can simply be a common-source (or common-emitter in bipolar technology) stage. The TLs can be realized using either coplanar waveguides [see Fig. 1(a)] or cascaded *LC* circuits [see Fig. 1(b)].

As a fundamental property, integrated circuits incorporating on-chip TLs trade delay for bandwidth [8], [20]. In frequency domain, the transistor's parasitic capacitances are absorbed into the constants of the TL [20], as also demonstrated in Fig. 1(a) and (b). Hence, the circuit bandwidth is set by the cutoff frequency of the TLs.

The design of silicon-based distributed integrated circuits is a topic of active research (for example, see [12], [21], [22]).

#### III. CMOS PERFORMANCE-OPTIMIZED DLNA

The LNA is a key building block in a UWB wireless receiver. Challenges in UWB LNA design involves achieving 1) a NF of around 3.5 dB [23], (2) a relatively flat gain of at least 6 dB [2], 3) a minimum reverse isolation of -20 dB [2], and 4) a good linearity (e.g., IIP3 > -8 dB, as specified in [23]).

The LNA in this work is based on distributed circuit topology. In addition to the attributes enumerated in Section II, distributed circuits are capable of providing an inherent *wideband input/ output matching*. This property is particularly useful in UWB RFIC design.

In a conventional CMOS DA, where each cell only employs a common-source transistor, the input-output coupling through overlap gate-drain capacitance of each transistor causes the realpart of the DA's propagation constant to become negative, resulting in the amplitude growth of the output waveform at the far-end load termination. The conventional DA is thus potentially unstable. In addition, any voltage/current variation in either gate or drain TL's terminations will be coupled to the other TL through  $C_{\rm GD}$  of the common-source transistor. A DA with cascode cell can mitigate these deleterious effects [16], [20], [21]. However, common-gate transistors of each cascode cell begin to contribute significant noise to the output at high frequencies, thereby degrading the circuit's NF.



Fig. 2. Circuit schematic of the proposed N-stage distributed LNA (N = 3 in our design.

Indicated in Fig. 2 is the schematic of the proposed N-stage UWB DLNA comprising uniform gate and drain artificial LC TLs and identical cascode cells. Each cell employs a cascode configuration to guarantee stability across the entire bandwidth by providing isolation between the cell's input and output terminals. The interstage inductors of the gate (drain) TL along with gate (drain) parasitic capacitances of transistors  $M_{ak1}(M_{ak2})$ ,  $1 \le k \le N$ , constitute cascaded LC ladder circuits with characteristic impedance of  $Z_G = \sqrt{L_G/C_{i,cs}}(Z_D = \sqrt{L_D/C_{o,cg}})$ , where  $C_{i,cs}$  is the input capacitance of the common-source stage and  $C_{o,cg}$  is the output capacitance of the common-gate stage within each cascode cell. Both  $Z_G$  and  $Z_D$  stay constant over a wide range of frequencies. In this design, both  $Z_G$  and  $Z_D$  are chosen to match the 50  $\Omega$  source/load resistances.

As indicated in Fig. 2, each cascode cell incorporates an inductor  $L_{Ck}$ ,  $1 \le k \le N$ , for the following reason: recall that the gate and drain TLs boost the BW by absorbing the input and output parasitic capacitances of each cell. These TLs do not, however, affect the frequency roll-off due to large parasitic capacitance seen at the internal node of a conventional cascode cell, where the drain of the common-source transistor is short-circuited to the gate of the common-gate transistor. Moreover, the input-referred noise of each cascode cell in the absence of this BW-enhancing inductor may rise considerably at high frequencies, because the internal node's parasitic capacitances will lower the equivalent impedance seen at this node to ground. The above problems are alleviated by using inductors  $L_{Ck} \ 1 \le k \le N$ . The proposed DLNA topology is based on a uniform distributed architecture, therefore,  $L_{Ck} = L_{Cr} = L_C$ , for all  $k \neq r$ .

In the absence of  $L_C$ , the circuit bandwidth is primarily limited by the pole associated to the internal node of the cascode cells whose value is  $p_{cascode} = [g_{m,cg}^{-1}(C_{i,cg} + C_{o,cs})]^{-1}$ , where  $C_{o,cs}$  is the output capacitance of the common-source transistor,  $C_{i,cg}$  is the input capacitance of the common-gate transistor, and  $g_{m,cq}$  is the transconductance of the commongate transistor in each cascode cell (cf. Fig. 2). The inductance  $L_C$ , which leads to less than 10% of ripple in the passband and a maximum increase of bandwidth, along with this boosted bandwidth are determined using the following analysis.

Fig. 3(a) and (b) shows the AC equivalent and high-frequency small-signal model of the kth cascode cell with BW-enhancing inductor  $L_C$ , seen from the internal node of the cascode cell. The high-frequency model of Fig. 3(b) is used to obtain the transfer function  $V_{dk}(s)/V_{qk}(s)$ .

 $L_C$  makes the equivalent impedance  $Z_{o,cs}$ , seen looking up from  $V_{dk}$  and expressed as  $Z_{o,cs}(s) = (L_C C_{i,cg} s^2 + g_{m,cg} L_C s + 1)/(g_{m,cg} + C_{i,cg} s)$ , behave inductively at high frequencies. This impedance effectively determines the *series* resonant frequency  $\omega_{n,z} = (L_C C_{i,cg})^{-1/2}$  of the transfer function  $V_{dk}(s)/V_{gk}(s)$  of the kth cell, and is in parallel with the output impedance of common-source transistor  $M_{ak1}$  which is capacitive. Using the circuit model of Fig. 3(b), the transfer function  $V_{dk}(s)/V_{gk}(s)$  of the kth cell is readily obtained as

$$\frac{V_{dk}}{V_{gk}}(s) = \frac{g_{m,cs}g_{m,cg}^{-1}\left(\frac{s^2}{\omega_{n,z}^2} + \frac{2\zeta_z}{\omega_{n,z}}s + 1\right)}{\frac{s^2}{\omega_{n,p}^2} + \frac{2\zeta_p}{\omega_{n,p}}s + 1} = \frac{g_{m,cs}g_{m,cg}^{-1}(L_C C_{i,cg}s^2 + g_{m,cg}L_C s + 1)}{(g_{m,cg}^{-1}C_{i,cg}s + 1)L_C C_{o,cs}s^2 + g_{m,cg}^{-1}(C_{i,cg} + C_{o,cs})s + 1},$$
for  $1 \le k \le N$ . (1)

In the absence of  $C_{i,cg}$ , the parallel resonant frequency of the transfer function  $V_{dk}(s)/V_{gk}(s)$  should have been

$$\omega_{n,p}^{\text{(unloaded)}} = (L_C C_{o,cs})^{-1/2}.$$

 $C_{i,cg}$ , however, lowers the parallel resonant frequency down to  $(1 + g_{m,cg}^{-2}C_{i,cg}^2\omega^2)^{-1/2}(L_C C_{o,cs})^{-1/2}$  which is smaller than  $(L_C C_{o,cs})^{-1/2}$ . This *loaded* resonant frequency is, therefore,



Fig. 3. (a) AC equivalent of the BW-enhanced cascode cell. (b) Small-signal model.

frequency-dependent. Because the goal is to obtain  $L_C$  so as to maximize the -3 dB bandwidth  $\omega_{-3dB}$ , the frequency offset  $\sqrt{1 + g_{m,cg}^{-2}C_{i,cg}^2\omega^2}$  of the loaded resonant frequency is evaluated at frequencies close to  $\omega_{-3dB}$ . The parallel resonant frequency  $\omega_{n,p}$  thus approximately becomes

$$\omega_{n,p}^2 \approx \frac{1}{(L_C C_{o,cs})} \cdot \frac{1}{\sqrt{1 + g_{m,cg}^{-2} C_{i,cg}^2 \omega_{-3dB}^2}}.$$
 (2)

To increase the bandwidth while avoiding large frequency peaking, the transfer function  $V_{dk}(s)/V_{gk}(s)$  should hold specific characteristics including the following.

- 1) The numerator of (1) should be in the form of a maximally flat polynomial, implying that the damping factor  $\zeta_z$  is  $1/\sqrt{2}$  (see Fig. 4).
- 2) The denominator of (1) should exhibit small peaking in frequency domain, which leads to additional BW increase. A damping factor of 1/2 (i.e.,  $\zeta_p = 0.5$ ) results in a peaking of 1.25 dB. Additionally, the parallel resonant frequency  $\omega_{n,p}$ becomes equal to the 0-dB frequency, where the magnitude response of the transfer function crosses the 0 dB axis after experiencing 1.25 dB peaking (see Fig. 4).

By choosing  $\omega_{n,p} = \omega_{n,z}$ , the 0-dB cutoff frequency of the transfer function  $V_{dk}(s)/V_{gk}(s)$  is boosted to  $\omega_{n,p}$ . Moreover, it results in a frequency peaking of less than 10%, as also shown in Fig. 4. This criterion along with the above design guidelines 1 and 2 provide sufficient information to calculate the inductance  $L_C$  and the new 3-dB bandwidth as follows:

$$L_C = \sqrt{2}g_{m,cg}^{-2}(C_{i,cg} + C_{o,cs})$$
(3)

$$\omega_{-3dB} = \frac{g_{m,cg}\sqrt{(C_{i,cg}/C_{o,cs})^2 - 1}}{C_{i,cq}}.$$
 (4)

The bias for cascode transistors in all constituent cells is provided by a single current mirror, as shown in Fig. 2. The artificial *LC* gate line provides the wideband input impedance matching, thereby obviating the need for inductive degeneration for each cascade cell of the DLNA circuit.

The spiral inductors with Q-factors of 10 at 10 GHz are designed to realize interstage delay lines because they exhibit a



Fig. 4. (1) Normalized magnitude response without BW-enhancing inductor. (2) Normalized magnitude response with BW-enhancing inductor. (3) Numerator polynomial. (4) Denominator of the transfer function.

larger inductance per unit length than CPWs or microstrip lines at the UWB frequency range and also avoid the circuit floorplan to spread too much in one dimension. TL inductors are designed such that the same characteristic impedance of 50  $\Omega$  is obtained at each tap-point of the gate and drain lines so as to maximize the power transfer toward the load termination. The gate line's inductor  $L_G$  is larger than the drain line's inductor  $L_D$ , because the input capacitance is larger than the output capacitance of each cell. To verify the bandwidth improvement, the DLNA with and without the inductor  $L_C$  was simulated. As will be extensively discussed in Section III-A, a three-stage circuit will result in minimum NF. The simulation result is demonstrated in Fig. 5, showing approximately 3.5 GHz bandwidth improvement.

The circuit's NF is a function of the load terminations, parasitic capacitances of the cascode stage, the propagation constants of the *LC* TLs, and the number of stages. A comprehensive noise figure analysis of the DLNA will be provided in Section III-A. It intends to address specific issues arising from



Fig. 5. Simulation results of a conventional cascode amplifier, a three-stage DLNA without  $L_C$ , and a three-stage DLNA with  $L_C$ .

the analysis presented in [13] by calculating the PSD of noise in the DLNA more accurately.

## A. Noise Analysis

The dominant intrinsic noise sources in the DLNA are: 1) thermal noise from the input source impedance ( $R_S = Z_G$ ;  $Z_G$  is the gate line's characteristic impedance defined earlier), 2) thermal noise from the gate and drain terminations, and 3) dominant noise sources associated with each MOS transistor including the channel thermal noise, gate-induced noise, and flicker noise. Despite the fact that flicker noise presents negligible impact on a high-frequency LNA circuit, for the sake of completeness, its contribution to the overall NF will be accounted for. The distributed structure of the DLNA provides several paths for any given signal/noise source in the circuit. Depending on the traveling direction of the wave toward the far-end termination, wave propagation falls into two classes, namely forward and backward propagation. For the same input and output matching impedances, the in-band forward power gain from the input terminal to the output is maximized when drain and gate TLs have the same propagation constants (i.e.,  $\beta_d = \beta_g = \beta$ ). This maximum forward power gain  $G_p^{(\hat{F})}$  is expressed as (see [13] for more details)

$$G_p^{(F)} = \frac{N^2 g_{m,cs}^2 Z_D Z_G}{4}.$$
 (5)

The backward power gain  $G_p^{(R)}$  at the near-end drain termination is expressed as [13]

$$G_p^{(R)} = \frac{g_{m,cs}^2 Z_D Z_G}{4} \left(\frac{\sin N\beta}{\sin \beta}\right)^2.$$
 (6)

To better clarify the forward and backward propagation phenomena, consider Fig. 6 showing the block diagram of a fourstage DA with a test current source applied to the input tap of the third cell. This figure clearly demonstrates backward and forward propagations of the wave, generated by  $I_{\text{test}}$ , toward the load termination.

For convenience, MOS transistors and gate/drain inductors are assumed to be lossless. The use of the inductance  $L_C$  in (3) allows us to keep the source-terminal impedance of each common-gate transistor large across the UWB frequency range. Therefore, the noise contribution of common-gate transistors



Fig. 6. Block diagram schematic of a four-stage DA with a test current source demonstrating the backward and forward propagations.



Fig. 7. Forward propagation of dominant device noise sources of the kth cell of the DLNA.

 $M_{ak2} \ 1 \le k \le N$  can be neglected. Measurement result in Section IV indeed verifies the accuracy of this observation. The voltage across the input capacitance of each cascode cell is amplified by the small-signal gain  $g_{m,cs}V_{gk}$  for  $1 \le k \le N$ , and the current from each cell flows in both directions with a phase constant  $\beta_d = \beta_g = \beta$  per each *LC* section of the drain TL (cf. Figs. 7 and 8). The noise analysis, described in the following, accounts for the impact of high frequency gate-induced noise, and therefore, is an extension of [18]. It is based on a rigorous stochastic modeling with some similarities to the approach presented in [13]. Section III-A1 briefly overviews basic concepts of the stationary random process and the procedure introduced in [13] for noise analysis.

1) Background and Current State of Knowledge: Device noise sources in electronic circuits are implicitly assumed to fall in the class of wide-sense stationary (WSS) processes [15]. For a WSS random process x(t), the first-order (i.e., mean) statistical average is time-invariant, and the second-order (i.e., autocorrelation function) statistical average at time values  $t_1$ and  $t_2$ , defined as  $\Phi_x(t_1, t_2) = \overline{x(t_1)x(t_2)}$ , depends only on the difference between  $t_1$  and  $t_2$ ,  $t_1 - t_2$ . Subsequently, it only needs to be indexed by one variable rather than two variables, i.e.,  $\Phi_x(t_1, t_2) = \Phi_x(t_1 - t_2)$  (see [15]). Most importantly, the Fourier transform of the autocorrelation of a WSS process, widely known as power spectral density (PSD), is a deterministic function whose integral is the average power of noise. On



Fig. 8. Backward propagation of dominant MOSFET noise sources of the kth cell of the DLNA.

the other hand, the Fourier transform  $X(j\omega)$  of the noise x(t) is defined as  $X(j\omega) = \int_t x(t) \exp(-j\omega) dt$  [15]. In contrast to deterministic signals, the Fourier transform of a random process does not carry useful insight with practical implications, as it is a random process by itself.

In an original work presented in [13], the noise figure of the conventional DA, where each cell is simply a common-source transistor, was calculated. The noise sources that were taken into account in the analysis were channel thermal noise and gate-induced noise of transistors and thermal noise of source and load resistive terminations. For the sake of argument, the analytical procedure in [13] is summarized:

- 1) The output noise contribution of the *r*th stage in an *N*-stage distributed amplifier is calculated. In doing so:
  - a) It calculates the Fourier transform of the output noise current due to forward and backward amplifications of noise generators of the *r*th stage.
  - b) It calculates the magnitude square of the Fourier transform of the total current in the load termination due to the *r*th section by combining currents due to forward and backward amplifications, vectorially.
  - c) It assumes that the magnitude square of Fourier transform of the total current obtained in step b is equal to the PSD of the noise current, i.e.,  $S_{Id}(\omega) = |I_d(\omega)|^2$ where  $S_{Id}(\omega)$  and  $I_d(\omega)$  denote the PSD and the Fourier transform of the noise current  $I_d$ , respectively. This is false, as  $|I_d(\omega)|^2$  is a random process itself, and cannot be equal to the PSD of noise. In fact, a theorem, proved in [15] and restated in the following, clearly specifies the relationship between a random process and its Fourier transform:

Theorem 1 ([15, p. 515]): Suppose that x(t) is a stationary random process with autocorrelation  $\Phi_x(t_1 - t_2)$  and the PSD  $S_x(\omega)$ . The Fourier transform of x(t),  $X(\omega)$  is nonstationary white random process with autocorrelation expressed as:  $\overline{X(\omega)X^*(\omega')} = 2\pi S_x(\omega)\delta(\omega - \omega')$  where  $\delta(.)$  is the delta function. Consequently,  $\int_0^{\omega} \overline{X(\omega)X^*(\omega')}d\omega' = 2\pi S_x(\omega)$ , and not  $|X(\omega)|^2$  (which is a random process), is equal to  $2\pi S_x(\omega)$ . More importantly, [13] ignores the par-

tial correlation between the gate-induced and thermal noise sources.

2) Finally, the noise contributions from all N stages are obtained by adding all the noise contributions for all r values from 1 to N.

We address the above problems by developing an analytical approach based on calculation of auto-correlation of the DLNA's output noise. Considering that the properties of Fourier transforms for deterministic signals also hold for random signals, we will first calculate the Fourier transform of the noise current due to forward and backward amplifications. Additionally, we take into account the frequency response of each cell. We will then calculate the autocorrelation functions of the output noise at the load termination. The PSD of noise will then be obtained by taking Fourier transform of the autocorrelation functions for the DLNA circuit of Fig. 2. This approach will be illustrated in details in Section III-A2.

2) Noise Contribution of MOS Transistors: Figs. 7 and 8 demonstrate the forward and backward propagations of dominant noise sources of the *k*th cell, respectively. To perform the noise analysis of partially correlated channel thermal noise  $I_{d,k}$  and gate-induced noise  $I_{g,k}$  of the *k*th stage, the gate-induced noise is first decomposed into its correlated and uncorrelated components [20], [24], [25], i.e.,

$$\overline{I_{g,k}^2} = \overline{I_{g,u_k}^2} + \overline{I_{g,c_k}^2} \\
= 4k_B T \delta g_{g,k} \left(1 - |c|^2\right) \\
+ 4k_B T \delta g_{q,k} |c|^2 \quad \text{for } 1 \le k \le N$$
(7)

where  $k_B$  is the Boltzmann's constant  $(1.38065 \times 10^{-23} \text{ Joule}/^{\circ}\text{K})$ , T is the absolute temperature,  $g_{g,k} = \zeta \omega^2 C_{GS,k}^2 / g_{m,k}$  for  $1 \leq k \leq N$ ,  $\delta$  is a technology-dependent constant, and c is the correlation coefficient [defined as  $c = \overline{I_{g,k}I_{d,k}^*} / (\overline{I_{g,k}^2I_{d,k}^2})^{1/2}$ ] whose value for long-channel devices is approximately j0.395 [20], [24]. Moreover,  $g_{m,k} = g_{m,cs_k}$  for  $1 \leq k \leq N$ .

All the cells distributed along constituent gate and drain TLs of the DLNA in Fig. 2 are contributors to the output noise power as well as the overall noise figure. Similar to the approach presented in [13] and summarized in the previous subsection, the noise contribution of MOSFETs of the kth stage to the output is calculated by accounting for both forward and backward propagations of these noise sources. Because of nonzero correlation between correlated noise sources, the overall average power of additive combination of these noise sources is not equal to sum of the average powers of individual noise sources [15]. This notion will be taken into consideration during the forthcoming noise calculations.

In calculating the noise contribution of MOSFETs, the TLs are assumed to have identical propagation constants. The DLNA's power gain with the same input and output matching impedances will be maximized if the LC TLs have identical propagation constants [8].

First, the forward amplification of noise sources associated with the kth cell is studied. Besides widening the BW, the inductor  $L_C$  reduces the noise contribution of the cascode transistor  $M_{ak2}$  of the kth cell in Fig. 2. The dominant noise sources are, therefore, the gate-induced noise, channel thermal noise, and low-frequency flicker noise of the common-source transistors  $M_{ak1}$ . Fig. 7 shows the forward amplification of dominant noise sources of the kth cell through the signal paths of this cell and N - k + 1 cells. Using Fig. 7, the Fourier transform of the output noise current due to MOSFET noise sources associated with the kth cell and their forward-propagated replicas is

$$I_{o,k}^{(F)}(\omega) = \frac{1}{2} \left[ I_{d,k}(\omega) + (N-k+1) \left( I_{g,ck}(\omega) + I_{g,uk}(\omega) + I_{1/f,k}(\omega) \right) H_{g,k}(\omega) \right] e^{-j(N-k+1/2)\beta}$$
(8)

where  $I_{o,k}^{(F)}(\omega)$  denotes the Fourier transform of the output noise current due to forward amplification of MOSFET noise sources of the kth cell.  $I_{d,k}$  and  $I_{1/f,k}$  represent the Fourier transforms of the channel thermal noise and flicker noise currents of  $M_{ak1}$ , respectively.  $I_{g,ck}$  and  $I_{g,uk}$  are the Fourier transforms of the correlated and uncorrelated components of the gate-induced noise current of  $M_{ak1}$ , respectively.  $H_{g,k}(\omega)$  is the input-output transfer function of the kth cell. With identical cells and identical TL's inductors, the corresponding noise sources of the DLNA will be identical, i.e.,  $I_{d,k}(\omega) = I_{d,r}(\omega) = I_d(\omega)$ ;  $I_{g,k}(\omega) = I_{g,r}(\omega) = I_g(\omega)$ ;  $I_{1/f,k}(\omega) = I_{1/f,r}(\omega) = I_{1/f}(\omega)$ for  $1 \le k, r \le N$ . Furthermore,  $H_{g,k}(\omega) = H_{g,r}(\omega) = H_g(\omega)$ for  $1 \le k, r \le N$ .

The backward propagations of gate and flicker noise sources of the *k*th cell, shown in Fig. 8, contribute to the output noise current. The backward-propagated noises are all correlated with the original noise sources at the gate terminal of the *k*th cell. Therefore, the Fourier transform of the noise current is calculated as (cf. Fig. 8)

$$I_{o,k}^{(B)}(\omega) = \frac{1}{2} \left[ I_{g,c}(\omega) + I_{g,u}(\omega) + I_{1/f}(\omega) \right] \\ \times H_g(\omega) \frac{\sin(k-1)\beta}{\sin\beta} e^{-j(N+1/2)\beta}.$$
(9)

The Fourier transform of backward-propagated noise current,  $I_{o,k}^{(B)}(\omega)$ , reaches its peak when  $\sin(k-1)\beta/\sin\beta = k-1$  for  $\beta = l\pi$  and  $l \in \mathbb{Z}$ .

The time-domain noise current at the output,  $i_{o,k}(t)$ , defined as  $i_{o,k}(t) = i_{o,k}^{(F)}(t) + i_{o,k}^{(B)}(t)$ , due to MOSFET noise sources of the kth stage is a random process, meaning that its Fourier transform is a random process itself. On the other hand, as pointed out in Section III-A1, the PSD of noise is not equal to the magnitude square of its Fourier transform. The PSD of noise  $i_{o,k}(t)$ should therefore be obtained by taking the Fourier transform of its autocorrelation function,  $\Phi_{o,k}(m)$ , which is defined as

$$\Phi_{o,k}(m) = \overline{i_{o,k}(t)i_{o,k}(t+m)}$$
(10)

where

$$\begin{split} i_{o,k}(t) &= \frac{1}{2} \left\{ i_d(t) + (N - k + 1) \left[ i_g(t_1) + i_{1/f}(t_1) \right] * h_g(t_1) \\ &+ (k - 1) \left[ i_g(t_2) + i_{1/f}(t_2) \right] * h_g(t_2) \right\} \\ t_1 &= t - (N - k + 0.5) \sqrt{LC}, \\ t_2 &= t - (N + 0.5) \sqrt{LC}. \end{split}$$
 (11)

The symbol \* in (11) denotes the convolution operation.  $h_g(t)$ and  $H_g(\omega)$  represent the impulse response and current-gain transfer function of each cell, respectively. After a certain amount of mathematical effort, the upper-bound of the autocorrelation is found using the following expression:

$$\begin{split} \Phi_{o,k}(m) &= \frac{1}{4} \left\{ \overline{I_d^2} \delta(m) + \left[ (N - k + 1)^2 + (k - 1)^2 \right] \\ &\times \left( \overline{I_g^2} + \overline{I_{1/f}^2} \right) h_g(m) * h_g(-m) + (N - k + 1) \\ &\times \left[ \Phi_{I_g,I_d}(m - t + t_1) * h_g(m - t + t_1) \right. \\ &+ \Phi_{I_g,I_d}^*(t_1 - t - m) * h_g(t_1 - t - m) \right] \\ &+ (k - 1) \left[ \Phi_{I_g,I_d}(m - t + t_2) * h_g(m - t + t_2) \right. \\ &+ \Phi_{I_g,I_d}^*(t_2 - t - m) \\ &\quad * h_g(t_2 - t - m) \right] \}. \end{split}$$
(12)

 $\Phi_{I_g,I_d}(m)$  is the cross-correlation of stochastic processes  $i_g(t)$  and  $i_d(t)$  with power spectral density of  $S_{I_g,I_d}(\omega)$ . The channel thermal noise of transistor is a white noise process, implying that its autocorrelation is an impulse function  $\overline{I_d^2}\delta(m)$  [see the first term of (12)].

The PSD of the output noise current  $S_{o,k}(\omega)$  due to the MOSFET noise sources of the *k*th stage and all its forward- and backward-propagated replicas is obtained by taking the Fourier transform of (12), which results in

$$S_{o,k}(\omega) = \frac{1}{4} \left\{ \overline{I_d^2} + \left[ (N - k + 1)^2 + (k - 1)^2 \right] \left( \overline{I_g^2} + \overline{I_{1/f}^2} \right) \right.$$
$$\times \left. \left| H_g(\omega) \right|^2 + 2(N - k + 1) \right.$$
$$\times \operatorname{Re} \left[ S_{I_g, I_d}(\omega) H_g(\omega) e^{-j\omega(t - t_1)} \right] + 2(k - 1) \right.$$
$$\times \operatorname{Re} \left[ S_{I_g, I_d}(\omega) H_g(\omega) e^{-j\omega(t - t_2)} \right] \right\}$$
(13)

where Re[.] represents the real part of a complex variable. The input and output capacitances of cascode cells have already been absorbed into the gate and drain TLs. Moreover,  $L_C$  has resonated out the effect of parasitic capacitances at the internal node of each cascode cell. Therefore,  $H_g(\omega)$  is simplified to the DC current gain  $g_m R_{GG}$  of each cell, where  $g_m = g_{m,cs}$  and  $R_{GG} = Z_G/2 + r_{gate}/3$  [ $r_{gate}$  is the physical gate resistance]. The 1/3 factor in  $r_{gate}/3$  is to model the distributed effect of gate resistance in MOS devices with large widths. The PSD of the output noise current due to the MOSFET noise sources of the *k*th stage and all its forward- and backward-propagated replicas thus becomes

$$S_{o,k}(\omega) = \frac{\overline{I_d^2}}{4} \left\{ 1 + \left[ (N - k + 1)^2 + (k - 1)^2 \right] \left| \frac{\kappa_c}{c} \right|^2 \times (\omega \tau_{\rm GS})^2 + 2N\kappa_c (\omega \tau_{\rm GS}) \right\} + \frac{\overline{I_{1/f}^2}}{4} g_m^2 R_{GG}^2 \times \left[ (N - k + 1)^2 + (k - 1)^2 \right]$$
(14)

where  $\kappa_c = |c| \sqrt{\delta \zeta / \gamma}$  ( $\gamma$  is the channel thermal-noise coefficient and is technology-dependent),  $\tau_{\rm GS} = R_{GG}C_{i,cs}$ , and  $\overline{I_{1/f}^2} = \overline{V_{1/f}^2} / R_{GG}^2$  with  $\overline{V_{1/f}^2}$  being the average power of flicker noise voltage [25].

Consequently, the overall PSD of the output noise current,  $S_{Io}^{\text{MOS}}(\omega)$ , due to MOSFET noise sources is

$$S_{Io}^{\text{MOS}}(\omega) = \sum_{k=1}^{N} S_{o,k}(\omega)$$
  
=  $\frac{\overline{I_d^2}}{4} \left\{ N + \frac{N(2N^2 + 1)}{3} \left| \frac{\kappa_c}{c} \right|^2 (\omega \tau_{\text{GS}})^2 + \frac{2N^2 \kappa_c (\omega \tau_{\text{GS}}) \right\}}{+ \frac{g_m^2 \overline{V_{1/f}^2}}{4} \frac{N(2N^2 + 1)}{3}}.$  (15)

3) Noise Contribution of Source and Load Impedances: Simple calculations reveal that the noise contributions of the source impedance  $R_s = Z_G$ , the gate-line termination  $Z_G$ , and the drain-line termination  $Z_D$  to the output are calculated as follows (see [13]):

$$\frac{\overline{V_{n,out}^2}}{\overline{V_{n,out}^2}}\Big|_{\text{Source_impedance}} = 4k_BT \frac{N^2 Z_G^2 Z_D g_m^2}{4} \quad (16)$$

$$\frac{\overline{V_{n,out}^2}}{\overline{V_{n,out}^2}}\Big|_{\overline{v_n}} = 4k_BT \frac{Z_G^2 Z_D g_m^2}{4}$$

$$\frac{n, out}{|\text{Gate_termination}} = \frac{B}{4} + \frac{4}{\left(\frac{\sin N\beta}{\sin \beta}\right)^2}$$
(17)

$$V_{n,out}^2\Big|_{\text{Drain\_termination}} = k_B T Z_D.$$
(18)

4) Calculation and Optimization of the Overall NF: So far, noise contributions of various noise sources to the output noise power of the DLNA have been calculated [cf. (15)–(18)]. Substituting the results of (15)–(18) in the definition of the spot NF yields

$$NF_{tot} = NF_{HF} + \frac{1}{4k_BTZ_T} \cdot \frac{2\pi K_{1/f}}{C_{ox}WL\omega} \cdot \frac{2N^2 + 1}{3N}$$
(19)

where

$$NF_{\rm HF} = 1 + \frac{1}{(Ng_m Z_T)^2} + \left(\frac{\sin N\beta}{N\sin\beta}\right)^2 + \frac{\gamma}{Ng_m Z_T} \times \left[1 + \frac{(2N^2 + 1)}{3} \left|\frac{\kappa_c}{c}\right|^2 (\omega\tau_{\rm GS})^2 + 2N\kappa_c(\omega\tau_{\rm GS})\right] \quad (20)$$

and NF<sub>HF</sub> denotes the high-frequency NF and  $Z_T = Z_G = Z_D$ .

The flicker noise corner frequency,  $f_{\text{corner}}$ , is simply determined by equating the midrange frequency value of NF<sub>HF</sub> with the low-frequency value of NF<sub>tot</sub>, resulting in

$$f_{\rm corner} = \frac{K_{1/f}}{C_{\rm ox}WL} \cdot \frac{2N^2 + 1}{12} \cdot \frac{g_m}{k_BT} \left(\frac{1}{Ng_mZ_T} + \gamma\right) \tag{21}$$

where  $K_{1/f}$  is the process-dependent flicker noise constant with typical values less than  $10^{-26}$  V<sup>2</sup>F [25]. Eq. (21) states that  $f_{\text{corner}}$  increases in proportion with  $N^2$ .

Eqs. (19) and (20) provide us with interesting design guidelines regarding the distributed LNA circuit of Fig. 2. First, the second term of (20) is inversely proportional to the forward power-gain of the circuit, which will be significantly reduced by increasing the power gain and increasing the number of stages. The third term represents the contribution of the gate termination. When  $N\beta$  is close to zero or  $\pi$ , this term adds an additional factor of one to the circuit's NF, setting the minimum NF to 3 dB. However, for other values of  $N\beta$ , this term is less than unity and decreases with number of stages N. This notion actually implies that for  $N\beta \neq l\pi$ ,  $l \in Z$ ; the noise powers are superimposed at the output incoherently whereas the signal and its propagated replicas are added coherently. As a result, the contribution of the gate termination to the overall NF becomes inversely proportional to  $N^2$ , and can be made to be smaller than unity.

Both the second and the third terms are inversely proportional to  $N^2$ , which can be assumed to be negligible momentarily to simplify the calculations. Differentiating the circuit NF with respect to N yields

$$N_{\rm opt} = \left\{ \frac{1}{2} \left[ 1 + \frac{3}{\frac{\gamma}{g_m Z_T} \left( \frac{\kappa_c}{c} \mid \omega \tau_{\rm GS} \right)^2 + \frac{1}{4k_B T Z_T} \frac{2\pi K_{1/f}}{W L C_{\rm ox} \omega}} \right] \right\}_{(22)}^{1/2}$$

As an approximation, the noise contribution of the flicker noise can be neglected, which simplifies (22) to

$$N_{\rm opt} \approx \sqrt{\frac{1}{2} \left[ 1 + \frac{3}{\left( |\kappa_c/c| \omega \tau_{\rm GS} \right)^2} \right]}.$$
 (23)

The device sizes are to be calculated to maximize gain across the UWB frequency band. [26] presented contours of constant gain-bandwidth product as function of gate and drain TLs' attenuations without any consideration for the noise figure minimization. The design guidelines presented in [8] and [26] to maximize the GBW are primarily based on calculation of optimum gate and drain attenuation factors without providing any quantitative discussion on the impact of number of stages N. In fact, [26] stated that for N greater than 4 the DA's frequency response does not change appreciably.

The design goal of this paper is to maximize the gain and minimize the NF across the UWB band. To achieve this goal, we introduce a design procedure based on the approach proposed in [26] with N being set to optimum number of stages  $N_{opt}$  from (22). The design optimization procedure utilizes the GBW expression obtained from [26, eq. (1)] in terms of the -3 dB bandwidth, i.e.,

$$A_0\omega_{-3\mathrm{dB}} = 4K_A X_{-3\mathrm{dB}}\omega_{\mathrm{max}} \tag{24}$$

where

$$\begin{split} &A_0 = \text{DC gain,} \\ &\omega_{-3\text{dB}} = -3 \text{ dB cutoff frequency of the amplifier (rad/s),} \\ &\omega_{\text{max}} = \text{MOSFET's maximum frequency of oscillation (rad/s),} \\ &K_A = \sqrt{ab}e^{-b}, \\ &a = N(r_{\text{gate}}C_{i,cs}/3\sqrt{L_DC_{o,cg}}) = N(r_{\text{gate}}\sqrt{C_{i,cs}/L_G}/3), \\ &b = N(R_{o,cg}C_{o,cg}/\sqrt{L_GC_{i,cs}}) = NR_{o,cg}\sqrt{C_{o,cg}/L_D}, \\ &X_{-3\text{dB}} = \omega_{-3\text{dB}}\sqrt{C_{i,cs}L_G} = \omega_{-3\text{dB}}\sqrt{C_{o,cg}L_D}, \end{split}$$

where  $R_{o,cg}$  denotes the output resistance of the common-gate stage in each cascode cell.



Fig. 9. Normalized gain-bandwidth contours for number of stages varying from N = 3 to N = 6.

TABLE ICOMPARISON BETWEEN THE  $K_A X_{-3dB,opt}$  FOR OPTIMUM a and b Values and  $K_A X_{-3dB}$  FOR a and b ValuesWHEN N = 6 (a = 0.75; b = 0.32 FOR N = 6)

|                    | n=3    | n=4    | n=5    | n=6    |
|--------------------|--------|--------|--------|--------|
| $K_A X_{-3dB}$     | 0.2570 | 0.2563 | 0.2558 | 0.2555 |
| $K_A X_{-3dB,opt}$ | 0.2680 | 0.2568 | 0.2559 | 0.2555 |

To ensure a flat frequency response across the UWB bandwidth, the -3 dB cut-off frequency is set to 13 GHz. The factors  $K_A$  and  $X_{-3dB}$  are both functions of gate and drain line attenuations as demonstrated in [8] and [26]. The GBW for our application is several orders of magnitude smaller than  $\omega_{\text{max}}$ , implying that the  $K_A X_{-3dB}$  cannot exceed 0.25. For N = 6, [26] plotted the normalized gain-bandwidth contours and noticed that there is a single maximum at a = 0.75 and b = 0.32and predicted a maximum value of 0.255. This value is about 2% greater than the expected value of 0.25, which is due to approximations used for attenuations of gate and drain TLs in equations used to derive  $X_{-3db}$  [26]. To investigate the effect of N on the maximum GBW, the normalized gain-bandwidth contours are simulated for the DLNA of Fig. 2 and with N varying from 3 to 6. Fig. 9 shows the simulation results.

Table I shows the  $K_A X_{-3dB}$  factors for optimum values of a and b for a specific number of stages N, and compares those with the  $K_A X_{-3dB}$  factors obtained for optimum a and b values when N = 6. This comparison shows a small sensitivity of the  $K_A X_{-3dB}$  factor with respect to a and b values. Based on the assertion of [26] which was also confirmed by simulation data in Table I, GBW will not change with N greater than 4. Therefore, a and b values for N = 6 are used. Procedure 1 summarizes the proposed approach for the performance-optimized DLA design.

Procedure 1:

- 1) For a flat magnitude response across the UWB band, set  $f_{-3dB} = 13$  GHz. The TLs' cutoff frequency  $f_c$ , defined as  $f_c = 2/[2\pi\sqrt{L_GC_{i,cs}}] = 2/[2\pi\sqrt{L_DC_{o,cg}}]$ , is calculated so as to ensure that  $N\beta \neq l\pi$ ,  $l \in Z$ . To achieve maximum gain for frequencies up to the UWB upper corner frequency, we set a = 0.75 and b = 0.32. Moreover,  $N = N_{opt}$ , and  $N_{opt}$  is obtained by (22) for minimum NF.
- 2) The maximum bias current for which the MOS transistors of each cell remain in saturation is calculated for the bias circuit used in the DLNA of Fig. 2. This current is readily calculated as  $I_{D,\text{max}} = V_{\text{THN}}/N_{\text{opt}}Z_T$ .
- 3) Using (24), calculate the maximum DC gain,  $A_0$ .
- 4) [26, eq. (2)] gives the DC gain of a conventional distributed amplifier as

$$A_0 = \frac{g_m Z_T^2}{2} \frac{\sinh(b)}{\sinh(b/N_{\text{opt}})} e^{-b}.$$
 (25)

This equation holds for the DLNA of Fig. 2 with identically matched transistors  $M_{ak2}$  and  $M_{ak1}$  for each cascode cell. All the parameters in (25) are expressed with respect to the gate aspect ratio of transistors, W/L.

5) Using step 4, calculate the W/L. This W/L results in minimum NF and maximum gain.



Fig. 10. NF comparison for different number of stages.

6) Using (19) and (20), obtain minimum NF.

In calculating the NF and gain expression, the device data provided by the foundry have been used. In doing so, a test structure on the same 0.18- $\mu$ m SiGe technology was fabricated to experimentally characterize various individual components including the MOS transistors and varactors, transmission lines, short structures, open structures, and thru structures. Measurement of individual MOSFET transistors in the test structure provides the technology dependent parameters. Applying the design procedure 1 to the DLNA of Fig. 2, results in the optimum W/L ratio of 240  $\mu$ m/0.18  $\mu$ m. Using (22), the optimum number of stages for 50  $\Omega$  load terminations will be readily calculated, once the optimum W/L ratio is obtained. For the DLNA circuit of Fig. 2,  $N_{opt} = 3$ . To verify these calculations, the DLNA was designed and simulated is Cadence. Four performance-optimized DLNA circuits with number of stages varying from N = 1 to N = 4 were separately designed and simulated. To capture the gate-induced noise in simulations, the BMIS4 level 54 MOS model has been utilized. Fig. 10 shows simulated noise figure with respect to frequency. It shows that the three-stage DLNA achieves a minimum NF of 2.1 across the UWB spectral band. Section IV will summarize measurement results of a three-stage DLNA prototype, which was designed and fabricated in a 0.18- $\mu$ m SiGe process.

### B. Linearity Analysis

A notch filter centered around the 802.11a 5 GHz frequency enhances the spurious-free dynamic range (SFDR) of the DLNA. Nevertheless, the proposed UWB DLNA must remain linear when receiving the desired weak wideband signal in the presence of in-band narrowband interfering signals. An analytical study of the circuit's linearity and the third intercept point (IP3) provides useful insight about the circuit's large-signal performance.

To capture the short-channel effects of submicron CMOS technology including mobility degradation and velocity saturation, the analysis uses the well-known I-V characteristic of the submicron MOS transistor [25], i.e.,

$$I_D = \frac{1}{2}\mu_0 C_{\rm ox} \left(\frac{W}{L}\right) \frac{(V_{\rm GS} - V_{\rm TH})^2}{1 + \left(\frac{\mu_0}{2\nu_{\rm sat}L} + \theta\right) (V_{\rm GS} - V_{\rm TH})}$$
(26)

where  $\mu_0$  represents the low-field mobility,  $\nu_{sat}$  is the saturated drift velocity, and  $\theta$  is the process-dependent parameter [25]. Assuming the input DC bias voltage to be equal to the threshold voltage, the above equation is simplified to

$$I_D \approx \frac{1}{2}\mu_0 C_{\rm ox} \left(\frac{W}{L}\right) V_{\rm in}^2 \left[1 - \eta \left(\frac{\mu_0}{2\nu_{\rm sat}L} + \theta\right) V_{\rm in}\right] \quad (27)$$

where  $\eta$  is a corrective factor ranging from 0.3 to 0.5 to improve the accuracy of the approximation. To estimate the intercept points we determine the DLNA output in response to the input sinusoidal voltage  $V_{in}(t) = V_{im} \cos \omega_{in} t$ , first. The signal at the near-end input terminal travels down the gate line, while being amplified by each cell once it arrives at that cell's input terminal. The amplified signal will then travel toward the load termination, while being combined with the signals at subsequent tap-points along the drain TL. The signal propagation mechanism is quantified using

$$V_o(t) = \sum_{k=1}^{N} V_{o,k} \left( t - (N - k + 1/2)\sqrt{LC} \right).$$
(28)

 $V_{o,k}$  is the signal amplified by the *k*th stage, i.e.,  $V_{o,k}(t) = I_{D,k}(t)Z_D/2$ , and is related to the input voltage using the *I*-V characteristic of each cascode cell.

$$V_{o}(t) = V_{\rm im} \left( 1 - \eta \left( \frac{\mu_{0}}{2\nu_{\rm sat}L} + \theta \right) \frac{V_{\rm im}^{2}}{4} \right) \cos \omega_{\rm in}(t - N\sqrt{LC}) + \left( \eta \left( \frac{\mu_{0}}{2\nu_{\rm sat}L} + \theta \right) \frac{V_{\rm im}^{3}}{4} \right) \cos 3\omega_{\rm in}(t - N\sqrt{LC}).$$
(29)

The third-order input intercept point is thus obtained as

IIP3 = 10 log 
$$\left[4 - \eta \left(\frac{\mu_0}{2\nu_{\text{sat}}L} + \theta\right) V_{\text{im}}^2\right]$$
  
-10 log  $\left[\frac{3}{4}\eta \left(\frac{\mu_0}{2\nu_{\text{sat}}L} + \theta\right) V_{\text{im}}^2\right]$ . (30)

Eq. (30) states that the IIP3 of the DLNA is equal to that of a lumped LNA that uses the same cascode cell.

## **IV. MEASUREMENT RESULTS**

The UWB DLNA circuit of Fig. 2 was fabricated in a 0.18- $\mu$ m SiGe BiCMOS process while only MOS devices were utilized. Square spiral inductors were all fabricated on the top-most metal layer and exhibited a Q-factor of 10 at 10 GHz. The LNA test-chip occupies a total area of 872  $\mu$ m × 872  $\mu$ m including the pad ring. The chip was directly mounted on a high-frequency board. Both input and output terminals of the proposed distributed LNA were terminated to on-chip square spiral inductors for matched termination. DC pads incorporate ESD protection. To minimize the parasitic effects of chip-board interface, the chip was solder bumped, and flipped on the board. Fig. 11 shows the chip micrograph.

A test structure was separately fabricated in the same 0.18- $\mu$ m SiGe technology to experimentally characterize various individual passive and active components including transistors, MOS varactors, transmission lines, short structures, open structures, and thru structures. Of particular interest is characterization of noise parameters of the MOSFET, which



Fig. 11. Die photo of the UWB DLNA.



Fig. 12. Measured forward gain and noise figure.



Fig. 13. Measured and simulated  $s_{21}$  versus frequency.

was carried out by the foundry. The measured average values of  $\gamma$ ,  $\delta$ , and  $\zeta$  are 2.21, 4.1, and 5.2, respectively. Calculations using the holistic thermal model developed in BSIM4 model results in  $\gamma = 2.14$ ,  $\delta = 3.94$ , and  $\zeta = 5$ .

S-parameter measurements of the circuit were carried out using the Anritsu 37247A vector network analyzer (VNA). Gate



Fig. 14. Comparison between the measured NF and (19).



Fig. 15. Measured and simulated input and output return losses.

 TABLE II

 MEASURED IIP3 OF THE DLNA WITH RESPECT TO FREQUENCY

| $f_{RF}$ | IIP3  |
|----------|-------|
| [GHz]    | [dBm] |
| 3        | -4.1  |
| 4        | -4.0  |
| 5        | -3.8  |
| 6        | -3.6  |
| 7        | -3.4  |
| 8        | -3.3  |
| 9        | -3.2  |
| 10       | -3.0  |

biasing was provided by the bias-Tees. Fig. 12 shows the measured  $s_{21}$  and NF of the DLNA under operating conditions of  $V_{\rm DD} = 1.8$  V and the overall current consumption of 12 mA. The DLNA exhibits a flat NF of 2.9 dB across the entire 7.5 GHz UWB frequency band. As explained in Section III-A4, at frequencies near or much lower than the lines' cutoff frequency, the far-end termination impedance at the gate load will add 3 dB to the total NF, because the second term in (20) approaches its maximum value of one. For  $N\beta \neq l\pi$ ,  $l \in \mathbb{Z}$  the second term is less than unity and decreases with number of stages N, and the contribution of the gate termination to the overall NF becomes

| TABLE III                                                                            |
|--------------------------------------------------------------------------------------|
| PERFORMANCE COMPARISON OF LNA CIRCUITS PRESENTED IN PRIOR WORK AND THE PROPOSED DLNA |

| Ref          | BW (GHz)   | s21 (dB)       | NF (dB)    | s11 (dB)                                                                                                                    | IIP3 (dBm) | Power (mW) | Topology                                                                         | Technology                  |
|--------------|------------|----------------|------------|-----------------------------------------------------------------------------------------------------------------------------|------------|------------|----------------------------------------------------------------------------------|-----------------------------|
| [2]          | 3-10       | 9.3            | 4-8        | <-9.9                                                                                                                       | -6.7       | 9          | Lumped CMOS Cascode                                                              | 0.18µm CMOS                 |
| [3]          | 3-10       | 21             | 2.5-4.2    | -14 <s11<-9< td=""><td>&gt;-5.5</td><td>30</td><td>Lumped Bipolar Cascode</td><td>SiGe BiCMOS, using bipolar</td></s11<-9<> | >-5.5      | 30         | Lumped Bipolar Cascode                                                           | SiGe BiCMOS, using bipolar  |
| [7]          | 1.3-12.3   | 8.2            | 4.4-5.3    | <-7.2                                                                                                                       | 7.6-8.3    | 4.5        | Lumped Differential Common-Gate                                                  | 0.18µm CMOS                 |
| [9]          | 3dB: ~6GHz | 6.5            | N/A        | <-9                                                                                                                         | N/A        | 52         | CMOS Distributed Common-Source                                                   | 0.18µm CMOS                 |
| [10]         | 0.5-4      | 6.5 ± 1.2      | 5.4-8      | <-6dB                                                                                                                       | N/A        | 83.4       | CMOS Distributed Common-Source                                                   | 0.6µm standard CMOS         |
| [14]         | 0.04-6.2   | 8±0.6          | 4.2-6.2    | <-16                                                                                                                        | 3          | 9          | CMOS Distributed Cascode                                                         | 0.18µm CMOS                 |
| [20]         | 0.1-23     | 14.5 ± 0.9     | 5          | <-9                                                                                                                         | -0.5       | 54         | Bipolar Distributed Cascode                                                      | SiGe BiCMOS<br>Used bipolar |
| [21]         | 1-25       | 7.8±1.3        | 4.8-7      | <-10                                                                                                                        | 4.7        | 54         | CMOS Downsized Distributed                                                       | 0.18µm SiGe, only CMOS      |
| [27]         | 0.5-14     | 10.6dB         | 3.4-5.4    | <-10                                                                                                                        | 10 @ 10GHz | 52         | CMOS Distributed Cascode                                                         | 0.18µm CMOS                 |
| [28]         | 3.5-4.5    | 16dB           | 3.9-4.2    | <-10                                                                                                                        | -4.5       | 7.2        | Differential LNA w/ Resistive-feedback                                           | 0.18µm CMOS                 |
| [29]         | 0.5-10     | 13             | 2.9-3.3    | <-7                                                                                                                         | -7.5       | 9.6        | Bipolar Common-Emitter w/ Resistive<br>feedback                                  | 0.18µm SiGe<br>Used bipolar |
| [30]         | 0-11GHz    | 10 (high-gain) | 3.2-6      | <-20                                                                                                                        | N/A        | 100        | CMOS Distributed Cascode                                                         | 0.18µm CMOS                 |
| [31]         | 3.1-10.6   | 10.87-12.02    | 4.7-5.6    | <-11                                                                                                                        | <-10.6     | 10.57      | Lumped CMOS LNA w/ Dual feedback                                                 | 0.18µm CMOS                 |
| This<br>Work | 0.1-11     | 8              | 2.9dB flat | <-12                                                                                                                        | -3.55      | 21.6       | Performance-Optimized CMOS<br>Distributed Cascode with BW-<br>Enhancing Inductor | 0.18μm SiGe, only CMOS      |

inversely proportional to  $N^2$ , and can be made to be smaller than unity. In our design, the gate line's inductance is chosen to be 942 pH and the gate input capacitance is 277 fF resulting in a line cut-off frequency of  $2/[2\pi\sqrt{L_GC_G}] = 19.6$  GHz. Consequently, the noise contribution of the gate load resistance becomes negligible. The measured forward gain of the LNA circuit remains at 8 dB for frequencies up to 11 GHz. It experiences a +1.6 dB overshoot at 11.6 GHz, as also indicated in Fig. 12.

Designing a performance-optimized DLNA with eleven inductors for a wideband frequency operation from 3.1 to 10.6 GHz demands careful layout development and post-layout extraction/simulation. Fig. 13 demonstrate simulated and measured forward gain  $s_{21}$ , verifying the accuracy of post-layout simulation. Fig. 14 compares the measured NF of the DLNA with (19). This comparison verifies an earlier analytical assessment in Section III, which states that (19) sets an upper limit for the NF of the DLNA.

Fig. 15 depicts the measured and simulated input and output return losses,  $s_{11}$  (dB) and  $s_{22}$  (dB).  $s_{11}$  and  $s_{22}$  remain below -12 dB and -10 dB, respectively, across the UWB frequency band. Post-layout simulation driven by electromagnetic extraction of the entire circuit layout allows an accurate simulation result that closely follows the chip measurement. Good return losses from measurement, once again, proves an essential attribute of DAs in exhibiting wideband input/output matching. Simulations predicted slightly better  $s_{11}$  and  $s_{22}$ . The discrepancy can be attributed to the off-chip flip-chip measurements.

Fig. 16 shows plots of measured and simulated reverse isolation  $s_{12}$  (dB) and the LNA's gain  $s_{21}$  (dB) versus frequency. The in-band isolation varies between -50 dB and -27 dB, which is verified by both simulation and measurement. Fig. 16 demonstrates the accuracy of  $s_{12}$  and  $s_{21}$  simulations compared to



Fig. 16. Measured and simulated reverse isolation and gain.

measurement. The superior input–output isolation is partly due to the utilization of BW-enhanced cascode cells in the proposed DLNA.

The linearity and third-order intercept measurements were performed using the Agilent 8565 spectrum analyzer. The measured input-referred 1 dB compression-point ( $P_{\rm in,1dB}$ ) at two input frequencies of 4 GHz and 9 GHz was -13.1 dBm and -12.2 dBm, respectively. The result from the two-tone test measurement at 7 GHz is shown in Fig. 17. The DLNA exhibits an IIP3 of -3.4 dBm and an OIP3 of 6.2 dBm ay 7 GHz frequency. Furthermore, the IP3 measurement was carried out for RF frequencies ranging from 3 GHz to 10 GHz. Table II summarizes the result of IP3 measurement, where the average IIP3 is -3.55 dBm.



Fig. 17. Measured two-tone test at 7-GHz frequency.

The proposed DLNA retains flat gain and input/output return losses, and relatively constant NF over a wide range of frequencies. It also contains a good linearity across the band. Table III compares the circuit performance of this LNA with some other recently published works.

## V. CONCLUSION

This paper presented the analysis and design of a performance-optimized distributed LNA (DLNA) for UWB receivers. A detailed analysis of noise in the DLNA was provided, which can easily be extended to any other DA topology. A three-stage DLNA using bandwidth-enhancing inductors was fabricated in a 0.18- $\mu$ m SiGe BiCMOS process, where only MOS transistors were used. Measurements of the DLNA show a 2.9-dB noise figure and a forward gain of 8 dB over the 7.5-GHz UWB bandwidth. The circuit exhibits an average IIP3 of -3.55 dBm and an input-referred 1-dB compression point of at least -13.1 dB. The overall current consumption is 12 mA from a 1.8-V supply voltage.

#### ACKNOWLEDGMENT

The author would like to thank D. Lin and D. Pi for assisting with the layout development and cadence simulation of the DLNA, Broadcom Corporation for providing the measurement equipment, and Jazz Semiconductor for fabricating the test chip.

#### REFERENCES

- S. Roy *et al.*, "Ultrawideband radio design: The promise of high-speed, short-range wireless connectivity," *Proc. IEEE*, vol. 92, no. 2, pp. 295–311, Feb. 2004.
- [2] A. Bevilacqua and A. M. Niknejad, "An ultra-wideband LNA for 3.1 to 10.6 GHz wireless receivers," in *IEEE ISSCC Dig. Tech. Papers*, 2004, pp. 382–383.
- [3] A. Ismail and A. Abidi, "A 3 to 10 GHz LNA using wideband LC-ladder matching network," in *IEEE ISSCC Dig. Tech. Papers*, 2004, pp. 384–385.

- [4] B. Razavi et al., "A UWB CMOS transceiver," IEEE J. Solid-State Circuits, vol. 40, no. 12, pp. 2555–2562, Dec. 2005.
- [5] X. Li, S. Shekhar, and D. J. Allstot, "G<sub>m</sub>-boosted common-gate LNA and differential colpitts VCO/QVCO in 0.18- μm CMOS," *IEEE J. Solid-State Circuits*, vol. 40, no. 12, pp. 2609–2619, Dec. 2005.
- [6] A. Shameli and P. Heydari, "A novel ultra-low power (ULP) low noise amplifier using differential inductor feedback," in *Proc. European Solid-State Circuits Conf. (ESSCIRC)*, 2006, pp. 352–355.
- [7] A. Shekhar, X. Li, and D. J. Allstot, "A CMOS 3.1–10.6 GHz UWB LNA employing staggered compensated series peaking," in *Proc. IEEE RFIC Symp.*, 2006, pp. 63–66.
- [8] J. B. Beyer et al., "MESFET distributed amplifier design guidelines," *IEEE Trans. Microw. Theory Tech.*, vol. 32, no. 3, pp. 268–275, Mar. 1984.
- [9] B. Kleveland *et al.*, "Exploiting CMOS reverse interconnect scaling in multigigahertz amplifier and oscillator design," *IEEE J. Solid-State Circuits*, vol. 36, no. 10, pp. 1480–1488, Oct. 2001.
- [10] B. M. Ballweber, R. Gupta, and D. J. Allstot, "A fully integrated 0.5–5.5-GHz CMOS distributed amplifier," *IEEE J. Solid-State Circuits*, vol. 35, no. 2, pp. 231–239, Feb. 2000.
- [11] H.-T. Ahn and D. J. Allstot, "A 0.5–8.5-GHz fully differential CMOS distributed amplifier," *IEEE J. Solid-State Circuits*, vol. 37, no. 8, pp. 985–993, Aug. 2002.
- [12] H. Shigematsu et al., "40 Gb/s CMOS distributed amplifier for fiber-optic communication systems," in *IEEE ISSCC Dig. Tech. Papers*, 2004, pp. 476–477.
- [13] C. S. Aitchison, "The intrinsic noise figure of the MESFET distributed amplifier," *IEEE Trans. Microw. Theory Tech.*, vol. MTT-33, no. 6, pp. 460–466, Jun. 1985.
- [14] F. Zhang and P. R. Kinget, "Low-power programmable gain CMOS distributed LNA," *IEEE J. Solid-State Circuits*, vol. 41, no. 6, pp. 1333–1343, Jun. 2006.
- [15] A. Papoulis and S. Pillai, Probability, Random Variables and Stochastic Processes, 4th ed. New York: McGraw-Hill, 2002.
- [16] P. Heydari and D. Lin, "A performance optimized CMOS distributed LNA for UWB receivers," in *Proc. IEEE Custom Integrated Circuits Conf. (CICC)*, 2005, pp. 337–340.
- [17] E. L. Ginzton, W. R. Hewlett, J. H. Jasberg, and J. D. Noe, "Distributed amplification," *Proc. IRE*, pp. 956–969, Aug. 1948.
- [18] A. Q. Safarian, A. Yazdi, and P. Heydari, "Design and analysis of an ultra wideband distributed CMOS mixer," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 13, no. 5, pp. 618–629, May 2005.
- [19] H. Wu and A. Hajimiri, "Silicon-based distributed voltage-controlled oscillator," *IEEE J. Solid-State Circuits*, vol. 36, no. 3, pp. 493–502, Mar. 2001.
- [20] T. H. Lee, The Design of CMOS Radio-Frequency Integrated Circuits, 2nd ed. Cambridge, U.K.: Cambridge Univ. Press, 2004.
- [21] Q. He and M. Feng, "Low-power, high-gain, and high-linearity SiGe BiCMOS wideband low-noise amplifier," *IEEE J. Solid-State Circuits*, vol. 39, no. 6, pp. 956–959, Jun. 2004.
- [22] A. Yazdi, D. Lin, and P. Heydari, "A 1.8 V three-stage 25 GHz 3 dB-BW differential non-uniform downsized distributed amplifier," in *IEEE ISSCC Dig. Tech. Papers*, 2005, pp. 156–157.
- [23] R. Roovers, D. M. W. Leenaerts, J. Bergervoet, K. S. Harish, R. C. H. van de Beek, G. van der Weide, H. Waite, Y. Zhang, S. Aggarwal, and C. Razzell, "An interference-robust receiver for ultra-wideband radio in SiGe BiCMOS technology," *IEEE J. Solid-State Circuits*, vol. 40, no. 12, pp. 2563–2572, Dec. 2005.
- [24] J.-S. Goo, H.-T. Ahn, D. J. Ladwig, Z. Yu, T. H. Lee, and R. W. Dutton, "A noise optimization technique for integrated low-noise amplifiers," *IEEE J. Solid-State Circuits*, vol. 37, no. 8, pp. 994–1002, Aug. 2002.
- [25] Y. Tsividis, Operation and Modeling of the MOS Transistor. New York: McGraw-Hill, 1999, pp. 440–512.
- [26] R. C. Becker and J. B. Beyer, "On gain-bandwidth product for distributed amplifiers," *IEEE Trans. Microw. Theory Tech.*, vol. MTT-34, no. 6, pp. 736–738, Jun. 1986.
- [27] R. Liu et al., "A 0.5–14 GHz 10.6 dB CMOS cascode distributed amplifier," in Symp. VLSI Circuits Dig. Tech. Papers, 2003, pp. 139–140.
- [28] S. Lida et al., "A 3.1 to 5.1 GHz CMOS DSSS UWB transceiver for WPANs," in *IEEE ISSCC Dig. Tech. Papers*, 2005, pp. 214–215.
- [29] Y. Park, C.-H. Lee, J. D. Cressler, J. Laskar, and A. Joseph, "A very low power SiGe LNA for UWB application," in *IEEE MTT-S Dig.*, 2005, pp. 1041–1044.

- [30] X. Guan and C. Nguyen, "Low-power-consumption and high-gain CMOS distributed amplifiers using cascade of inductively coupled common-source gain cells for UWB systems," *IEEE Trans. Microw. Theory Tech.*, vol. 54, no. 8, pp. 3278–3283, Aug. 2006.
- [31] C.-T. Fu and C.-N. Kuo, "3-11-GHz UWB LNA using dual feedback for broadband matching," in *IEEE RFIC Symp. Dig. Papers*, 2006, pp. 67–70.



**Payam Heydari** (S'98–M'00) received the B.S. and M.S. degrees (with honors) in electrical engineering from the Sharif University of Technology, Tehran, Iran, in 1992 and 1995, respectively. He received the Ph.D. degree in electrical engineering from the University of Southern California, Los Angeles, in 2001.

During the summer of 1997, he was with Bell Labs, Lucent Technologies, Murray Hill, NJ, where he worked on noise analysis in deep-submicron very large-scale integrated (VLSI) circuits. During the

summer of 1998, he was with IBM T. J. Watson Research Center, Yorktown Heights, NY, where he worked on gradient-based optimization and sensitivity analysis of custom-integrated circuits. In August 2001, he joined the University of California, Irvine, where he is currently an Associate Professor of

electrical engineering. His research interest is the design of high-speed analog, radio-frequency (RF), and mixed-signal integrated circuits. He has authored or co-authored more than 55 journal and conference papers.

Dr. Heydari is the recipient of the 2007 IEEE Circuits and Systems Society Guillemin-Cauer Best Paper Award, the 2005 IEEE Circuits and Systems Society Darlington Best Paper Award, the 2005 National Science Foundation (NSF) CAREER Award, the 2005 Henry Samueli School of Engineering Teaching Excellence Award, the Best Paper Award at the 2000 IEEE International Conference on Computer Design (ICCD), the 2000 Honorable Award from the Department of Electrical Engineering-Systems at the University of Southern California, and the 2001 Technical Excellence Award in the area of Electrical Engineering from the Association of Professors and Scholars of Iranian Heritage (APSIH). He was recognized as the 2004 Outstanding Faculty at the EECS Department of the University of California, Irvine. His name was included in the 2006 Who's Who in America and Who's Who in Science and Engineering. He is an Associate Editor of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-PART I, and is a Guest Editor of the IEEE JOURNAL OF SOLID-STATE CIRCUITS. He currently serves on the Technical Program Committees of the IEEE Custom Integrated Circuits Conference (CICC), International Symposium on Low-Power Electronics and Design (ISLPED), and International Symposium on Quality Electronic Design (ISQED). He was the Student Design Contest Judge for the DAC/ISSCC Design Contest Award in 2003, and a Technical Program Committee member of the IEEE Design and Test in Europe (DATE) from 2003 to 2004.