PAD MODELING BY USING ARTIFICIAL NEURAL NETWORK

X. P. Li
School of Telecommunication Engineering
Beijing University of Posts and Telecommunications
Beijing, China

J. J. Gao
School of Information and Science Technology
East China Normal University
Shanghai, China

Abstract—An approach for the PAD modeling technique for microwave on wafer measurement based on a combination of the conventional equivalent circuit model and artificial neural network (ANN) is presented in this paper. The PAD capacitances are determined from S parameters of different size of PAD test structure based on EM (electromagnetic) simulation and described as functions of the dimensions of the PAD structure by using sub-ANN. Good agreement is obtained between ANN-based modeling and EM simulated results up to 40 GHz. The de-embedding procedure for PHEMT device utilizing the ANN based PAD model is demonstrated.

1. INTRODUCTION

The intrinsic characteristics of the on-wafer devices is interesting for IC engineering. However, it is impossible to measure it by placing the coplanar probes directly on the devices. Instead, on-wafer measurement requires probe pads and interconnect lines leading to the DUT. In this way, it significantly limits the performance in devices or circuits using pads, which is in particularly sensitive to the substrate effects due to its large metal plate area, like 60μm×60μm [1–6]. The high-frequency cross-talk and power loss through the bonding pads to the substrate can result in poor performance gain that the design optimization at the circuit level. To illustrate the impact of the
pads and interconnects on the device measurement, consider $H_{21}$, the current gain of a transistor with a shorted output, defined as:

$$H_{21} = \frac{Y_{21}}{Y_{11}}$$

(1)

Any parasitic capacitance will affect the measured admittances $Y_{21}$ and $Y_{11}$, so de-embedding the probe pads and interconnects is critical to coming up with an accurate $H_{21}$ measurement. Failure to do so can impact each of the four S-parameter by 1 to 2 dB. This error compounds when the S parameters are used in complex calculations, making transistor benchmarks such as $f_T$ and $f_{max}$ appear around 25% lower than they actually are. The impact of the pads and interconnects becomes larger as the device’s dimensions get smaller, particularly with conductive substrate.

In order to easily remove the pad effect, pad de-embedding is needed [7]. Usually the pad de-embedding is done in two ways: One is obtaining the pad S parameters by using EM simulation and another is measuring the S parameters of the dummy device [8–12]. Recently, ANN is widely used in RF design and CAD combination of EM simulation is investigated by researchers [13–19]. The artificial neural network (ANN) modeling techniques are efficient alternatives to conventional methods such as numerical modeling methods, which could be computationally expensive, or analytical methods, which could be difficult to obtain for new device or empirical models, whose ranges and accuracy could be limited.

In this paper, an approach for the pad modeling technique for microwave on wafer measurement based on a combination of the conventional equivalent circuit model and artificial neural network (ANN) is proposed, which is based on the combination of the conventional equivalent circuit model of the pad and artificial neural network (ANN) [19]. Each circuit elements in the pad equivalent circuit model can be regarded as a sub-artificial neural network (SANN). Pad capacitance can be directly obtained from S parameters by using EM simulation of measurement of different pad dimensions. Good agreement is obtained between the equivalent circuit model results and the EM simulated results. Example of PHEMT pad de-embedding utilizing the proposed technique is demonstrated.

The organization of this paper is as follows: the artificial neural network (ANN) is introduced in Section 2; the ANN base pad modeling is described in Section 3; the pad de-embedding technique is demonstrated by using pseudomorphic high electron mobility transistors (PHEMTs) in Section 4; the conclusion is shown in Section 5.
2. ANN TECHNIQUE INTRODUCTION

The multi-layer perceptron (MLP) [19] is a popularly used neural network structure. The neurons are grouped into layers in the MLP neural network. The first and last layers are called input and output layers, respectively. Between input and output layers there exists a central part of the neural network called a hidden layer. Depending on the complexity of the input response and desired output, the number of hidden layers and neurons at each layer can vary. Because there always exists a three layer perceptron that can approximate an arbitrary nonlinear, continuous, multi-dimensional function \(f\) with any desired accuracy. Therefore, a typical MLP neural network consists of an input layer, a hidden layer and an output layer, as shown in Fig. 1 [20].

![Diagram of a three-layer MLP structure.](image)

**Figure 1.** Three-layer MLP structure.

For given input \(x\), the output of three-layer MLP neural network can be computed by:

\[
y = w_0^3 + \sum_{i=1}^{n} w_i^3 \sigma \left( w_{i0}^2 + \sum_{j=1}^{m} w_{ij}^2 x_j \right)
\]

i.e.,

\[
y = \begin{bmatrix} w_0^3, w_1^3, \ldots, w_i^3, \ldots, w_n^3 \end{bmatrix} \begin{bmatrix} 1, z_1, \ldots, z_i, \ldots, z_n \end{bmatrix}^T
\]
\[
\begin{pmatrix}
Z_1 \\
Z_2 \\
\vdots \\
Z_i \\
\vdots \\
Z_n
\end{pmatrix} =
\begin{pmatrix}
w_{10}^2 & w_{11}^2 & \cdots & w_{1j}^2 & \cdots & w_{1m}^2 \\
w_{20}^2 & w_{21}^2 & \cdots & w_{2j}^2 & \cdots & w_{2m}^2 \\
\vdots & \vdots & \ddots & \vdots & \ddots & \vdots \\
w_{i0}^2 & w_{i1}^2 & \cdots & w_{ij}^2 & \cdots & w_{im}^2 \\
\vdots & \vdots & \ddots & \vdots & \ddots & \vdots \\
w_{n0}^2 & w_{n1}^2 & \cdots & w_{nj}^2 & \cdots & w_{nm}^2
\end{pmatrix}
\begin{pmatrix}
1 \\
x_1 \\
x_2 \\
\vdots \\
x_j \\
x_m
\end{pmatrix}
\]

where \(\sigma(.)\) is an activation function. The overall nonlinear relationship between input and output is realized by various activation patterns of the neurons whose activation functions are typically a smooth switch function, e.g., the sigmoid function can be expressed:

\[
\sigma(\gamma) = \frac{1}{1 + e^{-\gamma}}
\]  

\(w_{ij}^l\) represents the weight of the link between the \(j^{th}\) neuron of the \((l-1)^{th}\) layer and the \(i^{th}\) neuron of the \(l^{th}\) layer. \(w_0^3\) and \(w_{i0}^2\) represent the bias of each neurons of output and hidden layers.

The neural model is then trained to learn the input-output relationship from the training data (sample of input-output data). Specifically training is to determine the neural model parameters, i.e., neural network weights \(w_{ij}^l\), such that the ANN model predicted output best matches that of the training data. The testing data (new input-output samples) is used to test the accuracy of the ANN model.

3. ANN BASED PAD MODELING TECHNIQUE

Neural-based microwave device modeling technique combines the conventional equivalent circuit and artificial neural network (ANN) modeling technique. Each intrinsic nonlinear circuit elements can be modeled by using a SANN.

A typical pad profile on GaAs substrate and corresponding equivalent circuit model is shown in Fig. 2(a) and (b), respectively. Where \(C_1\) is the capacitance between the signal pad to ground, \(C_3=C_1\) for symmetrical pad structure, \(C_2\) is the cross talk between two signal pad. The dimension Variation of the pad structure is shown in Table 1.

The pad capacitance can be described as follows:

\[
C_1 = \frac{\text{imag}(Y_{11} + Y_{12})}{2\pi f}
\]

\[
C_2 = -\frac{\text{imag}(Y_{12})}{2\pi f}
\]
Figure 2. Pad and its equivalent circuit model. a) Pad structure b) Equivalent circuit model.

\[ C_3 = \frac{\text{imag}(Y_{22} + Y_{12})}{2\pi f} \]  

(8)

Where \( f \) is frequency, \( Y_{ij} (i, j = 1, 2) \) is the Y parameters of the pad, and \( Y_{11} = Y_{22}, Y_{12} = Y_{21} \).

Table 1. Variable pad input parameters values.

<table>
<thead>
<tr>
<th>Parameters</th>
<th>Notation</th>
<th>Values (µm)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Width</td>
<td>( W )</td>
<td>50–100</td>
</tr>
<tr>
<td>Length</td>
<td>( L )</td>
<td>50–100</td>
</tr>
<tr>
<td>Slot</td>
<td>( S )</td>
<td>50–300</td>
</tr>
</tbody>
</table>

In Fig. 2(b), \( C_1, C_2 \) and \( C_3 \) can be described by using sub-ANN as follows:

\[ C_1 = f_{ANN}^{C_1}(W, L, S) \]  

(9)

\[ C_2 = f_{ANN}^{C_2}(W, L, S) \]  

(10)

\[ C_3 = f_{ANN}^{C_3}(W, L, S) \]  

(11)

where \( f_{ANN} \) represents ANN of each element of pad structure. It can be found that the pad capacitance is a function of \( W, L \) and \( S \). The corresponding ANN based pad equivalent circuit model is shown in Fig. 3.

The training data has been obtained in the EM simulation over a frequency range of 0 to 40 GHz by different pad dimensions. After the pad capacitance versus dimensions was obtained, the training was conducted by using a combination of the Conjugate-Gradient and Back
Propagation methods until the difference between the training data and the output from the ANN model has reached less than 1%. 5 neurons were used in the ANN model.

Fig. 4, Fig. 5 and Fig. 6 show the pad capacitance versus width \( W \), length \( L \) and slot \( S \), respectively. From Figs. 4–6, we can see the comparison between the data obtained from the ANN model and the EM simulated (training) data for the pad. The solid line and circle line show the EM simulated results and ANN modeled results, respectively. We can see that good agreement is obtained. When different size of device or circuit is considered, its pad performance can be obtained from the function efficiently instead of EM simulation. We can consider the pad presented by sub-ANN and DUT together as shown in Fig. 7.

**Figure 3.** ANN based pad equivalent circuit model.

**Figure 4.** Pad capacitance versus its width \( (L = 50 \mu m \text{ and } S = 100 \mu m) \).
4. PAD DE-EMBEDDING METHOD FOR PHEMT

In Fig. 7, let the DUT to be PHEMT, which is an AlGaAs/InGaAs/GaAs pseudomorphic high electron mobility transistors (PHEMTs) with 0.25µm mushroom gates grown and fabricated using NTU’s developed process technology. The layer structure of the wafer,
from bottom to top, consists of a GaAs undoped buffer layer, 140Å undoped $In_{0.22}Ga_{0.78}As$ strained layer, 40Å $Al_{0.25}Ga_{0.75}As$ spacer layer, $5 \times 10^{12}$ cm$^{-2}$ Si $\delta$-doping plane, 220Å i-$Al_{0.25}Ga_{0.75}As$ source layer, and a Si-dope 450Å $n^+$-GaAs cap layer.

In this paper, the PI-gate PHEMT has been used, which has $2 \times 40 \mu m$ gate width (number of gate fingers unit gate width), with $W \times L \times S=60 \mu m \times 60 \mu m \times 200 \mu m$ pad dimension.

The main interest in RF probing for device characterization purposes is to characterize the intrinsic device for the purpose of modeling its behavior at the GHz frequencies when embedded in an IC design environment. It is obvious that the intrinsic device in an IC design environment will not have probe pads attached to it except when used as a test structure. Therefore, the probe pad parasitic effect must be de-embedded from the measurement since a measurement on wafer with calibrated probe tips has the intrinsic device characteristics plus pad parasitics. Pad capacitance can not be removed by using calibration method (such as SOLT, TRL, LRM, etc.). In order to get the DUT response from measurement, the pad parasitics must be removed. Then the dummy devices are introduced. Layout patterns, one including the DUT while the other (dummy) excluding it, are fabricated on the same wafer as shown in Fig. 8. Here, we examine both pad de-embedding and probe pad layout techniques since they are closely related. Proper probe layout rules in addition to technology design rules must be followed. Then the pad de-embedding can be summarized as follows [7, 9].

1) Calibrate the network analyzer up to the tips of the probe by using either on-wafer or off-wafer calibration standard patterns.

2) Verify the calibration on the measurement wafer. Verification of the calibration can be done using high-Q inductors or capacitors. Verification will not be correct if it is done on the standard pattern where calibration is done. This is because those patterns are already used for calibration.

3) Measure the s-parameters of the dummy device and convert
4) Measure the s-parameters of the DUT and convert them to y-parameters.

5) Subtract the dummy y-parameters from DUT y-parameters, and convert the results back to s-parameters.

By using ANN-based pad modeling, the S parameters of the pad are presented and good agreement is obtained between the ANN-based
Figure 10. Comparison of $S_{21}$ magnitude of pad between ANN-based method and EM simulated result.

Figure 11. Comparison of $S_{11}$ phase of pad between ANN-based method and EM simulated result.

method and the EM simulated results, as shown in Fig. 9–Fig. 12. The discrepancy between the ANN-based and EM simulated results of $S_{11}$ magnitude at higher frequency is because of loss omit in the ANN-based model.

By using the ANN-based pad de-embedding method, the pad effect is removed and shown in Fig. 13. From it we can see that the pad capacitance affected more seriously to $S_{12}$ and $S_{22}$ than phase of $S_{21}$.
Figure 12. Comparison of $S_{21}$ phase of pad between ANN-based method and EM simulated result.

Figure 13. Comparison of S parameters of PHEMT devices with and without pad.
Figure 14. High-frequency gain $H_{21}$ with and without pad.

and the magnitude of $S_{11}$.

Fig. 14 shows the pad effect to the high-frequency gain $H_{21}$ used to determine the $f_T$. We can see from it that the discrepancy is around 10 GHz between the pHEMT with pad and without pad.

5. CONCLUSION

In order to investigate the effect of pad capacitance to the performance of on-wafer devices, a pad modeling technique based on the combination of the conventional equivalent circuit model and artificial neural network (ANN) is proposed. The pad modeling technique shows the relationship between pad capacitance and its dimensions based on the ANN in the equivalent circuit model up to 40 GHz. Good agreement between the ANN-based modeling and EM simulated results demonstrate the validity of the pad modeling technique. Finally, the effect of pad to AlGaAs/InGaAs/GaAs pseudomorphic high electron mobility transistors (PHEMTs) is demonstrated by using the proposed pad modeling technique.

ACKNOWLEDGMENT

The work was supported by both the National Natural Science Foundation of China and National High Tech 863 Research Plan of China (No.2006AA04A102).
REFERENCES


