Fusion of Physical Mechanism and Data-Driven Methods for Online Thickness Measurement and Error Compensation in SiC CMP

Junjie Lin; Taotao Chen; Yicheng Ren; Zhilong Song; Binghai Lyu; Julong Yuan; Wenhong Zhao

PMC · DOI:10.3390/mi17030313·February 28, 2026

Fusion of Physical Mechanism and Data-Driven Methods for Online Thickness Measurement and Error Compensation in SiC CMP

Junjie Lin, Taotao Chen, Yicheng Ren, Zhilong Song, Binghai Lyu, Julong Yuan, Wenhong Zhao

PDF

Open Access

TL;DR

This paper introduces a high-precision method for measuring and compensating errors in the thickness of silicon carbide wafers during polishing.

Contribution

A hierarchical hybrid error compensation method combining physical models and LSTM networks for online SiC CMP thickness measurement.

Findings

01

The proposed method achieves an RMSE of 0.47471 μm in thickness measurement.

02

The MAPE is reduced to 0.1102% after applying the compensation method.

03

Measurement deviation is maintained within ±1 μm of reference values.

Abstract

The thickness of silicon carbide (SiC) wafers is a crucial parameter that significantly affects the performance of devices, and its high-precision online measurement during chemical mechanical polishing (CMP) faces challenges from complex process-induced errors. To address this issue, this study develops a non-contact online thickness measurement system based on oblique-incidence laser triangulation and proposes a hierarchical hybrid error compensation method. Deterministic systematic errors caused by optical interference from polishing slurry are first compensated by combining an optical propagation physical model with experimental calibration. Subsequently, a Long Short-Term Memory (LSTM) network model is introduced to compensate for nonlinear, time-series-related dynamic random errors, primarily induced by temperature drift and associated thermal effects. Experimental results…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Chemicals1

SiC

Figures12

Click any figure to enlarge with its caption.

Funding1

—National Natural Science Foundation of China

Keywords

online thickness measurementoblique-incidence laser triangulationerror compensationLSTMSiCCMP

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Surface Polishing Techniques · Advanced Measurement and Metrology Techniques · Advanced machining processes and optimization

Full text

1. Introduction

Chemical mechanical polishing (CMP) plays a vital role in attaining global nanoscale planarization of wafers within the semiconductor manufacturing industry [1]. This process, which integrates chemical corrosion with mechanical grinding, is essential for the surface processing of third-generation semiconductor materials such as silicon carbide (SiC) [2]. Thickness serves as a vital geometric parameter for SiC, with a decisive influence on its physical properties and device performance. Consequently, precise control of wafer thickness is a fundamental requirement in the semiconductor manufacturing process [3]. However, traditional offline thickness measurement methods necessitate interrupting the polishing process for inspection, leading to considerable time delays. This interruption hinders the real-time acquisition of the material removal rate (MRR) for process adjustment and the accurate determination of the polishing endpoint, which can result in defects such as over-polishing or under-polishing of the wafer [4]. Therefore, the development of a high-precision, non-contact, and real-time online thickness measurement technique is of critical engineering significance for achieving precise control and intelligent enhancement of the SiC wafer CMP process.

To achieve online thickness measurement, researchers both domestically and internationally have employed various techniques to study wafer thickness during the polishing process. These techniques include eddy current, capacitive, ultrasonic, and laser approaches. The eddy current technique, which relies on electromagnetic induction, offers a low cost and fast response [5]; however, it is primarily used for detecting the thickness of metal films and has limitations for non-conductive dielectric layers [6]. The capacitive method characterized by its non-contact nature, shows potential for thickness detection of multi-layer dielectric films [7]. However, its non-linear output characteristic constrain measurement resolution at the micron scale [8]. The ultrasonic method determines thickness based on differences in acoustic impedance; however, its applicability to sub-micron thickness measurement is restricted by the acoustic wavelength, which limits achievable resolution [9]. In comparison, laser-based methods demonstrate significant potential for online thickness measurement owing to their non-contact operation and high spatial resolution. Specifically, the oblique-incidence laser triangulation method, through optimized optical path design, enables simultaneous detection of the upper and lower surfaces of semi-transparent materials such as SiC, thereby improving measurement accuracy to the sub-micron level [10,11]. Consequently, it is regarded as one of the most promising solutions for online thickness measurement. However, during the polishing process, this method is still affected by two major categories of errors: deterministic systematic errors induced by changes in optical interface characteristics caused by polishing slurry, and nonlinear, time-series-related dynamic random errors resulting from factors such as temperature drift and mechanical vibration. The coupling of these errors significantly degrades the accuracy of online thickness measurement.

Regarding the issue of error compensation in thickness measurement, scholars have proposed various improvement methods. For instance, in ultrasonic oil film thickness measurement, Jia et al. [12] analyzed the influence of temperature mechanism by establishing a multi-physics field coupling model and proposed a compensation method that combines sound speed-temperature modeling with signal correction. Similarly, Palanisamy et al. [13] achieved real-time temperature compensation by establishing a relationship model between thermal distribution and acoustic wave propagation speed, along with a multi-channel signal processing method, effectively reducing errors in thickness measurement. These methods have demonstrated effectiveness in addressing the first category of deterministic systematic errors. However, for the second category of dynamic random errors, their inherent complexity makes them difficult to be accurately modeled and compensated using traditional physics-based approaches.

In recent years, the integration of data-driven methods with ultra-precision machining has emerged as a critical research direction in advanced manufacturing. Deep learning models are increasingly being applied to processes such as polishing and grinding to model complex behaviors, predict outcomes like material removal rate (MRR) and surface quality [14,15], and achieve process optimization under dynamic conditions, often outperforming traditional empirical or physics-based models. In parallel, time-series models based on recurrent neural networks (RNNs)—notably Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and bidirectional LSTM (BiLSTM)—have been widely adopted across industrial domains. Their applications span tool wear prediction [16], remaining useful life estimation [17], power load forecasting [18], discharge state detection in micro-EDM [19], and process state monitoring in ultra-precision manufacturing [20]. Recent studies confirm that such models are particularly effective in capturing long-term temporal dependencies and nonlinear dynamics inherent in complex manufacturing processes.

Building upon the proven efficacy of LSTM in handling complex temporal data, this study focuses on the design of an online thickness measurement system based on oblique-incidence laser triangulation and proposes a hybrid compensation method suitable for the complex conditions of CMP. The proposed method elucidates the physical mechanism and deterministic characteristics of optical errors caused by the polishing slurry through optical propagation theory modeling, and achieves effective compensation for the deterministic systematic errors introduced by the slurry’s optical interference and system installation through experimental calibration. Furthermore, an LSTM model is introduced to adaptively learn and compensate for the remaining dynamic random errors. This approach successfully combines the interpretability of a physical model with the nonlinear fitting capability of a data-driven model, thereby enhancing the accuracy of thickness measurements under complex conditions. Consequently, it provides a reliable data foundation for real-time measurements, endpoint judgment, along with closed-loop control of the SiC CMP process.

2. Design and Principles of the Online Thickness Measurement System

To achieve online measurement of wafer thickness during the SiC CMP process, this section presents the design of an integrated online thickness measurement system based on oblique-incidence laser triangulation. As illustrated in Figure 1, the system integrates a laser triangulation thickness sensor, a temperature sensor, a main control circuit board, and an independent power supply, all housed within the polishing head. The LK-H008W laser thickness sensor (Keyence Corporation, Osaka, Japan) and the MLX90614 infrared temperature sensor (Melexis NV, Ypres, Belgium) are used to collect thickness and temperature data, respectively, with real-time wireless data transmission facilitated via Wi-Fi connection. The key specifications of the LK-H008W sensor are summarized in Table 1. Notably, its resolution is 0.2 $[eqn]$ , while the MLX90614 sensor has a resolution of 0.01 °C.

This study employs the oblique-incidence laser triangulation method for thickness measurement, which offers distinct advantages for semi-transparent materials such as SiC. It is particularly suitable for online measurement scenarios with a rotating polishing head. The underlying principle is shown in Figure 2. As the laser beam goes through a wafer at a specific incidence angle, reflection spots are formed on both the upper and lower surfaces of the wafer. The laser thickness sensor determines the wafer thickness by capturing the imaging displacement of these two reflection spots, based on the principles of geometric optics [11]:

[eqn]

Table 2 lists the definitions and typical values of the geometric parameters in Equation (1).

3. Analysis of Deterministic Systematic Errors and Establishment of Compensation Model

In the online thickness measurement system of CMP, this paper categorizes system errors into deterministic systematic errors and dynamic random errors based on their generation mechanisms. Deterministic systematic errors refer to inherent deviations that arise from the optical interference introduced by the polishing slurry as well as from the system integration and installation. In contrast, dynamic random errors are caused by time-series-related and highly random factors during the polishing process, such as temperature drift and mechanical vibration. This section focuses on analyzing the deterministic systematic errors resulting from the optical interference of the polishing slurry, establishing a compensation model for these errors, and ultimately calibrating and validating this model through experiments.

3.1. Calibration of Laser Thickness Sensor

The accuracy of the laser thickness sensor’s output serves as a foundational prerequisite for the subsequent compensation of slurry-induced optical errors. A high-precision laser thickness calibration platform was constructed, as illustrated in Figure 3a. During the calibration process, the Shenzhen Zhongtu Instrument Co., Ltd. (Shenzhen, China) WD4000 Patternless Wafer Geometric Measurement System employed as the reference true value for thickness, as illustrated in Figure 3b. This system, which operates on the principle of spectral confocal microscopy, offers a resolution of ±0.01 $[eqn]$ , thereby providing a reliable reference true value for sensor calibration.

The calibration measurements were conducted on a set of 4H-SiC single-crystal wafers (n-type) with a (0001) crystallographic orientation (Si-face). In the SiC CMP process, the wafer thickness typically falls within the range of ~350–500 $[eqn]$ , a range within which 4H-SiC exhibits sufficient effective semi-transparency at 655 nm for the present measurement configuration. This was confirmed experimentally: the laser triangulation sensor consistently and robustly captured the dual-spot signals reflected from both the upper and lower surfaces of each wafer, validating the practical feasibility of the optical measurement.

The raw sensor output showed an excellent linear correlation with the high-precision thickness values measured by the WD4000 system, achieving a coefficient of determination $[eqn]$ = 0.99996. This relationship was modeled using linear regression:

[eqn]

where $[eqn]$ is the calibrated thickness, $[eqn]$ is the raw measured thickness, $[eqn]$ is the scale factor, and $[eqn]$ is the offset. The fitting results yield $[eqn]$ = 1.76607 and $[eqn]$ = 9.0201, as shown in the linear regression plot in Figure 4.

As shown by the data in Table 3, the calibrated measurement errors are all controlled within ±0.15%. This verifies the high linearity and accuracy of the sensor, establishing a reliable foundation for the subsequent separation and compensation of optical errors in the CMP process.

3.2. Optical Mechanism Analysis of Slurry-Induced Measurement Errors

To analyze the impact of polishing slurry on thickness measurement during the CMP process, this section establishes a theoretical model based on optical propagation mechanisms. This model elucidates the root causes and deterministic characteristics of measurement errors, thereby providing a theoretical basis for subsequent deterministic compensation through experimental calibration.

3.2.1. Qualitative Analysis of Optical Interface Changes and Error Sources

Under benchmark conditions without polishing slurry (dry condition), the laser reflects at the “SiC-air” interface, resulting in a clear optical path and stable signal, with only fixed errors caused by installation present. In the actual CMP working condition (wet condition), the optical interface changes to “SiC-polishing slurry,” thereby introducing optical interference in two aspects:

Reflection signal attenuation and distortion. The polishing slurry has a higher refractive index than air, resulting in a significant drop in effective reflectivity at the interface. This reduction in reflectivity decreases the signal-to-noise ratio (SNR) of the detected spot, thereby directly affecting the accuracy of centroid positioning.
Optical path changes due to the equivalent optical medium layer. The polishing slurry infiltrates and fills surface pores of the polishing pad, forming an equivalent optical medium layer between the wafer and pad, as illustrated in Figure 5. The laser beam traverses this equivalent optical medium layer composed of the slurry and pad before returning, resulting in an optical path difference (OPD). It is important to note that under CMP conditions, reflections occur at both the SiC-slurry interface (the wafer’s lower surface) and the slurry-polishing pad interface. Due to the sub-micron thickness of the equivalent optical medium layer, these two reflected beams have a minimal optical path difference. Consequently, on the imaging sensor of the laser triangulation system, they are not resolved as two separate spots but merge into a single composite spot (Spot 2 in Figure 5b).

The combined effect of these factors diminishes the accuracy of centroid positioning of the light spot. This results in a deviation in the imaging displacement of the spot as calculated by the sensor, ultimately leading to errors in the thickness measurement results.

3.2.2. Reflectivity Attenuation and Its Influence on Spot Centroid Accuracy

The core of thickness calculation lies in accurately acquiring the centroid positions of the upper and lower surface reflection spots on the image sensor. The accuracy of this directly depends on the SNR of the reflected light signal. A reduction in reflected light intensity directly lowers the SNR for the lower surface spot, leading to decreased accuracy in centroid positioning on the image sensor [21].

To quantitatively evaluate the influence of optical interface variations on the reflected signal intensity, the refraction and reflection behavior at the SiC interface is analyzed using Snell’s law and Fresnel equations. According to Snell’s Law, the refraction angle ( $[eqn]$ ) is governed by the following equation:

[eqn]

where $[eqn]$ and $[eqn]$ are the refractive indices of the two media, and $[eqn]$ is the incident angle.

The reflectivity for s-polarization and p-polarization, denoted as $[eqn]$ and $[eqn]$ respectively, are:

[eqn]

[eqn]

For a non-polarized light source, the effective reflectivity ( $[eqn]$ ) is the mean of the two:

[eqn]

Based on actual parameters, the computed effective reflectivity for different interfaces is shown in Table 4.

As shown in Table 4, when the interface transitions from SiC–air to SiC–polishing slurry, the reflectivity decreases from 21.8% to 11.2%, representing a reduction of 48.6%. This pronounced attenuation of reflected intensity directly degrades the SNR of the lower-surface spot and introduces a deterministic centroid displacement in the captured image.

In this work, the Fresnel calculations adopt simplifying assumptions, including real refractive indices and an unpolarized light approximation, to establish a clear first-order physical model capturing the dominant deterministic effect, namely the slurry-induced reflectivity change. Secondary effects such as absorption and partial polarization, which are expected to be of secondary magnitude under CMP measurement conditions, are implicitly incorporated through experimental calibration.

Therefore, under CMP conditions, this reflectivity attenuation introduces a deterministic image displacement deviation $[eqn]$ , which can be modeled as linearly proportional to the reflectivity difference between the dry and wet states:

[eqn]

where $[eqn]$ is an experimentally calibrated coefficient, $[eqn]$ is the effective reflectivity under dry conditions (SiC–air interface), and $[eqn]$ is that under wet conditions (SiC–slurry interface).

3.2.3. Optical Path Difference and Phase Change in the Equivalent Optical Medium Layer

Under CMP conditions, the polishing slurry permeates and fills the pores on the surface of the polishing pad, forming an equivalent optical medium layer with a thickness $[eqn]$ and an average refractive index $[eqn]$ between the wafer and pad. As illustrated in Figure 5, this medium layer alters the laser propagation path. In addition to the primary reflection at the SiC–polishing slurry interface, an additional reflection occurs at the polishing slurry–polishing pad interface. As noted in Section 3.2.1, because these two reflecting interfaces are extremely close, their reflected beams merge into a single composite spot (Spot 2 in Figure 5b) on the imaging sensor, leading to an error when the system calculates the overall centroid of this composite spot.

This equivalent optical medium layer increases the round-trip optical path length of the laser within the medium, generating an additional OPD $[eqn]$ :

[eqn]

where $[eqn]$ is the thickness of the equivalent optical medium layer, $[eqn]$ is the average refractive index of the polishing slurry, and $[eqn]$ is the refraction angle of the laser within the slurry. Equation (8) quantifies the additional optical path difference introduced by the equivalent slurry layer formed during CMP.

When the laser wavelength is $[eqn]$ , this OPD induces a phase change $[eqn]$ in the reflected light:

[eqn]

In an oblique-incidence laser triangulation system, this phase change is interpreted by the optical sensor as a change in the geometrical thickness of the SiC wafer. This misinterpretation ultimately manifests as an equivalent displacement $[eqn]$ of the spot centroid on the imaging plane. Under CMP conditions, this displacement can be modeled as linearly proportional to the optical path difference $[eqn]$ :

[eqn]

where $[eqn]$ is a proportionality coefficient related to the optical path structure and imaging parameters of the system, and $[eqn]$ is a composite calibration coefficient that absorbs the known wavelength $[eqn]$ and the system-specific coefficient $[eqn]$ .

3.2.4. Integrated Modeling of Slurry-Induced Optical Errors

In actual measurements, both reflectivity attenuation and optical path change jointly determine the total deviation $[eqn]$ of the spot imaging displacement. Given the deterministic nature of these errors under stable conditions, we model their combined effect as a linear superposition of the two independent contributions defined in Formulas (7) and (10):

[eqn]

Consequently, the relationship between the imaging displacements under wet ( $[eqn]$ ) and dry ( $[eqn]$ ) conditions is:

[eqn]

By substituting the imaging displacements $[eqn]$ and $[eqn]$ into the thickness calculation Formula (1), we obtain the corresponding thickness readings $[eqn]$ and $[eqn]$ . The deterministic systematic error introduced by the optical interference of the polishing slurry is defined as:

[eqn]

where $[eqn]$ is a system-specific transfer coefficient that converts the imaging displacement error into a thickness error.

In this study, stable CMP conditions refer to a quasi-steady operating regime reached shortly after polishing initiation, in which the overall polishing state can be regarded as locally stable over a short time window. Under this condition, the measurement outputs of the sensors—including the laser triangulation sensor and the temperature sensor—exhibit good repeatability. This theoretical model demonstrates that under stable CMP conditions, both $[eqn]$ and $[eqn]$ are deterministic. Notably, $[eqn]$ is not random noise; instead, it exhibits a deterministic and repeatable systematic deviation.

3.3. Engineering Implementation of the Deterministic Systematic Errors Compensation Model

Based on the theoretical analysis presented, this paper constructs an optical interference compensation model suitable for CMP conditions. The expression for this model is as follows:

[eqn]

where $[eqn]$ is the thickness value after optical compensation, $[eqn]$ is the optical error quantity in the model to be calibrated, and $[eqn]$ is the system static error.

The structural framework of this compensation model is illustrated in Figure 6. This model elucidates the physical sources of the optical errors induced by the polishing slurry. Under the actual dynamic CMP conditions, the distribution of the polishing slurry layer and the microstructure of the polishing pad are subject to constant change. Parameters within the theoretical model, such as equivalent thickness and effective reflectivity, are challenging to measure accurately in real time. Consequently, this paper directly calibrates the optical error quantity and the system’s static error through comparative experiments. This approach eliminates the need for real-time measurement of the microscopic, dynamic optical parameters during processing. Instead, it extracts a stable compensation value under controlled process conditions, thereby offering greater practicality for engineering applications.

3.4. Experimental Calibration and Validation of the Deterministic Error Compensation Model

3.4.1. Calibration Experiment Under Dry and CMP Conditions

The calibration experiments were completed on a UNIPOL-1200S automatic pressure grinding and polishing machine (Shenyang Kejing Automation Equipment Co., Ltd., Shenyang, China). The polishing head integrated with the laser thickness sensor was installed on the main spindle. A schematic of the experimental platform is provided in Figure 7, and the key polishing process parameters are detailed in Table 5.

SiC wafers with known thicknesses were selected as standard samples for the experiment. The calibration experiment included measurements under two conditions: dry and wet. The baseline measurement for the dry condition was conducted with the polishing machine halted and without the use of polishing slurry. The calibrated integrated thickness measurement system was employed to measure the standard samples, and the thickness reading $[eqn]$ was recorded. Following the initiation of the polishing machine and allowing the process to stabilize, stable CMP condition measurements were performed on the same set of wafers at identical measurement positions, with the thickness reading recorded as $[eqn]$ .

By substituting the measurement data from the dry condition (where $[eqn]$ = 0) and the wet condition into Formula (14), the calibration parameters can be solved simultaneously:

[eqn]

where $[eqn]$ refers to the measurement value obtained from the integrated laser thickness measurement system under the dry conditions, and $[eqn]$ represents the measurement value under wet conditions.

3.4.2. Calibration of Deterministic Systematic Errors and Model Validation

Using Equation (15) to analyze the calibration samples, we obtained values of $[eqn]$ $[eqn]$ and $[eqn]$ . These parameters were subsequently substituted into the compensation model (Equation (14)) to adjust the wet-condition measured value $[eqn]$ of the validation samples:

[eqn]

The calibration and validation results are presented in Table 6. After correction by the deterministic compensation model, the absolute error ( $[eqn]$ ) at each measurement point was effectively controlled within ±0.5 $[eqn]$ , and the relative error ( $[eqn]$ ) was maintained below 0.1%. These results demonstrate the effectiveness and accuracy of both the compensation model and the parameter calibration method. This confirms that the compensation model can reliably eliminate the deterministic system errors introduced by the optical interference of the polishing slurry and the system integration/installation. Consequently, it stably restores the online thickness measurement accuracy under CMP conditions to a level close to the offline benchmark, providing a highly reliable solution for online thickness detection during the CMP process.

4. Research on Dynamic Error Compensation Model Based on LSTM

After processing with the deterministic error compensation model established in Section 3, the deterministic systematic errors caused by the optical interference of the polishing slurry have been effectively compensated. However, during the actual SiC CMP process, the measurement system continues to be affected by various time-varying factors such as temperature drift and mechanical vibration. These dynamic random errors are characterized by nonlinearity, time-series-related correlation, and significant randomness, making them difficult to fully describe with a fixed physical model.

To address this problem, this section constructs a dynamic error compensation model based on Long Short-Term Memory (LSTM) networks. This model utilizes the time-series data of optically compensated thickness and real-time monitored process temperature as inputs. It establishes a nonlinear time-series mapping from the measured values to the true thickness, thereby achieving high-precision compensation for residual dynamic random errors under complex working conditions.

4.1. Construction of the LSTM Dynamic Error Compensation Model

4.1.1. Selection of Model Inputs

The LSTM architecture is capable of learning long-range temporal dependencies in time-series data, while alleviating the vanishing gradient problem typically encountered in conventional recurrent neural networks [22]. It exhibits strong modeling and analytical capabilities when processing time-series data. To construct an LSTM model capable of accurately compensating for dynamic random errors, we select the thickness and temperature data collected in real-time during the CMP process, along with their rates of change, as inputs. All data are sourced from the integrated online thickness measurement system designed in Section 1, which enables continuous and stable data acquisition throughout the polishing process.

The selected thickness sequence undergoes preprocessing. Initially, this sequence is corrected using the deterministic error compensation model established in Section 3, which eliminates the systematic deviation introduced by the optical interference of the polishing slurry. Subsequently, the sequence undergoes offset processing based on the initial thickness, to mitigate the impact of variations in the initial thickness of different wafers on model training. Furthermore, the rate of change in thickness directly reflects the instantaneous material removal rate (MRR). The variation trend of this rate contains critical dynamic process information, such as the evolution of the polishing pad state and fluctuations in process stability [23]. Therefore, the introduction of this feature enhances the model’s ability to learn the underlying patterns of thickness error changes over time.

Temperature is a primary time-varying factor that significantly influences the accuracy of the online measurement system. Its effects are multifaceted. On one hand, the heating of sensors and optical components can induce drift, resulting in random errors in the measured values [24]. On the other hand, the rate of temperature change affects both the chemical reaction rate and the mechanical friction coefficient in the polishing zone, thereby altering the material removal process [25,26]. Consequently, introducing temperature information enables the model to learn and compensate for the complex nonlinear errors induced by thermal effects.

Mechanical vibration is a potential error source in CMP, and its signal characteristics correlate with the variation in SiC surface roughness during polishing [27]. However, in the experiments conducted here, the SiC wafers exhibited low and gradually changing surface roughness, which resulted in vibration signals lacking distinct temporal variation characteristics. Consequently, their practical impact on thickness measurement can be regarded as secondary noise. Therefore, vibration features were not incorporated into the model, while temperature and its rate of change were identified as the key inputs for compensating dynamic random errors.

4.1.2. Basic Architecture of LSTM

The core innovation of LSTM lies in the introduction of the cell state, as depicted in Figure 8. This cell state enables information to be transmitted over long distances within a sequence, with control managed collectively by the input gate ( $[eqn]$ ), forget gate ( $[eqn]$ ), and output gate ( $[eqn]$ ). The input gate determines which new information to retain in the cell state, while the forget gate decides which information to eliminate from it. The output gate, in turn, determines what information to output based on the current state of the cell.

The forward propagation calculation process of a single LSTM unit is shown in Table 7. At sampling time $[eqn]$ , the input is the current input vector $[eqn]$ and the output is the hidden state $[eqn]$ . Here, $[eqn]$ and $[eqn]$ represent the weight matrices and bias vectors in the network; $[eqn]$ is the Sigmoid activation function, which controls the proportion of gated information flow; and $[eqn]$ maps values to the range of −1 to 1.

4.1.3. Structure of the LSTM Dynamic Compensation Model

Based on the input features selected in Section 4.1.1 and the basic structure of the LSTM unit described in Section 4.1.2, a dynamic error compensation model based on LSTM is constructed. This model consists of an input layer, an LSTM layer, and a fully connected output layer. The overall structure is presented in Figure 9.

The model employs a sequence-to-point prediction architecture, using the multidimensional time-series data collected during the CMP process as input. It predicts the compensation value for the current thickness measurement error. The input feature vector $[eqn]$ comprises four dimensions:

[eqn]

where $[eqn]$ is the preprocessed thickness measurement value, which has undergone optical compensation and offset correction, as defined by Formula (18); $[eqn]$ denotes the real-time process temperature; and $[eqn]$ and $[eqn]$ are the rates of change in thickness and temperature at time t, respectively, calculated using Formulas (19) and (20).

[eqn]

[eqn]

[eqn]

where t − 1 denotes the immediate previous sampling time step.

The model uses a sequence of features from a continuous time window as input in order to capture temporal dependencies. With a look-back window length of $[eqn]$ , the features from $[eqn]$ consecutive time steps are stacked into a three-dimensional input tensor:

[eqn]

where $[eqn]$ is the input tensor of the LSTM model. Each row of $[eqn]$ corresponds to the feature vector at a single time step, constructed from the variables defined in Formulas (17)–(20).

The LSTM layer adopts a single-layer structure with U hidden units. This layer performs temporal feature extraction on the input sequence and outputs the hidden state of the last time step:

[eqn]

A fully connected output layer then maps this hidden state to a scalar output, which is the predicted thickness error compensation value $[eqn]$ :

[eqn]

where $[eqn]$ and $[eqn]$ are learnable parameters. The final compensated thickness estimate $[eqn]$ is obtained as:

[eqn]

This structure enables adaptive learning of the nonlinear temporal error patterns within the time series of thickness, temperature, and their rates of change, thereby achieving effective compensation for dynamic random errors.

4.1.4. Hyperparameter Settings and Training Method

The key hyperparameters of the LSTM were determined through a grid search over the following ranges: sequence length $[eqn]$ , hidden size $[eqn]$ , number of LSTM layers {1, 2, 3}, and learning rate {0.01, 0.02, 0.03}. These ranges were predefined based on common practices in time-series prediction tasks and the characteristics of the dataset used in this study. The configuration shown in Table 8 (L = 20, U = 64, single layer, learning rate = 0.02) was selected due to its consistently low Mean Squared Error (MSE) on the validation set, stable training behavior, and its effectiveness in avoiding overfitting—a phenomenon observed with more complex architectures (e.g., three layers) or higher learning rates.

The model was implemented in a Python 3.9.19 environment based on the PyTorch 1.13.1 framework, with training accelerated through CUDA 11.6. The optimization process utilized the Adaptive Moment Estimation (Adam) algorithm, which computes adaptive learning rates for different parameters. Model training uses the mean squared error (MSE) as the loss function:

[eqn]

where $[eqn]$ is the batch size, $[eqn]$ is the predicted error compensation, and $[eqn]$ is the corresponding true error.

4.2. Experimental Settings Under CMP Conditions

To validate the effectiveness of the LSTM dynamic error compensation model under complex CMP working conditions, this section builds an experimental platform based on the UNIPOL-1200S automatic pressure grinding and polishing machine, as illustrated in Figure 7. The core component of this platform is the self-developed integrated polishing head described in Section 1 which internally integrates a laser triangulation thickness sensor and an infrared temperature sensor. This integration enables real-time acquisition of thickness and process temperature data.

To construct a temporally aligned dataset for model training and evaluation, the polishing process was paused every 2.5 min to obtain thickness reference values $[eqn]$ using the WD4000 system. At the same time, the integrated online thickness sensor was triggered at the start of each polishing pause to synchronously record a set of thickness measurements $[eqn]$ along with temperature data, thereby building a temporally aligned sequential dataset. This approach established a direct one-to-one pairing between an optically compensated online measurement and a high-precision reference value at each time step $[eqn]$ (k = 0, 1, 2…). The true error $[eqn]$ used for model training at each step was then computed directly as $[eqn]$ . The polishing process parameters are detailed in Table 9.

A total of 300 temporally aligned data points were collected throughout the polishing experiment. Following common practice for model development and evaluation, the entire dataset was randomly split into a training set (70%, 210 samples) and a held-out test set (30%, 90 samples). The training set was used for LSTM model learning and hyperparameter tuning (via grid search, as described in Section 4.1.4). The completely independent test set was strictly reserved for the final, unbiased evaluation of model generalization performance, with all results reported in Section 4.3 (Table 10, Figure 10 and Figure 11).

4.3. Results Analysis and Discussion

To validate the effectiveness of the proposed LSTM dynamic error compensation model under CMP working conditions, we compare it with three baselines: (1) the results with only deterministic optical error compensation (denoted as ‘None’), (2) a classical autoregressive (AR) time-series model, and (3) the Extended Kalman Filter (EKF). The comparison encompasses thickness time-series tracking capability, error distribution characteristics, and prediction consistency. The experimental results are illustrated in Figure 10, Figure 11 and Figure 12, and the numerical error metrics are summarized in Table 10.

As shown in Table 10, the AR model achieves almost identical error metrics to the ‘None’ baseline, indicating that linear time-series models cannot capture the complex nonlinear dynamics of the residual error. The EKF yields a moderate improvement, reducing the RMSE by 24.2% compared to ‘None’. However, the proposed LSTM model significantly outperforms all baselines, achieving an RMSE of 0.47471 $[eqn]$ and an MAPE of 0.1102%—a 44.0% reduction in RMSE and a 40.1% reduction in MAPE relative to EKF. This demonstrates that the LSTM’s ability to model long-term temporal dependencies and nonlinear patterns is essential for compensating the dynamic random errors in CMP thickness measurement.

Figure 10 presents a time-series comparison between the reference thickness and the thickness after LSTM compensation on both the training and test sets. It can be observed that the LSTM output stably follows the monotonically decreasing trend of the thickness throughout the entire polishing process and remains synchronized even during phases where the rate of thickness change fluctuates significantly. This stability is attributed to the gating mechanism of the LSTM: its Cell State can retain the overall trend of thickness variation over the long term, while the Forget Gate dynamically filters out irrelevant fluctuations caused by short-term process disturbances.

Figure 11 illustrates the variation in thickness error over time before and after LSTM compensation. After deterministic optical error compensation, the residual error still exhibits clear time-series correlation and random fluctuation characteristics, with its amplitude increasing significantly during certain periods. This behavior indicates that the residual error is not white noise, but rather dynamic random error resulting from the coupling of multiple CMP process disturbances, such as temperature drift and mechanical vibration. These characteristics are consistent with the complex, non-stationary nature of the CMP environment described in the previous sections.

After LSTM compensation, the range of error fluctuation is significantly reduced, and the error distribution becomes more concentrated. This demonstrates that the proposed model does not merely smooth the measurement signal, but effectively learns the intrinsic temporal evolution patterns of the dynamic error.

Figure 12 presents a scatter plot of the LSTM-compensated thickness against the reference values, along with the linear regression fit. Across the entire thickness range, the data points are closely distributed around the ideal line y = x, indicating good linear consistency and interval stability. This result confirms that the proposed compensation model does not introduce additional systematic bias with respect to thickness variation, and maintains stable accuracy over the full measurement range.

To further contextualize the choice of LSTM, we compare it with other advanced time-series models that have been widely applied in industrial process monitoring, such as Gated Recurrent Unit (GRU) networks and Transformer-based architectures. Compared with GRU, LSTM features a more explicit memory cell structure, which is particularly advantageous for capturing long-term temporal dependencies in slowly varying CMP processes. Although Transformer-based models have demonstrated strong performance in large-scale time-series learning tasks, they generally require extensive training datasets and incur higher computational costs. These requirements limit their practical applicability in small-sample, physics-constrained manufacturing scenarios such as online CMP thickness measurement. Therefore, considering model robustness, interpretability, and data availability, LSTM is adopted in this study as a suitable compromise between model complexity and compensation performance.

In summary, by leveraging its strong time-series modeling capability and adaptive learning of multidimensional process features, the LSTM-based dynamic compensation model significantly enhances the accuracy of online thickness measurement during the CMP process. Following deterministic optical error compensation, the proposed method further addresses residual dynamic random errors that cannot be adequately modeled using fixed physical models or linear filtering approaches. By integrating thickness, temperature, and their temporal variations, the model captures the coupling relationship between thermal effects, mechanical disturbances, and material removal behavior under CMP conditions. As a result, it achieves effective compensation for nonlinear, time-varying errors, providing a reliable foundation for online endpoint determination and closed-loop control in the SiC CMP process.

5. Conclusions

To achieve high-precision online wafer thickness measurement during SiC CMP, this paper proposed a non-contact integrated measurement system based on oblique-incidence laser triangulation, together with a hierarchical hybrid error compensation framework combining physical mechanisms and data-driven methods.

A deterministic error model based on optical propagation theory and experimental calibration was established to compensate for optical interference induced by the polishing slurry, effectively suppressing the thickness measurement error to within ±0.5 $[eqn]$ under stable CMP conditions. Subsequently, residual dynamic random errors were further mitigated by introducing an LSTM-based model that utilizes thickness, temperature, and their variation rates as inputs. After hierarchical hybrid compensation, the system achieved an RMSE of 0.47471 $[eqn]$ and an MAE of 0.38440 $[eqn]$ , with the deviation between online measurements and reference values effectively controlled within ±1 $[eqn]$ .

In summary, this study addresses the coexistence of deterministic and dynamic random errors under complex CMP conditions through a hierarchical compensation framework integrating physical modeling and data-driven learning, providing a reliable online thickness measurement solution for hard and brittle materials such as SiC. Future work will focus on incorporating multi-sensor information, such as vibration and pressure, to enable higher-dimensional process state modeling and intelligent CMP applications including endpoint detection and adaptive parameter control.

Bibliography27

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Tian F. Wang T. Lu X. Guo J. Endpoint Detection Based on Optical Method in Chemical Mechanical Polishing Micromachines 202314205310.3390/mi 1411205338004910 PMC 10673209 · doi ↗ · pubmed ↗
2Zhong Z.-W. Recent Developments and Applications of Chemical Mechanical Polishing Int. J. Adv. Manuf. Technol.20201091419143010.1007/s 00170-020-05740-w · doi ↗
3Jäker P. Aegerter D. Kyburz T. Städler R. Fonjallaz R. Detlefs B. Koziej D. Flow Cell for Operando X-Ray Photon-in-Photon-out Studies on Photo-Electrochemical Thin Film Devices Open Res. Eur.202227410.12688/openreseurope.14433.237645301 PMC 10446061 · doi ↗ · pubmed ↗
4Hirano K. Sato T. Suzuki N. Real-Time Prediction of Material Removal Rate for Advanced Process Control of Chemical Mechanical Polishing CIRP Ann.20247326927210.1016/j.cirp.2024.04.028 · doi ↗
5Qu Z. Wang W. Yang Z. Bao Q. Zheng Y. High-Precision Thickness Measurement of Cu Film on Si-Based Wafer Using Erasable Printed Eddy Current Coil and High-Sensitivity Associated Circuit Techniques IEEE Trans. Ind. Electron.2022699556956510.1109/TIE.2021.3111570 · doi ↗
6Wang C. Wang T. Tian F. Lu X. Edge Effects of an Eddy-Current Thickness Sensor During Chemical Mechanical Polishing IEEE Trans. Semicond. Manuf.202235243110.1109/TSM.2022.3141053 · doi ↗
7Abdollahi-Mamoudan F. Ibarra-Castanedo C. Maldague X.P.V. Advancements in and Research on Coplanar Capacitive Sensing Techniques for Non-Destructive Testing and Evaluation: A State-of-the-Art Review Sensors 202424498410.3390/s 2415498439124031 PMC 11314776 · doi ↗ · pubmed ↗
8Lee H. Shin K. Moon W. Capacitive Measurements of Si O 2 Films of Different Thicknesses Using a MOSFET-Based SPM Probe Sensors 202121407310.3390/s 2112407334199213 PMC 8231994 · doi ↗ · pubmed ↗