Integrating a Convolutional Neural Network and MultiHead Attention with Long Short-Term Memory for Real-Time Control During Drying: A Case Study of Yuba (Tofu Skin)

Jiale Guo; Jie Wu; Lixuan Zhang; Ziqin Peng; Lixuan Wei; Wuxia Li; Jingzhi Shen; Yanhong Liu

PMC · DOI:10.3390/foods15020245·January 9, 2026

Integrating a Convolutional Neural Network and MultiHead Attention with Long Short-Term Memory for Real-Time Control During Drying: A Case Study of Yuba (Tofu Skin)

Jiale Guo, Jie Wu, Lixuan Zhang, Ziqin Peng, Lixuan Wei, Wuxia Li, Jingzhi Shen, Yanhong Liu

PDF

Open Access

TL;DR

This study uses a new AI model to improve drying of yuba, reducing time while maintaining quality, with potential for other food applications.

Contribution

A novel CNN-LSTM-MHA network is introduced for intelligent drying control, improving prediction accuracy and product quality.

Findings

01

The CNN-LSTM-MHA model achieved high prediction accuracy (R2: 0.9855–0.9999) for drying properties of yuba.

02

Intelligent drying reduced drying time and improved yuba's texture, color, and nutritional content compared to fixed-temperature drying.

Abstract

Achieving comprehensive improvements in the drying rate (DR) and the quality after drying of agricultural products is a major goal in the field of drying. To further shorten the drying time while improving product quality, this study introduced a Convolutional Neural Network (CNN) and MultiHead Attention (MHA) to enhance the prediction accuracy of the Long Short-Term Memory (LSTM) network regarding the properties of dried samples. These properties included DR, shrinkage rate (SR), and total color difference (ΔE). The CNN-LSTM-MHA network was proposed, developing a novel hot-air drying (HAD) scenario utilizing an intelligent temperature control system based on the real dynamics of material properties. The results of drying experiments with temperature-sensitive yuba showed that the CNN-LSTM-MHA network’s predictive accuracy was better than that of other networks, as evidenced by its…

Figures14

Click any figure to enlarge with its caption.

Funding2

—National Natural Science Foundation of China
—Guangdong Institute of Modern Agricultural Equipment

Keywords

Long Short-Term MemoryConvolutional Neural NetworkMultiHead Attentionhot-air dryingreal-time control

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFood Drying and Modeling · Microencapsulation and Drying Processes · Spectroscopy and Chemometric Analyses

Full text

1. Introduction

Drying is a crucial method for extending the shelf life and preventing mildew growth on agricultural products. However, the long-term exposure of agricultural products to high-temperature environments during drying affects their quality [1]. To solve this problem, researchers have developed various techniques, such as intermittent grain drying with tempering and temperature control during different drying stages of fruits and vegetables [2,3]. However, these methods only set different drying temperatures in several stages, which is insufficient for complex drying systems.

Real-time drying temperature control (RTDTC) technology can improve the quality of dried products while reducing drying time. For instance, Vilas et al. [4] applied RTDTC technology during freeze drying, and their results showed that RTDTC reduced the drying time by about 30% compared with the standard strategy. In the study of Nadian et al. [5], the authors predicted the drying time, color, and shrinkage rate (SR) of kiwifruit slices under a microwave–hot air drying process based on an Artificial Neural Network (ANN) model. Their results showed that the drying time was shortened by 40%, while the total color difference (ΔE) was reduced by 73.42%. However, the inherent characteristics of the model they applied in this study led to difficulties in establishing a correlation between quality and drying time, which limited the prediction accuracy of the model, as drying is a process in which a product’s state changes over time [6]. Therefore, applying a prediction model that can capture the time series relationship within the data is expected to enhance the accuracy of model predictions.

The Long Short-Term Memory (LSTM) network is particularly adept at capturing the temporal dynamics inherent in the input data [7]. In Jia et al.’s [6] study on the drying kinetics and color prediction of apple slices, they confirmed that the LSTM network achieved the highest accuracy (coefficient of determination (R^2^) > 0.98). However, it is regrettable that their research did not integrate the predicted results into the dryer for the real-time control of the drying temperature, which also limited further optimization of the drying process. Guo et al. [8] utilized the LSTM model for the real-time control of drying parameters (temperature, humidity, and air velocity) during the HAD of Pleurotus eryngii. Their results showed that the composite scores of the samples after real-time control of the drying parameters were 6% higher than that of those with the highest score in the full-factorial experiment [8]. However, they found that when the drying parameters were frequently changed, the LSTM’s prediction accuracy was significantly decreased. Therefore, it is necessary to optimize the LSTM further in order to improve its prediction accuracy in situations involving frequent changes in drying parameters. In addition, to test the effectiveness of the real-time control system based on the improved LSTM, temperature-sensitive samples are more applicable for testing.

Yuba is a sensitive product to drying temperatures. It is a film that forms on the surface of soybean milk when boiled, and is usually dried before being transported and sold [9]. Therefore, using yuba for drying experiments can intuitively demonstrate the effectiveness of the real-time temperature control system based on the improved LSTM.

Therefore, the purposes of this study were to (1) improve the structure of LSTM and test the prediction accuracy of the network and (2) explore the effects of different drying temperature control methods on drying kinetics and qualities. Compared with our previous research [8], this work proposes a CNN-LSTM-MHA model which addresses the accuracy loss of LSTM under parameter frequency fluctuations by integrating local feature extraction and multi-dimensional attention.

2. Materials and Methods

2.1. Raw Materials

The yuba (the production method of which is shown in Figure 1) was purchased from a local supermarket in Haidian District, Beijing, China, and stored in thermally sealed polyethylene bags at 4 °C for no longer than one week before use. Before drying, the samples were cut to a length of 60.00 ± 4.87 mm. The initial moisture content of the fresh samples was measured to be 56.64% ± 0.89% on a wet basis (w.b.).

2.2. Experimental Equipment and Design

The HAD system used to conduct the drying experiments on yuba was fully described in our previous study [8]. The system was designed and established in our lab for the online monitoring of the appearance qualities and DR of agricultural products in order to control the temperature, humidity, and air velocity during drying [8]. To ensure precise data acquisition and control of drying parameters throughout the experiment, an SHT30 temperature and humidity sensor (Sensirion, Frauenfeld, Switzerland) was employed. This sensor offers a temperature accuracy of ±0.3 °C and a humidity accuracy of ±3% RH. Additionally, a portable air velocity meter (HT9829, Xinsite, Dongguan, China) with a precision of ±5% was utilized to measure air velocity, ensuring accurate pulse width modulation (PWM) regulation. At the top of the drying chamber, a Raspberry PI 4B, an industrial camera (DF200, Jieruiweitong, Shenzhen, China), and a ring shadowless lamp (R-90, Topvision, Shenzhen, China) were mounted. The system also included a gas heating and circulation unit, comprising a heater, air ducts, and a centrifugal fan, to regulate air velocity and temperature. The centrifugal fan was linked to an exhaust pipe, which directed the expelled gas into the drying chamber. Furthermore, a weight sensor was integrated into the drying chamber to enable the real-time monitoring of sample weight. A water tank and humidification system were also incorporated to adjust the relative humidity of the hot air.

Before drying, the yuba sticks were placed evenly on three trays, with twelve sticks for each tray. One yuba stick was placed in the upper tray for appearance quality tracking. The drying air velocity was set at 1.50 ± 0.20 m/s. During drying, images of the sample on the upper plate were taken every 1 min until a final moisture content (w.b.) of 10% (drying endpoint) was reached. The design of specific experimental parameters is shown in Table 1.

2.3. The Real-Time Control Strategy of Drying Temperature

2.3.1. LSTM Model

The LSTM model, a widely utilized variant of Recurrent Neural Networks (RNNs), excels in handling long-term dependencies, thereby improving the network’s capacity to model temporal dynamics in sequential data [10]. This advantage is attributed to the specialized architecture of LSTM cells, which incorporate three gate mechanisms, as depicted in Figure 2. The recurrent nature of LSTM allows for sequential data processing and iterative updates of internal states at every time step, enabling the network to effectively learn temporal patterns in time series data. The memory cell in LSTM is governed by three gate units: (1) the function of the input gate is to determine which information from the current time step’s input should be saved to the cell state; (2) the function of the forget gate is to decide which information from the previous time step’s cell state should be discarded; and (3) the function of the output gate is to control which parts of the cell state need to be output as the current output value. Specifically, after receiving the data, the forget gate decides which information to keep from the previous time step, and the input gate updates the current time step’s input into the state, generating a candidate state. Then, the cell state is updated by means of element-wise multiplication with the outputs of the forget gate and the input gate. Finally, the output gate controls the hidden state output for the current time step and, together with the updated cell state, determines the final output result of the LSTM, as described in Equations (1)–(6) [8]. The interplay of these gates, combined with the application of tanh and sigmoid activation functions, mitigates the issue of gradient decay during backpropagation.

[eqn]

[eqn]

[eqn]

[eqn]

[eqn]

[eqn]

where $[eqn]$ , $[eqn]$ , $[eqn]$ , $[eqn]$ , $[eqn]$ , and $[eqn]$ are the forget gate, input gate, candidate cell state, output gate, current cell state, and hidden layer state; $[eqn]$ is the input vector at time t; $[eqn]$ is the weight coefficient matrix; and $[eqn]$ is the bias of the corresponding cell state. The initial cell state (C0) and initial hidden state (h0) are initialized to zero vectors, which helps to avoid biasing the model towards any particular state at the beginning of the sequence.

Equation (1) uses h_t_−1 and X_t_ with a sigmoid layer to determine data inclusion. After passing through the tanh layer using h_t_−1 and X_t_, data are obtained using Equation (5). Equation (3) combines long-term memory C_t_−1 and current data C’t. The input gate bias is b_f_, while the weight matrices are W_i_. A sigmoid layer and a dot product, together with the forget gate, enable selective data transmission. Equation (2) decides whether to forget details from a previous cell, using W_i_ and B_i_. In Equations (4) and (6), h_t_−1 and X_t_ are the inputs for the LSTM output unit, processing new data C_t_ using the tanh layer to obtain the outcome.

In practice, to enhance computational efficiency, the weight matrices associated with the gates are often fused into a single matrix. This approach reduces the number of matrix multiplications required during forward propagation, thereby optimizing the computational resources and accelerating the training process. For instance, the weight matrices for the W_i_, W_f_, and W_o_ are combined into a single weight matrix, which is then decomposed during the computation to extract the contributions for each gate.

2.3.2. Improvement of LSTM Network

(1) Convolutional Neural Network (CNN)

CNNs are capable of automatically extracting features from data by convolutional, pooling, and fully connected layers [8]. The structure of the CNN utilized in this study is depicted in Figure 3. In this study, we employ a one-dimension convolutional network to perform local feature extraction on the input data. The extracted features are then used as the input for the LSTM network, enabling the LSTM to perform sequence modeling based on more meaningful and compact representations. According to the study of Jia et al. [6], the convolutional layer utilizes kernel size = 3 and padding = 1 to capture local patterns. Furthermore, the pooling layer reduces the feature dimensions, effectively decreasing the computational complexity of the subsequent LSTM and preventing overfitting by preserving crucial features.

In conclusion, based on the advantages of CNNs mentioned above, applying a CNN to the feature extraction of LSTM can reduce the computational complexity while improving the prediction accuracy of LSTM. However, due to the fixed gating structure of LSTM, it has poor adaptability when predicting sequence data of different complexities (such as changes in the appearance quality and moisture content of the samples).

(2) MultiHead Attention (MHA)

The introduction of MHA enables LSTM to focus on multiple positions in the input sequence simultaneously. This mechanism can improve the flexibility and expressiveness of the model by processing multiple attention heads in parallel (each head focuses on different features and time steps). Furthermore, MHA can also enhance the model’s ability to capture long-term dependencies and prevent the vanishing of the model gradient [11]. The attention mechanism implements scaled dot product attention through operations on three vectors: query (Q), key (K), and value (V). As presented in Figure 4a, during scaled dot product attention, the dot products between queries and all keys are calculated to gauge the significance of every key. Subsequently, the dot product results are divided by $[eqn]$ to avoid the dot product result being too large due to the increase in dimensions, in order to avoid numerical overflow or gradient explosion, where d represents the dimensionality of both K and Q. After that, a SoftMax function is utilized to generate weights. These weights signify the relative importance of each key-value pair with respect to a specific query. Eventually, each attention weight is multiplied by the corresponding value to yield the output, and the relevant attention function is expressed as Equation (7) [12]:

[eqn]

MHA can analyze the input features from various perspectives by performing parallel attention functions h times, taking different linear projections of Q, K, and V as inputs, as illustrated in Figure 4b. In this context, h represents the quantity of heads in MHA. Subsequently, the outputs of the attention mechanisms are combined and subjected to further linear projection. The formulation of MHA is given by Equations (8) and (9) [12]:

[eqn]

[eqn]

where ( $[eqn]$ , $[eqn]$ , $[eqn]$ , $[eqn]$ ) are the linear projection matrices and Hi denotes the output of a single attention function. According to the study of Shu et al. [12], a value of h = 6 was adopted in our study.

The improved LSTM presented in this study is referred to as “CNN-LSTM-MHA”, and its structure is shown in Figure 5. Figure 5b shows that the MHA module and the CNN module are alternately distributed among the LSTM units. Specifically, in CNN-LSTM-MHA, after the data is preprocessed, it is input into CNN. CNN extracts the local features of the data through convolution operations and then inputs the local feature maps into LSTM to capture the temporal dependencies in the feature maps and generate hidden states. Subsequently, the hidden state is input into the MHA to calculate the correlation between time steps, highlighting the key time steps or features.

2.3.3. Real-Time Control Logic of Drying Temperature Based on CNN-LSTM-MHA

The CNN-LSTM-MHA-based real-time management approach for the HAD of yuba sticks is depicted in Figure 6. As shown, once the drying parameters (drying endpoint, initial temperature, and air velocity) are established, the system gathers the images, quality metrics, and air temperature data. Subsequently, these data are transformed into SR, ΔE, moisture content, and DR every 1 min and are normalized to values ranging from 0 to 1. Using these data, CNN-LSTM-MHA predicts SR, ΔE, and DR at different temperatures in the subsequent stages. Then, based on these predictions, the system determines the possible optimal drying temperature for the next stage and adjusts the air temperature to the best setting using a Proportion Integration Differentiation (PID) controller for the next stage. This adjustment process is based on the comprehensive scoring method [8]. Specifically, during this process, the model predicts SR, ΔE, and DR at different drying temperatures in the next stage. Then, the system selects the temperature setting with the highest comprehensive score of these metrics at different temperatures. Subsequently, the drying equipment is regulated to this temperature through PID. This process is carried out once every minute. Ultimately, the system assesses the current moisture content of the samples to decide if the drying process is complete. If the drying endpoint is not yet achieved, the cycle continues. Once the drying endpoint is met, the drying process terminates.

2.4. Model Training and Evaluation

2.4.1. Model Training and Application Environment

The experimental environment of the network trained and applied in this study is shown in Table 2. The training and application environment of different models were the same.

2.4.2. Model Training

The parameters were configured as follows: all input data were in ‘csv’ format, the network’s input size was set to 5 (current moisture content, drying time, DR, SR, and ΔE), the hidden layer size was 256, the output size was set to 3 (the next stage of DR, SR, and ΔE), and the mean square error (MSE) function was used to measure the loss. The Adaptive Moment Estimation (Adam) optimizer was selected, with a learning rate of 0.01 and a total of 5000 training iterations. The dataset included real-time moisture content, DR, SR, and ΔE normalized to the range of 0 to 1. The dataset was divided into a training set, a validation set, and a test set at a 6:2:2 ratio (time-continuous). During the model training process, the Rolling Window method (RWM) was used to extract the dataset, and the average training time was approximately 13.5 min.

2.4.3. Model Evaluation

The R^2^, root mean square error (RMSE), and mean absolute error (MAE) were used as indexes to evaluate the prediction accuracy of the CNN-LSTM-MHA model, as shown in Equations (10)–(12) [13]:

[eqn]

[eqn]

[eqn]

where $[eqn]$ is the true value; $[eqn]$ is the predicted value of the CNN-LSTM-MHA network; $[eqn]$ is the average of the true values; and $[eqn]$ is the number of data points.

2.5. Drying Kinetics and Physicochemical Properties

2.5.1. Drying Kinetics

The moisture ratio (MR) of yuba was calculated according to the moisture content (dry basis, d.b.), assuming a zero value of the equilibrium moisture content, as shown in Equation (13) [14]:

[eqn]

where $[eqn]$ is the initial moisture content of fresh yuba, the detection method of which is shown in Section 2.1, g/g, and $[eqn]$ is the moisture content of yuba when dried for t min, which is obtained after real-time detection of the material mass by the weighing sensor (take the average of two consecutive detections), g/g.

Also, the DR of yuba was calculated according to Equation (14) [14]:

[eqn]

where $[eqn]$ and $[eqn]$ are the moisture content (d.b.) of yuba at drying times $[eqn]$ and $[eqn]$ , respectively, g/g, while $[eqn]$ and $[eqn]$ are the drying times at two weight readings, min.

2.5.2. Shrinkage Rate

Utilizing the DeepLabV3+ model (Figure 7) for the automatic semantic segmentation of yuba stick images during the drying process, the images of yuba were divided into a dataset at a ratio of 6:4 (quantity 2837:1892) for model training. The training parameters were set as follows: the size of the input image was 512 × 512 pixels, the batch size of the model was 8, the Adam optimizer was used as the optimizer, the maximum learning rate was 0.01, the minimum learning rate was 0.0001, the momentum was 0.9, and the number of iterations was 50. The accuracy of the trained model meets the application requirements. The mean intersection over union (MIoU), recall value, and mean average precision (mAP) were 98.38, 97.08, and 98.04, respectively. The yuba’s projected area was ascertained by tallying the segmented pixel points; subsequently, the yuba stick’s SR was computed using Equation (15) [14]:

[eqn]

where $[eqn]$ represents the projected area of the yuba stick when drying time is $[eqn]$ min, pixels and $[eqn]$ is the projected area of the fresh yuba stick, pixels.

2.5.3. Color

With the application of the DeepLabV3+ model for semantic segmentation, the yuba stick area was delineated, and the average values of R, G, and B color values were obtained. These values were subsequently transformed into L* (lightness/darkness), a* (redness/greenness), and b* (yellowness/blueness) values, respectively. The ΔE value between time intervals was then computed based on Equation (16) [15]:

[eqn]

Yellow is considered to be the most popular color among consumers for yuba sticks [16]. The Yellowness index (YI) value was then computed based on Equation (17) [15]:

[eqn]

where L* and b* refer to the color parameters of dried samples.

2.5.4. Rehydration Ratio

The dried samples were rehydrated following the method described by Zhou et al. [17] with minor modifications. The dried yuba sticks (5 ± 0.5 g) were first weighed, then immersed in water (1 g/20 mL of sample/water) at 85 °C for 20 min. After cooling, the remaining water on the surface was wiped away with a paper towel, and the samples were weighed after rehydration. The rehydration ratio (RR) was estimated by applying Equation (18):

[eqn]

where $[eqn]$ and $[eqn]$ refer to the sample weight after and before water absorption, respectively, g.

2.5.5. Texture

The texture was analyzed following the method outlined by Sun et al. [18] with a few modifications. The texture of yuba sticks was determined by a texture profile analyzer (TA. XTPLUS/50, Stable Micro System, Surrey, UK). The yuba sticks after rehydration (refer to Section 2.5.4) were cut into 1.4 cm × 5 cm sticks, and then a P/5 cylindrical probe was used to puncture the yuba sticks at a test speed of 1 mm/s, with the trigger pressure set to 150 g, shape variable set to 70%, and probe residence time set to 1 s.

2.5.6. Protein Content (PC)

The PC of yuba sticks was analyzed according to the method described by Peng et al. [19] with some modifications. The Kjeldahl method was applied to measure the total nitrogen content in the yuba sticks, and a nitrogen-to-protein conversion factor of 5.71 was used to approximate the protein content, as per the guidelines set by AOAC in 2005.

2.5.7. Fat Content (FC)

The FC of yuba sticks was evaluated following the method presented by Zhu et al. [20] with slight modification. Dried yuba sticks were cracked and ground into powder. Yuba powder (5 g) was subjected to extraction in a Soxhlet siphoning-type extractor filled with petroleum ether (30~60 °C) for 6 h. Residue was dried to a constant weight in a drying oven at 45 °C for 4 h and then weighed.

2.5.8. Microstructure

The cross-sectional morphology of the dried yuba sticks was inspected under a scanning electron microscope (SEM, SU3500, Hitachi, Tokyo, Japan) at a magnification of 1500×. Following a slightly adjusted method of Pei et al. [21], the samples segments underwent gold coating (Beijing Zhongquan Co., Ltd., Beijing, China) for 20 s at a pressure of 6~8 Pa prior to being scanned at a voltage of 15 kV.

2.6. Statistical Analysis

Data were expressed as the mean ± standard deviation (SD, n = 3) of three replicates. The statistical analysis was conducted using IBM SPSS statistics (version 27.0, SPSS Inc., Chicago, IL, USA). One-way analysis of variance (ANOVA) was applied to assess the statistical differences among groups, with Duncan’s multiple range test for multiple comparisons being used to identify significant differences at p < 0.05.

3. Results and Discussion

3.1. Evaluation Results of CNN-LSTM-MHA Model

Table 3 shows the experimental results for SR, ΔE, and DR compared to their predicted results from the six models. The results indicate that the CNN-LSTM-MHA model demonstrated the most precise fit between actual and predicted values, with all R^2^ values surpassing 0.9855. Furthermore, the CNN-LSTM-MHA model reflected the smallest difference between the actual and predicted values, with overall lower RMSE values (0.0001 ≤ RMSE ≤ 0.0099, 0.0001 ≤ MAE ≤ 0.0120).

The reason for the higher prediction accuracy of CNN-LSTM-MHA than LSTM may be divided into two aspects. Firstly, the introduction of CNN enhanced the model’s ability to extract key features from the data, especially at moments when the data fluctuates, which can provide more valuable time series data as the input to LSTM [6]. Secondly, the introduction of MHA can focus on and extract input features in parallel from different representation subspaces, such as focusing on extracting the fluctuation trend of DR on one head and focusing on extracting the correlation between DR and SR on another head. Therefore, the features extracted from these different perspectives can provide more a comprehensive input of time series data for the LSTM model, which can enable it to process the time series data better [11], as shown in Figure 8.

Compared to other models (ANN, PR, XGB, and LR), the CNN-LSTM-MHA network’s higher accuracy came from its ability to learn long-term dependencies within sequential data, thereby enabling it to capture more complex patterns and trends when dealing with variations in time series data [22]. This attribute can be explained by the unit structure of the LSTM model. Yang et al. [22] indicated that the cell states in the LSTM can transmit information across multiple time steps, which enables the LSTM to store and recall long-term information. In addition, the forget gate in the LSTM unit can selectively forget some information. These attributes enhanced the ability of LSTM to capture features in time series data with complex fluctuations and effectively inhibited the disappearance and explosion of gradients.

3.2. Real-Time Drying Temperature Control Results

3.2.1. Real-Time Drying Temperature Control

The HAD temperature of yuba sticks was controlled in real time based on the CNN-LSTM-MHA network. The resulting temperature changes in the drying chamber are shown in Figure 9. Although the temperature variation trends across the three experiments were not entirely consistent, the general trend remained the same. This may be due to inherent individual variations present within the samples. This also showed the advantage of real-time drying temperature control of CNN-LSTM-MHA based on the real dynamic state of the samples, ensuring that the samples were always in the optimal drying environment. The drying temperature increased to around 70 °C during the initial drying stage (around 0~100 min). Referring to Figure 8, this temperature rise during this period may be due to the rapid increase in ΔE. To facilitate the rapid increase in ΔE, the CNN-LSTM-MHA network elevated the drying temperature to 70 °C to promote the yellowing of samples.

During the middle (around 100~~130 min) and late (around 130~~330 min) stages, the drying temperature decreased rapidly and then increased to approximately 65 °C. CNN-LSTM-MHA decreased the temperature rapidly at this stage with the likely intention of inhibiting the shrinkage of the samples when the moisture content ranged from 0.6 to 0.4 g/g (d.b.). This was confirmed in the study of Jiang et al. [23], whose research on carrot slices also indicated that a decrease in drying temperature inhibited SR, which is consistent with the general understanding. In the late drying stage, as the moisture was removed, the partial pressure of water vapor between the samples and the drying medium gradually decreased, leading to a decrease in DR. Therefore, to speed up dehydration, it was necessary to use high temperatures to provide a greater driving force for mass transfer [24].

3.2.2. Accuracy of CNN-LSTM-MHA Network Real-Time Prediction

Figure 10 shows the changes in the observed and predicted values of the DR, SR, and ΔE of yuba sticks. The point-by-point predictions in Figure 10 rely on the model’s sequential learning ability, which integrates historical time series data to capture dynamic correlations between drying parameters. This ensures prediction robustness even with frequent temperature adjustments. Figure 10a shows a slight deviation between actual and predicted values, with an R^2^ value of 0.9616, indicating a negative effect on the prediction of DR under frequent variations in drying temperature. This may be due to the limited amount of data available for the network’s reference during early drying stages, which compromised its predicted accuracy for time series data variations, such as DR. However, compared with our previous research [8], the prediction accuracy of the network was significantly improved (R^2^ increased from 0.8475 to 0.9410–0.9998). As shown in Figure 10b,c, the R^2^ values between the predicted and actual values of the CNN-LSTM-MHA were 0.9872 and 0.9985, respectively, indicating accurate predictions of the yuba sticks’ SR and ΔE changes after altering the drying temperature. Consequently, the data variation curve predicted by the CNN-LSTM-MHA network provides a basis for controlling drying temperatures.

3.3. Drying Kinetics and Physicochemical Properties

3.3.1. Drying Kinetics

The MR and DR curves of yuba sticks under different drying temperature control methods are shown in Figure 11. According to Figure 11a, increasing the drying temperature reduces the drying time. The time required to dry the CNN-LSTM-MHA samples until the final moisture content (10%) was reached was 320.59 min. This achieved a 43.36%, 25.06%, 17.53%, and 5.88% reduction in drying time compared with the 50, 55, 60, and 65 °C drying scenarios, respectively. However, the drying time for the CNN-LSTM-MHA-treated samples increased by 4.92% compared to the samples dried at 70 °C.

There is a positive correlation between drying temperature and heat and mass transfer. The increase in drying temperature improves the heat and mass transfer efficiency, thereby shortening the drying time [24]. This is the typical drying behavior of many food samples [25]. This also clarifies why the CNN-LSTM-MHA-treated samples’ drying time was longer than 70 °C, where the drying temperature was maintained between 65 and 70 °C during the drying process.

As presented in Figure 11b, a rapid increase in DR was observed, followed by a gradual decrease during the initial drying stage. The initial increase in DR can be attributed to the excess moisture on the surface of the yuba sticks, leading to a rapid moisture removal rate. As drying advanced, surface moisture evaporation became less significant, while inherent moisture diffusion within the samples started to dominate the process [26]. These observations align with studies on persimmon, blueberries, and kiwifruit slices [27,28,29].

It must be mentioned that a higher DR was observed in the late drying stage for the CNN-LSTM-MHA-treated samples than others. This higher DR can be attributed to the pulsed temperature changes (50–70 °C) during the initial and middle stages, avoiding the damage to the microstructure caused by thermal stress that occurs under continuous high temperatures. Wang et al. [30] also pointed out that less damage to the microstructure during drying could promote the migration and diffusion of moisture. In addition, higher drying temperatures in the late drying stage further enhanced heat and mass transfer during the drying process. This effectively shortened the drying time required for the CNN-LSTM-MHA process compared to other constant-temperature drying scenarios [24].

3.3.2. Shrinkage Rate

The SRs of the yuba sticks under different drying temperature control methods are summarized in Figure 12a. As the drying temperature increased from 50 to 60 °C, the SRs of the samples gradually increased from 11.74% to 19.26%, which was consistent with expectations. However, at 65 and 70 °C, the SR decreased by 13.14% and 16.82%, respectively, which is unexpected, as higher temperatures typically increase shrinkage. The SR of CNN-LSTM-MHA-treated samples was significantly lower by 6.02% and 11.95% compared to 55 and 60 °C, but slightly higher by 5.01%, 1.37%, and 5.86% compared to 50, 65, and 70 °C, respectively.

Li et al. [31] reported that moisture evaporation during drying is usually accompanied by significant volume shrinkage, which is mostly affected by moisture migration and temperature. The increase in SR between 50 and 60 °C can be attributed to the diminishment of viscoelastic stresses within the pores due to the loss of moisture, leading to a subsequent decrease in the inherent pore pressures [32]. This is supported by Figure 13, which shows a weak positive correlation between SR and hardness (r = 0.67), indicating that the cause of shrinkage was related to the reduction or disappearance of micropores in the samples, as this could increase the hardness of the samples, as is widely observed [31]. However, the decrease in SR between 60 and 70 °C may be associated with the reduction in drying time, which helped to maintain the microstructure [33]. When the temperature was above 60 °C, the positive impact of the shortened drying time predominated. This was also confirmed in the study of Li et al. [31], which pointed out that the shrinkage behavior of materials in the drying process is a synergistic effect of the drying method, drying temperature, and drying time, where the shortening of drying time can inhibit shrinkage. Additionally, this phenomenon has also been observed in studies on potatoes by McMinn & Magee [34] and Wang & Brennan [35]. They believed that high temperature reduced the SR because of the shell-hardening effect of the samples at high temperatures, restricting their deformation and inhibiting their shrinkage.

The CNN-LSTM-MHA-controlled samples did not exhibit the minimum SR, possibly because this network focused more on the automatic optimization of DR and color, which could result in the drying temperature at some stages not being the most suitable for optimizing SR. This is supported by Figure 8 and Figure 9. For example, before 50 min, the drying temperature fluctuated between 55 and 65 °C, while this temperature range increased the SR of the samples, as shown in Figure 12a. However, this temperature range had a favorable effect on yuba sticks’ DR, as explained in Section 3.3.1. This indicated that the selection of the optimal drying temperature by the CNN-LSTM-MHA network was based on the balance between the DR, SR, and color of yuba sticks, which is beneficial for comprehensively improving the drying performance of samples.

3.3.3. Color

Color is a key quality indicator for dried yuba sticks and significantly impacts consumer acceptance [16]. The effects of different drying temperature control methods on the color parameters of yuba sticks are presented in Table 4. There were no significant differences in the samples’ L* and a* values (p > 0.05), indicating that the drying temperature did not affect the samples’ brightness/darkness and redness/greenness values. The lowest YI value was obtained at 50 °C, significantly lower than that of the high-temperature-dried samples (p < 0.05). The b* and ΔE of the samples were consistent with the changing trend of YI, indicating that the color change during drying was primarily yellow.

The yellowing of high-temperature-dried samples can be ascribed to non-enzymatic browning (Millard reaction) during drying, resulting in a more yellowish color [36]. Higher b* values were associated with higher YI values in high-temperature-dried samples, providing a strong correlation (0.96 ≤ r ≤ 1) between b*, ΔE, and YI, as demonstrated in Figure 13.

The CNN-LSTM-MHA-treated samples exhibited the highest YI values, signifying a more pronounced yellow color. This greater YI is possibly due to the real-time drying temperature control promoting the Maillard reaction. In their review, Pathare et al. [15] pointed out that temperature reduction would inhibit the occurrence of the Maillard reaction. In conclusion, the CNN-LSTM-MHA-based real-time temperature control technology can represent a recommended scenario for yuba stick drying, achieving an impressive final product with an excellent color.

3.3.4. Rehydration Ratio

The capacity of a dried substance to revert to its initial condition upon rehydration reflects its structural integrity, which can be assessed through the rehydration process [14]. The RR results of the dried yuba sticks are shown in Figure 12b. Samples subjected to HAD at between 50 and 65 °C displayed no significant differences in RR. This indicated that a low drying temperature had a limited effect on the samples’ rehydration performance. However, the RR significantly reduced to the lowest level after samples were subjected to the 70 °C drying conditions, showing the worst water absorption capacity. Notably, the CNN-LSTM-MHA-managed samples exhibited the highest RR (p < 0.05), being 9.33%, 11.48%, 5.06%, 8.40%, and 18.15% higher than the constant-temperature treatments from 50 to 70 °C, respectively, suggesting superior water absorption capabilities.

The RR of samples dried at 70 °C decreased, in accordance with Aradwad et al.’s [37] observation that the RR is closely related to the porosity and cavity of the sample microstructure. The samples dried at 70 °C precipitated a large amount of oil and solidified into a shell on their surface, which made it difficult for external water to be transported through the micropores in the samples. The highest RR of CNN-LSTM-MHA-treated samples can be attributed to the CNN-LSTM-MHA network’s dynamic adjustment of temperatures during the drying process, which optimized the DR and facilitated uniform moisture evaporation. This control mitigated inherent stress and damage within the samples, suppressed pore tearing, and maintained a more intact structure of the protein–fat network (SPFN), consistent with Ogawa & Adachi’s review [38] of pasta’s RR.

3.3.5. Texture

Texture is a physical characteristic determined by the composition and microstructure [18]. The hardness, springiness, cohesiveness, gumminess, and chewiness of rehydrated samples after drying under different drying temperature control methods are summarized in Table 5. The hardness of samples dried at 50 and 70 °C was significantly lower than that of the samples dried at other temperatures (p < 0.05), exhibiting an initial increase followed by a decrease with increasing temperature. Springiness showed a similar trend, increasing up to 60 °C and then decreasing, but without significant differences (p > 0.05), indicating that the drying temperature had little impact on the yuba sticks’ springiness. The other textural properties of the samples, particularly gumminess, exhibited a similar trend to hardness, resulting in more palatable yuba with reduced gumminess at 50 and 70 °C. The samples dried at 55 °C exhibited significantly higher chewiness compared to the other samples (p < 0.05).

It is generally believed that the hardness of samples increases with higher drying temperatures due to less damage occurring to the microstructure at lower temperatures, resulting in a softer texture upon rehydration. In this study, the decrease in hardness at high temperatures may be attributed to the SR of the high-temperature-treated samples being low and the inherent structure being loose (r = 0.67). The 50 °C samples had lower gumminess, which may be due to the rate of protein cross-linking to form a film being slower at lower drying temperatures, potentially leading to a looser structure and decreased gumminess, aligning with the findings of Gennadios & Weller [39]. The reduced gumminess at 70 °C may be attributed to the minimum RR, where the samples absorbed the least amount of water, thus diminishing the gumminess of the samples. This was demonstrated in a study of sweet potato starch noodles by Xiang et al. [40], in which the gumminess of noodles decreased with decreasing RR. The LSTM-treated samples exhibited higher hardness and chewiness, as well as the highest springiness, which may be more suitable for consumers who prefer a crisp texture in yuba sticks.

3.3.6. Protein Content

The PC under different drying temperature control methods is presented in Figure 12c. As the temperature increased, the PC gradually decreased. The PC of the 50 °C samples was significantly higher than that of the samples treated at 55 to 70 °C and those treated with LSTM (p < 0.05) by 5.86%, 6.99%, 8.96% and 10.85%, respectively, indicating that low-temperature drying was beneficial to the retention of protein in yuba sticks. The PC of the LSTM-treated samples was not significantly different from those of the samples treated at 55 °C (p > 0.05), but was significantly higher than those of the samples dried at 60 to 70 °C (p < 0.05), by 2.46%, 4.35%, and 6.15%, respectively, suggesting that the temperature control method based on the LSTM can shorten the drying time to a certain degree and improve the retention rate of proteins.

The reason for the decrease in PC with increasing temperature may be that high temperature promoted protein hydrolysis into polypeptides and amino acids and participated in the Maillard reaction to produce ketones, aldehydes, and other volatile compounds. This is consistent with Liao et al.’s [41] study on the steaming and drying of black soybeans, which also attributed the decrease in PC to the intense Maillard reaction involving sugar and protein in the samples under a high-temperature environment, which is commonly considered to constitute a risk of nutrient loss for soybean products. This is further confirmed in Figure 13, where PC was negatively correlated with b*, ΔE, and YI (−0.73 ≤ r ≤ −0.64). In addition, dos Santos et al. [42] reported that not only was the PC of soybeans after convection drying related to drying temperature, but also the denaturation and degradation rates of proteins were inhibited with a shorter drying time, which effectively explains why the PC of LSTM-treated samples was significantly higher than that of samples treated at 60–70 °C (p < 0.05).

3.3.7. Fat Content

The FC under different drying temperature control methods is shown in Figure 12d. As shown in Figure 12d, the FC of the samples decreased by 10.33%, 16.94%, 17.91%, and 21.25%, respectively, as the drying temperature increased from 50 to 70 °C. In addition, LSTM-treated samples maintained a higher FC than samples treated at 60 to 70 °C (3.18%, 4.39%, and 8.83%, respectively), except for those dried at 50 and 55 °C, indicating that LSTM-treated samples exhibited a more desirable fat retention rate.

Liao et al. [41] reported that during the heating process, especially at high temperatures, lipids undergo thermal degradation and secondary reactions of thermal degradation products, thus further producing volatile compounds. This phenomenon may have contributed to the decrease in fat content to some extent. Further research is needed to understand the underlying mechanisms involved in the variation in FC during drying. On the other hand, in the study by Farmer [43], it was argued that lipid oxidation and Maillard reactions usually do not occur alone, and that each reaction may be modified by the reactants, intermediates, and products of others. Many of the effects of lipid–Maillard interactions appear to be due to reactions between carbonyl compounds (from the degradation of lipids or sugars) and amines (NH_3_, amino acids, ethanolamine) or mercaptans (H_2_S, mercaptoacetaldehyde). This is demonstrated by the results in Figure 13, which show that FC is negatively correlated with b*, ΔE, and YI (−0.74 ≤ r ≤ −0.80). LSTM-treated samples were exposed to low temperatures for some time during the drying process, which may have reduced the intensity of the above reaction, therefore leading to a higher FC than in samples that are dried at high temperatures throughout the process.

3.3.8. Microstructure

The microstructures of the yuba stick cross-sections after dehydration at different drying temperatures were observed by means of SEM and are shown in Figure 14. The dried samples treated at 50 and 55 °C had smaller pores than those dried at higher temperatures, and the SPFN was more complete. A significant increase in pore size was observed upon surpassing a drying temperature of 60 °C, accompanied by the tearing of pores. Dense pores were observed in the CNN-LSTM-MHA-treated samples without the pores being torn. This preservation of the SPFN’s integrity, relative to other temperature drying methods, likely contributes to a superior rehydration capacity.

The main reason for the small pore size of low-temperature-treated samples is that the moisture evaporated more slowly at lower temperatures and had less impact on inherent structures, consistent with the findings of Aksoy et al. [44]. This phenomenon can be elucidated by the significant acceleration in the moisture evaporation rate as the temperature increases, reducing the drying time. The SPFN undergoes short-term intense stress, precipitating the emergence of large and torn pores [45].

The microstructure of CNN-LSTM-MHA-treated samples was well maintained, which may be attributed to the slow and gradual temperature increase in the initial drying stage. This avoided accelerated moisture evaporation caused by rapid temperature increases and effectively reduced microstructural damage. This finding is similar to the conclusion of Yao et al. [46] on the microstructural changes of fruits and vegetables during drying. In the middle drying stage, the CNN-LSTM-MHA network reduced the temperature, which also might reduce the migration rate of moisture, contributing to the microstructure maintenance [46]. During drying, the moisture is gradually removed, and the partial pressure of water vapor between the samples and the drying medium gradually decreases, resulting in a decrease in the DR, as shown in Figure 11b. Therefore, in the late drying stage, even if the temperature was raised to about 65 °C, the moisture evaporation rate would still be low and would not damage the microstructure.

4. Conclusions

In this study, a CNN and MHA were introduced to reduce the prediction accuracy of the LSTM network when drying parameters change frequently. The CNN-LSTM-MHA network was employed to predict the real-time DR, SR, and ΔE changes in samples during drying, aiming to optimize drying temperature in real time. Compared with other commonly used networks, the CNN-LSTM-MHA network demonstrated the highest accuracy (R^2^: 0.9855–0.9999; RMSE: 0.0001–0.0099; MAE: 0.0001–0.0120). Even when the drying temperature changed frequently, the prediction accuracy of CNN-LSTM-MHA for DR, SR, and ΔE was still high (R^2^ > 0.9616). In addition, based on the CNN-LSTM-MHA network, an intelligent real-time temperature control system was developed and compared to constant-temperature scenarios (50, 55, 60, 65, and 70 °C), showing significant improvements in drying time (5.88–43.36%), SR (6.02–11.95%), color, RR (5.06–18.15%), texture, PC (2.46–6.15%), FC (3.18–8.83%), and microstructure. These results demonstrate the potential application of this novel drying strategy in dried yuba stick processing. However, the limitation of this study lies in the fact that only a single drying parameter (temperature) was controlled, which inhibited the overall improvement of the drying process. In future studies, multiple parameters for controlling the drying process, such as humidity and air velocity, should also be considered to comprehensively optimize the drying process.

Bibliography46

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Jimoh K.A. Hashim N. Shamsudin R. Che Man H. Jahari M. Recent Advances of Optical Imaging in the Drying Process of Grains—A Review J. Stored Prod. Res.202310310214510.1016/j.jspr.2023.102145 · doi ↗
2Mahmood N. Liu Y. Munir Z. Zhang Y. Niazi B.M.K. Effects of Hot Air Assisted Radio Frequency Drying on Heating Uniformity, Drying Characteristics and Quality of Paddy LWT 202215811313110.1016/j.lwt.2022.113131 · doi ↗
3Usama M. Ali Z. Ndukwu M.C. Sathyamurthy R. The Energy, Emissions, and Drying Kinetics of Three-Stage Solar, Microwave and Desiccant Absorption Drying of Potato Slices Renew. Energy 202321911950910.1016/j.renene.2023.119509 · doi ↗
4Vilas C. Alonso A.A. Balsa-Canto E. López-Quiroga E. Trelea I.C. Model-Based Real Time Operation of the Freeze-Drying Process Processes 2020832510.3390/pr 8030325 · doi ↗
5Nadian M.H. Abbaspour-Fard M.H. Martynenko A. Golzarian M.R. An Intelligent Integrated Control of Hybrid Hot Air-Infrared Dryer Based on Fuzzy Logic and Computer Vision System Comput. Electron. Agric.201713713814910.1016/j.compag.2017.04.001 · doi ↗
6Jia Z. Liu Y. Xiao H. Deep Learning Prediction of Moisture and Color Kinetics of Apple Slices by Long Short-Term Memory as Affected by Blanching and Hot-Air Drying Conditions Processes 202412172410.3390/pr 12081724 · doi ↗
7Dong F. Bi Y. Hao J. Liu S. Yi W. Yu W. Lv Y. Cui J. Li H. Xian J. A New Comprehensive Quantitative Index for the Assessment of Essential Amino Acid Quality in Beef Using Vis-NIR Hyperspectral Imaging Combined with LSTM Food Chem.202444013804010.1016/j.foodchem.2023.13804038103505 · doi ↗ · pubmed ↗
8Guo J. Liu Y. Lei D. Peng Z. Mowafy S. Li X. Jia Z. Ai Z. Xiao H. Combining Deep Lab V 3 + and LSTM for Intelligent Drying Strategy Optimization in Fruits and Vegetables Based on Appearance Quality: A Case Study of Pleurotus eryngii Comput. Electron. Agric.202523010992910.1016/j.compag.2025.109929 · doi ↗