Quantification of feeding intensity and feeding control of largemouth bass based on water surface vibration characteristics
Yufei Zhang, Andong Liu, Yulei Zhang, Qi Ni, Haigen Zhang, Hongqiao Song, Yong Wang, Xiaoyan Cheng

TL;DR
This study uses water surface vibrations and deep learning to precisely control feeding in largemouth bass aquaculture.
Contribution
A novel vibration-based model using LSTM for real-time feeding control in aquaculture is developed and validated.
Findings
Vibration displacement increases significantly with fish size and density.
The LSTM model outperformed GRU and Transformer in predicting feeding intensity.
The vibration-LSTM system achieved low residual feed rates and lower costs than other methods.
Abstract
In response to the demand for precise feeding in high-density aquaculture, this study established a dynamic prediction model for fish feeding intensity by integrating vibration signal quantification and deep learning. Through multidimensional experiments (fish size: 50–300 g; stocking density: 20–60 fish/group; feeding speed: 1-3 g/s; feed particle size: 2#4#6#), we quantified the three-axis displacement signals of Micropterus salmoides during feeding. Results demonstrated significant effects of all parameters on water surface fluctuations (p < 0.05). Vibration displacement exhibited linear relationships with fish size and density. The 300 g group showed 109.7% higher peak amplitude than the 50 g group, while the 60-fish density group exceeded the 20-fish group by 141.9%. Optimal palatability (4#) reduced fluctuation frequency by 42%. A predictive model for feeding vibration patterns…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18| Experiment name | Fish weight (g) | Number of fish per tank (n) | Feed input rate (g/s) | Feed particle size |
|---|---|---|---|---|
| Size experiment | 50, 150, 300 | 30 | 1 | 4# |
| Density experiment | 50 | 20, 40, 60 | 1 | 4# |
| Feed input rate experiment | 300 | 30 | 1, 2, 3 | 4# |
| Feed particle size experiment | 300 | 30 | 1 | 2#, 4#, 6# |
| Intelligent feeding control simplified pseudo code |
|---|
| 1INPUT: Vibration data (10 Hz), Fish params (S, D, |
| 2OUTPUT: Feeder control signals |
| 3PROCEDURE Main Control: |
| 4Feeder speed ← 2.0 g/s |
| 5Buffer ← [0]*50, decline count ← 0 |
| 6SET_FEEDER_SPEED (feeder speed) |
| 7WHILE TRUE: |
| 8// Data acquisition and processing |
| 9D_sum ← SUM_ABS(FILTER_NOISE(READ_VIBRATION())) |
| 10Buffer. APPEND(D_sum). POP_FRONT() |
| 11// Periodic prediction (5 s intervals) |
| 12IF SYSTEM_TIME % 5 s == 0: |
| 13Prediction ← LSTM_MODEL. PREDICT(NORMALIZE([S, D, feeder_speed,Φ] + buffer)) |
| 14// Trend analysis |
| 15First = MEAN(prediction[0:25]), second = MEAN(prediction[25:50]) |
| 16Decline count ← (second < 0.9*first)? decline_count+1: 0 |
| 17// Control decision |
| 18IF decline count ≥ 2: STOP_FEEDER(); BREAK |
| 19ELSE: MAINTAIN_SPEED() |
| 20END IF |
| 21DELAY(100 ms) |
| 22END |
| INDEX | RMSE | MAE | R2 |
|---|---|---|---|
| LSTM | 69.43 | 48.00 | 0.883 |
| GRU | 85.15 | 67.43 | 0.828 |
| Transformer | 80.98 | 59.66 | 0.618 |
| Method | Meal 1 (300 g) | Meal 2 (300 g) | Meal 3 (300 g) | Meal 1 (150 g) | Meal 2 (150 g) | Meal 3 (150 g) | Meal 1 (40 fish density) | Meal 2 (40 fish density) |
|---|---|---|---|---|---|---|---|---|
| Float method | 32.1 g (RFR: 0%) | 2 g (RFR: 0%) | 0 g (RFR: 0%) | 26 g (RFR: 0%) | 1.8 g (RFR: 0%) | 0 g (RFR: 0%) | 60 g (RFR: 0.8%) | – |
| Optical flow | 27.3 g (RFR: 0%) | 6.5 g (RFR: 0%) | 6.8 g (RFR: 0%) | 11.2 g (RFR: 0%) | 4.6 g (RFR: 0%) | 6.3 g (RFR: 0%) | 35.6 g (RFR: 0%) | 22.3 g (RFR: 2.69%) |
| Graph neural network | 36 g (RFR: 1.64%) | – | – | 20.9 g (RFR: 0%) | 6.9 g (RFR: 0%) | 0 g (RFR: 0%) | 64.6 g (RFR: 6.58%) | – |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWater Quality Monitoring Technologies · Aquaculture Nutrition and Growth · Fish Ecology and Management Studies
Introduction
1
In factory-based largemouth bass farming, feed costs account for more than 60% of the total farming costs, and whether feeding is reasonable is directly related to the profit margin of aquatic products (Zhou et al., 2018). Traditional feeding methods usually rely on the experience and intuition of skilled workers. However, with the aging of the population, the labor shortage problem is becoming more and more serious, and manual feeding alone can no longer meet the needs of industrial scale (Zheng and Zheng, 2025). Although various automatic feeders have appeared on the market, conventional automatic feeder technology cannot achieve autonomous decision-making, and its promotion and application are greatly restricted. Therefore, intelligent feeding technology with autonomous decision-making is the future development direction of fish feeding methods (Zhu et al., 2022).
Recent research has focused on estimating fish feeding intensity by analyzing behavioral signals through machine vision, acoustics, and water surface fluctuations (Wu et al., 2022; Biazi and Marques, 2023; Føre et al., 2024). Vision-based methods, such as optical flow and deep learning models, classify feeding activity into discrete states (e.g., none, weak, medium, and strong) with high accuracy (>90%) (Xu et al., 2022; Hu et al., 2025). Advanced techniques like 3D convolutional neural networks and Graph Convolutional Networks (GCNs) have further improved performance by modeling spatiotemporal dynamics of fish schools (Ubina et al., 2021).
Acoustic methods analyze sounds generated during feeding—such as chewing, swallowing, and bait impacts—using underwater hydrophones. By leveraging time-frequency representations (e.g., Mel spectrograms, STFT) and feature fusion strategies, these approaches achieve high classification accuracy (>96%) (Qi et al., 2023; Du et al., 2023a,b). Transformer-based models, such as the Audio Spectrum Swin Transformer, support rapid inference and real-time monitoring of feeding intensity (Zeng et al., 2023). However, acoustic methods are susceptible to ambient noise from operational equipment (e.g., pumps, aerators), which can mask biologically relevant signals. To address this, multimodal fusion frameworks that combine audio, video, and sonar data have been proposed, achieving robustness in challenging conditions (Du et al., 2024).
Vibration-based methods, which capture water surface fluctuations via accelerometers mounted on floating platforms, offer a complementary approach. Xiang (2022) developed an adaptive feeding algorithm based on three-axis accelerometer feedback and proposed a quantitative index of feeding intensity. Pan et al. (2023) used a six-axis accelerometer to identify distinct feeding stages in crucian carp, leading to an automatic control method that improved feed utilization by 7.66%.
Despite these advances, current appetite assessment often relies on subjective visual evaluation, producing only qualitative labels without quantifiable metrics. Furthermore, systematic relationships between water surface fluctuations and key culture parameters—such as fish size, stocking density, feeding rate, and feed particle size-remain unestablished for species like largemouth bass. Crucially, existing studies classify appetite into discrete states but do not support the continuous, quantitative prediction of feeding intensity essential for dynamic, proportional feed control.
To bridge this gap, we propose a predictive model based on Long Short-Term Memory (LSTM) networks. LSTMs, a type of recurrent neural network (RNN), excel at capturing long-range temporal dependencies in sequential data (Hochreiter and Schmidhuber, 1997). Feeding behavior in largemouth bass is a continuous dynamic process where current surface vibrations are influenced by prior states. LSTM’s gating mechanisms mitigate the vanishing gradient problem common in standard RNNs, making the network particularly suitable for modeling such temporal sequences and predicting future fluctuations. Therefore, we adopt an LSTM architecture to model the complex, time-dependent relationships among multi-dimensional inputs (fish size, density, feeding rate, and feed size), real-time triaxial vibration signals, and future feeding intensity. The model’s output is the predicted future vibration displacement. We then utilize this displacement value as a continuous, quantitative proxy for feeding intensity, which forms the basis for dynamic feed control.
Materials and methods
2
Ethical statement
2.1
All research on animal participation in this article has passed the welfare and ethical review of experimental animals by FMIRI (FMIRI-AWE-2024-001). It should be noted that our experiments were conducted in an industrial recirculating water system (RAS) given its indispensable role in aquaculture and its rapid growth trend.
Experimental system
2.2
The recirculating aquaculture system (RAS) comprised six 1-m^3^ circular rearing tanks (water volume: 600 L; radius: 75 cm, water depth: 40 cm), each equipped with a screw-type feeder (feeding speed: 0–5 g/s). Water quality was rigorously controlled: temperature 26–30 °C, dissolved oxygen ≥5 mg/L (maintained at 5.5 ± 0.5 mg/L during acclimation/experiment), pH 7.2–8.5 (maintained at 7.2 ± 0.5 during acclimation/experiment), nitrate ≤0.5 mg/L, and total ammonia nitrogen (TAN) ≤ 0.8 mg/L.
Vibration data acquisition employed a WT-VB01-485 triaxial sensor (47 × 38 × 15 mm) embedded within a 60° hardened EVA float (Ø 180 × 30 mm), with electronic components waterproofed using conformal coating. This sensor assembly was secured 15–20 cm downstream of the feeder outlet via an 80 g silicone-weighted data line and surrounded by a Ø 60 cm float ring to confine floating feed dispersion. Video monitoring utilized a camera (120 fps, 1920 × 1,080 resolution) mounted 40 cm above the water surface on an adjustable bracket. Real-time observational control was facilitated through Wi-Fi connectivity to a mobile device (Figure 1).
Experimental system device.
Experimental design
2.3
Juvenile largemouth bass (Micropterus salmoides) were grouped by size: small (50 g ± 4 g), medium (150 g ± 7 g), and large (300 g ± 11 g). Fish were stocked at densities of 20–60 per tank and fed once daily (16:00) with Tongwei California bass expanded feed (Specifications 2#, 4#, 6#; Crude protein ≥46.0%, Crude fat ≥6.0%). The feed input rate was set to 1, 2, or 3 g/s as an experimental variable. To ensure statistical robustness, each experimental condition (e.g., each fish size or density level) was replicated across two independent rearing tanks, with three repeated feeding trials conducted per tank. The detailed experimental design is summarized in Table 1.
Overview of the proposed approach
2.4
To address the demand for precise feeding control in high-density aquaculture, this study established a dynamic prediction model for fish feeding intensity by integrating vibration signal quantification and deep learning. The research workflow comprised three key stages (Figure 2):
Water surface fluctuation parameter extraction.
Analysis of Feeding Fluctuation Patterns: Through synchronous video verification, we analyzed the vibration characteristics of largemouth bass during feeding under variations in key parameters: fish size (S), stocking density (D), feeding rate (V), and feed particle size (Φ).
Prediction Modeling of Water Surface Fluctuations: A predictive model was developed using the multi-dimensional experimental data. The input variables included time (t), the static parameters (S, D, V, Φ), and the real-time triaxial vibration displacement. The output variable was the summed vibration displacement over the next 5 s (prediction horizon N = 5), representing a quantitative proxy for the predicted feeding intensity. The Z-score normalization method was applied to eliminate dimensional differences and enhance model convergence. The modeling architecture utilized a Long Short-Term Memory (LSTM) network. All experimental runs (408 trials in total) were pooled into a unified dataset. Each data instance consisted of a 5-s window (sampled at 10 Hz) of the summed triaxial vibration displacement, paired with the corresponding static parameters (S, D, V, Φ). This structured dataset was then randomly split into training (80%) and validation (20%) sets for model development.
Algorithm Deployment and Feeding Control Testing: The trained model was compressed and deployed on an embedded system (Orange Pi AiPRO) for real-time prediction (with a delay ≤ 50 ms). The feeding system adjusted the feeding rate (ΔV) based on the real-time feedback of the predicted feeding intensity, enabling anticipatory control, including early feeding cessation. The performance of this algorithm was rigorously compared against optical flow and graph neural network (GCN) methods to evaluate its efficacy.
Data preprocessing and feature engineering
2.5
The triaxial displacement signals (X, Y, Z) were measured using the WT-VB01-485 sensor. The X and Y axes represent horizontal displacements on the water surface plane, capturing lateral wave movements, while the Z axis corresponds to vertical displacement, capturing heave motion. This configuration allows comprehensive quantification of surface agitation dynamics. Figure 2 compares feeding vibrations versus environmental noise (aeration bubbles/water flow), confirming that our filtering method effectively isolates feeding signals. The Python Serial module enabled real-time data capture with amplitude filtering:
The amplitude filtering method is described in Equation 1. Where Draw(t) is raw displacement and &threshold = 0.5 mm (empirically calibrated to bubble noise). Triaxial displacements (Xi, Yi, Zi) were summed per 5 s analysis window:
The triaxial displacement summation is calculated using Equation 2. Maximum wave height per window derived as Hmax[k]=max(D∑[t5_k_:t_5k + 5_]). Strong inter-axis correlation (ρXY = 0.93, ρXZ = 0.88, ρYZ = 0.91) validated signal summation for amplifying feeding signatures.
Prediction modeling of water surface fluctuations for largemouth bass feeding
2.6
The core of our predictive framework was a Long Short-Term Memory (LSTM) network, designed to forecast the sum of triaxial vibration displacement 5 s into the future. This predicted displacement sum serves as a quantitative proxy for assessing feeding intensity at the water surface. The modeling procedure was structured as follows.
As shown in Figure 3, the model integrated both static experimental parameters and dynamic time-series data. The static input features included fish size (S, g), stocking density (D, fish/tank), feed input rate (V, g/s), and feed particle size (Φ, mm). The latter was numerically encoded using the approximate diameters of the commercial feed types (3 mm, 5 mm, and 7 mm for #2, #4, and #6, respectively). The primary dynamic temporal input was the real-time summed absolute triaxial vibration displacement (DΣ). The model was designed to output the predicted future value of this vibration displacement sum (DΣ) at a horizon of t + 5 seconds.
Model input and output structure.
As shown in Figure 4, prior to model training, all input features were normalized using Z-score normalization to mitigate scale differences and accelerate convergence. The complete dataset, comprising 408 experimental runs, was partitioned into training and validation subsets with an 8:2 ratio.
Data partitioning and preprocessing configuration.
As shown in Figure 5, the architecture of the LSTM network was configured to capture complex temporal dependencies. The input layer receives the normalized feature vector. This is followed by two stacked LSTM layers, each containing 128 units with tanh activation functions, to learn hierarchical temporal patterns from the sequential data. To prevent over fitting, a dropout layer with a rate of 0.2 was applied after the LSTM layers. The extracted features were then passed through a fully connected (Dense) layer with 64 units and a ReLU activation function for non-linear transformation. Finally, the output layer employs a linear activation unit to produce the final continuous prediction of the displacement value.
Architectural and training parameters of the LSTM model.
For the training configuration, a historical window of 5 s (corresponding to a time step of 50 at a 10 Hz sampling rate) was used as the input context for each prediction. The model was trained using the Adam optimizer with a learning rate of 0.001 and a batch size of 32. The loss function was defined as Mean Squared Error (MSE). To further guard against over fitting, an early stopping callback was implemented to halt training if the validation loss failed to improve for 20 consecutive epochs.
To benchmark the performance of the proposed LSTM model, we also implemented and trained two other prominent sequential models on the same dataset:
Gated Recurrent Unit (GRU): A variant of RNNs similar to LSTM but with a simplified gating mechanism, often offering comparable performance with fewer parameters. Our implemented GRU network comprised two stacked layers with 128 units each, using tanh activations, followed by a dropout layer (0.2) and a dense output layer with a linear activation. It was trained with the same configuration (Adam optimizer, lr = 0.001, MSE loss) as the LSTM model to ensure a fair comparison (Cho et al., 2014).
Transformer: A model architecture based on self-attention mechanisms, which has shown remarkable success in various sequence processing tasks by weighing the importance of different time steps. The Transformer encoder implemented here utilized a single-head self-attention mechanism with positional encoding for the 50-step input sequence. It consisted of two encoder layers (hidden dimension 128), a feed-forward network (dimension 256), and a final linear projection layer for regression. Training hyperparameters matched those used for LSTM and GRU (Vaswani et al., 2017).
All models were trained to perform the same regression task: predicting the summed vibration displacement t + 5 s. Their performance was evaluated and compared using Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and the coefficient of determination (R^2^).
Algorithm deployment and feeding control testing
2.7
To enable engineering application, an embedded intelligent feeding system prototype was developed for closed-loop control testing. The hardware adopts a three-layer architecture:
Perception layer: Comprising a WT-VB01M triaxial vibration sensor (10 Hz sampling) and Arduino Mega 2,560 MCU for signal acquisition/preprocessing (amplitude filtering and triaxial displacement summation).
Decision layer: Centered on an Orange Pi AiPRO embedded platform (quad-core Cortex-A55 CPU, 4GB RAM) executing the trained LSTM model to analyze current/predicted vibration trends.
Execution layer: Utilizing a custom screw-type feeder (0–5 g/s speed control) receiving stop/speed-adjust commands via serial communication.
Software control logic: The system continuously acquires vibration signals, processes them (amplitude filtering and triaxial summation), and inputs them into the LSTM model to predict values for the next 5 s (Table 2). If a declining trend is detected over two consecutive analysis cycles (5 s/cycle), a stop-feeding command is issued; otherwise, the current speed is maintained. Performance was validated against optical flow and graph neural network (GCN) methods in repeated feeding scenarios. For the GCN-based method, we replicated the architecture and training protocol described by Wei et al. (2022), which constructs a graph from time-series features and applies graph convolutional layers for appetite state classification. This implemented GCN model was then applied to our experimental setup for closed-loop control comparison. Evaluation metrics included actual feed input, residual feed [for residual feed rate calculation: RFR (Residual feed rate %) = (Residual feed mass/Total feed mass) × 100%], and algorithm control signals (Figure 6).
Configuration of the embedded intelligent feeding system.
Results and discussion
3
Parametric influences on water surface fluctuations
3.1
Multidimensional experiments revealed distinct displacement patterns governed by biological and operational parameters, providing a foundation for the predictive model.
Size effect: Larger bass generated significantly greater water surface displacements due to their higher mass and energy output during feeding. Analysis of all replicate observations (n = 6 per size group, from 2 tanks × 3 trials) showed that the 300 g cohort generated 109.7% higher peak displacement than the 50 g group (p < 0.05, Figure 7). The relationship between fish size (S) and peak displacement (Y) across all individual trials was characterized by a linear model (Y = 1.7316S + 236.41, R^2^ = 0.9262, Figure 8). While the linear fit is strong, visual inspection of the scatter plot suggests a potential saturation trend between the 150 g and 300 g groups, which may indicate a non-linear scaling of feeding energy transfer with body mass in larger individuals.
Water surface fluctuations of feeding largemouth bass of different sizes.
Peak fluctuation patterns of water surface ingestion for largemouth bass of different sizes.
Density effect: Increased stocking density led to more intense collective feeding activity. Based on all replicate data (n = 6 per density level), the 60-fish groups exhibited 141.9% greater fluctuation amplitude than the 20-fish groups (Figure 9). The density-dependent effect was also highly linear across individual trials (Y = 5.4250D + 47.11, R^2^ = 0.9828, Figure 10), where Y represents peak displacement and D represents density.
Fluctuation of water surface for feeding of largemouth bass at different densities.
Peak fluctuation patterns of water surface ingestion of largemouth bass at different densities.
Feeding rate effect: The rate of feed delivery influenced the temporal dynamics of feeding. Higher feeding speeds (3 g/s) accelerated consumption, producing earlier vibration peaks (at approximately 20s) compared to slower rates (1 g/s, peak at ~35 s) (Figure 11). This suggests that fish respond to higher feed availability by intensifying their feeding effort over a shorter duration.
Water surface fluctuations of feeding of largemouth bass at different feeding rates.
Feed particle size effect: The size of feed pellets significantly affected the initial feeding behavior. Smaller #2 pellets (3 mm) induced chaotic, high-frequency initial vibrations, likely due to competitive scrambling for numerous small particles. In contrast, the optimally palatable #4 feed (5 mm) resulted in more stable, low-frequency oscillations (Figure 12), indicating efficient capture and consumption with reduced energy expenditure. This confirms that feed palatability can modulate the characteristics of surface fluctuations.
Water surface fluctuations of feeding of largemouth bass under different feed particle sizes.
From a biomechanical perspective, the feeding behavior of fish generates kinetic energy, which is transferred to the water body through body movements, thereby causing water surface fluctuations. The positive correlations observed in the size and density experiments align with findings from acoustic studies on largemouth bass (Wei et al., 2022). The feeding speed and feed particle size experiments demonstrate that operational parameters directly influence feeding kinetics. Faster feeding rates and larger pellets can shorten feeding time, a phenomenon also observed in catfish (Khater et al., 2021). However, it is crucial to note that exceeding the fish’s acceptable threshold for these parameters does not linearly improve efficiency and may instead increase energy waste. The quantification of these relationships provides a critical theoretical basis for precise feeding control, an aspect often overlooked in existing studies that focus solely on intensity classification without parametric context.
Predictive model performance
3.2
The developed LSTM model achieved robust forecasting performance, effectively integrating static biometric parameters (S, D) with dynamic, real-time vibration sequences. The model’s primary evaluation metrics, calculated on the validation set, were RMSE = 69.43 μm, MAE = 48.00 μm, and R^2^ = 0.883, indicating high accuracy in predicting the future 5-s vibration displacement. Training converged stably within 200 epochs (Figure 13), demonstrating the architectural efficiency and appropriateness of the training configuration.
Training and validation loss convergence of the LSTM prediction model.
A critical strength of the model was its temporal precision. As shown in Figure 14, the predicted waveform closely tracked the actual measured displacement, capturing key inflection points (rise, peak, and decline) with a temporal deviation of less than 50 ms. This capability is fundamental for anticipatory control, allowing the system to initiate feeding cessation before satiation is fully reached, thereby minimizing waste.
Example prediction of the LSTM model against actual data.
The model’s regression capability was further validated through an internal architecture comparison (Table 3). The LSTM achieved the best overall forecasting metrics, outperforming both the GRU (RMSE = 85.15 μm, R^2^ = 0.828) and the Transformer (RMSE = 80.98 μm, R^2^ = 0.618) models. The superior performance of LSTM can be attributed to its ability to capture long-range temporal dependencies in the sequential feeding data more effectively than the GRU, while the Transformer model may have been hampered by its relatively higher data requirements for optimal performance on this specific task. The reliability of the LSTM’s continuous predictions is further corroborated by the narrow confidence intervals obtained through uncertainty quantification (Figure 15), which show high model certainty around the predicted values.
LSTM model prediction with confidence intervals for uncertainty quantification.
Feasibility demonstration and comparative analysis
3.3
Our vibration-LSTM approach demonstrates distinct and contextually complementary advantages compared to alternative methods. A comparative analysis with Wei et al.’s graph convolutional network (GCN) highlights a fundamental difference in objective. The GCN approach is architected for high-accuracy classification of discrete appetite states (e.g., weak, medium, and strong). In contrast, our framework is designed for continuous regression, delivering quantitative predictions of feeding intensity that are a prerequisite for dynamic, proportional feed modulation.
Computationally, the LSTM implementation requires only 1.2 MB of memory and achieves 5.7 × faster inference (48 ms vs. 275 ms) on equivalent embedded hardware, making it more suitable for cost-effective, real-time deployment.
In closed-loop feeding tests, this advantage translated into superior practical performance. The vibration-based control system consistently minimized waste, achieving residual feed rates (RFR) of ≤0.8% across all trials (Table 4). This system significantly outperformed both optical flow (2.69% RFR) and GCN-based (6.58% RFR) methods. The feeding protocol was as follows: if no residual feed was detected at the end of a meal, the system would immediately initiate the next feeding session for up to three consecutive meals. The appearance of residual feed indicated fish satiety, at which point feeding was terminated.
Vibration-based control (Figure 16) provided continuous, high-resolution control signals (full-scale output range 0–255). This enabled precise modulation and, crucially, anticipatory cessation. For instance, with 300 g fish, the system commanded a sharp reduction in feed upon predicting a decline, achieving 0% residual feed across meals.
Feeding control profile using the vibration-LSTM method.
The curve shows the real-time summed vibration displacement (“Vib_sum,” red line) and the corresponding normalized control signal (“Out,” green line, range 0–255) sent to the feeder.
Optical flow control (Figure 17) also offered a full output range but exhibited reduced sensitivity to subtle appetite changes. This resulted in less precise control and higher residual feed (e.g., 6.5 g/6.8 g for 300 g fish), as it reacted to rather than predicted feeding dynamics.
Feeding control profile using the optical flow method.
The curve shows the optical flow magnitude (“EK,” in arbitrary units, blue line) computed from video frames, and the resulting normalized control signal (“Out,” orange line, range 0–255).
GCN-based control (Figure 18) operates on a classification paradigm. This approach adjusts the feeding rate to predetermined levels (e.g., 30 for ‘weak’, 120 for ‘strong’ appetite). While effective for state differentiation, this step-wise control is less suited for the rapid, continuous adjustment needed to minimize waste, resulting in higher residual feed rates compared to the regression-based vibration method.
Feeding control profile using the GCN-based method.
The curve shows the classified appetite level (“Appetite_State,” discrete levels, blue line) output by the GCN model and the corresponding step-wise control signal (“Out,” orange line).
Embedded deployment confirmed real-time efficacy with an inference latency of 48.2 ± 1.7 ms, fulfilling industrial constraints (<50 ms). The control logic, triggering cessation after two consecutive declining trends, prevented 89.7% of overfeeding events with a low false positive rate (3.2%). Hardware optimization reduced power consumption to 8.3 W—47% lower than GCN implementations typically requiring GPU acceleration. The combination of markedly reduced feed waste and low hardware cost (approximately $200) positions the vibration-LSTM system as an economically attractive solution for precision feeding in RAS facilities.
Conclusions
4
This study establishes a robust, data-driven framework for the real-time quantification and predictive control of feeding intensity in largemouth bass aquaculture. By integrating water surface vibration analysis with a LSTM (vibration-LSTM) deep learning model, we have transitioned beyond qualitative appetite assessment to a continuous, quantitative, and anticipatory feeding strategy. The core achievements of this work are threefold:
Parametric Quantification: We demonstrated and quantified significant linear relationships between vibration amplitude and key biological parameters (fish size: R^2^ = 0.926; stocking density: R^2^ = 0.983), providing a foundational model for understanding feeding energy expenditure.Dynamic Prediction Model: We developed a vibration-LSTM architecture capable of accurately forecasting feeding intensity 5 s into the future (R^2^ = 0.883), enabling proactive control rather than reactive response.Embedded Implementation and Validation: We successfully deployed the model on a low-cost, embedded platform (Orange Pi AiPRO), achieving real-time operation (latency ≤50 ms) and demonstrating superior performance in closed-loop tests, reducing residual feed rates to ≤0.8%—significantly outperforming optical flow (2.69%) and GCN-based (6.58%) methods.
This vibration-driven, predictively controlled strategy, characterized by its low hardware cost and demonstrated high feed utilization, presents a cost-effective and industrially viable solution for precision aquaculture.
Limitations and future work
5
The proposed vibration-based intelligent feeding system, while promising, has several limitations that offer pathways for future research. The primary challenge lies in system scalability and feed type dependency.
Feasibility demonstration and comparative analysis
5.1
Validation trials in larger, commercial-scale RAS (20 m^3^ tanks) revealed a performance decline attributable to wave damping effects over longer distances and increased spatial heterogeneity in fish distribution and feeding activity. A single-point surface sensor may not adequately capture the feeding dynamics of a large, heterogeneous tank. To transition from experimental to industrial applicability, two strategies are recommended:
Tank-Specific Transfer Learning: Pre-trained models should be fine-tuned using a small amount of data collected from the specific commercial tank, allowing the system to adapt to its unique hydrodynamic and behavioral characteristics.
Coupled Multi-Sensor Arrays: Deploying a network of vibration sensors at strategic locations across the water surface can provide a comprehensive view of feeding activity, mitigating the limitations of a single sensor and improving signal fidelity in large volumes.
Detection efficacy for sinking feeds
5.2
A significant limitation was identified with the use of sinking feeds, where the model’s detection efficacy dropped to 76.4%, compared to 98.3% for floating pellets. This performance gap arises because sinking feeds are consumed below the surface, generating considerably weaker surface vibrations. To develop a universally applicable system, future work should focus on:
Strategic Sub-Surface Sensor Placement: Deploying vibration sensors at multiple depths to directly capture feeding activity in the water column.
Hybrid Hydrophone Integration: Fusing vibration data with underwater acoustic signals, as the crunching and chewing sounds of fish consuming sinking pellets provide a strong, complementary signal.
Sinking-Feed-Specific Model Retraining: Collecting dedicated datasets of sinking feed consumption and retraining the model to recognize the associated, albeit subtler, vibration signatures.
Broader research directions
5.3
Beyond addressing these immediate limitations, this work opens several avenues for future research. These include extending the framework to multi-species applications, developing adaptive noise cancelation algorithms to enhance robustness in noisy environments, and exploring reinforcement learning paradigms to optimize long-term feeding strategies that maximize growth and fish welfare.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Biazi V. Marques C. (2023). Industry 4.0-based smart systems in aquaculture: a comprehensive review. Aquac. Eng. 103:102360. doi: 10.1016/j.aquaeng.2023.102360 · doi ↗
- 2Cho K. Van Merriënboer B. Gulcehre C. Bahdanau D. Bougares F. Schwenk H. . (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1724–1734.
- 3Du Z. Cui M. Wang Q. Liu X. Xu Y. Bai Z. . (2023 a). Feeding intensity assessment of aquaculture fish using mel spectrogram and deep learning algorithms. Aquac. Eng. 102:102345. doi: 10.1016/j.aquaeng.2023.102345 · doi ↗
- 4Du Z. Cui M. Xu X. Bai Z. Han J. Li W. . (2024). Harnessing multimodal data fusion to advance accurate identification of fish feeding intensity. Biosyst. Eng. 246, 135–146. doi: 10.1016/j.biosystemseng.2024.08.001 · doi ↗
- 5Du Z. Xu X. Bai Z. Liu X. Hu Y. Li W. . (2023 b). Feature fusion strategy and improved ghostnet for accurate recognition of fish feeding behavior. Comput. Electron. Agric. 214:108310. doi: 10.1016/j.compag.2023.108310 · doi ↗
- 6Føre M. Alver M. O. Alfredsen J. A. Rasheed A. Hukkelås T. Bjelland H. V. . (2024). Digital twins in intensive aquaculture — challenges, opportunities and future prospects. Comput. Electron. Agric. 218:108676. doi: 10.1016/j.compag.2024.108676 · doi ↗
- 7Hochreiter S. Schmidhuber J. (1997). Long short-term memory. Neural Comput. 9, 1735–1780. doi: 10.1162/neco.1997.9.8.17359377276 · doi ↗ · pubmed ↗
- 8Hu W. Yang X. Ma P. Fu T. Zhou C. (2025). DCA-MVIT: fused DS Gated convolution and CA attention for fish feeding behavior recognition in recirculating aquaculture systems. Aquaculture 598:742008. doi: 10.1016/j.aquaculture.2024.742008 · doi ↗
