Off-Grid Underwater Acoustic Source Direction-of-Arrival Estimation Method Based on Iterative Empirical Mode Decomposition Interval Threshold

Chuanxi Xing; Guangzhi Tan; Saimeng Dong

PMC · DOI:10.3390/s24175835·September 8, 2024

Off-Grid Underwater Acoustic Source Direction-of-Arrival Estimation Method Based on Iterative Empirical Mode Decomposition Interval Threshold

Chuanxi Xing, Guangzhi Tan, Saimeng Dong

PDF

Open Access

TL;DR

This paper introduces a new method for accurately estimating the direction of underwater sound sources in noisy shallow seas using advanced signal processing techniques.

Contribution

The novel approach combines iterative EMD interval thresholding with off-grid sparse Bayesian learning for improved DOA estimation.

Findings

01

The algorithm achieves 100% resolution probability at an 8° azimuthal separation between signal sources.

02

It maintains 100% resolution probability even at a low signal-to-noise ratio of −9 dB.

03

The method outperforms existing algorithms in accuracy, runtime, and robustness.

Abstract

To solve the problem that the hydrophone arrays are disturbed by ocean noise when collecting signals in shallow seas, resulting in reduced accuracy and resolution of target orientation estimation, a direction-of-arrival (DOA) estimation algorithm based on iterative EMD interval thresholding (EMD-IIT) and off-grid sparse Bayesian learning is proposed. Firstly, the noisy signal acquired by the hydrophone array is denoised by the EMD-IIT algorithm. Secondly, the singular value decomposition is performed on the denoised signal, and then an off-grid sparse reconstruction model is established. Finally, the maximum a posteriori probability of the target signal is obtained by the Bayesian learning algorithm, and the DOA estimate of the target is derived to achieve the orientation estimation of the target. Simulation analysis and sea trial data results show that the algorithm achieves a…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Chemicals1

-IIT

Diseases2

injury to people or property noise

Figures26

Click any figure to enlarge with its caption.

Funding2

—National Natural Science Foundation of China
—Basic Research Special General project of Yunnan Province, China

Keywords

hydrophone arraysDOAempirical mode decompositionBayesian learning algorithm

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsUnderwater Acoustics Research · Direction-of-Arrival Estimation Techniques · Speech and Audio Processing

Full text

1. Introduction

In recent years, with the increasing demand for marine resource development, environmental protection, and national defense, hydrophones have been widely used in various fields such as oil and gas exploration, subsea geological surveys, marine ecological monitoring, noise pollution control, and submarine detection [1]. However, despite significant advancements in hydrophone technology, many challenges remain. The noise in marine environments is complex and variable, comprising both natural sources (such as waves, marine life, and rain) and artificial sources (such as ships and military sonar) [2]. The non-stationary and complex characteristics of this noise greatly increase the difficulty of signal processing. Furthermore, marine noise often consists of a mixture of multiple noise sources, forming a complex noise field that further complicates the extraction and identification of target signals. Against this backdrop, it is crucial to develop effective denoising algorithms and methods to improve the positioning accuracy of hydrophone signals and address the challenges posed by complex underwater environments and noise interference [3].

In the field of underwater acoustic DOA estimation, hydrophone arrays have been widely employed for source localization [4]. Traditional spatial spectrum estimation algorithms include Conventional Beamforming (CBF) [5,6], Minimum Variance Distortionless Response (MVDR), and subspace algorithms with super-resolution capabilities such as MUSIC and ESPRIT [7,8]. However, in unfavorable environments such as low signal-to-noise ratios, the performance of these algorithms in estimating the target orientation can be significantly reduced. In order to enhance the precision of location determination, a number of sparse signal processing algorithms, including Bayesian inference (SBI), have been put forth as potential solutions [9]. Nevertheless, these methods rely on predefined spatial sampling grids. Despite the high grid density, the fact that actual signal sources do not precisely align with these grid points limits the angle resolution. To address this issue, gridless algorithms and algorithms that incorporate off-grid errors have been widely studied. The authors of [10] proposed a gridless convex optimization problem to address this issue by recovering the covariance matrix of a virtual array. The authors of [11] introduced a perturbed Sparse Signal Representation (SSR) model, incorporating bias parameters into the DOA estimation framework to address these challenges. The authors of [12] tackled the angle resolution limitation caused by sources not precisely aligning with grid points by constructing and optimizing a co-array tensor embedded with displacement information. The authors of [13] introduced the off-grid sparse Bayesian inference (OGSBI) algorithm, which incorporates off-grid errors into the array’s manifold matrix, thereby avoiding the substantial computational burden associated with mesh refinement and overcoming the angle resolution limitations.

The non-stationarity of underwater acoustic signals coupled with the presence of interfering noises renders traditional stationary signal processing methods inadequate. To address this, algorithms such as empirical mode decomposition (EMD) and wavelet transform are utilized due to their proficient handling of non-stationary signals. In 1998, Hilbert Huang et al. proposed an adaptive signal processing method, empirical mode decomposition (EMD), for the analysis of non-linear non-stationary time series, which is based on the characteristic time scale of the signal itself. The method does not require the setting of a basis function. The intrinsic modal function is generated adaptively based on the analyzed signal [14]. In the studies by [15,16], ensemble empirical mode decomposition (EEMD) and modified ensemble EMD (MEEMD) algorithms have been proposed to reduce the aliasing problem in empirical modal decomposition by adding noise to the original data and also to extend it to the field of high-dimensional processing. In the study by [17], an iterative EMD interval thresholding denoising algorithm is proposed, which is based on the idea of EMD with translation-invariant wavelet thresholding, and builds different noisy versions of the original signal by changing the position of the first intrinsic mode function (IMF) sample through multiple iterations, and then obtains multiple denoised versions of the signal by the IMF thresholding method and averaging them to enhance the denoising ability.

Specifically, the contributions of our work can be summarized as follows:

The algorithm leverages EMD for the adaptive decomposition of non-stationary signals into their respective IMFs. To overcome EMD’s limitations in handling complex noise, IIT is applied to iteratively threshold the IMFs, effectively filtering out low-energy noise. This combined approach significantly enhances noise suppression, ensuring accurate DOA estimation even under low signal-to-noise ratio (SNR) conditions.
The integration with OGSBI further optimizes signal reconstruction, enhancing target resolution and addressing grid mismatch issues, which is particularly effective when dealing with closely spaced signal sources.
The use of singular value decomposition (SVD) reduces computational complexity, making this method not only more accurate than traditional algorithms but also more efficient in real-time applications.

The paper is structured as follows: Section 2 introduces the background of EMD (empirical mode decomposition) and discusses the traditional DOA (direction of arrival) estimation method in detail. It also establishes the signal reception model and elaborates on the principles and processes of the EMDIIT-OGSBI algorithm. Section 3 compares the EMDIIT-OGSBI algorithm with the MUSIC algorithm and the OGSBI-SVD algorithm through simulation analysis and analyzes the superiority of the algorithm in this paper. Section 4 presents the source of the experimental data from the sea trial and uses it to validate the algorithm. The conclusions are given in Section 5.

2. Methods

2.1. Signal Processing Model for Hydrophone Arrays

Multiple hydrophone array elements arranged in space according to a certain geometric position constitute a base array, which has better directivity than a single array element. It is assumed that there are M uniformly deployed hydrophone arrays for hydroacoustic signal localization [18], with neighboring arrays spaced $[eqn]$ and $[eqn]$ being the signal wavelength. Under the plane wave assumption [19], the hydrophone base array with an array streamlines of $[eqn]$ receives $[eqn]$ far-field uncorrelated narrowband signals. As shown in Figure 1, the incident azimuth is $[eqn]$ , and $[eqn]$ denotes transposition, where $[eqn]$ denotes the orientation of the $[eqn]$ signal. The orientation of the hydrophone array is measured relative to the zero angle, which corresponds to the broadside angle of the array. At the instantaneous moment $[eqn]$ , the base array output vector $[eqn]$ is expressed as [20]

[eqn]

where

$[eqn]$ is the array signal,
$[eqn]$ is the source signal,
$[eqn]$ is the array noise,
$[eqn]$ is the array guidance vector, and $[eqn]$ is the array stream vector for the $[eqn]$ signal received by the $[eqn]$ array element, with

[eqn]

If each source signal has $[eqn]$ snapshots, the model for multiple snapshots can be written as

[eqn]

where

$[eqn]$ is the signal vector received by the array,
$[eqn]$ is the signal vector emitted by the source, and
$[eqn]$ is the noise vector to which the array is subjected.

2.2. Iterative EMD Interval Thresholding Methodology

The EMD-IIT denoising algorithm combines EMD with a translation-invariant wavelet thresholding method [21], and the section below describes the specific principles involved in this denoising algorithm.

2.2.1. Empirical Mode Decomposition

The basic idea of EMD [22] is that for all signals, the time series can be decomposed into a finite number of IMFs with different characteristic scales, each representing a characteristic oscillation in a time scale and with local orthogonality and adaptive properties. In the analysis of non-stationary signals, the interference of the same frequency components between different components over time can be effectively avoided.

In the signal model established above for array reception, where the array element received signal is $[eqn]$ , we perform EMD of each array element signal separately to obtain IMFs and residuals for each row vector, as follows:

[eqn]

where $[eqn]$ is the residual, which is a slowly varying function of the non-zero mean with only a few extreme values. Here, $[eqn]$ , $[eqn]$ , and $[eqn]$ .

The EMD algorithm proceeds as follows, taking as an example the first array received signal $[eqn]$ of the array received signal as the input signal:

(1)Find the local maxima and minima of a, obtain the sequence of maxima and minima, and interpolate the maxima and minima, respectively, to obtain the upper and lower envelopes of $[eqn]$ . The two envelopes are asymmetric to each other.
(2)Average the maxima and minima envelopes, and obtain the envelope mean $[eqn]$ . Subtract the envelope mean from the original signal to obtain the first component $[eqn]$ .
(3)Use $[eqn]$ as the input and repeat steps 1 and 2 to obtain $[eqn]$ , iterating continuously. Use the standard deviation of the two adjacent decomposition components as the criterion for stopping the iteration, which is generally taken as $[eqn]$ . At this point, obtain $[eqn]$ . Calculate $[eqn]$ as $[eqn]$ .
(4) $[eqn]$ is the first IMF, noted as $[eqn]$ , and subtract the input signal from $[eqn]$ to obtain the residual quantity $[eqn]$ .
(5)Use $[eqn]$ as the new input signal and re-execute steps 1 to 5 to obtain a new residual $[eqn]$ and a second $[eqn]$ , and so on, to obtain $[eqn]$ . At this point, the residual $[eqn]$ becomes a monotonic function and cannot be decomposed into IMF again. The final result is $[eqn]$ .

The iterative flow chart of the EMD algorithm is shown in Figure 2.

2.2.2. Wavelet Threshold Denoising

Wavelet threshold denoising has the feature of suppressing the useless part of the signal and enhancing the useful part [23,24]. The following describes the principle of wavelet threshold denoising:

The following noise signal $[eqn]$ is given by

[eqn]

where $[eqn]$ is the original signal, $[eqn]$ is the white noise signal following a standard Gaussian distribution, and $[eqn]$ is the noise variance. The wavelet threshold denoising method starts with the selection of a suitable wavelet basis $[eqn]$ , and the discrete wavelet transform (DWT) is obtained as

[eqn]

where $[eqn]$ is the wavelet decomposition coefficient. Then, a threshold is set for quantization, i.e., $[eqn]$ , where C is a constant. The basic principle of wavelet thresholding is to set all wavelet coefficients below that threshold to zero, and to keep those above that threshold directly or to process them accordingly. Currently, the most common thresholding functions are the hard and soft threshold values proposed by Donoho [25], which are defined as

[eqn]

and

[eqn]

For the selection of the threshold value, the common threshold formula is $[eqn]$ , which guarantees a high probability that all noisy components will have a low amplitude. Here, $[eqn]$ is the length of the sampled signal, and $[eqn]$ is the standard deviation of the noisy signal, which is estimated as [26]

[eqn]

After processing with hard or soft thresholding, the denoised signal is obtained by inverting the processed wavelet coefficients, as follows:

[eqn]

where $[eqn]$ .

2.2.3. Implementation of Iterative EMD Interval Thresholding

The IMF threshold denoising method in this paper is related to the wavelet threshold denoising method. Wavelet denoising is the thresholding of wavelet components, whereas in the EMD case, IMF thresholding is performed on samples from each IMF. However, the fact is that the noise contained in each IMF is colored, i.e., each mode has a different energy in it. In this sense, EMD denoising is most closely related to wavelet denoising of signals corrupted by colored noise, where the threshold must be scale-dependent, and by adapting the threshold function to the nature of each IMF, threshold denoising of the IMF obtained by EMD can locally exclude the low-energy IMF parts disturbed by high noise. Based on the idea of wavelet thresholding, the IMF threshold is given by

[eqn]

for hard thresholding and given by

[eqn]

for soft thresholding. In both thresholding cases, we can see that the thresholds are also different for the different IMFs, where $[eqn]$ represents the $[eqn]$ thresholded IMF. The thresholds we study become multiples of the independent thresholds for each IMF, i.e., $[eqn]$ , where $[eqn]$ represents the energy of the $[eqn]$ IMF, which is given by

[eqn]

where $[eqn]$ is the energy of the first IMF. According to [27], the values of parameters $[eqn]$ and $[eqn]$ are 0.719 and 2.01.

The reconstruction of the denoised signal is given by

[eqn]

In particular, the addition of the $[eqn]$ and $[eqn]$ parameters gives us more flexibility in excluding noisy low-order IMFs and in selecting thresholds for higher-order IMFs. Regarding the parameter $[eqn]$ , it is calculated by conventional EMD denoising [28,29,30]. The denoised signal must be reconstructed from IMFs of order $[eqn]$ and higher; in other words, the energy of the noise is lower than the energy of the signal in IMFs of order $[eqn]$ and higher. The best choice for the $[eqn]$ parameter is

[eqn]

And the best choice for the parameter is $[eqn]$ .

For an independent IMF sample amplitude, it is not possible to infer whether it corresponds to a noisy or useful signal. However, it is possible to guess whether the signal within this interval is noise-dominated or signal-dominated, based on the single extreme value $[eqn]$ corresponding to the interval $[eqn]$ adjacent to the over-zero point, where $[eqn]$ and $[eqn]$ . If a strong signal is present within this interval, the absolute value of the extreme value will exceed the threshold; conversely, if the signal is small, the absolute value of the extreme value will be below the threshold. Thus, the modified hard and soft thresholds, referred to as the EMD interval threshold (EMD-IT), are

[eqn]

and

[eqn]

where $[eqn]$ represents the sample values from interval $[eqn]$ to $[eqn]$ in the $[eqn]$ IMF.

Based on the idea of translation-invariant wavelet thresholding [28], multiple denoised versions of the signal are obtained by iteration, and their denoising power is enhanced by averaging them. In the EMD case, different denoised versions of the noisy signal can only be obtained by thresholding different versions of the IMF. We know that under Gaussian white noise conditions, the first IMF is mainly noise; in other words, it contains more noise than the others. By changing the position of the first IMF sample and then adding the newly generated noise signal to the sum of the remaining IMFs, we can obtain a different noisy version of the original signal. In fact, when the first IMF contains only noise, the total noise variance of the newly generated noise signal is the same as that of the original noise signal. Figure 3 shows the flowchart of the algorithm referred to as iterative EMD interval thresholding (EMD-IIT), in the following steps:

(1)EMD expansion of the initial noisy signal $[eqn]$ .
(2)Local reconstruction using only the last L-1 IMFs, $[eqn]$ .
(3)Randomly changing the sample position of the first IMF, $[eqn]$ .
(4)Constructing a different noisy version of the original signal, $[eqn]$ .
(5)Performing EMD processing on the newly obtained noisy signal $[eqn]$ .
(6)The denoised version of the original signal, $[eqn]$ of $[eqn]$ , is obtained by denoising the IMFs of the newly obtained noisy signal $[eqn]$ by Formula (16) or Formula (17).
(7)Iterating $[eqn]$ more times in steps 3–6 (typically, $[eqn]$ is set to 20) to obtain $[eqn]$ denoised versions of x, i.e., $[eqn]$ .
(8)Averaging of the noise-reduced signal, $[eqn]$ .

2.3. The Algorithm for Iterative EMD Interval Thresholding and Off-Grid Sparse Bayesian Learning

The DOA estimation algorithm in this paper consists of the following parts: the first part is to perform EMD-IIT denoising on the noisy signal received by each array element to obtain the signal with noise interference removed; the second part is to specify the signal space by building an off-grid sparse grid model and to decompose the denoised signal into singular values to further reduce the sensitivity to noise; and the third part is to learn by Bayesian continuous iteration, updating the hyperparameters, and finally reaching a state of convergence to obtain the DOA estimates. The overall framework of the algorithm is shown in Figure 4.

2.3.1. EMD-IIT Denoising

According to Formula (3), we obtain the noise-bearing signal received by the hydrophone arrays, i.e., $[eqn]$ , where $[eqn]$ is the signal of $[eqn]$ snapshots. Here, with the EMD-IIT algorithm above, we denoise the noisy signal. Since each array is subjected to EMD-IIT, the signal matrix received by the $[eqn]$ array element is defined as $[eqn]$ for ease of calculation. We write the noise-bearing signal in the following form:

[eqn]

Then, the EMD-IIT algorithm is used to denoise each row of the noisy signal separately to obtain the denoised signal $[eqn]$ . Then, the new array signal vector after removing the noise is

[eqn]

where $[eqn]$ is the vector signal of an $[eqn]$ dimension.

2.3.2. Off-Grid Sparse Model and Singular Value Decomposition

The spatial angle range $[eqn]$ is uniformly divided into $[eqn]$ grid points, with each point representing a potential incident direction, such as $[eqn]$ , and $[eqn]$ . From this, the grid interval $[eqn]$ can be determined. It is evident that the target orientation can be reinterpreted as an overcomplete sparse representation across the $[eqn]$ divided grid points. In the off-grid sparse model, there is an issue where the target incidence direction does not align perfectly with the grid points, resulting in a mismatch, represented as $[eqn]$ ; generally a denser grid point can reduce the error, but this does not contain all possible incidence directions and increases the computational effort. To address this issue, an off-grid error is incorporated into the array’s prevalence matrix by performing a first-order Taylor expansion between two neighboring grid points, allowing the steering vector to be approximated as

[eqn]

where $[eqn]$ denotes the nearest grid point to $[eqn]$ , $[eqn]$ , and $[eqn]$ . By letting $[eqn]$ , the grid error can be expressed as

[eqn]

According to Formula (20), the overcomplete array popularity matrix can be expressed as

[eqn]

Considering the case where the background noise has been filtered out above, we have $[eqn]$ , where $[eqn]$ and $[eqn]$ . Let $[eqn]$ , where $[eqn]$ and $[eqn]$ are matrices consisting of the first $[eqn]$ columns of $[eqn]$ and the remaining $[eqn]$ columns, respectively. By singular value decomposition, we obtain matrices $[eqn]$ containing all signal information, where $[eqn]$ , and where the first part $[eqn]$ retains most of the signal information and is used in the signal reconstruction process later, and the second part is discarded. Let $[eqn]$ ; then, we can represent the signal after singular value decomposition as

[eqn]

where the signal $[eqn]$ still has joint sparsity [29].

2.3.3. Sparse Bayesian Inference

The Bayesian inference method is used for the estimation of hydroacoustic target orientation, and the optimal estimate can be obtained. Assuming that the noise signal obeys the complex Gaussian distribution, the following likelihood function of $[eqn]$ is given by

[eqn]

where CN represents the complex Gaussian distribution, $[eqn]$ , where $[eqn]$ is the noise variance and $[eqn]$ is usually unknown, which is assumed to obey the Gamma prior distribution.

Let the prior probability of the sparse signal $[eqn]$ be

[eqn]

where $[eqn]$ is a set of hyperparameters, $[eqn]$ , which represents the source signal power incident to the array in each direction and also affects the sparsity of the sparse signal. Here, $[eqn]$ represents the covariance matrix of the sparse signal.

Combining the prior information and the likelihood function, the joint probability density function is obtained as

[eqn]

The posterior probability density function of $[eqn]$ can be obtained from Bayesian inference as

[eqn]

where the posterior mean $[eqn]$ and posterior covariance matrix $[eqn]$ of the sparse signal are given separately by

[eqn]

and

[eqn]

where $[eqn]$ and $[eqn]$ are functions with respect to the three hyperparameters $[eqn]$ , $[eqn]$ , and $[eqn]$ . We use a maximum posterior probability criterion to maximize the probability $[eqn]$ , which leads to the derivation of two parameters $[eqn]$ and $[eqn]$ . These two parameters are given by

[eqn]

and

[eqn]

where $[eqn]$ , $[eqn]$ , and $[eqn]$ is a positive constraint taking small values.

The grid error $[eqn]$ determines the accuracy of the target orientation estimation. We can use the expectation maximization criterion to find the grid error so that the expectation $[eqn]$ is maximized, which is equivalent to minimizing $[eqn]$ , as follows:

[eqn]

where $[eqn]$ is a constant depending on $[eqn]$ . $[eqn]$ is a semi-positive definite matrix whose expression is given by

[eqn]

[eqn]

The angle correction vector is obtained by the above derivation as [30]

[eqn]

Next, we obtain the expression for $[eqn]$ by taking the derivative of $[eqn]$ by Formula (32) and setting it to zero as

[eqn]

where $[eqn]$ is $[eqn]$ without the $[eqn]$ entry for a vector $[eqn]$ . By means of the constraint $[eqn]$ , we have

[eqn]

In the above Bayesian inference, we maximize the posterior probability to find out the updated formula of the two hyperparameters $[eqn]$ and $[eqn]$ . Then, through these two hyperparameters, we use the expectation maximization criterion to find out the update formula of the angle correction vector hyperparameter $[eqn]$ and obtain the final grid error $[eqn]$ . Finally, we initialize $[eqn]$ , $[eqn]$ , and $[eqn]$ and keep iterating Formulas (30), (31), and (37) until convergence, so that we can calculate the estimated value of the K DOAs as $[eqn]$ , where $[eqn]$ .

The specific flow of the algorithm for iterative EMD interval thresholding and off-grid sparse Bayesian learning is as follows:

(1)Pass the received signal from the array through EMD-IIT to obtain the denoised signal $[eqn]$ .
(2)Construct the off-grid sparse model to obtain the overcomplete sparse dictionary $[eqn]$ and perform the singular value decomposition of $[eqn]$ to obtain $[eqn]$ .
(3)Initialize the hyperparameters $[eqn]$ and $[eqn]$ for the noise and signal, respectively, and initialize the angle correction vector $[eqn]$ , the mean $[eqn]$ , and the variance $[eqn]$ to zero.
(4)Use Formulas (28) and (29) to solve for the mean and variance, respectively.
(5)Use Formulas (31)–(33) to obtain the updated hyperparameters $[eqn]$ , $[eqn]$ and $[eqn]$ .
(6)When $[eqn]$ or the maximum number of iterations is reached, continue to the next step; if it does not converge, skip to step 4.
(7)Calculate DOA estimates for the target.

3. Simulation Analysis

In order to verify the feasibility of the algorithms in this paper, simulation analysis is performed in this section and the estimated performance of the three algorithms is compared. If not specified, the following parameters and initial values were used in the DOA estimation of the objectives: In OGSBI-SVD and EMDIIT-OGSBI, $[eqn]$ and $[eqn]$ are set. $[eqn]$ , $[eqn]$ , and $[eqn]$ are initialized. The uniform line array element $[eqn]$ is initialized with a grid distance of $[eqn]$ , $[eqn]$ is set, and the maximum number of iterations is 800. The original signal is a frequency band signal with a center frequency of $[eqn]$ , $[eqn]$ , the sampling frequency is 10 kHz, $[eqn]$ is set as the underwater sound velocity, and the adjacent array element spacing is $[eqn]$ . The signal-to-noise ratio is calculated as $[eqn]$ , where $[eqn]$ is the signal power, $[eqn]$ is the noise power, and the spatial angle is divided into $[eqn]$ , and the noise is complex Gaussian white noise. The root mean square error (RMSE) is defined as $[eqn]$ , where $[eqn]$ denotes the number of Monte Carlo experiments, $[eqn]$ is the number of source signals, and $[eqn]$ denotes the orientation estimate of the $[eqn]$ signal source in the first experiment.

3.1. EMD-IIT Denoising Analysis

To visualize the denoising performance of the EMD-IIT algorithm, the noisy signals received by each array element are compared and analyzed with the original signals in this section. The time–frequency spectrum of the original signal is shown in Figure 5. Let the incident directions of the two target signals be [−3.6° 11°]; the number of snapshots is $[eqn]$ . The time–frequency spectrum of the received signals of each array element at a signal-to-noise ratio of 5 dB is given in Figure 6. Figure 7 gives the time–frequency spectrum of the received signals of each array element after the EMD-IIT denoising algorithm. We can see that after the EMD-IIT denoising algorithm, a portion of the frequency interference has been effectively removed from the time–frequency spectrum of the array element. This is due to the enhanced noise reduction capability of the EMD-IIT algorithm through multiple noises averaging by the translation-invariant wavelet threshold. The signal-to-noise ratio after denoising and the signal-to-noise ratio before denoising are given in Table 1 for each array element, which further illustrates the good denoising performance of EMD-IIT.

3.2. Spatial Power Spectrum Estimation Analysis

To validate the proposed method and highlight its superiority, we compare it under identical conditions with several existing algorithms, including the MUSIC algorithm, the off-grid sparse Bayesian inference algorithm (OGSBI), and the MUSIC-like algorithm. This comparative analysis demonstrates the advantages of our algorithm in terms of performance and accuracy. The orientation of the two target signals is [−3.6° 11°], $[eqn]$ , and the number of snapshots is $[eqn]$ . Figure 8 illustrates that the EMDIIT-OGSBI algorithm, as proposed in this paper, is capable of matching the target incident orientation, exhibiting higher spatial–spectral gain and a narrower main flap width. The OGSB-SVD algorithm is able to distinguish the target orientation; however, the main flap width is wider and comprises four pseudo-peaks. The MUSIC algorithm is capable of approximating the target orientation; however, the main flap width is wider, and the peak is not discernible. The MUSIC-like algorithm is able to align with the target orientation; however, its main lobe width is wider, and the spatial gain is inferior in comparison to the algorithm proposed in this paper.

3.3. Root Mean Square Error Analysis

3.3.1. RMSE of the Algorithm at Different Numbers of Monte Carlo Trials

Figure 9 shows the variation in RMSE with the number of Monte Carlo trials. The two target orientations in this experiment are [−3.6° 11°], the signal-to-noise ratio is 0 dB, the number of snapshots is 1024, and other parameters are unchanged. It can be seen from the graph that when the number of Monte Carlo trials is less than 100, the RMSE values of the four algorithms converge less, and the DOA estimates of the algorithms have a chance at this time. To avoid this problem, the number of Monte Carlo trials in the following root mean square error analysis is 200.

3.3.2. RMSE of the Algorithm at Different Signal-to-Noise Ratios

Figure 10 shows the variation in root mean square error with signal-to-noise ratios for the four algorithms. The number of snapshots is 1024, $[eqn]$ , and the two target signal orientations are [−14.3° 6°], respectively. As shown in Figure 9, the MUSIC algorithm has the relatively highest RMSE value at low SNRs, and its orientation estimation performance is limited. The OGSBI-SVD algorithm and the MUSIC-like algorithm have higher RMSE values; however, their performance improves as the signal-to-noise ratio increases, leading to better orientation estimation under the influence of weak noise. In contrast, the EMDIIT-OGSBI algorithm in this paper has a relatively lower RMSE value and a relatively more outstanding DOA estimation accuracy and still has a high DOA estimation accuracy at low signal-to-noise ratios. This is due to the suppression of Gaussian noise by the EMDIIT denoising algorithm, and the singular value decomposition of the off-grid sparse observation matrix also reduces the effect of noise interference.

3.3.3. RMSE of Algorithms at Different Snap Counts

Figure 11 shows a comparison of the RMSEs of the four algorithms for different numbers of snapshots. The range of the number of snapshots is $[eqn]$ , the signal-to-noise ratio is −3 dB, the two target orientations are $[eqn]$ , and other conditions remain unchanged. As can be seen from Figure 11, the RMSE values of the four algorithms gradually decrease as the number of snapshots increases, while the RMSE values of the proposed algorithm are relatively lower, and the target orientation can be accurately estimated under the low snapshot condition.

3.3.4. RMSE of the Algorithm at Different Grid Distances

Figure 12 shows the RMSE values of the proposed algorithm in this paper as a function of the S/N ratio at grid distances $[eqn]$ . The two target orientations are $[eqn]$ , the number of snapshots is 1024, the signal-to-noise ratio range is $[eqn]$ , and other conditions remain unchanged. As can be seen from Figure 12, the RMSE values of the four different grid distances gradually decrease as the signal-to-noise ratio increases. The finer the grid distance is divided, the smaller the RMSE value is. We can also see that the difference in RMSE between coarse and fine grid distance is insignificant, which shows that the proposed algorithm still has high accuracy in estimating the target orientation under coarse grid distance. It is worth noting that the algorithm in this paper can still maintain a high estimation accuracy even with coarse grid spacing.

3.4. Analysis of the Discriminative Probability of Compact Sound Sources

This section focuses on the analysis of the spatial resolution probability of compact sound sources. Under a uniform line array, the spatial resolution capability of the algorithm in this paper for compact sound sources is analyzed. We set the incident directions of the two target signals as $[eqn]$ , the signal-to-noise ratio as 5 dB, the number of snapshots as 1024, and the grid distance as 1°, with other conditions being constant. Figure 13 gives the spatial–spectral estimation plots of the four algorithms for compact sound sources. From Figure 13, it can be seen that compared with the MUSIC, MUSIC-like, and OGSBI-SVD algorithms, the algorithms in this paper have a better spatial resolution for the compact sound sources and have narrower main flap widths and sharper peaks, indicating that the orientation estimation of the compact sound sources also has higher accuracy. To further investigate the discrimination ability of the algorithm proposed in this paper for spatially tight signal DOA estimation, the discrimination probabilities of the three algorithms are analyzed at different DOA intervals with a signal-to-noise ratio of 0 dB with other conditions being constant. The DOAs of the two target signals are defined as $[eqn]$ and $[eqn]$ , and their corresponding spatial power spectrum values are $[eqn]$ and $[eqn]$ ; the DOA interval $[eqn]$ was varied from 2° to 30° in steps of 2°, and 200 Monte Carlo experiments are performed at each DOA interval. The intermediate values of the two target DOAs are set to $[eqn]$ , and $[eqn]$ corresponds to a spatial power spectrum value of $[eqn]$ ; if $[eqn]$ is satisfied, the discrimination of the compact sound sources is successful. As can be seen from Figure 14, the EMDIIT-OGSBI algorithm in this paper achieves a 100% resolution probability at source intervals $[eqn]$ . The OGSBI-SVD algorithm achieves 100% resolution probability at $[eqn]$ ; the MUSIC algorithm achieves 100% resolution probability at $[eqn]$ ; and the MUSIC-like algorithm achieves 100% resolution probability at $[eqn]$ . It can be seen that the algorithm has a good spatial resolution of spatially immediate neighboring signals.

3.5. Analysis of the Discriminative Probability at Different Signal-to-Noise Ratios

To investigate the algorithms’ ability to discriminate between targets at low signal-to-noise ratios, the following simulations are made. The incident directions of the two target signals are randomly generated in the range of $[eqn]$ , the separation between them is 14°, and the signal-to-noise ratio range is $[eqn]$ , with all other conditions being equal. Figure 15 shows the resolution probabilities of the four algorithms as the signal-to-noise ratio varies. From this figure, it can be seen that the MUSIC and OGSBI-SVD algorithms have a poor resolution of the target at a low S/N ratio, and their resolution ability is improving as the S/N ratio increases. Compared to the MUSIC and OGSBI-SVD algorithms, the MUSIC-like algorithm has better target resolution capability at low signal-to-noise ratios, with its resolution probability reaching 100% at a low SNR of −4 dB. In contrast to the other three algorithms, the proposed EMDIIT-OGSBI algorithm still has a very strong resolution of the target at low SNRs, and the resolution probability reaches 100% only at a low SNR of −9 dB, which effectively shows that the algorithm still has an excellent resolution of the target signal at low SNRs.

3.6. Algorithm Runtime Analysis

Figure 16 shows the comparison of the algorithm running time of the four algorithms with different grid spacing. SNR = 5 dB was set, the number of snaps is 1024, and the two target incidence directions are $[eqn]$ , respectively. We can see that the running time of the four algorithms decreases as the grid spacing increases. The MUSIC algorithm has the shortest runtime because it does not involve iterative operations; however, this also results in its poor orientation estimation performance, as discussed earlier. The MUSIC-like algorithm requires the calculation of fourth-order cumulants, which increases its runtime. The EMDIIT-OGSBI algorithm includes iterative operations not only in the noise reduction process but also in the azimuth angle estimation, leading to a longer runtime than the MUSIC algorithm. Nevertheless, its runtime is still shorter than that of the OGSBI-SVD algorithm. This is because, in the EMDIIT-OGSBI algorithm, the EMDIIT method effectively reduces the impact of background noise, significantly decreasing the number of iterations required by OGSBI. Additionally, singular value decomposition is employed to reduce the signal’s dimensionality and discard unnecessary signal matrices, further reducing the runtime.

4. Validating Algorithms with Sea Trial Data

The data obtained from a sea trial experiment conducted in a sea area are used to verify the performance of the algorithm. The experimental sea area on that day was calm in terms of ocean noise, with no other passing vessels on the surface and low wind speeds. The sound velocity profiles of the sea surface on that day are shown in Figure 17, which were measured at 13:52 p.m., 14:57 p.m., and 16:41 p.m. In the sea trial, the sound source transmitting equipment UW350 was lifted at a depth of 5 m and the transmitting signal was a broadband long pulse signal of 200–600 Hz, with a signal length of 15 s and a sampling frequency of 10 kHz. The transmitted signal is shown in Figure 18. The time–frequency spectrum of the sound source is shown in Figure 19. The number of hydrophone array elements is eight, the array element spacing is half a wavelength, and the experimental sea depth is 25.5 m. The above equipment deployment depth and seawater depth are measured by the depth sensor. The schematic layout of the sea trial is shown in Figure 20.

DOA verification of the target was carried out by selecting data from two different locations from the sea trials. The target orientations were $[eqn]$ , which were placed at 14:57 p.m. and 16:41 p.m., respectively. Figure 21 and Figure 22 show the signal’s time–frequency spectrum of the eight array elements collected at two time periods, respectively. It can be seen that the noise of the day interferes with the original signal. The results of DOA spatial spectrum estimation at different target orientations are given in Figure 23 and Figure 24, respectively. In the orientation estimation of the sea trial data, the number of snapshots chosen is 512 and 1024. From these figures, it can be seen that the MUSIC algorithm can roughly distinguish the target orientation, has some deviation in the target orientation estimation, has a wide main flap width and poor spatial power spectrum gain, and the OGSBI-SVD algorithm can approximate the target orientation, with a relatively narrow width of the main flap and a more pronounced peak. However, there are pseudo-peaks in the spatial power spectrum, which will seriously affect the discrimination of the true orientation; the MUSIC-like algorithm is relatively stable and is capable of distinguishing target direction angles with a narrower main lobe width. However, it exhibits poor spatial gain, whereas the EMDIIT-OGSBI algorithm in this paper has a narrower main flap width and higher spatial–spectral gain, which can estimate the target orientation more accurately. In summary, the DOA estimates of the four algorithms are approximately equal to the actual deployed orientation. In comparison, both the EMDIIT-OGSBI algorithm and the MUSIC-like algorithm demonstrate high accuracy at low snapshot counts, are less affected by the number of snapshots, and exhibit better robustness.

To better compare the performance of the four algorithms for the orientation estimation of the sea trial data, 200 Monte Carlo experiments are conducted on the sea trial experimental data of the three positions, and the number of snapshots is chosen as 1024, and the mean value and root mean square error of the DOA estimation results are recorded in Table 2. The estimation results of the algorithm in this paper are closer to the target bearing and have a stronger suppression ability to the background noise interference of the ocean, which is generally consistent with the numerical simulation analysis.

5. Conclusions

This paper investigates the application of EMD-IIT-based denoising with off-grid sparse Bayesian learning algorithms for hydroacoustic direction-of-arrival (DOA) estimation. The proposed algorithm effectively reduces noise, improves target resolution, and enhances estimation accuracy.

The EMDIIT-OGSBI algorithm demonstrates strong robustness, shorter runtime, and reduced sensitivity to grid spacing and the number of snapshots compared to conventional algorithms. It maintains high resolution for target signals and neighboring signals in the presence of strong noise interference. Overall, the algorithm addresses the challenges of low precision and poor resolution in target DOA estimation under low signal-to-noise ratios, making it a relevant and beneficial approach in hydroacoustic signal processing.

Bibliography30

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Yang D.S. Zhu Z.R. Tian Y.Z. Theoretical Bases and Application Development Trend of Vector Sonar Technology J. Unmanned Undersea Syst.201826185192
2Huang H.N. LIY. Underwater Acoustic Detection: Current Status and Future Trends China Acad. J. Electron. Publ. House 201934264271
3Yang Y.X. Han Y.N. Zhao R.Q. Liu X.H. Wang Y. Ocean Acoustic Target Detection Technologies: A Review J. Unmanned Undersea Syst.201826369386
4Xing C.X. Wan Z.L. Jiang S.Y. Yu R.M. Direction of arrival estimation based on high-order cumulant by sparse reconstruction of underwater acoustic signals Acta Acust.202247440450
5Li Q.H. Wei C.H. Iterative inverse beamforming algorithm and its application in multiple targets detection of passive sonar Chin. J. Acoust.202241744749
6Jiang S. Liu S. Jin M. High-dimensional MVDR beamforming based on a second unitary transformation Signal Process.202320510886910.1016/j.sigpro.2022.108869 · doi ↗
7Chowdhury M.W.T.S. Mastora M. Performance analysis of MUSIC algorithm for DOA estimation with varying ULA parameters Proceedings of the 23rd International Conference on Computer and Information Technology (ICCIT)Rome, Italy 22–23 July 2020
8Li W. Liao W. Fannjiang A. Super-resolution limit of the ESPRIT algorithm IEEE Trans. Inf. Theory 2020664593460810.1109/TIT.2020.2974174 · doi ↗