Research on AUV Underwater Localization Method Based on an n-Shaped Array

Chuang Han; Mengran Gao; Tao Shen; Chengli Guo

PMC · DOI:10.3390/s26061845·March 15, 2026

Research on AUV Underwater Localization Method Based on an n-Shaped Array

Chuang Han, Mengran Gao, Tao Shen, Chengli Guo

PDF

Open Access

TL;DR

This paper introduces a new method for localizing an autonomous underwater vehicle (AUV) using an n-shaped hydrophone array to improve recovery accuracy.

Contribution

A novel AUV localization method using an n-shaped array with MUSIC and SAGE algorithms for improved underwater positioning.

Findings

01

The proposed method effectively handles coherent signals caused by underwater transmission impairments.

02

Simulation results show the method achieves good parameter estimation performance.

03

The algorithm is extended to support both far-field and near-field localization scenarios.

Abstract

During continuous navigation of the mother ship, an autonomous underwater vehicle (AUV) can be recovered through an underwater hangar, and the accurate localization of the AUV relative to the mother ship is a key step in the recovery process. To address the AUV localization problem, an n-shaped hydrophone array is designed based on the spatial configuration of the underwater hangar. Since underwater acoustic signals are susceptible to multipath propagation, co-channel interference, and other transmission impairments, the signals received by the array often exhibit coherence. Accordingly, a far-field sound source localization method based on the n-shaped array is proposed. The proposed algorithm first applies spatial smoothing to the x-axis and y-axis subarrays and jointly constructs a received data vector, followed by eigenvalue decomposition of the corresponding covariance matrix. The…

Figures15

Click any figure to enlarge with its caption.

Funding1

—Program for Young Talents of Basic Research in Universities of Heilongjiang Province

Keywords

n-shaped arrayMUSIC algorithmSAGE algorithmcoherent signalsfar-field localizationnear-field source localization

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsUnderwater Acoustics Research · Direction-of-Arrival Estimation Techniques · Underwater Vehicles and Communication Systems

Full text

1. Introduction

AUVs are essential platforms for underwater exploration and reconnaissance. Common recovery methods for AUVs include ship-based recovery using a support vessel and recovery via fixed platforms. In certain operational scenarios, however, the support vessel is required to recover the AUV while maintaining continuous navigation. Under such conditions, recovery through an underwater docking hanger mounted on the mother ship has received increasing attention. In underwater environments, acoustic waves constitute the primary medium for information transmission. AUVs are typically equipped with acoustic beacons or generate radiated noise during operation, which can be captured by a receiving array. By processing the received acoustic signals, the spatial parameters of the AUV can be estimated. Therefore, the AUV recovery and localization problem in underwater environments can be formulated as an acoustic source localization problem. Source localization is one of the fundamental research topics in the field of array signal processing, and its primary objective is to process the signals received by a sensor array in order to estimate the spatial parameters of the source, such as azimuth, elevation, and range. This framework can be further extended to the estimation of additional parameters, including signal frequency and time delay [1].

The array geometry plays a crucial role in determining the performance of source localization algorithms, and different array configurations demonstrate notable differences in parameter estimation accuracy and implementation complexity. In underwater acoustic source localization studies, commonly employed array structures include the Uniform Linear Array (ULA), the L-shaped array, and the planar array. The ULA is structurally simple and easy to implement; however, it is inherently constrained in multidimensional parameter estimation and cannot provide the unambiguous and accurate joint estimation of multiple spatial parameters such as azimuth, elevation, and range. Although the planar array can offer higher-dimensional spatial information, its structure is more complex and typically requires a larger number of array elements as well as higher deployment precision, which may lead to deployment challenges and increased system costs in practical underwater engineering scenarios. Considering the structural characteristics and spatial layout of the underwater hangar of the mother ship, an n-shaped hydrophone array is deployed near the hangar entrance. This array configuration makes efficient use of the available physical space and provides advantages in multidimensional parameter estimation. The n-shaped array can receive acoustic signals emitted by an acoustic beacon carried by the AUV, as well as radiated noise generated by the vehicle itself. By processing and analyzing the received signals, the spatial parameters of the target sound source, including angular parameters and range, can be accurately estimated, thereby enabling reliable localization of the AUV.

For underwater operational missions of AUVs, localization requirements typically include both direction of arrival (DOA) estimation under long-range conditions and accurate position estimation under short-range conditions. Depending on the distance between the sound source and the receiving array relative to the array aperture, source localization problems are generally classified into far-field and near-field localization. When the sound source is located in the far-field region, the wavefront impinging on the array can be well approximated as a plane wave. In contrast, when the source lies within the Fresnel region of the array, the plane-wave assumption is no longer valid [2]. In this case, the received wavefront exhibits spherical wavefront characteristics, and the phase differences across the array depend on both the source direction and range [3]. Consequently, near-field sound source localization depends not only on the DOA but also on the distance between the source and the reference sensor of the array. For far-field acoustic source localization, commonly used approaches include the Conventional beamforming (CBF), Minimum Variance Distortionless Response (MVDR) [4], and MUSIC algorithms [5]. Among these methods, the MUSIC algorithm exhibits superior estimation performance; however, because it relies on spectral peak search to determine the source direction, it suffers from relatively high computational complexity. To address this issue, a series of improved MUSIC-based methods [6,7,8] as well as the Estimation of Signal Parameters via Rotational Invariance Technique (ESPRIT) [9,10] and the propagator method [11,12,13] have been proposed. For near-field acoustic source localization, several methods have also been developed, including the near-field MVDR algorithm [14], the maximum likelihood (ML) method [15], the near-field MUSIC algorithm [16,17], and the covariance approximation (CA) method [18].

During underwater propagation, acoustic signals are easily affected by multipath propagation and co-channel interference, which often cause the signals received by the array to exhibit coherence. When the incident signals are coherent, the covariance matrix of the array observations becomes rank-deficient. As a result, the dimension of the signal subspace is smaller than the actual number of sound sources, leading to performance degradation or even failure of conventional DOA estimation algorithms. To address DOA estimation under coherent signal conditions, extensive research efforts have been reported in the literature [19,20,21,22]. In [19], a joint covariance matrix was constructed using a spatial smoothing technique, and the ESPRIT algorithm was then applied to estimate the directions of arrival, thereby improving estimation accuracy. In [20], a spatial-smoothing-based ESPRIT was proposed. In this approach, a modified covariance matrix is first obtained through spatial smoothing applied to two parallel linear arrays, and the DOA estimation is subsequently achieved by exploiting the rotational invariance property between subarrays. In [21], the received data vectors of the array elements were used to construct a Toeplitz matrix. A modified covariance matrix was then obtained through Hermitian transpose-based correction and forward–backward processing, and coherent-signal DOA estimation was finally realized by integrating the ESPRIT algorithm.

The methods discussed above mainly focus on DOA estimation for coherent signals in far-field scenarios. In contrast, research on near-field coherent signal localization has also attracted increasing attention [23,24,25,26]. In [23], an efficient iterative algorithm was proposed for the localization of near-field coherent sources. In each iteration, a covariance matrix containing information from only a single source is constructed using the alternating oblique projection (AOP) technique, the DOA is estimated based on the principle of vector inner-product principle, and the corresponding range parameter is finally obtained using one-dimensional ML estimation. In [24], a focusing technique was first applied to approximate the near-field signal model as a far-field model, after which spatial smoothing and ESPRIT were employed for decorrelation and coarse estimation and the AOP was finally iteratively applied to achieve refined estimation. In [25], a planar-array-based approach was proposed for near-field coherent sound source localization. By properly designing the sensor positions, a covariance matrix that does not suffer from rank deficiency under coherent source conditions was constructed. The azimuth angle and range of the sound source were then estimated through two separate one-dimensional searches. In time-reversal (TR) applications, the MUSIC algorithm has also been widely studied [27,28]. In [27], a target localization method based on the multistatic array data matrix was proposed. By performing singular value decomposition (SVD) on the data matrix and combining time-reversal imaging with the MUSIC algorithm, the target position can be estimated. This method demonstrates robustness and high-resolution localization capability in complex propagation environments. In [28], the TR-MUSIC imaging and localization method was investigated, and a theoretical performance analysis model for target position estimation error was established. The root mean square error (RMSE) expression under high signal-to-noise ratio (SNR) conditions was derived, and the influence of noise on localization accuracy was analyzed using SVD perturbation theory.

To provide a clearer overview of existing research, the related literature is summarized and categorized according to several criteria, including array configuration, signal model (far-field or near-field), signal characteristics (coherent or incoherent), and localization algorithms, as shown in Table 1. As shown in Table 1, most existing studies focus on conventional array configurations such as the ULA, the L-shaped array, and planar arrays. In contrast, this paper proposes an n-shaped array based on the structural characteristics of the mother ship’s underwater cabin and investigates far-field and near-field acoustic source localization methods suitable for AUV localization scenarios.

In conventional MUSIC-based localization, a two-dimensional joint search over elevation and azimuth angles is required in far-field scenarios. In near-field cases, an additional range parameter must be introduced, resulting in a three-dimensional search over elevation, azimuth, and range. Due to the strong coupling among these parameters, the achievable scanning resolution is limited, which degrades estimation accuracy. To address the aforementioned issues, this paper proposes an improved MUSIC-based sound source localization algorithm based on an n-shaped array. The proposed algorithm first applies spatial smoothing to the x-axis and y-axis subarrays and jointly constructs a received data vector, after which an equivalent covariance matrix is formed to effectively restore the dimensionality of the signal subspace in the presence of coherent sources. The MUSIC algorithm is then employed to obtain coarse estimates of the source angles, which are subsequently used as initial values for the SAGE algorithm to perform refined optimization of the angular parameters in a continuous parameter space, thereby effectively improving the estimation accuracy. Simulation results demonstrate that the proposed algorithm achieves good estimation performance.

The structure of this paper is as follows: Section 2 describes the far- field and near-field signal models for the n-shaped array. Section 3 introduces the fundamental principles of the SAGE algorithm and subsequently presents the proposed far- field and near-field acoustic source localization algorithms based on the n-shaped array. Section 4 presents simulation results and analysis of the proposed algorithms. Section 5 concludes the paper.

Symbols: matrices, vectors and scalars are represented by capital bold letters, lower-case bold letters and lowercase letters, respectively. $[eqn]$ , $[eqn]$ , $[eqn]$ denote conjugate transpose, transpose and conjugate, respectively. $[eqn]$ represents mathematical expectation. $[eqn]$ and $[eqn]$ denote $[eqn]$ identity matrix and diagonal matrix. $[eqn]$ represents the exponential function, where $[eqn]$ denotes Euler’s number.

2. Signal Method

During AUV underwater operational missions, the positioning requirements generally involve target direction estimation under long-range conditions and high-precision position estimation under short-range conditions. In long-range scenarios, the primary objective is to estimate the target direction, thereby providing coarse guidance information, while in short-range scenarios, accurate joint estimation of angular and range parameters is required to meet the stringent localization requirements during the recovery phase. These two operational stages correspond to the far-field and near-field acoustic source localization problems, respectively. In long-range scenarios, with respect to the receiving array, the AUV can be reasonably approximated as a far-field acoustic source, and the localization task primarily involves DOA estimation. In contrast, during the short-range recovery stage, the AUV enters the near-field region of the array, where joint estimation of angular and range parameters is required for precise localization. Therefore, simultaneous investigation of far-field and near-field models is essential for establishing a unified theoretical framework capable of meeting the full-process localization requirements of AUV operations.

Due to the structural characteristics of the underwater hangar and the fact that hydrophones are typically installed on the inner side of the cabin wall, the array geometry must be compatible with the planar structure of the cabin wall as well as the limited available installation space. The n-shaped array configuration proposed in this paper allows array elements to be arranged along two orthogonal directions of the cabin wall, thereby preserving a relatively large array aperture while satisfying practical installation constraints. In addition, this configuration facilitates the cable routing and mechanical mounting of the hydrophones, making it suitable for space-constrained underwater cabin environments. The scenario is illustrated in Figure 1.

The structure of the n-shaped array is illustrated in Figure 2, which consists of three uniform linear arrays arranged in the x–y plane along the x-axis, the y-axis, and a direction parallel to the x-axis, respectively, with the sensor at the coordinate origin serving as the central reference element. Each subarray is composed of M sensors, resulting in a total of 3*M−*2 array elements. The inter-element spacing is denoted by d, and the distance between the subarray along the x-axis and the subarray parallel to the x-axis is denoted by $[eqn]$ . It is assumed that K narrowband signals impinge on the n-shaped array, and the number of sources K is known a priori in this paper, where the directional information of the k-th incident signal is characterized by the pair of angles $[eqn]$ , Here, $[eqn]$ and $[eqn]$ denote the elevation angle and the azimuth angle, respectively, corresponding to the angle between the incident signal and the positive z-axis, and the angle between the projection of the incident signal onto the x–y plane and the positive x-axis. The distance from the $[eqn]$ -th source to the reference array element is denoted by $[eqn]$ .

The received data vectors of the three subarrays located along the x-axis, the y-axis, and the direction parallel to the x-axis can be expressed as

[eqn]

[eqn]

[eqn]

where $[eqn]$ denotes the incident signal vector. The noise vector of the x-axis subarray is denoted by $[eqn]$ , that of the y-axis subarray is denoted by $[eqn]$ , and the noise vector of the subarray parallel to the x-axis is denoted by $[eqn]$ ; all noise vectors are assumed to be zero-mean Gaussian white noise with variance $[eqn]$ .

As shown in Figure 2, it is assumed that K far-field narrowband signals impinge on the n-shaped array. The array steering vector matrices corresponding to the subarrays along the x-axis, the y-axis, and the subarray parallel to the x-axis are denoted by $[eqn]$ , $[eqn]$ , and $[eqn]$ , respectively, and can be expressed as follows:

[eqn]

[eqn]

[eqn]

where

[eqn]

[eqn]

[eqn]

In the near-field scenario, the array steering vector matrices corresponding to the subarrays along the x-axis, the y-axis, and the subarray parallel to the x-axis are denoted by $[eqn]$ , $[eqn]$ , and $[eqn]$ , respectively, and can be expressed as follows:

[eqn]

[eqn]

[eqn]

The steering vectors corresponding to the x-axis subarray, the y-axis subarray, and the subarray parallel to the x-axis are given, respectively, as follows:

[eqn]

[eqn]

[eqn]

where $[eqn]$ , $[eqn]$

[eqn]

[eqn]

[eqn]

3. Algorithm Principle

To address the requirements of long-range orientation and short-range precise localization in the AUV positioning process, this paper proposes a far-field acoustic source localization algorithm and a near-field acoustic source localization algorithm based on the n-shaped array, respectively. This section systematically presents the theoretical foundations and key steps of the two algorithms.

3.1. SAGE Algorithm

The SAGE algorithm reformulates the joint multi-source estimation problem as an alternating estimation process performed for individual sources. Its core principle is to update only a subset of parameters associated with the current source during each iteration while keeping the parameters of the remaining sources fixed [29,30]. By adopting this space-alternating strategy, the algorithm reduces computational complexity and enhances convergence stability.

Assume that the signal received by the array is given by

[eqn]

where $[eqn]$ denotes the steering vector corresponding to the k-th signal, $[eqn]$ represents the complex envelope of the k-th signal, and $[eqn]$ is additive white Gaussian noise with zero mean and variance $[eqn]$ .

Given that the number of sources K is known, the goal is to estimate the desired parameters from the observation data $[eqn]$ . The complete data corresponding to the k-th signal is defined as

[eqn]

The SAGE algorithm consists of two steps: the expectation step (E-step) and the maximization step (M-step).

E-step

The core assumption of the SAGE algorithm is that, if the estimates of the other $[eqn]$ sources are known, the expected observation corresponding to the k-th source can be reconstructed.

Based on conditional probability, and given the observation data $[eqn]$ and the current parameter estimates $[eqn]$ , the complete data estimate corresponding to the k-th source is obtained:

[eqn]

Under the Gaussian white noise assumption, the following approximation is commonly adopted:

[eqn]

where $[eqn]$ denotes the current reconstruction of the k-th signal component, and i denotes the iteration index. Let $[eqn]$ denote the initialization stage. The initial angle estimates $[eqn]$ are obtained from the MUSIC spectrum, and the corresponding source signals are initialized via least-squares projection. At iteration $[eqn]$ , the E-step uses the estimates from iteration $[eqn]$ , and the M-step updates the parameters to obtain the i-th estimates. The iterations continue until the relative parameter variation falls below a predefined threshold or the maximum iteration number is reached.

2.M-step

After obtaining the estimate $[eqn]$ , the parameters of the k-th source, $[eqn]$ and $[eqn]$ , are updated by maximizing the conditional expected log-likelihood function. Under the additive white Gaussian noise assumption, the parameter update problem can be reformulated as a least-squares optimization:

[eqn]

Assuming that $[eqn]$ is known, taking the partial derivative of (23) with respect to $[eqn]$ yields

[eqn]

By substituting (24) into (23), the problem can be reformulated as maximizing the spatial projection power:

[eqn]

For a uniform linear array, $[eqn]$ , (25) can be simplified as follows:

[eqn]

3.2. Far-Field Sound Source Location Algorithm

Due to multipath propagation and other effects in the underwater environment, the signals received by an array often exhibit coherence, under which the covariance matrix of the received data becomes rank-deficient. In this case, the subspace orthogonality property on which the conventional MUSIC algorithm relies no longer holds, making it difficult to achieve effective parameter estimation directly. To address the rank-deficiency problem, the covariance matrix of the received data can be preprocessed using spatial smoothing techniques to restore its full-rank property, after which the MUSIC algorithm can be applied to perform sound source localization.

The spatial smoothing algorithm requires the array structure to satisfy translational invariance. For the n-shaped array, both the subarrays along the x-axis and the y-axis possess translational invariance and can therefore be processed using spatial smoothing. However, for the subarray parallel to the x-axis, the inter-element phase differences include a term related to $[eqn]$ , where $[eqn]$ is a constant, which violates the translational invariance between subarrays and thus precludes the direct application of spatial smoothing. Consequently, for DOA estimation of coherent far-field signals based on the n-shaped array, spatial smoothing can first be applied to the x-axis and y-axis subarrays to mitigate signal coherence and restore the full-rank property of the covariance matrix, after which the azimuth and elevation angles of the sound sources can be estimated using the MUSIC algorithm.

The arrays along the x- and y-axes are each partitioned into L subarrays, with each subarray consisting of $[eqn]$ array elements. Then, the received data from the p-th subarrays along the x-axis and the y-axis can be expressed, respectively, as follows:

[eqn]

[eqn]

where $[eqn]$ and $[eqn]$ represent the first q rows of the steering vector matrices $[eqn]$ and $[eqn]$ associated with the x-axis and y-axis subarrays, i.e., the first q rows of (4) and (5), respectively; and $[eqn]$ $[eqn]$ are diagonal matrices given by

[eqn]

[eqn]

A received data vector jointly formed by the p-th forward spatially smoothed subarrays of the x-axis and y-axis is constructed, whose expression can be written as

[eqn]

The corresponding steering vector of the joint subarray can be expressed as

[eqn]

By averaging the covariance matrices of the L subarrays, the forward spatially smoothed covariance matrix $[eqn]$ is obtained

[eqn]

The forward–backward spatially smoothed covariance matrix can then be further derived

[eqn]

By performing eigenvalue decomposition on the covariance matrix, the signal subspace $[eqn]$ and the noise subspace $[eqn]$ can be obtained

[eqn]

Therefore, the spatial spectrum function can be obtained as follows:

[eqn]

Under far-field conditions, the MUSIC algorithm based on the n-shaped array requires a two-dimensional joint search over the azimuth and elevation angles. Owing to the strong coupling between these two parameters, the achievable search resolution and estimation accuracy are limited. To overcome this limitation, an improved MUSIC algorithm is proposed. First, coarse estimates of the source angles are obtained using the MUSIC algorithm, and these estimates are then used as initial values for refined estimation of the angular parameters in a continuous parameter space via the SAGE algorithm, thereby effectively improving the estimation accuracy of two-dimensional far-field DOA estimation.

Coarse estimates of the source angles are obtained by performing peak searching on the spatial spectrum

[eqn]

where $[eqn]$ denotes the coarse estimation results, which can be used as the initial values for the SAGE algorithm.

By subtracting the contributions of all sources except the k-th source from the original array observations, the equivalent observation corresponding to the k-th source can be obtained

[eqn]

The signal waveform is subsequently updated

[eqn]

The angular parameters are updated based on the maximum likelihood criterion $[eqn]$ , $[eqn]$

[eqn]

Let the maximum number of iterations be $[eqn]$ and the allowable estimation error be $[eqn]$ , the iteration process terminates when the iteration count reaches $[eqn]$ or when the condition in (41) is satisfied

[eqn]

where $[eqn]$ .

For coherent signals, the procedure of the proposed improved MUSIC-based far-field sound source localization algorithm is summarized as follows:

Construct the received data vector $[eqn]$ by jointly combining the forward spatially smoothed subarrays of the x-axis and y-axis subarrays, compute the forward–backward spatially smoothed covariance matrix $[eqn]$ of the joint subarray, and perform eigenvalue decomposition to obtain the noise subspace $[eqn]$ ;
The spatial spectrum function $[eqn]$ is constructed based on the orthogonality between the steering vector $[eqn]$ and the noise subspace $[eqn]$ .
Coarse estimates of the source elevation and azimuth angles $[eqn]$ are obtained through two-dimensional spectral peak searching and are then used as the initial values for the SAGE algorithm.
The equivalent observation signal $[eqn]$ of the k-th source is constructed according to (38);
The signal waveform parameters are updated, followed by updating the angular parameters $[eqn]$ and $[eqn]$ based on the maximum likelihood criterion;
The iteration is terminated when the maximum number of iterations $[eqn]$ is reached or when the condition in (41) is satisfied, and the final estimates of $[eqn]$ and $[eqn]$ are output; otherwise, the algorithm returns to the step 4.

The flowchart of the algorithm is shown in Figure 3.

3.3. Near-Field Sound Source Location Algorithm

The above far-field sound source localization algorithm is extended to the near-field case. As discussed in Section 3.1, the subarray parallel to the x-axis does not satisfy translational invariance; therefore, a received data vector is constructed by jointly combining the p-th forward spatially smoothed subarrays of the x-axis and y-axis.

[eqn]

The corresponding steering vector of the joint subarray can be expressed as

[eqn]

By averaging the covariance matrices of the L subarrays, the forward spatially smoothed covariance matrix $[eqn]$ is obtained

[eqn]

The forward–backward spatially smoothed covariance matrix can then be further derived

[eqn]

By performing eigenvalue decomposition on the covariance matrix, the signal subspace $[eqn]$ and the noise subspace $[eqn]$ can be obtained

[eqn]

Therefore, the spatial spectrum function can be obtained as follows:

[eqn]

From Equation (36), it can be seen that under near-field conditions, the MUSIC algorithm requires a three-dimensional search over the elevation, azimuth, and range parameters, resulting in a relatively high computational burden. Moreover, owing to the pronounced nonlinear coupling between the angular and range parameters in the near-field array model, different combinations of angle and range may produce similar phase responses; when a direct three-dimensional joint parameter search is performed, the spatial spectrum is prone to exhibiting spurious peaks, thereby degrading the accuracy of parameter estimation. To reduce parameter coupling, the range parameter can be fixed, transforming the original three-dimensional search problem into a two-dimensional search, which effectively reduces the computational complexity while maintaining reliable angle estimation performance.

By fixing the range parameter $[eqn]$ , $[eqn]$ is a temporarily assumed value, based on which a new spatial spectrum function is constructed according to the orthogonality between the steering vector and the noise subspace

[eqn]

Coarse estimates of the source angles $[eqn]$ are obtained through a two-dimensional spectral peak search, and the resulting coarse estimates $[eqn]$ together with $[eqn]$ are used as the initial values for the iterative SAGE algorithm. The SAGE algorithm is then employed to obtain refined estimates of both the angles and the range $[eqn]$ . The iteration is terminated when the maximum number of iterations $[eqn]$ is reached or when the stopping condition in (49) is satisfied.

[eqn]

where $[eqn]$ .

For coherent signals, the procedure of the proposed improved MUSIC-based near-field sound source localization algorithm is summarized as follows:

Construct the received data vector $[eqn]$ by jointly combining the forward spatially smoothed subarrays of the x-axis and y-axis subarrays, compute the forward–backward spatially smoothed covariance matrix $[eqn]$ of the joint subarray, and perform eigenvalue decomposition to obtain the noise subspace $[eqn]$ .
Construct the spatial spectrum function based on the orthogonality between the steering vector $[eqn]$ and the noise subspace $[eqn]$ , and fix the range parameter $[eqn]$ to obtain a modified spatial spectrum function $[eqn]$ .
Obtain coarse estimates of the source elevation and azimuth angles $[eqn]$ through two-dimensional spectral peak searching, and these estimates together with the fixed range parameter $[eqn]$ are used as the initial values for the SAGE algorithm.
Construct the equivalent observation signal $[eqn]$ of the k-th source.
Update the signal waveform $[eqn]$ , and then update the angular and range parameters $[eqn]$ and $[eqn]$ based on the maximum likelihood criterion.
Terminate the iteration when the maximum number of iterations $[eqn]$ is reached or when the condition in (49) is satisfied, and obtain the final estimates $[eqn]$ and $[eqn]$ ; otherwise, the algorithm returns to the step 4.

The flowchart of the algorithm is shown in Figure 4.

The computational complexity of the proposed far-field and near-field acoustic source localization algorithms mainly comprises two stages: the FBSS-MUSIC initialization stage and the SAGE-based refinement stage. Compared with the far-field localization algorithm, the steering vector of the near-field n-shaped array contains a quadratic phase term; however, the complexity of generating the steering vector in a single evaluation remains $[eqn]$ , and therefore does not change the dominant computational complexity order. In addition, during the initialization stage of the near-field localization algorithm, no discrete search is performed with respect to the range parameter $[eqn]$ ; instead, the spatial spectrum is evaluated only over the two-dimensional angular domain. Consequently, the computational complexity of the spatial spectrum search in the initialization stage is the same for both the far-field and near-field algorithms. In summary, the computational complexities of the far-field and near-field source localization algorithms are of the same order.

The complexity analysis includes the construction of the spatially smoothed covariance matrix, eigenvalue decomposition, two-dimensional spatial spectrum search, and the iterative parameter optimization process involved in the SAGE algorithm. In the initialization stage, the complexity of constructing the spatially smoothed covariance matrix is $[eqn]$ , the complexity of eigenvalue decomposition is $[eqn]$ , and the complexity of the two-dimensional spatial spectrum search is $[eqn]$ . In the refinement stage, considering K sources and a maximum of $[eqn]$ iterations, the resulting computational complexity is $[eqn]$ Therefore, the overall computational complexity of the proposed algorithm can be summarized as follows:

[eqn]

where L denotes the number of spatial smoothing subarrays, $[eqn]$ represents the length of each spatial smoothing subarray, M is the number of array elements in the subarray of the n-shaped array, and N denotes the number of snapshots. $[eqn]$ represents the number of elevation angle search points, and $[eqn]$ is the elevation angle search step size. $[eqn]$ denotes the number of azimuth angle search points, and $[eqn]$ is the azimuth angle search step size.

4. Simulation and Analysis

Assume that two narrowband coherent acoustic sources located in the far-field region, denoted by $[eqn]$ and $[eqn]$ , with $[eqn]$ , both with a frequency of 8 kHz and an amplitude of 1V, and the sampling frequency is set to 80 kHz. The time-domain waveforms of $[eqn]$ and $[eqn]$ are illustrated in Figure 5. In the simulation, additive white Gaussian noise is added at different SNR levels, where the SNR is defined as the ratio of the signal power to the noise power, and its expression is given as follows:

[eqn]

where the signal power and noise power are denoted by $[eqn]$ and $[eqn]$ , respectively.

In this paper, the RMSE is adopted as a performance metric to evaluate the estimation accuracy of the proposed algorithm. The RMSE of the localization parameters is defined as follows:

[eqn]

where T denotes the number of Monte Carlo trials, $[eqn]$ represents the estimated value of the localization parameter of the k-th source obtained in the j-th Monte Carlo trial, and $[eqn]$ denotes the true value of the localization parameter of the k-th source.

4.1. Effectiveness Analysis

Simulation Experiment 1: For coherent signals, the effectiveness of the proposed improved MUSIC-based far-field sound source localization algorithm using an n-shaped array is validated.

Two narrowband coherent acoustic impinge on the n-shaped array, denoted as $[eqn]$ by $[eqn]$ . The incident angles are represented by $[eqn]$ , where $[eqn]$ denotes the elevation angle and $[eqn]$ denotes the azimuth angle. The incident directions of the two sources are $[eqn]$ and $[eqn]$ , respectively. The numbers of array elements along the x-axis, the y-axis, and the subarray parallel to the x-axis in the n-shaped array are all set to $[eqn]$ . The sound speed is denoted by $[eqn]$ m/s, and the inter-element spacing is $[eqn]$ . $[eqn]$ dB, and the number of snapshots is $[eqn]$ . The estimation results are shown in Figure 6. It can be observed that the estimated angles agree well with the true values, demonstrating that the proposed improved MUSIC-based far-field sound source localization algorithm based on the n-shaped array is effective and achieves accurate estimation performance.

When $[eqn]$ narrowband coherent sources impinge on the n-shaped array from the far-field space, the incident directions are $[eqn]$ , $[eqn]$ , $[eqn]$ , $[eqn]$ respectively. The numbers of array elements in the three subarrays of the n-shaped array, namely the x-axis subarray, the y-axis subarray, and the subarray parallel to the x-axis, are all set to 8, while the other parameters remain the same as above. The estimation results are shown in Figure 7. As can be observed from Figure 7, the estimated angles are generally consistent with the true values, indicating that the proposed algorithm remains effective when the number of sources is greater than two.

Simulation Experiment 2: For coherent signals, the effectiveness of the proposed improved MUSIC-based near-field sound source localization algorithm is validated using an n-shaped array.

Two narrowband coherent acoustic sources impinge on an n-shaped array, denoted by $[eqn]$ and $[eqn]$ . The incident angles and ranges of the sources are characterized by the parameter set $[eqn]$ , where $[eqn]$ denotes the elevation angle, $[eqn]$ denotes the azimuth angle, and $[eqn]$ denotes the range. The incident directions of the two sources are specified by $[eqn]$ and $[eqn]$ , respectively. The n-shaped array consists of three subarrays aligned along the x-axis, the y-axis, and parallel to the x-axis, each containing $[eqn]$ array elements. The sound speed is denoted by $[eqn]$ m/s, and the inter-element spacing is $[eqn]$ . $[eqn]$ dB, and the number of snapshots is 500. The estimation results are illustrated in Figure 8. It can be observed that the estimated angles are in good agreement with the true values, demonstrating that the proposed improved MUSIC-based near-field sound source localization algorithm based on the n-shaped array is effective and achieves satisfactory estimation accuracy.

When $[eqn]$ narrowband coherent sources impinge on the n-shaped array from the near-field space, the incident directions are $[eqn]$ , $[eqn]$ , $[eqn]$ , and $[eqn]$ , respectively. The numbers of array elements in the three subarrays of the n-shaped array, namely the x-axis subarray, the y-axis subarray, and the subarray parallel to the x-axis, are all set to 10, while the other parameters remain the same as above. The estimation results are shown in Figure 9. As can be observed from Figure 9, the estimated angles are generally consistent with the true values, indicating that the proposed algorithm remains effective when the number of sources is greater than two.

4.2. Performance Analysis

4.2.1. RMSE Versus SNR

In this paper, the improved CBF algorithm and the improved MVDR algorithm are selected for comparison with the proposed method. The proposed approach is developed on the basis of the MUSIC algorithm in combination with the SAGE algorithm and represents a high-resolution subspace-based estimation method. The improved CBF algorithm represents conventional spatial spectrum beamforming methods, whereas the improved MVDR algorithm represents adaptive beamforming methods with stronger interference suppression capability. By comparing these three categories of methods under identical array configurations and signal conditions, a relatively representative performance evaluation framework can be established, which facilitates a clearer assessment of the performance improvement achieved by the proposed algorithm. The following presents the simulation experiments.

Simulation Experiment 3: RMSE versus SNR of angle estimation for the improved MUSIC-based far-field sound source localization algorithm, the improved CBF, and the improved MVDR algorithm.

Two narrowband coherent sound sources, denoted by $[eqn]$ and $[eqn]$ , impinge on the n-shaped array, with their incident directions given by $[eqn]$ and $[eqn]$ , respectively. The number of snapshots is set to 500, and the SNR ranges from $[eqn]$ dB to 20 dB with a step size of 5 dB. For each SNR level, 200 Monte Carlo trials are conducted. Other parameters are the same as those used in Simulation Experiment 1. Figure 10 presents the curves of the RMSE of the elevation and azimuth angle estimates versus the SNR for the three algorithms.

As shown in Figure 10, as the SNR increases, the RMSE of the angle estimates for all three algorithms gradually decreases, indicating that reduced noise influence leads to improved estimation accuracy. Under coherent signal conditions, the proposed improved MUSIC-based far-field sound source localization algorithm consistently achieves the smallest RMSE across all SNR levels, demonstrating superior angle estimation performance compared with the improved MVDR [4,14] and CBF [31] algorithms. The proposed algorithm also maintains good performance under low-SNR conditions; when the SNR is −5 dB, the RMSEs of the elevation and azimuth angles are approximately $[eqn]$ and $[eqn]$ , respectively, demonstrating the robustness of the algorithm.

Simulation Experiment 4: RMSE versus SNR for the improved MUSIC-based near-field sound source localization algorithm, the improved CBF algorithm, and the improved MVDR algorithm.

Two narrowband coherent sound sources, denoted by $[eqn]$ and $[eqn]$ , impinge on the n-shaped array, with their incident angles and ranges given $[eqn]$ and $[eqn]$ , respectively. The number of snapshots is set to 500, and the SNR ranges from $[eqn]$ dB to 20 dB with a step size of 5 dB. For each SNR level, 200 Monte Carlo trials are conducted. Other parameters are the same as those used in Simulation Experiment 2. Figure 11 presents the curves of the RMSE of the elevation angle, azimuth angle, and range estimates versus the SNR for the three algorithms.

As shown in Figure 11, the proposed algorithm achieves higher estimation accuracy for all three parameters than the improved MVDR and CBF algorithms, and the RMSE of both the angle and range estimates decreases gradually as the SNR increases. Under low-SNR conditions, noise components affect the statistical characteristics of the covariance matrix, thereby degrading parameter estimation accuracy and resulting in relatively large errors. As the SNR increases, the influence of noise diminishes, thereby improving the parameter estimation accuracy. The proposed algorithm applies spatial smoothing to the subarray data and constructs an equivalent covariance matrix to restore the rank of the signal subspace under coherent-source conditions, thereby avoiding estimation errors caused by rank deficiency. By combining coarse estimation using MUSIC with refined optimization using the SAGE algorithm, the method reduces discretization errors and the impact of parameter coupling, enabling the rapid and stable estimation of angle and range parameters under medium- to high-SNR conditions. As a result, the RMSE performance and stability of the proposed algorithm are consistently superior to those of the comparison algorithms, demonstrating its robustness and high-precision estimation capability.

4.2.2. RMSE Versus Number of Snapshots

Simulation Experiment 5: RMSE of angle estimation versus the number of snapshots for the improved MUSIC-based far-field sound source localization algorithm, the improved CBF algorithm, and the improved MVDR algorithm

Two narrowband coherent sound sources, denoted by $[eqn]$ and $[eqn]$ , impinge on the n-shaped array, with their incident directions given by $[eqn]$ and $[eqn]$ , respectively. $[eqn]$ dB. The number of snapshots varies from 100 to 1000, and 200 Monte Carlo trials are conducted for each snapshot setting. The other parameters are the same as those used in Simulation Experiment 1. Figure 12 presents the curves of the RMSE of the elevation and azimuth angle estimates versus the number of snapshots for the three algorithms.

As shown in Figure 12, as the number of snapshots increases, the RMSE of the angle estimates obtained by the proposed algorithm gradually decreases, indicating that the method effectively exploits the additional statistical information provided by more snapshots, resulting in improved subspace estimation stability and enhanced parameter estimation accuracy. In contrast, the RMSE of the improved MVDR and CBF algorithms decreases only marginally as the number of snapshots increases and tends to saturate, primarily because these algorithms are limited by their resolution and interference suppression capabilities. With a small number of snapshots, the RMSE of the proposed algorithm is relatively large, mainly due to the presence of coherent signals, as a limited sample size reduces the accuracy of covariance matrix estimation, leading to degraded separation between the signal and noise subspaces and consequently increased parameter estimation errors. However, as the number of snapshots increases, this effect diminishes rapidly, and the RMSE curve stabilizes while maintaining a lower error level, further validating the advantages of the proposed method in terms of statistical stability and estimation accuracy.

Simulation Experiment 6: RMSE values of angle and estimation versus the number of snapshots for the improved MUSIC-based near-field sound source localization algorithm, the improved CBF algorithm, and the improved MVDR algorithm

Two narrowband coherent sound sources, denoted by $[eqn]$ and $[eqn]$ , impinge on the n−shaped array, with their incident angles and ranges given by $[eqn]$ and $[eqn]$ , respectively. $[eqn]$ dB. The number of snapshots varies from 400 to 1000, and 200 Monte Carlo trials are conducted for each snapshot setting. The other parameters are the same as those used in Simulation Experiment 2. Figure 13 presents the curves of the RMSE of the elevation angle, azimuth angle, and range estimates versus the number of snapshots for the three algorithms.

Under coherent signal conditions, Figure 13 shows the RMSE of the elevation angle, azimuth angle, and range estimates as functions of the number of snapshots, respectively. The results show that the proposed algorithm achieves superior estimation performance for all three parameters compared with the improved CBF and MVDR algorithms. When the number of snapshots increases from 400 to 500, the RMSE of the proposed algorithm decreases significantly, primarily because, in the presence of coherent signals, a limited sample size reduces the accuracy of covariance matrix estimation, resulting in degraded separation between the signal and noise subspaces. As the number of snapshots increases, the sample statistics gradually converge, the subspace estimation error diminishes rapidly, and the spatial spectrum becomes more stable, resulting in a significant reduction in RMSE. This behavior demonstrates the good convergence properties and robustness of the proposed algorithm.

4.2.3. Comparison of Different Array Configurations

Simulation Experiment 7: RMSE versus SNR for coherent signals using an improved MUSIC-based far-field sound source localization algorithm with n-shaped and L-shaped arrays.

The structure of the n-shaped array is shown in Figure 2, where each of the three subarrays consists of eight elements. The L-shaped array is composed of two subarrays aligned along the x-axis and the y-axis, each containing eight elements. The inter-element spacing of both arrays is set to $[eqn]$ . The number of snapshots is 200, and the SNR varies from $[eqn]$ dB to 20 dB with a step size of 5 dB. Two narrowband coherent sound sources impinge on the n-shaped array and L-shaped array, respectively, with identical incident angles and ranges. For each SNR level, 200 Monte Carlo trials are performed. Figure 14 shows the RMSE curves of the elevation and azimuth angle estimates versus the SNR for far-field sound source localization using the improved MUSIC algorithm based on the two array configurations.

As shown in Figure 14, the n-shaped array shows relatively large estimation errors at $[eqn]$ dB; however, when the SNR increases from −10 dB to −5 dB, the RMSE decreases significantly. As the SNR increases further, the RMSE gradually stabilizes. This behavior can be attributed to the reduced influence of noise on the array covariance matrix at higher SNR levels, resulting in a corresponding decrease in estimation error.

Across all SNR conditions, the RMSEs of both the elevation and azimuth angle estimates obtained using the n-shaped array are smaller than those obtained with the L-shaped array. This can be attributed to the more balanced element distribution of the n-shaped array in two-dimensional space, which provides a more effective array aperture and enables the proposed algorithm to achieve lower estimation errors when implemented with this configuration. In contrast, the two-dimensional spatial sampling structure of the L-shaped array is more limited and exhibits lower sensitivity to angular variations. In the presence of noise perturbations, the main peak of the spatial spectrum becomes more susceptible to shifts, thereby reducing peak localization accuracy and increasing parameter estimation errors.

Simulation Experiment 8: RMSE versus SNR for coherent signals using an improved MUSIC-based near-field sound source localization algorithm with n-shaped and L-shaped arrays.

Each of the three subarrays of the n-shaped array consists of $[eqn]$ elements, while each subarray of the L-shaped array contains $[eqn]$ elements. The two array configurations have the same total number of array elements, which is 19, and identical inter-element spacing is set to $[eqn]$ . The number of snapshots is set to 200, and SNR varies from $[eqn]$ dB to 20 dB with a step size of 5 dB. Two narrowband coherent sound sources impinge on the n-shaped array and the L-shaped array, respectively, with identical incident angles and identical ranges. For each SNR level, 200 Monte Carlo trials are performed. Figure 15 presents the RMSE curves of the angle and range estimates versus SNR for the two array configurations, obtained using the improved MUSIC-based near-field sound source localization algorithm.

As shown in Figure 15, the n-shaped array exhibits relatively large estimation errors for the target parameters under low SNR conditions; however, as the SNR increases, the influence of noise on the sample covariance matrix diminishes, leading to more accurate subspace estimation and a corresponding significant reduction in parameter estimation errors. With an equal total number of array elements, the application of the proposed near-field sound source localization algorithm enables the n-shaped array to achieve lower RMSEs in the estimation of elevation, azimuth, and range parameters, while exhibiting a decreasing and more stable error trend as the SNR increases.

The simulation results demonstrate the advantage of the n-shaped array in joint angle and range estimation. By employing a multi-directional element layout to form a larger array aperture, the n-shaped array improves steering vector diversity and subspace stability, thereby enhancing parameter resolution and noise robustness, which enables lower estimation errors under different SNR conditions. In contrast, the geometric degrees of freedom of the L-shaped array are more limited, restricting its spatial sampling capability; consequently, it exhibits higher RMSEs across the entire SNR range.

5. Conclusions

To meet the requirements of long-range direction estimation and short-range precise localization in the AUV localization process, this paper presents far-field and near-field acoustic source localization algorithms based on the n-shaped array. The proposed method first constructs a spatially smoothed data vector and subsequently forms the corresponding equivalent covariance matrix. The MUSIC algorithm is then employed to obtain coarse angle estimates, which serve as initializations for the SAGE algorithm. The SAGE algorithm is subsequently employed to refine the angular and range parameters, thereby enhancing estimation accuracy. Simulation results demonstrate that, under both far-field and near-field conditions and in the presence of coherent sources, the proposed algorithm outperforms the improved CBF and MVDR algorithms, yielding lower estimation errors and superior estimation accuracy. Furthermore, when the proposed algorithm is applied to compare the n-shaped array with the commonly used two-dimensional L-shaped array, the n-shaped array achieves superior estimation performance, thereby demonstrating the effectiveness of the n-shaped array design.

In addition, although the proposed n-shaped array–based far-field and near-field acoustic source localization method demonstrates promising performance in simulation studies, practical underwater acoustic environments are typically characterized by significant complexity and uncertainty. Factors such as environmental mismatch and colored noise may influence both the array signal model and the performance of the algorithm. In practical engineering applications, mechanical installation errors and platform motion may cause the array geometry to deviate from the ideal array model. Small position deviations of the hydrophones may lead to steering vector mismatch, thereby degrading parameter estimation accuracy. Furthermore, in practical applications, the number of sources is often difficult to determine accurately in advance, which places higher demands on the robustness of the algorithm. In this work, the performance characteristics of the proposed algorithm are mainly analyzed under controlled simulation conditions, while the effects of environmental uncertainties, array geometry mismatches, and platform motion on localization performance, as well as the corresponding compensation strategies, will be further investigated and validated in future studies.

Future research will focus on the following aspects. First, underwater experiments will be conducted to further investigate the effects of environmental uncertainties in complex underwater conditions, array geometry mismatches, and platform motion on localization performance, to evaluate the engineering feasibility and operational stability of the proposed algorithm. Second, when the number of sources is unknown, information-theoretic criteria such as the Akaike Information Criterion (AIC) or the Minimum Description Length (MDL) criterion can be incorporated into the proposed algorithm to improve its applicability in complex practical environments. In addition, for parameter estimation under limited snapshot or low SNR conditions, the robustness and computational efficiency of the algorithm can be further enhanced. Finally, the potential application of the n-shaped array configuration in other array signal processing problems may also be investigated. These research directions will contribute to further improving the algorithmic performance and its practical engineering applicability.

Bibliography31

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Krim H. Viberg M. Two decades of array signal processing research: The parametric approach IEEE Signal Process. Mag.199613679410.1109/79.526899 · doi ↗
2Liang J. Liu D. Passive Localization of Near-Field Sources Using Cumulant IEEE Sens. J.2009995396010.1109/JSEN.2009.2025580 · doi ↗
3He Q. Cheng Z. Wang Z. He Z. Mixed far-field and near-field source separation and localization based on FOC matrix differencing Digit. Signal Process.202213110375310.1016/j.dsp.2022.103753 · doi ↗
4Liu G. Sun H. Jin D. Experimental Research of Vector Hydrophone MVDR Algorithm J. Inf. Comput. Sci.2015121329133610.12733/jics 20105550 · doi ↗
5Schmidt R. Multiple emitter location and signal parameter estimation IEEE Trans. Antennas Propag.19863427628010.1109/TAP.1986.1143830 · doi ↗
6Wen F. Wan Q. Fan R. Wei H. Improved MUSIC Algorithm for Multiple Noncoherent Subarrays IEEE Signal Process. Lett.20142152753010.1109/LSP.2014.2308271 · doi ↗
7Ning G. Jiang S. Zhao X. Yang C. A 2D-DOA estimation algorithm for double L-shaped array in unknown sound velocity environment IEICE Trans. Commun.202010324024610.1587/transcom.2019 EBP 3007 · doi ↗
8Merkofer J.P. Revach G. Shlezinger N. Routtenberg T. van Sloun R.J.G. DA-MUSIC: Data-Driven DOA Estimation via Deep Augmented MUSIC Algorithm IEEE Trans. Veh. Technol.2024732771278510.1109/TVT.2023.3320360 · doi ↗