Method for Improving Positioning Accuracy of Rotating Scanning Satellite Images via Multi-Source Satellite Data Fusion

Liwei Wang; Peng Wang; Yamin Zhang; Yi Wang; Bo Chen

PMC · DOI:10.3390/s26030850·January 28, 2026

Method for Improving Positioning Accuracy of Rotating Scanning Satellite Images via Multi-Source Satellite Data Fusion

Liwei Wang, Peng Wang, Yamin Zhang, Yi Wang, Bo Chen

PDF

Open Access

TL;DR

This paper introduces a method to improve the positioning accuracy of rotating scanning satellite images by fusing data from multiple satellite sources.

Contribution

The novel approach uses grid-based feature extraction and joint adjustment to achieve meter-level accuracy without dense ground control points.

Findings

01

The framework achieved a planar accuracy of 4.01 m and edge matching RMSE of 2.52 m using ZY-3 and GF-2 imagery.

02

Meter-level positioning accuracy (4.68 m in mountainous areas and 5.22 m in plains) was achieved for simulated ultra-wide rotating scanning imagery.

03

Multi-source fusion effectively corrects geometric distortions and improves positioning accuracy in rotating scanning systems.

Abstract

What are the main findings? A multi-source collaborative positioning framework integrates ZY-3 and GF-2 imagery to achieve a planar accuracy of 4.01 m and an edge matching RMSE of 2.52 m.The proposed grid-based feature extraction and joint adjustment method successfully attains meter-level positioning accuracy (4.68 m and 5.22 m) for simulated ultra-wide rotating scanning imagery. A multi-source collaborative positioning framework integrates ZY-3 and GF-2 imagery to achieve a planar accuracy of 4.01 m and an edge matching RMSE of 2.52 m. The proposed grid-based feature extraction and joint adjustment method successfully attains meter-level positioning accuracy (4.68 m and 5.22 m) for simulated ultra-wide rotating scanning imagery. What are the implications of the main findings? The approach effectively mitigates complex geometric distortions in rotating scanning systems without dense…

Figures4

Click any figure to enlarge with its caption.

Funding5

—National Key Research and Development Program of China
—National Key Research and Development Program of China
—Shenzhen Higher Education Institutions Stabilization Support Program Project
—National Natural Science Foundation of China
—National Natural Science Foundation of Shenzhen

Keywords

geometric positioningjoint adjustmentrotating scanningRPC model

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSatellite Image Processing and Photogrammetry · Remote Sensing in Agriculture · Synthetic Aperture Radar (SAR) Applications and Techniques

Full text

1. Introduction

With the expansion of remote sensing applications, requirements for information acquisition have evolved remarkably. Critical applications, such as global change monitoring [1] and natural resource surveys, now demand a dual capability: rapid, large-area coverage (i.e., wide swath) combined with high spatial resolution [2]. In recent years, high-resolution observation technology has advanced rapidly, providing rich information resources for the geosciences [3]. However, constraints inherent to existing imaging mechanisms—such as camera fields of view and satellite orbital regression cycles—create a mutual trade-off between spatial resolution and swath width in optical remote sensing satellites [4]. Consequently, simultaneously achieving meter-level resolution and thousand-kilometer-level coverage with a single satellite remains a formidable challenge [5].

To surmount these limitations, the rotating scanning imaging system has been proposed. Unlike the mainstream static pushbroom imaging employed by most optical satellites, the CMOS sensor in a rotating scanning system is aligned along the flight direction but rotates continuously 360° around an axis perpendicular to the trajectory. This mechanism allows the satellite to sweep the ground continuously, acquiring image strips that are subsequently stitched into a seamless panoramic view of the target area.

However, satellite remote sensing imagery obtained via this imaging method faces several key challenges. First, the imaging geometry results from the complex coupling between the satellite’s orbital motion and the high-speed rotation of the sensor, introducing high-order and time-varying geometric distortions. Such dynamic deformation renders traditional static geometric models and standard Rational Polynomial Coefficient (RPC) correction methods inadequate for achieving high-precision geolocation, often leading to model instability. Second, rotating-scanning satellites typically cover ultra-wide swaths, often on the order of thousands of kilometers [2], making it practically impossible to obtain a sufficient number of well-distributed, field-surveyed ground control points (GCPs) across the entire coverage area. To bridge this gap, this paper proposes a multi-source collaborative positioning framework that leverages existing high-precision satellite data (e.g., ZY-3) as a control skeleton to automate accuracy transfer, bypassing the GCP acquisition bottleneck.

With these research challenges and opportunities, this study aims to propose a positioning framework via multi-source satellite data fusion. The main contributions of this paper are as follows:

Establishment of a Positioning Framework via Multi-Source Satellite Data Fusion: We have constructed a comprehensive framework for the precise positioning of ultra-wide swath imagery. By synergistically utilizing multi-source high-resolution optical satellite imagery and DEM data, we established a joint adjustment model that proficiently facilitates error compensation and accuracy transfer across heterogeneous data sources.
Development of High-Precision Tie Point Extraction Technology: Addressing the severe deformation characteristics of ultra-wide images, we developed a robust automatic tie point extraction technique. This method integrates an image blocking and gridding strategy with geometric constraints and Least Squares Minimization (LSM) optimization, ensuring a balanced distribution and high-precision extraction of feature points.
Comprehensive Experimental Validation: We designed a rigorous validation scheme using simulated wide-swath data derived from real satellite imagery. Through quantitative evaluation of both absolute positioning accuracy and relative edge matching accuracy, we verified the method’s effectiveness and engineering practicality for rotating scanning satellite applications.

This study presents significant advancements in the geometric rectification of spaceborne sensors, primarily through two key contributions: (1) To address the critical challenge of acquiring Ground Control Points (GCPs) in ultra-wide swath imaging, an automated control information transfer mechanism utilizing multi-source reference imagery has been developed, thereby overcoming the technical bottleneck associated with control data scarcity for swaths spanning thousands of kilometers. (2) Targeting the severe geometric instability inherent in rotating scanning sensors, a joint adjustment model is proposed to successfully suppress kilometer-level distortions. By specifically modeling and compensating for high-order time-varying errors, this model achieves a substantial reduction in absolute positioning error from the kilometer-scale commonly reported in the literature to within 10 m.

2. Related Work

The geometric processing of optical satellite imagery has evolved notably, driven by the diversity of imaging mechanisms and the demand for high-precision geospatial products. This section reviews the current state of research in three key areas relevant to this study: geometric modeling of whiskbroom/scanning systems, jitter detection and correction, and advanced computational processing for remote sensing data.

2.1. Geometric Modeling and Calibration of Whiskbroom Imaging Systems

While geometric correction for static pushbroom cameras is relatively mature [6,7], the modeling of whiskbroom scanning systems presents unique challenges due to the complex coupling of scanning motion and satellite flight. Early research by Breuer and Albertz established foundational concepts for airborne whiskbroom scanners, emphasizing the need for hybrid auxiliary data to correct panoramic and motion-induced distortions [8]. Building on these principles, Uto et al. developed low-cost whiskbroom imagers using optical fiber bundles, demonstrating that hardware-specific geometric distortions can be mitigated through precise optical coupling and registration algorithms [9].

In the context of spaceborne platforms, the “rotating scanning” or “conical scanning” mechanism has been increasingly explored to achieve ultra-wide swaths. Wang et al. proposed a conceptual rotational mode for optical conical scanning imaging small satellites, analyzing the imaging degradation caused by the compound motion of satellite flight and sensor rotation [10]. To mitigate image blur in such dynamic systems, Sun et al. introduced an image motion compensation method using two-axis fast steering mirrors, establishing a quantitative relationship between the scanning trajectory and compensation rates [11].

Specific missions have driven further methodological innovations. For the SDGSAT-1 Thermal Infrared Spectrometer (TIS), Hu et al. detailed the system design which utilizes a 1-D scanning mirror to achieve a 300 km swath [12], while Li et al. proposed a three-step in-orbit calibration strategy decoupling errors from exterior orientation and scanning compensation parameters [13]. Similarly, for the HJ-2 A/B satellites, Zhang et al. introduced a multi-focal-plane-array joint calibration method to improve band-to-band registration accuracy [14]. Additionally, Sun developed a geometric calibration model for linear-array whiskbroom satellites based on look-angle corrections, utilizing cubic polynomial surfaces to fit correction quantities [15].

Most relevant to this study, Liu et al. constructed a rigorous imaging model specifically for linear array sensors performing circular scanning perpendicular to the orbit, proposing a correction method based on orbital attitude parameters and spline function fitting [16]. Furthermore, Li et al. applied rigorous photogrammetric processing to Tianwen-1 HiRIC imagery, demonstrating that bundle adjustment with high-frequency jitter compensation can achieve sub-pixel accuracy even in complex deep-space environments [17]. Zhang et al. also developed a calibration method for airborne linear-array multi-camera systems, utilizing bundle adjustment with orientation constraints to ensure high-precision stitching [18].

Despite these advances in imaging models, existing methods primarily focus on sensor-specific calibration or standard linear scanning mechanisms. They often lack a unified solution for the complex, high-order geometric distortions characteristic of continuous 360° rotating scanning systems over ultra-wide swaths. To address this, this study builds upon these rigorous geometric principles, proposing a refined model capable of integrating external constraints to suppress the specific non-linear deformations inherent in rotating scanning imagery.

2.2. Attitude Jitter Detection and Correction

For high-resolution satellites, micro-vibrations or “jitter” can remarkably deteriorate geometric quality. Rotating scanning systems are particularly susceptible to high-frequency attitude variations during the scan cycle. Zhang et al. addressed this by deriving an image-based jitter inversion model that accounts for the Time Delay Integration (TDI) effect [19]. Additionally, Zhang et al. conducted a comprehensive simulation of vibration influences on rotating scanning satellites, analyzing how different vibration frequencies affect geometric positioning and providing a theoretical basis for payload structural design [20].

However, while these studies successfully characterize vibration impacts, practical correction of these high-frequency attitude variations remains challenging, especially in scenarios lacking dense GCPs. Therefore, distinct from purely internal jitter detection, this paper proposes to mitigate these geometric instabilities by incorporating multi-source reference data into a joint adjustment framework, thereby compensating for positioning errors induced by dynamic sensor motion.

2.3. Multi-Source Fusion and Advanced Geometric Processing

The synergy of multi-source data is critical for improving mapping accuracy. Silva et al. demonstrated a fusion framework combining GEDI, ICESat-2, and NISAR data, proving that integrating sparse lidar samples with wall-to-wall radar data can yield spatially comprehensive biomass maps [21]. To reduce dependence on ground control points (GCPs), Afsharnia et al. proposed a geometric correction method based on DEM matching, correcting orbital parameters without ground control [22]. Similarly, Tatar introduced a geolocation bias correction for Cartosat-1 using virtual GCPs generated from open-source DEMs [23]. Zhou et al. achieved high-accuracy georeferencing for GF-6 WFV images using a similar reference-based approach [24].

In terms of advanced processing, deep learning and implicit neural representations are reshaping geometric pipelines. Song et al. evaluated deep learning-based features against handcrafted features for satellite image matching, finding that learning-based methods offer superior robustness [25]. Lu et al. proposed SatMVS, transferring Multi-View Stereo (MVS) neural networks to satellite imagery for robust height estimation [26]. Furthermore, Marí et al. explored the application of Neural Radiance Fields (NeRF) to multi-date Earth observation data, demonstrating that implicit representations can model geometry and appearance changes effectively [27]. Finally, Liu et al. proposed a parallel optimization approach on DCU clusters to handle the massive data volume generated by wide-swath satellites [28].

Nevertheless, directly applying these general fusion strategies to rotating scanning systems is hindered by the substantial differences in resolution and viewing geometry compared to standard reference satellites (e.g., ZY-3, GF-2). Bridging this gap, our work develops a robust multi-source fusion pipeline. While learning-based methods mentioned above show promise in general scenarios, they often require extensive training on domain-specific datasets (e.g., rotating scanning distortions) and can lack the rigorous geometric interpretability required for metric-level joint adjustment. Therefore, this study prioritizes a classical yet rigorous photogrammetric approach to ensure deterministic sub-pixel precision.

3. Methods

This study employs an integrated collaborative positioning framework to rectify the geometric distortions of rotating scanning satellite imagery by fusing multi-source high-resolution data (Figure 1). A high-precision control grid is first constructed by extracting feature points from reference ZY-3 and GF-2 imagery using a grid-based strategy and determining their 3D coordinates via a reference DEM. To address the radiometric and geometric disparities between heterogeneous sensors, a coarse-to-fine matching approach combining Normalized Cross-Correlation (NCC) and Least Squares Matching (LSM) is utilized to generate reliable tie points. Subsequently, for the target rotating scanning images, a Multi-Directional Spatial Context Information (MSCI) matching method is applied to accurately transfer control information. Finally, a multi-image joint adjustment model, reinforced by robust estimation, is implemented to optimize the orientation parameters and generate high-precision Digital Orthophoto Maps (DOM).

3.1. Multi-Source Control Point Grid Generation Method

3.1.1. Geometric Control Principle of Heterogeneous Sensor Integration

Integrating data from diverse platforms (e.g., GF-1, GF-2, ZY-3) presents inherent challenges due to differences in spatial resolution, spectral response, and initial positioning accuracy. To address the aforementioned challenges, this section employs a heterogenous sensor integration strategy centered on constructing a hierarchical, robust joint adjustment system. Its fundamental principle is as follows: first, all images to be fused are treated as a unified photogrammetric regional network. Second, high-precision connection point extraction establishes geometric links between all overlapping image pairs. Finally, within a unified adjustment model, the systematic error model parameters for all images and the three-dimensional ground coordinates for all connection points are jointly solved. In this progress, we adopt a “hierarchical constraint” strategy within a unified photogrammetric block adjustment. The core principle is to assign higher weights to images with superior initial positioning accuracy (such as ZY-3), allowing them to function as a “control skeleton”. Images with lower initial accuracy are then geometrically constrained to these anchors. Through this mechanism, geometric accuracy is proficiently transferred and distributed throughout the entire block, unifying the geometric datum across the dataset.

3.1.2. High-Precision Tie Point Extraction Strategy

We employ a robust two-step matching strategy: “Normalized Cross-Correlation (NCC) Coarse Matching” followed by “Least Squares Matching (LSM) Fine Matching”.

NCC Coarse Matching To mitigate the effects of geometric deformation and radiometric differences, NCC is used for initial matching. The NCC coefficient is calculated as follows:

[eqn]

where $[eqn]$ represents the dimensions of the template window, $[eqn]$ and $[eqn]$ denote the pixel gray values at corresponding positions in the template and search images, respectively, and $[eqn]$ and $[eqn]$ represent the average gray values of the two windows. The location yielding the maximum NCC coefficient is thus regarded as the approximate position of the corresponding point.

To enhance efficiency and robustness against rotation, we implement two optimization strategies:

Pyramid Strategy: Matching initiates at the top level of a Gaussian pyramid (low resolution) and progressively refines to the bottom level, considerably reducing the search space.
Rotation Template Generation: Using the RPC files, we estimate the relative rotation angle $[eqn]$ between images. The template is then rotated to compensate for deformation:

[eqn]

where $[eqn]$ represents the original template coordinates; $[eqn]$ represents the rotated coordinates; and $[eqn]$ represents the rotation center.

LSM Fine Matching: For sub-pixel accuracy, LSM is employed to solve for geometric and radiometric parameters. The fundamental observation equation is:

[eqn]

where $[eqn]$ are the object space coordinates; $[eqn]$ denote the sensor’s perspective center; $[eqn]$ is the effective focal length; and $[eqn]$ are the elements of the rotation matrix $[eqn]$ determined by the platform’s roll, pitch, and yaw.

We utilize an affine transformation model to describe the geometric relationship between windows, solving for 6 geometric parameters and 2 radiometric parameters. The linearized error equation for each pixel is

[eqn]

This system is solved iteratively using the least squares method, $[eqn]$ , until the parameter corrections fall below a preset threshold.

3.1.3. Control Point Library Generation via Moravec Operator

To generate the control point library, we utilize the Moravec corner detection operator [24]. While more modern detectors like Harris, SIFT, or learning-based methods offer higher rotation invariance, the Moravec operator is selected for this framework due to its notably lower computational complexity and high processing speed—critical for handling the ultra-wide swath data involved in rotating scanning imagery. This involves computing the Sum of Squared Differences (SSD) in four directions (horizontal, vertical, diagonal, and anti-diagonal):

[eqn]

The “Interest Value” for each pixel is determined by the minimum SSD among the four directions: $[eqn]$ . Following non-maximum suppression to retain only local maxima, the extracted corners are mapped to object space coordinates using the reference DOM and DSM, forming a high-precision control information library.

3.2. Collaborative Positioning Based on Semantic Features

Once the control library is established, the next phase is the collaborative positioning of the target rotating scanning imagery.

3.2.1. Control Point Transfer Using Spatial Context Features

Due to the complex geometric distortions and non-linear radiometric differences between the reference and rotating scanning images, traditional intensity-based matching is often insufficient. We therefore propose a matching method based on Multi-Directional Spatial Context Information (MSCI). Instead of using raw pixel values, this method aggregates the gray value relationships between a central pixel and its neighbors to construct structural features, enhancing robustness against contrast variations.

As depicted in the schema in Figure 2, the local neighborhood around a feature point is partitioned into multiple cells. This method aggregates the gray value relationships between a central pixel $[eqn]$ and its neighbors ( $[eqn]$ ) to construct structural features. The spatial relationship and intensity contrast between these cells are visualized in Figure 2.

By linking each directional cell in Figure 2 to a specific dimension in the formula, the MSCI model ensures that the matching cost calculated in Equation (3) reflects the actual spatial structure of the satellite imagery.

To suppress noise inherent in multi-source data (e.g., SAR or LiDAR intensity), Gaussian filtering is applied to the feature channels. Subsequently, the features are normalized to create the MSCI descriptor, enhancing robustness against contrast variations:

[eqn]

Finally, matching is performed in the frequency domain. By calculating the Cross-Power Spectrum via Fast Fourier Transform (FFT), we obtain a pulse function where the peak location indicates the precise translation shift.

3.2.2. Multi-Image Joint Adjustment and Robust Estimation

We employ a multi-image joint adjustment model based on collinearity condition equations to optimize the orientation parameters. The process includes extracting tie points between overlapping strips and transferring control points from the library [24]. Error equations are constructed for both control points (using affine parameters as unknowns) and tie points (using object coordinates as unknowns). To eliminate gross errors generated during automatic matching, we apply robust estimation using the Iteratively Reweighted Least Squares (IRLS) method. The weight matrix is updated in each iteration using the Huber weight function:

[eqn]

This ensures that observations with large residuals are down-weighted, preventing them from distorting the final adjustment solution.

4. Results

4.1. Experiment Area and Data Sources

To validate the effectiveness of the proposed method across different geomorphological conditions, the test area was selected as the experimental site (N 39.465° to N 40.461°, E 115.672° to E 116.496°). This region provides a natural laboratory for comparative analysis due to its distinct topographic dichotomy: the northwestern part is dominated by rugged mountainous terrain, while the southeastern part consists of the flat North Plain.

The experimental dataset consists of six multi-source high-resolution optical satellite images:

Reference Data: Two scenes of ZY-3 panchromatic imagery (2.1 m resolution) and four scenes of GF-2 satellite imagery with high internal geometric stability served as the control backbone;
Target Data: Rotating scanning satellite simulation data from the research by Xue [1]. To simulate the rotating scanning imagery, the original GF-2 push-broom data was re-projected onto a cylindrical surface. We applied a transformation based on the scanning angle $[eqn]$ , where $[eqn]$ is the nominal scan rate and $[eqn]$ represents simulated attitude jitter. This ensures the resulting imagery contains the characteristic S-shape distortion and scale variation typical of rotating sensors. For this comparative study, 80 representative chips were selected: 40 chips covering the western mountainous areas and 40 chips covering the southeastern plains;
Auxiliary Data: SRTM DEM data were utilized to provide elevation constraints for the joint adjustment and orthorectification processes.

4.2. Control Point Extraction and Joint Adjustment Accuracy

Using the grid-based feature extraction and MSCI matching strategy, a dense network of control and tie points was established. As illustrated in Figure 3, the study area is divided into regular grids, and tie points are extracted within each grid to ensure an even spatial distribution. This distribution serves as the physical basis for the ‘control skeleton’ mentioned above, ensuring that the high-precision constraints from ZY-3 are uniformly propagated through the adjustment equations discussed in Section 3.2.2. On average, 2046 points per chip were extracted for the mountainous region and 2475 points per chip for the plain region. The higher point density in the plain region is attributed to the abundance of man-made structures (roads and building corners), which are conducive to feature matching.

Following the multi-image joint adjustment, we evaluated the positioning accuracy using high-precision checkpoints. Table 1 summarizes the positioning accuracy after multi-image joint adjustment. The results demonstrate that the systematic errors across the heterogeneous sensors were efficiently unified, achieving a consistent planar accuracy level. The average planar positioning accuracy (RMSE) after adjustment reached 4.01 m. Considering the reference imagery itself has a planar accuracy of approximately 5 m (leading to a theoretical error propagation of 6.41 m), the adjustment result of 4.01 m significantly outperforms the theoretical value, demonstrating the efficacy of the error compensation model.

Furthermore, we assessed the relative consistency between adjacent images. As shown in Table 2, the average edge matching RMSE between all overlapping image pairs was 2.52 m. This high level of consistency indicates that systematic errors between multi-source images were successfully eliminated, satisfying the requirements for seamless stitching.

4.3. Collaborative Positioning Performance on Ultra-Wide Images

The core of the experiment involved evaluating the cooperative positioning accuracy of the simulated ultra-wide rotating scanning image segments. We conducted a rigorous comparison between the original direct positioning and the proposed collaborative adjustment for both mountainous and plain sub-regions.

To compare the model’s performance in mountainous and plain regions, the data for these areas will now be processed and analyzed separately. Table 3 presents the initial direct positioning accuracy of the images from mountainous areas, with Table 4 for plain areas. Due to the high-order distortions inherent in the rotating scanning mechanism, the original imagery exhibited severe geometric displacement.

The initial errors were heavily dominated by the X-direction (cross-track/scanning direction), reaching over 80 m in mountainous areas. This is consistent with the physical characteristics of rotating scanning, where scanning non-linearity and terrain-induced relief displacement are most pronounced in the cross-track direction. Mountainous regions exhibited higher original errors (82.22 m) compared to plains (78.87 m) because the extreme elevation changes further amplified the relief displacement in the scanning geometry.

After applying the proposed collaborative positioning correction based on multi-source control data, the accuracy improved drastically. As detailed in Table 5 and Table 6, the experimental data reveals several key insights:

Effective Geometric Recovery: The planar accuracy (RMSE XY) was reduced to under 5.5 m for all regions, meeting the requirements for meter-level positioning.
Terrain Adaptability: Interestingly, the correction accuracy in mountainous areas (4.68 m) was slightly superior to that in plain areas (5.22 m). This demonstrates that the MSCI descriptor and DEM-assisted adjustment are highly effective at capturing stable topographic structural features in rugged terrain, which are less susceptible to temporal land-use changes.
Error Isotropy: Following the adjustment, the extreme X-direction bias was eliminated. The residual errors in the X and Y directions are now within the same order of magnitude (3–4 m), indicating that the systematic scanning distortions were successfully modeled and suppressed.

4.4. Digital Orthophoto Map (DOM) Generation

Based on the high-precision orientation parameters obtained from the joint adjustment and the external DEM, we generated a seamless Digital Orthophoto Map (DOM) for the entire experimental area using pixel-by-pixel rectification. Visual inspection of the generated DOM products reveals clear image textures and accurate planar positions (Figure 4). The seamless mosaic further validates the high edge-matching accuracy achieved, proving the method’s engineering practicality for producing high-quality mapping products from rotating scanning satellite data.

Visual inspection of the generated DOM products reveals clear image textures and accurate planar positions. The seamless mosaic further validates the high edge-matching accuracy achieved, proving the method’s engineering practicality for producing high-quality mapping products from rotating scanning satellite data.

4.5. Computational Cost and Operational Applicability

To evaluate the operational feasibility of the proposed framework, the processing time was recorded on a Dell 7040 workstation equipped with two Intel Xeon Gold 5118 processors (2.3 GHz) and 64 GB of RAM. For a typical image chip, the automated grid-based feature extraction and MSCI matching require approximately 45.2 s, while the joint adjustment with robust estimation converges in less than 2.5 s. The total processing time for the ultra-wide swath simulation (80 chips) was approximately 64 min. This efficiency indicates that the method is suitable for large-scale, near-real-time satellite data processing without manual intervention.

5. Discussion

The results of this study demonstrate that the proposed multi-source data fusion framework proficiently addresses the geometric challenges of rotating scanning satellites. A critical interpretation of the results reveals a counter-intuitive finding: despite the higher initial geometric complexity in mountainous areas (82.22 m initial error), the final collaborative positioning accuracy (4.68 m) is superior to that of the plain areas (5.22 m). This disparity highlights the relative contribution of the MSCI descriptor. In urban plains, high-frequency temporal changes in buildings and shadows introduce ‘textural noise’ that slightly degrades matching precision. In contrast, the rugged terrain provides stable, unique topographic skeletons (e.g., ridges and valleys) that the MSCI descriptor captures with higher semantic fidelity. This suggests that the proposed method is particularly robust for remote, unpopulated mountainous regions where GCPs are hardest to obtain.

Despite these promising results, several avenues for improvement remain:

Integration of On-board Data: Future work should incorporate satellite attitude and Inertial Measurement Unit (IMU) data to construct a more rigorous physical imaging model, thereby reducing error accumulation at the source.
Robustness in Complex Terrain: While the method performed well in the test area, extracting reliable tie points in extreme terrains (e.g., high relief mountains) or urban canyons remains challenging. Developing adaptive filtering and matching algorithms for these scenarios is necessary.
Multi-Modal Fusion: To further enhance positioning capabilities, particularly in varying weather conditions, we aim to explore the online collaborative processing of heterogeneous data, including Optical, SAR, and LiDAR, to achieve intelligent, near-real-time earth observation processing.
Transferability and Limitations in Other Terrains: While the experimental results in demonstrate high accuracy across mountainous and plain areas, the transferability of the proposed method to other global terrains warrants further discussion:

High-Relief Mountains: The synergy between MSCI semantic features and DEM-based elevation constraints allows the model to proficiently rectify non-linear relief displacements. However, in extreme high-relief areas (e.g., the Himalayas), excessive radar/optical shadowing or snow cover may reduce the number of available tie points, potentially requiring higher-resolution reference DEMs to maintain sub-pixel accuracy.
Deserts and Low-Texture Regions: A primary limitation arises in extremely homogeneous landscapes such as vast sand deserts or consistent ice sheets. Because the MSCI descriptor relies on local spatial contextual structures, the absence of distinct textural gradients in these regions would lead to sparse or unreliable feature matching. In such cases, the framework would rely more heavily on the satellite’s initial ephemeris data or would require the integration of multi-modal data (e.g., SAR intensity features) which are less dependent on optical texture.
Dependency on Reference Quality: The final absolute positioning accuracy is intrinsically capped by the precision of the reference ‘skeleton’ (e.g., ZY-3 imagery). In remote regions where high-precision reference imagery or GCP-calibrated base maps are unavailable, the absolute accuracy may degrade toward the level of the target satellite’s raw orientation parameters.

6. Conclusions

This study established a collaborative positioning framework for ultra-wide rotating scanning satellite imagery. By utilizing ZY-3 imagery as a geometric skeleton and incorporating DEM constraints, we achieved a meter-level positioning accuracy of 4.68 m (RMSE XY) for mountainous areas and 5.22 m for plain areas.

Validation Conditions and Dependencies: The results demonstrate that the method efficiently mitigates high-order scanning distortions in both mountainous (4.68 m) and plain (5.22 m) environments. However, the final accuracy remains highly dependent on the precision of the reference imagery and the vertical quality of the auxiliary DEM.

Limitations and Future Work: A primary limitation is the sensitivity to significant temporal land-use changes between multi-source sensors, which can introduce matching outliers in rapidly developing urban areas. Future research will focus on integrating multi-temporal robust filtering to further enhance reliability in dynamically changing landscapes.

Bibliography28

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Xue W. Wang P. Zhong L. Geometric Processing of New Ultra-large Swath Optical Remote Sensing Satellite Images Remote Sens. Inf.2021366065
2Liu X. Xue W. Wang P. Construction and accuracy assessment of rational function model for perpendicular-orbit circular scanning satellite images Opt. Precis. Eng.2023312898290910.37188/OPE.20233119.2898 · doi ↗
3Wang Y. Fu T. Zhou Y. Kong Q. Yu W. Liu J. Wang Y. Chen B. TE Net: Attention-Frequency Edge-Enhanced 3D Texture Enhancement Network Sensors 20252571510.3390/s 2503071539943353 PMC 11819683 · doi ↗ · pubmed ↗
4Xue W. Liu X. Zhao L. Wang P. Zhang X. Li W. Research on High-Precision Geometric Correction of Agile Optical Satellite Images J. Geo-Inf. Sci.20252727012712
5Zhang T. Li Y. rpc PRF: Generalizable MPI Neural Radiance Field for Satellite Cameraar Xiv 202310.48550/ar Xiv.2310.071792310.07179 · doi ↗
6Zhang G. Xu K. Zhang Q. Li D. Correction of Pushbroom Satellite Imagery Interior Distortions Independent of Ground Control Points Remote Sens.2018109810.3390/rs 10010098 · doi ↗
7Tao C.V. Hu Y. A comprehensive study of the rational function model for photogrammetric processing Photogramm. Eng. Remote Sens.20016713471357
8Breuer M. Albertz J. Geometric correction of airborne whiskbroom scanner imagery using hybrid auxiliary data Int. Arch. Photogramm. Remote Sens.20003393100