Two-Line Element Estimation Using Machine Learning

Rasit Abay; Sudantha Balage; Melrose Brown; Russell Boyce

arXiv:1902.04189·physics.space-ph·June 4, 2019

Two-Line Element Estimation Using Machine Learning

Rasit Abay, Sudantha Balage, Melrose Brown, Russell Boyce

PDF

TL;DR

This paper explores machine learning techniques, including Monte-Carlo methods and neural networks, to estimate Two-Line Elements (TLEs) for space objects, achieving high accuracy with reduced computational effort.

Contribution

It introduces a Monte-Carlo approach for TLE estimation without initial guesses and trains ML models to map orbital evolution to TLEs using large datasets.

Findings

01

Monte-Carlo method achieves residual errors below 1 km.

02

ML models successfully estimate TLEs for certain cases.

03

The approach reduces computational costs compared to traditional methods.

Abstract

Two-line elements are widely used for space operations to predict the orbit with a moderate accuracy for 2-3 days. Local optimization methods, such as the nonlinear least squares method with differential corrections, can estimate a TLE as long as there exists an initial estimate that provides the desired precision. Global optimization methods to estimate TLEs are computationally intensive, and estimating a large number of them is prohibitive. In this paper, the feasibility of estimating TLEs using machine learning methods is investigated. First, a Monte-Carlo approach to estimate a TLE, when there are no initial estimates that provide the desired precision, is introduced. The proposed Monte-Carlo method is shown to estimate TLEs with residual mean squared errors below 1 km for space objects with varying area-to-mass ratios and orbital characteristics. Second, gradient boosting decision…

Tables11

Table 1. Table 1: Feature vectors

Features	Equations/Values
Instance	[2.452, 2.402, …, 0.049, 0.0]
Mean motion (rad/h)	$n = \sqrt{\frac{μ}{a^{3}}}$
Ex	$E_{x} = e c o s (ω + Ω)$
Ey	$E_{y} = e s i n (ω + Ω)$
Hx	$H_{x} = t a n (\frac{i}{2}) c o s (Ω)$
Hy	$H_{y} = t a n (\frac{i}{2}) s i n (Ω)$
Lv	$L v = ν + ω + Ω$

Table 2. Table 2: Hyperparameters of XGBoost parameters

Parameters	$B^{*}$	$i$	$Ω$	$e$	$ω$	$m$	$n$
Learning rate	0.1	0.3	0.3	0.3	0.2	0.2	0.1
No. of estimators	100	80	70	70	80	80	100
Maximum depth	20	20	20	20	25	25	25
Min. child weight	7	9	11	13	27	27	1
Gamma	-	-	-	-	0.7	1.0	-
Subsample	0.7	0.7	0.6	0.7	0.1	0.2	1.0
Col. sample by tree	0.7	0.8	0.7	0.7	0.2	0.3	1.0
Col. sample by level	0.7	0.8	0.7	0.7	0.2	0.3	1.0
Alpha	0.2	0.2	0.2	0.1	0.9	0.7	0.0

Table 3. Table 3: Bias-Variance tradeoff for XGBoost

Models	Training Squared Mean Error	Test Squared Mean Error
$B^{*}$	3.9e-5	6.3e-5
$i$	7.1e-6	2.1e-5
$Ω$	3.5e-4	2.9e-3
$e$	1.6e-6	1.3e-5
$ω$	0.21	0.32
$m$	0.15	0.26
$n$	9.8e-9	2.2e-8

Table 4. Table 4: Parameters to normalize and standardize the input data for the neural networks

Features	Min	Max	$μ$	$σ$
Instance	0.0	2.45	1.225	0.721543
Mean motion (rad/h)	1.532504	4.479097	3.662984	0.221211
Ex	-0.489632	0.430255	-0.000028	0.015722
Ey	-0.327102	0.434307	0.000077	0.015767
Hx	-3.138717	3.138880	0.001386	0.693990
Hy	-3.138954	3.138859	0.009009	0.7005792
Lv	-3.141592	3.141592	0.01232161	1.809887

Table 5. Table 5: Fully-connected neural network hyperparameters

Parameters	$B^{*}$	$i$	$Ω$	$e$	$ω$	$m$	$n$
Learning rate	9e-5	9e-5	9e-5	9e-5	9e-5	9e-5	1e-5
Optimizer	Adam	Adam	Adam	Adam	Adam	Adam	Adam
$β_{1}$	9e-4	9e-4	9e-4	9e-4	9e-4	9e-4	9e-4
$β_{2}$	999e-6	999e-6	999e-6	999e-6	999e-6	999e-6	999e-6
Decay	1e-07	1e-07	1e-07	1e-07	1e-07	1e-07	1e-07
Epoch number	50	50	50	50	50	50	50

Table 6. Table 6: Bias-Variance tradeoff for fully-connected neural network

Models	Training Squared Mean Error	Test Squared Mean Error
$B^{*}$	1.4e-6	2.4e-6
$i$	1.3e-4	9.8e-5
$Ω$	0.02	0.01
$e$	1.8e-7	2e-7
$ω$	0.24	0.22
$m$	0.2	0.3
$n$	4.6e-9	5.3e-9

Table 7. Table 7: The parameters of the test cases used for the numerical propagator

Parameters	Test case #1	Test case #2	Test case #3
Epoch (UTC)	2018-12-12T18:47	2014-01-15T10:37	2009-08-03T15:18
Period (min)	$99.71$	$94.65$	$97.32$
Mass (kg)	$800$	$3$	$1$
Area ( $m^{2}$ )	$6$	$0.03$	$0.1$
$n$ ( $\frac{r a d}{s}$ )	$0.00105$	$0.001106$	$0.001076$
$i$ (deg)	$101.45$	$85.12$	$97.60$
$Ω$ (deg)	$53.03$	$90.16$	$253.29$
$ω$ (deg)	$9.33$	$204.08$	$235.32$
$m$ (deg)	$359.44$	$196.9$	$178.81$
$e$	$0.022353$	$0.011556$	$0.025822$
$C_{D}$	$2.2$	$1.5$	$0.8$

Table 8. Table 8: The statistical description for the performance of the proposed method in terms of the number of loops and the root mean square error

Parameters	Test case #1		Test case #2		Test case #3
	$μ$	$σ$	$μ$	$σ$	$μ$	$σ$
Final RMSE (km)	4.92	2.92	4.95	2.76	4.98	2.82
Inner loop number	54.39	50.22	43.91	42.38	55.28	58.06
Inner loop RMSE (km)	659.49	482.67	556.65	416.40	619.22	447.76
Outer loop number	1.43	0.83	1.41	0.71	1.38	0.74
Outer loop RMSE (km)	7.0	4.33	7.26	4.4	7.0	4.17

Table 9. Table 9: Mean and variance of residuals for different n 𝑛 n values

Mean and Variance	$n < 0.053$	$n > 0.067$	$0.053 < n < 0.067$
Mean (XGBoost)	0.0006	0.0003	6.5e-5
Mean (Neural Network)	7.2e-5	7e-5	5.2e-5
Variance (XGBoost)	3.2e-7	9.4e-8	1.8e-8
Variance (Neural Network)	8.5e-9	2.5e-8	2.6e-9

Table 10. Table 10: Mean and variance of residuals for the mean anomaly and the argument of perigee

Machine learning models	Mean ( $r a d$ )	Variance ( $r a d^{2}$ )
Mean Anomaly (XGBoost)	0.19	0.23
Mean Anomaly (Neural Network)	0.21	0.26
Argument of Perigee (XGBoost)	0.24	0.27
Argument of Perigee (Neural Network)	0.20	0.18

Table 11. Table 11: Best machine learning models for predicting mean elements of a TLE

Mean elements	Best machine learning models
$B^{*}$	Gradient boosting trees
$n$	Fully-connected neural network (unless the absolute residual between two models are larger than 0.0002 rad/min for values of $n$ between 0.053 rad/min and 0.067 rad/min)
$i$	Gradient boosting trees (unless the absolute residual between two models are larger than 0.014 rad)
$Ω$	Gradient boosting trees
$ω$	Gradient boosting trees
$m$	Fully-connected neural network
$e$	Fully-connected neural network

Equations12

B^{*} = \frac{1}{2} \frac{C _{D} A}{m} ρ

B^{*} = \frac{1}{2} \frac{C _{D} A}{m} ρ

n = \frac{μ}{a ^{3}}

n = \frac{μ}{a ^{3}}

δ tl e_{e p oc h} = (\frac{\partial f _{s g p 4}}{\partial tl e _{e p oc h}})^{- 1} δ r v_{e p oc h} = (A_{t_{0}})^{- 1} δ b_{t_{0}}

δ tl e_{e p oc h} = (\frac{\partial f _{s g p 4}}{\partial tl e _{e p oc h}})^{- 1} δ r v_{e p oc h} = (A_{t_{0}})^{- 1} δ b_{t_{0}}

δ tl e_{e p oc h} = (A^{T} W A)^{- 1} A^{T} W b = (i \sum N A_{(t_{i})}^{T} A_{(t_{i})})^{- 1} i \sum N A_{(t_{i})}^{T} b_{(t_{i})}

δ tl e_{e p oc h} = (A^{T} W A)^{- 1} A^{T} W b = (i \sum N A_{(t_{i})}^{T} A_{(t_{i})})^{- 1} i \sum N A_{(t_{i})}^{T} b_{(t_{i})}

z_{n} = \frac{x - min ( x )}{ma x ( x ) - min ( x )}

z_{n} = \frac{x - min ( x )}{ma x ( x ) - min ( x )}

z_{s} = \frac{x - μ _{(x)}}{σ _{(x)}}

z_{s} = \frac{x - μ _{(x)}}{σ _{(x)}}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

∎

11institutetext: Rasit Abay

11email: [email protected]

1University of New South Wales, Canberra, Australian Capital Territory 2600, Australia

Two-Line Element Estimation Using Machine Learning

Rasit Abay1

Sudantha Balage1

Melrose Brown1

Russell Boyce1

(Received: date / Accepted: date)

Abstract

Two-line elements are widely used for space operations to predict the orbit with a moderate accuracy for 2-3 days. Local optimization methods, such as the nonlinear least squares method with differential corrections, can estimate a TLE as long as there exists an initial estimate that provides the desired precision. Global optimization methods to estimate TLEs are computationally intensive, and estimating a large number of them is prohibitive. In this paper, the feasibility of estimating TLEs using machine learning methods is investigated. First, a Monte-Carlo approach to estimate a TLE, when there are no initial estimates that provide the desired precision, is introduced. The proposed Monte-Carlo method is shown to estimate TLEs with root mean square errors below 1 km for space objects with varying area-to-mass ratios and orbital characteristics. Second, gradient boosting decision trees and fully-connected neural networks are trained to map orbital evolution of space objects to the associated TLEs using 8 million publicly available TLEs from the US space catalog. The desired precision in the mapping to estimate a TLE is achieved for one of the three test cases, which is a low area-to-mass ratio space object.

Keywords:

Two-line elements TLE Machine learning Orbit determination

1 Introduction

Two-line element sets (TLEs) and Simplified General Perturbations #4 (SGP4) are widely used for space operations to predict trajectories of space objects with moderate accuracy (tens of kilometers) for 2-3 days. The trajectories produced by SGP4 are defined in True Equator Mean Equinox (TEME) reference frame DV2013 . The orbits of tracked Resident Space Objects around Earth are specified and updated in the US space catalog as TLEs. TLEs consist of mean elements that can only be used with SGP4 to propagate orbits with moderate accuracy. SGP4 provides the reference trajectory that is updated by fitting the actual trajectory, and TLEs are the updated parameters that approximate it. SGP4 includes zonal terms up to J5 of the gravitational potential of the Earth, neutral atmosphere with exponential decay and partially modeled third-body mass interactions DV2013 . Bstar ( $B^{*}$ ) is an element of a TLE that determines the effect of air drag on the trajectories of the space objects. Eq. 1 shows the relationship between drag coefficient ( $C_{D}$ ), area-to-mass ratio ( $\frac{A}{m}$ ), atmospheric density ( $\rho$ ) and $B^{*}$ DV2013 . However, it should be noted that the $B^{*}$ parameter in the TLE is highly sensitive to any perturbations, such as solar radiation pressure, air drag, third-body mass gravitational interactions, maneuvers and mismodeling of the Earth’s gravitational potential. Therefore, $B^{*}$ is always adjusted during the estimation scheme of TLEs, and it can even be negative (7.9% of the test data used in this work have negative $B^{*}$ ) DV2013 .

[TABLE]

Mean motion ( $n$ ) is one of the 6 independent mean elements, which include the inclination, right ascension of the ascending node, argument of perigee, mean anomaly and eccentricity, that is required for computations of TLEs DV2013 . Eq. 2 shows the relationship between the semi-major axis ( $a$ ), the geocentric gravitational constant ( $\mu$ ) and the mean motion of a space object. Mean motion has 8 decimal places in TLEs.

[TABLE]

Inclination ( $i$ ) is another independent mean element of a TLE that defines the orientation of the orbital plane with respect to the Earth. Inclination has 4 decimal places in a TLE, and its unit is degrees. The right ascension of the ascending node ( $\Omega$ ) is a mean element of a TLE that defines the orientation of an orbit with respect to the $\hat{z}$ axis, which is parallel to the rotation axis of the Earth. It changes primarily due to the $J2$ perturbation that is caused by the oblateness of the Earth, and this change is a function of inclination. The right ascension of the ascending node has 4 decimal places in the TLE, and it is in degrees. Argument of perigee ( $\omega$ ) is a mean element of a TLE that defines the orientation of the orbital plane with respect to the Earth around the axis parallel to the angular momentum vector of the orbit. The advance of the argument of perigee is the slowest frequency of the orbital evolution for objects orbiting the Earth while the change in the mean anomaly (Keplerian orbital frequency) is the fastest and the nodal precession is the intermediate frequency. Mean anomaly ( $m$ ) is a mean element of a TLE that defines the position of a space object in its orbit. In addition, it is the fastest changing parameter. Eccentricity ( $e$ ) is a mean element of a TLE that defines the shape of the orbit while semi-major axis ( $a$ ) defines the size of the orbit. Eccentricity has 7 decimal places. The slow rate of change in the $a$ and $e$ of the low-Earth orbits is primarily due to air drag, which is a non-gravitational perturbation, and this is related to the orbital decay of the resident space objects in LEO.

There are a few studies that investigate different approaches to estimate TLEs. As stated above, TLEs include the mean states estimated by fitting observations to the dynamics provided by SGP4, and they can only be used with SGP4 VC2006 . Keplerian orbital elements are used as initial estimates of TLEs by the differential corrections and nonlinear least squares methods VC2008 ; JGMK1996 ; LBS2002 . Kalman filter is investigated to estimate TLEs using onboard Global Positioning System (GPS) data MG2000 . However, the above methods, which search for the local minimum of the objective function, which is the sum of the squares of the position and velocity errors, depend on the availability of a reasonable initial estimate of a TLE. Although methods, such as genetic algorithms GOH2018 and invasive weed optimization BLD2015 , that do not require an initial estimate of the TLEs have been investigated, they are reported to be computationally intensive as they search for the global optimum. To summarize, the two main approaches in the literature to estimate a TLE in literature use either computationally expensive global search or local search which depends on having an initial estimate. This represents a significant shortcoming in the current state-of-the-art of orbit determination using TLEs. The present work addresses this shortcoming.

This work utilizes machine learning methods, namely the gradient boosting trees and fully-connected neural networks, to approximate the inverse mapping of publicly available SGP4 algorithm VC2006 for LEO objects by learning to map the orbital evolution to the associated TLE. The capability of approximating such mapping using machine learning will enable to represent time series orbital data in latent space that can be used with orbit propagators. The gradient boosting trees are machine learning models that are suitable for determining non-linear and sharp decision boundaries. The boosting is achieved by adding weak learners, such as decision trees, to determine the complex decision boundaries. The first successful application of such an idea is adaptive boosting (AdaBoost) FS1995 . AdaBoost generates a sequence of classifiers, and each classifier puts more weight on the samples that are not classified accurately in the previous iteration. Such an approach enables each classifier to capture different features in the input data, and more complex decision boundaries can be defined in the end KJ2013 . The boosting idea is further improved by defining the framework as a numerical optimization in the function space Fr2001 . The objective function is minimized by adding weak learners using the steepest-descent method.

Gradient boosting of regression trees are shown to be efficient both for regression and classification problems Fr2001 . The constraints imposed on trees and the weight updates, and the subsampling of data at each iteration help to reduce the overfitting Fr2002 . Moreover, L1 and L2 regularizations of the leaf weights in addition to the structure of the trees yield better generalization capability because they force the gradient boosting decision trees to be less complex CG2016 .

Artificial neural networks are machine learning models that are suitable for determining smooth nonlinear decision boundaries. According to the universal approximation theory Cy1989 ; HSW1989 , any Borel measurable function can be approximated by a feedforward neural network as long as it includes enough hidden units GYC2016 . In addition, a neural network may also approximate any discrete function irrespective of its dimensions given that dimension is finite. Although the universal approximation theory states that any closed and bounded continuous function can be approximated by multilayer perceptrons, the training scheme to approximate that function might fail due to overfitting or failure of the optimization algorithm GYC2016 .

Different techniques have been developed to solve the above issues regarding the neural network training, such as deeper networks, regularization and gradient descent with momentum. Figure 1 presents a simple neural network with an activation function of the rectified linear unit in the hidden layer. In this work, it is assumed that the publicly available SGP4 algorithm VC2006 is a bounded and closed function, and the supervised machine learning models are investigated to approximate it. The contributions of the present study are : 1) a Monte-Carlo approach to estimate a TLE with the desired precision without an initial estimate as is required by the differential correction methods and 2) supervised machine learning methods which can map an orbital evolution to a TLE which can be used as an initial estimate with desired precision. The trained machine learning models can also be used to improve the computational efficiency of the proposed Monte-Carlo approach to estimate a TLE.

The paper is structured as follows: In section 2, a Monte-Carlo approach and supervised learning models to estimate TLEs are explained. In section 3, an investigation of the performances of the proposed methods is presented. In section 4, the discussion regarding the feasibility of using machine learning methods to approximate the inverse mapping of publicly available SGP4 algorithm VC2006 for LEO objects is provided. In section 5, the conclusions and future work based on the results are discussed.

2 Methodology

2.1 Monte-Carlo approach

In this section, a Monte-Carlo approach that searches for the global minimum of the sum of the squares of the position and velocity errors to estimate a TLE without any initial data is presented. However, the proposed method is computationally intensive. Therefore, this work also investigates the feasibility of using machine learning to predict reasonable initial estimates of TLEs that can be utilized by the proposed Monte-Carlo approach. Figure 2 outlines the proposed method that extends the differential corrections method using the Monte-Carlo technique to estimate a TLE that satisfies a given criterion. The following procedure describes how to obtain such TLEs.

First, a position and a velocity vector are sampled from normal distributions which have the given position and velocity vector components (user-provided state for the epoch of the desired TLE) as mean values, and 3 km for the position and 0.003 km/s for the velocity as the associated standard deviations (STEP #1 in Figure 2). Then, the sampled position and velocity vectors are mapped to the corresponding Keplerian elements (STEP #2 in Figure 2). Next, the mapped Keplerian elements are fed into SGP4 (Keplerian elements are assumed to be the initial estimate of the TLE with desired precision, and $B^{*}$ is taken as 0 at this step) to propagate to the epoch ( $t=0$ ) to obtain new position and velocity vectors. The differential corrections to the Keplerian elements are computed by mapping the residual vector ( $\delta rv_{epoch}$ ), which is the difference between the above obtained state vectors and the user-provided states, to the corrections ( $\delta tle_{epoch}$ ) iteratively by using Eq. 3 (STEP #3 in Figure 2). Once the process is converged (the magnitude of the position error vector is $10^{-8}$ ), an osculating TLE is obtained, which provides the tangential orbit at the given point only ( $t=0$ ). Next, the $B^{*}$ value ( $\mu=0.0001$ and $\sigma=0.00009$ ) is sampled from the normal distribution computed from 12 years of all LEO TLEs in the official US space catalog (STEP #4 in Figure 2). The osculating TLE, which includes six independent elements that define the orbit, and the sampled $B^{*}$ are assumed to be the TLE with desired precision. The above computed TLE is used to propagate an orbit which is compared to the orbit generated by high-precision numerical orbit propagator, which utilizes the libraries of Orekit (version 9.1) low level space dynamics library MVP2010 , and the root mean square error is computed. The process starts from the beginning if the root mean square error (RMSE) does not satisfy the criterion, which is an RMSE of 30 km and smaller for this work to exclude the cases which may have secular error growth (STEP #5 and the end of inner loop in Figure 2). If the criterion at the end of the inner loop is satisfied, the iterative differential corrections and nonlinear least squares method is used to search for the TLE that corresponds to the local minimum by using Eq. 4 (STEP #6 in Figure 2). The sample that satisfies the error criterion (10 km and smaller in present study) is retained, and the process is terminated (STEP #7 and the end of outer loop in Figure 2).

The differential corrections scheme iteratively improves the sampled osculating TLE to match the position vector at the epoch time. The stopping criterion is selected as $10^{-8}$ and smaller for the magnitude of position error in kilometers. Eq. 3 shows how a Jacobian matrix, which is computed by the finite-difference method, relates the change in the TLE to the change in the position and velocity vectors at the epoch.

[TABLE]

where $tle_{epoch}$ is the TLE parameters at the epoch, $f_{sgp4}$ is SGP4, $rv_{epoch}$ are the propagated position and velocity vectors at the epoch, $A_{t_{0}}$ is the Jacobian, and $\delta b_{t_{0}}$ is the residual at the epoch. Eq. 4 shows the normal equation that relates all of the changes in the position and velocity vectors associated with the observational data available to the change in the TLE at epoch time VC2008 . In this work, all observational data are assumed to have the same weight, therefore, $W$ is assumed to be the identity matrix. Eq. 3 is used for differential corrections in the inner loop and Eq. 4 is used for differential corrections with nonlinear least squares method in the outer loop (Figure 2).

[TABLE]

where $tle_{epoch}$ is the TLE parameters at the epoch, $f_{sgp4}$ is SGP4, $rv_{epoch}$ are the propagated position and velocity vectors at any time, $A_{t_{i}}$ is the Jacobian, and $\delta b_{t_{i}}$ is the residual at any time.

2.2 Supervised Machine Learning Models

2.2.1 Feature selection

In the present work, Equinoctial orbital elements and a parameter that we shall call instance (Table 1) that defines the sequence of the data with fixed time interval are selected as feature vectors (input data for the machine learning models). The above orbital elements are non-singular, and the magnitude of their values are bounded, thus well-suited for the training process. The Keplerian mean motion, in radians per hour, replaces semi-major axis of the Equinoctial elements. The feature vectors are generated from 8,206,374 TLEs which are all LEO objects that have semi-major axes of 8,378 km and smaller in the space catalog. TLEs are restricted to the LEO orbital regime to ensure that all space objects are subject to a significant orbital perturbation due to atmospheric drag.

Figure 3 outlines the feature and target vectors. The dimensions of the feature and target vectors are 7x50 and 7x1 respectively. The target vectors are published TLEs in the official space catalog111Available at http://www.space-track.com.. TLEs from the official space catalog are used for convenience because mappings for orbital regimes that are not populated by resident space objects are not desired. Therefore, TLEs are just used as initial conditions for SGP4 to generate associated time series orbital data. The feature vectors are generated by propagating the TLEs backward in time using SGP4 for 50 orbital periods and mapping each point to the associated Equinoctial orbital elements. Since TLEs are mostly computed from the history of orbital evolutions of the space objects which are obtained from past observations, the training and development test data include orbits that are propagated backward in time. The equinoctial orbital elements are selected as feature vectors because the Keplerian orbital elements have singularities and osculating TLEs are computationally expensive to obtain.

2.2.2 Boosting trees

An optimized distributed gradient boosting library called XGBoost (version 0.7) is used for this work222Available at https://github.com/dmlc/xgboost.. A gradient boosting tree model is trained for each element of a TLE, namely Bstar ( $B^{*}$ ), eccentricity ( $e$ ), inclination ( $i$ ), mean anomaly ( $m$ ), mean motion ( $n$ ), right ascension of the ascending node ( $\Omega$ ), and the argument of perigee ( $\omega$ ). The construction of individual gradient boosting trees for each TLE parameter is chosen because they cannot map to a multidimensional output for regression problems due to the architecture of the decision trees, whereas neural networks can. However, the above approach is expected to yield better function approximation because the complexity of the model is utilized to obtain one parameter only. Another disadvantage of XGboost is that it can not be efficiently used for incremental learning. All available data should be presented for constructing each tree. Therefore, the data should fit the memory to train the models. Due to the memory restriction (a workstation with 128 GB RAM and 64 AMD OpteronTM 6376 processors are used), the training data include TLEs from 01 January 2017 to 21 April 2018, while test data include TLEs from 02 January 2016 to 01 April 2016 to select the best performing machine learning models. The models are trained with 8,206,374 TLEs and tested with 1,278,900 TLEs. A subset of the test data (69,632 TLEs from 27 March 2016 to 01 April 2016) is also used to determine the accuracy of the TLEs predicted by the selected best machine learning models. The test data are chosen from another time period as stated above to enable the trained machine learning models to generalize the predictions. Table 2 shows the gradient boosting tree hyperparameters that are optimized for training XGBoost models. The learning rate controls the step size for the gradient-based optimizer. The number of estimators controls the number of trees. The maximum depth limits the depth of each tree which is used to increase the complexity of the model. The minimum child weight is required to avoid overfitting of the training data. A node is split as long as the resultant split leads to a positive reduction in the loss function (mean square error in this work). Gamma can control the split of nodes. The rest of the parameters in Table 2 are used to avoid overfitting by forcing the model to be less complex. The hyperparameters of the gradient boosting trees are tuned empirically by considering the bias-variance tradeoff to ensure the trained models have generalization capability (Table 3). The grid search methods are not preferred because the models are trained with large amount of data.

2.2.3 Deep neural networks

Since neural networks can be trained with additional data without being trained from scratch, they are more versatile compared to other methods such as decision trees. In this paper, a fully-connected neural network architecture (Keras-version 2.2.0 using Tensorflow backend-version 1.8.0) is chosen to approximate the TLE estimation scheme (Figure 4). The fully-connected neural network learns the mapping between the input and output by connecting all neurons available, and it has 133,801 trainable parameters (total number of weights and biases). It requires a 1D input vector; therefore, 7x50 input matrices are flattened into 1D input vectors with 350 elements (Figure 3). Unlike decision trees, neural networks are sensitive to the scaling of the input data; therefore, the input data are either standardized (Eq. 6) or normalized (Eq. 5) based on the resultant performance of the model. Table 4 shows the parameters that are used to standardize and normalize the input data. For the fully-connected machine learning models considered in this work, it is found that the standardization outperforms the normalization. It should be noted that the parameters in Table 4 are obtained from only the training data not to create a bias in the machine learning model. The memory restriction is not an issue for training neural networks due to their versatility. The same data excluding the ones with missing values (not a number (NaN) values due to numerical artifacts) that are used to train the gradient boosting trees are used to compare the performances of the two different machine learning methods. XGBoost can be trained with missing values whereas neural networks cannot; therefore, NaN values (0.005% of the total input data) are removed from the input data. Table 5 shows the hyperparameters of the fully-connected neural network that are optimized for neural network models. The Adam optimizer is chosen due to its computational efficiency for problems with large amount of data, and it computes the moving average of the gradients and squared gradients. The parameters of the Adam optimizer that control the decay rate of the moving averages KA2015 are $\beta_{1}$ and $\beta_{2}$ . The learning rate controls the step size for the Adam optimizer while the decay parameter decays the learning rate at each epoch. The hyperparameters of the fully-connected neural networks are tuned empirically by considering the bias-variance tradeoff to ensure the trained models have generalization capability (Table 6). The grid search methods are not preferred because the models are trained with large amount of data.

[TABLE]

where $x$ are the input data.

3 Results and Discussions

3.1 Monte-Carlo Approach

A novel approach to estimate a TLE from a given ephemeris without any initial estimate of the TLE using a Monte-Carlo method is introduced. Three particular cases with varying area-to-mass ratios, orbital characteristics and solar activity phases, where the standard differential corrections with the nonlinear least squares method VC2008 cannot generate a valid TLE, are chosen as test cases. Initial orbital parameters for the test cases are propagated forward in time using a three-degree-of-freedom (3-DOF) numerical orbit propagator over 1 day. The orbital evolution is defined in the Vehicle Velocity, Local Horizontal (VVLH) reference frame (Figure 5).The perturbations included are: air drag (Harris-Priester model), the Earth’s aspherical gravitational potential with order and degree 20 (Holmes-Featherstone model), solar radiation pressure (Lambertian sphere model), and Sun and Moon third-body gravitational perturbations. Parameters of the test cases used for the numerical propagation are given in Table 7. In addition, the statistical description for the performance of the proposed method in terms of the number of loops and root mean square error in the orbital evolution during both inner and outer loops are provided in Table 8. For each test case, 100 TLEs are computed using the proposed method.

Figure 6 shows the absolute errors in the orbital evolution between the orbit propagated by SGP4 using the Keplerian elements and the TLE obtained from the Combined Space Operations Center (CSpOC). The most significant error is in the along-track direction while the second largest error is in the cross-track direction for all test cases. The drastic divergence of orbital evolution leads to large corrections to the initial TLE estimate (Keplerian elements), and this results in divergence in the standard TLE estimation scheme VC2008 . The nonlinear least squares and differential corrections method minimize the mean square errors. The secular error growth results in TLEs with large errors. Therefore, an efficient algorithm that searches for bounded error evolution is required. Because a valid TLE should fit the whole orbit, an osculating TLE at the epoch time that fits the whole orbit is searched for using the Monte-Carlo method. The inner loop stopping criteria is selected as RMSE of 30 km (Figure 2). The outer loop is terminated when the standard TLE estimation scheme VC2008 computes a valid TLE (RMSE of 10 km and smaller for this work). Figure 7 shows the absolute relative distance over one day when the inner loop criterion is satisfied. Figure 8 shows the absolute relative distance when a valid TLE is computed. The above TLE satisfies both the inner and the outer loops simultaneously (Figure 2).

In conclusion, the present Monte-Carlo method can estimate a TLE without any initial estimate. Such capability is beneficial to space situational awareness (SSA) because TLEs are required for establishing first contact with the spacecraft during the Launch and Early Orbit Phase (LEOP). Most ground stations have propriety software that requires TLEs to track the satellites. Although the proposed method addresses the difficulty related to unavailability of the initial estimate, the inner loop is iterated 50 times on the average to find an initial estimate with desired precision. Therefore, the feasibility of using machine learning methods to generate initial estimates with the desired precision is investigated in the following sections.

3.2 Machine Learning Models

In this section, the performances of the gradient boosting trees and the fully-connected neural networks that are trained to map an orbital evolution to the associated TLE are presented using two different approaches. For the first performance test, the metric is chosen as the absolute difference between the TLEs that are predicted by machine learning models and the CSpOC TLEs. The first test is also used to determine the best combinations of machine learning models which is selected for the second test. For the second performance test, the metric is chosen as the absolute relative distance in the VVLH reference frame between the positions propagated by the estimated TLEs and the CSpOC TLEs.

3.2.1 $B^{*}$ machine learning model

The absolute residual plot and the scatter plot with regression of the predictions of the $B^{*}$ using machine learning models are presented in Figure 9. Both machine learning models have difficulties in determining the decision boundary that can accurately map the features to the target values, and this may be related to the modification of $B^{*}$ to account for perturbations DV2013 . There are outliers observed in the $B^{*}$ data that are kept (99.9% of $B^{*}$ values in the test data are between -0.05 and 0.05 (1/Earth radius)) which may affect the machine learning models. The poor performance of the models can be attributed to the modification of the $B^{*}$ value during the estimation process as the tree-based models are insensitive to the outliers in the data. Both methods output values close to the average of all target values ( $10^{-4}$ ). The 48% of all $B^{*}$ values in the test data are between 0.0001 and 0.0009 (1/Earth radius). The percentage of residuals smaller than 0.0001 is 28.2% for the gradient boosting model and 25.1% for neural network model. The regression plots (Figure 9) show that the performance of both machine learning models is similar for different values of $B^{*}$ . Therefore, the gradient boosting tree $B^{*}$ model is chosen as the best model irrespective of the $B^{*}$ values because the percentage of residuals smaller than 0.0001 is higher for the gradient boosting trees.

3.2.2 Mean motion machine learning model

The absolute residual plot and the scatter plot with regression of the predictions of mean motion ( $n$ ) using machine learning models are presented in Figure 10. The residuals smaller than 0.0001 (rad/min) are 86.2% of the test data for the gradient boosting tree model and 89.4% for fully-connected network model. The above criterion of 0.0001 (rad/min) is the RMSE of the predictions of the test data. The regression plot of the gradient boosting tree model in Figure 10 shows that the variance becomes larger for values smaller than 0.053 (rad/min) and bigger than 0.067 (rad/min), and the regression plot for the fully-connected neural network shows that there are outliers in the predictions. Table 9 presents the mean and variance of residuals for different mean motion values. The fully-connected neural network outperforms the gradient boosting trees regarding prediction accuracy for all values of mean motion. Therefore, the fully-connected neural network model is chosen as the best machine learning model for predicting mean motion. For the orbital evolution accuracy test at the end of the section, the fully-connected neural network model is used to predict mean motion unless the absolute difference between the prediction of neural network and the prediction of the gradient boosting trees are larger than 0.0002 (rad/min) for values of $n$ between 0.053 (rad/min) and 0.067 (rad/min) to account for outliers present in the predictions of the fully-connected neural network model (Figure 10). The above threshold value of 0.0002 (rad/min) is $\mu+3\sigma$ of the residuals for the fully-connected neural network for values of $n$ between 0.053 (rad/min) and 0.067 (rad/min).

3.2.3 Inclination machine learning model

The absolute residual plot and the scatter plot with regression of the predictions of inclination ( $i$ ) using machine learning models are presented in Figure 11. For the gradient boosting trees, the mean of the residuals is 0.002 radians, and the variance of the residuals is 1.6e-5 square radians. For the fully-connected neural network, the mean of the residuals is 0.008 radians, and the variance of the residuals is 1.9e-5 square radians. The gradient boosting tree model is chosen as the best machine learning model for predicting inclination. However, the regression plot for the gradient boosting trees shows that there are outliers in the predictions. Therefore, the gradient boosting tree model is used to predict inclination unless the absolute difference between the prediction of neural network and the prediction of the gradient boosting trees are larger than 0.014 (rad), which is $\mu+3\sigma$ of the residuals for the gradient boosting trees, for the orbital evolution accuracy test at the end of the section.

3.2.4 Right ascension of the ascending node machine learning model

The absolute residual plot and the scatter plot with regression of the predictions of right ascension of the ascending node ( $\Omega$ ) using machine learning models are presented in Figure 12. For the gradient boosting trees, the mean of the residuals is 0.0045 radians, and the variance of the residuals is 0.0029 square radians. For the fully-connected neural network, the mean of the residuals is 0.076 radians, and the variance of the residuals is 0.0115 square radians. The equinoctial element $Lv$ ( $\Omega+\omega+\nu$ ) is in radians in the features. This leads to the large variances in the residuals of the predictions of both methods because the machine learning models fail to recognize the cyclic behavior when any of the mean elements, namely $\Omega$ , $\omega$ , $\nu$ , switch values between $2\pi$ and [math] as the orbits evolve in time. The advancement in the values of the above mentioned mean elements between $2\pi$ and [math] occurs for the $0.08\%$ of the test data. This issue will be addressed in future studies by introducing additional features into the features, such as $\Omega$ , $\omega$ , $\nu$ , and applying data transformations. The gradient boosting tree model is chosen as the best machine learning model for predicting $\Omega$ because the mean of the residuals for its predictions is one order magnitude lower than that of the fully-connected neural network.

3.2.5 Argument of the perigee and mean anomaly machine learning models

The absolute residual plots and the scatter plots with regression of the predictions of the argument of perigee ( $\omega$ ) and the mean anomaly ( $m$ ) using machine learning models are presented in Figure 13 and Figure 14 respectively. Machine learning models for both mean elements are analyzed together due to their similar performances in the residuals of their predictions. Table 10 shows the mean and variance of the residuals of the predictions for the machine learning models. According to the Table 10, the gradient boosting tree model is chosen as the best machine learning model for predicting argument of perigee ( $\omega$ ), and the fully-connected neural network is chosen as the best machine learning model for predicting the mean anomaly ( $m$ ). However, both the argument of perigee and the mean anomaly machine learning models have relatively poor performance in determining the decision boundary that can map the orbital evolution to the mean elements accurately. In addition to the difficulty of capturing the cyclic behavior of angles close to [math] and $2\pi$ due to $Lv$ feature (as discussed for $\Omega$ above), all machine learning models learn a representation that has large variations in the residuals of its predictions, and this may indicate that the mean anomaly and the argument of perigee are modified to fit the perturbed orbit as it is the case for $B^{*}$ . In addition, there is no significant relationship between the eccentricity and the accuracy of the predictions of the argument of perigee (Figure 15). In Figure 15, the distributions of the eccentricities for the residuals of predictions of $\omega$ for values smaller than $0.001$ and larger than $0.1$ are presented. The main reason for the poor performance of the models for the parameters is the fact that the sum of these two parameters define the location of the object in the orbit. Therefore, the estimator (non-linear least squares with differential corrections method) changes each parameter arbitrarily to reduce the sums of the squared errors without changing the actual location of the object in the orbit because most LEO objects have small eccentricities, and this behavior is observed for all test cases in Section 3.1.

3.2.6 Eccentricity machine learning model

The absolute residual plot and the scatter plot with regression of the predictions of eccentricity ( $e$ ) using machine learning models are presented in Figure 16. For the gradient boosting trees, the mean of the residuals is 0.0008, and the variance of the residuals is 1.23e-5. For the fully-connected neural network, the mean of the residuals is 0.0003, and the variance of the residuals is 7.59e-8. The fully-connected neural network is chosen as the best machine learning model.

3.2.7 The performance evaluation of the selected best machine learning models

The performances of machine learning methods differ based on the features selected. Two different machine learning models are trained to predict each mean elements of a TLE, and the best performing model is chosen to be validated by the errors in the orbital evolution. The error in the orbital evolution is computed by propagating TLEs estimated by the selected best machine learning models and the associated CSpOC TLEs backward in time using SGP4 for 50 orbital periods (Figure 3). Table 11 summarizes the best performing models for each mean element that are discussed in detail above.

The performance of the machine learning models is evaluated by replacing the inner loop of the Monte-Carlo approach, which searches for an initial estimate with desired precision, with the selected best machine learning models. For all three test cases (Table 7), the orbital evolution (Figure 3) associated with the desired TLE is feed into the selected best machine learning models to obtain an initial estimate. The osculating Keplerian mean anomaly and argument of perigee are replaced with the counterparts estimated by the machine learning models because of the poor performance of the models for these two parameters.The selected best machine learning models are able to provide an initial estimate with desired precision only for the test case #1, which is a low area-to-mass ratio space object. For the other test cases, the estimated TLEs have RMSE values over 100 km, and this indicates that the precision that is required for mapping the orbital evolution to the associated initial estimate of the desired TLE should be higher. Figure 17 shows the absolute errors in the orbital evolution between the orbit propagated using SGP4 with TLEs estimated by the selected best machine learning models as compared to the TLE obtained from the CSpOC. Figure 18 shows the absolute relative errors between the numerically propagated orbits and the orbits propagated by SGP4 when the initial estimates that are computed by the machine learning models are improved by nonlinear least squares and differential corrections methods.

In conclusion, the present best machine learning models (selected based on their performance) that are trained with TLEs obtained from the official US space catalog have the potential to provide an initial estimate to estimate a TLE with desired precision. The inner loop (Figure 3), which searches for the global minimum by using brute force approach, in the Monte-Carlo TLE estimation method can be replaced by machine learning models because there is no secular error growth in the orbital evolution of TLEs predicted by machine learning models (test case #1). Therefore, the standard differential corrections with nonlinear least squares method VC2008 can converge to a local minimum, and the computation time can be reduced significantly.

4 Conclusions and Future Work

In this paper, a new Monte-Carlo TLE estimation method that does not require an initial estimate of the desired TLE is developed, and the feasibility of approximating the inverse mapping of publicly available SGP4 algorithm VC2006 for LEO objects using publicly available TLEs and state-of-the-art machine learning methods is investigated. The present Monte-Carlo TLE estimation method can estimate a TLE without any initial estimate from an orbital evolution with an RMSE value smaller than 1 km over 1 day for space objects with varying area-to-mass ratios and orbital characteristics. The selected best machine learning models are shown to provide an initial estimate with desired precision for the associated TLE of a low area-to-mass ratio object.

To the author’s knowledge, this work is the first effort to approximate a mapping between orbital evolution and the TLE parameters using machine learning. There are a couple of benefits of developing such mappings for SSA. First, the computational power that is required to estimate TLEs on a daily basis can be used for other important tasks. Second, the capability of reducing the dimension of orbital evolution down to mean elements using machine learning models provides valuable insights in utilizing them for orbit determination. The initial results of machine learning models that can predict TLEs are promising. The TLEs predicted by the machine learning model can determine the reference orbit with the desired precision. Additional input parameters, non-discontinuous transformations of cyclic parameters, and different machine learning architectures could enhance the capabilities of models. Developing different machine learning models for different orbital regimes in LEO can improve the current prediction accuracy of the models as well. For future studies, the feasibility of enhancing the predictions of the right ascension of the ascending node and the inclination, and the location of the space object within the reference trajectory using physics-embedded machine learning models will be investigated.

Conflict of interests

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Bibliography18

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1) D.A. Vallado, Fundamentals of astrodynamics and applications , 4th edn. (Microcosm Press, 2013)
2(2) D. Vallado, P. Crawford, R. Hujsak, T. Kelso, Revisiting spacetrack report# 3 p. 6753 (2006)
3(3) D. Vallado, P. Crawford, Sgp 4 orbit determination p. 6770 (2008)
4(4) E. Jochim, E. Gill, O. Montenbruck, M. Kirschner, Gps based onboard and onground orbit operations for small satellites, Acta Astronautica 39 (9-12), 917 (1996)
5(5) B. Lee, Norad tle conversion from osculating orbital element, Journal of Astronomy and Space Sciences 19 (4), 395 (2002)
6(6) O. Montenbruck, E. Gill, Real-time estimation of sgp 4 orbital elements from gps navigation data pp. 26–30 (2000)
7(7) S.T. Goh, K.S. Low, Real-time estimation of satellite’s two-line elements via positioning data pp. 1–7 (2018)
8(8) H. Bolandi, M. Ashtari Larki, S. Sedighy, M. Zeighami, M. Esmailzadeh, Estimation of simplified general perturbations model 4 orbital elements from global positioning system data by invasive weed optimization algorithm, Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering 229 (8), 1384 (2015)

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Two-Line Element Estimation Using Machine Learning

Abstract

Keywords:

1 Introduction

2 Methodology

2.1 Monte-Carlo approach

2.2 Supervised Machine Learning Models

2.2.1 Feature selection

2.2.2 Boosting trees

2.2.3 Deep neural networks

3 Results and Discussions

3.1 Monte-Carlo Approach

3.2 Machine Learning Models

3.2.1 B∗B^{*}B∗ machine learning model

3.2.2 Mean motion machine learning model

3.2.3 Inclination machine learning model

3.2.4 Right ascension of the ascending node machine learning model

3.2.5 Argument of the perigee and mean anomaly machine learning models

3.2.6 Eccentricity machine learning model

3.2.7 The performance evaluation of the selected best machine learning models

4 Conclusions and Future Work

Conflict of interests

3.2.1 $B^{*}$ machine learning model