Genetic Informed Trees (GIT*): Path Planning via Reinforced Genetic Programming Heuristics

Liding Zhang; Kuanqi Cai; Zhenshan Bing; Chaoqun Wang; and Alois Knoll

arXiv:2508.20871·cs.RO·August 29, 2025

Genetic Informed Trees (GIT*): Path Planning via Reinforced Genetic Programming Heuristics

Liding Zhang, Kuanqi Cai, Zhenshan Bing, Chaoqun Wang, and Alois Knoll

PDF

TL;DR

This paper introduces GIT*, a path planning method that enhances heuristic functions by integrating environmental data and reinforced genetic programming, leading to improved efficiency and solution quality in complex spaces.

Contribution

It presents GIT*, a novel path planning approach that combines environmental data integration with reinforced genetic programming for superior heuristic guidance.

Findings

01

GIT* outperforms existing planners in high-dimensional problems.

02

Incorporating environmental data improves heuristic accuracy.

03

Reinforced genetic programming enhances computational efficiency.

Abstract

Optimal path planning involves finding a feasible state sequence between a start and a goal that optimizes an objective. This process relies on heuristic functions to guide the search direction. While a robust function can improve search efficiency and solution quality, current methods often overlook available environmental data and simplify the function structure due to the complexity of information relationships. This study introduces Genetic Informed Trees (GIT*), which improves upon Effort Informed Trees (EIT*) by integrating a wider array of environmental data, such as repulsive forces from obstacles and the dynamic importance of vertices, to refine heuristic functions for better guidance. Furthermore, we integrated reinforced genetic programming (RGP), which combines genetic programming with reward system feedback to mutate genotype-generative heuristic functions for GIT*. RGP…

Tables3

Table 1. TABLE I: Example performance results for randomly generated environment

EIT*	0.19	$\infty$	$\infty$	2.5	$\infty$	$\infty$	2.5	$\infty$	$\infty$	0.48
	$t_{init}^{\min}$	$t_{init}^{med}$	$t_{init}^{\max}$	$c_{init}^{\min}$	$c_{init}^{med}$	$c_{init}^{\max}$	$c_{final}^{\min}$	$c_{final}^{med}$	$c_{final}^{\max}$	Success
Weights $w [m^{i}]$	1.0	3.5	0.5	1.0	2.5	1.0	1.0	2.5	1.0	3.0
GIT ${}_{ψ_{i}}^{*}$	0.16	0.39	$\infty$	2.34	5.05	$\infty$	2.33	3.53	$\infty$	0.72

Table 2. TABLE II: Parameter settings for the training

Parameter Name	Value or Description
Population Size	1500
Number of Generations	100
Selection Method	Tournament Selection
Crossover Rate	0.8
Mutation Rate	0.1
Maximum Tree Depth	4
Tournament Size	5
Crossover Type	Subtree Crossover
Mutation Type	Point Mutation

Table 3. TABLE III: Benchmarks evaluation comparison (Fig. 7 )

	Adaptively Informed Trees			Effort Informed Trees			Genetic Informed Trees			$t_{init}^{med} ⇑ ⇑$ (%)
	$t_{init}^{med}$	$c_{init}^{med}$	$c_{final}^{med}$	$t_{init}^{med}$	$c_{init}^{med}$	$c_{final}^{med}$	$t_{init}^{med}$	$c_{init}^{med}$	$c_{final}^{med}$	$t_{init}^{med} ⇑ ⇑$ (%)
$DW - ℝ^{4}$	0.1299	1.9571	1.7151	0.0252	2.4051	1.3693	0.0201	2.0619	1.3634	84.53 / 20.23
$DW - ℝ^{8}$	0.1947	3.1492	2.6388	0.0357	4.0910	2.2892	0.0279	3.3791	2.3109	85.67 / 21.84
$RR - ℝ^{4}$	0.0853	1.7570	1.5282	0.0587	1.8392	1.4715	0.0472	1.6874	1.4595	44.67 / 19.59
$RR - ℝ^{8}$	1.1843	4.4697	4.3599	0.1889	4.6789	2.8588	0.1429	4.1716	2.8450	87.93 / 24.35
$GE - ℝ^{4}$	0.0191	1.2457	0.9900	0.0082	1.2909	0.9126	0.0064	1.2678	0.9083	66.49 / 21.95
$GE - ℝ^{8}$	0.3834	1.7854	1.6605	0.0941	1.4970	1.4086	0.0512	1.6634	1.3636	86.64 / 45.59

Equations46

σ^{*} = ar g σ \in Σ min {s (σ) ∣ σ (0) = x_{start}, σ (1) \in X_{goal}, \forall t \in [0, 1], σ (t) \in X_{free}},

σ^{*} = ar g σ \in Σ min {s (σ) ∣ σ (0) = x_{start}, σ (1) \in X_{goal}, \forall t \in [0, 1], σ (t) \in X_{free}},

Fitness (ψ) = (ρ_{ψ}, y) \in D \sum (y - \overset{y}{^} (ρ_{ψ}, ψ))^{2},

Fitness (ψ) = (ρ_{ψ}, y) \in D \sum (y - \overset{y}{^} (ρ_{ψ}, ψ))^{2},

ψ^{*} = ar g ψ min Fitness (ψ) .

ψ^{*} = ar g ψ min Fitness (ψ) .

ψ^{*}

ψ^{*}

ϕ (ψ_{i}, D_{benchmark})

α = \frac{v _{G I T_{ψ_{i}}^{*}}^{i} - v _{E I T^{*}}^{i}}{v _{E I T^{*}}^{i}},

α = \frac{v _{G I T_{ψ_{i}}^{*}}^{i} - v _{E I T^{*}}^{i}}{v _{E I T^{*}}^{i}},

s_{base} [m^{i}] = δ + δ \times α,

s_{base} [m^{i}] = δ + δ \times α,

s_{total}^{θ} = i = 1 \sum n (s_{base} [m^{i}] + s_{bonus} [m^{i}]) \cdot w [m^{i}],

s_{total}^{θ} = i = 1 \sum n (s_{base} [m^{i}] + s_{bonus} [m^{i}]) \cdot w [m^{i}],

ρ_{ψ} = \overline{s_{total}} + c_{1} σ_{s_{total}}^{2} + c_{2} ∣ ψ ∣.

ρ_{ψ} = \overline{s_{total}} + c_{1} σ_{s_{total}}^{2} + c_{2} ∣ ψ ∣.

ψ^{*}

ψ^{*}

G I T^{*}

key_{R}^{GIT^{*}} (x_{s}, x_{t}) := {(g (x_{t}) - π) \times \frac{l o g ( 1 + ∣ U [ x _{t} ] - U [ x _{s} ] ∣ )}{1 + w _{dyn} [ x _{t} ]}, \overset{e}{ˉ} (x_{s}) + \overset{e}{ˉ} (x_{s}, x_{t}) \times lo g (\overset{ˉ}{d} (x_{t})) .

key_{R}^{GIT^{*}} (x_{s}, x_{t}) := {(g (x_{t}) - π) \times \frac{l o g ( 1 + ∣ U [ x _{t} ] - U [ x _{s} ] ∣ )}{1 + w _{dyn} [ x _{t} ]}, \overset{e}{ˉ} (x_{s}) + \overset{e}{ˉ} (x_{s}, x_{t}) \times lo g (\overset{ˉ}{d} (x_{t})) .

F_{\text{rep}}(q):=\left\{\begin{array}[]{ll}-\frac{k_{r}\cdot q\cdot q_{\text{obs}}}{r^{2}}&\text{if }r\leq\rho_{0}\\ 0&\text{otherwise}\end{array},\right.

F_{\text{rep}}(q):=\left\{\begin{array}[]{ll}-\frac{k_{r}\cdot q\cdot q_{\text{obs}}}{r^{2}}&\text{if }r\leq\rho_{0}\\ 0&\text{otherwise}\end{array},\right.

U_{\text{rep}}(q):=\left\{\begin{array}[]{ll}-\frac{k_{r}\cdot q\cdot q_{\text{obs}}}{r}&\text{if }r\leq\rho_{0}\\ 0&\text{otherwise}\end{array},\right.

U_{\text{rep}}(q):=\left\{\begin{array}[]{ll}-\frac{k_{r}\cdot q\cdot q_{\text{obs}}}{r}&\text{if }r\leq\rho_{0}\\ 0&\text{otherwise}\end{array},\right.

F_{attr} (q) := \frac{k _{a} \cdot q \cdot q _{goal}}{r ^{2}},

F_{attr} (q) := \frac{k _{a} \cdot q \cdot q _{goal}}{r ^{2}},

U_{attr} (q) := \frac{k _{a} \cdot q \cdot q _{goal}}{r} .

U_{attr} (q) := \frac{k _{a} \cdot q \cdot q _{goal}}{r} .

w_{dyn} [x_{t}] := x_{neighbor} \in X_{neighbors} (x_{t}) \sum I (x_{neighbor} \in T_{R} (X_{neighbors} (x_{t}))),

w_{dyn} [x_{t}] := x_{neighbor} \in X_{neighbors} (x_{t}) \sum I (x_{neighbor} \in T_{R} (X_{neighbors} (x_{t}))),

ε_{infl}^{*} = 1.0 + \frac{lo g ( D ( θ )) + D ( θ )}{N _{samples} + lo g ( N _{samples} ) + 1},

ε_{infl}^{*} = 1.0 + \frac{lo g ( D ( θ )) + D ( θ )}{N _{samples} + lo g ( N _{samples} ) + 1},

ε_{trunc}^{*} = 1.0 + \frac{3 π}{N _{samples}},

ε_{trunc}^{*} = 1.0 + \frac{3 π}{N _{samples}},

P (t + 1) := select (crossover (mutate (P (t)))),

P (t + 1) := select (crossover (mutate (P (t)))),

t \to \infty lim P (Z_{t} = ψ^{*}) := 1,

t \to \infty lim P (Z_{t} = ψ^{*}) := 1,

n \to \infty lim P ({V_{F} \cup V_{R}} \cap X_{goal}) \neq = \emptyset) = 1,

n \to \infty lim P ({V_{F} \cup V_{R}} \cap X_{goal}) \neq = \emptyset) = 1,

r (q) > η (2 (1 + \frac{1}{d}) (\frac{λ ( X _{\hat{f}} )}{ζ _{d}}) (\frac{lo g ( q )}{q}))^{\frac{1}{d}},

r (q) > η (2 (1 + \frac{1}{d}) (\frac{λ ( X _{\hat{f}} )}{ζ _{d}}) (\frac{lo g ( q )}{q}))^{\frac{1}{d}},

P (q \to \infty lim sup σ \in Σ_{q} min {c (σ)} = c^{*}) = 1,

P (q \to \infty lim sup σ \in Σ_{q} min {c (σ)} = c^{*}) = 1,

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Genetic Informed Trees (GIT*): Path

Planning via Reinforced Genetic Programming Heuristics

Liding Zhang1, Kuanqi Cai1, Zhenshan Bing1, Chaoqun Wang2, Alois Knoll1 1L. Zhang, K. Cai, Z. Bing, and A. Knoll are with the Department of Informatics, Technical University of Munich, Germany. [email protected]2C. Wang is with the School of Control Science and Engineering, Shandong University, Shandong, China.

(Corresponding authors: Zhenshan Bing; Kuanqi Cai.)

Abstract

Optimal path planning involves finding a feasible state sequence between a start and a goal that optimizes an objective. This process relies on heuristic functions to guide the search direction. While a robust function can improve search efficiency and solution quality, current methods often overlook available environmental data and simplify the function structure due to the complexity of information relationships. This study introduces Genetic Informed Trees (GIT*), which improves upon Effort Informed Trees (EIT*) by integrating a wider array of environmental data, such as repulsive forces from obstacles and the dynamic importance of vertices, to refine heuristic functions for better guidance. Furthermore, we integrated reinforced genetic programming (RGP), which combines genetic programming with reward system feedback to mutate genotype-generative heuristic functions for GIT*. RGP leverages a multitude of data types, thereby improving computational efficiency and solution quality within a set timeframe. Comparative analyses demonstrate that GIT* surpasses existing single-query, sampling-based planners in problems ranging from $\mathbb{R}^{4}$ to $\mathbb{R}^{16}$ and was tested on a real-world mobile manipulation task. A video showcasing our experimental results is available at https://youtu.be/URjXbc_BiYg.

Index Terms:

Genetic algorithm, reinforced genetic programming, generative heuristics, optimal path planning.

I Introduction

Path planning is a fundamental challenge in robotic automation, involving the determination of a sequence of valid states that guide a robot from a starting point to a desired goal while avoiding obstacles [1]. Many algorithms have been proposed to address this problem, such as the A* algorithm [2], Artificial Potential Field (APF) algorithm [3], and sampling-based algorithms [4]. The A* algorithm’s performance declines with higher dimensionality, while the APF algorithm often converges to local minima. Sampling-based algorithms have gained popularity due to their efficient exploration of the state space [5]. However, they often require significant time to find the optimal solution. In multi-dimensional environments, such as autonomous vehicles and robot manipulators, it is essential to compute an efficient path to conserve power [6].

The motivation for this paper is to improve the convergence rate and find successful solutions faster with lower initial solution costs based on the genetic-based generation of heuristics. Sampling-based algorithms like Rapidly-exploring Random Trees (RRT) [7], Probabilistic Roadmaps (PRM) [8], and variant algorithms of RRTs [9] have been widely used for recent path planning work and have demonstrated effectiveness in practical applications. However, these algorithms’ performance fluctuates greatly in different environments. On the other hand, in optimization algorithm research, a combined method known as Reinforced Genetic Programming (RGP) [10] is proposed. We introduced genotype-generative heuristic (G-heuristic) functions based on RGP for optimal edge evaluation, incorporating a fitness reward function to facilitate autonomous learning and adjustment of exploration strategies based on environmental feedback. This approach integrates the Genetic Algorithm (GA) [11] to assess bio-inspired chromosome behavior (e.g., crossover, mutation, reproduction) for integration with sampling-based planners. However, the G-heuristic cannot be directly applied to robot path planning because it does not consider environmental constraints (e.g., obstacle avoidance) or robustness across various scenarios. Therefore, the G-heuristic must be trained across different benchmarking datasets using reward feedback for robustness.

Inspired by RGP technology, this paper presents the Genetic Informed Trees (GIT*) algorithm, which generates a heuristic function using problem-specific information via RGP. This heuristic enhances efficiency by minimizing expanded vertices. GIT* uses invalid samples within obstacles and start/goal points to create an APF, incorporating obstacle shapes and locations, and tracks sample visit frequency to account for the dynamic importance of states. The G-heuristics represent a symbolic regression problem tackled by RGP. It involves evolving nonlinear expressions to refine the heuristic. As shown in Fig. 2, G-heuristic enables GIT* to find the initial solution quickly and then expand. GIT* incorporates additional graph search techniques, such as truncation and inflation, to balance exploitation and exploration, dynamically modified using RGP. GIT* has shown improvements over state-of-the-art (SOTA) methods in time to find the initial solution, initial solution quality, and final solution quality in both generalized simulation benchmarks and real-world experiments.

The contributions of this paper are summarized as follows:

An efficient optimal genotype-generative heuristic function based on reinforced genetic programming, trained with a dataset from the random problem domain. 2. 2.

A novel sampling-based path planning algorithm, GIT*, integrates the trained genotype-generative heuristic function to rapidly obtain high-quality solutions. 3. 3.

Demonstrating the effectiveness of GIT* across various dimensional environments and optimization objectives.

II Related work

Heuristic functions optimize path planning by estimating goal-state costs, which is crucial across multiple dimensions. Informed planners with heuristics outperform their uninformed counterparts [12]. Effective heuristics should be accurate and efficient, yet balancing these traits can be challenging [13].

RRT-Connect [14] extends the RRT framework by growing two trees: one from the start state and the other from the goal, using heuristic-guided planning to accelerate path convergence. However, RRT-Connect lacks asymptotic optimality and does not improve solution quality with more computation time [15]. Cost heuristics in tree growth are demonstrated by Heuristically-guided RRT (hRRT) [16] and Generalized Bidirectional RRT (GBRRT) [17]. hRRT uses a priori heuristics for exploration within RRT’s Voronoi regions, while GBRRT, a bidirectional RRT variant, employs reverse tree-computed heuristics to guide the forward tree. However, these algorithms do not provide bounds on solution quality [18].

To overcome this limitation, some planners combine graph-based and sampling-based approaches. Batch Informed Trees (BIT*) [19] uses A* on a random geometric graph (RGG) formed by random samples, improving approximation as samples increase. While BIT* efficiently refines its search, it still has limitations in using problem-specific information. Advanced BIT* (ABIT*) [20] enhances BIT* by introducing inflated and truncated factors, balancing the exploration and exploitation. Adaptively Informed Trees (AIT*) [13] and Effort Informed Trees (EIT*) [13] further enhance efficiency with bidirectional search strategies. EIT* uses adaptive sparse collision checks, reducing expensive collision detections. The forward and reverse trees inform each other, sharing complementary information to optimize the search [9]. However, EIT* did not leverage the information from the invalid sample/nearest neighbor in the planning domain. GIT* integrates APF and dynamic importance for further guidance.

II-A Genetic-based Path Planning Method

Genetic sampling-based algorithms utilize genetic operations like crossover and mutation to generate candidate solutions in the problem space. Hybridizing-RRT [21] uses a hybrid path generation scheme that combines RRT with an island parallel genetic algorithms (GA) to efficiently find $G^{3}\text{-continuous }\eta^{3}\text{-spline}$ paths that optimize path length and curvature. This approach leverages RRT injections to maintain genetic diversity and prevent premature convergence in complex map scenarios. Genetic-RRT [22] uses GA to optimize paths planned by RRTs. This approach retains multiple optimal solutions, increasing the likelihood of finding an asymptotically optimal path with more iterations. However, genetic-based algorithms typically use GA to optimize path length objectives, often overlooking edge evaluation during exploration, resulting in a challenging achievement of rapid convergence of the initial solution. Our method enhances search efficiency by employing RGP to create G-heuristics.

II-B Applications of Symbolic Regression

Symbolic regression, a popular application of genetic algorithms (GA) [23], discovers mathematical expressions that accurately represent datasets without presupposing a specific mathematical form. Unlike traditional regression models that require predefined functional relationships, symbolic regression explores all possible expressions, uncovering complex data relationships, nonlinear interactions, and dynamic patterns [24]. Inspired by symbolic regression, our work incorporates genetic programming to generate heuristic functions within the GIT* algorithm for path planning. This integration allows GIT* to leverage a broader range of data, improving computational efficiency and solution quality.

The Open Motion Planning Library (OMPL) [25] is commonly used in benchmarking motion planning algorithms. It provides a comprehensive framework and tools for researchers to evaluate algorithms. Genetic Informed Trees (GIT*) is integrated into the OMPL framework, the Planner-Arena benchmark database [26], and Planner Developer Tools (PDT) [27].

III Problem Formulation

We define the optimal planning problem according to the definition provided in [4] and consider the symbolic regression problem defined in [28] as a tool to address optimal planning.

Problem Definition 1 (Optimal Planning): Consider a path planning problem with the $n$ -th dimensional state space $X\subseteq\mathbb{R}^{n}$ . Let $X_{\text{obs}}\subset X$ represent states in collision with obstacles, and $X_{\text{free}}=cl(X\setminus X_{\text{obs}})$ denote the resulting permissible states, where $cl(\cdot)$ represents the closure of a set. The initial/start state is denoted by $\mathbf{x}_{\text{start}}\in X_{\text{free}}$ , and the set of desired final/goal states is $X_{\text{goal}}\subset X_{\text{free}}$ . A sequence of states $\sigma:[0,1]\mapsto X$ forms a continuous map (i.e., a collision-free, feasible path), and $\Sigma$ represents the set of all nontrivial paths.

The optimal solution, represented as the queue vector $\sigma^{*}$ , corresponds to the path that minimizes a selected scalar cost function $s:\Sigma\mapsto\mathbb{R}_{\geq 0}$ . This path connects the initial state $\mathbf{x}_{\text{start}}$ to any goal state $\mathbf{x}_{\text{goal}}\in X_{\text{goal}}$ through the free space:

[TABLE]

where $\mathbb{R}_{\geq 0}$ denotes non-negative real numbers. The cost of the optimal path is $s^{*}$ , and $t$ is the timestep of the exploration.

Considering a discrete set of states, $X_{\text{samples}}\subset X$ , as a graph where edges are determined algorithmically by a transition function, we can describe its properties using a probabilistic model implicit dense RGGs when these states are randomly sampled, i.e., $X_{\text{samples}}=\{\mathbf{x}\sim\mathcal{U}(X)\}$ , as discussed in [29].

The characteristics of the anytime almost-surely sampling-based planner with the definition are provided in [12].

Problem Definition 2 (Symbolic Regression): Symbolic regression aims to find a mathematical expression that best fits a given dataset. The process involves searching the space of mathematical expressions to identify the one that minimizes the error between the predicted output $\hat{y}$ and the actual output $y$ over a dataset $\mathcal{D}$ [30]. This can be formulated as an optimization problem where the objective is to minimize the sum of squared errors, represented by the fitness function.

The fitness function, $\textit{Fitness}(\cdot)$ , quantifies the error between the predicted and actual outputs. It is defined as:

[TABLE]

where $\psi$ represents the symbolic expression, $\rho_{\psi}$ denotes the input fitness value, $\hat{y}(\rho_{\psi},\psi)$ is the predicted output generated by the symbolic expression $\psi$ , and $y$ is the actual output in the dataset $\mathcal{D}$ . The goal is to find the expression $\psi^{*}$ that minimizes $\textit{Fitness}(\psi)$ , thus minimizing the sum of squared errors over the dataset. This optimization problem can be expressed as:

[TABLE]

In this context, the fitness function measures how well a given symbolic expression $\psi$ fits the dataset $\mathcal{D}$ . By minimizing the fitness function, we aim to find the symbolic expression that best fits the data, as discussed in [28].

IV Algorithm

This section explains how to use the RGP to learn heuristic functions from the benchmark dataset. Then, the learned G-heuristics are then applied in the GIT* to achieve fast and high-quality path planning. Finally, we prove that GIT* guarantees probabilistic completeness and asymptotic optimality.

IV-A Notation

The state space of the planning problem is denoted by $X\subseteq\mathbb{R}^{n}$ , where $n\in\mathbb{N}$ . The start point is represented by $\mathbf{x}_{\text{start}}\in X$ , and the goals are denoted by $X_{\text{goal}}\subset X$ . The sampled states are denoted by $X_{\text{sampled}}$ . The forward and reverse search trees are represented by $\mathcal{T_{F}}=(V_{\mathcal{F}},E_{\mathcal{F}})$ and $\mathcal{T_{R}}=(V_{\mathcal{R}},E_{\mathcal{R}})$ , respectively. The vertices in these trees, denoted by $V_{\mathcal{F}}$ and $V_{\mathcal{R}}$ , correspond to valid states. The edges in the forward tree, $E_{\mathcal{F}}\subseteq V_{\mathcal{F}}\times V_{\mathcal{F}}$ , represent valid connections between states, while the edges in the reverse tree, $E_{\mathcal{R}}\subseteq V_{\mathcal{R}}\times V_{\mathcal{R}}$ , may traverse invalid regions of the problem domain. An edge comprises a source state, $\mathbf{x}_{s}$ , and a target state, $\mathbf{x}_{t}$ , denoted as $(\mathbf{x}_{s},\mathbf{x}_{t})$ . The true connection cost between two states in configuration space ( $\mathcal{C}$ -space) is represented by the function $c:X\times X\rightarrow[0,\infty)$ .

Let $A$ be a set and let $B,C$ be subsets of $A$ . The notation $B\stackrel{{\scriptstyle+}}{{\leftarrow}}C$ is used to denote $B\leftarrow B\cup C$ and $B\stackrel{{\scriptstyle-}}{{\leftarrow}}C$ is used to denote $B\leftarrow B\setminus C$ .

GIT-specific Notation:* Let $\Theta$ be the space of all path planning problems and $\Xi$ be the space of all path planning algorithms. The dataset consisting of $k$ path planning problems is represented as $D_{\text{benchmark}}^{k}=\{\theta_{1},\theta_{2},\ldots,\theta_{k}\}$ . The function $\Phi:\Xi\times\Theta\to\{(m_{1},v_{1}),(m_{2},v_{2}),\ldots,(m_{k},v_{k})\}$ quantifies the expected performance of running an algorithm $\xi\in\Xi$ on a path planning problem $\theta\in\Theta$ one hundred times, with the performance measured by $k$ distinct indicators. The elements $(m_{i},v_{i})$ belong to a set $M\times\mathcal{V}$ , where $M$ is the set of all possible metrics and $\mathcal{V}$ is the set of all possible values.

In the genetic programming process, an individual is denoted as $\psi\in\mathcal{E}$ . The individual corresponding to the heuristic function of EIT* is defined as $\psi_{EIT^{*}}$ . The individuals generated in the same iteration form a population $\mathcal{P}$ , with size denoted as $\mathcal{O}$ . The probabilities of mutation and crossover, the two types of genetic operations, are denoted as $p_{m}$ and $p_{c}$ respectively. We define the fitness loss function $\phi:\mathcal{E}\to[0,\infty)$ , which quantitatively evaluates the performance of individuals on the dataset $D_{\text{benchmark}}^{k}$ . The evaluated fitness value of an individual is denoted as $\rho_{\psi}:=\phi(\psi)$ . The algorithm obtained by substituting the heuristic function in EIT* with the individual $\psi_{i}$ is denoted by $GIT^{*}_{\psi_{i}}$ . The reward function to assess the improvement of algorithm performance is denoted as $\chi:M\times\mathcal{V}\rightarrow[0,\infty)$ . The function $U:X\to[0,\infty)$ provides the magnitude of potential energy of a state in APF.

IV-B Reinforced Genetic Programming (RGP)

This subsection introduces RGP and its adaptation to improve the heuristic function in sampling-based path planning. RGP uses a reward function to evaluate candidate models on unlabeled data, enabling model evolution.

As shown in Fig. 3, RGP continues the traditional genetic programming’s (GP) iterative evolutionary process. Initially, a primitive set is established to generate individuals and populations in the evolutionary cycle. This set includes essential components for individual generations. An algorithm outlines the rules for assembling individuals from these components. Multiple individuals created using the primitive set form a population of candidate solutions. Each individual $\psi_{i}$ represents a heuristic function and corresponds to a new algorithm $GIT^{*}_{\psi_{i}}$ . The performance of this new algorithm is assessed using a Reinforced Fitness Evaluation Function, which compares the fitness of $GIT^{*}_{\psi_{i}}$ with the baseline (EIT*) algorithm.

Based on fitness values, exceptional individuals from the previous generation are chosen for the next, preserving superior genetic segments and removing inferior ones. This iterative process involves selecting parents, performing crossover and mutation to introduce new genetic segments, and evaluating the new population’s fitness. Crossover mixes genetic material between parents, creating offspring with diverse traits, while mutation introduces random changes for unique variations. The process continues iteratively until a termination condition, such as a specific number of generations or satisfactory fitness, is met. The best-performing individual $\psi^{*}$ is then selected as the genotype-generative heuristic function for GIT*. The pseudocode for this process is illustrated in Alg. 1.

Unlike traditional GP, RGP employs a reward function to assess the fitness of individuals, known as the Reinforced Fitness Evaluation Function. In traditional GP, the dataset $D$ comprises input data $x_{i}\in X$ and corresponding label data $l_{i}\in L$ . The objective is for each individual’s model, $\psi_{i}$ , to simulate the mapping from inputs to labels, $\phi:X\rightarrow L$ , minimizing the discrepancy between predicted outputs and actual labels to optimize model performance.

However, in our path planning problem, only the environment and problem description are provided as input data without any labels. We adopted an incentive-based approach to evaluate an individual’s fitness using our unlabeled dataset $D_{\text{benchmark}}$ . The objective is to identify an individual $\psi^{*}$ from the set $\mathcal{E}$ to replace the heuristic of EIT* and maximize the performance improvement over the EIT* baseline, as measured by the loss function $\phi$ . This is intended to optimize the effectiveness of the algorithm by adjusting its G-heuristic:

[TABLE]

For problem description input $\theta_{i}$ , we use the existing algorithm EIT* performance as a control group, assessed over a set number of trials. We then compare this with the performance of a new algorithm $GIT^{*}_{\psi_{i}}$ , generated by replacing EIT*’s heuristic function with individual $\psi_{i}$ , tested under identical conditions. If $GIT^{*}_{\psi_{i}}$ outperforms EIT* on any performance metric, the reward function $\chi$ decreases the fitness score proportionally to the degree of improvement. Conversely, if $GIT^{*}_{\psi_{i}}$ performs worse than EIT* on any metric, $\chi$ increases the fitness score accordingly. This method ensures that a lower fitness score indicates the superior performance of an individual compared to EIT* within the dataset $D_{\text{benchmark}}$ .

To illustrate how the fitness of an individual $\rho_{\psi_{i}}$ is assessed, consider the following example. Table I presents the performance results of EIT* and $GIT^{*}_{\psi_{i}}$ . These two algorithms were tested 100 times on the Random Rectangle problems across different dimensions, with time limits for each run (unsuccessful runs were considered as infinite costs). Ten critical metrics were evaluated, reflecting the algorithm’s performance in terms of time to find the initial solution, the cost of the initial solution, the cost of the optimal solution within the time limit, and the final success rate over 100 runs of finding solutions.

When assessing the fitness $\rho$ of an individual $\psi_{i}$ , these metrics must be taken into consideration, and the corresponding weights for each metric should be set according to the specific application context. Below, we present the reward function system and rules used in our subsequent experiments:

1) Initial score: The initial score for an individual is 800, ensuring the final computed fitness is greater than 0.

2) Weights of metrics: To determine the specific weights $w[m^{i}]$ for each metric $m^{i}$ , the weighting depends on the specific application scenario and requirements. The weights for the metrics are provided in the weights row of Table I.

3) Base score for each metric: For each metric $m^{i}$ , a base score $s_{\text{base}}$ is assigned based on whether $GIT^{*}_{\psi_{i}}$ outperforms $EIT^{*}$ and the magnitude of the difference. First, it is assessed whether $GIT^{*}_{\psi_{i}}$ ’s value $v^{i}_{GIT^{*}_{\psi_{i}}}$ outperforms $EIT^{*}$ ’s value $v^{i}_{EIT^{*}}$ . If $GIT^{*}_{\psi_{i}}$ is superior, a fixed score $\delta$ is subtracted; otherwise, it is added. To quantify the degree of superiority, this score is multiplied by a coefficient $\alpha$ , calculated as the ratio of the difference between $v^{i}_{GIT^{*}_{\psi_{i}}}$ and $v^{i}_{EIT^{*}}$ :

[TABLE]

The base score for the metric $m^{i}$ is then:

[TABLE]

4) Handling infinity as a special case: Some metrics may be infinite if solutions are not found in time. If both EIT* and $GIT^{*}_{\psi_{i}}$ record infinity for a metric, $s_{\text{base}}$ is set to 0. If only one does, $s_{\text{base}}$ is $2\times\delta$ .

5) Bonus for significant success rate enhancement: Given the importance of the success rate, substantial differences between $GIT^{*}_{\psi_{i}}$ and EIT* in this metric should impact the overall fitness evaluation. If the difference $v^{\text{success}}_{GIT^{*}_{\psi_{i}}}-v^{\text{success}}_{EIT^{*}}$ exceeds 5% but is less than 15%, a bonus $s_{\text{bonus}}[m^{\text{success}}]=\delta$ is applied. For differences exceeding 15%, $s_{\text{bonus}}[m^{\text{success}}]=2\times\delta$ .

6) Calculation of total score: The total score run on path planning problem $\theta$ is equal to the sum of all metrics’ basic scores and bonuses, each multiplied by their respective weights. The total scure is expressed as:

[TABLE]

The above rules evaluate an individual’s total score within a specific problem context. To measure generalizability, we use randomly generated problem descriptions as a dataset $D_{\text{benchmark}}$ . The average total scores within this dataset are included in the fitness calculation. To ensure stability across problems, we include the variance of total scores. Lastly, we consider the number of nodes as a complexity measure to avoid overfitting from complex expressions. The final fitness calculation formula is as follows:

[TABLE]

where $\psi$ denotes the individual. $\overline{s_{\text{total}}}$ represents the mean of the total scores across each problem definition in the benchmark. $\sigma^{2}_{s_{\text{total}}}$ is the variance of the total scores, multiplied by the coefficient $c_{1}$ . $|\psi|$ signifies the size of the individual $\psi$ , multiplied by the coefficient $c_{2}$ .

During practical training, techniques can reduce unnecessary computations. A segmented system can evaluate an individual without testing the complete benchmark. The benchmark’s scenarios are divided into segments with increasing difficulty. If the fitness in the first $i$ segments is significantly lower than a baseline, it indicates the algorithm performs worse than the expected ideal threshold (EIT*). Consequently, the fitness score can be directly assessed and recorded as $L_{\psi}$ without testing the entire benchmark, as such an individual is likely to be quickly eliminated in the evolutionary process.

IV-C Genetic Informed Trees (GIT)*

In Section IV-B, we use the RGP to evaluate the best individual $\psi^{*}$ of the generated population. In this subsection, the evaluated best individual $\psi^{*}$ is utilized in the GIT* to guide robot path planning, allowing the robot to rapidly converge on the initial solution while maintaining path quality.

Problem-specific information falls into three categories: search tree information $g(\mathbf{x})$ , heuristic information $\hat{h}(\mathbf{x})$ , and environmental information (e.g., dimensionality $D(\theta)$ and obstacle details). GIT* uses the RGP to generate evolving individuals that combine these information types into complex expressions. These expressions are integrated into the EIT* heuristic function to form new algorithms, $GIT^{*}_{\psi}$ , with the optimal GIT* algorithm being selected based on performance:

[TABLE]

When GIT* trains its heuristic function using RGP, information is stored in the primitive set to generate individuals. The search tree-related information includes $g(\mathbf{x}_{s})$ , while prior heuristic information includes $\hat{h}(\mathbf{x}_{t})$ and $\hat{c}(\mathbf{x}_{s},\mathbf{x}_{t})$ . $\bar{e}(\mathbf{x}_{\mathrm{s}})$ estimates the effort to find and validate a path from $\mathbf{x}_{\mathrm{s}}$ to the goal, whereas $\bar{e}(\mathbf{x}_{\mathrm{s}},\mathbf{x}_{\mathrm{t}})$ estimates the computational effort required to find and validate a path between states, while $\bar{d}(\mathbf{x}_{\mathrm{t}})$ estimates the effort from $\mathbf{x}_{\mathrm{t}}$ to the start. Environmental information comprises not only $D(\theta)$ but also two variables that record information about obstacles and the dynamic importance of states. According to the GA model, after natural selection, the winner G-heuristic function generated by RGP can be equivalently represented by $\mathrm{key}_{\mathcal{R}}^{\mathrm{GIT}^{*}}$ , which extracts the next edge from the reverse queue:

[TABLE]

where $U[\mathbf{x}_{t}]$ refers to the potential energy of the current state in an artificial potential field, and $w_{\text{dym}}$ refers to dynamic importance, represented by the number of times the current state has been visited. The following will detail how these variables are obtained.

IV-C1 Potential field variable $U[\mathbf{x}_{t}]$

Understanding obstacle characteristics like shapes, numbers, and locations is crucial for guiding the search tree to either circumvent obstacles for quicker solutions or approach them to reduce costs. However, these characteristics are often unknown beforehand.

GIT* approximates the environment by sampling points in the $\mathcal{C}$ -space to acquire information about obstacles, denoted as ${X}_{\text{obs}}$ . Those randomly sampled points undergo a validity check (e.g., collision detection) to determine if they are inside obstacles. Invalid points, denoted as $\mathbf{x}_{\text{invalid}}$ , indicate locations within obstacles, gradually outlining their shapes and locations as sampling increases. GIT* also employs the APF method to conceptualize the navigation space as a force field where obstacles generate repulsive forces, and targets generate attractive forces (Alg. 3, line 2). $\mathbf{x}_{\text{invalid}}$ and $\mathbf{x}_{\text{goal}}$ generate repulsive and attractive forces with target state $\mathbf{x}_{\text{t}}$ , respectively, and the potential field is dynamically adjusted based on the obstacle data. The calculated data is then utilized in the primitive set as candidates for RGP to generate G-heuristic individuals.

•

Repulsive force: Generated around invalid samples, these forces prevent entry into these areas. The magnitude of the repulsive force is:

[TABLE]

where $k_{r}$ is a proportionality constant, $q$ is the charge equivalent of the path planner, $q_{\text{obs}}$ is the charge equivalent of the obstacle, $r$ is the distance between the path planner and the obstacle, and $\rho_{0}$ is the threshold distance beyond which the force is not exerted.

•

Repulsive potential energy: The potential energy is:

[TABLE]

•

Attractive force: Produced by the target, these forces guide the path planner towards the target, navigating around repulsive regions. The magnitude of the attractive force is:

[TABLE]

where $k_{a}$ is another proportionality constant, $q$ is the charge equivalent of the path planner, $q_{\text{goal}}$ is the charge equivalent of the target, and $r$ is the distance between the path planner and the target.

•

Attractive potential energy: The potential energy is:

[TABLE]

The potential energy in the APF can be calculated using these formulae, recording information about obstacles and incorporating it into the primitive set to construct the heuristic function. As potential energy increases, indicating proximity to obstacles, the heuristic function’s value increases, reducing the likelihood of state selection. When $U[\mathbf{x}_{t}]$ is high, indicating frequent visits, the heuristic function’s value decreases, increasing the likelihood of exploration. As $\widehat{g}\left(\mathbf{x}_{\mathrm{t}}\right)$ increases, indicating greater distance from the start, the heuristic function’s value increases, making the node less likely to be searched.

IV-C2 Dynamic importance variable $w_{\textit{dyn}}[\mathbf{x}_{t}]$

In incremental asymptotically sampling-based planners like GIT*, certain sample points in $\mathcal{C}$ -space are frequently visited, often in nearest neighbor areas (e.g., path rewire) of path planning. These samples may lie along essential routes between start and end states, serve as conduits connecting regions, and could be located in narrow corridor areas. Thus, frequently visited samples in the free space prior to neighboring areas guide the search into explore-worthy regions, which improves search efficiency. GIT* tracks the number of visits to each sample point, capturing its dynamic importance (Alg. 4 and 5), and navigates to higher importance states. These strategies help GIT* search more efficiently during the path optimization phase. Similar to the APF discussed in Section IV-C1, the number of visits (i.e., dynamic importance) to a state is included in RGP’s primitive set to generate G-heuristics.

Formally, the dynamic importance of a state $\mathbf{x}_{t}$ , denoted as $w_{\textit{dyn}}[\mathbf{x}_{t}]$ , is calculated as follows:

[TABLE]

where $\mathbb{I}(\cdot)$ is the indicator function that equals 1 if the condition is true and 0 otherwise.

Each time a sample point appears in the nearest neighbors of the reverse tree queue, the dynamic importance of the corresponding state is incremented by 1, emphasizing frequently visited states for path optimization. Furthermore, the inflation factor speeds up the search by biasing the goal, resulting in rapid initial solutions. The truncation factor optimizes the search by stopping it when the solution quality is satisfied.

IV-C3 Inflation and truncation factor function

The traditional inflation and truncation factor update strategy is a user-adjustable parameter that can be tailored to specific application scenarios and requirements. However, this strategy lacks flexibility as it requires manual adjustments in each scenario to achieve optimal performance. The updated function for the inflation factor derived from this training session is:

[TABLE]

where $D(\theta)$ is the dimensionality of the path planning problem $\theta$ , and $N_{\text{samples}}$ is the current number of samples taken.

As the problem’s dimensionality increases, this expression’s value also increases, biasing the search towards rapidly finding feasible solutions based on heuristics rather than ensuring the lowest cost solution. In higher-dimensional spaces, fewer obstacles relative to the overall space decrease the probability of blocking the path, enhancing the success rate and reducing the time to find initial solutions. As the number of samples increases, the value decreases, leading GIT* to focus on low-cost solutions after several sampling batches, aligning with practical requirements.

The updated function for the truncation factor is:

[TABLE]

where $N_{\text{samples}}$ represents the current number of samples taken.

As $N_{\text{samples}}$ increases, the value decreases, indicating a tendency to exploit the current approximation rather than explore new ones. This is suitable for the later stages of the search when $N_{\text{samples}}$ is large.

V Analysis

In this section, firstly, we provide the convergency analysis and asymptotical time to prove the feasibility of the proposed algorithm. In addition, we explain the reason that GIT* consumes less time complexity, and we also verify the advantage of reinforced genetic programming (RGP) from mathematics.

V-A Reinforced Genetic Programming Training Analysis

Due to the randomness of the RGP algorithm and variability in training parameters, results from each RGP instance are unique. Practical applications need to consider specific objectives, use cases, datasets, and time constraints for parameter settings. Table II details the chosen parameters: high population size enhances diversity but raises computational cost, 1500 size was chosen for optimal performance with our equipment; 100 generations provide a balance between solution quality and overfitting; a crossover rate of 0.8 promotes exploration without excessive disruption; a mutation rate of 0.1 maintains diversity and prevents premature convergence; a maximum tree depth of 4 avoids overfitting and underfitting; and a tournament size of 5 balances selection pressure and diversity. The fitness variation across generations is shown in Fig. 5.

V-B Proof of Convergence in Genetic Programming

Research has explored the convergence properties of genetic programming (GP) for symbolic regression [31]. The global optimum in symbolic regression problem refers to the best possible individual that achieves the minimum or maximum fitness of the objective function across the entire state space [32]. Convergence to the global optimum implies generating solutions where the global optimum emerges as a limit. This study adopts a probabilistic interpretation. Rudolph [23] modeled genetic programming using a Markov Chain framework and demonstrated convergence when the population retains the best solution. The natural selection, crossover, and mutation processes in GP mimic biological evolution.

Let $\mathcal{P}(t)$ be the population at time $t$ , and $\psi^{*}$ be the global optimum. GP maintains a diverse population $\mathcal{P}(t)$ over generations to escape local optima:

[TABLE]

This helps GP escape local optima, unlike greedy search methods, which may converge quickly to local optima.

Let $Z_{t}$ denote a sequence of random variables representing the best fitness within a population at step $t$ . The convergence property of genetic programming, which preserves the best solution in the population, can then be formalized as:

[TABLE]

where $\psi^{*}$ represents the global optimum. This expression indicates that the probability of the best fitness $Z_{t}$ equating to the global optimum $\psi^{*}$ approaches unity as the number of iterations steps $t$ approaches infinity.

Through mutation and crossover, GP maintains diversity and explores the search space effectively. Selection mechanisms favor individuals with higher fitness, leading to gradual improvement. Consequently, the probability of finding the global optimum $\psi^{*}$ increases with each iteration.

V-C Probabilistic Completeness and Asymptotic Optimality

Most informed tree-based path planning algorithms have been proven to be probabilistically complete and asymptotically optimal, and GIT* can also guarantee these two properties. GIT* utilizes uniform sampling strategies. As the number of iterations $n$ approaches infinity, the entire state space will be explored, satisfying the following equation:

[TABLE]

which means that if there is a feasible path, it must be found by the GIT*. Therefore, the probabilistic completeness of the optimal path planner is guaranteed.

The GIT* implements the same Choose Parent and Rewire strategies as the EIT*. It means that if the rewiring radius $r(q)$ in Choose Parent and Rewire processes satisfies:

[TABLE]

here, $q$ denotes the number of sampled states in the informed set, $\eta>$ 1 is a tuning parameter, $\lambda(\cdot)$ denotes the Lebesgue measure, and $d$ is the dimensionality of the workspace, $\lambda(X_{\hat{f}}))$ is the Lebesgue measure of informed set $X_{\hat{f}}$ and $\zeta_{d}$ is the volume of unit ball in current workspace. In reference to Lemma 56, 71 and 72 in [4], the following equation holds:

[TABLE]

where $q$ is the number of samples, $\Sigma_{q}\subset\Sigma$ is the set of valid paths from the start to the goal found by the planner from those samples, $c:\Sigma\rightarrow[0,\infty)$ is the cost function, and $c^{*}$ is the optimal solution cost. It indicates that the GIT* can find an optimal path, if it exists, as the number of iterations go to infinity. Therefore, the asymptotic optimality is guaranteed.

VI Experiments

In this paper, we utilize the Planner Developer Tools (PDT) [27] and MoveIt [33] to benchmark motion planner behaviors. GIT* was tested against SOTA algorithms in both simulated random scenarios (Fig. 6) and real-world manipulation problems (Fig. 8). The comparison involved several versions of RRT-Connect, Informed RRT*, BIT*, AIT*, ABIT*, and EIT* sourced from the Open Motion Planning Library (OMPL) [25]. The evaluations were conducted on a computer with an Intel i7 3.90 GHz processor and 32GB of LPDDR3 3200 MHz memory. These comparisons were carried out in simulated environments of dimensions $\mathbb{R}^{4}$ and $\mathbb{R}^{8}$ . The primary objective for the planners was to minimize path length (cost). The RGG constant $\eta$ was uniformly set to 1.001, and the rewire factor was set to 1.2 for all planners.

In the case of RRT-based algorithms, a goal bias of 5% was employed, and the maximum edge lengths were determined based on the dimensionality of the space. All batched algorithms utilized a batch size of 100. BIT*, AIT*, ABIT*, and EIT* maintained a linear combination heuristic function of Euclidean distance and effort, respectively. GIT* utilized optimal G-heuristic (Eq. 12) to extract the next edge from the reverse queue, which was selected based on the fitness of RGP.

VI-A Simulation Experimental Tasks

The planners were tested across three distinct benchmarks in two domains: $\mathbb{R}^{4}$ and $\mathbb{R}^{8}$ . In the first scenario, a constrained environment resembling a dividing wall with several narrow gaps was simulated, allowing valid paths in multiple general directions for non-intersecting solutions (Fig. 6a). Each planner underwent 100 runs, with computation time for each anytime asymptotically optimal planner shown in the labels, using varying random seeds. The overall success rates and median path lengths for all planners are depicted in Fig. 7a and 7b. It can be seen that GIT* quickly finds the initial solution in both dimensions with minimal time, whereas EIT* requires more time to find the initial solution.

In the second test scenario, random widths were assigned to axis-aligned hyperrectangles, generated arbitrarily within the $\mathcal{C}$ -space (Fig. 6b). Random rectangle problems were created for each dimension of the $\mathcal{C}$ -space, with each planner undergoing 100 runs for every instance. Fig. 7c and 7d illustrate the proposed method has the highest success rates and lowest median path costs within the computation time compared with other planners. This indicates that GIT* can recognize promising regions via environmental information (e.g., APF) where feasible paths likely lie, thereby biasing the sampling process toward these regions. As a result, GIT* outperformed and can quickly find an initial solution.

The last test problem consisted of a hollow, axis-aligned hyperrectangle enclosing the goal state, configured such that even in higher dimensions, the goal can only be reached through the face of the hyperrectangle farthest from the start state (Fig. 6c). This problem is challenging for GIT* because there are many invalid edges close to the root of the reverse search tree, often requiring large parts to be repaired (Figs. 7e-f). From the figure, the GIT* achieves the best performance in finding the initial solution and converging to the optimal solution compared with the SOTA planner.

As observed in Table III, there’s a median initial time improvement across varied benchmark scenarios, correlating with dimensionality. For instance, in the $\text{DW}-\mathbb{R}^{4}$ scenario, GIT* exhibits a lower initial median time (i.e., median value over 100 trials) of 0.0201s compared to 0.0252s for EIT* and 0.1299s for AIT*. This trend is consistent across other scenarios, such as $\text{RR}-\mathbb{R}^{4}$ and $\text{GE}-\mathbb{R}^{4}$ , where GIT* consistently shows reduced initial median times.

In the $\text{GE}-\mathbb{R}^{8}$ scenario, GIT* demonstrates an initial median time of 0.0512s, compared to 0.0941 for EIT* and 0.3834s for AIT*. This indicates an improvement in initial convergence time of approximately 45.59% compared to EIT*.

Overall, Table III highlights the advantages of GIT* in achieving lower initial median times compared to SOTA, thereby enhancing the efficiency of path planning algorithms.

VI-B Real-world Path Planning Tasks

To evaluate the algorithm’s performance in real-world scenarios, three numerical experiments are conducted on a single-arm manipulator and mobile manipulator (DARKO) to demonstrate the efficiency and extensibility of GIT* compared with three SOTA path planning algorithms: Batch Informed Trees (BIT*) [19], Adaptively Informed Trees (AIT*) [13], and Effort Informed Trees (EIT*) [13].

We compare GIT* with AIT* and EIT* in single-arm manipulator environments to evaluate their performance in converging to the optimal solution cost and success rate over 30 runs. The first environment (Beer Barrel) consists of simple cup holder obstacles. The second and third environments (Shelf and Kitchen) are confined to the DARKO robot and cluttered with narrow spaces. A collision-free path connecting the start state to the goal region is required. GIT* demonstrated its effective G-heuristic during multiple experimental tasks (Fig. 8). The detailed behavior of real-world experiments can be viewed in the accompanying video.

VI-B1 Beer Barrel Cup Placement Task

Fig. 8a showcases the start and goal configuration of the cup placement task. In this task, we utilize a single robotic manipulator to grab a beer cup and place it under the beer tap of the beer barrel keg while avoiding obstacles. The following graph illustrates the performance of AIT*, EIT*, and GIT* in terms of solution cost and success rate. All planners were given 1.0 seconds to address the beer barrel cup placement problem. Over the course of 30 trials, GIT* achieved a 100% success rate with a median solution cost of 13.8972. EIT* had a success rate of 96.67% with a median solution cost of 15.1332. AIT* was 93.33% successful, with a median solution cost of 19.2183.

VI-B2 Industry Shelf Container Rearrangement Task

The initial and final configurations for the shelf task are depicted in Fig. 8b. This task involves extracting an industry-standard container from a position between two other boxes on the lower bottom layer and repositioning it on the third layer of the shelf, again between two containers. Due to component standardization, the challenge lies in the precise insertion of industry containers into narrow spaces. The task aims to place the industry-standard container between two larger containers on the shelf, with a tolerance scope of $\leq$ 5mm, making the planning of a collision-free feasible path particularly difficult. Each planner was allocated 5.0 seconds to solve this confined, limited space pull-out and insertion problem. Across 30 trials, GIT* achieved an 86.67% success rate with a median solution cost of 15.9745. EIT* had a 76.67% success rate with a median solution cost of 19.1045. AIT* managed a 56.67% success rate with a median solution cost of 18.2672.

VI-B3 Kitchen Model Pan Cooking Task

For the third task, we utilized the DARKO robot positioned in front of a kitchen model. The start and goal configurations are illustrated in Fig. 8c. This task is particularly challenging as the manipulator must navigate the geometric shape of the pan within a cluttered oven while also avoiding collisions between the base robot and the kitchen shelves. The complexity is further heightened by the need for precise movements in a confined space. Each planner was allotted 10.0 seconds to solve this kitchen pan reallocation problem. Over the course of 30 trials, GIT* achieved a 30% success rate with a median solution cost of 15.8860. EIT* had a success rate of 26.67% with a median solution cost of 19.2746. AIT* managed a 16.67% success rate with a median solution cost of 20.9824.

In short, compared with the AIT* and the EIT*, the GIT* achieves the best performance on finding the initial solution and converging to the optimal solution.

VI-C Discussion

VI-C1 Comparison With SOTA Planner

To showcase the advantages of GIT*, we compared its performance with AIT* and EIT* using success rate and solution cost metrics in three real-world tasks (Fig. 8): placing cups on beer barrel faucets, rearranging industrial containers on shelves, and cooking pans in a kitchen model, and six simulation tasks (Fig. 6) across multi-dimensions with randomly generated seeds.

From the experiment results, we observe that EIT* performs much better than AIT* in both simulation environments (Table III) and real-world scenarios. However, GIT* outperforms SOTA planners due to its use of problem-specific environmental information via RGP and the integration of the G-heuristic. As shown in Fig. 7(a, c, and e), In low-dimensional problem domains, the initial solution finding time and cost show minimal improvement. In high-dimensional domains, the linear combination heuristic struggles to guide the search efficiently, as shown in Fig. 7(b, d, and f). Furthermore, In the first real-world environment, GIT* outperformed EIT* by 3.33% in success rate and reduced the solution cost by approximately 8.17%. Compared to AIT*, GIT* improved the success rate by 6.67% and reduced the solution cost by approximately 27.68%, as shown in Fig. 8(a). In the second real-world experiment, the benchmark results show that the G-heuristic can enhance solving cluttered tasks, achieving the highest success rate and the lowest solution cost among the evaluated planners. GIT* outperformed EIT* by 13.04% in success rate and reduced the solution cost by approximately 16.38%. Compared to AIT*, GIT* improved the success rate by 52.94% and reduced the solution cost by approximately 12.56%, as shown in Fig. 8(b). In the third real-world experiment, GIT* outperformed EIT* by 12.5% in success rate and reduced solution cost by about 17.58%. Compared to AIT*, GIT* improved success rate by 80% and reduced solution cost by approximately 24.32%, as shown in Fig. 8(c). These results highlight the effectiveness of the G-heuristic in narrow environments to prevent obstacle avoidance in the kitchen model.

From the discussion, one may conclude that using RGP to train an optimal G-heuristic across all benchmarks can improve the initial convergence rate and initial path length. Furthermore, GIT* can utilize environmental information to search via more promising regions (i.e., APF and dynamic importance), which accelerates the path-planning initial finding process. GIT* achieved the highest success rate and lowest solution cost among tested SOTA planners, emphasizing its potential for real-world applications.

VI-C2 Limitations and Future Work

While GIT* demonstrates superior performance, it has limitations. The current implementation is tailored for specific tasks with predefined start and goal configurations, limiting its adaptability to variable environments and tasks. Future work could enhance GIT*’s generalization capabilities by integrating neural network-driven approaches to learn from diverse human demonstrations, improving its extension ability across different environments. This aligns with advancements in neural network-based path planning and promises to enhance GIT*’s robustness and versatility. Furthermore, future designs will consider human acceptability and comfort when planning trajectories.

VII Conclusion

In this paper, we introduced the Genetic Informed Trees (GIT*) algorithm, a novel path planning approach that leverages Reinforced Genetic Programming (RGP) to refine heuristic functions for enhanced guidance. By incorporating additional environmental data, such as repulsive forces from obstacles and the dynamic importance of vertices, GIT* improves search efficiency and solution quality. The integration of RGP allows GIT* to mutate genotype-generative heuristic functions (G-heuristic), adapting to various problem domains. Our comparative analyses demonstrate that GIT* consistently outperforms existing single-query, sampling-based planners across different scenarios, including simulation benchmarks and real-world robot manipulation tasks. Optimal G-heuristic exhibits notable improvements over SOTA methods in terms of both success rate and solution cost, showcasing its robustness and adaptability, particularly in handling complex, cluttered environments with high precision and efficiency.

In conclusion, GIT* enhances rapid initial pathfinding and reduces solution costs. GIT* shows promising potential for future research and applications in motion planning.

Bibliography33

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] L. Zhang, K. Cai, Z. Sun, Z. Bing, C. Wang, L. Figueredo, S. Haddadin, and A. Knoll, “Motion planning for robotics: A review for sampling-based planners,” Biomimetic Intelligence and Robotics , vol. 5, no. 1, p. 100207, 2025.
2[2] P. E. Hart, N. J. Nilsson, and B. Raphael, “A formal basis for the heuristic determination of minimum cost paths,” IEEE transactions on Systems Science and Cybernetics , vol. 4, no. 2, pp. 100–107, 1968.
3[3] O. Khatib, “Real-time obstacle avoidance for manipulators and mobile robots,” The International Journal of Robotics Research , vol. 5, no. 1, pp. 90–98, 1986.
4[4] S. Karaman and E. Frazzoli, “Sampling-based algorithms for optimal motion planning,” The international journal of robotics research , vol. 30, no. 7, pp. 846–894, 2011.
5[5] K. Cai, W. Chen, D. Dugas, R. Siegwart, and J. J. Chung, “Sampling-based path planning in highly dynamic and crowded pedestrian flow,” IEEE Transactions on Intelligent Transportation Systems , vol. 24, no. 12, pp. 14 732–14 742, 2023.
6[6] L. Zhang, S. Wang, K. Cai, Z. Bing, F. Wu, C. Wang, S. Haddadin, and A. Knoll, “APT*: Asymptotically optimal motion planning via adaptively prolated elliptical r-nearest neighbors,” IEEE Robotics and Automation Letters , vol. 10, no. 10, pp. 10 242–10 249, 2025.
7[7] S. M. La Valle and J. J. Kuffner Jr, “Randomized kinodynamic planning,” The international journal of robotics research , vol. 20, no. 5, pp. 378–400, 2001.
8[8] L. E. Kavraki, P. Svestka, J.-C. Latombe, and M. H. Overmars, “Probabilistic roadmaps for path planning in high-dimensional configuration spaces,” IEEE transactions on Robotics and Automation , vol. 12, no. 4, pp. 566–580, 1996.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Genetic Informed Trees (GIT*): Path

Abstract

Index Terms:

I Introduction

II Related work

II-A Genetic-based Path Planning Method

II-B Applications of Symbolic Regression

III Problem Formulation

IV Algorithm

IV-A Notation

IV-B Reinforced Genetic Programming (RGP)

IV-C Genetic Informed Trees (GIT)*

IV-C1 Potential field variable U[xt]U[\mathbf{x}_{t}]U[xt​]

IV-C2 Dynamic importance variable wdyn[xt]w_{\textit{dyn}}[\mathbf{x}_{t}]wdyn​[xt​]

IV-C3 Inflation and truncation factor function

V Analysis

V-A Reinforced Genetic Programming Training Analysis

V-B Proof of Convergence in Genetic Programming

V-C Probabilistic Completeness and Asymptotic Optimality

VI Experiments

VI-A Simulation Experimental Tasks

VI-B Real-world Path Planning Tasks

VI-B1 Beer Barrel Cup Placement Task

VI-B2 Industry Shelf Container Rearrangement Task

VI-B3 Kitchen Model Pan Cooking Task

VI-C Discussion

VI-C1 Comparison With SOTA Planner

VI-C2 Limitations and Future Work

VII Conclusion

IV-C1 Potential field variable $U[\mathbf{x}_{t}]$

IV-C2 Dynamic importance variable $w_{\textit{dyn}}[\mathbf{x}_{t}]$