Collision-Free Multi Robot Trajectory Optimization in Unknown Environments using Decentralized Trajectory Planning
Vijay Arvindh, Govind Aadithya R, Shravan Krishnan, Sivanathan K

TL;DR
This paper introduces an online decentralized trajectory optimization method for multi-robot systems operating in unknown environments, enabling collision-free navigation through local mapping and inter-robot communication.
Contribution
It presents a novel decentralized trajectory planning algorithm that accounts for dynamic obstacles and robot interactions using local maps and current state communication.
Findings
Successfully tested in Gazebo simulations with ROS
Achieves collision-free trajectories in unknown environments
Utilizes local object-based maps and inter-robot communication
Abstract
Multi robot systems have the potential to be utilized in a variety of applications. In most of the previous works, the trajectory generation for multi robot systems is implemented in known environments. To overcome that we present an online trajectory optimization algorithm that utilizes communication of robots' current states to account to the other robots while using local object based maps for identifying obstacles. Based upon this data, we predict the trajectory expected to be traversed by the robots and utilize that to avoid collisions by formulating regions of free space that the robot can be without colliding with other robots and obstacles. A trajectory is optimized constraining the robot to remain within this region.The proposed method is tested in simulations on Gazebo using ROS.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Path Planning Algorithms · Control and Dynamics of Mobile Robots · Robotics and Sensor-Based Localization
Collision-Free Multi Robot Trajectory Optimization in Unknown Environments using Decentralized Trajectory Planning
Vijay Arvindh
Govind Aadithya R
Shravan Krishnan and Sivanathan K This work was supported by SRM Institute of Science and Technology The authors are with Autonomous Systems Lab, SRM Institute of Science and Technology,IndiaEmail:vijay_arvindh,govinda_adithya,shravan_krishnan)@srmuniv.edu.in, [email protected]**Corresponding Author: Shravan Krishnan
Abstract
Multi robot systems have the potential to be utilized in a variety of applications. In most of the previous works, the trajectory generation for multi robot systems is implemented in known environments. To overcome that we present an online trajectory optimization algorithm that utilizes communication of robots’ current states to account to the other robots while using local object based maps for identifying obstacles. Based upon this data, we predict the trajectory expected to be traversed by the robots and utilize that to avoid collisions by formulating regions of free space that the robot can be without colliding with other robots and obstacles. A trajectory is optimized constraining the robot to remain within this region.The proposed method is tested in simulations on Gazebo using ROS.
Keywords- Multi-Robot System, Collision Avoidance, Trajectory Optimization
I Introduction
Multi-robots systems are a group of individual entities working together so as to maximize their own performance while accounting for some higher goals. The trajectories generated in such scenarios will have to ensure that the robots don’t collide with one another while keeping up with their dynamic limits. The trajectory generation process in multi agent systems has long since been done in a centralized manner wherein the trajectories are generated before hand and transmitted across to individual robots. These methods show excellent performance and safety when the number of robots are known beforehand and limited in number as scalability is a problem in them. Recently, this has branched out to decentralized approaches that attempt to plan trajectories in known environments using a variety of different approaches, but it is important to mitigate disturbances and unmodeled errors by modifying the trajectories during run time, making it more like an implicit feedback.
Our approach attempts to solve this problem using two steps, i) Collision-free regions that the robot can be safe within to avoid collisions with other robots and ii) Generate a trajectory for the robot constraining it within the safe region. Other obstacles also have to be considered while planning the trajectory. To achieve this the obstacles are stored using primitive model representations and used as soft constraint for the objective with the obstacles approximated as circles of appropriate radii. The approach exploits differential flatness [1] based trajectory generation for nth order systems property of robots [2], [3]. and constrains the generated trajectory to stay within a safe region at specified discrete time points in two dimensional environments.
The contributions can be stated as
A decentralized trajectory optimization algorithm for multi robot systems. 2. 2.
A simple method for obstacle representation in 2D environments based on Lidar data under assumption of no uncertainty 3. 3.
Extensive simulation experiments of the proposed algorithm
The algorithm requires a sharing of data amongst the agents and assumes that the robots are equipped with depth sensors (RADAR,LiDAR). An advantage of the proposed algorithm is a continuous time parametrization of the trajectory generation problem with discrete inter-robot collision avoidance. Moreover,the trajectory optimization is solved as a convex optimization problem.
The rest of the paper is organised as: Related works are presented in Section II. A formal problem definition and assumptions are provided in Section III. Section V details the formulation of the safe region following while the obstacle representation is explained in IV. The trajectory optimisation formulation is given in Section VI. The results are discussed in Section VII The paper is concluded in Section VIII.
II Related Work
In [4], a centralized multi robot trajectory planner for obstacle free environments was proposed utilizing tools from non linear optimization and calculus of variation and exploiting a two step process with first step accounting for the collision avoidance. A centralized mixed integer programming based approach to multi robot path planning was proposed by [5] with collision avoidance accounted for using binary integer constraints.
In [6], a decentralized method was proposed using polygonal representations for obstacles utilizing a switched systems to achieve collision-avoidance. Generating regions of free space for the robots was attempted using voronoi cells was tried out by [7]. They utilize a Receding Horizon control based approach which they are able to formulate as a convex quadratic program.
Distributed collision avoidance for multi robot systems have also been attempted [8] in a method called reciprocal velocity obstacle, exploiting the concept of Velocity obstacles proposed by [9]. This approach assumed other agents continued movement in a straight line with collisions accounted in future by relative velocities.[10] proposed a collaborative collision avoidance for non holonomic robots with grid based mapped environments whilst respecting the vehicular constraints and also accounting for potential tracking error bounds for the robot that is planning.
A fully distributed algorithm for navigation in unknown environments was proposed by [11] using incremental sequential convex optimisation for trajectory generation in a model predictive control setting. Distributed re-planning for multiple robots with each robots having different planning cycles in known obstacle filled environments was attempted by [12].The transmission of previously generated trajectories and plan trajectories while avoiding collisions with these trajectories and utilize conservative approximations to account for deviations from these expected trajectories by the other robots.
III Problem formulation
Consider robots in a 2 dimensional workspace with an unknown number of obstacles and their sizes. The position of ith robot is represented by . Each of the N robots have a set of time stamped desired poses. We also assume every robot gets to know the current time stamped state of other robots in the vicinity at frequent intervals. We take the state as . Where is the velocity of ith along the two axes, is acceleration of ith robot. Therefore, a common reference frame for all the robots is a requisite. Then it is required that the robots go from their current positions through all their desired poses at as close as possible to the specified timestamps as possible. That is for all the robots, a trajectory is to be planned that ensures that robot traverses from its current position to a within a region from the desired position within the specified time while not colliding with any of the other robots and/or obstacles in the environment.
Furthermore, we assume that each robot is equipped with a rangefinding based sensor that can give the depth information of the obstacles and that the depth sensor only perceives obstacles within a sensing region. We also assume that the robots do not know the number of robots in the environment and their desired poses and just utilize the states received by them for forward prediction.
The constraints on the overall system are:
The current state of all the robots 2. 2.
The states remain within the feasible set of the respective robots 3. 3.
The positions of any robots at any time from to should not coincide.(Inter-Agent Collision Avoidance) 4. 4.
The position of an obstacle and a robot should not overlap (Obstacle Avoidance)
IV Obstacle Detection
A local map representation is implemented for obstacles in the environment. The robot is assumed to be employed with a 2D laser range finder, whose data is utilized to find the shape of the object and its center. The rangefinder data from the lidar provides the distance of the obstacles.
The detection of reflection pattern is achieved by using a search through the data that is available in the lidar, that isn’t infinity(no reflection back). The set of reflections together between a non reflection form a single obstacle. as the resolution of a lidar is known beforehand and hence, using the resolution and distance to the obstacle from the robots current position, the obstacle point’s position can be comprehended by :
[TABLE]
This results in a primitive but a simple structure of the robot’s obstacles to be noted at. The selection of circle is motivated by the reason that higher sided convex polygons can be approximated accurately by circles of appropriate radius. Moreover, it is easier to approximate from observing a shape partially.
The robot’s obstacles with a threshold formed by their radius are utilized for formulating the obstacle avoidance. That is the obstacle’s center and it’s radius are used for obstacle from the lidar data. This method while being primitive, allows for simpler obstacle representations for a robot functioning in a 2 dimensional environment. Hence, the obstacles can be easily stored as with their center and sizes. It also has a drawback of over approximating the obstacles at times by formulating drastic radii.
V Convex Region
For the formulation of the safe regions for the robots to plan trajectories within, We forward simulate other robot’s trajectories utilizing their current states.
V-A Forward Simulation
Forward simulation is done utilizing the current state of the robot. The forward simulation of the robots is done by:
[TABLE]
where is the time stamp of the robot’s transmitted state.
The forward simulation is done for specified time horizon based upon the discretization .
V-B Accommodating the size of robots
At each of the discretized time points, utilizing the transmitted size of the robots, we formulate regions depending on the transmitted data. For simplicity that if three numbers are sent, The robot is modeled as a cuboid or a cube, two numbers a cylinder and a single number as a sphere. In the case of three numbers, robots are modeled as a square of diagonal of largest side, thereby allowing the robot to rotate freely. In case of two or one, the robot is modeled as a circle. A robot inflated to its size is represented as .
[TABLE]
Where is formulated as and is the radius of the robot
V-C Hyperplane
Considering a time between and , supporting hyperplanes are formulated for all that are within their appropriate moving volumes and as convex and a supporting hyperplane hence, exists. [13]. A support hyperplane for can be formulated as
[TABLE]
where is the boundary of the set
The intersection of all the support hyperplanes is convex polyhedron as an intersection of convex hyperplane is convex. We Constrain the robot to remain within the generated polyhedron at the discretized time points with polyhedron at each time point represented as . But as this region constrains the overall robot as a point we subtract the robot’s dimensions from the to constrain the robot to be within the region. the convex regions at specific time points. As the robot’s modeled so that it’s size is invariant to rotation, dilating the robot according to it’s shape of the robot.
VI Trajectory Generation
The generation of trajectory by the robot can be formulated as an optimization problem that tries to optimize the system while ensuring that
{argmini!}
xC_int + C_final + C_collision \addConstraintx(t_1)=x_0 \addConstraint∑_t_disc=1^thτH(t_disc) ≤h , t_1 ≤t ≤t_2 \addConstraint¨x≤x ≤¯¨x
While the above mentioned problem is a continuous time problem and thereby a infinite dimensional problem. To transform the problem into a finite dimension, the trajectory is represented by polynomial in each dimension.
[TABLE]
[TABLE]
VI-A Objective Function
is the integral cost functional that specifies the objective for the derivative over the integral. is the cost at the end of the time horizon. is the collision cost for static obstacles along the trajectory.
VI-A1 Derivative Cost
Derivative penalty utilized is square of the integral of nth derivative and n-1th squared over time horizon th: It is represented as:
[TABLE]
Where are tuning weights for the objective.
As the time horizon is known before hand and the initial time and position are known, We can formulate this cost with a closed form solution as a Quadratic Objective with the decision vector as:
[TABLE]
With formulated by integrating Equation 7 and substituting , and separating according to the coefficients of the polynomial.
VI-A2 End Cost
We add an end point quadratic cost for the final position along the trajectory as a soft constraint for two reasons. One, to allow to robot to plan appropriate trajectory if a robot or an obstacle is occupying or blocking the path directly to the end point. Two, in scenarios where the robot’s end pose’s time stamp is beyond it’s trajectory planning horizon, this cost tries to drive the robot as close as possible to it, while ensuring the dynamic limits aren’t violated by the hard dynamic constraint
[TABLE]
The final position is penalized but if required addition penalties on velocity, acceleration can be added. The cost can be reformulated with respect to the decision variables resulting in:
[TABLE]
VI-A3 Collision Cost
For the generated trajectory to be collision free with respect to the obstacles in the environment,the following penalty is used:
[TABLE]
Where
[TABLE]
Here is euclidean distance to each obstacle. Similar cost functions for collision avoidance have been utilized for collision avoidance for autonomous cars [14], aerial robots [15]. is a smoothness tuning parameter that allows to increase or decrease the smoothness of the collision cost.
For efficient optimization, a quadratic approximation of the obstacle cost around the previous optimized trajectory is used resulting in:
[TABLE]
VI-B Constraints
The trajectory optimisation is constrained by the derivatives of the trajectory staying within the feasible limits(Dynamic Constraints), Staying within the convex region and way-points.
VI-B1 Waypoint constraints
The trajectory has to also pass through the given time stamped poses along the trajectory, This results in linear equality constraints on the trajectory.
[TABLE]
where is the stack of poses at their time.
Moreover, as the end pose’s time is given in this scenario, the polynomial while minimizing the costs only for the specific time horizon also ensures that the robot reaches the end goal at the desired time stamp
VI-B2 Convex Region
We require that the generated trajectory also remains within the feasible convex region generated at the specific time samples. This constrain is formulated as
[TABLE]
where is the map from the polynomial coefficients to the positions. As both are linear with the polynomial coefficients this results in a convex constraint.
VI-B3 Dynamic constraints
Dynamic constraint on the robot is an infinite dimensional and hence we apply the constrains at specific points on the trajectory. This results in added Inequality constrain on the system
[TABLE]
Where is the number of discrete points wherein the constraints are added and represent the minimal and maximal limits of the derivatives.
The resulting optimization problem can be formulated as a Non Linear Program
{mini}
DD^T H_net D + F_net^T D \addConstraintA_ContD=0 \addConstraintA_wayD=P \addConstraint∑_t_disc=1^thτH(t_disc) T D ≤h \addConstraint∑_i=1^n d ≤A_dynD≤¯d
Where is formulated by and by
The Non Linear Program in Equation 4 is a Convex QP and can be solved using available solvers
VII Results and Discussion
The algorithm was implemented in C++ and integrated into Robot Operating System(ROS) and tested on a workstation with Intel Xeon E5 1630v5 processor, 32GB of RAM and a Nvidia Quadro M4000 GPU. A degree of -1 was utilized for the polynomials with being three seconds. We utilized a of 0.1 seconds for the other agents’ prediction. The threshold distance for the obstacle is kept at 0.75m.The algorithm was run at frequency of 25Hz. For solving the QP, qpOASES[16] was used.
To handle infeasible QPs that arise due to the inequality constraints, We utilize a two step process for the same. In the initial step, we apply the previous solution for some time and in second step relax the dynamic constraints.
The proposed algorithm was tested with two sets of robots one a Turtlebot 3 Burger and Waffle (utilizing the native sensor suite available on these robots) and two, AscTec Firefly and Neo using the a high fidelity simulator [17] with robots mounted with Velodyne Puck. The weight of the Velodyne Puck was modified to allow firefly to fly with it. Moreover, Velodyne Puck is a 3D sensor but we limited the sensing to a 2D region. The experiments included both homogeneous and heterogeneous interactions in an intersection like environment, a cross road-like structure which was formed with the help of walls. We utilize an intersection-like environment as it is an important usage of the labeled multi robot problem.
The robots were spawned randomly. The desired timestamped poses are predefined in for every robot. For the multirotors, the altitude was fixed at 1. For the ground robots, the generated trajectory is tracked using the MPC proposed in [18] which was solved using qpOASES [16] with a time horizon of 2 secs and a discretization of 10Hz. The aerial robots tracked trajectories with the controller proposed in [19]. The failure rates are shown in Fig 4 and Fig 5. The trajectories of Aerial robots at the intersection-like environment is shown in Fig 6.
VII-A Discussions
During the course of simulation, in a few experiments, the robots collided with the obstacles. The collisions in some cases are due to the inaccurate representation of the obstacles and also the unaccounted sensors noise. Moreover, the LIDAR measurements are available at 5Hz whereas the trajectory optimization algorithm runs at 25Hz. The usage of a single polynomial is also a potential cause for this problem. The algorithm also shows some unnecessary non smoothness in its transition towards the final end pose. The safe region generation is conservative but also results in available free-space being neglected. This at times results in the QP being infeasible.
VIII Conclusion
A decentralized algorithm for collision free navigation of multiple robots in unknown two dimensional environments was proposed in this work. The proposed algorithm parametrized trajectories by a time parameterized polynomial and generated safe regions based on the prediction of other robots in the environment. A method for obstacle representation was utilized that allowed for simpler collision avoidance with the obstacles. The proposed method was tested extensively in simulations using gazebo for up to eight aerial robots and ten turtlebots.
Utilizing piecewise spline representations of non uniform B-Splines or bezier curves are another avenue for research. The collision representation is a discrete time representation and a continuous time representation of the collision is an avenue for future research. Moreover, sophisticated models of prediction can be utilized for a better prediction accuracy. The sensors used had low update rates. Utilizing faster sensors or utilizing RGB-D or Stereo cameras for local maps is an another important possible extension. Moreover, incorporating a higher variety of primitives for the obstacles will allow for a much more accurate obstacle representation. Another avenue for future research would be test the capabilities of the algorithm in an unstructured environment.
Supplementary Material
The experiments in Gazebo can be found at https://youtu.be/wGu0GMOTeH8
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] M. Fliess, J. Lévine, P. Martin, and P. Rouchon, “Flatness and defect of non-linear systems: introductory theory and examples,” International journal of control , vol. 61, no. 6, pp. 1327–1361, 1995.
- 2[2] D. Mellinger and V. Kumar, “Minimum snap trajectory generation and control for quadrotors,” in Robotics and Automation (ICRA), 2011 IEEE International Conference on . IEEE, 2011, pp. 2520–2525.
- 3[3] R. Walambe, N. Agarwal, S. Kale, and V. Joshi, “Optimal trajectory generation for car-type mobile robot using spline interpolation∗,” IFAC-Papers On Line , vol. 49, no. 1, pp. 601–606, 2016.
- 4[4] S. Tang, J. Thomas, and V. Kumar, “Hold or take optimal plan (hoop): A quadratic programming approach to multi-robot trajectory generation,” The International Journal of Robotics Research , p. 0278364917741532, 2018.
- 5[5] T. Schouwenaars, B. De Moor, E. Feron, and J. How, “Mixed integer programming for multi-vehicle path planning,” in Control Conference (ECC), 2001 European . IEEE, 2001, pp. 2603–2608.
- 6[6] M. Sutorius and D. Panagou, “Decentralized hybrid control for multi-agent motion planning and coordination in polygonal environments,” IFAC-Papers On Line , vol. 50, no. 1, pp. 6977–6982, 2017.
- 7[7] D. Zhou, Z. Wang, S. Bandyopadhyay, and M. Schwager, “Fast, on-line collision avoidance for dynamic vehicles using buffered voronoi cells,” IEEE Robotics and Automation Letters , vol. 2, no. 2, pp. 1047–1054, 2017.
- 8[8] J. Van den Berg, M. Lin, and D. Manocha, “Reciprocal velocity obstacles for real-time multi-agent navigation,” in Robotics and Automation, 2008. ICRA 2008. IEEE International Conference on . IEEE, 2008, pp. 1928–1935.
