Estimation of continuous environments by robot swarms: Correlated networks and decision-making
Mohsen Raoufi, Pawel Romanczuk, Heiko Hamann

TL;DR
This paper presents a decentralized control algorithm enabling robot swarms to explore unbounded environments, reach consensus on environmental features, and adapt their network topology dynamically, demonstrated through real-world experiments.
Contribution
It introduces a novel approach for continuous environment estimation in robot swarms, considering the causal loop between network topology and decision-making, validated by real-world experiments.
Findings
Higher precision in environmental feature estimation compared to control
Effective consensus achievement in real-world swarm experiments
Dynamic network topology influences convergence time
Abstract
Collective decision-making is an essential capability of large-scale multi-robot systems to establish autonomy on the swarm level. A large portion of literature on collective decision-making in swarm robotics focuses on discrete decisions selecting from a limited number of options. Here we assign a decentralized robot system with the task of exploring an unbounded environment, finding consensus on the mean of a measurable environmental feature, and aggregating at areas where that value is measured (e.g., a contour line). A unique quality of this task is a causal loop between the robots' dynamic network topology and their decision-making. For example, the network's mean node degree influences time to convergence while the currently agreed-on mean value influences the swarm's aggregation location, hence, also the network structure as well as the precision error. We propose a control…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed Control Multi-Agent Systems · Marine and coastal ecosystems
Estimation of continuous environments by robot swarms:
Correlated networks and decision-making*
Mohsen Raoufi1,2,3, Pawel Romanczuk1,2,4 and Heiko Hamann1,5 This work is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – EXC 2002/1 “Science of Intelligence” – project number 390523135.1 Mohsen Raoufi, Pawel Romanczuk and Heiko Hamann are with Science of Intelligence, Research Cluster of Excellence, Marchstr. 23, 10587 Berlin, Germany2* Mohsen Raoufi and Pawel Romanczuk are with Institute for Theoretical Biology, Department of Biology, Humboldt Universität zu Berlin, Berlin, Germany [email protected], [email protected]3 Mohsen Raoufi is with Department of Electrical Engineering and Computer Science, Technical University of Berlin, Berlin, Germany4 Pawel Romanczuk is with Bernstein Center for Computational Neuroscience, Berlin, Germany5 Heiko Hamann is with Department of Computer and Information Science, University of Konstanz, Konstanz, Germany [email protected]
Abstract
Collective decision-making is an essential capability of large-scale multi-robot systems to establish autonomy on the swarm level. A large portion of literature on collective decision-making in swarm robotics focuses on discrete decisions selecting from a limited number of options. Here we assign a decentralized robot system with the task of exploring an unbounded environment, finding consensus on the mean of a measurable environmental feature, and aggregating at areas where that value is measured (e.g., a contour line). A unique quality of this task is a causal loop between the robots’ dynamic network topology and their decision-making. For example, the network’s mean node degree influences time to convergence while the currently agreed-on mean value influences the swarm’s aggregation location, hence, also the network structure as well as the precision error. We propose a control algorithm and study it in real-world robot swarm experiments in different environments. We show that our approach is effective and achieves higher precision than a control experiment. We anticipate applications, for example, in containing pollution with surface vehicles.
I Introduction
©Accepted at 2023 IEEE/International Conference on Robotics and Automation (ICRA)
Collective decision-making in large-scale decentralized multi-robot systems is required to coordinate and organize the system [1, 2, 3, 4]. For example, a robot swarm needs to collectively agree on a common direction in flocking or on a task allocation [5]. While task allocation is an example for a discrete consensus problem similar to best-of- problems (collectively choosing from a finite and countable set), the flocking example is a continuous consensus achievement problem [6]. Large portions of the collective decision-making literature in swarm robotics are focused on discrete problems, such as the popular collective perception benchmark scenario [4]. Here we focus on a continuous consensus achievement problem [7, 8] in the form of a decentralized estimation scenario [9]. In our previous work we studied the effect of diverse information on the accuracy of collective estimation, which forms the exploration-exploitation trade-off [1]. To achieve diverse-enough information, the swarm needs to expand and sample from larger area, which leads to a dispersal collective behavior. Among the proposed distributed methods in the literature on dispersion, some use information that is either costly or not available for all swarm platforms [10]. However, an approximate estimation of distance proved to be efficient to achieve such a goal. The performance of greedy gradient descent algorithm for dispersion predicted to be challenging, especially with large number of robots () [11]. Thus, to overcome this, we propose a threshold-based random walk algorithm that proves to be efficient enough for larger swarms ().
In addition, we require a form of exploitation of the collective decision as the robots need to react to their collective decisions and aggregate at areas that are determined by their consensus. This comes with a design challenge. Should the robots separate a consensus finding phase from an exploitation phase? Either they synchronize and determine an end of the collective decision-making process or they asynchronously switch to exploitation and try to keep finding a consensus on the go. Here we propose a solution choosing the asynchronous option. Consequently, we face another challenge. As the robots initiate their exploitation process, they try to move towards the designated area while continuing to communicate with neighbors. They form a dynamic network topology while following the collective decision-making protocol. We know that the network topology influences the decision-making process [12, 13, 14, 15, 16] and hence the emerging process is self-referential (network influences consensus, consensus influences spatial displacement). In that regards, there is a huge body of literature studying this effect from a network point of view. An example of such phenomenon is the homophily in social networks [17, 18]. However, studying the co-evolution of network and opinion dynamics in swarm robotics has been overlooked. In this paper, we show how the swarm of real robots disperse in an unbounded environment and then aggregating at the points where they agreed on.
II Method
Following our previous work [1], we study the co-evolution of network structure and collective estimation for a swarm of real robots. The value to estimate is a continuous, spatially distributed scalar feature of the environment. In our experiments, this will be realized by a spatially varying light intensity field. The swarm’s goal is to estimate a global property of the distributed feature and approach it in the physical space. Our focus is on estimation and localization of the environmental field’s mean value (see Fig. 1).
We define two phases: exploration and exploitation. Having separate phases for exploration and exploitation has been shown to be more efficient than mixed phases [19]. During initial exploration (see Sec. II-A), we program the swarm to expand. The aim is for the individual robots to collect diverse estimates of the environmental feature. The robots are supposed to cover as much area as possible while keeping network largely connected. The communication range and the swarm size determine how much the swarm can expand without being disconnected. We define the end of exploration as the moment when the collective achieves a maximal area coverage while still maintaining connectivity. During the subsequent exploitation phase (see Sec. II-B), robots communicate to achieve a consensus on the mean value, and at the same time, try to move toward the spots in the environment where the measured intensity is closer to the consensus. We showed previously that by combining these components a contour-capturing behavior emerges [1]. A possible application is to contain pollution or localize the position of a resource in the environment [20, 21, 22, 23].
We minimized the requirements with respect to the robotic platform to enable the implementation of the algorithm even on minimal robots, here specifically the Kilobot platform [24]. Although some algorithmic details are specific to our implementation on Kilobots, our model is generally applicable regardless of the swarm robotic platform. The requirements are: a) fully distributed algorithm; no central control, b) only local environmental information available, c) communication only to local neighbors, within a limited communication range, d) no prior information neither about the environment, nor the neighbors, and e) unbounded arena.
II-A Exploration
With exploration, the variation and diversity of information available to the swarm increases. During the exploration phase, no information is aggregated. As we demonstrated before [1], the exploration phase reduces the trueness error (systematic bias). In principle, any dispersion behavior may achieve the goal in an unbounded environment. However, due to limited connectivity of the distributed robots, a pure random dispersion may disconnect robots from their neighbors and fragment the network. Blind random motion in an unbounded environment is dangerous as robots might get lost and never find their way back to the swarm [25].
As an alternative, we suggest a random walk while preserving the connectivity of the network. A robot requires to know the approximate distance to its neighbors. We will show that even with noisy distance estimations the method is able to keep the swarm largely connected. With Kilobots, the estimation of the distance is calculated by considering the strength of the received infra-red (IR) signal [24]. Hence, making the random walk conditional on the distance to the nearest neighbor is the algorithm we implemented on robots. Once the minimum distance to local neighbors goes below the threshold, the robot stops and waits for its local neighbors to finish their random walk, then it switches to exploitation. Violations of the desired distance take the robot back to dispersion phase. By the end of this phase, the collective has the potential to make a less biased (or bias-free) estimation. Then, the swarm exploits the information distributed throughout the collective to increase the precision. See Fig. 2 for an illustration on how exploration and exploitation can modulate the trueness and precision components of the total accuracy error.
II-B Exploitation
Exploitation operates not only in the information domain, but also in the real physical space. By exploiting the information contained in the swarm, the collective estimation converges to the mean value in the information domain. The exploitation in the physical space results in individual robots converging towards the mean contours of the environmental field. Here, we introduce two mechanisms for each of the domains: local averaging and consensus-based phototaxis.
II-B1 Local averaging
The first part of exploitation is used to reach consensus in the information domain, which is achieved by local communication of robots. The results of interactions in this phase facilitate the wisdom of crowds effect [26, 27], by enabling the agents to average their imperfect estimates of environmental cues [28, 14]. The updating rule comes from the local averaging of DeGroot model [29], and we modified it by adding a memory term [1]. The ultimate updating rule is formulated as:
[TABLE]
Here, each robot updates its estimation () based on what it measures (), and the average of its neighbors’ estimation, with a weighting factor . Robots repeat these updates for a fixed number of iterations . The output of this phase is the consensus value (although all robots might not have exactly the same opinion about the consensus.) Robots use this value as input for the next phase.
The updating equation (Eq. 1) can be reformulated from a network point of view [7, 30]. This would convert the model to a linear system whose transition matrix is the normalized weighted adjacency matrix of the network, its states are agents’ estimation and the measurements are the inputs. Assuming the general system without input, the result of such local averaging, given that the network of communication is fully connected, is the mean value of information available within the collective [30]. Later, we briefly discuss how the connectivity of the network (mean node degree, in particular) changes the dynamic of this system.
II-B2 Consensus-based Phototaxis (CBPT)
We implement a sample-based pseudo gradient descent for the motion of robots which implements homophily on networks. Homophily is the tendency to interact more with like-minded agents in a social group [17]. We require a collective motion that moves robots sharing similar opinions closer to each other and thus establishes links [1]. As a pseudo gradient descent method, we choose the bio-inspired phototaxis method. By CBPT the robots are guided to areas where light measurements match the consensus value.
III Metrics and Setup
III-A Covered Area
We measure the area that is covered by the swarm. We consider a disk centered at each robot’s position with radius (). For Kilobots, we choose which is roughly half its communication range ( is robot radius). We calculate the collective coverage as the (non-overlapping) intersection of areas with covered by each robot located at :
[TABLE]
III-B Network Properties
The inter-agent communication network plays a critical role for the whole scenario. It is challenging to determine the existence of actual robot-robot communication links forming the network, as these links are noisy and difficult to extract from the robot swarm during an experiment. For simplicity we assume that if the distance between two robots is less than the average communication range, then there is a link. The communication range is assumed to be [31]. The links are estimated based on robot positions and distances obtained from tracking via a top-view camera. False positives and negatives for links between robots are possible as this is only an estimation.
We record the connectivity of the network by measuring the connectivity using two metrics: mean node degree and giant component size. Although the communication network of Kilobots is not necessarily undirected (signal strength is not always symmetric), we assume an undirected network for simplicity. In- and out-degree of all nodes are equal as well as the mean in- and out-degree. As second network metric we use giant component sizes, that is the number of nodes in the largest connected component of the network. This way we quantify how many robots have disconnected from the main cluster (implemented with NetworkX Library [32]).
III-C Accuracy Metrics
Collective estimation (accuracy) error is decomposed into trueness and precision error, which relates to the bias and variance decomposition of the total error. We showed that the generality and case-independence of these metrics enable their usage in various conditions (see [1] for details). We assume as ground truth for estimation the mean value of the light intensity across the environment . By defining the individual estimation for robot as and collective estimation as , we obtain for trueness, precision, and accuracy errors:
[TABLE]
As we have no direct access to a robot’s current estimation, we use its position as an indicator of its estimation. For each environmental distribution, there is a mapping between the camera-detected Cartesian robot positions and the coordination of interest. For example, in the radial distribution of Fig. 1, the mapping is:
[TABLE]
where, is the distribution’s center, and is the detected robot’s position in the captured frame.
III-D Experimental Setup
In our experiments we use Kilobot robot swarms [24] of up to 40 robots, on a arena of a white-board. For tracking we use a downward-facing camera and Hough circle transformation from OpenCV Library [33]. Otherwise mentioned, we used the same parameters as [1].
IV Results and Discussion
We study each component of our scenario (dispersion, consensus, CBPT) as stand-alone swarm tasks. Later, we combine these components to form a complex scenario.
IV-A Dispersion
The aim of dispersion is to increase covered area. We measure how much area is covered by robots (Fig. 3-a). To indicate the dynamic network structure, we measure the mean degree of the network (Fig. 3-b). The results in Fig. 3, indicate that initially the collective starts from a dense distribution with low coverage area and high connectivity in the network. Due to dispersion, the collective expands and covers larger area while the mean degree decreases. This increase in the covered area can lead to a lower trueness error in the collective estimation. The network gets sparser (reduced node degrees) while the giant component size of the network does not change significantly, suggesting that the network connectivity is largely preserved. Later we show how reduced connectivity results in lower speed of convergence during the decision-making process. Both the covered area and mean degree converge to steady state values. Once robots stop moving, we finish the experiment.
In Fig. 3-c, we show the size of the giant component. The algorithm keeps the majority of the swarm connected while a few robots disconnect from the swarm. In our analysis we found that often two (or more) robots stick to each other and while measuring strong signals from each other, they continue moving. They detach from the swarm, although they are members of a small cluster. As a control experiment, we tested a random walk diffusion algorithm that does not try to preserve connectivity (solid orange line in Fig. 3-c). Almost half of the swarm disconnects within three minutes. In comparison, our algorithm preserves connectivity well.
IV-B Consensus
Consensus occurs only in the information domain, which makes it difficult to measure in a real robot experiment. However, we simulated the consensus algorithm on a static network in order to show how the precision error changes over time (Fig. 4-a) and how its dynamics change with changing network properties, namely mean degree. We studied spatial networks with N nodes and different connectivity to investigate the role of mean degree. For doing so, we distributed N agents uniformly in an environment, and drew a deterministic network with a specific communication range. Then, we varied the communication range (ratio to the environment size) to achieve networks with various mean degree. As agents share and update their estimation about the mean value of the distribution, they converge to the consensus estimation, and thus the precision error decreases (Fig. 4-a)–this is the well-known speed-vs-accuracy trade-off happening over the course of decision-making.
In Fig. 4-b, we show how the mean node degree of the network influences the accuracy (precision) of the steady-state collective estimation. A higher mean degree leads to a lower precision error. With respect to speed of consensus, we measured the time to reach a steady state using a threshold () and recorded the first passage time of the precision error. The peaks in Fig. 4-c show the slowest convergence time for a specific mean degree. The speed reduces significantly for lower and higher degrees. A low or zero mean degree means there are few or no links in the network. Convergence is fast without information flow but not accurate. As known from graph theory, the network is immediately fully connected once the mean degree exceeds a critical value. This is where the second largest eigenvalue of the network adjacency matrix becomes less than one.
IV-C Contour Capturing
Next, we present our results for the scenario of contour capturing with a swarm of Kilobots. The objective is to gather the robots at the contours with mean light distribution. First, we give results of our fully distributed collective method. Second, we define a control experiment without robot-robot communication as baseline for comparison.
IV-C1 Collective Scenario–radial distribution
Here we present our main result in real-world experiments with Kilobots for the whole scenario by assembling the above components: dispersion while keeping the network connected, local averaging to achieve consensus, and homophily by CBPT to approach the consensus value. For a radial light distribution, Fig. 5-a shows the radial distribution of robot positions during the experiment. Initially, the robots are distributed rather densely close to the center (). During the dispersion the distribution becomes more uniform by spreading to larger radii. Then the local consensus finding with minimal movement starts while the spatial distribution of robots remains largely unchanged (). In a third phase robots approach the mean contour line by CBPT and the distribution contracts around 160 pixels ().
The temporal evolution of the trueness, precision and accuracy errors is illustrated in Fig. 5-b. The trueness error quickly drops to a small value by the end of the dispersion phase ( sec). However, the variation is still large although the mean value of the radial distribution is close to the ground truth. Thus, in contrast to the accurate mean value of the collective, each robots’ estimation is not yet accurate. This is because robots did not aggregate any information during dispersion. But, now that the collective is less-biased, and the network is connected robots exploit the information available within the entire collective. This is implemented via the local average from the consensus method (see eq. 1). At time s, the swarm arrives at a consensus on the information domain, but robot positions are still off the mean contour line. During the CBPT phase, robots approach the mean value in space and precision error is reduced. We observe both a low precision error and a low accuracy error. These results confirm our previous work in simulations [1].
The mean degree and area coverage of the swarm evolve in an anti-correlated manner. During dispersion, the swarm spreads out to cover more area and the spatial distribution gets sparser, hence, reducing the mean node degree. But the process inverts during exploitation as robots get closer to each other and increase network connectivity. Covered area decreases because robots form a denser distribution around the contour line and the overlap area increases.
IV-C2 Control experiment–no communication
As control experiment, the robots do contour capturing without collaboration between robots or exchange of any information. During exploration, each robot walks randomly while updating and aggregating its mean value estimation. Robots iteratively average over measured samples. The random walk is random diffusion and without effects by other robots (in difference to Sec. IV-A). It stops after a predetermined number of samples (). Then robots switch to exploitation and follow the CBPT algorithm to approach the estimated mean light spot. We used three different switching times: .
As seen in Fig. 6-a, a too short exploration () does only insufficiently reduce the trueness error (red line). Whereas the precision error (blue line) remains as high as the initial value due to insufficient spatial dispersal of robots. In Fig. 6-b, a sufficiently long exploration () reduces the trueness error, and manages the temporarily high precision error ( s). Fig. 6-c indicates a too long exploration phase resulting in a larger precision error. In our previous work [1], we already showed that (in a bounded environment) too late switching can cause the precision error to remain high (for a limited time budget).
The unbounded environment is challenging as the swarm tends to loose more and more robots (lost connectivity) with increased exploration time (Fig. 6-d). In addition to the known speed-vs-accuracy trade-off, we find this new trade-off in unbounded environments. With uncontrolled diffusion, one does not only pay in speed for accuracy, but also in the number of robots that get lost.
IV-C3 Collective Scenario–V-shape ramp distribution
In the model simulations presented in [1], we showed that the algorithm is able to capture the mean contour line for different environmental distributions, including uni- and multi-modal ones. In this part, we tested another distribution that is of an inverted V-shape, with a peak on its diagonal as in Fig. 7-a. The evolution of the distribution of robots over time (Fig. 7-b) demonstrates how the swarm expands uniformly up until the exploitation phase. Then, they branch into two different clusters; one on the top left and the other on the bottom right of the diagonal. The accuracy errors of Fig. 7-(c) have the same qualitative trends as in Fig. 5 for radial distribution. However, the remaining precision error at the end of the experiments indicates that the problem here is more difficult to solve. We note that here the precision error represents the dominant contribution to the total error.
V Conclusion
Starting from our previous work on the speed-accuracy trade-off in collective estimation [1], we have successfully implemented a real robot swarm (Kilobots) to capture a contour in a continuous environmental field in an unbounded arena. Our dispersion method largely preserves connectivity of the swarm and minimizes losing robots during exploration. As another component, we introduced a sample-based optimization method inspired by phototaxis that makes the Kilobots approach the desired contour. We added a light conductor to the robot (minimizing shadows on the sensor) to improve light measurements. This seems to be a novel implementation of a gradient ascent for Kilobots with various potential applications. The codes we used in this paper are available on GitHub [34].
Previously we showed that besides the speed-vs-accuracy there are also exploration-vs-exploitation trade-offs [1] that are generally non-trivial to resolve. With our new dispersion method, an optimal switching time to finish exploration is not required anymore. The swarm automatically ends dispersion at supposed best achievement constrained by connectivity. Here we discussed another trade-off induced by dynamic network topologies. During exploration, the temporarily low mean degree slows down collective decision-making. But the swarm expansion improves the accuracy of the estimation.
In future work, we plan to study contour-capturing scenarios in dynamic environments. We also plan to analyze scalability and test different light distributions.
Acknowledgment
We thank Marshall Lutz Mykietyshyn and Noran Abdelsalam for their contribution to real robot experiments.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] M. Raoufi, H. Hamann, and P. Romanczuk, “Speed-vs-accuracy tradeoff in collective estimation: An adaptive exploration-exploitation case,” in 2021 International Symposium on Multi-Robot and Multi-Agent Systems (MRS) . IEEE, 2021, pp. 47–55.
- 2[2] J. T. Ebert, M. Gauci, F. Mallmann-Trenn, and R. Nagpal, “Bayes bots: collective bayesian decision-making in decentralized robot swarms,” in 2020 IEEE International Conference on Robotics and Automation (ICRA) . IEEE, 2020, pp. 7186–7192.
- 3[3] M. Brambilla, E. Ferrante, M. Birattari, and M. Dorigo, “Swarm robotics: a review from the swarm engineering perspective,” Swarm Intelligence , vol. 7, no. 1, pp. 1–41, 2013.
- 4[4] G. Valentini, E. Ferrante, H. Hamann, and M. Dorigo, “Collective decision with 100 Kilobots: Speed versus accuracy in binary discrimination problems,” Autonomous agents and multi-agent systems , vol. 30, no. 3, pp. 553–580, 2016.
- 5[5] M. Raoufi, A. E. Turgut, and F. Arvin, “Self-organized collective motion with a simulated real robot swarm,” in Annual Conference Towards Autonomous Robotic Systems . Springer, 2019, pp. 263–274.
- 6[6] G. Valentini, E. Ferrante, and M. Dorigo, “The best-of-n problem in robot swarms: Formalization, state of the art, and novel perspectives,” Frontiers in Robotics and AI , vol. 4, p. 9, 2017.
- 7[7] R. Olfati-Saber, J. A. Fax, and R. M. Murray, “Consensus and cooperation in networked multi-agent systems,” Proceedings of the IEEE , vol. 95, no. 1, pp. 215–233, 2007.
- 8[8] Z. Ding, X. Chen, Y. Dong, S. Yu, and F. Herrera, “Consensus convergence speed in social network degroot model: The effects of the agents with high self-confidence levels,” IEEE Transactions on Computational Social Systems , 2022.
