Tree-based Inference of Species Interaction Network from Abundance Data
Rapha\"elle Momal, St\'ephane Robin, Christophe Ambroise

TL;DR
This paper introduces a tree-based statistical model for inferring ecological species interaction networks from abundance data, accounting for environmental effects and distinguishing direct from indirect interactions efficiently.
Contribution
It presents a novel, computationally efficient method that incorporates environmental covariates and models species interactions through averaging over tree-shaped networks.
Findings
Method performs well compared to state-of-the-art approaches.
Accounting for covariates reduces spurious edges.
Application to datasets reveals environmental influences on networks.
Abstract
The behavior of ecological systems mainly relies on the interactions between the species it involves. We consider the problem of inferring the species interaction network from abundance data. To be relevant, any network inference methodology needs to handle count data and to account for possible environmental effects. It also needs to distinguish between direct interactions and indirect associations and graphical models provide a convenient framework for this purpose. We introduce a generic statistical model for network inference based on abundance data. The model includes fixed effects to account for environmental covariates and sampling efforts, and correlated random effects to encode species interactions. The inferred network is obtained by averaging over all possible tree-shaped (and therefore sparse) networks, in a computationally efficient manner. An output of the procedure is the…
| SpiecEasi | gCoda | ecoCopula | MRFcov | MInt | EMtree | |
|---|---|---|---|---|---|---|
| Easy | 25.45 (1.87) | 0.11 (0.06) | 5.55 (0.64) | 34.51 (3.68) | 43.04 (19.76) | 11.72 (1.89) |
| Hard | 28.43 (1.30) | 0.53 (0.25) | 9.6 (0.65) | 8.29 (0.36) | 33.77 (18.20) | 8.17 (0.50) |
| EMtree | 0.44 (0.14) | 0.60 (0.17) | 0.41 (0.13) | 0.76 (0.21) |
|---|---|---|---|---|
| gCoda | 0.11 (26.8) | 0.05 (0.05) | 0.05 (0.04) | 0.09 (0.54) |
| SpiecEasi | 2.09 (0.26) | 2.37 (0.28) | 2.42 (0.27) | 2.42 (0.26) |
| SpiecEasi | gCoda | ecoCopula | MRFcov | MInt | EMtree | ||
|---|---|---|---|---|---|---|---|
| Easy | Cluster | 0.86 (0.20) | 0 (0.08) | 0.33 (0.14) | 0.74 (0.06) | 0.38 (0.17) | 0.12 (0.09) |
| Erdös | 0.86 (0.21) | 0 (0.15) | 0.29 (0.15) | 0.73 (0.05) | 0.38 (0.15) | 0.12 (0.08) | |
| Scale-free | 0.92 (0.04) | 0 (0.04) | 0.33 (0.11) | 0.88 (0.02) | 0.73 (0.09) | 0.34 (0.08) | |
| Hard | Cluster | 0.87 (0.12) | 0 (0.20) | 0.15 (0.18) | 0.78 (0.05) | 0.77 (0.09) | 0.36 (0.09) |
| Erdös. | 0.88 (0.11) | 0 (0.24) | 0 (0.15) | 0.78 (0.05) | 0.77 (0.10) | 0.35 (0.09) | |
| Scale-free | 0.94 (0.05) | 0 (0.13) | 0 (0.16) | 0.89 (0.03) | 0.94 (0.03) | 0.56 (0.07) | |
| SpiecEasi | gCoda | ecoCopula | MRFcov | MInt | EMtree | ||
|---|---|---|---|---|---|---|---|
| Easy | Cluster | 0.16 (0.11) | 0.05 (0.07) | 1.04 (0.48) | 2.26 (0.58) | 0.30 (0.13) | 0.62 (0.14) |
| Erdös | 0.15 (0.09) | 0.06 (0.08) | 0.95 (0.50) | 2.23 (0.42) | 0.30 (0.14) | 0.58 (0.12) | |
| Scale-free | 0.63 (0.13) | 0.08 (0.07) | 0.92 (0.30) | 4.86 (0.44) | 0.81 (0.25) | 0.96 (0.08) | |
| Hard | Cluster | 0.21 (0.08) | 0.02 (0.03) | 0.02 (0.17) | 1.65 (0.33) | 0.68 (0.30) | 0.43 (0.10) |
| Erdös | 0.21 (0.08) | 0.02 (0.02) | 0.00 (0.18) | 1.56 (0.32) | 0.66 (0.25) | 0.42 (0.10) | |
| Scale-free | 0.61 (0.12) | 0.04 (0.03) | 0.08 (0.24) | 3.29 (0.40) | 3.63 (1.08) | 0.94 (0.09) | |
| SpiecEasi | gCoda | ecoCopula | MRFcov | MInt | EMtree | ||
|---|---|---|---|---|---|---|---|
| Easy | Cluster | 1.77 | 13.89 | 1.74 | 0.00 | 0.00 | 0.00 |
| Erdös | 0.68 | 11.95 | 0.99 | 0.00 | 0.83 | 0.00 | |
| Scale-free | 0.00 | 1.88 | 0.00 | 0.00 | 0.00 | 0.00 | |
| Hard | Cluster | 0.00 | 14.05 | 23.40 | 0.00 | 0.00 | 0.00 |
| Erdös | 0.00 | 20.85 | 27.28 | 0.00 | 0.00 | 0.00 | |
| Scale-free | 0.00 | 5.97 | 15.46 | 0.00 | 0.00 | 0.00 | |
| SpiecEasi | gCoda | ecoCopula | MRFcov | MInt | EMtree | |
|---|---|---|---|---|---|---|
| Easy | 29.73 (2.00) | 1.29 ( 0.30) | 28.14 (1.46) | 48.7 (2.32) | 138.13 (39.60) | 50.19 (7.81) |
| Hard | 29.38 (1.31) | 40.73 (20.94) | 27.16 (1.30) | 14.1 (0.36) | 95.27 (46.34) | 23.59 (2.09) |
| 1 | 2 | 10 | 20 | 50 | 150 | |
|---|---|---|---|---|---|---|
| Easy | 0.66 (0.15) | 1.86 (0.23) | 7.00 (0.81) | 12.29 (1.27) | 29.50 (3.39) | 87.30 (10.36) |
| Hard | 0.45 (0.12) | 1.44 (0.14) | 5.06 (0.78) | 8.97 (0.87) | 23.35 (2.40) | 69.29 (10.83) |
| EMtree | 0.41 (0.11) | 0.60 (0.15) | 0.38 (0.12) | 0.71 (0.21) |
|---|---|---|---|---|
| gCoda | 0.12 (0.47) | 0.07 (0.03) | 0.05 (0.03) | 0.09 (0.06) |
| SpiecEasi | 2.41 (0.25) | 2.41 (0.25) | 2.39 (0.25) | 2.42 (0.25) |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPlant and animal studies · Ecology and Vegetation Dynamics Studies · Species Distribution and Climate Change
Tree-based Inference of Species Interaction Network from Abundance Data
Raphaëlle Momal1 Electronic address: [email protected]; Corresponding author 1: UMR MIA-Paris, AgroParisTech, INRA, Université Paris-Saclay, 75005 Paris, France
2: Laboratoire de Mathématiques et Modélisation d’Évry, 23 bvd de France, Évry, France
Stéphane Robin1
1: UMR MIA-Paris, AgroParisTech, INRA, Université Paris-Saclay, 75005 Paris, France
2: Laboratoire de Mathématiques et Modélisation d’Évry, 23 bvd de France, Évry, France
Christophe Ambroise2
1: UMR MIA-Paris, AgroParisTech, INRA, Université Paris-Saclay, 75005 Paris, France
2: Laboratoire de Mathématiques et Modélisation d’Évry, 23 bvd de France, Évry, France
(Dated: )
Summary
-
The behavior of ecological systems mainly relies on the interactions between the species it involves. We consider the problem of inferring the species interaction network from abundance data. To be relevant, any network inference methodology needs to handle count data and to account for possible environmental effects. It also needs to distinguish between direct interactions and indirect associations and graphical models provide a convenient framework for this purpose.
-
We introduce a generic statistical model for network inference based on abundance data. The model includes fixed effects to account for environmental covariates and sampling efforts, and correlated random effects to encode species interactions. The inferred network is obtained by averaging over all possible tree-shaped (and therefore sparse) networks, in a computationally efficient manner. An output of the procedure is the probability for each edge to be part of the underlying network.
-
A simulation study shows that the proposed methodology compares well with state-of-the-art approaches, even when the underlying graph strongly differs from a tree. The analysis of two datasets highlights the influence of covariates on the inferred network.
-
Accounting for covariates is critical to avoid spurious edges. The proposed approach could be extended to perform network comparison or to look for missing species.
Key-words:
abundance data, covariates adjustment, EM algorithm, graphical models, matrix tree theorem, Poisson log-Normal model, species interaction network
1 Introduction
There is a growing awareness of biotic interactions being crucial components of biodiversity and relevant descriptors of ecosystems (Valiente-Banuet et al., 2015; Jordano, 2016). Such interactions can be conveniently represented by networks, which have been increasingly studied and used in recent years for describing and understanding living systems in ecology (Poisot et al., 2016), microbiology (Faust and Raes, 2012) or genomics (Evans et al., 2016). Observing species interactions is a laborious task which restricts them to certain categories (e.g. trophic, pollination), while many other mutualistic and/or antagonistic interactions may be hard to observe and key in the system organization (e.g. communication, shelter sharing). Many efforts have been devoted in the last decade to get a more complete picture of the biotic interactions existing between species living in the same niche.
Network reconstruction.
A first attempt consists in using observed interactions to predict other possible links based on species traits matching (see e.g. Olito and Fox, 2015; Bartomeus et al., 2016; Weinstein and Graham, 2017; Graham and Weinstein, 2018). The interaction strength can also be predicted (Wells and O’Hara, 2013). This can be viewed as a prediction task, and modern approaches arising from signal processing and machine learning have been also proposed (Desjardins-Proulx et al., 2017; Stock et al., 2017; Dallas et al., 2017). We name these approaches network reconstruction to distinguish them from network inference, which is the problem we consider in this article.
Network inference.
Network inference approaches also aim at retrieving the interactions among species, but do not rely on observed interactions and therefore, remain agnostic as for their type. Such approaches have been developed in many domains ranging from cell biology (Friedman, 2004, to infer gene regulatory networks) to neurosciences (Zhu and Cribben, 2018, to deciphere brain connectivity structures). In ecology, it will typically aim at inferring the set of biotic interactions linking species from the same guild. As summarized in Fig. 1, network inference takes as input measures on species at similar sites, and returns a network of species direct interactions. The importance of distinguishing between direct interaction and indirect association between species is explained in Popovic et al. (2019). To be accurate, network inference must account for environmental covariates to prevent the inference of spurious interactions resulting from abiotic effects. Fig. 1 illustrates this phenomenon: () corresponds to the case where two species (1 and 4) are not in direct interaction, but are affected by the variations of the same environmental covariate . () displays the network when is not accounted for: a spurious edge appears between these two species.
Joint species distribution models.
The rational between network inference is that interactions between species must affect their joint distribution in a series of similar sites. Such approaches necessarily rely on a joint species distribution model (JSDM), as opposed to species distribution models (Elith and Leathwick, 2009) where species are traditionally considered as disconnected entities. A JSDM is a probabilistic model describing the species simultaneous presence/absence (Harris, 2015; Ovaskainen et al., 2017) or joint abundances (Popovic et al., 2018, 2019). An important feature of JSDMs is to include environmental covariates to account for abiotic interactions.
Recently, latent variable models have received attention in community ecology as they provide a convenient way to model the dependence structure between species (Warton et al., 2015). The JSDM proposed by Popovic et al. (2018, 2019) involves a latent layer. So does the Poisson log-Normal model (PLN, Aitchison and Ho, 1989), which combines generalized linear models to account for covariates and offsets, and a Gaussian latent structure to describe the species interactions. It can be seen as a multivariate mixed model, in which correlated random effects encode the dependency between the species abundances.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Valiente-Banuet et al. (2015) A. Valiente-Banuet, M. A. Aizen, J. M. Alcántara, J. Arroyo, A. Cocucci, M. Galetti, M. B. García, D. García, J. M. Gómez, P. Jordano, et al., Functional Ecology 29 , 299 (2015).
- 2Jordano (2016) P. Jordano, Functional Ecology 30 , 1883 (2016).
- 3Poisot et al. (2016) T. Poisot, D. B. Stouffer, and S. Kéfi, Functional Ecology 30 , 1878 (2016).
- 4Faust and Raes (2012) K. Faust and J. Raes, Nature Reviews Microbiology 10 , 538 (2012).
- 5Evans et al. (2016) D. M. Evans, J. J. Kitson, D. H. Lunt, N. A. Straw, and M. J. Pocock, Functional ecology 30 , 1904 (2016).
- 6Olito and Fox (2015) C. Olito and J. W. Fox, Oikos 124 , 428 (2015).
- 7Bartomeus et al. (2016) I. Bartomeus, D. Gravel, J. M. Tylianakis, M. A. Aizen, I. A. Dickie, and M. Bernard-Verdier, Functional Ecology 30 , 1894 (2016).
- 8Weinstein and Graham (2017) B. G. Weinstein and C. H. Graham, Ecology letters 20 , 326 (2017).
