Pointwise prediction of protein diffusive properties using machine learning
Rasched Haidari, Achillefs N Kapanidis

TL;DR
This paper introduces M3, a machine learning method that accurately predicts protein diffusion properties from noisy data, outperforming traditional methods.
Contribution
M3 is a novel machine learning approach for pointwise inference of protein diffusive properties with high accuracy and minimal expert tuning.
Findings
M3 achieves high state accuracies (>90%) for inferring protein diffusion properties.
The method uses LSTM cells to reduce mean absolute errors in diffusion coefficient and anomalous exponent predictions.
M3 successfully detects changepoints in protein behavior and finished in the Top 5 of the Anomalous Diffusion Challenge 2024.
Abstract
The understanding of cellular mechanisms benefits substantially from accurate determination of protein diffusive properties. Prior work in this field primarily focuses on traditional methods, such as mean square displacements, for calculation of protein diffusion coefficients and biological states. This proves difficult and error-prone for proteins undergoing heterogeneous behaviour, particularly in complex environments, limiting the exploration of new biological behaviours. The importance of determining protein diffusion coefficients, anomalous exponents, and biological behaviours led to the Anomalous Diffusion Challenge 2024, exploring machine learning methods to infer these variables in heterogeneous trajectories with time-dependent changepoints. In response to the challenge, we present M3, a machine learning method for pointwise inference of diffusive coefficients, anomalous…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6| Model | Parameters | Sampling range |
|---|---|---|
| Single state (SSM) | [0,2] [10−12, 106] | |
|
| ||
| Multi state (MSM) | Number of states ( | {2, 3}
[0, 1] |
|
| ||
| Quenched trap (QTM) | Unbinding probability
Binding probability
Trap radius
Number of traps
Free | [0, 0.1] 1 [0.5, 2] [100, 300] [0, 2], [10−12, 106] 0, 0 |
|
| ||
| Dimerization (DIM) | Unbinding probability
Binding probability
Interaction radius
Free | [0, 0.1] 1 [0.5, 5] [0, 2], [10−12, 106] [0, 2], [10−12, 106] |
|
| ||
| Transient confinement (TCM) | Transition probability
Number of compartments
Compartment radius
Free | [0, 0.3] [30, 50] [5, 10] [0, 2], [10−12, 106] [0, 2], [10−12, 106] |
- —Medical Sciences Doctoral Training Centre University of Oxford
- —Wellcome Trust10.13039/100010269
- —UK Biotechnology and Biological Sciences Research Council
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Gene Regulatory Network Analysis · thermodynamics and calorimetric analyses
Introduction
1
Protein diffusion is vital to understanding cellular processes and the mechanisms which govern cellular functionality [1–4]. Advancements in fields such as single-molecule imaging and single-particle tracking have allowed for direct experimental observation of protein movement [4–15]. These studies have made extensive use of spatial-temporal properties of protein trajectories such as mean square displacements (MSD) and diffusion coefficients. Subsequent hard thresholding of diffusion coefficients allows for categorisation of proteins into different states, with each state representing underlying biological behaviour (e.g. a RNA polymerase molecule with a low diffusion coefficient may be interacting with the DNA) [14, 16–18].
Although successful for a small number of states with relatively simple behaviours, MSD approaches are difficult to extend to complex behaviours such as changes in environment (e.g. viscosity), confinement or directed motion [19–21]. This is further complicated with proteins transitioning between different states, leading to inaccurate diffusion coefficients and overlapping states causing misclassifications [22].
To address these issues, Muñoz–Gil et al introduced the Anomalous Diffusion (AnDi) Challenge in 2020 [21]. This challenge focused on advanced statistical and machine learning approaches to predict anomalous exponents and diffusion type for protein trajectories undergoing at most a single changepoint between states. The anomalous exponent is essential in understanding protein movement and is related to the diffusion coefficient by the equation [23, 24]:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{equation*}{\text{MSD}}\left( t \right) = 4K{t^\alpha }\end{equation*}\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} is the diffusion coefficient and α is the anomalous exponent. There are three distinct cases of α for which changes in diffusive behaviour are observed:
- • \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} a < 1\end{document} —sub-diffusion, protein trajectories display ‘confined’ steps; \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\text{MSD}} \propto {t^\alpha }\end{document} ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha = 0\end{document} representing immobile motion)
- • \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} a\end{document} = 1 corresponds to Brownian motion; \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\text{MSD}} \propto t\end{document}
- • \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 1, < a < 2\end{document} —super-diffusion, protein trajectories display ‘directed’ steps; \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\text{MSD}} \propto {t^\alpha }{ }\end{document} ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha = 2\end{document} representing ballistic motion)
The above MSD formula can be used to describe fractional Brownian motion (fBM), a generalisation of Brownian motion, in which the increments in position are no longer independent [23–28]. The covariance function for FBM is given by [24]:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{equation*}{\text{Cov}}\left[ {{B_H}\left( t \right){B_H}\left( s \right)} \right] = K\left( {{{\left| t \right|}^{2H}} + {{\left| s \right|}^{2H}} - {{\left| {t - s} \right|}^{2H}}} \right)\end{equation*}\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {B_H}\end{document} is the stochastic process, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} t\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s\end{document} are time points, and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} H\end{document} is the Hurst exponent given by \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha = ,2H\end{document} [24]. When \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha , \ne 1,\end{document} , different timesteps of the protein trajectories are negatively or positively correlated, leading to ‘confined’ or ‘directed’ behaviour respectively.
The outcome of AnDi 2020 led to more accurate machine learning techniques as compared to traditional MSD methods [21]. Machine learning models, developed both during and outside of the challenge, have implemented various architectures such as recurrent neural networks, convolutional neural networks, graph neural networks, feed-forward networks with feature engineering, and more recently transformers [29–45].
The success of the first challenge led to AnDi 2024 [46], in which the problem focused more specifically on fBM of proteins undergoing different biological behaviours and an arbitrary number of changepoints between these states. Given noisy protein coordinates, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} x\left( t \right)\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} y\left( t \right)\end{document} , the task is to infer \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} , and the protein state ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s\end{document} ) over the timeseries, where each variable can have an unknown number of arbitrary changepoints.
AnDi challenge 2024
1.1
AnDi 2024 considers protein trajectories simulated from five different biological behaviours as shown in figure 1(a). A single state model (SSM) does not have any changepoints, providing a control for the sensitivity of the model in the detection of changes. The rest of the models can exhibit an arbitrary number of changepoints either between two or more states as given by a transition matrix. At these changepoints, at least one of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} or \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s\end{document} will change. The multi-state model (MSM) considers time-dependent changes; the quenched-trap model (QTM) considers proteins in trapped (immobile) regions; the transient-confinement model considers proteins entering and exiting compartments; and lastly the dimerization model (DIM) considers co-diffusion between interacting proteins. The model can be extended to higher-order oligomeric states, which would involve scanning localisation coordinates for particles within Gaussian noise of each other. This would not require any further changes to the model (rather just data filtering) and such particles can be assigned into groups.
(a) Biological behaviours simulated using the AnDi package [49]. Red paths denote a change in behaviour. Immobile regions are assigned a state, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}s, equal to zero, while confined regions are assigned a state value of one. Unless specified, the state is either two or three. Figure Reproduced from [46]. CC BY 4.0. (b) M3 model architecture. The model takes in a protein trajectory and extract features ft. This is passed through a bidirectional LSTM and we concatenate with the input using skip connections and subsequently layer normalise. This is performed three times, followed by a feed-forward layer providing pointwise predictions. The model is retrained for every variable.
For QTM, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K = ,\alpha = s = ,0\end{document} when the protein is trapped. For the rest of the diffusive states, we have \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha , \in ,\left[ {0,2} \right],,,K \in ,\left[ {{{10}^{ - 12}},,{{10}^6}} \right]\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s, \in \left{ {0,1,2,3} \right}\end{document} . Note there are five underlying biological behaviours but four states. These represent immobile (imm.) motion ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s = 0\end{document} with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha = 0\end{document} ), confinement (conf.) ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s = 1\end{document} with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 0, < \alpha < 1.9\end{document} ), free diffusion ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s = 2\end{document} with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 0, < \alpha < 1.9\end{document} ), and directed (dir.) motion ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s = 3\end{document} with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha > 1.9\end{document} ) [46]. The value of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} or \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} does not necessarily have to change with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s\end{document} .
In response to the AnDi 2024 challenge, we present M3, a long short-term memory (LSTM) based method for the pointwise inference of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s\end{document} , and subsequent changepoint detection using the Python package ruptures [47, 48]. M3, as depicted in figure 1(b), extracts multivariate time series features from the original coordinates of the protein trajectory and passes these into bidirectional LSTM (biLSTMs) layers (with skip connections and dropout), followed by a feedforward layer. The use of LSTMs with skip connections allows the model to pick up on long-range dependencies while keeping the architecture simple.
Methods
2
Model architecture
2.1
As we are dealing with timeseries data, we implement a custom sequence-to-sequence model based on LSTM cells. Notably, we have three stacked biLSTMs with Rectified Linear activation functions, skip connections, and layer normalisations (see figure 1(b)). The first two biLSTM cells have two hidden layers and the last biLSTM has a single layer, with all three cells returning a vector of length 128. The output of the last biLSTM is passed onto a fully connected layer, mapping the result into the desired shape. The input to the model is ten features (see 2.3 Feature Selection), in the form of timeseries, extracted from the initial coordinates \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} x\left( t \right)\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} y\left( t \right)\end{document} . The model returns a sequence of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\left( t \right)\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \left( t \right)\end{document} or \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s\left( t \right)\end{document} for every timestep. For each of the three output variables, we duplicate the model and retrain, with the exception of adding a LogSoftmax layer before the output of the state model (classification task). M3 has a total of roughly 513 000 trainable parameters per model.
The model architecture resulted from starting with a single biLSTM cell and progressively incrementing the number of cells and layers until no improvement in validation loss was seen. The same procedure was carried out for other parameters in the model such as the dropout rate, number of fully connected layers, learning rate and regularisation parameter. To streamline this process, we use Optuna [50], an open source Python library implementing Bayesian hyperparameter search, and some trial and error (see 2.5 Training for hyperparameter values and Data Availability for tuning process).
Training data
2.2
Training data was simulated using the AnDi Python package [21, 46, 49]. Table 1 shows the general parameters used for model training. Briefly, we simulate 400 000 trajectories each for the single state, multi-state, immobile, confined, and dimerised models. This leads to a total of two million trajectories. We repeat this process for the case where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} is constrained to [1.9, 2]. The constrained range for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} simulated more trajectories displaying directed motion ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s = 3\end{document} ), a minority class. We further simulated trajectories (using the same sampling ranges as in table 1) in which changepoints occurred in only one of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} or \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} while the other remained fixed across the changepoint. This is only possible for the multi-state, dimerised, and confined models, allowing isolation of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} or \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} without interference from the other. In total we obtained 5.2 million trajectories. The trajectories are split randomly into train/test/validation datasets in a ratio of 60:20:20, such that there are 3.12 million trajectories in the training data. A few figures have been generated using different datasets, but this will be described in the text. Units for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} are \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\text{pixel}}{{\text{s}}^2}/{\text{ fram}}{{\text{e}}^\alpha }\end{document} .
Feature selection
2.3
We carry out an extensive literature search, including past AnDi challenge literature, to obtain features specific to the problem [21, 29–46, 51–59]. We add generic timeseries features to this (e.g. running means and standard deviations) resulting in a list of roughly 80 features. Using Pearson’s correlation coefficient (PCC) we remove highly collinear features (|PCC| > 0.95). Then, we perform greedy forward feature selection starting with the z-normalised (zero mean and unit variance) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} x\left( t \right)\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} y\left( t \right)\end{document} . By symmetry, a feature not improving the model in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} x\end{document} was removed alongside it is \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} y\end{document} counterpart (e.g. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} x\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} y\end{document} running standard deviation).
After this process, we were left with ten features, each being a timeseries itself, derived from the original coordinates. These are:
- • \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} z\end{document} -normalised \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} x\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} y\end{document} coordinates \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \hat x\left( t \right) = ,\frac{{x\left( t \right) - ,\bar x}}{{{\sigma _x}}}\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \hat y\left( t \right) = ,\frac{{y\left( t \right) - ,\bar y}}{{{\sigma _y}}}\end{document}
- •displacement from origin \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} ,d\left( t \right) = \sqrt {{{\left( {x\left( t \right) - x\left( 0 \right)} \right)}^2} + {{\left( {y\left( t \right) - y\left( 0 \right)} \right)}^2}} \end{document}
- • \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} z\end{document} -normalised step size \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \widehat {{\text{step}}}\left( t \right) = { }\frac{{{\text{step}}\left( t \right) - { }\overline {{\text{step}}} }}{{{\sigma _{{\text{step}}}}}}\end{document} with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\text{step}}\left( t \right) = { }\sqrt {\Delta x{{\left( t \right)}^2} + \Delta y{{\left( t \right)}^2}} \end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \Delta \end{document} is the difference between consecutive elements in the timeseries.
- •Angle between segments of the trajectory \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \vartheta \left( t \right)\end{document}
- • \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\text{str}}\left( t \right) = \frac{{\sqrt {{{\left( {\hat x\left( t \right) - \hat x\left( 0 \right)} \right)}^2} + {{\left( {\hat y\left( t \right) - \hat y\left( 0 \right)} \right)}^2}} }}{{\sum\nolimits_{j = 1}^t {\sqrt {\Delta \hat x{{\left( j \right)}^2} + \Delta \hat y{{\left( j \right)}^2}} } }}\end{document}
- • \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\text{st}}{{\text{r}}2}\left( t \right) = \frac{{{{\left( {\hat x\left( t \right) - \hat x\left( 0 \right)} \right)}^2} + {{\left( {\hat y\left( t \right) - \hat y\left( 0 \right)} \right)}^2}}}{{\sum\nolimits{j = 1}^t \Delta \hat x{{\left( j \right)}^2} + \Delta \hat y{{\left( j \right)}^2}}}\end{document}
- • \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} F\left( {\hat x} \right)\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} F\left( {\hat y} \right)\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} F\left( {\widehat {{\text{step}}}} \right)\end{document} where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} F\left( k \right) = \log \left| {\Delta k\left( t \right)} \right|.\end{document}
The features \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\text{str}}\left( t \right)\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\text{st}}{{\text{r}}2}\left( t \right){ }\end{document} are measures of trajectory straightness, where a value of 1 represents perfectly directed motion (straight path) and a value of 0 represents a path which returned to its starting position. The final input to the model is [ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \hat x,\hat y,{ }d,{ }\widehat {{\text{step}}},{ }\vartheta ,{\text{ str}},{\text{ st}}{{\text{r}}2},{F{\hat x}},{F{\hat y}},{ }{F_{\widehat {{\text{step}}}}}\end{document} ]. Any features shorter than the total number of timesteps is post-padded with zeros. The input to the model is of shape (B, 10, T) where T is the maximum trajectory length in the batch and B is the batch size (see 2.5 Training).
Augmentations
2.4
We add the following augmentations to the coordinates \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} x\left( t \right)\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} y\left( t \right)\end{document} :
- •Gaussian noise with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \mu = 0\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \sigma = 0.1\end{document} pixels, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} y{^{^{\prime}}}\left( t \right) = y\left( t \right) + { }\varepsilon \left( t \right),{ }x{^{^{\prime}}}\left( t \right) = x\left( t \right) + { }\varepsilon \left( t \right),{ }\varepsilon \left( t \right)\sim \mathcal{N}\left( {0,0.1} \right).\end{document}
- •Random rotations, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \left[ {x^{\prime}\left( t \right),y^{\prime}\left( t \right)} \right], = ,{M_\vartheta }\left[ {x\left( t \right),y\left( t \right)} \right]\end{document} , where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} M\end{document} is the standard rotation matrix with angle \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \vartheta \in \left[ {0,,2\pi } \right]\end{document} ).
- •Vertical and horizontal flips in the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} x - \end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} ,y - ,\end{document} axes \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} ,,x{^{^{\prime}}}\left( t \right) = , - x\left( t \right),,,y{^{^{\prime}}}\left( t \right) = , - y\left( t \right)\end{document} .
- •Truncation of coordinates; we select two random times \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {t_1}\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {t_2}\end{document} such that \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 0 \unicode{x2A7D} {t_1} \unicode{x2A7D} T - {t_{{\text{min}}}}\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {t_1} + { }{t_{{\text{min}}}} \unicode{x2A7D} { }{t_2} \unicode{x2A7D} T\end{document} resulting in truncated coordinates \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} x\left( {t^{\prime}} \right) = x\left( {{t_1}:,{t_2}} \right)\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} y\left( {t^{\prime}} \right) = y\left( {{t_1}:{t_2}} \right)\end{document} ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {t_{{\text{min}}}} = 20\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} T\end{document} is trajectory length).
The augmentations are applied independently with a 30% probability for each training instance. The remaining features are calculated based on the augmented coordinates.
Training
2.5
We train the models using a batch size of 32 for a maximum of 30 epochs. During training, 10% dropout is applied to the first two biLSTM cells. Training converges typically before this and the model with the smallest validation loss is saved. A scheduler is implemented, decreasing the learning rate by a factor of ten every five epochs, if there is no improvement in validation loss. We use the Adam optimiser with an initial learning rate of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 1, \times {10^{ - 3}}\end{document} and a weight decay ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \lambda \end{document} ) of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 2, \times {10^{ - 6}}\end{document} to prevent overfitting [60].
The loss functions for a single output timeseries are given by:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{equation*}{\mathcal{L}_\alpha } = \frac{1}{T}\mathop \sum \limits_{i = 1}^T \left| {{ }{\alpha _{{\text{p}},i{ }}} - { }{\alpha _{{\text{g}},i{ }}}} \right|\end{equation*}\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{equation*}{\mathcal{L}_K} = \frac{1}{T}\mathop \sum \limits_{i = 1}^T \left| {{\text{log}}\left( {{K_{{\text{p}},i{ }}} + 1} \right) - {\text{ log}}\left( {{K_{{\text{g}},i{ }}} + 1} \right)} \right|{ }\end{equation*}\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{equation*}{\mathcal{L}_s} = \frac{1}{T}\mathop \sum \limits_{i = 1}^T - {w_{{y_i}}}\log p\left({y_i}\right)\end{equation*}\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\alpha {\text{p}}}\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {K{{\text{p }}}}\end{document} are the predicted timeseries, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\alpha {\text{g}}}\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {K{{\text{g }}}}\end{document} are the ground truth timeseries, and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} T\end{document} is the total length of the protein trajectory ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {T_{{\text{max}}}} = { }200)\end{document} . The state labels are imbalanced so we include class weights, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {w_{{y_i}}}\end{document} (inverse frequency from training dataset) in the state loss function \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \left( {{\mathcal{L}s}} \right)\end{document} . The losses are normalised by trajectory length to account for varying lengths. The backpropagated loss is given by the average over the batch size (B) plus a regularisation term \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \overline {\cal L} = \frac{1}{B}\sum\nolimits{i = 1}^B {{{\cal L}i}} + {\mkern 1mu} \lambda \sum\nolimits{j = 1}^n {\theta j^2} {\mkern 1mu} \end{document} . As the range of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} is between 10^−12^ and 10^6^, we work in the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\text{lo}}{{\text{g}}{10}}\left( {K + 1} \right)\end{document} space, in which the value of the variable ranges between 0 and 6. This provides a normalised range for the model. All postprocessing steps on the predicted timeseries for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} are performed in its \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\text{lo}}{{\text{g}}_{10}}\left( {K + 1} \right)\end{document} space.
Training was performed on a single NVIDIA RTX A5000 GPU taking a maximum of 10 h for a single output variable. This can be done in parallel with multiple GPU instances. We had access to two GPUs, such that training took a total of 20 h.
Post-processing output
2.6
The output of the model is a timeseries of the same length as the input sequence. We smooth \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\left( t \right)\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \left( t \right)\end{document} by first replacing sections in which the output series varies by \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 0.01\end{document} with their mean. This removes any noisy variations in the output sequence and considers jumps in values that are above this threshold. After initial smoothing, we apply a median filter with a window size of 3. Median filters preserve edges well, providing better changepoint detection. As part of the challenge constraints, the minimum time spent in a state must be at least three frames; hence, the state series is smoothed such that any state segment lasting fewer than three consecutive frames is deemed invalid and replaced with the value of its preceding state.
Changepoint detection
2.7
Changepoint detection was done using the ruptures library in Python [47]. Ruptures implements offline detection methods using a variety of cost functions. As we do not know the number of changes beforehand, an additional penalty parameter is applied. To obtain an ideal penalty value, we performed a quick grid-search over a sensible range of starting values to maximise the Jaccard index (described shortly) across the training set. Specifically, we use the Pruned Exact Linear Time (PELT) algorithm with a Least Squares Deviation (L2) cost function and a penalty of 0.3 [61]. This identifies changepoints in a timeseries by segmenting data to minimize the sum of squared deviations within segments, plus a penalty term dependent on the number of segments. We set the minimum distance of changepoints to be at least three-time steps and we min-max normalise the predicted timeseries for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} before changepoint detection. For the state series, changepoint detection is easier as the label value changes.
Given changepoints from all three outputs, we combined them by adding changepoints from \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s\end{document} to the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} changepoints and removing duplicates within a window size of 5. This centres changepoints around the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} changepoints, where the model generally performed better on. We cannot only take \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} changepoints, as \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} could be constant across a changepoint while \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} may not be, leading to missing changepoints.
Evaluation metrics
2.8
To evaluate the model on unseen testing data we calculate the mean absolute error per timestep for continuous variables, given by:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{equation*}\frac{1}{N}\,\mathop \sum \limits_{j = 1}^N \frac{1}{{{T_{j\,}}}}\mathop \sum \limits_{i = 1}^{{T_{j\,}}} \left| {\,{p_{i,j\,}} - \,{g_{i,j\,}}} \right|\end{equation*}\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} N\end{document} is the number of trajectories, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {T_{j,}}\end{document} the number of timesteps for the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} { }j{\text{th}}\end{document} trajectory, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {p_{i,j,}}\end{document} is the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} i\end{document} th predicted value of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} or \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\text{log}}\left( {K + 1} \right)\end{document} for the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} j\end{document} trajectory and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {g_{i,j,}}\end{document} is the ground truth for the corresponding predicted point. For \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} , this is simply the mean absolute loss (MAE), whereas for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} , this is the mean absolute log error (MALE), as we work in the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\text{lo}}{{\text{g}}_{10}}\left( {K + 1} \right)\end{document} space.
For the state variable, we evaluate the confusion matrix (row-wise normalised), as the class distributions are imbalanced, and the state loss ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\mathcal{L}_{\text{s}}})\end{document} averaged over all trajectories. Evaluation of changepoint detection is done using two indicators discussed below: the Jaccard index and the root mean square error (RMSE).
The Jaccard index measures the level of overlap between the predicted changepoints and the ground truth changepoints with an allowable error of 5 timesteps, in-line with the challenge constraints. For two sets A and B, the Jaccard index is defined by:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{equation*}J = { }\frac{{A{{\mathop \cap \nolimits}}B}}{{A{{\mathop \cup \nolimits}}B}} = { }\frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FN}} + {\text{FP}}}}{ }\end{equation*}\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\text{TP}}\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\text{FN}}\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\text{FP}}\end{document} are the number true positives, false negatives, and false positives respectively.
The RMSE for a single trajectory is given by [46]:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \begin{equation*}{\text{RMSE}} = { }\sqrt {\frac{1}{{{N_{{\text{CP}}}}}}{ }\sum\limits_{{\text{paired CPs}}} {{\left( {{t_{g,i}} - {t_{p,j}}} \right)}^2}{ }} \end{equation*}\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {N_{{\text{CP}}}}\end{document} is the number of changepoints in the trajectory, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {t_{{\text{g}},i}}\end{document} is the ground truth changepoint, and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {t_{{\text{p}},j}}\end{document} is the predicted changepoint for some pair of changepoints \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \left( {i,j} \right)\end{document} . The RMSE is only calculated for paired changepoints ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} < !!5\end{document} timesteps away). Matching predicted and ground truth changepoints is performed using the Hungarian algorithm, performing cost minimisation, which is handled by the AnDi package [21, 46, 49].
Results
3
Work in single molecule imaging has primarily relied on classical statistics to calculate a single diffusion coefficient for an entire trajectory [13, 24–26]. These methods are difficult to implement, rely on expert input, and encounter large difficulties in detecting complex biological phenomena. With the rise of machine learning in the single molecule field, new models allow for the inference of diffusion properties at per timestep resolutions [21, 46]. Machine learning, particularly deep learning, removes the need for expert input and finetuned manual thresholding, while being able to detect complex states.
M3, an LSTM approach to the AnDi 2024 challenge [46], returns the diffusion coefficient, anomalous exponent, and protein state at every timestep of a protein trajectory, with an unknown number of changepoints. M3 is trained on simulations from biologically relevant scenarios including free and directed diffusion, confinement, and trapped motion.
Given a trajectory, the model extracts ten features from the coordinates, resulting in a multivariate timeseries. These features were selected after an extensive literature search [21, 29–46, 51–59], performing Pearson’s rank to remove correlated features, and greedy forward feature selection on the remaining features. The model is duplicated three times and trained independently for each output variable, with the exception of a LogSoftmax layer for the state variable as it is a multi-class variable.
During training, we implement augmentations; Gaussian noise, coordinate rotations, flips in the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} x/y\end{document} axes, and truncations of the time series (see Methods).
Once each model is trained, we perform inference on unseen trajectories from the test data set. Inference is performed once for each model, returning \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \left( t \right)\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\left( t \right)\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s\left( t \right)\end{document} for every timestep of the trajectory. Details of the dataset and simulation are described in table 1. The output timeseries have sudden changes reflecting changepoints in the protein trajectory.
M3 finished in the Top 5 of the AnDi 2024 challenge at the time of challenge closure [62]. Since then, we have fixed a bug in the training phase and slightly altered the model architecture to include another normalisation layer.
Pointwise estimation of
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}α, K, and s
3.1
We evaluate M3 on simulated trajectories from five underlying biological behaviours (see figure 1(a)) representing fBM with piecewise constant values of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha ,\in ,\left[ {0,2} \right]\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K,\in \end{document} [ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {10^{ - 12}},{ }{10^6}]{ }\left( {{\text{pixel}}{{\text{s}}^2}/{\text{fram}}{{\text{e}}^\alpha }} \right),{ }\end{document} and state \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s,\in ,\left{ {0,1,2,3} \right}\end{document} . The protein states represent immobile (imm.) motion ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s = 0\end{document} ), confinement (conf., \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s = 1\end{document} ), free diffusion ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s = 2\end{document} ), and directed (dir.) motion ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s = 3\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha , > = 1.9\end{document} ). Details of the simulation can be found in Methods 2.2.
Once training is completed, we first visualise model performance by inspecting predicted outputs compared to ground truth variables for different trajectories from the test set. Hereof, we show all plots for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} in the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {\text{log}}\left( {K + 1} \right)\end{document} space ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K \to {\text{ log}}\left( {K + 1} \right)\end{document} ) such that all results are between \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 0,\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 6\end{document} . This is also evident from the scale on the plots.
Figure 2(a) shows example trajectories from different biological behaviours and the resulting model output. Generally, it was observed that predicted \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} values are closer to the ground truth than \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} , and immobile states ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s = 0\end{document} ) were easily detected by the model, likely due to their distinct difference from the other states. The difference in prediction of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} can also arise from the inherent difficulty of the problem. Varying \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} directly affects the step size of the protein whereas \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} can be seen as a measure of ‘directionality’, making it a more difficult task for the model to learn.
(a) Example model predictions on trajectories from the test set. We have (from top to bottom row), a single state, multi-state, two immobile trajectories with differing changepoint numbers, and lastly a confined trajectory. For clarity, not every timepoint prediction is shown. (b) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}K predictions (in log space) compared to the ground truth. Trajectories for this plot were generated using the single state model and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}α was fixed at one. For every \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}K ground truth bin, 1000 protein predictions were averaged. (c) The same plot as (b) except varying \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}α, while \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}K was fixed at one. (d) Confusion matrix, row wise normalised, for the state variable over the entire test set.
To verify our observations, we first calculate the mean absolute errors as measures of accuracy for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} averaged over the test set. This gives a MAE of 0.16 for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} and a MALE of 0.14 for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} . We also evaluate model predictions on trajectories simulated solely from the SSM for a wide range of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} values. Heatmaps of predictions against ground truth are shown on figures 2(b) and (c), and allowed for model evaluation while holding the other variable fixed. At an ensemble level, the predictions from both continuous variables lie close to the ground truth with the exception that \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K,\end{document} ranges from \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 0\end{document} to \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 3\end{document} in log space. The restricted range of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} reflects the fact that the protein leaves the field of view during the simulation for diffusion coefficients larger than this. However, the range includes a wide range of diffusion coefficients ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 0 - 100{ }\mu {{\text{m}}^2},{{\text{s}}^{ - 1}})\end{document} beyond most experimentally observed values [63].
Due to the nature of the challenge and simulations, the test set is imbalanced, with the free state ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s = 2\end{document} ) observed more frequently. We implement class weights (inverse frequency) during training and evaluate the confusion matrix. This gives an average class accuracy of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 92% \end{document} , as shown on figure 2(d). The confusion matrix verifies the ability of the model to detect immobile states ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s = 0\end{document} ) very well compared to the other states. It is interesting to note the interplay between the states \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s = 1\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 2\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 3\end{document} . For certain trajectories, these three states can be confused for another, not necessarily due to an inherent fault in the model setup, but due to the difficulty of the classification. As an example, a slightly larger compartment ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s = 1\end{document} ) can be detected as free diffusion ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s = 2\end{document} ) and vice versa. As shown later, these effects can be amplified for shorter trajectories, where the model error increases, due to less information being available to the model. Another factor affecting this may stem from the use of class weights which favours minority classes, decreasing the accuracy of the model on the majority class.
Dependence on trajectory length
3.2
We further evaluate model predictions over trajectory lengths in the range of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 20\end{document} to \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 200\end{document} . For this, we plot the model loss and average accuracy per timepoint for varying lengths across the test set. Consistent with previous work in the field, we a see a decrease in error as the trajectory length increases (see figures 3(a) and (b)). As more information is available to the model, better predictions are obtained. Shorter trajectories can originate from a range of possible parameters, leading to higher errors in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s\end{document} . Figures 3(a) and (b) show plateaus in both loss and class accuracy at lengths of roughly \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} 150\end{document} time steps and greater.
(a) Losses per timestep for each target variable with increasing trajectory length, averaged over the test set. (b) Average class accuracy with trajectory length over the test set.
Changepoint detection
3.3
M3 does not directly return changepoints. Detecting changepoints in the output timeseries is performed using ruptures, an open-source changepoint detection library in Python, through a cost function and penalty parameter [47] (selected from a grid search across the training set, see Methods 2.7). For the state timeseries, finding changepoints is easier as the value of the label changes. Figure 4(a) shows an example model prediction for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} and the detected changepoints. The model is able to pick up all changes in the protein trajectory for sufficiently long enough timeseries. The penalty parameter in ruptures is tuneable. To avoid constant tuning, all timeseries are min-max normalised before changepoint detection, resulting in the same penalty for both \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} .
(a) Example application of changepoint detection, using the PELT algorithm [47], on the model prediction for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}K (b) The MAE and MALE over the test set for varying number of changepoints (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}NCP) (c) RMSE for changepoint detection performed using different combinations of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}α, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}K and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}s over the test set. Every \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}NCP represents RMSE values for 100 randomly sampled protein trajectories of length 200 from the test set.
As shown on figure 4(b), with an increasing number of changepoints ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} {N_{{\text{CP}}}}\end{document} ) in the protein trajectory, the MAE for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} increases while the MALE for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} remains roughly constant. This aligns with the asymptotic behaviour of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} , which leads to greater prediction estimates for a higher number of changepoints, in which the average segment length is shorter (inversely related) [21]. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} can be estimated with lower error from shorter segments (with less information per segment) and thus we see a constant loss as the number of changepoints increases.
Changepoint detection can be performed on the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s\end{document} predictions. We first perform this individually, then combine sets of changepoints into a single set with repeats removed. This merges the changepoints from all timeseries and ensures the model is able to pick up on changepoints from any of the output variables. In figure 4(c), we evaluate the RMSE for changepoint detection using combinations of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s\end{document} . As expected, the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} timeseries alone returns the highest RMSE across the range of changepoints, with changepoints from \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} predictions providing the lowest RMSE values. The state timeseries alone is not enough to detect changepoints, since in multi-state trajectories, the state label remains the same although a changepoint can occur. However, using only \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} timeseries for changepoints may mean we lose out on changepoints in which \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} remains constant across the changepoint whereas \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} does not. Thus, we settle for using \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha + K\end{document} changepoints (see figure 4(c)), while removing duplicates, for producing a set of changepoints for any given trajectory (providing slightly lower RMSE values than \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha + K + s\end{document} ). This may mean we miss on changepoints in which both of these variables do not change, while \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s\end{document} does, but these trajectories are less likely ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \sim \end{document} 0.04% of test dataset). Typically, a change in biological behaviour is accompanied by a change in diffusion coefficient or anomalous exponent.
We also evaluate the average Jaccard index on figure 5(a) for varying changepoints using the same sets of variables. Again, we see slightly better performance across the test set using \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha + K\end{document} compared to \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha + K + s\end{document} . Figure 5(b) shows changepoint detection for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s\end{document} for trajectories with a single changepoint across the test set. The model’s bidirectional nature allows for consistent changepoint detection across various changepoint positions in trajectories.
(a) Average Jaccard index, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}J¯, using combinations of variables for changepoint detection. Each point represents an average over 100 randomly selected protein trajectories of length 200 from the test set. (b) Average Jaccard index for trajectories of length 200 from the test set with a single changepoint. Each point represents the average over 100 randomly selected trajectories and bins a range of 20 lengths. (c) Average Jaccard index for trajectories with a single changepoint (thus having two segments, multi-state model), with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}Δα being the difference in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}α between the two segments. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}K is fixed at one for both segments. Each point is the average of roughly 50 trajectories of length 200. (d) The same plot as (c), except for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}ΔK, with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}α fixed at one between segments.
Changepoint detection is clearly dependent on the difference of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} at a changepoint. Little to no difference in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} or \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} at a transition in a protein trajectory leads to no changepoint being detected using the PELT algorithm. For small differences in diffusive properties at a changepoint, the model may also return a single value at all timesteps for the entire trajectory, as it is unable to detect a change.
To evaluate performance as a function of the difference at a changepoint, we simulated two new datasets in which there exists a single changepoint exactly halfway through the trajectory (MSm), with all trajectories of length 200. The first dataset fixes \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} at one for all trajectories, allowing \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} to change between the two segments, while the second dataset fixes \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} at one for all trajectories and allows \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} to vary between the two segments. These datasets allow for robust evaluation of changepoint detection for varying \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \Delta \alpha \end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \Delta K\end{document} between the segments. We show the Jaccard index for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \Delta \alpha \end{document} between segments on figure 5(c). The model performs well for large differences in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \Delta \alpha \end{document} between the segments, with the Jaccard index drastically reducing for smaller differences. As expected, we obtain a Jaccard value of zero when the two segments have identical \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} values, as the model did not pick up any changepoints. Figure 5(d) repeats the same analysis for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \Delta K\end{document} . This shows high values of Jaccard index for a much larger range of values, again reducing for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \Delta K\end{document} close to zero. As mentioned before, we believe the increased performance in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \Delta K\end{document} is due to a change in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} directly affecting the trajectory step sizes. Trajectories with fixed \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} and increasingly closer values of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} are much harder to distinguish as the model needs to spot a measure of ‘directionality’ over stochastic consecutive time steps.
Dependence on biological behaviours
3.4
Lastly, we evaluate the Jaccard index using the combined changepoints from \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha + K\end{document} timeseries against simulated biological states. As figure 6(a) highlights, the Jaccard index decreases as the number of changepoints increases for most biological states. The immobile and free states provide high Jaccard values over a range of changepoints. We see decreased detection of changepoints using the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha + K\end{document} set for the confined and directed models (any trajectories containing \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s = 1\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} s = 3\end{document} respectively), particularly for a greater number of changepoints. This aligns with figure 6(b), which shows significantly higher values of MAE and MALE for the confined and directed models. The larger error in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} for these models directly affects subsequent changepoint detection, so we expect to see decreases Jaccard indices. The higher error in the confined model may arise from large values of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} we simulated. As the protein reflects at a compartment boundary, consecutive timesteps can appear close for both high and small values of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} . This may impact the model negatively, as it sees higher and lower values of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} during training for similarly looking coordinates. A similar issue may be present for directed trajectories, where high values of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} can lead to large displacements, potentially creating trajectories similar to lower \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} K\end{document} values and higher values of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \alpha \end{document} .
(a) Average Jaccard index for changepoint detection for various biological states and number of changepoints \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}NCP. Each point represents the average over 100 randomly selected protein trajectories of length 200. (b) MAE(\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}α) and MALE(\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} \end{document}K) given the ground truth biological state over the entire test set.
Discussion
4
In this work we have presented M3, a machine learning model with three stacked biLSTMs layers for inferring diffusion coefficients, anomalous exponents, and states, for every frame of a protein trajectory. Specifically, we show M3 performs well on noisy stochastic trajectories simulated from varying biologically relevant scenarios.
M3 provides a new and simple method for detecting complex motion of proteins in cells. Additionally, the model does not require prior expert knowledge or manual finetuning of thresholds, as done with classical statistics or methods relying on MSD. In the AnDi Challenge, at the time of challenge closure, the model placed in the Top 5 on a set of combined metrics including absolute errors, Jaccard indices, and RMSE of predicted changepoints [62]. Since then, improvements have been made to the model, predominantly in the model architecture (addition of another layer normalisation) and the correction of a bug in the training data generation. The model consists of roughly half a million trainable parameters per variable, making it possible to train on a single NVIDIA GPU in 10 h for a single output variable.
It is possible to combine the models into a single model with three output heads, but this is difficult as it combines classification and regression. This can serve as future work, as a more complex loss function with additional parameters would be required.
Applying M3 to experimental single-molecule data requires consideration of several caveats commonly encountered in single molecule imaging. Fluorophore blinking may lead to intermittent signal loss. This can be mitigated through longer exposure times which increase photon collection and reduce blinking effects although at the cost of temporal resolution. The use of more photostable fluorophores will reduce both blinking and photobleaching enabling longer trajectory lengths. Excitation laser intensities in single molecule experiments can also be tuned to access longer timescales, which would be beneficial when wanting to observe various diffusive behaviours and allows for best model performance. Lower laser intensities also reduce potential photo-induced damage to living cells.
In cases where missing frames arise due to blinking, we can interpolate coordinates from surrounding frames. As part of this process accurate localisation is critical. Out-of-focus molecules, reduced signal-to-noise, and motion blur can introduce localisation errors leading to errors in downstream diffusion parameters. A potential approach to tackling localisation errors would be to resample trajectory coordinates by adding Gaussian noise based on localisation uncertainty. Subsequently, inference on each trajectory can be used to plot a probability distribution over possible output values. The output distribution can be used to infer both a final value and error range (similar to MC dropout techniques). Such an approach will be useful in cases where localisation cannot be performed accurately and provides pointwise errors.
Lastly, specific cellular boundaries will constrain diffusion. As such, it may be better to finetune the model using simulated tracks that account for cell geometry. Finetuning ML models on experimental data will be harder and subjective as compared to simulated data due to small dataset sizes and the lack of a ground truth.
As a starting point for experimental validation, future work can involve the use of M3, without any further finetuning, to analyse binding kinetics. M3 performs incredibly well (see figure 2(d)) in detecting immobile states which can be used to calculate dwell times, crucial for understanding cellular mechanisms and gene expression in vivo [16–18].
Improvements can be also made to the model architecture, such as the addition of convolutional layers, which would require further training and hyperparameter optimisation. Convolutional layers may be less flexible with varying length timeseries but can be incorporated in a unified (multi-headed) architecture. The model can also be entirely replaced by transformers, which have shown to outperform LSTMs. Transformers alongside hybrid LSTM-CNN models were tested. All architectures were kept small (minimising free parameters) for compatibility with single-GPU training within a day, but none outperformed the evaluation metrics as given by the current M3 model. However, experimentation was brief and should be further explored.
Future changes to the model architecture should potentially focus on short trajectories ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} < !!150\end{document} timesteps) as this is where the model performs worse, with a plateau in accuracy for longer trajectories ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} > !!150\end{document} timesteps). For current applications to experimental single molecule data, longer trajectories are preferable, which may require running experiments using low laser intensities and photostable fluorophores.
The ability of machine learning methods to achieve better performance, as shown by the AnDi challenges, while also providing pointwise diffusive properties, highlights the need for a shift in analysis of experimental data [21, 46]. New techniques, such as M3, provide fine grained insights into diffusive behaviours, allowing for better understanding of cellular mechanisms. Combined with advancements in experimental data and techniques, this can lead to new discoveries in protein behaviour and mechanisms at cellular levels.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Elf J Li G-W Xie X S 2007 Probing transcription factor dynamics at the single-molecule level in a living cell Science 316119141191–410.1126/science.114196717525339 PMC 2853898 · doi ↗ · pubmed ↗
- 2Gebhardt J C M Suter D M Roy R Zhao Z W Chapman A R Basu S Maniatis T Xie X S 2013 Single-molecule imaging of transcription factor binding to DNA in live mammalian cells Nat. Methods 104216421–610.1038/nmeth.241123524394 PMC 3664538 · doi ↗ · pubmed ↗
- 3Liu Z Lavis L D Betzig E 2015 Imaging live-cell dynamics and structure at the single-molecule level Mol. Cell 5864459644–5910.1016/j.molcel.2015.02.03326000849 · doi ↗ · pubmed ↗
- 4Betzig E Patterson G H Sougrat R Lindwasser O W Olenych S Bonifacino J S Davidson M W Lippincott-Schwartz J Hess H F 2006 Imaging intracellular fluorescent proteins at nanometer resolution Science 313164251642–510.1126/science.112734416902090 · doi ↗ · pubmed ↗
- 5Balzarotti F Eilers Y Gwosch K C GynnåH A Westphal V Stefani F D Elf J Hell S W 2017 Nanometer resolution imaging and tracking of fluorescent molecules with minimal photon fluxes Science 35560612606–1210.1126/science.aak 991328008086 · doi ↗ · pubmed ↗
- 6Lelek M Gyparaki M T Beliu G Schueder F GriffiéJ Manley S Jungmann R Sauer M Lakadamyali M Zimmer C 2021 Single-molecule localization microscopy Nat. Rev. Meth. Primers 11271–2710.1038/s 43586-021-00038-x PMC 916041435663461 · doi ↗ · pubmed ↗
- 7Heilemann M van de Linde S Schüttpelz M Kasper R Seefeldt B Mukherjee A Tinnefeld P Sauer M 2008 Subdiffraction-resolution fluorescence imaging with conventional fluorescent probes Angew. Chem., Int. Ed.47617266172–610.1002/anie.20080237618646237 · doi ↗ · pubmed ↗
- 8Sharonov A Hochstrasser R M 2006 Wide-field subdiffraction imaging by accumulated binding of diffusing probes Proc. Natl Acad. Sci.10318911618911–610.1073/pnas.060964310417142314 PMC 1748151 · doi ↗ · pubmed ↗
