Prediction of Retention Time in Larger Antisense Oligonucleotide Datasets using Machine Learning
Manal Rahal, Bestoun S. Ahmed, Christoph A. Bauer, Johan Ulander, Jorgen Samuelsson

TL;DR
This paper applies machine learning models to predict the retention time of antisense oligonucleotides in chromatography, addressing the challenge of sequence-dependent variability and improving prediction accuracy and efficiency.
Contribution
It introduces a machine learning approach with novel features for accurate, scalable retention time prediction of ASOs, outperforming traditional methods in speed and interpretability.
Findings
Gradient Boosting performs comparably to SVM but is faster to tune.
New features like sulfur count and terminal nucleotides improve model accuracy.
ML models effectively predict retention times across large datasets.
Abstract
Antisense oligonucleotides (ASOs) are nucleic acid molecules with transformative therapeutic potential, especially for diseases that are untreatable by traditional drugs. However, the production and purification of ASOs remain challenging due to the presence of unwanted impurities. One tool successfully used to separate an ASO compound from the impurities is ion pair liquid chromatography (IPC). It is a critical step in separation, where each compound is identified by its retention time (tR) in the IPC. Due to the complex sequence-dependent behavior of ASOs and variability in chromatographic conditions, the accurate prediction of tR is a difficult task. This study addresses this challenge by applying machine learning (ML) to predict tR based on the sequence characteristics of ASOs. Four ML models Gradient Boosting, Random Forest, Decision Tree, and Support Vector Regression were…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · DNA and Nucleic Acid Chemistry · Analytical Chemistry and Chromatography
