A Review on the Applications of Transformer-based language models for Nucleotide Sequence Analysis
Nimisha Ghosh, Daniele Santoni, Indrajit Saha, Giovanni Felici

TL;DR
This review paper discusses how Transformer-based language models, originally developed for NLP, are increasingly applied to analyze nucleotide sequences in bioinformatics, highlighting recent developments and potential future applications.
Contribution
It provides a comprehensive review and structured explanation of Transformer models' applications in nucleotide sequence analysis, facilitating understanding for new users and guiding future research.
Findings
Transformer models are effectively adapted for nucleotide sequences.
Numerous application-based studies demonstrate their utility in bioinformatics.
The review encourages further development of Transformer methodologies in bioinformatics.
Abstract
In recent times, Transformer-based language models are making quite an impact in the field of natural language processing. As relevant parallels can be drawn between biological sequences and natural languages, the models used in NLP can be easily extended and adapted for various applications in bioinformatics. In this regard, this paper introduces the major developments of Transformer-based models in the recent past in the context of nucleotide sequences. We have reviewed and analysed a large number of application-based papers on this subject, giving evidence of the main characterizing features and to different approaches that may be adopted to customize such powerful computational machines. We have also provided a structured description of the functioning of Transformers, that may enable even first time users to grab the essence of such complex architectures. We believe this review…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics
