Identification of Protein Coding Regions in Genomic DNA Using   Unsupervised FMACA Based Pattern Classifier

Pokkuluri Kiran Sree; Inampudi Ramesh Babu

arXiv:1401.6484·cs.CE·January 28, 2014·21 cites

Identification of Protein Coding Regions in Genomic DNA Using Unsupervised FMACA Based Pattern Classifier

Pokkuluri Kiran Sree, Inampudi Ramesh Babu

PDF

Open Access

TL;DR

This paper introduces an unsupervised FMACA-based pattern classifier that accurately identifies protein-coding regions in DNA sequences, demonstrating scalability and improved accuracy over previous methods.

Contribution

It proposes a novel unsupervised FMACA classifier with a new K-Means design for better accuracy in DNA coding region identification.

Findings

01

High classification accuracy achieved

02

Scalable to large datasets

03

Outperforms previous classifiers

Abstract

Genes carry the instructions for making proteins that are found in a cell as a specific sequence of nucleotides that are found in DNA molecules. But, the regions of these genes that code for proteins may occupy only a small region of the sequence. Identifying the coding regions play a vital role in understanding these genes. In this paper we propose a unsupervised Fuzzy Multiple Attractor Cellular Automata (FMCA) based pattern classifier to identify the coding region of a DNA sequence. We propose a distinct K-Means algorithm for designing FMACA classifier which is simple, efficient and produces more accurate classifier than that has previously been obtained for a range of different sequence lengths. Experimental results confirm the scalability of the proposed Unsupervised FCA based classifier to handle large volume of datasets irrespective of the number of classes, tuples and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCellular Automata and Applications · Fractal and DNA sequence analysis · DNA and Biological Computing