On the Informativeness of the DNA Promoter Sequences Domain Theory

J. Ortega

arXiv:cs/9503101·cs.AI·November 17, 2014

On the Informativeness of the DNA Promoter Sequences Domain Theory

J. Ortega

PDF

Open Access

TL;DR

This paper reinterprets the DNA promoter sequences domain theory using M-of-N concepts, achieving high accuracy and demonstrating the theory's informativeness without complex learning algorithms.

Contribution

It introduces a simple reinterpretation of the domain theory, showing its high potential for accurate classification and providing insights into its learning difficulty.

Findings

01

Achieved 93.4% accuracy on the database

02

Random interpretations have an expected 76.5% accuracy

03

Maximum accuracy of 97.2% found in 12 cases

Abstract

The DNA promoter sequences domain theory and database have become popular for testing systems that integrate empirical and analytical learning. This note reports a simple change and reinterpretation of the domain theory in terms of M-of-N concepts, involving no learning, that results in an accuracy of 93.4% on the 106 items of the database. Moreover, an exhaustive search of the space of M-of-N domain theory interpretations indicates that the expected accuracy of a randomly chosen interpretation is 76.5%, and that a maximum accuracy of 97.2% is achieved in 12 cases. This demonstrates the informativeness of the domain theory, without the complications of understanding the interactions between various learning algorithms and the theory. In addition, our results help characterize the difficulty of learning using the DNA promoters theory.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Advanced biosensing and bioanalysis techniques · Machine Learning and Data Classification