MotifMark: Finding Regulatory Motifs in DNA Sequences
Hamid Reza Hassanzadeh, Pushkar Kolhe, Charles L. Isbell, May D. Wang

TL;DR
MotifMark is a novel graph theory and machine learning-based algorithm designed to accurately identify DNA-binding motifs from high-throughput microarray data, outperforming existing methods in specificity prediction.
Contribution
It introduces a new computational pipeline that improves motif detection accuracy from noisy protein binding microarray data using graph theory and machine learning techniques.
Findings
MotifMark outperforms two leading motif search methods in benchmark tests.
The algorithm effectively ranks binding site specificity for transcription factors.
It provides a viable alternative for analyzing high-throughput DNA-protein interaction data.
Abstract
The interaction between proteins and DNA is a key driving force in a significant number of biological processes such as transcriptional regulation, repair, recombination, splicing, and DNA modification. The identification of DNA-binding sites and the specificity of target proteins in binding to these regions are two important steps in understanding the mechanisms of these biological activities. A number of high-throughput technologies have recently emerged that try to quantify the affinity between proteins and DNA motifs. Despite their success, these technologies have their own limitations and fall short in precise characterization of motifs, and as a result, require further downstream analysis to extract useful and interpretable information from a haystack of noisy and inaccurate data. Here we propose MotifMark, a new algorithm based on graph theory and machine learning, that can find…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
