Motif Diversity in Human Liver ChIP-seq Data Using MAP-Elites
Alejandro Medina, Mary Lauren Benton

TL;DR
This paper introduces a novel approach to motif discovery in biological data by framing it as a quality-diversity problem and applying the MAP-Elites algorithm, which uncovers multiple diverse motifs with high quality.
Contribution
The work applies MAP-Elites to motif discovery, revealing multiple high-quality motifs and structured diversity, unlike traditional single-solution methods.
Findings
MAP-Elites recovers multiple high-quality motifs.
MAP-Elites reveals structured diversity in motifs.
Results comparable to MEME in motif quality.
Abstract
Motif discovery is a core problem in computational biology, traditionally formulated as a likelihood optimization task that returns a single dominant motif from a DNA sequence dataset. However, regulatory sequence data admit multiple plausible motif explanations, reflecting underlying biological heterogeneity. In this work, we frame motif discovery as a quality-diversity problem and apply the MAP-Elites algorithm to evolve position weight matrix motifs under a likelihood-based fitness objective while explicitly preserving diversity across biologically meaningful dimensions. We evaluate MAP-Elites using three complementary behavioral characterizations that capture trade-offs between motif specificity, compositional structure, coverage, and robustness. Experiments on human CTCF liver ChIP-seq data aligned to the human reference genome compare MAP-Elites against a standard motif discovery…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
