Prediction and Comparative Analysis of CTCF Binding Sites based on a First Principle Approach
Nestor Norio Oiwa, Kunhe Li, Claudette E. Cordeiro, Dieter W., Heermann

TL;DR
This study employs a first principle approach to predict CTCF binding sites across multiple genomes, validating the method with experimental data and analyzing the organizational patterns of these sites in various species.
Contribution
The paper introduces a novel first principle computational method for predicting CTCF binding sites and provides a comparative analysis across diverse genomes, revealing organizational patterns and deviations.
Findings
Predicted human CTCF sites align with experimental data.
Binding sites form cluster-like groups with power-law distribution.
Aedes aegypti genome shows different binding site organization.
Abstract
We calculated the patterns for the CCCTC transcription factor (CTCF) binding sites across many genomes on a first principle approach. The validation of the first principle method was done on the human as well as on the mouse genome. The predicted human CTCF binding sites are consistent with the consensus sequence, ChIP-seq data for the K562 cell, nucleosome positions for IMR90 cell as well as the CTCF binding sites in the mouse HOXA gene. The analysis of Homo sapiens, Mus musculus, Sus scrofa, Capra hircus and Drosophila melanogaster whole genomes shows: binding sites are organized in cluster-like groups, where two consecutive sites obey a power-law with coefficient ranging from to to ; the distance between these groups varies from kbp to kbp. The genome of Aedes aegypti does not show a power law, but of binding sites…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Chromatin Dynamics · RNA Research and Splicing · Animal Genetics and Reproduction
