Abstractions, Algorithms and Data Structures for Structural Bioinformatics in PyCogent
Marcin Cieslik, Zygmunt Derewenda, Cameron Mura

TL;DR
This paper introduces new object-oriented modules and algorithms for structural bioinformatics within PyCogent, enhancing its capabilities for 3D structure analysis and integration with sequence-based bioinformatics tools.
Contribution
It provides the first comprehensive, extensible Python framework for structural bioinformatics in PyCogent, combining efficient data structures, algorithms, and file support.
Findings
Enhanced 3D structure processing in PyCogent
Efficient algorithms like surface-area calculations implemented
Integration enables combined structural and sequence analyses
Abstract
To facilitate flexible and efficient structural bioinformatics analyses, new functionality for three-dimensional structure processing and analysis has been introduced into PyCogent -- a popular feature-rich framework for sequence-based bioinformatics, but one which has lacked equally powerful tools for handling stuctural/coordinate-based data. Extensible Python modules have been developed, which provide object-oriented abstractions (based on a hierarchical representation of macromolecules), efficient data structures (e.g. kD-trees), fast implementations of common algorithms (e.g. surface-area calculations), read/write support for Protein Data Bank-related file formats and wrappers for external command-line applications (e.g. Stride). Integration of this code into PyCogent is symbiotic, allowing sequence-based work to benefit from structure-derived data and, reciprocally, enabling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
