# NOCTIS: open-source toolkit that turns reaction data into actionable graph networks

**Authors:** Nataliya Lopanitsyna, Marta Pasquini, Marco Stenta

PMC · DOI: 10.1186/s13321-025-01118-w · 2025-12-04

## TL;DR

NOCTIS is an open-source toolkit that converts chemical reaction data into graph networks to help design efficient synthetic routes.

## Contribution

NOCTIS introduces a modular, open-source framework for constructing and analyzing reaction graphs with route enumeration capabilities.

## Key findings

- NOCTIS supports large-scale reaction data analysis using graph-based methods and parallel processing.
- The plugin enables exhaustive synthetic route enumeration, reducing redundant exploration.
- The toolkit is demonstrated using the MIT USPTO-480k dataset, showcasing route mining and network analysis.

## Abstract

Chemical reactions form densely connected networks, and exploring these networks is essential for designing efficient and sustainable synthetic routes. As reaction data from literature, patents, and high-throughput experimentation continue to grow, so does the need for tools that can navigate and mine these large-scale datasets. Graph-based representations capture the topology of reaction space, yet few open-source tools exist for building and querying such networks. To address this, we developed NOCTIS, an open-source toolkit for constructing and analyzing reaction data as graphs.

NOCTIS is an open-source Python package for building Networks of Organic Chemistry (NOCs) from reaction strings. It supports graph-based analysis, parallel processing of large datasets, and export to common Python formats (e.g., NetworkX, pandas). Built on Neo4j technology, it features a modular, extensible architecture with open-source dependencies. We also provide a companion plugin for exhaustive route enumeration. It traverses graph-encoded reactions to assemble all valid synthetic routes, helping prevent redundant exploration and supporting knowledge reuse in synthesis planning. The underlying algorithm is documented in detail along with its current limitations. Using the MIT USPTO-480k dataset (Adv Neural Inf Process Syst 30, 2017), we demonstrate the plugin’s route mining capabilities, analyze network connectivity, and assess synthetic trees.

Built on LinChemIn (J Chem Inf Model 64(6):1765–1771, 2024), NOCTIS serves as an open and extensible toolkit for network-based reaction analysis and route mining, laying the groundwork for data-driven route design at scale. Future work will extend query capabilities and improve the efficiency of route extraction.

NOCTIS provides a programmatic, query-driven framework for exploring large reaction networks beyond the limitations of traditional tabular data analysis. Synthetic routes are extracted via a built-in query invoking the plugin and can be used in downstream workflows. Its flexible architecture allows users to customize or extend workflows without modifying the underlying code, making it suitable for large-scale, automated analysis.

## Full-text entities

- **Diseases:** NOCTIS (MESH:D000092124)
- **Chemicals:** OR (MESH:C034130), Paracetamol (MESH:D000082), PRODUCT (-)
- **Mutations:** C2)C

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12798089/full.md

---
Source: https://tomesphere.com/paper/PMC12798089