PlasmoData.jl -- A Julia Framework for Modeling and Analyzing Complex Data as Graphs
David L Cole, Victor M Zavala

TL;DR
PlasmoData.jl is a Julia framework that models complex datasets as graphs, enabling advanced analysis using topology, graph theory, and machine learning, demonstrated through diverse real-world applications.
Contribution
It introduces a flexible DataGraph abstraction in Julia for representing various data types as graphs, facilitating analysis with graph-based tools and machine learning methods.
Findings
Successfully applied to image classification with topological features
Detected abnormal events in multivariate time series data
Enabled navigation of connectivity in technology pathways
Abstract
Datasets encountered in scientific and engineering applications appear in complex formats (e.g., images, multivariate time series, molecules, video, text strings, networks). Graph theory provides a unifying framework to model such datasets and enables the use of powerful tools that can help analyze, visualize, and extract value from data. In this work, we present PlasmoDatajl, an open-source, Julia framework that uses concepts of graph theory to facilitate the modeling and analysis of complex datasets. The core of our framework is a general data modeling abstraction, which we call a DataGraph. We show how the abstraction and software implementation can be used to represent diverse data objects as graphs and to enable the use of tools from topology, graph theory, and machine learning (e.g., graph neural networks) to conduct a variety of tasks. We illustrate the versatility of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Topological and Geometric Data Analysis · Scientific Computing and Data Management
