COKE: Causal Discovery with Chronological Order and Expert Knowledge in High Proportion of Missing Manufacturing Data
Ting-Yun Ou, Ching Chang, Wen-Chih Peng

TL;DR
COKE is a novel causal discovery method that effectively utilizes expert knowledge and chronological order in manufacturing data with high missingness, significantly improving causal graph accuracy without imputing missing values.
Contribution
It introduces a new approach that leverages expert knowledge and chronological order to construct causal graphs directly from incomplete manufacturing data, outperforming existing methods.
Findings
Achieved an average 39.9% improvement in F1-score over benchmarks.
Reaches up to 62.6% F1-score improvement in real-world datasets.
Attains 85.0% F1-score improvement in semiconductor datasets.
Abstract
Understanding causal relationships between machines is crucial for fault diagnosis and optimization in manufacturing processes. Real-world datasets frequently exhibit up to 90% missing data and high dimensionality from hundreds of sensors. These datasets also include domain-specific expert knowledge and chronological order information, reflecting the recording order across different machines, which is pivotal for discerning causal relationships within the manufacturing data. However, previous methods for handling missing data in scenarios akin to real-world conditions have not been able to effectively utilize expert knowledge. Conversely, prior methods that can incorporate expert knowledge struggle with datasets that exhibit missing values. Therefore, we propose COKE to construct causal graphs in manufacturing datasets by leveraging expert knowledge and chronological order among sensors…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRough Sets and Fuzzy Logic · Data Quality and Management · Manufacturing Process and Optimization
