Managing large-scale scientific hypotheses as uncertain and probabilistic data
Bernardo Gon\c{c}alves

TL;DR
This paper presents a novel method for encoding and managing large-scale scientific hypotheses as probabilistic data, enabling effective causal reasoning and uncertainty propagation for data-driven scientific prediction.
Contribution
It introduces algorithms to encode hypotheses into functional dependencies and perform causal reasoning, facilitating probabilistic hypothesis management in scientific data systems.
Findings
Efficient encoding of hypotheses into functional dependencies.
Effective causal reasoning over hypothesis structures.
Demonstrated applicability in computational science use cases.
Abstract
In view of the paradigm shift that makes science ever more data-driven, in this thesis we propose a synthesis method for encoding and managing large-scale deterministic scientific hypotheses as uncertain and probabilistic data. In the form of mathematical equations, hypotheses symmetrically relate aspects of the studied phenomena. For computing predictions, however, deterministic hypotheses can be abstracted as functions. We build upon Simon's notion of structural equations in order to efficiently extract the (so-called) causal ordering between variables, implicit in a hypothesis structure (set of mathematical equations). We show how to process the hypothesis predictive structure effectively through original algorithms for encoding it into a set of functional dependencies (fd's) and then performing causal reasoning in terms of acyclic pseudo-transitive reasoning over fd's. Such…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Semantic Web and Ontologies · Data Management and Algorithms
