From Data to the p-Adic or Ultrametric Model
Fionn Murtagh

TL;DR
This paper introduces a method to model data anomalies and changes by embedding data into ultrametric spaces derived from correspondence analysis, capturing temporal and structural variations in diverse datasets.
Contribution
It presents a novel approach to model data dynamics using ultrametric spaces induced by correspondence analysis, applicable to narrative and social conflict data.
Findings
Successful modeling of narrative flow in Casablanca script
Effective detection of social conflict evolution in Colombia
Ultrametric modeling captures temporal data changes
Abstract
We model anomaly and change in data by embedding the data in an ultrametric space. Taking our initial data as cross-tabulation counts (or other input data formats), Correspondence Analysis allows us to endow the information space with a Euclidean metric. We then model anomaly or change by an induced ultrametric. The induced ultrametric that we are particularly interested in takes a sequential - e.g. temporal - ordering of the data into account. We apply this work to the flow of narrative expressed in the film script of the Casablanca movie; and to the evolution between 1988 and 2004 of the Colombian social conflict and violence.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
