Text Classification of the Precursory Accelerating Seismicity Corpus:   Inference on some Theoretical Trends in Earthquake Predictability Research   from 1988 to 2018

Arnaud Mignan

arXiv:1810.03480·cs.CL·April 19, 2019

Text Classification of the Precursory Accelerating Seismicity Corpus: Inference on some Theoretical Trends in Earthquake Predictability Research from 1988 to 2018

Arnaud Mignan

PDF

TL;DR

This study applies machine learning classifiers to seismology literature to analyze trends in earthquake predictability, finding Naive Bayes most effective for small datasets but with limited generalization to recent articles.

Contribution

First application of text classification to seismology articles, demonstrating potential and limitations of machine learning in analyzing earthquake predictability research trends.

Findings

01

Naive Bayes achieved 86% accuracy in binary classification.

02

Multiclass classification reached up to 78% accuracy.

03

Weak generalization to recent articles with 60% F1-score.

Abstract

Text analytics based on supervised machine learning classifiers has shown great promise in a multitude of domains, but has yet to be applied to Seismology. We test various standard models (Naive Bayes, k-Nearest Neighbors, Support Vector Machines, and Random Forests) on a seismological corpus of 100 articles related to the topic of precursory accelerating seismicity, spanning from 1988 to 2010. This corpus was labelled in Mignan (2011) with the precursor whether explained by critical processes (i.e., cascade triggering) or by other processes (such as signature of main fault loading). We investigate rather the classification process can be automatized to help analyze larger corpora in order to better understand trends in earthquake predictability research. We find that the Naive Bayes model performs best, in agreement with the machine learning literature for the case of small datasets,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.