Comparative analysis of criteria for filtering time series of word usage frequencies
Inna A. Belashova, Vladimir V. Bochkarev

TL;DR
This paper introduces a novel nonlinear wavelet thresholding method using genetic algorithms and the runs test to improve filtering quality of time series, demonstrated on Google Books Ngram data.
Contribution
It proposes a new filtering approach combining wavelet thresholding, the runs test, and genetic algorithms for enhanced time series filtering quality.
Findings
The method outperforms standard wavelet thresholding in filtering quality.
It is effective on both model series and real word frequency data.
The approach is suitable when filtering quality outweighs computational speed.
Abstract
This paper describes a method of nonlinear wavelet thresholding of time series. The Ramachandran-Ranganathan runs test is used to assess the quality of approximation. To minimize the objective function, it is proposed to use genetic algorithms - one of the stochastic optimization methods. The suggested method is tested both on the model series and on the word frequency series using the Google Books Ngram data. It is shown that method of filtering which uses the runs criterion shows significantly better results compared with the standard wavelet thresholding. The method can be used when quality of filtering is of primary importance but not the speed of calculations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Systems and Time Series Analysis · Evolutionary Algorithms and Applications · Stock Market Forecasting Methods
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
