The R package sentometrics to compute, aggregate and predict with   textual sentiment

David Ardia; Keven Bluteau; Samuel Borms; Kris Boudt

arXiv:2110.10817·stat.ML·October 22, 2021

The R package sentometrics to compute, aggregate and predict with textual sentiment

David Ardia, Keven Bluteau, Samuel Borms, Kris Boudt

PDF

TL;DR

The paper introduces the R package sentometrics, which facilitates efficient computation, aggregation, and prediction of sentiment scores from textual data, demonstrated through forecasting financial volatility.

Contribution

It presents a comprehensive framework and implementation in R for sentiment analysis, aggregation, and predictive modeling using textual data.

Findings

01

Effective sentiment scoring of large text datasets

02

Successful forecasting of the CBOE Volatility Index using sentiment data

03

User-friendly tools for sentiment analysis in R

Abstract

We provide a hands-on introduction to optimized textual sentiment indexation using the R package sentometrics. Textual sentiment analysis is increasingly used to unlock the potential information value of textual data. The sentometrics package implements an intuitive framework to efficiently compute sentiment scores of numerous texts, to aggregate the scores into multiple time series, and to use these time series to predict other variables. The workflow of the package is illustrated with a built-in corpus of news articles from two major U.S. journals to forecast the CBOE Volatility Index.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.