Data Sharing and Resampled LASSO: A word based sentiment Analysis for IMDb data
Ashutosh K. Maurya

TL;DR
This paper introduces a novel resampled LASSO approach that combines resampling techniques with LASSO for improved variable selection and prediction accuracy in sentiment analysis of IMDb data.
Contribution
It develops a new methodology blending resampling and LASSO, incorporating weighting schemes and data sharing techniques for better variable reduction and prediction.
Findings
Enhanced variable selection with significant reduction in variables.
Improved prediction accuracy measured by mean squared error.
Effective application to IMDb sentiment analysis dataset.
Abstract
In this article we study variable selection problem using LASSO with new improvisations. LASSO uses penalty, it shrinks most of the coefficients to zero when number of explanatory variables are much larger the number of observations . Novelty of the approach developed in this article blends basic ideas behind resampling and LASSO together which provides a significant variable reduction and improved prediction accuracy in terms of mean squared error in the test sample. Different weighting schemes have been explored using Bootstrapped LASSO, the basic methodology developed in here. Weighting schemes determine to what extent of data blending in case of grouped data. Data sharing (DSL) technique developed by [11] lies at the root of the present methodology. We apply the technique to analyze the IMDb dataset as discussed in [11] and compare our result with [11].
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Stock Market Forecasting Methods · Rough Sets and Fuzzy Logic
