# Development of an Entropy-Based Feature Selection Method and Analysis of   Online Reviews on Real Estate

**Authors:** Hiroki Horino, Hirofumi Nonaka, Elisa Claire Alem\'an Carre\'on and, Toru Hiraoka

arXiv: 1904.11797 · 2019-04-29

## TL;DR

This paper introduces an entropy-based feature selection method to analyze online real estate reviews, successfully identifying key customer concerns from millions of posts using machine learning.

## Contribution

The study presents a novel entropy-based keyword extraction technique and applies it to large-scale real estate review data for customer insight analysis.

## Key findings

- Achieved 0.69 F-measure in classification
- Identified key customer concerns: apartment facilities, access, and price
- Demonstrated effectiveness on 6 million posts

## Abstract

In recent years, data posted about real estate on the Internet is currently increasing. In this study, in order to analyze user needs for real estate, we focus on "Mansion Community" which is a Japanese bulletin board system (hereinafter referred to as BBS) about Japanese real estate. In our study, extraction of keywords is performed based on the calculation of the entropy value of each word, and we used them as features in a machine learning classifier to analyze 6 million posts at "Mansion Community". As a result, we achieved a 0.69 F-measure and found that the customers are particularly concerned about the facility of apartment, access, and price of an apartment.

---
Source: https://tomesphere.com/paper/1904.11797