House price estimation from visual and textual features
Eman Ahmed, Mohamed Moustafa

TL;DR
This paper introduces a neural network-based approach that combines visual and textual features from house data to improve price estimation accuracy, outperforming models that rely solely on textual information.
Contribution
The paper presents the first dataset combining house images and textual attributes and demonstrates that integrating visual features significantly enhances price prediction accuracy.
Findings
Adding visual features tripled the R-value.
Decreased Mean Square Error by an order of magnitude.
Outperformed existing textual-only models on benchmark datasets.
Abstract
Most existing automatic house price estimation systems rely only on some textual data like its neighborhood area and the number of rooms. The final price is estimated by a human agent who visits the house and assesses it visually. In this paper, we propose extracting visual features from house photographs and combining them with the house's textual information. The combined features are fed to a fully connected multilayer Neural Network (NN) that estimates the house price as its single output. To train and evaluate our network, we have collected the first houses dataset (to our knowledge) that combines both images and textual attributes. The dataset is composed of 535 sample houses from the state of California, USA. Our experiments showed that adding the visual features increased the R-value by a factor of 3 and decreased the Mean Square Error (MSE) by one order of magnitude compared…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Music and Audio Processing
