An Ensemble method for Content Selection for Data-to-text Systems

Dimitra Gkatzia; Helen Hastie

arXiv:1506.02922·cs.CL·June 10, 2015

An Ensemble method for Content Selection for Data-to-text Systems

Dimitra Gkatzia, Helen Hastie

PDF

Open Access

TL;DR

This paper introduces an ensemble-based multi-label classification approach for automatic report generation from time-series data, specifically applied to student feedback, improving accuracy over previous methods.

Contribution

It presents a novel ensemble method for content selection in data-to-text systems, treating it as a multi-label classification problem for the first time in this context.

Findings

01

Higher accuracy and F-score compared to baselines

02

Effective handling of all data simultaneously

03

Improved quality of generated student feedback

Abstract

We present a novel approach for automatic report generation from time-series data, in the context of student feedback generation. Our proposed methodology treats content selection as a multi-label classification (MLC) problem, which takes as input time-series data (students' learning data) and outputs a summary of these data (feedback). Unlike previous work, this method considers all data simultaneously using ensembles of classifiers, and therefore, it achieves higher accuracy and F- score compared to meaningful baselines.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning · Topic Modeling · Advanced Text Analysis Techniques