TL;DR
This paper develops an Indonesian automatic question generator using sequence-to-sequence models with linguistic features, achieving promising results despite limited datasets, and adapts existing architectures for Indonesian language processing.
Contribution
It introduces the first Indonesian AQG system based on sequence-to-sequence models with linguistic features, copy, and coverage mechanisms, and evaluates it on translated datasets.
Findings
Achieved BLEU scores up to 39.9 and ROUGE-L of 44.13 on Indonesian datasets.
System performs well with named entities and syntactically close answers.
Generated questions are acceptable and useful from native Indonesian perspective.
Abstract
Automatic question generation is defined as the task of automating the creation of question given a various of textual data. Research in automatic question generator (AQG) has been conducted for more than 10 years, mainly focused on factoid question. In all these studies, the state-of-the-art is attained using sequence-to-sequence approach. However, AQG system for Indonesian has not ever been researched intensely. In this work we construct an Indonesian automatic question generator, adapting the architecture from some previous works. In summary, we used sequence-to-sequence approach using BiGRU, BiLSTM, and Transformer with additional linguistic features, copy mechanism, and coverage mechanism. Since there is no public large dan popular Indonesian dataset for question generation, we translated SQuAD v2.0 factoid question answering dataset, with additional Indonesian TyDiQA dev set for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Sigmoid Activation · Tanh Activation · Long Short-Term Memory · Bidirectional GRU · Bidirectional LSTM · Dense Connections · Dropout
