Automatic Data Visualization Generation from Chinese Natural Language Questions
Yan Ge, Victor Junqiu Wei, Yuanfeng Song, Jason Chen Zhang, and Raymond Chi-Wing Wong

TL;DR
This paper introduces a new Chinese Text-to-Vis dataset and a model that uses multilingual BERT and n-gram information to generate data visualizations from Chinese natural language questions, addressing a gap in multilingual visualization research.
Contribution
The paper presents the first Chinese Text-to-Vis dataset and a novel model leveraging multilingual BERT and n-gram features for cross-lingual visualization generation.
Findings
The dataset is challenging and suitable for further research.
The model effectively integrates multilingual BERT and n-gram information.
Experimental results demonstrate the dataset's difficulty and potential for future work.
Abstract
Data visualization has emerged as an effective tool for getting insights from massive datasets. Due to the hardness of manipulating the programming languages of data visualization, automatic data visualization generation from natural languages (Text-to-Vis) is becoming increasingly popular. Despite the plethora of research effort on the English Text-to-Vis, studies have yet to be conducted on data visualization generation from questions in Chinese. Motivated by this, we propose a Chinese Text-to-Vis dataset in the paper and demonstrate our first attempt to tackle this problem. Our model integrates multilingual BERT as the encoder, boosts the cross-lingual ability, and infuses the -gram information into our word representation learning. Our experimental results show that our dataset is challenging and deserves further research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Advanced Text Analysis Techniques · Computational and Text Analysis Methods
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Attention Dropout · Residual Connection · Adam · Weight Decay · Softmax · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Linear Decay
