KoBigBird-large: Transformation of Transformer for Korean Language Understanding
Kisu Yang, Yoonna Jang, Taewoo Lee, Jinwoo Seong, Hyungjin Lee,, Hwanseok Jang, Heuiseok Lim

TL;DR
KoBigBird-large is a transformer model tailored for Korean language understanding, achieving state-of-the-art results on benchmarks by extending the architecture with TAPER for long sequences without additional pretraining.
Contribution
The paper introduces TAPER, a novel positional encoding method, and adapts BigBird for Korean, enabling effective long sequence processing without extra pretraining.
Findings
Achieves state-of-the-art performance on Korean benchmarks.
Outperforms baselines on document classification and question answering.
Effectively processes longer sequences with TAPER.
Abstract
This work presents KoBigBird-large, a large size of Korean BigBird that achieves state-of-the-art performance and allows long sequence processing for Korean language understanding. Without further pretraining, we only transform the architecture and extend the positional encoding with our proposed Tapered Absolute Positional Encoding Representations (TAPER). In experiments, KoBigBird-large shows state-of-the-art overall performance on Korean language understanding benchmarks and the best performance on document classification and question answering tasks for longer sequences against the competitive baseline models. We publicly release our model here.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsBigBird
