A text autoencoder from transformer for fast encoding language representation
Tan Huang

TL;DR
This paper introduces a fast, resource-efficient transformer-based autoencoder for language representation that reduces computational complexity and improves performance on classification and semantic similarity tasks.
Contribution
It proposes a novel deep bidirectional language model with window masking, achieving O(n) complexity and superior performance compared to traditional BERT-like models.
Findings
Higher accuracy in SMS classification using CPU-based embeddings
Significantly better performance in semantic similarity tasks
Reduced computational complexity from O(n^2) to O(n)
Abstract
In recent years BERT shows apparent advantages and great potential in natural language processing tasks. However, both training and applying BERT requires intensive time and resources for computing contextual language representations, which hinders its universality and applicability. To overcome this bottleneck, we propose a deep bidirectional language model by using window masking mechanism at attention layer. This work computes contextual language representations without random masking as does in BERT and maintains the deep bidirectional architecture like BERT. To compute the same sentence representation, our method shows O(n) complexity less compared to other transformer-based models with O(). To further demonstrate its superiority, computing context language representations on CPU environments is conducted, by using the embeddings from the proposed method, logistic regression…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · WordPiece · Layer Normalization · Residual Connection · Dense Connections · Attention Dropout · Softmax
