Boosting the Performance of Transformer Architectures for Semantic   Textual Similarity

Ivan Rep; Vladimir \v{C}eperi\'c

arXiv:2306.00708·cs.CL·June 2, 2023·1 cites

Boosting the Performance of Transformer Architectures for Semantic Textual Similarity

Ivan Rep, Vladimir \v{C}eperi\'c

PDF

Open Access

TL;DR

This paper explores fine-tuning transformer models like BERT, RoBERTa, and DeBERTaV3 for semantic textual similarity, combining their outputs with handcrafted features and analyzing model performance and errors.

Contribution

It introduces a hybrid approach of transformer fine-tuning and feature boosting, along with detailed error analysis on semantic similarity tasks.

Findings

01

Transformer models improved validation scores.

02

Combining outputs with handcrafted features enhanced performance.

03

Error analysis revealed challenges at prediction range edges.

Abstract

Semantic textual similarity is the task of estimating the similarity between the meaning of two texts. In this paper, we fine-tune transformer architectures for semantic textual similarity on the Semantic Textual Similarity Benchmark by tuning the model partially and then end-to-end. We experiment with BERT, RoBERTa, and DeBERTaV3 cross-encoders by approaching the problem as a binary classification task or a regression task. We combine the outputs of the transformer models and use handmade features as inputs for boosting algorithms. Due to worse test set results coupled with improvements on the validation set, we experiment with different dataset splits to further investigate this occurrence. We also provide an error analysis, focused on the edges of the prediction range.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsAttention Is All You Need · Test · Attention Dropout · Linear Warmup With Linear Decay · Residual Connection · Linear Layer · Layer Normalization · RoBERTa · Softmax · Multi-Head Attention