Comateformer: Combined Attention Transformer for Semantic Sentence Matching
Bo Li, Di Liang, Zixin Zhang

TL;DR
Comateformer introduces a novel transformer-based quasi-attention mechanism that enhances semantic sentence matching by capturing subtle differences and dissimilarities, leading to improved performance on multiple datasets.
Contribution
The paper proposes a new quasi-attention mechanism within transformers that models both similarity and dissimilarity for better semantic matching.
Findings
Achieves consistent improvements across ten datasets.
Effectively captures subtle differences between sentences.
Demonstrates robustness in various testing scenarios.
Abstract
The Transformer-based model have made significant strides in semantic matching tasks by capturing connections between phrase pairs. However, to assess the relevance of sentence pairs, it is insufficient to just examine the general similarity between the sentences. It is crucial to also consider the tiny subtleties that differentiate them from each other. Regrettably, attention softmax operations in transformers tend to miss these subtle differences. To this end, in this work, we propose a novel semantic sentence matching model named Combined Attention Network based on Transformer model (Comateformer). In Comateformer model, we design a novel transformer-based quasi-attention mechanism with compositional properties. Unlike traditional attention mechanisms that merely adjust the weights of input tokens, our proposed method learns how to combine, subtract, or resize specific vectors when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAttention Is All You Need · Adam · Dropout · Position-Wise Feed-Forward Layer · Dense Connections · Byte Pair Encoding · Linear Layer · Multi-Head Attention · Label Smoothing · Layer Normalization
