# HULAT at SemEval-2023 Task 9: Data augmentation for pre-trained   transformers applied to Multilingual Tweet Intimacy Analysis

**Authors:** Isabel Segura-Bedmar

arXiv: 2302.12794 · 2023-02-27

## TL;DR

This paper explores data augmentation for multilingual transformer models to improve Tweet intimacy analysis, with modest gains but promising results in several languages.

## Contribution

It applies data augmentation techniques to fine-tune multilingual transformers for Tweet intimacy classification, highlighting their impact and limitations.

## Key findings

- XLM-T achieved best results among tested models.
- Data augmentation provided slight performance improvements.
- System ranked 27th out of 45 in the competition.

## Abstract

This paper describes our participation in SemEval-2023 Task 9, Intimacy Analysis of Multilingual Tweets. We fine-tune some of the most popular transformer models with the training dataset and synthetic data generated by different data augmentation techniques. During the development phase, our best results were obtained by using XLM-T. Data augmentation techniques provide a very slight improvement in the results. Our system ranked in the 27th position out of the 45 participating systems. Despite its modest results, our system shows promising results in languages such as Portuguese, English, and Dutch. All our code is available in the repository \url{https://github.com/isegura/hulat_intimacy}.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.12794/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/2302.12794/full.md

## References

9 references — full list in the complete paper: https://tomesphere.com/paper/2302.12794/full.md

---
Source: https://tomesphere.com/paper/2302.12794