CareLab at #SMM4H-HeaRD 2025: Insomnia Detection and Food Safety Event Extraction with Domain-Aware Transformers

Zihan Liang; Ziwen Pan; Sumon Kanti Dey; Azra Ismail

arXiv:2506.18185·cs.CL·June 24, 2025

CareLab at #SMM4H-HeaRD 2025: Insomnia Detection and Food Safety Event Extraction with Domain-Aware Transformers

Zihan Liang, Ziwen Pan, Sumon Kanti Dey, Azra Ismail

PDF

TL;DR

This paper describes CareLab's system for detecting insomnia mentions and extracting food safety events using domain-aware transformers, achieving top performance in shared tasks with innovative model and data augmentation techniques.

Contribution

We developed a domain-aware transformer-based approach with GPT-4 data augmentation for improved event detection in clinical notes and news articles.

Findings

01

Achieved first place in Food Safety Event Extraction with F1 0.958

02

Effective use of GPT-4 for data augmentation

03

Demonstrated strong performance across multiple subtasks

Abstract

This paper presents our system for the SMM4H-HeaRD 2025 shared tasks, specifically Task 4 (Subtasks 1, 2a, and 2b) and Task 5 (Subtasks 1 and 2). Task 4 focused on detecting mentions of insomnia in clinical notes, while Task 5 addressed the extraction of food safety events from news articles. We participated in all subtasks and report key findings across them, with particular emphasis on Task 5 Subtask 1, where our system achieved strong performance-securing first place with an F1 score of 0.958 on the test set. To attain this result, we employed encoder-based models (e.g., RoBERTa), alongside GPT-4 for data augmentation. This paper outlines our approach, including preprocessing, model architecture, and subtask-specific adaptations

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.