HuAMR: A Hungarian AMR Parser and Dataset
Botond Barta, Endre Hamerlik, Mil\'an Konor Nyist, Judit \'Acs

TL;DR
HuAMR introduces the first Hungarian AMR dataset and LLM-based parsers, demonstrating how different models and strategies impact semantic parsing accuracy in Hungarian news domain.
Contribution
This work provides the first Hungarian AMR dataset and explores the effectiveness of LLM-based parsers and training strategies for semantic parsing in Hungarian.
Findings
Silver-standard AMRs improve parsing accuracy on news data.
Model architecture and fine-tuning strategies influence performance.
HuAMR advances semantic parsing research for Hungarian.
Abstract
We present HuAMR, the first Abstract Meaning Representation (AMR) dataset and a suite of large language model-based AMR parsers for Hungarian, targeting the scarcity of semantic resources for non-English languages. To create HuAMR, we employed Llama-3.1-70B to automatically generate silver-standard AMR annotations, which we then refined manually to ensure quality. Building on this dataset, we investigate how different model architectures - mT5 Large and Llama-3.2-1B - and fine-tuning strategies affect AMR parsing performance. While incorporating silver-standard AMRs from Llama-3.1-70B into the training data of smaller models does not consistently boost overall scores, our results show that these techniques effectively enhance parsing accuracy on Hungarian news data (the domain of HuAMR). We evaluate our parsers using Smatch scores and confirm the potential of HuAMR and our parsers for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
