Harmonic Reasoning in Large Language Models
Anna Kruspe

TL;DR
This paper evaluates large language models' abilities in musical reasoning tasks, revealing strengths in interval recognition but limitations in chord and scale identification, and introduces a new benchmark dataset.
Contribution
It provides an analysis of LLMs' musical reasoning capabilities and introduces an automatically generated benchmark dataset for these tasks.
Findings
LLMs excel at note interval recognition
LLMs struggle with chord and scale identification
The paper highlights current limitations in LLM reasoning abilities
Abstract
Large Language Models (LLMs) are becoming very popular and are used for many different purposes, including creative tasks in the arts. However, these models sometimes have trouble with specific reasoning tasks, especially those that involve logical thinking and counting. This paper looks at how well LLMs understand and reason when dealing with musical tasks like figuring out notes from intervals and identifying chords and scales. We tested GPT-3.5 and GPT-4o to see how they handle these tasks. Our results show that while LLMs do well with note intervals, they struggle with more complicated tasks like recognizing chords and scales. This points out clear limits in current LLM abilities and shows where we need to make them better, which could help improve how they think and work in both artistic and other complex areas. We also provide an automatically generated benchmark data set for the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Sparse Evolutionary Training · Residual Connection · Attention Dropout · Linear Layer · Multi-Head Attention · Dense Connections · Cosine Annealing
