Generating Multimodal Augmentations with LLMs from Song Metadata for Music Information Retrieval

Rossetto, F., Dalton, J. and Murray-Smith, R. (2023) Generating Multimodal Augmentations with LLMs from Song Metadata for Music Information Retrieval. In: 1st Workshop on Large Generative Models Meet Multimodal Application (LGM3A), Ottawa, Canada, 2 November 2023, pp. 51-59. ISBN 9798400702839 (doi: 10.1145/3607827.3616842)

Full text not currently available from Enlighten.

Abstract

In this work we propose a set of new automatic text augmentations that leverage Large Language Models from song metadata to improve on music information retrieval tasks. Compared to recent works, our proposed methods leverage large language models and copyright-free corpora from web sources, enabling us to release the knowledge sources collected. We show how combining these representations with the audio signal provides a 21% relative improvement on five of six datasets on genre classification, emotion recognition and music tagging, achieving state-of-the-art in three (GTZAN, FMA-Small and Deezer). We demonstrate the benefit of injecting external knowledge sources by comparing them withintrinsic text representation methods that rely only on the sample's information.

Item Type:	Conference Proceedings
Status:	Published
Refereed:	Yes
Glasgow Author(s) Enlighten ID:	Murray-Smith, Professor Roderick and Dalton, Dr Jeff and Rossetto, Federico
Authors:	Rossetto, F., Dalton, J., and Murray-Smith, R.
College/School:	College of Science and Engineering > School of Computing Science
ISBN:	9798400702839
Related URLs:	Organisation

University Staff: Request a correction | Enlighten Editors: Update this record

Funder and Project Information

Project Code

Award No

Project Name

Principal Investigator

Funder's Name

Funder Ref

Lead Dept

310549

Dalton-UKRI-Turing Fellow

Jeff Dalton

Engineering and Physical Sciences Research Council (EPSRC)

EP/V025708/1

Computing Science

Deposit and Record Details

ID Code:	305760
Depositing User:	Ms Jacqui Brannan
Datestamp:	04 Sep 2023 09:37
Last Modified:	24 Jul 2024 10:57
Date of acceptance:	8 August 2023
Date of first online publication:	29 October 2023
Data Availability Statement:	No