Using a Language Model to Generate Music in Its Symbolic Domain While Controlling Its Perceived Emot

Using a Language Model to Generate Music in Its Symbolic Domain While Controlling Its Perceived Emot

Abstract:

This work proposes a transformer-based model capable of generating music in its symbolic domain, in a controllable fashion. The ultimate goal of this is to build a system with which people can compose music collaboratively with a computer. Using an NLP model as a base (GPT-2), we take advantage of the similarities across symbolic music representation and written language to build a model capable of conditionally predicting musical sequences. Controllability is achieved without explicit programming for it, and does not require extensive retraining of the model. A study with 939 participants was performed to evaluate this controllability. The results of this suggest the proposed method is indeed effective and can be used to control the generation of music in its symbolic domain. The method itself is flexible to any desired “control”, but this work focuses specifically on the emotion conveyed when one listens to a piece of music.