Published September 24, 2024 | Version v1
Other Open

Supplementary Material for "On Different Symbolic Music Representations for Algorithmic Composition Approaches based on Neural Sequence Models"

Creators

  • 1. ROR icon TU Wien

Contributors

Contact person:

Supervisor:

Description

Overview

This repository contains additional resources for our paper

Felix Schön, Hans Tompits: On Different Symbolic Music Representations for Algorithmic Composition Approaches based on Neural Sequence Models. AIxIA 2024.

Among the different approaches for automated music composition, those based on neural sequence models like the transformer show particular promise. A critical aspect for such approaches is how given music data sets are represented, or tokenised, for serving as suitable inputs for such models, as the choice of representation influences the quality of the produced output. In this paper, we introduce seven novel tokenisation techniques for converting MIDI data into numeric sequences. We compare characteristics of our tokenisers based on sets of musical data translated using our approaches. Our results show that some of our techniques greatly outperform the approaches found in the literature with respect to different metrics such as sequence length, information density, or memory requirements. Moreover, to evaluate the influence of our tokenisation approaches on the quality of the output of a model, we trained an ensemble of transformer models on the sets of tokenised musical data and performed a user study to assess the quality of the generated music pieces. The result of the study shows that the quality of pieces produced using our most promising techniques is equal to or outperforms state-of-the-art approaches.

In this paper, we introduce and compare seven tokenisation techniques for symbolic music generation. Symbolic music generation refers to the process of composing music via some formal process. Here, music is represented using "tokens" rather than audio waveforms. The exact way we represent music using these tokens (referred to as a "tokenisation" process) can have a significant impact on a model's ability to learn and the quality of its output. To evaluate our tokenisation approaches, we trained an ensemble of seven Transformer models on musical data represented using our techniques. We then conducted a user study on the output of the models to compare the different approaches.

This repository contains both the code for the tokenisation approaches and the code for the models and their training loop. It furthermore contains the weights of the seven models trained on our tokenisation techniques, as well as two comparison models trained on approaches commonly used in the field. Lastly, we include the raw output of the user study we conducted and the samples used for it, both in the MIDI and .mp3 format.

Structure

This repository is organised as follows:

  • ./code.zip contains the code used to train the Transformer models. Note that this directory functions as a snapshot of the code at the moment of submission of the paper and thus contains some aspects aimed at tasks not related to this publication that were still work-in-progress at the time of publishing. It serves for reproducibility and educational purposes.
  • ./samples.zip contains 64 pieces generated by each of the different models in the .mp3 and .midi format as well as the token representation of the pieces.
  • ./survey.zip contains the raw answers of all survey participants
  • ./weights.zip contains the compressed checkpoints of the trained models. Also contains logs of the training process.
  • ./scoda.zip contains the code for the different tokenisation approaches in the form of our MIDI processing library S-Coda v2.1. Note that S-Coda is still in active development and may change in the future. Up-to-date versions can be found at its git repository.
  • ./requirements.txt contains the dependencies and their versions compatible with this project.

Dependencies

We used Python v3.11, PyTorch v2.5, and S-Coda v2.1 for this project. For more information, see both S-Coda's and the training code's pyproject.toml, which specify additional (minor) dependencies.

License

The code of both S-Coda and the training scripts is licensed under the MIT license, while the data is licensed under the Creative Commons Attribution 4.0 International license.

Notes

In order to run the project a [code_root]/cfg/environment_config.json file (where [code_root] marks the root folder of the code contained in ./code.zip) is needed. This file points to the root folders used to load and store the datasets and models. An example file could look like this:

{
"paths": {
      "datasets": "/path/to/datasets/",
      "models": "/path/to/models/",
      "wd_datasets": "/path/to/working/directory/of/datasets/"
    }
}

Files

code.zip

Files (650.0 MiB)

Name Size
md5:7aec8c9a4ff3eef31e516645e6d9e346
70.4 KiB Preview Download
md5:2db54be1c3b908b05e7a1d230efa9acf
1.0 KiB Preview Download
md5:e2132f7a27ba6ae66315954ecaea8c5b
99.2 MiB Preview Download
md5:fc017633761076743682277952f3157d
117.8 KiB Preview Download
md5:102863f0f645303dc217ad07d5362198
7.3 KiB Preview Download
md5:a85778217b980140e824c64fb17d4d42
550.6 MiB Preview Download

Additional details

Related works

Is supplement to
Conference Paper: 10.1007/978-3-031-80607-0_21 (DOI)