Supplementary Material for "On Different Symbolic Music Representations for Algorithmic Composition Approaches based on Neural Sequence Models"
Description
Overview
This repository contains additional resources for our paper
Felix Schön, Hans Tompits: On Different Symbolic Music Representations for Algorithmic Composition Approaches based on Neural Sequence Models. AIxIA 2024.
Among the different approaches for automated music composition, those based on neural sequence models like the transformer show particular promise. A critical aspect for such approaches is how given music data sets are represented, or tokenised, for serving as suitable inputs for such models, as the choice of representation influences the quality of the produced output. In this paper, we introduce seven novel tokenisation techniques for converting MIDI data into numeric sequences. We compare characteristics of our tokenisers based on sets of musical data translated using our approaches. Our results show that some of our techniques greatly outperform the approaches found in the literature with respect to different metrics such as sequence length, information density, or memory requirements. Moreover, to evaluate the influence of our tokenisation approaches on the quality of the output of a model, we trained an ensemble of transformer models on the sets of tokenised musical data and performed a user study to assess the quality of the generated music pieces. The result of the study shows that the quality of pieces produced using our most promising techniques is equal to or outperforms state-of-the-art approaches.
In this paper, we introduce and compare seven tokenisation techniques for symbolic music generation. Symbolic music generation refers to the process of composing music via some formal process. Here, music is represented using "tokens" rather than audio waveforms. The exact way we represent music using these tokens (referred to as a "tokenisation" process) can have a significant impact on a model's ability to learn and the quality of its output. To evaluate our tokenisation approaches, we trained an ensemble of seven Transformer models on musical data represented using our techniques. We then conducted a user study on the output of the models to compare the different approaches.
This repository contains both the code for the tokenisation approaches and the code for the models and their training loop. It furthermore contains the weights of the seven models trained on our tokenisation techniques, as well as two comparison models trained on approaches commonly used in the field. Lastly, we include the raw output of the user study we conducted and the samples used for it, both in the MIDI and .mp3 format.
Structure
This repository is organised as follows:
./code.zip
contains the code used to train the Transformer models. Note that this directory functions as a snapshot of the code at the moment of submission of the paper and thus contains some aspects aimed at tasks not related to this publication that were still work-in-progress at the time of publishing. It serves for reproducibility and educational purposes../samples.zip
contains 64 pieces generated by each of the different models in the .mp3 and .midi format as well as the token representation of the pieces../survey.zip
contains the raw answers of all survey participants./weights.zip
contains the compressed checkpoints of the trained models. Also contains logs of the training process../scoda.zip
contains the code for the different tokenisation approaches in the form of our MIDI processing library S-Coda v2.1. Note that S-Coda is still in active development and may change in the future. Up-to-date versions can be found at its git repository../requirements.txt
contains the dependencies and their versions compatible with this project.
Dependencies
We used Python v3.11, PyTorch v2.5, and S-Coda v2.1 for this project. For more information, see both S-Coda's and the training code's pyproject.toml
, which specify additional (minor) dependencies.
License
The code of both S-Coda and the training scripts is licensed under the MIT license, while the data is licensed under the Creative Commons Attribution 4.0 International license.
Notes
In order to run the project a [code_root]/cfg/environment_config.json
file (where [code_root]
marks the root folder of the code contained in ./code.zip
) is needed. This file points to the root folders used to load and store the datasets and models. An example file could look like this:
{
"paths": {
"datasets": "/path/to/datasets/",
"models": "/path/to/models/",
"wd_datasets": "/path/to/working/directory/of/datasets/"
}
}
Files
code.zip
Files
(650.0 MiB)
Name | Size | |
---|---|---|
md5:7aec8c9a4ff3eef31e516645e6d9e346
|
70.4 KiB | Preview Download |
md5:2db54be1c3b908b05e7a1d230efa9acf
|
1.0 KiB | Preview Download |
md5:e2132f7a27ba6ae66315954ecaea8c5b
|
99.2 MiB | Preview Download |
md5:fc017633761076743682277952f3157d
|
117.8 KiB | Preview Download |
md5:102863f0f645303dc217ad07d5362198
|
7.3 KiB | Preview Download |
md5:a85778217b980140e824c64fb17d4d42
|
550.6 MiB | Preview Download |
Additional details
Related works
- Is supplement to
- Conference Paper: 10.1007/978-3-031-80607-0_21 (DOI)