Musical Distribution Shift

This repository contains the code and data used for our evaluation of Automatic (Piano) Music Transcription (AMT) systems under Musical Distribution Shift (MDS), as published in "Sound and Music Biases in Deep Music Transcription Models: A Systematic Analysis".

The full MDS Dataset including audio files is available at https://zenodo.org/records/17467279.

data: MIDI files for the ground truth and transcriptions produced by the evaluated AMT systems + metadata
comp_metrics.py: computes all relevant metrics using the pairs of MIDI files found in data
results: computed performance metrics stored as CSV files
metrics: code relevant to the computation of metrics
figures: code reproducing the figures from our paper based on the results CSVs stored in results
results_ref: reference results (exact numbers reported in the paper's figures)
notebooks: jupyter notebooks documenting the curation of the MDS dataset and a reproducibility check

Setup

conda create -n mds python=3.9
conda activate mds
pip install -r requirements.txt

Dependencies

python 3.9
mpteval 0.1.4
partitura 1.7.0

Citing

If you build on this work in your research, please cite the relevant journal article:

@article{martak2025biases,
	author = {Luk{\'a}{\v{s}} Samuel Mart{\'a}k and Patricia Hu and Gerhard Widmer},
	title = {{Sound and Music Biases in Deep Music Transcription Models: A Systematic Analysis}},
	journal = {EURASIP Journal on Audio, Speech, and Music Processing},
	year = {2025},
	month = {Dec},
	day = {11},
	volume = {2026},
	number = {1},
	pages = {5},
	issn = {1687-4722},
	doi = {10.1186/s13636-025-00428-z},
	url = {https://doi.org/10.1186/s13636-025-00428-z},
	abstract = {Automatic Music Transcription (AMT) — the task of converting music audio into note representations — has seen rapid progress, driven largely by deep learning systems. Due to the limited availability of richly annotated music datasets, much of the progress in AMT has been concentrated on classical piano music, and even a few very specific datasets. Whether these systems can generalize effectively to other musical contexts remains an open question. Complementing recent studies on distribution shifts in sound (e.g., recording conditions), in this work we investigate the musical dimension—specifically, variations in genre, dynamics, and polyphony levels. To this end, we introduce the MDS corpus, comprising three distinct subsets — (1) genre, (2) random, and (3) MAEtest — to emulate different axes of distribution shift. We evaluate the performance of several state-of-the-art AMT systems on the MDS corpus using both traditional information-retrieval and musically informed performance metrics. Our extensive evaluation isolates and exposes varying degrees of performance degradation under specific distribution shifts. In particular, we measure a note-level F1 performance drop of 20 percentage points due to sound, and 14 due to genre. Generally, we find that dynamics estimation proves more vulnerable to musical variation than onset prediction. Musically informed evaluation metrics, particularly those capturing harmonic structure, help identify potential contributing factors. Furthermore, experiments with randomly generated, non-musical sequences reveal clear limitations in system performance under extreme musical distribution shifts. Altogether, these findings offer new evidence of the persistent impact of the corpus bias problem in deep AMT systems.},
	keywords = {Automatic Music Transcription, AMT, Musical Distribution Shift, MDS corpus, Corpus Bias, Deep Learning, Robustness Evaluation Benchmark, Out-of-Distribution Inference, Generalization, Polyphonic Piano Transcription}
}

Related publications:

2025 EURASIP JASM paper "Sound and Music Biases in Deep Music Transcription Models: A Systematic Analysis" (arXiv preprint)
2024 IWSSPA workshop paper "Quantifying the Corpus Bias Problem in Automatic Music Transcription Systems" (arXiv preprint)

Acknowledgments

This work is supported by the European Research Council (ERC) under the EU’s Horizon 2020 research and innovation programme, grant agreement No. 101019375 ("Whither Music?"), by the LIT AI Lab, and by Johannes Kepler University Open Access Publishing Fund and the Federal State of Upper Austria.

License

All software provided in this repository is subject to the CRAPL license.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Musical Distribution Shift

Contents

Setup

Dependencies

Citing

Acknowledgments

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
figures		figures
metrics		metrics
notebooks		notebooks
results		results
results_ref		results_ref
.gitignore		.gitignore
CRAPL-LICENSE.txt		CRAPL-LICENSE.txt
README.md		README.md
comp_metrics.py		comp_metrics.py
requirements.txt		requirements.txt
utils.py		utils.py

Folders and files

Latest commit

History

Repository files navigation

Musical Distribution Shift

Contents

Setup

Dependencies

Citing

Acknowledgments

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages