Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

High Resolution Guitar Transcription via Domain Adaptation (2402.15258v1)

Published 23 Feb 2024 in eess.AS, cs.LG, and cs.SD

Abstract: Automatic music transcription (AMT) has achieved high accuracy for piano due to the availability of large, high-quality datasets such as MAESTRO and MAPS, but comparable datasets are not yet available for other instruments. In recent work, however, it has been demonstrated that aligning scores to transcription model activations can produce high quality AMT training data for instruments other than piano. Focusing on the guitar, we refine this approach to training on score data using a dataset of commercially available score-audio pairs. We propose the use of a high-resolution piano transcription model to train a new guitar transcription model. The resulting model obtains state-of-the-art transcription results on GuitarSet in a zero-shot context, improving on previously published methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. “Onsets and frames: Dual-objective piano transcription,” in Proceedings of the 19th International Society for Music Information Retrieval Conference, Paris, France, 2018, pp. 50–57.
  2. “High-resolution piano transcription with pedals by regressing onset and offset times,” IEEE ACM Transactions on Audio, Speech and Language Processing, vol. 29, pp. 3707–3717, 2021.
  3. “Sequence-to-sequence piano transcription with transformers,” in Proceedings of the 22nd International Society for Music Information Retrieval Conference, Online, 2021, pp. 246–253.
  4. “Enabling factorized piano music modeling and generation with the MAESTRO dataset,” in 7th International Conference on Learning Representations, New Orleans, USA, 2019.
  5. “MAPS - a piano database for multipitch estimation and automatic transcription of music,” Research Report 00544155, INRIA, France, 2010.
  6. “MT3: Multi-task multitrack music transcription,” in Tenth International Conference on Learning Representations, 2022.
  7. “GuitarSet: A dataset for guitar transcription,” in Proceedings of the 19th International Society for Music Information Retrieval Conference, Paris, France, 2018, pp. 453–460.
  8. “Learning features of music from scratch,” in International Conference on Learning Representations (ICLR), Toulon, France, 2017.
  9. “Unaligned supervision for automatic music transcription in the wild,” in International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA. 2022, vol. 162 of Proceedings of Machine Learning Research, pp. 14918–14934, PMLR.
  10. “Maximum likelihood from incomplete data via the EM algorithm,” Journal of the Royal Statistical Society, Series B, vol. 39, no. 1, pp. 1–38, 1977.
  11. “Mir_eval: A transparent implementation of common MIR metrics,” in Proceedings of the 15th International Society for Music Information Retrieval Conference, Taipei, Taiwan, 2014, pp. 367–372.
  12. “A lightweight instrument-agnostic model for polyphonic note transcription and multipitch estimation,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Singapore, 2022.
  13. “Note-level automatic guitar transcription using attention mechanism,” in 30th European Signal Processing Conference, EUSIPCO 2022, Belgrade, Serbia, August 29 - Sept. 2, 2022. 2022, pp. 229–233, IEEE.
  14. “FretNet: Continuous-valued pitch contour streaming for polyphonic guitar tablature transcription,” CoRR, vol. abs/2212.03023, 2022.
  15. “Multitrack music transcription with a time-frequency perceiver,” CoRR, vol. abs/2306.10785, 2023.
  16. “Omnizart: A general toolbox for automatic music transcription,” CoRR, vol. abs/2106.00497, 2021.
  17. “Multi-instrument automatic music transcription with self-attention-based instance segmentation,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, pp. 2796–2809, 2020.
Citations (6)

Summary

We haven't generated a summary for this paper yet.