Abdulazeez, Rana L. and Alizadeh, Fattah (2024) Deep Learning-Based Optical Music Recognition for Semantic Representation of Non-overlap and Overlap Music Notes. ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY, 12 (1). pp. 79-87. ISSN 2410-9355
Text (Research Article)
ARO.11402.VOL12.NO1.2024.ISSUE22-PP79-87.pdf - Published Version Available under License Creative Commons Attribution Non-commercial Share Alike. Download (1MB) |
Abstract
In the technology era, the process of teaching a computer to interpret musical notation is termed optical music recognition (OMR). It aims to convert musical note sheets presented in an image into a computer-readable format. Recently, the sequence-to-sequence model along with the attention mechanism (which is used in text and handwritten recognition) has been used in music notes recognition. However, due to the gradual disappearance of excessively long sequences of musical sheets, the mentioned OMR models which consist of long short-term memory are facing difficulties in learning the relationships among the musical notations. Consequently, a new framework has been proposed, leveraging the image segmentation technique to break up the procedure into several steps. In addition, an overlap problem in OMR has been addressed in this study. Overlapping can result in misinterpretation of music notations, producing inaccurate findings. Thus, a novel algorithm is being suggested to detect and segment the notations that are extremely close to each other. Our experiments are based on the usage of the Convolutional Neural Network block as a feature extractor from the image of the musical sheet and the sequence-to-sequence model to retrieve the corresponding semantic representation. The proposed approach is evaluated on The Printed Images of Music Staves dataset. The achieved results confirm that our suggested framework successfully solves the problem of long sequence music sheets, obtaining SER 0% for the non-overlap symbols in the best scenario. Furthermore, our approach has shown promising results in addressing the overlapping problem: 23.12 % SER for overlapping symbols.
Item Type: | Article |
---|---|
Additional Information: | Baró, A., Badal, C., and Fornés, A., 2020. Handwritten Historical Music Recognition by Sequence-to-Sequence with Attention Mechanism. In: 2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR). IEEE, United States, pp.205-210. DOI: https://doi.org/10.1109/ICFHR2020.2020.00046 Brownlee, J., 2019. Deep Learning for Computer Vision: Image Classification, Object Detection, and Face Recognition in Python. Machine Learning Mastery, Vermont. Calvo-Zaragoza, J., and Rizo, D., 2018. End-to-end neural optical music recognition of monophonic scores. Applied Sciences, 8(4), p.606. DOI: https://doi.org/10.3390/app8040606 Calvo-Zaragoza, J., Valero-Mas, J.J., and Pertusa, A., 2017. End-to-End Optical Music Recognition using Neural Networks. In: Proceedings of the 18th International Society for Music Information Retrieval Conference. ISMIR, Canada, pp.23-27. Castellanos, F.J., Calvo-Zaragoza, J., and Inesta, J.M., 2020. A Neural Approach for Full-Page Optical Music Recognition of Mensural Documents. ISMIR, Canada, pp.558-565. Jang, M., Seo, S., and Kang, P., 2019. Recurrent neural network-based semantic variational autoencoder for sequence-to-sequence learning. Information Sciences, 490, pp.59-73. DOI: https://doi.org/10.1016/j.ins.2019.03.066 Matrenin, P.V., Manusov, V.Z., Khalyasmaa, A.I., Antonenkov, D.V., Eroshenko, S.A., and Butusov, D.N., 2020. Improving accuracy and generalization performance of small-size recurrent neural networks applied to short-term load forecasting. Mathematics, 8(12), p.2169. DOI: https://doi.org/10.3390/math8122169 Michael, J., Labahn, R., Grüning, T., and Zöllner, J., 2019. Evaluating Sequenceto-Sequence Models for Handwritten Text Recognition. In: 2019 International Conference on Document Analysis and Recognition. ICDAR. IEEE, United States, pp.1286-1293. DOI: https://doi.org/10.1109/ICDAR.2019.00208 Mondal, R., Malakar, S., Barney Smith, E.H., and Sarkar, R., 2022. Handwritten English word recognition using a deep learning based object detection architecture. Multimedia Tools and Applications, 81, pp.1-26. DOI: https://doi.org/10.1007/s11042-021-11425-7 Neubig, G., 2017. Neural Machine Translation and Sequence-to-Sequence Models: A Tutorial. [arXiv Preprint] arXiv:1703.01619. Pugin, L., 2006. Optical Music Recognitoin of Early Typographic Prints using Hidden Markov Models. ISMIR, Canada, pp.53-56. Pugin, L., Burgoyne, J.A., and Fujinaga, I., 2007. MAP Adaptation to Improve Optical Music Recognition of Early Music Documents Using Hidden Markov Models. ISMIR, Canada, pp.513-516. Rosen, K.H., 2007. Discrete Mathematics and Its Applications. The McGraw Hill Companies, United States. Shatri, E., and Fazekas, G., 2020. Optical Music Recognition: State of the Art and Major Challenges. Computer Science, Engineering. [arXiv preprint] arXiv:2006.07885. Sutskever, I., Vinyals, O., and Le, Q.V., 2014. Sequence to Sequence Learning with Neural Networks. In: Advances in Neural Information Processing Systems. Vol. 27. The MIT Press, United States. Torras, P., Baró, A., Fornés, A., and Kang, L., 2022. Improving Handwritten Music Recognition through Language Model Integration. In: 4th International Workshop on Reading Music Systems, p.42. Van Der Wel, E., and Ullrich, K., 2017. Optical Music Recognition with Convolutional Sequence-to-Sequence Models. [arXiv preprint] arXiv:1707.04877. Wen, C., and Zhu, L., 2022. A sequence-to-sequence framework based on transformer with masked language model for optical music recognition. IEEE Access, 10, pp.118243-118252 DOI: https://doi.org/10.1109/ACCESS.2022.3220878 |
Uncontrolled Keywords: | equence-To-Sequence, Long Short-term Memory Network, Convolutional Neural Network, Segmentation, Semantic representation, Overlapping |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | ARO-The Scientific Journal of Koya University > VOL 12, NO 1 (2024) |
Depositing User: | Dr Salah Ismaeel Yahya |
Date Deposited: | 02 Sep 2024 06:58 |
Last Modified: | 02 Sep 2024 06:58 |
URI: | http://eprints.koyauniversity.org/id/eprint/473 |
Actions (login required)
View Item |