Singer and Genre Recognition using Deep Learning

Main Article Content

Zainab Iqra Yasrab
Akhtar Hussain Jalbani
Saima Siraj Soomro

Abstract

 In this paper, we try to solve the two mostly occurred problems which are genre recognition and singer identification from common pieces of music. For that, we use a system that helps us to achieve our target that is based on a music information retrieval system, in which the system can process our files and the system can generate the desired results. The system is divided into different phases and for the music recognition, first, there is a need to separate the vocal features of the singer from the music clip because, with the background music, it’s slightly difficult to analyze the vocal of the singer, then extract music features, here we work on some of the music features like tempo, RMS, song duration, frequency, dynamic range, and tonality. The RNN (recurrent neural network) has been used for training the model and testing our data set. After the training of the models, the test classifier with different unknown audio files, and the results are 91% accurate for singer identification and 90% for music genre recognition. The system will identify the singer as well as the genre of the song from the same segment of the file. For the data set, the training data used 10 different Indian and Pakistani male and female singers’ audio files in Sindhi, Hindi, and Urdu language for six types of genres like rock, pop, jazz, disco, hiphop, and classic. The songs were sung by only one singer, the group singing and live concerts music are limitations of our system.

Article Details

How to Cite
Yasrab, Z. I. ., Jalbani, A. H., & Soomro, S. S. . (2023). Singer and Genre Recognition using Deep Learning. Pakistan Journal of Emerging Science and Technologies (PJEST), 4(2), 1–12. https://doi.org/10.58619/pjest.v4i2.102
Section
Articles

References

Ghosal, D., & Kolekar, M. H. (2018, September). Music Genre Recognition Using Deep Neural Networks and Transfer Learning. In Interspeech (pp. 2087-2091).

Bahuleyan, H. (2018). Music genre classification using machine learning techniques. arXiv preprint arXiv:1804.01149.

Zhang, W., Lei, W., Xu, X., & Xing, X. (2016, September). Improved music genre classification with convolutional neural networks. In Interspeech (pp. 3304-3308).

Mesaros, A., Virtanen, T., & Klapuri, A. (2007, September). Singer identification in polyphonic music using vocal separation and pattern recognition methods. In ISMIR (pp. 375-378).

Lippens, S., Martens, J. P., & De Mulder, T. (2004, May). A comparison of human and automatic musical genre classification. In 2004 IEEE international conference on acoustics, speech, and signal processing (Vol. 4, pp. iv-iv). IEEE.

Fulzele, P., Singh, R., Kaushik, N., & Pandey, K. (2018, August). A hybrid model for music genre classification using LSTM and SVM. In 2018 Eleventh International Conference on Contemporary Computing (IC3) (pp. 1-3). IEEE.

Tang, C. P., Chui, K. L., Yu, Y. K., Zeng, Z., & Wong, K. H. (2018, July). Music genre classification using a hierarchical long short term memory (LSTM) model. In Third International Workshop on Pattern Recognition (Vol. 10828, pp. 334-340). SPIE.

Bisharad, D., & Laskar, R. H. (2019). Music genre recognition using convolutional recurrent neural network architecture. Expert Systems, 36(4), e12429.

Choi, K., Fazekas, G., Sandler, M., & Cho, K. (2017, March). Convolutional recurrent neural networks for music classification. In 2017 IEEE International conference on acoustics, speech and signal processing (ICASSP) (pp. 2392-2396). IEEE.

Pelchat, N., & Gelowitz, C. M. (2020). Neural network music genre classification. Canadian Journal of Electrical and Computer Engineering, 43(3), 170-173.

Elbir, A., & Aydin, N. (2020). Music genre classification and music recommendation by using deep learning. Electronics Letters, 56(12), 627-629.

Rafi, Q. G., Noman, M., Prodhan, S. Z., Alam, S., & Nandi, D. (2021). Comparative analysis of three improved deep learning architectures for music genre classification. International Journal of Information Technology and Computer Science, 13(2), 1-14

Kim, S., Kim, D., & Suh, B. (2016). Music genre classification using multimodal deep learning. In Proceedings of HCI Korea (pp. 389-395).

Yu, Y., Luo, S., Liu, S., Qiao, H., Liu, Y., & Feng, L. (2020). Deep attention based music genre classification. Neurocomputing, 372, 84-91.

Qiu, L., Li, S., & Sung, Y. (2021). DBTMPE: Deep bidirectional transformers-based masked predictive encoder approach for music genre classification. Mathematics, 9(5), 530.

Leglaive, S., Hennequin, R., & Badeau, R. (2015, April). Singing voice detection with deep recurrent neural networks. In 2015 IEEE International conference on acoustics, speech and signal processing (ICASSP) (pp. 121-125). IEEE.

Kum, S., & Nam, J. (2019). Joint detection and classification of singing voice melody using convolutional recurrent neural networks. Applied Sciences, 9(7), 1324.

Lehner, B., Widmer, G., & Bock, S. (2015, August). A low-latency, real-time-capable singing voice detection method with LSTM recurrent neural networks. In 2015 23rd European signal processing conference (EUSIPCO) (pp. 21-25). IEEE.

Teytaut, Y., & Roebel, A. (2021, August). Phoneme-to-audio alignment with recurrent neural networks for speaking and singing voice. In Proceedings of Interspeech 2021 (pp. 61-65). International Speech Communication Association; ISCA.

Shen, Z., Yong, B., Zhang, G., Zhou, R., & Zhou, Q. (2019). A deep learning method for Chinese singer identification. Tsinghua Science and Technology, 24(4), 371-378.

Gu, Y., Yin, X., Rao, Y., Wan, Y., Tang, B., Zhang, Y., ... & Ma, Z. (2021, January). Bytesing: A chinese singing voice synthesis system using duration allocated encoder-decoder acoustic models and wavernn vocoders. In 2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP) (pp. 1-5). IEEE.

Micchi, G. (2018). A neural network for composer classification. In International Society for Music Information Retrieval Conference (ISMIR 2018).

Kooshan, S., Fard, H., & Toroghi, R. M. (2019, March). Singer identification by vocal parts detection and singer classification using lstm neural networks. In 2019 4th International Conference on Pattern Recognition and Image Analysis (IPRIA) (pp. 246-250). IEEE.

Zhang, X., Yu, Y., Gao, Y., Chen, X., & Li, W. (2020). Research on singing voice detection based on a long-term recurrent convolutional network with vocal separation and temporal smoothing. Electronics, 9(9), 1458.

Weninger, F., Wöllmer, M., & Schuller, B. (2011). Automatic assessment of singer traits in popular music: Gender, age, height and race.