The Research Fellows (RFs) will engage in research and development of an AI-Lyricist system and Singing and Listening to Improve Our Natural Speaking (SLIONS) system for language learning and/or speech rehabilitation. The RFs will work in a multi-disciplinary team to help develop 1) machine learning algorithms, 2) an automatic lyrics generation system, and 3) an automatic singing voice/speech evaluation system. The RFs will help coordinate research activities and supervise research staff/students in the team, and perform data analyses. The RFs will need to be capable of working both independently and in a team, of developing innovative solutions, and of publishing research findings in high-impact conferences and journals. These positions are part of two 3-year projects from the Ministry of Education (MOE) of Singapore with a value of SGD2.4 millions (ca. USD1.75 millions).
Here are the relevant references for the projects:
Wang, Y. (2019). Singing Voice Modelling for Language Learning, Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.
Murad, D., Wang, R., Turnbull, D., & Wang, Y. (2018). SLIONS: A Karaoke Application to Enhance Foreign Language Learning, in 2018 ACM International Conference on Multimedia (MM'18), pp. 1679-1687. [NUS News]
Ma, X., Wang, Y., Kan, M. Y., and Lee, W. S., (2021). AI-Lyricist: Generating Music and Vocabulary Constrained Lyrics, in 2021 ACM International Conference on Multimedia (MM'21), pp. 1002-1011.
Sharma, B., & Wang, Y. (2020). Automatic Evaluation of Song Intelligibility using Singing Adapted STOI and Vocal-specific Features, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vo.28, pp. 319-331. [code] [data]
Gu, X.*, Ou, L.*, Ong, D., and Wang, Y., (2022). MM-ALT: A Multimodal Automatic Lyric Transcription System, in 2022 ACM International Conference on Multimedia (MM'22), pp. 3328-3337. [demo]
Ou, L.*, Gu, X.*, and Wang, Y., (2022). Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription, in Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR 2022).
Wei, W., Huang, H., Gu, X., Wang, H., and Wang, Y., (2022 December). Unsupervised Mismatch Localization in Cross-Modal Sequential Data with Application to Mispronunciations Localization. Transactions on Machine Learning Research (12/2022).
Candidates will need:
Ph.D. in Computer Science & Engineering, Electrical Engineering, or related disciplines
Experience in signal processing and machine learning or natural language processing is required
Strong analytical and programming skills
Strong publication track record
Strong ability to work independently and in teams
Experience with automatic speech recognition (ASR), automatic lyrics or singing voice generation/analysis is a plus
Knowledge of music is a plus
Key details:
Positions available immediately
1-year contract extendable for 2 years
Competitive salary and benefits
The candidates must be based in, or be able to relocate to, Singapore
For enquiries and further details about the responsibilities of the positions, please contact Associate Prof. Ye Wang at wangye@comp.nus.edu.sg with the subject title of "SMC4HHP positions".