SMC Lab | Publication

Full Publication List

2025

Q. Liang, X. Ma, T. Hopkins, and Y. Wang, “LivePoem: Improving the Learning Experience of Classical Chinese Poetry with AI-Generated Musical Storyboards,” in Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2025). ijcai.org, 2025.
J. Zhao, X. Wang, and Y. Wang, “Prosody-Adaptable Audio Codecs for Zero-Shot Voice Conversion via In-Context Learning,” in Proceedings of the 26th Annual Conference of the International Speech Communication Association (Interspeech 2025). ISCA, 2025s.
H. Liu, H. Huang, H. Wang, X. Gu, and Y. Wang, “ On Calibration of LLM-based Guard Models for Reliable Content Moderation,” in Proceedings of the 13th International Conference on Learning Representations (ICLR 2025). OpenReview.net, 2025.
X. Gu, T. Pang, C. Du, Q. Liu, F. Zhang, C. Du, Y. Wang and M. Lin, “When Attention Sink Emerges in Language Models: An Empirical View,” in Proceedings of the 13th International Conference on Learning Representations (ICLR 2025). OpenReview.net, 2025.
X. Gu, C. Du, T. Pang, C. Li, M. Lin, and Y. Wang, “ On Memorization in Diffusion Models,” Trans. Mach. Learn. Res. (TMLR) , vol. 2025.
L. Ou, Y. Takahashi, and Y. Wang, “ Lead Instrument Detection from Multitrack Music ,” in Proceedings of the 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2025). IEEE, 2025.
J. Zhao, C. Low, and Y. Wang, “ SPSinger: Multi-Singer Singing Voice Synthesis with Short Reference Prompt,” in Proceedings of the 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2025). IEEE, 2025.

2024

J. Zhao, G. Xia, Z. Wang, and Y. Wang, “Structured Multi-Track Accompaniment Arrangement via Style Prior Modelling,” in Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS 2024). 2024. [demo] [code]
H. Liu, H. Huang, and Y. Wang, “Advancing Test-Time Adaptation inWild Acoustic Test Settings,” in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024). Association for Computational Linguistics, 2024, pp. 7138-7155.
H. Liu*, Y. Xie*, Y. Wang, and Michael Shieh, “Advancing Adversarial Suffix Transfer Learning on Aligned Large Language Models,” in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024). Association for Computational Linguistics, 2024, pp. 7213-7224.
X. Ma, V. Sharma, M. Y. Kan, W. S. Lee, and Y. Wang, “KeYric: Unsupervised Keywords Extraction and Expansion from Music for Coherent Lyric Generation,” ACM Trans. Multim. Comput. Commun. Appl. (TOMM),, vol. 21, No. 1, pp. 1-28, 2024.
X. Wang, M. Shi, and Y. Wang, “Pitch-Aware RNN-T for Mandarin Chinese Mispronunciation Detection and Diagnosis,” in Proceedings of the 25th Annual Conference of the International Speech Communication Association (Interspeech 2024). ISCA, 2024, pp. 292-296.
J. Zhao, L. Q. Chetwin, and Y. Wang, “SinTechSVS: A Singing Technique Controllable Singing Voice Synthesis System,” IEEE ACM Trans. Audio Speech Lang. Process. (TASLP), vol. 32, pp. 2641–2653, 2024.
H. Huang, S. Wang, H. Liu, H. Wang, and Y. Wang, “Benchmarking Large Language Models on Communicative Medical Coaching: A Dataset and a Novel System,” in Findings of the Association for Computational Linguistics: ACL 2024 (Findings of ACL 2024). Association for Computational Linguistics, 2024, pp. 1624-1637.
Q. Liang, X. Ma, F. Doshi-Velez, B. Lim, and Y. Wang, “XAI-Lyricist: Improving the Singability of AI-Generated Lyrics with Prosody Explanations,” in Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (IJCAI 2024). ijcai.org, 2024, pp.7877-7885.
W. Zeng, X. He, and Y. Wang, “End-to-End Real-World Polyphonic Piano Audio-to-Score Transcription with Hierarchical Decoding,” in Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (IJCAI 2024). ijcai.org, 2024, pp. 7788-7795.
X. Ma, Y. Wang, and Y. Wang, “Symbolic Music Generation from Graph-Learning-based Preference Modeling and Textual Queries,” IEEE Trans. Multim. (TMM), vol. 26, pp. 10545-10558, 2024.
X. Gu, X. Zheng, T. Pang, C. Du, Q. Liu, Y. Wang, J. Jiang, and M. Lin, “Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast,” Proceedings of the 41st International Conference on Machine Learning (ICML 2024). PMLR, 2024.
X. Gu, L. Ou, W. Zeng, J. Zhang, N. Wong, and Y. Wang, “Automatic Lyric Transcription and Automatic Music Transcription from Multimodal Singing,” ACM Trans. Multim. Comput. Commun. Appl. (TOMM), vol. 20, No. 7, pp. 1551-6857, 2024.
Q. Liang and Y. Wang, “Drawlody: Sketch-Based Melody Creation with Enhanced Usability and Interpretability,” IEEE Trans. Multim. (TMM), vol. 26, pp. 7074-7088, 2024.

2023

H. Liu and Y. Wang, “Towards Informative Few-Shot Prompt with Maximum Information Gain for In-Context Learning,” in Findings of the Association for Computational Linguistics: EMNLP 2023 (Findings of EMNLP 2023). Association for Computational Linguistics, 2023, pp. 15825-15838.
Y. Wang, W. Wei, X. Gu, X. Guan, and Y. Wang, “Disentangled Adversarial Domain Adaptation for Phonation Mode Detection in Singing and Speech,” IEEE ACM Trans. Audio Speech Lang. Process. (TASLP), vol. 31, pp. 3746-3759, 2023.
X. Gu, W. Zeng, and Y. Wang, “Elucidate Gender Fairness in Singing Voice Transcription,” in Proceedings of the 31st ACM International Conference on Multimedia (MM 2023). ACM, 2023, pp. 8760-8769.
H. Liu, M. Shi, and Y. Wang, “Zero-Shot Automatic Pronunciation Assessment,” in Proceedings of the 24th Annual Conference of the International Speech Communication Association (Interspeech 2023). ISCA, 2023, pp. 1009-1013.
X. Wu, H. Huang, Y. Ding, H. Wang, Y. Wang, and Q. Xu, “FedNP: Towards Non-IID Federated Learning via Federated Neural Propagation,” in Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI 2023). AAAI Press, 2023, pp. 10399-10407.
J. Zhao, G. Xia, and Y. Wang, “Q&A: Query-Based Representation Learning for Multi-Track Symbolic Music re-Arrangement,” in Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence (IJCAI 2023). ijcai.org, 2023, pp. 5878-5886. [code] [demo] [tutorial]
L. Ou, X. Ma, M. Kan, and Y. Wang, “Songs Across Borders: Singable and Controllable Neural Lyric Translation,” in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (ACL 2023). Association for Computational Linguistics, 2023, pp. 447-467. [code] [demo]
Y. Wang, W. Wei, and Y. Wang, “Phonation Mode Detection in Singing: A Singer Adapted Model,” in Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023). IEEE, 2023, pp. 1-5.
S. Dai, X. Ma, Y. Wang, and R. B. Dannenberg, “Personalized Popular Music Generation Using Imitation and Structure,” Journal of New Music Research, vol. 51, no. 1, pp. 69-85, 2023.

2022

W. Wei, H. Huang, X. Gu, H. Wang, and Y. Wang, “Unsupervised Mismatch Localization in Cross-Modal Sequential Data with Application to Mispronunciations Localization,” Trans. Mach. Learn. Res. (TMLR), vol. 2022, 2022.
H. Huang, X. Gu, H. Wang, C. Xiao, H. Liu, and Y. Wang, “Extrapolative Continuous-time Bayesian Neural Network for Fast Training-free Test-time Adaptation,” in Proceedings of the 36th Annual Conference on Neural Information Processing Systems (NeurIPS 2022). 2022.
L. Ou, X. Gu, and Y. Wang, “Transfer Learning of Wav2Vec 2.0 for Automatic Lyric Transcription,” in Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR 2022). 2022, pp. 891-899.
X. Ma, X. Liu, B. Zhang, and Y. Wang, “Robust Melody Track Identification in Symbolic Music,” in Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR 2022). 2022, pp. 842-849.
J. Zhao, G. Xia, and Y. Wang, “Beat Transformer: Demixed Beat and Downbeat Tracking with Dilated Self-Attention,” in Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR 2022). 2022, pp. 169-177. [code] [tutorial] [video]
J. Zhao, G. Xia, and Y. Wang, “Domain Adversarial Training on Conditional Variational Auto-Encoder for Controllable Music Generation,” in Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR 2022). 2022, pp. 925-932. [code] [demo] [video]
X. Gu, L. Ou, D. Ong, and Y. Wang, “MM-ALT: A Multimodal Automatic Lyric Transcription System,” in Proceedings of the 30th ACM International Conference on Multimedia (MM 2022). ACM, 2022, pp. 3328-3337. (Top Paper Award) [demo]
X. Ma, Y. Wang, and Y. Wang, “Content Based User Preference Modeling in Music Generation,” in Proceedings of the 30th ACM International Conference on Multimedia (MM 2022). ACM, 2022, pp. 2473-2482. [demo 1] [demo 2]
L. Ou, Z. Guo, E. Benetos, J. Han, and Y. Wang, “Exploring Transformer's Potential on Automatic Piano Transcription,” in Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2022). IEEE, 2022, pp. 776-780.

2021

X. Ma, Y. Wang, M. Kan, and W. S. Lee, “AI-Lyricist: Generating Music and Vocabulary Constrained Lyrics,” in Proceedings of the 29th ACM International Conference on Multimedia (MM 2021). ACM, 2021, pp. 1002-1011. [lyrics demo] [synthsis demo]
H. Huang, H. Liu, H. Wang, C. Xiao, and Y. Wang, “STRODE: Stochastic Boundary Ordinary Differential Equation,” in Proceedings of the 38th International Conference on Machine Learning (ICML 2021). PMLR, 2021, pp. 4435-4445. [code] [slides]

2020

H. Huang, F. Xue, H. Wang, and Y. Wang, “Deep Graph Random Process for Relational-Thinking-Based Speech Recognition,” in Proceedings of the 37th International Conference on Machine Learning (ICML 2020). PMLR, 2020, pp. 4531-4541. [supplementary] [slides]
W. Wei, H. Zhu, E. Benetos, and Y. Wang, “A-CRNN: A Domain Adaptation Model for Sound Event Detection,” in Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2020). IEEE, 2020, pp. 276-280.

2019

B. Sharma and Y. Wang, “Automatic Evaluation of Song Intelligibility using Singing Adapted STOI and Vocal-specific Features,” IEEE ACM Trans. Audio Speech Lang. Process. (TASLP), vol. 28, pp. 319-331, 2020. [code] [data]
C. Gupta, H. Li, and Y. Wang, “Automatic Leaderboard: Evaluation of Singing Quality without a Standard Reference,” IEEE ACM Trans. Audio Speech Lang. Process. (TASLP), vol. 28, pp. 13-26, 2020.
B. Anderson, M. Shi, V. Y. F. Tan, and Y. Wang, “Mobile Gait Analysis Using Foot-Mounted UWB Sensors,” Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., vol. 3, no. 3, pp. 73:1-73:22, 2019.
S. S. R. Phaye, E. Benetos, and Y. Wang, “SubSpectralNet - Using Sub-Spectrogram Based Convolutional Neural Networks for Acoustic Scene Classification,” in Proceedings of the 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2019). IEEE, 2019, pp. 825-829.
B. Sharma, C. Gupta, H. Li, and Y. Wang, “Automatic Lyrics-to-Audio Alignment on Polyphonic Music Using Singing-Adapted Acoustic Models,” in Proceedings of the 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2019). IEEE, 2019, pp. 396-400.
Wang, Y, “Singing Voice Modelling for Language Learning,” Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2019

2018

C. Gupta, H. Li, and Y. Wang, “Automatic Evaluation of Singing Quality without a Reference,” in Proceedings of the 2018 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2018). IEEE, 2018, pp. 990-997.
B. Anderson, S. Zhu, K. Yang, J. Wang, H. Anderson, C. X. Tay, V. Y. F. Tan, and Y. Wang, “MANA: Designing and Validating a User-Centered Mobility Analysis System,” in Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2018). ACM, 2018, pp. 321-332.
M. D. Barone, K. M. Ibrahim, C. Gupta, and Y. Wang, “Empirically Weighting the Importance of Decision Factors for Singing Preference,” in Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR 2018). 2018, pp. 529-536.
C. Gupta, R. Tong, H. Li, and Y. Wang, “Semi-Supervised Lyrics and Solo-Singing Alignment,” in Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR 2018). 2018, pp. 600-607.
C. Gupta, H. Li, and Y. Wang, “Automatic Pronunciation Evaluation of Singing,” in Proceedings of the 19th Annual Conference of the International Speech Communication Association (Interspeech 2018). ISCA, 2018, pp. 1507-1511.
D. Murad, R. Wang, D. Turnbull, and Y. Wang, “SLIONS: A Karaoke Application to Enhance Foreign Language Learning,” in Proceedings of the 26th ACM International Conference on Multimedia (MM 2018). ACM, 2018, pp. 1679-1687. [NUS News]
C. Gupta, H. Li, and Y. Wang, “A Technical Framework for Automatic Perceptual Evaluation of Singing Quality,” APSIPA Transactions on Signal and Information Processing, vol. 7, 2018.

2017

C. Gupta, H. Li, and Y. Wang, “Perceptual Evaluation of Singing Quality,” in Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2017). IEEE, 2017, pp. 577-586. (Best Student Paper Award)
D. Turnbull, C. Gupta, D. Murad, M. Barone, and Y. Wang, “Using Music Technology to Motivate Foreign Language Learning,” in Proceedings of the 2017 International Conference on Orange Technologies (ICOT 2017). IEEE, 2017, pp. 218-221.
D. Murad, F. Ye, M. Barone, and Y. Wang, “Motion Initiated Music Ensemble with Sensors for Motor Rehabilitation,” in Proceedings of the 2017 International Conference on Orange Technologies (ICOT 2017). IEEE, 2017, pp. 87-90.
J. Fang, D. Grunberg, S. Lui, and Y. Wang, “Development of a Music Recommendation System for Motivating Exercise,” in Proceedings of the 2017 International Conference on Orange Technologies (ICOT 2017). IEEE, 2017, pp. 83-86.
J. Fang, D. Grunberg, D. J. Litman, and Y. Wang, “Discourse Analysis of Lyric and Lyric-Based Classification of Music,” in Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR 2017). 2017, pp. 464-471.
C. Gupta, D. Grunberg, P. Rao, and Y. Wang, “Towards Automatic Mispronunciation Detection in Singing,” in Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR 2017). 2017, pp. 390-396..
K. M. Ibrahim, D. Grunberg, K. Agres, C. Gupta, and Y. Wang, “Intelligibility of Sung Lyrics: A Pilot Study,” in Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR 2017). 2017, pp. 686-693. [data]
Z. Duan, C. Gupta, G. Percival, D. Grunberg, and Y. Wang, “SECCIMA: Singing and Ear Training for Children with Cochlear Implants via a Mobile Application,” in Proceedings of the 14th Sound and Music Computing Conference (SMC 2017). 2017, pp. 200-207.

2016

W. Zhu, B. Anderson, S. Zhu, and Y. Wang, “A Computer Vision-Based System for Stride Length Estimation using a Mobile Phone Camera,” in Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2016). ACM, 2016, pp. 121-130.
L. Bu, K. Yang, W. Xiong, F. Liu, B. Anderson, Y. Wang, and J. Wang, “Toward Precision Medicine in Parkinson's Disease,” Annals of Translational Medicine, vol. 4, no. 2, 2016.

2015

R. J. Ellis, Y. S. Ng, S. Zhu, D. M. Tan, B. Anderson, G. Schlaug, and Y. Wang, “A Validated Smartphone-Based Assessment of Gait and Gait Variability in Parkinson's Disease,” PLOS ONE, vol. 10, no. 10, pp. 1-22, 2015
R. J. Ellis, Z. Xing, J. Fang, and Y. Wang, “Quantifying Lexical Novelty in Song Lyrics,” in Proceedings of the 16th International Society for Music Information Retrieval Conference (ISMIR 2015). 2015, pp. 694-700. [data]
R. J. Ellis, B. Zhu, J. Koenig, J. F. Thayer, and Y. Wang, “A Careful Look at ECG Sampling Frequency and R-Peak Interpolation on Short-Term Measures of Heart Rate Variability,” Physiological Measurement, vol. 36, pp. 1827-1852, 2015.
S. Zhu, B. Anderson, R. Ellis, and Y. Wang, “Using Smartphones in Gait Analysis: An iOS-Based Rhythmic Auditory Cueing Evaluation (iRACE) Mobile Application,” the first prize (top out of 38 shortlisted teams) at the Asia Pacific Assistive, Rehabilitative and Therapeutic Technologies Challenge, 2015.
R. J. Ellis, Z. Duan, and Y. Wang, “Quantifying Auditory Temporal Stability in a Large Database of Recorded Music,” PLOS ONE, vol. 9, no. 12, pp. 1-22, 2015

2014

Z. Xing, X. Wang, and Y. Wang, “Enhancing Collaborative Filtering Music Recommendation by Balancing Exploration and Exploitation,” in Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR 2014). 2014, pp. 445-450. (Best Paper Award)
S. Zhu, J. Cai, J. Zhang, Z. Li, J. Wang, and Y. Wang, “Bridging the User Intention Gap: an Intelligent and Interactive Multidimensional Music Search Engine,” in Proceedings of the First International Workshop on Internet-Scale Multimedia Management (WISMM 14). ACM, 2014, pp. 59-64.
X. Wang, Y. Wang, D. Hsu, and Y. Wang, “Exploration in Interactive Personalized Music Recommendation: A Reinforcement Learning Approach,” ACM Trans. Multim. Comput. Commun. Appl. (TOMM), vol. 11, no. 1, pp. 7:1-7:22, 2014.
S. Zhu, R. J. Ellis, G. Schlaug, Y. S. Ng, and Y. Wang, “Validating an iOS-Based Rhythmic Auditory Cueing Evaluation (iRACE) for Parkinson's Disease,” in Proceedings of the 22nd ACM International Conference on Multimedia (MM 2014). ACM, 2014, pp. 487-496. [erratum] [original version]
X. Wang and Y. Wang, “Improving Content-Based and Hybrid Music Recommendation using Deep Learning,” in Proceedings of the 22nd ACM International Conference on Multimedia (MM 2014). ACM, 2014, pp. 627-636.

2013

Y. Yu, R. Zimmermann, Y. Wang, and V. Oria, “Scalable Content-Based Music Retrieval Using Chord Progression Histogram and Tree-Structure LSH,” IEEE Trans. Multim. (TMM), vol. 15, no. 8, pp. 1969-1981, 2013.
Z. Duan, H. Fang, B. Li, K. C. Sim, and Y. Wang, “The NUS Sung and Spoken Lyrics Corpus: A Quantitative Comparison of Singing and Speech,” in Proceedings of the 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2013). IEEE, 2013, pp. 1-9. [data]
Z. Li, B. Zhang, Y. Yu, J. Shen, and Y. Wang, “Query-Document-Dependent Fusion: A Case Study of Multimodal Music Retrieval,” IEEE Trans. Multim. (TMM), vol. 15, no. 8, pp. 1830-1842, 2013.
Z. Cai, R. J. Ellis, Z. Duan, H. Lu, and Y. Wang, “Basic Evaluation of Auditory Temporal Stability (BEATS): A Novel Rationale and Implementation,” in Proceedings of the 14th International Society for Music Information Retrieval Conference (ISMIR 2013). 2013, pp. 541-546.
Z. Li, J. Wang, J. Cai, Z. Duan, H. Wang, and Y. Wang, “Non-Reference Audio Quality Assessment for Online Live Music Recordings,” in Proceedings of the 21st ACM International Conference on Multimedia (MM 2013). ACM, 2013, pp. 63-72.

2012

Z. Li and Y. Wang, “A Domain-Specific Music Search Engine for Gait Training,” in Proceedings of the 20th ACM International Conference on Multimedia (MM 2012). ACM, 2012, pp. 1311-1312.
Y. Yu, R. Zimmermann, Y. Wang, and V. Oria, “Recognition and Summarization of Chord Progressions and Their Application to Music Information Retrieval,” in Proceedings of 2012 IEEE International Symposium on Multimedia (ISM 2012). IEEE Computer Society, 2012, pp. 9-16. (Best Paper Award)
X. Wang, Y. Wang, and D. S. Rosenblum, “A Daily, Activity-Aware, Mobile Music Recommender System,” in Proceedings of the 20th ACM International Conference on Multimedia (MM 2012). ACM, 2012, pp. 1313-1314.
X. Wang, D. S. Rosenblum, and Y. Wang, “Context-Aware Mobile Music Recommendation for Daily Activities,” in Proceedings of the 20th ACM International Conference on Multimedia (MM 2012). ACM, 2012, pp. 99-108.
W. F. Ng, Y. Zhou, P. Tan, and Y. Wang, “Using the MOGCLASS in Group Music Therapy With Individuals With Muscular Dystrophy: A Pilot Study,” International Association for Music and Medicine., vol. 4, no. 4, pp. 199-204, 2012.
Y. Zhou, K. C. Sim, P. Tan, and Y. Wang, “MOGAT: Mobile Games with Auditory Training for Children with Cochlear Implants,” in Proceedings of the 20th ACM International Conference on Multimedia (MM 2012). ACM, 2012, pp. 429-438.
Y. Zhou, T. K. P. Monserrat, and Y. Wang, “MOGAT: A Cloud-Based Mobile Game System with Auditory Training for Children with Cochlear Implants,” in Proceedings of the 20th ACM International Conference on Multimedia (MM 2012). ACM, 2012, pp. 1309-1310.
S. Zhu, H. Anderson, and Y. Wang, “A Real-Time On-Chip Algorithm for IMU-Based Gait Measurement,” in Proceedings of the 13th Pacific-Rim Conference on Multimedia (PCM 2012). Springer, 2012, pp. 93-104
S. Zhu, H. Anderson, and Y. Wang, “Reducing the Power Consumption of an IMU-Based Gait Measurement System,” in Proceedings of the 13th Pacific-Rim Conference on Multimedia (PCM 2012). Springer, 2012, pp. 105-116.

2011

Z. Li, B. Zhang, and Y. Wang, “Document Dependent Fusion in Multimodal Music Retrieval,” in Proceedings of the 19th International Conference on Multimedia (MM 2011). ACM, 2011, pp. 1105-1108.
Y. Yi, Y. Zhou, and Y. Wang, “A Tempo-Sensitive Music Search Engine with Multimodal Inputs,” in Proceedings of the 1st international ACM workshop on Music information retrieval with user-centered and multimodal strategies(MIRUM 2011). ACM, 2011, pp. 13-18.
Y. Zhou, G. Percival, X. Wang, Y. Wang, and S. Zhao, “MOGCLASS: Evaluation of a Collaborative System of Mobile Devices for Classroom Music Education of Young Children,” in Proceedings of the International Conference on Human Factors in Computing Systems (CHI 2011). ACM, 2011, pp. 523-532. (Honorable Mention Paper Award)
X. Chen, Z. Zhao, A. Rahmati, Y. Wang, and L. Zhong, “Sensor-Assisted Video Encoding for Mobile Devices in Real-World Environments,” IEEE Trans. Circuits Syst. Video Technol. (TCSVT), vol. 21, no. 3, pp. 335-349, 2011.

2010

Z. Zhao, X. Wang, Q. Xiang, A. M. Sarroff, Z. Li, and Y. Wang, “Large-scale Music Tag Recommendation with Explicit Multiple Attributes,” in Proceedings of the 18th ACM International Conference on Multimedia (MM 2010). ACM, 2010, pp. 401–410.
Z. Li, Q. Xiang, J. Hockman, J. Yang, Y. Yi, I. Fujinaga, and Y. Wang, “A Music Search Engine for Therapeutic Gait Training,” in Proceedings of the 18th ACM International Conference on Multimedia (MM 2010). ACM, 2010, pp. 627–630.
W. Zhao, X. Wang, and Y. Wang, “Automated Sleep Quality Measurement using EEG Signal - First Step Towards a Domain Specific Music Recommendation System,” in Proceedings of the 18th ACM International Conference on Multimedia (MM 2010). ACM, 2010, pp. 1079–1082.
Y. Zhou, G. Percival, X. Wang, Y. Wang, and S. Zhao, “MOGCLASS: A Collaborative System of Mobile Devices for Classroom Music Education,” in Proceedings of the 18th ACM International Conference on Multimedia (MM 2010). ACM, 2010, pp. 671–674.
Y. Wang, “Perception-Aware Low-Power Media Processing for Portable Devices,” IEEE COMSOC MMTC E-Letter, vol. 5, no. 4, pp. 14-17, 2010. (Invited Paper)

2009

Y. Liu, Q. Xiang, Y. Wang, and L. Cai, “Cultural Style Based Music Classification of Audio Signals,” in Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2009). IEEE, 2009, pp. 57–60.
B. Zhang, Q. Xiang, H. Lu, J. Shen, and Y. Wang, “Comprehensive Query-Dependent Fusion using Regression-on-Folksonomies: A Case Study of Multimodal Music Search,” in Proceedings of the 17th ACM International Conference on Multimedia (MM 2009). ACM, 2009, pp. 213–222.
B. Zhang, J. Shen, Q. Xiang, and Y. Wang, “CompositeMap: a Novel Framework for Music Similarity Measure,” in Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2009). ACM, 2009, pp. 403–410.
Y. Zhou, Z. Li, D. Tan, G. Percival, and Y. Wang, “MOGFUN: Musical mObile Group for FUN,” in Proceedings of the 17th ACM International Conference on Multimedia (MM 2009). ACM, 2009, pp. 1005–1006.
B. Zhang and Y. Wang, “Automatic Music Transcription using Audio-Visual Fusion for Violin Practice in Home Environment,” Technical Report, School of Computing, National University of Singapore, 2009.
W. Huang and Y. Wang, “An Optimal Speed Control Scheme Supported by Media Servers for Low-Power Multimedia Applications,” Multim. Syst., vol. 15, no. 2, pp. 113–124, 2009.
W. Huang and Y. Wang, “A Joint Encoder-Decoder Framework for Supporting Energy Efficient Audio Decoding,” Multim. Syst., vol. 15, no. 2, pp. 101–112, 2009.
X. Chen, Z. Zhao, A. Rahmati, Y. Wang, and L. Zhong, “SaVE: Sensor-assisted Motion Estimation for Efficient H.264/AVC Video Encoding,” in Proceedings of the 17th ACM International Conference on Multimedia (MM 2009). ACM, 2009, pp. 381–390.

2008

M. Kan, Y. Wang, D. Iskandar, T. L. Nwe, and A. Shenoy, “LyricAlly: Automatic Synchronization of Textual Lyrics to Acoustic Music Signals,” IEEE Trans. Speech Audio Process., vol. 16, no. 2, pp. 338–349, 2008.
J. Zhu and Y. Wang, “Complexity- Scalable Beat Detection with MP3 Audio Bitstreams,” Comput. Music. J., vol. 32, no. 1, pp. 71–87, 2008.
O. Schleusing, B. Zhang, and Y. Wang, “Onset Detection in Pitched Non-Percussive Music Using Warping-Compensated Correlation,” in Proceedings of the 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2008). IEEE, 2008, pp. 117–120.
C. Toh, B. Zhang, and Y. Wang, “Multiple-Feature Fusion Based Onset Detection for Solo Singing Voice,” in Proceedings of the 9th International Conference on Music Information Retrieval (ISMIR 2008). 2008, pp. 515–520.
Y. Liu, Y. Wang, A. Shenoy, W. Tsai, and L. Cai, “Clustering Music Recordings by Their Keys,” in Proceedings of the 9th International Conference on Music Information Retrieval (ISMIR 2008). 2008, pp. 319–324.
Y. Wang and B. Zhang, “Application-Specific Music Transcription for Tutoring,” IEEE Multim., vol. 15, no. 3, pp. 70–74, 2008.
H. Lu, B. Zhang, Y. Wang, and W. K. Leow, “iDVT: An Interactive Digital Violin Tutoring System Based on Audio-Visual Fusion,” in Proceedings of the 16th ACM International Conference on Multimedia 2008 (MM 2008). ACM, 2008, pp. 1005–1006.
Y. Huang, S. Chakraborty, and Y. Wang, “Watermarking Video Clips with Workload Information for DVS,” in Proceedings of the 21st International Conference on VLSI Design (VLSI Design 2008). IEEE Computer Society, 2008, pp. 712–717.
G. Hong, A. Rahmati, Y. Wang, and L. Zhong, “SenseCoding: Accelerometer-Assisted Motion Estimation for Efficient Video Encoding,” in Proceedings of the 16th ACM International Conference on Multimedia 2008 (MM 2008). ACM, 2008, pp. 749–752.
S. Chakraborty and Y. Wang, “Multimedia Power Management on a Platter: From Audio to Video & Games,” in Proceedings of the 16th ACM International Conference on Multimedia 2008 (MM 2008). ACM, 2008, pp. 1165–1166.
Y. Huang, G. Hong, A. V. Tran, and Y. Wang, “Decoding-Workload-Aware Video Encoding,” in Proceedings of the Network and Operating System Support for Digital Audio and Video, 18th International Workshop (NOSSDAV 2008). ACM, 2008, pp. 45–50.

2007

T. Kinnunen, B. Zhang, J. Zhu, and Y. Wang, “Speaker Verification with Adaptive Spectral Subband Centroids,” in Proceedings of the International Conference on Advances in Biometrics (ICB 2007). Springer, 2007, pp. 58–66.
J. Zhu and Y. Wang, “Pop Music Beat Detection in the Huffman Coded Domain,” in Proceedings of the 2007 IEEE International Conference on Multimedia and Expo (ICME 2007). IEEE Computer Society, 2007, pp. 60–63.
B. Zhang, J. Zhu, Y. Wang, and W. K. Leow, “Visual Analysis of Fingering for Pedagogical Violin Transcription,” in Proceedings of the 15th ACM International Conference on Multimedia (MM 2007). ACM, 2007, pp. 521–524.
G. Percival, Y. Wang, and G. Tzanetakis, “Effective Use of Multimedia for Computer-Assisted Musical Instrument Tutoring,” in Proceedings of the International Workshop on Educational Multimedia and Multimedia Education (EMME 2007). ACM, 2007, pp. 67–76.
Y. Wang, B. Zhang, and O. Schleusing, “Educational Violin Transcription by Fusing Multimedia Streams,” in Proceedings of the International Workshop on Educational Multimedia and Multimedia Education (EMME 2007). ACM, 2007, pp. 57–66.
Y. Huang, A. V. Tran, and Y. Wang, “A Workload Prediction Model for Decoding MPEG Video and its Application to Workload-scalable Transcoding,” in Proceedings of the 15th ACM International Conference on Multimedia (MM 2007). ACM, 2007, pp. 952–961.
Y. Huang, A. V. Tran, and Y. Wang, “A Compressed Domain Distortion Measure for Fast Video Transcoding,” in Proceedings of the 15th ACM International Conference on Multimedia (MM 2007). ACM, 2007, pp. 787–790.

2006

D. Iskandar, Y. Wang, M. Kan, and H. Li, “Syllabic Level Automatic Synchronization of Music Signals and Text Lyrics,” in Proceedings of the 14th ACM International Conference on Multimedia (MM 2006). ACM, 2006, pp. 659–662.
A. Loscos, Y. Wang, and W. J. J. Boo, “Low Level Descriptors for Automatic Violin Transcription,” in Proceedings of the 7th International Conference on Music Information Retrieval (ISMIR 2006). 2006, pp. 164–167.
W. J. J. Boo, Y. Wang, and A. Loscos, “A Violin Music Transcriber for Personalized Learning,” in Proceedings of the 2006 IEEE International Conference on Multimedia and Expo (ICME 2006). IEEE Computer Society, 2006, pp. 2081–2084.
W. Huang and Y. Wang, “Efficient Partial Spectrum Reconstruction Using an Asymmetric PQMF Algorithm for MPEG-Coded Stereo Audio,” in Proceedings of the 2006 IEEE International Conference on Multimedia and Expo (ICME 2006). IEEE Computer Society, 2006, pp. 901–904.
J. Korhonen, Y. Huang, and Y. Wang, “Generic Forward Error Correction of Short Frames for IP Streaming Applications,” Multim. Tools Appl., vol. 29, no. 3, pp. 305–323, 2006.

2005

A. Shenoy and Y. Wang, “Key, Chord, and Rhythm Tracking of Popular Music Recordings,” Comput. Music. J., vol. 29, no. 3, pp. 75–86, 2005.
A. Shenoy, Y. Wu, and Y. Wang, “Singing Voice Detection for Karaoke Application,” in Proceedings of Visual Communications and Image Processing 2005 (VCIP 2005). SPIE, 2005, pp. 752–762.
J. Yin, T. Sim, Y. Wang, and A. Shenoy, “Music Transcription Using an Instrument Model,” in Proceedings of the 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005). IEEE, 2005, pp. 217–220.
W. Huang and Y. Wang, “A Method for Separating Drum Objects from Polyphonic Musical Signals,” in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics 2005 (WASPAA 2005). IEEE, 2005, pp. 307–310.
J. Yin, Y. Wang, and D. Hsu, “Digital Violin Tutor: An Integrated System for Beginning Violin Learners,” in Proceedings of the 13th ACM International Conference on Multimedia (MM 2005). ACM, 2005, pp. 976–985.
Y. Huang, S. Chakraborty, and Y. Wang, “Using Offline Bitstream Analysis for Power-Aware Video Decoding in Portable Devices,” in Proceedings of the 13th ACM International Conference on Multimedia (MM 2005). ACM, 2005, pp. 299–302.
J. Korhonen and Y. Wang, “Power-Efficient Streaming for Mobile Terminals,” in Proceedings of the Network and Operating System Support for Digital Audio and Video, 15th International Workshop (NOSSDAV 2005). ACM, 2005, pp. 39–44.
W. Huang, Y. Wang, and S. Chakraborty, “Power-aware Bandwidth and Stereo-image Scalable Audio Decoding,” in Proceedings of the 13th ACM International Conference on Multimedia (MM 2005). ACM, 2005, pp. 291–294.
S. Chakraborty, Y. Wang, and W. Huang, “A Perception-Aware Low-Power Software Audio Decoder for Portable Devices,” in Proceedings of the 2005 3rd Workshop on Embedded Systems for Real-Time Multimedia (ESTIMedia 2005). IEEE Computer Society, 2005, pp. 13–18.
J. Korhonen, Y. Wang, and D. Isherwood, “Toward Bandwidth-Efficient and Error-Robust Audio Streaming over Lossy Packet Networks,” Multim. Syst., vol. 10, no. 5, pp. 402–412, 2005.
Y. Huang, J. Korhonen, and Y. Wang, “Optimization of Source and Channel Coding for Voice Over IP,” in Proceedings of the 2005 IEEE International Conference on Multimedia and Expo (ICME 2005). IEEE Computer Society, 2005, pp. 173–176.
J. Korhonen and Y. Wang, “Effect of Packet Size on Loss Rate and Delay in Wireless Links,” in Proceedings of the IEEE Wireless Communications and Networking Conference (WCNC 2005). IEEE, 2005, pp. 1608–1613.

2004

T. L. Nwe, A. Shenoy, and Y. Wang, “Singing Voice Detection in Popular Music,” in Proceedings of the 12th ACM International Conference on Multimedia (MM 2004). ACM, 2004, pp. 324–327.
N. C. Maddage, C. Xu, and Y. Wang, “Singer Identification Based on Vocal and Instrumental Models,” in Proceedings of the 17th International Conference on Pattern Recognition (ICPR 2004). IEEE Computer Society, 2004, pp. 375–378.
Y. Wang, M. Kan, T. L. Nwe, A. Shenoy, and J. Yin, “LyricAlly: Automatic Synchronization of Acoustic Musical Signals and Textual Lyrics,” in Proceedings of the 12th ACM International Conference on Multimedia (MM 2004). ACM, 2004, pp. 212–219. (Best Student Award)
A. Shenoy, R. Mohapatra, and Y. Wang, “Key Determination of Acoustic Musical Signals,” in Proceedings of the 2004 IEEE International Conference on Multimedia and Expo (ICME 2004). IEEE Computer Society, 2004, pp. 1771–1774.
X. Shao, C. Xu, Y. Wang, and M. S. Kankanhalli, “Automatic Music Summarization in Compressed Domain,” in Proceedings of the 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004). IEEE, 2004, pp. 261–264.
T. L. Nwe and Y. Wang, “Automatic Detection of Vocal Segments in Popular Songs,” in Proceedings of the 5th International Conference on Music Information Retrieval (ISMIR 2004). 2004.
J. Yin, A. Dhanik, D. Hsu, and Y. Wang, “The Creation of a Music-Driven Digital Violinist,” in Proceedings of the 12th ACM International Conference on Multimedia (MM 2004). ACM, 2004, pp. 476–479.
Y. Wang, W. Huang, and J. Korhonen, “A Framework for Robust and Scalable Audio Streaming,” in Proceedings of the 12th ACM International Conference on Multimedia (MM 2004). ACM, 2004, pp. 144–151.

2003

N. C. Maddage, C. Xu, and Y. Wang, “An SVM-Based Classification Approach to Musical Audio,” in Proceedings of the 4th International Conference on Music Information Retrieval (ISMIR 2003). 2003.
Y. Wang and M. Vilermo, “Modified Discrete Cosine Transform: Its Implications for Audio Coding and Error Concealment,” Journal of The Audio Engineering Society, vol. 51, pp. 52–61, 2003.
J. Korhonen and Y. Wang, “Schemes for Error Resilient Streaming of Perceptually Coded Audio,” in Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2003). IEEE, 2003, pp. 740–743.
M. Vilermo, S. Streich, M. Vaananen, K. Linzmeier, B. Grill, and Y. Wang, “Perceptual Optimization of the Frequency Selective Switch in Scalable Audio Coding,” in Audio Engineering Society Convention 114 (AESC 2003). 2003.
Y. Wang, J. Tang, A. Ahmaniemi, and M. Vaalgamaa, “Parametric Vector Quantization for Coding Percussive Sounds in Music,” in Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2003). IEEE, 2003, pp. 652–655.
Y. Wang, A. Ahmaniemi, D. Isherwood, and W. Huang, “Content-Based UEP: A new Scheme for Packet Loss Recovery in Music Streaming,” in Proceedings of the 11th ACM International Conference on Multimedia (MM 2003). ACM, 2003, pp. 412–421.
L. Wyse, Y. Wang, and X. Zhu, “Application of a Content-Based Percussive Sound Synthesizer to Packet Loss Recovery in Music Streaming,” in Proceedings of the 11th ACM International Conference on Multimedia (MM 2003). ACM, 2003, pp. 335–338.

2002

Y. Wang and S. Streich, “A Drumbeat-Pattern Based Error Concealment Method for Music Streaming Applications,” in Proceedings of the 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2002). IEEE, 2002, pp. 2817–2820.

2001

Y. Wang and M. Vilermo, “A Compressed Domain Beat Detector Using MP3 Audio Bitstreams,” in Proceedings of the 9th ACM International Conference on Multimedia (MM 2001). ACM, 2001, pp. 194–202.
Y. Wang, “A Beat-Pattern Based Error Concealment Scheme for Music Delivery with Burst Packet Loss,” in Proceedings of the 2001 IEEE International Conference on Multimedia and Expo (ICME 2001). IEEE Computer Society, 2001, pp. 72-75.

2000

Y. Wang, M. Vilermo, and L. Yaroslavsky, “Energy Compaction Property of the MDCT in Comparison with Other Transforms,” in Proceedings of the Audio Engineering Society Convention 109 (AESC 2000). 2000.
Y. Wang, L. Yaroslavsky, M. Vilermo, and M. Vaananen, “Some Peculiar Properties of the MDCT,” in Proceedings of the 5th International Conference on Signal Processing Proceedings, 16th World Computer Congress (WCC 2000), vol. 1. IEEE, 2000, pp. 61–64.
Y. Wang, L. Yaroslavsky, and M. Vilermo, “On the Relationship between MDCT, SDPT and DFT,” in Proceedings of the 5th International Conference on Signal Processing Proceedings, 16th World Computer Congress (WCC 2000), vol. 1. IEEE, 2000, pp. 44–47.
L. Yaroslavsky and Y. Wang, “DFT, DCT, MCDT, DST and Signal Fourier Spectrum Analysis,” in Proceedings of the 10th European Signal Processing Conference (EUSIPCO 2000). IEEE, 2000, pp. 1–4.
Y. Wang, M. Vilermo, M. V ̈a ̈an ̈anen, and L. Yaroslavsky, “Restructured Audio Encoder for Improved Computational Efficiency,” in Proceedings of the Audio Engineering Society Convention 108 (AESC 2000). 2000.

1999

Y. Wang and M. Vilermo, “Exploiting Excess Masking for Audio Compression,” in Proceedings of the Audio Engineering Society Convention 107 (AESC 1999). 1999.