default search action
Helen M. Meng
Person information
- unicode name: 蒙美玲
- affiliation: The Chinese University of Hog Kong
- affiliation (former): Massachusetts Institute of Technology, Cambridge, MA, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j65]Xiaohan Feng, Xixin Wu, Helen Meng:
Injecting Linguistic Knowledge Into BERT for Dialogue State Tracking. IEEE Access 12: 93761-93770 (2024) - [j64]Xiaoquan Ke, Man-Wai Mak, Helen M. Meng:
Automatic selection of spoken language biomarkers for dementia detection. Neural Networks 169: 191-204 (2024) - [j63]Jingbei Li, Sipan Li, Ping Chen, Luwen Zhang, Yi Meng, Zhiyong Wu, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang:
Joint Multiscale Cross-Lingual Speaking Style Transfer With Bidirectional Attention Mechanism for Automatic Dubbing. IEEE ACM Trans. Audio Speech Lang. Process. 32: 517-528 (2024) - [j62]Dongchao Yang, Songxiang Liu, Rongjie Huang, Chao Weng, Helen Meng:
InstructTTS: Modelling Expressive TTS in Discrete Latent Space With Natural Language Style Prompt. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2913-2925 (2024) - [j61]Shujie Hu, Xurong Xie, Mengzhe Geng, Zengrui Jin, Jiajun Deng, Guinan Li, Yi Wang, Mingyu Cui, Tianzi Wang, Helen Meng, Xunying Liu:
Self-Supervised ASR Models and Features for Dysarthric and Elderly Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 32: 3561-3575 (2024) - [c425]Boshi Tang, Zhiyong Wu, Xixin Wu, Qiaochu Huang, Jun Chen, Shun Lei, Helen Meng:
SimCalib: Graph Neural Network Calibration Based on Similarity between Nodes. AAAI 2024: 15267-15275 - [c424]Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Lifeng Jin, Linfeng Song, Haitao Mi, Helen Meng:
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation. ACL (1) 2024: 1946-1965 - [c423]Jincenzi Wu, Zhuang Chen, Jiawen Deng, Sahand Sabour, Helen Meng, Minlie Huang:
COKE: A Cognitive Knowledge Graph for Machine Theory of Mind. ACL (1) 2024: 15984-16007 - [c422]Jiaxiong Hu, Junze Li, Yuhang Zeng, Dongjie Yang, Danxuan Liang, Helen Meng, Xiaojuan Ma:
Designing Scaffolding Strategies for Conversational Agents in Dialog Task of Neurocognitive Disorders Screening. CHI 2024: 70:1-70:21 - [c421]Zhiwei Lin, Jun Chen, Boshi Tang, Binzhu Sha, Jing Yang, Yaolong Ju, Fan Fan, Shiyin Kang, Zhiyong Wu, Helen Meng:
Multi-View Midivae: Fusing Track- and Bar-View Representations for Long Multi-Track Symbolic Music Generation. ICASSP 2024: 941-945 - [c420]Yuanyuan Wang, Hangting Chen, Dongchao Yang, Jianwei Yu, Chao Weng, Zhiyong Wu, Helen Meng:
Consistent and Relevant: Rethink the Query Embedding in General Sound Separation. ICASSP 2024: 961-965 - [c419]Weinan Tong, Jiaxu Zhu, Jun Chen, Shiyin Kang, Tao Jiang, Yang Li, Zhiyong Wu, Helen Meng:
SCNet: Sparse Compression Network for Music Source Separation. ICASSP 2024: 1276-1280 - [c418]Qiaochu Huang, Xu He, Boshi Tang, Haolin Zhuang, Liyang Chen, Shuochen Gao, Zhiyong Wu, Haozhi Huang, Helen Meng:
Enhancing Expressiveness in Dance Generation Via Integrating Frequency and Music Style Information. ICASSP 2024: 8185-8189 - [c417]Haiwei Xue, Sicheng Yang, Zhensong Zhang, Zhiyong Wu, Minglei Li, Zonghong Dai, Helen Meng:
Conversational Co-Speech Gesture Generation via Modeling Dialog Intention, Emotion, and Context with Diffusion Models. ICASSP 2024: 8296-8300 - [c416]Zhe Li, Man-Wai Mak, Helen Mei-Ling Meng:
Dual Parameter-Efficient Fine-Tuning for Speaker Representation Via Speaker Prompt Tuning and Adapters. ICASSP 2024: 10751-10755 - [c415]Hui Lu, Xixin Wu, Haohan Guo, Songxiang Liu, Zhiyong Wu, Helen Meng:
Unifying One-Shot Voice Conversion and Cloning with Disentangled Speech Representations. ICASSP 2024: 11141-11145 - [c414]Jiawen Kang, Lingwei Meng, Mingyu Cui, Haohan Guo, Xixin Wu, Xunying Liu, Helen Meng:
Cross-Speaker Encoding Network for Multi-Talker Speech Recognition. ICASSP 2024: 11986-11990 - [c413]Yuejiao Wang, Xixin Wu, Disong Wang, Lingwei Meng, Helen Meng:
UNIT-DSR: Dysarthric Speech Reconstruction System Using Speech Unit Normalization. ICASSP 2024: 12306-12310 - [c412]Xueyuan Chen, Xi Wang, Shaofei Zhang, Lei He, Zhiyong Wu, Xixin Wu, Helen Meng:
Stylespeech: Self-Supervised Style Enhancing with VQ-VAE-Based Pre-Training for Expressive Audiobook Speech Synthesis. ICASSP 2024: 12316-12320 - [c411]Xueyuan Chen, Yuejiao Wang, Xixin Wu, Disong Wang, Zhiyong Wu, Xunying Liu, Helen Meng:
Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction. ICASSP 2024: 12341-12345 - [c410]Binzhu Sha, Xu Li, Zhiyong Wu, Ying Shan, Helen Meng:
Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion. ICASSP 2024: 12577-12581 - [c409]Shun Lei, Yixuan Zhou, Liyang Chen, Dan Luo, Zhiyong Wu, Xixin Wu, Shiyin Kang, Tao Jiang, Yahui Zhou, Yuxing Han, Helen Meng:
Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts. ICASSP 2024: 12662-12666 - [c408]Dongchao Yang, Jinchuan Tian, Xu Tan, Rongjie Huang, Songxiang Liu, Haohan Guo, Xuankai Chang, Jiatong Shi, Sheng Zhao, Jiang Bian, Zhou Zhao, Xixin Wu, Helen M. Meng:
UniAudio: Towards Universal Audio Generation with Large Language Models. ICML 2024 - [c407]Wenxuan Wu, Xueyuan Chen, Xixin Wu, Haizhou Li, Helen Meng:
Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy. IJCNN 2024: 1-8 - [c406]Jingyan Zhou, Minda Hu, Junan Li, Xiaoying Zhang, Xixin Wu, Irwin King, Helen Meng:
Rethinking Machine Ethics - Can LLMs Perform Moral Reasoning through the Lens of Moral Theories? NAACL-HLT (Findings) 2024: 2227-2242 - [c405]Tianhua Zhang, Jiaxin Ge, Hongyin Luo, Yung-Sung Chuang, Mingye Gao, Yuan Gong, Yoon Kim, Xixin Wu, Helen Meng, Jim Glass:
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning. NAACL-HLT (Findings) 2024: 4131-4155 - [i171]Jiawen Kang, Lingwei Meng, Mingyu Cui, Haohan Guo, Xixin Wu, Xunying Liu, Helen Meng:
Cross-Speaker Encoding Network for Multi-Talker Speech Recognition. CoRR abs/2401.04152 (2024) - [i170]Zhiwei Lin, Jun Chen, Boshi Tang, Binzhu Sha, Jing Yang, Yaolong Ju, Fan Fan, Shiyin Kang, Zhiyong Wu, Helen Meng:
Multi-view MidiVAE: Fusing Track- and Bar-view Representations for Long Multi-track Symbolic Music Generation. CoRR abs/2401.07532 (2024) - [i169]Yuejiao Wang, Xixin Wu, Disong Wang, Lingwei Meng, Helen Meng:
UNIT-DSR: Dysarthric Speech Reconstruction System Using Speech Unit Normalization. CoRR abs/2401.14664 (2024) - [i168]Xueyuan Chen, Yuejiao Wang, Xixin Wu, Disong Wang, Zhiyong Wu, Xunying Liu, Helen Meng:
Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction. CoRR abs/2401.17796 (2024) - [i167]Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Lifeng Jin, Linfeng Song, Haitao Mi, Helen Meng:
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation. CoRR abs/2402.09267 (2024) - [i166]Qiaochu Huang, Xu He, Boshi Tang, Haolin Zhuang, Liyang Chen, Shuochen Gao, Zhiyong Wu, Haozhi Huang, Helen Meng:
Enhancing Expressiveness in Dance Generation via Integrating Frequency and Music Style Information. CoRR abs/2403.05834 (2024) - [i165]Wenxuan Wu, Xueyuan Chen, Xixin Wu, Haizhou Li, Helen Meng:
Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy. CoRR abs/2403.16078 (2024) - [i164]Dongchao Yang, Dingdong Wang, Haohan Guo, Xueyuan Chen, Xixin Wu, Helen Meng:
SimpleSpeech: Towards Simple and Efficient Text-to-Speech with Scalar Latent Transformer Diffusion Models. CoRR abs/2406.02328 (2024) - [i163]Haohan Guo, Fenglong Xie, Dongchao Yang, Hui Lu, Xixin Wu, Helen Meng:
Addressing Index Collapse of Large-Codebook Speech Tokenizer with Dual-Decoding Product-Quantized Variational Auto-Encoder. CoRR abs/2406.02940 (2024) - [i162]Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Yipeng Zhang, Haitao Mi, Helen Meng:
Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching. CoRR abs/2406.06326 (2024) - [i161]Xueyuan Chen, Dongchao Yang, Dingdong Wang, Xixin Wu, Zhiyong Wu, Helen Meng:
CoLM-DSR: Leveraging Neural Codec Language Modeling for Multi-Modal Dysarthric Speech Reconstruction. CoRR abs/2406.08336 (2024) - [i160]Tianzi Wang, Xurong Xie, Zhaoqing Li, Shoukang Hu, Zengrui Jing, Jiajun Deng, Mingyu Cui, Shujie Hu, Mengzhe Geng, Guinan Li, Helen Meng, Xunying Liu:
Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask. CoRR abs/2406.10034 (2024) - [i159]Dongchao Yang, Haohan Guo, Yuanyuan Wang, Rongjie Huang, Xiang Li, Xu Tan, Xixin Wu, Helen Meng:
UniAudio 1.5: Large Language Model-driven Audio Codec is A Few-shot Audio Task Learner. CoRR abs/2406.10056 (2024) - [i158]Guinan Li, Jiajun Deng, Youjun Chen, Mengzhe Geng, Shujie Hu, Zhe Li, Zengrui Jin, Tianzi Wang, Xurong Xie, Helen Meng, Xunying Liu:
Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition. CoRR abs/2406.10152 (2024) - [i157]Tianhua Zhang, Kun Li, Hongyin Luo, Xixin Wu, James R. Glass, Helen Meng:
Adaptive Query Rewriting: Aligning Rewriters through Marginal Probability of Conversational Answers. CoRR abs/2406.10991 (2024) - [i156]Jing Xu, Minglin Wu, Xixin Wu, Helen Meng:
Seamless Language Expansion: Enhancing Multilingual Mastery in Self-Supervised Models. CoRR abs/2406.14092 (2024) - [i155]Jingyan Zhou, Kun Li, Junan Li, Jiawen Kang, Minda Hu, Xixin Wu, Helen Meng:
Purple-teaming LLMs with Adversarial Defender Training. CoRR abs/2407.01850 (2024) - [i154]Mengzhe Geng, Xurong Xie, Jiajun Deng, Zengrui Jin, Guinan Li, Tianzi Wang, Shujie Hu, Zhaoqing Li, Helen Meng, Xunying Liu:
Homogeneous Speaker Features for On-the-Fly Dysarthric and Elderly Speaker Adaptation. CoRR abs/2407.06310 (2024) - [i153]Lingwei Meng, Long Zhou, Shujie Liu, Sanyuan Chen, Bing Han, Shujie Hu, Yanqing Liu, Jinyu Li, Sheng Zhao, Xixin Wu, Helen Meng, Furu Wei:
Autoregressive Speech Synthesis without Vector Quantization. CoRR abs/2407.08551 (2024) - [i152]Lingwei Meng, Jiawen Kang, Yuejiao Wang, Zengrui Jin, Xixin Wu, Xunying Liu, Helen Meng:
Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System. CoRR abs/2407.09817 (2024) - [i151]Yuejiao Wang, Xianmin Gong, Lingwei Meng, Xixin Wu, Helen Meng:
Large Language Model-based FMRI Encoding of Language Functions for Subjects with Neurocognitive Disorder. CoRR abs/2407.10376 (2024) - [i150]Weiqin Li, Peiji Yang, Yicheng Zhong, Yixuan Zhou, Zhisheng Wang, Zhiyong Wu, Xixin Wu, Helen Meng:
Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language Models. CoRR abs/2407.13509 (2024) - [i149]Shujie Hu, Xurong Xie, Mengzhe Geng, Zengrui Jin, Jiajun Deng, Guinan Li, Yi Wang, Mingyu Cui, Tianzi Wang, Helen Meng, Xunying Liu:
Self-supervised ASR Models and Features For Dysarthric and Elderly Speech Recognition. CoRR abs/2407.13782 (2024) - [i148]Dongchao Yang, Rongjie Huang, Yuanyuan Wang, Haohan Guo, Dading Chong, Songxiang Liu, Xixin Wu, Helen Meng:
SimpleSpeech 2: Towards Simple and Efficient Text-to-Speech with Flow-based Scalar Latent Transformer Diffusion Models. CoRR abs/2408.13893 (2024) - [i147]Haohan Guo, Fenglong Xie, Kun Xie, Dongchao Yang, Dake Guo, Xixin Wu, Helen Meng:
SoCodec: A Semantic-Ordered Multi-Stream Speech Codec for Efficient Language Model Based Text-to-Speech Synthesis. CoRR abs/2409.00933 (2024) - [i146]Shun Lei, Yixuan Zhou, Boshi Tang, Max W. Y. Lam, Feng Liu, Hangyu Liu, Jingcheng Wu, Shiyin Kang, Zhiyong Wu, Helen Meng:
SongCreator: Lyrics-based Universal Song Generation. CoRR abs/2409.06029 (2024) - [i145]Lingwei Meng, Shujie Hu, Jiawen Kang, Zhaoqing Li, Yuejiao Wang, Wenxuan Wu, Xixin Wu, Xunying Liu, Helen Meng:
Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions. CoRR abs/2409.08596 (2024) - [i144]Haohan Guo, Fenglong Xie, Dongchao Yang, Xixin Wu, Helen Meng:
Speaking from Coarse to Fine: Improving Neural Codec Language Model via Multi-Scale Speech Coding and Generation. CoRR abs/2409.11630 (2024) - [i143]Jiawen Kang, Lingwei Meng, Mingyu Cui, Yuejiao Wang, Xixin Wu, Xunying Liu, Helen Meng:
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC. CoRR abs/2409.12388 (2024) - [i142]Yuanyuan Wang, Hangting Chen, Dongchao Yang, Zhiyong Wu, Helen Meng, Xixin Wu:
AudioComposer: Towards Fine-grained Audio Generation with Natural Language Descriptions. CoRR abs/2409.12560 (2024) - [i141]Jiawen Kang, Dongrui Han, Lingwei Meng, Jingyan Zhou, Jinchao Li, Xixin Wu, Helen Meng:
Towards Within-Class Variation in Alzheimer's Disease Detection from Spontaneous Speech. CoRR abs/2409.16322 (2024) - 2023
- [j60]King Woon Yau, Ching Sing Chai, Thomas K. F. Chiu, Helen Meng, Irwin King, Yeung Yam:
A phenomenographic approach on teacher conceptions of teaching Artificial Intelligence (AI) in K-12 schools. Educ. Inf. Technol. 28(1): 1041-1064 (2023) - [j59]Hanyi Zhang, Longbiao Wang, Kong Aik Lee, Meng Liu, Jianwu Dang, Helen Meng:
Meta-Generalization for Domain-Invariant Speaker Verification. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1024-1036 (2023) - [j58]Haohan Guo, Fenglong Xie, Xixin Wu, Frank K. Soong, Helen Meng:
MSMC-TTS: Multi-Stage Multi-Codebook VQ-VAE Based Neural TTS. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1811-1824 (2023) - [j57]Guinan Li, Jiajun Deng, Mengzhe Geng, Zengrui Jin, Tianzi Wang, Shujie Hu, Mingyu Cui, Helen Meng, Xunying Liu:
Audio-Visual End-to-End Multi-Channel Speech Separation, Dereverberation and Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 31: 2707-2723 (2023) - [j56]Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu, Xixin Wu, Shiyin Kang, Helen Meng:
MSStyleTTS: Multi-Scale Style Modeling With Hierarchical Context Information for Expressive Speech Synthesis. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3290-3303 (2023) - [j55]Xixin Wu, Hui Lu, Kun Li, Zhiyong Wu, Xunying Liu, Helen Meng:
Hiformer: Sequence Modeling Networks With Hierarchical Attention Mechanisms. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3993-4003 (2023) - [c404]Yunrui Cai, Changhe Song, Boshi Tang, Dongyang Dai, Zhiyong Wu, Helen Meng:
Robust Representation Learning for Speech Emotion Recognition with Moment Exchange. APSIPA ASC 2023: 1002-1007 - [c403]Xiaoquan Ke, Man-Wai Mak, Helen M. Meng:
Jointly Modelling Transcriptions and Phonemes with Optimal Features to Detect Dementia from Spontaneous Cantonese. APSIPA ASC 2023: 2267-2273 - [c402]Haibin Wu, Jiawen Kang, Lingwei Meng, Helen Meng, Hung-yi Lee:
The Defender's Perspective on Automatic Speaker Verification: An Overview. DADA@IJCAI 2023: 6-11 - [c401]Hongyin Luo, Tianhua Zhang, Yung-Sung Chuang, Yuan Gong, Yoon Kim, Xixin Wu, Helen Meng, James R. Glass:
Search Augmented Instruction Learning. EMNLP (Findings) 2023: 3717-3729 - [c400]Xiaoying Zhang, Baolin Peng, Kun Li, Jingyan Zhou, Helen Meng:
SGP-TOD: Building Task Bots Effortlessly via Schema-Guided LLM Prompting. EMNLP (Findings) 2023: 13348-13369 - [c399]Jun Chen, Wei Rao, Zilin Wang, Jiuxin Lin, Zhiyong Wu, Yannan Wang, Shidong Shang, Helen Meng:
Inter-Subnet: Speech Enhancement with Subband Interaction. ICASSP 2023: 1-5 - [c398]Shujie Hu, Xurong Xie, Zengrui Jin, Mengzhe Geng, Yi Wang, Mingyu Cui, Jiajun Deng, Xunying Liu, Helen Meng:
Exploring Self-Supervised Pre-Trained ASR Models for Dysarthric and Elderly Speech Recognition. ICASSP 2023: 1-5 - [c397]Xiaoquan Ke, Man-Wai Mak, Helen M. Meng:
Feature Selection and Text Embedding for Detecting Dementia from Spontaneous Cantonese. ICASSP 2023: 1-5 - [c396]Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu, Shiyin Kang, Helen Meng:
Context-Aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis. ICASSP 2023: 1-5 - [c395]Zhe Li, Man-Wai Mak, Helen Mei-Ling Meng:
Discriminative Speaker Representation Via Contrastive Learning with Class-Aware Attention in Angular Space. ICASSP 2023: 1-5 - [c394]Jinchao Li, Kaitao Song, Junan Li, Bo Zheng, Dongsheng Li, Xixin Wu, Xunying Liu, Helen Meng:
Leveraging Pretrained Representations With Task-Related Keywords for Alzheimer's Disease Detection. ICASSP 2023: 1-5 - [c393]Jinchao Li, Xixin Wu, Kaitao Song, Dongsheng Li, Xunying Liu, Helen Meng:
A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition. ICASSP 2023: 1-5 - [c392]Jiuxin Lin, Xinyu Cai, Heinrich Dinkel, Jun Chen, Zhiyong Yan, Yongqing Wang, Junbo Zhang, Zhiyong Wu, Yujun Wang, Helen Meng:
Av-Sepformer: Cross-Attention Sepformer for Audio-Visual Target Speaker Extraction. ICASSP 2023: 1-5 - [c391]Lingwei Meng, Jiawen Kang, Mingyu Cui, Yuejiao Wang, Xixin Wu, Helen Meng:
A Sidecar Separator Can Convert A Single-Talker Speech Recognition System to A Multi-Talker One. ICASSP 2023: 1-5 - [c390]Jie Tan, Hengyi Cai, Hongshen Chen, Hong Cheng, Helen Meng, Zhuoye Ding:
Contrastive Learning with Dialogue Attributes for Neural Dialogue Generation. ICASSP 2023: 1-5 - [c389]Weinan Tong, Jiaxu Zhu, Jun Chen, Zhiyong Wu, Shiyin Kang, Helen Meng:
TFCnet: Time-Frequency Domain Corrector for Speech Separation. ICASSP 2023: 1-5 - [c388]Yi Wang, Jiajun Deng, Tianzi Wang, Bo Zheng, Shoukang Hu, Xunying Liu, Helen Meng:
Exploiting Prompt Learning with Pre-Trained Language Models for Alzheimer's Disease Detection. ICASSP 2023: 1-5 - [c387]Zilin Wang, Peng Liu, Jun Chen, Sipan Li, Jinfeng Bai, Gang He, Zhiyong Wu, Helen Meng:
A Synthetic Corpus Generation Method for Neural Vocoder Training. ICASSP 2023: 1-5 - [c386]Yuanyuan Wang, Yang Zhang, Zhiyong Wu, Zhihan Yang, Tao Wei, Kun Zou, Helen Meng:
DASA: Difficulty-Aware Semantic Augmentation for Speaker Verification. ICASSP 2023: 1-5 - [c385]Yaoxun Xu, Baiji Liu, Qiaochu Huang, Xingchen Song, Zhiyong Wu, Shiyin Kang, Helen Meng:
CB-Conformer: Contextual Biasing Conformer for Biased Word Recognition. ICASSP 2023: 1-5 - [c384]Yujie Yang, Kun Zhang, Zhiyong Wu, Helen Meng:
Keyword-Specific Acoustic Model Pruning for Open-Vocabulary Keyword Spotting. ICASSP 2023: 1-5 - [c383]Shaohuan Zhou, Xu Li, Zhiyong Wu, Ying Shan, Helen Meng:
Enhancing the Vocal Range of Single-Speaker Singing Voice Synthesis with Melody-Unsupervised Pre-Training. ICASSP 2023: 1-5 - [c382]Haolin Zhuang, Shun Lei, Long Xiao, Weiqin Li, Liyang Chen, Sicheng Yang, Zhiyong Wu, Shiyin Kang, Helen Meng:
GTN-Bailando: Genre Consistent long-Term 3D Dance Generation Based on Pre-Trained Genre Token Network. ICASSP 2023: 1-5 - [c381]Tian Bian, Yuli Jiang, Jia Li, Tingyang Xu, Yu Rong, Yi Su, Timothy C. Y. Kwok, Helen Meng, Hong Cheng:
Decision Support System for Chronic Diseases Based on Drug-Drug Interactions. ICDE 2023: 3467-3480 - [c380]Xintao Zhao, Shuai Wang, Yang Chao, Zhiyong Wu, Helen Meng:
Adversarial Speaker Disentanglement Using Unannotated External Data for Self-supervised Representation-based Voice Conversion. ICME 2023: 1691-1696 - [c379]Sipan Li, Songxiang Liu, Luwen Zhang, Xiang Li, Yanyao Bian, Chao Weng, Zhiyong Wu, Helen Meng:
SnakeGAN: A Universal Vocoder Leveraging DDSP Prior Knowledge and Periodic Inductive Bias. ICME 2023: 1703-1708 - [c378]Jiaxu Zhu, Weinan Tong, Yaoxun Xu, Changhe Song, Zhiyong Wu, Zhao You, Dan Su, Dong Yu, Helen Meng:
Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation. INTERSPEECH 2023: 1334-1338 - [c377]Helen Meng, Brian Mak, Man-Wai Mak, Helene H. Fung, Xianmin Gong, Timothy C. Y. Kwok, Xunying Liu, Vincent C. T. Mok, Patrick C. M. Wong, Jean Woo, Xixin Wu, Ka Ho Wong, Sean Shensheng Xu, Naijun Zheng, Ranzo Huang, Jiawen Kang, Xiaoquan Ke, Junan Li, Jinchao Li, Yi Wang:
Integrated and Enhanced Pipeline System to Support Spoken Language Analytics for Screening Neurocognitive Disorders. INTERSPEECH 2023: 1713-1717 - [c376]Tianzi Wang, Shoukang Hu, Jiajun Deng, Zengrui Jin, Mengzhe Geng, Yi Wang, Helen Meng, Xunying Liu:
Hyper-parameter Adaptation of Conformer ASR Systems for Elderly and Dysarthric Speech Recognition. INTERSPEECH 2023: 1733-1737 - [c375]Mengzhe Geng, Xurong Xie, Rongfeng Su, Jianwei Yu, Zengrui Jin, Tianzi Wang, Shujie Hu, Zi Ye, Helen Meng, Xunying Liu:
On-the-Fly Feature Based Rapid Speaker Adaptation for Dysarthric and Elderly Speech Recognition. INTERSPEECH 2023: 1753-1757 - [c374]Yunxiang Li, Pengfei Liu, Xixin Wu, Helen Meng:
PunCantonese: A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts. INTERSPEECH 2023: 2183-2187 - [c373]Shujie Hu, Xurong Xie, Mengzhe Geng, Mingyu Cui, Jiajun Deng, Guinan Li, Tianzi Wang, Helen Meng, Xunying Liu:
Exploiting Cross-Domain And Cross-Lingual Ultrasound Tongue Imaging Features For Elderly And Dysarthric Speech Recognition. INTERSPEECH 2023: 2313-2317 - [c372]Jiaxu Zhu, Changhe Song, Zhiyong Wu, Helen Meng:
SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge. INTERSPEECH 2023: 3272-3276 - [c371]Weiqin Li, Shun Lei, Qiaochu Huang, Yixuan Zhou, Zhiyong Wu, Shiyin Kang, Helen Meng:
Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis. INTERSPEECH 2023: 3377-3381 - [c370]Lingwei Meng, Jiawen Kang, Mingyu Cui, Haibin Wu, Xixin Wu, Helen Meng:
Unified Modeling of Multi-Talker Overlapped Speech Recognition and Diarization with a Sidecar Separator. INTERSPEECH 2023: 3467-3471 - [c369]Xiang Li, Songxiang Liu, Max W. Y. Lam, Zhiyong Wu, Chao Weng, Helen Meng:
Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model. INTERSPEECH 2023: 4858-4862 - [c368]Jianan Li, Yueming Jin, Yueyao Chen, Hon-Chi Yip, Markus Scheppach, Philip Wai Yan Chiu, Yeung Yam, Helen Mei-Ling Meng, Qi Dou:
Imitation Learning from Expert Video Data for Dissection Trajectory Prediction in Endoscopic Surgical Procedure. MICCAI (9) 2023: 494-504 - [c367]Hui Lu, Xixin Wu, Zhiyong Wu, Helen Meng:
SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody. ACM Multimedia 2023: 2829-2837 - [c366]Yuan Xu, Ching Sing Chai, Helen Meng, Savio Wai-Ho Wong, King Woon Yau, Thomas K. F. Chiu, Irwin King, Yeung Yam:
An experiential learning approach to learn AI in an online workshop. TALE 2023: 1-6 - [i140]Hang Su, Borislav Dzodzo, Changlun Li, Danyang Zhao, Hao Geng, Yunxiang Li, Sidharth Jaggi, Helen Meng:
Learning Analytics from Spoken Discussion Dialogs in Flipped Classroom. CoRR abs/2301.12399 (2023) - [i139]Dongchao Yang, Songxiang Liu, Rongjie Huang, Guangzhi Lei, Chao Weng, Helen Meng, Dong Yu:
InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style Prompt. CoRR abs/2301.13662 (2023) - [i138]HoLam Chung, Junan Li, Pengfei Liu, Wai-Kim Leung, Xixin Wu, Helen Meng:
Improving Rare Words Recognition through Homophone Extension and Unified Writing for Low-resource Cantonese Speech Recognition. CoRR abs/2302.00836 (2023) - [i137]Lingwei Meng, Jiawen Kang, Mingyu Cui, Yuejiao Wang, Xixin Wu, Helen Meng:
A Sidecar Separator Can Convert a Single-Talker Speech Recognition System to a Multi-Talker One. CoRR abs/2302.09908 (2023) - [i136]Shujie Hu, Xurong Xie, Zengrui Jin, Mengzhe Geng, Yi Wang, Mingyu Cui, Jiajun Deng, Xunying Liu, Helen Meng:
Exploring Self-supervised Pre-trained ASR Models For Dysarthric and Elderly Speech Recognition. CoRR abs/2302.14564 (2023) - [i135]Tian Bian, Yuli Jiang, Jia Li, Tingyang Xu, Yu Rong, Yi Su, Timothy C. Y. Kwok, Helen Meng, Hong Cheng:
Decision Support System for Chronic Diseases Based on Drug-Drug Interactions. CoRR abs/2303.02405 (2023) - [i134]Jinchao Li, Kaitao Song, Junan Li, Bo Zheng, Dongsheng Li, Xixin Wu, Xunying Liu, Helen Meng:
Leveraging Pretrained Representations with Task-related Keywords for Alzheimer's Disease Detection. CoRR abs/2303.08019 (2023) - [i133]Jinchao Li, Xixin Wu, Kaitao Song, Dongsheng Li, Xunying Liu, Helen Meng:
A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition. CoRR abs/2303.08027 (2023) - [i132]Tianhua Zhang, Hongyin Luo, Yung-Sung Chuang, Wei Fang, Luc Gaitskell, Thomas Hartvigsen, Xixin Wu, Danny Fox, Helen Meng, James R. Glass:
Interpretable Unified Language Checking. CoRR abs/2304.03728 (2023) - [i131]Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu, Shiyin Kang, Helen Meng:
Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis. CoRR abs/2304.06359 (2023) - [i130]Yaoxun Xu, Baiji Liu, Qiaochu Huang, Xingchen Song, Zhiyong Wu, Shiyin Kang, Helen Meng:
CB-Conformer: Contextual biasing Conformer for biased word recognition. CoRR abs/2304.09607 (2023) - [i129]Haolin Zhuang, Shun Lei, Long Xiao, Weiqin Li, Liyang Chen, Sicheng Yang, Zhiyong Wu, Shiyin Kang, Helen Meng:
GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on Pre-trained Genre Token Network. CoRR abs/2304.12704 (2023) - [i128]Jingbei Li, Sipan Li, Ping Chen, Luwen Zhang, Yi Meng, Zhiyong Wu, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang:
Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing. CoRR abs/2305.05203 (2023) - [i127]Jun Chen, Wei Rao, Zilin Wang, Jiuxin Lin, Zhiyong Wu, Yannan Wang, Shidong Shang, Helen Meng:
Inter-SubNet: Speech Enhancement with Subband Interaction. CoRR abs/2305.05599 (2023) - [i126]Xiaoying Zhang, Baolin Peng, Kun Li, Jingyan Zhou, Helen Meng:
SGP-TOD: Building Task Bots Effortlessly via Schema-Guided LLM Prompting. CoRR abs/2305.09067 (2023) - [i125]Xintao Zhao, Shuai Wang, Yang Chao, Zhiyong Wu, Helen Meng:
Adversarial Speaker Disentanglement Using Unannotated External Data for Self-supervised Representation Based Voice Conversion. CoRR abs/2305.09167 (2023) - [i124]Haibin Wu, Jiawen Kang, Lingwei Meng, Helen Meng, Hung-yi Lee:
The defender's perspective on automatic speaker verification: An overview. CoRR abs/2305.12804 (2023) - [i123]Hongyin Luo, Yung-Sung Chuang, Yuan Gong, Tianhua Zhang, Yoon Kim, Xixin Wu, Danny Fox, Helen Meng, James R. Glass:
SAIL: Search-Augmented Instruction Learning. CoRR abs/2305.15225 (2023) - [i122]Lingwei Meng, Jiawen Kang, Mingyu Cui, Haibin Wu, Xixin Wu, Helen Meng:
Unified Modeling of Multi-Talker Overlapped Speech Recognition and Diarization with a Sidecar Separator. CoRR abs/2305.16263 (2023) - [i121]Xiang Li, Songxiang Liu, Max W. Y. Lam, Zhiyong Wu, Chao Weng, Helen Meng:
Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model. CoRR abs/2305.16749 (2023) - [i120]Jiuxin Lin, Xinyu Cai, Heinrich Dinkel, Jun Chen, Zhiyong Yan, Yongqing Wang, Junbo Zhang, Zhiyong Wu, Yujun Wang, Helen Meng:
AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction. CoRR abs/2306.14170 (2023) - [i119]Tianzi Wang, Shoukang Hu, Jiajun Deng, Zengrui Jin, Mengzhe Geng, Yi Wang, Helen Meng, Xunying Liu:
Hyper-parameter Adaptation of Conformer ASR Systems for Elderly and Dysarthric Speech Recognition. CoRR abs/2306.15265 (2023) - [i118]Guinan Li, Jiajun Deng, Mengzhe Geng, Zengrui Jin, Tianzi Wang, Shujie Hu, Mingyu Cui, Helen Meng, Xunying Liu:
Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition. CoRR abs/2307.02909 (2023) - [i117]Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu, Xixin Wu, Shiyin Kang, Helen Meng:
MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis. CoRR abs/2307.16012 (2023) - [i116]Jingyan Zhou, Minda Hu, Junan Li, Xiaoying Zhang, Xixin Wu, Irwin King, Helen Meng:
Rethinking Machine Ethics - Can LLMs Perform Moral Reasoning through the Lens of Moral Theories? CoRR abs/2308.15399 (2023) - [i115]Yi Meng, Xiang Li, Zhiyong Wu, Tingtian Li, Zixun Sun, Xinyu Xiao, Chi Sun, Hui Zhan, Helen Meng:
CALM: Contrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis. CoRR abs/2308.16021 (2023) - [i114]Jie Chen, Changhe Song, Deyi Tuo, Xixin Wu, Shiyin Kang, Zhiyong Wu, Helen Meng:
Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information. CoRR abs/2308.16577 (2023) - [i113]Weiqin Li, Shun Lei, Qiaochu Huang, Yixuan Zhou, Zhiyong Wu, Shiyin Kang, Helen Meng:
Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis. CoRR abs/2308.16593 (2023) - [i112]Shaohuan Zhou, Shun Lei, Weiya You, Deyi Tuo, Yuren You, Zhiyong Wu, Shiyin Kang, Helen Meng:
Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information. CoRR abs/2308.16836 (2023) - [i111]Haohan Guo, Fenglong Xie, Jiawen Kang, Yujia Xiao, Xixin Wu, Helen Meng:
QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning. CoRR abs/2309.00126 (2023) - [i110]Shaohuan Zhou, Xu Li, Zhiyong Wu, Ying Shan, Helen Meng:
Enhancing the vocal range of single-speaker singing voice synthesis with melody-unsupervised pre-training. CoRR abs/2309.00284 (2023) - [i109]Jiaxu Zhu, Changhe Song, Zhiyong Wu, Helen M. Meng:
SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge. CoRR abs/2309.01437 (2023) - [i108]Jiaxu Zhu, Weinan Tong, Yaoxun Xu, Changhe Song, Zhiyong Wu, Zhao You, Dan Su, Dong Yu, Helen M. Meng:
Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation. CoRR abs/2309.02459 (2023) - [i107]Sipan Li, Songxiang Liu, Luwen Zhang, Xiang Li, Yanyao Bian, Chao Weng, Zhiyong Wu, Helen Meng:
SnakeGAN: A Universal Vocoder Leveraging DDSP Prior Knowledge and Periodic Inductive Bias. CoRR abs/2309.07803 (2023) - [i106]Tianhua Zhang, Jiaxin Ge, Hongyin Luo, Yung-Sung Chuang, Mingye Gao, Yuan Gong, Xixin Wu, Yoon Kim, Helen Meng, James R. Glass:
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning. CoRR abs/2309.10814 (2023) - [i105]Shun Lei, Yixuan Zhou, Liyang Chen, Dan Luo, Zhiyong Wu, Xixin Wu, Shiyin Kang, Tao Jiang, Yahui Zhou, Yuxing Han, Helen Meng:
Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts. CoRR abs/2309.11977 (2023) - [i104]Dongchao Yang, Jinchuan Tian, Xu Tan, Rongjie Huang, Songxiang Liu, Xuankai Chang, Jiatong Shi, Sheng Zhao, Jiang Bian, Xixin Wu, Zhou Zhao, Shinji Watanabe, Helen Meng:
UniAudio: An Audio Foundation Model Toward Universal Audio Generation. CoRR abs/2310.00704 (2023) - [i103]Qichao Wang, Tian Bian, Yian Yin, Tingyang Xu, Hong Cheng, Helen M. Meng, Zibin Zheng, Liang Chen, Bingzhe Wu:
Language Agents for Detecting Implicit Stereotypes in Text-to-image Models at Scale. CoRR abs/2310.11778 (2023) - [i102]Yuanyuan Wang, Yang Zhang, Zhiyong Wu, Zhihan Yang, Tao Wei, Kun Zou, Helen Meng:
DASA: Difficulty-Aware Semantic Augmentation for Speaker Verification. CoRR abs/2310.12111 (2023) - [i101]Xiaohan Feng, Xixin Wu, Helen Meng:
Injecting linguistic knowledge into BERT for Dialogue State Tracking. CoRR abs/2311.15623 (2023) - [i100]Binzhu Sha, Xu Li, Zhiyong Wu, Ying Shan, Helen Meng:
Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion. CoRR abs/2312.04919 (2023) - [i99]Boshi Tang, Zhiyong Wu, Xixin Wu, Qiaochu Huang, Jun Chen, Shun Lei, Helen Meng:
SimCalib: Graph Neural Network Calibration based on Similarity between Nodes. CoRR abs/2312.11858 (2023) - [i98]Xueyuan Chen, Xi Wang, Shaofei Zhang, Lei He, Zhiyong Wu, Xixin Wu, Helen Meng:
StyleSpeech: Self-supervised Style Enhancing with VQ-VAE-based Pre-training for Expressive Audiobook Speech Synthesis. CoRR abs/2312.12181 (2023) - [i97]Yuanyuan Wang, Hangting Chen, Dongchao Yang, Jianwei Yu, Chao Weng, Zhiyong Wu, Helen Meng:
Consistent and Relevant: Rethink the Query Embedding in General Sound Separation. CoRR abs/2312.15463 (2023) - [i96]Haiwei Xue, Sicheng Yang, Zhensong Zhang, Zhiyong Wu, Minglei Li, Zonghong Dai, Helen M. Meng:
Conversational Co-Speech Gesture Generation via Modeling Dialog Intention, Emotion, and Context with Diffusion Models. CoRR abs/2312.15567 (2023) - 2022
- [j54]Haibin Wu, Xu Li, Andy T. Liu, Zhiyong Wu, Helen Meng, Hung-Yi Lee:
Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning. IEEE ACM Trans. Audio Speech Lang. Process. 30: 202-217 (2022) - [j53]Shoukang Hu, Xurong Xie, Mingyu Cui, Jiajun Deng, Shansong Liu, Jianwei Yu, Mengzhe Geng, Xunying Liu, Helen Meng:
Neural Architecture Search for LF-MMI Trained Time Delay Neural Networks. IEEE ACM Trans. Audio Speech Lang. Process. 30: 1093-1107 (2022) - [j52]Mengzhe Geng, Xurong Xie, Zi Ye, Tianzi Wang, Guinan Li, Shujie Hu, Xunying Liu, Helen Meng:
Speaker Adaptation Using Spectro-Temporal Deep Features for Dysarthric and Elderly Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 30: 2597-2611 (2022) - [j51]Boyang Xue, Shoukang Hu, Junhao Xu, Mengzhe Geng, Xunying Liu, Helen Meng:
Bayesian Neural Network Language Modeling for Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 30: 2900-2917 (2022) - [j50]Thomas K. F. Chiu, Helen Meng, Ching-Sing Chai, Irwin King, Savio Wong, Yeung Yam:
Creation and Evaluation of a Pretertiary Artificial Intelligence (AI) Curriculum. IEEE Trans. Educ. 65(1): 30-39 (2022) - [c365]Hongyuan Lu, Wai Lam, Hong Cheng, Helen Meng:
On Controlling Fallback Responses for Grounded Dialogue Generation. ACL (Findings) 2022: 2591-2601 - [c364]Kun Li, Tianhua Zhang, Liping Tang, Junan Li, Hongyuan Lu, Xixin Wu, Helen Meng:
Grounded Dialogue Generation with Cross-encoding Re-ranker, Grounding Span Prediction, and Passage Dropout. DialDoc@ACL 2022: 123-129 - [c363]Zijian Ding, Jiawen Kang, Tinky Oi Ting Ho, Ka Ho Wong, Helene H. Fung, Helen Meng, Xiaojuan Ma:
TalkTive: A Conversational Agent Using Backchannels to Engage Older Adults in Neurocognitive Disorders Screening. CHI 2022: 304:1-304:19 - [c362]Xueyuan Chen, Shun Lei, Zhiyong Wu, Dong Xu, Weifeng Zhao, Helen Meng:
Unsupervised Multi-scale Expressive Speaking Style Modeling with Hierarchical Context Information for Audiobook Speech Synthesis. COLING 2022: 7193-7202 - [c361]Jingyan Zhou, Jiawen Deng, Fei Mi, Yitong Li, Yasheng Wang, Minlie Huang, Xin Jiang, Qun Liu, Helen Meng:
Towards Identifying Social Bias in Dialog Systems: Framework, Dataset, and Benchmark. EMNLP (Findings) 2022: 3576-3591 - [c360]Jiawen Deng, Jingyan Zhou, Hao Sun, Chujie Zheng, Fei Mi, Helen Meng, Minlie Huang:
COLD: A Benchmark for Chinese Offensive Language Detection. EMNLP 2022: 11580-11599 - [c359]Haibin Wu, Po-Chun Hsu, Ji Gao, Shanshan Zhang, Shen Huang, Jian Kang, Zhiyong Wu, Helen Meng, Hung-Yi Lee:
Adversarial Sample Detection for Speaker Verification by Neural Vocoders. ICASSP 2022: 236-240 - [c358]Haibin Wu, Bo Zheng, Xu Li, Xixin Wu, Hung-Yi Lee, Helen Meng:
Characterizing the Adversarial Vulnerability of Speech self-Supervised Learning. ICASSP 2022: 3164-3168 - [c357]Guinan Li, Jianwei Yu, Jiajun Deng, Xunying Liu, Helen Meng:
Audio-Visual Multi-Channel Speech Separation, Dereverberation and Recognition. ICASSP 2022: 6042-6046 - [c356]Disong Wang, Songxiang Liu, Xixin Wu, Hui Lu, Lifa Sun, Xunying Liu, Helen Meng:
Speaker Identity Preservation in Dysarthric Speech Reconstruction by Adversarial Speaker Adaptation. ICASSP 2022: 6677-6681 - [c355]Shujie Hu, Shansong Liu, Xurong Xie, Mengzhe Geng, Tianzi Wang, Shoukang Hu, Mingyu Cui, Xunying Liu, Helen Meng:
Exploiting Cross Domain Acoustic-to-Articulatory Inverted Features for Disordered Speech Recognition. ICASSP 2022: 6747-6751 - [c354]Xixin Wu, Shoukang Hu, Zhiyong Wu, Xunying Liu, Helen Meng:
Neural Architecture Search for Speech Emotion Recognition. ICASSP 2022: 6902-6906 - [c353]Xintao Zhao, Feng Liu, Changhe Song, Zhiyong Wu, Shiyin Kang, Deyi Tuo, Helen Meng:
Disentangling Content and Fine-Grained Prosody Information Via Hybrid ASR Bottleneck Features for Voice Conversion. ICASSP 2022: 7022-7026 - [c352]Wenlin Dai, Changhe Song, Xiang Li, Zhiyong Wu, Huashan Pan, Xiulin Li, Helen Meng:
An End-to-End Chinese Text Normalization Model Based on Rule-Guided Flat-Lattice Transformer. ICASSP 2022: 7122-7126 - [c351]Disong Wang, Shan Yang, Dan Su, Xunying Liu, Dong Yu, Helen Meng:
VCVTS: Multi-Speaker Video-to-Speech Synthesis Via Cross-Modal Knowledge Transfer from Voice Conversion. ICASSP 2022: 7252-7256 - [c350]Junhao Xu, Jianwei Yu, Xunying Liu, Helen Meng:
Mixed Precision DNN Quantization for Overlapped Speech Separation and Recognition. ICASSP 2022: 7297-7301 - [c349]Naijun Zheng, Na Li, Jianwei Yu, Chao Weng, Dan Su, Xunying Liu, Helen Meng:
Multi-Channel Speaker Diarization Using Spatial Features for Meetings. ICASSP 2022: 7337-7341 - [c348]Xueyuan Chen, Changhe Song, Yixuan Zhou, Zhiyong Wu, Changbin Chen, Zhongqin Wu, Helen Meng:
A Character-Level Span-Based Model for Mandarin Prosodic Structure Prediction. ICASSP 2022: 7602-7606 - [c347]Jun Chen, Zilin Wang, Deyi Tuo, Zhiyong Wu, Shiyin Kang, Helen Meng:
FullSubNet+: Channel Attention Fullsubnet with Complex Spectrograms for Speech Enhancement. ICASSP 2022: 7857-7861 - [c346]Jingbei Li, Yi Meng, Chenyi Li, Zhiyong Wu, Helen Meng, Chao Weng, Dan Su:
Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis with Graph-Based Multi-Modal Context Modeling. ICASSP 2022: 7917-7921 - [c345]Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu, Shiyin Kang, Helen Meng:
Towards Expressive Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis. ICASSP 2022: 7922-7926 - [c344]Jingbei Li, Yi Meng, Zhiyong Wu, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang:
Neufa: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism. ICASSP 2022: 8007-8011 - [c343]Hang Su, Danyang Zhao, Long Dang, Minglei Li, Xixin Wu, Xunying Liu, Helen Meng:
A Multitask Learning Framework for Speaker Change Detection with Content Information from Unsupervised Speech Decomposition. ICASSP 2022: 8087-8091 - [c342]Naijun Zheng, Na Li, Xixin Wu, Lingwei Meng, Jiawen Kang, Haibin Wu, Chao Weng, Dan Su, Helen Meng:
The CUHK-Tencent Speaker Diarization System for the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge. ICASSP 2022: 9161-9165 - [c341]Haibin Wu, Heng-Cheng Kuo, Naijun Zheng, Kuo-Hsuan Hung, Hung-Yi Lee, Yu Tsao, Hsin-Min Wang, Helen Meng:
Partially Fake Audio Detection by Self-Attention-Based Fake Span Discovery. ICASSP 2022: 9236-9240 - [c340]Yang Zhang, Zhiqiang Lv, Haibin Wu, Shanshan Zhang, Pengfei Hu, Zhiyong Wu, Hung-yi Lee, Helen Meng:
MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification. INTERSPEECH 2022: 306-310 - [c339]Jie Chen, Changhe Song, Deyi Tuo, Xixin Wu, Shiyin Kang, Zhiyong Wu, Helen Meng:
Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information. INTERSPEECH 2022: 426-430 - [c338]Jun Chen, Wei Rao, Zilin Wang, Zhiyong Wu, Yannan Wang, Tao Yu, Shidong Shang, Helen Meng:
Speech Enhancement with Fullband-Subband Cross-Attention Network. INTERSPEECH 2022: 976-980 - [c337]Haohan Guo, Hui Lu, Xixin Wu, Helen Meng:
A Multi-Scale Time-Frequency Spectrogram Discriminator for GAN-based Non-Autoregressive TTS. INTERSPEECH 2022: 1566-1570 - [c336]Haohan Guo, Feng-Long Xie, Frank K. Soong, Xixin Wu, Helen Meng:
A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS. INTERSPEECH 2022: 1611-1615 - [c335]Jinchao Li, Shuai Wang, Yang Chao, Xunying Liu, Helen Meng:
Context-aware Multimodal Fusion for Emotion Recognition. INTERSPEECH 2022: 2013-2017 - [c334]Junhao Xu, Shoukang Hu, Xunying Liu, Helen Meng:
Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Swithboard Corpus. INTERSPEECH 2022: 2128-2132 - [c333]Xiaoquan Ke, Man-Wai Mak, Helen M. Meng:
Automatic Selection of Discriminative Features for Dementia Detection in Cantonese-Speaking People. INTERSPEECH 2022: 2153-2157 - [c332]Sicheng Yang, Methawee Tantrawenith, Haolin Zhuang, Zhiyong Wu, Aolan Sun, Jianzong Wang, Ning Cheng, Huaizhen Tang, Xintao Zhao, Jie Wang, Helen Meng:
Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion. INTERSPEECH 2022: 2553-2557 - [c331]Yixuan Zhou, Changhe Song, Xiang Li, Luwen Zhang, Zhiyong Wu, Yanyao Bian, Dan Su, Helen Meng:
Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis. INTERSPEECH 2022: 2573-2577 - [c330]Jiajun Deng, Xurong Xie, Tianzi Wang, Mingyu Cui, Boyang Xue, Zengrui Jin, Mengzhe Geng, Guinan Li, Xunying Liu, Helen Meng:
Confidence Score Based Conformer Speaker Adaptation for Speech Recognition. INTERSPEECH 2022: 2623-2627 - [c329]Mingyu Cui, Jiajun Deng, Shoukang Hu, Xurong Xie, Tianzi Wang, Shujie Hu, Mengzhe Geng, Boyang Xue, Xunying Liu, Helen Meng:
Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems. INTERSPEECH 2022: 3158-3162 - [c328]Yi Wang, Tianzi Wang, Zi Ye, Lingwei Meng, Shoukang Hu, Xixin Wu, Xunying Liu, Helen Meng:
Exploring linguistic feature and model combination for speech recognition based automatic AD detection. INTERSPEECH 2022: 3328-3332 - [c327]Shaohuan Zhou, Shun Lei, Weiya You, Deyi Tuo, Yuren You, Zhiyong Wu, Shiyin Kang, Helen Meng:
Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information. INTERSPEECH 2022: 4292-4296 - [c326]Haibin Wu, Lingwei Meng, Jiawen Kang, Jinchao Li, Xu Li, Xixin Wu, Hung-yi Lee, Helen Meng:
Spoofing-Aware Speaker Verification by Multi-Level Fusion. INTERSPEECH 2022: 4357-4361 - [c325]Tianzi Wang, Jiajun Deng, Mengzhe Geng, Zi Ye, Shoukang Hu, Yi Wang, Mingyu Cui, Zengrui Jin, Xunying Liu, Helen Meng:
Conformer Based Elderly Speech Recognition System for Alzheimer's Disease Detection. INTERSPEECH 2022: 4825-4829 - [c324]Yixuan Zhou, Changhe Song, Jingbei Li, Zhiyong Wu, Yanyao Bian, Dan Su, Helen Meng:
Enhancing Word-Level Semantic Representation via Dependency Structure for Expressive Text-to-Speech Synthesis. INTERSPEECH 2022: 5518-5522 - [c323]Shun Lei, Yixuan Zhou, Liyang Chen, Jiankun Hu, Zhiyong Wu, Shiyin Kang, Helen Meng:
Towards Multi-Scale Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis. INTERSPEECH 2022: 5523-5527 - [c322]Xiang Li, Changhe Song, Xianhao Wei, Zhiyong Wu, Jia Jia, Helen Meng:
Towards Cross-speaker Reading Style Transfer on Audiobook Dataset. INTERSPEECH 2022: 5528-5532 - [c321]Yi Meng, Xiang Li, Zhiyong Wu, Tingtian Li, Zixun Sun, Xinyu Xiao, Chi Sun, Hui Zhan, Helen Meng:
CALM: Constrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis. INTERSPEECH 2022: 5533-5537 - [c320]HoLam Chung, Junan Li, Pengfei Liu, Wai-Kim Leung, Xixin Wu, Helen Meng:
Improving Rare Words Recognition through Homophone Extension and Unified Writing for Low-resource Cantonese Speech Recognition. ISCSLP 2022: 26-30 - [c319]Xueyuan Chen, Qiaochu Huang, Xixin Wu, Zhiyong Wu, Helen Meng:
HILvoice:Human-in-the-Loop Style Selection for Elder-Facing Speech Synthesis. ISCSLP 2022: 86-90 - [c318]Chenyi Li, Zhiyong Wu, Wei Rao, Yannan Wang, Helen Meng:
Boosting the Performance of SpEx+ by Attention and Contextual Mechanism. ISCSLP 2022: 135-139 - [c317]Jingbei Li, Yi Meng, Xixin Wu, Zhiyong Wu, Jia Jia, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang:
Inferring Speaking Styles from Multi-modal Conversational Context by Multi-scale Relational Graph Convolutional Networks. ACM Multimedia 2022: 5811-5820 - [c316]Jianhua Tao, Jiangyan Yi, Cunhang Fan, Ruibo Fu, Shan Liang, Pengyuan Zhang, Haizhou Li, Helen Meng, Dong Yu, Masato Akagi:
DDAM '22: 1st International Workshop on Deepfake Detection for Audio Multimedia. ACM Multimedia 2022: 7405-7406 - [c315]Hongyuan Lu, Wai Lam, Hong Cheng, Helen Meng:
Partner Personas Generation for Dialogue Response Generation. NAACL-HLT 2022: 5200-5212 - [c314]Jingyan Zhou, Fei Mi, Helen Meng, Jiawen Deng:
Overview of NLPCC 2022 Shared Task 7: Fine-Grained Dialogue Social Bias Measurement. NLPCC (2) 2022: 342-350 - [c313]Haibin Wu, Jiawen Kang, Lingwei Meng, Yang Zhang, Xixin Wu, Zhiyong Wu, Hung-yi Lee, Helen Meng:
Tackling Spoofing-Aware Speaker Verification with Multi-Model Fusion. Odyssey 2022: 92-99 - [c312]Jixiu Li, Yisen Huang, Wing Yin Ng, Truman Cheng, Xixin Wu, Qi Dou, Helen Meng, Pheng-Ann Heng, Yunhui Liu, Shannon Melissa Chan, David Navarro-Alarcon, Calvin Sze Hang Ng, Philip Wai Yan Chiu, Zheng Li:
Speech-Vision Based Multi-Modal AI Control of a Magnetic Anchored and Actuated Endoscope. ROBIO 2022: 403-408 - [c311]Xiaoying Zhang, Baolin Peng, Jianfeng Gao, Helen Meng:
Toward Self-Learning End-to-End Task-oriented Dialog Systems. SIGDIAL 2022: 516-530 - [c310]Xuanjun Chen, Haibin Wu, Helen Meng, Hung-yi Lee, Jyh-Shing Roger Jang:
Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual Active Speaker Detection. SLT 2022: 692-699 - [c309]Hui Lu, Disong Wang, Xixin Wu, Zhiyong Wu, Xunying Liu, Helen Meng:
Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE. SLT 2022: 814-821 - [c308]King Woon Yau, Ching Sing Chai, Thomas K. F. Chiu, Helen Meng, Irwin King, Savio Wai-Ho Wong, Chandni Saxena, Yeung Yam:
Developing an AI literacy test for junior secondary students: The first stage. TALE 2022: 59-64 - [c307]Yang Deng, Wenxuan Zhang, Wai Lam, Hong Cheng, Helen Meng:
User Satisfaction Estimation with Sequential Dialogue Act Modeling in Goal-oriented Conversational Systems. WWW 2022: 2998-3008 - [e8]Jianhua Tao, Haizhou Li, Helen Meng, Dong Yu, Masato Akagi, Jiangyan Yi, Cunhang Fan, Ruibo Fu, Shan Lian, Pengyuan Zhang:
DDAM@MM 2022: Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia, Lisboa, Portugal, 14 October 2022. ACM 2022, ISBN 978-1-4503-9496-3 [contents] - [i95]Shoukang Hu, Xurong Xie, Mingyu Cui, Jiajun Deng, Shansong Liu, Jianwei Yu, Mengzhe Geng, Xunying Liu, Helen Meng:
Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks. CoRR abs/2201.03943 (2022) - [i94]Mengzhe Geng, Shansong Liu, Jianwei Yu, Xurong Xie, Shoukang Hu, Zi Ye, Zengrui Jin, Xunying Liu, Helen Meng:
Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition. CoRR abs/2201.05554 (2022) - [i93]Mengzhe Geng, Xurong Xie, Shansong Liu, Jianwei Yu, Shoukang Hu, Xunying Liu, Helen Meng:
Investigation of Data Augmentation Techniques for Disordered Speech Recognition. CoRR abs/2201.05562 (2022) - [i92]Shansong Liu, Mengzhe Geng, Shoukang Hu, Xurong Xie, Mingyu Cui, Jianwei Yu, Xunying Liu, Helen Meng:
Recent Progress in the CUHK Dysarthric Speech Recognition System. CoRR abs/2201.05845 (2022) - [i91]Pengfei Liu, Kun Li, Helen Meng:
Group Gated Fusion on Attention-based Bidirectional Alignment for Multimodal Emotion Recognition. CoRR abs/2201.06309 (2022) - [i90]Xiaoying Zhang, Baolin Peng, Jianfeng Gao, Helen Meng:
Toward Self-Learning End-to-End Dialog Systems. CoRR abs/2201.06849 (2022) - [i89]Jingyan Zhou, Xiaohan Feng, King Keung Wu, Helen Meng:
Convex Polytope Modelling for Unsupervised Derivation of Semantic Structure for Data-efficient Natural Language Understanding. CoRR abs/2201.10588 (2022) - [i88]Naijun Zheng, Na Li, Xixin Wu, Lingwei Meng, Jiawen Kang, Haibin Wu, Chao Weng, Dan Su, Helen Meng:
The CUHK-TENCENT speaker diarization system for the ICASSP 2022 multi-channel multi-party meeting transcription challenge. CoRR abs/2202.01986 (2022) - [i87]Yang Deng, Wenxuan Zhang, Wai Lam, Hong Cheng, Helen Meng:
User Satisfaction Estimation with Sequential Dialogue Act Modeling in Goal-oriented Conversational Systems. CoRR abs/2202.02912 (2022) - [i86]Haibin Wu, Heng-Cheng Kuo, Naijun Zheng, Kuo-Hsuan Hung, Hung-Yi Lee, Yu Tsao, Hsin-Min Wang, Helen Meng:
Partially Fake Audio Detection by Self-attention-based Fake Span Discovery. CoRR abs/2202.06684 (2022) - [i85]Jingyan Zhou, Jiawen Deng, Fei Mi, Yitong Li, Yasheng Wang, Minlie Huang, Xin Jiang, Qun Liu, Helen Meng:
Towards Identifying Social Bias in Dialog Systems: Frame, Datasets, and Benchmarks. CoRR abs/2202.08011 (2022) - [i84]Zijian Ding, Jiawen Kang, Tinky Oi Ting Ho, Ka Ho Wong, Helene H. Fung, Helen Meng, Xiaojuan Ma:
TalkTive: A Conversational Agent Using Backchannels to Engage Older Adults in Neurocognitive Disorders Screening. CoRR abs/2202.08216 (2022) - [i83]Disong Wang, Shan Yang, Dan Su, Xunying Liu, Dong Yu, Helen Meng:
VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion. CoRR abs/2202.09081 (2022) - [i82]Disong Wang, Songxiang Liu, Xixin Wu, Hui Lu, Lifa Sun, Xunying Liu, Helen Meng:
Speaker Identity Preservation in Dysarthric Speech Reconstruction by Adversarial Speaker Adaptation. CoRR abs/2202.09082 (2022) - [i81]Mengzhe Geng, Xurong Xie, Zi Ye, Tianzi Wang, Guinan Li, Shujie Hu, Xunying Liu, Helen Meng:
Speaker Adaptation Using Spectro-Temporal Deep Features for Dysarthric and Elderly Speech Recognition. CoRR abs/2202.10290 (2022) - [i80]Haohan Guo, Hui Lu, Xixin Wu, Helen Meng:
A Multi-Scale Time-Frequency Spectrogram Discriminator for GAN-based Non-Autoregressive TTS. CoRR abs/2203.01080 (2022) - [i79]Shujie Hu, Shansong Liu, Xurong Xie, Mengzhe Geng, Tianzi Wang, Shoukang Hu, Mingyu Cui, Xunying Liu, Helen Meng:
Exploiting Cross Domain Acoustic-to-articulatory Inverted Features For Disordered Speech Recognition. CoRR abs/2203.10274 (2022) - [i78]Jun Chen, Zilin Wang, Deyi Tuo, Zhiyong Wu, Shiyin Kang, Helen Meng:
FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement. CoRR abs/2203.12188 (2022) - [i77]Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu, Shiyin Kang, Helen Meng:
Towards Expressive Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis. CoRR abs/2203.12201 (2022) - [i76]Xintao Zhao, Feng Liu, Changhe Song, Zhiyong Wu, Shiyin Kang, Deyi Tuo, Helen Meng:
Disentangleing Content and Fine-grained Prosody Information via Hybrid ASR Bottleneck Features for Voice Conversion. CoRR abs/2203.12813 (2022) - [i75]Mengzhe Geng, Xurong Xie, Rongfeng Su, Jianwei Yu, Zi Ye, Xunying Liu, Helen Meng:
On-the-fly Feature Based Speaker Adaptation for Dysarthric and Elderly Speech Recognition. CoRR abs/2203.14593 (2022) - [i74]Yang Zhang, Zhiqiang Lv, Haibin Wu, Shanshan Zhang, Pengfei Hu, Zhiyong Wu, Hung-Yi Lee, Helen Meng:
MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification. CoRR abs/2203.15249 (2022) - [i73]Haibin Wu, Lingwei Meng, Jiawen Kang, Jinchao Li, Xu Li, Xixin Wu, Hung-yi Lee, Helen Meng:
Spoofing-Aware Speaker Verification by Multi-Level Fusion. CoRR abs/2203.15377 (2022) - [i72]Jingbei Li, Yi Meng, Zhiyong Wu, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang:
NeuFA: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism. CoRR abs/2203.16838 (2022) - [i71]Xueyuan Chen, Changhe Song, Yixuan Zhou, Zhiyong Wu, Changbin Chen, Zhongqin Wu, Helen Meng:
A Character-level Span-based Model for Mandarin Prosodic Structure Prediction. CoRR abs/2203.16922 (2022) - [i70]Xixin Wu, Shoukang Hu, Zhiyong Wu, Xunying Liu, Helen Meng:
Neural Architecture Search for Speech Emotion Recognition. CoRR abs/2203.16928 (2022) - [i69]Wenlin Dai, Changhe Song, Xiang Li, Zhiyong Wu, Huashan Pan, Xiulin Li, Helen Meng:
An End-to-end Chinese Text Normalization Model based on Rule-guided Flat-Lattice Transformer. CoRR abs/2203.16954 (2022) - [i68]Yixuan Zhou, Changhe Song, Xiang Li, Luwen Zhang, Zhiyong Wu, Yanyao Bian, Dan Su, Helen Meng:
Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis. CoRR abs/2204.00990 (2022) - [i67]Guinan Li, Jianwei Yu, Jiajun Deng, Xunying Liu, Helen Meng:
Audio-visual multi-channel speech separation, dereverberation and recognition. CoRR abs/2204.01977 (2022) - [i66]Shun Lei, Yixuan Zhou, Liyang Chen, Jiankun Hu, Zhiyong Wu, Shiyin Kang, Helen Meng:
Towards Multi-Scale Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis. CoRR abs/2204.02743 (2022) - [i65]Chandni Saxena, Mudit Chaudhary, Helen Meng:
Cross-lingual Word Embeddings in Hyperbolic Space. CoRR abs/2205.01907 (2022) - [i64]Shujie Hu, Xurong Xie, Mengzhe Geng, Mingyu Cui, Jiajun Deng, Tianzi Wang, Xunying Liu, Helen Meng:
Exploiting Cross-domain And Cross-Lingual Ultrasound Tongue Imaging Features For Elderly And Dysarthric Speech Recognition. CoRR abs/2206.07327 (2022) - [i63]Haibin Wu, Jiawen Kang, Lingwei Meng, Yang Zhang, Xixin Wu, Zhiyong Wu, Hung-yi Lee, Helen Meng:
Tackling Spoofing-Aware Speaker Verification with Multi-Model Fusion. CoRR abs/2206.09131 (2022) - [i62]Mingyu Cui, Jiajun Deng, Shoukang Hu, Xurong Xie, Tianzi Wang, Shujie Hu, Mengzhe Geng, Boyang Xue, Xunying Liu, Helen Meng:
Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems. CoRR abs/2206.11596 (2022) - [i61]Junhao Xu, Shoukang Hu, Xunying Liu, Helen Meng:
Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Switchboard Corpus. CoRR abs/2206.11643 (2022) - [i60]Jiajun Deng, Xurong Xie, Tianzi Wang, Mingyu Cui, Boyang Xue, Zengrui Jin, Mengzhe Geng, Guinan Li, Xunying Liu, Helen Meng:
Confidence Score Based Conformer Speaker Adaptation for Speech Recognition. CoRR abs/2206.12045 (2022) - [i59]Tianzi Wang, Jiajun Deng, Mengzhe Geng, Zi Ye, Shoukang Hu, Yi Wang, Mingyu Cui, Zengrui Jin, Xunying Liu, Helen Meng:
Conformer Based Elderly Speech Recognition System for Alzheimer's Disease Detection. CoRR abs/2206.13232 (2022) - [i58]Yi Wang, Tianzi Wang, Zi Ye, Lingwei Meng, Shoukang Hu, Xixin Wu, Xunying Liu, Helen Meng:
Exploring linguistic feature and model combination for speech recognition based automatic AD detection. CoRR abs/2206.13758 (2022) - [i57]Xiang Li, Changhe Song, Xianhao Wei, Zhiyong Wu, Jia Jia, Helen Meng:
Towards Cross-speaker Reading Style Transfer on Audiobook Dataset. CoRR abs/2208.05359 (2022) - [i56]Sicheng Yang, Methawee Tantrawenith, Haolin Zhuang, Zhiyong Wu, Aolan Sun, Jianzong Wang, Ning Cheng, Huaizhen Tang, Xintao Zhao, Jie Wang, Helen Meng:
Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion. CoRR abs/2208.08757 (2022) - [i55]Boyang Xue, Shoukang Hu, Junhao Xu, Mengzhe Geng, Xunying Liu, Helen Meng:
Bayesian Neural Network Language Modeling for Speech Recognition. CoRR abs/2208.13259 (2022) - [i54]Haohan Guo, Feng-Long Xie, Frank K. Soong, Xixin Wu, Helen Meng:
A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS. CoRR abs/2209.10887 (2022) - [i53]Xuanjun Chen, Haibin Wu, Helen Meng, Hung-yi Lee, Jyh-Shing Roger Jang:
Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual Active Speaker Detection. CoRR abs/2210.00753 (2022) - [i52]Liping Tang, Zhen Li, Zhiquan Luo, Helen Meng:
Robust Unsupervised Cross-Lingual Word Embedding using Domain Flow Interpolation. CoRR abs/2210.03319 (2022) - [i51]Hui Lu, Disong Wang, Xixin Wu, Zhiyong Wu, Xunying Liu, Helen Meng:
Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using β-VAE. CoRR abs/2210.13771 (2022) - [i50]Haohan Guo, Fenglong Xie, Xixin Wu, Hui Lu, Helen Meng:
Towards High-Quality Neural TTS for Low-Resource Languages by Learning Compact Speech Representations. CoRR abs/2210.15131 (2022) - [i49]Yi Wang, Jiajun Deng, Tianzi Wang, Bo Zheng, Shoukang Hu, Xunying Liu, Helen Meng:
Exploiting prompt learning with pre-trained language models for Alzheimer's Disease detection. CoRR abs/2210.16539 (2022) - [i48]Zhe Li, Man-Wai Mak, Helen Mei-Ling Meng:
Discriminative Speaker Representation via Contrastive Learning with Class-Aware Attention in Angular Space. CoRR abs/2210.16622 (2022) - [i47]Jun Chen, Wei Rao, Zilin Wang, Zhiyong Wu, Yannan Wang, Tao Yu, Shidong Shang, Helen Meng:
Speech Enhancement with Fullband-Subband Cross-Attention Network. CoRR abs/2211.05432 (2022) - 2021
- [j49]Xixin Wu, Yuewen Cao, Hui Lu, Songxiang Liu, Shiyin Kang, Zhiyong Wu, Xunying Liu, Helen Meng:
Exemplar-Based Emotive Speech Synthesis. IEEE ACM Trans. Audio Speech Lang. Process. 29: 874-886 (2021) - [j48]Shoukang Hu, Xurong Xie, Shansong Liu, Jianwei Yu, Zi Ye, Mengzhe Geng, Xunying Liu, Helen Meng:
Bayesian Learning of LF-MMI Trained Time Delay Neural Networks for Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1514-1529 (2021) - [j47]Songxiang Liu, Yuewen Cao, Disong Wang, Xixin Wu, Xunying Liu, Helen Meng:
Any-to-Many Voice Conversion With Location-Relative Sequence-to-Sequence Modeling. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1717-1728 (2021) - [j46]Jianwei Yu, Shi-Xiong Zhang, Bo Wu, Shansong Liu, Shoukang Hu, Mengzhe Geng, Xunying Liu, Helen Meng, Dong Yu:
Audio-Visual Multi-Channel Integration and Recognition of Overlapped Speech. IEEE ACM Trans. Audio Speech Lang. Process. 29: 2067-2082 (2021) - [j45]Shansong Liu, Mengzhe Geng, Shoukang Hu, Xurong Xie, Mingyu Cui, Jianwei Yu, Xunying Liu, Helen Meng:
Recent Progress in the CUHK Dysarthric Speech Recognition System. IEEE ACM Trans. Audio Speech Lang. Process. 29: 2267-2281 (2021) - [j44]Xixin Wu, Yuewen Cao, Hui Lu, Songxiang Liu, Disong Wang, Zhiyong Wu, Xunying Liu, Helen Meng:
Speech Emotion Recognition Using Sequential Capsule Networks. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3280-3291 (2021) - [j43]Junhao Xu, Jianwei Yu, Shoukang Hu, Xunying Liu, Helen Meng:
Mixed Precision Low-Bit Quantization of Neural Network Language Models for Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3679-3693 (2021) - [c306]Xiaoquan Ke, Man-Wai Mak, Jinchao Li, Helen M. Meng:
Dual Dropout Ranking of Linguistic Features for Alzheimer's Disease Recognition. APSIPA ASC 2021: 743-749 - [c305]Sean Shensheng Xu, Man-Wai Mak, Ka Ho Wong, Helen Meng, Timothy C. Y. Kwok:
Speaker Turn Aware Similarity Scoring for Diarization of Speech-Based Cognitive Assessments. APSIPA ASC 2021: 1299-1304 - [c304]Huirong Huang, Zhiyong Wu, Shiyin Kang, Dongyang Dai, Jia Jia, Tianxiao Fu, Deyi Tuo, Guangzhi Lei, Peng Liu, Dan Su, Dong Yu, Helen Meng:
Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams. APSIPA ASC 2021: 1433-1437 - [c303]Songxiang Liu, Yuewen Cao, Dan Su, Helen Meng:
DiffSVC: A Diffusion Probabilistic Model for Singing Voice Conversion. ASRU 2021: 741-748 - [c302]Aolan Sun, Jianzong Wang, Ning Cheng, Methawee Tantrawenith, Zhiyong Wu, Helen Meng, Edward Xiao, Jing Xiao:
Reconstructing Dual Learning for Neural Voice Conversion Using Relatively Few Samples. ASRU 2021: 946-953 - [c301]Disong Wang, Liqun Deng, Yang Zhang, Nianzu Zheng, Yu Ting Yeung, Xiao Chen, Xunying Liu, Helen Meng:
Fcl-Taco2: Towards Fast, Controllable and Lightweight Text-to-Speech Synthesis. ICASSP 2021: 5714-5718 - [c300]Xiong Cai, Dongyang Dai, Zhiyong Wu, Xiang Li, Jingbei Li, Helen Meng:
Emotion Controllable Speech Synthesis Using Emotion-Unlabeled Dataset with the Assistance of Cross-Domain Speech Emotion Recognition. ICASSP 2021: 5734-5738 - [c299]Xingchen Song, Zhiyong Wu, Yiheng Huang, Chao Weng, Dan Su, Helen M. Meng:
Non-Autoregressive Transformer ASR with CTC-Enhanced Decoder Input. ICASSP 2021: 5894-5898 - [c298]Changhe Song, Jingbei Li, Yixuan Zhou, Zhiyong Wu, Helen M. Meng:
Syntactic Representation Learning For Neural Network Based TTS with Syntactic Parse Tree Traversal. ICASSP 2021: 6064-6068 - [c297]Xu Li, Na Li, Chao Weng, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Replay and Synthetic Speech Detection with Res2Net Architecture. ICASSP 2021: 6354-6358 - [c296]Jinchao Li, Jianwei Yu, Zi Ye, Simon Wong, Man-Wai Mak, Brian Mak, Xunying Liu, Helen Meng:
A Comparative Study of Acoustic and Linguistic Features Classification for Alzheimer's Disease Detection. ICASSP 2021: 6423-6427 - [c295]Zi Ye, Shoukang Hu, Jinchao Li, Xurong Xie, Mengzhe Geng, Jianwei Yu, Junhao Xu, Boyang Xue, Shansong Liu, Xunying Liu, Helen Meng:
Development of the Cuhk Elderly Speech Recognition System for Neurocognitive Disorder Detection Using the Dementiabank Corpus. ICASSP 2021: 6433-6437 - [c294]Naijun Zheng, Na Li, Bo Wu, Meng Yu, Jianwei Yu, Chao Weng, Dan Su, Xunying Liu, Helen Meng:
A Joint Training Framework of Multi-Look Separator and Speaker Embedding Extractor for Overlapped Speech. ICASSP 2021: 6698-6702 - [c293]Haibin Wu, Xu Li, Andy T. Liu, Zhiyong Wu, Helen Meng, Hung-yi Lee:
Adversarial Defense for Automatic Speaker Verification by Cascaded Self-Supervised Learning Models. ICASSP 2021: 6718-6722 - [c292]Shoukang Hu, Xurong Xie, Shansong Liu, Mingyu Cui, Mengzhe Geng, Xunying Liu, Helen Meng:
Neural Architecture Search for LF-MMI Trained Time Delay Neural Networks. ICASSP 2021: 6758-6762 - [c291]Boyang Xue, Jianwei Yu, Junhao Xu, Shansong Liu, Shoukang Hu, Zi Ye, Mengzhe Geng, Xunying Liu, Helen Meng:
Bayesian Transformer Language Models for Speech Recognition. ICASSP 2021: 7378-7382 - [c290]Junhao Xu, Shoukang Hu, Jianwei Yu, Xunying Liu, Helen Meng:
Mixed Precision Quantization of Transformer Language Models for Speech Recognition. ICASSP 2021: 7383-7387 - [c289]Jie Wang, Yuren You, Feng Liu, Deyi Tuo, Shiyin Kang, Zhiyong Wu, Helen Meng:
The Huya Multi-Speaker and Multi-Style Speech Synthesis System for M2voc Challenge 2020. ICASSP 2021: 8608-8612 - [c288]Songxiang Liu, Yuewen Cao, Na Hu, Dan Su, Helen Meng:
Fastsvc: Fast Cross-Domain Singing Voice Conversion With Feature-Wise Linear Modulation. ICME 2021: 1-6 - [c287]Jie Wang, Jingbei Li, Xintao Zhao, Zhiyong Wu, Shiyin Kang, Helen Meng:
Adversarially Learning Disentangled Speech Representations for Robust Multi-Factor Voice Conversion. Interspeech 2021: 846-850 - [c286]Disong Wang, Liqun Deng, Yu Ting Yeung, Xiao Chen, Xunying Liu, Helen Meng:
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-Shot Voice Conversion. Interspeech 2021: 1344-1348 - [c285]Disong Wang, Liqun Deng, Yu Ting Yeung, Xiao Chen, Xunying Liu, Helen Meng:
Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization. Interspeech 2021: 2956-2960 - [c284]Hui Lu, Zhiyong Wu, Xixin Wu, Xu Li, Shiyin Kang, Xunying Liu, Helen Meng:
VAENAR-TTS: Variational Auto-Encoder Based Non-AutoRegressive Text-to-Speech Synthesis. Interspeech 2021: 3775-3779 - [c283]Minglin Wu, Kun Li, Wai-Kim Leung, Helen Meng:
Transformer Based End-to-End Mispronunciation Detection and Diagnosis. Interspeech 2021: 3954-3958 - [c282]Xu Li, Xixin Wu, Hui Lu, Xunying Liu, Helen Meng:
Channel-Wise Gated Res2Net: Towards Robust Detection of Synthetic Speech Attacks. Interspeech 2021: 4314-4318 - [c281]Xiang Li, Changhe Song, Jingbei Li, Zhiyong Wu, Jia Jia, Helen Meng:
Towards Multi-Scale Style Control for Expressive Speech Synthesis. Interspeech 2021: 4673-4677 - [c280]Mengzhe Geng, Shansong Liu, Jianwei Yu, Xurong Xie, Shoukang Hu, Zi Ye, Zengrui Jin, Xunying Liu, Helen Meng:
Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition. Interspeech 2021: 4793-4797 - [c279]Zengrui Jin, Mengzhe Geng, Xurong Xie, Jianwei Yu, Shansong Liu, Xunying Liu, Helen Meng:
Adversarial Data Augmentation for Disordered Speech Recognition. Interspeech 2021: 4803-4807 - [c278]Disong Wang, Songxiang Liu, Lifa Sun, Xixin Wu, Xunying Liu, Helen Meng:
Learning Explicit Prosody Models and Deep Speaker Embeddings for Atypical Voice Conversion. Interspeech 2021: 4813-4817 - [c277]Jiajun Deng, Fabian Ritter Gutierrez, Shoukang Hu, Mengzhe Geng, Xurong Xie, Zi Ye, Shansong Liu, Jianwei Yu, Xunying Liu, Helen Meng:
Bayesian Parametric and Architectural Domain Adaptation of LF-MMI Trained TDNNs for Elderly and Dysarthric Speech Recognition. Interspeech 2021: 4818-4822 - [c276]Xiong Cai, Zhiyong Wu, Kuo Zhong, Bin Su, Dongyang Dai, Helen Meng:
Unsupervised Cross-Lingual Speech Emotion Recognition Using Domain Adversarial Neural Network. ISCSLP 2021: 1-5 - [c275]Yuewen Cao, Songxiang Liu, Shiyin Kang, Na Hu, Peng Liu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Exploring Cross-lingual Singing Voice Synthesis Using Speech Data. ISCSLP 2021: 1-5 - [c274]Guolei Jiang, Chunhong Liao, Kun Li, Pengfei Liu, Linying Jiang, Helen Meng:
Automatic Speaker-level Pronunciation Assessment of L2 Speech Using Posterior Probabilities from Multiple Utterances. ISCSLP 2021: 1-5 - [c273]Disong Wang, Jianwei Yu, Xixin Wu, Lifa Sun, Xunying Liu, Helen Meng:
Improved End-to-End Dysarthric Speech Recognition via Meta-learning Based Model Re-initialization. ISCSLP 2021: 1-5 - [c272]Sean Shensheng Xu, Man-Wai Mak, Ka Ho Wong, Helen Meng, Timothy C. Y. Kwok:
Age-Invariant Speaker Embedding for Diarization of Cognitive Assessments. ISCSLP 2021: 1-5 - [c271]Jingyan Zhou, Xiaoying Zhang, Xiaohan Feng, King Keung Wu, Helen Meng:
Automatic Extraction of Semantic Patterns in Dialogs using Convex Polytopic Model. ISCSLP 2021: 1-5 - [c270]Pengfei Liu, Kun Li, Helen Meng:
Out-of-Scope Domain and Intent Classification through Hierarchical Joint Modeling. IWSDS 2021: 3-16 - [c269]Liangqi Liu, Jiankun Hu, Zhiyong Wu, Song Yang, Songfan Yang, Jia Jia, Helen Meng:
Controllable Emphatic Speech Synthesis based on Forward Attention for Expressive Speech Synthesis. SLT 2021: 410-414 - [i46]Mudit Chaudhary, Borislav Dzodzo, Sida Huang, Chun Hei Lo, Mingzhi Lyu, Lun Yiu Nie, Jinbo Xing, Tianhua Zhang, Xiaoying Zhang, Jingyan Zhou, Hong Cheng, Wai Lam, Helen Meng:
Unstructured Knowledge Access in Task-oriented Dialog Modeling using Language Inference, Knowledge Retrieval and Knowledge-Integrative Response Generation. CoRR abs/2101.06066 (2021) - [i45]Thomas K. F. Chiu, Helen Meng, Ching-Sing Chai, Irwin King, Savio Wong, Yeung Yam:
Creation and Evaluation of a Pre-tertiary Artificial Intelligence (AI) Curriculum. CoRR abs/2101.07570 (2021) - [i44]Jie Wang, Jingbei Li, Xintao Zhao, Zhiyong Wu, Helen Meng:
Adversarially learning disentangled speech representations for robust multi-factor voice conversion. CoRR abs/2102.00184 (2021) - [i43]Boyang Xue, Jianwei Yu, Junhao Xu, Shansong Liu, Shoukang Hu, Zi Ye, Mengzhe Geng, Xunying Liu, Helen Meng:
Bayesian Transformer Language Models for Speech Recognition. CoRR abs/2102.04754 (2021) - [i42]Haibin Wu, Xu Li, Andy T. Liu, Zhiyong Wu, Helen Meng, Hung-yi Lee:
Adversarial defense for automatic speaker verification by cascaded self-supervised learning models. CoRR abs/2102.07047 (2021) - [i41]Xiang Li, Changhe Song, Jingbei Li, Zhiyong Wu, Jia Jia, Helen M. Meng:
Towards Multi-Scale Style Control for Expressive Speech Synthesis. CoRR abs/2104.03521 (2021) - [i40]Yixuan Zhou, Changhe Song, Jingbei Li, Zhiyong Wu, Helen Meng:
Dependency Parsing based Semantic Representation Learning with Graph Neural Network for Enhancing Expressiveness of Text-to-Speech. CoRR abs/2104.06835 (2021) - [i39]Pengfei Liu, Youzhang Ning, King Keung Wu, Kun Li, Helen Meng:
Open Intent Discovery through Unsupervised Semantic Clustering and Dependency Parsing. CoRR abs/2104.12114 (2021) - [i38]Pengfei Liu, Kun Li, Helen Meng:
Hierarchical Modeling for Out-of-Scope Domain and Intent Classification. CoRR abs/2104.14781 (2021) - [i37]Songxiang Liu, Yuewen Cao, Dan Su, Helen Meng:
DiffSVC: A Diffusion Probabilistic Model for Singing Voice Conversion. CoRR abs/2105.13871 (2021) - [i36]Haibin Wu, Xu Li, Andy T. Liu, Zhiyong Wu, Helen Meng, Hung-yi Lee:
Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning. CoRR abs/2106.00273 (2021) - [i35]Jingbei Li, Yi Meng, Chenyi Li, Zhiyong Wu, Helen Meng, Chao Weng, Dan Su:
Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis. CoRR abs/2106.06233 (2021) - [i34]Disong Wang, Liqun Deng, Yu Ting Yeung, Xiao Chen, Xunying Liu, Helen Meng:
Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization. CoRR abs/2106.10127 (2021) - [i33]Disong Wang, Liqun Deng, Yu Ting Yeung, Xiao Chen, Xunying Liu, Helen Meng:
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion. CoRR abs/2106.10132 (2021) - [i32]Haibin Wu, Po-Chun Hsu, Ji Gao, Shanshan Zhang, Shen Huang, Jian Kang, Zhiyong Wu, Helen Meng, Hung-yi Lee:
Spotting adversarial samples for speaker verification by neural vocoders. CoRR abs/2107.00309 (2021) - [i31]Hui Lu, Zhiyong Wu, Xixin Wu, Xu Li, Shiyin Kang, Xunying Liu, Helen Meng:
VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis. CoRR abs/2107.03298 (2021) - [i30]Xu Li, Xixin Wu, Hui Lu, Xunying Liu, Helen Meng:
Channel-wise Gated Res2Net: Towards Robust Detection of Synthetic Speech Attacks. CoRR abs/2107.08803 (2021) - [i29]Zengrui Jin, Mengzhe Geng, Xurong Xie, Jianwei Yu, Shansong Liu, Xunying Liu, Helen Meng:
Adversarial Data Augmentation for Disordered Speech Recognition. CoRR abs/2108.00899 (2021) - [i28]Mudit Chaudhary, Chandni Saxena, Helen Meng:
Countering Online Hate Speech: An NLP Perspective. CoRR abs/2109.02941 (2021) - [i27]Haibin Wu, Bo Zheng, Xu Li, Xixin Wu, Hung-yi Lee, Helen Meng:
Characterizing the adversarial vulnerability of speech self-supervised learning. CoRR abs/2111.04330 (2021) - [i26]Hongyuan Lu, Wai Lam, Hong Cheng, Helen M. Meng:
Partner Personas Generation for Diverse Dialogue Generation. CoRR abs/2111.13833 (2021) - [i25]Junhao Xu, Jianwei Yu, Xunying Liu, Helen Meng:
Mixed Precision DNN Qunatization for Overlapped Speech Separation and Recognition. CoRR abs/2111.14479 (2021) - [i24]Junhao Xu, Xie Chen, Shoukang Hu, Jianwei Yu, Xunying Liu, Helen Meng:
Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers. CoRR abs/2111.14836 (2021) - [i23]Junhao Xu, Jianwei Yu, Shoukang Hu, Xunying Liu, Helen Meng:
Mixed Precision Low-bit Quantization of Neural Network Language Models for Speech Recognition. CoRR abs/2112.11438 (2021) - [i22]Junhao Xu, Shoukang Hu, Jianwei Yu, Xunying Liu, Helen Meng:
Mixed Precision of Quantization of Transformer Language Models for Speech Recognition. CoRR abs/2112.11540 (2021) - 2020
- [j42]Yong-Hong Kuo, Nicholas B. Chan, Janny M. Y. Leung, Helen Meng, Anthony Man-Cho So, Kelvin Kam-fai Tsoi, Colin A. Graham:
An Integrated Approach of Machine Learning and Systems Thinking for Waiting Time Prediction in an Emergency Department. Int. J. Medical Informatics 139: 104143 (2020) - [c268]Songxiang Liu, Disong Wang, Yuewen Cao, Lifa Sun, Xixin Wu, Shiyin Kang, Zhiyong Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
End-To-End Accent Conversion Without Using Native Utterances. ICASSP 2020: 6289-6293 - [c267]Haibin Wu, Songxiang Liu, Helen Meng, Hung-yi Lee:
Defense Against Adversarial Attacks on Spoofing Countermeasures of ASV. ICASSP 2020: 6564-6568 - [c266]Xu Li, Jinghua Zhong, Xixin Wu, Jianwei Yu, Xunying Liu, Helen Meng:
Adversarial Attacks on GMM I-Vector Based Speaker Verification Systems. ICASSP 2020: 6579-6583 - [c265]Jianwei Yu, Shi-Xiong Zhang, Jian Wu, Shahram Ghorbani, Bo Wu, Shiyin Kang, Shansong Liu, Xunying Liu, Helen Meng, Dong Yu:
Audio-Visual Recognition of Overlapped Speech for the LRS2 Dataset. ICASSP 2020: 6984-6988 - [c264]Yuewen Cao, Songxiang Liu, Xixin Wu, Shiyin Kang, Peng Liu, Zhiyong Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Code-Switched Speech Synthesis Using Bilingual Phonetic Posteriorgram with Only Monolingual Corpora. ICASSP 2020: 7619-7623 - [c263]Disong Wang, Jianwei Yu, Xixin Wu, Songxiang Liu, Lifa Sun, Xunying Liu, Helen Meng:
End-To-End Voice Conversion Via Cross-Modal Knowledge Distillation for Dysarthric Speech Reconstruction. ICASSP 2020: 7744-7748 - [c262]Junhao Xu, Xie Chen, Shoukang Hu, Jianwei Yu, Xunying Liu, Helen Meng:
Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers. ICASSP 2020: 7939-7943 - [c261]Pengfei Liu, Kun Li, Helen Meng:
Group Gated Fusion on Attention-Based Bidirectional Alignment for Multimodal Emotion Recognition. INTERSPEECH 2020: 379-383 - [c260]Xingcheng Song, Zhiyong Wu, Yiheng Huang, Dan Su, Helen Meng:
SpecSwap: A Simple Data Augmentation Method for End-to-End Speech Recognition. INTERSPEECH 2020: 581-585 - [c259]Mengzhe Geng, Xurong Xie, Shansong Liu, Jianwei Yu, Shoukang Hu, Xunying Liu, Helen Meng:
Investigation of Data Augmentation Techniques for Disordered Speech Recognition. INTERSPEECH 2020: 696-700 - [c258]Shansong Liu, Xurong Xie, Jianwei Yu, Shoukang Hu, Mengzhe Geng, Rongfeng Su, Shi-Xiong Zhang, Xunying Liu, Helen Meng:
Exploiting Cross-Domain Visual Feature Generation for Disordered Speech Recognition. INTERSPEECH 2020: 711-715 - [c257]Xu Li, Na Li, Jinghua Zhong, Xixin Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Investigating Robustness of Adversarial Samples Detection for Automatic Speaker Verification. INTERSPEECH 2020: 1540-1544 - [c256]Kun Zhang, Zhiyong Wu, Daode Yuan, Jian Luan, Jia Jia, Helen Meng, Binheng Song:
Re-Weighted Interval Loss for Handling Data Imbalance Problem of End-to-End Keyword Spotting. INTERSPEECH 2020: 2567-2571 - [c255]Naijun Zheng, Xixin Wu, Jinghua Zhong, Xunying Liu, Helen Meng:
Speaker-Aware Linear Discriminant Analysis in Speaker Verification. INTERSPEECH 2020: 3012-3016 - [c254]Xiangyu Liang, Zhiyong Wu, Runnan Li, Yanqing Liu, Sheng Zhao, Helen Meng:
Enhancing Monotonicity for Robust Autoregressive Transformer TTS. INTERSPEECH 2020: 3181-3185 - [c253]Jianwei Yu, Bo Wu, Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Dong Yu, Xunying Liu, Helen Meng:
Audio-Visual Multi-Channel Recognition of Overlapped Speech. INTERSPEECH 2020: 3496-3500 - [c252]Xingchen Song, Guangsen Wang, Yiheng Huang, Zhiyong Wu, Dan Su, Helen Meng:
Speech-XLNet: Unsupervised Acoustic Model Pretraining for Self-Attention Networks. INTERSPEECH 2020: 3765-3769 - [c251]Songxiang Liu, Yuewen Cao, Shiyin Kang, Na Hu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Transferring Source Style in Non-Parallel Voice Conversion. INTERSPEECH 2020: 4721-4725 - [c250]Xu Li, Jinghua Zhong, Jianwei Yu, Shoukang Hu, Xixin Wu, Xunying Liu, Helen Meng:
Bayesian x-vector: Bayesian Neural Network based x-vector System for Speaker Verification. Odyssey 2020: 365-371 - [e7]Helen Meng, Bo Xu, Thomas Fang Zheng:
21st Annual Conference of the International Speech Communication Association, Interspeech 2020, Virtual Event, Shanghai, China, October 25-29, 2020. ISCA 2020 [contents] - [i21]Jianwei Yu, Shi-Xiong Zhang, Jian Wu, Shahram Ghorbani, Bo Wu, Shiyin Kang, Shansong Liu, Xunying Liu, Helen Meng, Dong Yu:
Audio-visual Recognition of Overlapped speech for the LRS2 dataset. CoRR abs/2001.01656 (2020) - [i20]Xu Li, Xixin Wu, Xunying Liu, Helen Meng:
Deep segmental phonetic posterior-grams based discovery of non-categories in L2 English speech. CoRR abs/2002.00205 (2020) - [i19]Haibin Wu, Songxiang Liu, Helen Meng, Hung-yi Lee:
Defense against adversarial attacks on spoofing countermeasures of ASV. CoRR abs/2003.03065 (2020) - [i18]Xu Li, Jinghua Zhong, Jianwei Yu, Shoukang Hu, Xixin Wu, Xunying Liu, Helen Meng:
Bayesian x-vector: Bayesian Neural Network based x-vector System for Speaker Verification. CoRR abs/2004.04014 (2020) - [i17]Jianwei Yu, Bo Wu, Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Dong Yu, Xunying Liu, Helen Meng:
Audio-visual Multi-channel Recognition of Overlapped Speech. CoRR abs/2005.08571 (2020) - [i16]Xu Li, Na Li, Jinghua Zhong, Xixin Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Investigating Robustness of Adversarial Samples Detection for Automatic Speaker Verification. CoRR abs/2006.06186 (2020) - [i15]Huirong Huang, Zhiyong Wu, Shiyin Kang, Dongyang Dai, Jia Jia, Tianxiao Fu, Deyi Tuo, Guangzhi Lei, Peng Liu, Dan Su, Dong Yu, Helen Meng:
Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams. CoRR abs/2006.11610 (2020) - [i14]Shoukang Hu, Xurong Xie, Shansong Liu, Mengzhe Geng, Xunying Liu, Helen Meng:
Neural Architecture Search for Speech Recognition. CoRR abs/2007.08818 (2020) - [i13]Songxiang Liu, Yuewen Cao, Disong Wang, Xixin Wu, Xunying Liu, Helen Meng:
Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling. CoRR abs/2009.02725 (2020) - [i12]Xiong Cai, Dongyang Dai, Zhiyong Wu, Xiang Li, Jingbei Li, Helen Meng:
Emotion controllable speech synthesis using emotion-unlabeled dataset with the assistance of cross-domain speech emotion recognition. CoRR abs/2010.13350 (2020) - [i11]Xu Li, Na Li, Chao Weng, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Replay and Synthetic Speech Detection with Res2net Architecture. CoRR abs/2010.15006 (2020) - [i10]Xingchen Song, Zhiyong Wu, Yiheng Huang, Chao Weng, Dan Su, Helen Meng:
Non-Autoregressive Transformer ASR with CTC-Enhanced Decoder Input. CoRR abs/2010.15025 (2020) - [i9]Disong Wang, Songxiang Liu, Lifa Sun, Xixin Wu, Xunying Liu, Helen Meng:
Learning Explicit Prosody Models and Deep Speaker Embeddings for Atypical Voice Conversion. CoRR abs/2011.01678 (2020) - [i8]Changhe Song, Jingbei Li, Yixuan Zhou, Zhiyong Wu, Helen M. Meng:
Syntactic representation learning for neural network based TTS with syntactic parse tree traversal. CoRR abs/2012.06971 (2020) - [i7]Xiong Cai, Zhiyong Wu, Kuo Zhong, Bin Su, Dongyang Dai, Helen Meng:
Unsupervised Cross-Lingual Speech Emotion Recognition Using DomainAdversarial Neural Network. CoRR abs/2012.11174 (2020)
2010 – 2019
- 2019
- [c249]Yao Du, Zhiyong Wu, Shiyin Kang, Dan Su, Dong Yu, Helen Meng:
Prosodic Structure Prediction using Deep Self-attention Neural Network. APSIPA 2019: 320-324 - [c248]Liangqi Liu, Zhiyong Wu, Runnan Li, Jia Jia, Helen Meng:
Learning Contextual Representation with Convolution Bank and Multi-head Self-attention for Speech Emphasis Detection. APSIPA 2019: 922-926 - [c247]Yao Du, Zhiyong Wu, Shiyin Kang, Dan Su, Dong Yu, Helen Meng:
Automatic Prosodic Structure Labeling using DNN-BGRU-CRF Hybrid Neural Network. APSIPA 2019: 1234-1238 - [c246]Kun Zhang, Zhiyong Wu, Jia Jia, Helen M. Meng, Binheng Song:
Query-by-Example Spoken Term Detection using Attentive Pooling Networks. APSIPA 2019: 1267-1272 - [c245]Songxiang Liu, Haibin Wu, Hung-yi Lee, Helen Meng:
Adversarial Attacks on Spoofing Countermeasures of Automatic Speaker Verification. ASRU 2019: 312-319 - [c244]Shoukang Hu, Max W. Y. Lam, Xurong Xie, Shansong Liu, Jianwei Yu, Xixin Wu, Xunying Liu, Helen Meng:
Bayesian and Gaussian Process Neural Networks for Large Vocabulary Continuous Speech Recognition. ICASSP 2019: 6555-6559 - [c243]Runnan Li, Zhiyong Wu, Jia Jia, Sheng Zhao, Helen Meng:
Dilated Residual Network with Multi-head Self-attention for Speech Emotion Recognition. ICASSP 2019: 6675-6679 - [c242]Xixin Wu, Songxiang Liu, Yuewen Cao, Xu Li, Jianwei Yu, Dongyang Dai, Xi Ma, Shoukang Hu, Zhiyong Wu, Xunying Liu, Helen Meng:
Speech Emotion Recognition Using Capsule Networks. ICASSP 2019: 6695-6699 - [c241]Hui Lu, Zhiyong Wu, Runnan Li, Shiyin Kang, Jia Jia, Helen Meng:
A Compact Framework for Voice Conversion Using Wavenet Conditioned on Phonetic Posteriorgrams. ICASSP 2019: 6810-6814 - [c240]Yuewen Cao, Xixin Wu, Songxiang Liu, Jianwei Yu, Xu Li, Zhiyong Wu, Xunying Liu, Helen Meng:
End-to-end Code-switched TTS with Mix of Monolingual Recordings. ICASSP 2019: 6935-6939 - [c239]Mu Wang, Xixin Wu, Zhiyong Wu, Shiyin Kang, Deyi Tuo, Guangzhi Li, Dan Su, Dong Yu, Helen Meng:
Quasi-fully Convolutional Neural Network with Variational Inference for Speech Synthesis. ICASSP 2019: 7060-7064 - [c238]Max W. Y. Lam, Xie Chen, Shoukang Hu, Jianwei Yu, Xunying Liu, Helen Meng:
Gaussian Process Lstm Recurrent Neural Network Language Models for Speech Recognition. ICASSP 2019: 7235-7239 - [c237]Jianwei Yu, Max W. Y. Lam, Xie Chen, Shoukang Hu, Songxiang Liu, Xixin Wu, Xunying Liu, Helen Meng:
Recurrent Neural Network Language Model Training Using Natural Gradient. ICASSP 2019: 7260-7264 - [c236]Dongyang Dai, Zhiyong Wu, Runnan Li, Xixin Wu, Jia Jia, Helen Meng:
Learning Discriminative Features from Spectrograms Using Center Loss for Speech Emotion Recognition. ICASSP 2019: 7405-7409 - [c235]Wai-Kim Leung, Xunying Liu, Helen Meng:
CNN-RNN-CTC Based End-to-end Mispronunciation Detection and Diagnosis. ICASSP 2019: 8132-8136 - [c234]Runnan Li, Zhiyong Wu, Jia Jia, Yaohua Bu, Sheng Zhao, Helen Meng:
Towards Discriminative Representation Learning for Speech Emotion Recognition. IJCAI 2019: 5060-5066 - [c233]Hui Lu, Zhiyong Wu, Dongyang Dai, Runnan Li, Shiyin Kang, Jia Jia, Helen Meng:
One-Shot Voice Conversion with Global Speaker Embeddings. INTERSPEECH 2019: 669-673 - [c232]Songxiang Liu, Yuewen Cao, Xixin Wu, Lifa Sun, Xunying Liu, Helen Meng:
Jointly Trained Conversion Model and WaveNet Vocoder for Non-Parallel Voice Conversion Using Mel-Spectrograms and Phonetic Posteriorgrams. INTERSPEECH 2019: 714-718 - [c231]Dongyang Dai, Zhiyong Wu, Shiyin Kang, Xixin Wu, Jia Jia, Dan Su, Dong Yu, Helen Meng:
Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-Trained BERT. INTERSPEECH 2019: 2090-2094 - [c230]Max W. Y. Lam, Jun Wang, Xunying Liu, Helen Meng, Dan Su, Dong Yu:
Extract, Adapt and Recognize: An End-to-End Neural Network for Corrupted Monaural Speech Recognition. INTERSPEECH 2019: 2778-2782 - [c229]Shoukang Hu, Xurong Xie, Shansong Liu, Max W. Y. Lam, Jianwei Yu, Xixin Wu, Xunying Liu, Helen Meng:
LF-MMI Training of Bayesian and Gaussian Process Time Delay Neural Networks for Speech Recognition. INTERSPEECH 2019: 2793-2797 - [c228]Hang Su, Borislav Dzodzo, Xixin Wu, Xunying Liu, Helen Meng:
Unsupervised Methods for Audio Classification from Lecture Discussion Recordings. INTERSPEECH 2019: 3347-3351 - [c227]Jianwei Yu, Max W. Y. Lam, Shoukang Hu, Xixin Wu, Xu Li, Yuewen Cao, Xunying Liu, Helen Meng:
Comparative Study of Parametric and Representation Uncertainty Modeling for Recurrent Neural Network Language Models. INTERSPEECH 2019: 3510-3514 - [c226]Shoukang Hu, Shansong Liu, Heng Fai Chang, Mengzhe Geng, Jiani Chen, Lau Wing Chung, To Ka Hei, Jianwei Yu, Ka Ho Wong, Xunying Liu, Helen Meng:
The CUHK Dysarthric Speech Recognition Systems for English and Cantonese. INTERSPEECH 2019: 3669-3670 - [c225]Shansong Liu, Shoukang Hu, Yi Wang, Jianwei Yu, Rongfeng Su, Xunying Liu, Helen Meng:
Exploiting Visual Features Using Bayesian Gated Neural Networks for Disordered Speech Recognition. INTERSPEECH 2019: 4120-4124 - [c224]Shansong Liu, Shoukang Hu, Xunying Liu, Helen Meng:
On the Use of Pitch Features for Disordered Speech Recognition. INTERSPEECH 2019: 4130-4134 - [c223]Jingbei Li, Zhiyong Wu, Runnan Li, Pengpeng Zhi, Song Yang, Helen Meng:
Knowledge-Based Linguistic Encoding for End-to-End Mandarin Text-to-Speech Synthesis. INTERSPEECH 2019: 4494-4498 - [c222]Jia Li, Yu Rong, Hong Cheng, Helen Meng, Wen-bing Huang, Junzhou Huang:
Semi-Supervised Graph Classification: A Hierarchical Graph Perspective. WWW 2019: 972-982 - [e6]Wen Gao, Helen Mei-Ling Meng, Matthew A. Turk, Susan R. Fussell, Björn W. Schuller, Yale Song, Kai Yu:
International Conference on Multimodal Interaction, ICMI 2019, Suzhou, China, October 14-18, 2019. ACM 2019, ISBN 978-1-4503-6860-5 [contents] - [i6]Jia Li, Yu Rong, Hong Cheng, Helen Meng, Wen-bing Huang, Junzhou Huang:
Semi-Supervised Graph Classification: A Hierarchical Graph Perspective. CoRR abs/1904.05003 (2019) - [i5]Songxiang Liu, Haibin Wu, Hung-yi Lee, Helen Meng:
Adversarial Attacks on Spoofing Countermeasures of automatic speaker verification. CoRR abs/1910.08716 (2019) - [i4]Xingcheng Song, Guangsen Wang, Zhiyong Wu, Yiheng Huang, Dan Su, Dong Yu, Helen Meng:
Speech-XLNet: Unsupervised Acoustic Model Pretraining For Self-Attention Networks. CoRR abs/1910.10387 (2019) - [i3]Xu Li, Jinghua Zhong, Xixin Wu, Jianwei Yu, Xunying Liu, Helen Meng:
Adversarial Attacks on GMM i-vector based Speaker Verification Systems. CoRR abs/1911.03078 (2019) - 2018
- [j41]Kelvin Kam-fai Tsoi, Felix C. H. Chan, Hoyee W. Hirai, Gary K. S. Keung, Yong-Hong Kuo, Samson Tai, Helen Mei-Ling Meng:
Data Visualization with IBM Watson Analytics for Global Cancer Trends Comparison from World Health Organization. Int. J. Heal. Inf. Syst. Informatics 13(1): 45-54 (2018) - [j40]Kelvin Kam-fai Tsoi, Nicholas B. Chan, Felix C. H. Chan, Lingling Zhang, Annisa C. H. Lee, Helen Mei-Ling Meng:
How can we better use Twitter to find a person who got lost due to dementia? npj Digit. Medicine 1 (2018) - [j39]Kun Li, Shaoguang Mao, Xu Li, Zhiyong Wu, Helen Meng:
Automatic lexical stress and pitch accent detection for L2 English speech using multi-distribution deep neural networks. Speech Commun. 96: 28-36 (2018) - [c221]Ziwei Zhu, Zhiyong Wu, Runnan Li, Yishuang Ning, Helen Meng:
Learning Frame-Level Recurrent Neural Networks Representations for Query-by-Example Spoken Term Detection on Mobile Devices. AIMS 2018: 55-66 - [c220]King Keung Wu, Helen Meng, Yeung Yam:
Topic Discovery via Convex Polytopic Model: A Case Study with Small Corpora. CogInfoCom 2018: 367-372 - [c219]Kelvin Kam-fai Tsoi, Max W. Y. Lam, Christopher T. K. Chu, Michael P. F. Wong, Helen Mei-Ling Meng:
Machine Learning on Drawing Behavior for Dementia Screening. DH 2018: 131-132 - [c218]Max W. Y. Lam, Xunying Liu, Helen Mei-Ling Meng, Kelvin Kam-fai Tsoi:
Drawing-Based Automatic Dementia Screening Using Gaussian Process Markov Chains. HICSS 2018: 1-10 - [c217]Kelvin Kam-fai Tsoi, Lingling Zhang, Nicholas B. Chan, Felix C. H. Chan, Hoyee W. Hirai, Helen Mei-Ling Meng:
Social Media as a Tool to Look for People with Dementia Who Become Lost: Factors That Matter. HICSS 2018: 1-10 - [c216]Runnan Li, Zhiyong Wu, Yuchen Huang, Jia Jia, Helen Meng, Lianhong Cai:
Emphatic Speech Generation with Conditioned Input Layer and Bidirectional LSTMS for Expressive Speech Synthesis. ICASSP 2018: 5129-5133 - [c215]Xixin Wu, Lifa Sun, Shiyin Kang, Songxiang Liu, Zhiyong Wu, Xunying Liu, Helen Meng:
Feature Based Adaptation for Speaking Style Synthesis. ICASSP 2018: 5304-5308 - [c214]Xunying Liu, Shansong Liu, Jinze Sha, Jianwei Yu, Zhiyuan Xu, Xie Chen, Helen Meng:
Limited-Memory BFGS Optimization of Recurrent Neural Network Language Models for Speech Recognition. ICASSP 2018: 6114-6118 - [c213]Shaoguang Mao, Xu Li, Kun Li, Zhiyong Wu, Xunying Liu, Helen Meng:
Unsupervised Discovery of an Extended Phoneme Set in L2 English Speech for Mispronunciation Detection and Diagnosis. ICASSP 2018: 6244-6248 - [c212]Shaoguang Mao, Zhiyong Wu, Runnan Li, Xu Li, Helen Meng, Lianhong Cai:
Applying Multitask Learning to Acoustic-Phonemic Model for Mispronunciation Detection and Diagnosis in L2 English Speech. ICASSP 2018: 6254-6258 - [c211]Shaoguang Mao, Zhiyong Wu, Xu Li, Runnan Li, Xixin Wu, Helen Meng:
Integrating Articulatory Features into Acoustic-Phonemic Model for Mispronunciation Detection and Diagnosis in L2 English Speech. ICME 2018: 1-6 - [c210]Ziwei Zhu, Zhiyong Wu, Runnan Li, Helen Meng, Lianhong Cai:
Siamese Recurrent Auto-Encoder Representation for Query-by-Example Spoken Term Detection. INTERSPEECH 2018: 102-106 - [c209]Shuai Yang, Zhiyong Wu, Binbin Shen, Helen Meng:
Detection of Glottal Closure Instants from Speech Signals: A Convolutional Neural Network Based Method. INTERSPEECH 2018: 317-321 - [c208]Songxiang Liu, Jinghua Zhong, Lifa Sun, Xixin Wu, Xunying Liu, Helen Meng:
Voice Conversion Across Arbitrary Speakers Based on a Single Target-Speaker Utterance. INTERSPEECH 2018: 496-500 - [c207]Max W. Y. Lam, Shoukang Hu, Xurong Xie, Shansong Liu, Jianwei Yu, Rongfeng Su, Xunying Liu, Helen Meng:
Gaussian Process Neural Networks for Speech Recognition. INTERSPEECH 2018: 1778-1782 - [c206]Xu Li, Shaoguang Mao, Xixin Wu, Kun Li, Xunying Liu, Helen Meng:
Unsupervised Discovery of Non-native Phonetic Patterns in L2 English Speech for Mispronunciation Detection and Diagnosis. INTERSPEECH 2018: 2554-2558 - [c205]Jianwei Yu, Xurong Xie, Shansong Liu, Shoukang Hu, Max W. Y. Lam, Xixin Wu, Ka Ho Wong, Xunying Liu, Helen Meng:
Development of the CUHK Dysarthric Speech Recognition System for the UA Speech Corpus. INTERSPEECH 2018: 2938-2942 - [c204]Helen Meng:
Speech and Language Processing for Learning and Wellbeing. INTERSPEECH 2018: 3022 - [c203]Xixin Wu, Yuewen Cao, Mu Wang, Songxiang Liu, Shiyin Kang, Zhiyong Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Rapid Style Adaptation Using Residual Error Embedding for Expressive Speech Synthesis. INTERSPEECH 2018: 3072-3076 - [c202]Xi Ma, Zhiyong Wu, Jia Jia, Mingxing Xu, Helen Meng, Lianhong Cai:
Emotion Recognition from Variable-Length Speech Segments Using Deep Learning on Spectrograms. INTERSPEECH 2018: 3683-3687 - [c201]Jinghua Zhong, Helen Meng:
DNN i-vector based Fishervoice and PLDA SVM scoring for NIST SRE 2016. ISCSLP 2018: 180-184 - [c200]Mu Wang, Zhiyong Wu, Shiyin Kang, Xixin Wu, Jia Jia, Dan Su, Dong Yu, Helen Meng:
Speech Super-Resolution Using Parallel WaveNet. ISCSLP 2018: 260-264 - [c199]Jia Li, Yu Rong, Helen Meng, Zhihui Lu, Timothy C. Y. Kwok, Hong Cheng:
TATC: Predicting Alzheimer's Disease with Actigraphy Data. KDD 2018: 509-518 - [c198]Runnan Li, Zhiyong Wu, Jia Jia, Jingbei Li, Wei Chen, Helen Meng:
Inferring User Emotive State Changes in Realistic Human-Computer Conversational Dialogs. ACM Multimedia 2018: 136-144 - [c197]Songxiang Liu, Lifa Sun, Xixin Wu, Xunying Liu, Helen Meng:
The HCCL-CUHK System for the Voice Conversion Challenge 2018. Odyssey 2018: 248-254 - 2017
- [j38]Kun Li, Xixin Wu, Helen M. Meng:
Intonation classification for L2 English speech using multi-distribution deep neural networks. Comput. Speech Lang. 43: 18-33 (2017) - [j37]Kun Li, Xiaojun Qian, Helen M. Meng:
Mispronunciation Detection and Diagnosis in L2 English Speech Using Multidistribution Deep Neural Networks. IEEE ACM Trans. Audio Speech Lang. Process. 25(1): 193-207 (2017) - [c196]Yishuang Ning, Jia Jia, Zhiyong Wu, Runnan Li, Yongsheng An, Yanfeng Wang, Helen M. Meng:
Multi-Task Deep Learning for User Intention Understanding in Speech Interaction Systems. AAAI 2017: 161-167 - [c195]King Keung Wu, Yeung Yam, Helen M. Meng, Mehran Mesbahi:
Parallel probabilistic swarm guidance by exploiting Kronecker product structures in discrete-time Markov chains. ACC 2017: 346-351 - [c194]Kelvin Kam-fai Tsoi, Max W. Y. Lam, Felix C. H. Chan, Hoyee W. Hirai, Baker K. K. Bat, Samuel Y. S. Wong, Helen Mei-Ling Meng:
Classification of Visit-to-Visit Blood Pressure Variability: A Machine Learning Approach for Data Clustering on Systolic Blood Pressure Intervention Trial (SPRINT). DH 2017: 58-59 - [c193]Kelvin Kam-fai Tsoi, Janet Y. H. Wong, Michael P. F. Wong, Gary K. S. Leung, Baker K. K. Bat, Felix C. H. Chan, Yong-Hong Kuo, Herman H. M. Lo, Helen Mei-Ling Meng:
Personal Wearable Devices to Measure Heart Rate Variability: A Framework of Cloud Platform for Public Health Research. DH 2017: 207-208 - [c192]Kelvin Kam-fai Tsoi, Felix C. H. Chan, Hoyee W. Hirai, Gary K. S. Leung, Yong-Hong Kuo, Samson Tai, Helen M. Meng:
Data Visualization on Global Trends on Cancer Incidence An Application of IBM Watson Analytics. HICSS 2017: 1-6 - [c191]Runnan Li, Zhiyong Wu, Xunying Liu, Helen M. Meng, Lianhong Cai:
Multi-task learning of structured output layer bidirectional LSTMS for speech synthesis. ICASSP 2017: 5510-5514 - [c190]Yishuang Ning, Zhiyong Wu, Runnan Li, Jia Jia, Mingxing Xu, Helen M. Meng, Lianhong Cai:
Learning cross-lingual knowledge with multilingual BLSTM for emphasis detection with limited training data. ICASSP 2017: 5615-5619 - [c189]Pengfei Liu, King Keung Wu, Helen M. Meng:
A model of extended paragraph vector for document categorization and trend analysis. IJCNN 2017: 2400-2406 - [c188]Yuchen Huang, Zhiyong Wu, Runnan Li, Helen Meng, Lianhong Cai:
Multi-Task Learning for Prosodic Structure Generation Using BLSTM RNN with Structured Output Layer. INTERSPEECH 2017: 779-783 - [c187]Xi Ma, Zhiyong Wu, Jia Jia, Mingxing Xu, Helen Meng, Lianhong Cai:
Speech Emotion Recognition with Emotion-Pair Based Framework Considering Emotion Distribution Information in Dimensional Emotion Space. INTERSPEECH 2017: 1238-1242 - [c186]Jinghua Zhong, Wenping Hu, Frank K. Soong, Helen Meng:
DNN i-Vector Speaker Verification with Short, Text-Constrained Test Utterances. INTERSPEECH 2017: 1507-1511 - [c185]Runnan Li, Zhiyong Wu, Yishuang Ning, Lifa Sun, Helen Meng, Lianhong Cai:
Spectro-Temporal Modelling with Time-Frequency LSTM and Structured Output Layer for Voice Conversion. INTERSPEECH 2017: 3409-3413 - 2016
- [j36]Hao Wang, Peggy Mok, Helen Meng:
Capitalizing on musical rhythm for prosodic training in computer-aided language learning. Comput. Speech Lang. 37: 67-81 (2016) - [j35]Xiaojun Qian, Helen M. Meng, Frank K. Soong:
A Two-Pass Framework of Mispronunciation Detection and Diagnosis for Computer-Aided Pronunciation Training. IEEE ACM Trans. Audio Speech Lang. Process. 24(6): 1020-1028 (2016) - [c184]Vincent Tsz Fai Chow, Ka Wing Sung, Helen M. Meng, Ka Ho Wong, Gary K. S. Leung, Yong-Hong Kuo, Kelvin Kam-fai Tsoi:
Utilizing Real-Time Travel Information, Mobile Applications and Wearable Devices for Smart Public Transportation. CCBD 2016: 138-144 - [c183]Kelvin Kam-fai Tsoi, Benjamin Yip, Doreen W. H. Au, Yong-Hong Kuo, Samuel Y. S. Wong, Jean Woo, Helen Mei-Ling Meng:
Blood Pressure Monitoring on the Cloud System in Elderly Community Centres: A Data Capturing Platform for Application Research in Public Health. CCBD 2016: 312-315 - [c182]Pengfei Liu, Shoaib Jameel, King Keung Wu, Helen M. Meng:
Learning Track Representation and Trends for Conference Analytics. HICSS 2016: 1671-1680 - [c181]Quanjie Yu, Peng Liu, Zhiyong Wu, Shiyin Kang, Helen Meng, Lianhong Cai:
Learning cross-lingual information with multilingual BLSTM for speech synthesis of low-resource languages. ICASSP 2016: 5545-5549 - [c180]Xinyu Lan, Xu Li, Yishuang Ning, Zhiyong Wu, Helen Meng, Jia Jia, Lianhong Cai:
Low level descriptors based DBLSTM bottleneck feature for speech driven talking avatar. ICASSP 2016: 5550-5554 - [c179]Yaodong Tang, Yuchen Huang, Zhiyong Wu, Helen Meng, Mingxing Xu, Lianhong Cai:
Question detection from acoustic features using recurrent neural network with gated recurrent unit. ICASSP 2016: 6125-6129 - [c178]Ka-Ho Wong, Wing Sum Yeung, Yu Ting Yeung, Helen M. Meng:
Exploring articulatory characteristics of Cantonese dysarthric speech using distinctive features. ICASSP 2016: 6495-6499 - [c177]Linchuan Li, Zhiyong Wu, Mingxing Xu, Helen M. Meng, Lianhong Cai:
Recognizing stances in Mandarin social ideological debates with text and acoustic features. ICME Workshops 2016: 1-6 - [c176]Lifa Sun, Kun Li, Hao Wang, Shiyin Kang, Helen M. Meng:
Phonetic posteriorgrams for many-to-one voice conversion without parallel data training. ICME 2016: 1-6 - [c175]Wei-Ying Yi, Kwong-Sak Leung, Yee Leung, Helen Mei-Ling Meng, Terrence S. T. Mak:
Modular sensor system (MSS) for urban air pollution monitoring. IEEE SENSORS 2016: 1-3 - [c174]Lifa Sun, Hao Wang, Shiyin Kang, Kun Li, Helen M. Meng:
Personalized, Cross-Lingual TTS Using Phonetic Posteriorgrams. INTERSPEECH 2016: 322-326 - [c173]Yaodong Tang, Zhiyong Wu, Helen M. Meng, Mingxing Xu, Lianhong Cai:
Analysis on Gated Recurrent Unit Based Question Detection Approach. INTERSPEECH 2016: 735-739 - [c172]Linchuan Li, Zhiyong Wu, Mingxing Xu, Helen M. Meng, Lianhong Cai:
Combining CNN and BLSTM to Extract Textual and Acoustic Features for Recognizing Stances in Mandarin Ideological Debate Competition. INTERSPEECH 2016: 1392-1396 - [c171]Xu Li, Zhiyong Wu, Helen M. Meng, Jia Jia, Xiaoyan Lou, Lianhong Cai:
Phoneme Embedding and its Application to Speech Driven Talking Avatar Synthesis. INTERSPEECH 2016: 1472-1476 - [c170]Xu Li, Zhiyong Wu, Helen M. Meng, Jia Jia, Xiaoyan Lou, Lianhong Cai:
Expressive Speech Driven Talking Avatar Synthesis with DBLSTM Using Limited Amount of Emotional Bimodal Data. INTERSPEECH 2016: 1477-1481 - [c169]Runnan Li, Zhiyong Wu, Helen M. Meng, Lianhong Cai:
DBLSTM-based multi-task learning for pitch transformation in voice conversion. ISCSLP 2016: 1-5 - [c168]Ka-Ho Wong, Hoi Kiu Kristy Mok, Helen Meng:
Exploratory data analysis on nuclei in cantonese dysarthric speech. ISCSLP 2016: 1-5 - [c167]King Keung Wu, Pengfei Liu, Helen M. Meng, Yeung Yam:
An embedding approach for context-aware collaborative recommendation and visualization. SMC 2016: 3457-3462 - [c166]King Keung Wu, Yeung Yam, Helen M. Meng, Mehran Mesbahi:
Kronecker product approximation with multiple factor matrices via the tensor product algorithm. SMC 2016: 4277-4282 - [i2]Xi Ma, Zhiyong Wu, Jia Jia, Mingxing Xu, Helen M. Meng, Lianhong Cai:
Study on Feature Subspace of Archetypal Emotions for Speech Emotion Recognition. CoRR abs/1611.05675 (2016) - 2015
- [j34]Péter Baranyi, Hassan Charaf, Anna Esposito, Péter Földesi, Helen Meng:
Preface. J. Multimodal User Interfaces 9(4): 261-262 (2015) - [j33]Lei Xie, Jia Jia, Helen M. Meng, Zhigang Deng, Lijuan Wang:
Expressive talking avatar synthesis and animation. Multim. Tools Appl. 74(22): 9845-9848 (2015) - [j32]Zhiyong Wu, Kai Zhao, Xixin Wu, Xinyu Lan, Helen Meng:
Acoustic to articulatory mapping with deep neural network. Multim. Tools Appl. 74(22): 9889-9907 (2015) - [j31]Zhiyong Wu, Yishuang Ning, Xiao Zang, Jia Jia, Fanbo Meng, Helen Meng, Lianhong Cai:
Generating emphatic speech with hidden Markov model for expressive speech synthesis. Multim. Tools Appl. 74(22): 9909-9925 (2015) - [j30]Wei Ying Yi, Kin Ming Lo, Terrence S. T. Mak, Kwong-Sak Leung, Yee Leung, Helen Mei-Ling Meng:
A Survey of Wireless Sensor Network Based Air Pollution Monitoring Systems. Sensors 15(12): 31392-31427 (2015) - [j29]Zhen-Hua Ling, Shiyin Kang, Heiga Zen, Andrew W. Senior, Mike Schuster, Xiaojun Qian, Helen M. Meng, Li Deng:
Deep Learning for Acoustic Modeling in Parametric Speech Generation: A systematic review of existing techniques and future trends. IEEE Signal Process. Mag. 32(3): 35-52 (2015) - [j28]Haizhou Li, Marcello Federico, Xiaodong He, Helen M. Meng, Isabel Trancoso:
Introduction to the Special Section on Continuous Space and Related Methods in Natural Language Processing. IEEE ACM Trans. Audio Speech Lang. Process. 23(3): 427-430 (2015) - [c165]Xixin Wu, Zhiyong Wu, Yishuang Ning, Jia Jia, Lianhong Cai, Helen M. Meng:
Understanding speaking styles of internet speech data with LSTM and low-resource training. ACII 2015: 815-820 - [c164]Xiaojun Qian, Helen M. Meng, Frank K. Soong:
A two-pass framework of mispronunciation detection & diagnosis for computer-aided pronunciation training. APSIPA 2015: 384-387 - [c163]Benjamin Yip, Hoyee W. Hirai, Yong-Hong Kuo, Helen M. Meng, Samuel Y. S. Wong, Kelvin Kam-fai Tsoi:
Blood Pressure Management with Data Capturing in the Cloud among Hypertensive Patients: A Monitoring Platform for Hypertensive Patients. BigData Congress 2015: 305-308 - [c162]Kin Fai Ho, Hoyee W. Hirai, Yong-Hong Kuo, Helen M. Meng, Kelvin Kam-fai Tsoi:
Indoor Air Monitoring Platform and Personal Health Reporting System: Big Data Analytics for Public Health Research. BigData Congress 2015: 309-312 - [c161]Yong-Hong Kuo, Janny M. Y. Leung, Kelvin Kam-fai Tsoi, Helen M. Meng, Colin A. Graham:
Embracing Big Data for Simulation Modelling of Emergency Department Processes and Activities. BigData Congress 2015: 313-316 - [c160]Yong-Hong Kuo, Janny M. Y. Leung, Helen M. Meng, Kelvin Kam-fai Tsoi:
A Real-Time Decision Support Tool for Disaster Response: A Mathematical Programming Approach. BigData Congress 2015: 639-642 - [c159]Kelvin Kam-fai Tsoi, Yong-Hong Kuo, Helen M. Meng:
A Data Capturing Platform in the Cloud for Behavioral Analysis among Smokers: An Application Platform for Public Health Research. BigData Congress 2015: 737-740 - [c158]Pengfei Liu, Shafiq R. Joty, Helen M. Meng:
Fine-grained Opinion Mining with Recurrent Neural Networks and Word Embeddings. EMNLP 2015: 1433-1443 - [c157]Peng Liu, Quanjie Yu, Zhiyong Wu, Shiyin Kang, Helen M. Meng, Lianhong Cai:
A deep recurrent approach for acoustic-to-articulatory inversion. ICASSP 2015: 4450-4454 - [c156]Lifa Sun, Shiyin Kang, Kun Li, Helen M. Meng:
Voice conversion using deep Bidirectional Long Short-Term Memory based Recurrent Neural Networks. ICASSP 2015: 4869-4873 - [c155]Hao Wang, Frank K. Soong, Helen Meng:
AA spectral space warping approach to cross-lingual voice transformation in HMM-based TTS. ICASSP 2015: 4874-4878 - [c154]Yishuang Ning, Zhiyong Wu, Jia Jia, Fanbo Meng, Helen M. Meng, Lianhong Cai:
HMM-based emphatic speech synthesis for corrective feedback in computer-aided pronunciation training. ICASSP 2015: 4934-4938 - [c153]Qi Lyu, Zhiyong Wu, Jun Zhu, Helen Meng:
Modelling High-Dimensional Sequences with LSTM-RTRBM: Application to Polyphonic Music Generation. IJCAI 2015: 4138-4139 - [c152]Ka-Ho Wong, Yu Ting Yeung, Edwin H. Y. Chan, Patrick C. M. Wong, Gina-Anne Levow, Helen M. Meng:
Development of a Cantonese dysarthric speech corpus. INTERSPEECH 2015: 329-333 - [c151]Yishuang Ning, Zhiyong Wu, Xiaoyan Lou, Helen M. Meng, Jia Jia, Lianhong Cai:
Using tilt for automatic emphasis detection with Bayesian networks. INTERSPEECH 2015: 578-582 - [c150]Pengfei Liu, Shoaib Jameel, Wai Lam, Bin Ma, Helen M. Meng:
Topic modeling for conference analytics. INTERSPEECH 2015: 707-711 - [c149]Ka-Ho Wong, Wai-Kim Leung, Helen M. Meng:
E-commu-book: an assistive technology for users with speech impairments. INTERSPEECH 2015: 1876-1877 - [c148]Yu Ting Yeung, Ka-Ho Wong, Helen M. Meng:
Improving automatic forced alignment for dysarthric speech transcription. INTERSPEECH 2015: 2991-2995 - [c147]Ka-Ho Wong, Yu Ting Yeung, Patrick C. M. Wong, Gina-Anne Levow, Helen Meng:
Analysis of Dysarthric Speech using Distinctive Feature Recognition. SLPAT@Interspeech 2015: 86-90 - [c146]Kun Li, Xiaojun Qian, Shiyin Kang, Pengfei Liu, Helen Meng:
Integrating acoustic and state-transition models for free phone recognition in L2 English speech using multi-distribution deep neural networks. SLaTE 2015: 119-124 - [e5]Zhengyou Zhang, Phil Cohen, Dan Bohus, Radu Horaud, Helen Meng:
Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, Seattle, WA, USA, November 09 - 13, 2015. ACM 2015, ISBN 978-1-4503-3912-4 [contents] - 2014
- [j27]Jia Jia, Wai-Kim Leung, Yu-Hao Wu, Xiu-Long Zhang, Hao Wang, Lianhong Cai, Helen M. Meng:
Grading the Severity of Mispronunciations in CAPT Based on Statistical Analysis and Computational Speech Perception. J. Comput. Sci. Technol. 29(5): 751-761 (2014) - [j26]Jia Jia, Zhiyong Wu, Shen Zhang, Helen M. Meng, Lianhong Cai:
Head and facial gestures synthesis using PAD model for an expressive talking avatar. Multim. Tools Appl. 73(1): 439-461 (2014) - [j25]Fanbo Meng, Zhiyong Wu, Jia Jia, Helen M. Meng, Lianhong Cai:
Synthesizing English emphatic speech for multimodal corrective feedback in computer-aided pronunciation training. Multim. Tools Appl. 73(1): 463-489 (2014) - [j24]Pui-Yu Hui, Helen Meng:
Latent Semantic Analysis for Multimodal User Input With Speech and Gestures. IEEE ACM Trans. Audio Speech Lang. Process. 22(2): 417-429 (2014) - [c145]Xin Zheng, Zhiyong Wu, Helen Meng, Lianhong Cai:
Learning dynamic features with neural networks for phoneme recognition. ICASSP 2014: 2524-2528 - [c144]Xin Zheng, Zhiyong Wu, Helen Meng, Lianhong Cai:
Contrastive auto-encoder for phoneme recognition. ICASSP 2014: 2529-2533 - [c143]Hao Wang, Xiaojun Qian, Helen Meng:
Phonological modeling of mispronunciation gradations in L2 English speech of L1 Chinese learners. ICASSP 2014: 7714-7718 - [c142]Xiao Zang, Zhiyong Wu, Helen M. Meng, Jia Jia, Lianhong Cai:
Using conditional random fields to predict focus word pair in spontaneous spoken English. INTERSPEECH 2014: 756-760 - [c141]Jinghua Zhong, Weiwu Jiang, Wei Rao, Man-Wai Mak, Helen M. Meng:
PLDA modeling in the fishervoice subspace for speaker verification. INTERSPEECH 2014: 1130-1134 - [c140]Shiyin Kang, Helen M. Meng:
Statistical parametric speech synthesis using weighted multi-distribution deep belief network. INTERSPEECH 2014: 1959-1963 - [c139]Xixin Wu, Zhiyong Wu, Jia Jia, Helen M. Meng, Lianhong Cai, Weifeng Li:
Automatic speech data clustering with human perception based weighted distance. ISCSLP 2014: 216-220 - [c138]Kun Li, Helen M. Meng:
Mispronunciation detection and diagnosis in l2 english speech using multi-distribution Deep Neural Networks. ISCSLP 2014: 255-259 - [c137]Jinghua Zhong, Weiwu Jiang, Helen Meng, Na Li, Zhifeng Li:
An Integration of Random Subspace Sampling and Fishervoice for Speaker Verification. Odyssey 2014: 88-93 - [c136]Pengfei Liu, Helen M. Meng:
SeemGo: Conditional Random Fields Labeling and Maximum Entropy Classification for Aspect Based Sentiment Analysis. SemEval@COLING 2014: 527-531 - [e4]Haizhou Li, Helen M. Meng, Bin Ma, Engsiong Chng, Lei Xie:
15th Annual Conference of the International Speech Communication Association, INTERSPEECH 2014, Singapore, September 14-18, 2014. ISCA 2014 [contents] - 2013
- [c135]Hao Wang, Helen M. Meng, Xiaojun Qian:
Predicting gradation of L2 English mispronunciations using ASR with extended recognition network. APSIPA 2013: 1-4 - [c134]Wai-Kim Leung, Ka-Wa Yuen, Ka-Ho Wong, Helen Meng:
Development of text-to-audiovisual speech synthesis to support interactive language learning on a mobile device. CogInfoCom 2013: 583-588 - [c133]Xin Zheng, Zhiyong Wu, Binbin Shen, Helen M. Meng, Lianhong Cai:
Investigation of tandem deep belief network approach for phoneme recognition. ICASSP 2013: 7586-7590 - [c132]Na Li, Weiwu Jiang, Helen M. Meng, Zhifeng Li:
Clustering similar acoustic classes in the Fishervoice framework. ICASSP 2013: 7726-7730 - [c131]Shiyin Kang, Xiaojun Qian, Helen Meng:
Multi-distribution deep belief network for speech synthesis. ICASSP 2013: 8012-8016 - [c130]Junhong Zhao, Hua Yuan, Wai-Kim Leung, Helen M. Meng, Jia Liu, Shanhong Xia:
Audiovisual synthesis of exaggerated speech for corrective feedback in computer-assisted pronunciation training. ICASSP 2013: 8218-8222 - [c129]Kun Li, Xiaojun Qian, Shiyin Kang, Helen Meng:
Lexical stress detection for L2 English speech using deep belief networks. INTERSPEECH 2013: 1811-1815 - [c128]Hao Wang, Xiaojun Qian, Helen Meng:
Predicting gradation of L2 English mispronunciations using crowdsourced ratings and phonological rules. SLaTE 2013: 127-131 - [i1]Xin Zheng, Zhiyong Wu, Helen M. Meng, Weifeng Li, Lianhong Cai:
Feature Learning with Gaussian Restricted Boltzmann Machine for Robust Speech Recognition. CoRR abs/1309.6176 (2013) - 2012
- [j23]Zhaojun Yang, Gina-Anne Levow, Helen M. Meng:
Predicting User Satisfaction in Spoken Dialog System Evaluation With Collaborative Filtering. IEEE J. Sel. Top. Signal Process. 6(8): 971-981 (2012) - [j22]Lan Wang, Hui Chen, Sheng Li, Helen M. Meng:
Phoneme-level articulatory animation in pronunciation training. Speech Commun. 54(7): 845-856 (2012) - [j21]Helen Meng:
Farewell Editorial. IEEE Trans. Speech Audio Process. 20(1): 1 (2012) - [c127]Jia Jia, Xiaohui Wang, Zhiyong Wu, Lianhong Cai, Helen M. Meng:
Modeling the correlation between modality semantics and facial expressions. APSIPA 2012: 1-10 - [c126]Fanbo Meng, Zhiyong Wu, Helen M. Meng, Jia Jia, Lianhong Cai:
Hierarchical English Emphatic Speech Synthesis Based on HMM with Limited Training Data. INTERSPEECH 2012: 466-469 - [c125]Xiaojun Qian, Helen M. Meng, Frank K. Soong:
The Use of DBN-HMMs for Mispronunciation Detection and Diagnosis in L2 English to Support Computer-Aided Pronunciation Training. INTERSPEECH 2012: 775-778 - [c124]Chunrong Li, Zhiyong Wu, Fanbo Meng, Helen M. Meng, Lianhong Cai:
Detection and emphatic realization of contrastive word pairs for expressive text-to-speech synthesis. ISCSLP 2012: 93-97 - [c123]Pengfei Liu, Ka-Wa Yuen, Wai-Kim Leung, Helen M. Meng:
mENUNCIATE: Development of a computer-aided pronunciation training system on a cross-platform framework for mobile, speech-enabled application development. ISCSLP 2012: 170-173 - [c122]Jia Jia, Wai-Kim Leung, Ye Tian, Lianhong Cai, Helen M. Meng:
Analysis on mispronunciations in CAPT based on computational speech perception. ISCSLP 2012: 174-178 - [c121]Kun Li, Helen M. Meng:
Perceptually-motivated assessment of automatically detected lexical stress in L2 learners' speech. ISCSLP 2012: 179-183 - [c120]Helen M. Meng:
Welcome message from the conference chair. ISCSLP 2012 - 2011
- [c119]Weiwu Jiang, Man-Wai Mak, Wei Rao, Helen M. Meng:
The HKCUPU system for the NIST 2010 speaker recognition evaluation. ICASSP 2011: 5288-5291 - [c118]Ka-Ho Wong, Wai Kit Lo, Helen M. Meng:
Allophonic variations in visual speech synthesis for corrective feedback in CAPT. ICASSP 2011: 5708-5711 - [c117]Mingxing Li, Shuang Zhang, Kun Li, Alissa M. Harrison, Wai-Kit Lo, Helen Meng:
Design and Collection of an L2 English Corpus with a Suprasegmental Focus for Chinese Learners of English. ICPhS 2011: 1210-1213 - [c116]Weiwu Jiang, Zhifeng Li, Helen M. Meng:
An Analysis Framework Based on Random Subspace Sampling for Speaker Verification. INTERSPEECH 2011: 253-256 - [c115]Xiaojun Qian, Helen M. Meng, Frank K. Soong:
On Mispronunciation Lexicon Generation Using Joint-Sequence Multigrams in Computer-Aided Pronunciation Training (CAPT). INTERSPEECH 2011: 865-868 - [c114]Kun Li, Shuang Zhang, Mingxing Li, Wai Kit Lo, Helen M. Meng:
Prominence Model for Prosodic Features in Automatic Lexical Stress and Pitch Accent Detection. INTERSPEECH 2011: 2009-2012 - 2010
- [j20]Xiaodong He, Li Deng, Roland Kuhn, Helen M. Meng, Samy Bengio:
Introduction to the Issue on Statistical Learning Methods for Speech and Language Processing. IEEE J. Sel. Top. Signal Process. 4(6): 913-916 (2010) - [j19]Zhengyu Zhou, Helen M. Meng:
Pseudo-Conventional N-Gram Representation of the Discriminative N-Gram Model for LVCSR. IEEE J. Sel. Top. Signal Process. 4(6): 943-952 (2010) - [c113]Zhifeng Li, Weiwu Jiang, Helen M. Meng:
Fishervioce: A discriminant subspace framework for speaker recognition. ICASSP 2010: 4522-4525 - [c112]Wai Kit Lo, Alissa M. Harrison, Helen M. Meng:
Statistical phone duration modeling to filter for intact utterances in a computer-assisted pronunciation training system. ICASSP 2010: 5238-5241 - [c111]Xiaojun Qian, Frank K. Soong, Helen M. Meng:
Discriminative acoustic model for improving mispronunciation detection and diagnosis in computer-aided pronunciation training (CAPT). INTERSPEECH 2010: 757-760 - [c110]Wai Kit Lo, Shuang Zhang, Helen M. Meng:
Automatic derivation of phonological rules for mispronunciation detection in a computer-assisted pronunciation training system. INTERSPEECH 2010: 765-768 - [c109]Kun Li, Shuang Zhang, Mingxing Li, Wai-Kit Lo, Helen M. Meng:
Detection of intonation in L2 English speech of native Mandarin learners. ISCSLP 2010: 69-74 - [c108]Xiaojun Qian, Helen M. Meng, Frank K. Soong:
Capturing L2 segmental mispronunciations with joint-sequence models in Computer-Aided Pronunciation Training (CAPT). ISCSLP 2010: 84-88 - [c107]Ka-Ho Wong, Wai-Kim Leung, Wai-Kit Lo, Helen M. Meng:
Development of an articulatory visual-speech synthesizer to support language learning. ISCSLP 2010: 139-143 - [c106]Zhiyong Wu, Lianhong Cai, Helen M. Meng:
Modeling prosody patterns for Chinese expressive text-to-speech synthesis. ISCSLP 2010: 148-152 - [c105]Weiwu Jiang, Helen M. Meng, Zhifeng Li:
An enhanced Fishervoice subspace framework for text-independent speaker verification. ISCSLP 2010: 300-304 - [c104]Pui-Yu Hui, Wai Kit Lo, Helen M. Meng:
Usage patterns and latent semantic analyses for task goal inference of multimodal user interactions. IUI 2010: 129-138 - [c103]Zhaojun Yang, Baichuan Li, Yi Zhu, Irwin King, Gina-Anne Levow, Helen M. Meng:
Collection of user judgments on spoken dialog system with crowdsourcing. SLT 2010: 277-282 - [c102]Baichuan Li, Zhaojun Yang, Yi Zhu, Helen M. Meng, Gina-Anne Levow, Irwin King:
Predicting user evaluations of spoken dialog systems using semi-supervised learning. SLT 2010: 283-288 - [c101]Zhaojun Yang, Baichuan Li, Yi Zhu, Irwin King, Gina-Anne Levow, Helen M. Meng:
Collaborative filtering model for user satisfaction prediction in Spoken Dialog System evaluation. SLT 2010: 472-477 - [c100]Yi Zhu, Zhaojun Yang, Helen M. Meng, Baichuan Li, Gina-Anne Levow, Irwin King:
Using finite state machines for evaluating spoken dialog systems. SLT 2010: 478-483 - [p1]Shen Zhang, Zhiyong Wu, Helen M. Meng, Lianhong Cai:
Facial Expression Synthesis Based on Emotion Dimensions for Affective Talking Avatar. Modeling Machine Emotions for Realizing Intelligence 2010: 109-132
2000 – 2009
- 2009
- [j18]Pui-Yu Hui, Helen M. Meng:
Cross-Modality Semantic Integration With Hypothesis Rescoring for Robust Interpretation of Multimodal User Interactions. IEEE Trans. Speech Audio Process. 17(3): 486-500 (2009) - [j17]Zhiyong Wu, Helen M. Meng, Hongwu Yang, Lianhong Cai:
Modeling the Expressivity of Input Text Semantics for Chinese Text-to-Speech Synthesis in a Spoken Dialog System. IEEE Trans. Speech Audio Process. 17(8): 1567-1576 (2009) - [c99]Wai Kit Lo, Wenying Xiong, Helen M. Meng:
Automatic Story Segmentation using a Bayesian Decision Framework for Statistical Models of Lexical Chain Features. ACL/IJCNLP (2) 2009: 265-268 - [c98]Bernd J. Kröger, Peter Birkholz, Rüdiger Hoffmann, Helen Meng:
Audiovisual Tools for Phonetic and Articulatory Visualization in Computer-Aided Pronunciation Training. COST 2102 Training School 2009: 337-345 - [c97]Helen Meng, Chiu-yu Tseng, Mariko Kondo, Alissa M. Harrison, Tanya Visceglia:
Studying L2 suprasegmental features in asian Englishes: a position paper. INTERSPEECH 2009: 1715-1718 - [c96]Helen Meng:
Developing Speech Recognition and Synthesis Technologies to Support Computer-Aided Pronunciation Training for Chinese Learners of English. PACLIC 2009: 40-42 - [c95]Alissa M. Harrison, Wai-Kit Lo, Xiaojun Qian, Helen Meng:
Implementation of an extended recognition network for mispronunciation detection and diagnosis in computer-assisted pronunciation training. SLaTE 2009: 45-48 - 2008
- [c94]Zhengyu Zhou, Helen M. Meng:
Recasting the discriminative n-gram model as a pseudo-conventional n-gram model for LVCSR. ICASSP 2008: 4933-4936 - [c93]Lan Wang, Xin Feng, Helen M. Meng:
Automatic generation and pruning of phonetic mispronunciations to support computer-aided pronunciation training. INTERSPEECH 2008: 1729-1732 - [c92]Alissa M. Harrison, Wing Yiu Lau, Helen M. Meng, Lan Wang:
Improving mispronunciation detection and diagnosis of learners' speech with context-sensitive phonological rules based on language transfer. INTERSPEECH 2008: 2787-2790 - [c91]Wai Kit Lo, Alissa M. Harrison, Helen Meng, Lan Wang:
Decision Fusion for Improving Mispronunciation Detection Using Language Transfer Knowledge and Phoneme-Dependent Pronunciation Scoring. ISCSLP 2008: 25-28 - [c90]Honglei Cong, Zhiyong Wu, Lianhong Cai, Helen M. Meng:
A New Prosodic Strength Calculation Method for Prosody Reduction Modeling. ISCSLP 2008: 53-56 - [c89]Zhiyong Wu, Jiying Wu, Helen M. Meng:
The Use of Dynamic Deformable Templates for Lip Tracking in an Audio-Visual Corpus with Large Variations in Head Pose, Face Illumination and Lip Shapes. ISCSLP 2008: 370-373 - [e3]Helen M. Meng, Hui Jiang, Jianhua Tao, Ren-Hua Wang:
6th International Symposium on Chinese Spoken Language Processing, ISCSLP 2008, 16-19 December, 2008, Kunming, China. IEEE 2008, ISBN 978-1-4244-2942-4 [contents] - 2007
- [j16]Shi-Xiong Zhang, Man-Wai Mak, Helen M. Meng:
Speaker Verification via High-Level Feature Based Phonetic-Class Pronunciation Modeling. IEEE Trans. Computers 56(9): 1189-1198 (2007) - [c88]Shen Zhang, Zhiyong Wu, Helen M. Meng, Lianhong Cai:
Facial Expression Synthesis Using PAD Emotional Parameters for a Chinese Expressive Avatar. ACII 2007: 24-35 - [c87]Helen Mei-Ling Meng, Yuen Yee Lo, Lan Wang, Wing Yiu Lau:
Deriving salient learners' mispronunciations from cross-language phonological comparisons. ASRU 2007: 437-442 - [c86]Zhifeng Li, Dahua Lin, Helen M. Meng, Xiaoou Tang:
Discriminant Mutual Subspace Learning for Indoor and Outdoor Face Recognition. CVPR 2007 - [c85]Bin Ma, Helen M. Meng, Man-Wai Mak:
Effects of Device Mismatch, Language Mismatch and Environmental Mismatch on Speaker Verification. ICASSP (4) 2007: 301-304 - [c84]Henry Pak-Sum Hui, Helen M. Meng, Man-Wai Mak:
Adaptive Weight Estimation in Multi-Biometric Verification using Fuzzy Logic Decision Fusion. ICASSP (1) 2007: 501-504 - [c83]Shen Zhang, Zhiyong Wu, Helen M. Meng, Lianhong Cai:
Head Movement Synthesis Based on Semantic and Prosodic Features for a Chinese Expressive Avatar. ICASSP (4) 2007: 837-840 - [c82]Shi-Xiong Zhang, Man-Wai Mak, Helen M. Meng:
High-level feature-based speaker verification via articulatory phonetic-class pronunciation modeling. INTERSPEECH 2007: 762-765 - [c81]Pui-Yu Hui, Zhengyu Zhou, Helen M. Meng:
Complementarity and redundancy in multimodal user inputs with speech and pen gestures. INTERSPEECH 2007: 2205-2208 - [c80]Shing-kai Chan, Lei Xie, Helen M. Meng:
Modeling the statistical behavior of lexical chains to capture word cohesiveness for automatic story segmentation. INTERSPEECH 2007: 2581-2584 - [c79]Lei Xie, Chuan Liu, Helen Meng:
Combined Use of Speaker- and Tone-Normalized Pitch Reset with Pause Duration for Automatic Story Segmentation in Mandarin Broadcast News. HLT-NAACL (Short Papers) 2007: 193-196 - 2006
- [c78]Zhengyu Zhou, Jianfeng Gao, Frank K. Soong, Helen Meng:
A Comparative Study of Discriminative Methods for Reranking LVCSR N-Best Hypotheses in Domain Adaptation and Generalization. ICASSP (1) 2006: 141-144 - [c77]Zhiyong Wu, Lianhong Cai, Helen M. Meng:
Multi-level Fusion of Audio and Visual Features for Speaker Identification. ICB 2006: 493-499 - [c76]Pui-Yu Hui, Helen M. Meng:
Joint interpretation of input speech and pen gestures for multimodal human-computer interaction. INTERSPEECH 2006 - [c75]Zhiyong Wu, Shen Zhang, Lianhong Cai, Helen M. Meng:
Real-time synthesis of Chinese visual speech and facial expressions using MPEG-4 FAP features in a three-dimensional avatar. INTERSPEECH 2006 - [c74]Hongwu Yang, Helen M. Meng, Lianhong Cai:
Modeling the acoustic correlates of expressive elements in text genres for expressive text-to-speech synthesis. INTERSPEECH 2006 - [c73]Zhengyu Zhou, Helen M. Meng, Wai Kit Lo:
A multi-pass error detection and correction framework for Mandarin LVCSR. INTERSPEECH 2006 - [c72]Zhiyong Wu, Helen M. Meng, Hui Ning, Sam C. Tse:
A Corpus-Based Approach for Cooperative Response Generation in a Dialog System. ISCSLP (Selected Papers) 2006: 614-626 - [c71]Lei Xie, Helen Meng, Zhi-Qiang Liu:
A Cantonese Speech-Driven Talking Face Using Translingual Audio-to-Visual Conversion. ISCSLP (Selected Papers) 2006: 627-639 - [c70]Devon Li, Wai Kit Lo, Helen M. Meng:
Initial Experiments on Automatic Story Segmentation in Chinese Spoken Documents Using Lexical Cohesion of Extracted Named Entities. ISCSLP (Selected Papers) 2006: 693-703 - [c69]Kui Xu, Helen M. Meng, Fuliang Weng:
A Maximum Entropy Framework that Integrates Word Dependencies and Grammatical Relations for Reading Comprehension. HLT-NAACL 2006 - [c68]Hongwu Yang, Helen M. Meng, Zhiyong Wu, Lianhong Cai:
Modelling the Global acoustic Correlates of Expressivity for Chinese Text-to-speech Synthesis. SLT 2006: 138-141 - 2005
- [c67]Tien Ying Fung, Yuk-Chi Li, Eddie Sio, Icarus Lee, Helen M. Meng, P. C. Ching:
Embedded Cantonese TTS for multi-device access to web content. INTERSPEECH 2005: 2601-2604 - [c66]Yongping Du, Helen Meng, Xuanjing Huang, Lide Wu:
The Use of Metadata, Web-derived Answer Patterns and Passage Context to Improve Reading Comprehension Performance. HLT/EMNLP 2005: 604-611 - [e2]Gary Geunbae Lee, Akio Yamada, Helen Meng, Sung-Hyon Myaeng:
Information Retrieval Technology, Second Asia Information Retrieval Symposium, AIRS 2005, Jeju Island, Korea, October 13-15, 2005, Proceedings. Lecture Notes in Computer Science 3689, Springer 2005, ISBN 3-540-29186-5 [contents] - 2004
- [j15]Helen M. Meng, Berlin Chen, Sanjeev Khudanpur, Gina-Anne Levow, Wai Kit Lo, Douglas W. Oard, Patrick Schone, Karen Tang, Hsin-Min Wang, Jianqiang Wang:
Mandarin-English Information (MEI): investigating translingual speech retrieval. Comput. Speech Lang. 18(2): 163-179 (2004) - [j14]Wai-Kit Lo, Helen M. Meng, P. C. Ching:
Multi-Scale Spoken Document Retrieval for Cantonese Broadcast News. Int. J. Speech Technol. 7(2-3): 203-219 (2004) - [j13]Helen M. Meng, P. C. Ching, Shuk Fong Chan, Yee Fong Wong, Cheong Chat Chan:
ISIS: an adaptive, trilingual conversational system with interleaving interaction and delegation dialogs. ACM Trans. Comput. Hum. Interact. 11(3): 268-299 (2004) - [c65]Kui Xu, Helen M. Meng:
Using Verb Dependency Matching in a Reading Comprehension System. AIRS 2004: 190-201 - [c64]Bin Ma, Helen Meng:
English-Chinese bilingual text-independent speaker verification. ICASSP (5) 2004: 293-296 - [c63]Jianqing Wang, Ka-Ho Wong, Pheng-Ann Heng, Helen Mei-Ling Meng, Tien-Tsin Wong:
A real-time Cantonese text-to-audiovisual speech synthesizer. ICASSP (1) 2004: 653-656 - [c62]Helen Meng, Yuk-Chi Li, Tien Ying Fung, Kon Fan Low, Ka-Fai Chow, Tin Hang Lo, Man Cheuk Ho, P. C. Ching:
Bilingual Chinese/English voice browsing based on a VoiceXML platform. ICASSP (3) 2004: 769-772 - [c61]Cheung-Chi Leung, Yiu Sang Moon, Helen Meng:
A Pruning Approach for GMM-Based Speaker Verification in Mobile Embedded Systems. ICBA 2004: 607-613 - [c60]Chun Wai Lau, Bin Ma, Helen Mei-Ling Meng, Yiu Sang Moon, Yeung Yam:
Fuzzy logic decision fusion in a multimodal biometric system. INTERSPEECH 2004: 261-264 - [c59]Zhengyu Zhou, Helen M. Meng:
A two-level schema for detecting recognition errors. INTERSPEECH 2004: 449-452 - [c58]Zhengyu Zhou, Helen Meng:
Error identification for large vocabulary speech recognition. ISCSLP 2004: 21-24 - [c57]Wing Lin Yip, Helen M. Meng:
Bilingual response generation using semi-automatically-induced templates for a mixed-initiative dialog system. ISCSLP 2004: 193-196 - [c56]Tien Ying Fung, Yuk-Chi Li, Helen M. Meng, P. C. Ching:
Prosody and style controls in CU VOCAL using SSML and SAPI XML tags. ISCSLP 2004: 209-212 - [c55]Joyce Y. C. Chan, P. C. Ching, Tan Lee, Helen M. Meng:
Detection of language boundary in code-switching utterances by bi-phone probabilities. ISCSLP 2004: 293-296 - 2003
- [j12]Wai Kit Lo, Helen M. Meng, P. C. Ching:
Cross-language spoken document retrieval using HMM-based retrieval model with multi-scale fusion. ACM Trans. Asian Lang. Inf. Process. 2(1): 1-26 (2003) - [j11]Helen M. Meng, Carmen Wai, Roberto Pieraccini:
The use of belief networks for mixed-initiative dialog modeling. IEEE Trans. Speech Audio Process. 11(6): 757-773 (2003) - [c54]Pui-Yu Hui, Wai Kit Lo, Helen M. Meng:
Multimedia fusion in automatic extraction of studio speech segments for spoken document retrieval. ICASSP (5) 2003: 724-727 - [c53]Helen M. Meng, Yuk-Chi Li, Tien Ying Fung, Man Cheuk Ho, Chi-Kin Keung, Tin Hang Lo, Wai Kit Lo, P. C. Ching:
Recent enhancements in CU VOCAL for Chinese TTS-enabled applications. INTERSPEECH 2003: 1253-1256 - [c52]Helen M. Meng, Wing Lin Yip, Oi Yan Mok, Shuk Fong Chan:
Natural language response generation in mixed-initiative dialogs using task goals and dialog acts. INTERSPEECH 2003: 1689-1692 - [c51]Wai Kit Lo, Yuk-Chi Li, Gina-Anne Levow, Hsin-Min Wang, Helen M. Meng:
Multi-scale document expansion in English-Mandarin cross-language spoken document retrieval. INTERSPEECH 2003: 2337-2340 - [c50]Kai-Chung Siu, Helen M. Meng, Chin-Chung Wong:
Example-based bi-directional Chinese-English machine translation with semi-automatically induced grammars. INTERSPEECH 2003: 2801-2804 - [c49]Helen M. Meng, Tin Hang Lo, Chi-Kin Keung, Man Cheuk Ho, Wai Kit Lo, P. C. Ching:
CU VOCAL Web Service: A Text-to-speech Synthesis Web Service for Voice-enabled Web-mediated Applications. WWW (Posters) 2003 - 2002
- [j10]Helen M. Meng, Steven Lee, Carmen Wai:
Intelligent speech for information systems: towards biliteracy and trilingualism. Interact. Comput. 14(4): 327-339 (2002) - [j9]Tan Lee, Wai Kit Lo, P. C. Ching, Helen M. Meng:
Spoken language resources for Cantonese speech processing. Speech Commun. 36(3-4): 327-342 (2002) - [j8]Helen M. Meng, Po-Chui Luk, Kui Xu, Fuliang Weng:
GLR parsing with multiple grammars for natural language queries. ACM Trans. Asian Lang. Inf. Process. 1(2): 123-144 (2002) - [j7]Eric Chang, Frank Seide, Helen M. Meng, Zhuoran Chen, Yu Shi, Yuk-Chi Li:
A system for spoken query information retrieval on mobile devices. IEEE Trans. Speech Audio Process. 10(8): 531-541 (2002) - [j6]Helen M. Meng, Kai-Chung Siu:
Semiautomatic Acquisition of Semantic Structures for Understanding Domain-Specific Natural Language Queries. IEEE Trans. Knowl. Data Eng. 14(1): 172-181 (2002) - [c48]Wai Kit Lo, Helen M. Meng, P. C. Ching:
Multi-scale and multi-model integration for improved performance in Chinese spoken document retrieval. INTERSPEECH 2002: 1513-1516 - [c47]Helen M. Meng, Chi-Kin Keung, Kai-Chung Siu, Tien Ying Fung, P. C. Ching:
CU VOCAL: corpus-based syllable concatenation for Chinese speech synthesis across domains and dialects. INTERSPEECH 2002: 2373-2376 - [c46]Helen M. Meng, P. C. Ching, Yee Fong Wong, Cheong Chat Chan:
ISIS: a multi-modal, trilingual, distributed spoken dialog system developed with CORBA, java, XML and KQML. INTERSPEECH 2002: 2561-2564 - [c45]Tien Ying Fung, Helen Meng:
The effect of tonal context on cantonese concatenative speech synthesis. ISCSLP 2002 - [c44]Helen Meng:
Intelligent speech for information systems (ISIS): a multi-modal, trilingual, distributed conversational system with combined interaction and delegation dialogs. ISCSLP 2002 - [c43]Bonnie Mok, Helen M. Meng:
Improvements on a belief network framework for natural language understanding of domain-specific Chinese queries. ISCSLP 2002 - 2001
- [j5]W. Lam, Helen M. Meng, K. L. Wong, Jerome Chih Hung Yen:
Using contextual analysis for news event detection. Int. J. Intell. Syst. 16(4): 525-546 (2001) - [j4]Helen Meng:
A hierarchical lexical representation for bi-directional spelling-to-pronunciation/pronunciation-to-spelling generation. Speech Commun. 33(3): 213-239 (2001) - [c42]Pui-Yu Hui, Xiaoou Tang, Helen M. Meng, Wai Lam, Xinbo Gao:
Automatic Story Segmentation for Spoken Document Retrieval. FUZZ-IEEE 2001: 1319-1322 - [c41]Carmen Wai, Roberto Pieraccini, Helen M. Meng:
A dynamic semantic model for re-scoring recognition hypotheses. ICASSP 2001: 589-592 - [c40]Hsin-Min Wang, Helen M. Meng, Patrick Schone, Berlin Chen, Wai-Kit Lo:
Multi-scale-audio indexing for translingual spoken document retrieval. ICASSP 2001: 605-608 - [c39]Helen M. Meng, Xiaoou Tang, Pui-Yu Hui, Xinbo Gao, Yuk-Chi Li:
Speech retrieval with video parsing for television news programs. ICASSP 2001: 1401-1404 - [c38]Kui Xu, Fuliang Weng, Helen M. Meng, Po-Chui Luk:
Multi-parser architecture for query processing. INTERSPEECH 2001: 1077-1080 - [c37]Wai Kit Lo, Patrick Schone, Helen M. Meng:
Multi-scale retrieval in MEI: an English-Chinese translingual speech retrieval system. INTERSPEECH 2001: 1303-1306 - [c36]Helen M. Meng, Shuk Fong Chan, Yee Fong Wong, Cheong Chat Chan, Yiu Wing Wong, Tien Ying Fung, Wai Ching Tsui, Ke Chen, Lan Wang, Ting-Yao Wu, Xiaolong Li, Tan Lee, Wing Nin Choi, P. C. Ching, Huisheng Chi:
ISIS: a learning system with combined interaction and delegation dialogs. INTERSPEECH 2001: 1551-1554 - [c35]Kai-Chung Siu, Helen M. Meng:
Semi-automatic grammar induction for bi-directional English-Chinese machine translation. INTERSPEECH 2001: 2749-2752 - [c34]Po-Chui Luk, Fuliang Weng, Helen M. Meng:
Automatic Grammar Partitioning for Syntactic Parsing. IWPT 2001 - [c33]Kin Hui, Wai Lam, Helen M. Meng:
Automatic event generation from multi-lingual news stories. JCDL 2001: 23-24 - [c32]Helen M. Meng, Berlin Chen, Sanjeev Khudanpur, Gina-Anne Levow, Wai-Kit Lo, Douglas W. Oard, Patrick Schone, Karen Tang, Hsin-Min Wang, Jianqiang Wang:
Mandarin-English Information: Investigating Translingual Speech Retrieval. HLT 2001 - [c31]Carmen Wai, Helen M. Meng, Roberto Pieraccini:
Scalability and Portability of a Belief Network-based Dialog Model for Different Application Domains. HLT 2001 - [c30]Chin-Chung Wong, Helen M. Meng, Kai-Chung Siu:
Learning Strategies In A Grammar Induction Framework. NLPRS 2001: 153-157 - [c29]Wai Kit Lo, P. C. Ching, Tan Lee, Helen Meng:
Design, Compilation and Processing of CUCall: A Set of Cantonese Spoken Language Corpora Collected Over Telephone Networks. ROCLING 2001 - 2000
- [j3]Helen M. Meng:
Initial Development Towards a Trilingual Speech Interface for Financial Information Inquiries. Int. J. Speech Technol. 3(2): 83-91 (2000) - [j2]Helen M. Meng:
HCI and the 3C convergence. ACM SIGCHI Bull. 32(1): 79-81 (2000) - [c28]Helen M. Meng:
Intelligent speech for information systems: towards biliteracy and trilingualism. CUU 2000: 91-95 - [c27]Tien Ying Fung, Helen M. Meng:
Concatenating syllables for response generation in spoken language applications. ICASSP 2000: 933-936 - [c26]Helen M. Meng, Steven Lee, Carmen Wai:
CU FOREX: a bilingual spoken dialog system for foreign exchange enquiries. ICASSP 2000: 1229-1232 - [c25]Helen M. Meng, Wai Kit Lo, Yuk-Chi Li, P. C. Ching:
Multi-scale audio indexing for Chinese spoken document retrieval. INTERSPEECH 2000: 101-104 - [c24]Helen M. Meng, Carmen Wai, Roberto Pieraccini:
The use of belief networks for mixed-initiative dialog modeling. INTERSPEECH 2000: 106-109 - [c23]Helen M. Meng, Shuk Fong Chan, Yee Fong Wong, Tien Ying Fung, Wai Ching Tsui, Tin Hang Lo, Cheong Chat Chan, Ke Chen, Lan Wang, Ting-Yao Wu, Xiaolong Li, Tan Lee, Wing Nin Choi, Yiu Wing Wong, P. C. Ching, Huisheng Chi:
ISIS: A multilingual spoken dialog system developed with CORBA and KQML agents. INTERSPEECH 2000: 150-153 - [c22]Po-Chui Luk, Helen M. Meng, Filung Wang:
Grammar partitioning and parser composition for natural language understanding. INTERSPEECH 2000: 486-489 - [c21]Yuk-Chi Li, Wai Kit Lo, Helen M. Meng, P. C. Ching:
Query expansion using phonetic confusions for Chinese spoken document retrieval. IRAL 2000: 89-93 - [c20]Wai-Kit Lo, Helen M. Meng, P. C. Ching:
Sub-Syllabic Acoustic Modeling Across Chinese Dialects. ISCSLP 2000 - [c19]Helen M. Meng, Wai Ching Tsui:
Comprehension Across Application Domains and Languages. ISCSLP 2000 - [c18]Fuliang Weng, Helen Meng, Po-Chui Luk:
Parsing a Lattice with Multiple Grammars. IWPT 2000: 266-277 - [e1]Kwong-Sak Leung, Lai-Wan Chan, Helen Meng:
Intelligent Data Engineering and Automated Learning - IDEAL 2000, Data Mining, Financial Engineering, and Intelligent Agents, Second International Conference, Shatin, N.T. Hong Kong, China, December 13-15, 2000, Proceedings. Lecture Notes in Computer Science 1983, Springer 2000, ISBN 3-540-41450-9 [contents]
1990 – 1999
- 1999
- [c17]Tan Lee, Helen M. Meng, Wai H. Lau, Wai Kit Lo, P. C. Ching:
Micro-prosodic control in cantonese text-to-speech synthesis. EUROSPEECH 1999: 1855-1858 - [c16]Helen M. Meng, Wai Lam, Carmen Wai:
To believe is to understand. EUROSPEECH 1999: 2015-2018 - [c15]Kai-Chung Siu, Helen M. Meng:
Semi-automatic acquisition of domain-specific semantic structures. EUROSPEECH 1999: 2039-2042 - [c14]Helen Meng, Wah-Hing Ip:
An Analytical Study of Transformational Tagging for Chinese Text. ROCLING (2) 1999: 101-122 - 1997
- [c13]Chao Wang, James R. Glass, Helen M. Meng, Joseph Polifroni, Stephanie Seneff, Victor W. Zue:
YINHE: a Mandarin Chinese version of the GALAXY system. EUROSPEECH 1997: 351-354 - [c12]Victor W. Zue, Stephanie Seneff, James R. Glass, I. Lee Hetherington, Edward Hurley, Helen M. Meng, Christine Pao, Joseph Polifroni, Rafael Schloming, Philipp Schmid:
From interface to content: translingual access and delivery of on-line information. EUROSPEECH 1997: 2227-2230 - 1996
- [j1]Helen M. Meng, Sheri Hunnicutt, Stephanie Seneff, Victor Zue:
Reversible letter-to-sound/sound-to-letter generation based on parsing word morpology. Speech Commun. 18(1): 47-63 (1996) - [c11]Stephanie Seneff, Raymond Lau, Helen M. Meng:
ANGIE: a new framework for speech analysis based on morpho-phonological modelling. ICSLP 1996: 110-113 - [c10]Helen M. Meng, Senis Busayapongchai, James R. Glass, David Goddeau, I. Lee Hetherington, Edward Hurley, Christine Pao, Joseph Polifroni, Stephanie Seneff, Victor Zue:
WHEELS: a conversational system in the automobile classifieds domain. ICSLP 1996: 542-545 - [c9]David Goddeau, Helen M. Meng, Joseph Polifroni, Stephanie Seneff, Senis Busayapongchai:
A form-based dialogue manager for spoken language applications. ICSLP 1996: 701-704 - [c8]Victor Zue, Stephanie Seneff, Joseph Polifroni, Helen M. Meng, James R. Glass:
Multilingual human-computer interactions: from information access to language learning. ICSLP 1996: 2207-2210 - 1995
- [b1]Helen M. Meng:
Phonological parsing for bi-directional letter-to-sound/sound-to-letter generation. Massachusetts Institute of Technology, Cambridge, MA, USA, 1995 - 1994
- [c7]Helen M. Meng, Stephanie Seneff, Victor W. Zue:
Phonological parsing for reversible letter-to-sound/sound-to-letter generation. ICASSP (2) 1994: 1-4 - [c6]Helen M. Meng, Stephanie Seneff, Victor Zue:
Phonological Parsing for Bi-directional Letter-to-Sound/Sound-to-Letter Generation. HLT 1994 - 1993
- [c5]Sheri Hunnicutt, Helen M. Meng, Stephanie Seneff, Victor W. Zue:
Reversible letter-to-sound sound-to-letter generation based on parsing word morphology. EUROSPEECH 1993: 763-766 - 1992
- [c4]Stephanie Seneff, Helen M. Meng, Victor Zue:
Language modelling for recognition and understanding using layered bigrams. ICSLP 1992: 317-320 - 1991
- [c3]Helen M. Meng, Victor W. Zue:
Signal representation comparison for phonetic classification. ICASSP 1991: 285-288 - [c2]Helen M. Meng, Victor Zue, Hong C. Leung:
Signal Representation Attribute Extraction and the Use Distinctive Features for Phonetic Classification. HLT 1991 - 1990
- [c1]Helen M. Meng, Victor W. Zue:
A comparative study of acoustic representations of speech for vowel classification using multi-layer perceptrons. ICSLP 1990: 1053-1056
Coauthor Index
aka: P. C. Ching
aka: Hung-Yi Lee
aka: Wai-Kit Lo
aka: Ka Ho Wong
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-22 20:12 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint