default search action
Xugang Lu
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j30]Ruiteng Zhang, Jianguo Wei, Xugang Lu, Wenhuan Lu, Di Jin, Lin Zhang, Junhai Xu:
Unsupervised Adaptive Speaker Recognition by Coupling-Regularized Optimal Transport. IEEE ACM Trans. Audio Speech Lang. Process. 32: 3603-3617 (2024) - [j29]Hongcheng Zhang, Wenhuan Lu, Jianguo Wei, Xiangdong Huang, Xiaokang Yang, Xugang Lu:
Efficient Singular Spectrum Mode Ensemble for Extracting Wide-Band Components in Overlapping Spectral Environments. IEEE Trans. Signal Process. 72: 4666-4681 (2024) - [c107]Ruiteng Zhang, Jianguo Wei, Xugang Lu, Yongwei Li, Wenhuan Lu, Di Jin, Junhai Xu:
Self-Supervised Domain Exploration with an Optimal Transport Regularization for Open Set Cross-Domain Speech Emotion Recognition. ICASSP 2024: 10466-10470 - [c106]Yuxuan Li, Jianguo Wei, Qiang Fang, Xugang Lu:
Evaluation of an Improved Ultrasonic Imaging Helmet for Observing Articulatory Data. ICASSP 2024: 12762-12766 - [c105]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-Based ASR. ICASSP 2024: 13116-13120 - [i36]Cho-Yuan Lee, Kuan-Chen Wang, Kai-Chun Liu, Xugang Lu, Ping-Cheng Yeh, Yu Tsao:
A Non-Intrusive Neural Quality Assessment Model for Surface Electromyography Signals. CoRR abs/2402.05482 (2024) - [i35]Wenhao Yang, Jianguo Wei, Wenhuan Lu, Lei Li, Xugang Lu:
Robust Channel Learning for Large-Scale Radio Speaker Verification. CoRR abs/2406.10956 (2024) - [i34]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Temporal Order Preserved Optimal Transport-based Cross-modal Knowledge Transfer Learning for ASR. CoRR abs/2409.02239 (2024) - [i33]Wenhao Yang, Jianguo Wei, Wenhuan Lu, Xugang Lu, Lei Li:
Integrated Multi-Level Knowledge Distillation for Enhanced Speaker Verification. CoRR abs/2409.09389 (2024) - [i32]Wenhao Yang, Jianguo Wei, Wenhuan Lu, Lei Li, Xugang Lu:
Channel Adaptation for Speaker Verification Using Optimal Transport with Pseudo Label. CoRR abs/2409.09396 (2024) - [i31]Kuo-Hsuan Hung, Kuan-Chen Wang, Kai-Chun Liu, Wei-Lun Chen, Xugang Lu, Yu Tsao, Chii-Wann Lin:
MECG-E: Mamba-based ECG Enhancer for Baseline Wander Removal. CoRR abs/2409.18828 (2024) - 2023
- [j28]Kai Li, Xugang Lu, Masato Akagi, Masashi Unoki:
Contributions of Jitter and Shimmer in the Voice for Fake Audio Detection. IEEE Access 11: 84689-84698 (2023) - [j27]Ruiteng Zhang, Jianguo Wei, Xugang Lu, Wenhuan Lu, Di Jin, Lin Zhang, Junhai Xu, Jianwu Dang:
TMS: Temporal multi-scale in time-delay neural network for speaker verification. Appl. Intell. 53(22): 26497-26517 (2023) - [j26]Ruiteng Zhang, Jianguo Wei, Xugang Lu, Wenhuan Lu, Di Jin, Lin Zhang, Yantao Ji, Junhai Xu:
Self-supervised learning based domain regularization for mask-wearing speaker verification. Speech Commun. 152: 102953 (2023) - [c104]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Cross-Modal Alignment With Optimal Transport For CTC-Based ASR. ASRU 2023: 1-7 - [c103]Kai Li, Dung Kim Tran, Xugang Lu, Masato Akagi, Masashi Unoki:
Data-driven Non-uniform Filterbanks Based on F-ratio for Machine Anomalous Sound Detection. EUSIPCO 2023: 201-205 - [c102]Ruiteng Zhang, Jianguo Wei, Xugang Lu, Wenhuan Lu, Di Jin, Lin Zhang, Junhai Xu:
Optimal Transport with a Diversified Memory Bank for Cross-Domain Speaker Verification. ICASSP 2023: 1-5 - [c101]Ruiteng Zhang, Jianguo Wei, Xugang Lu, Yongwei Li, Junhai Xu, Di Jin, Jianhua Tao:
SOT: Self-supervised Learning-Assisted Optimal Transport for Unsupervised Adaptive Speech Emotion Recognition. INTERSPEECH 2023: 1858-1862 - [c100]Yang Liu, Haoqin Sun, Geng Chen, Qingyue Wang, Zhen Zhao, Xugang Lu, Longbiao Wang:
Multi-Level Knowledge Distillation for Speech Emotion Recognition in Noisy Conditions. INTERSPEECH 2023: 1893-1897 - [i30]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Cross-modal Alignment with Optimal Transport for CTC-based ASR. CoRR abs/2309.13650 (2023) - [i29]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-based ASR. CoRR abs/2309.16093 (2023) - [i28]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Neural domain alignment for spoken language recognition based on optimal transport. CoRR abs/2310.13471 (2023) - [i27]Peng Shen, Xugang Lu, Hisashi Kawai:
Speaker Mask Transformer for Multi-talker Overlapped Speech Recognition. CoRR abs/2312.10959 (2023) - [i26]Yang Liu, Haoqin Sun, Geng Chen, Qingyue Wang, Zhen Zhao, Xugang Lu, Longbiao Wang:
Multi-Level Knowledge Distillation for Speech Emotion Recognition in Noisy Conditions. CoRR abs/2312.13556 (2023) - 2022
- [j25]Tassadaq Hussain, Wei-Chien Wang, Mandar Gogate, Kia Dashtipour, Yu Tsao, Xugang Lu, Ahsan Adeel, Amir Hussain:
A Novel Temporal Attentive-Pooling based Convolutional Recurrent Architecture for Acoustic Signal Enhancement. IEEE Trans. Artif. Intell. 3(5): 833-842 (2022) - [c99]Kai Li, Xugang Lu, Masato Akagi, Jianwu Dang, Sheng Li, Masashi Unoki:
Relationship Between Speakers' Physiological Structure and Acoustic Speech Signals: Data-Driven Study Based on Frequency-Wise Attentional Neural Network. EUSIPCO 2022: 379-383 - [c98]Ruiteng Zhang, Jianguo Wei, Wenhuan Lu, Lin Zhang, Yantao Ji, Junhai Xu, Xugang Lu:
CS-REP: Making Speaker Verification Networks Embracing Re-Parameterization. ICASSP 2022: 7082-7086 - [c97]Kai Li, Sheng Li, Xugang Lu, Masato Akagi, Meng Liu, Lin Zhang, Chang Zeng, Longbiao Wang, Jianwu Dang, Masashi Unoki:
Data Augmentation Using McAdams-Coefficient-Based Speaker Anonymization for Fake Audio Detection. INTERSPEECH 2022: 664-668 - [c96]Peng Shen, Xugang Lu, Hisashi Kawai:
Transducer-based language embedding for spoken language identification. INTERSPEECH 2022: 3724-3728 - [c95]Rong Chao, Cheng Yu, Szu-Wei Fu, Xugang Lu, Yu Tsao:
Perceptual Contrast Stretching on Target Feature for Speech Enhancement. INTERSPEECH 2022: 5448-5452 - [c94]Peng Shen, Xugang Lu, Hisashi Kawai:
Pronunciation-Aware Unique Character Encoding for RNN Transducer-Based Mandarin Speech Recognition. SLT 2022: 123-129 - [c93]Ginji Hayashi, Shigeru Katagiri, Xugang Lu, Miho Ohsaki:
An Investigation of Feature Difference Between Child and Adult Voices Using Line Spectral Pairs. SPML 2022: 94-100 - [i25]Tassadaq Hussain, Wei-Chien Wang, Mandar Gogate, Kia Dashtipour, Yu Tsao, Xugang Lu, Ahsan Adeel, Amir Hussain:
A Novel Temporal Attentive-Pooling based Convolutional Recurrent Architecture for Acoustic Signal Enhancement. CoRR abs/2201.09913 (2022) - [i24]Ruiteng Zhang, Jianguo Wei, Xugang Lu, Wenhuan Lu, Di Jin, Junhai Xu, Lin Zhang, Yantao Ji, Jianwu Dang:
TMS: A Temporal Multi-scale Backbone Design for Speaker Embedding. CoRR abs/2203.09098 (2022) - [i23]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Partial Coupling of Optimal Transport for Spoken Language Identification. CoRR abs/2203.17036 (2022) - [i22]Rong Chao, Cheng Yu, Szu-Wei Fu, Xugang Lu, Yu Tsao:
Perceptual Contrast Stretching on Target Feature for Speech Enhancement. CoRR abs/2203.17152 (2022) - [i21]Peng Shen, Xugang Lu, Hisashi Kawai:
Transducer-based language embedding for spoken language identification. CoRR abs/2204.03888 (2022) - [i20]Peng Shen, Xugang Lu, Hisashi Kawai:
Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition. CoRR abs/2207.14578 (2022) - 2021
- [j24]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Coupling a Generative Model With a Discriminative Learning Framework for Speaker Verification. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3631-3641 (2021) - [c92]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Siamese Neural Network with Joint Bayesian Model Structure for Speaker Verification. APSIPA ASC 2021: 769-774 - [c91]Yu-Wen Chen, Kuo-Hsuan Hung, Shang-Yi Chuang, Jonathan Sherman, Xugang Lu, Yu Tsao:
A Study of Incorporating Articulatory Movement Information in Speech Enhancement. EUSIPCO 2021: 496-500 - [c90]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Unsupervised Neural Adaptation Model Based on Optimal Transport for Spoken Language Identification. ICASSP 2021: 7213-7217 - [c89]Tsun-An Hsieh, Cheng Yu, Szu-Wei Fu, Xugang Lu, Yu Tsao:
Improving Perceptual Quality by Phone-Fortified Perceptual Loss Using Wasserstein Distance for Speech Enhancement. Interspeech 2021: 196-200 - [c88]Szu-Wei Fu, Cheng Yu, Tsun-An Hsieh, Peter Plantinga, Mirco Ravanelli, Xugang Lu, Yu Tsao:
MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement. Interspeech 2021: 201-205 - [c87]Yu-Wen Chen, Kuo-Hsuan Hung, Shang-Yi Chuang, Jonathan Sherman, Wen-Chin Huang, Xugang Lu, Yu Tsao:
EMA2S: An End-to-End Multimodal Articulatory-to-Speech System. ISCAS 2021: 1-5 - [c86]Hsin-Yi Lin, Huan-Hsin Tseng, Xugang Lu, Yu Tsao:
Unsupervised Noise Adaptive Speech Enhancement by Discriminator-Constrained Optimal Transport. NeurIPS 2021: 19935-19946 - [i19]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Integrating a joint Bayesian generative model in a discriminative learning framework for speaker verification. CoRR abs/2101.03329 (2021) - [i18]Yu-Wen Chen, Kuo-Hsuan Hung, Shang-Yi Chuang, Jonathan Sherman, Wen-Chin Huang, Xugang Lu, Yu Tsao:
EMA2S: An End-to-End Multimodal Articulatory-to-Speech System. CoRR abs/2102.03786 (2021) - [i17]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Siamese Neural Network with Joint Bayesian Model Structure for Speaker Verification. CoRR abs/2104.03004 (2021) - [i16]Szu-Wei Fu, Cheng Yu, Tsun-An Hsieh, Peter Plantinga, Mirco Ravanelli, Xugang Lu, Yu Tsao:
MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement. CoRR abs/2104.03538 (2021) - [i15]Ruiteng Zhang, Jianguo Wei, Wenhuan Lu, Lin Zhang, Yantao Ji, Junhai Xu, Xugang Lu:
CS-Rep: Making Speaker Verification Networks Embracing Re-parameterization. CoRR abs/2110.13465 (2021) - [i14]Hsin-Yi Lin, Huan-Hsin Tseng, Xugang Lu, Yu Tsao:
Unsupervised Noise Adaptive Speech Enhancement by Discriminator-Constrained Optimal Transport. CoRR abs/2111.06316 (2021) - 2020
- [j23]Tsun-An Hsieh, Hsin-Min Wang, Xugang Lu, Yu Tsao:
WaveCRN: An Efficient Convolutional Recurrent Neural Network for End-to-End Speech Enhancement. IEEE Signal Process. Lett. 27: 2149-2153 (2020) - [j22]Peng Shen, Xugang Lu, Sheng Li, Hisashi Kawai:
Knowledge Distillation-Based Representation Learning for Short-Utterance Spoken Language Identification. IEEE ACM Trans. Audio Speech Lang. Process. 28: 2674-2683 (2020) - [j21]Cheng Yu, Ryandhimas E. Zezario, Syu-Siang Wang, Jonathan Sherman, Yi-Yen Hsieh, Xugang Lu, Hsin-Min Wang, Yu Tsao:
Speech Enhancement Based on Denoising Autoencoder With Multi-Branched Encoders. IEEE ACM Trans. Audio Speech Lang. Process. 28: 2756-2769 (2020) - [c85]Haipeng Sun, Rui Wang, Kehai Chen, Xugang Lu, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao:
Robust Unsupervised Neural Machine Translation with Adversarial Denoising Training. COLING 2020: 4239-4250 - [c84]Ryandhimas E. Zezario, Tassadaq Hussain, Xugang Lu, Hsin-Min Wang, Yu Tsao:
Self-Supervised Denoising Autoencoder with Linear Regression Decoder for Speech Enhancement. ICASSP 2020: 6669-6673 - [c83]Peng Shen, Xugang Lu, Hisashi Kawai:
Investigation of NICT Submission for Short-Duration Speaker Verification Challenge 2020. INTERSPEECH 2020: 751-755 - [c82]Yen-Ju Lu, Chien-Feng Liao, Xugang Lu, Jeih-weih Hung, Yu Tsao:
Incorporating Broad Phonetic Information for Speech Enhancement. INTERSPEECH 2020: 2417-2421 - [c81]Peng Shen, Xugang Lu, Komei Sugiura, Sheng Li, Hisashi Kawai:
Compensation on x-vector for Short Utterance Spoken Language Identification. Odyssey 2020: 47-52 - [c80]Sheng Li, Xugang Lu, Raj Dabre, Peng Shen, Hisashi Kawai:
Joint Training End-to-End Speech Recognition Systems with Speaker Attributes. Odyssey 2020: 385-390 - [p1]Xugang Lu, Sheng Li, Masakiyo Fujimoto:
Automatic Speech Recognition. Speech-to-Speech Translation 2020: 21-38 - [i13]Cheng Yu, Ryandhimas E. Zezario, Jonathan Sherman, Yi-Yen Hsieh, Xugang Lu, Hsin-Min Wang, Yu Tsao:
Speech Enhancement based on Denoising Autoencoder with Multi-branched Encoders. CoRR abs/2001.01538 (2020) - [i12]Tsun-An Hsieh, Hsin-Min Wang, Xugang Lu, Yu Tsao:
WaveCRN: An Efficient Convolutional Recurrent Neural Network for End-to-end Speech Enhancement. CoRR abs/2004.04098 (2020) - [i11]Yen-Ju Lu, Chien-Feng Liao, Xugang Lu, Jeih-weih Hung, Yu Tsao:
Incorporating Broad Phonetic Information for Speech Enhancement. CoRR abs/2008.07618 (2020) - [i10]Tsun-An Hsieh, Cheng Yu, Szu-Wei Fu, Xugang Lu, Yu Tsao:
Improving Perceptual Quality by Phone-Fortified Perceptual Loss for Speech Enhancement. CoRR abs/2010.15174 (2020) - [i9]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Unsupervised neural adaptation model based on optimal transport for spoken language identification. CoRR abs/2012.13152 (2020)
2010 – 2019
- 2019
- [c79]Peng Shen, Xugang Lu, Sheng Li, Hisashi Kawai:
Interactive Learning of Teacher-student Model for Short Utterance Spoken Language Identification. ICASSP 2019: 5981-5985 - [c78]Sheng Li, Chenchen Ding, Xugang Lu, Peng Shen, Tatsuya Kawahara, Hisashi Kawai:
End-to-End Articulatory Attribute Modeling for Low-Resource Multilingual Speech Recognition. INTERSPEECH 2019: 2145-2149 - [c77]Sheng Li, Xugang Lu, Chenchen Ding, Peng Shen, Tatsuya Kawahara, Hisashi Kawai:
Investigating Radical-Based End-to-End Speech Recognition Systems for Chinese Dialects and Japanese. INTERSPEECH 2019: 2200-2204 - [c76]Chien-Feng Liao, Yu Tsao, Xugang Lu, Hisashi Kawai:
Incorporating Symbolic Sequential Modeling for Speech Enhancement. INTERSPEECH 2019: 2733-2737 - [c75]Ryandhimas E. Zezario, Szu-Wei Fu, Xugang Lu, Hsin-Min Wang, Yu Tsao:
Specialized Speech Enhancement Model Selection Based on Learned Non-Intrusive Quality Assessment Metric. INTERSPEECH 2019: 3168-3172 - [c74]Xugang Lu, Peng Shen, Sheng Li, Yu Tsao, Hisashi Kawai:
Class-Wise Centroid Distance Metric Learning for Acoustic Event Detection. INTERSPEECH 2019: 3614-3618 - [c73]Sheng Li, Raj Dabre, Xugang Lu, Peng Shen, Tatsuya Kawahara, Hisashi Kawai:
Improving Transformer-Based Speech Recognition Systems with Compressed Structure and Speech Attributes Augmentation. INTERSPEECH 2019: 4400-4404 - [c72]Yuya Tomotoshi, David Ha, Emilie Delattre, Hideyuki Watanabe, Xugang Lu, Shigeru Katagiri, Miho Ohsaki:
Optimal Classifier Parameter Status Selection Based on Bayes Boundary-ness for Multi-ProtoType and Multi-Layer Perceptron Classifiers. IUKM 2019: 295-307 - [i8]Chien-Feng Liao, Yu Tsao, Xugang Lu, Hisashi Kawai:
Incorporating Symbolic Sequential Modeling for Speech Enhancement. CoRR abs/1904.13142 (2019) - [i7]Natalie Yu-Hsien Wang, Hsiao-Lan Sharon Wang, Taowei Wang, Szu-Wei Fu, Xugang Lu, Yu Tsao, Hsin-Min Wang:
Improving the Intelligibility of Electric and Acoustic Stimulation Speech Using Fully Convolutional Networks Based Speech Enhancement. CoRR abs/1909.11912 (2019) - [i6]Xugang Lu, Peng Shen, Sheng Li, Yu Tsao, Hisashi Kawai:
Deep progressive multi-scale attention for acoustic event classification. CoRR abs/1912.12011 (2019) - 2018
- [j20]Jianguo Wei, Yan Ji, Jingshu Zhang, Qiang Fang, Wenhuan Lu, Kiyoshi Honda, Xugang Lu:
Study of articulators' contribution and compensation during speech by articulatory speech recognition. Multim. Tools Appl. 77(14): 18849-18864 (2018) - [j19]Szu-Wei Fu, Taowei Wang, Yu Tsao, Xugang Lu, Hisashi Kawai:
End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks. IEEE ACM Trans. Audio Speech Lang. Process. 26(9): 1570-1584 (2018) - [c71]Ryandhimas E. Zezario, Jen-Wei Huang, Xugang Lu, Yu Tsao, Hsin-Te Hwang, Hsin-Min Wang:
Deep Denoising Autoencoder Based Post Filtering for Speech Enhancement. APSIPA 2018: 373-377 - [c70]Wei-Jen Lee, Syu-Siang Wang, Fei Chen, Xugang Lu, Shao-Yi Chien, Yu Tsao:
Speech Dereverberation Based on Integrated Deep and Ensemble Learning Algorithm. ICASSP 2018: 5454-5458 - [c69]Xugang Lu, Peng Shen, Sheng Li, Yu Tsao, Hisashi Kawai:
Temporal Attentive Pooling for Acoustic Event Detection. INTERSPEECH 2018: 1354-1357 - [c68]Peng Shen, Xugang Lu, Sheng Li, Hisashi Kawai:
Feature Representation of Short Utterances Based on Knowledge Distillation for Spoken Language Identification. INTERSPEECH 2018: 1813-1817 - [c67]Sheng Li, Xugang Lu, Ryoichi Takashima, Peng Shen, Tatsuya Kawahara, Hisashi Kawai:
Improving CTC-based Acoustic Model with Very Deep Residual Time-delay Neural Networks. INTERSPEECH 2018: 3708-3712 - [c66]Sheng Li, Xugang Lu, Ryoichi Takashima, Peng Shen, Tatsuya Kawahara, Hisashi Kawai:
Improving Very Deep Time-Delay Neural Network With Vertical-Attention For Effectively Training CTC-Based ASR Systems. SLT 2018: 77-83 - [i5]Wei-Jen Lee, Syu-Siang Wang, Fei Chen, Xugang Lu, Shao-Yi Chien, Yu Tsao:
Speech Dereverberation Based on Integrated Deep and Ensemble Learning. CoRR abs/1801.04052 (2018) - 2017
- [j18]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Regularization of neural network model with distance metric learning for i-vector based spoken language identification. Comput. Speech Lang. 44: 48-60 (2017) - [j17]Shota Morita, Xugang Lu, Masashi Unoki, Masato Akagi:
Method of Estimating Signal-to-Noise Ratio Based on Optimal Design for Sub-band Voice Activity Detection. J. Inf. Hiding Multim. Signal Process. 8(6): 1446-1459 (2017) - [j16]Naoyuki Kanda, Xugang Lu, Hisashi Kawai:
Maximum-a-Posteriori-Based Decoding for End-to-End Acoustic Models. IEEE ACM Trans. Audio Speech Lang. Process. 25(5): 1023-1034 (2017) - [j15]Ying-Hui Lai, Fei Chen, Syu-Siang Wang, Xugang Lu, Yu Tsao, Chin-Hui Lee:
A Deep Denoising Autoencoder Approach to Improving the Intelligibility of Vocoded Speech in Cochlear Implant Simulation. IEEE Trans. Biomed. Eng. 64(7): 1568-1578 (2017) - [c65]Szu-Wei Fu, Yu Tsao, Xugang Lu, Hisashi Kawai:
Raw waveform-based speech enhancement by fully convolutional networks. APSIPA 2017: 6-12 - [c64]Sheng Li, Xugang Lu, Peng Shen, Ryoichi Takashima, Tatsuya Kawahara, Hisashi Kawai:
Incremental training and constructing the very deep convolutional residual network acoustic models. ASRU 2017: 222-227 - [c63]Naoyuki Kanda, Xugang Lu, Hisashi Kawai:
Minimum Bayes risk training of CTC acoustic models in maximum a posteriori based decoding framework. ICASSP 2017: 4855-4859 - [c62]Sheng Li, Xugang Lu, Shinsuke Sakai, Masato Mimura, Tatsuya Kawahara:
Semi-supervised ensemble DNN acoustic model training. ICASSP 2017: 5270-5274 - [c61]Peng Shen, Xugang Lu, Sheng Li, Hisashi Kawai:
Conditional Generative Adversarial Nets Classifier for Spoken Language Identification. INTERSPEECH 2017: 2814-2818 - [c60]Szu-Wei Fu, Ting-Yao Hu, Yu Tsao, Xugang Lu:
Complex spectrogram enhancement by convolutional neural network with multi-metrics learning. MLSP 2017: 1-6 - [i4]Szu-Wei Fu, Yu Tsao, Xugang Lu, Hisashi Kawai:
Raw Waveform-based Speech Enhancement by Fully Convolutional Networks. CoRR abs/1703.02205 (2017) - [i3]Szu-Wei Fu, Ting-Yao Hu, Yu Tsao, Xugang Lu:
Multi-Metrics Learning for Speech Enhancement. CoRR abs/1704.08504 (2017) - [i2]Szu-Wei Fu, Yu Tsao, Xugang Lu, Hisashi Kawai:
End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks. CoRR abs/1709.03658 (2017) - 2016
- [j14]Tsubasa Ochiai, Shigeki Matsuda, Hideyuki Watanabe, Xugang Lu, Chiori Hori, Hisashi Kawai, Shigeru Katagiri:
Speaker Adaptive Training Localizing Speaker Modules in DNN for Hybrid DNN-HMM Speech Recognizers. IEICE Trans. Inf. Syst. 99-D(10): 2431-2443 (2016) - [j13]Peng Shen, Xugang Lu, Xinhui Hu, Naoyuki Kanda, Masahiro Saiko, Chiori Hori, Hisashi Kawai:
Combination of multiple acoustic models with unsupervised adaptation for lecture speech transcription. Speech Commun. 82: 1-13 (2016) - [j12]Syu-Siang Wang, Alan Chern, Yu Tsao, Jeih-weih Hung, Xugang Lu, Ying-Hui Lai, Borching Su:
Wavelet Speech Enhancement Based on Nonnegative Matrix Factorization. IEEE Signal Process. Lett. 23(8): 1101-1105 (2016) - [j11]Shota Morita, Masashi Unoki, Xugang Lu, Masato Akagi:
Robust Voice Activity Detection Based on Concept of Modulation Transfer Function in Noisy Reverberant Environments. J. Signal Process. Syst. 82(2): 163-173 (2016) - [c59]Tsubasa Ochiai, Shigeki Matsuda, Hideyuki Watanabe, Xugang Lu, Hisashi Kawai, Shigeru Katagiri:
Bottleneck linear transformation network adaptation for speaker adaptive training-based hybrid DNN-HMM speech recognizer. ICASSP 2016: 5015-5019 - [c58]Peng Shen, Xugang Lu, Lemao Liu, Hisashi Kawai:
Local fisher discriminant analysis for spoken language identification. ICASSP 2016: 5825-5829 - [c57]Xiaoyun Wang, Xugang Lu, Hisashi Kawai, Seiichi Yamamoto:
F0 Contour Analysis Based on Empirical Mode Decomposition for DNN Acoustic Modeling in Mandarin Speech Recognition. INTERSPEECH 2016: 973-977 - [c56]Naoyuki Kanda, Shoji Harada, Xugang Lu, Hisashi Kawai:
Investigation of Semi-Supervised Acoustic Model Training Based on the Committee of Heterogeneous Neural Networks. INTERSPEECH 2016: 1325-1329 - [c55]Naoyuki Kanda, Xugang Lu, Hisashi Kawai:
Maximum a posteriori Based Decoding for CTC Acoustic Models. INTERSPEECH 2016: 1868-1872 - [c54]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Pair-Wise Distance Metric Learning of Neural Network Model for Spoken Language Identification. INTERSPEECH 2016: 3216-3220 - [c53]Szu-Wei Fu, Yu Tsao, Xugang Lu:
SNR-Aware Convolutional Neural Network Modeling for Speech Enhancement. INTERSPEECH 2016: 3768-3772 - [c52]Chia-Yung Hsu, Ryandhimas E. Zezario, Jia-Ching Wang, Chin-Wen Ho, Xugang Lu, Yu Tsao:
Incorporating local environment information with ensemble neural networks to robust automatic speech recognition. ISCSLP 2016: 1-5 - [c51]Sheng Li, Xugang Lu, Shinsuke Mori, Yuya Akita, Tatsuya Kawahara:
Confidence estimation for speech recognition systems using conditional random fields trained with partially annotated data. ISCSLP 2016: 1-5 - [c50]Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
A pseudo-task design in multi-task learning deep neural network for speaker recognition. ISCSLP 2016: 1-5 - [c49]Peng Shen, Xugang Lu, Hisashi Kawai:
Comparison of regularization constraints in deep neural network based speaker adaptation. ISCSLP 2016: 1-5 - [c48]Peng Shen, Xugang Lu, Hisashi Kawai:
Automatic acoustic segmentation in N-best list rescoring for lecture speech recognition. ISCSLP 2016: 1-5 - [i1]Syu-Siang Wang, Alan Chern, Yu Tsao, Jeih-Weih Hung, Xugang Lu, Ying-Hui Lai, Borching Su:
Wavelet speech enhancement based on nonnegative matrix factorization. CoRR abs/1601.02309 (2016) - 2015
- [j10]Yu Tsao, Payton Lin, Ting-Yao Hu, Xugang Lu:
Ensemble environment modeling using affine transform group. Speech Commun. 68: 55-68 (2015) - [c47]Syu-Siang Wang, Hsin-Te Hwang, Ying-Hui Lai, Yu Tsao, Xugang Lu, Hsin-Min Wang, Borching Su:
Improving denoising auto-encoder based speech enhancement with the speech parameter generation algorithm. APSIPA 2015: 365-369 - [c46]Naoyuki Kanda, Mitsuyoshi Tachimori, Xugang Lu, Hisashi Kawai:
Training data pseudo-shuffling and direct decoding framework for recurrent neural network based acoustic modeling. ASRU 2015: 15-21 - [c45]Tsubasa Ochiai, Shigeki Matsuda, Hideyuki Watanabe, Xugang Lu, Chiori Hori, Shigeru Katagiri:
Speaker adaptive training for deep neural networks embedding linear transformation networks. ICASSP 2015: 4605-4609 - [c44]Xugang Lu, Peng Shen, Yu Tsao, Chiori Hori, Hisashi Kawai:
Sparse representation with temporal max-smoothing for acoustic event detection. INTERSPEECH 2015: 1176-1180 - [c43]Sheng Li, Xugang Lu, Yuya Akita, Tatsuya Kawahara:
Ensemble speaker modeling using speaker adaptive training deep neural network for speaker adaptation. INTERSPEECH 2015: 2892-2896 - 2014
- [j9]Yu Tsao, Xugang Lu, Paul R. Dixon, Ting-Yao Hu, Shigeki Matsuda, Chiori Hori:
Incorporating local information of the acoustic environments to MAP-based feature compensation and acoustic model adaptation. Comput. Speech Lang. 28(3): 709-726 (2014) - [c42]Hao-Teng Fan, Jeih-weih Hung, Xugang Lu, Syu-Siang Wang, Yu Tsao:
Speech enhancement using segmental nonnegative matrix factorization. ICASSP 2014: 4483-4487 - [c41]Xugang Lu, Yu Tsao, Shigeki Matsuda, Chiori Hori:
Sparse representation based on a bag of spectral exemplars for acoustic event detection. ICASSP 2014: 6255-6259 - [c40]Tsubasa Ochiai, Shigeki Matsuda, Xugang Lu, Chiori Hori, Shigeru Katagiri:
Speaker Adaptive Training using Deep Neural Networks. ICASSP 2014: 6349-6353 - [c39]Xugang Lu, Yu Tsao, Shigeki Matsuda, Chiori Hori:
Ensemble modeling of denoising autoencoder for speech spectrum restoration. INTERSPEECH 2014: 885-889 - [c38]Xinhui Hu, Xugang Lu, Chiori Hori:
Mandarin speech recognition using convolution neural network with augmented tone features. ISCSLP 2014: 15-18 - [c37]Shota Morita, Masashi Unoki, Xugang Lu, Masato Akagi:
Robust voice activity detection based on concept of modulation transfer function in noisy reverberant environments. ISCSLP 2014: 108-112 - [c36]Xugang Lu, Yu Tsao, Peng Shen, Chiori Hori:
Spectral patch based sparse coding for acoustic event detection. ISCSLP 2014: 317-320 - [c35]Shota Morita, Xugang Lu, Masashi Unoki:
Signal to noise ratio estimation based on an optimal design of subband voice activity detection. ISCSLP 2014: 560-564 - 2013
- [j8]Xugang Lu, Masashi Unoki, Shigeki Matsuda, Chiori Hori, Hideki Kashioka:
Controlling Tradeoff Between Approximation Accuracy and Complexity of a Smooth Function in a Reproducing Kernel Hilbert Space for Noise Reduction. IEEE Trans. Signal Process. 61(3): 601-610 (2013) - [c34]Shigeki Matsuda, Xugang Lu, Hideki Kashioka:
Automatic localization of a language-independent sub-network on deep neural networks trained by multi-lingual speech. ICASSP 2013: 7359-7362 - [c33]Xugang Lu, Yu Tsao, Shigeki Matsuda, Chiori Hori:
Speech enhancement based on deep denoising autoencoder. INTERSPEECH 2013: 436-440 - [c32]Xugang Lu, Shigeki Matsuda, Chiori Hori:
Speech spectrum restoration based on conditional restricted boltzmann machine. INTERSPEECH 2013: 3259-3263 - [c31]Chien-Lin Huang, Paul R. Dixon, Shigeki Matsuda, Youzheng Wu, Xugang Lu, Masahiro Saiko, Chiori Hori:
The NICT ASR system for IWSLT 2013. IWSLT (Evaluation Campaign) 2013 - 2012
- [c30]Youzheng Wu, Xugang Lu, Hitoshi Yamamoto, Shigeki Matsuda, Chiori Hori, Hideki Kashioka:
Factored Language Model based on Recurrent Neural Network. COLING 2012: 2835-2850 - [c29]Dongwen Ying, Xugang Lu, Junfeng Li, Yonghong Yan, Jianwu Dang, Frank K. Soong:
Noise estimation using a constrained sequential HMM IN log-spectral domain. ICASSP 2012: 4553-4556 - [c28]Xugang Lu, Shigeki Matsuda, Chiori Hori, Hideki Kashioka:
Speech restoration based on deep learning autoencoder with layer-wised pretraining. INTERSPEECH 2012: 1504-1507 - [c27]Xugang Lu, Masashi Unoki, Shigeki Matsuda, Chiori Hori, Hideki Kashioka:
Controlling the tradeoff property in a regularization framework for noise reduction. ISCSLP 2012: 201-205 - [c26]Masashi Unoki, Xugang Lu:
Unified denoising and dereverberation method used in restoration of MTF-based power envelope. ISCSLP 2012: 215-219 - [c25]Xugang Lu, Yu Tsao, Shigeki Matsuda, Chiori Hori, Hideki Kashioka:
Acoustic space partition based on broad phonetic class for ensemble acoustic modeling. ISCSLP 2012: 311-314 - [c24]Hitoshi Yamamoto, Youzheng Wu, Chien-Lin Huang, Xugang Lu, Paul R. Dixon, Shigeki Matsuda, Chiori Hori, Hideki Kashioka:
The NICT ASR system for IWSLT2012. IWSLT 2012: 34-37 - [c23]Youzheng Wu, Hitoshi Yamamoto, Xugang Lu, Shigeki Matsuda, Chiori Hori, Hideki Kashioka:
Factored recurrent neural network language model in TED lecture transcription. IWSLT 2012: 222-228 - 2011
- [j7]Xugang Lu, Masashi Unoki, Satoshi Nakamura:
Sub-band temporal modulation envelopes and their normalization for automatic speech recognition in reverberant environments. Comput. Speech Lang. 25(3): 571-584 (2011) - [j6]Xugang Lu, Shigeki Matsuda, Masashi Unoki, Satoshi Nakamura:
Temporal modulation normalization for robust speech feature extraction and recognition. Multim. Tools Appl. 52(1): 187-199 (2011) - [c22]Masashi Unoki, Xugang Lu, Rico Petrick, Shota Morita, Masato Akagi, Rüdiger Hoffmann:
Voice Activity Detection in MTF-Based Power Envelope Restoration. INTERSPEECH 2011: 2609-2612 - [c21]Xugang Lu, Masashi Unoki, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura:
Adaptive Regularization Framework for Robust Voice Activity Detection. INTERSPEECH 2011: 2653-2656 - 2010
- [j5]Xugang Lu, Shigeki Matsuda, Masashi Unoki, Satoshi Nakamura:
Temporal contrast normalization and edge-preserved smoothing of temporal modulation structures of speech for robust speech recognition. Speech Commun. 52(1): 1-11 (2010) - [j4]Xugang Lu, Jianwu Dang:
Vowel Production Manifold: Intrinsic Factor Analysis of Vowel Articulation. IEEE Trans. Speech Audio Process. 18(5): 1053-1062 (2010) - [c20]Xugang Lu, Masashi Unoki, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura:
Voice activity detection in a reguarized reproducing kernel hilbert space. INTERSPEECH 2010: 3086-3089 - [c19]Xugang Lu, Masashi Unoki, Ryosuke Isotani, Hisashi Kawai, Satoshi Nakamura:
Speech enhancement as a functional approximation and generalization. ISCSLP 2010: 18-22
2000 – 2009
- 2009
- [j3]Dongwen Ying, Masashi Unoki, Xugang Lu, Jianwu Dang:
Speech Enhancement Based on Noise Eigenspace Projection. IEICE Trans. Inf. Syst. 92-D(5): 1137-1145 (2009) - [c18]Xugang Lu, Shigeki Matsuda, Masashi Unoki, Tohru Shimizu, Satoshi Nakamura:
Temporal contrast normalization and edge-preserved smoothing on temporal modulation structure for robust speech recognition. ICASSP 2009: 4573-4576 - [c17]Xugang Lu, Masashi Unoki, Satoshi Nakamura:
Subband temporal modulation spectrum normalization for automatic speech recognition in reverberant environments. INTERSPEECH 2009: 2503-2506 - [c16]Xugang Lu, Masashi Unoki, Satoshi Nakamura:
Normalization on the modulation spectrum of the subband temporal envelopes for automatic speech recognition in reverberant environments. IUCS 2009: 247-254 - 2008
- [j2]Xugang Lu, Jianwu Dang:
An investigation of dependencies between frequency components and speaker characteristics for text-independent speaker identification. Speech Commun. 50(4): 312-322 (2008) - [c15]Rico Petrick, Xugang Lu, Masashi Unoki, Masato Akagi, Rüdiger Hoffmann:
Robust front end processing for speech recognition in reverberant environments: utilization of speech characteristics. INTERSPEECH 2008: 658-661 - [c14]Qiang Fang, Satoru Fujita, Xugang Lu, Jianwu Dang:
A model based investigation of activation patterns of the tongue muscles for vowel production. INTERSPEECH 2008: 2298-2301 - [c13]Xugang Lu, Shigeki Matsuda, Tohru Shimizu, Satoshi Nakamura:
Noise Reduction Based Random Matrix Theory. ISCSLP 2008: 285-288 - [c12]Xugang Lu, Shigeki Matsuda, Tohru Shimizu, Satoshi Nakamura:
Normalization on Temporal Modulation Transfer Function for Robust Speech Recognition. ISUC 2008: 16-23 - 2007
- [j1]Jianguo Wei, Xugang Lu, Jianwu Dang:
A Model-Based Learning Process for Modeling Coarticulation of Human Speech. IEICE Trans. Inf. Syst. 90-D(10): 1582-1591 (2007) - [c11]Xugang Lu, Jianwu Dang:
Physiological Feature Extraction for Text Independent Speaker Identification using Non-Uniform Subband Processing. ICASSP (4) 2007: 461-464 - [c10]Xugang Lu, Jianwu Dang:
Dimension reduction for speaker identification based on mutual information. INTERSPEECH 2007: 2021-2024 - 2006
- [c9]Xugang Lu, Masashi Unoki, Masato Akagi:
A robust feature extraction based on the MTF concept for speech recognition in reverberant environment. INTERSPEECH 2006 - [c8]Jianguo Wei, Xugang Lu, Jianwu Dang:
A simulation based parameter optimization for a coarticulation model. INTERSPEECH 2006 - [c7]Dongwen Ying, Yu Shi, Frank K. Soong, Jianwu Dang, Xugang Lu:
A Robust Voice Activity Detection Based on Noise Eigenspace Projection. ISCSLP (Selected Papers) 2006: 76-86 - [c6]Xugang Lu, Jianwu Dang:
Auditory Contrast Spectrum for Robust Speech Recognition. ISCSLP (Selected Papers) 2006: 325-334 - 2005
- [c5]Junfeng Li, Xugang Lu, Masato Akagi:
A noise reduction system in arbitrary noise environments and its applications to speech enhancement and speech recognition. ICASSP (3) 2005: 277-280 - 2000
- [c4]Xugang Lu, Gang Li, Lipo Wang:
Dominant subspace analysis for auditory spectrum. INTERSPEECH 2000: 59-62
1990 – 1999
- 1999
- [c3]Xugang Lu, Daowen Chen:
Integrating spatial and temporal mechanisms in auditory neural fiber's computational model. IJCNN 1999: 280-283 - [c2]Xugang Lu, Daowen Chen:
A New Cochlear Model Based on Adaptive Gain Mechanism. IWANN (1) 1999: 180-188 - [c1]Xugang Lu, Daowen Chen:
Nonlinear processing in auditory system. NSIP 1999: 301-305
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-12-02 22:32 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint