default search action
Alexander H. Liu
Person information
- unicode name: 劉浩然
- affiliation: Massachusetts Institute of Technology (MIT), Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA
- affiliation (former): National Taiwan University, Taipei, Taiwan
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c23]Haibin Wu, Ho-Lam Chung, Yi-Cheng Lin, Yuan-Kuei Wu, Xuanjun Chen, Yu-Chi Pai, Hsiu-Hsuan Wang, Kai-Wei Chang, Alexander H. Liu, Hung-yi Lee:
Codec-SUPERB: An In-Depth Analysis of Sound Codec Models. ACL (Findings) 2024: 10330-10348 - [c22]Alexander H. Liu, Sung-Lin Yeh, James R. Glass:
Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective. ICASSP 2024: 12051-12055 - [c21]Yuan Gong, Hongyin Luo, Alexander H. Liu, Leonid Karlinsky, James R. Glass:
Listen, Think, and Understand. ICLR 2024 - [c20]Alexander H. Liu, Matthew Le, Apoorv Vyas, Bowen Shi, Andros Tjandra, Wei-Ning Hsu:
Generative Pre-training for Speech with Flow Matching. ICLR 2024 - [i28]Alexander H. Liu, Sung-Lin Yeh, James R. Glass:
Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective. CoRR abs/2401.08833 (2024) - [i27]Haibin Wu, Ho-Lam Chung, Yi-Cheng Lin, Yuan-Kuei Wu, Xuanjun Chen, Yu-Chi Pai, Hsiu-Hsuan Wang, Kai-Wei Chang, Alexander H. Liu, Hung-yi Lee:
Codec-SUPERB: An In-Depth Analysis of Sound Codec Models. CoRR abs/2402.13071 (2024) - [i26]Haibin Wu, Xuanjun Chen, Yi-Cheng Lin, Kai-Wei Chang, Ho-Lam Chung, Alexander H. Liu, Hung-yi Lee:
Towards audio language modeling - an overview. CoRR abs/2402.13236 (2024) - [i25]Haibin Wu, Xuanjun Chen, Yi-Cheng Lin, Kai-Wei Chang, Jiawei Du, Ke-Han Lu, Alexander H. Liu, Ho-Lam Chung, Yuan-Kuei Wu, Dongchao Yang, Songxiang Liu, Yi-Chiao Wu, Xu Tan, James R. Glass, Shinji Watanabe, Hung-yi Lee:
Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models. CoRR abs/2409.14085 (2024) - [i24]Jiatong Shi, Jinchuan Tian, Yihan Wu, Jee-weon Jung, Jia Qi Yip, Yoshiki Masuyama, William Chen, Yuning Wu, Yuxun Tang, Massa Baali, Dareen Alharthi, Dong Zhang, Ruifan Deng, Tejes Srivastava, Haibin Wu, Alexander H. Liu, Bhiksha Raj, Qin Jin, Ruihua Song, Shinji Watanabe:
ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech. CoRR abs/2409.15897 (2024) - [i23]Pin-Jui Ku, Alexander H. Liu, Roman Korostik, Sung-Feng Huang, Szu-Wei Fu, Ante Jukic:
Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and Restoration. CoRR abs/2409.16117 (2024) - 2023
- [c19]Yuan Gong, Alexander H. Liu, Hongyin Luo, Leonid Karlinsky, James R. Glass:
Joint Audio and Speech Understanding. ASRU 2023: 1-8 - [c18]Yuan Gong, Andrew Rouditchenko, Alexander H. Liu, David Harwath, Leonid Karlinsky, Hilde Kuehne, James R. Glass:
Contrastive Audio-Visual Masked Autoencoder. ICLR 2023 - [c17]Heng-Jui Chang, Alexander H. Liu, James R. Glass:
Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering. INTERSPEECH 2023: 2983-2987 - [c16]Alexander H. Liu, Heng-Jui Chang, Michael Auli, Wei-Ning Hsu, James R. Glass:
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning. NeurIPS 2023 - [i22]Alexander H. Liu, Heng-Jui Chang, Michael Auli, Wei-Ning Hsu, James R. Glass:
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning. CoRR abs/2305.10005 (2023) - [i21]Yuan Gong, Hongyin Luo, Alexander H. Liu, Leonid Karlinsky, James R. Glass:
Listen, Think, and Understand. CoRR abs/2305.10790 (2023) - [i20]Heng-Jui Chang, Alexander H. Liu, James R. Glass:
Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering. CoRR abs/2305.11072 (2023) - [i19]Yuan Gong, Alexander H. Liu, Hongyin Luo, Leonid Karlinsky, James R. Glass:
Joint Audio and Speech Understanding. CoRR abs/2309.14405 (2023) - [i18]Alexander H. Liu, Matt Le, Apoorv Vyas, Bowen Shi, Andros Tjandra, Wei-Ning Hsu:
Generative Pre-training for Speech with Flow Matching. CoRR abs/2310.16338 (2023) - 2022
- [j3]Yuan Gong, Alexander H. Liu, Andrew Rouditchenko, James R. Glass:
UAVM: Towards Unifying Audio and Visual Models. IEEE Signal Process. Lett. 29: 2437-2441 (2022) - [j2]Yi-Long Liou, Jui-Yang Hsu, Chen-Sheng Chen, Alexander H. Liu, Hung-Yi Lee, Tsung-Te Liu:
A Fully Integrated 1.7mW Attention-Based Automatic Speech Recognition Processor. IEEE Trans. Circuits Syst. II Express Briefs 69(10): 4178-4182 (2022) - [c15]Alexander H. Liu, SouYoung Jin, Cheng-I Lai, Andrew Rouditchenko, Aude Oliva, James R. Glass:
Cross-Modal Discrete Representation Learning. ACL (1) 2022: 3013-3035 - [c14]Cheng-I Jeff Lai, Erica Cooper, Yang Zhang, Shiyu Chang, Kaizhi Qian, Yi-Lun Liao, Yung-Sung Chuang, Alexander H. Liu, Junichi Yamagishi, David D. Cox, James R. Glass:
On the Interplay between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis. ICASSP 2022: 8447-8451 - [c13]Alexander H. Liu, Cheng-I Lai, Wei-Ning Hsu, Michael Auli, Alexei Baevski, James R. Glass:
Simple and Effective Unsupervised Speech Synthesis. INTERSPEECH 2022: 843-847 - [c12]Alexander H. Liu, Wei-Ning Hsu, Michael Auli, Alexei Baevski:
Towards End-to-End Unsupervised Speech Recognition. SLT 2022: 221-228 - [i17]Alexander H. Liu, Wei-Ning Hsu, Michael Auli, Alexei Baevski:
Towards End-to-end Unsupervised Speech Recognition. CoRR abs/2204.02492 (2022) - [i16]Alexander H. Liu, Cheng-I Jeff Lai, Wei-Ning Hsu, Michael Auli, Alexei Baevski, James R. Glass:
Simple and Effective Unsupervised Speech Synthesis. CoRR abs/2204.02524 (2022) - [i15]Yuan Gong, Alexander H. Liu, Andrew Rouditchenko, James R. Glass:
UAVM: A Unified Model for Audio-Visual Learning. CoRR abs/2208.00061 (2022) - [i14]Yuan Gong, Andrew Rouditchenko, Alexander H. Liu, David Harwath, Leonid Karlinsky, Hilde Kuehne, James R. Glass:
Contrastive Audio-Visual Masked Autoencoder. CoRR abs/2210.07839 (2022) - 2021
- [j1]Shun-Po Chuang, Alexander H. Liu, Tzu-Wei Sung, Hung-yi Lee:
Improving Automatic Speech Recognition and Speech Translation via Word Embedding Prediction. IEEE ACM Trans. Audio Speech Lang. Process. 29: 93-105 (2021) - [c11]Mathew Monfort, SouYoung Jin, Alexander H. Liu, David Harwath, Rogério Feris, James R. Glass, Aude Oliva:
Spoken Moments: Learning Joint Audio-Visual Representations From Video Descriptions. CVPR 2021: 14871-14881 - [c10]Alexander H. Liu, Yu-An Chung, James R. Glass:
Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies. Interspeech 2021: 3730-3734 - [c9]Cheng-I Jeff Lai, Yang Zhang, Alexander H. Liu, Shiyu Chang, Yi-Lun Liao, Yung-Sung Chuang, Kaizhi Qian, Sameer Khurana, David D. Cox, James R. Glass:
PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition. NeurIPS 2021: 21256-21272 - [c8]Heng-Jui Chang, Alexander H. Liu, Hung-yi Lee, Lin-Shan Lee:
End-to-End Whispered Speech Recognition with Frequency-Weighted Approaches and Pseudo Whisper Pre-training. SLT 2021: 186-193 - [i13]Mathew Monfort, SouYoung Jin, Alexander H. Liu, David Harwath, Rogério Feris, James R. Glass, Aude Oliva:
Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions. CoRR abs/2105.04489 (2021) - [i12]Alexander H. Liu, SouYoung Jin, Cheng-I Jeff Lai, Andrew Rouditchenko, Aude Oliva, James R. Glass:
Cross-Modal Discrete Representation Learning. CoRR abs/2106.05438 (2021) - [i11]Cheng-I Jeff Lai, Yang Zhang, Alexander H. Liu, Shiyu Chang, Yi-Lun Liao, Yung-Sung Chuang, Kaizhi Qian, Sameer Khurana, David D. Cox, James R. Glass:
PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition. CoRR abs/2106.05933 (2021) - [i10]Cheng-I Jeff Lai, Erica Cooper, Yang Zhang, Shiyu Chang, Kaizhi Qian, Yi-Lun Liao, Yung-Sung Chuang, Alexander H. Liu, Junichi Yamagishi, David D. Cox, James R. Glass:
On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis. CoRR abs/2110.01147 (2021) - [i9]Kevin Duarte, Brian Chen, Nina Shvetsova, Andrew Rouditchenko, Samuel Thomas, Alexander H. Liu, David Harwath, James R. Glass, Hilde Kuehne, Mubarak Shah:
Routing with Self-Attention for Multimodal Capsule Networks. CoRR abs/2112.00775 (2021) - 2020
- [c7]Shun-Po Chuang, Tzu-Wei Sung, Alexander H. Liu, Hung-yi Lee:
Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation. ACL 2020: 5998-6003 - [c6]Alexander H. Liu, Tao Tu, Hung-yi Lee, Lin-Shan Lee:
Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning. ICASSP 2020: 7259-7263 - [c5]Alexander H. Liu, Tzu-Wei Sung, Shun-Po Chuang, Hung-yi Lee, Lin-Shan Lee:
Sequence-to-Sequence Automatic Speech Recognition with Word Embedding Regularization and Fused Decoding. ICASSP 2020: 7879-7883 - [c4]Tao Tu, Yuan-Jui Chen, Alexander H. Liu, Hung-yi Lee:
Semi-Supervised Learning for Multi-Speaker Text-to-Speech Synthesis Using Discrete Speech Representation. INTERSPEECH 2020: 3191-3195 - [i8]Heng-Jui Chang, Alexander H. Liu, Hung-yi Lee, Lin-Shan Lee:
End-to-end Whispered Speech Recognition with Frequency-weighted Approaches and Layer-wise Transfer Learning. CoRR abs/2005.01972 (2020) - [i7]Tao Tu, Yuan-Jui Chen, Alexander H. Liu, Hung-yi Lee:
Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation. CoRR abs/2005.08024 (2020) - [i6]Shun-Po Chuang, Tzu-Wei Sung, Alexander H. Liu, Hung-yi Lee:
Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation. CoRR abs/2005.10678 (2020) - [i5]Alexander H. Liu, Yu-An Chung, James R. Glass:
Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies. CoRR abs/2011.00406 (2020)
2010 – 2019
- 2019
- [c3]Po-Yi Chen, Alexander H. Liu, Yen-Cheng Liu, Yu-Chiang Frank Wang:
Towards Scene Understanding: Unsupervised Monocular Depth Estimation With Semantic-Aware Representation. CVPR 2019: 2624-2632 - [c2]Alexander H. Liu, Hung-yi Lee, Lin-Shan Lee:
Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model. ICASSP 2019: 6176-6180 - [i4]Alexander H. Liu, Tao Tu, Hung-yi Lee, Lin-Shan Lee:
Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning. CoRR abs/1910.12729 (2019) - [i3]Alexander H. Liu, Tzu-Wei Sung, Shun-Po Chuang, Hung-yi Lee, Lin-Shan Lee:
Sequence-to-sequence Automatic Speech Recognition with Word Embedding Regularization and Fused Decoding. CoRR abs/1910.12740 (2019) - 2018
- [c1]Alexander H. Liu, Yen-Cheng Liu, Yu-Ying Yeh, Yu-Chiang Frank Wang:
A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation. NeurIPS 2018: 2595-2604 - [i2]Alexander H. Liu, Yen-Cheng Liu, Yu-Ying Yeh, Yu-Chiang Frank Wang:
A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation. CoRR abs/1809.01361 (2018) - [i1]Alexander H. Liu, Hung-yi Lee, Lin-Shan Lee:
Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model. CoRR abs/1811.00787 (2018)
Coauthor Index
aka: Hung-Yi Lee
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-07 20:33 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint