default search action
Erik McDermott
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [i8]Zijin Gu, Tatiana Likhomanenko, He Bai, Erik McDermott, Ronan Collobert, Navdeep Jaitly:
Denoising LM: Pushing the Limits of Error Correction Models for Speech Recognition. CoRR abs/2405.15216 (2024) - [i7]Roger Hsiao, Liuhui Deng, Erik McDermott, Ruchir Travadi, Xiaodan Zhuang:
Optimizing Byte-level Representation for End-to-end ASR. CoRR abs/2406.09676 (2024) - [i6]Adnan Haider, Xingyu Na, Erik McDermott, Tim Ng, Zhen Huang, Xiaodan Zhuang:
Focused Discriminative Training For Streaming CTC-Trained Automatic Speech Recognition Models. CoRR abs/2408.13008 (2024) - 2023
- [c48]Stefan Braun, Erik McDermott, Roger Hsiao:
Neural Transducer Training: Reduced Memory Consumption with Sample-Wise Computation. ICASSP 2023: 1-5 - [c47]Pawel Swietojanski, Stefan Braun, Dogan Can, Thiago Fraga da Silva, Arnab Ghoshal, Takaaki Hori, Roger Hsiao, Henry Mason, Erik McDermott, Honza Silovsky, Ruchir Travadi, Xiaodan Zhuang:
Variable Attention Masking for Configurable Transformer Transducer Speech Recognition. ICASSP 2023: 1-5 - 2022
- [i5]Thien Nguyen, Nathalie Tran, Liuhui Deng, Thiago Fraga da Silva, Matthew Radzihovsky, Roger Hsiao, Henry Mason, Stefan Braun, Erik McDermott, Dogan Can, Pawel Swietojanski, Lyan Verwimp, Sibel Oyman, Tresi Arvizo, Honza Silovsky, Arnab Ghoshal, Mathieu Martel, Bharat Ram Ambati, Mohamed Ali:
Optimizing Bilingual Neural Transducer with Synthetic Code-switching Text Generation. CoRR abs/2210.12214 (2022) - [i4]Pawel Swietojanski, Stefan Braun, Dogan Can, Thiago Fraga da Silva, Arnab Ghoshal, Takaaki Hori, Roger Hsiao, Henry Mason, Erik McDermott, Honza Silovsky, Ruchir Travadi, Xiaodan Zhuang:
Variable Attention Masking for Configurable Transformer Transducer Speech Recognition. CoRR abs/2211.01438 (2022) - [i3]Stefan Braun, Erik McDermott, Roger Hsiao:
Neural Transducer Training: Reduced Memory Consumption with Sample-wise Computation. CoRR abs/2211.16270 (2022) - 2020
- [c46]Qian Zhang, Han Lu, Hasim Sak, Anshuman Tripathi, Erik McDermott, Stephen Koo, Shankar Kumar:
Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss. ICASSP 2020: 7829-7833 - [i2]Qian Zhang, Han Lu, Hasim Sak, Anshuman Tripathi, Erik McDermott, Stephen Koo, Shankar Kumar:
Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss. CoRR abs/2002.02562 (2020) - [i1]Erik McDermott, Hasim Sak, Ehsan Variani:
A Density Ratio Approach to Language Model Fusion in End-To-End Automatic Speech Recognition. CoRR abs/2002.11268 (2020)
2010 – 2019
- 2019
- [c45]Erik McDermott, Hasim Sak, Ehsan Variani:
A Density Ratio Approach to Language Model Fusion in End-to-End Automatic Speech Recognition. ASRU 2019: 434-441 - 2018
- [c44]Ehsan Variani, Tom Bagby, Kamel Lahouel, Erik McDermott, Michiel Bacchiani:
Sampled Connectionist Temporal Classification. ICASSP 2018: 4959-4963 - 2017
- [c43]Bo Li, Tara N. Sainath, Arun Narayanan, Joe Caroselli, Michiel Bacchiani, Ananya Misra, Izhak Shafran, Hasim Sak, Golan Pundak, Kean K. Chin, Khe Chai Sim, Ron J. Weiss, Kevin W. Wilson, Ehsan Variani, Chanwoo Kim, Olivier Siohan, Mitchel Weintraub, Erik McDermott, Richard Rose, Matt Shannon:
Acoustic Modeling for Google Home. INTERSPEECH 2017: 399-403 - [c42]Ehsan Variani, Tom Bagby, Erik McDermott, Michiel Bacchiani:
End-to-End Training of Acoustic Models for Large Vocabulary Continuous Speech Recognition with TensorFlow. INTERSPEECH 2017: 1641-1645 - 2015
- [c41]Ehsan Variani, Erik McDermott, Georg Heigold:
A Gaussian Mixture Model layer jointly optimized with discriminative features within a Deep Neural Network architecture. ICASSP 2015: 4270-4274 - 2014
- [c40]Ehsan Variani, Xin Lei, Erik McDermott, Ignacio López-Moreno, Javier Gonzalez-Dominguez:
Deep neural networks for small footprint text-dependent speaker verification. ICASSP 2014: 4052-4056 - [c39]Georg Heigold, Erik McDermott, Vincent Vanhoucke, Andrew W. Senior, Michiel Bacchiani:
Asynchronous stochastic optimization for sequence training of deep neural networks. ICASSP 2014: 5587-5591 - [c38]Hasim Sak, Oriol Vinyals, Georg Heigold, Andrew W. Senior, Erik McDermott, Rajat Monga, Mark Z. Mao:
Sequence discriminative distributed training of long short-term memory recurrent neural networks. INTERSPEECH 2014: 1209-1213 - [c37]Erik McDermott, Georg Heigold, Pedro J. Moreno, Andrew W. Senior, Michiel Bacchiani:
Asynchronous stochastic optimization for sequence training of deep neural networks: towards big data. INTERSPEECH 2014: 1224-1228 - 2013
- [c36]Hank Liao, Erik McDermott, Andrew W. Senior:
Large scale deep neural network acoustic modeling with semi-supervised training data for YouTube video transcription. ASRU 2013: 368-373 - 2012
- [c35]Erik McDermott:
An integrated framework for "margin" based sequential discriminative training over lattices based on differenced maximum mutual information (dMMI). MLSLP 2012 - 2010
- [j11]Yotaro Kubo, Shinji Watanabe, Atsushi Nakamura, Erik McDermott, Tetsunori Kobayashi:
A Sequential Pattern Classifier Based on Hidden Markov Kernel Machine and Its Application to Phoneme Classification. IEEE J. Sel. Top. Signal Process. 4(6): 974-984 (2010) - [c34]Hideyuki Watanabe, Shigeru Katagiri, Kouta Yamada, Erik McDermott, Atsushi Nakamura, Shinji Watanabe, Miho Ohsaki:
Minimum Error Classification with geometric margin control. ICASSP 2010: 2170-2173 - [c33]Erik McDermott, Shinji Watanabe, Atsushi Nakamura:
Discriminative training based on an integrated view of MPE and MMI in margin and error space. ICASSP 2010: 4894-4897 - [c32]Shinji Watanabe, Takaaki Hori, Erik McDermott, Atsushi Nakamura:
A discriminative model for continuous speech recognition based on Weighted Finite State Transducers. ICASSP 2010: 4922-4925
2000 – 2009
- 2009
- [c31]Atsushi Nakamura, Erik McDermott, Shinji Watanabe, Shigeru Katagiri:
A unified view for discriminative objective functions based on negative exponential of difference measure between strings. ICASSP 2009: 1633-1636 - [c30]Erik McDermott, Shinji Watanabe, Atsushi Nakamura:
Margin-space integration of MPE loss via differencing of MMI functionals for generalized error-weighted discriminative training. INTERSPEECH 2009: 224-227 - 2008
- [c29]Erik McDermott, Atsushi Nakamura:
Flexible discriminative training based on equal error group scores obtained from an error-indexed forward-backward algorithm. INTERSPEECH 2008: 2398-2401 - 2007
- [j10]John Hogden, Philip Rubin, Erik McDermott, Shigeru Katagiri, Louis Goldstein:
Inverting mappings from smooth paths through Rn to paths through Rm: A technique applied to recovering articulation from acoustics. Speech Commun. 49(5): 361-383 (2007) - [j9]Erik McDermott, Timothy J. Hazen, Jonathan Le Roux, Atsushi Nakamura, Shigeru Katagiri:
Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error. IEEE Trans. Speech Audio Process. 15(1): 203-223 (2007) - [c28]Timothy J. Hazen, Erik McDermott:
Discriminative MCE-based speaker adaptation of acoustic models for a spoken lecture processing task. INTERSPEECH 2007: 1577-1580 - [c27]Erik McDermott, Atsushi Nakamura:
String and lattice based discriminative training for the corpus of spontaneous Japanese lecture transcription task. INTERSPEECH 2007: 2081-2084 - 2006
- [j8]Erik McDermott, Shigeru Katagiri:
Discriminative training via minimization of risk estimates based on Parzen smoothing. Appl. Intell. 25(1): 37-57 (2006) - [j7]Atsushi Nakamura, Shinji Watanabe, Takaaki Hori, Erik McDermott, Shigeru Katagiri:
Advanced computational models and learning theories for spoken language processing. IEEE Comput. Intell. Mag. 1(2): 5-9 (2006) - [j6]Erik McDermott, Atsushi Nakamura:
Production-Oriented Models for Speech Recognition. IEICE Trans. Inf. Syst. 89-D(3): 1006-1014 (2006) - [c26]Jun Suzuki, Erik McDermott, Hideki Isozaki:
Training Conditional Random Fields with Multivariate Evaluation Measures. ACL 2006 - 2005
- [c25]Erik McDermott, Shigeru Katagiri:
Minimum Classification Error for Large Scale Speech Recognition Tasks using Weighted Finite State Transducers. ICASSP (1) 2005: 113-116 - [c24]Jonathan Le Roux, Erik McDermott:
Optimization methods for discriminative training. INTERSPEECH 2005: 3341-3344 - 2004
- [j5]Erik McDermott, Shigeru Katagiri:
A derivation of minimum classification error from the theoretical classification risk using Parzen estimation. Comput. Speech Lang. 18(2): 107-122 (2004) - [c23]Erik McDermott, Timothy J. Hazen:
Minimum classification error training of landmark models for real-time continuous speech recognition. ICASSP (1) 2004: 937-940 - [c22]Yasuhiro Minami, Erik McDermott, Atsushi Nakamura, Shigeru Katagiri:
A theoretical analysis of speech recognition based on feature trajectory models. INTERSPEECH 2004: 549-552 - 2003
- [c21]Yasuhiro Minami, Erik McDermott, Atsushi Nakamura, Shigeru Katagiri:
Recognition method with parametric trajectory generated from mixture distribution HMMs. ICASSP (1) 2003: 124-127 - [c20]Daniel Willett, Thomas Niesler, Erik McDermott, Yasuhiro Minami, Shigeru Katagiri:
Pervasive unsupervised adaptation for lecture speech transcription. ICASSP (1) 2003: 292-295 - [c19]Erik McDermott, Shigeru Katagiri:
A new formalization of minimum classification error using a Parzen estimate of classification chance. ICASSP (2) 2003: 713-716 - [c18]John Hogden, Patrick Valdez, Shigeru Katagiri, Erik McDermott:
Blind inversion of multidimensional functions for speech enhancement. INTERSPEECH 2003: 1409-1412 - 2002
- [c17]Yasuhiro Minami, Erik McDermott, Atsushi Nakamura, Shigeru Katagiri:
A recognition method with parametric trajectory synthesized using direct relations between static and dynamic feature vector time series. ICASSP 2002: 957-960 - [c16]Erik McDermott, Shigeru Katagiri:
Classification error from the theoretical Bayes classification risk. INTERSPEECH 2002: 2465-2468 - [c15]Erik McDermott, Shigeru Katagiri:
Minimum classification error via a Parzen window based estimate of the theoretical Bayes classification risk. NNSP 2002: 415-424 - 2001
- [j4]Alain Biem, Shigeru Katagiri, Erik McDermott, Biing-Hwang Juang:
An application of discriminative feature extraction to filter-bank-based speech recognition. IEEE Trans. Speech Audio Process. 9(2): 96-110 (2001) - [c14]Daniel Willett, Erik McDermott, Yasuhiro Minami, Shigeru Katagiri:
Time and memory efficient viterbi decoding for LVCSR using a precompiled search network. INTERSPEECH 2001: 847-850 - 2000
- [c13]Erik McDermott, Alain Biem, Seiichi Tenpaku, Shigeru Katagiri:
Discriminative training for large vocabulary telephone-based name recognition. ICASSP 2000: 3739-3742
1990 – 1999
- 1998
- [c12]Reiko Akahane-Yamada, Erik McDermott, Takahiro Adachi, Hideki Kawahara, John S. Pruitt:
Computer-based second language production training by using spectrographic representation and HMM-based speech recognition scores. ICSLP 1998 - 1997
- [c11]Eric A. Woudenberg, Alain Biem, Erik McDermott, Shigeru Katagiri:
Efficient normalization based upon GPD [generalized probabilistic descent]. ICASSP 1997: 3245-3248 - [c10]Erik McDermott, Shigeru Katagiri:
String-level MCE for continuous phoneme recognition. EUROSPEECH 1997: 123-126 - 1996
- [c9]Erik McDermott, Eric A. Woudenberg, Shigeru Katagiri:
A telephone-based directory assistance system adaptively trained using minimum classification error/generalized probabilistic descent. ICASSP 1996: 3346-3349 - 1995
- [c8]Alain Biem, Erik McDermott, Shigeru Katagiri:
A discriminative filter bank model for speech recognition. EUROSPEECH 1995: 545-548 - 1994
- [j3]Erik McDermott, Shigeru Katagiri:
Prototype-based minimum error training for speech recognition. Appl. Intell. 4(3): 245-256 (1994) - [j2]Erik McDermott, Shigeru Katagiri:
Prototype-based minimum classification error/generalized probabilistic descent training for various speech units. Comput. Speech Lang. 8(4): 351-368 (1994) - 1993
- [c7]Erik McDermott, Shigeru Katagiri:
Prototype-based MCE/GPD training for word spotting and connected word recognition. ICASSP (2) 1993: 291-294 - 1992
- [c6]Erik McDermott, Shigeru Katagiri:
Prototype-based discriminative training for various speech units. ICASSP 1992: 417-420 - 1991
- [j1]Erik McDermott, Shigeru Katagiri:
LVQ-based shift-tolerant phoneme recognition. IEEE Trans. Signal Process. 39(6): 1398-1411 (1991) - [c5]Hitoshi Iwamida, Shigeru Katagiri, Erik McDermott:
Speaker-independent large vocabulary word recognition using an LVQ/HMM hybrid algorithm. ICASSP 1991: 553-556 - 1990
- [c4]Hitoshi Iwamida, Shigeru Katagiri, Erik McDermott, Yoh'ichi Tohkura:
A hybrid speech recognition system using HMMs with an LVQ-trained codebook. ICASSP 1990: 489-492 - [c3]Yasuhiro Minami, Toshiyuki Hanazawa, Hitoshi Iwamida, Erik McDermott, Kiyohiro Shikano, Shigeru Katagiri, Masaona Kagawa:
On the robustness of HMM and ANN speech recognition algorithms. ICSLP 1990: 1345-1348
1980 – 1989
- 1989
- [c2]Erik McDermott, Shigeru Katagiri:
Shift-invariant, multi-category phoneme recognition using Kohonen's LVQ2. ICASSP 1989: 81-84 - [c1]Shigeru Katagiri, Erik McDermott, Manami Yokota:
A new algorithm for representing acoustic feature dynamics. ICASSP 1989: 322-325
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 21:24 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint