default search action
Florian Metze
Person information
- affiliation: Carnegie Mellon University, Pittsburgh, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c215]Jackson Michaels, Juncheng B. Li, Laura Yao, Lijun Yu, Zach Wood-Doughty, Florian Metze:
Audio-Journey: Open Domain Latent Diffusion Based Text-To-Audio Generation. ICASSP 2024: 6960-6964 - 2023
- [j16]Siddharth Dalmia, Dmytro Okhonko, Mike Lewis, Sergey Edunov, Shinji Watanabe, Florian Metze, Luke Zettlemoyer, Abdelrahman Mohamed:
LegoNN: Building Modular Encoder-Decoder Models. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3112-3126 (2023) - [c214]Brian Yan, Siddharth Dalmia, Yosuke Higuchi, Graham Neubig, Florian Metze, Alan W. Black, Shinji Watanabe:
CTC Alignments Improve Autoregressive Translation. EACL 2023: 1615-1631 - 2022
- [c213]Xinjian Li, Florian Metze, David R. Mortensen, Shinji Watanabe, Alan W. Black:
Zero-shot Learning for Grapheme to Phoneme Conversion with Language Ensemble. ACL (Findings) 2022: 2106-2115 - [c212]Triantafyllos Afouras, Yuki M. Asano, Francois Fagan, Andrea Vedaldi, Florian Metze:
Self-supervised object detection from audio-visual correspondence. CVPR 2022: 10565-10576 - [c211]Yookoon Park, Mahmoud Azab, Seungwhan Moon, Bo Xiong, Florian Metze, Gourab Kundu, Kirmani Ahmed:
Normalized Contrastive Learning for Text-Video Retrieval. EMNLP 2022: 248-260 - [c210]Shruti Palaskar, Akshita Bhagia, Yonatan Bisk, Florian Metze, Alan W. Black, Ana Marasovic:
On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization. EMNLP (Findings) 2022: 2644-2657 - [c209]Siddhant Arora, Siddharth Dalmia, Brian Yan, Florian Metze, Alan W. Black, Shinji Watanabe:
Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models. EMNLP (Findings) 2022: 5419-5429 - [c208]Juncheng B. Li, Shuhui Qu, Xinjian Li, Bernie Po-Yao Huang, Florian Metze:
On Adversarial Robustness Of Large-Scale Audio Visual Learning. ICASSP 2022: 231-235 - [c207]Roshan Sharma, Shruti Palaskar, Alan W. Black, Florian Metze:
End-to-End Speech Summarization Using Restricted Self-Attention. ICASSP 2022: 8072-8076 - [c206]Juncheng Li, Shuhui Qu, Po-Yao Huang, Florian Metze:
AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification. INTERSPEECH 2022: 1521-1525 - [c205]Xinjian Li, Florian Metze, David R. Mortensen, Alan W. Black, Shinji Watanabe:
ASR2K: Speech Recognition for Around 2000 Languages without Audio. INTERSPEECH 2022: 4885-4889 - [c204]Xinjian Li, Florian Metze, David R. Mortensen, Alan W. Black, Shinji Watanabe:
Phone Inventories and Recognition for Every Language. LREC 2022: 1061-1067 - [c203]Po-Yao Huang, Hu Xu, Juncheng Li, Alexei Baevski, Michael Auli, Wojciech Galuba, Florian Metze, Christoph Feichtenhofer:
Masked Autoencoders that Listen. NeurIPS 2022 - [i78]Juncheng B. Li, Shuhui Qu, Xinjian Li, Po-Yao Huang, Florian Metze:
On Adversarial Robustness of Large-scale Audio Visual Learning. CoRR abs/2203.12122 (2022) - [i77]Juncheng B. Li, Shuhui Qu, Po-Yao Huang, Florian Metze:
AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification. CoRR abs/2203.13448 (2022) - [i76]Juncheng B. Li, Shuhui Qu, Florian Metze:
Robustness of Neural Architectures for Audio Event Detection. CoRR abs/2205.03268 (2022) - [i75]Shruti Palaskar, Akshita Bhagia, Yonatan Bisk, Florian Metze, Alan W. Black, Ana Marasovic:
On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization. CoRR abs/2205.11686 (2022) - [i74]Siddharth Dalmia, Dmytro Okhonko, Mike Lewis, Sergey Edunov, Shinji Watanabe, Florian Metze, Luke Zettlemoyer, Abdelrahman Mohamed:
LegoNN: Building Modular Encoder-Decoder Models. CoRR abs/2206.03318 (2022) - [i73]Po-Yao Huang, Hu Xu, Juncheng Li, Alexei Baevski, Michael Auli, Wojciech Galuba, Florian Metze, Christoph Feichtenhofer:
Masked Autoencoders that Listen. CoRR abs/2207.06405 (2022) - [i72]Xinjian Li, Florian Metze, David R. Mortensen, Alan W. Black, Shinji Watanabe:
ASR2K: Speech Recognition for Around 2000 Languages without Audio. CoRR abs/2209.02842 (2022) - [i71]Brian Yan, Siddharth Dalmia, Yosuke Higuchi, Graham Neubig, Florian Metze, Alan W. Black, Shinji Watanabe:
CTC Alignments Improve Autoregressive Translation. CoRR abs/2210.05200 (2022) - [i70]Zheng Wang, Juncheng B. Li, Shuhui Qu, Florian Metze, Emma Strubell:
SQuAT: Sharpness- and Quantization-Aware Training for BERT. CoRR abs/2210.07171 (2022) - [i69]Siddhant Arora, Siddharth Dalmia, Brian Yan, Florian Metze, Alan W. Black, Shinji Watanabe:
Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models. CoRR abs/2210.15734 (2022) - [i68]Zheng Wang, Juncheng B. Li, Shuhui Qu, Florian Metze, Emma Strubell:
Error-aware Quantization through Noise Tempering. CoRR abs/2212.05603 (2022) - [i67]Yookoon Park, Mahmoud Azab, Bo Xiong, Seungwhan Moon, Florian Metze, Gourab Kundu, Kirmani Ahmed:
Normalized Contrastive Learning for Text-Video Retrieval. CoRR abs/2212.11790 (2022) - 2021
- [c202]Hu Xu, Gargi Ghosh, Po-Yao Huang, Prahal Arora, Masoumeh Aminzadeh, Christoph Feichtenhofer, Florian Metze, Luke Zettlemoyer:
VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding. ACL/IJCNLP (Findings) 2021: 4227-4239 - [c201]Amanda Cardoso Duarte, Shruti Palaskar, Lucas Ventura, Deepti Ghadiyaram, Kenneth DeHaan, Florian Metze, Jordi Torres, Xavier Giró-i-Nieto:
How2Sign: A Large-Scale Multimodal Dataset for Continuous American Sign Language. CVPR 2021: 2735-2744 - [c200]Abhilasha Ravichander, Siddharth Dalmia, Maria Ryskina, Florian Metze, Eduard H. Hovy, Alan W. Black:
NoiseQA: Challenge Set Evaluation for User-Centric Question Answering. EACL 2021: 2976-2992 - [c199]Hu Xu, Gargi Ghosh, Po-Yao Huang, Dmytro Okhonko, Armen Aghajanyan, Florian Metze, Luke Zettlemoyer, Christoph Feichtenhofer:
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding. EMNLP (1) 2021: 6787-6800 - [c198]Juncheng B. Li, Kaixin Ma, Shuhui Qu, Po-Yao Huang, Florian Metze:
Audio-Visual Event Recognition Through the Lens of Adversary. ICASSP 2021: 616-620 - [c197]Xinjian Li, David R. Mortensen, Florian Metze, Alan W. Black:
Multilingual Phonetic Dataset for Low Resource Speech Recognition. ICASSP 2021: 6958-6962 - [c196]Xinjian Li, Juncheng Li, Jiali Yao, Alan W. Black, Florian Metze:
Phone Distribution Estimation for Low Resource Languages. ICASSP 2021: 7233-7237 - [c195]Mandela Patrick, Po-Yao Huang, Ishan Misra, Florian Metze, Andrea Vedaldi, Yuki M. Asano, João F. Henriques:
Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning. ICCV 2021: 10540-10552 - [c194]Mandela Patrick, Po-Yao Huang, Yuki Markus Asano, Florian Metze, Alexander G. Hauptmann, João F. Henriques, Andrea Vedaldi:
Support-set bottlenecks for video-text representation learning. ICLR 2021 - [c193]Shruti Palaskar, Ruslan Salakhutdinov, Alan W. Black, Florian Metze:
Multimodal Speech Summarization Through Semantic Concept Learning. Interspeech 2021: 791-795 - [c192]Siddhant Arora, Alissa Ostapenko, Vijay Viswanathan, Siddharth Dalmia, Florian Metze, Shinji Watanabe, Alan W. Black:
Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding. Interspeech 2021: 1264-1268 - [c191]Xinjian Li, Juncheng Li, Florian Metze, Alan W. Black:
Hierarchical Phone Recognition with Compositional Phonetics. Interspeech 2021: 2461-2465 - [c190]Brian Yan, Siddharth Dalmia, David R. Mortensen, Florian Metze, Shinji Watanabe:
Differentiable Allophone Graphs for Language-Universal Speech Recognition. Interspeech 2021: 2471-2475 - [c189]Siddharth Dalmia, Brian Yan, Vikas Raunak, Florian Metze, Shinji Watanabe:
Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks. NAACL-HLT 2021: 1882-1896 - [c188]Poyao Huang, Mandela Patrick, Junjie Hu, Graham Neubig, Florian Metze, Alex Hauptmann:
Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models. NAACL-HLT 2021: 2443-2459 - [c187]Mandela Patrick, Dylan Campbell, Yuki M. Asano, Ishan Misra, Florian Metze, Christoph Feichtenhofer, Andrea Vedaldi, João F. Henriques:
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers. NeurIPS 2021: 12493-12506 - [e5]Heng Tao Shen, Yueting Zhuang, John R. Smith, Yang Yang, Pablo César, Florian Metze, Balakrishnan Prabhakaran:
MM '21: ACM Multimedia Conference, Virtual Event, China, October 20 - 24, 2021. ACM 2021, ISBN 978-1-4503-8651-7 [contents] - [i66]Abhilasha Ravichander, Siddharth Dalmia, Maria Ryskina, Florian Metze, Eduard H. Hovy, Alan W. Black:
NoiseQA: Challenge Set Evaluation for User-Centric Question Answering. CoRR abs/2102.08345 (2021) - [i65]Po-Yao Huang, Mandela Patrick, Junjie Hu, Graham Neubig, Florian Metze, Alexander G. Hauptmann:
Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models. CoRR abs/2103.08849 (2021) - [i64]Mandela Patrick, Yuki Markus Asano, Bernie Huang, Ishan Misra, Florian Metze, João F. Henriques, Andrea Vedaldi:
Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning. CoRR abs/2103.10211 (2021) - [i63]Triantafyllos Afouras, Yuki Markus Asano, Francois Fagan, Andrea Vedaldi, Florian Metze:
Self-supervised object detection from audio-visual correspondence. CoRR abs/2104.06401 (2021) - [i62]Siddharth Dalmia, Brian Yan, Vikas Raunak, Florian Metze, Shinji Watanabe:
Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks. CoRR abs/2105.00573 (2021) - [i61]Hu Xu, Gargi Ghosh, Po-Yao Huang, Prahal Arora, Masoumeh Aminzadeh, Christoph Feichtenhofer, Florian Metze, Luke Zettlemoyer:
VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding. CoRR abs/2105.09996 (2021) - [i60]Mandela Patrick, Dylan Campbell, Yuki Markus Asano, Ishan Misra, Florian Metze, Christoph Feichtenhofer, Andrea Vedaldi, João F. Henriques:
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers. CoRR abs/2106.05392 (2021) - [i59]Siddhant Arora, Alissa Ostapenko, Vijay Viswanathan, Siddharth Dalmia, Florian Metze, Shinji Watanabe, Alan W. Black:
Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding. CoRR abs/2106.15065 (2021) - [i58]Brian Yan, Siddharth Dalmia, David R. Mortensen, Florian Metze, Shinji Watanabe:
Differentiable Allophone Graphs for Language-Universal Speech Recognition. CoRR abs/2107.11628 (2021) - [i57]Hu Xu, Gargi Ghosh, Po-Yao Huang, Dmytro Okhonko, Armen Aghajanyan, Florian Metze, Luke Zettlemoyer, Christoph Feichtenhofer:
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding. CoRR abs/2109.14084 (2021) - [i56]Roshan Sharma, Shruti Palaskar, Alan W. Black, Florian Metze:
Speech Summarization using Restricted Self-Attention. CoRR abs/2110.06263 (2021) - 2020
- [j15]Shruti Palaskar, Ramon Sanabria, Florian Metze:
Transfer learning for multimodal dialog. Comput. Speech Lang. 64: 101093 (2020) - [j14]Lucia Specia, Loïc Barrault, Ozan Caglayan, Amanda Cardoso Duarte, Desmond Elliott, Spandana Gella, Nils Holzenberger, Chiraag Lala, Sun Jae Lee, Jindrich Libovický, Pranava Madhyastha, Florian Metze, Karl Mulligan, Alissa Ostapenko, Shruti Palaskar, Ramon Sanabria, Josiah Wang, Raman Arora:
Grounded Sequence to Sequence Transduction. IEEE J. Sel. Top. Signal Process. 14(3): 577-591 (2020) - [j13]Odette Scharenborg, Lucas Ondel, Shruti Palaskar, Philip Arthur, Francesco Ciannella, Mingxing Du, Elin Larsen, Danny Merkx, Rachid Riad, Liming Wang, Emmanuel Dupoux, Laurent Besacier, Alan W. Black, Mark Hasegawa-Johnson, Florian Metze, Graham Neubig, Sebastian Stüker, Pierre Godard, Markus Müller:
Speech Technology for Unwritten Languages. IEEE ACM Trans. Audio Speech Lang. Process. 28: 964-975 (2020) - [j12]Fengquan Dong, Kun Qian, Zhao Ren, Alice Baird, Xinjian Li, Zhenyu Dai, Bo Dong, Florian Metze, Yoshiharu Yamamoto, Björn W. Schuller:
Machine Listening for Heart Status Monitoring: Introducing and Benchmarking HSS - The Heart Sounds Shenzhen Corpus. IEEE J. Biomed. Health Informatics 24(7): 2082-2092 (2020) - [c186]Xinjian Li, Siddharth Dalmia, David R. Mortensen, Juncheng Li, Alan W. Black, Florian Metze:
Towards Zero-Shot Learning for Automatic Phonemic Transcription. AAAI 2020: 8261-8268 - [c185]Zhong Zhou, Isak Czeresnia Etinger, Florian Metze, Alexander Hauptmann, Alexander Waibel:
Gun Source and Muzzle Head Detection. IMAWM 2020: 1-11 - [c184]Tejas Srinivasan, Ramon Sanabria, Florian Metze, Desmond Elliott:
Fine-Grained Grounding for Multimodal Speech Recognition. EMNLP (Findings) 2020: 2667-2677 - [c183]Vikas Raunak, Siddharth Dalmia, Vivek Gupta, Florian Metze:
On Long-Tailed Phenomena in Neural Machine Translation. EMNLP (Findings) 2020: 3088-3095 - [c182]Tejas Srinivasan, Ramon Sanabria, Florian Metze:
Looking Enhances Listening: Recovering Missing Speech Using Images. ICASSP 2020: 6304-6308 - [c181]Anirudh Mani, Shruti Palaskar, Nimshi Venkat Meripo, Sandeep Konam, Florian Metze:
ASR Error Correction and Domain Adaptation Using Machine Translation. ICASSP 2020: 6344-6348 - [c180]Xinjian Li, Siddharth Dalmia, Juncheng Li, Matthew Lee, Patrick Littell, Jiali Yao, Antonios Anastasopoulos, David R. Mortensen, Graham Neubig, Alan W. Black, Florian Metze:
Universal Phone Recognition with a Multilingual Allophone System. ICASSP 2020: 8249-8253 - [c179]Mahaveer Jain, Gil Keren, Jay Mahadeokar, Geoffrey Zweig, Florian Metze, Yatharth Saraf:
Contextual RNN-T for Open Domain ASR. INTERSPEECH 2020: 11-15 - [c178]Zimeng Qiu, Yiyuan Li, Xinjian Li, Florian Metze, William M. Campbell:
Towards Context-Aware End-to-End Code-Switching Speech Recognition. INTERSPEECH 2020: 4776-4780 - [c177]David R. Mortensen, Xinjian Li, Patrick Littell, Alexis Michaud, Shruti Rijhwani, Antonios Anastasopoulos, Alan W. Black, Florian Metze, Graham Neubig:
AlloVera: A Multilingual Allophone Database. LREC 2020: 5329-5336 - [c176]Vikas Raunak, Vaibhav Kumar, Vivek Gupta, Florian Metze:
On Dimensional Linguistic Properties of the Word Embedding Space. RepL4NLP@ACL 2020: 156-165 - [i55]Zhong Zhou, Isak Czeresnia Etinger, Florian Metze, Alexander G. Hauptmann, Alexander Waibel:
Gun Source and Muzzle Head Detection. CoRR abs/2001.11120 (2020) - [i54]Tejas Srinivasan, Ramon Sanabria, Florian Metze:
Looking Enhances Listening: Recovering Missing Speech Using Images. CoRR abs/2002.05639 (2020) - [i53]Xinjian Li, Siddharth Dalmia, David R. Mortensen, Juncheng Li, Alan W. Black, Florian Metze:
Towards Zero-shot Learning for Automatic Phonemic Transcription. CoRR abs/2002.11781 (2020) - [i52]Xinjian Li, Siddharth Dalmia, Juncheng Li, Matthew Lee, Patrick Littell, Jiali Yao, Antonios Anastasopoulos, David R. Mortensen, Graham Neubig, Alan W. Black, Florian Metze:
Universal Phone Recognition with a Multilingual Allophone System. CoRR abs/2002.11800 (2020) - [i51]Anirudh Mani, Shruti Palaskar, Nimshi Venkat Meripo, Sandeep Konam, Florian Metze:
ASR Error Correction and Domain Adaptation Using Machine Translation. CoRR abs/2003.07692 (2020) - [i50]David R. Mortensen, Xinjian Li, Patrick Littell, Alexis Michaud, Shruti Rijhwani, Antonios Anastasopoulos, Alan W. Black, Florian Metze, Graham Neubig:
AlloVera: A Multilingual Allophone Database. CoRR abs/2004.08031 (2020) - [i49]Amanda Cardoso Duarte, Shruti Palaskar, Deepti Ghadiyaram, Kenneth DeHaan, Florian Metze, Jordi Torres, Xavier Giró-i-Nieto:
How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language. CoRR abs/2008.08143 (2020) - [i48]Ze Cheng, Juncheng Li, Chenxu Wang, Jixuan Gu, Hao Xu, Xinjian Li, Florian Metze:
Revisiting Factorizing Aggregated Posterior in Learning Disentangled Representations. CoRR abs/2009.05739 (2020) - [i47]Tejas Srinivasan, Ramon Sanabria, Florian Metze, Desmond Elliott:
Fine-Grained Grounding for Multimodal Speech Recognition. CoRR abs/2010.02384 (2020) - [i46]Mandela Patrick, Po-Yao Huang, Yuki Markus Asano, Florian Metze, Alexander G. Hauptmann, João F. Henriques, Andrea Vedaldi:
Support-set bottlenecks for video-text representation learning. CoRR abs/2010.02824 (2020) - [i45]Vikas Raunak, Siddharth Dalmia, Vivek Gupta, Florian Metze:
On Long-Tailed Phenomena in Neural Machine Translation. CoRR abs/2010.04924 (2020) - [i44]Tejas Srinivasan, Ramon Sanabria, Florian Metze, Desmond Elliott:
Multimodal Speech Recognition with Unstructured Audio Masking. CoRR abs/2010.08642 (2020) - [i43]Juncheng B. Li, Kaixin Ma, Shuhui Qu, Po-Yao Huang, Florian Metze:
Audio-Visual Event Recognition through the lens of Adversary. CoRR abs/2011.07430 (2020)
2010 – 2019
- 2019
- [j11]Niluthpol Chowdhury Mithun, Juncheng Li, Florian Metze, Amit K. Roy-Chowdhury:
Joint embeddings with multimodal cues for video-text retrieval. Int. J. Multim. Inf. Retr. 8(1): 3-18 (2019) - [j10]Okko Räsänen, Shreyas Seshadri, Julien Karadayi, Eric Riebling, John P. Bunce, Alejandrina Cristià, Florian Metze, Marisa Casillas, Celia Rosemberg, Elika Bergelson, Melanie Soderstrom:
Automatic word count estimation from daylong child-centered recordings in various language environments using language-independent syllabification of speech. Speech Commun. 113: 63-80 (2019) - [c175]Suyoun Kim, Siddharth Dalmia, Florian Metze:
Gated Embeddings in End-to-End Speech Recognition for Conversational-Context Fusion. ACL (1) 2019: 1131-1141 - [c174]Shruti Palaskar, Jindrich Libovický, Spandana Gella, Florian Metze:
Multimodal Abstractive Summarization for How2 Videos. ACL (1) 2019: 6587-6596 - [c173]Yun Wang, Juncheng Li, Florian Metze:
A Comparison of Five Multiple Instance Learning Pooling Functions for Sound Event Detection with Weak Labeling. ICASSP 2019: 31-35 - [c172]Yun Wang, Florian Metze:
Connectionist Temporal Localization for Sound Event Detection with Sequential Labeling. ICASSP 2019: 745-749 - [c171]Siddharth Dalmia, Xinjian Li, Alan W. Black, Florian Metze:
Phoneme Level Language Models for Sequence Based Low Resource ASR. ICASSP 2019: 6091-6095 - [c170]Shruti Palaskar, Vikas Raunak, Florian Metze:
Learned in Speech Recognition: Contextual Acoustic Word Embeddings. ICASSP 2019: 6530-6534 - [c169]Nils Holzenberger, Shruti Palaskar, Pranava Madhyastha, Florian Metze, Raman Arora:
Learning from Multiview Correlations in Open-domain Videos. ICASSP 2019: 8628-8632 - [c168]Ozan Caglayan, Ramon Sanabria, Shruti Palaskar, Loïc Barrault, Florian Metze:
Multimodal Grounding for Sequence-to-sequence Speech Recognition. ICASSP 2019: 8648-8652 - [c167]Vikas Raunak, Sang Keun Choe, Quanyang Lu, Yi Xu, Florian Metze:
On Leveraging the Visual Modality for Neural Machine Translation. INLG 2019: 147-151 - [c166]Xinjian Li, Siddharth Dalmia, Alan W. Black, Florian Metze:
Multilingual Speech Recognition with Corpus Relatedness Sampling. INTERSPEECH 2019: 2120-2124 - [c165]Xinjian Li, Zhong Zhou, Siddharth Dalmia, Alan W. Black, Florian Metze:
SANTLR: Speech Annotation Toolkit for Low Resource Languages. INTERSPEECH 2019: 3681-3682 - [c164]Suyoun Kim, Siddharth Dalmia, Florian Metze:
Cross-Attention End-to-End ASR for Two-Party Conversations. INTERSPEECH 2019: 4380-4384 - [c163]Florian Metze:
Survey Talk: Multimodal Processing of Speech and Language. INTERSPEECH 2019 - [c162]Tejas Srinivasan, Ramon Sanabria, Florian Metze:
CMU's Machine Translation System for IWSLT 2019. IWSLT 2019 - [c161]Tejas Srinivasan, Ramon Sanabria, Florian Metze:
Multitask Learning For Different Subword Segmentations In Neural Machine Translation. IWSLT 2019 - [c160]Yasufumi Moriya, Ramon Sanabria, Florian Metze, Gareth J. F. Jones:
MediaEval 2019: Eyes and Ears Together. MediaEval 2019 - [c159]Suyoun Kim, Florian Metze:
Acoustic-to-Word Models with Conversational Context Information. NAACL-HLT (1) 2019: 2766-2771 - [c158]Juncheng Li, Shuhui Qu, Xinjian Li, Joseph Szurley, J. Zico Kolter, Florian Metze:
Adversarial Music: Real world Audio Adversary against Wake-word Detection System. NeurIPS 2019: 11908-11918 - [c157]Vikas Raunak, Vivek Gupta, Florian Metze:
Effective Dimensionality Reduction for Word Embeddings. RepL4NLP@ACL 2019: 235-243 - [i42]Eduard H. Hovy, Jaime G. Carbonell, Hans Chalupsky, Anatole Gershman, Alex Hauptmann, Florian Metze, Teruko Mitamura, Zaid Sheikh, Ankit Dangi, Aditi Chaudhary, Xianyang Chen, Xiang Kong, Bernie Huang, Salvador Medina, Hector Liu, Xuezhe Ma, Maria Ryskina, Ramon Sanabria, Varun Gangal:
OPERA: Operations-oriented Probabilistic Extraction, Reasoning, and Analysis. TAC 2019 - [i41]Shruti Palaskar, Vikas Raunak, Florian Metze:
Learned In Speech Recognition: Contextual Acoustic Word Embeddings. CoRR abs/1902.06833 (2019) - [i40]Siddharth Dalmia, Xinjian Li, Alan W. Black, Florian Metze:
Phoneme Level Language Models for Sequence Based Low Resource ASR. CoRR abs/1902.07613 (2019) - [i39]Aditi Chaudhary, Siddharth Dalmia, Junjie Hu, Xinjian Li, Austin Matthews, Aldrian Obaja Muis, Naoki Otani, Shruti Rijhwani, Zaid Sheikh, Nidhi Vyas, Xinyi Wang, Jiateng Xie, Ruochen Xu, Chunting Zhou, Peter J. Jansen, Yiming Yang, Lori S. Levin, Florian Metze, Teruko Mitamura, David R. Mortensen, Graham Neubig, Eduard H. Hovy, Alan W. Black, Jaime G. Carbonell, Graham Horwood, Shabnam Tafreshi, Mona T. Diab, Efsun Sarioglu Kayi, Noura Farra, Kathleen R. McKeown:
The ARIEL-CMU Systems for LoReHLT18. CoRR abs/1902.08899 (2019) - [i38]Suyoun Kim, Florian Metze:
Acoustic-to-Word Models with Conversational Context Information. CoRR abs/1905.08796 (2019) - [i37]Yasufumi Moriya, Ramon Sanabria, Florian Metze, Gareth J. F. Jones:
Grounding Object Detections With Transcriptions. CoRR abs/1906.06147 (2019) - [i36]Shruti Palaskar, Jindrich Libovický, Spandana Gella, Florian Metze:
Multimodal Abstractive Summarization for How2 Videos. CoRR abs/1906.07901 (2019) - [i35]Suyoun Kim, Siddharth Dalmia, Florian Metze:
Gated Embeddings in End-to-End Speech Recognition for Conversational-Context Fusion. CoRR abs/1906.11604 (2019) - [i34]Tejas Srinivasan, Ramon Sanabria, Florian Metze:
Analyzing Utility of Visual Context in Multimodal Speech Recognition Under Noisy Conditions. CoRR abs/1907.00477 (2019) - [i33]Suyoun Kim, Siddharth Dalmia, Florian Metze:
Cross-Attention End-to-End ASR for Two-Party Conversations. CoRR abs/1907.10726 (2019) - [i32]Xinjian Li, Siddharth Dalmia, Alan W. Black, Florian Metze:
Multilingual Speech Recognition with Corpus Relatedness Sampling. CoRR abs/1908.01060 (2019) - [i31]Xinjian Li, Zhong Zhou, Siddharth Dalmia, Alan W. Black, Florian Metze:
SANTLR: Speech Annotation Toolkit for Low Resource Languages. CoRR abs/1908.01067 (2019) - [i30]Vikas Raunak, Vaibhav Kumar, Vivek Gupta, Florian Metze:
On Dimensional Linguistic Properties of the Word Embedding Space. CoRR abs/1910.02211 (2019) - [i29]Vikas Raunak, Sang Keun Choe, Quanyang Lu, Yi Xu, Florian Metze:
On Leveraging the Visual Modality for Neural Machine Translation. CoRR abs/1910.02754 (2019) - [i28]Tejas Srinivasan, Ramon Sanabria, Florian Metze:
Multitask Learning For Different Subword Segmentations In Neural Machine Translation. CoRR abs/1910.12368 (2019) - [i27]Juncheng B. Li, Shuhui Qu, Xinjian Li, J. Zico Kolter, Florian Metze:
Adversarial Music: Real World Audio Adversary Against Wake-word Detection System. CoRR abs/1911.00126 (2019) - [i26]Vikas Raunak, Vaibhav Kumar, Florian Metze:
On Compositionality in Neural Machine Translation. CoRR abs/1911.01497 (2019) - [i25]Siddharth Dalmia, Abdelrahman Mohamed, Mike Lewis, Florian Metze, Luke Zettlemoyer:
Enforcing Encoder-Decoder Modularity in Sequence-to-Sequence Models. CoRR abs/1911.03782 (2019) - 2018
- [c156]Siddharth Dalmia, Ramon Sanabria, Florian Metze, Alan W. Black:
Sequence-Based Multi-Lingual Low Resource Speech Recognition. ICASSP 2018: 4909-4913 - [c155]Odette Scharenborg, Laurent Besacier, Alan W. Black, Mark Hasegawa-Johnson, Florian Metze, Graham Neubig, Sebastian Stüker, Pierre Godard, Markus Müller, Lucas Ondel, Shruti Palaskar, Philip Arthur, Francesco Ciannella, Mingxing Du, Elin Larsen, Danny Merkx, Rachid Riad, Liming Wang, Emmanuel Dupoux:
Linguistic Unit Discovery from Multi-Modal Inputs in Unwritten Languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop. ICASSP 2018: 4979-4983 - [c154]Neville Ryant, Elika Bergelson, Kenneth Church, Alejandrina Cristià, Jun Du, Sriram Ganapathy, Sanjeev Khudanpur, Diana Kowalski, Mahesh Krishnamoorthy, Rajat Kulshreshta, Mark Y. Liberman, Yu-Ding Lu, Matthew Maciejewski, Florian Metze, Ján Profant, Lei Sun, Yu Tsao, Zhou Yu:
Enhancement and Analysis of Conversational Speech: JSALT 2017. ICASSP 2018: 5154-5158 - [c153]Shruti Palaskar, Ramon Sanabria, Florian Metze:
End-to-end Multimodal Speech Recognition. ICASSP 2018: 5774-5778 - [c152]Juncheng Li, Yun Wang, Joseph Szurley, Florian Metze, Samarjit Das:
A Light-Weight Multimodal Framework for Improved Environmental Audio Tagging. ICASSP 2018: 6832-6836 - [c151]Thomas Zenkel, Ramon Sanabria, Florian Metze, Alex Waibel:
Subword and Crossword Units for CTC Acoustic Models. INTERSPEECH 2018: 396-400 - [c150]Yun Wang, Juncheng Li, Florian Metze:
Comparing the Max and Noisy-Or Pooling Functions in Multiple Instance Learning for Weakly Supervised Sequence Learning Tasks. INTERSPEECH 2018: 1339-1343 - [c149]Adrien Le Franc, Eric Riebling, Julien Karadayi, Yun Wang, Camila Scaff, Florian Metze, Alejandrina Cristià:
The ACLEW DiViMe: An Easy-to-use Diarization Tool. INTERSPEECH 2018: 1383-1387 - [c148]Shao-Yen Tseng, Juncheng Li, Yun Wang, Florian Metze, Joseph Szurley, Samarjit Das:
Multiple Instance Deep Learning for Weakly Supervised Small-Footprint Audio Event Detection. INTERSPEECH 2018: 3279-3283 - [c147]Boyang Li, Beth Cardier, Tong Wang, Florian Metze:
Annotating High-Level Structures of Short Stories and Personal Anecdotes. LREC 2018 - [c146]Yasufumi Moriya, Ramon Sanabria, Florian Metze, Gareth J. F. Jones:
Eyes and Ears Together: New Task for Multimodal Spoken Content Analysis. MediaEval 2018 - [c145]Niluthpol Chowdhury Mithun, Juncheng Li, Florian Metze, Amit K. Roy-Chowdhury:
Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval. ICMR 2018: 19-27 - [c144]Siddharth Dalmia, Xinjian Li, Florian Metze, Alan W. Black:
Domain Robust Feature Extraction for Rapid Low Resource ASR Development. SLT 2018: 258-265 - [c143]Shruti Palaskar, Florian Metze:
Acoustic-to-Word Recognition with Sequence-to-Sequence Models. SLT 2018: 397-404 - [c142]Suyoun Kim, Florian Metze:
Dialog-Context Aware end-to-end Speech Recognition. SLT 2018: 434-440 - [c141]Ramon Sanabria, Florian Metze:
Hierarchical Multitask Learning With CTC. SLT 2018: 485-490 - [i24]Eduard H. Hovy, Taylor Berg-Kirkpatrick, Jaime G. Carbonell, Hans Chalupsky, Anatole Gershman, Alexander G. Hauptmann, Florian Metze, Teruko Mitamura, Aditi Chaudhary, Xianyang Chen, Bernie Po-Yao Huang, Hector Zhengzhong Liu, Xuezhe Ma, Shruti Palaskar, Dheeraj Rajagopal, Maria Ryskina, Ramon Sanabria:
OPERA: Operations-oriented Probabilistic Extraction, Reasoning, and Analysis. TAC 2018 - [i23]Odette Scharenborg, Laurent Besacier, Alan W. Black, Mark Hasegawa-Johnson, Florian Metze, Graham Neubig, Sebastian Stüker, Pierre Godard, Markus Müller, Lucas Ondel, Shruti Palaskar, Philip Arthur, Francesco Ciannella, Mingxing Du, Elin Larsen, Danny Merkx, Rachid Riad, Liming Wang, Emmanuel Dupoux:
Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop. CoRR abs/1802.05092 (2018) - [i22]Siddharth Dalmia, Ramon Sanabria, Florian Metze, Alan W. Black:
Sequence-based Multi-lingual Low Resource Speech Recognition. CoRR abs/1802.07420 (2018) - [i21]Yun Wang, Juncheng Li, Florian Metze:
Comparing the Max and Noisy-Or Pooling Functions in Multiple Instance Learning for Weakly Supervised Sequence Learning Tasks. CoRR abs/1804.01146 (2018) - [i20]Shruti Palaskar, Ramon Sanabria, Florian Metze:
End-to-End Multimodal Speech Recognition. CoRR abs/1804.09713 (2018) - [i19]Ramon Sanabria, Florian Metze:
Hierarchical Multi Task Learning With CTC. CoRR abs/1807.07104 (2018) - [i18]Shruti Palaskar, Florian Metze:
Acoustic-to-Word Recognition with Sequence-to-Sequence Models. CoRR abs/1807.09597 (2018) - [i17]Siddharth Dalmia, Xinjian Li, Florian Metze, Alan W. Black:
Domain Robust Feature Extraction for Rapid Low Resource ASR Development. CoRR abs/1807.10984 (2018) - [i16]Suyoun Kim, Florian Metze:
Dialog-context aware end-to-end speech recognition. CoRR abs/1808.02171 (2018) - [i15]Ankit Shah, Harini Kesavamoorthy, Poorva Rane, Pramati Kalwad, Alexander G. Hauptmann, Florian Metze:
Activity Recognition on a Large Scale in Short Videos - Moments in Time Dataset. CoRR abs/1809.00241 (2018) - [i14]Yun Wang, Juncheng Li, Florian Metze:
A Comparison of Five Multiple Instance Learning Pooling Functions for Sound Event Detection with Weak Labeling. CoRR abs/1810.09050 (2018) - [i13]Yun Wang, Florian Metze:
Connectionist Temporal Localization for Sound Event Detection with Sequential Labeling. CoRR abs/1810.09052 (2018) - [i12]Ramon Sanabria, Ozan Caglayan, Shruti Palaskar, Desmond Elliott, Loïc Barrault, Lucia Specia, Florian Metze:
How2: A Large-scale Dataset for Multimodal Language Understanding. CoRR abs/1811.00347 (2018) - [i11]Ozan Caglayan, Ramon Sanabria, Shruti Palaskar, Loïc Barrault, Florian Metze:
Multimodal Grounding for Sequence-to-Sequence Speech Recognition. CoRR abs/1811.03865 (2018) - [i10]Nils Holzenberger, Shruti Palaskar, Pranava Madhyastha, Florian Metze, Raman Arora:
Learning from Multiview Correlations in Open-Domain Videos. CoRR abs/1811.08890 (2018) - 2017
- [c140]Juncheng Li, Wei Dai, Florian Metze, Shuhui Qu, Samarjit Das:
A comparison of Deep Learning methods for environmental sound detection. ICASSP 2017: 126-130 - [c139]Yun Wang, Florian Metze:
A first attempt at polyphonic sound event detection using connectionist temporal classification. ICASSP 2017: 2986-2990 - [c138]Abhinav Gupta, Yajie Miao, Leonardo Neves, Florian Metze:
Visual features for context-aware speech recognition. ICASSP 2017: 5020-5024 - [c137]Thomas Zenkel, Ramon Sanabria, Florian Metze, Jan Niehues, Matthias Sperber, Sebastian Stüker, Alex Waibel:
Comparison of Decoding Strategies for CTC Acoustic Models. INTERSPEECH 2017: 513-517 - [c136]Yun Wang, Florian Metze:
A Transfer Learning Based Feature Extractor for Polyphonic Sound Event Detection Using Connectionist Temporal Classification. INTERSPEECH 2017: 3097-3101 - [c135]Niluthpol Chowdhury Mithun, Juncheng B. Li, Florian Metze, Amit K. Roy-Chowdhury, Samarjit Das:
CMU-UCR-BOSCH @ TRECVID 2017: VIDEO TO TEXT RETRIEVAL. TRECVID 2017 - [p3]Shinji Watanabe, Marc Delcroix, Florian Metze, John R. Hershey:
Preliminaries. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 3-17 - [p2]Yajie Miao, Florian Metze:
End-to-End Architectures for Speech Recognition. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 299-323 - [p1]Shinji Watanabe, Takaaki Hori, Yajie Miao, Marc Delcroix, Florian Metze, John R. Hershey:
Toolkits for Robust Speech Processing. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 369-382 - [e4]Shinji Watanabe, Marc Delcroix, Florian Metze, John R. Hershey:
New Era for Robust Speech Recognition, Exploiting Deep Learning. Springer 2017, ISBN 978-3-319-64679-4 [contents] - [i9]Juncheng Li, Wei Dai, Florian Metze, Shuhui Qu, Samarjit Das:
A Comparison of deep learning methods for environmental sound. CoRR abs/1703.06902 (2017) - [i8]Thomas Zenkel, Ramon Sanabria, Florian Metze, Jan Niehues, Matthias Sperber, Sebastian Stüker, Alex Waibel:
Comparison of Decoding Strategies for CTC Acoustic Models. CoRR abs/1708.04469 (2017) - [i7]Boyang Li, Beth Cardier, Tong Wang, Florian Metze:
Annotating High-Level Structures of Short Stories and Personal Anecdotes. CoRR abs/1710.06917 (2017) - [i6]Abhinav Gupta, Yajie Miao, Leonardo Neves, Florian Metze:
Visual Features for Context-Aware Speech Recognition. CoRR abs/1712.00489 (2017) - [i5]Thomas Zenkel, Ramon Sanabria, Florian Metze, Alex Waibel:
Subword and Crossword Units for CTC Acoustic Models. CoRR abs/1712.06855 (2017) - [i4]Shao-Yen Tseng, Juncheng Li, Yun Wang, Joseph Szurley, Florian Metze, Samarjit Das:
Multiple Instance Deep Learning for Weakly Supervised Audio Event Detection. CoRR abs/1712.09673 (2017) - [i3]Juncheng Li, Yun Wang, Joseph Szurley, Florian Metze, Samarjit Das:
A Light-Weight Multimodal Framework for Improved Environmental Audio Tagging. CoRR abs/1712.09680 (2017) - 2016
- [c134]Marvin Ritter, Markus Müller, Sebastian Stüker, Florian Metze, Alex Waibel:
Training Deep Neural Networks for Reverberation Robust Speech Recognition. ITG Symposium on Speech Communication 2016: 1-5 - [c133]Yajie Miao, Mohammad Gowayyed, Xingyu Na, Tom Ko, Florian Metze, Alexander Waibel:
An empirical exploration of CTC acoustic models. ICASSP 2016: 2623-2627 - [c132]Yun Wang, Leonardo Neves, Florian Metze:
Audio-based multimedia event detection using deep recurrent neural networks. ICASSP 2016: 2742-2746 - [c131]Florian Metze, Eric Riebling, Anne S. Warlaumont, Elika Bergelson:
Virtual Machines and Containers as a Platform for Experimentation. INTERSPEECH 2016: 1603-1607 - [c130]Rebecca Bates, Eric Fosler-Lussier, Florian Metze, Martha A. Larson, Gina-Anne Levow, Emily Mower Provost:
Experiences with Shared Resources for Research and Education in Speech and Language Processing. INTERSPEECH 2016: 1627-1631 - [c129]Yashesh Gaur, Florian Metze, Jeffrey P. Bigham:
Manipulating Word Lattices to Incorporate Human Corrections. INTERSPEECH 2016: 3062-3065 - [c128]Yajie Miao, Florian Metze:
Open-Domain Audio-Visual Speech Recognition: A Deep Learning Approach. INTERSPEECH 2016: 3414-3418 - [c127]Yun Wang, Florian Metze:
Recurrent Support Vector Machines for Audio-Based Multimedia Event Detection. ICMR 2016: 265-269 - [c126]Yashesh Gaur, Walter S. Lasecki, Florian Metze, Jeffrey P. Bigham:
The effects of automatic speech recognition quality on human transcription latency. W4A 2016: 23:1-23:8 - [i2]Ramon Sanabria, Florian Metze, Fernando De la Torre:
Robust end-to-end deep audiovisual speech recognition. CoRR abs/1611.06986 (2016) - 2015
- [j9]Yajie Miao, Hao Zhang, Florian Metze:
Speaker Adaptive Training of Deep Neural Network Acoustic Models Using I-Vectors. IEEE ACM Trans. Audio Speech Lang. Process. 23(11): 1938-1949 (2015) - [c125]Yajie Miao, Mohammad Gowayyed, Florian Metze:
EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding. ASRU 2015: 167-174 - [c124]Florian Metze, Ankur Gandhe, Yajie Miao, Zaid Sheikh, Yun Wang, Di Xu, Hao Zhang, Jungsuk Kim, Ian R. Lane, Wonkyum Lee, Sebastian Stüker, Markus Müller:
Semi-supervised training in low-resource ASR and KWS. ICASSP 2015: 4699-4703 - [c123]Hao Zhang, Yajie Miao, Florian Metze:
Regularizing DNN acoustic models with Gaussian stochastic neurons. ICASSP 2015: 4964-4968 - [c122]Xavier Anguera, Luis Javier Rodríguez-Fuentes, Andi Buzo, Florian Metze, Igor Szöke, Mikel Peñagarikano:
QUESST2014: Evaluating Query-by-Example Speech Search in a zero-resource setting with real-life queries. ICASSP 2015: 5833-5837 - [c121]Yajie Miao, Florian Metze:
Distance-aware DNNs for robust speech recognition. INTERSPEECH 2015: 761-765 - [c120]Yajie Miao, Florian Metze:
On speaker adaptation of long short-term memory recurrent neural networks. INTERSPEECH 2015: 1101-1105 - [c119]Florian Metze, Eric Riebling, Eric Fosler-Lussier, Andrew R. Plummer, Rebecca Bates:
The speech recognition virtual kitchen turns one. INTERSPEECH 2015: 2617-2618 - [c118]Yashesh Gaur, Florian Metze, Yajie Miao, Jeffrey P. Bigham:
Using keyword spotting to help humans correct captioning faster. INTERSPEECH 2015: 2829-2833 - [c117]Igor Szöke, Luis Javier Rodríguez-Fuentes, Andi Buzo, Xavier Anguera, Florian Metze, Jorge Proença, Martin Lojka, Xiao Xiong:
Query by Example Search on Speech at Mediaeval 2015. MediaEval 2015 - [c116]Shoou-I Yu, Lu Jiang, Zhongwen Xu, Zhenzhong Lan, Shicheng Xu, Xiaojun Chang, Xuanchong Li, Zexi Mao, Chuang Gan, Yajie Miao, Xingzhong Du, Yang Cai, Lara J. Martin, Nikolas Wolfe, Anurag Kumar, Huan Li, Ming Lin, Zhigang Ma, Yi Yang, Deyu Meng, Shiguang Shan, Pinar Duygulu Sahin, Susanne Burger, Florian Metze, Rita Singh, Bhiksha Raj, Teruko Mitamura, Richard M. Stern, Alexander G. Hauptmann:
CMU Informedia@TRECVID 2015: MED/SIN/LNK/SED. TRECVID 2015 - [i1]Yajie Miao, Mohammad Gowayyed, Florian Metze:
EESEN: End-to-End Speech Recognition using Deep RNN Models and WFST-based Decoding. CoRR abs/1507.08240 (2015) - 2014
- [j8]Anuj Kumar, Florian Metze, Matthew Kam:
Enabling the Rapid Development and Adoption of Speech-User Interfaces. Computer 47(1): 40-47 (2014) - [j7]Florian Metze, Xavier Anguera, Etienne Barnard, Marelie H. Davel, Guillaume Gravier:
Language independent search in MediaEval's Spoken Web Search task. Comput. Speech Lang. 28(5): 1066-1082 (2014) - [c115]Florian Metze, Koichi Shinoda:
Semantics for Large-Scale Multimedia: New Challenges for NLP. ACL (Tutorial Abstracts) 2014: 6 - [c114]Yulia Tsvetkov, Florian Metze, Chris Dyer:
Augmenting Translation Models with Simulated Acoustic Confusions for Improved Spoken Language Translation. EACL 2014: 616-625 - [c113]Yipei Wang, Shourabh Rawat, Florian Metze:
Exploring audio semantic concepts for event-based video retrieval. ICASSP 2014: 1360-1364 - [c112]Yipei Wang, Shourabh Rawat, Florian Metze:
Semi-automatic audio semantic concept discovery for multimedia retrieval. ICASSP 2014: 1375-1379 - [c111]Ankur Gandhe, Florian Metze, Alex Waibel, Ian R. Lane:
Optimization of Neural Network Language Models for keyword search. ICASSP 2014: 4888-4892 - [c110]Florian Metze, Shourabh Rawat, Yipei Wang:
Improved audio features for large-scale multimedia event detection. ICME 2014: 1-6 - [c109]Yajie Miao, Florian Metze:
Improving language-universal feature extraction with deep maxout and convolutional neural networks. INTERSPEECH 2014: 800-804 - [c108]Yajie Miao, Hao Zhang, Florian Metze:
Distributed learning of multilingual DNN feature extractors using GPUs. INTERSPEECH 2014: 830-834 - [c107]Andrew R. Plummer, Eric Riebling, Anuj Kumar, Florian Metze, Eric Fosler-Lussier, Rebecca Bates:
The speech recognition virtual kitchen: launch party. INTERSPEECH 2014: 2140-2141 - [c106]Yajie Miao, Hao Zhang, Florian Metze:
Towards speaker adaptive training of deep neural network acoustic models. INTERSPEECH 2014: 2189-2193 - [c105]Xavier Anguera, Luis Javier Rodríguez-Fuentes, Igor Szöke, Andi Buzo, Florian Metze, Mikel Peñagarikano:
Query-by-example spoken term detection on multilingual unconstrained speech. INTERSPEECH 2014: 2459-2463 - [c104]Yun Wang, Florian Metze:
An in-depth comparison of keyword specific thresholding and sum-to-one score normalization. INTERSPEECH 2014: 2474-2478 - [c103]Ankur Gandhe, Florian Metze, Ian R. Lane:
Neural network language models for low resource languages. INTERSPEECH 2014: 2615-2619 - [c102]Di Xu, Florian Metze:
Word-based probabilistic phonetic retrieval for low-resource spoken term detection. INTERSPEECH 2014: 2774-2778 - [c101]Markus Müller, Sebastian Stüker, Zaid Sheikh, Florian Metze, Alex Waibel:
Multilingual deep bottle neck features: a study on language selection and training techniques. IWSLT 2014 - [c100]Xavier Anguera, Luis Javier Rodríguez-Fuentes, Igor Szöke, Andi Buzo, Florian Metze:
Query by Example Search on Speech at Mediaeval 2014. MediaEval 2014 - [c99]Lara J. Martin, Matthew Stone, Florian Metze, Jack Mostow:
A methodology for using crowdsourced data to measure uncertainty in natural speech. SLT 2014: 95-99 - [c98]Yajie Miao, Lu Jiang, Hao Zhang, Florian Metze:
Improvements to speaker adaptive training of deep neural networks. SLT 2014: 165-170 - [c97]Di Xu, Yun Wang, Florian Metze:
EM-based phoneme confusion matrix generation for low-resource spoken term detection. SLT 2014: 424-429 - [c96]Jan Trmal, Guoguo Chen, Daniel Povey, Sanjeev Khudanpur, Pegah Ghahremani, Xiaohui Zhang, Vimal Manohar, Chunxi Liu, Aren Jansen, Dietrich Klakow, David Yarowsky, Florian Metze:
A keyword search system using open source software. SLT 2014: 530-535 - [c95]Xavier Anguera, Luis Javier Rodríguez-Fuentes, Igor Szöke, Andi Buzo, Florian Metze, Mikel Peñagarikano:
Query-by-example spoken term detection evaluation on low-resource languages. SLTU 2014: 24-31 - [c94]Shoou-I Yu, Lu Jiang, Zhongwen Xu, Zhenzhong Lan, Shicheng Xu, Xiaojun Chang, Xuanchong Li, Zexi Mao, Chuang Gan, Yajie Miao, Xingzhong Du, Yang Cai, Lara J. Martin, Nikolas Wolfe, Anurag Kumar, Huan Li, Ming Lin, Zhigang Ma, Yi Yang, Deyu Meng, Shiguang Shan, Pinar Duygulu Sahin, Susanne Burger, Florian Metze, Rita Singh, Bhiksha Raj, Teruko Mitamura, Richard M. Stern, Alexander G. Hauptmann, Anil Armagan, Yicheng Zhao:
Informedia @ TRECVID 2014. TRECVID 2014 - 2013
- [j6]Florian Metze, Duo Ding, Ehsan Younessian, Alexander G. Hauptmann:
Beyond audio and video retrieval: topic-oriented multimedia summarization. Int. J. Multim. Inf. Retr. 2(2): 131-144 (2013) - [c93]Udhyakumar Nallasamy, Mark C. Fuhs, Monika Woszczyna, Florian Metze, Tanja Schultz:
Neighbour selection and adaptation for rapid speaker-dependent ASR. ASRU 2013: 60-65 - [c92]Florian Metze, Zaid Sheikh, Alex Waibel, Jonas Gehring, Kevin Kilgour, Quoc Bao Nguyen, Van Huy Nguyen:
Models of tone for tonal and non-tonal languages. ASRU 2013: 261-266 - [c91]Jonas Gehring, Quoc Bao Nguyen, Florian Metze, Alex Waibel:
DNN acoustic modeling with modular multi-lingual feature extraction networks. ASRU 2013: 344-349 - [c90]Yajie Miao, Florian Metze, Shourabh Rawat:
Deep maxout networks for low-resource speech recognition. ASRU 2013: 398-403 - [c89]Ankur Gandhe, Long Qin, Florian Metze, Alexander I. Rudnicky, Ian R. Lane, Matthias Eck:
Using web text to improve keyword spotting in speech. ASRU 2013: 428-433 - [c88]Jonas Gehring, Yajie Miao, Florian Metze, Alex Waibel:
Extracting deep bottleneck features using stacked auto-encoders. ICASSP 2013: 3377-3381 - [c87]Yajie Miao, Florian Metze, Alex Waibel:
Subspace mixture model for low-resource speech recognition in cross-lingual settings. ICASSP 2013: 7339-7343 - [c86]Yulia Tsvetkov, Zaid Sheikh, Florian Metze:
Identification and modeling of word fragments in spontaneous speech. ICASSP 2013: 7624-7628 - [c85]Yajie Miao, Florian Metze, Alex Waibel:
Learning discriminative basis coefficients for eigenspace MLLR unsupervised adaptation. ICASSP 2013: 7927-7931 - [c84]Aren Jansen, Emmanuel Dupoux, Sharon Goldwater, Mark Johnson, Sanjeev Khudanpur, Kenneth Church, Naomi Feldman, Hynek Hermansky, Florian Metze, Richard C. Rose, Mike Seltzer, Pascal Clark, Ian McGraw, Balakrishnan Varadarajan, Erin Bennett, Benjamin Börschinger, Justin T. Chiu, Ewan Dunbar, Abdellah Fourtassi, David Harwath, Chia-ying Lee, Keith D. Levin, Atta Norouzian, Vijayaditya Peddinti, Rachael Richardson, Thomas Schatz, Samuel Thomas:
A summary of the 2012 JHU CLSP workshop on zero resource speech technologies and models of early language acquisition. ICASSP 2013: 8111-8115 - [c83]Florian Metze, Xavier Anguera, Etienne Barnard, Marelie H. Davel, Guillaume Gravier:
The spoken web search task at MediaEval 2012. ICASSP 2013: 8121-8125 - [c82]Sujay Kumar Jauhar, Yun-Nung Chen, Florian Metze:
Prosody-Based Unsupervised Speech Summarization with Two-Layer Mutually Reinforced Random Walk. IJCNLP 2013: 648-654 - [c81]Yun-Nung Chen, Florian Metze:
Multi-layer mutually reinforced random walk with hidden parameters for improved multi-party meeting summarization. INTERSPEECH 2013: 485-489 - [c80]Anuj Kumar, Florian Metze, Wenyi Wang, Matthew Kam:
Formalizing expert knowledge for developing accurate speech recognizers. INTERSPEECH 2013: 1121-1125 - [c79]Florian Metze, Eric Fosler-Lussier, Rebecca Bates:
The speech recognition virtual kitchen. INTERSPEECH 2013: 1858-1860 - [c78]Yajie Miao, Florian Metze:
Improving low-resource CD-DNN-HMM using dropout and multilingual DNN training. INTERSPEECH 2013: 2237-2241 - [c77]Shourabh Rawat, Peter F. Schulam, Susanne Burger, Duo Ding, Yipei Wang, Florian Metze:
Robust audio-codebooks for large-scale event detection in consumer videos. INTERSPEECH 2013: 2929-2933 - [c76]Xavier Anguera, Florian Metze, Andi Buzo, Igor Szöke, Luis Javier Rodríguez-Fuentes:
The Spoken Web Search Task. MediaEval 2013 - [c75]Zhenzhong Lan, Lu Jiang, Shoou-I Yu, Chenqiang Gao, Shourabh Rawat, Yang Cai, Shicheng Xu, Haoquan Shen, Xuanchong Li, Yipei Wang, Waito Sze, Yan Yan, Zhigang Ma, Nicolas Ballas, Deyu Meng, Wei Tong, Yi Yang, Susanne Burger, Florian Metze, Rita Singh, Bhiksha Raj, Richard M. Stern, Teruko Mitamura, Eric Nyberg, Alexander G. Hauptmann:
Informedia@TRECVID 2013. TRECVID 2013 - 2012
- [j5]Karen Livescu, Eric Fosler-Lussier, Florian Metze:
Subword Modeling for Automatic Speech Recognition: Past, Present, and Emerging Approaches. IEEE Signal Process. Mag. 29(6): 44-57 (2012) - [c74]Alan W. Black, H. Timothy Bunnell, Ying Dou, Prasanna Kumar Muthukumar, Florian Metze, Daniel Perry, Tim Polzehl, Kishore Prahallad, Stefan Steidl, Callie Vaughn:
Articulatory features for expressive speech synthesis. ICASSP 2012: 4005-4008 - [c73]Florian Metze, Nitendra Rajput, Xavier Anguera, Marelie H. Davel, Guillaume Gravier, Charl Johannes van Heerden, Gautam Varma Mantena, Armando Muscariello, Kishore Prahallad, Igor Szöke, Javier Tejedor:
The Spoken Web Search Task at MediaEval 2011. ICASSP 2012: 5165-5168 - [c72]Duo Ding, Florian Metze, Shourabh Rawat, Peter Franz Schulam, Susanne Burger:
Generating Natural Language Summaries for Multimedia. INLG 2012: 128-130 - [c71]Tim Polzehl, Katrin Schoenenberg, Sebastian Möller, Florian Metze, Gelareh Mohammadi, Alessandro Vinciarelli:
On Speaker-Independent Personality Perception and Prediction from Speech. INTERSPEECH 2012: 258-261 - [c70]Florian Metze, Eric Fosler-Lussier:
The Speech Recognition Virtual Kitchen: An Initial Prototype. INTERSPEECH 2012: 1872-1873 - [c69]Udhyakumar Nallasamy, Florian Metze, Tanja Schultz:
Enhanced Polyphone Decision Tree Adaptation for Accented Speech Recognition. INTERSPEECH 2012: 1902-1905 - [c68]Qin Jin, Peter Franz Schulam, Shourabh Rawat, Susanne Burger, Duo Ding, Florian Metze:
Event-based Video Retrieval Using Audio. INTERSPEECH 2012: 2085-2088 - [c67]Yun-Nung Chen, Florian Metze:
Integrating Intra-Speaker Topic Modeling and Temporal-Based Inter-Speaker Topic Modeling in Random Walk for Improved Multi-Party Meeting Summarization. INTERSPEECH 2012: 2346-2349 - [c66]Ngoc Thang Vu, Wojtek Breiter, Florian Metze, Tanja Schultz:
Initialization Schemes for Multilayer Perceptron Training and their Impact on ASR Performance using Multilingual Data. INTERSPEECH 2012: 2586-2589 - [c65]Florian Metze, Etienne Barnard, Marelie H. Davel, Charl Johannes van Heerden, Xavier Anguera, Guillaume Gravier, Nitendra Rajput:
The Spoken Web Search Task. MediaEval 2012 - [c64]Duo Ding, Florian Metze, Shourabh Rawat, Peter Franz Schulam, Susanne Burger, Ehsan Younessian, Lei Bao, Michael G. Christel, Alexander G. Hauptmann:
Beyond audio and video retrieval: towards multimedia summarization. ICMR 2012: 2 - [c63]Udhyakumar Nallasamy, Florian Metze, Tanja Schultz:
Semi-supervised learning for speech recognition in the context of accent adaptation. MLSLP 2012: 13-17 - [c62]Gerald Friedland, Daniel P. W. Ellis, Florian Metze:
AMVA'12: ACM international workshop on audio and multimedia methods for large-scale video analysis. ACM Multimedia 2012: 1513-1514 - [c61]Yun-Nung Chen, Florian Metze:
Intra-Speaker Topic Modeling for Improved Multi-Party Meeting Summarization with Integrated Random Walk. HLT-NAACL 2012: 377-381 - [c60]Udhyakumar Nallasamy, Florian Metze, Tanja Schultz:
Active learning for accent adaptation in Automatic Speech Recognition. SLT 2012: 360-365 - [c59]Yun-Nung Chen, Florian Metze:
Two-layer mutually reinforced random walk for improved multi-party meeting summarization. SLT 2012: 461-466 - [c58]Jochen Weiner, Ngoc Thang Vu, Dominic Telaar, Florian Metze, Tanja Schultz, Dau-Cheng Lyu, Engsiong Chng, Haizhou Li:
Integration of language identification into a recognition system for spoken conversations containing code-Switches. SLTU 2012: 76-79 - [c57]Ngoc Thang Vu, Florian Metze, Tanja Schultz:
Multilingual bottle-neck features and its application for under-resourced languages. SLTU 2012: 90-93 - [c56]Shoou-I Yu, Zhongwen Xu, Duo Ding, Waito Sze, Francisco Vicente, Zhenzhong Lan, Yang Cai, Shourabh Rawat, Peter F. Schulam, Nisarga Markandaiah, Sohail Bahmani, Antonio Juárez, Wei Tong, Yi Yang, Susanne Burger, Florian Metze, Rita Singh, Bhiksha Raj, Richard M. Stern, Teruko Mitamura, Eric Nyberg, Lu Jiang, Qiang Chen, Lisa M. Brown, Ankur Datta, Quanfu Fan, Rogério Schmidt Feris, Shuicheng Yan, Alexander G. Hauptmann, Sharath Pankanti:
Informedia @TRECVID 2012. TRECVID 2012 - [e3]Martha A. Larson, Sebastian Schmiedeke, Pascal Kelm, Adam Rae, Vasileios Mezaris, Tomas Piatrik, Mohammad Soleymani, Florian Metze, Gareth J. F. Jones:
Working Notes Proceedings of the MediaEval 2012 Workshop, Santa Croce in Fossabanda, Pisa, Italy, October 4-5, 2012. CEUR Workshop Proceedings 927, CEUR-WS.org 2012 [contents] - 2011
- [j4]Tim Polzehl, Alexander Schmitt, Florian Metze, Michael Wagner:
Anger recognition in speech using acoustic and linguistic cues. Speech Commun. 53(9-10): 1198-1209 (2011) - [c55]Tim Polzehl, Alexander Schmitt, Florian Metze:
Salient Features for Anger Recognition in German and English IVR Portals. IWSDS 2011: 83-105 - [c54]Florian Metze, Alan W. Black, Tim Polzehl:
A Review of Personality in Voice-Based Man Machine Interaction. HCI (2) 2011: 358-367 - [c53]Udhyakumar Nallasamy, Michael Garbus, Florian Metze, Qin Jin, Thomas Schaaf, Tanja Schultz:
Analysis of Dialectal Influence in Pan-Arabic ASR. INTERSPEECH 2011: 1721-1724 - [c52]Tim Polzehl, Sebastian Möller, Florian Metze:
Modeling Speaker Personality Using Voice. INTERSPEECH 2011: 2369-2372 - [c51]Nitendra Rajput, Florian Metze:
Spoken Web Search. MediaEval 2011 - [c50]Lei Bao, Longfei Zhang, Shoou-I Yu, Zhen-zhong Lan, Lu Jiang, Arnold Overwijk, Qin Jin, Shohei Takahashi, Brian Langner, Yuanpeng Li, Michael Garbus, Susanne Burger, Florian Metze, Alexander G. Hauptmann:
Informedia@TRECVID 2011: Surveillance Event Detection. TRECVID 2011 - [e2]Martha A. Larson, Adam Rae, Claire-Hélène Demarty, Christoph Kofler, Florian Metze, Raphaël Troncy, Vasileios Mezaris, Gareth J. F. Jones:
Working Notes Proceedings of the MediaEval 2011 Workshop, Santa Croce in Fossabanda, Pisa, Italy, September 1-2, 2011. CEUR Workshop Proceedings 807, CEUR-WS.org 2011 [contents] - 2010
- [c49]Björn W. Schuller, Florian Metze, Stefan Steidl, Anton Batliner, Florian Eyben, Tim Polzehl:
Late fusion of individual engines for improved recognition of negative emotion in speech - learning vs. democratic vote. ICASSP 2010: 5230-5233 - [c48]Thomas Schaaf, Florian Metze:
Analysis of gender normalization using MLP and VTLN features. INTERSPEECH 2010: 306-309 - [c47]Florian Metze, Anton Batliner, Florian Eyben, Tim Polzehl, Björn W. Schuller, Stefan Steidl:
Emotion recognition using imperfect speech recognition. INTERSPEECH 2010: 478-481 - [c46]Roger Hsiao, Florian Metze, Tanja Schultz:
Improvements to generalized discriminative feature transformation for speech recognition. INTERSPEECH 2010: 1361-1364 - [c45]Florian Metze, Roger Hsiao, Qin Jin, Udhyakumar Nallasamy, Tanja Schultz:
The 2010 CMU GALE speech-to-text system. INTERSPEECH 2010: 1501-1504 - [c44]Martha A. Larson, Roeland Ordelman, Florian Metze, Wessel Kraaij, Franciska de Jong:
Multimedia content with a speech track: ACM multimedia 2010 workshop on searching spontaneous conversational speech. ACM Multimedia 2010: 1747-1748 - [c43]Tim Polzehl, Sebastian Möller, Florian Metze:
Automatically Assessing Personality from Speech. ICSC 2010: 134-140 - [c42]Tim Polzehl, Sebastian Möller, Florian Metze:
Automatically assessing acoustic manifestations of personality in speech. SLT 2010: 7-12 - [c41]Huan Li, Lei Bao, Zan Gao, Arnold Overwijk, Wei Liu, Longfei Zhang, Shoou-I Yu, Ming-yu Chen, Florian Metze, Alexander G. Hauptmann:
Informedia @ TRECVID2010. TRECVID 2010 - [e1]Martha A. Larson, Roeland Ordelman, Florian Metze, Franciska de Jong, Wessel Kraaij:
Proceedings of the 2010 International Workshop on Searching Spontaneous Conversational Speech, SSCS '10, Firenze, Italy, October 29, 2010. ACM 2010, ISBN 978-1-4503-0162-6 [contents]
2000 – 2009
- 2009
- [j3]Florian Metze, Roman Englert, Udo Bub, Felix Burkhardt, Joachim Stegmann:
Getting closer: tailored human-computer speech dialog. Univers. Access Inf. Soc. 8(2): 97-108 (2009) - [c40]Florian Metze, Ina Wechsung, Stefan Schaffer, Julia Seebode, Sebastian Möller:
Reliable Evaluation of Multimodal Dialogue Systems. HCI (2) 2009: 75-83 - [c39]Ina Wechsung, Klaus-Peter Engelbrecht, Stefan Schaffer, Julia Seebode, Florian Metze, Sebastian Möller:
Usability Evaluation of Multimodal Interfaces: Is the Whole the Sum of Its Parts? HCI (2) 2009: 113-119 - [c38]Felix Burkhardt, Tim Polzehl, Joachim Stegmann, Florian Metze, Richard Huber:
Detecting real life anger. ICASSP 2009: 4761-4764 - [c37]Julia Seebode, Stefan Schaffer, Ina Wechsung, Florian Metze:
Influence of training on direct and indirect measures for the evaluation of multimodal systems. INTERSPEECH 2009: 300-303 - [c36]Tim Polzehl, Shiva Sundaram, Hamed Ketabdar, Michael Wagner, Florian Metze:
Emotion classification in children's speech using fusion of acoustic and linguistic features. INTERSPEECH 2009: 340-343 - [c35]Ina Wechsung, Klaus-Peter Engelbrecht, Anja B. Naumann, Stefan Schaffer, Julia Seebode, Florian Metze, Sebastian Möller:
Predicting the quality of multimodal systems based on judgments of single modalities. INTERSPEECH 2009: 1827-1830 - [c34]Roman Englert, Florian Metze:
Digital Signage mit Interaktiven Displays. MuC (Workshopband) 2009: 3-5 - [c33]Stefan Schaffer, Julia Seebode, Ina Wechsung, Florian Metze, Sebastian Möller:
Benutzerstudien zur Bewertung multimodaler, interaktiver Anzeigetafeln in unterschiedlichen Entwicklungsstufen. MuC (Workshopband) 2009: 22-27 - [c32]Ina Wechsung, Klaus-Peter Engelbrecht, Julia Seebode, Stefan Schaffer, Florian Metze, Sebastian Möller:
Usability-Evaluation multimodaler Schnittstellen: Ist das Ganze die Summe seiner Teile? MuC 2009: 495-498 - [c31]Florian Metze, Tim Polzehl, Michael Wagner:
Fusion of Acoustic and Linguistic Features for Emotion Detection. ICSC 2009: 153-160 - 2008
- [c30]Robert Wetzker, Winfried Umbrath, Leonhard Hennig, Christian Bauckhage, Tansu Alpcan, Florian Metze:
Tailoring Taxonomies for Efficient Text Categorization and Expert Finding. Web Intelligence/IAT Workshops 2008: 459-462 - [c29]Robert Wetzker, Till Plumbaum, Alexander Korth, Christian Bauckhage, Tansu Alpcan, Florian Metze:
Detecting trends in social bookmarking systems using a probabilistic generative model and smoothing. ICPR 2008: 1-4 - [c28]Florian Metze, Roman Englert, Udo Bub, Ingmar Kliche, Thomas Scheerbarth:
User perception of multi-modal interfaces for mobile applications. INTERSPEECH 2008: 2470-2473 - 2007
- [j2]Florian Metze:
Discriminative speaker adaptation using articulatory features. Speech Commun. 49(5): 348-360 (2007) - [c27]Jitendra Ajmera, Florian Metze:
Spotting using Durational Entropy. ICASSP (4) 2007: 973-976 - [c26]Florian Metze, Jitendra Ajmera, Roman Englert, Udo Bub, Felix Burkhardt, Joachim Stegmann, Christian A. Müller, Richard Huber, Bernt Andrassy, Josef G. Bauer, Bernhard Littel:
Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications. ICASSP (4) 2007: 1089-1092 - [c25]Florian Metze:
On using Articulatory Features for Discriminative Speaker Adaptation. HLT-NAACL (Short Papers) 2007: 117-120 - [c24]Florian Metze, Christian Bauckhage, Tansu Alpcan:
The "Spree" Expert Finding System. ICSC 2007: 551-558 - [c23]Christian Bauckhage, Tansu Alpcan, Sachin Agarwal, Florian Metze, Robert Wetzker, Milena Ilic, Sahin Albayrak:
An intelligent knowledge sharing system for web communities. SMC 2007: 3069-3074 - 2006
- [c22]Florian Metze:
Articulatory features for "meeting" speech recognition. INTERSPEECH 2006 - 2005
- [b1]Florian Metze:
Articulatory features for conversational speech recognition. Karlsruhe Institute of Technology, Germany, 2005, pp. 1-169 - [c21]Florian Metze, Christian Fügen, Yue Pan, Alex Waibel:
Automatically Transcribing Meetings using Distant Microphones. ICASSP (1) 2005: 989-992 - [c20]Florian Metze, Petra Gieselmann, Hartwig Holzapfel, Tobias Kluge, Ivica Rogina, Alex Waibel, Matthias Wölfel, James L. Crowley, Patrick Reignier, Dominique Vaufreydaz, François Bérard, Bérangère Cohen, Joëlle Coutaz, Sylvie Rouillard, Victoria Arranz, Manuel Bertrán, Horacio Rodríguez:
The "FAME" Interactive Space. MLMI 2005: 126-137 - 2004
- [c19]Jan Kratt, Florian Metze, Rainer Stiefelhagen, Alex Waibel:
Large Vocabulary Audio-Visual Speech Recognition Using the Janus Speech Recognition Toolkit. DAGM-Symposium 2004: 488-495 - [c18]Hagen Soltau, Hua Yu, Florian Metze, Christian Fügen, Qin Jin, Szu-Chen Stan Jou:
The 2003 ISL rich transcription system for conversational telephony speech. ICASSP (1) 2004: 773-776 - [c17]Tanja Schultz, Qin Jin, Kornel Laskowski, Yue Pan, Florian Metze, Christian Fügen:
Issues in meeting transcription - the ISL meeting transcription system. INTERSPEECH 2004: 1709-1712 - 2003
- [c16]Sebastian Stüker, Tanja Schultz, Florian Metze, Alex Waibel:
Multilingual articulatory features. ICASSP (1) 2003: 144-147 - [c15]Sebastian Stüker, Florian Metze, Tanja Schultz, Alex Waibel:
Integrating multilingual articulatory features into speech recognition. INTERSPEECH 2003: 1033-1036 - [c14]Nadia Mana, Susanne Burger, Roldano Cattoni, Laurent Besacier, Victoria MacLaren, John W. McDonough, Florian Metze:
The NESPOLE! voIP multilingual corpora in tourism and medical domains. INTERSPEECH 2003: 1589-1592 - 2002
- [c13]Alon Lavie, Florian Metze, Roldano Cattoni, Erica Costantini:
A Multi-Perspective Evaluation of the NESPOLE! Speech-to-Speech Translation System. Speech-to-Speech Translation@ACL 2002: 121-128 - [c12]Hagen Soltau, Florian Metze, Christian Fügen, Alex Waibel:
Efficient language model lookahead through polymorphic linguistic context assignment. ICASSP 2002: 709-712 - [c11]Hagen Soltau, Florian Metze, Alex Waibel:
Compensating for hyperarticulation by modeling articulatory properties. INTERSPEECH 2002: 841-844 - [c10]Florian Metze, Alex Waibel:
A flexible stream architecture for ASR using articulatory features. INTERSPEECH 2002: 2133-2136 - 2001
- [c9]Hagen Soltau, Thomas Schaaf, Florian Metze, Alex Waibel:
The ISL evaluation system for Verbmobil-II. ICASSP 2001: 65-68 - [c8]John W. McDonough, Florian Metze, Hagen Soltau, Alex Waibel:
Speaker compensation with sine-log all-pass transforms. ICASSP 2001: 369-372 - [c7]Alex Waibel, Michael Bett, Florian Metze, Klaus Ries, Thomas Schaaf, Tanja Schultz, Hagen Soltau, Hua Yu, Klaus Zechner:
Advances in automatic meeting record creation and access. ICASSP 2001: 597-600 - [c6]Susanne Burger, Laurent Besacier, Paolo Coletti, Florian Metze, Céline Morel:
The nespole! voIP dialogue database. INTERSPEECH 2001: 2043-2046 - [c5]Florian Metze, John W. McDonough, Hagen Soltau:
Speech recognition over netmeeting connections. INTERSPEECH 2001: 2389-2392 - [c4]Alex Waibel, Hua Yu, Tanja Schultz, Yue Pan, Michael Bett, Martin Westphal, Hagen Soltau, Thomas Schaaf, Florian Metze:
Advances in meeting recognition. HLT 2001 - 2000
- [j1]Sebastian Albrecht, Jan Busch, Martin Kloppenburg, Florian Metze, Paul Tavan:
Generalized radial basis function networks for classification and novelty detection: self-organization of optimal Bayesian decision. Neural Networks 13(10): 1075-1093 (2000) - [c3]Florian Metze, Thomas Kemp, Thomas Schaaf, Tanja Schultz, Hagen Soltau:
Confidence measure based language identification. ICASSP 2000: 1827-1830 - [c2]Florian Metze, Thomas Kemp:
Das View4You- System: End-to-End Evaluation. KONVENS 2000: 273-278
1990 – 1999
- 1996
- [c1]Daniel Zboril, Florian Metze:
Indeterminateness in Qualitative and Quantitative Reasoning. DEXA Workshop 1996: 262-267
Coauthor Index
aka: Juncheng B. Li
aka: Xavier Anguera
aka: Alexander Waibel
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-01-20 23:05 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint