skip to main content
10.1145/3359789.3359816acmotherconferencesArticle/Chapter ViewAbstractPublication PagesacsacConference Proceedingsconference-collections
research-article

Robust keystroke transcription from the acoustic side-channel

Published: 09 December 2019 Publication History

Abstract

The acoustic emanations from keyboards provide a side-channel attack from which an attacker can recover sensitive user information, such as passwords and personally identifiable information. Previous work has shown the feasibility of these attacks given isolated key strokes, but has not demonstrated robust keystroke detection and segmentation in the presence of realistic noise and fast typing speeds. Common problems include noises like doors closing or speech as well as overlapping keystroke waveforms. Prior work has assumed that isolating the waveform of individual key strokes can be achieved with near 100% accuracy, but we show that these techniques generate a large number of misses and false positives, drastically impacting the downstream keystroke classification task.
To solve this problem, we present a deep learning system, leveraging related state-of-the-art techniques from speech transcription, that performs end-to-end, audio-to-keystroke transcription with superior performance. The recurrent architecture enables it to robustly handle overlapping waveforms and adapt to local noise profiles. Furthermore, the joint approach to keystroke detection and classification enables us to both train without ground truth keystroke timings and outperform standard classification approaches even when they have ground truth timings. Due to the paucity of existing datasets, we collected a novel acoustic and keylogger dataset comprising 17 users and 86k keystrokes across various real-world typing tasks. On this dataset, we reduce the end-to-end character error rate on English text from 36.0% to 7.41% for known typists and 41.3% to 15.41% for unknown typists. The keystroke acoustic side-channel attack remains dangerously feasible.

References

[1]
Kamran Ali, Alex X Liu, Wei Wang, and Muhammad Shahzad. 2015. Keystroke recognition using wifi signals. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking. ACM, 90--102.
[2]
Dario Amodei, Sundaram Ananthanarayanan, Rishita Anubhai, Jingliang Bai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Qiang Cheng, Guoliang Chen, et al. 2016. Deep speech 2: End-to-end speech recognition in english and mandarin. In International Conference on Machine Learning. 173--182.
[3]
S Abhishek Anand and Nitesh Saxena. 2016. A sound for a sound: Mitigating acoustic side channel attacks on password keystrokes with active sounds. In International Conference on Financial Cryptography and Data Security. Springer, 346--364.
[4]
S Abhishek Anand and Nitesh Saxena. 2018. Keyboard Emanations in Remote Voice Calls: Password Leakage and Noise (less) Masking Defenses. In Proceedings of the Eighth ACM Conference on Data and Application Security and Privacy. ACM, 103--110.
[5]
Dmitri Asonov and Rakesh Agrawal. 2004. Keyboard acoustic emanations. In Proc. IEEE Symp. on Security & Privacy (SP). IEEE, 3--11.
[6]
Yigael Berger, Avishai Wool, and Arie Yeredor. 2006. Dictionary attacks using keyboard acoustic emanations. In Proceedings of the 13th ACM conference on Computer and communications security. ACM, 245--254.
[7]
Sung-Hyuk Cha, Charles Tappert, and Mary Villani. [n. d.]. Keystroke Biometric Identification and Authentication on Long-Text Input. ([n. d.]).
[8]
Ciprian Chelba, Tomas Mikolov, Mike Schuster, Qi Ge, Thorsten Brants, Phillipp Koehn, and Tony Robinson. 2013. One billion word benchmark for measuring progress in statistical language modeling. arXiv preprint arXiv:1312.3005 (2013).
[9]
Stanley F Chen and Joshua Goodman. 1999. An empirical study of smoothing techniques for language modeling. Computer Speech & Language 13, 4 (1999), 359--394.
[10]
Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).
[11]
Alberto Compagno, Mauro Conti, Daniele Lain, and Gene Tsudik. 2017. Don't Skype & Type!: Acoustic Eavesdropping in Voice-Over-IP. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security. ACM, 703--715.
[12]
Song Fang, Ian Markwood, Yao Liu, Shangqing Zhao, Zhuo Lu, and Haojin Zhu. 2018. No Training Hurdles: Fast Training-Agnostic Attacks to Infer Your Typing. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. ACM, 1747--1760.
[13]
Jeffrey Friedman. 1972. Tempest: A signal problem. NSA Cryptologic Spectrum 35 (1972), 76.
[14]
Alex Graves, Santiago Fernández, Faustino Gomez, and Jürgen Schmidhuber. 2006. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In Proceedings of the 23rd international conference on Machine learning. ACM, 369--376.
[15]
Alex Graves and Navdeep Jaitly. 2014. Towards end-to-end speech recognition with recurrent neural networks. In International conference on machine learning. 1764--1772.
[16]
Tzipora Halevi and Nitesh Saxena. 2015. Keyboard acoustic side channel attacks: exploring realistic and security-sensitive scenarios. International Journal of Information Security 14, 5 (2015), 443--456.
[17]
Kenneth Heafield. 2011. KenLM: Faster and smaller language model queries. In Proceedings of the sixth workshop on statistical machine translation. Association for Computational Linguistics, 187--197.
[18]
Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015).
[19]
Jian Liu, Yan Wang, Gorkem Kar, Yingying Chen, Jie Yang, and Marco Gruteser. 2015. Snooping keystrokes with mm-level audio ranging on a single phone. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking. ACM, 142--154.
[20]
Philip Marquardt, Arunabh Verma, Henry Carter, and Patrick Traynor. 2011. (sp) iPhone: decoding vibrations from nearby keyboards using mobile phone accelerometers. In Proceedings of the 18th ACM conference on Computer and communications security. ACM, 551--562.
[21]
Zdenek Martinasek, Vlastimil Clupek, and Krisztina Trasy. 2015. Acoustic attack on keyboard using spectrogram and neural network. In Telecommunications and Signal Processing (TSP), 2015 38th International Conference on. IEEE, 637--641.
[22]
Shane McCulley and Vassil Roussev. 2018. Latent Typing Biometrics in Online Collaboration Services. In Proceedings of the 34th Annual Computer Security Applications Conference. ACM, 66--76.
[23]
Paul Mermelstein. 1976. Distance measures for speech recognition, psychological and instrumental. Pattern recognition and artificial intelligence 116 (1976), 374--388.
[24]
Yan Michalevsky, Dan Boneh, and Gabi Nakibly. 2014. Gyrophone: Recognizing Speech from Gyroscope Signals. In USENIX Security Symposium. 1053--1067.
[25]
John Monaco. 2018. SoK: Keylogging Side Channels. In IEEE Security & Privacy. IEEE.
[26]
John V Monaco and Charles C Tappert. 2016. Obfuscating keystroke time intervals to avoid identification and impersonation. arXiv preprint arXiv:1609.07612 (2016).
[27]
Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10). 807--814.
[28]
Joseph Roth, Xiaoming Liu, Arun Ross, and Dimitris Metaxas. 2013. Biometric authentication via keystroke sound. In 2013 international conference on biometrics (ICB). IEEE, 1--8.
[29]
Michael Schwarz, Moritz Lipp, Daniel Gruss, Samuel Weiser, Clémentine Maurice, Raphael Spreitzer, and Stefan Mangard. 2018. KeyDrown: Eliminating Keystroke Timing Side-Channel Attacks. In Network and Distributed System Security Symposium (NDSS).
[30]
David Snyder, Daniel Garcia-Romero, Gregory Sell, Daniel Povey, and Sanjeev Khudanpur. 2018. X-vectors: Robust DNN embeddings for speaker recognition. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 5329--5333.
[31]
Esra Vural, Jiaju Huang, Daqing Hou, and Stephanie Schuckers. 2014. Shared research dataset to support development of keystroke authentication. In IEEE International Joint Conference on Biometrics. IEEE, 1--8.
[32]
Junjue Wang, Kaichen Zhao, Xinyu Zhang, and Chunyi Peng. 2014. Ubiquitous keyboard for small mobile devices: harnessing multipath fading for fine-grained keystroke localization. In Proceedings of the 12th annual international conference on Mobile systems, applications, and services. ACM, 14--27.
[33]
Xiong Xiao, Shinji Watanabe, Hakan Erdogan, Liang Lu, John Hershey, Michael L Seltzer, Guoguo Chen, Yu Zhang, Michael Mandel, and Dong Yu. 2016. Deep beamforming networks for multi-channel speech recognition. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 5745--5749.
[34]
Tong Zhu, Qiang Ma, Shanfeng Zhang, and Yunhao Liu. 2014. Context-free attacks using keyboard acoustic emanations. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security. ACM, 453--464.
[35]
Li Zhuang, Feng Zhou, and J Doug Tygar. 2009. Keyboard acoustic emanations revisited. ACM Transactions on Information and System Security (TISSEC) 13, 1 (2009), 3.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ACSAC '19: Proceedings of the 35th Annual Computer Security Applications Conference
December 2019
821 pages
ISBN:9781450376280
DOI:10.1145/3359789
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 December 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. acoustics
  2. deep learning
  3. keystroke transcription
  4. neural networks
  5. side-channel attacks
  6. user privacy

Qualifiers

  • Research-article

Funding Sources

  • Air Force Research Laboratory (AFRL)
  • Defense Advanced Research Project Agency (DARPA)

Conference

ACSAC '19
ACSAC '19: 2019 Annual Computer Security Applications Conference
December 9 - 13, 2019
Puerto Rico, San Juan, USA

Acceptance Rates

ACSAC '19 Paper Acceptance Rate 60 of 266 submissions, 23%;
Overall Acceptance Rate 104 of 497 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)61
  • Downloads (Last 6 weeks)8
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media