skip to main content
10.1145/3419394.3423616acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
research-article

Hiding in Plain Site: Detecting JavaScript Obfuscation through Concealed Browser API Usage

Published: 27 October 2020 Publication History

Abstract

In this paper, we perform a large-scale measurement study of JavaScript obfuscation of browser APIs in the wild. We rely on a simple, but powerful observation: if dynamic analysis of a script's behavior (specifically, how it interacts with browser APIs) reveals browser API feature usage that cannot be reconciled with static analysis of the script's source code, then that behavior is obfuscated. To quantify and test this observation, we create a hybrid analysis platform using instrumented Chromium to log all browser API accesses by the scripts executed when a user visits a page. We filter the API access traces from our dynamic analysis through a static analysis tool that we developed in order to quantify how much and what kind of functionality is hidden on the web. When applying this methodology across the Alexa top 100k domains, we discover that 95.90% of the domains we successfully visited contain at least one script which invokes APIs that cannot be resolved from static analysis. We observe that eval is no longer the prominent obfuscation method on the web and we uncover families of novel obfuscation techniques that no longer rely on the use of eval.

Supplementary Material

MP4 File (imc2020-3419394.3423616-long.mp4)
Presentation video.
MP4 File (imc2020-3419394.3423616-short.mp4)
Presentation video.

References

[1]
Moataz AbdelKhalek and Ahmed Shosha. JSDES: An Automated De-Obfuscation System for Malicious JavaScript. In Proceedings of the 12th International Conference on Availability, Reliability and Security - ARES, 2017.
[2]
I.A. Al-Taharwa, C.H. Mao, H.K. Pao, K.P. Wu, C. Faloutsos, H.M. Lee, S.M. Chen, and A.B. Jeng. Obfuscated malicious javascript detection by causal relations finding. In Advanced Communication Technology (ICACT), 2011.
[3]
Ismail Adel AL-Taharwa, Hahn-Ming Lee, Albert B. Jeng, Kuo-Ping Wu, Cheng-Seen Ho, and Shyi-Ming Chen. JSOD: JavaScript obfuscation detector. In Security and Communication Networks, 2015.
[4]
Ariya Hidayat. ECMAScript parsing infrastructure for multipurpose analysis. https://esprima.org/. Accessed: 11-12-2019.
[5]
Boaz Barak, Oded Goldreich, Rusell Impagliazzo, Steven Rudich, Amit Sahai, Salil Vadhan, and Ke Yang. On the (Im)possibility of Obfuscating Programs. In Annual International Cryptology Conference, 2001.
[6]
G Blanc, R Ando, and Y Kadobayashi. Term-Rewriting Deobfuscation for Static Client-Side Scripting Malware Detection. In Mobility and Security (NTMS), 2011.
[7]
Brave Software. Features | Brave Browser. https://brave.com/features/. Accessed: 11-15-2019.
[8]
Brave Software. PageGraph Âů brave/brave-browser Wiki. https://github.com/brave/brave-browser/wiki/PageGraph. Accessed: 11-15-2019.
[9]
Davide Canali, Marco Cova, Giovanni Vigna, and Christopher Kruegel. Prophiler: A Fast Filter for the Large-Scale Detection of Malicious Web Pages Categories and Subject Descriptors. In Proceedings of the International World Wide Web Conference (WWW), 2011.
[10]
YoungHan Choi, TaeGhyoon Kim, SeokJin Choi, and Cheolwon Lee. Automatic detection for javascript obfuscation attacks in web pages through string pattern analysis. In International Conference on Future Generation Information Technology, 2009.
[11]
Christian S. Collberg and Clark Thomborson. Watermarking, tamper-proofing, and obfuscation-tools for software protection. IEEE Transactions on Software Engineering, 2002.
[12]
A. Costello. Punycode: A Bootstring encoding of Unicode for Internationalized Domain Names in Applications (IDNA). RFC3492, RFC Editor, March 2003.
[13]
Marco Cova, Christopher Kruegel, and Giovanni Vigna. Detection and analysis of drive-by-download attacks and malicious JavaScript code. In Proceedings of the International World Wide Web Conference (WWW), 2010.
[14]
Charlie Curtsinger, Benjamin Livshits, Benjamin G Zorn, and Christian Seifert. Zozzle: Fast and precise in-browser javascript malware detection. In Proceedings of the USENIX Security Symposium, 2011.
[15]
DaftLogic. daftlogic. https://www.daftlogic.com/projects-online-javascript-obfuscator.htm. Accessed: 05-30-2020.
[16]
Aurore Fass, Robert P Krawczyk, Michael Backes, and Ben Stock. Jast: Fully syntactic detection of malicious (obfuscated) javascript. In International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, 2018.
[17]
Ben Feinstein and Daniel Peck. Caffeine monkey: Automated collection, detection and analysis of malicious javascript. In Black Hat USA, 2007.
[18]
Github. Catapult Project. https://github.com/catapult-project. Accessed: 11-12-2019.
[19]
Github. CDNJS - the best front-end resource CDN for free! https://cdnjs.com/. Accessed: 11-12-2019.
[20]
Github. cdnjs September 2019 Usage Stats. https://github.com/cdnjs/cf-stats/blob/master/2019/cdnjs_September_2019.md. Accessed: 11-12-2019.
[21]
Github. EScope. https://github.com/estools/escope. Accessed: 11-12-2019.
[22]
Github. JavaScript Obfuscator. https://github.com/javascript-obfuscator/javascript-obfuscator. Accessed: 11-12-2019.
[23]
Github. Puppeteer. https://github.com/GoogleChrome/puppeteer. Accessed: 11-12-2019.
[24]
Github. UglifyJS âĂŞ a JavaScript parser/compressor/beautifier. https://github.com/mishoo/UglifyJS. Accessed: 11-12-2019.
[25]
Github. Web Page Replay. https://github.com/catapult-project/catapult/tree/master/web_page_replay_go. Accessed: 11-12-2019.
[26]
github.io. Chrome DevTools Protocol Viewer. https://chromedevtools.github.io/devtools-protocol/. Accessed: 11-12-2019.
[27]
F. Howard. Malware with your Mocha? Obfuscation and antiemulation tricks in malicious JavaScript. In Sophos Technical Papers (2010), 2010.
[28]
Umar Iqbal, Peter Snyder, Shitong Zhu, Benjamin Livshits, Zhiyun Qian, and Zubair Shafiq. AdGraph: A Graph-Based Approach to Ad and Tracker Blocking. In Proceedings of the IEEE Symposium on Security and Privacy, 2020.
[29]
Javascript Obfuscator. Javascript Obfuscator. https://javascriptobfuscator.com/default.aspx. Accessed: 05-30-2020.
[30]
W Jiang, H Wang, and K Wu. Method for Detecting Javascript Code Obfuscation based on Convolutional Neural Network. In International Journal of Performability Engineering, 2018.
[31]
Mehran Jodavi, Mahdi Abadi, and Elham Parhizkar. Jsobfusdetector: A binary pso-based one-class classifier ensemble to detect obfuscated javascript code. In International Symposium on Artificial Intelligence and Signal Processing, 2015.
[32]
Jordan Jueckstock and Alexandros Kapravelos. VisibleV8: In-browser Monitoring of JavaScript in the Wild. In Proceedings of the ACM SIGCOMM Internet Measurement Conference (IMC), 2019.
[33]
Scott Kaplan, Benjamin Livshits, Benjamin Zorn, Christian Siefert, and Charlie Curtsinger. " NOFUS: Automatically Detecting"+ String. fromCharCode (32)+" ObFuSCateD". toLowerCase ()+" JavaScript Code. In Technical report, Technical Report MSR-TR 2011-57, Microsoft Research, 2011.
[34]
Alexandros Kapravelos, Yan Shoshitaishvili, Marco Cova, Christopher Kruegel, and Giovanni Vigna. Revolver: An automated approach to the detection of evasive web-based malware. In Proceedings of the USENIX Security Symposium, 2013.
[35]
Kaspersky. Chrome 0-day exploit cve-2019-13720 used in operation wizardopium. https://securelist.com/chrome-0-day-exploit-cve-2019-13720-used-in-operation-wizardopium/94866/. Accessed: 11-11-2019.
[36]
Byung-Ik Kim, Chae-Tae Im, and Hyun-Chul Jung. Suspicious malicious web site detection with strength analysis of a javascript obfuscation. In International Journal of Advanced Science and Technology, 2011.
[37]
Kyungtae Kim, I Luk Kim, Chung Hwan Kim, Yonghwi Kwon, Yunhui Zheng, Xiangyu Zhang, and Dongyan Xu. J-force: Forced execution on javascript. In Proceedings of the International World Wide Web Conference (WWW), 2017.
[38]
Clemens Kolbitsch, Benjamin Livshits, Benjamin Zorn, and Christian Seifert. Rozzle: De-cloaking internet malware. In Proceedings of the IEEE Symposium on Security and Privacy, 2012.
[39]
Min Li, Ying Zhou, Min Yu, and Chao Liu. Combining static and dynamic analysis for the detection of malicious JavaScript-bearing PDF documents. In Computer Science, Technology and Application, 2016.
[40]
Peter Likarish, Eunjin Jung, and Insoon Jo. Obfuscated malicious javascript detection using classification techniques. In 2009 4th International Conference on Malicious and Unwanted Software (MAL WARE), 2009.
[41]
Gen Lu and Saumya Debray. Automatic Simplification of Obfuscated JavaScript Code: A Semantics-Based Approach. In 2012 IEEE Sixth International Conference on Software Security and Reliability, 2012.
[42]
Mozilla. Battery Status API. https://developer.mozilla.org/en-US/docs/Web/API/Battery_Status_API. Accessed 11-14-2019.
[43]
Mozilla Web Docs. Source code submission. https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/Source_Code_Submission. Accessed: 11-09-2019.
[44]
Mozilla Web Docs. Standard Built-in Objects. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects. Accessed: 11-09-2019.
[45]
NPM. JavaScript Obfuscator. https://www.npmjs.com/package/javascript-obfuscator. Accessed: 11-12-2019.
[46]
Obfuscator.io. Obfuscator.io. https://obfuscator.io/. Accessed: 05-30-2020.
[47]
Łukasz Olejnik, Gunes Acar, Claude Castelluccia, and Claudia Diaz. The leaking battery - A privacy analysis of the HTML5 Battery Status API. Lecture Notes in Computer Science, 2015.
[48]
Online Version. JavaScript Obfuscator. https://obfuscator.io/. Accessed: 11-12-2019.
[49]
Brian Pfretzschner and Lotfi ben Othmane. Identification of dependency-based attacks on node.js. In Proceedings of the 12th International Conference on Availability, Reliability and Security, 2017.
[50]
Niels Provos, Panayiotis Mavrommatis, Moheeb Rajab, and Fabian Monrose. All your iframes point to us. In Proceedings of the USENIX Security Symposium, 2008.
[51]
Paruj Ratanaworabhan, V Benjamin Livshits, and Benjamin G Zorn. Nozzle: A defense against heap-spraying code injection attacks. In Proceedings of the USENIX Security Symposium, 2009.
[52]
Scitki Learn Documentation. Scikit Learn DBSCAN. https://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html#sklearn.cluster. DBSCAN. Accessed: 11-12-2019.
[53]
Philippe Skolka, Cristian-Alexandru Staicu, and Michael Pradel. Anything to hide? studying minified and obfuscated code in the web. In Proceedings of the International World Wide Web Conference (WWW), 2019.
[54]
Technology Lookup. CDN Usage Distribution in the Top 1 Million Sites. https://trends.builtwith.com/cdn. Accessed: 11-12-2019.
[55]
VirusTotal. Analyze suspicious files and URLs to detect types of malware, automatically share them with the security community. http://mathworld.wolfram.com/HarmonicMean.html. Accessed: 11-13-2019.
[56]
Wolfram MathWorld. Harmonic Mean. http://mathworld.wolfram.com/HarmonicMean.html. Accessed: 11-12-2019.
[57]
Wei Xu, Fangfang Zhang, and Sencun Zhu. The power of obfuscation techniques in malicious JavaScript code: A measurement study. In International Conference on Malicious and Unwanted Software (MALWARE), 2012.
[58]
Wei Xu, Fangfang Zhang, and Sencun Zhu. Jstill: Mostly static detection of obfuscated malicious javascript code. In Proceedings of the third ACM conference on Data and application security and privacy - CODASPY, 2013.
[59]
ZS Wang. jfogs. https://github.com/zswang/jfogs. Accessed: 05-30-2020.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
IMC '20: Proceedings of the ACM Internet Measurement Conference
October 2020
751 pages
ISBN:9781450381383
DOI:10.1145/3419394
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2020

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

IMC '20
IMC '20: ACM Internet Measurement Conference
October 27 - 29, 2020
Virtual Event, USA

Acceptance Rates

IMC '20 Paper Acceptance Rate 53 of 216 submissions, 25%;
Overall Acceptance Rate 277 of 1,083 submissions, 26%

Upcoming Conference

IMC '24
ACM Internet Measurement Conference
November 4 - 6, 2024
Madrid , AA , Spain

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)79
  • Downloads (Last 6 weeks)9
Reflects downloads up to 14 Sep 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media