skip to main content
10.1145/2517312.2517320acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Approaches to adversarial drift

Published: 04 November 2013 Publication History

Abstract

In this position paper, we argue that to be of practical interest, a machine-learning based security system must engage with the human operators beyond feature engineering and instance labeling to address the challenge of drift in adversarial environments. We propose that designers of such systems broaden the classification goal into an explanatory goal, which would deepen the interaction with system's operators.
To provide guidance, we advocate for an approach based on maintaining one classifier for each class of unwanted activity to be filtered. We also emphasize the necessity for the system to be responsive to the operators constant curation of the training set. We show how this paradigm provides a property we call isolation and how it relates to classical causative attacks.
In order to demonstrate the effects of drift on a binary classification task, we also report on two experiments using a previously unpublished malware data set where each instance is timestamped according to when it was seen.

References

[1]
U. Bayer, P. M. Comparetti, C. H. C. Kruegel, and E. Kirda. Scalable, behavior-based malware clustering. In NDSS, 2009.
[2]
B. Biggio, I. Corona, and G. Fumera. Bagging classifiers for fighting poisoning attacks in adversarial classification tasks. In Multiple Classifier Systems, pages 350--359. Springer Berlin Heidelberg, 2011.
[3]
B. Biggio, G. Fumera, and F. Roli. Evade hard multiple classifier systems. In Applications of Supervised and Unsupervised Ensemble Methods, pages 15--38. Springer Berlin Heidelberg, 2009.
[4]
L. Bottou and O. Bousquet. The Tradeoffs of Large-Scale Learning. Advances in Neural Information Processing Systems, 20:161--168, 2008.
[5]
M. Brückner, C. Kanzow, and T. Scheffer. Static prediction games for adversarial learning problems. Journal of Machine Learning Research, 13:2617--2654, 2012.
[6]
M. Brückner and T. Scheffer. Stackelberg games for adversarial prediction problems. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 547--555, 2011.
[7]
V. Castelli and T. M. Cover. On the exponential value of labeled samples. Pattern Recognition Letters, 16, 1995.
[8]
G. F. Cretu, A. Stavrou, M. E. Locasto, S. J. Stolfo, and A. D. Keromytis. Casting out demons: Sanitizing training data for anomaly sensors. In Security and Privacy, 2008. SP 2008. IEEE Symposium on, pages 81--95. IEEE, 2008.
[9]
C. Curtsinger, B. Livshits, B. Zorn, and C. Seifert. ZOZZLE: Fast and precise in-browser JavaScript malware detection. In Proceedings of the 20th USENIX conference on Security, SEC'11, pages 3--3, Berkeley, CA, USA, 2011. USENIX Association.
[10]
N. Dalvi, P. Domingos, S. Sanghai, and D. Verma. Adversarial classification. In Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining KDD 04 (2004), page 99, New York, New York, USA, 2004. ACM Press.
[11]
K. P. Dyer, S. E. Coull, T. Ristenpart, and T. Shrimpton. Peek-a-boo, i still see you: Why efficient traffic analysis countermeasures fail. In Proceedings of the 2012 IEEE Symposium on Security and Privacy, SP '12, pages 332--346, Washington, DC, USA, 2012. IEEE Computer Society.
[12]
R. Fan, K. Chang, C. Hsieh, X. Wang, and Lin. LIBLINEAR : A Library for Large Linear Classification. The Journal of Machine Learning Research, 9(2008):1871--1874, 2008.
[13]
J. Gennari and D. French. Defining malware families based on analyst insights. In Technologies for Homeland Security (HST), 2011 IEEE International Conference on, pages 396--401, 2011.
[14]
P. Graham. A plan for spam. http://www.paulgraham.com/spam.html, Aug. 2002.
[15]
A. Gupta, P. Kuppili, A. Akella, and P. Barford. An empirical study of malware evolution. In First International Communication Systems and Networks and Workshops (COMSNETS 2009), pages 1--10, 2009.
[16]
C.-W. Hsu and C.-J. Lin. A comparison of methods for multiclass support vector machines. Neural Networks, IEEE Transactions on, 13(2):415--425, 2002.
[17]
L. Huang, A. D. Joseph, B. Nelson, B. I. Rubinstein, and J. D. Tygar. Adversarial machine learning. In Proceedings of the 4th ACM workshop on Security and artificial intelligence, AISec '11, pages 43--58, New York, NY, USA, 2011. ACM.
[18]
P. G. Ipeirotis, F. Provost, and J. Wang. Quality management on Amazon Mechanical Turk. In Proceedings of the ACM SIGKDD Workshop on Human Computation, HCOMP '10, pages 64--67, New York, NY, USA, 2010. ACM.
[19]
A. Kantchelian, J. Ma, L. Huang, S. Afroz, A. D. Joseph, and J. D. Tygar. Robust detection of comment spam using entropy rate. In Proceedings of the 5th ACM Workshop on Artificial Intelligence and Security, AISEC 2012. ACM, 2012.
[20]
A. Kołcz and C. H. Teo. Feature weighting for improved classifier robustness. In CEAS'09: Sixth conference on email and Anti-Spam, number 1, 2009.
[21]
L. I. Kuncheva. Classifier ensembles for detecting concept change in streaming data: Overview and perspectives. In O. Okun and G. Valentini, editors, Workshop on Supervised and Unsupervised Ensemble Methods and their Applications (SUEMA), 2008.
[22]
A. Lavoie, M. Otey, N. Ratliff, and D. Sculley. History Dependent Domain Adaptation. In Domain Adaptation Workshop at NIPS '11, 2011.
[23]
H. Lee and A. Ng. Spam deobfuscation using a hidden markov model. In Proceedings of the Second Conference on Email and Anti-Spam, 2005.
[24]
Z. Li, K. Zhang, Y. Xie, F. Yu, and X. Wang. Knowing your enemy: Understanding and detecting malicious web advertising. In CCS, 2012.
[25]
W. Liu and S. Chawla. Mining adversarial patterns via regularized loss minimization. Machine Learning, 81(1):69--83, July 2010.
[26]
D. Lowd and C. Meek. Good word attacks on statistical spam filters. In Second Conference on Email and Anti-Spam (CEAS), Palo Alto, CA, 2005.
[27]
L. Lu, R. Perdisci, and W. Lee. Surf: Detecting and measuring search poisoning. In CCS, 2011.
[28]
T. A. Meyer and B. Whateley. SpamBayes: Effective open-source, Bayesian based, email classification system. In Proceedings of the Conference on Email and Anti-Spam (CEAS), July 2004.
[29]
T. M. Mitchell. Machine Learning. McGraw-Hill, 1997.
[30]
B. Nelson, M. Barreno, F. J. Chi, A. D. Joseph, B. I. P. Rubinstein, U. Saini, C. Sutton, J. D. Tygar, and K. Xia. Exploiting machine learning to subvert your spam filter. In Proceedings of thenth1st USENIX Workshop on Large-Scale Exploits and Emergent Threats (LEET), pages 1--9, Berkeley, CA, USA, 2008. USENIX Association.
[31]
J. Newsome, B. Karp, and D. Song. Polygraph: Automatically generating signatures for polymorphic worms. In Security and Privacy, 2005 IEEE Symposium on, pages 226--241. IEEE, 2005.
[32]
J. Newsome, B. Karp, and D. Song. Paragraph: Thwarting signature learning by training maliciously. In Recent Advances in Intrusion Detection, pages 81--105. Springer, 2006.
[33]
A. Y. Ng and M. I. Jordan. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In NIPS, pages 841--848, 2001.
[34]
A. Ramachandran, N. Feamster, and S. Vempala. Filtering spam with behavioral blacklisting. In Proceedings of thenth14th ACM conference on Computer and communications security (CCS), pages 342--351, New York, NY, USA, 2007. ACM.
[35]
K. Rieck, T. Holz, C. Willems, P. Dussel, and P. Laskov. Learning and classification of malware behavior. In DIMVA, 2008.
[36]
K. Rieck, P. Trinius, C. Willems, and T. Holz. Automatic analysis of malware behavior using machine learning. Journal of Computer Security, 19(4), 2011.
[37]
J. J. Rodríguez and L. I. Kuncheva. Combining online classification approaches for changing environments. In Proc. of the Joint IAPR International Workshops on Structural and Syntactic Pattern Recognition and Statistical Techniques in Pattern Recognition, pages 520--529, 2008.
[38]
L. Rokach. Ensemble-based classifiers. Artif. Intell. Rev., 33(1--2):1--39, Feb. 2010.
[39]
B. I. Rubinstein, B. Nelson, L. Huang, A. D. Joseph, S.-h. Lau, S. Rao, N. Taft, and J. Tygar. Antidote: understanding and defending against poisoning of anomaly detectors. In Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference, pages 1--14. ACM, 2009.
[40]
G. Schwenk, A. Bikadorov, T. Krueger, and K. Rieck. Autonomous learning for detection of javascript attacks: Vision or reality? In AISEC, 2012.
[41]
D. Sculley, M. E. Otey, M. Pohl, B. Spitznagel, J. Hainsworth, and Y. Zhou. Detecting adversarial advertisements in the wild. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 274--282. ACM, 2011.
[42]
D. Sculley, G. M. Wachman, and C. E. Brodley. Spam Filtering using Inexact String Matching in Explicit Feature Space with On-Line Linear Classifiers. In The Fifteenth Text REtrieval Conference (TREC 2006) Proceedings, 2006.
[43]
R. Segal, J. Crawford, J. Kephart, and B. Leiba. SpamGuru: An enterprise anti-spam filtering system. In Conference on Email and Anti-Spam (CEAS), 2004.
[44]
A. Singh, A. Walenstein, and A. Lakhotia. Tracking concept drift in malware families. In Proceedings of the 5th ACM workshop on Security and artificial intelligence, pages 81--92. ACM, 2012.
[45]
R. Sommer and V. Paxson. Outside the closed world: On using machine learning for network intrusion detection. In Security and Privacy (SP), 2010 IEEE Symposium on, pages 305--316. IEEE, 2010.
[46]
N. Srndic and P. Laskov. Detection of malicious pdf files based on hierarchical document structure. In Proceedings of the Network and Distributed System Security Symposium, NDSS 2013, San Diego, California, USA. The Internet Society, 2013.
[47]
T. Stein, E. Chen, and K. Mangla. Facebook immune system. In Proceedings of the 4th Workshop on Social Network Systems, SNS '11, pages 8:1--8:8, New York, NY, USA, 2011. ACM.
[48]
K. Thomas, C. Grier, J. Ma, V. Paxson, and D. Song. Design and evaluation of a real-time URL spam filtering service. In 2011 IEEE Symposium on Security and Privacy (SP), pages 447--462. IEEE, 2011.
[49]
C. Whittaker, B. Ryner, and M. Nazif. Large-scale automatic classification of phishing pages. In Proc. of 17th NDSS, 2010.
[50]
M. M. Williamson. Throttling viruses: Restricting propagation to defeat malicious mobile code. In Proceedings of thenth18th Annual Computer Security Applications Conference (ACSAC), pages 61--68, Washington DC, USA, 2002. IEEE Computer Society.
[51]
G. Wittel and S. Wu. On attacking statistical spam filters. In Proceedings of the First Conference on Email and Anti-Spam (CEAS), 2004.
[52]
C. V. Wright, S. E. Coull, and F. Monrose. Traffic morphing: An efficient defense against statistical traffic analysis. In NDSS. The Internet Society, 2009.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AISec '13: Proceedings of the 2013 ACM workshop on Artificial intelligence and security
November 2013
116 pages
ISBN:9781450324885
DOI:10.1145/2517312
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 November 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. adversarial machine learning
  2. concept drift
  3. malware classification

Qualifiers

  • Research-article

Conference

CCS'13
Sponsor:

Acceptance Rates

AISec '13 Paper Acceptance Rate 10 of 17 submissions, 59%;
Overall Acceptance Rate 94 of 231 submissions, 41%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)45
  • Downloads (Last 6 weeks)5
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media