Living-Off-The-Land Command Detection Using Active Learning

Ongun, Talha; Stokes, Jack W.; Or, Jonathan Bar; Tian, Ke; Tajaddodianfar, Farid; Neil, Joshua; Seifert, Christian; Oprea, Alina; Platt, John C.

doi:10.1145/3471621.3471858

Computer Science > Cryptography and Security

arXiv:2111.15039 (cs)

[Submitted on 30 Nov 2021]

Title:Living-Off-The-Land Command Detection Using Active Learning

Authors:Talha Ongun, Jack W. Stokes, Jonathan Bar Or, Ke Tian, Farid Tajaddodianfar, Joshua Neil, Christian Seifert, Alina Oprea, John C. Platt

View PDF

Abstract:In recent years, enterprises have been targeted by advanced adversaries who leverage creative ways to infiltrate their systems and move laterally to gain access to critical data. One increasingly common evasive method is to hide the malicious activity behind a benign program by using tools that are already installed on user computers. These programs are usually part of the operating system distribution or another user-installed binary, therefore this type of attack is called "Living-Off-The-Land". Detecting these attacks is challenging, as adversaries may not create malicious files on the victim computers and anti-virus scans fail to detect them. We propose the design of an Active Learning framework called LOLAL for detecting Living-Off-the-Land attacks that iteratively selects a set of uncertain and anomalous samples for labeling by a human analyst. LOLAL is specifically designed to work well when a limited number of labeled samples are available for training machine learning models to detect attacks. We investigate methods to represent command-line text using word-embedding techniques, and design ensemble boosting classifiers to distinguish malicious and benign samples based on the embedding representation. We leverage a large, anonymized dataset collected by an endpoint security product and demonstrate that our ensemble classifiers achieve an average F1 score of 0.96 at classifying different attack classes. We show that our active learning method consistently improves the classifier performance, as more training data is labeled, and converges in less than 30 iterations when starting with a small number of labeled instances.

Comments:	14 pages, published in RAID 2021
Subjects:	Cryptography and Security (cs.CR)
Cite as:	arXiv:2111.15039 [cs.CR]
	(or arXiv:2111.15039v1 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2111.15039
Related DOI:	https://doi.org/10.1145/3471621.3471858

Submission history

From: Talha Ongun [view email]
[v1] Tue, 30 Nov 2021 00:31:49 UTC (3,003 KB)

Computer Science > Cryptography and Security

Title:Living-Off-The-Land Command Detection Using Active Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Cryptography and Security

Title:Living-Off-The-Land Command Detection Using Active Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators