1 Introduction
Migrating business capabilities or entire enterprises to the cloud is increasingly common. However, moving to the cloud is not a security guarantee. For instance, Gartner predicted that, by 2025, 99% of all cloud breaches would be customer-side lapses,
1 but such issues are already commonplace. A crux with customer-side cloud security is combinations of traditional security vulnerabilities enhanced with cloud-native attacker tactics. Therefore, the search space grows large and complicated, spanning multiple attack steps, assets, and services. In response, this work proposes a domain-specific modeling language for
Amazon Web Services (AWS) environments. First, it can model AWS environments and configurations. Then, the models are used to simulate attacks by automatically constructing and traversing the corresponding attack graph. The result is an AWS security assessment akin to virtual penetration testing requiring minimal security expertise from users.
AWS is currently one of the most adopted public cloud platforms, offering services numbering in the hundreds. However, security incidents involving AWS services are also routinely reported. One such service is the Simple Storage Service (S3), a data storage solution for managing objects in so-called buckets. Another frequently used service is the Elastic Compute Cloud (EC2), which hosts virtual machine instances. Essential to many other services is the Virtual Private Cloud (VPC). This virtual network infrastructure allows cloud customers to deploy and configure, for example, web servers in logically separate networks on the AWS infrastructure. Finally, the policy-oriented Identity and Access Management (IAM) for user and permissions administration is another essential service. Even when limited to these four services, a realistic breach can start by exploiting a vulnerable web application on an EC2 instance. Then, the attacker can pivot by exploit IAM privileges assigned to the instance. Such causal vulnerability chains make security assessments of enterprise AWS environments overwhelming due to sheer scale and complexity.
Regarding attacks, two highlights in a recent threat landscape report
2 were abusing valid credentials and pivoting within the environment. Also relevant is that the
Cloud Security Alliance (CSA) egregious 11 threats report [
13] lists customer-side issues such as data breaches, misconfigurations, change control, IAM problems, and account hijacking in the top five. Examples actualizing these threats are abundant. One 2019 incident involved request forgeries from a misconfigured firewall.
3 Leaked AWS credentials caused the Tesla
4 and Uber breaches.
5 Exposed S3 buckets is another common root cause behind, for example, the Accenture
6 and Time Warner
7 incidents. Finally, escalating privileges via AWS permission combinations has gained attention in industry.
8, 9, 10 All in all, there appears to be practical needs for assessing cloud services usage and configurations.
Automated assessment artifacts often focus on general network security. Seminal work includes NetSPA [
5], MulVal [
46], and the
Topological Vulnerability Analysis (TVA) tool [
44]. Some recent contributions are Vulnus [
3] and MAD [
4]. AWS-specific tools can be both commercial solutions, such as hava.io
11 and Hyperglance,
12 or one of many different open source tools.
13 AWS itself also provides numerous security services, such as Inspector for EC2 security and vulnerability assessments. AWS also provides certain assessments based on formal methods. One such solution is TIROS for network reachability [
6] and, Zelkova for analyzing policy configurations [
7]. However, security incidents still occur despite the large selection of services and tools, which suggests that some AWS problems remain.
The problem is not the lack of particular tools but the lack of specialized yet holistic security assessment methods. No single AWS or third-party solution seems to capture the structural relationships and chains of vulnerabilities present in customer environments. Fuller pictures require multiple services but this by itself can add undesirable complexity. Moreover, repositories such as MITRE cloud matrix
14 does not account for AWS-specific techniques, such as particular privilege combinations. Looking wider, the cloud security requirements in Kumar and Goyal [
35] and many cloud modeling languages in Bergmayr et al. [
10] may account for application security, but not tactics and techniques. Close to this work is the recent modeling language proposed in Mouratidis et al. [
42], except it also has a strategic and requirements focus. The problem with the existing cloud modeling work, despite its richness, is that it serves a different purpose to modeling threats. Therefore, more effort should be directed to modeling detailed events that realize cloud threats such as those reported in Reference [
13].
This work proposes a domain-specific threat modeling and cyber attack simulation language developed with the
Meta Attack Language (MAL) [
26]. The proposed language was tailored to AWS, since it is market-leading with regularly reported security incidents. The domain-specific language, hereafter the DSL for AWS or just the DSL, can model AWS environments and reason about their security by constructing and traversing attack graphs. However, the DSL does not include supporting features, such as automated information collection and visualizations. Such functions are currently found in securiCAD Vanguard.
15 Vanguard is a commercial tool powered by the DSL for AWS described in this work. The contributions of this work are, therefore, fourfold:
•
It presents the DSL for AWS, a language where users solely describe, manually or automatically, AWS configurations to obtain security assessments.
•
The DSL for AWS reasons with both traditional software vulnerabilities and cloud-specific attacker tactics and techniques.
•
A first validation of the DSL is performed based on scenarios, comparison, and performance testing.
•
A demonstration of the first fully realized application of MAL.
Before introducing the DSL, Section
2 discusses related works, and Section
3 describes the methodology. Following is the presentation of MAL in Section
4 the DSL in Section
5. Section
6 demonstrates relevant aspects of the DSL, followed by validation results in Section
7. Finally, Section
8 discusses the validation outcomes and general observations about the DSL, and Section
9 concludes the article and suggests topics for future work.
8 Discussion
CloudGoat showed that the DSL could describe and identify possible data breaches, permission issues, and network misconfigurations. In particular, the results often captured the consequences of compromising valid accounts and potential pivot points. These flaws share at least two characteristics. They are architectural and largely static. Such analyses are a strength of the DSL and an indicator of its practical usability. Moreover, it is these types of architectural analyses that become intractable at larger scales without support from tools.
However, the analogy driving this work, and the analogy behind MAL-based attack simulations [
26, p. 1], was parallel virtual penetration tests. When validated accordingly, it became evident that some simpler penetration testing tasks could be difficult. Conversely, some complex tasks became simple, such as architectural analysis. One option is changing the analogy, which could change the choice of validation tests and re-orient the expectations on the DSL. But rather than retroactively changing analogy, this is perhaps better practiced in future work.
Still, if viewed through penetration testing, the missing links from CloudGoat may further indicate a limitation with the underlying graph construction method. Namely, some realistic attacker tactics change the environment. This is especially relevant when assessing dynamic environments, such as cloud environments. Some cloud assets can be created, removed, and modified with relative ease. Such tactics can sometimes be computed with hypothetical reasoning. For example, being allowed to create EC2 instances with certain privileges implies accessing it to obtain those privileges. However, this challenge exists, because the underlying attack graphs cannot change once computed.
The planted credentials posed a different challenge, which was interpreted as a data collection limitation. For instance, linking credential assets to S3 objects is a simple modeling task. However, the difficulty lies in inferring when and how to do so. Such tasks impose different challenges, including automatically identifying the AWS API key signatures or high entropy strings as possible passwords. A simpler approach would be to assume such possibilities and build models accordingly, but this is also a non-trivial task.
Vanguard, and by extension the DSL, had to rely on certain assumptions for automation, often the catch-all unknown vulnerability. Strictly how attack simulations should deal with ambiguity and guessing remains an open issue. There were also assumptions regarding preferred attack targets. For example, being unable to select Lambda as a target might have omitted the final step in the reported results (Figure
13). Finally, cursorily removing the Internet exposure shifted assessment results toward user credential theft in scenario five. This opens up questions about realistic assumptions, attacker profiling, decision-making, and sensitivity for future work. MAL simulations are, at the time of writing, simple shortest path calculations.
The calculation time was, for the majority of the sample, below 1 min. Additionally, all measurements were slower than the actual time, since the queue times could not be separated from the calculations. This is one plausible reason behind some models below 10,000 assets exceeding 1 min but not others. The measurements for models beyond 20,000 DSL assets were sparse. However, the results indicated favorable scaling up until 40,000 assets and beyond. Visually, there is an indication of quadratic scaling, but that conclusion requires further measurements at the appropriate scale.
Compared to earlier work, the DSL can reason about VPC configurations and IAM policies in the context of cyber attacks, bearing some resemblance to TIROS [
6] and Zelkova [
7], and any amount of known and unknown vulnerabilities. The automated workflow additionally enables continuous assessments of configurations and changes. Once the DSL is fully probabilistic, it will also rank attack paths more accurately. At this stage, simulation sensitivity might be a central validation metric. There are, however, no suggestions for patching strategies,
33 like Vulnus [
3], or countermeasures like References [
16,
23,
44].
8.1 Improving the DSL
One language issue was the DSL being too straightforward. It could not always simulate tactics, such as deploying then accessing database snapshots. Certain methods are, however, difficult to represent in MAL due to its static models. Nevertheless, planted inline credentials and similar collection tactics were the major obstacle encountered. To re-iterate, this is not inherently a modeling limitation. The logic could easily be represented, for example, via EC2ReadMetadata.credentials.access. The problem is knowing when to invoke such logic and how to enhance the data collection pipeline.
Some auditing tools could alert to potential secrets stored in the environment and thereby enrich the data collection pipeline. Similarly, credential scanners, such as cred_scanner (Table
2), could potentially fill a similar function. Such tools could indicate when to invoke credential access logic, which could be a sufficient alert to end users. Moreover, a sufficiently privileged data collection agent might also determine the nature of, at least, found AWS credentials. This would be the ideal solution for AWS-specific keys, but automatically identifying other credentials remains an open issue.
Another topic is knowledge acquisition for the DSL and MAL-based artifacts overall, a missing component from the analogy with expert and knowledge-based system. Acquisition mechanisms enabling end users to update the attack and defense logic is preferable to manually editing MAL code. Additionally, it would improve the usability and long-term maintainability of the DSL and MAL-based artifacts overall. In a similar vein, the recent work by Gylling et al. [
19] leveraged threat intelligence, especially the ATT&CK matrix, in the context of MAL and probabilistic attack graphs. Further work in this direction, also for the DSL and cloud security, could be lucrative. Integration with existing threat intelligence standards, such as STIX [
8] or those discussed in Reference [
56, pp. 23–24] could be of particular interest.
8.2 Limitations
The quality of test-based validations depends on the relevance and coverage of the scenarios [
45]. The CloudGoat scenarios are inherently artificial, making generalizations about performance in real-world environments uncertain. CloudGoat did include most of the CSA top five threats [
13] and some Cloud Matrix tactics to some extent. Furthermore, some incidents such as the Capital One request forgery were explicitly represented. However, specific attacks, such as container escapes, were neglected. Conversely, CloudGoat contained services not supported by the DSL. The missing services and operations highlight lapses in the DSL, many that could be modeled with relative ease. What can be concluded overall is that correct logic sequences produced in CloudGoat are likely replicable in practice. However, test cases give limited insights on validity outside of coverage.
One drawback with testing through Vanguard was it potentially obscuring some results. The disparity between the DSL and actual coverage is evidence of this. A manual check could determine whether missed user accounts were visited at all. However, since these users had no practical impact on the simulations, the reports omitted them. The result is a lower actual coverage than DSL coverage. The problem is arguably a form of sensitivity in the reporting and user settings. Specifically, when user settings or assumptions in software, such as Vanguard, do not match expectations. For example, if Lambda is not checked or considered as an end goal, then it might not be reported as a result despite being covered. The path diagrams in Section
7 improve transparency but were created manually. Similarly, the normalized CloudGoat steps were also produced by manually interpreting the, sometimes incomplete, CloudGoat solutions. In the end, obtaining objective coverage and quality measurements were challenging.
9 Conclusions
Migrating to cloud platforms like AWS has become increasingly common, but customer-side mistakes routinely cause security incidents. Testing the DSL for AWS against CloudGoat via Vanguard demonstrated the ability of the language to identify basic AWS security issues automatically. The tests suggested no fundamental structural or logical issues in the DSL, but CloudGoat highlighted certain shortcomings. For one, there is room for more advanced attacker tactics and additional cloud services in the language. The simulated behaviors and attacker profiles are also rudimentary and can be examined further. Finally, issues with opaque data and collection techniques remained unresolved. Nevertheless, the real-world precedents showed that even major breaches could have “egregious,” yet simple, causes of the kind the DSL could demonstrably identify. Future improvements regarding cloud tactics and data collection steps should yield considerable improvements.
9.1 Future Work
Future work can explore how to handle inline credentials, for example, by finding them or adding some probabilistic attack steps. Another line of inquiry is identifying further attacker tactics and profiles. One method is studying existing literature and incidents, possibly with input from security firms and industrial actors. A similar approach could also provide data for assigning probability distributions in the DSL. A third angle could be devising graph-traversal algorithms. MAL-simulations currently rely on finding the shortest path, but more realistic methods might incorporate behavioral or decision-making theory. Specifically, attackers and the paths they produce may need to be constrained by their resources, preferences, and character traits. Finally, further work regarding countermeasure suggestions might require a separate algorithm together with a MAL extension for defining security controls.