research-article

Open access

An Extended Account of Trace-relating Compiler Correctness and Secure Compilation

Authors:

Jérémy ThibaultAuthors Info & Claims

ACM Transactions on Programming Languages and Systems (TOPLAS), Volume 43, Issue 4

Article No.: 14, Pages 1 - 48

https://doi.org/10.1145/3460860

Published: 10 November 2021 Publication History

All formats PDF

Abstract

Compiler correctness, in its simplest form, is defined as the inclusion of the set of traces of the compiled program in the set of traces of the original program. This is equivalent to the preservation of all trace properties. Here, traces collect, for instance, the externally observable events of each execution. However, this definition requires the set of traces of the source and target languages to be the same, which is not the case when the languages are far apart or when observations are fine-grained. To overcome this issue, we study a generalized compiler correctness definition, which uses source and target traces drawn from potentially different sets and connected by an arbitrary relation. We set out to understand what guarantees this generalized compiler correctness definition gives us when instantiated with a non-trivial relation on traces. When this trace relation is not equality, it is no longer possible to preserve the trace properties of the source program unchanged. Instead, we provide a generic characterization of the target trace property ensured by correctly compiling a program that satisfies a given source property, and dually, of the source trace property one is required to show to obtain a certain target property for the compiled code. We show that this view on compiler correctness can naturally account for undefined behavior, resource exhaustion, different source and target values, side channels, and various abstraction mismatches. Finally, we show that the same generalization also applies to many definitions of secure compilation, which characterize the protection of a compiled program linked against adversarial code.

1 Introduction

Compiler correctness is an old idea [46, 49, 50] that has seen a significant revival in recent times. This new wave was started by the creation of the CompCert verified C compiler [41] and continued by the proposal of many significant extensions and variants of CompCert [10, 11, 15, 29, 36, 37, 51, 67, 73, 76, 80] and the success of many other milestone compiler verification projects, including Vellvm [83], Pilsner [56], CakeML [77], and CertiCoq [5]. Verification through proof assistants allows the user of a compiler to trust the proofs without diving into all of the details. Still, to clearly understand the benefits and limitations of using a verified compiler, she has to deeply understand the statement of correctness. This is true not just for correct compilation, but also for secure compilation, which is the more recent idea that a compilation chain should not just provide correctness but also security against co-linked adversarial components [4, 32].

Basic Compiler Correctness. The gold standard for compiler correctness is semantic preservation, which intuitively says that the semantics of a compiled program (in the target language) is compatible with the semantics of the original program (in the source language). For practical verified compilers, such as CompCert [41] and CakeML [77], semantic preservation is stated extrinsically, by referring to traces. In these two settings, a trace is an ordered sequence of events—such as inputs from and outputs to an external environment—that are produced by the execution of a program.

A basic definition of compiler correctness can be given by the inclusion of the set of traces of the compiled program in the set of traces of the original program. Formally [41]:

*Definition 1.1* (Basic Compiler Correctness (.

This definition says that for any whole¹ source program

, if we compile it (denoted

), execute it in the semantics of the target language, and observe a trace

, then the original

can produce the same trace

in the semantics of the source language. ² This definition is simple and easy to understand, since it only references a few familiar concepts: a compiler between a source and a target language, each equipped with a trace-producing semantics (usually nondeterministic).

Beyond Basic Compiler Correctness. Definition 1.1 implicitly assumes that the source and target traces are drawn from the very same set, and requires that any target trace produced by a compiled program can be faithfully reproduced by the source program. In practice, existing verified compiler adopts a less restrictive formulation of compiler correctness:

CompCert [41] The original compiler correctness theorem of CompCert [41] can be seen as an instance of basic compiler correctness, but it does not provide any guarantees for programs that can exhibit undefined behavior [68]. As allowed by the C standard, such unsafe programs are not even considered to be in the source language, so are not quantified over. This has important practical implications, since undefined behavior often leads to exploitable security vulnerabilities [16, 30, 31] and serious confusion even among experienced C and C++ developers [40, 68, 78, 79]. As such, since 2010, CompCert provides an additional top-level correctness theorem³ that better accounts for the presence of unsafe programs by providing guarantees for them up to the point when they encounter undefined behavior [68]. This new theorem goes beyond the basic correctness definition above, as a target trace need only correspond to a source trace up to the occurrence of undefined behavior in the source trace.

CakeML [77] Compiler correctness for CakeML accounts for memory exhaustion in target executions. Crucially, memory exhaustion events cannot occur in source traces, only in target traces. Hence, dually to CompCert, compiler correctness only requires source and target traces to coincide up to the occurrence of a memory exhaustion event in the target trace.

Trace-relating Compiler Correctness. Generalized formalizations of compiler correctness like the ones above can be naturally expressed as instances of a uniform definition, which we call trace-relating compiler correctness. This generalizes basic compiler correctness by (a) considering that source and target traces belong to possibly distinct sets

and

, and (b) being parameterized by an arbitrary trace relation

Definition 1.2

(Trace-relating Compiler Correctness (CC

) A compiler

is correct with respect to a trace relation

iff

This definition requires that, for any target trace

produced by the compiled program

, there exists a source trace

that can be produced by the original program

and is related to

according to

(i.e.,

). By choosing the trace relation appropriately, one can recover the different notions of compiler correctness presented above:

Basic CC Take

to be

. Trivially, the basic CC of Definition 1.1 is

CompCert Undefined behavior is modeled in CompCert as a trace-terminating event

that can occur in any of its languages (source, target, and all intermediate languages), so for a given phase (or composition thereof), we have

. Nevertheless, the relation between source and target traces with which to instantiate CC

to obtain CompCert’s current theorem is the following (note that we denote finite traces–or prefixes– as

A compiler satisfying CC

for this trace relation can turn a source prefix ending in undefined behavior

(where “

” is concatenation) either into the same prefix in the target (first disjunct) or into a target trace that starts with the prefix

but then continues arbitrarily (second disjunct, “

” is the prefix relation).

CakeML Here, target traces are sequences of symbols from an alphabet

that has a specific trace-terminating event,

, which is not available in the source alphabet

(i.e.,

. Then, the compiler correctness theorem of CakeML can be obtained by instantiating CC

with the following

relation:

The resulting CC

instance relates a target trace ending in

after executing prefix

to a source trace that first produces

and then continues in a way given by the semantics of the source program.

Beyond undefined behavior and resource exhaustion, there are many other practical uses for CC

: In this article, we show that it also accounts for differences between source and target values, for a single source output being turned into a series of target outputs, and for side-channels.

On the flip side, the compiler correctness statement and its implications can be more difficult to understand for CC

than for

. The full implications of choosing a particular

relation can be subtle. In fact, using a bad relation can make the compiler correctness statement trivial or unexpected. For instance, it should be easy to see that if one uses the total relation, which relates all source traces to all target ones, the CC

property holds for every compiler, yet it might take one a bit more effort to understand that the same is true even for the following relation:

Reasoning about Trace Properties. To understand more about a particular CC

instance, we propose to also look at how it preserves trace properties—defined as sets of allowed traces [39]—from the source to the target. For instance, it is well known that

is equivalent to the preservation of all trace properties (where

reads “

satisfies property

” and stands for

However, to the best of our knowledge, similar results have not been formulated for trace relations beyond equality, when it is no longer possible to preserve the trace properties of the source program unchanged. For trace-relating compiler correctness, where source and target traces can be drawn from different sets and related by an arbitrary trace relation, there are two crucial questions to ask:

(1)

For a source trace property

of a program—established for instance by formal verification—what is the strongest target property that any CC

compiler is guaranteed to ensure for the produced target program?

(2)

For a target trace property

, what is the weakest source property we need to show of the original source program to obtain

for the result of any CC

compiler?

Far from being mere hypothetical questions, they can help the developer of a verified compiler better understand the compiler correctness theorem they are proving, and we expect that any user of such a compiler will need to ask either one or the other if they are to make use of that theorem. In this work, we provide a simple and natural answer to these questions, for any instance of CC

. Building upon a bijection between relations and Galois connections [6, 26, 54], we observe that any trace relation

corresponds to two property mappings

and

, which are functions mapping source properties to target ones (

standing for “to target”) and target properties to source ones (

standing for “to source”):

The existential image of

, answers the first question above by mapping a given source property

to the target property that contains all target traces for which there exists a related source trace that satisfies

. Dually, the universal image of

, answers the second question by mapping a given target property

to the source property that contains all source traces for which all related target traces satisfy

. We introduce two new correct compilation definitions in terms of trace property preservation():

•

quantifies over all source trace properties and uses

to obtain the corresponding target properties;

•

quantifies over all target trace properties and uses

to obtain the corresponding source properties.

We prove that these two definitions are equivalent to CC

, yielding a novel trinitarian view of compiler correctness (Figure 1). Contributions.

Fig. 1.

•

We propose a new trinitarian view of compiler correctness that accounts for non-trivial relations between source and target traces. While, as discussed above, specific instances of the CC

definition have already been used in practice, we seem to be the first to propose assessing the meaningfulness of CC

instances in terms of how properties are preserved between the source and the target, and in particular by looking at the property mappings

and

induced by the trace relation

. We prove that CC

, TP

, and TP

are equivalent for any trace relation (Section 2.2), as illustrated in Figure 1. In the opposite direction, we show that for every trace relation corresponding to a given Galois connection [26], an analogous equivalence holds.

•

We extend these results from the preservation of trace properties to the larger class of subset-closed hyperproperties, e.g., noninterference (Section 3.1), ⁴ and to the classes of safety properties (Section 3.2) and all hyperproperties (Section 3.3).

•

We use CC

compilers of various complexities to illustrate that our view on compiler correctness naturally accounts for undefined behavior (Section 4.1), resource exhaustion (Section 4.2), different source and target values (Section 4.3), and differences in the granularity of data and observable events (Section 4.4). We expect these ideas to extend to other discrepancies between source and target traces. For each compiler, we show how to choose the relation between source and target traces and how the induced property mappings preserve interesting trace properties and subset-closed hyperproperties. We look at the way particular

and

work on different kinds of properties and how the produced properties can be expressed for different kinds of traces.

•

We analyze the impact of correct compilation on noninterference [28], showing what can still be preserved (and thus also what is lost) when target observations are finer than source ones, e.g., side-channel observations (Section 5). We formalize the guarantee obtained by correct compilation of a noninterfering program as abstract noninterference [27], a weakening of target noninterference. Dually, we identify a family of declassifications of target noninterference for which source reasoning is possible.

•

We show that the trinitarian view also extends to a large class of secure compilation definitions [3], formally characterizing the protection of the compiled program against linked adversarial code (Section 6). For each secure compilation definition, we again propose both a property-free characterization in the style of CC

and two characterizations in terms of preserving a class of source or target properties satisfied against arbitrary adversarial contexts. The additional quantification over contexts allows for finer distinctions when considering different property classes, so we study mapping classes not only of trace properties and hyperproperties, but also of relational hyperproperties [3].

•

We provide instances of secure compilers that preserve three different classes of hyperproperties (trace, safety, and hypersafety properties) when targeting a language with additional trace events that are not possible in the source (Section 7).

The results and insights that we provide often follow one’s expected intuition and may be considered unsurprising. However, our framework is the first to capture such expectations formally and precisely, and as such it provides a uniform way to discuss these and to formalize future (possibly surprising) ones. The article closes with discussions of related (Section 8) and future work (Section 9). Some technical proofs can be found in the Appendix (Section B).

The traces considered in our examples are structured, usually as sequences of events. We notice, however, that unless explicitly mentioned, all our definitions and results are more general and make no assumption whatsoever about the structure of traces. Most of the theorems formally or informally mentioned in the article were mechanized in the Coq proof assistant and are marked with

. This development has around 10K lines of code and is available at the following address: https://github.com/secure-compilation/different_traces.

2 Trace-relating Compiler Correctness

In this section, we start by generalizing the trace property preservation definitions at the end of the introduction to

and

, which depend on two arbitrary mappings

and

(Section 2.1). We prove that, whenever

and

form a Galois connection,

and

are equivalent (Theorem 2.4). We then exploit a bijective correspondence between trace relations and Galois connections to close the trinitarian view (Section 2.2), with two main benefits: First, it helps us assess the meaningfulness of a given trace relation by looking at the property mappings it induces; second, it allows us to construct new compiler correctness definitions starting from a desired mapping of properties. Finally, we generalize the classic result that compiler correctness (i.e.,

) is enough to preserve not just trace properties but also all subset-closed hyperproperties [18]. For this, we show that CC

is also equivalent to subset-closed hyperproperty preservation, for which we also define both a version in terms of

and a version in terms of

(Section 3.1).

2.1 Property Mappings

As explained in Section 1, trace-relating compiler correctness CC

, by itself, lacks a crisp description of which trace properties are preserved by compilation. Since even the syntax of traces can differ between source and target, one can either focus on trace properties of the source (and then interpret them in the target) or on trace properties of the target (and then interpret them in the source). Formally, we need two property mappings,

and

, which lead us to the following generalization of trace property preservation (

Definition 2.1

and

Given two property mappings,

and

, for a compilation chain

, we define

and

as follows:

For an arbitrary source program

interprets a source property

as the target guarantee for

. Dually,

defines a source obligation sufficient for the satisfaction of a target property

after compilation. Ideally:

(i)

Given

, the target interpretation of the source obligation

should actually guarantee that

holds, i.e.,

;

(ii)

Dually for

, we would not want the source obligation for

to be harder than

itself, i.e.,

These requirements are satisfied when the two maps form a Galois connection between the posets of source and target properties ordered by inclusion. We briefly recall the definition and the characteristic property of Galois connections [20, 47].

Definition 2.2 (Galois Connection).

Let

and

be two posets. A pair of maps,

is a Galois connection

it satisfies the adjunction law:

(respectively,

) is the lower (upper) adjoint or abstraction (concretization) function and

(

) the abstract (concrete) domain.

We will often write

to denote a Galois connection, or simply

, or even

when the involved posets are clear from context.

Lemma 2.3 (Characteristic Property of Galois Connections).

If two property mappings,

and

, form a Galois connection on trace properties ordered by set inclusion, then Lemma 2.3 (with

and

) tells us that they satisfy conditions

above, i.e.,

and

.⁵ These conditions on

and

are sufficient to show the equivalence of the criteria they define, respectively,

and

Theorem 2.4

and

Coincide

Let

be a Galois connection, with

and

the lower and upper adjoints (respectively). Then

Proof. Notice that if a program satisfies a property

, then it satisfies every less restrictive i.e., bigger property

. Building on this:

(

) Assume

and that

satisfies

. Apply

and

and deduce that

satisfies

(

) Assume

and that

satisfies

. Apply

and

deducing

satisfies

2.2 Trace Relations and Property Mappings

We now investigate the relation between CC

, and

. We show that for a trace relation and its corresponding Galois connection (Lemma 2.7), the three criteria are equivalent (Theorem 2.8). This equivalence offers interesting insights for both verification and the design of a correct compiler. For a CC

compiler, the equivalence makes explicit both the guarantees one has after compilation (

) and source proof obligations to ensure the satisfaction of a given target property (

). However, a compiler designer might first determine the target guarantees the compiler itself must provide, i.e.,

, and then prove an equivalent statement, CC

, for which more convenient proof techniques exist in the literature [9, 77].

Definition 2.5 (Existential and Universal Image [26]).

Given any two sets

and

and a relation

, define the relation’s existential or direct image,

and its universal image,

as follows:

When trace relations are considered, the corresponding existential and universal images can be used to instantiate Definition 2.1 leading to the trinitarian view already mentioned in Section 1.

Theorem 2.6 Trinitarian View

For any trace relation

and its existential and universal images

and

, we have:

This result relies both on Theorem 2.4 and on the fact that the existential and universal images of a trace relation form a Galois connection (

). The theorem can be stated in a slightly more general form (Theorem 2.8), exploiting an isomorphism between the category of sets and relations and a subcategory of monotonic predicate transformers [26]. We specialize this isomorphism to what is of interest for our purposes and deduce a bijective correspondence between trace relations and Galois connections on properties.

Lemma 2.7 Trace Relations Galois Connections on Trace Properties

The bijection just introduced allows us to generalize Theorem 2.6 and switch anytime between the three views of compiler correctness described earlier.

Theorem 2.8 (Correspondence of Criteria).

For any trace relation

and corresponding Galois connection

, we have:

Note that sometimes the lifted properties may be trivial: The target guarantee can be the true property (the set of all traces) or the source obligation the false property (the empty set of traces). This might be the case when source observations abstract away too much information (Section 4.2 presents an example).

3 Preserving Other (Hyper)Property Classes

In this section, we investigate how to preserve other classes of (hyper)properties beyond trace properties: subset-closed hyperproperties (Section 3.1), safety properties (Section 3.2), and arbitrary hyperproperties that are not just subset-closed (Section 3.3). For each of these classes, we start by giving an intuition of what it means to preserve such a class in the equal-trace setting, then we study preservation of that class in the trace-relating setting. For subset-closed hyperproperties, we have to refine the Galois connection to ensure the information “

is subset-closed” is not lost with the application of

. Similarly, when looking at safety properties, we have to preserve the information that a propery is a safety property. For arbitrary hyperproperties one might instead require that no information at all is lost during the (pre or post) composition of

and

. The section concludes with a comparison of the criteria in terms of relative strengths (Section 3.4).

3.1 Preservation of Subset-closed Hyperproperties

Hyperproperty preservation is a strong requirement in general. Fortunately, many interesting hyperproperties are subset-closed( for short) (e.g., noninterference), and these are known to be preserved by refinement [18]. When the trace semantics is common to source and target languages, a subset-closed hyperproperty is preserved if the behaviors of the compiled program refine the behaviors of the source program, which coincides with the statement of

. We generalize this result to the trace-relating setting, introducing two other equivalent characterizations of CC

in terms of preservation of subset-closed hyperproperties (Theorem 3.3). To do so, we close under subsets the images of both

and

so source subset-closed hyperproperties are mapped to target subset-closed ones and vice versa.

First, a hyperproperty

is defined as a set of sets of traces,

(recall that

is the set of all traces) [18]. A program satisfies a hyperproperty when its complete set of traces, which from now on we will call its behavior, is a member of the hyperproperty.

Definition 3.1 (Hyperproperty Satisfaction [18]).

A program

satisfies a hyperproperty

, written

,⁶ iff

, where

To talk about hyperproperty preservation in the trace-relating setting, we need an interpretation of source hyperproperties into the target and vice versa. The one we consider builds on top of the two trace property mappings

and

, which are naturally lifted to hyperproperty mappings. This way, we are able to extract two hyperproperty mappings from a trace relation similarly to Section 2.2:

*Definition 3.2* (Lifting Property Mappings to Hyperproperty Mappings).

Formally, we are defining two new mappings, this time on hyperproperties, but with a small abuse of notation, we still denote them

and

Interestingly, it is not possible to apply the argument used for

to show that a CC

compiler guarantees

whenever

. This is because direct images do not necessarily preserve subset-closure [44, 55]. We therefore close the image of

and

under subsets (denoted as

) and obtain the following result:

Theorem 3.3 Preservation of Subset-Closed Hyperproperties

For any trace relation

and its existential and universal images lifted to hyperproperties,

and

, and for

, we have the following:

The use of

in Theorem 3.3 implies a loss of precision in preserving subset-closed hyperproperties through compilation. In Section 5, we focus on a specific security-relevant subset-closed hyperproperty, noninterference, and show that such a loss of precision can be seen as a declassification. Instead, now we define the trinity and the related formal machinery for safety properties preservation.

3.2 Preserving Safety Properties

The class of Safety properties collects all trace properties prescribing that “something bad never happens” or equivalently, all trace properties whose violation can be monitored and, once observed, no longer restored [18]. More abstractly, safety properties can be defined as the closed sets of a topology [18, 58], with no need to consider any particular structure on the traces. To ease the presentation, we consider the trace model adopted by Abate et al. [3] where traces resemble lists and streams of events. This model naturally comes with a notion of prefixes and a relation between a prefix

and a trace

, written

. Intuitively,

is a safety property if any trace

violating the property extends a “bad prefix”

that witnesses such a violation. Every safety property is therefore uniquely defined by the set of its “bad prefixes.” We recall below the definition and the characterization of safety properties in terms of sets of finite prefixes

*Definition 3.4* (Safety Properties [18]).

Due to this characterization of safety properties through finite prefixes (Definition 3.4), the preservation of all and only the safety properties is equivalent to

restricted to finite prefixes.

Unfolding

, we can interpret

as follows: Whenever

produces a trace

that violates a specific safety property, namely, the one defined by the singleton prefix set

, then

violates the same safety property, producing a trace

but possibly distinct from

The generalization we propose of

to the trace-relating setting, states that whenever

produces a trace

that violates a target safety property, then

violates the source interpretation of the property, i.e., its image through

.⁷ The following theorem defines

and its two equivalent formulations:

Theorem 3.6 (Trinitarian View for Safety).

Coherent with the informal meaning we aimed to give to

quantifies over target safety properties, while

quantifies over arbitrary source properties, but imposes the composition of

with

, which maps an arbitrary target property

to the target safety property that best over-approximates

. ⁸ More precisely,

is a closure operator on target properties, with

being the class of target safety properties.

In Figure 2 the blue and red ellipses represent source and target properties properties, respectively, and are connected by

. The red ellipse is the class of all target safety properties.

is a Galois connection between target properties and the target safety properties, as

is a closure operator [21]. Finally, the composition of Galois connections is still a Galois connection [21]. Hence,

is a Galois connection between source properties and target safety properties, which we used to prove the equivalence

(

). We notice that this argument generalizes to arbitrary closure operators on target properties (

). We come back to this in Section 6, where more such results will be needed when considering other classes of properties being preserved by secure compilers. Now, we define the trinity for arbitrary hyperproperties, not just the subset-closed ones.

Fig. 2.

3.3 Preserving Non-subset Closed Hyperproperties

Subset-closed hyperproperties are not expressive enough to all capture interesting properties, e.g., possibilistic notions of information-flow [18], so we aim to briefly discuss the preservation of arbitrary hyperproperties. In general, one cannot lift a Galois connection over trace properties to a Galois connection over arbitrary hyperproperties.

While two out of three of the criteria we introduce in this section are equivalent under no assumptions (

), for a comparison with the third one, we require that no information is lost in the pre or post composition of

and

. For this, we label the trinity in Theorem 3.8 as weak.

To start, we note that the following strengthening of

, denoted

, is equivalent to the preservation of arbitrary hyperproperties. Here,

is the set of all traces of

Theorem 3.7

The following are equivalent:

requires that the behavior of

is exactly the same as the behavior of

. We generalize this to the trace-relating setting by requiring that the behavior of

coincide with the target interpretation of the source properties describing the behavior of

. ⁹

Theorem 3.8 Weak Trinity for Hyperproperties

In other words, it is still possible (and sound) to deduce a source obligation for a given target hyperproperty

(

) when no information is lost in the composition

. Dually,

(and hence

) is a consequence of

when no information is lost in composing in the other direction,

3.4 Comparing the Presented Criteria

At this point, we have presented four trinities of criteria that preserve trace properties, subset-closed hyperproperties, safety properties, and arbitrary hyperproperties. Figure 3 sums up our trinities and orders them according their relative strength.

Fig. 3.

In Section 6, we will also consider, in the setting of secure compilation, the class of safety hyperproperties or hypersafety, and relational hyperproperties. In the setting of correct compilation—which focuses only on whole programs—it is straightforward to show that the trinity for hypersafety coincides with the one for safety properties in the same way the trinity of trace properties and subset-closed hyperproperties coincide. Similarly the trinity for relational hyperproperties coincides with the one for hyperproperties.

4 Instances of Trace-relating Compiler Correctness

The trace-relating view of compiler correctness above can serve as a unifying framework for studying a range of interesting compilers. This section provides several representative instantiations of the framework: source languages with undefined behavior that compilation can turn into arbitrary target behavior (Section 4.1), target languages with resource exhaustion that cannot happen in the source (Section 4.2), changes in the representation of values (Section 4.3), and differences in the granularity of data and observable events (Section 4.4).

4.1 Undefined Behavior

We start by expanding upon the discussion of undefined behavior in Section 1. We first study the model of CompCert, where source and target alphabets are the same, including the event for undefined behavior. The trace relation weakens equality by allowing undefined behavior to be replaced with an arbitrary sequence of events.

*Example 4.1* (CompCert-like Undefined Behavior Relation).

This relation can be easily generalized to other settings. For instance, consider the setting in which we compile down to a low-level language like machine code. Target traces can now contain new events that cannot occur in the source: Indeed, in modern architectures like x86 a compiler typically uses only a fraction of the available instruction set. Some instructions might even perform dangerous operations, such as writing to the hard drive or controlling a device that is hidden from the source language. Formally, the source and target do not have the same events anymore. Thus, we consider a source alphabet,

, and a target alphabet,

. The trace relation is defined in the same way and we obtain the same property mappings as above, except that target traces now have more events (some of which may be dangerous), the arbitrary continuations of target traces get more interesting. For instance, consider a new event that represents writing data on the hard drive, and suppose we want to prove that this event cannot happen for a compiled program. Then, proving this property requires exactly proving that the source program exhibits no undefined behavior [14]. More generally, what one can prove about target-only events can only be either that they cannot appear (because there is no undefined behavior) or that any of them can appear (in the case of undefined behavior).

In Section 7.1, we study a similar example, showing that even in a safe language linked adversarial contexts can cause dangerous target events that have no source correspondent.

4.2 Resource Exhaustion

Let us return to the discussion about resource exhaustion in Section 1.

We conclude this subsection by noting that the resource exhaustion relation and the undefined behavior relation from the previous subsection can easily be combined. Indeed, given a relation

and a relation

defined as above on the same sets of traces, we can build a new relation

that allows both refinement of undefined behavior and resource exhaustion by taking their union:

. A compiler that is

is trivially CC

, though the converse is not true.

4.3 Different Source and Target Values

This section first presents the common language formalization (Section 4.3.1) that the following (Section 4.3.2) and later instances (Section 4.4 and Section 7.1) build upon. This shared language formalization does not contain a key language feature, namely, the expressions that generate actions and thus labels. This is because each instance deals with specific ways to generate actions, so each instance will define its own extension to each of the languages defined below. Additionally, each instance will define its own compiler and the trace relation used to attain CC

4.3.1 Shared Source and Target Language Formalization.

The source language is a pure, statically typed expression language whose expressions

include naturals, Booleans, a Boolean conditional and a conditional for expressions that reduce to

, arithmetic and relational operations, and sequencing.

Types

are either

(naturals) or

(Booleans) and typing is standard.

The language semantics deal with actions

, lists of actions

and expression results

. A list of actions

is a list of individual actions

, which are instance-dependant and thus presented later; the same holds for source traces

The source language has a standard big-step operational semantics (

) that tells how an expression

generates a list of actions and a result

The target language is analogous to the source one, except that it is untyped, it only has naturals

and its only conditional is

The semantics of the target language is also given in big-step style; since its rules are a subset of the source rules, they are omitted. Since we only have naturals and all expressions operate on them, no error result is possible in the target.

4.3.2 Different Source and Target Values.

In this instance, we extend the source language with expressions to perform Booleans and natural inputs, while the target only has expressions to input naturals. To compile the

, the target is also extended with a conditional that checks if an expression is less than another.

Source actions are Boolean

and natural inputs

and source traces

are lists of actions

together with a final result

. Target actions are just natural inputs

The source extensions respect typing and thus well-typed programs never produce error (

). The semantics of the extensions adds elements to the traces.

The compiler is homomorphic, translating a source expression to the same target expression; the only differences are natural numbers (and conditionals).

When compiling an if-then-else the then and else branches of the source are swapped in the target because of the compilation of Booleans.

Relating Traces. We relate basic values (naturals and Booleans) in a non-injective fashion, as noted below. Then, we extend the relation to lists of inputs pointwise (Rules Empty and Cons) and lift that relation to traces (Rules Nat and Bool).

Property mappings. The property mappings

and

induced by the trace relation

defined above capture the intuition behind encoding Booleans as naturals:

•

the source-to-target mapping allows

to be encoded by any non-zero number;

•

the target-to-source mapping requires that

be replaceable by both

and

Compiler correctness. With the relation above, the compiler is proven to satisfy CC

Theorem 4.3 (

is Correct

)

is CC

Simulations with different traces. In the settings where

, it is customary to prove compiler correctness showing a forward simulation (i.e., a simulation between source and target transition system); then, using determinacy [24, 48] of the target language and input totality [25, 82] (receptiveness) of the source, this forward simulation is flipped into a backward simulation (a simulation between target and source transition system), as described by Beringer et al. [9], Leroy [42]. This “flipping” is useful, because forward simulations are often much easier to prove (by induction on the transitions of the source) than backward ones. For the proof of Theorem 4.3, we had to show a backward simulation, as it was not possible to define a forward one and then flip it. Hereafter, we show the reason lies in the shape of trace relation itself and discuss when is possible to generalize the flipping to the trace-relating setting.

We first give the main idea of the flipping proof, when the inputs are the same in the source and the target [9, 42]. We only consider inputs, as it is the most interesting case, since with determinacy, nondeterminism only occurs on inputs. Given a forward simulation

, and a target program

that simulates a source program

is able to perform an input iff so is

: otherwise, say, for instance that

performs an output, by forward simulation

would also perform an output, which is impossible because of determinacy. By input totality of the source,

must be able to perform the exact same input as

; using forward simulation and determinacy, the resulting programs must be related.

The trace relation from Section 4.3.2 is not injective (both

and

are mapped to

), therefore, these arguments do not apply: Not all possible inputs of target programs are accounted for in the forward simulation. To flip a forward simulation into a backward one it is necessary that, for any source program

and target program

related by the forward simulation

, the following diagram is satisfied:

We say that a forward simulation for which this property holds is flippable. For our example compiler, a flippable forward simulation works as follows: Whenever a Boolean input occurs in the source, the target program must perform every strictly positive input

(and not just

, as suggested by the compiler). Using this property, determinacy of the target, input totality of the source, as well as the fact that any target input has an inverse image through the relation, we can indeed show that the forward simulation can be turned into a backward one: Starting from

and an input

, we show that there is

and

as in the diagram above, using the same arguments as when the inputs are the same; because the simulation is flippable, we can close the diagram and obtain the existence of an adequate

. From this, we obtain CC

In fact, we showed that the flippable hypothesis is also sufficient to flip a forward simulation into a backward one, even in the trace-relating setting, and proved it in a general (i.e., language independent) “flipping theorem” (

). We have also shown that if the relation

defines a bijection between the inputs of the source and the target, then any forward simulation is flippable, hence reobtaining the usual proof technique [9, 42] as a special case.

4.4 Abstraction Mismatches

We now consider how to relate traces where a single source action is compiled to multiple target ones. To illustrate this, we extend our source language to output (nested) pairs of arbitrary size and our target language to send values that have a fixed size. Concretely, the source is analogous to the language of Section 4.3, except that it does not have inputs (nor Booleans for simplicity) but it has pairs. Additionally, it has an expression

that can emit a (nested) pair

of values in a single action. Given that

reduces to a pair, e.g.,

, expression

emits action

. That expression is eventually compiled into a sequence of individual sends in the target language

, since in the target,

sends the value that

reduces to, but the language cannot send pairs (although it has pair constructs).

The source and target languages are formally extended (respectively, in the first and second lines below) with pairs and sending constructs as follows: For reasons that we explain when the compiler is presented, we extend the target language with a let-in construct and variables. Finally, source traces are sequences of sent values

(which include nested pairs) and target traces are only sequences of natural numbers.

The source additions are well-typed and their semantics is unsurprising; the semantics relies on the usual capture-avoiding substitution

of a result

for a variable

The compiler is defined inductively on the type derivation of a source expression (

). The only interesting case is when compiling a

, where we use the source type information concerning the message (i.e., a pair) being sent to deconstruct that pair into a sequence of natural numbers, which is what is sent in the target. This is the reason we need the let-in construct in the target, since we run the pair once (as the argument of the let-in) and then we send all of its projection to avoid duplicating side effects. Technically, since it is defined on the type derivations of terms, the compiler is defined inductively on type derivations (and not simply on terms). Thus, compiling

would look like the following (using

as a metavariable to range over derivations):

However, note that each judgment uniquely identifies which typing rule is being applied and the underlying derivation. Thus, for compactness, we only write the judgment in the compilation and implicitly apply the related typing rule to obtain the underlying judgments for recursive calls. To differentiate this from the compiler of Section 4.3.2, this compiler has parentheses over its input.

Relating Traces. We start with the trivial relation between numbers:

, i.e., numbers are related when they are the same. We cannot build a relation between single actions, since a single source action is related to multiple target ones. Therefore, we define a relation between a source action

and a target trace

(a list of numbers) inductively on the structure of

A pair of naturals is related to the two actions that send each element of the pair (Rule Trace-Rel-N-N). If a pair is made of sub-pairs, then we require all such sub-pairs to be related (Rules Trace-Rel-N-M to Trace-Rel-M-M).

We build on these rules to define the

relation between source and target traces for which the compiler is correct (4.5). Trivially, traces are related when they are both empty. Alternatively, given related traces, we can concatenate a source action and a second target trace provided that they are related (Rule Trace-Rel-Single). Before proving that the compiler is correct, we need 4.4. Intuitively, that lemma tells us that the way we break down a source sent value

into multiple target sends is correct.

Lemma 4.4 (

Works)

(since

is necessarily a sent value

, that can be related to

Theorem 4.5 (

is Correct)

is CC

With our trace relation, the trace property mappings capture the following intuitions:

•

The target-to-source mapping states that a source property can reconstruct target action as it sees fit. For example, trace

is related to

and

(and many more variations). This gives freedom to the source implementation of a target behavior, which follows from the non-injectivity of

. ¹⁰

•

The source-to-target mapping “forgets” about the way pairs are nested, but is faithful w.r.t. the values

contained in a message. Notice that source safety properties are always mapped to target safety properties. For instance, if

prescribes that some bad number is never sent, then

prescribes the same number is never sent in the target and

. Of course if

prescribes that a particular nested pairing like

never happens, then

is still a target safety property, but the trivial one, since

5 Trace-relating Compilation and Noninterference Preservation

We now study the relation between trace-relating compilation and noninterference preservation. As mentioned earlier (Section 3.1), in the particular case where source and target observations are drawn from the same set, a correct compiler (

) is enough to ensure the preservation of all subset-closed hyperproperties, in particular of noninterference (NI) [28]. But in the scenario where target observations are strictly more informative than source observations, this is not the case. In fact, as we will show, the best guarantee one may expect from a correct trace-relating compiler (CC

) in such a setting is a weakening (or declassification) of target noninterference that matches the noninterference property satisfied in the source. In certain scenarios, it turns out that the noninterference property of interest in the target comes “for free,” while in others, it does not, and therefore establishing noninterference requires an additional proof effort beyond CC

. To formalize this reasoning, this section applies the trinitarian view of trace-relating compilation to the general framework of abstract noninterference (ANI) [27], clarifying the kind of noninterference preservation that follows from a given trace relation and correct compilation.

We first define NI and explain the issue of preserving source NI via a CC

compiler (Section 5.1). We then introduce ANI, which allows characterizing various forms of noninterference (Section 5.2), and formulate a theory of ANI preservation via CC

, both with respect to a timing insensitive declassification (Section 5.3) and in general (Section 5.4). We also study how to deal with cases such as undefined behavior in the target (Section 5.5). We then answer the dual question, i.e., which source NI should be satisfied to guarantee that compiled programs are noninterfering with respect to target observers (Section 5.6). Finally, we use this formal development to analyze recent work on correct compilers with interesting noninterference guarantees [7, 74], clarifying whether these guarantees follow from correctness alone or not (Section 5.7).

5.1 Noninterference and Trace-relating Compilation

Intuitively, noninterference (NI) requires that publicly observable outputs do not reveal information about private inputs. To define this formally, we need a few additions to our setup. We indicate the (disjoint) input and output projections of a trace

and

, respectively.¹¹ Denote with

the equivalence class of a trace

, obtained using a standard low-equivalence relation that relates low (public) events only if they are equal, and ignores any difference between private events. Then, NI for source traces can be defined as:

That is, source NI comprises the sets of traces that have equivalent low output projections as long as their low input projections are equivalent.

When additional observations are possible in the target, it is unclear whether a noninterfering source program is compiled to a noninterfering target program or not, and if so, whether the notion of NI in the target is the expected (or desired) one. We illustrate this issue by considering a scenario where target traces extend source traces by exposing the execution time. While source noninterference

requires that private inputs do not affect public outputs,

additionally requires that the execution time is not affected by varying private inputs.

To model the scenario described, we represent target traces as pairs of a source trace and a natural number that denotes the time spent to produce the trace (using

for infinite time units). Formally, if

denotes the set of source traces, then

is the set of target traces, where

Notice that if two source traces

are low-equivalent, then

and

, but

and

Consider the following straightforward trace relation, which relates a source trace to any target trace whose first component is equal to it, irrespective of execution time:

A compiler is CC

for this trace relation if any trace that can be exhibited in the target can be simulated in the source in some amount of time. For such a compiler, 3.3 says that if

satisfies

, then

satisfies

. This hyperproperty is, however, strictly weaker than

, as it contains for example

, and one cannot conclude that

is noninterfering in the target. It is easy to check that

the first equality coming from

, and the second from

being subset-closed. As we will see, this hyperproperty can be characterized as a form of NI, which one might call timing-insensitive noninterference, i.e., ensured only against attackers that cannot measure execution time. For this characterization, and to describe different forms of noninterference as well as formally analyze their preservation by a CC

compiler, we rely on the general framework of abstract noninterference [27].

5.2 Abstract Noninterference

ANI [27] is a generalization of NI whose formulation relies on abstractions (in the sense of Abstract Interpretation [20]) to encompass arbitrary variants of NI. ANI is parameterized by an observer abstraction

, which denotes the distinguishing power of the attacker, and a selection abstraction

, which specifies when to check NI, and therefore captures a form of declassification [69]. ¹² Formally:

By picking

, we recover the standard noninterference defined above, where NI must hold for all low inputs (i.e., no declassification of private inputs), and the observational power of the attacker is limited to distinguishing low outputs. The observational power of the attacker can be weakened by choosing a more liberal relation for

. For instance, one may limit the attacker to observe the parity of output integer values. Another way to weaken ANI is to use

to specify that noninterference is only required to hold for a subset of low inputs.

The operators

and

are defined over sets of (input and output projections of) traces, explicitly

and

. When we write

like above, this should be understood as a convenience notation for

. Likewise,

should be understood as

, i.e., the powerset lifting of

. Additionally,

and

are required to be upper-closed operators (

)—i.e., monotonic, idempotent, and extensive (i.e.,

) —on the poset that is the powerset of (input and output projections of) traces ordered by inclusion [27].

5.3 Trace-relating Compilation and ANI for Timing

We can now reformulate our example with observable execution times in target traces in terms of ANI. We have

with

. In this case, the hyperproperty that a compiled program

satisfies whenever

satisfies

can be described as an instance of ANI:

The definition of

tells us that the trace relation does not affect the selection abstraction, i.e., declassification is unaffected. The definition of

characterizes an observer that cannot distinguish execution times for noninterfering traces (notice that

in the definition of

is discarded). For instance,

, for any

. Therefore, in this setting, we know explicitly through

that a CC

compiler degrades source noninterference to target timing-insensitive noninterference.

5.4 Trace-relating Compilation and ANI in General

While the particular

and

above can be discovered by intuition, we want to know whether there is a systematic way of obtaining them in general. In other words, for any trace relation

and any notion of source NI, what property is guaranteed on noninterfering source programs by any CC

compiler?

We can now answer this question generally (Theorem 5.1): Any source notion of noninterference expressible as an instance of ANI is mapped to a corresponding instance of ANI in the target, whenever source traces are an abstraction of target ones (i.e., when

is a total and surjective map). For this result, we consider trace relations that can be split into input and output trace relations (denoted as

) such that

. The trace relation

corresponds to a Galois connection between the sets of trace properties

as described in Section 2.2. Similarly, the pair

and

corresponds to a pair of Galois connections,

and

, between the sets of input and output properties. In the timing example, time is an output so we have

and

is defined as

The target abstract noninterference has to be intended as the best correct approximation of the source one. The mappings

are the existential and universal images of the relation

, defined by

if and only if

. Therefore,

and

are lower and upper adjoints, respectively (Section 2). The operator

is the best correct approximation of

w.r.t. to

[20] (hence, the choice of the

notation). A similar result holds for

Coming back to our example above, we can formally recover the intuitively justified definitions, i.e.,

5.5 Noninterference and Undefined Behavior

As stated above, Theorem 5.1 does not apply to several scenarios from Section 4 such as undefined behavior (Section 4.1). Indeed, in these cases, the relation

is not a total map. Nevertheless, we can still exploit our framework to reason about the impact of compilation on noninterference.

Let us consider

where

is any total and surjective map from target to source inputs (e.g., equality) and

is defined as

. Intuitively, a CC

compiler guarantees noninterference for the compiled program, provided that the target attacker cannot exploit undefined behavior to learn private information. This intuition can be made formal by the following theorem:

Theorem 5.2 (Relaxed Compiling ANI).

Relax the assumptions of Theorem 5.1 by allowing

to be any output trace relation. If

satisfies

, then

satisfies

where

is defined as in Theorem 5.1, and

is such that:

(1)

Technically, instead of giving us a definition of

, the theorem gives a property of it. The property states that, given a target output trace

, the attacker cannot distinguish it from any other target output traces produced by other possible compilations (

) of the source trace

it relates to, up to the observational power of the source-level attacker

. Therefore, given a source attacker

, the theorem characterizes a family of attackers that cannot observe any interference for a correctly compiled noninterfering program. Notice that the target attacker

satisfies the premise of the theorem, but defines a trivial hyperproperty, so we cannot prove in general that

. Also, this degenerate attacker

shows that the family of attackers described in Theorem 5.2 is nonempty, which ensures the existence of a most powerful attacker among them [27].

5.6 From Target NI to Source NI

We now explore the dual question: Under what hypothesis does trace-relating compiler correctness alone allow target noninterference to be reduced to source noninterference? This is of practical interest, as one would be able to protect from target attackers by ensuring noninterference in the source. This task can be made easier if the source language has some static enforcement mechanism [1, 44].

Let us consider the languages from Section 4.4 extended with the ability to accept inputs as (pairs of) values. It is easy to show that the compiler described in Section 4.4 (extended to treat the new input expressions homomorphically) is still CC

: Given a target trace

with the same inputs of the source one (i.e.,

), the compiler of Section 4.4 ensures that

simulates the same outputs of

(i.e.,

). Assume that we want to satisfy a given notion of target noninterference after compilation, i.e.,

. Recall that the observational power of the target attacker,

, is expressed as a property of sequences of values. To express the same property (or attacker) in the source, we have to abstract the way pairs of values are nested. For instance, the source attacker should not distinguish

and

. In general (i.e., when

is not the identity), this argument is valid only when

can be represented in the source. More precisely,

must consider as equivalent all target inputs that are related to the same source input, because in the source it is not possible to have a finer distinction of inputs. This intuitive correspondence can be formalized as follows:

5.7 Analyzing Noninterference Preserving Compilers

The results presented in this section formalize and generalize some intuitive facts about compiler correctness and noninterference, clarifying which noninterference property follows “for free” from trace-relating compiler correctness. Of course, in the general case, compiler correctness alone is not a strong enough criterion for dealing with many security properties [8, 23]. This section exploits our ANI-based framework and results to analyze two compilers from the recent literature [7, 74] that are both proven to be correct and to preserve two interesting notions of noninterference: cryptographic constant time (Section 5.7.1) and value-dependent noninterference (Section 5.7.2). For each, we explain how to express compiler correctness as an instance of CC

, describe the noninterference property that is implied by the trace relation and the correctness result, and compare it with the noninterference properties of interest as established by their authors.

5.7.1 A Correct Compiler Preserving Cryptographic Constant Time.

Barthe et al. [7] provide a correct compiler (as an extension of CompCert) that also preserves cryptographic constant time (CT). CT is a security property stating that the runtime of a program does not depend on its secret, and thus an attacker cannot extrude secrets of a program by observing its execution time. A CT-preserving compiler takes code that is CT and generates code that also is CT. Thus, a CT-preserving compiler must translate runtime-equivalent source programs into runtime-equivalent target ones. Notice that it is not necessary for the leakage of target programs to be the same of their source counterparts, rather: Source programs with the same leakage must be compiled to target programs with the same leakage.

Barthe et al. [7] prove CT preservation for 17 passes of CompCert. The authors partition the 17 steps in four categories, depending on the proof technique they use to show CT preservation. Every category proves an instance of CC

by improving on the existing CompCert simulation. In three out of the four cases this is sufficient to also prove CT preservation, while for the last category a further proof is necessary. In what follows, we first encode CT as an instance of abstract noninterference, i.e., show for which operators

and then use our framework to understand why modifying CompCert simulation is sufficient in the first three categories but not in the last one. For each category, Theorem 5.2 applies, so no

that respects Equation (1) can notice any interference on compiled programs that were source constant-time. In the first three categories the attacker that defines CT—

—respects the equation,¹³ i.e.,

(2)

and CT preservation is therefore a consequence of CC

. In the last category,

does not respect Equation (2) and the authors have to prove an additional theorem, the CT-diagram.

Trace Model and CT as an instance of ANI. The formal definition of CT is given by extending the semantics of the languages in CompCert and enriching the traces of input and output events with leakages. Leakages are results of execution steps that involve conditional branching or memory access. A program is CT w.r.t. a certain relation over program states

[7, Definition 3.2] iff for every two initial states

such that

, the leakages that can be observed are the same. Notice that in Reference [7, Definition 3.2] the secret is stored in the program states and defined by

, therefore to regard CT as an instance of abstract noninterference program states will be regarded as inputs and events together with their leakages as outputs. More precisely, a trace

is a sequence of of triples

where

and

are program states and

an event in the instrumented semantics, i.e., input/output event and associated leakage.

We consider:

•

to be (the uco corresponding to) the relation defined by

iff

have the same length with

and

•

to be (the uco corresponding to) the relation defined by

iff

have the same length with

and

, where

denotes the leakage in the event

(projection of

on the leak-only semantics [7]).

It is easy to check that

for the

and

given above.

We now present more details for each of the four proof techniques adopted by Barthe et al. [7]. Since CT is defined only for safe programs [7, Definition 3.1], we can assume no undefined behavior is ever encountered and have a simpler presentation. We also omit

coming from the application of Theorem 5.2, as it always coincides with

Constant-time security preservation by leakage preservation ([7, Section 5.2]). For compilation passes that belong to this category, the authors prove that the source leakage is preserved exactly in the target. Thus, in this simple case, the theorem proved is CC

where

is point-wise equality of events together with leakages,

the identity and

satisfies Equation 2 by idempotency of

CT preservation from leakage-erasing simulation ([7, Section 5.3]). In this case, CC

is proved for a relation that erases source leakage-only events, i.e., those events that do not contain inputs or outputs, but only the amount of leakage revealed. More precisely (see also Reference [7, Fig. 8]) for

and

of the same length,

iff

The property mapping associated to the above relation,

, erases all leak-only events from the traces of a source property. If an attacker cannot notice at any point any difference in the leakages of two traces and we erase the leak-only events from them, then the attacker will still not notice any difference on leakages, therefore it is easy to check that Equation 2 holds also in this case.

CT preservation via memory injection ([7, Section 5.4]). This case is analogous to the one above, save that it rests on a more complex relation

involving a memory injection relation (see [7, Definition 5.8]). Intuitively,

relates source and target traces that differ at most in leakages due to memory accesses. While in the previous case, leakages where simply erased, here they are modified and crucially with some uniformity. Reasoning as in the previous case, if an attacker cannot notice a difference in the leakages of two traces and we modify equal leakages of the same factor, then the attacker will still not notice any difference on leakages, thus Equation 2 holds.

CT preservation from CT-diagram ([7, Section 5.5]). In this case,

does not satisfy Equation 2 because the counting simulation ([7, Definition 5.10]) does not necessarily relate source and target leakages but only the inputs and outputs.¹⁴ CC

alone does not ensure that an attacker cannot observe any interference in the target leakages, to show preservation of CT the authors need to prove an extra condition, the so-called CT diagram [8].

5.7.2 Value-dependent noninterference.

Sison and Murray [74] introduce a compiler that provably preserves value-dependent noninterference (VDNI) for a concurrent language with shared variables. Value-dependent means that the secrecy level of a variable—low or high—may depend on the value of some other variable, called the control variable of the first, and therefore could change throughout its lifetime.

Preservation of VDNI for concurrent programs enjoys compositionality, meaning that it follows from the preservation of VDNI for each single thread [52] under certain conditions. As the compositionality result is orthogonal to our framework, we can study either (1) the preservation of VDNI for one local thread or for (2) the whole-program,

In the remainder of this section, we focus on the preservation of VDNI for a single thread, which is proven by showing a secure refinement relation between source and compiled threads. Similarly to the previous section, the secure refinement is expressed via a cube diagram (Reference [74], Figure 1) and can be proven directly [52] or split into more obligations [74].

As Sison and Murray [74] use a state transition-based semantics, we first show how to encode this semantics into a trace model by defining the

relation based on the secure refinement relation. We then show how to encode VDNI as an instance of abstract noninterference (i.e., both

and

). Finally, we apply Theorem 5.2 and conclude that if

satisfies

, then

satisfies

given that the trace relation

has properties defined in Reference [52, Theorem 5.1].

Source (WHILE) and target (RISC-like assembly) languages are equipped with a determined evaluation step semantics (i.e., a semantics where the only source of nondeterminism are external inputs; Reference [74], Section 2) between thread-local configurations, which are triples of the form

. In such a configuration, mds is the access mode state for program variables and mem is a map relating global program variables to their values. Both of these components are common to the source and target language. The tps component denotes the thread-private state. In the source language, it is the program to be executed. In the target language, tps consists of the target program (labelled assembly-language instructions), of a program counter, and of the set of thread-local registers. We denote WHILE configurations by tuples of the form:

and RISC configurations by tuples of the form:

Trace Model and Trace relation. We consider traces that are (possibly infinite) sequences of configurations. The traces produced by a program are the sequences of local configurations that the program may encounter during execution, according to the evaluation semantics. Let

be a source trace. The input projection is defined by

(the tuple consisting of the access modes and the memory in the first state) and the output projection is defined by

(the trace itself). Input/output projections are defined similarly for target traces.

We take the trace relation

to be the point-wise lifting of a secure refinement relation

(Reference [74], Definition 6). Source and target configurations

that are related coincide on the access mode and memory part (i.e., mds = mds’ and mem = mem’; Reference [74], Definition 4), so

is simply the identity and

coincides with

VDNI as abstract noninterference. A program satisfies VDNI (Reference [74], Definition 2) if any two of its executions starting in low equivalent memories are related via a strong low bisimulation modulo modes (strong low bisimulation mm). Intuitively, a strong low bisimulation mm is a bisimulation that preserves low-equivalence. Preservation of VDNI is proved by Murray et al. [52] by showing that for every strong low-bisimulation mm

for source threads, there exists a target strong low bisimulation mm

such that if two source threads are related by

, then the compiled threads are related by

(Reference [52], Theorem 5.1).

The intuition for the encoding of VDNI as an instance of abstract noninterference is to model low equivalence through the operator

, and bisimilarity through

. More rigorously,

, where

and

are defined as following:

For

where

is the low-equivalence modulo mds (Reference [74], Definition 1).

For

where

denotes a strong low bisimulation modulo modes. Similarly

where

The relation

is a simulation, and therefore CC

holds. To apply Theorem 5.2 and conclude that whenever a source program

satisfies

, then

satisfies

, it is sufficient for

to satisfy Equation 1, that is,

for

. If one is willing to unfold all definitions, then this amounts to show the set of traces “bismilar” to

coincides with the set of traces that are bisimilar to some

and

for some

bisimilar to

. Splitting the “coincides” (set equality) into the two directions of inclusion, the “

” direction is immediate, while for the “

” direction one has to prove some properties of

, the ones in the definition of

([52, inlined above Theorem 5.1]) which entails preservation of low-equivalence as shown in [52, Theorem 5.1].

In summary, our framework makes it possible to precisely characterize the target noninterference properties that are implied by (trace-relating) correct compilation of source noninterfering programs. As we have shown, such properties are not necessarily as strong as desired. Crucially, the target noninterference property one gets for free for a given trace-relating correct compiler is a function of the trace relation under consideration. By considering more sophisticated trace relations, one could be able to get more interesting noninterference properties in the target for free—but this would likely come at the expense of a more challenging trace-relating compiler correctness proof.

6 Trace-relating Secure Compilation

So far, we have studied compiler correctness criteria for whole, standalone programs. However, in practice, programs do not exist in isolation, but in a context where they interact with other programs, libraries, etc. In many cases, this context cannot be assumed to be benign and could instead behave maliciously to try to disrupt a compiled program.

Hence, in this section, we consider the following secure compilation scenario: A source program is compiled and linked with an arbitrary target-level context, i.e., one that may not be expressible as the compilation of a source context. Compiler correctness does not address this case, as it does not consider arbitrary target contexts, looking instead at whole programs (empty context [41]) or well-behaved target contexts that behave like source ones (as in compositional compiler correctness [33, 37, 56, 76]).

Summary of the work of Abate et al. [3]. To account for this scenario, Abate et al. [3] describe several secure compilation criteria based on the preservation of classes of (hyper)properties (e.g., trace properties, safety, hypersafety, hyperproperties) against arbitrary target contexts. For each of these criteria, they give an equivalent “property-free” criterion, analogous to the equivalence between

and

. For instance, their robusttrace property preservation criterion () states that, for any trace property

, if a source partial program

plugged into any context

satisfies

, then the compiled program

plugged into any target context

satisfies

. Their equivalent criterion to

, which states that for any trace produced by the compiled program, when linked with any target context, there is a source context that produces the same trace. Formally (writing

to mean the whole program that results from linking partial program

with context

) they define:

In the following, we adopt the notation

to mean “

robustly satisfies

,” i.e.,

satisfies

irrespective of the contexts (

) it is linked with. Formally,

, where

is the same as before. Thus, we write more compactly:

All the criteria of Abate et al. [3] share this flavor of stating the existence of some source context that simulates the behavior of any given target context, with some variations depending on the class of (hyper)properties under consideration. For trace properties, they also have criteria that preserve safety properties plus their version of liveness properties. For hyperproperties, they have criteria that preserve hypersafety properties, subset-closed hyperproperties, and arbitrary hyperproperties. Finally, they define relational hyperproperties, which are relations between the behaviors of multiple programs for expressing, e.g., that a program always runs faster than another. For relational hyperproperties, they have criteria that preserve arbitrary relational properties, relational safety properties, relational hyperproperties, and relational subset-closed hyperproperties.

Each category of criteria provides different kinds of security guarantees (confidentiality or integrity) for the code and data segments of programs. Roughly speaking, the security guarantees due to robust preservation of trace properties regard only protecting the integrity of the program from the context, the guarantees of hyperproperties also regard data confidentiality, and the guarantees of relational hyperproperties may even regard code confidentiality. Naturally, these stronger guarantees are increasingly harder to enforce and prove.

All the criteria of Abate et al. [3] are stated in a setting where source and target traces are the same. In this section, we extend their results to the trace-relating setting, obtaining trintarian views for secure compilation. There are many similarities with Section 2 that show up in the secure compilation setting, too, but also some crucial differences. As in Section 2, the application of

, may lose the information that a property belongs to the class

, or that a hyperproperty is subset-closed, which are both crucial for the equivalence with the property-free criterion of Abate et al. [3]. As in Section 2, we solve this problem by interpreting classes of properties as an abstraction of another class of properties induced by a closure operator. Differently from Section 2, the presence of adversarial contexts makes the criteria for subset-closed hyperproperties and trace properties distinct. Abate et al. [3] show that the criterion for robust preservation of hypersafety is distinct from robust safety preservation, and all criteria about classes of trace properties are distinct from their relational counterparts, e.g., robust preservation of relational safety and robust preservation of safety properties are different. We therefore further generalize the argument from Section 3.2 to safety hyperproperties as well as to relational hyperproperties.

Specifically, we provide a trinity for the preservation of trace properties and subset-closed hyperproperties (Section 6.1), of safety properties and hypersafety hyperproperties (Section 6.2), of hyperproperties (Section 6.3), and for 2-relational (hyper)properties (Section 6.4). We conclude the section by studying the relative expressiveness of these criteria (Section 6.5).

Robustness and Compositional Compilation. Before diving into the criteria for robust compilation, it is worth noting the relationship between these and compositional compiler correctness. Compositional compiler correctness () is a statement of compiler correctness for programs that are linked against some contexts. Unlike robustness, which imposes no constraints on the contexts,

imposes conditions on the target contexts that compiled programs can be linked against: They need to be related (in ways that vary from work-to-work [38, 56]) to the source contexts [65]. As Patrignani and Garg [64] also point out, the notions of

and of robust compilation are incomparable: Neither can be proven stronger than the other. This is not surprising, since robust compilation criteria are used to prove compiler security while

is used to prove correctness. ¹⁵

The criteria we adopt could be generalized further by adding an extra parameter that qualifies the relation between source and target contexts. Such a general statement would let us express both

and robust compilation by picking the correct extra parameter. However, we refrain from presenting such general statements, as the implications in terms of preservation of classes of (hyper)properties has not been studied for them.

6.1 Trace-relating Secure Compilation: Trace Properties and Subset-closed Hyperproperties

This section shows the simple generalization of

to the trace-relating setting (

) and its corresponding trinitarian view (6.1). Then, it presents the trinitarian view for criteria that preserve subset-closed hyperproperties (6.2).

Theorem 6.1 Trinity for Robust Trace Properties

For any trace relation

and induced property mappings

and

, we have:

, where

The trinity for robust trace property preservation is the straightforward adaptation of the concepts of Section 2 to the definitions of Abate et al. [3]. Intuitively, these criteria simply deal with partial programs

instead of whole programs

. Necessarily, these criteria then consider arbitrary program contexts linked with

; the universal quantification over

and

are tacit in the expression

We can also generalize Section 2 to robust subset-closed hyperproperties (6.2). However, unlike the correct compilation case of Section 2, the equivalent property-free criterion (

) does not coincide with

, but states the existence of a single source context for all the target traces produced by a program in a given context.

Theorem 6.2 Trinity for Robust Subset-closed Hyperproperties

Let

and

denote the sets of all subset-closed hyperproperties in the source and target languages, respectively. For any trace relation

and its existential and universal images lifted to hyperproperties (that is, the lifting of the respective functions from Definition 2.5),

and

, and for

, we have:

, where

6.2 Trace-relating Secure Compilation: Safety and Hypersafety

In this section, we elaborate the robust preservation of safety (6.3) and hypersafety properties (6.4). Similar to Section 3.2, we consider the trace model adopted by Abate et al. [3] to ease the presentation. Our starting point is the two equivalent criteria for preservation of robust satisfaction of all and only the safety properties [3],

where

is a shorthand for

differs from

as it only quantifies over safety properties, and

differs from

as it quantifies over finite prefixes

, rather than complete traces

. This comes from the fact that safety properties can be characterized in terms of sets of bad prefixes (as in Definition 3.4). Unfolding

, we can interpret

as follows: If

produces a trace

that violates a specific safety property, namely, the one defined by

, then there exists

in which

violates the same safety property, producing a trace

but possibly distinct from

Our generalization of

to the trace-relating setting states that whenever

produces a trace

that violates a target safety property, there exists a source context

in which

violates the source interpretation of the property, i.e., its image through

. The following theorem defines

and its two equivalent formulations:

Theorem 6.3 Trinity for Robust Safety Properties

For any trace relation

and for the corresponding property mappings

and

, we have:

, where

where the closure operator

is the one introduced in Section 3.2.

Exactly like Section 3.2, Theorem 6.3 exploits the fact that

is a Galois connection between source properties and target safety properties and the argument generalizes to arbitrary closure operators on target properties (

). More interestingly, we can further generalize this idea to hypersafety. Hypersafety lifts the idea of safety with another level of sets (just like hyperproperties do w.r.t. trace properties) to talk about multiple runs of the same program. Just like for safety, hypersafety is concerned with a set of bad prefixes (called

) that no program upholding the hypersafety property should extend. Formally, a hyperproperty

is hypersafety if:

In Theorem 6.4, we indeed exploit the following Galois connection between source subset-closed hyperproperties and target:

where

and

is the closure operator that maps an arbitrary target hyperproperty

to the target hypersafety that best over-approximates

. ¹⁶

Theorem 6.4 Trinity for Robust Hypersafety

For any trace relation

and for the induced property mappings

and

, we have:

, where

and

is the set of finite sets of prefixes.

We conclude this section with the following remark: The reader might wonder about extracting a “new” trace relation from the Galois connection

and get another formulation of

. We note that this is not possible in general, as the class of safety properties, i.e., closed sets, is not necessarily a powerset and hence Lemma 2.7 cannot be applied.

6.3 Trace-relating Secure Compilation: Arbitrary Hyperproperties

We already mentioned that some properties of interest for security, e.g., possibilistic information-flow are not subset closed [18]. In this section, we lift the results from Section 3.3 to the secure compilation setting. Once again, the trinity is weak, as the equivalence to

requires an extra assumption.

Theorem 6.5 Weak Trinity for Robust Hyperproperties

It is therefore possible and correct to deduce a source obligation for a given target hyperproperty

(

) when no information is lost in the composition

. However,

is a consequence of

when no information is lost in composing in the other direction,

6.4 Trace-relating Secure Compilation: 2-Relational Hyperproperties

Finally, we turn to relational properties and hyperproperties. Relational hyperproperties, as defined by Abate et al. [3], are predicates on a sequence of behaviors; a sequence of programs has the relational hyperproperty if their behaviors collectively satisfy the predicate. Depending on the arity of the sequence, there exist different subclasses of relational hyperproperties, though, for simplicity, here, we only study relational hyperproperties of arity 2. A key example of a relational hyperproperty is trace equivalence, which holds if two programs have identical behaviors.

All the trinities in this section follow the pattern of their non-relational counterparts. We first explain how one can get a Galois connection between source and target relational properties from a trace relation.

Given a trace relation

, we can relate pairs of source traces with pairs of target traces point-wise,

Formally this is

, the product of the relation

with itself. Therefore, by Lemma 2.7 it corresponds to a Galois connection between source and target relational properties (

), that with a little abuse of notation¹⁷ we still denote by

Explicitly, for

and

are then lifted to relational hyperproperties similarly to Definition 3.2. Explicitly, for

and

Given a relational property

and two programs

, we write

for

Given a relational hyperproperty

, by

, we mean

Theorem 6.6 Trinity for Robust 2-Relational Trace Properties

For any trace relation

and for the corresponding property mappings

and

, we have:

, where

Next, we propose the trinity for 2-relational subset-closed hyperproperties, i.e., elements of

that are closed under subsets. Exactly as in the case of subset-closed hyperproperties, the application of

and

may lose the information of being subset-closed. To guarantee the equivalence of the three criteria, we compose the two mappings with a closure operator that we still denote by

Theorem 6.7 Trinity for 2-Relational Robust Subset-Closed Hyperproperties

For any trace relation

and for the corresponding property mappings

and

, we have

, where

We move now to the class of relational safety properties, a notion that generalizes safety properties to relations on programs. Similarly to Theorem 6.3,

quantifies over target relational safety properties, while

quantifies over all source relational property and compose

with

a closure operator that best approximates a relational property with a relational safety property.

Theorem 6.8 Trinity for Robust 2-Relational Safety Properties

For any trace relation

and for the corresponding property mappings

and

, we have:

, where

Finally, we present the most general criterion: preservation of arbitrary 2-relational hyperproperties. As for the preservation of arbitrary hyperproperties, this (weak) trinity requires additional assumptions to hold, namely, that the Galois connection is an insertion or a reflection.

Theorem 6.9 Weak trinity for Robust 2-Relational Hyperproperties )

6.5 Relating the Secure Compilation Trinities

Figure 4 orders criteria referring to the same trace relation

according to their relative strength. If a trinity entails another (denoted by

), then the former provides stronger security for a compilation chain than the latter.

Fig. 4.

The hypotheses of insertion and reflection mentioned in Theorem 6.9 and Theorem 6.5 are highlighted with the labels “Ins” and “Refl.” Recall that when composing

with

, we quantify over the whole class of source trace properties rather than only safety properties. This is represented by the blue background in

. The trinity for the robust preservation of arbitrary trace properties is on the same blue background. Red and green backgrounds are reserved for subset-closed hyperproperties and arbitrary relational properties and serve the same purpose.

We now describe how to interpret the acronyms in Figure 4. All criteria start with

meaning they refer to robust preservation (secure compilation criteria). Criteria for relational hyperproperties—here only arity 2 is shown for simplicity—contain

. Next, criteria names spell the class of hyperproperties they preserve:

for hyperproperties,

for subset-closed hyperproperties,

for hypersafety,

for trace properties, and

for safety properties. Finally, property-free criteria end with a

while property-full ones involving

and

end with

. Thus, robust () subset-closed hyperproperty-preserving () compilation () is

, robust () two-relational () safety-preserving () compilation () is

, and so on.

7 Instances of Trace-relating Secure Compilation

This section presents instances of compilers that adopt our framework for secure compilation purposes. We provide three illustrative cases for compilers that, respectively, robustly preserve trace properties (Section 7.1), safety properties (Section 7.2), and hypersafety properties (Section 7.3). The last two examples are not novel instances we devise but rather existing work whose results we recount as instantiations of our framework.

7.1 An Instance of Trace-relating Robust Preservation of Trace Properties

This subsection illustrates trace-relating secure compilation when the target events are strictly more events than the source ones.

The source and target languages used here extend the syntax of the source language of Section 4.3.1. Both languages have outputs of naturals, and the expressions that generate them:

and

. Additionally, the target has a different output action and its related expression

; this is the only difference between the languages. The extra events in the target model the ability of target language to perform potentially dangerous operations (e.g., writing to the hard drive), which cannot be performed by the source language, and against which source-level reasoning can therefore offer no protection.

Both languages and compilation chains now deal with partial programs

, contexts

, and linking of those two to produce whole programs

. In this setting, a whole program

is the combination of a main expression to be evaluated and a set of function definitions

(with distinct names) that can refer to their argument (

) symbolically and can be called by the main expression and by other functions (

. The set of functions of a whole program is the union of the functions of a partial program and a context; the latter also contains the main expression.

The extensions of the typing rules and the operational semantics for whole programs are unsurprising and therefore elided. The trace model also follows closely that of Section 4.3: It consists of a list of regular events (including the new outputs) terminated by a result event.¹⁸ A partial program and a context can be linked into a whole program when their functions satisfy the requirements mentioned above.

We define the homomorphic compiler (

) that translates each source construct into its target correspondent. Thus, the compiler never introduces the additional target instruction

. Since it is straightforward, the formalization of the compiler is elided.

Relating Traces. In the present model, source and target traces differ only in the fact that the target draws (regular) events from a strictly larger set than the source, i.e.,

. A natural relation between source and target traces essentially maps a given target trace

the source trace that erases from

those events that exist only at the target level. This is reasonable, because only target contexts

(not compiled programs

) can perform the extra target actions, as the compiler does not introduce them. Let

indicate trace

filtered to retain only those elements included in alphabet

. We define the trace relation as:

In the opposite direction, a source trace

is related to many target ones, as any target-only events can be inserted at any point in

. The induced mappings for this relation are:

That is, the target guarantee of a source property is that the target has the same source-level behavior, sprinkled with arbitrary target-level behavior. Conversely, the source-level obligation of a target property is the aggregate of those source traces, all of whose target-level enrichments are in the target property.

Since the languages are very similar, it is simple to prove that our compiler is secure according to the trace relation

defined above.

Theorem 7.1 (

is Secure

)

7.2 An Instance of Trace-relating Robust Preservation of Safety Properties

I/O events are not the only instance of events that compilers consider. Especially in the setting of secure compilation, where a compartmentalized partial program interacts with a context, interaction traces are often used [3, 35, 59, 64]. Consider a language analogous to that of the previous section, where the context

defines a set of functions

and the program defines a different set

. Interaction traces (generally) record the control flow of calls between these two sets via actions that are

and

[34]. These actions indicate a call to function

with parameter

and a return with return value

. In case the context calls a function in

(or returns to a function in

), the action is decorated with a

(i.e., those actions are

and

). Dually, the program calling a function in

(or returning to it) generates an action decorated with a

(i.e., those actions are

and

Patrignani and Garg [64] consider precisely such a setting. Their languages are simple like those presented here but impure; their source has an ML-like heap and the target has a memory that is indexed by natural numbers and capabilities to protect addresses. Moreover, they define a compiler that preserves safety properties of source programs (i.e., it is

in the sense of 6.3) by relying on the target capabilities. The interesting point, however, is that they also consider source and target traces to be distinct, since the two languages have different values. Concretely, the source has

and

and the target only has

, plus in the source, heap addresses are abstract locations

while in the target they are

. Thus, to prove

, they rely on a cross-language relation on values, which is lifted to trace actions, and then lifted point-wise to traces (analogously to what we have done in Section 4.3, 4.4, and 7.1). To relate addresses, their cross-language relation is equipped with a partial bijection between source and target addresses, this bijection grows monotonically with every reduction step.

Besides defining a relation on traces (which is an instance of

), they also define a relation between source and target safety properties that supports concurrent programs. ¹⁹ Thus, they really provide an instantiation of

that maps all safe source traces to the related target ones. This ensures that no additional target trace is introduced in the target property, and source safety properties are mapped to target safety ones by

. Thus, their compiler is proven to generate code that respects

, so they really achieve a variation of

from 6.3. Their proofs are based on standard techniques either for secure compilation (i.e., trace-based backtranslation [61]) and for correct compilation (i.e., forward/backward simulation [42]).

7.3 An Instance of Trace-relating Robust Preservation of Hypersafety Properties

Patrignani and Garg [63] study the preservation of hypersafety from the perspective of secure compilation. Again, their result can be interpreted in our setting. They consider reactive systems, where trace alphabets are partitioned in input actions

and output actions

, whose concatenation generate traces

. We use the same notation as before and indicate such sequences as

and

, respectively. The set of target output actions

includes an action

that has no source counterpart (i.e.,

) and whose output does not depend on internal state (thus, it cannot leak secrets). ²⁰ By emitting

whenever undesired inputs are fed to a compiled program (e.g., passing a

when a

is expected), hypersafety is preserved (as

does not leak secrets) [63].

More formally, they assume a relation on actions

that is total on the source actions and injective. From there, they define

—which here corresponds to an instance of

— that maps the set of valid source traces to the set of valid target traces (that now mention

) as follows:

where

indicates that

is an undesired input (intuitively, this is an information that can be derived from the set of source traces [63]).

Informally, given a set of source traces

generates all target traces that are related (point-wise) to a source trace (case

). Then (case

), it adds all traces (

) with interleavings of undesired input

(third conjunct) followed by

(first conjunct) as long as the interleavings split a trace

that has already been mapped (second conjunct).

is an instance of

that maps source hypersafety to target hypersafety (and therefore, safety to safety), thus our theory can be instantiated for the preservation of these classes of hyperproperties as well.

8 Related Work

We already discussed how our results relate to some existing work in correct compilation [41, 77] and secure compilation [3, 63, 64]. We also already mentioned that most of our definitions and results make no assumptions about the structure of traces. One result that partially relies on the structure of traces is 6.3, which refers to finite prefix

, suggesting traces should be some sort of sequences of events (or states), as customary when one wants to refer to safety properties [18]. Without a notion of finite prefix, only

may look different, but both

and

are trace-agnostic, as in general safety properties can be defined as the closed sets of any topology on traces [58].

Even for reasoning about safety, hypersafety, or arbitrary hyperproperties, traces can therefore be values, sequences of program states, or of input output events, or even the recently proposed interaction trees [81]. In the latter case, we believe that the compilation from IMP to ASM proposed by Xia et al. [81] can be seen as an instance of

, for the relation they call “trace equivalence.”

Compilers Where Our Work Could Be Useful. Our work should be broadly applicable to understanding the guarantees provided by many verified compilers. For instance, Wang et al. [80] recently proposed a CompCert variant that compiles all the way down to machine code, and it would be interesting to see if the model at the end of Section 4.1 applies there, too. This and many other verified compilers [15, 36, 51, 73] beyond CakeML [77] deal with resource exhaustion, and it would be interesting to also apply the ideas of Section 4.2 to them.

Hur and Dreyer [33] devised a correct compiler from an ML language to assembly using a cross-language logical relation to state their CC theorem. They do not have traces, though were one to add them, the logical relation on values would serve as the basis for the trace relation and therefore their result would attain CC

Switching to more informative traces capturing the interaction between the program and the context is often used as a proof technique for secure compilation [3, 34, 62]. Most of these results consider a cross-language relation, so they probably could be proved to attain one of the criteria from Figure 4.

Generalizations of Compiler Correctness. The compiler correctness definition of Morris [50] was already general enough to account for trace relations, since it considered a translation between the semantics of the source program and that of the compiled program, which he called “decode” in his diagram, reproduced in Figure 5 (left). And even some of the more recent compiler correctness definitions preserve this kind of flexibility [65]. While CC

can be seen as an instance of a definition by Morris [50], we are not aware of any prior work that investigated the preservation of properties when the “decode translation” is neither the identity nor a bijection, and source properties need to be re-interpreted as target ones and vice versa. Correct Compilation and Galois Connections. Melton et al. [47] and Sabry and Wadler [70] expressed a strong variant of compiler correctness using the diagram of Figure 5 (right). They require that compiled programs parallel the computation steps (

) of the original source programs, which can be proven showing the existence of a decompilation map

that makes the diagram commute, or equivalently, the existence of an adjoint for

(

for both source and target). The “parallel” intuition can be formalized as an instance of CC

. Take source and target traces to be finite or infinite sequences of program states (maximal trace semantics [19]), and relate them exactly like Melton et al. [47] and Sabry and Wadler [70].

Fig. 5.

Translation Validation. Translation validation is an important alternative to proving that all runs of a compiler are correct, as it can be more easily applied to realistic compilers. An interesting work about translation validation of security properties has been recently proposed by Namjoshi and Tabajara [53]. They can handle many security properties expressible in terms of automata as long as source and target attackers and the observable traces are the same.

Instantiating the definition of any of the presented criteria with a particular program, one has translation validation criteria with the map

describing the target property that is (robustly) satisfied once the translation is validated. For example, one can consider

While the proof technique proposed by Namjoshi and Tabajara [53] might be generalized for

—as long as

and

can be expressed as one of the automata they can handle—they do not work for

because of the existential in the conclusion.

Busi et al. [13] are instead considering translation validation criteria in the spirit of

, their preliminary work only allows equality as trace relation, but should be subject to a generalization to the trace-relating setting similar to the one we presented in this work.

Proof Techniques. We believe existing proof techniques (beyond the simulations discussed in Section 4.3.2) that have been devised to prove compiler correctness can also be employed to prove that a compiler attains any of the presented criteria. For example, cross-language binary logical relations can be used to relate two terms of two different languages when they “behave the same” [12, 33, 71]. Additionally, they can also be used when multiple programs “behave the same” [66] in a multilanguage semantics setting [45]. Secure compilation results (which rely on the criteria of Section 6) can be proven using variations of the backtranslation proof technique [22, 57, 64]. Presenting this proof techniques is beyond the scope of this article, so we refer the interested reader to the work of Patrignani et al. [61].

9 Conclusion and Future Work

We have extended the property preservation view on compiler correctness to arbitrary trace relations, and we believe that this will be useful for understanding the guarantees various compilers provide. An open question is whether, given a compiler, there exists a most precise

relation for which this compiler is correct. As mentioned in Section 1, every compiler is CC

for some

, but under which conditions is there a most precise relation? In practice, more precision may not always be better, though, as it may be at odds with compiler efficiency and may not align with more subjective notions of usefulness, leading to tradeoffs in the selection of suitable relations. Finally, another interesting direction for future work is studying whether using the relation to Galois connections allows to more easily compose trace relations for different purposes, say, for a compiler whose target language has undefined behavior, resource exhaustion, and side-channels. In particular, are there ways to obtain complex relations by combining simpler ones in a way that eases the compiler verification burden?

Composition for Multipass Compilers. For now, we can already informally argue about the correctness of a multipass compiler, where each step is proved correct for a possibly different trace relation. Concretely, assume

is a compilation chain from a source language

to an intermediate language

and

from the intermediate language

to a target language

. ²¹ Assume given two relations between traces of these languages:

and

, such that each compiler is proven to be

w.r.t. the expected trace relation:

and

Let us consider the source-to-target compiler

that is derived of the composition of the two aforementioned compilers, so

. In this case, we obtain the expected result: The correctness of the whole compiler

is derived from the individual compiler correctness proofs for each step.

where

Generalizing this kind of composition to compilers that attain different criteria is unclear. For example, if

preserves arbitrary hyperproperties, but

preserves 2-relational safety properties, then what can we conclude for

? We leave investigating these interesting matters for future work.

Acknowledgments

We thank Akram El-Korashy and Amin Timany for participating in an early discussion about this work and the anonymous reviewers for their valuable feedback.

Footnotes

For simplicity, for now, we ignore separate compilation and linking, returning to it in Section 6.

Typesetting convention [60]: We use a

font for

elements, an

font for

ones, and a

font for elements common to both languages.

Stated at the top of the CompCert file driver/Complements.v and discussed by Regehr [68].

⁴

Given the deterministic nature of our programs, we consider notions of noninterference that are often used in deterministic languages. We leave notions of noninterference in nondeterministic languages for future work.

⁵

While target traces are often “more concrete” than source ones, trace properties

(which in Coq we represent as the function type

) are contravariant in

and thus target properties correspond to the abstract domain.

⁶

In case of ambiguity with property satisfaction the class of

will be made explicit.

⁷

At least one other symmetric generalization is possible: For

defined by

, if

produces a trace

that violates the target interpretation of

, i.e.,

, then

produces

thus violating

⁸

is the topological closure in the topology where safety properties coincide with the closed sets (see, e.g., Clarkson and Schneider [18] and Pasqua and Mastroeni [58]).

⁹

At least one generalization is possible:

. In this case,

holds unconditionally while the other two implications hold under the same, but swapped, hypotheses from Theorem 3.8.

¹⁰

Making

injective is a matter of adding open and close parentheses actions in target traces.

¹¹

The exact shape of inputs and outputs depends on the scenario. For instance, inputs can be initial memories and outputs trace semantics of programs as in Reference [27, Section 7], while for interactive programs one may want to consider streams like Clark and Hunt [17]. We only require the sets of input and output projections to be disjoint. Further information, such as the ordering of events, is part of the attacker/observer model or the declassification of noninterference itself.

¹²

To be precise, the original formulation of ANI by Giacobazzi and Mastroeni [27] includes a third parameter

, which describes the maximal input variation that the attacker may control. Here, we omit

(i.e., take it to be the identity) to simplify the presentation.

¹³

In each compilation step, source and target traces are drawn from the same set so

can be applied to both source and target traces.

¹⁴

The interested reader will notice the difference from the previous category by comparing condition (1) of Definition 5.10 and condition (1) of Definition 5.8 by Barthe et al. [7].

¹⁵

We remark

has been used to conclude security of compilation in the previously discussed work of Sison and Murray [74] (and in its predecessor [52]). However, there is a key difference in the “role” of contexts: In robust compilation criteria, contexts model attackers, while in Sison and Murray [74] contexts are other bits of compiled code. This treatment lets Sison and Murray [74] reason compositionally about the concurrently executing compiled code.

¹⁶

. See, e.g., Clarkson and Schneider [18] and Pasqua and Mastroeni [58].

¹⁷

Technically, we should write:

¹⁸

Notice that the languages are strictly terminating.

¹⁹

They call those safety properties monitors, since they focus on safety [72] and indicate

with

and

with

²⁰

Technically, they assume a set of

actions, but for this analogy a single action suffices.

²¹

For the intermediate language, we use a

font.

A Proofs

Proof of Theorem 2.6

(

) See Theorems rel_TC_

TP and rel_TC_

TP in TraceCriterion.v, where the

part follows directly from 2.4. □

Proof of Theorem 2.8 (

Correspondence of Criteria ).

For a trace relation

and the Galois connection

, the result follows from 2.6. For a Galois connection

and

, use Lemma 2.7 to conclude that the existential and universal images of

coincide with

and

, respectively; the goal then follows from 2.6. □

Lemma B.1 (Special Relations and Consequences on the Adjoints).

Proof of Lemma B.1

See Lemma rel_total_surjective and rel_total_surjective_up_inj in Galois.v. □

Proof of 4.3

(

) See Theorem correctness in TypeRelationExampleInput.v. □

Proof for Lemma 4.4 (

gensend() Works) We proceed by induction on

and then by induction on

and

By canonicity, we have that

translates that into

By Sem-seq, that produces

We need to prove that

, which holds by Trace-Rel-N-N.

and

Analogous to the other cases, by IH and Trace-Rel-N-M.

and

Analogous to the other cases, by IH and Trace-Rel-M-N.

and

So by canonicity

By definition of

By the target reductions, we know

, so by IH, we have

and

We need to prove that

, which holds by Trace-Rel-M-M, for

and

□

Proof of 4.5

Trivial induction on the typing derivation of

, the only interesting case is the compilation of

in the inductive cases.

Inductive.

By IH, we have that if

, then

and

By definition of

and of

, we need to prove that if

Then

and

The reductions proceed as follows in the target:

In the source, we have

By IH, we have that

By Rule Trace-Rel-Single, to prove that

, we need to prove that

By Lemma 4.4 (gensend (

) Works), we have that

, so this case holds.

□

Proof of Theorem 5.1. First, we show that

is an

, the proof for

is the same.

Monotonicity.

is composition of monotonic functions, hence it is itself monotonic.

Idempotence. We have to show that for

, that unfolding the definition means

For the inclusion “

,”

the inclusion holds, because

and the equality comes from idempotency of

For the inclusion “

,”

the inclusion comes from

by extensiveness of

, and the equality from

Extensiveness. We have to show that

The first inclusion is due to extensiveness of

, the second by

being the upper adjoint of

For the statement of the theorem to hold, assume

and

with

, we have to show that

By CC

there exists

and

such that

. As a preliminary, apply Lemma B.1 to the relations

and deduce

is injective. Notice also that by functionality and totality, of

and of

and

and a similar fact holds for

and

We now show that if

is surjective, i.e.,

injective, then

Let

, we show that

for some

The source property

is such that

. We only need to show

. Let

that shows

and concludes the proof.

Proof of Theorem 5.2. Assume

and

with

. We have to show that

, for an arbitrary

that satisfies the condition

By CC

there exists

and

such that

. As a preliminary, recall that Lemma B.1 ensures

is injective. Moreover, notice that by functionality and totality, of

and

Proof of Theorem 5.3. Assume

and

with

and

satisfying the condition

. We have to show that

. By CC

there exists

and

such that

. As a preliminary, recall that Lemma B.1 ensures

is injective. Moreover, notice that by functionality and totality, of

and

Proof of 6.1

(

) Theorems rel_RTC_

RTP and rel_RTC_

RTP in

RobustTraceCriterion.v. □

Proof of 6.3

(

) Theorems tilde_RSC_

RSP and tilde_RSC_Cl_

RTP in

RobustSafetyCriterion.v. □

Proof of 6.5

(

) Lemmas

RHP_rel_RHC and rel_RHC_

RHP and Theorem

rel_RHC_

RHP in RobustHyperCriterion.v. □

Proof of 7.1

(

) (See theorem extra_target_RTCt in MoreTargetEventsExample.v, mechanizing a slightly simplified model.) By definition of

, we need to find a source context and source trace given a source program, target context, and target trace related by compilation and program semantics: This instantiation is simple, since the trace relation is a function from target traces to source traces, and it is easy to clean target contexts to produce equivalent source context without target-only events. The proof is a trivial instance of precise, context-based backtranslation [3, 57, 61, 75], aided by a few straightforward lemmas and where the case of function calls is guaranteed to terminate by the language. □

Proof of 3.6

(

) Theorems tilde_SC_

SP and tilde_SC_Cl_

TP in

SafetyCriterion.v. □

Proof of 3.7

For the implication from left to right, assume

. By

have

, so

as well.

For the implication from right to left, instantiate

with the hyperproperty

, for a given

, and deduce that

i.e.,

. □

Proof of 3.8

(

) Theorems rel_HC_

HP, rel_HC_

HP and

HP_rel_HC in

HyperCriterion.v. □

References

[1]

Martín Abadi, Anindya Banerjee, Nevin Heintze, and Jon G. Riecke. 1999. A core calculus of dependency. In Proceedings of the 26th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’99). New York, NY, 147–160. DOI:https://doi.org/10.1145/292540.292555

Google Scholar

[2]

Carmine Abate, Roberto Blanco, Ştefan Ciobâcă, Adrien Durier, Deepak Garg, Catalin Hritcu, Marco Patrignani, Éric Tanter, and Jérémy Thibault. 2020. Trace-relating compiler correctness and secure compilation. In Proceedings of the 29th European Symposium on Programming: Programming Languages and Systems, Held as Part of the European Joint Conferences on Theory and Practice of Software. 1–28. DOI:

Crossref

Google Scholar

[3]

Carmine Abate, Roberto Blanco, Deepak Garg, Cătălin Hriţcu, Marco Patrignani, and Jérémy Thibault. 2019. Journey beyond full abstraction: Exploring robust property preservation for secure compilation. In Proceedings of the 32nd IEEE Computer Security Foundations Symposium (CSF’19). Retrieved from https://arxiv.org/abs/1807.04603.

Crossref

Google Scholar

[4]

Amal Ahmed, Deepak Garg, Cătălin Hriţcu, and Frank Piessens. 2018. Secure compilation (Dagstuhl seminar 18201). Dagstuhl Rep. 8, 5 (2018), 1–30. DOI:

Crossref

Google Scholar

[5]

Abhishek Anand, Andrew Appel, Greg Morrisett, Zoe Paraskevopoulou, Randy Pollack, Olivier Savary Belanger, Matthieu Sozeau, and Matthew Weaver. 2017. CertiCoq: A verified compiler for Coq. In Proceedings of theCoqPL Workshop. Retrieved from https://www.cs.princeton.edu/appel/papers/certicoq-coqpl.pdf.

Google Scholar

[6]

Kevin Backhouse and Roland Backhouse. 2004. Safety of abstract interpretations for free, via logical relations and Galois connections. Sci. Comput. Program. 51, 1–2 (2004), 153–196. Retrieved from https://core.ac.uk/download/pdf/82190842.pdf.

Digital Library

Google Scholar

[7]

Gilles Barthe, Sandrine Blazy, Benjamin Grégoire, Rémi Hutin, Vincent Laporte, David Pichardie, and Alix Trieu. 2020. Formal verification of a constant-time preserving C compiler. Proc. ACM Program. Lang. 4, POPL (2020), 7:1–7:30. DOI:https://doi.org/10.1145/3371075

Digital Library

Google Scholar

[8]

Gilles Barthe, Benjamin Grégoire, and Vincent Laporte. 2018. Secure compilation of side-channel countermeasures: The case of cryptographic “constant-time.” In Proceedings of the 31st IEEE Computer Security Foundations Symposium (CSF’18). 328–343. DOI:

Crossref

Google Scholar

[9]

Lennart Beringer, Gordon Stewart, Robert Dockins, and Andrew W. Appel. 2014. Verified compilation for shared-memory C. In Proceedings of the 23rd European Symposium on Programming: Programming Languages and Systems, Held as Part of the European Joint Conferences on Theory and Practice of Software. 107–127. DOI:https://doi.org/10.1007/978-3-642-54833-8_7

Google Scholar

[10]

Frédéric Besson, Sandrine Blazy, and Pierre Wilke. 2019. A verified Comp-Cert front-end for a memory model supporting pointer arithmetic and uninitialised data. J. Autom. Reason. 62, 4 (2019), 433–480. DOI:https://doi.org/10.1007/s10817-017-9439-z

Google Scholar

[11]

Sylvie Boldo, Jacques-Henri Jourdan, Xavier Leroy, and Guillaume Melquiond. 2015. Verified compilation of floating-point computations. J. Autom. Reason. 54, 2 (2015), 135–163. DOI:https://doi.org/10.1007/s10817-014-9317-x

Digital Library

Google Scholar

[12]

William J. Bowman and Amal Ahmed. 2015. Noninterference for free. In Proceedings of the ACM SIGPLAN International Conference on Functional Programming.

Digital Library

Google Scholar

[13]

Matteo Busi, Pierpaolo Degano, and Letterio Galletta. 2019. Translation validation for security properties. CoRR abs/1901.05082 (2019).

Google Scholar

[14]

Qinxiang Cao, Lennart Beringer, Samuel Gruetter, Josiah Dodds, and Andrew W. Appel. 2018. VST-Floyd: A separation logic tool to verify correctness of C programs. J. Autom. Reason. 61, 1–4 (2018), 367–422. DOI:https://doi.org/10.1007/s10817-018-9457-5

Digital Library

Google Scholar

[15]

Quentin Carbonneaux, Jan Hoffmann, Tahina Ramananandro, and Zhong Shao. 2014. End-to-end verification of stack-space bounds for C programs. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Michael F. P. O’Boyle and Keshav Pingali (Eds.). 270–281. DOI:https://doi.org/10.1145/2594291.2594301

Crossref

Google Scholar

[16]

Catalin Cimpanu. 2019. Microsoft: 70 percent of all security bugs are memory safety issues. ZDNet. Retrieved from https://www.zdnet.com/article/microsoft-70-percent-of-all-security-bugs-are-memory-safety-issues/.

Google Scholar

[17]

David Clark and Sebastian Hunt. 2008. Non-interference for deterministic interactive programs. In Proceedings of the International Workshop on Formal Aspects in Security and Trust. Springer, 50–66.

Digital Library

Google Scholar

[18]

Michael R. Clarkson and Fred B. Schneider. 2010. Hyperproperties. J. Comput. Secur. 18, 6 (2010), 1157–1210. DOI:

Crossref

Google Scholar

[19]

Patrick Cousot. 2002. Constructive design of a hierarchy of semantics of a transition system by abstract interpretation. Theoretical Computer Science 277, 1-2 (2002), 47–103. Retrieved from https://www.di.ens.fr/cousot/COUSOTpapers/publications.www/Cousot-TCS-02-v277p47-103-2002.pdf.

Digital Library

Google Scholar

[20]

P. Cousot and R. Cousot. 1977. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Proceedings of the 4th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 238–252.

Digital Library

Google Scholar

[21]

Patrick Cousot and Radhia Cousot. 1979. Systematic design of program analysis frameworks. In Proceedings of the 6th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages. 269–282.

Digital Library

Google Scholar

[22]

Dominique Devriese, Marco Patrignani, and Frank Piessens. 2016. Fully-abstract compilation by approximate back-translation. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. Retrieved from http://www.mpi-sws.org/marcopat/marcopat/Publications_files/logrel-for-facomp.pdf.

Digital Library

Google Scholar

[23]

Vijay D’Silva, Mathias Payer, and Dawn Xiaodong Song. 2015. The correctness-security gap in compiler optimization. In Proceedings of the IEEE Symposium on Security and Privacy Workshops. 73–87. DOI:https://doi.org/10.1109/SPW.2015.33

Crossref

Google Scholar

[24]

Joost Engelfriet. 1985. Determinacy implies (observation equivalence = trace equivalence). Theor. Comput. Sci. 36 (1985), 21–25. DOI:

Crossref

Google Scholar

[25]

Riccardo Focardi and Roberto Gorrieri. 1995. A taxonomy of security properties for process algebras. J. Comput. Secur. 3, 1 (1995), 5–34. DOI:

Crossref

Google Scholar

[26]

Paul H. B. Gardiner, Clare E. Martin, and Oege De Moor. 1994. An algebraic construction of predicate transformers. Sci. Comput. Program. 22, 1–2 (1994), 21–44. DOI: https://doi.org/10.1016/0167-6423(94)90006-X

Digital Library

Google Scholar

[27]

Roberto Giacobazzi and Isabella Mastroeni. 2018. Abstract non-interference: A unifying framework for weakening information-flow. ACM Trans. Priv. Secur. 21, 2 (2018), 9. DOI: https://doi.org/10.1145/3175660

Digital Library

Google Scholar

[28]

Joseph A. Goguen and José Meseguer. 1982. Security policies and security models. In Proceedings of the Symposium on Security and Privacy. 11–20. Retrieved from https://www.cs.purdue.edu/homes/ninghui/readings/AccessControl/goguen_meseguer_82.pdf.

Crossref

Google Scholar

[29]

Ronghui Gu, Zhong Shao, Jieung Kim, Xiongnan (Newman) Wu, Jérémie Koenig, Vilhelm Sjöberg, Hao Chen, David Costanzo, and Tahina Ramananandro. 2018. Certified concurrent abstraction layers. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, Jeffrey S. Foster and Dan Grossman (Eds.). 646–661. DOI:https://doi.org/10.1145/3192366.3192381

Crossref

Google Scholar

[30]

István Haller, Yuseok Jeon, Hui Peng, Mathias Payer, Cristiano Giuffrida, Herbert Bos, and Erik van der Kouwe. 2016. TypeSan: Practical type confusion detection. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security. 517–528. DOI:https://doi.org/10.1145/2976749.2978405

Google Scholar

[31]

Heartbleed. 2014. The Heartbleed Bug. Retrieved from http://heartbleed.com/.

Google Scholar

[32]

Cătălin Hriţcu, David Chisnall, Deepak Garg, and Mathias Payer. 2019. Secure Compilation. SIGPLAN PL Perspectives Blog. Retrieved from https://blog.sigplan.org/2019/07/01/secure-compilation/.

Google Scholar

[33]

Chung-Kil Hur and Derek Dreyer. 2011. A Kripke logical relation between ML and assembly. In Proceedings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Thomas Ball and Mooly Sagiv (Eds.). 133–146. DOI:https://doi.org/10.1145/1926385.1926402

Google Scholar

[34]

Alan Jeffrey and Julian Rathke. 2005. Java Jr: Fully abstract trace semantics for a core java language. In Proceedings of the 14th European Symposium on Programming (Lecture Notes in Computer Science), Vol. 3444. 423–438. DOI:https://doi.org/10.1007/978-3-540-31987-0_29

Google Scholar

[35]

Yannis Juglaret, Cătălin Hriţcu, Arthur Azevedo de Amorim, Boris Eng, and Benjamin C. Pierce. 2016. Beyond good and evil: Formalizing the security guarantees of compartmentalizing compilation. In Proceedings of the IEEE 29th Computer Security Foundations Symposium. 45–60. DOI:

Crossref

Google Scholar

[36]

Jeehoon Kang, Chung-Kil Hur, William Mansky, Dmitri Garbuzov, Steve Zdancewic, and Viktor Vafeiadis. 2015. A formal C memory model supporting integer-pointer casts. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation. 326–335. DOI:https://doi.org/10.1145/2737924.2738005

Crossref

Google Scholar

[37]

Jeehoon Kang, Yoonseung Kim, Chung-Kil Hur, Derek Dreyer, and Viktor Vafeiadis. 2016. Lightweight verification of separate compilation. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. Retrieved from http://sf.snu.ac.kr/sepcompcert/.

Digital Library

Google Scholar

[38]

Jeehoon Kang, Yoonseung Kim, Chung-Kil Hur, Derek Dreyer, and Viktor Vafeiadis. 2016. Lightweight verification of separate compilation. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’16). Association for Computing Machinery, New York, NY, 178–190. DOI:https://doi.org/10.1145/2837614.2837642

Crossref

Google Scholar

[39]

Leslie Lamport and Fred B. Schneider. 1985. Formal foundation for specification and verification. In Distributed Systems: Methods and Tools for Specification, an Advanced Course. Springer-Verlag, 203–285. DOI:https://doi.org/10.1007/3-540-15216-4_15

Digital Library

Google Scholar

[40]

Chris Lattner. 2011. What Every C Programmer Should Know about Undefined Behavior #1/3. LLVM Project Blog. (May 2011). Retrieved from http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html.

Google Scholar

[41]

Xavier Leroy. 2009. Formal verification of a realistic compiler. Commun. ACM 52, 7 (2009), 107–115. DOI:https://doi.org/10.1145/1538788.1538814

Digital Library

Google Scholar

[42]

Xavier Leroy. 2009. A formally verified compiler back-end. J. Autom. Reason. 43, 4 (2009), 363–446. https://doi.org/10.1007/s10817-009-9155-4

Digital Library

Google Scholar

[43]

Xavier Leroy. 2017. The formal verification of compilers (DeepSpec Summer School 2017). Retrieved from https://xavierleroy.org/courses/DSSS-2017/.

Google Scholar

[44]

Isabella Mastroeni and Michele Pasqua. 2018. Verifying bounded subset-closed hyperproperties. In Proceedings of the International Static Analysis Symposium. 263–283. Retrieved from https://iris.univr.it/retrieve/handle/11562/990895/120109/MastroeniPasqua.pdf.

Crossref

Google Scholar

[45]

Jacob Matthews and Robert Bruce Findler. 2007. Operational semantics for multi-language programs. SIGPLAN Not. 42, 1 (Jan. 2007), 3–10. DOI:https://doi.org/10.1145/1190215.1190220

Crossref

Google Scholar

[46]

John McCarthy and James Painter. 1967. Correctness of a compiler for arithmetic expressions. Math. Asp. Comput. Sci. 1 19 of Proceedings of Symposia in Applied Mathematics (1967), 33–41. Retrieved from http://jmc.stanford.edu/articles/mcpain/mcpain.pdf.

Crossref

Google Scholar

[47]

A. Melton, D. A. Schmidt, and G. E. Strecker. 1986. Galois connections and computer science applications. In Proceedings of a Tutorial and Workshop on Category Theory and Computer Programming. 299–312. Retrieved from http://dl.acm.org/citation.cfm?id=20081.20099.

Digital Library

Google Scholar

[48]

R. Milner. 1982. A Calculus of Communicating Systems. Springer-Verlag, Berlin.

Digital Library

Google Scholar

[49]

Robin Milner and Richard Weyhrauch. 1972. Proving compiler correctness in a mechanized logic. In Proceedings of 7th Annual Machine Intelligence Workshop, volume 7 of Machine Intelligence. 51–72. Retrieved from http://www.cs.umd.edu/hjs/pubs/compilers/archive/mi72-mil-wey.pdf.

Google Scholar

[50]

F. Lockwood Morris. 1973. Advice on structuring compilers and proving them correct. In Proceedings of the ACM Symposium on Principles of Programming Languages, Patrick C. Fischer and Jeffrey D. Ullman (Eds.). 144–152. DOI:https://doi.org/10.1145/512927.512941

Google Scholar

[51]

Eric Mullen, Daryl Zuniga, Zachary Tatlock, and Dan Grossman. 2016. Verified peephole optimizations for CompCert. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation. 448–461. DOI:https://doi.org/10.1145/2908080.2908109

Crossref

Google Scholar

[52]

Toby C. Murray, Robert Sison, Edward Pierzchalski, and Christine Rizkallah. 2016. Compositional verification and refinement of concurrent value-dependent noninterference. In Proceedings of the IEEE 29th Computer Security Foundations Symposium. IEEE Computer Society, 417–431. DOI:

Crossref

Google Scholar

[53]

Kedar S. Namjoshi and Lucas M. Tabajara. 2020. Witnessing secure compilation. In Proceedings of the International Conference on Verification, Model Checking, and Abstract Interpretation. Springer, 1–22.

Digital Library

Google Scholar

[54]

David A. Naumann. 1998. A categorical model for higher order imperative programming. Math. Struct. Comput. Sci. 8, 4 (1998), 351–399. Retrieved from https://www.cs.stevens.edu/naumann/pub/cmho.ps.

Digital Library

Google Scholar

[55]

David A. Naumann and Minh Ngo. 2019. Whither specifications as programs. In Proceedings of the International Symposium on Unifying Theories of Programming. Springer, 39–61. Retrieved from https://arxiv.org/abs/1906.03557.

Digital Library

Google Scholar

[56]

Georg Neis, Chung-Kil Hur, Jan-Oliver Kaiser, Craig McLaughlin, Derek Dreyer, and Viktor Vafeiadis. 2015. Pilsner: A compositionally verified compiler for a higher-order imperative language. In Proceedings of the 20th ACM SIGPLAN International Conference on Functional Programming. 166–178. DOI:https://doi.org/10.1145/2784731.2784764

Crossref

Google Scholar

[57]

Max New, William J. Bowman, and Amal Ahmed. 2016. Fully abstract compilation via universal embedding. In Proceedings of the ACM SIGPLAN International Conference on Functional Programming.

Digital Library

Google Scholar

[58]

Michele Pasqua and Isabella Mastroeni. 2017. On topologies for (hyper)properties. In Joint Proceedings of the 18th Italian Conference on Theoretical Computer Science and the 32nd Italian Conference on Computational Logic co-located with the IEEE International Workshop on Measurements and Networking IEEE M&N’17). (CEUR Workshop Proceedings), Dario Della Monica, Aniello Murano, Sasha Rubin, and Luigi Sauro (Eds.), Vol. 1949. 150–161. Retrieved from http://ceur-ws.org/Vol-1949/ICTCSpaper13.pdf.

Google Scholar

[59]

Marco Patrignani. 2015. The Tome of Secure Compilation: Fully Abstract Compilation to Protected Modules Architectures. Ph.D. Dissertation. KU Leuven, Leuven, Belgium. Retrieved from https://lirias.kuleuven.be/bitstream/123456789/494704/1/thesis.pdf.

Google Scholar

[60]

Marco Patrignani. 2020. Why Should Anyone use Colours? or, Syntax Highlighting Beyond Code Snippets. arxiv:cs.SE/2001.11334 (2020).

Google Scholar

[61]

Marco Patrignani, Amal Ahmed, and Dave Clarke. 2019. Formal approaches to secure compilation: A survey of fully abstract compilation and related work. Comput. Surv. (2019). Retrieved from http://theory.stanford.edu/mp/mp/Publications_files/main-full.pdf.

Digital Library

Google Scholar

[62]

Marco Patrignani and Dave Clarke. 2015. Fully abstract trace semantics for protected module architectures. Comput. Lang. Syst. Struct. 42 (2015), 22–45. DOI:https://doi.org/10.1016/j.cl.2015.03.002

Google Scholar

[63]

Marco Patrignani and Deepak Garg. 2017. Secure compilation and hyperproperty preservation. In Proceedings of the 30th IEEE Computer Security Foundations Symposium. 392–404. DOI:

Crossref

Google Scholar

[64]

Marco Patrignani and Deepak Garg. 2019. Robustly safe compilation. In Proceedings of the 28th European Symposium on Programming: Programming Languages and Systems (ESOP’19). Retrieved from https://arxiv.org/abs/1804.00489.

Crossref

Google Scholar

[65]

Daniel Patterson and Amal Ahmed. 2019. The next 700 compiler correctness theorems (functional pearl). Proc. ACM Program. Lang. 3, ICFP (July 2019). DOI:https://doi.org/10.1145/3341689

Google Scholar

[66]

James T. Perconti and Amal Ahmed. 2014. Verifying an open compiler using multi-language semantics. In Proceedings of the 23rd European Symposium on Programming (Lecture Notes in Computer Science), Vol. 8410. Springer, 128–148. Retrieved from http://www.ccs.neu.edu/home/jtpercon/multilang-verify.pdf.

Digital Library

Google Scholar

[67]

Tahina Ramananandro, Zhong Shao, Shu-Chun Weng, Jérémie Koenig, and Yuchen Fu. 2015. A compositional semantics for verified separate compilation and linking. In Proceedings of the Conference on Certified Programs and Proofs. 3–14. DOI:https://doi.org/10.1145/2676724.2693167

Google Scholar

[68]

John Regehr. 2010. A Guide to Undefined Behavior in C and C++, Part 3. Embedded in Academia blog. (July 2010). Retrieved from https://blog.regehr.org/archives/232.

Google Scholar

[69]

Andrei Sabelfeld and David Sands. 2005. Dimensions and principles of declassification. In Proceedings of the Computer Security Foundations 18th Workshop. 255–269. Retrieved from http://www.cse.chalmers.se/dave/papers/sabelfeld-sands-CSFW05.pdf.

Digital Library

Google Scholar

[70]

Amr Sabry and Philip Wadler. 1997. A reflection on call-by-value. ACM Trans. Program. Lang. Syst. 19, 6 (1997), 916–941.

Digital Library

Google Scholar

[71]

Gabriel Scherer, Max S. New, Nick Rioux, and Amal Ahmed. 2018. FabULous interoperability for ML and a linear language. In Proceedings of the 21st International Conference on Foundations of Software Science and Computation Structures, Held as Part of the European Joint Conferences on Theory and Practice of Software (Lecture Notes in Computer Science), Christel Baier and Ugo Dal Lago (Eds.), Vol. 10803. Springer, 146–162. DOI:

Crossref

Google Scholar

[72]

Fred B. Schneider. 2000. Enforceable security policies. ACM Trans. Inf. Syst. Secur. 3, 1 (2000), 30–50. Retrieved from http://www.cs.cornell.edu/fbs/publications/EnfSecPols.pdf.

Digital Library

Google Scholar

[73]

Jaroslav Sevcík, Viktor Vafeiadis, Francesco Zappa Nardelli, Suresh Jagannathan, and Peter Sewell. 2013. CompCertTSO: A verified compiler for relaxed-memory concurrency. J. ACM 60, 3 (2013), 22:1–22:50. DOI:https://doi.org/10.1145/2487241.2487248

Google Scholar

[74]

Robert Sison and Toby Murray. 2019. Verifying that a compiler preserves concurrent value-dependent information-flow security. In Proceedings of the 10th International Conference on Interactive Theorem Proving (LIPIcs), John Harrison, John O’Leary, and Andrew Tolmach (Eds.), Vol. 141. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 27:1–27:19. DOI:

Crossref

Google Scholar

[75]

Lau Skorstengaard, Dominique Devriese, and Lars Birkedal. 2019. StkTokens: Enforcing well-bracketed control flow and stack encapsulation using linear capabilities. Proc. ACM Program. Lang. 3, POPL (Jan. 2019), 19:1–19:28.

Digital Library

Google Scholar

[76]

Gordon Stewart, Lennart Beringer, Santiago Cuellar, and Andrew W. Appel. 2015. Compositional CompCert. In Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 275–287. DOI:https://doi.org/10.1145/2676726.2676985

Crossref

Google Scholar

[77]

Yong Kiam Tan, Magnus O. Myreen, Ramana Kumar, Anthony Fox, Scott Owens, and Michael Norrish. 2019. The verified CakeML compiler backend. J. Funct. Program. 29 (2019). DOI:

Crossref

Google Scholar

[78]

Xi Wang, Haogang Chen, Alvin Cheung, Zhihao Jia, Nickolai Zeldovich, and M. Frans Kaashoek. 2012. Undefined behavior: What happened to my code? In Proceedings of the Asia-Pacific Workshop on Systems. 9. DOI:https://doi.org/10.1145/2349896.2349905

Google Scholar

[79]

Xi Wang, Nickolai Zeldovich, M. Frans Kaashoek, and Armando Solar-Lezama. 2013. Towards optimization-safe systems: Analyzing the impact of undefined behavior. In Proceedings of the ACM SIGOPS 24th Symposium on Operating Systems Principles. 260–275. DOI:https://doi.org/10.1145/2517349.2522728

Google Scholar

[80]

Yuting Wang, Pierre Wilke, and Zhong Shao. 2019. An abstract stack based approach to verified compositional compilation to machine code. Proc. ACM Program. Lang. 3, POPL (2019), 62:1–62:30. DOI:https://doi.org/10.1145/3290375

Google Scholar

[81]

Li-yao Xia, Yannick Zakowski, Paul He, Chung-Kil Hur, Gregory Malecha, Benjamin C. Pierce, and Steve Zdancewic. 2020. Interaction trees: Representing recursive and impure programs in Coq. Proc. ACM Program. Lang. 4, POPL (2020), 51:1–51:32. DOI:https://doi.org/10.1145/3371119

Google Scholar

[82]

Aris Zakinthinos and E. Stewart Lee. 1997. A general theory of security properties. In Proceedings of the IEEE Symposium on Security and Privacy (Oakland S&P’97). 94–102. DOI:https://doi.org/10.1109/SECPRI.1997.601322

Crossref

Google Scholar

[83]

Jianzhou Zhao, Santosh Nagarakatte, Milo M. K. Martin, and Steve Zdancewic. 2012. Formalizing the LLVM intermediate representation for verified program transformations. In Proceedings of the 39th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 427–440. Retrieved from http://www.cis.upenn.edu/stevez/papers/ZNMZ12.pdf.

Digital Library

Google Scholar

Cited By

View all

Zhao YWang ZChen JFu RLu YGao TYe HLuo BLiao XXu JKirda ELie D(2024)Program Ingredients Abstraction and Instantiation for Synthesis-based JVM TestingProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690366(3943-3957)Online publication date: 2-Dec-2024
https://dl.acm.org/doi/10.1145/3658644.3690366
Thibault JBlanco RLee DArgo SAzevedo de Amorim AGeorges AHriţcu CTolmach ALuo BLiao XXu JKirda ELie D(2024)SECOMP: Formally Secure Compilation of Compartmentalized C ProgramsProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3670288(1061-1075)Online publication date: 2-Dec-2024
https://dl.acm.org/doi/10.1145/3658644.3670288
Künnemann RPatrignani MCecchetti E(2024)Computationally Bounded Robust Compilation and Universally Composable Security2024 IEEE 37th Computer Security Foundations Symposium (CSF)10.1109/CSF61375.2024.00024(265-278)Online publication date: 8-Jul-2024
https://doi.org/10.1109/CSF61375.2024.00024
Show More Cited By

Index Terms

An Extended Account of Trace-relating Compiler Correctness and Secure Compilation
1. Security and privacy
  1. Formal methods and theory of security
    1. Formal security models
2. Software and its engineering
  1. Software notations and tools
    1. Compilers
  2. Software organization and properties
    1. Software functional properties
      1. Formal methods
        Software verification

Recommendations

Robustly Safe Compilation, an Efficient Form of Secure Compilation

Security-preserving compilers generate compiled code that withstands target-level attacks such as alteration of control flow, data leaks, or memory corruption. Many existing security-preserving compilers are proven to be fully abstract, meaning that they ...
Trace-Relating Compiler Correctness and Secure Compilation
Programming Languages and Systems
Abstract
Compiler correctness is, in its simplest form, defined as the inclusion of the set of traces of the compiled program into the set of traces of the original program, which is equivalent to the preservation of all trace properties. Here traces ...
An approach to compiler correctness
International Conference on Reliable Software

This paper is a preliminary report on an experiment in applying Floyd's method of inductive assertions to the compiler correctness problem. Practical postfix translators are considered, and the semantics of source and object languages are characterized ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Programming Languages and Systems

ACM Transactions on Programming Languages and Systems Volume 43, Issue 4

December 2021

272 pages

ISSN:0164-0925

EISSN:1558-4593

DOI:10.1145/3492431

Editor:
Andrew Myers
Cornell University, USA

Issue’s Table of Contents

This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 November 2021

Accepted: 01 April 2021

Revised: 01 April 2021

Received: 01 May 2020

Published in TOPLAS Volume 43, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

European Research Council
German Federal Ministry of Education and Research (BMBF)
DARPA
Office of Naval Research
Accountable Protocol Customization
UAIC

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
1,882
Total Downloads

Downloads (Last 12 months)586
Downloads (Last 6 weeks)67

Reflects downloads up to 25 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Zhao YWang ZChen JFu RLu YGao TYe HLuo BLiao XXu JKirda ELie D(2024)Program Ingredients Abstraction and Instantiation for Synthesis-based JVM TestingProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690366(3943-3957)Online publication date: 2-Dec-2024
https://dl.acm.org/doi/10.1145/3658644.3690366
Thibault JBlanco RLee DArgo SAzevedo de Amorim AGeorges AHriţcu CTolmach ALuo BLiao XXu JKirda ELie D(2024)SECOMP: Formally Secure Compilation of Compartmentalized C ProgramsProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3670288(1061-1075)Online publication date: 2-Dec-2024
https://dl.acm.org/doi/10.1145/3658644.3670288
Künnemann RPatrignani MCecchetti E(2024)Computationally Bounded Robust Compilation and Universally Composable Security2024 IEEE 37th Computer Security Foundations Symposium (CSF)10.1109/CSF61375.2024.00024(265-278)Online publication date: 8-Jul-2024
https://doi.org/10.1109/CSF61375.2024.00024
Charguéraud AChlipala AErbsen AGruetter S(2023)Omnisemantics: Smooth Handling of NondeterminismACM Transactions on Programming Languages and Systems10.1145/357983445:1(1-43)Online publication date: 8-Mar-2023
https://dl.acm.org/doi/10.1145/3579834
Derakhshan FZhang ZVasudevan AJia L(2023)Towards End-to-End Verified TEEs via Verified Interface Conformance and Certified Compilers2023 IEEE 36th Computer Security Foundations Symposium (CSF)10.1109/CSF57540.2023.00021(324-339)Online publication date: Jul-2023
https://doi.org/10.1109/CSF57540.2023.00021
Ballou KSherman E(2023)Minimally Comparing Relational Abstract DomainsAutomated Technology for Verification and Analysis10.1007/978-3-031-45332-8_8(159-175)Online publication date: 24-Oct-2023
https://dl.acm.org/doi/10.1007/978-3-031-45332-8_8
Ballou KSherman E(2023)Identifying Minimal Changes in the Zone Abstract DomainTheoretical Aspects of Software Engineering10.1007/978-3-031-35257-7_13(221-239)Online publication date: 4-Jul-2023
https://dl.acm.org/doi/10.1007/978-3-031-35257-7_13
Ballou KSherman E(2022)Incremental Transitive Closure for Zonal Abstract DomainNASA Formal Methods10.1007/978-3-031-06773-0_43(800-808)Online publication date: 24-May-2022
https://dl.acm.org/doi/10.1007/978-3-031-06773-0_43

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

1 Introduction

2 Trace-relating Compiler Correctness

2.1 Property Mappings

2.2 Trace Relations and Property Mappings

3 Preserving Other (Hyper)Property Classes

3.1 Preservation of Subset-closed Hyperproperties

3.2 Preserving Safety Properties

3.3 Preserving Non-subset Closed Hyperproperties

3.4 Comparing the Presented Criteria

4 Instances of Trace-relating Compiler Correctness

4.1 Undefined Behavior

4.2 Resource Exhaustion

4.3 Different Source and Target Values

4.3.1 Shared Source and Target Language Formalization.

4.3.2 Different Source and Target Values.

4.4 Abstraction Mismatches

5 Trace-relating Compilation and Noninterference Preservation

5.1 Noninterference and Trace-relating Compilation

5.2 Abstract Noninterference

5.3 Trace-relating Compilation and ANI for Timing

5.4 Trace-relating Compilation and ANI in General

5.5 Noninterference and Undefined Behavior

5.6 From Target NI to Source NI

5.7 Analyzing Noninterference Preserving Compilers

5.7.1 A Correct Compiler Preserving Cryptographic Constant Time.

5.7.2 Value-dependent noninterference.

6 Trace-relating Secure Compilation

6.1 Trace-relating Secure Compilation: Trace Properties and Subset-closed Hyperproperties

6.2 Trace-relating Secure Compilation: Safety and Hypersafety

6.3 Trace-relating Secure Compilation: Arbitrary Hyperproperties

6.4 Trace-relating Secure Compilation: 2-Relational Hyperproperties

6.5 Relating the Secure Compilation Trinities

7 Instances of Trace-relating Secure Compilation

7.1 An Instance of Trace-relating Robust Preservation of Trace Properties

7.2 An Instance of Trace-relating Robust Preservation of Safety Properties

7.3 An Instance of Trace-relating Robust Preservation of Hypersafety Properties

8 Related Work

9 Conclusion and Future Work

Acknowledgments

Footnotes

A Proofs

References

Cited By

Index Terms

Recommendations

Robustly Safe Compilation, an Efficient Form of Secure Compilation

Trace-Relating Compiler Correctness and Secure Compilation

An approach to compiler correctness

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

HTML Format

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations