1 Introduction

Sequential decision-making tasks often require satisfaction of multiple, partially-contradictory objectives. For example, the control policy of a traffic light may need to choose signals in a way that the traffic throughput is maximized while the maximum waiting time is minimized [34], the control policy operating an unmanned aerial vehicle may need to navigate in a way that the destination is reached while no-fly zones are avoided [33], the policy of an operating-system resource manager needs to allocate resources to tasks in a way that deadlocks are avoided while fairness is maintained [3].

We propose a decentralized synthesis framework for policies when tasks are given as a conjunction of two objectives \(\varPhi _1\) and \(\varPhi _2\), and the policies need to choose actions from a common action space. The key idea is that \(\varPhi _1\) and \(\varPhi _2\) will be accomplished, respectively, using two action policies \(\alpha _1\) and \(\alpha _2\)—designed independently, and the composition of \(\alpha _1\) and \(\alpha _2\) at runtime will generate a policy for \(\varPhi _1 \wedge \varPhi _2\). The challenge is that at each time point, one action needs to be chosen, whereas \(\alpha _1\) and \(\alpha _2\) might select conflicting actions. For example, when developing a plan for a robot, \(\varPhi _1\) and \(\varPhi _2\) might specify two target locations, and \(\alpha _1\) and \(\alpha _2\) may select opposite directions in a location.

We propose a novel composition mechanism called auction-based scheduling: both policies are allocated bounded monetary budgets, and at each point in time, an auction (aka bidding) is held, where the policies bid from their budgets for the privilege to get scheduled for choosing the action. More formally, we equip each action policy \(\alpha _i\), for \(i \in \left\{ 1,2 \right\} \), with a bidding policy \(\beta _i\), which is a function that proposes a bid from the available budget based on the history of the interaction. A tender for objective \(\varPsi \) is a triple \(\tau = \langle \alpha , \beta , \mathbb {B} \rangle \), where \(\alpha \) is an action policy, \(\beta \) is a bidding policy, and \(\mathbb {B}\in (0,1)\) is a minimal budget required for the tender to guarantee \(\varPsi \). Two tenders \(\tau _1\) and \(\tau _2\) are compatible if \(\mathbb {B}_1 + \mathbb {B}_2 < 1\), which is when they can be composed at runtime as follows. Each Tender i, for \(i \in \left\{ 1,2 \right\} \), is allocated an initial budget that exceeds \(\mathbb {B}_i\), where the sum of budgets equals 1. At each point in time, the tenders simultaneously choose bids using their bidding policies, the higher bidder chooses an action using its action policy and pays the bid to the other tender. Thus, the sum of budgets stays constant at 1. Note that the composition gives rise to a path in the graph. The decentralized synthesis problem asks: Given a graph \(\mathcal {G}\) and objectives \(\varPhi _1,\varPhi _2\) such that \(\varPhi _1\wedge \varPhi _2\ne \texttt{false}\), for each \(\varPhi _i\) compute \(\tau _i\) such that no matter which tender it is composed with, the composition generates a path that fulfills \(\varPhi _i\). The framework is sound-by-construction, namely the composition of compatible tenders satisfies \(\varPhi _1\wedge \varPhi _2\).

The advantage of auction-based scheduling is modularity at two levels. First, since the designs of policies do not depend on each other, they can be created independently and in parallel, e.g., by different vendors or in a parallel computation. Second, since the policies operate independently, they can be modified and replaced separately. For example, when only the objective \(\varPhi _1\) changes, there is no need to alter the policy \(\alpha _2\), and vice versa.

Bidding for the next action encourages the policy with higher scheduling urgency to bid higher, and at the same time, the bounds on budgets ensure fairness, namely that no policy is starved. Auction-based scheduling adds new, complementary features to the arsenal of modular approaches in multi-objective decision-making. With the conventional decentralized synthesis approaches, the policies are composed either concurrently [39] or in a turn-based manner [23]. Concurrent actions are meaningful if each policy needs to act on its own local control variables, e.g., when the local control policies of two robots concurrently move the robot towards their destinations in a shared workspace. In our case, the set of actions is common between policies, and the concurrent interaction is unsuitable. Likewise, turn-based actions are also unsuitable in our setting because it is unclear how to assign turns to policies apriori. We will demonstrate (Ex. 2) that an inappropriate turn-assignment to policies may violate some of the objectives, while auction-based scheduling will succeed to fulfil all of them.

We study auction-based scheduling in the context of path planning on finite directed graphs with pairs of \(\omega \)-regular objectives on its paths, and present algorithms for the decentralized synthesis problem with increasing levels of assumptions made by the tenders on each other: (a) Strong synthesis, with no assumptions and the most robust solution, (b) assume-admissible synthesis, with the assumption that the other tender is not purely cynical and behaves rationally with respect to its own objective, and (c) assume-guarantee synthesis, with explicit contract-based pre-coordination. We show for graphs whose every vertex has at most two outgoing edges, for every pair of \(\omega \)-regular objectives \(\varPhi _1,\varPhi _2\), and for all three classes of problems (a), (b), and (c), there exist PTIME decentralized synthesis algorithms that either compute compatible tenders or output that no compatible tenders with the respective assumptions exist; surprisingly, we show that compatible tenders always exist for (b). For general graphs, we show that the problems are in \(\text {NP}\cap \text {coNP}\). All our algorithms internally solve bidding games using known algorithms from the literature [37, 38]. Due to the lack of space, some proofs are omitted, but can be found in the extended version [17].

2 Preliminaries

Let \(\varSigma \) be a finite alphabet. We use \(\varSigma ^*\) and \(\varSigma ^\omega \) to respectively denote the set of finite and infinite words over \(\varSigma \), and \(\varSigma ^\infty \) to denote \(\varSigma ^*\cup \varSigma ^\omega \). Let for two words \(u\in \varSigma ^*\) and \(v\in \varSigma ^\infty \), \(u\le v\) denote that u is a prefix of v, i.e., there exists a w such that \(v = uw\). Given a language \(L\subseteq \varSigma ^\infty \), define \(\textsf{pref}(L)\) to be the set of every finite prefix in L, i.e., \(\textsf{pref}(L){:}{=}\left\{ u\in \varSigma ^*\mid \exists v\in L\;.\; u\le v \right\} \).

Graphs. We formalize path planning problems on graphs. A graph \(\mathcal {G}\) is a tuple \(\left\langle V,v^{0},E \right\rangle \) where \(V\) is a finite set of vertices, \(v^{0}\) is a designated initial vertex, and \(E\subseteq V\times V\) is a set of directed edges. If \((u,v)\in E\), then v is a successor of u. A binary graph is a graph whose every vertex has at most two successors. A path over \(\mathcal {G}\) is a sequence of vertices \(v^0v^1\ldots \) so that every \((v^i,v^{i+1})\in E\). Unless explicitly mentioned, paths always start at \(v^0\). We use \( Paths ^{\textsf{fin}}(\mathcal {G})\) and \( Paths ^{\textsf{inf}}(\mathcal {G})\) to denote the sets of finite and infinite paths, respectively.

A strongly connected component (SCC) of the graph \(\mathcal {G}\) is a set S of vertices, such that there is a path between every pair of vertices of S. An SCC S is called a bottom strongly connected component (BSCC) if there does not exist any edge from a vertex in S to a vertex outside of S. The graph \(\mathcal {G}\) is itself called strongly connected if \(V\) is an SCC.

Objectives. Fix a graph \(\mathcal {G}\). An objective \(\varPhi \) in \(\mathcal {G}\) is a set of infinite paths, i.e., \(\varPhi \subseteq Paths ^{\textsf{inf}}(\mathcal {G})\). For an infinite path \(\rho \), we use \( Inf (\rho )\) to denote the set of vertices that \(\rho \) visits infinitely often. We focus on the following objectives:

  • Reachability: for \(S\subseteq V\), \( Reach _\mathcal {G}(S){:}{=}\left\{ v^{0}v^1\ldots \in Paths ^{\textsf{inf}}(\mathcal {G}) \mid \exists i\ge 0\; . \; v^i\in S \right\} \),

  • Safety: for \(S\subseteq V\), \( Safe _\mathcal {G}(S){:}{=}\left\{ v^{0}v^1\ldots \in Paths ^{\textsf{inf}}(\mathcal {G}) \mid \forall i\ge 0\; . \; v^i\in S \right\} \),

  • Büchi: for \(S\subseteq V\), \( B\ddot{u}chi _\mathcal {G}(S){:}{=}\left\{ \rho \in Paths ^{\textsf{inf}}(\mathcal {G})\mid Inf (\rho )\cap S\ne \emptyset \right\} \),

  • Parity (max, even): for \( Col :V\rightarrow [0;k]\) for some \(k>0\), \( Parity _\mathcal {G}( Col ){:}{=}\left\{ \rho \in Paths ^{\textsf{inf}}(\mathcal {G})\mid \max \left\{ i\mid \exists v\in Inf (\rho )\;.\; Col (v)=i \right\} \text { is even} \right\} \),

Given an objective \(\varPhi \), we will use \(\varPhi ^c\) to denote its complement, i.e., \(\varPhi ^c = Paths ^{\textsf{inf}}(\mathcal {G})\setminus \varPhi \). Observe that \(( Reach _\mathcal {G}(S))^c = Safe _\mathcal {G}(V\setminus S)\).

Action policies. Fix a graph \(\mathcal {G}\). An action policy is a function \(\alpha : Paths ^{\textsf{fin}}(\mathcal {G})\rightarrow V\), choosing the next vertex to extend any given finite path \(\rho v\), where \(\left\langle v, \alpha (\rho v) \right\rangle \in E\). The action policy \(\alpha \) is memoryless if for every pair of distinct finite paths \(\rho v,\rho 'v\) that end in the same vertex v, it holds that \(\alpha (\rho v) = \alpha (\rho ' v)\); in this case we simply write \(\alpha (v)\). An action policy \(\alpha \) generates a unique infinite path in \(\mathcal {G}\), denoted \( out (\alpha )\), and defined inductively as follows. The initial vertex is \(v^{0}\). For every prefix \(v^{0}, \ldots , v^i\) of \( out (\alpha )\), for \(i \ge 0\), \(v^{i+1} = \alpha (v^{0}, \ldots , v^i)\). We say that the policy \(\alpha \) satisfies a given objective \(\varPhi \), written \(\alpha \models \varPhi \) iff \( out (\alpha )\in \varPhi \).

3 The Auction-Based Scheduling Framework

Consider a graph \(\mathcal {G}=\left\langle V,v^{0},E \right\rangle \). A pair of objectives \(\varPhi _1,\varPhi _2 \subseteq V^\omega \) in \(\mathcal {G}\) are called overlapping if they have nonempty intersection (i.e., \(\varPhi _1 \cap \varPhi _2 \ne \emptyset \)). The multi-objective planning problem asks to synthesize an action policy that satisfies the global objective \(\varPhi _1 \cap \varPhi _2\) for overlapping \(\varPhi _1,\varPhi _2\).

We propose a decentralized approach to the problem. Our goal is to design two action policies \(\alpha _1\) and \(\alpha _2\) for \(\varPhi _1\) and \(\varPhi _2\), respectively. We will equip each action policy with a bidding policy, which it will use at runtime to bid for choosing the action at each time point. We formalize this below.

Definition 1 (Bidding policies)

A bidding policy is a function \(\beta :V\times [0,1]\rightarrow [0,1]\) with the constraint that \(\beta (v,B)\le B\) for every vertex v and every amount of available budget \(B\in [0,1]\).

We equip a pair of action and bidding policies with a threshold budget, which represents the greatest lower bound on the initial budget needed for the policies to guarantee their objective, and we call the resulting triple a tender.

Definition 2 (Tenders)

A tender for a given graph \(\mathcal {G}\) is a triple \(\left\langle \alpha ,\beta ,\mathbb {B} \right\rangle \) of an action policy \(\alpha \), a bidding policy \(\beta \), and a threshold budget \(\mathbb {B}\in [0,1]\). The set of all tenders for \(\mathcal {G}\) is denoted \(\mathcal {T}^\mathcal {G}\). A tender \(\tau \) satisfies an objective \(\varPhi \), denoted \(\tau \models \varPhi \), iff \(\alpha \models \varPhi \) (i.e., when the tender is operating alone on the graph).

Next, we formalize the composition of two tenders at runtime, which produces an action policy that uses a register of memory to keep track of the available budgets. We introduce some notation. A configuration is a pair \(\left\langle v,B_1 \right\rangle \), where v is a vertex and \(B_1\) is the budget available to the first tender. We normalize the sum of budgets to 1, hence implicitly, the budget available to the second tender is \(B_2=1-B_1\). Let \(\mathcal {C}= V\times [0,1]\) be the set of all configurations. For a given sequence of configurations \(s = (v^0,B_1^0)(v^1,B_1^1)\ldots \in \mathcal {C}^\infty \), let \(\textsf{proj}_{V}(s)\) denote the path \(v^0v^1\ldots \). A history is a finite sequence of configurations \(\left\langle v^{0},B_1^0 \right\rangle \ldots \left\langle v^k,B_1^k \right\rangle \in \mathcal {C}^*\) with the constraint that \(\textsf{proj}_{V}(s)\in Paths ^{\textsf{fin}}(\mathcal {G})\). Let \(\mathcal {H}\) be the set of all histories.

Definition 3 (Composition of tenders)

Let \(\mathcal {G}\) be a graph, and \(\tau _1 = \left\langle \alpha _1,\beta _1,\mathbb {B}_1 \right\rangle \) and \(\tau _2=\left\langle \alpha _2,\beta _2,\mathbb {B}_2 \right\rangle \) be two tenders. The tenders \(\tau _1\) and \(\tau _2\) are compatible iff \(\mathbb {B}_1+\mathbb {B}_2 < 1\). If compatible, then their composition, denoted \(\tau _1\!\bowtie \! \tau _2\), is a function \(\tau _1\!\bowtie \! \tau _2:\mathcal {H}\rightarrow \mathcal {C}\) defined as follows. Given a history \(h=\left\langle v^{0},B_1^0 \right\rangle \ldots \left\langle v^k,B_1^k \right\rangle \in \mathcal {H}\), let \(b_1{:}{=}\beta _1(v^k,B_1^k)\) and \(b_2{:}{=}\beta _2(v^k,1-B_1^k)\). Then,

  • if \(b_1\ge b_2\), then \(\tau _1\!\bowtie \! \tau _2(h) = \left( \alpha _1(\rho v), B_1-b_1\right) \), and

  • if \(b_1 < b_2\), then \(\tau _1\!\bowtie \! \tau _2(h)= \left( \alpha _2(\rho v), B_1+b_2\right) \).

Given an initial configuration \(\left\langle v^{0},B_1^0 \right\rangle \) with \(B_1^0>\mathbb {B}_1\) and \(B_2^0=1-B_1^0>\mathbb {B}_2\), the composition outputs an infinite sequence of configurations, denoted \(\textrm{out}(\tau _1\!\bowtie \! \tau _2)\), where \(\textrm{out}(\tau _1\!\bowtie \! \tau _2){:}{=}\left\langle v^{0},B_1^0 \right\rangle \left\langle v^1,B_1^1 \right\rangle \ldots \in \mathcal {C}^\omega \) such that for every k, \(\left\langle v^k,B_1^k \right\rangle = \tau _1\!\bowtie \! \tau _2\left( \left\langle v^{0},B_1^0 \right\rangle \ldots \left\langle v^{k-1},B_1^{k-1} \right\rangle \right) \). We will say \(\tau _1\!\bowtie \! \tau _2\) satisfies a given objective \(\varPhi \), written \(\tau _1\!\bowtie \! \tau _2\models \varPhi \), iff \(\textsf{proj}_{V}(\textrm{out}(\tau _1\!\bowtie \! \tau _2))\in \varPhi .\)

We will often use the index \(i \in \left\{ 1,2 \right\} \) to refer to either of the two tenders or their attributes, and will use \(-i = 3-i\) for the “other” one, e.g., \(\tau _i\) and \(\tau _{-i}\). Notice the difference between \(\mathbb {B}_i\) and \(B_i^0\): \(\mathbb {B}_i\) is the threshold budget at \(v^{0}\) which is a constant attribute of \(\tau _i\), whereas \(B_i^0\) is the actual budget initially allocated to \(\tau _i\) whose value can be anything above \(\mathbb {B}_i\).

3.1 Classes of decentralized synthesis problem

In this section, we describe three classes of decentralized synthesis problems that we study. Throughout this section, fix a graph \(\mathcal {G}\) and a given pair of overlapping objectives \(\varPhi _1\) and \(\varPhi _2\).

Strong decentralized synthesis. Here, tenders make no assumptions on each other, thus the solutions provide the strongest (the most robust) guarantees. Formally, for each \(i\in \left\{ 1,2 \right\} \), the goal is to construct \(\tau _i\) such that for every compatible \(\tau _{-i}\), we have \(\tau _i\!\bowtie \! \tau _{-i}\models \varPhi _i\).

Fig. 1.
figure 1

Graphs with two reachability objectives given by targets: \(T_\textsf{blue}\), depicted in blue, \(T_\textsf{red}\) depicted in red, and \(T_\textsf{blue}\cap T_\textsf{red}\) depicted in purple. The action policies of the red and blue tenders choose edges with, respectively, red and blue shadows (shared edges are in purple). If no edges from a vertex have red or blue shadow, then the respective tender is indifferent about the choice made. Thick edges depict the paths taken by the compositions of tenders.

Example 1

Consider the graph depicted in Fig. 1a with a pair of reachability objectives having the targets \(T_{\textsf{blue}} = \left\{ c,d,g \right\} \) and \(T_{\textsf{red}}=\left\{ d,f \right\} \), respectively. Their intersection \(\left\{ d \right\} \) is depicted in purple. We present a pair of robust tenders \(\tau _{\textsf{blue}}\) and \(\tau _{\textsf{red}}\) with and , so that \(\tau _{\textsf{blue}}\) and \({\tau _{\textsf{red}}}\) are compatible. We will show that \(\tau _{\textsf{blue}}\) guarantees that no matter which compatible tender it is composed with, eventually \(T_{\textsf{blue}}\) is reached, and similarly \(\tau _{\textsf{red}}\) ensures that \(T_{\textsf{red}}\) is reached. Therefore, \(\tau _{\textsf{blue}}\!\bowtie \! \tau _{\textsf{red}}\) ensures that d is reached.

We first describe \(\tau _{\textsf{blue}}\). Consider an initial configuration , for any \(\epsilon > 0\). Note that the other tender’s budget is . The first action of \(\tau _\textsf{blue}\) is . There are two possibilities. First, \(\tau _{\textsf{blue}}\) wins the bidding, then we reach the configuration \(\langle b, \epsilon \rangle \), and since both successors of b are in \(T_{\textsf{blue}}\), the objective is satisfied in the next step. Second, \(\tau _{\textsf{blue}}\) loses the bidding, meaning that the other tender bids at least , and in the worst case, we proceed to the configuration . Next, \(\tau _\textsf{blue}\) chooses and necessarily wins as \(\tau _\textsf{red}\)’s budget is only , and we reach \(g \in T_{\textsf{blue}}\). We stress that \(\tau _\textsf{blue}\) can be entirely oblivious about \(\tau _\textsf{red}\), except for the implicit knowledge of \(\tau _\textsf{red}\)’s budget.

We describe \(\tau _{\textsf{red}}\). Consider an initial configuration , for any \(\epsilon > 0\). Initially, \(\tau _{\textsf{red}}\) bids 0, because it does not have a preference between going left or right. In the worst case, the budget stays in the next turn. Since both b and e have single successors in \(T_{\textsf{red}}\), thus \(\tau _\textsf{red}\) must win the bidding. It does so by bidding , which exceeds the available budget of \(\tau _\textsf{red}\).   \(\triangle \)

We now use the same problem as in Ex. 1, and show that the conventional turn-based interaction may fail to fulfill both objectives.

Example 2

Consider again the graph depicted in Fig. 1a with the targets \(T_{\textsf{blue}} = \left\{ c,d,g \right\} \) and \(T_{\textsf{red}}=\left\{ d,f \right\} \). Suppose \(\alpha _\textsf{blue}\) and \(\alpha _\textsf{red}\) are the two respective action policies, and we arbitrarily decide to make their interaction turn-based, where \(\alpha _\textsf{red}\) chooses actions in a and \(\alpha _\textsf{blue}\) chooses actions in b and e. It is clear that no matter which edge \(\alpha _\textsf{red}\) chooses from a, it cannot guarantee satisfaction of \(T_\textsf{red}\), because \(\alpha _\textsf{blue}\) can take the game to c or g depending on \(\alpha _\textsf{red}\)’s choice.    \(\triangle \)

Assume-admissible decentralized synthesis. While the guarantees of strong decentralized synthesis are appealing, it often fails as each tender makes the pessimistic assumption that the other tender can behave arbitrarily—even adversarially. We consider admissibility [23] as a stronger assumption based on rationality, ensuring compatible tenders to exist even when strong synthesis may fail. We illustrate the idea in the following example.

Example 3

Consider the graph in Fig. 1b, with reachability objectives given by targets \(T_{\textsf{blue}}=\left\{ d,g \right\} \) and \(T_{\textsf{red}}=\left\{ d,f \right\} \). We argue that strong decentralized synthesis is not possible. Indeed, using the same reasoning for \(\tau _\textsf{red}\) in Ex. 1, we have \( Th _\textsf{blue}(a) = Th _\textsf{red}(a) = 0.5\). On the other hand, observe that when synthesizing \(\tau _\textsf{red}\), since \(c \notin T_\textsf{blue}\), we know that a “rational” \(\tau _\textsf{blue}\)—formally, admissible \(\tau _\textsf{blue}\) (see Sec. 6)—will not proceed from b to c, and we can omit the edge. In turn, the threshold in a decreases to for both objectives. Since the sum of thresholds is now less than 1, two compatible tenders can be obtained.    \(\triangle \)

In general, we seek an admissible-winning tender, which ensures that its objective is satisfied when composed with any admissible tender. Admissible-winning tenders are modular because they can be reused provided that the set of admissible actions of the other tender remains unchanged. For example, even when vertex g is added to the red target set, the blue tender can be used with no change. Somewhat surprisingly, we show that in graphs in which all vertices have out-degree at most 2, assume-admissible decentralized synthesis is always possible, and a pair of admissible-winning tenders can be found in PTIME.

Assume-guarantee decentralized synthesis. Sometimes, even the admissibility assumption is too weak, and we need more direct synchronization of the tenders. We consider assume-guarantee decentralized synthesis, where each tender needs to respect a pre-specified contract, and as a result, their composition satisfies both objectives. We illustrate the idea below.

Example 4

Consider the graph depicted in Fig. 1c, with reachability objectives given by targets \(T_{\textsf{blue}}=\left\{ c,d,g \right\} \) and \(T_{\textsf{red}}=\left\{ d,f \right\} \). Here, the strong decentralized synthesis fails due to reasons similar to Ex. 3. The assume-admissible decentralized synthesis fails because from e, both objectives cannot be fulfilled, and from b, no matter which tender wins the bidding can use an admissible edge that violates the other objective (e.g., (bc) is admissible for \(\tau _\textsf{blue}\) but violates \(T_\textsf{red}\)). We consider the contract \(\left\langle G_{\textsf{blue}},G_{\textsf{red}} \right\rangle =\left\langle \textbf{G}\,\lnot c,\textbf{G}\,\lnot f \right\rangle \), which is satisfied when (a) if \(\alpha _{\textsf{blue}}\) fulfills \(G_{\textsf{blue}}\), then \(\alpha _{\textsf{red}}\) fulfills \(G_{\textsf{red}}\), and (b) if \(\alpha _{\textsf{red}}\) fulfills \(G_{\textsf{red}}\), then \(\alpha _{\textsf{blue}}\) fulfills \(G_{\textsf{blue}}\). Now whichever tender wins the bidding at b needs to fulfill its guarantee, because it cannot judge from the past interaction if the other tender violates its guarantee. Therefore, from b, the next vertex will be d under the contract, and using the same tenders from Ex. 3, both objectives will be fulfilled.

4 An Aside on Bidding Games on Graphs

All our synthesis algorithms internally solve bidding games, which we briefly review here; see the survey [8] for more details. A (two-player) bidding game is played between \( Player ~X \) and \( Player ~Y \), and is a tuple \(\langle \mathcal {G}, \varPhi \rangle \), where \(\mathcal {G}= \langle V, E \rangle \) is the (finite, directed) graph and \(\varPhi \subseteq V^\omega \) is the objective for \( Player ~X \). The game is zero-sum, meaning that the objective of \( Player ~Y \) is \(V^\omega \setminus \varPhi \), i.e., the violation of \(\varPhi \). This differs from auction-based scheduling where objectives overlap; otherwise, the interaction between \( Player ~X \) and \( Player ~Y \) is the same as the one between tenders. A strategy for a player is a pair \(\langle \alpha , \beta \rangle \) where \(\alpha \) is an action policy and \(\beta \) is a bidding policy. As in the composition of tenders, two strategies and an initial configuration \(\langle v, B_1 \rangle \) give rise to an infinite sequence of configurations called a play. A strategy is winning if no matter which strategy the opponent follows, the play satisfies the player’s objective. A central quantity in bidding games is the threshold budget in a vertex v, which is intuitively, a necessary and sufficient initial budget for \( Player ~X \) to guarantee winning.

Definition 4 (Threshold budgets)

Consider a bidding game \(\langle \mathcal {G}, \varPhi \rangle \) with \(\mathcal {G}=\left\langle V,E \right\rangle \). The threshold of \( Player ~X \) is given by \( Th _\varPhi ^\mathcal {G}: V\rightarrow [0,1]\), where for every \(v \in V\), we have \( Th _\varPhi ^\mathcal {G}(v) = \inf _B \left\{ Player ~X \text { has a winning strategy from } \langle v, B \rangle \right\} \).

The threshold of \( Player ~Y \) is denoted as \( Th _{\varPhi ^c}^\mathcal {G}(v)\). The following theorem characterizes the structure of thresholds and states that the two players’ thresholds are complementary. The intuition can be found in the full version [17], where we also show how winning strategies can be constructed from thresholds.

Theorem 1

([38]). Consider a reachability bidding game \(\left\langle \mathcal {G},\varPhi \right\rangle \) where \(\varPhi \) is \( Reach _\mathcal {G}(T)\) where, without loss of generality, T is a given set of sink vertices. For every vertex v, we have \( Th _\varPhi ^\mathcal {G}(v) = 1- Th _{\varPhi ^c}^\mathcal {G}(v)\). Moreover, for every sink vertex t, we have \( Th _\varPhi ^\mathcal {G}(t) = 0\), if \(t \in T\), and \( Th _\varPhi ^\mathcal {G}(t) = 1\) otherwise. For every vertex v, we have \( Th _\varPhi ^\mathcal {G}(v) = 0.5 \cdot ( Th _\varPhi ^\mathcal {G}(v^+) + Th _\varPhi ^\mathcal {G}(v^-))\), where \(v^-\) and \(v^+\) are successors of v, such that for every other successor u, we have \( Th _\varPhi ^\mathcal {G}(v^-) \le Th _\varPhi ^\mathcal {G}(u) \le Th _\varPhi ^\mathcal {G}(v^+)\). Verifying if \( Th _\varPhi ^\mathcal {G}(v) > 0.5\) for a given vertex v is in NP\(\,\cap \,\)coNP in general and is in PTIME for binary graphs.

For infinite-horizon objectives, like parity, it is known that eventually one of the BSCCs will be reached, and inside every BSCC every vertex can be reached by both players infinitely often with every arbitrary initial budget. This implies that for every parity objective, the threshold of every vertex inside every BSCC in a game graph is either 0 or 1, and fulfilling a given parity objective is equivalent to reaching a BSCC whose every vertex has threshold 0. We state this formally.

Theorem 2

([10]). Consider a bidding game \(\left\langle \mathcal {G},\varPhi \right\rangle \) with a parity objective \(\varPhi \). Let S be a BSCC of \(\mathcal {G}\). Every vertex in S has threshold either 0 or 1, and it is 1 iff the highest parity index in S is odd. Moreover, for a vertex v not in a BSCC, we have \( Th _\varPhi ^\mathcal {G}(v)= Th _{ Reach _\mathcal {G}(T)}^\mathcal {G}(v)\), where T is the union of BSCCs whose vertices have threshold 0.

5 Strong Decentralized Synthesis

We study the strong decentralized synthesis problem, where the goal is to synthesize two compatible robust tenders, i.e., tenders that guarantee the fulfillment of their objectives when composed with any compatible tender.

Definition 5 (Robust tenders)

Let \(\mathcal {G}\) be a graph and \(\varPhi _i\) be an objective in \(\mathcal {G}\). A tender \(\tau _i\) is robust for \(\varPhi _i\) if for every other compatible tender \(\tau _{-i}\in \mathcal {T}^\mathcal {G}\), we have \(\tau _i\!\bowtie \! \tau _{-i}\models \varPhi _i\).

Problem 1

(\(\mathsf {STRONG\hbox {-}SYNT}\)). Define \(\mathsf {STRONG\hbox {-}SYNT}\) as the problem whose input is a tuple \(\left\langle \mathcal {G},\varPhi _1,\varPhi _2 \right\rangle \), where \(\mathcal {G}\) is a graph and \(\varPhi _1\) and \(\varPhi _2\) are overlapping \(\omega \)-regular objectives in \(\mathcal {G}\), and the goal is to decide whether there exists a pair of tenders \(\tau _1,\tau _2 \in \mathcal {T}^\mathcal {G}\) such that:

  1. (I)

    \(\tau _1\) and \(\tau _2\) are compatible,

  2. (II)

    \(\tau _1\) is robust for \(\varPhi _1\), and

  3. (III)

    \(\tau _2\) is robust for \(\varPhi _2\).

Since each robust tender \(\tau _i\) guarantees that \(\varPhi _i\) is satisfied when composed with any tender, the composition of two robust tenders satisfies both objectives:

Proposition 1 (Sound composition of robust tenders)

Let \(\tau _1\) and \(\tau _2\) be two compatible robust tenders for \(\left\langle \mathcal {G},\varPhi _1,\varPhi _2 \right\rangle \). Then \(\tau _1\!\bowtie \! \tau _2\models \varPhi _1\cap \varPhi _2\).

We reduce the strong decentralized synthesis problem to the solution of two independent bidding games, both played on the graph \(\mathcal {G}\), one with \( Player ~X \) ’s objective \(\varPhi _1\) and the other one with \( Player ~X \) ’s objective \(\varPhi _2\). When the sum of thresholds in \(v^{0}\) is less than 1, we set the two tenders to be winning \( Player ~X \) strategies in the two games with the threshold budgets of the tenders being set as the respective thresholds in \(v^0\). It follows from the construction that both tenders are robust, and hence their composition will fulfill both objectives (Prop. 1).

Theorem 3 (Strong decentralized synthesis)

Let \(\mathcal {G}=\left\langle V,v^{0},E \right\rangle \) be a graph and \(\varPhi _1\) and \(\varPhi _2\) be a pair of overlapping \(\omega \)-regular objectives. A pair of robust tenders exists iff \( Th _{\varPhi _1}^{\mathcal {G}}(v^{0}) + Th _{\varPhi _2}^{\mathcal {G}}(v^{0}) < 1\). Moreover, \(\mathsf {STRONG\hbox {-}SYNT}\) is in NP \(\cap \) coNP in general and is in PTIME for binary graphs.

Proof

First, assume that \( Th _{\varPhi _1}^{\mathcal {G}}(v^{0}) + Th _{\varPhi _2}^{\mathcal {G}}(v^{0}) < 1\). For \(i \in \left\{ 1,2 \right\} \), let \(\langle \alpha _i, \beta _i \rangle \) denote a winning \( Player ~X \) strategy in the bidding game \(\langle \mathcal {G}, \varPhi _i \rangle \) from every configuration \(\langle v^{0}, B \rangle \) with \(B> Th _{\varPhi _i}^{\mathcal {G}}(v^{0})\). We argue that the render \(\tau _1 = \langle \alpha _1, \beta _1, Th _{\varPhi _1}^{\mathcal {G}}(v^{0}) \rangle \) is robust for \(\varPhi _1\), and the proof for \(\tau _2\) is dual. Indeed, for any compatible tender \(\tau '_{2} = \langle \alpha '_{2}, \beta '_{2}, \mathbb {B}'_{2} \rangle \), the pair \(\langle \alpha '_{2}, \beta '_{2} \rangle \) corresponds to a \( Player ~Y \) strategy in the bidding game \(\langle \mathcal {G}, \varPhi _1 \rangle \). The resulting play coincides with \(out(\tau _1\!\bowtie \! \tau '_2)(\langle v^{0}, B \rangle )\) and satisfies \(\varPhi _1\) since the strategy \(\langle \alpha _1, \beta _1 \rangle \) is winning.

Second, suppose that \( Th _{\varPhi _1}^{\mathcal {G}}(v^{0}) + Th _{\varPhi _2}^{\mathcal {G}}(v^{0}) \ge 1\). For any allocation \(\mathbb {B}_1 + \mathbb {B}_2 < 1\), there is an \(i \in \left\{ 1,2 \right\} \) such that \(\mathbb {B}_i \le Th _{\varPhi _i}^{\mathcal {G}}(v^{0})\). Assume WLog that \(\mathbb {B}_1 \le Th _{\varPhi _1}^{\mathcal {G}}(v^{0})\). Consider a winning \( Player ~Y \) strategy \(\langle \alpha _2, \beta _2 \rangle \) in the bidding game \(\langle \mathcal {G}, \varPhi _1 \rangle \) from \(\langle v^{0}, \mathbb {B}_1 \rangle \). The tender \(\tau '_{2} = \langle \alpha _2, \beta _2, 1-\mathbb {B}_1 \rangle \) is compatible and \(out(\tau _1\!\bowtie \! \tau '_2(\langle v^{0}, \mathbb {B}_1 \rangle ))\) violates \(\varPhi _1\).

Finally, in order to obtain the complexity bounds, we guess memoryless action policies in both games, which are known to exist [37], and verify that they are optimal. Based on the guess, we devise a linear program to compute the thresholds. Finally, we verify that the sum of thresholds in \(v^{0}\) is less than 1. For binary graphs, there is no need to guess the action policy in order to find thresholds (Thm. 1).    \(\square \)

We identify a setting where strong decentralized synthesis is always possible. The following theorem follows from the result that threshold budgets in strongly-connected Büchi games containing at least one accepting vertex are 0.

Theorem 4 (Strong decentralized synthesis on SCCs)

Consider a strongly-connected graph \(\mathcal {G}\) and a pair of non-empty Büchi objectives in \(\mathcal {G}\). Then, a pair of robust tenders exists in \(\mathcal {G}\).

We demonstrate the effectiveness of strong synthesis using path planning problems with two reachability objectives. Consider a fixed grid but four different instances of the problem, as shown in Fig. 2. For the first three cases, we successfully obtain pairs of robust tenders whose compositions fulfill both objectives. Moreover, since the blue target remained the same in all cases, we needed to redesign only the red tender, saving us a significant amount of computation.

Fig. 2.
figure 2

Robust tenders for path planning with two reachability objectives on a one-way grid, where the black cells are obstacles and the only permissible moves are from lighter to darker green cells—and not the other way round. The cell B8 is the initial location. The cells with double circles of colors red (respectively, G7, E5, E3, C3) and blue (G1) are the targets to reach. The path shows the output of the composition of the two tenders, where the red and blue segments are actions which were chosen by the red and blue tenders, respectively. The cells with red and blue squares are locations where the respective tenders win the bidding; in the rest of the cells on the paths, the bidding ended in ties which were resolved randomly. Strong synthesis was successful in the first three instances and failed in the last one. The pairs of thresholds of red and blue targets are, respectively (left to right): (0.75, 0.125), (0.625, 0.125), (0.75, 0.125), (0.875, 0.125).

6 Assume-Admissible Decentralized Synthesis

In assume-admissible decentralized synthesis, each tender assumes that the other tender is rational and pursues its own objective. We formalize rationality by adapting the well-known concepts of dominance and admissibility from game theory [1, 22]. Intuitively, \(\tau _i\) dominates \(\tau '_i\) if \(\tau _i\) is always at least as good as \(\tau _i\) and sometimes strictly better than \(\tau _i\); therefore, there is no reason to use \(\tau _i'\). An admissible tender is one that is not dominated by any other tender.

Definition 6 (Dominance, admissibility)

Let \(\mathcal {G}\) be a graph and \(\varPhi \) be an objective. We provide definitions for the first tender and the definitions for the second tender are dual. Let \(\mathbb {B}_1 < 1\). For two tenders \(\tau _1\) and \(\tau _1'\) that have the same budget allocation, \(\tau _1\) dominates \(\tau '_1\) when

  1. (a)

    \(\tau _1\) performs as well as \(\tau '_1\) when composed with any compatible \(\tau _2\); formally, for every compatible tender \(\tau _2\), \(\tau _1'\!\bowtie \! \tau _2 \models \varPhi \) implies \(\tau _1\!\bowtie \! \tau _2 \models \varPhi \), and

  2. (b)

    there is a compatible tender \(\tau _2\) for which \(\tau _1\) performs better than \(\tau '_1\); formally, there exists a compatible \(\tau _2\) with \(\tau _1\!\bowtie \! \tau _2 \models \varPhi \), and \(\tau _1'\!\bowtie \! \tau _2 \not \models \varPhi \).

A tender \(\tau _1\) is called \(\varPhi \)-admissible in \(\mathcal {G}\) iff it is not dominated by any other tender. We denote the set of \(\varPhi \)-admissible tenders in \(\mathcal {G}\) by \(\textit{Adm}{^{\mathcal {G}}(\varPhi )}\).

Next, we define admissible-winning tenders, which are tenders that fulfill their objectives when composed with any admissible tender.

Definition 7 (Admissible-winning tenders)

Let \(\mathcal {G}\) be a graph and \(\varPhi _1,\varPhi _2\) be a pair of overlapping objectives in \(\mathcal {G}\). A tender \(\tau _i\) is called \(\varPhi _{-i}\)-admissible-winning for \(\varPhi _i\) if and only if \(\tau _i\in \textit{Adm}{^{\mathcal {G}}(\varPhi _i)}\), and for every other tender \(\tau _{-i}\in \textit{Adm}{^{\mathcal {G}}(\varPhi _{-i})}\) compatible with \(\tau _i\), we have \(\tau _i\!\bowtie \! \tau _{-i}\models \varPhi _i\).

When the objectives are clear from the context, we will omit them and will simply write a tender is “admissible tender,” “admissible-winning tender,” etc.

Problem 2

(\(\mathsf {AA\hbox {-}SYNT}\)). Define \(\mathsf {AA\hbox {-}SYNT}\) as the problem whose input is a tuple \(\left\langle \mathcal {G},\varPhi _1,\varPhi _2 \right\rangle \), where \(\mathcal {G}\) is a graph and \(\varPhi _1\) and \(\varPhi _2\) are overlapping \(\omega \)-regular objectives in \(\mathcal {G}\), and the goal is to decide whether there exists a pair of tenders \(\tau _1 \in \textit{Adm}^\mathcal {G}(\varPhi _1)\) and \(\tau _2 \in \textit{Adm}^\mathcal {G}(\varPhi _2)\) such that:

  1. (I)

    \(\tau _1\) and \(\tau _2\) are compatible,

  2. (II)

    \(\tau _1\) is \(\varPhi _2\)-admissible-winning for \(\varPhi _1\), and

  3. (III)

    \(\tau _2\) is \(\varPhi _1\)-admissible-winning for \(\varPhi _2\).

The following proposition follows from the requirement that \(\tau _1\) and \(\tau _2\) are admissible.

Proposition 2 (Sound composition of admissible-winning tenders)

Let \(\tau _1\) and \(\tau _2\) be tenders that fulfill the requirements stated in Prob. 2. Then, \(\tau _1\!\bowtie \! \tau _2\models \varPhi _1\cap \varPhi _2\).

Remark 1

Note that the synthesis procedure for each \(\varPhi _{-i}\)-admissible-winning tender \(\tau _i\) for \(\varPhi _i\) requires the knowledge of \(\textit{Adm}{^{\mathcal {G}}(\varPhi _{-i})}\). Assume-admissible decentralized synthesis is modular in the following sense. First, the specific implementation of the tender \(\tau _{-i}\) with which each \(\tau _i\) is composed is not known during synthesis. All that is known is the objective \(\varPhi _{-i}\) for which \(\tau _{-i}\) is synthesized. Second, each \(\tau _i\) can remain unchanged even when \(\varPhi _{-i}\) changes to \(\varPhi _{-i}'\), as long as \(\textit{Adm}{^{\mathcal {G}}(\varPhi _{-i}')} \subseteq \textit{Adm}{^{\mathcal {G}}(\varPhi _{-i})}\).

6.1 Reachability objectives

Throughout this section we focus on overlapping reachability objectives \(\varPhi _1 = Reach _\mathcal {G}(T_1)\) and \(\varPhi _2 = Reach _\mathcal {G}(T_2)\) with \(T_1,T_2\subseteq V\) being sets of sink target vertices. This is without loss of generality, as every graph with non-sink target vertices can be converted into a graph with sink target vertices by adding memory (see the full version  [17]).

We reduce the decentralized assume-admissible synthesis problem to solving a pair of zero-sum bidding games on a sub-graph of \(\mathcal {G}\). Intuitively, an edge \(e = \langle u,v \rangle \) is dominated for the i-th tender, for \(i \in \left\{ 1,2 \right\} \), if it is possible to achieve the objective \(\varPhi _i\) from u but not from v. Clearly, a tender that chooses e is dominated and is thus not admissible (see the proof of the lemma in the full version [17]). Recall that \( Th _{\varPhi _i}^{\mathcal {G}}(v)\) denotes the threshold in the zero-sum bidding game played on \(\mathcal {G}\) with the \( Player ~X \) objective \(\varPhi _i\), and that \( Th _{\varPhi _i}^\mathcal {G}(v)=1\) means there is no path from v to \(T_i\).

Lemma 1 (A necessary condition for admissibility)

For every vertex u having at least two successors vw with \( Th _{\varPhi _i}^\mathcal {G}(v)< 1\) and \( Th _{\varPhi _i}^\mathcal {G}(w)=1\), if a \( Player ~i \) tender \(\left\langle \alpha _i,\beta _i, \mathbb {B}_i \right\rangle \) is in \( \textit{Adm}{^{\mathcal {G}}(\varPhi _i)}\), then \(\alpha _i(u)\ne w\), for both \(i\in \left\{ 1,2 \right\} \).

Proof

We argue that choosing w from u is dominated by the action of choosing v from u, no matter what the budget at u is. Firstly, Cond. (a) of Def. 6 trivially holds. Secondly, consider the other tender \(\tau _{-i}\) which bids zero at u, and later cooperates with \(\tau _i\) to satisfy \(\varPhi _i\). Clearly, the \(\tau _i\)’s action policy that selects v at u will be able to satisfy \(\varPhi _i\), but the one that selects w will not.    \(\square \)

We obtain the reduced graph by omitting edges that are dominated for both players. For example, in Fig. 1b, the edge \(\langle b,c \rangle \) is dominated for both players (see Ex. 3) and in Fig. 1c, no edge is dominated for both players (see Ex. 4).

Definition 8 (Largest admissible sub-graphs for reachability)

The largest admissible sub-graph of \(\mathcal {G}\) with respect to two reachability objectives \(\varPhi _1\) and \(\varPhi _2\) is \(\widehat{\mathcal {G}}_{{\varPhi _1},{\varPhi _2}}=\left\langle V',E' \right\rangle \) with \(V' = V\setminus \left\{ v\in V\mid Th _{\varPhi _1}^\mathcal {G}(v)=1 \wedge Th _{\varPhi _2}^\mathcal {G}(v)=1 \right\} \) and \(E' = (V'\times V') \cap E\). We omit \(\varPhi _1, \varPhi _2\) from \(\widehat{\mathcal {G}}_{{\varPhi _1},{\varPhi _2}}\) when it is clear from the context.

For a vertex v in \(\mathcal {G}\) and \(i \in \left\{ 1,2 \right\} \), recall that \( Th _{\mathcal {G}}^{\varPhi _i}(v)\) denotes the threshold in \(\mathcal {G}\) for objective \(\varPhi _i\), and \( Th _{\widehat{\mathcal {G}}}^{\varPhi _i}(v)\) denotes the threshold in the reduced graph. Observe that a winning strategy in \(\mathcal {G}\) will never cross a dominated edge. Removing dominated edges restricts the opponent, thus \( Th _{\mathcal {G}}^{\varPhi _i}(v) \ge Th _{\widehat{\mathcal {G}}}^{\varPhi _i}(v)\). The next lemma shows that, surprisingly, the decrease in sum of thresholds is guaranteed to be significant. The proof (see the full version [17]) which holds for non-binary graphs, intuitively follows from observing that in \(\widehat{\mathcal {G}}\), necessarily a sink that is a target for one of the players is reached, and since there is an overlap in at least one sink, the sum of thresholds is at most 1.

Lemma 2

(On the sum of thresholds in \(\widehat{\mathcal {G}}\)). For every vertex v, we have \( Th _{\varPhi _1}^{\widehat{\mathcal {G}}}(v) + Th _{\varPhi _2}^{\widehat{\mathcal {G}}}(v) \le 1\). Moreover, if \(\mathcal {G}\) is binary then \( Th _{\varPhi _1}^{\widehat{\mathcal {G}}}(v) + Th _{\varPhi _2}^{\widehat{\mathcal {G}}}(v) < 1\).

Our synthesis procedure proceeds as in strong decentralized synthesis: Find and output a pair of robust tenders in \(\widehat{\mathcal {G}}\), which are guaranteed to exist when \(\mathcal {G}\) is binary. In order to maintain soundness (see Prop. 2), it is key to show that a robust tender \(\tau _i\) in \(\widehat{\mathcal {G}}\) is admissible in \(\mathcal {G}\). The proof of the following lemma is intricate (see the full version [17]). We show that even when one can find \(\tau '_i\) and \(\tau _{-i}\) such that \(\tau _i\!\bowtie \! \tau _{-i} \not \models \varPhi _i\) but \(\tau _i'\!\bowtie \! \tau _{-i} \models \varPhi _i\), it is possible to construct \(\tau _{-i}'\) for which \(\tau _i\!\bowtie \! \tau _{-i}' \models \varPhi _i\) but \(\tau _i'\!\bowtie \! \tau _{-i}' \not \models \varPhi _i\), thus \(\tau '_i\) does not dominate \(\tau _i\). Furthermore, such a tender wins against a set of tenders which over-approximates admissible tender for \(\varPhi _{-i}\).

Lemma 3 (Algorithm for computing admissible-winning tenders)

For \(i \in \left\{ 1,2 \right\} \), a robust tender for \(\varPhi _i\) in \(\widehat{\mathcal {G}}\) is \(\varPhi _{-i}\)-admissible-winning for \(\varPhi _i\) in \(\mathcal {G}\).

The following theorem is obtained by combining Lemmas 2 and 3.

Theorem 5 (Assume-admissible decentralized synthesis for reachability)

The problem \(\mathsf {AA\hbox {-}SYNT}\) is a tautology for binary graphs: for every binary graph and two overlapping reachability objectives, there exists a pair of compatible admissible-winning tenders. Moreover, the tenders can be computed in PTIME.

Remark 2

For general (i.e., non-binary) graphs, \(\mathsf {AA\hbox {-}SYNT}\) is not a tautology anymore; a counter-example is given in Ex. 4. However, the same PTIME algorithm for computing tenders can still be used to obtain a sound solution; the completeness question is left open for future work.

6.2 Büchi objectives

In this section, we consider binary graphs with a pair of overlapping Büchi objectives. We first demonstrate that, unlike reachability, it is not guaranteed that an assume-admissible decentralized solution exists.

Fig. 3.
figure 3

A graph with no assume-admissible decentralized solution.

Example 5

Consider the graph depicted in Fig. 3 with the Büchi objectives given by the accepting vertices \(S_\textsf{red}= \left\{ b,d \right\} \) and \(S_\textsf{blue}= \left\{ a,c \right\} \). Note that the objectives are overlapping since the path \((bc)^\omega \) satisfies both. We argue that no pair of compatible admissible-winning tenders exist. Note that a robust (hence dominant) red tender forces reaching d, thus forcing \(\varPhi _\textsf{red}\) to be satisfied. Dually, a robust blue tender forces \(\varPhi _\textsf{blue}\) in a. It can be shown that and . Thus, for any \(\mathbb {B}_\textsf{red}\) and \(\mathbb {B}_\textsf{blue}\) with \(\mathbb {B}_\textsf{red}+\mathbb {B}_\textsf{blue}<1\), there is a robust tender that violates the other tender’s objective.    \(\triangle \)

We generalize the concept of largest admissible subgraphs to Büchi objectives. It is not hard to show that proceeding into a BSCC with an accepting state is admissible. Indeed, Thm. 2 shows that there is a robust (hence admissible) tender in such BSCCs. On the other hand, proceeding to a BSCC with no accepting vertex is clearly not admissible. The largest admissible subgraph is obtained by repeatedly removing BSCC that are not admissible for both tenders. Formally, for a given action policy \(\alpha \) and a given vertex v of \(\mathcal {G}\), we will write \(\alpha \not \models _v \varPhi _1\cup \varPhi _2\) to indicate that the action policy cannot fulfill \(\varPhi _1\cup \varPhi _2\) from the initial vertex v.

Definition 9 (Largest admissible sub-graphs for Büchi)

The largest admissible sub-graph \(\widehat{\mathcal {G}}_\mathcal {B}\) of \(\mathcal {G}\) for the Büchi objectives \(\varPhi _1,\varPhi _2\) is the graph \(\left\langle V',E' \right\rangle \) with \(V' = V\setminus \left\{ v\in V\mid \forall \text { action policy } \alpha \;.\;\alpha \not \models _{v}\varPhi _1\cup \varPhi _2 \right\} \), and \(E' = (V'\times V') \cap E\).

We describe a reduction to reachability games. For \(i \in \left\{ 1,2 \right\} \), let \(T_i\) denote the union of BSCCs of \(\widehat{\mathcal {G}}_\mathcal {B}\) in which there is at least one Büchi accepting vertex for \(\varPhi _i\). We call \(\left\langle T_1,T_2 \right\rangle \) the reachability core of \(\left\langle \varPhi _1,\varPhi _2 \right\rangle \) in \(\widehat{\mathcal {G}}_\mathcal {B}\). Let \(\varPhi _1' = Reach _{\widehat{\mathcal {G}}_\mathcal {B}}(T_1)\) and \(\varPhi _2' = Reach _{\widehat{\mathcal {G}}_\mathcal {B}}(T_2)\). We proceed as in strong decentralized synthesis: we find \( Th _{\varPhi _1'}^{\widehat{\mathcal {G}}_\mathcal {B}}(v^{0})\) and \( Th _{\varPhi _2'}^{\widehat{\mathcal {G}}_\mathcal {B}}(v^{0})\) and return a pair of robust tenders if their sum is strictly less than 1. Note that unlike reachability objectives, in Büchi objectives the sum might be 1 as in Ex. 5. Moreover, as Ex. 5 illustrates, when the sum is 1, no pair of admissible-winning tenders exist. By adapting results from the previous section, we obtain the following.

Theorem 6 (Assume-admissible decentralized synthesis for Büchi)

Let \(\mathcal {G}\) be a binary graph and \(\varPhi _1,\varPhi _2\) be a pair of overlapping Büchi objectives. Let \(\left\langle T_1,T_2 \right\rangle \) be the reachability core of \(\left\langle \varPhi _1,\varPhi _2 \right\rangle \) in the largest admissible sub-graph of \(\mathcal {G}\) (for \(\varPhi _1,\varPhi _2\)). A pair of admissible-winning tenders exists iff \(T_1\cap T_2\ne \emptyset \). Moreover, \(\mathsf {AA\hbox {-}SYNT}\) for Büchi objectives is in PTIME.

Like reachability (see Rem. 2), for Büchi objectives the same algorithm for \(\mathsf {AA\hbox {-}SYNT}\) for binary graphs can be used to obtain a sound solution for general graphs, and the completeness question is left open for future work.

7 Assume-Guarantee Decentralized Synthesis

We present the assume-guarantee decentralized synthesis problem, the one with the highest degree of synchronization among the tenders, with the benefit of the most applicability. In this synthesis procedure, we assume that we are given a pair of languages \(A_1,A_2\subseteq V^\omega \), called the assumptions. Intuitively, each tender \(\tau _i\) can assume \(A_i\) is fulfilled by the other tender, and, in return, needs to guarantee that \(A_{-i}\) is fulfilled, in addition to fulfilling own objective.

Definition 10 (Contract-abiding tenders)

Let \(\mathcal {G}\) be a graph, \(\varPhi _i\) be an \(\omega \)-regular objective, and \(A_1,A_2\) be a pair of \(\omega \)-regular assumptions in \(\mathcal {G}\). We say a tender \(\tau _i=\left\langle \alpha _i,\cdot ,\cdot \right\rangle \in \mathcal {T}^\mathcal {G}\) fulfills \(\varPhi _i\) under the contract \(\left\langle A_1,A_2 \right\rangle \), written \(\tau _i\models \langle A_i \triangleright \varPhi _i \triangleright A_{-i}\rangle \), iff

  1. (a)

    for every finite path \(\rho \), if \(\rho \) is in \(\textsf{pref}(A_i)\), then \(\rho \cdot \alpha _i(\rho )\in \textsf{pref}(A_1\cap A_2)\), and

  2. (b)

    for every other compatible tender \(\tau _{-i}\in \mathcal {T}^\mathcal {G}\), we have \(\tau _i\!\bowtie \! \tau _{-i}\models Safe _\mathcal {G}(\textsf{pref}(A_1\cap A_2))\implies \left( A_{-i}\wedge \left( A_i\implies \varPhi _i \right) \right) \).

Here, each tender \(\tau _i\) only make safety assumption on the other tender (Cond. (a)), namely that the path does not leave the safe set \(\textsf{pref}(A_i)\), and in return, provides full guarantee on \(A_{-i}\) (Cond. (b)). Normally, safety assumptions are not enough for fulfilling liveness guarantees and objectives [5]. But in bidding games, within the safe set, the players can use a known bidding tactic [9] to accumulate enough budgets from time to time to reach the liveness goals always eventually. We use \(A_1,A_2\) as \(\omega \)-regular sets, though we conjecture that safety assumptions suffice. The assume-guarantee distributed synthesis problem asks to compute a pair of tenders that fulfill their objectives under the given contract, as stated below.

Problem 3

(\(\mathsf {AG\hbox {-}SYNT}\)). Define \(\mathsf {AG\hbox {-}SYNT}\) as the problem that takes as input a tuple \(\left\langle \mathcal {G},\varPhi _1,\varPhi _2,A_1,A_2 \right\rangle \), where \(\mathcal {G}\) is a graph, \(\varPhi _1\) and \(\varPhi _2\) are overlapping objectives in \(\mathcal {G}\), and \(A_1\) and \(A_2\) are two \(\omega \)-regular languages over \(V\) with \(v^{0}\in \textsf{pref}(A_1\cap A_2)\), and the goal is to decide whether there exists a pair of tenders \(\tau _1 ,\tau _2 \in \mathcal {T}^\mathcal {G}\) such that:

  1. (I)

    \(\tau _1\) and \(\tau _2\) are compatible,

  2. (II)

    \(\tau _1\models \langle A_1 \triangleright \varPhi _1 \triangleright A_2\rangle \), and

  3. (III)

    \(\tau _2\models \langle A_2 \triangleright \varPhi _2 \triangleright A_1\rangle \).

When the assumptions allow all behaviors, i.e., \(A_1 =A_2 = V^\omega \), then \(\mathsf {AG\hbox {-}SYNT}\) is equivalent to \(\mathsf {STRONG\hbox {-}SYNT}\). On the other hand, when the assumptions allow only each other’s objectives, i.e., \(A_1=\varPhi _1\) and \(A_2=\varPhi _2\), then we obtain a purely cooperative synthesis algorithm. We prove that satisfaction of the contracts by a pair of tenders will imply satisfaction of \(\varPhi _1\cap \varPhi _2\).

Proposition 3 (Sound composition of contract-abiding tenders)

Let \(\tau _1\) and \(\tau _2\) be tenders that fulfill the requirements stated in Prob. 3. Then, \(\tau _1\!\bowtie \! \tau _2\models \varPhi _1\cap \varPhi _2\).

Proof

In the following, for a given language \(L\in V^\omega \), we write \( Safe _\mathcal {G}(\textsf{pref}(L))\) to denote the set of infinite paths which can always be extended to L, i.e., \(\left\{ v^{0}v^1\ldots \in Paths ^{\textsf{inf}}(\mathcal {G}) \mid \forall i\ge 0\;.\; v^{0}\ldots v^{i}\in \textsf{pref}(L) \right\} \).

We claim that both assumptions \(A_1,A_2\) will be fulfilled, from which Cond. (b) of Def. 10 will imply satisfaction of both \(\varPhi _1\) and \(\varPhi _2\) by \(\tau _1\) and \(\tau _2\), respectively. Let \(A=A_1\cap A_2\), and \(A\) can be decomposed into safety and liveness components as \(A= Safe _\mathcal {G}(\textsf{pref}(A))\cap \left( Safe _\mathcal {G}(\textsf{pref}(A))\implies A\right) \). We prove the claim on the two components separately. Firstly, the fact that \(\tau _1\!\bowtie \! \tau _2\) implements \(\textsf{pref}(A)\) on \(\mathcal {G}\) can be proven by induction over the length of the generated path: The base case is given by the assumption \(v^{0}\in \textsf{pref}(A_1\cap A_2)\), and for every finite path \(\rho \), if \(\tau _i\) wins the bidding and if \(\rho \in \textsf{pref}(A_1\cap A_2)\subset \textsf{pref}(A_{i})\) then \(\tau _i\) needs to ensure that the next vertex \(v'\) satisfies \(\rho v'\in \textsf{pref}(A_i\cap A_{-i})\) (consequence of Cond. II-III of Prob. 3 and Cond. (a) of Def. 10), thereby implying that the path will always remain inside \(\textsf{pref}(A_1\cap A_2)\), proving the safety part.

For the liveness part, we use known results from Richman bidding games, which guarantee that in an infinite horizon game, with any arbitrary positive initial budget, players can always eventually visit any vertex that can be reached [37]. This implies that if the invariance \( Safe _\mathcal {G}(\textsf{pref}(A_1\cap A_2))\) holds, then each tender \(\tau _i\) can actually fulfill \(A_{-i}\) (they are required to do so by Cond. (b) of Def. 10) when composed with any compatible tender in the long run. Therefore, \(A_1\cap A_2\) will be fulfilled.    \(\square \)

In bidding games literature, it is unknown how to compute strategies for objectives which can be violated if a given assumption is violated by the opponent, like in Cond. (a) in Def. 10. The challenge stems from the lack of separation of the set of available actions to the players, preventing us to impose assumptions only on the opponent’s behavior. We present a practically motivated sound, but possibly incomplete, solution for the decentralized synthesis problem, by using a stronger way of satisfying the contract, namely by requiring each tender \(\tau _i\) to use actions so that the generated path remains in \(\textsf{pref}(A_1\cap A_2)\) all the time. Formally, we say that the tender \(\tau _i\) strongly fulfills \(\varPhi _i\) under the contract \(\left\langle A_1,A_2 \right\rangle \), written \(\tau _i\models _{s} \langle A_i \triangleright \varPhi _i \triangleright A_{-i}\rangle \), if, instead of Cond. (a) of Def. 10, for every finite path \(\rho \), we have \(\rho \cdot \alpha _i(\rho )\in \textsf{pref}(A_1\cap A_2)\), regardless of whether \(\rho \in \textsf{pref}(A_i)\) or not, and moreover Cond. (b) of Def. 10 is fulfilled. It is easy to show that \(\tau _i\models _s \langle A_i \triangleright \varPhi _i \triangleright A_{-i}\rangle \) implies \(\tau _i\models \langle A_i \triangleright \varPhi _i \triangleright A_{-i}\rangle \), so that \(\varPhi _1\cap \varPhi _2\) will be fulfilled.

Similar to \(\mathsf {AA\hbox {-}SYNT}\), we extract a sub-graph \(\mathcal {G}'\) of \(\mathcal {G}\), called the largest contract-satisfying sub-graph, whose every path belongs to \(\textsf{pref}(A_1\cap A_2)\), and vice versa; we omit the construction, which follows usual automata-theoretic procedure from the literature [4]. For example, in Ex. 4, the largest contract-satisfying sub-graph of the graph in Fig. 1c is the one that only excludes the vertices c and f. It follows that when the tenders strongly fulfill their objectives under the contracts, it is guaranteed that every path always remains in \(\mathcal {G}'\).

Theorem 7 (Assume-guarantee decentralized synthesis)

Let \(\mathcal {G}=\left\langle V,v^{0},E \right\rangle \) be a graph, \(\varPhi _1\) and \(\varPhi _2\) be a pair of overlapping \(\omega \)-regular objectives, and \(A_1\) and \(A_2\) be \(\omega \)-regular assumptions. Let \(\mathcal {G}'\) be the largest contract-satisfying sub-graph of \(\mathcal {G}\). A pair of robust tenders exist if \( Th _{A_2\cap \varPhi _1}^{\mathcal {G}'}(v^{0}) + Th _{A_1\cap \varPhi _2}^{\mathcal {G}'}(v^{0}) < 1\). Moreover, \(\mathsf {AG\hbox {-}SYNT}\) is in PTIME.

8 Related Work

Shielding [35] is a framework in which a runtime monitor called a shield enforces an unverified policy \(\pi \) (e.g., generated using reinforcement learning [7]) to satisfy a given specification. A shield operates by observing, at each point in time, the action proposed by \(\pi \) and can alter it, e.g., if safety is violated. The choice of who acts at each point in time, \(\pi \) or the shield, can be seen as a scheduling choice similar to our setting. However, the goals of the two approaches are different: our goal is to design tenders for modular policy synthesis, whereas a shield is meant as a verified “wrapper” for a complex policy. Technically, in auction-based scheduling, the scheduling depends on the auction which is external to the policies, whereas in shielding, it is the shield who chooses whether to override \(\pi \).

In distributed reactive synthesis [32, 36, 42], the goal is to design a collection of Mealy machines whose communication is dictated by a given communication architecture. Distributed synthesis is well studied and we point to a number of works that considered objectives that are a conjunction \(\varPhi _1\wedge \varPhi _2\wedge \ldots \) of sub-objectives \(\varPhi _1,\varPhi _2,\ldots \) [6, 18, 21, 30, 31, 39]. While there is a conceptual similarity between our synthesis of tenders and the synthesis of Mealy machines, there is a fundamental difference between the approaches. Namely, our composition is based on scheduling, i.e., exactly one policy is scheduled at each point in time, whereas in distributed synthesis, the composition of the Mealy machines is performed in parallel, i.e., they all read and write at each point in time.

Our algorithms build upon the rich literature on bidding games on graphs. The bidding mechanism that we focus on is called Richman bidding [10, 37, 38]. Other bidding mechanisms have been studied: poorman [11], taxman [13], and all-pay [14, 15]. Auction-based scheduling can be instantiated with any of these mechanisms and the properties from bidding games transfer immediately (which differ significantly for quantitative objectives). Of particular interest in practice is discrete bidding, in which the granularity of the bids is restricted [2, 16, 27]. To the best of our knowledge, beyond our work, non-zero-sum bidding games have only been considered in [40]. The solution concept that they consider is subgame perfect equilibrium (SPE). While it is suitable to model the interaction between selfish agents, we argue that it is less suitable in decentralized synthesis.

There are many works on designing optimal policies for multi-objective sequential decision making problems for various different system models; see the survey by Roijers et al. [43] and works on multi-objective stochastic games  [20, 21, 24]. To the best of our knowledge, no prior work considers the decomposition of the problem into individual task-dependent policies like us. Auctions to distribute tasks to agents have been consideredextensively [19, 25, 26, 28, 29, 41, 44]. Their goal is very different: their agents bid for tasks, that is, a bid represents an agent’s cost (e.g., in terms of resources) for performing a task. The auction then allocates the tasks to agents so as to minimize the individual costs, giving rise to an efficient global policy.

9 Conclusion and Future Work

We present the auction-based scheduling framework. Rather than synthesizing a monolithic policy for a conjunction of objectives \(\varPhi _1\wedge \varPhi _2\), we synthesize two independent tenders for each of the objectives and compose the tenders at runtime using auction-based scheduling. A key advantage of the framework is modularity; each tender can be synthesized and modified independently. We study three instantiations of decentralized synthesis in planning problems with varying degree of flexibility and practical usability, and develop algorithms based on bidding games. Interestingly, we show that a pair of admissible-winning tenders always exists in binary graphs for reachability objectives and they can be found in PTIME. This positive result illustrates the strength and potential of the auction-based scheduling framework.

There are plenty of directions of future research and we list a handful. First, we consider only qualitative objectives and it is interesting to lift the results to quantitative objectives, where one can quantify the fairness achieved by the scheduling mechanism in a fine-grained manner. Moreover, it is appealing to employ the rich literature on mean-payoff bidding games. Second, we consider a conjunction of two objectives, and it is interesting to extend the approach to a conjunction of multiple objectives. This will require extending the theory of bidding games to the multi-player setting, which have not yet been studied. Finally, it is particularly interesting to extend the technique of auction-based scheduling beyond path-planning problems, for example, it is interesting to consider decentralized synthesis of controllers that operate in an adversarial or probabilistic environment. Again, the corresponding bidding games need to be studied (so far only sure winning has been considered for bidding games played on MDPs [12]).