research-article

Public Access

The Price of Anarchy of Strategic Queuing Systems

Authors:

Jason Gaitonde,

Éva TardosAuthors Info & Claims

Journal of the ACM, Volume 70, Issue 3

Article No.: 20, Pages 1 - 63

https://doi.org/10.1145/3587250

Published: 23 May 2023 Publication History

PDF eReader

Abstract

Bounding the price of anarchy, which quantifies the damage to social welfare due to selfish behavior of the participants, has been an important area of research in algorithmic game theory. Classical work on such bounds in repeated games makes the strong assumption that the subsequent rounds of the repeated games are independent beyond any influence on play from past history. This work studies such bounds in environments that themselves change due to the actions of the agents. Concretely, we consider this problem in discrete-time queuing systems, where competitive queues try to get their packets served. In this model, a queue gets to send a packet at each step to one of the servers, which will attempt to serve the oldest arriving packet, and unprocessed packets are returned to each queue. We model this as a repeated game where queues compete for the capacity of the servers, but where the state of the game evolves as the length of each queue varies.

We analyze this queuing system from multiple perspectives. As a baseline measure, we first establish precise conditions on the queuing arrival rates and service capacities that ensure all packets clear efficiently under centralized coordination. We then show that if queues strategically choose servers according to independent and stationary distributions, the system remains stable provided it would be stable under coordination with arrival rates scaled up by a factor of just \(\frac{e}{e-1}\). Finally, we extend these results to no-regret learning dynamics: if queues use learning algorithms satisfying the no-regret property to choose servers, then the requisite factor increases to 2, and both of these bounds are tight. Both of these results require new probabilistic techniques compared to the classical price of anarchy literature and show that in such settings, no-regret learning can exhibit efficiency loss due to myopia.

1 Introduction

A fundamental aim at the intersection of economics and computer science is to understand the efficiency of systems when the dynamics are governed by the actions of strategic and competitive agents. In general, the outcomes reached by selfish agents at an equilibrium, or under some other dynamics, need not necessarily align well with the design of the system. Real-world systems whose performance is provably near-optimal even in the presence of selfish agents can be viewed as robust: such systems can be safely decentralized without sacrificing much efficiency. However, systems whose behavior degrades dramatically in strategic settings require extra safeguards to prevent inferior outcomes.

To make this more concrete, consider the setting of routing in networks as considered in the work of Roughgarden and Tardos [34]. Formulating this as a game, each agent chooses a path in the network from their source vertex to their sink vertex with the objective of minimizing their experienced delay along this path. The delay of an edge is some monotonic function of the number of agents traversing this edge in their chosen path. At a Nash equilibrium, selfish agents each select the path with minimum delay given the equilibrium behavior of the others; in other words, each agent selfishly optimizes her own delay function given the actions of the others. These equilibrium outcomes can be quite different from the globally optimal choice of routes that minimizes the total delay of all agents. However, Roughgarden and Tardos show that in the case of atomless selfish routing with affine delay functions, the ratio between the total delay at any Nash equilibrium and that of the global optimum is bounded by \(4/3\). More generally, if the delay functions are polynomials of degree at most d, then the ratio can behave as \(\Theta (d/\ln d)\) [31]. Such a result thus strongly characterizes the gap between selfish and optimal behavior, and identifies the concrete obstruction (here, nonlinearity) to efficiency.

In more general settings and games, given some quantitative notion of social welfare, the worst-case ratio between the social welfare at the global optimum with respect to this metric and that of any Nash equilibrium (defined in an analogous way as in the routing setting) is known as the price of anarchy [24]. More generally, we will refer to any sort of quantitative comparison between socially optimal outcomes and equilibrium outcomes as a price of anarchy analysis. A large body of work has established price of anarchy bounds for various well-studied games like routing, scheduling, and auctions, among others. Establishing price of anarchy results is important for several reasons. First, price of anarchy analyses help us understand the performance of real-world systems that already exist “in the wild” [33]. Just as importantly, the quantitative understanding given by price of anarchy–style analyses yield crucial insights toward the design of more robust systems with respect to selfish behaviors. For instance, Roughgarden and Tardos [34] further showed that in routing games, the cost of any Nash equilibrium is no more than that of the centralized optimum for twice as much flow. Such an analysis gives clearly actionable prescriptions: to attain good performance, one can simply augment the amount of resources in the system relative to the competition.

In many settings, agents repeatedly play against each other in the same game, not just once. In the most commonly studied models of repeated games, a critical assumption is that in each round, agents play an “independent” copy of the same game. In other words, the past sequence of play does not fundamentally change the nature of the game that is repeated in each round. In some settings, this approximation might be valid. For instance, in the preceding routing setting, consider routing on the scale of the morning rush-hour traffic. In this case, it usually holds that any traffic on Monday morning will have cleared by Tuesday morning, at which point agents again choose paths in a fresh version of the game. However, consider modeling packet routing in computer networks. If a packet gets dropped, then this packet must be re-sent in future rounds and thereby increases the total congestion going forward, thus violating the assumption that the game itself has not changed.

Therefore, developing a deeper theory of the efficiency of strategic agents in repeated games that retain state is of large importance. This motivates the first main question that guides our work in this article.

Question 1.

Do natural repeated games with state admit strong price of anarchy–style guarantees?

In this article, we extensively consider this problem of price of anarchy bounds in systems with state that impacts future rounds. Concretely, we study Question 1 in a queuing system with queues sending packets to servers as a simplified model of a network of queues as previously considered by Krishnasamy et al. [25]. In their work, the authors study the performance of a centralized learner in the same queuing system that finds the best server with respect to a more refined notion of “queue-regret,” which measures the expected difference in queue sizes to that of a genie strategy that knows the optimal server.

In contrast, our work studies a decentralized and strategic multi-queue version of the same system, where each queue selfishly competes with each other for service. Our primary focus is on precisely characterizing what conditions ensure that this queuing system remains stable (in a quantitative sense that will be formalized later) even under strategic assumptions, a concern that does not arise in the learning problem with centralized scheduling. In particular, we consider conditions that guarantee the efficiency of stochastic queuing systems when queues choose servers with the aim of getting their packets served in minimal time, and where queues must repeatedly resend their packets until the packet gets served.

To do so, we study the amount of extra resources required for stability in queuing systems under a variety of different behavioral structures. We first consider this question in the completely centralized setting, where a central coordinator can specify which queues send to which servers at each time. We first show, using the theory of majorization, that the obvious necessary condition is in fact sufficient: namely, the sum of arrival rates of the top k queues must be at most the sum of the capacities of the top k servers, for each k. This result establishes a baseline measure on the feasibility of queuing systems to compare against the rest of our results in strategic setting.

Once we establish this baseline, we may then proceed to ask: how many extra resources (server capacity) are needed relative to the coordinated setting to ensure that the system remains stable under some notion of game-theoretic equilibrium? We first study this question under the natural assumption that the behavior is stationary over time for each queue; this situation models the outcomes of systems that have reached a long-term equilibrium. We do so by defining a one-shot version of the queuing dynamics, which we call the patient queuing game, where each queue must choose a fixed distribution over servers. The term patience refers to the fact that the incentives for the queues in this game are their long-run growth rates, measured in terms of the limiting number of uncleared packets normalized by time. Thus, the values obtained in this game correspond to the long-run growth rates obtained in systems when the distribution of actions (choice of server) has stabilized for each queue, and we then ask for the values attained at any Nash equilibrium.

To answer our main question, we establish several delicate probabilistic properties of such systems and show that the long-run behavior of such systems, as a function of the choices of the queues, admits a number of favorable analytic properties. These techniques and results allow us to directly connect these probabilistic properties of the queuing system with the game-theoretic incentives of the agents to show that a factor of just \(\frac{e}{e-1}\) extra server capacity is needed to ensure stability of every Nash equilibrium of this game. Along the way, we also consider, but do not resolve, the separate question of the optimal rates attained by any independent strategies. We show that this question is equivalent to the classical notion of price of stability¹ [1, 2] of these games.

Although our results in this setting are tight, the assumption that agents have reached a stationary equilibrium is quite strong. In fact, this problem applies equally to the “independent game” setting. To address this deficiency, recent work in the independent game setting has shown that price of anarchy bounds often seamlessly generalize from the single-round equilibrium setting to the repeated games setting where agents employ no-regret learning algorithms [7, 26, 32]. In this model, agents repeatedly play the same game against each other and use learning algorithms to adapt to each other’s behavior. Such extensions are crucial in capturing the outcomes that might arise in practice, as there are known obstructions to the predictive power of Nash equilibria; to name just a few, Nash equilibria need not be unique, may be computationally infeasible to compute [13, 35], and may require stringent assumptions on the knowledge of the agents. Taken together, these deficiencies may collectively prevent real-world agents from reaching equilibrium outcomes. By contrast, no-regret algorithms are simple, computationally efficient, and encode natural and minimal behavioral assumptions of strategic agents, all while enjoying sufficient provable guarantees to enable a price of anarchy analysis. This guarantee can be ensured by running any of a large set of learning algorithms [38]. The study of learning in games has a long history, dating back to the early work of Brown [10] and Robinson [30] (also see the work of Fudenberg and Levine [19]). In traditional repeated games, if all players employ a no-regret learning strategy, then the play converges to a form of correlated equilibrium of the game [20] (players correlating their play by each of them using the history of play to decide their next action), and price of anarchy analyses often extend also to the correlated play.

Therefore, to complement our equilibrium results, we turn to analyzing the outcomes that arise under learning dynamics. A fundamental question in our setting with strong inter-round dependencies is whether price of anarchy bounds similarly extend to natural learning dynamics.

Question 2.

To what extent do price of anarchy–style bounds extend in repeated games with state to learning dynamics?

To formally address this question, we consider the performance of queuing systems where each queue uses no-regret learning to determine which servers to send packets to over time. Because we only assume that the queues satisfy the no-regret property in their choices, much of the analytic structure developed for stationary strategies in the patient queuing game is no longer applicable. In contrast to more classical settings with repeated games, where such price of anarchy bounds often naturally extend from equilibrium notions to dynamic learning outcomes, we show that no-regret learning exhibits myopia, as it generally cannot consider the long-run dependencies in the system. We develop new probabilistic machinery from the equilibrium setting to prove that the corresponding queuing systems require twice the amount of server capacity needed for centralized stability, and this bound is again tight. When taken together with our equilibrium bounds, a key conceptual contribution of our work is thus that standard no-regret algorithms can indeed attain nontrivial performance guarantees in these repeated games with state, but that these guarantees are not entirely lossless. Moreover, the analyses themselves appear to require a fundamentally different set of tools rather than being corollaries of the same generic framework [32].

In the next section, we elaborate on the model, our results, and techniques.

1.1 Overview of Results and Techniques

1.1.1 Strategic Queuing Model.

Before discussing our results in more detail, we first briefly describe the queuing model that will be the focus of this work.² As mentioned earlier, we consider the discrete-time queuing system studied by Krishnasamy et al. [25], where n queues receive packets at heterogeneous rates \(\mathbf {\lambda }=(\lambda _1,\ldots ,\lambda _n)\in (0,1)^n\) so that queue i receives a new packet at each time with probability \(\lambda _i\). In each round, any queue that has any remaining packets must select exactly one of m servers with heterogeneous success probabilities \(\mathbf {\mu }=(\mu _1,\ldots ,\mu _m)\in [0,1]^{m}\), to attempt to clear a single packet. Each server can only succeed in clearing at most one packet in each round and, most importantly, returns each unprocessed packet to the original queue, assuming for simplicity that servers have no buffer.³ We will also assume that each queue receives only bandit feedback in each round, meaning that it only observes whether it succeeded in clearing the packet it sent in the current round at the server it attempted.

Queue lengths can grow arbitrarily, so the efficiency we consider is under what conditions on the service and arrival rates can it be ensured that the system remains stable? We provide formal definitions in Section 2 of the various stability notions we will consider, but informally, stability corresponds to the number of uncleared packets growing sublinearly over time. To establish a baseline measure on what is possible under a centralized coordination algorithm, we will prove the following theorem.

Theorem 1.1 (Theorem 2.2, Informal).

Let \(\mathbf {\lambda }\in (0,1)^n\) and \(\mathbf {\mu }\in [0,1]^m\) be the arrival and service rates, respectively. Then the preceding queuing system is stable for some centralized (coordinated) scheduling policy if and only if for all \(1\le k\le n\),

\begin{equation} \sum _{j=1}^{\min \lbrace k,m\rbrace } \mu _{j} \ge \sum _{i=1}^{k} \lambda _i. \end{equation}

(1)

Because the focus of our work is on outcomes with strategic queues under different behavioral assumptions, an important feature of our model is how conflicts are resolved when multiple queues send to the same server in a time period. In decentralized settings, queues sending a packet at each round can and will often collide at a server, necessitating a choice of which (if any) packet a server attempts to serve. There are at least two natural choices. As a first natural choice, a server may choose a packet to attempt to clear uniformly at random among those that it receives in a round. Although this is a plausible modeling choice, we will actually show (Theorem 2.4) that in this case, the number of uncleared packets in the system can increase linearly over time when queues are strategic unless the success rates of the servers are prohibitively larger than the arrival rates of the queues. Roughly speaking, uniform randomization by the servers is not sufficiently adapted to our system objective of stability; even if there exists a simple coordinated strategy that would ensure queues remain bounded, strategic behavior by the queues can prevent queues with higher arrival rates from exploiting the necessary servers. In particular, such a model precludes the possibility of a constant factor of resource augmentation ensuring stability of selfish queuing systems.

To address this immediate difficulty, we turn to a second natural choice: instead of choosing a packet to serve uniformly at random, we will assume that packets are labeled with timestamps and that servers attempt to serve the received packet with oldest timestamp (breaking ties arbitrarily). This choice immediately induces significant dependencies between rounds, for queues that for some reason have not successfully cleared many packets will have priority over queues that have been successful in previous rounds. Although this is a natural choice to facilitate stability in such systems, we will require delicate probabilistic reasoning to study such processes. A simple but key idea that will enable our analysis will be to study a Geometric version of this model that is more tailored to this priority scheme via the principle of deferred decisions, where “Geometric” refers to the type of random variables governing the evolution of the system. Namely, we reduce the analysis of this complicated system to studying a single parameter for each queue, the age of the oldest packet in the queue, that meshes well with the priority structure. After formulating and proving this equivalence in Section 2.4, all of our subsequent results will be in this Geometric system.

1.1.2 Patient Selfishness.

To understand these systems game theoretically, our first main contribution is to understand the performance of systems that are at long-run equilibrium even though agents nonetheless compete to optimize their own long-run growth rate. We do this by formulating a one-shot version of the queuing dynamics. We define the patient queuing game where queues choose fixed randomized strategies over servers to be played at each round. Each queue i’s choice over servers can be described by a fixed vector \(p_i\in \Delta ^{m-1}\), where \(\Delta ^{m-1}\) is the probability simplex over the m servers. We study this as a traditional game and consider the resulting Nash equilibria when each queue aims to choose their fixed randomization to minimize their long-run aging rate (equivalently, their long-run growth rate, see Section 2.2) conditioned on the others. Our main interest is understanding under what conditions on the service rates and arrival rates the system will remain stable in every Nash equilibrium. To study this, we face significant probabilistic and game-theoretic challenges: probabilistic challenges to determine and prove a closed form of asymptotic growth rates for given strategies exists, and game-theoretic challenges in showing that Nash equilibria exist and bounding their quality. The techniques we use will prove useful in addressing these conceptually distinct difficulties, thereby unifying the game-theoretic and probabilistic properties of our systems.

Asymptotic Growth Rates. In the preceding discussion, we stated that each queue aims to select a fixed randomization over servers to minimize their long-run aging rate in this system given the randomizations of the others. Our first task, to do any game-theoretic analysis of this system, is to analyze the long-run properties of this random process of queue ages (which typically will not even be recurrent). A major technical component of our work is showing that for any fixed, independent randomizations \(\mathbf {p}\) by the queues over servers, not only do these long-run growth rates exist almost surely, they are deterministic and can be explicitly computed as a function of the strategies.

Theorem 1.2 (Theorem 3.3, Informal).

There exists an explicit, continuous function \(r:(\Delta ^{m-1})^n\rightarrow \mathbb {R}^n_{\ge 0}\) such that if queues independently randomize over servers according to \(\mathbf {p}\in (\Delta ^{m-1})^n\), then the (random) long-run growth rate of each queue i is \(r_{i}(\mathbf {p})\) almost surely.

To prove this result, we provide an alternate, algorithmic description of the long-run rates in Section 3.1, which we use for all of our subsequent game-theoretic results. Working just with this alternative definition, we show that the queues partition into groups such that all queues in a group age asymptotically at the same rate. We will return to the task of establishing that the true, long-run asymptotic aging rates of the queues for any choice of strategies coincide with the output of the algorithm in Section 6.

The key technical difficulty in proving this alternate characterization is that the priority structure via timestamps changes rapidly round to round and depends crucially on past successes by the queues. To overcome this, we use a rather delicate inductive argument that accounts for these changes in a controlled fashion that enables us to keep track of the evolution of the queuing system at a less granular level while still being sharp enough to prove the precise quantitative rates. Once we have done so, only then can we repeatedly appeal to concentration bounds to argue that each subset in the partition grows at the desired rate. To conclude, we carefully apply the Borel-Cantelli lemma to establish the result. For ease of exposition, some of the highly nontrivial and technical details are deferred to Appendix F.

Game-Theoretic Properties: Equilibria and Price of Anarchy. Once we show that these limits almost surely are equal to an explicit, deterministic function of \(\mathbf {p}\), it might still not be the case that a Nash equilibrium exists in the induced game. However, we show that the cost function exhibits significant analytic properties, which lets us reason about the structure of the sets that arise in the partition for any fixed strategy profile. More precisely, we show that each level set of the cost function corresponds to the minimizing subset of the ratio of a submodular and modular set function; this significant structure allows us to show that the subsets that minimize this ratio are closed under union and nonempty intersection (and thus essentially form a Boolean lattice). These considerations will be enough to show continuity as a function of the strategies (Proposition 3.7), which along with other properties will enable us to show that an equilibrium exists using Kakutani’s theorem (Theorem 3.8). Although we show that the cost function of our game has significant structure, the correspondence between actions (randomizations) and costs is quite nonlinear, imposing new technical challenges.

Recall that our goal is to ensure stability in any Nash equilibrium, assuming some relationship on the service rates to the arrival rates. Our main result, proven in Section 4, shows that the correct constant of system slack is \(\frac{e}{e-1}\approx 1.58\).

Theorem 1.3 (Main, Corollary 4.2, Informal).

If the service capacity is large enough so that the system would remain feasible when centrally managed even if server rates are scaled down by \(\frac{e}{e-1}\), then in every equilibrium of this game, all queues are stable.

This result is tight—in a symmetric system where \(n=m\), each queue has the same arrival rate, each server has the same success rate, and each queue chooses to uniformly randomize over servers, a simple balls-in-bins analysis yields this constant as \(m,n\rightarrow \infty\).

To prove this theorem, we provide a novel argument that establishes the result by continuously deforming any Nash profile toward a carefully constructed strategy profile while only monotonically decreasing the rate at which the top group clears. We then analyze the resulting profile to give a lower bound on the value of the Nash profile. The key difficulty is that the relevant incentives for each queue correspond to possibly many different subsets of queues that have a maximal aging rate. These constraints are difficult to directly compare; different choices of deviations in the strategy by a queue at any Nash equilibrium may violate distinct constraints, making it unclear how to argue about the quality of these equilibria. In particular, there does not seem to be a direct analogue of the Nash indifference principle in finite-action games where utilities are affine in the randomizations of each agent (recall Example 1.4, where the queue moving to the lesser server will still appear to prefer the better server).

To overcome these difficulties, we show that one can significantly reduce the number of incentive constraints one must consider for each queue (Proposition 4.3). This part of the argument crucially uses the lattice structure of the subsets of queues that ages fastest. We can then carefully perform our deformation of the collective strategy vector of the queues according to the structure of these sparsified constraints, and show that our deformation only hurts the quality of the Nash solution to provide a valid lower bound.

In contrast, almost every known price of anarchy–style result can be viewed via the very general smoothness framework of Roughgarden [32], which connects an equilibrium with the social optimum via discrete changes in the strategy profile. Our argument instead relies on a careful equilibrium analysis that smoothly interpolates between the equilibrium and a “good” profile that is easy to explicitly bound; however, during these deformations, these intermediate strategy profiles will not be equilibria. To prove the monotonicity of this deformation, we connect the incentives at Nash to the structure of the subset of maximizers of the long-run rate function and show that the Nash constraints still hold along the directions we deform.

1.1.3 No-Regret Learning in Queuing Systems.

As mentioned earlier, no-regret learning is a classical modeling assumption in the context of repeated, independent games that often attains equivalent price of anarchy–style guarantees to that of Nash equilibria. Moreover, the convergence of no-regret learning to a correlated equilibrium gives an intrinsic game-theoretic justification for using it as a behavioral model of the agents. In our setting, because queues only receive bandit feedback on their actions, but otherwise may not know the service rates \(\mathbf {\mu }\) nor any other information on the number or identities of the other queues in the systems or their choices of servers, it is natural to assume that they may use an adaptive learning algorithm to select servers in each round. Indeed, in large systems, reasoning about explicit game-theoretic actions on a round-by-round basis may prove difficult or impossible due to these information constraints, but no-regret learning is nonetheless achievable and gives useful performance guarantees.

However, we now provide a simple example showing that no-regret behavior can exhibit surprisingly myopic behavior in our queuing model. In this example, there is a unique no-regret policy for each agent, given the behavior of the other agent, but both agents would have been better off long-term had one even slightly deviated to an inferior server and the other stayed the same. In the classical setting with “independent” repeated games, this cannot occur.

Example 1.4.

Suppose that there are two queues with arrival rates \(\lambda _1=\lambda _2=.51\) and two servers with \(\mu _1=1\) and \(\mu _2=.49\). In this case, each queue receives a new packet roughly once every two periods on average. One can show that if both servers send to the top rate server every period, the sequence of play will satisfy the no-regret property, as they roughly split the top server equally. Each server then roughly clears at a rate of \(1/2\), which is strictly better than deviating to the lower server, but this system will not be stable; packets arrive at a total rate of 1.02 but are cleared at a rate of 1 in expectation, leading to linear growth. However, if one queue commits to slightly deviate toward the inferior server, one can show that the resulting system will be stable.

Note that this behavior cannot arise in the patient version of the queuing game considered earlier, because the incentives explicitly favor smaller long-run growth rates. This example shows that no-regret learning, which considers less sophisticated measures of the efficiency of the servers without considering the long-run effects, exhibits myopia. The queue sending to the second server in the preceding example does have regret in the classical sense, despite doing better in the long term.

We thus see that no-regret outcomes could be susceptible to performance losses compared to the patient setting. Our second main result precisely captures the performance of generic no-regret dynamics. We show that under no-regret dynamics, the system still remains stable when there is a factor 2 extra slack in server capacity: if the system has enough capacity to serve all packets when they are centrally coordinated even with half the service rates, then no-regret learning of the queues guarantees that queue lengths stay bounded in expectation across time. We will prove the following theorem.

Theorem 1.5 (Theorem 5.2, Informal).

If each queue in the queuing system satisfies the no-regret property in choosing servers at each round, and if for all \(k\le n\),

\begin{equation*} \sum _{j=1}^{\min \lbrace k,m\rbrace } \mu _j \gt 2\sum _{i=1}^k \lambda _i, \end{equation*}

then the queuing system under these dynamics is stable.

We complement these results by providing a partial converse that this factor of 2 on the required service rate is tight in Theorem 5.4, in that with less than a factor of 2 higher service rate, no-regret outcomes do not necessarily guarantee stability. Taken in tandem with our results on the patient queuing game, we thus observe rather subtle behavior that can arise in games with state. No-regret learning can indeed attain nontrivial performance guarantees, but the myopia induced by the local dynamics may lead to performance losses compared to the setting where agents compete with stationary, patient strategies.

Our key technique in proving Theorem 1.5 is to use a delicate potential function argument. The main idea of our proof is to argue that when some potential function, which we must construct, has a high enough value, then it must have negative drift. To conclude that the queue sizes remain bounded in expectation, we can then employ a powerful theorem of Pemantle and Rosenthal [29] showing that a sufficiently regular stochastic process with negative drift must have moments uniformly bounded over time.

To carry out this approach, the key difficulty now becomes: what potential function (related to queue lengths/ages) should we choose that provably has negative drift when it is large? The simplest possible choice is the maximum queue age, which is an \(\ell _{\infty }\) potential. Indeed, we will argue that the oldest queues, by virtue of the slack, no-regret condition, and the priority, will tend to decrease in age in aggregate. However, the dependencies from the priority scheme and the learning dynamics make arguing about how this decrease is spread among old queues rather tricky. Moreover, this potential does not alone sufficiently account for the full state of the system; to try to benefit from the performance of all queues, one could instead try an \(\ell _1\)-style potential function. However, this potential has a different problem; the gains by older queues could be washed out by the aging of young queues that do not have priority for this choice of potential.

To that end, it will become most convenient to instead study an \(\ell _2^2\) potential function of queue ages. This potential function construction can be motivated in multiple ways. First, squaring queue ages naturally biases the potential to the older queues as for an \(\ell _{\infty }\) potential, which we will argue are decreasing in aggregate. Moreover, we will want to benefit at all scales like in an \(\ell _1\) potential: to do so, one is naturally led to summing an \(\ell _1\)-potential over just those queues above each age threshold. Upon doing some algebra, one arrives at our \(\ell _2^2\) potential function of queue ages. For technical reasons, we eventually translate back to a suitable \(\ell _2\)-style norm of queue ages. It is our hope that the kinds of qualitative features we establish and the methods of proof for these results will be of interest in the future study of repeated games that similarly relax the independence assumptions of the games played at each round.

1.2 Organization

We first formalize the strategic queuing model in Section 2. In doing so, we will formally establish our notions of stability and feasibility that will give the benchmark for our results with strategic agents. We then turn to our first main result on patient queuing systems at equilibrium. In Section 3, we introduce and prove several properties of the relevant queuing game. We provide an alternate, algorithmic description of the cost functions for the queues, and use the resulting description to prove various game-theoretic and structural properties of these systems. Using these results, we prove our tight bound on the price of anarchy in this patient queuing game in Section 4 assuming that this alternative characterization of long-run rates is valid. In Section 5, we prove our first main result showing that with a factor 2 extra slack, no-regret queues will remain stable. In Section 6, we return to formally showing that the algorithmic description of the cost functions indeed coincides with their original definition as long-run growth rates of the queues.

1.3 Related Work

Our work falls in a long tradition of establishing price of anarchy bounds for various games [24], but is one of the first examples of studying the effect of learning in games with carryover effects between rounds. Compared to classical price of anarchy bounds in repeated games [4, 32, 40], we no longer make the assumption that games at different rounds are independent. Studying this model requires us to combine ideas from the price of anarchy analysis of games with the theory of stochastic processes. Another important repeated game setting with such carryover effect is the repeated ad-auction game with limited budgets. Some works [8, 11, 12] consider such games and offers results on convergence to equilibrium as well as understanding equilibria in the first-price auction settings under a particular behavioral model of the agents. Analyzing such systems for the more commonly used second-price auction system is an important open problem.

Although our stability objective differs from usual objectives in this literature, our results qualitatively also resemble the bicriteria result of Roughgarden and Tardos [34], which shows that in nonatomic routing, the cost incurred at any Nash flow is at most the optimal cost when twice the flow is routed. Unlike most such bounds that follow the smoothness framework of Roughgarden [32], our second main result is an equilibrium analysis that is more similar to that of Johari and Tsitsiklis [23], who establish equilibrium conditions and modify their problem while maintaining the equilibrium condition to arrive at a version that is easy to analyze. In our argument, we also modify the equilibrium itself toward a more tractable solution, but the intermediate points in this deformation will not be Nash, requiring additional arguments.

Our queuing setting also bears resemblance to stochastic games, a generalization of Markov decision processes where multiple players competitively and jointly control the actions and transitions (e.g., see [16, 28]). However, our work differs from this line in multiple ways: in our second model, queues are unaware of the system state and parameters, and most importantly, we are interested in explicit bounds to derive price of anarchy-style results for stability.

Although the goal of our work is in establishing price of anarchy-style bounds in dependent systems, this necessitates a careful understanding of the analytic properties of our random queuing dynamics. Among the large body of literature studying highly dependent random processes, closest to our work is the adversarial queuing systems of Borodin et al. [9], who also use the Pemantle and Rosenthal [29] theorem to establish bounded queue sizes in expectation.

The classical focus of work on scheduling in queuing systems is to identify policies that achieve optimal throughput (e.g., see the textbook of Shortle et al. [39]). There has also been work both on evaluating efficiency loss due to selfishness in different classical queuing systems, as well as the role of learning in such systems. For work on price of anarchy in queuing systems, see the book of Hassin [21] and the survey of Hassin and Haviv [22], and for a very recent tutorial on the role of learning and information in queuing systems, see the work of Walton and Xu [41]. Closest to our model from this literature is the work of Krishnasamy et al. [25], which characterizes the queue-regret of learning algorithms that only seek to identify the best servers but does not consider competition between selfish learners. Their primary goal is to study this more refined notion in the queuing setting for this classical stochastic bandit problem, which exhibits more complicated behavior than standard no-regret bounds that grow at least logarithmically with time. They characterize queue-regret for the case of a single queue aiming to find the best server, and extend the result to the case of multiple queues scheduled by a single coordinated scheduling algorithm, assuming that there is a perfect matching between queues and optimal servers that can serve them. In contrast, we assume that each queue separately learns to selfishly make sure its own packets are served at the highest possible rate, offering a strategic model of scheduling packets in a queuing system. Furthermore, we do not make the matching assumption on queues and servers.

Subsequent Work. After a subset of these results originally appeared in conference form, several works studied several game- and learning-theoretic questions arising from our model and technical results. On the strategic side, Fu et al. [18] show that the probabilistic techniques we introduce for analyzing these queuing systems extend to more general queuing networks once the feasibility conditions under coordinated scheduling are suitably adjusted, which naturally arise via duality arguments. Baudin et al. [5] propose an alternative, episodic queuing system where agents have incentives to hold jobs in an episode before sending to a central server, but suffer penalties should their jobs not be completed before the end of the episode. Their main conclusion, similar to ours, is that both equilibrium and no-regret outcomes ensure stability as long as these costs are sufficiently large. On the learning-theoretic side, Sentenac et al. [37], as well as Freund et al. [17], consider the problem of decentralized learning dynamics in bipartite queuing systems that attain near-optimal performance, extending the original work of centralized learning by Krishnasamy et al. [25]. These algorithmic advances are incomparable with our results, which are inherently noncooperative and strategic in nature. The former work also shows that the more refined guarantee of policy regret [14] is not sufficient to bridge the quantitative gap between our positive results for no-regret learning and patient equilibria.

2 Preliminaries

2.1 Notation

In general, random variables will be denoted by capital letters (i.e., \(X,Y,Z,\ldots)\), whereas vectors will be bolded (i.e., \(\mathbf {\mu },\mathbf {\lambda }\), etc). If a random variable X has some distribution \(\mathcal {D}\), we write \(X\sim \mathcal {D}\). We use the notation \(\text{Geom}(p)\) to denote a geometric distribution with parameter p, \(\text{Bern}(p)\) for a Bernoulli distribution that is 1 with probability p and 0 otherwise, and \(\text{Bin}(n,p)\) for a binomial distribution with parameters n and p.

We say that an event occurs almost surely if it has probability 1. We use standard \(O(\cdot), o(\cdot)\), and \(\Theta (\cdot)\) notation. We will sometimes write \(f(n)\asymp g(n)\) if \(f(n)=\Theta (g(n))\). We will also consider the following norms: for a positive vector \(\mathbf {\lambda }=(\lambda _1,\ldots ,\lambda _n)\), with \(\lambda _1\ge \ldots \ge \lambda _n\gt 0\), we define the following two weighted \(\ell _p\) norms on \(\mathbb {R}^n\):\(\Vert \mathbf {x}\Vert _{\mathbf {\lambda },1}\triangleq \sum _{i=1}^n \lambda _i \vert x_i\vert\) and \(\Vert \mathbf {x}\Vert _{\mathbf {\lambda },2}\triangleq \sqrt {\sum _{i=1}^n \lambda _i x_i^2}.\) It is easily seen that for any \(\mathbf {x}\), \(\Vert \mathbf {x}\Vert _{\mathbf {\lambda },1}\asymp \Vert \mathbf {x}\Vert _{\mathbf {\lambda },2}\) (where the constants depend on \(\mathbf {\lambda }\)) via Cauchy-Schwarz (see Lemma A.1). We use the following fractional sum operation \(\oplus : \mathbb {R}^2_{\ge 0}\times \mathbb {R}^2_{\ge 0}\rightarrow \mathbb {R}_{\ge 0}\):

We will later repeatedly use the following simple fact.

Fact 2.1.

For all \(a_1,\ldots ,a_n\ge 0\) and \(b_1,\ldots ,b_n\gt 0\),

\begin{equation*} \min _{i\in [n]}\frac{a_i}{b_i}\le \frac{a_1}{b_1}\oplus \ldots \oplus \frac{a_n}{b_n}=\frac{\sum _{i=1}^n a_i}{\sum _{i=1}^n b_i}\le \max _{i\in [n]}\frac{a_i}{b_i}. \end{equation*}

Moreover, equality holds in either of the inequalities if and only if both inequalities are tight.

Given a \(n\cdot m\)-dimensional vector \(\mathbf {p}=(p_1,\ldots ,p_n)\), where \(p_i\in \mathbb {R}^m\), we will write \(p_{ij}\) for the jth element of \(p_i\). Given a vector \(\mathbf {x}\in \mathbb {R}^n\) and a subset \(I\subseteq [n]\), we write \(\mathbf {x}_I\) to denote the vector restricted to the components in I. Given a set S, we will write \(\mathcal {P}(S)\) to denote the power set.

2.2 Bernoulli Queuing Model

We consider the following discrete-time queuing system illustrated by Figure 1, which is a decentralized, competitive version of the model considered by Krishnasamy et al. [25]: there is a system of n queues and m servers. During each discrete timestep \(t=0,1,\ldots\), the following occurs:

Fig. 1.

(1)

Each queue i receives a new packet with a fixed, time-independent probability \(\lambda _i\). We model this via an independent random variable \(B^i_t\sim \text{Bern}(\lambda _i)\). This packet has a timestamp that indicates that it was generated at time t. We label queues such that \(1\gt \lambda _1\ge \ldots \ge \lambda _n\gt 0\), writing \(\mathbf {\lambda }\) for the vector of arrival rates.

(2)

Each queue that currently has an uncompleted packet chooses one server to send their oldest unprocessed packet (in terms of timestamp) to.

(3)

Each server j that receives a packet does the following. First, it only considers the packet it receives with the oldest timestamp (breaking ties arbitrarily). It then processes this packet with a fixed, time-independent probability \(\mu _j\). We again label servers so that \(\mu _1\ge \ldots \ge \mu _{m}\ge 0\), writing \(\mathbf {\mu }\) for the vector of service rates.

(4)

All unprocessed packets, possibly including the packets that were selected if the corresponding server failed to process it, are then sent back to their respective queues still uncompleted. Queues receive bandit feedback on whether their packet cleared at their chosen server.

At each round of this process, the queues with packets aim to maximize the probability that their packet gets served, describing the incentives of the stage game.

We write \(Q^i_t\) for the number of unprocessed packets of queue i at the beginning of time t (before sampling new packets) and \(\mathbf {Q}_t=(Q^1_t,\ldots ,Q^n_t)\) for the vector of queue sizes at time t. Define \(Q_t=\sum _{i=1}^n Q^i_t\) as the total number of unprocessed packets in the system at time t. Formally, if \(X^i_t\) is the indicator event that queue i clears a packet at time t and \(B^i_t\) is again the indicator queue i received a new packet at time t, then we have the recurrence as random variables with \(Q^i_0=0\) and

\begin{equation} Q^i_{t+1}=Q^i_{t}+B^i_t-X^i_t, \end{equation}

(2)

where we note that \(X^i_t\) is necessarily 0 if \(Q^i_{t}+B^i_t=0\) (i.e., queue i had no packets and did not receive a new one in the round, so it does not send nor clear a packet this time period). This ensures that each \(Q^i_t\) is integral and nonnegative. We call the preceding random process the Bernoulli model. We will be interested in the stability of this system in the following sense.

Definition 2.1.

The preceding system is strongly stable under some given dynamics if, for any fixed \(r\ge 0\), the random process \(Q_t\) satisfies \(\mathbb {E}[(Q_t)^r]\le C_r\) for some absolute constant \(C_r\) depending only on r and the parameters \(\mathbf {\mu }\) and \(\mathbf {\lambda }\), but not on t. If instead we only have \(\mathbb {E}[Q_t]=o(t)\) (i.e., sublinear expected growth), we say that the system is weakly stable.⁴

To get a baseline measure for the outcomes that could arise under strategic behavior, we must first understand when a queuing system is stable under centralized coordination: it turns out that an obvious necessary condition on \(\mathbf {\mu }\) and \(\mathbf {\lambda }\) is also sufficient. An instructive example to keep in mind is a single-queue, single-server system. Of course, there is no learning or competition in such a process. If \(0\lt \lambda \lt \mu \le 1\), it is well known that \(Q^1_t\) follows a random walk on the nonnegative integers biased toward 0, and moreover is geometrically ergodic.⁵ This in particular implies strong stability. However, if \(0\lt \lambda =\mu \lt 1\), then the corresponding unbiased random walk \(Q^1_t\) satisfies \(\mathbb {E}[Q^1_t]=\Theta (\sqrt {t})\). Therefore, there is a sharp threshold for strong stability at \(\lambda = \mu\). Our first result, which provides a natural baseline to compare with all of the subsequent results in this work, offers an extension of this for the feasibility of any coordinated policy for multiple queues and servers.

Theorem 2.2.

Let \(\mathbf {\lambda }\in (0,1)^n\) and \(\mathbf {\mu }\in [0,1]^m\) be the arrival and service rates, respectively. Then the preceding queuing system is strongly stable for some centralized (coordinated) scheduling policy if and only if for all \(1\le k\le n\),

\begin{equation} \sum _{j=1}^{{\min \lbrace k,m\rbrace }} \mu _{j} \gt \sum _{i=1}^{k} \lambda _i. \end{equation}

(3)

When Equation (3) holds, we say that the queuing system is (centrally) feasible.⁶

We give the proof of sufficiency in this section while deferring the (straightforward) proof of necessity to Appendix C, which requires a submartingale argument in the edge cases where some inequality is tight. To prove sufficiency, we require the well-known notions of majorization of vectors.

Definition 2.3.

Let \(\mathbf {x},\mathbf {y}\in \mathbb {R}^n_+\), and assume that \(x_1\ge x_2\ge \ldots \ge x_n\ge 0\) and analogously for y. Then \(\mathbf {x}\) majorizes \(\mathbf {y}\) (written \(\mathbf {x}\succeq \mathbf {y}\)) if for each \(1\le k\le n\),

\begin{equation*} \sum _{i=1}^k x_i\ge \sum _{i=1}^k y_i. \end{equation*}

If the preceding inequalities are strict for each \(1\le k\le n\), then \(\mathbf {x}\) strictly majorizes y (written \(\mathbf {x}\succ \mathbf {y}\)).⁷

With this definition in hand, we can prove sufficiency using standard results on the relation between majorization and doubly stochastic matrices (see Appendix C for these standard facts).

Proof of Sufficiency in Theorem 2.2

First, suppose that \(\mathbf {\mu }\) strictly majorizes \(\mathbf {\lambda }\), when appropriately appending zeros if needed to make the vectors of same length. By Corollary C.3, there exists some doubly stochastic P such that \(P\mathbf {\mu }\gt \mathbf {\lambda }\) elementwise. By the Birkhof–von Neumann theorem, the set of doubly stochastic matrices is the convex hull of the set of permutation matrices \(\mathcal {P}\). This implies that there exists a distribution \(\pi\) over \(\mathcal {P}\) such that \(\mathbf {\lambda }\lt P\mathbf {\mu }\), interpreted componentwise, where \(P=\mathbb {E}_{\Pi \sim \pi }[\Pi ]\).

Consider the following oblivious scheduling algorithm: at each time t, independently sample a permutation matrix \(\Pi\) from \(\pi\), and schedule queues via the associated matching on the bipartite graph of queues and servers (where we pad with zeros if necessary, and even if some queues have no available packets to send). For each queue i, the associated marginal distribution on servers it sends to in each round is given by the ith row of P. Given that queue i has a packet to send at time t, the probability of successfully clearing a packet is exactly \((P\mathbf {\mu })_i\gt \lambda _i\), as this scheduling scheme ensures that each queue is alone at each server it sends to. As a result, the packet clears as long as the server is successful.

Therefore, the random process \(Q^i_t\) of number of packets by queue i at time t follows a homogeneous random walk on the half-line biased toward 0, which is ergodic with a stationary distribution with geometric tails. It is not difficult to show that any distribution on the natural numbers with geometric tails has bounded rth moments for any \(r\ge 0\).⁸ This then extends to the rth moment of the sum by Minkowski’s inequality. This proves strong stability when strict majorization holds.□

2.3 The Need for Packet Priorities

Recall that we will be interested in proving statements of the following form:

Given a queuing system that is centrally feasible (in the sense of Theorem 2.2) even when \(\mathbf {\mu }\) is scaled down by some constant \(c\ge 1\) independent of the parameters of the system, then a random process where queues are decentralized and strategic under certain conditions remains stable.

To see the necessity of timestamps, consider instead a simpler model where there are no timestamps and priorities, and instead each server uniformly randomly picks which packet to process among those that are sent to it in each step. It is easy to see that if a queuing system is feasible even if \(\mathbf {\mu }\) is scaled down by a factor of n, then it will remain a stable queuing system with reasonably strategic queues. Indeed, by this feasibility assumption, \(\mu _1\gt n\cdot \lambda _1\) so that \(\mu _1\gt \sum _{i=1}^n \lambda _i\). Therefore, if every queue just always sends to the largest server whenever they have a packet, they will succeed in clearing a packet with probability at least \(\mu _1/n\gt \lambda _i\), and it is not too difficult to prove that this results in a strongly stable process by comparing to a random walk biased toward the origin. It is natural to ask if a better factor is attainable in this alternate model, perhaps even a constant. It turns out that in general, a polynomial in n is required, as we formally show in Appendix C.

Theorem 2.4.

In this alternate model, for large enough n, there exists a centrally feasible queuing system with n queues and servers with the following property: the system remains feasible even if \(\mathbf {\mu }\) is scaled down by \(\Omega (n^{1/3})\) and it is possible for all queues to be in a Nash equilibrium of the stage game at each step⁹ at each timestep (and in particular, satisfy no-regret properties as in Assumption 5.1), yet the system is not weakly stable.

The basic reason this can occur is that low arrival rate queues can saturate the high success rate servers, making it impossible for high arrival rate queues to clear fast enough to offset their higher arrival. Our priority model, although more difficult to analyze, results in older queues gaining an advantage on young queues, causing the young queues to prefer lower-quality servers. In other words, our model implicitly helps fast-growing queues get better service, as long as queues are sufficiently adaptive to take advantage of it.

2.4 Geometric Queuing Model via Deferred Decisions

Because of the significant dependencies induced by the timestamp priority scheme in our model, a direct analysis of these systems proves quite unwieldy. In this section, we use the principle of deferred decision to give an alternate description of the Bernoulli system given in Section 2.2 that will prove significantly more amenable to analysis in the rest of this article.

To describe this Geometric model, suppose that each queue chooses which server to send to at time t only depending on past feedback and their history of oldest timestamps, but not on the \(Q^i_t\). In this case, we can equivalently characterize the evolution of this system keeping only the oldest timestamp of a packet at each queue. To do this, instead of randomly generating new packets at each timestep according to a Bernoulli process, each queue only maintains the timestamp of their current oldest unprocessed packet. Once this packet is successfully cleared, the new current oldest unprocessed packet has a timestamp generated by sampling a geometric random variable with parameter \(\lambda _i\) and adding this to the timestamp of the just-completed packet. If this number exceeds the current timestep t, this corresponds to having processed all packets that arrived before the current timestep, and receiving the next packet in the future. We will call this random process the Geometric process. Because the gap between successes in repeated independent \(\text{Bern}(\lambda _i)\) trials is given by a \(\text{Geom}(\lambda _i)\) random variable, the Bernoulli and Geometric processes can be completely coupled, as described in the following.

Definition 2.5.

When the queues use strategies with the preceding property, the Geometric process can be described using the following notation:

(1)

Time progresses in discrete steps \(t=0,1,\ldots\) . At each time t, \(\widetilde{T}^i_t\) is the timestamp of the oldest unprocessed packet of queue i at time t. \(T^i_t=\max \lbrace 0,t-\widetilde{T}^i_t\rbrace\) is the age of the current oldest packet of queue i at timestep t. In other words, \(T^i_t\) measures how old the current oldest unprocessed packet for queue i is.¹⁰

(2)

Queue i can send a packet to any server j in this timestep if \(t-\widetilde{T}^i_t\ge 0\). Each server j attempts to serve only the oldest packet it receives and succeeds with probability \(\mu _j\). If queue i’s packet is successfully served, set \(\widetilde{T}^{i}_{t+1}= \widetilde{T}^i_t+G^i\), where \(G^i\sim \text{Geom}(\lambda _i)\) is independent of all past events, and otherwise \(\widetilde{T}^{i}_{t+1}=\widetilde{T}^{i}_{t}\).

We write \(\mathbf {T}_t=(T^1_t,\ldots ,T^n_t)\in \mathbb {N}^n\) for the vector of current ages of oldest packets. To see the equivalence, consider any Bernoulli queuing system with Bernoulli random variables \(\lbrace B^i_t\rbrace _{i\in [n],t\ge 0}\) for packet generation. To get a coupled Geometric system for the same system, use an independent sequence \(\lbrace G^i_s\rbrace _{i\in [n],s\ge 0}\) with the interpretation that \(G^i_{s}\sim \text{Geom}(\lambda _i)\) is the size of the sth gap between successes in the \(B^i_t\). When queue i clears her sth packet, her new oldest timestamp increases by \(G^i=G^i_{s}\) as described earlier. As such gaps between timestamps in the Bernoulli model have \(\text{Geom}(\lambda _i)\) distributions, the Geometric system gives the ages of each queue in the Bernoulli system at all times and gives an explicit coupling.

The key feature is that, under the assumption that each queue i chooses servers at time t only based on at most the \(T^i_t\), not on \(Q^i_t\), all choices by queues are the same conditioned on just the current timestamp and past feedback as it is conditioned on all the past information in the Bernoulli model (which includes arrivals received after the current oldest packet). In other words, if \(\mathcal {G}_t\) denotes the information available to the queues in the Bernoulli model at time t, and \(\mathcal {F}_t\) for the Geometric model, then all choices by the queues at time \(t+1\) are the same conditioned on either history. The point of doing so is that \(G^i_{s}\) will be independent of \(\mathcal {G}_t\) until the queue clears her sth packet (namely, the timestamp of queue i’s \(s+1\)-th packet is not known until the time queue i clears her sth packet).

In the Geometric system, we define stability in the same way as before.

Definition 2.6.

The Geometric system is strongly stable if, for any \(r\ge 0\), \(\mathbb {E}[(\sum _{i=1}^n T^i_t)^r]\le C_r\), where \(C_r\) is a constant depending only on r and the parameters \(\mathbf {\mu }\) and \(\mathbf {\lambda }\), but not on t. The system is weakly stable if just \(\mathbb {E}[\sum _{i=1}^n T^i_t]=o(t)\).

Because heuristically \(Q_t^i\approx \lambda _i T_t^i\), it is intuitive that our notions of strong stability are equivalent whenever both systems correspond to the same random process. This is indeed the case, and furthermore, strong stability implies almost sure subpolynomial asymptotic growth. The basic idea is to use Markov’s inequality and the Borel-Cantelli lemma along an appropriately chosen subsequence of times and then interpolate to the rest. We defer this equivalence to Appendix C.3.

Lemma 2.7.

If the Bernoulli and Geometric models characterize the same queuing dynamics, then strong stability in the Bernoulli system is equivalent to strong stability in the Geometric system (and the same holds for weak stability). Moreover, if this holds, then strong stability in either system implies almost sure subpolynomial asymptotic growth.

3 Patience in Queuing Systems

In this section, we introduce and establish structural results about a patient version of our queuing game. In this game, each queue chooses a fixed distribution over servers that will be used in all rounds to optimize the long-run age of the packets in the queue. To begin, we formulate the game in a manner that is well defined a priori.

Definition 3.1.

The patient queuing game \(\mathcal {G}=([n],(c_i)_{i=1}^n,\mathbf {\mu },\mathbf {\lambda })\) is defined as follows: the strategy space for each queue is \(\Delta ^{m-1}\). Let \(\mathbf {p}=(p_1,\ldots ,p_n)\in (\Delta ^{m-1})^n\) denote the vector of fixed distributions for all queues over servers. The cost function \(c_i\) for queue i is \(c_i(p_i,p_{-i})\triangleq \limsup \limits _{t\rightarrow \infty } \mathbb {E}[\tfrac{T^i_t}{t}],\) where \(T^i_t\) is again the age of queue i at time t in the random queuing process induced by running the queuing system with \(\mathbf {\mu }\) and \(\mathbf {\lambda }\) the system parameters when each queue chooses a server by independently randomizing according to \(\mathbf {p}\) in each timestep she has a packet.

We say that \(\mathbf {p}\) is a Nash equilibrium of \(\mathcal {G}\) if for all \(i\in [n]\), \(p_i\in \arg \min _{p^{\prime }\in \Delta ^{m-1}} c_i(p^{\prime },p_{-i})\)—that is, each queue chooses \(p_i\) to minimize their cost function conditioned on the strategies \(p_{-i}\) of the other queues.

Note that the cost function defined previously is clearly well defined as the lim sup of expected values. However, we will actually show that the limit of the random quantity \(T_i^t/t\) (without expectations) is almost surely equal to a deterministic constant depending on \(\mathbf {p},\mathbf {\lambda },\mathbf {\mu }\) (see Theorem 3.3). By deriving an alternate, explicit characterization of these values, we show that Nash equilibria exist in Theorem 3.8. Because the cost functions are explicit functions of the randomizations and the parameters \(\mathbf {\mu }\) and \(\mathbf {\lambda }\), we omit the notation \((c_i)_{i=1}^n\) when instantiating a game \(\mathcal {G}\).

Our main focus in Section 4 will be to give guarantees on the quality of all Nash equilibria in this game. In a slight abuse of the price of anarchy terminology, we make the following definition.

Definition 3.2.

Let \(\mathbf {\lambda }\) and \(\mathbf {\mu }\) satisfy the conditions of Theorem 2.2 with weak inequalities so that \(\mathbf {\mu }\) majorizes \(\mathbf {\lambda }\). For \(\alpha \ge 1\), let \(\mathcal {G}(\alpha)=([n],\mathbf {\mu },\alpha ^{-1}\mathbf {\lambda })\), and let \(\mathcal {N}(\alpha)\) denote the set of Nash equilibria of \(\mathcal {G}(\alpha)\). The price of anarchy of \(\mathcal {G}=\mathcal {G}(1)\) is defined as the supremum of \(\alpha\) values such that there exists a Nash equilibrium \(\mathbf {p}^*\in \mathcal {N}(\alpha)\) and some \(i\in [n]\) such that \(c_i(\mathbf {p}^*)\gt 0\) in \(\mathcal {G}(\alpha)\).¹¹ The price of anarchy of the patient queuing game is the supremum over all instances of \(\mathcal {G}\).

In this section, we extensively study the properties of the cost function \(c(\mathbf {p})\), which is currently written as the lim sup of the expected value of the random linear aging rate of each queue. By taking the lim sup and expected values, the cost function is well defined, albeit quite unwieldy at present. Our first task is thus to provide an alternative, algorithmic description of \(c(\mathbf {p})\), which we initially denote \(r(\mathbf {p})\) (for “rates”) in Section 3.1. We show that r has significant analytic structure that will help establish various game-theoretic properties of this system. In particular, we show that the level subsets (in \([n]\)) of \(r(\mathbf {p})\) enjoy convenient closure properties, which will be enough to establish continuity and other properties, which we use to prove the existence of equilibria. We will return to proving that this function is equal to c in Section 6.

The Need for Packet Priorities with Patience. We note that even with this restriction to stationary strategies in a patient queuing game, the priority scheme by servers to attempt to serve the oldest packet is necessary to obtain constant price of anarchy bounds. It is not too difficult to see that if servers choose packets uniformly at random among those it receives in each round, the price of anarchy can be polynomially large in n in the sense of Theorem 2.4 when the costs are defined to be the asymptotic aging rates as in Definition 3.1. More specifically, consider a queuing system with one queue with arrival rate \(C/n^{1/3}\) for some large constant \(C\gt 0\) to be determined, and \(n-1\) queues with arrival rate \(1/n^{2/3}\). Suppose that there is one server with success rate 1 and then n servers with success rate \(1/n^{1/3}\). There exist bad Nash equilibria in stationary strategies of the following form: each small queue evenly mixes between a personal server with success rate \(1/n^{2/3}\) and the top server with success rate 1. It is clear that each small queue will be stable for large enough n (provided no other queue shares her personal server) by this choice of constants, so she has no incentive to deviate because her cost in this game is her long-run aging rate of zero.

By standard Chernoff bounds, in any given round, there are at least \(cn^{1/3}\) small queues that will send to the top server under these stationary strategies with overwhelming probability for some absolute constant \(c\gt 0\); therefore, with very high probability in each round, the large queue can succeed in clearing a packet by sending to the large server with probability at most \(1/cn^{1/3}\). This is strictly smaller than her rate if \(C\gt 0\) is taken sufficiently large. At any other server, the queue can get rate at best \(1/n^{1/3}\), which clearly does not ensure stability for large enough \(C\gt 0\). Therefore, the system must remain unstable if she best responds to this behavior by the other queues (note further that this best response will never involve sending to a personal server of any of the other queues, as there are n such servers). No queue has any incentive to deviate according to the cost functions as defined in Definition 3.1 under this alternate server selection choice, so these strategies constitute a Nash equilibrium. This system would remain centrally feasible even if server rates were scaled down by \(\Omega (n^{1/3})\), and hence the price of anarchy of such a patient queuing game can be polynomially large in n.

In this example, small queues mix between the top server and their own server equally because their long-run growth rate is zero in any case. It is possible to modify this example so that all queues mix among servers that offer equal long-run probability of success. This can be done by having each small queue send to the top server with probability \(p(n)\) and to a personal server with rate \(1/n^{1/3}\) with probability \(1-p(n)\), while the top queue sends deterministically to the last server with rate \(1/n^{1/3}\) (note that the effectiveness of the top server is at best \(1/n^{1/3}\) if the top queue were to deviate to there, by construction, so this is a best response). The parameter \(p(n)\) can be chosen so that the long-run average number of small queues sending to the top server is precisely \(n^{1/3}\) almost surely using the strong law of large numbers for Markov chains, so that each queue indeed mixes among servers that offer long-run success probability \(1/n^{1/3}\).

3.1 Algorithmic Description of Costs

As stated, we now construct a function \(r:(\Delta ^{m-1})^n \rightarrow [0,1]\) that we will show is equivalent to c. We will show that for any fixed \(\mathbf {p}\), the set \([n]\) of queues partitions into subsets \(S_1,S_2,\ldots\), where each queue in \(S_i\) group has the same aging rate and \(S_1\) ages the fastest, then \(S_2\), and so on, according to r (and so for c as well). To get a sense of the quantities that will arise before considering the general case, consider the simplest setting of a single queue and a single server (where there are no nontrivial strategies nor competition), with rates \(\lambda \gt \mu\). In any round where the queue has an uncleared packet, the age will first increase by 1 deterministically. With probability \(\mu\), the queue will succeed in clearing this packet, and the age will go down in expectation by \(\mathbb {E}[G]=1/\lambda\), where \(G\sim \text{Geom}(\lambda)\) is independent of whether or not the server succeeds. Therefore, the expected change in this queue’s age will be \(1-\mu /\lambda \gt 0\), and we expect that the queue will asymptotically age at this rate.

In general, with multiple queues and servers, the actual values of \(c_i\) are best described via a recursive algorithm that computes the rates, which we give in the following. The intuition is that \(S_1(\mathbf {p})\) will be the subset that minimizes the ratio of expected packets they clear collectively given \(\mathbf {p}\), assuming that they have priority over all other queues, divided by their sum of arrival rates. This quantity arises by viewing this subset as a single large queue as in the preceding single queue example. Conditioned on this set \(S_1\) of queues growing fastest, they will typically have priority, and then we recurse to find the lower groups. The algorithm begins by initializing \(k=1\) and \(I=[n]\):

(1)

Compute the minimum value over all nonempty subsets \(S\subseteq I\) of

\begin{equation*} \frac{\sum _{j=1}^m \mu _j(1-\prod _{i\in S}(1-p_{i,j}))}{\sum _{i\in S} \lambda _i}. \end{equation*}

This gives the expected number of packets cleared by S if all queues in S send in a timestep and they have priority over all other queues, divided by their sum of arrival rates.

(2)

If this value is at least 1, then no subset of queues will have linear aging, so set \(S_k=I\), \(r_i(\mathbf {p})=0\) for all \(i\in S_k\), and terminate. Otherwise, set \(S_k\) to be the minimizer of the previous quantity over all nontrivial subsets of I, chosen to be of largest cardinality in the case of degeneracies.¹² In this case, for each \(i\in S_k\), \(r_i(\mathbf {p})\) gets set to

\begin{equation*} 1-\frac{\sum _{j=1}^m \mu _j(1-\prod _{i\in S_k}(1-p_{i,j}))}{\sum _{i\in S_k} \lambda _i}. \end{equation*}

For \(k=1\), we refer to any subset with the minimum ratio as a tight, or minimizing, subset.

(3)

Update the server rates \(\mu _j\) as \(\mu _j\leftarrow \mu _j\prod _{i\in S_k}(1-p_{i,j}).\) In other words, \(\mu _j\) gets discounted by the probability a queue from \(S_k\) sends to server j (assuming that all of these queues are sending). Update \(I\leftarrow I\setminus S_k\), \(k\leftarrow k+1\), and recurse on I with \(\mathbf {\mu }\) and \(\mathbf {p}_I\) if nonempty.

As many of these quantities will appear often, we make the following conventions: for any subsets \(S,S^{\prime }\) such that \(S\subseteq [n]\setminus S^{\prime }\), define \(\lambda (S)\) as the sum of arrival rates of packets to a set of queues S, and \(\alpha (S\vert \mathbf {p},\mathbf {\mu },S^{\prime })\) as the expected number of packets cleared from queues in S with service rates \(\mathbf {\mu }\), if the queues in \(S^{\prime }\) have priority, S has priority over all other queues, and all queues in \(S\cup S^{\prime }\) send packets in the round:

and then let

denote the ratio of expected number of packets cleared by S when having priority over all members but \(S^{\prime }\), normalized by the expected number of new packets received in each round by S.

Let \(S_{k}(\mathbf {p},\mathbf {\mu },\mathbf {\lambda })\) be the \({k}\)th set output by the preceding algorithm. When \(\mathbf {p},\mathbf {\mu },\mathbf {\lambda }\) are clear from context, we will suppress them. We write \(U_k=\cup _{\ell =1}^kS_{\ell }\) as the set of queues in the top k groups outputted by the algorithm, with \(U_0=\emptyset\). We will write \(f_k = f(S_k\vert U_{k-1})\), and we use \(g_k=\max \lbrace 0,1-f_k\rbrace\) for the rate of the kth outputted set, which is equal to \(r_i(\mathbf {p},\mathbf {\mu },\mathbf {\lambda })\) for any \(i\in S_k(\mathbf {p},\mathbf {\mu },\mathbf {\lambda })\). From the recursive construction,

\begin{equation} S_{k+1}(\mathbf {p},\mathbf {\mu },\mathbf {\lambda })=S_1(\mathbf {p}_{[n]\setminus U_k},\mathbf {\mu }^{\prime },\mathbf {\lambda }_{[n]\setminus U_k}), \end{equation}

(4)

where \(\mu ^{\prime }_j = \mu _j\prod _{i\in U_k(\mathbf {p},\mathbf {\mu },\mathbf {\lambda })}(1-p_{ij})\) for all \(j\in [m]\). In other words, having found \(U_k\), \(S_{k+1}\) is the largest minimal set among the remaining elements, but where the \(\mathbf {\mu }\) rates have been reweighed by the probability no element of \(U_k\) sends to each server. These quantities are compiled in a table in Appendix D for easy reference.

Our main probabilistic result about the function r is that this is indeed equivalent to the cost function c of the patient queuing games. In fact, we prove the following, stronger result.

Theorem 3.3 (Almost Sure Asymptotic Convergence).

Let \(\mathcal {G}=([n],\mathbf {\mu },\mathbf {\lambda })\) be a one-shot patient queuing game. For each fixed \(\mathbf {p}\) and all \(i\in [n]\), almost surely the long-run aging rate of queue i satisfies

\begin{equation*} c_i(\mathbf {p})=\lim _{t\rightarrow \infty } \frac{T^i_t}{t}=r_i(\mathbf {p}). \end{equation*}

We will prove our game-theoretic results assuming this theorem; however, as the proof is quite nontrivial and rather involved, we defer the proof to Section 6.

3.2 Properties of Rate Function

We first establish basic properties of the output of the algorithm that will be useful in studying the analytic properties, as well as in proving that this algorithm gives the correct asymptotic rates. Throughout, we will view f as the quotient \(\alpha /\lambda\) when invoking Fact 3.1.

Clearly, for fixed S, the function \(f(S\vert T)\) is nonincreasing in T as a set function. We repeatedly use the following fact, which can be seen simply by expanding the definition of f.

Fact 3.1.

Suppose \(S,S^{\prime },T\) are such that \(S,S^{\prime }\subseteq [n]\setminus T\) and are disjoint. Writing f in the form of the quotient \(\alpha /\lambda\), then

\begin{equation*} f(S\cup S^{\prime }\vert T)=f(S\vert T)\oplus f(S^{\prime }\vert S\cup T). \end{equation*}

Next, we characterize some structure in the minimizing subsets at each step of the algorithm, which will allow us to choose the \(S_k\) canonically as the largest cardinality minimizer. To do this, we first show that the function \(\alpha (\cdot)\) is submodular.

Lemma 3.4 (Submodularity).

For fixed \(S^{\prime }\), \(\mathbf {p},\mathbf {\mu },\mathbf {\lambda }\) the function \(\alpha (S\vert \mathbf {p},\mathbf {\mu },\mathbf {\lambda },S^{\prime })\) is submodular is S—that is, for any \(S,T\subseteq [n]\setminus S^{\prime }\), \(\alpha (S\cap T\vert S^{\prime })+\alpha (S\cup T\vert S^{\prime })\le \alpha (S\vert S^{\prime })+\alpha (T\vert S^{\prime })\).

Proof.

Fix \(S^{\prime },\mathbf {p},\mathbf {\mu },\mathbf {\lambda }\), and we suppress the dependence in \(\alpha\). To prove submodularity, recall that an equivalent definition is that for any \(S,T\) satisfying \(S\subseteq T\), and \(i\not\in T\), then \(\alpha (S\cup \lbrace i\rbrace)-\alpha (S)\ge \alpha (T\cup \lbrace i\rbrace)-\alpha (T)\) [36]. Let \(S\subseteq T\) and \(i\not\in T\). A simple computation shows that for any \(V\subseteq [n]\setminus S^{\prime }\),

\begin{equation*} \alpha (V\cup \lbrace i\rbrace)-\alpha (V)=\sum _{j=1}^m \mu ^{\prime }_j p_{ij}\prod _{k\in V}(1-p_{kj}), \end{equation*}

where \(\mu ^{\prime }_j\) is \(\mu _j \cdot \prod _{i\in S^{\prime }}(1-p_{ij})\) from priority. As each factor in the product is at most 1, this is clearly decreasing in V as a set function, establishing submodularity.□

Now, recall that the relevant functions in the construction of the preceding algorithm is the set function \(f=\alpha /\lambda\). As a consequence of the fact that this function is the ratio of a submodular function with a modular function, we will be able to gain significant closure properties of the tight subsets (as defined earlier), which will end up being critical in establishing both game-theoretic and probabilistic properties of our systems.

Lemma 3.5 (Closure).

For each fixed \(\mathbf {p}\) and \(k\ge 1\), the set of minimizers of \(f(\cdot \vert U_{k-1})\) in \(\mathcal {P}([n]\setminus U_{k-1})\) is closed under union and nondisjoint intersection—that is, if \(S,S^{\prime }\subseteq [n]\setminus U_{k-1}\) are minimizers, then so is \(S\cup S^{\prime }\), as well as \(S\cap S^{\prime }\) if nonempty. Moreover, if \(S\cap S^{\prime }\) is empty, then the queues in S and \(S^{\prime }\) must send to disjoint subsets of servers.

In particular, the minimizing set with largest cardinality is unique, and is the union of all minimizing sets at step k. If S is considered at step k of the algorithm but S is not a subset of \(S_k\), then \(f(S\vert U_{k-1})\gt f(S_k\vert U_{k-1})\).

Proof.

The last statement is an immediate consequence of the first, so we focus on the first part. We show this just for \(k=1\); the general case for \(k\gt 1\) follows from the recurrence in Equation (4).

By Lemma 3.4, \(\alpha\) is submodular, and it is immediate that \(\lambda (\cdot)\) is a modular function—that is, \(\lambda (S\cup S^{\prime })+\lambda (S\cap S^{\prime })=\lambda (S)+\lambda (S^{\prime })\). We claim that if a function f is a ratio of a nonnegative submodular function and a nonnegative supermodular function, then the set of minimizers of f is closed under union and nondisjoint intersection. To see this, suppose that \(S,S^{\prime }\) are minimizers. Then we see the following inequalities, writing f always in the form of the quotient \(\alpha /\lambda\), and using Fact 3.1 and Fact 2.1,

\begin{equation*} \max \lbrace f(S),f(S^{\prime })\rbrace \ge f(S)\oplus f(S^{\prime }) \ge f(S\cap S^{\prime })\oplus f(S\cup S^{\prime }) \ge \min \lbrace f(S\cap S^{\prime }),f(S\cup S^{\prime })\rbrace \end{equation*}

(where we omit \(S\cap S^{\prime }\) if it is empty), as the inequalities in the numerator and denominator go the correct way by Lemma 3.4, and then using Fact 2.1. But as \(S,S^{\prime }\) were minimizers, these must be equalities, which occurs if and only if \(S\cap S^{\prime }\) (if nonempty) and \(S\cup S^{\prime }\) are both minimizers. As \(f=\alpha /\lambda\) here, this applies for our functions.

If \(S,S^{\prime }\) are both minimizers and are disjoint, then from Fact 3.1, it follows that \(f(S\cup S^{\prime }) = f(S) \oplus f(S^{\prime }\vert S)\). As \(S,S^{\prime },S\cup S^{\prime }\) are evidently minimal, this equation can only hold if \(f(S^{\prime }\vert S)=f(S)\), which occurs if and only if S and \(S^{\prime }\) disjointly mix among servers.□

From Lemma 3.5, it will nearly immediately follow that the outputted rates are strictly monotonic decreasing in the groups: as mentioned, \([n]=S_1\sqcup S_2\sqcup \ldots\) is meant to give a partition into groups that age together, where \(S_1\) is the fastest aging group, \(S_2\) the next fastest, and so on. As such, the disjoint subsets iteratively output by the algorithm satisfy the intuition that motivates the construction.

Lemma 3.6 (Monotonicity).

Let \(S_1,S_2,\ldots\) be the outputs of the algorithm in order. Then \(g_k\gt g_{k+1}\) for each \(k\ge 1\).

Proof.

Consider \(S_{k}\cup S_{k+1}\): this set is considered at stage k of the algorithm and evidently was not selected. From the previous lemma, we must have

\begin{equation*} f(S_{k}\vert U_{k-1})\lt f(S_{k}\cup S_{k+1}\vert U_{k-1})\lt f(S_{k+1}\vert U_{k-1}\cup S_k)=f(S_{k+1}\vert U_{k}). \end{equation*}

The first inequality follows from the selection criteria of the algorithm (and maximality of \(S_{k}\)), whereas the second follows from writing (via Fact 3.1)

\begin{equation*} f(S_{k}\cup S_{k+1}\vert U_{k-1})=f(S_{k}\vert U_{k-1})\oplus f(S_{k+1}\vert U_{k-1}\cup S_{k}). \end{equation*}

As we have already proven the first inequality, the second must follow from Fact 2.1. This yields the claim.□

With these basic properties, we can obtain an important structural result that will prove fruitful in establishing the existence of equilibria. We defer the somewhat technical proof to Appendix D.1.

Proposition 3.7 (Continuity).

The function \(r:(\Delta ^{m-1})^n\rightarrow [0,1]^n\) given by \(r(\mathbf {p})=(r_1(\mathbf {p}),\ldots ,\) \(r_n(\mathbf {p}))\) is continuous.

With these structural results, we can turn to showing our first game-theoretic property of this game, for now assuming that the costs are given by r, the output of the algorithm of Section 3.1: namely, that equilibria exist. Although the cost functions are not quite convex, by restricting each component to a line that varies only a single queue’s strategy, one can deduce enough structure that allows for an application of Kakutani’s theorem. We record this result here while deferring the proof to Appendix D.1.

Theorem 3.8 (Existence of Equilibria).

There exists an equilibrium of the game with costs given by \(r:(\Delta ^{m-1})^n\rightarrow [0,1]^n\).

3.3 Price of Stability and Independence

In Section 4, we will establish one of our main results, a tight bound of \(\frac{e}{e-1}\) on the price of anarchy in the patient queuing game. Recall that this bound asserts that in any patient queuing system that is centrally feasible, if the arrival rates are decreased by a factor of \(\frac{e}{e-1}\), then every Nash equilibria of the resulting game will be stable. In this section, we consider the intermediate and complementary notion of the price of stability, which controls the efficiency loss of the best Nash equilibrium. Formally, we have the following definition.

Definition 3.9.

Let \(\mathbf {\lambda }\) and \(\mathbf {\mu }\) satisfy the conditions of Theorem 2.2 with weak inequalities so that \(\mathbf {\mu }\) majorizes \(\mathbf {\lambda }\). For \(\alpha \ge 1\), let \(\mathcal {G}(\alpha)=([n],\mathbf {\mu },\alpha ^{-1}\mathbf {\lambda })\), and let \(\mathcal {N}(\alpha)\) denote the set of Nash equilibria of \(\mathcal {G}(\alpha)\). The price of stability of \(\mathcal {G}=\mathcal {G}(1)\) is defined as the supremum of \(\alpha\) values such that in every Nash equilibrium \(\mathbf {p}^*\in \mathcal {N}(\alpha)\), there exists \(i\in [n]\) such that \(c_i(\mathbf {p}^*)\gt 0\) in \(\mathcal {G}(\alpha)\). The price of stability of the patient queuing game is the supremum over all instances of \(\mathcal {G}\).

By definition, the price of stability lies between 1 and the price of anarchy.

Another measure is the price of independence. Recall that the coordinated queuing strategy in the proof of Theorem 2.2 is highly centralized, in the sense that queues never collide due to the central scheduling.¹³ How well can all agents do when they are not necessarily selfish but are restricted to stationary product strategies? In other words, how well can all agents do when they attempt to make the system as stable as possible for all agents but are restricted to product strategies? Such a measure decouples the contribution in the price of anarchy of selfishness and independence (i.e., the inability to coordinate due to product distributions). Formally, we have the following definition.

Definition 3.10.

Let \(\mathbf {\lambda }\) and \(\mathbf {\mu }\) satisfy the conditions of Theorem 2.2 with weak inequalities so that \(\mathbf {\mu }\) strictly majorizes \(\mathbf {\lambda }\). For \(\alpha \ge 1\), let \(\mathcal {G}(\alpha)=([n],\mathbf {\mu },\alpha ^{-1}\mathbf {\lambda })\). The price of independence of \(\mathcal {G}=\mathcal {G}(1)\) is defined as the supremum of \(\alpha\) values such that for all \(\mathbf {p}^*\in (\Delta ^{m-1})^n\), there exists \(i\in [n]\) such that \(c_i(\mathbf {p}^*)\gt 0\) in \(\mathcal {G}(\alpha)\). The price of independence of the patient queuing game is the supremum over all instances of \(\mathcal {G}\).

In our setting, it is actually not very difficult to see that these two quantities are precisely the same. One direction is trivial, whereas if \(\alpha \ge 1\) is such that there exists a \(\mathbf {p}^*\) with \(c_i(\mathbf {p}^*)=0\) in \(\mathcal {G}(\alpha)\) for all \(i\in [n]\), \(\mathbf {p}^*\) is clearly a Nash equilibrium in \(\mathcal {G}(\alpha)\) as well as it trivially holds that no agent can possibly improve their rate. In fact, we observe that there is a more general correspondence, as shown in the following preposition.

Proposition 3.11.

For any patient queuing game \(\mathcal {G}\), let \(\mathsf {OPTIND}(\mathcal {G}) = \min _{\mathbf {p}}\max _{i\in [n]} c_i(\mathbf {p}).\) Then there exists a (possibly different) patient queuing game \(\mathcal {G}^{\prime }\) and a Nash equilibrium \(\mathbf {p}^{\prime }\) of \(\mathcal {G}^{\prime }\) such that \(c_{i,\mathcal {G}^{\prime }}(\mathbf {p}^*)= \mathsf {OPTIND}(\mathcal {G})\) for all i. Moreover, \(\mathcal {G}^{\prime }\) is a restriction of \(\mathcal {G}\) to a subset of queues and servers.

Proof.

Fix \(\mathcal {G}=([n],\mathbf {\mu },\mathbf {\lambda })\), and let \(\mathbf {p}\) denote an optimizer in the definition of \(\mathsf {OPTIND}(\mathcal {G})\) with a minimal number of tight subsets achieving the maximum rate. If \(\mathsf {OPTIND}(\mathcal {G})=0\), the claim is trivial with \(\mathcal {G}^{\prime }=\mathcal {G}\) and \(\mathbf {p}^{\prime }=\mathbf {p}\) by the reasoning prior to the proposition statement, so assume that it is strictly positive. Let \(S_1\) denote the maximal tight subset, as seen in the previous section, with size \(n^{\prime }\). Let \(\mathbf {\lambda }^{\prime }\) be the restriction of \(\mathbf {\lambda }\) to \(S_1\), \(\mathbf {\mu }^{\prime }=\mathbf {\mu }\), and \(\mathbf {p}^{\prime }\) the restriction of \(\mathbf {p}\) to \(S_1\). Note that \(\mathbf {\mu }^{\prime }\) clearly still strictly majorizes \(\mathbf {\lambda }^{\prime }\).

We claim that \(\mathcal {G}^{\prime }=(S_1,\mathbf {\mu }^{\prime },\mathbf {\lambda }^{\prime })\) is such that \(\mathbf {p}^{\prime }\) is a Nash equilibrium of \(\mathcal {G}^{\prime }\). Note that this implies the conclusion of the proposition. Fix an \(i\in S_1\) and suppose by way of contradiction that there exists a deviation to \(p^{\prime \prime }_i\) such that the rate of queue i strictly improves. If i is part of every tight subset of the original game, this implies that the strategy vector \(\mathbf {p}^*=((1-t)p_i+tp^{\prime \prime }_i,\mathbf {p}_{-i})\) for \(t\gt 0\) sufficiently close to zero in the original game attains a strictly smaller value than \(\mathbf {p}\). This holds because for each subset \(S\subseteq [n]\) containing i, the function \(f(S\vert \cdot)\) is a linear function of i’s randomization, fixing \(\mathbf {p}_{-i}\). Because all tight subsets contain i by assumption, and all of the other finitely many subsets attain strictly smaller rates, we may take \(t\gt 0\) sufficiently small to obtain a smaller value for \(\mathsf {OPTIND}(\mathcal {G})\), violating the definition of \(\mathbf {p}\).

Therefore, suppose instead that there exists a tight subset S of \(S_1\) in the original game such that \(i\not\in S\). Again, suppose by way of contradiction that there exists a deviation to \(p^{\prime \prime }_i\) such that the rate of queue i strictly improves in \(\mathcal {G}^{\prime }\). Considering the same deviation as in the previous case, note that there is no contradiction to the optimality of the value attained by \(\mathbf {p}\) because the rate of the subset S is unchanged. But for sufficiently small \(t\gt 0\), the number of tight subsets attaining the maximum rate strictly decreases as the rate of every subset containing i strictly decreases, contradicting the minimality of the number of tight subsets attaining the maximum rate in \(\mathbf {p}^*\). As \(i\in S_1\) was arbitrary, we conclude that \(\mathbf {p}^{\prime }\) is a Nash equilibrium of \(\mathcal {G}^{\prime }\) attaining the same value of the optimizer in \(\mathcal {G}\).□

We now give a simple argument showing that the price of independence, and therefore also the price of stability, is at most \(\frac{e}{e-1}\). This will later also be a consequence of our bound on the price of anarchy, but the argument is substantially simpler.

Proposition 3.12.

The price of independence/stability of the patient queuing game is at most \(\frac{e}{e-1}\).

Proof.

Let \(\mathcal {G}=([n],\mathbf {\mu },\mathbf {\lambda })\) such that \(\mathbf {\mu }\succeq \mathbf {\lambda }\). We first assume that \(m=n\). Recall that this implies there exists a doubly stochastic matrix \(X\in \mathbb {R}^{n\times n}\) such that \(X\mathbf {\mu }\ge \mathbf {\lambda }\), where the inequality is taken elementwise. We set \(x_i\) to be the ith column of X and consider the strategy \(\mathbf {x}=(x_1,\ldots ,x_n)\). We write \(x_{ij}\) for the jth element of \(x_i\), which is the \((j,i)\) entry of the matrix X.

Consider any subset \(S\subseteq [n]\). Our goal is to show that

\begin{equation*} f(S\vert \mathbf {x}) = \frac{\sum _{j=1}^n \mu _j\left(1-\prod _{i\in S}(1-x_{ij})\right)}{\sum _{i\in S}\lambda _i}\ge \frac{e-1}{e}. \end{equation*}

If this holds, then scaling down \(\mathbf {\lambda }\) by \(\frac{e}{e-1}\) and using the same strategy tuple yields the desired result.

We have

\begin{equation*} \prod _{i\in S} (1-x_{ij})\le \exp \left(-\sum _{i\in S} x_{ij}\right)\le 1-\frac{e}{e-1}\sum _{i\in S} x_{ij}. \end{equation*}

Here, we use the inequalities \(1-x\le \exp (-x)\) and \(\exp (-x)\le 1-\frac{e-1}{e}x\) on \([0,1]\), which can be easily obtained by convexity and observing that the left and right sides are equal at the endpoints. We also use the doubly stochasticity of X so that the sum is indeed at most 1. Plugging this in, we find that

\begin{equation*} f(S\vert \mathbf {x}) \ge \left(\frac{e-1}{e}\right)\frac{\sum _{j=1}^n \mu _j\sum _{i\in S}x_{ij}}{\sum _{i\in S}\lambda _i}=\left(\frac{e-1}{e}\right)\frac{\sum _{i\in S} \sum _{j=1}^m\mu _jx_{ij}}{\sum _{i\in S}\lambda _i}\ge \left(\frac{e-1}{e}\right), \end{equation*}

where we use the fact that \(\sum _{j=1}^n \mu _j x_{ij}\ge \lambda _i\) (this is the ith inequality in \(X\mathbf {\mu }\ge \mathbf {\lambda }\)).

To extend this argument to arbitrary \(m,n\), simply observe that if \(m\gt n\), one may simply truncate to the n largest servers. If instead \(m\lt n\), one can instead argue as follows: consider \(\mathbf {\mu }^{\prime }\) that extends \(\mathbf {\mu }\) by padding with zeros. The preceding argument shows that there exists the desired strategy vector \(\mathbf {x}^*\) attaining the bound where queues may send to dummy servers. In the original game, any strategy profile \(\mathbf {x}\) such that \(\mathbf {x}_{ij}\ge \mathbf {x}_{ij}^*\) for all \(j\le m\) and \(\mathbf {x}_{ij}=0\) for \(j\gt m\) attains at least as good a value as the dummy servers confer no value to the rate functions. This concludes the proof.□

It is not difficult to see that the price of independence is strictly greater than 1; almost any nontrivial example will certify this. For instance, the following example shows that it is at least \(9/8\).

Example 3.13.

Consider \(\mathbf {\mu }=(1,1/2)\) and \(\mathbf {\lambda }=(3/4,3/4)\). Let \(x,y\in [0,1]\) denote the probabilities that the first and second queues send to the top server. Then it is easy to see that the optimal rate that can be attained is

\begin{equation*} \max _{x,y\in [0,1]} 1-\left(\min \left\lbrace \frac{1/2+x/2}{3/4},\frac{1/2+y/2}{3/4},\frac{1/2+x+y-3xy/2}{3/2}\right\rbrace \right). \end{equation*}

One can show that the minimum is attained at \((x,y)=(1/3,1)\) so that the rate of both queues is \(1-(8/9)=1/9\). This implies that the system remains unstable unless the arrival rates are scaled down by \(9/8\), so the price of independence is at least \(9/8\).

We suspect that the price of independence is significantly less than \(\frac{e}{e-1}\), but we leave the determination of its value as an interesting analytical question for future work.

Question 3.

Determine the solution of

\begin{equation*} \max _{\mathbf {\mu }\succeq \mathbf {\lambda }}\min _{\mathbf {x}\in (\Delta ^{m-1})^n} \max _{S\subseteq [n]} \left[ 1 - \frac{\sum _{j=1}^m \mu _j\left(1-\prod _{i\in S}(1-x_{ij})\right)}{\sum _{i\in S}\lambda _i}\right]. \end{equation*}

4 Price of Anarchy of Patient Queuing

In this section, we turn to the game-theoretic problem of understanding what condition ensures the stability at any equilibrium profile assuming Theorem 3.3—we return to proving that this is valid in the next section. By considering the quality of deviations by a queue at a Nash equilibrium to a single other server, it is possible to show that the price of anarchy is always at most 2. With more careful, continuous deviations, we in fact show that this factor is loose with patience, and the correct bound is \(\frac{e}{e-1}\approx 1.58\).

The following simple example shows that this is the best possible constant factor: fix \(\epsilon \gt 0\) small and suppose that there are n queues and n servers, with \(\mathbf {\lambda }=(1-1/e+\epsilon ,\ldots , 1-1/e+\epsilon)\) and \(\mathbf {\mu }=(1,\ldots ,1)\), and \(\mathbf {p}\) has every queue uniformly mixing among the servers. It is easy to see by symmetry that this system is Nash with \(S_1=[n]\), for if a queue deviates from this uniform distribution, this does not change the worst ratio in the algorithm. Moreover, for any fixed \(\epsilon \gt 0\), as \(n\rightarrow \infty\), this system becomes unstable. One can check that

\begin{equation*} f(S_1\vert \mathbf {p})= f([n]\vert \mathbf {p})=\frac{\sum _{j=1}^n (1-\prod _{i=1}^n (1-1/n))}{n(1-1/e+\epsilon)}\rightarrow \frac{1-1/e}{1-1/e+\epsilon }\lt 1 \end{equation*}

so that \(r(S_1)=\max _i c_i(\mathbf {p})\gt 0\). Our main result asserts that this is the worst case, where every queue is maximally colliding subject to being Nash. Concretely, we prove the following instance-dependent bound from which the claimed factor immediately follows.

Corollary 4.2.

Let \(\mathbf {p}\) be a Nash equilibrium of \(\mathcal {G}\), and suppose that for each \(1\le k\le n\),

\begin{equation*} \sum _{j=1}^{\min \lbrace k,m\rbrace } \mu _j \ge \left(\frac{e}{e-1}\right)\sum _{i=1}^k \lambda _i. \end{equation*}

Then every queue is weakly stable at \(\mathbf {p}\). In particular, the price of anarchy of the patient queuing game is exactly \(\frac{e}{e-1}\).

Proof.

In Theorem 4.1, for any \(k\le n\), we may set \(x_j=1\) for \(1\le j\le k\). Note that \((1-x_j/k)^k\lt e^{-1}\), so if \(\mathbf {\mu }\) majorizes \(\mathbf {\lambda }\) by a factor of at least \(\frac{e}{e-1}\), Theorem 4.1 implies that \(f(S_1\vert \mathbf {p})\ge 1\). From Lemma 3.6 and Theorem 3.3, we conclude that all queues are weakly stable.□

We now prepare for the proof of Theorem 4.1. The idea will be to continuously deform the Nash profile toward a highly symmetrized strategy vector while only weakly decreasing \(f(S_1)\). At the end of this process, we obtain a lower bound on this value at Nash. To do this deformation and ensure monotonicity of the growth rate, we must at some point use the Nash property. The difficulty lies in the form of the f functions; recall that as \(S_1\) is the set of all queues growing at the fastest rate as the union of all tight subsets, it can have many proper tight subsets, and each queue \(i\in S_1\) thus has to locally optimize all of the functions \(f(S\vert \mathbf {p})\) with \(S\ni i\) simultaneously at Nash (see Figure 2 for an interesting example). In particular, if queue \(i\in S_1\) at Nash, one possible deviation may weakly decrease \(f(S)\) for some tight subset \(S\ni i\), whereas another deviation may weakly decrease \(f(S^{\prime })\) for some different tight subset \(S^{\prime }\ni i\). In other words, each queue may be constrained by multiple different objective functions at Nash, making it difficult to generically argue about why any given deviation decreases performance. We overcome this barrier via Proposition 4.3 by connecting the incentives for each queue in \(S_1\) with the structure guaranteed by Lemma 3.5.

Fig. 2.

Proposition 4.3.

Let \(\mathbf {p}\) be any arbitrary strategy vector by the queues, and without loss of generality, let \([k]\) be the maximal tight subset after relabeling. Then, for some \(s\gt 0\), there exists a level partition of \([k]\) into s levels with the following property: if a queue \(i\in [k]\) belongs to a level-\(\ell\) subset, then for any deviation by i that shifts probability mass from one server to another and does not increase \(f(S)\) for some tight subset \(S\ni i\), there exists a tight subset \(S^{\prime }\) containing all queues at all levels \(j\le \ell\) such that \(f(S^{\prime })\) must not increase.

Proof.

We construct the desired decomposition of \([k]\) by instead recursively considering minimal tight subsets. In particular, let the level-1 subsets be the set of all minimal tight subsets (notice that by Lemma 3.5 and minimality, each subset of queues must be disjoint and disjointly mixing). Note that to clear packets at the rate of these subsets, they all must have priority over any other queues. Then given the level-j subsets for all \(j\le \ell\) for some \(\ell \ge 1\), we recursively define the level-\((\ell +1)\) subsets as the set of minimal tight subsets of the remaining queues conditioned on all subsets at lower levels having priority. The same argument as used by Lemma 3.5 implies that these subsets at each level must all be disjoint and disjointly mixing.¹⁵ It is easy to verify by iteratively using Fact 3.1 that for any level \(\ell\), the union of all subsets at levels \(j\lt \ell\) with any subset of the level-\(\ell\) subsets must be tight. As \([k]\) is tight and any tight subset must be contained in \([k]\) by the maximality guaranteed in Lemma 3.5, it is immediate that this decomposition exhausts \([k]\).

Let \(i\in [k]\) belong to some level-\(\ell\) subset, for some \(\ell \ge 1\). Suppose that i has some deviation from server j to another server \(j^{\prime }\) that causes \(f(S)\) to not increase for some tight subset \(S\subseteq [k]\). As \(f(S)\) is a linear function in the randomizations by queue i holding each other queue fixed, this must hold for any sufficiently small deviation as well. By expanding the partial derivatives \(\frac{\partial f(S)}{\partial p_{ij}}\) and \(\frac{\partial f(S)}{\partial p_{ij^{\prime }}}\) and clearing denominators, this holds if and only if

\begin{equation} \mu _j\prod _{r\in S\setminus \lbrace i\rbrace }(1-p_{rj})\ge \mu _{j^{\prime }}\prod _{r\in S\setminus \lbrace i\rbrace }(1-p_{rj^{\prime }}). \end{equation}

(5)

Define \(V^i_{\le \ell }\) to be the set of all queues at level at most \(\ell\) that do not belong to the level-\(\ell\) subset i belongs to. Then \(V^i_{\le \ell }\) is tight, and by construction \(S\setminus V^i_{\le \ell }\) is nonempty as i belongs to this set. As tight subsets are preserved under unions and intersections, we have that \(S\cup V^i_{\le \ell }\) and \(S\cap V^i_{\le \ell }\) are tight. On the one hand, we have by Fact 3.1 that

\begin{equation*} f(S) = f(S\cap V^i_{\le \ell })\oplus f(S\setminus V^i_{\le \ell }\vert S\cap V^i_{\le \ell }). \end{equation*}

On the other hand, we have

\begin{equation*} f(S\cup V^i_{\le \ell }) = f(V^i_{\le \ell }) \oplus f(S\setminus V^i_{\le \ell }\vert V^i_{\le \ell }). \end{equation*}

Comparing these two expressions, the left-hand sides are minimal, as well as the first terms on the right, which by Fact 2.1 implies that the latter terms are equal. But we know from definition that

\begin{equation*} f(S\setminus V^i_{\le \ell }\vert S\cap V^i_{\le \ell })\ge f(S\setminus V^i_{\le \ell }\vert V_{\le \ell }), \end{equation*}

with equality if and only if \(V^i_{\le \ell }\setminus S\) disjointly mixes from \(S\setminus V^i_{\le \ell }\) in \(\mathbf {p}\). As i belongs to this set, we can combine this with Equation (5) to see

\begin{equation*} \mu _j\prod _{r\in (V^i_{\le \ell }\cup S)\setminus \lbrace i\rbrace }(1-p_{rj})= \mu _j\prod _{r\in S\setminus \lbrace i\rbrace }(1-p_{rj})\ge \mu _{j^{\prime }}\prod _{r\in S\setminus \lbrace i\rbrace }(1-p_{rj^{\prime }})\ge \mu _{j^{\prime }}\prod _{r\in (V^i_{\le \ell }\cup S)\setminus \lbrace i\rbrace }(1-p_{rj^{\prime }}). \end{equation*}

This implies that \(f(S\cup V^i_{\le \ell })\) must also not increase from this deviation. The last thing to check is that the level-\(\ell\) subset i belongs to is contained in S so that \(S\cup V^i_{\le \ell }\) contains all queues up to level \(\ell\). But this follows because \(S\setminus V^i_{\le \ell }\) consists of queues at a level at least \(\ell\) and intersects this level-\(\ell\) subset nontrivially at i, and therefore must contain it by minimality in our construction. Therefore, for this choice of deviation by queue i, the subset \(S\cup V^i_{\le \ell }\) certifies the desired claim.□

With this result, we may finally return to the proof of Theorem 4.1.

Proof of Theorem 4.1

Let \(\mathbf {p}\) be any Nash equilibrium, and suppose that \(S_1\) is the maximal tight subset. If \(f(S_1)\ge 1\), then we are done, so suppose that \(f(S_1)\lt 1\). The Nash assumption then implies that any deviation by a queue in \(S_1\) cannot decrease the value of each tight subset it is part of; note that we need the \(f(S_1)\lt 1\) assumption, as incentives are about the rates, and when \(f(S_1)\ge 1\), the rate may remain 0 even if f decreases.

For convenience, reindex and relabel so that \(\vert S_1\vert = k\) and \(S_1=[k]\). Fix any \(\mathbf {x}\in \mathbb {R}^{m}_{\ge 0}\) such that \(\sum _{i=1}^m x_i=k\). It suffices to show that

\begin{equation*} f([k]\vert \mathbf {p}) \ge \frac{\sum _{j=1}^m\mu _j (1-(1-x_j/k)^k)}{\sum _{i=1}^k \lambda _i}. \end{equation*}

From now on, we omit the dependence on \(\mathbf {p}\) in f unless explicitly needed.

Consider the level partition of \([k]\) guaranteed by Proposition 4.3, and suppose that there are s levels. We continuously deform the Nash solution while monotonically decreasing \(f([k])\) so that at the end of this process, we have a lower bound on the value of Nash. Given any strategy profile \(\mathbf {p}\), we say that a server j is oversaturated if \(\sum _{i\in [k]} p_{ij}\gt x_j\) and undersaturated if \(\sum _{i\in [k]} p_{ij}\lt x_j\). We will continuously move probability mass from the queues from oversaturated to undersaturated servers. If no server is oversaturated, we will be done; notice that if a server is oversaturated, an easy averaging argument implies that some server must be undersaturated.

Suppose that there exists some oversaturated server. Let \(i\in [k]\) be a queue at level s, the top level. If i nontrivially sends to an oversaturated server j so that \(p_{ij}\gt 0\), we continuously decrease \(p_{ij}\) and increase \(p_{ij^{\prime }}\) for some undersaturated server \(j^{\prime }\), until either j stops being oversaturated, \(j^{\prime }\) stops being undersaturated, or \(p_{ij}\) hits zero. We claim that this deformation cannot increase \(f([k])\). To see this, observe that because \(\mathbf {p}\) is Nash, we know that any deviation by i from one server it is nontrivially mixing at to another cannot increase \(f(S)\) for all tight subsets S containing i, hence there must be some \(S\ni i\) such that \(f(S)\) does not increase. But then Proposition 4.3 implies that some subset \(S^{\prime }\) containing all queues up to level s must have \(f(S^{\prime })\) not increase either. As \([k]\) is the only such subset, this deformation could not have actually increased \(f([k])\).

Moreover, we claim that we can do this for all level-s queues one by one without increasing \(f([k])\). Although the intermediate profiles are not Nash, because we only move probability mass from oversaturated to undersaturated servers, each oversaturated queue only has at most the same probability mass as it did at Nash while each undersaturated queue only has additional probability mass compared to what it had at Nash. As we have shown any such deviation by a level-s server from an oversaturated queue to an undersaturated queue at Nash cannot increase \(f([k])\), and now deviations are only worse at this intermediate stage while deforming the level-k queue strategies, each such deformation still cannot increase \(f([k])\). Therefore, we can continuously shift all probability mass from level-s queues at oversaturated servers to undersaturated servers while never increasing \(f([k])\).

Suppose that we have now done this for all levels at least \(\ell +1\) for some \(\ell \lt s\) while not increasing \(f([k])\), and we want to continue this process at level \(\ell\). Let \(\mathbf {p^{\prime }}\) be this intermediate strategy vector, where we note that for any queue i below level \(\ell +1\), \(\mathbf {p^{\prime }}_i=\mathbf {p}_i\). Again, if no server is oversaturated, we are done. Otherwise, suppose that some queue i at level \(\ell\) still sends to an oversaturated server j, and we again try to decrease \(p^{\prime }_{ij}\) and increase \(p^{\prime }_{ij^{\prime }}\) for some undersaturated server \(j^{\prime }\) as before until the same stopping criterion. We must show that this too cannot increase \(f([k])\).

Suppose otherwise that it did indeed increase \(f([k])\) with respect to \(\mathbf {p^{\prime }}\). For a contradiction, it suffices to show that this implies that this same deviation, with respect to the original Nash solution \(\mathbf {p}\), must have increased \(f(S)\) for every subset S containing all queues up to level \(\ell\). This is sufficient to obtain a contradiction, as then Proposition 4.3 implies that every subset containing i improves at Nash with respect to this deviation, which violates the Nash property.

To prove this claim, let \(S\subseteq [k]\) be any arbitrary tight subset at Nash containing all queues up to level \(\ell\). Because we assume that this deviation improves \(f([k])\) with respect to \(\mathbf {p^{\prime }}\), it follows from taking partial derivatives that

\begin{equation*} \mu _j \prod _{r\in [k]\setminus \lbrace i\rbrace } (1-p^{\prime }_{rj})\lt \mu _{j^{\prime }} \prod _{r\in [k]\setminus \lbrace i\rbrace } (1-p^{\prime }_{rj^{\prime }}). \end{equation*}

However, note that at \(\mathbf {p^{\prime }}\), as j is still oversaturated, \(p^{\prime }_{rj}=0\) for all queues r that are at strictly higher levels. As all queues at level \(\ell\) and below have \(\mathbf {p^{\prime }}_i=\mathbf {p}_i\) and S contains all such queues, this inequality implies that

\begin{equation*} \mu _j \prod _{r\in S\setminus \lbrace i\rbrace } (1-p_{rj})\lt \mu _{j^{\prime }} \prod _{r\in [k]\setminus \lbrace i\rbrace } (1-p^{\prime }_{rj^{\prime }}). \end{equation*}

Moreover, as \(j^{\prime }\) is undersaturated, we must have \(p^{\prime }_{rj^{\prime }} \ge p_{rj^{\prime }}\) for all \(r\in [k]\) from the construction of this process, and removing terms in the product only increases the right side. Therefore, we deduce that

\begin{equation*} \mu _j \prod _{r\in S\setminus \lbrace i\rbrace } (1-p_{rj})\lt \mu _{j^{\prime }} \prod _{r\in S\setminus \lbrace i\rbrace } (1-p_{rj^{\prime }}). \end{equation*}

This implies that this deviation also increases \(f(S)\) at Nash. As S was an arbitrary tight subset containing all queues up to level \(\ell\), the claim is proved. By the reduction described earlier, this is a contradiction, and therefore \(f([k])\) must further decrease with respect to \(\mathbf {p}^{\prime }\). The argument extends analogously at all intermediate points of this process at level \(\ell\) by the same reasoning as before.

Therefore, by induction, it follows that we may continuously deform probability mass from oversaturated servers to undersaturated servers while only decreasing \(f([k])\). At the end of this process, there cannot be any oversaturated servers, as otherwise the process could have continued. In particular, if \(\mathbf {p^{\prime \prime }}\) is the final probability vector at the end of this process, we have shown that \(\sum _{i\in [k]} p^{\prime \prime }_{ij}=x_{j}\) for all servers j and that \(f([k]\vert \mathbf {p}^{\prime \prime })\le f([k]\vert \mathbf {p})\). We have

\begin{align*} f([k]\vert \mathbf {p})&=\frac{\sum _{j=1}^m \mu _j(1-\prod _{i=1}^k(1-p_{ij}))}{\sum _{i=1}^k \lambda _i} &&\text{(by definition)}\\ &\ge \frac{\sum _{j=1}^m \mu _j(1-\prod _{i=1}^k(1-p^{\prime \prime }_{ij}))}{\sum _{i=1}^k \lambda _i} &&\text{(by construction)}\\ &\ge \frac{\sum _{j=1}^m \mu _j(1-(1-x_j/k)^k)}{\sum _{i=1}^k \lambda _i} &&\text{(as symmetric profile maximizes product).} \end{align*}

The second inequality holds because given \(\sum _{i\in [k]} p^{\prime \prime }_{ij}\), the maximizer of \(\prod _{i\in [k]}(1-p^{\prime \prime }_{ij})\) is attained when each term is equal. As \(\mathbf {x}\) was arbitrary, we may take the maximum of the right side over all \(\mathbf {x}\) satisfying the constraints and then the minimum over k. As \(\mathbf {p}\) was an arbitrary Nash profile, this concludes the proof.□

5 No-Regret Learning in Strategic Queuing Systems

Our work in the previous sections provides tight bounds for strategic but stationary behavior in this queuing model. However, it is unclear how agents might reach such a state, let alone as the result of natural dynamics. In this section, we therefore turn to a dynamical setting where agents adaptively update their behavior using no-regret learning algorithms. In the no-regret queuing system we analyze, each queue aims to get their packets served as efficiently as possible. At each timestep, they aim to maximize the probability that their packet gets served, measuring value for a time period with the number of packets served. The effective rate of service at a server j depends on its rate \(\mu _{j}\), as well as on the competition for service by the other queues.

In this section, we will model queues as learners, and will make the assumption that for a parameter w, each queue will satisfy a no-regret learning guarantee on the number of packets served during each window of w time length. To formalize this assumption, we need a few definitions.

Note that all of these random variables are with respect to the same sample path; the \(X^{i,j}_t\) will depend on all previous randomizations and choices by the queues, as these implicitly yield the priorities of the queues. In other words, \(\text{Reg}_i(w)\) of queue i on some fixed window of length w is defined to be the (random) difference between the number of packets queue i cleared on these w periods compared to the backward-looking number of packets she would have cleared had she simply always sent to the best single server, where the comparison is in hindsight to the best single server on the realized sample path—that is, for each time t and server j, we have \(X^{i,j}_t=1\) if at time t server j was successful (regardless of if a packet was sent there then), and the packet that queue i sent had priority over any packet sent there at that time.¹⁶

We now make the following assumption on the regret of queuing strategies.

Assumption 5.1 (Queues Satisfy High-probability No-regret).

All queues select servers using a strategy or algorithm satisfying the following no-regret guarantee: given fixed \(\delta \in (0,1)\) and a fixed window with length w, the regret \(\text{Reg}_i(w)\) of queue i on this given window of w consecutive timesteps satisfies \(\text{Reg}_i(w)\le \varphi _{\delta }(w)\) with probability at least \(1-\delta\) only over their own randomizations during this window, where \(\varphi _{\delta }(w)=o(w)\) is some explicit function. Here, \(o(\cdot)\) hides constant factors depending on \(\delta\) and m, but not w.

Moreover, we require that the choices of the queue depend only on their past bandit feedback and their past history of ages, but not on their history of queue sizes.

For instance, this assumption holds with EXP3.P.1 with the form of the regret scaling like \(\sqrt {wm\ln (m w/\delta)}=o(w)\) [3]. Note that this high-probability guarantee is possible in our setting even in the priority model where the random variables of success at each server from the perspective of each queue at each timestep depend on all previous actions (via timestamps and priorities), as well as the actions of the other queues in the current time period (e.g., see the discussion in Section 9 of Auer et al. [3]). This property is standard and necessary in applying learning algorithms to multi-player games. Using EXP3.P.1 ensures that such a guarantee holds simultaneously for each window of this length, and not only a fixed window, so the players would not have to be aware which window of size w is relevant for our analysis. This is true as EXP3.P.1 mixes in uniform exploration to guarantee that the probabilities remain high enough throughout the algorithm, allowing us to adapt the classical no-regret analysis starting at any timestep for the window of the next w timesteps.

5.1 Stability of No-Regret Queuing Systems

Our second main result shows that if the queuing system has enough slack and all queues satisfy an appropriate high-probability no-regret guarantee, then the queuing system is strongly stable. To this end, we make the following feasibility assumption asserting that a queuing system with servers scaled down by \(1/2\) would remain feasible.

Assumption 5.2 (Feasibility).

There exists \(\eta \gt 0\) such that for all \(k\in [n]\),

\begin{equation*} \frac{1}{2}(1-\eta)\sum _{j=1}^{\min \lbrace k,m\rbrace } \mu _j\ge \sum _{i=1}^k \lambda _i. \end{equation*}

We will use \(\eta\) to denote the maximum such value that this inequality holds. The parameter \(\eta\) controls the quality of learning required for our results. With these definitions in order, we may formally state our main result on stability of no-regret learners.

Theorem 5.2.

Suppose that Assumption 5.2 holds with parameter \(\eta\). Set the following parameters: \(\delta = \frac{\eta }{4}, \epsilon =\frac{\delta \mu _1}{4n}\), and \(\epsilon _i = \frac{\epsilon }{\lambda _i}\) for \(1\le i\le n\). Let w be large enough so that the following holds:

\begin{equation} n\varphi _{\frac{\eta }{256n}}(w)+n\le \frac{w\delta \mu _1}{4}, \end{equation}

(7)

and over w steps of our process the sum of the geometric variables of subsequent packet arrivals, and the sum of the Bernoulli server successes concentrate around their expectation with an error probability of at most \(\eta /256\) with the preceding values of \(\delta ,\epsilon _i,\mathbf {\lambda }\), and \(\mathbf {\mu }\) (see the required inequalities at Equations (10) and (11)).¹⁷

Then, if each queue satisfies Assumption 5.1 on each consecutive time interval of length w with probability at least \(1-\frac{\eta }{256n}\), then the random process \(\mathbf {T}_t\) under these dynamics is strongly stable.

The technical tool we use to prove Theorem 5.2 is the following result of Pemantle and Rosenthal [29]:

Theorem 5.3 (Theorem 1 in the Work of Pemantle and Rosenthal [29]).

To apply this theorem, we must define an appropriate potential function of queue ages that satisfies the negative drift and bounded moments condition. We define for \(\tau \in \mathbb {N}\) the following potential functions that will feature prominently in the proof:

(8)

(9)

In other words, \(\Phi _{\tau }(\cdot)\) denotes the expected number of total packets in the system aged above \(\tau\), conditioned just on the ages \(\mathbf {T}_t\).

Remark 5.1.

This analysis crucially relies on using the Geometric system as opposed to the Bernoulli system. The reason is that the preconditions in Theorem 5.3 must hold conditioned on any history, even low probability events. In the Bernoulli system, this would require us to condition on too much. For instance, it is possible for there to be a queue with a very old packet and yet have received no other packets until the current timestep. Although unlikely to actually ever happen, this is a perfectly valid potential history. In this case, clearing this packet would lead to an arbitrarily large pth moment change, as her age would drastically decrease, and therefore the moment condition of Theorem 5.3 would be violated. Although intuitively this should only help the stability of the random process, the conditions in Theorem 5.3 are subtle (see the discussion in the work of Pemantle and Rosenthal [29].

Even if that obstruction can be managed suitably, the extra conditional information in the Bernoulli system highly complicates the analysis, as then one must reason about the priorities of the packets that have already been received before the present timestep. These could in principle be quite arbitrary. We avoid these complications in the Geometric system, as it allows us to only condition on current ages.

We now provide a simple construction showing that a partial converse holds: \(\frac{1}{2}\) is the best constant that can appear in Assumption 5.2 for a similar no-regret condition to be sufficient for stability as in Theorem 5.2. To set it up, let \(w_k=k^2\) for each \(k\ge 1\). Then we have the following theorem.

Theorem 5.4.

Partition time \(t=0,1,\ldots\) into consecutive windows, where the kth window has length \(w_k=k^2\). Then there exists a family of queuing systems with n queues and servers for each \(n\ge 1\) satisfying Assumption 5.2 with \(\frac{1}{2}+o_n(1)\) in place of \(\frac{1}{2}\) with the following properties: almost surely, each queue has zero regret on all but at most finitely many of the windows, but the system is not weakly stable.

The formal details are slightly technical, and therefore the proof is deferred to Appendix E, but the high-level idea is quite natural: for each \(n\ge 1\), consider the following system on n queues and n servers where we set \(\mathbf {\lambda }=(\frac{n+1}{n^2},\ldots ,\frac{n+1}{n^2})\) and \(\mathbf {\mu }=(1,\frac{n-1}{n^2},\ldots ,\frac{n-1}{n^2})\). Consider the strategy where every queue always sends to the rate 1 server. It is easy to see purely from expectations that the queue lengths are unbounded in expectation, as the sum of arrival rates strictly exceeds 1. However, it is intuitive that this strategy will “usually” be zero regret; if all the queues are similarly aged at the start of some window, then they should expect to clear roughly \(1/n\) fraction of the time on this window using this strategy, which strictly exceeds what they would get at any other server. We use standard concentration arguments and the Borel-Cantelli lemma to argue that this situation will happen all but finitely many times almost surely, thereby obtaining the claim.

6 Asymptotic Convergence

In this section, we finally return to proving Theorem 3.3, which asserted the equivalence between the long-run rates of queue ages that form the cost functions for the patient queuing game and the output of the algorithm given in Section 3.1. The high-level idea is to show that this identity holds for all \(i\in S_1\), then \(S_2\), and so on. The first step is showing that the maximum queue age grows by at most the desired rate on each long-enough window with high probability.

Proposition 6.1.

Fix \(\epsilon \gt 0\). For any integer \(a\in \mathbb {N}\), let \(w=a\cdot \lceil \frac{6}{\epsilon }\rceil ^{n-1}\). Suppose that it holds at time t that \(\max _{i\in [n]} T^i_t\ge w\cdot f_1\). Then

\begin{equation*} \max _{i\in [n]} T^i_{t+w} -\max _{i\in [n]} T^i_t \le (1-(1-\epsilon)\cdot f_1)\cdot w \end{equation*}

with probability at least \(1-C_1\exp (-C_2a)\), where \(C_1,C_2\gt 0\) are absolute constants depending only on \(n,\epsilon ,\mathbf {\lambda }, \mathbf {\mu },\mathbf {p}\), but not on a. More generally, for each \(s\ge 1\), if \(\max _{i\not\in U_{s-1}} T^i_t\ge w\cdot f_s\), then

\begin{equation*} \max _{i\not\in U_{s-1}} T^i_{t+w} -\max _{i\not\in U_{s-1}} T^i_t \le (1-(1-\epsilon)\cdot f_s)\cdot w \end{equation*}

with probability at least \(1-C_1\exp (-C_2a)\), where \(C_1,C_2\gt 0\) are absolute constants depending only on \(n,\epsilon ,\mathbf {\lambda }, \mathbf {\mu },\mathbf {p}\), but not on a.

We prove Proposition 6.1 in Appendix F using a delicate argument relying on a variety of concentration bounds. The key insight is that if a subset of much older queues S is likely to have priority on a long window of length w, the quantity \(w\cdot f(S_1)\cdot \lambda (S)\) is a lower bound on the expected number of packets cleared collectively by S on this window by definition of \(S_1\). The analysis gets complicated when there are multiple old queues, as although we know these queues collectively have priority over all young queues, we must argue about priorities within this subset to bound the growth of the maximum queue age. We deal with this by induction by carefully chaining together large windows to obtain a win-win analysis.

For Proposition 6.1 to yield anything useful, we will need a corresponding lower bound asserting roughly that if groups have separated according to what the algorithm asserts, then the aging rate of the average queue in a group grows at the conjectured rate. To that end, we prove the following result in Appendix F, which shows that if we have the conjectured aging separation between groups \(U_{k-1}\) and \(S_k\), then some weighted combination of the queue ages in \(S_k\) (whose significance will prove apparent momentarily) must rise quickly.

Proposition 6.2.

For any \(s\ge 1\) and any fixed \(\epsilon \gt 0\), the following holds: suppose that at time t, it holds that

\begin{equation*} \min _{i\in U_s} T^i_t -\max _{i\in S_{s+1}} T^i_t\ge 2\cdot \frac{w}{\lambda _n}. \end{equation*}

Then with probability \(1-A\exp (-Bw)\) where \(A,B\gt 0\) are absolute constants not depending on w, we have

\begin{equation*} \sum _{i\in S_{s+1}} \lambda _i T^i_{t+w}-\sum _{i\in S_{s+1}} \lambda _iT^i_t\ge (1-(1+\epsilon)f_{s+1})\cdot w \cdot \left(\sum _{i\in S_{s+1}} \lambda _i\right). \end{equation*}

Moreover, for any fixed \(\epsilon \gt 0\), with probability at least \(1-A\exp (-Bw)\) it holds that

\begin{equation*} \sum _{i\in S_1} \lambda _iT^i_w \ge (1-(1+\epsilon)f_1)\cdot w\cdot \left(\sum _{i\in S_1} \lambda _i\right). \end{equation*}

Combined with Proposition 6.1, this will allow us to conclude that because the average queue and oldest queue in \(S_1\) ages at the desired rate almost surely, all queues in \(S_1\) must age at this rate almost surely. To extend this analysis to lower groups \(S_2\) and so on, we will use a similar analysis to show that the maximum age over every queue not in \(S_1\) grows at most like \(r(S_2)\). Then, because we know that every queue in \(S_1\) grows by a \(r(S_1)\gt r(S_2)\) rate, almost surely, eventually every queue in \(S_1\) will be much older than every queue not in \(S_1\), giving priority. We leverage this fact to show again that the average queue in \(S_2\) must grow by at least \(r(S_2)\), and therefore every queue in \(S_2\) grows at this rate almost surely. The proof for the lower groups \(S_3,\ldots\) is completely analogous. We now proceed to make this argument formal to prove Theorem 3.3.

Proof of Theorem 3.3

By the dominated convergence theorem, it suffices to show the second equality. We will show that the desired statement holds for each \(i\in S_1\), then \(S_2\), and so on. We first treat the case that the last outputted group \(S_k\) satisfies \(g_k=0\), or equivalently that \(f_k\ge 1\). Fix \(\epsilon \gt 0\) and partition time into consecutive windows of size \(w_{\ell }=\ell \cdot \lceil \frac{6}{\epsilon }\rceil ^{n-1}\). Let \(W_{\ell }=\sum _{q=1}^{\ell -1} w_{q}\) be the time period at the beginning of the \(\ell\)th window, and note that \(w_{\ell }=\Theta (W_{\ell }^{1/2})\).

Consider the following events for \(\ell =1,2,\ldots\)

\[\begin{gather*} A_{\ell }=\left\lbrace \max _{i\in [n]\setminus U_{k-1}} T^i_{W_{\ell +1}} -\max _{i\in [n]\setminus U_{k-1}} T^i_{W_{\ell }} \gt (1-(1-\epsilon)\cdot f_{k})\cdot w_{\ell }\right\rbrace \quad \quad B_{\ell }=\left\lbrace \max _{i\in [n]\setminus U_{k-1}} T^i_{W_{\ell }}\ge w_{\ell }\cdot f_{k}\right\rbrace \\ C_{\ell } = A_{\ell }\cap B_{\ell }. \end{gather*}\]

Clearly, \(\Pr (C_{\ell })\le \Pr (A_{\ell }\vert B_{\ell })\). But by Proposition 6.1, we know that for some constants \(C_1,C_2\gt 0\) independent of \(\ell\),

\begin{equation*} \Pr (A_{\ell }\vert B_{\ell })\le C_1\exp (-C_2\cdot \ell). \end{equation*}

The sum over \(\ell\) is thus finite, and so the first Borel-Cantelli lemma (Lemma B.1) implies that almost surely at most finitely many of the \(C_{\ell }\) occur. Equivalently, almost surely, for all but finitely many of the \(\ell\), either

\begin{equation*} \max _{i\in [n]\setminus U_{k-1}} T^i_{W_{\ell +1}} -\max _{i\in [n]\setminus U_{k-1}} T^i_{W_{\ell }} \le (1-(1-\epsilon)\cdot f_{k})\cdot w_{\ell } \quad \text{ or }\quad \max _{i\in [n]\setminus U_{k-1}} T^i_{W_{\ell }}\lt w_{\ell }\cdot f_{k}. \end{equation*}

Observe that for each of the intervals where the latter holds, the value during the interval is at most \(w_{\ell }\cdot f_{k} + w_{\ell +1}=O(W_{\ell }^{1/2})\). In particular, it is not difficult to see that almost surely \(\max _{i\in [n]\setminus U_{k-1}} T^i_{W_{\ell }}\) is either \(o(W_{\ell })\), in which case we are done, or grows by at most a rate of \((1-(1-\epsilon)\cdot f_{k})\cdot w_{\ell }\). Either way, as \(\epsilon \gt 0\) was arbitrary, we may take \(\epsilon \rightarrow 0\) to deduce the desired result that almost surely¹⁸

\begin{equation*} \limsup _{t\rightarrow \infty }\frac{\max _{i\in [n]\setminus U_{k-1}} T^i_{t}}{t}=0=g_{k}, \end{equation*}

using \(f_{k}\ge 1\). As ages of queues are nonnegative, the lower bound of 0 is trivial.

Now we turn to the rest of the groups, and we now assume that \(g_{k}\gt 0\) . We do this inductively. For \(S_1\), fix \(\epsilon \gt 0\) and define

\begin{equation*} A_t=\left\lbrace \sum _{i\in S_1} \lambda _iT^i_t \lt (1-(1+\epsilon)f_1)\cdot t\cdot \left(\sum _{i\in S_1} \lambda _i\right)\right\rbrace . \end{equation*}

By Proposition 6.2, we know that \(\Pr (A_t)\le A\exp (-Bt)\) for some constants \(A,B\gt 0\) independent of t. Therefore, \(\sum _{t=1}^{\infty } \Pr (A_t)\lt \infty\), from which the Borel-Cantelli lemma implies that almost surely, for all but finitely many t,

\begin{equation*} \sum _{i\in S_1} \lambda _iT^i_t\ge (1-(1+\epsilon)f_1)\cdot t\cdot \left(\sum _{i\in S_1} \lambda _i\right). \end{equation*}

Taking \(\epsilon \rightarrow 0\), we obtain almost surely that

\begin{equation} \liminf _{t\rightarrow \infty } \frac{\sum _{i\in S_1} \lambda _iT^i_t}{t}\ge g_1\cdot \left(\sum _{i\in S_1} \lambda _i\right). \end{equation}

(25)

Next, note that deterministically, we have from Fact 2.1 that

\begin{equation} \min _{i\in S_1} T^i_t\le \frac{\sum _{i\in S_1} \lambda _iT^i_t}{\sum _{i\in S_1} \lambda _i}\le \max _{i\in S_1} T^i_t. \end{equation}

(26)

In particular, we deduce that almost surely,

\begin{equation} \liminf _{t\rightarrow \infty } \frac{\max _{i\in S_1} T^i_t}{t}\ge g_1\gt 0. \end{equation}

(27)

For the upper bound, let the \(w_{\ell }\) and \(W_{\ell }\) be as before, and now define

\[\begin{gather*} A_{\ell }=\left\lbrace \max _{i\in [n]} T^i_{W_{\ell +1}} -\max _{i\in [n]} T^i_{W_{\ell }} \gt (1-(1-\epsilon)\cdot f_1)\cdot w_{\ell }\right\rbrace \quad \quad B_{\ell }=\left\lbrace \max _{i\in [n]} T^i_{W_{\ell }}\ge w_{\ell }\cdot f_1\right\rbrace \\ C_{\ell } = A_{\ell }\cap B_{\ell }. \end{gather*}\]

Again, \(\Pr (C_{\ell })\le \Pr (A_{\ell }\vert B_{\ell })\). By Proposition 6.1, a now routine application of the Borel-Cantelli lemma implies that almost surely, for all but finitely many \(\ell\), either

\begin{equation*} \max _{i\in [n]} T^i_{W_{\ell +1}} -\max _{i\in [n]} T^i_{W_{\ell }} \lt (1-(1-\epsilon)\cdot f_1)\cdot w_{\ell } \quad \text{ or }\quad \max _{i\in [n]} T^i_{W_{\ell }}\lt w_{\ell }\cdot f_1. \end{equation*}

But the latter event cannot happen infinitely often with positive probability, as this would imply \(\max _{i\in [n]} T^i_{W_{\ell }}=o(W_{\ell })\) infinitely often with nonzero probability, which violates (27). Therefore, it must be the case that almost surely, for all but finitely many \(\ell\), the former event holds. This implies that almost surely,

\begin{equation*} \limsup _{t\rightarrow \infty } \frac{\max _{i\in [n]} T^i_{t}}{t}\le (1-(1-\epsilon)\cdot f_1); \end{equation*}

taking \(\epsilon \rightarrow 0\) implies that

\begin{equation*} \limsup _{t\rightarrow \infty } \frac{\max _{i\in [n]} T^i_{t}}{t}\le g_1. \end{equation*}

As clearly the left side is an upper bound for the \(\limsup\) of only those queues in \(S_1\), almost surely

\begin{equation*} \limsup _{t\rightarrow \infty } \frac{\max _{i\in S_1} T^i_{t}}{t}\le g_1. \end{equation*}

Combining this with (27), we finally deduce that almost surely,

\begin{equation*} \lim _{t\rightarrow \infty } \frac{\max _{i\in S_1} T^i_{t}}{t}= g_1. \end{equation*}

By Equations (25) and (26), we can also conclude that almost surely the same holds for the minimum \(i\in S_1\). Thus, \(r_i(\mathbf {p})=g_1\) for all \(i\in S_1(\mathbf {p})\) by definition of \(g_1\), proving the theorem for all queues in \(S_1\).

We now show how to extend this inductively to higher values of k with \(g_{k}\gt 0\). Suppose that we have shown for all \(i\in U_{k-1}\) that the desired almost sure limit holds, and now consider \(S_{k}\). A completely analogous argument using the windows \(w_{\ell }\) as earlier with Proposition 6.1 via the Borel-Cantelli lemma implies that almost surely,

\begin{equation} \limsup _{t\rightarrow \infty } \frac{\max _{i\in [n]\setminus U_{k-1}} T^i_t}{t}\le g_{k}. \end{equation}

(28)

Now, with these same windows, fix \(\epsilon \gt 0\) and let

\[\begin{gather*} A_{\ell } = \left\lbrace \sum _{i\in S_{k}} \lambda _i T^i_{W_{\ell +1}}-\sum _{i\in S_{k}} \lambda _iT^i_{W_{\ell }}\lt (1-(1+\epsilon)f_{k})\cdot w_{\ell } \cdot \left(\sum _{i\in S_{k}} \lambda _i\right)\right\rbrace \\ B_{\ell } = \left\lbrace \min _{i\in U_{k-1}} T^i_{W_{\ell }} -\max _{i\in S_{k}} T^i_t\ge 2\cdot \frac{w_{\ell }}{\lambda _n}\right\rbrace \\ C_{\ell } = A_{\ell }\cap B_{\ell }. \end{gather*}\]

Another completely analogous application of Proposition 6.2 and the Borel-Cantelli lemma implies that almost surely, at most finitely many of the \(C_{\ell }\) occur. In other words, almost surely, for all but finitely many \(\ell\), either

\begin{equation*} \sum _{i\in S_{k}} \lambda _i T^i_{W_{\ell +1}}-\sum _{i\in S_{k}} \lambda _iT^i_{W_{\ell }}\ge (1-(1+\epsilon)f_{k})\cdot w_{\ell } \cdot \left(\sum _{i\in S_{k}} \lambda _i\right) \quad \text{ or }\quad \min _{i\in U_{k-1}} T^i_{W_{\ell }} -\max _{i\in S_{k}} T^i_t\lt 2\cdot \frac{w_{\ell }}{\lambda _n}. \end{equation*}

But the latter event cannot happen infinitely often with any nonzero probability by virtue of the inductive hypothesis and (28), as \(g_{k-1}\gt g_{k}\) by Lemma 3.6, which implies that these timestamps cannot be so close infinitely often. Therefore, it must be the case that for all but finitely many of the \(\ell\), the former event holds. As usual, this immediately implies that

\begin{equation*} \liminf _{t\rightarrow \infty }\frac{\sum _{i\in S_{k}} \lambda _i T^i_{t}}{t}\ge (1-(1+\epsilon)f_{k})\left(\sum _{i\in S_{k}} \lambda _i\right). \end{equation*}

Again taking \(\epsilon \rightarrow 0\) thus implies that almost surely,

\begin{equation*} \liminf _{t\rightarrow \infty }\frac{\sum _{i\in S_{k}} \lambda _i T^i_{t} }{t}\ge g_{k}\left(\sum _{i\in S_{k}} \lambda _i\right), \end{equation*}

which again coupled with (28) and Fact 2.1 yields that almost surely,

\begin{equation*} \lim _{t\rightarrow \infty } \frac{\min _{i\in S_{k}} T^i_{t}}{t}=\lim _{t\rightarrow \infty } \frac{\max _{i\in S_{k}} T^i_{t}}{t}= g_{k}. \end{equation*}

The extension to all \(i\in S_{k}\) follows in the same manner as before by comparing with the average, completing the proof.□

Observe that Theorem 3.3 rather strongly and explicitly characterizes the linear almost sure asymptotic growth rates of each queue for any choices of randomizations. Our main result in Theorem 4.1 showed that with a small slack in the system capacity, each queue will be guaranteed sublinear asymptotic growth almost surely in any equilibrium. Although our objective function emphasizes the physical interpretation for each queue as an asymptotic linear growth rate, these incentives impose that queues are indifferent between sublinear growth rates. One could instead define the game just using the \(f_k\) quantities directly rather than taking the max with 0 as is needed to argue about the asymptotic growth rates via r. If queues started out equally backed up, the \(f_k\) quantities measure the linear speed at which their ages descend to zero. In this setting, we provide the following stronger conclusion, whose proof is deferred to Appendix F.

Corollary 6.3.

Fix \(\mathbf {p}\) and suppose that for some group \(S_k\) output by the algorithm, \(f_k\gt 1\), so that \(1-f_k\lt 0\). Then, for each \(i\in S_k\), \(T^i_t\) is strongly stable.

7 Conclusion and Open Questions

In this article, we have studied the outcomes of strategic queuing under multiple behavioral assumptions. When considering a patient queuing game restricted to stationary equilibria, we were able to use careful probabilistic arguments to establish the incentive structure of the game. We show that the correct bicriteria factor in this setting is \(\frac{e}{e-1}\) via a novel deformation argument. Turning then to no-regret dynamics, we show that the factor remains constant but degrades to a factor of 2. In total, our work shows that price of anarchy–style bounds are attainable in such repeated games with state in both equilibrium outcomes and learning outcomes, but that these results require substantially different techniques from the existing literature.

Our work leaves open several enticing questions. The most immediate technical question left open is Question 3 to determine the price of stability of the patient queuing game. Doing so would give a more fine-grained analysis of the cost of selfishness rather than independence here. Moreover, although our restriction to time-independent policies in the patient queuing game exhibits quite rich behavior while enabling us to completely characterize the game-theoretic properties, perhaps there is a larger space of strategies where similar results hold.

But a more pressing direction, in our view, is that the gap we show in equilibrium and no-regret outcomes suggests that no-regret behavior can (surprisingly) yield reasonable outcomes here, but may not necessarily be the correct notion of agent behavior in repeated games that carry strong interdependencies between rounds. It is an interesting question as to whether a natural form of non-cooperative learning can arrive at a Nash equilibrium of the patient version or at least result in stable outcomes without reaching an equilibrium. To that end, it may be necessary to explore the theoretical properties of more powerful learning algorithms in such settings that get the best of both worlds: namely, balancing current rewards while maintaining long-run perspective. Whether such results are possible is an exciting open direction in resolving some of the deficiencies of traditional price of anarchy results.

Equally as important, we wonder if it is possible to obtain a more general theory of price of anarchy in stochastic games. If so, it would be quite interesting to see whether such an analysis extends naturally to a reasonable learning dynamic as in the classical setting. The techniques in this work appear quite specialized to the queuing setting and differ significantly in the equilibrium and learning settings. We leave a systematic exploration of this question to future work.

Acknowledgments

We thank Christos Papadimitriou for valuable discussions and insights in the early stages of this work.

Footnotes

Note that “stability” here refers to game-theoretic stability of behavior, not the stability notion in our queuing system.

See Section 2.2 for precise definitions and the full specification.

We remark that in other queuing systems, the servers may also have a bounded size queue and would only send back (or drop) packets when they no longer fit on the queue; our simpler model without server queues makes the tradeoffs we want to study cleaner. A packet sent to a server is either served or returned and offers instantaneous feedback to the learning algorithms of the queues, in contrast to the bit more informative but delayed feedback available in real systems.

⁴

While not immediately obvious, we will show in Appendix C.3 that strong stability implies almost sure subpolynomial growth.

⁵

Namely, this random process mixes to a stationary distribution on \(\mathbb {N}\) with geometrically decreasing tail probabilities.

⁶

We remark that the analogous statement for weak stability holds if the inequalities in Equation (3) only weakly hold by the same proof.

⁷

Majorization is often used for probabilities, and hence defined so that the total sums are equal; we omit this condition in our work.

⁸

This can also easily be seen directly using Theorem 5.3. Negative drift when exceeding \(Q_t=0\) is obvious, and as queue sizes can change by at most n in total between steps, increments are clearly bounded in \(L^p\) for any \(p\ge 0\).

⁹

By this, we mean that conditioned on the (randomized) strategies of all other queues in a given timestep, each queue sends to a server with the highest probability of success.

¹⁰

Note that although \(T^i_t\ge 0\) by definition, it is possible that \(\widetilde{T}^i_t\gt t\). The interpretation of this is that the queue has cleared all of her packets at time t and will receive her next one at time \(t=\widetilde{T}^i_t\), or equivalently, in \(\widetilde{T}^i_t-t\) steps in the future from the perspective at time t.

¹¹

In other words, the price of anarchy of a centrally feasible system is the supremum of values of \(\alpha\) such that when all queue arrival rates are scaled down by \(\alpha\), there nonetheless exists a Nash equilibrium and some queue that suffers nonzero linear aging.

¹²

We show in Lemma 3.5 that this choice is unique and canonical.

¹³

We remark that because all queues are stable in the coordinated solution and thus no deviation is profitable, this strategy can be interpreted as a correlated equilibrium of the patient queuing game where at each time, each agent is told to play a (random, coordinated) Dirac strategy (and possibly abstaining from sending in that round). Therefore, if we allow for public randomness to induce coordination, the price of stability with respect to the larger class of correlated equilibria is simply 1.

¹⁴

For any fixed k, it is not difficult to determine the optimal value of \(\mathbf {x}\) to give the tightest lower bound. It suffices to maximize the numerator, which is concave. By standard Karush-Kuhn-Tucker conditions at optimality, for all \(j,j^{\prime }\in [m]\) such that \(x_j\gt 0\), we must have \(\mu _j (1-x_j/k)^{k-1} = \mu _{j^{\prime }} (1-x_{j^{\prime }}/k)^{k-1}\), and \(x_{\ell }=0\) for all lower indices.

¹⁵

In the example given before of a system with multiple tight subsets, the level-1 subset is the subset of two queues that split between the inner and outer servers. The level-2 subsets are the two singleton sets of the outer queues sending each to their own outer server. Notice that these two subsets indeed disjointly mix.

¹⁶

Note that this notion of regret does not take into account that had a queue i cleared a packet at at server j instead of another queue \(i^{\prime }\), at a later time \(i^{\prime }\) would have had an older packet and therefore higher priority.

¹⁷

Note that this is possible as \(\varphi _{\gamma }(w)=o(w)\) for any fixed \(\gamma\), as well as the exponential decay in w of the concentration bounds in Equations (29) and (30) in the appendix.

¹⁸

For any \(\epsilon \gt 0\), we have directly shown that the statement holds for t of the form \(t=W_{\ell }\) for \(\ell \ge 1\). For any t such that \(W_{\ell }\le t\lt W_{\ell +1}\), \(T^i_t\) cannot be more than \(w_{\ell }\) from the value at \(W_{\ell }\), as ages can increase by at most 1 in each period. This implies that on any such intermediate time, the difference in the numerator from the value at \(t=W_{\ell }\) is \(O(w_{\ell })=O(t^{1/2})=o(t)\) and thus vanishes in the limit when divide by t, so the \(\limsup\) may be taken over all t, not just the sparsified sequence.

Appendices

A Basic Inequalities

Fact A.1.

Suppose that \(a,b,c\ge 0\) and that \(a-b\le c.\) Then

\begin{equation*} \sqrt {a}-\sqrt {b}\le \min \bigg \lbrace \frac{c}{2\sqrt {b}},\sqrt {c}\bigg \rbrace . \end{equation*}

Proof.

The first inequality arises from rearranging and concavity of the square root function:

\begin{equation*} \sqrt {a}\le \sqrt {b}\sqrt {1+c/b}\le \sqrt {b}(1+c/2b). \end{equation*}

The second follows from assuming without loss of generality that \(a\ge b\) and observing that the claim is implied by \(\sqrt {a}-\sqrt {b}\le \sqrt {a-b},\) which holds by squaring and simple algebra.□

Fact A.2.

Suppose that \(a,b,c\ge 0\). Then \(a-b\ge c\) implies that

\begin{equation*} \sqrt {a}-\sqrt {b}\ge \frac{c}{2\sqrt {a}}. \end{equation*}

Proof.

\begin{equation*} a-b=(\sqrt {a}-\sqrt {b})(\sqrt {a}+\sqrt {b})\ge c\Rightarrow \sqrt {a}-\sqrt {b}\ge \frac{c}{\sqrt {a}+\sqrt {b}}\ge \frac{c}{2\sqrt {a}}. \end{equation*}

□

Recall that we defined the following two weighted \(\ell _p\) norms on \(\mathbb {R}^n\): \(\Vert \mathbf {x}\Vert _{\mathbf {\lambda },1}\triangleq \sum _{i=1}^n \lambda _i \vert x_i\vert\) and \(\Vert \mathbf {x}\Vert _{\mathbf {\lambda },2}\triangleq \sqrt {\sum _{i=1}^n \lambda _i x_i^2}.\)

Lemma A.1.

For all \(x\in \mathbb {R}^n\),

\begin{equation*} \sqrt {\lambda _n}\Vert x\Vert _{\mathbf {\lambda },2}\le \Vert x\Vert _{\mathbf {\lambda },1}\le \sqrt {\sum _{i=1}^n \lambda _i}\Vert x\Vert _{\mathbf {\lambda },2}. \end{equation*}

Proof.

For the first inequality,

\begin{equation*} \Vert x\Vert _{\mathbf {\lambda },1}^2=\sum _{i,j=1}^n \lambda _i\lambda _j\vert x_i\vert \vert x_j\vert \ge \sum _{i=1}^n \lambda _i^2x_i^2 \ge \lambda _n \sum _{i=1}^n \lambda _ix_i^2 =\lambda _n\Vert x\Vert _{\mathbf {\lambda },2}^2. \end{equation*}

The second is a routine application of Cauchy-Schwarz:

\begin{equation*} \sum _{i=1}^n \lambda _i \vert x_i\vert =\sum _{i=1}^n \sqrt {\lambda _i}(\sqrt {\lambda _i}\vert x_i\vert)\le \sqrt {\sum _{i=1}^n\lambda _i}\sqrt {\sum _{i=1}^n \lambda _i x_i^2}=\sqrt {\sum _{i=1}^n\lambda _i}\Vert x\Vert _{\mathbf {\lambda },2}. \end{equation*}

□

B Probability Tools

We will use the following concentration results throughout the article.

Lemma B.1 (First Borel-Cantelli Lemma, Theorem 2.3.1 of Durrett [15]).

Let \(A_1,A_2,\ldots\) be a sequence of events with \(\sum _{i=1}^{\infty }\Pr (A_i)\lt \infty\). Then with probability 1 at most finitely many of the \(A_i\) occur.

Lemma B.2 (Azuma-Hoeffding).

Let \(\lbrace \mathcal {F}_k\rbrace _{k\le n}\) be any filtration and let \(A_k,B_k,\Delta _k\) satisfy the following conditions:

(1)

\(\Delta _k\) is \(\mathcal {F}_k\)-measurable and \(\mathbb {E}[\Delta _k\vert \mathcal {F}_{k-1}]=0\). In other words, the \(\Delta _k\) form a martingale difference sequence.

(2)

\(A_k,B_k\) are \(\mathcal {F}_{k-1}\)-measurable and satisfy \(A_k\le \Delta _k\le B_k\) almost surely.

Then

\begin{equation*} \Pr \left(\sum _{k=1}^n \Delta _k \ge t\right)\le \exp \left(\frac{-2t^2}{\sum _{k=1}^n \Vert B_k-A_k\Vert _{\infty }}\right). \end{equation*}

Lemma B.3 (Etemadi, Theorem 22.5 in the Work of Billingsley [6]).

Suppose that \(X_1,\ldots ,X_n\) are independent random variables. Then for any \(x\ge 0\),

\begin{equation*} \Pr \left(\max _{1\le k\le n} \vert Z_k\vert \ge 3x\right)\le 3\max _{1\le i\le n}\Pr \big (\vert Z_k\vert \ge x\big), \end{equation*}

where \(Z_k\) is the kth partial sum of the \(X_i\) (i.e., \(Z_k=\sum _{i=1}^k X_i\)).

Lemma B.4 (Theorem 1 in the Work of Witt [42]).

Let \(X_1,\ldots ,X_n\) be i.i.d. \(\text{Geom}(\lambda)\) random variables so that \(\mathbb {E}[X_i]=\frac{1}{\lambda }\). Let \(s=\frac{n}{\lambda ^2}\) and \(Z_n=\sum _{i=1}^n X_i\). Then for all \(\delta \gt 0\),

\begin{equation*} \Pr \left(Z_n-\frac{n}{\lambda }\lt -\delta \right)\le \exp \left(\frac{-\delta ^2}{2s}\right) \quad \text{and} \quad \Pr \left(Z_n-\frac{n}{\lambda }\gt \delta \right)\le \exp \left(\frac{-\delta }{4}\min \lbrace \delta /s,\lambda \rbrace \right). \end{equation*}

Corollary B.5.

Under the assumptions and notation of Lemma B.4, for any \(\epsilon \in [0,1]\),

\begin{equation*} \Pr \left(\max _{1\le j\le n} \left|Z_j-\frac{j}{\lambda }\right|\gt \frac{\epsilon n}{\lambda }\right)\le 6 \exp \left(\frac{-\epsilon ^2n}{36}\right). \end{equation*}

Proof.

First apply Lemma B.4 for each partial sum \(Z_j\) and \(\delta =\epsilon n/\lambda\). By considering the cases \(j\le \epsilon n\) and \(j\gt \epsilon n\), respectively, it follows for all \(j\le n\) that \(\min \lbrace \delta /s,\lambda \rbrace \ge \epsilon \lambda .\) Lemma B.4 implies that for all \(j\le n\),

\begin{equation*} \Pr \left(\left|Z_j-\frac{j}{\lambda }\right|\gt \frac{\epsilon n}{\lambda }\right)\le 2\exp \left(\frac{-\epsilon ^2 n}{4}\right). \end{equation*}

Now apply Lemma B.3 using the centered random variables \(Y_i=X_i-1/\lambda\). This yields

\begin{equation*} \Pr \left(\max _{1\le j\le n} \left|Z_j-\frac{j}{\lambda }\right|\gt \frac{\epsilon n}{\lambda }\right)\le 3\max _{1\le j\le n}\Pr \left(\left|Z_j-\frac{j}{\lambda }\right|\gt \frac{\epsilon n}{3\lambda }\right) \le 6 \exp \left(\frac{-\epsilon ^2n}{36}\right). \end{equation*}

□

Corollary B.6.

Let \(\lbrace G^i_k\rbrace _{i\in [n],k\in [w]}\) be a family of independent geometric random variables such that for all \(i,k\), \(G^i_k\sim \text{Geom}(\lambda _i)\). Let \(Z_{q}^i=\sum _{k=1}^{q} G^i_k\). Then for any \(\epsilon \in [0,1]\),

\begin{equation} \Pr \left(\exists i\in [n],q\in [w]: \left|Z_{q}^i-\frac{q}{\lambda _i}\right|\ge \frac{\epsilon w}{\lambda _i}\right)\le 6n\exp \left(\frac{-\epsilon ^2 w}{36}\right). \end{equation}

(29)

Proof.

This follows immediately from Corollary B.5 and a union bound.□

Lemma B.7.

Let \(\lbrace I^j_k\rbrace _{j\in [m], k\in [w]}\) be an independent Bernoulli ensemble such that for all \(j,k\)\(I^j_k\sim \text{Bern}(\mu _{j})\) with \(\mu _1\ge \mu _2\ge \ldots \ge \mu _m\). Then for all \(\delta \in [0,1]\),

\begin{equation} \Pr \left(\exists q\in [m]:\sum _{j=1}^{q}\sum _{k=1}^w I^j_k\le (1-\delta)w\left(\sum _{j=1}^{q} \mu _{j}\right)\right)\le m\exp \left(\frac{-\delta ^2 w\mu _1}{2}\right). \end{equation}

(30)

Proof.

The multiplicative form of the Chernoff bound immediately implies that for each \(q\in [m]\),

\begin{equation*} \Pr \left(\sum _{j=1}^{q}\sum _{k=1}^w I^j_k\le (1-\delta)w\left(\sum _{j=1}^{q} \mu _{j}\right)\right)\le \exp \left(\frac{-\delta ^2 w\sum _{j=1}^{q} \mu _{j}}{2}\right)\le \exp \left(\frac{-\delta ^2 w\mu _1}{2}\right). \end{equation*}

The result then follows from a union bound over all \(q\in [m]\).□

The following characterizes the moments of geometric distributions.

Lemma B.8.

Let \(X\sim \text{Geom}(\lambda)\). Then for all \(k\ge 1\), \(\mathbb {E}[X^k]\le \frac{c_k}{\lambda ^k}\), where \(c_k\) is a constant depending on k but not on \(\lambda\).

Lemma B.9.

Let \(X\sim \text{Bin}(n,p)\), where \(p\in (0,1]\) is considered fixed. Then, for any fixed integer \(k\ge 0\), \(\mathbb {E}[X^k]\asymp n^k\), where the implicit constants depend on p and k, but not n.

Proof.

By definition, \(X=\sum _{i=1}^n X_i\), where \(X_i\sim \text{Bern}(p)\) are i.i.d. We clearly have

\begin{equation*} X^k=\sum _{1\le i_1,\ldots ,i_k\le n} \prod _{j=1}^k X_{i_j}. \end{equation*}

Note that products of these indicator variables remain indicator random variables, and it is easy to see that for any indices \(1\le i_1,\ldots ,i_k\le n\), \(p^k\le \mathbb {E}\left[\prod _{j=1}^k X_{i_j}\right]\le p.\) Taking expectations and summing, we obtain \(p^k n^k\le \mathbb {E}[X^k]\le pn^k\).□

C Proofs Deferred from Section 2

C.1 Central Feasibility

Lemma C.1.

Suppose that \(\mathbf {x}\) majorizes \(\mathbf {y}\). Then for any nonnegative, monotone decreasing sequence \(z_1\ge \ldots \ge z_n\ge 0\),

\begin{equation*} \sum _{i=1}^n z_i x_i\ge \sum _{i=1}^n z_i y_i. \end{equation*}

Proof.

Multiply the equations by appropriate scalars in the definition and sum to obtain the inequality.□

Lemma C.2 (Theorem B.2. in the Work of Marshall et al. [27]).

Suppose that \(\mathbf {x},\mathbf {y}\in \mathbb {R}^n_+\) are in sorted order, have equal total sums, and \(\mathbf {x}\) majorizes \(\mathbf {y}\). Then \(\mathbf {y}=P\mathbf {x}\) for some doubly stochastic matrix P.

Corollary C.3.

Suppose that \(\mathbf {x},\mathbf {y}\in \mathbb {R}^n_+\) are in sorted order and \(\mathbf {x}\) strictly majorizes \(\mathbf {y}\). Then there exists a doubly stochastic matrix P such that \(P\mathbf {x}\) is strictly greater than \(\mathbf {y}\) componentwise.

Proof.

By continuity and strict majorization, it is possible to scale all entries of \(\mathbf {x}\) by nonnegative factors strictly less than 1 to obtain a vector \(\mathbf {x^{\prime }}\) that majorizes \(\mathbf {y}\) and so that the sums are equal. Applying the previous result, we have \(P\mathbf {x^{\prime }}=\mathbf {y}\) for some doubly stochastic P. But \(P\mathbf {x}\) strictly exceeds \(P\mathbf {x^{\prime }}\) componentwise, giving the result.□

Theorem C.4 (Theorem 2.2, Restated).

\begin{equation} \sum _{j=1}^{\min \lbrace k,m\rbrace } \mu _{j} \gt \sum _{i=1}^{k} \lambda _i. \end{equation}

(31)

C.2 Impossibility for No-Priority Model

Next, we give the promised example that the simpler queuing model is too weak to give any sub-polynomial bicriterion result.

Theorem C.5 (Theorem 2.4, Restated).

In the alternate model, for large enough n, there exists a centrally feasible queuing system with n queues and servers with the following property: the system remains feasible even if \(\mathbf {\mu }\) is scaled down by \(\Omega (n^{1/3})\) and it is possible for all queues to be in a Nash equilibrium of the stage game at each timestep (and in particular, satisfy no-regret properties as in Assumption 5.1), yet the system is not strongly stable.

Proof.

Let \(\lambda _1=2/n^{1/3}\), whereas \(\lambda _2=\ldots =\lambda _n=1/n^{2/3}\); let \(\mu _1=1/2\) and \(\mu _2=\ldots =\mu _n=c/n^{1/3}\), where \(c=c(n)=\Theta (1)\) is such that

\begin{equation*} \frac{1}{n^{1/3}+2}\lt \frac{c}{n^{1/3}}\lt \frac{1}{n^{1/3}}. \end{equation*}

We proceed by considering an adversarial, centralized scheduler that suggests actions for each queue in each round while ensuring that each agent achieves no regret. The schedule is as follows: in each round, the scheduler chooses \(n^{1/3}/2-1\) of the low rate agents arbitrarily to send to the unique high rate server, if that many low rate agents have packets, as well as the high rate queue. All other low rate agents send to distinct low rate servers. If fewer that \(n^{1/3}/2-1\) low rate servers are active, then the scheduler schedules all active queues to the high rate server.

By standard Chernoff bounds, the number of low rate queues that receive a packet in a given round is at least \(n^{1/3}/2-1\) with probability at least \(1-\exp (-\Omega (n^{1/3}))\), so with at least this probability there are enough low rate queues for the first case to hold. The preceding inequalities show that in such a round where there are at least \(n^{1/3}/2-1\) active low agents, the suggested schedule is a Nash equilibrium, and the probability of success for each queue sending to the high server is exactly \(1/n^{1/3}\) in such rounds. When this does not occur, the suggested schedule is still Nash, and the probability of success for any queue sending to the high rate server is at most \(1/2\). Therefore, in any timestep where the high rate queue has a packet, by the Law of Total Probability, her probability of clearing is upper bounded by

\begin{equation*} \frac{1}{n^{1/3}}+\exp (-\Omega (n^{1/3}))\cdot (1/2)\lt \frac{1.5}{n^{1/3}}, \end{equation*}

where the inequality is for sufficiently large n. As a result, in expectation \(Q^1_{t+1}-Q^1_t\) is lower bounded by a nonzero constant (depending on n but not on t), and therefore \(Q^1_t\) diverges with t in expectation by telescoping. This shows that this system is not strongly stable, even though every queue plays a Nash strategy at each time. Note that this system would still be centrally feasible if all queues were scaled up by a factor of \(\Theta (n^{1/3})\), giving the result.

To see that this is no regret with high probability on each fixed window, define \(X^{i,j}_t\) to be the indicator variable that queue i would succeed in clearing a packet at server j at time t, and let \(\sigma _i(t)\) be the identity of the server that queue i chooses at time t. Note that if queue i is empty at time t, then \(X^{i,j}_t=0\) for all j and \(\sigma _i(t)\) can be arbitrary. Then define \(\Delta ^{i,j}_t=X^{i,\sigma _i(t)}_t-X^{i,j}_t\). By the preceding Nash discussion, \(\mathbb {E}[\Delta ^{i,j}_t\vert \mathcal {F}_{t-1}]\ge 0\) for all t in both cases as described earlier, where \(\mathcal {F}_{t}\) denotes the past history of this process up to time t. This holds regardless of if queue i is really sending in that round (in which case the quantity is just 0).

Therefore, as \(\vert \Delta ^{i,j}_t\vert \le 2\) surely, we may apply the Azuma-Hoeffding inequality (Lemma B.2) to see that on any fixed window of length w (and reindexing time so that time progresses \(t=1,\ldots ,w\) on this window for notational ease),

\begin{equation} \Pr \left(\sum _{t=1}^w \Delta ^{i,j}_t\le -\alpha \right)\le \exp \left(\frac{-\alpha ^2}{w}\right). \end{equation}

(32)

By a union bound, for each queue i, this holds for all servers \(j\in [m]\) with probability at most \(m\cdot \exp \left(-\alpha ^2/w\right)\). Note that if \(\alpha =\sqrt {w\ln (m/\delta)}\), this quantity is at most \(\delta\). As such, by definition of regret, on any fixed period of length w, with probability at least \(1-\delta\), this strategy satisfies \(\text{Reg}_i(w)\le \sqrt {w\ln (m/\delta)}=o(w)\), as needed.□

C.3 Proofs Deferred from Section 2.4

We now show the desired relations between our notions of stability and almost sure subpolynomial growth. We need the following technical lemma.

Lemma C.6.

Suppose that a nonnegative sequence of random variables \(X_1,X_2,\ldots\) satisfies \(X_t\le X_{t-1}+L\) surely for some fixed \(L\ge 0\) and any t, as well as the moment condition \(\sup _{t}\mathbb {E}[X_t^p]\le C_p\) for some constant \(C_p\ge 0\) for each \(p\ge 1\). Then, for any \(c\gt 0\), almost surely, \(X_t=o(t^c)\).

Proof.

Fix \(\epsilon \gt 0\). It suffices to prove the lemma for \(0\lt c\lt 1\), so take \(0\lt d\lt c\) and set \(p=d^{-1}\). We do this by proving the desired asymptotics on a conveniently chosen subsequence, then interpolate to intermediate values. Indeed, by Markov’s inequality, for each \(k\ge 1\),

\begin{equation*} \Pr (X_{k^{1+\epsilon }}\gt k^{(1+\epsilon)d})= \Pr (X_{k^{1+\epsilon }}^p\gt k^{1+\epsilon })\le \frac{C_p}{k^{1+\epsilon }}. \end{equation*}

Summing over k and observing that the right side is summable, we deduce from the first Borel-Cantelli lemma that almost surely, for all sufficiently large k, \(X_{k^{1+\epsilon }}\le k^{(1+\epsilon)d}.\) To extend this to all large enough t, suppose that t is such that \(k^{1+\epsilon }\le t\lt (k+1)^{1+\epsilon }\). By the one-sided boundedness, we know that almost surely, for such t and all large enough k,

\begin{equation*} X_t\le L\cdot (t-k^{1+\epsilon })+X_{k^{1+\epsilon }} \le L\cdot (1+\epsilon)(k+1)^{\epsilon } + k^{(1+\epsilon)d} \le L\cdot (1+\epsilon)(t^{1/(1+\epsilon)}+1)^{\epsilon }+t^d, \end{equation*}

where the bound on \(t-k^{1+\epsilon }\) arises from the mean value theorem. Clearly, this last expression is \(O(t^{\epsilon /(1+\epsilon)}+t^d)\). As this holds for arbitrary \(\epsilon \gt 0\), we may take \(\epsilon\) small enough so that this expression is \(o(t^c)\), as claimed.□

Lemma C.7 (Lemma 2.7, Restated).

If the Bernoulli and Geometric models characterize the same queuing dynamics, then strong (weak) stability in the Bernoulli system is equivalent to strong (weak) stability in the Geometric system. Moreover, if this holds, then strong stability in either system implies almost sure subpolynomial growth.

Proof.

Suppose that the dynamics are as stated so that the Bernoulli and Geometric dynamics yield completely equivalent processes. Then the distribution of \(Q^i_t\) conditioned on the value of \(T^i_t\) at time t is \(\text{Bin}(T^i_t,\lambda _i)\). Note that by the Law of Iterated Expectations, \(\mathbb {E}[(Q^i_t)^p]=\mathbb {E}[\mathbb {E}[(Q^i_t)^p\vert T^i_t]]\). But by Lemma B.9, \(\mathbb {E}[(Q^i_t)^p\vert T^i_t]\asymp (T_i^t)^p\) up to absolute constants depending only on p and \(\lambda _i\). Therefore, by taking expectations, the Bernoulli system and the Geometric system have equivalent stability properties. The second claim now follows from either form of strong stability from Lemma C.6, noting that either \(Q_t\) or \(T_t\) can increase by at most \(L=n\) in each timestep.□

D Proofs and Notation for Section 3

For convenience, we repeat the definitions of various expressions considered in Section 3.

Symbol	Formula	Definition
\(\mathbf {\lambda }\)		Vector of length n of queue arrival rates in descending order
\(\Delta ^{m-1}\)		Probability simplex over the m element set
\(\mathbf {\mu }\)		Vector of length m of server success rates in descending order
\(\mathbf {p}\)		Vector of queue randomizations over servers in \((\Delta ^{m-1})^n\)
\(\widetilde{T_t^i}\)		Timestamp of the oldest packet at queue i at timestep t
\(T_t^i\)	\(\max \lbrace 0,t-\widetilde{T_t^i}\rbrace\)	Age of queue i at time t
\(\alpha (S\vert \mathbf {p},\mathbf {\mu },S^{\prime })\)	\(\sum _{j=1}^m \mu _j\prod _{i\in S^{\prime }}(1-p_{i,j})(1-\prod _{i\in S}(1-p_{i,j}))\)	Expected number of packets cleared by queues in S if all have packets in a round and have priority over all queues except for those in \(S^{\prime }\) and each such queue also has packets in the round
\(\lambda (S)\)	\(\sum _{i\in S}\lambda _i\)	Sum of arrival rates of queues in S
\(f(S\vert \mathbf {p},\mathbf {\mu },\mathbf {\lambda },S^{\prime })\)	\(\alpha (S\vert \mathbf {p},\mathbf {\mu },S^{\prime })/\lambda (S)\)	Ratio of expected packets cleared by S with priority over all queues except \(S^{\prime }\) to the total arrival rate of S
\(S_k(\mathbf {p},\mathbf {\mu },\mathbf {\lambda })\)		The kth subset output in the algorithm of Section 3.2
\(U_k(\mathbf {p},\mathbf {\mu },\mathbf {\lambda })\)	\(\cup _{\ell =1}^k S_{\ell }(\mathbf {p},\mathbf {\mu },\mathbf {\lambda })\)	Union of the first k outputted subsets in the algorithm of Section 3.2
\(r_i(\mathbf {p},\mathbf {\mu },\mathbf {\lambda })\)		Outputted aging rate of queue i in the algorithm of Section 3.2
\(f_k(\mathbf {p},\mathbf {\mu },\mathbf {\lambda })\)	\(f(S_k(\mathbf {p},\mathbf {\mu },\mathbf {\lambda })\vert \mathbf {p},\mathbf {\mu },\mathbf {\lambda },U_{k-1})\)	Value of f for \(S_k\) when \(U_{k-1}\) has priority
\(g_k(\mathbf {p},\mathbf {\mu },\mathbf {\lambda })\)	\(\max \lbrace 0,1-f_k(\mathbf {p},\mathbf {\mu },\mathbf {\lambda })\rbrace\)	Outputted rate for \(S_k\); equivalently, value of \(r_i\) for any \(i\in S_k\)

When the values of, or dependencies on, \(\mathbf {p},\mathbf {\mu },\mathbf {\lambda }\) are clear from context, we omit them for notational ease.

D.1 Proofs Deferred from Section 3

In this section, we fill in the deferred proofs showing analytic properties of r, the output of the algorithm described in Section 3.1.

Proposition D.1 (Proposition 3.7, Restated).

The function \(r:(\Delta ^{m-1})^n\rightarrow [0,1]^n\) given by \(r(\mathbf {p})=(r_1(\mathbf {p}),\ldots ,r_n(\mathbf {p}))\) is continuous.

With these results, we complete the proof of Theorem 3.8. We proceed as follows: fix any queue i, as well as any fixed probability choices \(p_{-i}\in (\Delta ^{m-1})^{n-1}\) by the other players, and any two \(p,p^{\prime }\in \Delta ^{m-1}\). Define for \(t\in [0,1]\),

\begin{equation*} h(t)=r_i(tp+(1-t)p^{\prime },p_{-i}). \end{equation*}

Lemma D.2.

For any fixed i, \(p_{-i}\in (\Delta ^{m-1})^{n-1}\), and \(p_i,p^{\prime }_i\in \Delta ^{m-1}\), the function \(h(t)\) is piecewise linear and has no local maxima on the interior.

Proof.

Let \(\mathbf {p}(t)=(tp_i+(1-t)p^{\prime }_i,p_{-i})\). By Proposition 3.7, h is continuous as the restriction of a continuous function, and it is easy to see that it must be piecewise linear in t by inspection. Indeed, as the algorithmic description of c takes minimums and maximums of finitely many linear functions, this yields a piecewise linear function with no jump discontinuities.

We now prove the last claim. It is sufficient to show that if h is increasing at \(t^{\prime }\), then it is increasing for all \(t^{\prime \prime }\gt t^{\prime }\). Suppose that this is violated for some \(t^{\prime }\lt t^{\prime \prime }\); by piecewise linearity, there must exist some \(t^*\) such that as \(t^{\prime }\lt t^*\lt t^{\prime \prime }\) where two lines intersect in the graph, and so that as \(t\rightarrow t^{*-}\), the slope is increasing while it is nonincreasing for \(t\rightarrow t^{*+}\).

Suppose that for all t that are sufficiently close to \(t^*\) from the left, i is outputted at step k of the algorithm. The only reason the slope can go from positive to nonpositive at \(t^*\) is that there is a change in which sets are outputted in the algorithm at some step \(\ell \le k\), which can happen only if some new set S including i gets selected for \(t\ge t^*\). However, as the rates of all sets not including i fixing any other disjoint set having priority are all constants with respect to t, this can only occur because at \(t^*\), some linear function \(f(S\vert \mathbf {p}(t),S^{\prime })\) went below the \(f(S^{\prime \prime }\vert \mathbf {p}(t),S^{\prime })\) that was previously selected at step \(\ell\), where \(S^{\prime }\) is the union of all sets outputted prior at t and \(S^{\prime \prime }\) is the set that was outputted next for all t close enough to the left of \(t^*\). If \(S^{\prime \prime }\) included i, this could only occur if \(r(S\vert \mathbf {p}(t),S^{\prime }):=\max \lbrace 0,1-f(S\vert \mathbf {p}(t),S^{\prime })\rbrace =1-f(S\vert \mathbf {p}(t),S^{\prime })\rbrace\) has larger positive slope than \(r(S^{\prime \prime }\vert \mathbf {p}(t),S^{\prime })\), so the slope of h would be strictly larger (and in particular, increasing) for all t sufficiently close to \(t^*\) on the right, contradicting our assumption that it is nonincreasing. If \(S^{\prime \prime }\) does not include i, then \(r(S^{\prime \prime }\vert \mathbf {p}(t),S^{\prime })\) is a constant with respect to t, so for \(r(S\vert \mathbf {p}(t),S^{\prime })\) to exceed it for t larger than \(t^*\) but be lower for t less than \(t^*\), the slope of \(r(S\vert \mathbf {p}(t),S^{\prime })\) must also be positive, another contradiction. Both cases lead to a contradiction, proving the claim.□

Corollary D.3.

Fix \(p_{-i}\in (\Delta ^{m-1})^{n-1}\). Then the set of global minimizers of \(r_i(\cdot ,p_{-i})\) form a nonempty, closed convex set.

Proof.

Note that global minima exist by continuity from Proposition 3.7 and the extreme value theorem. Let \(p_i,p^{\prime }_i\) be global minimizers; if we form the line between them in \(\Delta ^{n-1}\) (which is of course convex) and consider the function h defined on this line, then as there are no local maxima in the interior of h by the previous lemma, the maximum must lie at an endpoint. This immediately implies that every point on this line is also a global minimizer. Closedness of the set of global minimizers follows immediately from the continuity guaranteed in Proposition 3.7.□

Theorem D.4 (Theorem 3.8, Restated).

There exists a pure equilibrium of the game with costs given by \(r:(\Delta ^{m-1})^n\rightarrow [0,1]^n\).

Proof.

We will prove this by appealing to Kakutani’s theorem. Let \(B:(\Delta ^{m-1})^n\rightrightarrows (\Delta ^{m-1})^n\) be the correspondence that maps \(\mathbf {p} \in (\Delta ^{m-1})^n\) to the set \(B(\mathbf {p})\subseteq (\Delta ^{m-1})^n\), where \(B(\mathbf {p})=\lbrace \mathbf {p}^{\prime }\in (\Delta ^{m-1})^n: p^{\prime }_i\in \arg \min _{x\in \Delta ^{m-1}} r_i(x,p_{-i})\rbrace\) is the best-response correspondence.

We must verify the preconditions of Kakutani’s theorem. \((\Delta ^{m-1})^n\) is clearly compact and convex, and we have shown that \(B(\mathbf {p})\) is nonempty and is a convex set by Corollary D.3. The final condition to show is that r has closed graph, which can be done by a completely standard argument; we must show that if \((\mathbf {p}^k,\mathbf {s}^k)\rightarrow (\mathbf {p},\mathbf {s})\), where \(\mathbf {s}^k\in B(\mathbf {p}^k)\), then \(\mathbf {s}\in B(\mathbf {p})\). Suppose for a contradiction that this does not hold for some such convergent sequence. This implies that for some \(i\in [n]\), there exists some \(s^{\prime }_i\) and \(\epsilon \gt 0\) such that

\begin{equation*} r_i(s^{\prime }_i,p_{-i})+3\epsilon \lt r_i(s_i,p_{-i}). \end{equation*}

As \(p_{-i}^k\rightarrow p_{-i}\), the continuity of r from Proposition 3.7 gives for large enough k that \(r_i(s^{\prime }_i,p^k_{-i})\le r_i(s^{\prime }_i,p_{-i})+\epsilon\). Thus,

\begin{equation*} r_i(s^{\prime }_i,p^k_{-i})\lt r_i(s_i,p_{-i})-2\epsilon \lt r_i(s^k_i,p^k_{-i})-\epsilon , \end{equation*}

where the last inequality holds for all large enough k by continuity of r. This contradicts the optimality of \(\mathbf {s}^k\in B(\mathbf {p}^k)\), proving that r has a closed graph. Kakutani’s theorem then immediately implies the existence of a pure equilibrium—that is, \(\mathbf {p}\in (\Delta ^{m-1})^n\) such that \(\mathbf {p}\in B(\mathbf {p})\).□

E Proofs Deferred from Section 5

E.1 Deferred Claims

Claim E.1 (Claim 5.1, Restated).

Under Equation (13), both of the following statements hold:

(1)

There exists some \(i\in [n]\) such that \(\lambda _i T^i_{\ell \cdot w}\gt 16nw\).

(2)

\(\sum _{i=1}^n \lambda _i T^i_{\ell \cdot w}\ge \frac{16}{\eta }w\sum _{i=1}^n \lambda _i\).

Proof.

Equation (13) immediately implies by definition of \(Z_{\ell }\) and \(\Phi\) that

\begin{equation*} \sqrt {\sum _{i=1}^n \lambda _i (T^i_{\ell \cdot w}})^2\ge \frac{w}{\sqrt {\lambda _n}}\max \left(\frac{16}{\eta }\left(\sum _{i=1}^n \lambda _i\right),16n^2\right). \end{equation*}

From Lemma A.1, this implies that

\begin{equation*} \sum _{i=1}^n \lambda _i T^i_{\ell \cdot w}\ge w\max \left(\frac{16}{\eta }\left(\sum _{i=1}^n \lambda _i\right),16n^2\right), \end{equation*}

from which both parts follow, the first from averaging.□

Claim E.2 (Claim 5.2, Restated).

If there exists an \(i\in [n]\) such that \(\lambda _i T^i_{\ell \cdot w}\gt 16nw\) under the good event assumptions, then

\begin{equation*} \sum _{i=1}^n \lambda _i\tau _i\ge \frac{1}{2}\sum _{i=1}^n \lambda _i T^i_{\ell \cdot w}. \end{equation*}

Proof.

By Equation (10), we must have

\begin{equation*} \lambda _i\tau _i\ge \lambda _i T^i_{\ell \cdot w}-w-\lambda _i\epsilon _i w, \end{equation*}

even if queue i clears a packet every step in the window. Observe that if \(\lambda _i T^i_{\ell \cdot w}\ge 8w\), then

\begin{equation*} w+\lambda _i\epsilon _i w\le 2w\le \frac{1}{4}\lambda _i T^i_{\ell \cdot w}, \end{equation*}

and so \(\lambda _i\tau _i\ge \frac{3}{4}\lambda _i T^i_{\ell \cdot w}.\) We also have

\begin{equation*} \frac{1}{2}\sum _{i:\lambda _iT^i_{\ell \cdot w}\lt 8w} \lambda _iT^i_{\ell \cdot w}\lt 4nw. \end{equation*}

In particular, if there exists some i such that \(\lambda _iT^i_{\ell \cdot w}\gt 16nw\), then

\begin{equation*} \lambda _i\tau _i\ge \frac{3}{4}\lambda _iT^i_{\ell \cdot w}\gt \frac{1}{2}\lambda _iT^i_{\ell \cdot w}+4nw\ge \frac{1}{2}\lambda _iT^i_{\ell \cdot w}+\frac{1}{2}\sum _{i:\lambda _iT^i_{\ell \cdot w}\lt 8w} \lambda _iT^i_{\ell \cdot w}. \end{equation*}

It follows that if this holds, then indeed,

\begin{equation*} \sum _{i=1}^n \lambda _i\tau _i=\sum _{i:\lambda _iT^i_{\ell \cdot w}\ge 8w} \lambda _i\tau _i+\sum _{i:\lambda _iT^i_{\ell \cdot w}\lt 8w} \lambda _i\tau _i\ge \frac{1}{2}\sum _{i=1}^n \lambda _iT^i_{\ell \cdot w}. \end{equation*}

□

Proof.

By the triangle inequality, it is easy to see that as random variables, the change in \(T^i_{\ell \cdot w}\) is at most \(G^i \triangleq \sum _{k=1}^w G^i_k.\) Then the change in \(\Phi\) is again at most

\begin{equation*} \frac{1}{2}\sum _{i=1}^n \lambda _i(T^i_{\ell \cdot w}+G^i)(T^i_{\ell \cdot w}+G^i-1)-\frac{1}{2}\sum _{i=1}^n \lambda _i(T^i_{\ell \cdot w})(T^i_{\ell \cdot w}-1)\le \sum _{i=1}^n \lambda _i G^iT^i_{\ell \cdot w}+\frac{1}{2}\sum _{i=1}^n \lambda _i(G^i)^2, \end{equation*}

as random variables. We treat two different cases separately:

(1)

Suppose that there does not exist \(i\in [n]\) such that \(\lambda _iT^i_{\ell \cdot w}\gt 1\). Then the change in \(\Phi\) is at most

\begin{equation*} \sum _{i=1}^n G^i+\frac{1}{2}\sum _{i=1}^n \lambda _i (G^i)^2. \end{equation*}

From Fact A.1, this means that the change in \(\sqrt {\Phi }\) is upper bounded as random variables by

\begin{equation*} \sqrt {\sum _{i=1}^n G^i+\frac{1}{2}\sum _{i=1}^n \lambda _i (G^i)^2}. \end{equation*}

Raising this to the pth power, expanding, and taking expectations, this term is at most \(C_{p,n,w}/\lambda _n^{2p}\) for some constant \(C_{p,n,w}\) depending only on \(n,w\), and p by Lemma B.8.

(2)

Suppose that there does exist \(i\in n\) such that \(\lambda _i T^i_{\ell \cdot w}\gt 1\). We claim that this implies that for all \(j\in [n]\),

\begin{equation*} \frac{\lambda _j T^j_{\ell \cdot w}}{2\sqrt {\frac{1}{2}\sum _{i=1}^n\lambda _i T^i_{\ell \cdot w}(T^i_{\ell \cdot w}-1)}}\le \sqrt {\lambda _j}. \end{equation*}

First, note that for any \(i\in [n]\), \(T^i_{\ell \cdot w}\ge 2\) implies that

\begin{equation} \frac{1}{2}\lambda _i T^i_{\ell \cdot w}(T^i_{\ell \cdot w}-1)\ge \frac{1}{4} \lambda _i (T^i_{\ell \cdot w})^2, \end{equation}

(35)

as can be confirmed from basic algebra. As \(\lambda _i\le 1/2\) by feasibility (as \(\mu _1\le 1\)), our assumption implies that \(T^i_{\ell \cdot w}\gt 2\), and so

\begin{equation*} 2\sqrt {\frac{1}{2}\sum _{i=1}^n \lambda _i T^i_{\ell \cdot w}(T^i_{\ell \cdot w}-1)}\gt 1. \end{equation*}

To prove the claim, we split into more cases: if \(T^j_{\ell \cdot w}\le 1/\sqrt {\lambda _j}\), the claim holds using the last inequality in the denominator. Otherwise, we must have \(T^j_{\ell \cdot w}\ge 2\) by integrality, in which case by Equation (35),

\begin{equation*} \frac{\lambda _j T^j_{\ell \cdot w}}{2\cdot \sqrt {\frac{1}{2}\sum _{i=1}^n\lambda _i T^i_{\ell \cdot w}(T^i_{\ell \cdot w}-1)}}\le \frac{\lambda _j T^j_{\ell \cdot w}}{\sqrt {\lambda _j (T^j_{\ell \cdot w})^2}}= \sqrt {\lambda _j}. \end{equation*}

Thus, in this case, we have

\begin{equation*} \frac{\sum _{i=1}^n \lambda _i G^iT^i_{\ell \cdot w}+\frac{1}{2}\sum _{i=1}^n \lambda _i(G^i)^2}{2\cdot \sqrt {\frac{1}{2}\sum _{i=1}^n\lambda _i T^i_{\ell \cdot w}(T^i_{\ell \cdot w}-1)}}\le \sum _{i=1}^n \sqrt {\lambda _i}G^i+\frac{1}{2}\sum _{i=1}^n \lambda _i (G^i)^2. \end{equation*}

By Fact A.1, this is an upper bound as random variables of the change in \(\sqrt {\Phi }\), so taking pth powers, expanding, and taking expectations, we get an upper bound of \(C_{p,n,w}/\lambda _n^{2p}\) by Lemma B.8 for some constant \(C_{p,n,w}\) depending only on \(n,w,p, \mathbf {\lambda }\).

□

E.2 Tightness of Factor 2

Theorem E.1 (Theorem 5.4, Restated).

Proof.

Define \(W_k=\sum _{i=1}^{k-1} w_i\). Note that \(W_k=\Theta (k^3)=\Theta (w_k^{3/2})\). \(W_k\) is the actual timestep at the end of \(k-1\) of the consecutive windows of length \(w_i\) for \(i=1,\ldots ,k-1\). Note also that \(W_{k+1}-W_k=w_k\).

For each \(n\ge 1\), consider the following system on n queues and n servers: set \(\mathbf {\lambda }=(\frac{n+1}{n^2},\ldots ,\frac{n+1}{n^2})\) and \(\mathbf {\mu }=(1,\frac{n-1}{n^2},\ldots ,\frac{n-1}{n^2})\). This system satisfies Assumption 5.2 with factor \(\frac{1}{2}-o_n(1)\). We consider the simple strategy where every queue always sends to the rate 1 server. Under these oblivious dynamics, in expectation the total number of packets grows by \(\frac{1}{n}\) with every step, and therefore this system is not even weakly stable. What we must show is that almost surely, this fixed strategy is zero regret for every queue for all but finitely many of the windows.

We first show almost sure concentration of the arrivals of new packets. Let \(\lbrace B^i_t\rbrace _{i\in [n],t\ge 1}\) be the independent random variables for arrivals as usual. Now, for each queue \(i\in [n]\) and \(\ell \ge 0\), we have

\begin{equation} \Pr \left(\left|\sum _{t=1}^{\ell } B^i_t-\lambda _i\ell \right|\ge \sqrt {\ell \ln (\ell)}\right)\le \frac{2}{\ell ^2}, \end{equation}

(36)

where we use the additive form of the Chernoff bound. As the same holds for all queues, the probability this event happens for any of the n queues is at most \(2n/\ell ^2\). As this is summable in \(\ell\), we may sum over all \(\ell \ge 1\) to deduce from the Borel-Cantelli lemma that almost surely, for all sufficiently large \(\ell\), all \(i\in [n]\) satisfy

\begin{equation} \sum _{t=1}^{\ell } B^i_t=\lambda _i\ell \pm O(\sqrt {\ell \ln (\ell)}). \end{equation}

(37)

Note that this also implies that almost surely, for all large \(\ell\), \(\sum _{t=1}^\ell \sum _{i=1}^n B^i_t\ge (1+\frac{1}{2n})\cdot \ell\) by the choice of \(\lambda _i\). Moreover, under this fixed strategy where everyone always sends to the rate 1 server, at most \(\ell\) packets can be cleared by time \(\ell\).

Next, we show that almost surely, there is a large backup proportional to the current time period. Let \(t_k\) be the last timestamp the rate 1 server clears up to time \(W_k\). As all queues send there under this fixed strategy, at this point, all queues only have packets that were received after \(t_k\) by priority. On the one hand, it is not difficult to see that deterministically, \(t_k\ge W_k/n\) (equality happens in the worst case where every queue received a packet in every step up to \(W_k\)). On the other hand, in light of our preceding results, almost surely, for all but finitely many of the k,

\begin{equation} t_k\lt \frac{1}{1+\frac{1}{2n}} W_k=(1-\Omega (1))W_k. \end{equation}

(38)

This is because at least \(W_k\) packets have been received up to time \(\frac{1}{1+\frac{1}{2n}} W_k\), and because the server can only have cleared at most \(W_k\) packets up to time \(W_k\), the oldest timestamp the server could have cleared by time \(W_k\) can be at most this quantity.

Next, we show almost sure concentration of the nontrivial server success rates. Let \(I^j_t\) be the indicator that server j would succeed at clearing a packet at time t (regardless of if one is sent there; under this strategy, no queue ever sends to \(j\ne 1\)). A similar application of the Chernoff bound and union bound with the Borel-Cantelli lemma implies that almost surely, for all but finitely many of the k, and for each server \(j\in [n]\), we have

\begin{equation} \sum _{t=W_k+1}^{W_{k+1}} I^j_t=w_{k}\mu _i \pm O(\sqrt {w_k\ln w_k}). \end{equation}

(39)

Note that the increasing nature of the \(w_k\) is needed here for this to be valid (and in fact, this statement will be false with probability 1 if interval sizes are kept fixed by independence and the second Borel-Cantelli lemma). Thus, almost surely, for all large enough \(\ell\) and k, all of these events happen simultaneously.

As \(t_k\ge W_k/n\), almost surely for large enough k, \(t_k\) eventually exceeds the random time \(\ell\) at which Equation (37) holds. Consider any subsequent window of length \(w_k\). Our goal is to use these facts to show that on these windows, all queues have zero regret. First, we show that each queue clears \((\frac{1}{n}-o(1))w_k\) packets on each such window. Let \(c=\frac{1}{n\lambda _i}\lt 1\) (note that this is independent of i). We know from Equation (38) that \(t_k+w_k\lt (1-\Omega (1))W_k+w_k\lt W_{k}\); moreover, using Equation (37) and the fact that \(t_k\ge \ell\),

\begin{align*} \sum _{t=t_k+1}^{t_k+c\cdot w_k} B^i_t&=\sum _{t=1}^{t_k+c\cdot w_k} B^i_t-\sum _{t=1}^{t_k} B^i_t \\ &= \frac{1}{n} \cdot w_k \pm O(\sqrt {(t_k+c\cdot w_k)\ln {(t_k+c\cdot w_k)}})\\ &=\frac{1}{n} \cdot w_k \pm O(\sqrt {W_k\ln {W_k}})\\ &=\frac{1}{n} \cdot w_k \pm \tilde{O}(w_k^{3/4}), \end{align*}

where the last line uses the relationship between \(W_k\) and \(w_k\). As \(t_k+w_k\lt W_k\), all of these packets were evidently received before the start of the given window, and therefore every queue is backed up throughout the period, and by virtue of the previous equation, each queue has \(\frac{1}{n}-o(1)\) fraction of the next \(w_k\) packets that will be cleared by this top server on this window. Thus, each queue clears at least \((\frac{1}{n}-o(1))\cdot w_k\) packets on such windows under this fixed strategy.

Finally, had any queue deviated on such a window to a single fixed low rate server, in light of Equation (39), she would have cleared \((\frac{n-1}{n^2}+o(1))\cdot w_k\) packets, which is linearly smaller than the amount she actually cleared. Therefore, almost surely, on all but finitely many of the windows, every queue actually has zero regret.□

F Proofs Deferred from Section 6

This section is devoted to proving the intermediate claims in the proof of Theorem 3.3. Our main first technical result of this section asserts that with high probability, the maximum queue age increases at a rate of at most \((1-(1-\epsilon)\cdot f_1)\) on the next w steps for a large enough w. In fact, more generally, the following holds.

Proposition F.1 (Proposition 6.1, Restated).

\begin{equation*} \max _{i\in [n]} T^i_{t+w} -\max _{i\in [n]} T^i_t \le (1-(1-\epsilon)\cdot f_1)\cdot w \end{equation*}

\begin{equation*} \max _{i\not\in U_{s-1}} T^i_{t+w} -\max _{i\not\in U_{s-1}} T^i_t \le (1-(1-\epsilon)\cdot f_s)\cdot w \end{equation*}

with probability at least \(1-C_1\exp (-C_2a)\), where \(C_1,C_2\gt 0\) are absolute constants depending only on \(n,\epsilon ,\mathbf {\lambda }, \mathbf {\mu },\mathbf {p}\), but not on a.

Note that the first part is simply the \(j=1\) case of the more general statement. This proposition will follow from the following lemma that will prove inductively.

Lemma F.2.

Fix \(s\ge 1\) and \(\epsilon \gt 0\). Then, for each \(1\le \tau \le n\) and all \(a\in \mathbb {N}\), the following holds: let \(w=a\cdot \lceil \frac{6}{\epsilon }\rceil ^{\tau -1}\), and suppose that at time t, \(M^*:=\max _{i\not\in U_{s-1}} T^i_t\ge w\cdot f_s\), and that the set

\begin{equation*} J=\lbrace i\not\in U_{s-1}: T^i_t\ge M^*-w\cdot f_s\rbrace \end{equation*}

has \(\vert J\vert \le \tau\). Then, with probability at least \(1-C_1\exp (-C_2a)\), we have

\begin{equation*} \max _{i\not\in U_{s-1}} T^i_{t+w}- \max _{i\not\in U_{s-1}} T^i_t \le (1-(1-\epsilon)\cdot f_s)\cdot w, \end{equation*}

where \(C_1,C_2\gt 0\) are absolute constants depending only on \(n,\epsilon ,\mathbf {\lambda }, \mathbf {\mu },\mathbf {p}\), but not a.

Proposition 6.1 follows as for any \(s\ge 1\) and \(a\in \mathbb {N}\), \(\vert J\vert \le n\). We now turn to proving Lemma F.2 inductively. The case for \(\tau =1\) will turn out to be relatively easy; this case just says that there is a single very old queue among those not in \(U_{s-1}\), so we will be able to lower bound the number of packets she clears by simply assuming that every queue in \(U_{s-1}\) is older than her. To extend this to higher \(\tau\) will be more difficult. To do this, we will chunk together many windows that we know have this property for smaller values of \(\tau\) and then leverage two facts to get a win-win situation. We will be able to easily show that at least one queue in J is always decreasing at the correct rate. If all queues in J are “close,” they also are at the correct rate. If not, then on the next chunk, they will clear at the correct rate on the next chunk inductively with high probability.

We now carry out this high-level plan. For reference, we will use the following similar notation to that used in the main text, but extended to more general windows:

(1)

\(w:=B\cdot L\) will denote a given window length composed of B consecutive blocks of L steps. As we will be considering the behavior of the process on some fixed window, we may as well reindex \(t=1\) for convenience so that each window we consider will go from \(t=1\) to w. We will reserve the superscript \(t=0\) to denote the value of the ages at the very beginning of the window we consider.

(2)

Recall the shorthand \(f_s:=f(S_s\vert U_{s-1})\) and \(g_s := \max \lbrace 0,1-f_s\rbrace\).

(3)

With this convention, we will often define (and will make clear from context) at the beginning of some considered window of fixed length w, fixed \(s\ge 1\), and fixed \(\epsilon \gt 0\):

\begin{equation*} M^* :=\max _{i\in [n]\setminus U_{s-1}} T^i_0\quad \quad T^*:=M^* -(1-\epsilon)\cdot w\cdot f_s. \end{equation*}

We will often refer to \(T^*\) as the target value for this window, which does not change over the course of the window (notice that it is measured at the beginning of the window). Then, define

\begin{equation*} J=\lbrace i\notin U_{s-1}: T^i_0\ge M^*-w\cdot f_s\rbrace . \end{equation*}

In other words, J is the set of queues whose age is within \(w\cdot f_s\) of the oldest age, measured at the beginning of the window. Our goal will be to eventually show that if w is sufficiently large, then with high probability, every queue in J has age below \(T^*\) at the end of the next w steps before accounting for w steps of aging, and of course all queues not in J are already strictly below \(T^*\) by definition. This will imply that the maximum age grows by at most \((1-(1-\epsilon)\cdot f_s)\cdot w\) once we account for the w steps of aging over this window.

(4)

Given a window of length \(w=B\cdot L\), \(\mathcal {F}^{(b)}_{\ell }\) is the filtration of \(\sigma\)-algebras generated up to step \(\ell\) in the bth block, for \(b=1,\ldots ,B\). In particular, \(\mathcal {F}_0^{(1)}\subseteq \mathcal {F}_1^{(1)}\subseteq \ldots \subseteq \mathcal {F}_{L}^{(1)}\subseteq \mathcal {F}_{0}^{(2)}\subseteq \ldots \subseteq \mathcal {F}_{\ell }^{(b)}\subseteq \ldots \subseteq \mathcal {F}_{L}^{(B)}.\)

(5)

\(X_{b,\ell }^{i}\) will be the indicator that queue i cleared a packet in timestep \(\ell\) in the bth block. \(X_{b,\ell }^{i}\) is \(\mathcal {F}^{(b)}_{\ell }\)-measurable.

(6)

\(Y_{b,\ell }^{i}\) will be a sequence of random variables, with same interpretation of the indices, defined as follows: for \(b,\ell\) such that every queue in J is still above \(M^*-w\cdot f_s\) at the start of the \(\ell\)th step of the bth block, set \(Y_{b,\ell }^{i}=X_{b,\ell }^{i}\). If this does not hold for some \(b,\ell\), then let the \(Y_{b,\ell }^{i}\) be arbitrary indicator random variables satisfying

Note that \(Y_{b,\ell }^{i}\) is \(\mathcal {F}^{(b)}_{\ell }\)-measurable. These random variables are purely for technical convenience because they have an a priori lower bound on the conditional expectation, which is not always true of X (i.e., if queues have already cleared a lot and thus some queues in J have lost priority over those not in J).

(7)

\(G_{b,\ell }^{i}\) will be i.i.d. Geom(\(\lambda _i\)) random variables in the same way for \(\ell \in [L],b\in [B]\). We define the partial sums on each window \(Z_{b,k}^{i}:= \sum _{\ell =1}^k G_{b,\ell }^{i}\). For each \(b=1,\ldots ,B\), we make the convention that the \(G_{b,\ell }^{i}\) are sampled for all \(i\in [n]\) and \(\ell \in [L]\) at the beginning of the bth block so that they are all \(\mathcal {F}_0^{(b)}\)-measurable; there is no corresponding queuing step. When queue i clears her kth packet in the bth block, her timestamp decreases by \(G_{b,k}^{i}\); equivalently, when this happens, her timestamp will have decreased on the bth block by \(Z_{b,k}^{i}\) so far.

Proposition F.3 (Proposition 6.2, Restated).

For any \(s\ge 1\) and any fixed \(\epsilon \gt 0\), the following holds: suppose that at time t, it holds that

\begin{equation*} \min _{i\in U_s} T^i_t -\max _{i\in S_{s+1}} T^i_t\ge 2\cdot \frac{w}{\lambda _n}. \end{equation*}

Then with probability \(1-A\exp (-Bw)\), where \(A,B\gt 0\) are absolute constants not depending on w, we have

\begin{equation*} \sum _{i\in S_{s+1}} \lambda _i T^i_{t+w}-\sum _{i\in S_{s+1}} \lambda _iT^i_t\ge (1-(1+\epsilon)f_{s+1})\cdot w \cdot \left(\sum _{i\in S_{s+1}} \lambda _i\right). \end{equation*}

Moreover, for any fixed \(\epsilon \gt 0\), with probability at least \(1-A\exp (-Bw)\) it holds that

\begin{equation*} \sum _{i\in S_1} \lambda _iT^i_w \ge (1-(1+\epsilon)f_1)\cdot w\cdot \left(\sum _{i\in S_1} \lambda _i\right). \end{equation*}

Corollary F.4 (Corollary 6.3, Restated).

Fix \(\mathbf {p}\) and suppose that for some group \(S_k\) output by the algorithm, \(f_k\gt 1\), so that \(1-f_k\lt 0\). Then, for each \(i\in S_k\), \(T^i_t\) is strongly stable.

Proof Sketch

It suffices to show this for the random variable \(\max _{i\notin U_{k-1}}T^i_t\). Let \(\epsilon \gt 0\) be small enough such that \((1-(1-\epsilon)\cdot f_{k})\lt \eta \lt 0\). Then let \(w=a\cdot \lceil \frac{6}{\epsilon }\rceil ^{n-1}\) be large enough so, on the event that \(\max _{i\notin U_{k-1}}T^i_{\ell \cdot w}\ge f_{k}\cdot w\), then \(\mathbb {E}[\max _{i\notin U_{k-1}}T^i_{(\ell +1)\cdot w}-\max _{i\notin U_{k-1}}T^i_{\ell \cdot w} \bigg \vert \mathcal {F}_{\ell \cdot w}]\lt \beta \lt 0\) for some \(\beta \lt 0\), where \(\mathcal {F}_{\ell \cdot w}\) is the filtration of the Geometric system; this can be done by Proposition 6.1, noting that on the event where the proposition fails, the queue age can increase at most by w, and this can be drowned out in the expectation by the exponential decay of the probability bound by taking w large enough. This yields negative drift for the random process \(Y_{\ell }:=\max _{i\notin U_{k-1}}T^i_{\ell \cdot w}\) with threshold value \(f_{k}\cdot w\).

Then, for any even \(p\ge 0\), \(\mathbb {E}[\vert \max _{i\notin U_{k-1}}T^i_{(\ell +1)\cdot w}-\max _{i\notin T_{j-1}}T^i_{\ell \cdot w}\vert ^p \vert \mathcal {F}_{\ell \cdot w}]\) is bounded by some constant \(C_p\gt 0\) for each p depending only on \(n,w, \mathbf {\lambda }\). This is because the difference between these is crudely upper bounded as random variables by a sum of at most \(n\cdot w\) geometric random variables in the case that queues somehow clear a packet every round, which are easily seen to have bounded moments. By Theorem 5.3, this implies stochastic stability as the pth moment condition holds for arbitrarily large p.□

References

[1]

Elliot Anshelevich, Anirban Dasgupta, Jon M. Kleinberg, Éva Tardos, Tom Wexler, and Tim Roughgarden. 2008. The price of stability for network design with fair cost allocation. SIAM J. Comput. 38, 4 (2008), 1602–1623.

Digital Library

Google Scholar

[2]

Elliot Anshelevich, Anirban Dasgupta, Éva Tardos, and Tom Wexler. 2008. Near-optimal network design with selfish agents. Theory Comput. 4, 1 (2008), 77–109.

Crossref

Google Scholar

[3]

Peter Auer, Nicolò Cesa-Bianchi, Yoav Freund, and Robert E. Schapire. 2002. The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32, 1 (2002), 48–77.

Digital Library

Google Scholar

[4]

Maria-Florina Balcan, Avrim Blum, and Yishay Mansour. 2013. Circumventing the price of anarchy: Leading dynamics to good behavior. SIAM J. Comput. 42, 1 (2013), 230–264.

Digital Library

Google Scholar

[5]

Lucas Baudin, Marco Scarsini, and Xavier Venel. 2023. Strategic behavior and no-regret learning in queueing systems. CoRR abs/2302.03614 (2023).

Crossref

Google Scholar

[6]

P. Billingsley. 2008. Probability and Measure (3rd ed.). Wiley.

Google Scholar

[7]

Avrim Blum, MohammadTaghi Hajiaghayi, Katrina Ligett, and Aaron Roth. 2008. Regret minimization and the price of total anarchy. In Proceedings of the 40th Annual ACM Symposium on Theory of Computing. ACM, New York, NY, 373–382.

Digital Library

Google Scholar

[8]

Christian Borgs, Jennifer T. Chayes, Nicole Immorlica, Kamal Jain, Omid Etesami, and Mohammad Mahdian. 2007. Dynamics of bid optimization in online advertisement auctions. In Proceedings of the 16th International Conference on World Wide Web (WWW’07). ACM, New York, NY, 531–540.

Digital Library

Google Scholar

[9]

Allan Borodin, Jon M. Kleinberg, Prabhakar Raghavan, Madhu Sudan, and David P. Williamson. 2001. Adversarial queuing theory. J. ACM 48, 1 (2001), 13–38.

Digital Library

Google Scholar

[10]

G. W. Brown. 1951. Iterative solutions of games by fictitious play. In Activity Analysis of Production and Allocation, Tjalling C. Koopmans (Ed.). Wiley, New York, NY, 374–376.

Google Scholar

[11]

Vincent Conitzer, Christian Kroer, Debmalya Panigrahi, Okke Schrijvers, Eric Sodomka, Nicolás E. Stier Moses, and Chris Wilkens. 2019. Pacing equilibrium in first-price auction markets. In Proceedings of the 2019 ACM Conference on Economics and Computation (EC’19). ACM, New York, NY, 587.

Digital Library

Google Scholar

[12]

Vincent Conitzer, Christian Kroer, Eric Sodomka, and Nicolás E. Stier Moses. 2018. Multiplicative pacing equilibria in auction markets. In Web and Internet Economics. Lecture Notes in Computer Science, Vol. 11316. Springer, 443. https://link.springer.com/content/pdf/bbm%3A978-3-030-04612-5%2F1.pdf.

Google Scholar

[13]

Constantinos Daskalakis, Paul W. Goldberg, and Christos H. Papadimitriou. 2009. The complexity of computing a nash equilibrium. SIAM J. Comput. 39, 1 (2009), 195–259.

Digital Library

Google Scholar

[14]

Ofer Dekel, Ambuj Tewari, and Raman Arora. 2012. Online bandit learning against an adaptive adversary: From regret to policy regret. In Proceedings of the 29th International Conference on Machine Learning (ICML’12). 1–8. http://icml.cc/2012/papers/749.pdf.

Google Scholar

[15]

Rick Durrett. 2019. Probability: Theory and Examples. Vol. 49. Cambridge University Press.

Crossref

Google Scholar

[16]

Jerzy Filar and Koos Vrieze. 2012. Competitive Markov Decision Processes. Springer Science & Business Media.

Digital Library

Google Scholar

[17]

Daniel Freund, Thodoris Lykouris, and Wentao Weng. 2022. Efficient decentralized multi-agent learning in asymmetric queuing systems. In Proceedings of the 35th Conference on Learning Theory, Vol. 178. 4080–4084. https://proceedings.mlr.press/v178/freund22a.html.

Google Scholar

[18]

Hu Fu, Qun Hu, and Jia’nan Lin. 2022. Stability of decentralized queueing networks beyond complete bipartite cases. In Web and Internet Economics. Lecture Notes in Computer Science, Vol. 13778. Springer, 96–114.

Digital Library

Google Scholar

[19]

Drew Fudenberg and David Levine. 1998. The Theory of Learning in Games. MIT Press, Cambridge, MA.

Google Scholar

[20]

S. Hart and A. M. Colell. 2000. A simple adaptive procedure leading to correlated equilibrium. Econometrica 68, 5 (2000), 1127–1150.

Crossref

Google Scholar

[21]

Refael Hassin. 2020. Rational Queueing. CRC Press, Boca Raton, FL. https://books.google.com/books?id=M3G1zQEACAAJ.

Google Scholar

[22]

Refael Hassin and Moshe Haviv. 2003. To Queue or Not to Queue: Equilibrium Behavior in Queueing Systems. Springer US. 02034096 https://books.google.com/books?id=K5KhJSZkhHQC.

Crossref

Google Scholar

[23]

Ramesh Johari and John N. Tsitsiklis. 2004. Efficiency loss in a network resource allocation game. Math. Oper. Res. 29, 3 (2004), 407–435.

Digital Library

Google Scholar

[24]

Elias Koutsoupias and Christos H. Papadimitriou. 1999. Worst-case equilibria. In Proceedings of the 16th Annual Symposium on Theoretical Aspects of Computer Science (STACS’99). 404–413.

Crossref

Google Scholar

[25]

Subhashini Krishnasamy, Rajat Sen, Ramesh Johari, and Sanjay Shakkottai. 2016. Regret of queueing bandits. In Proceedings of the 30th Conference on Neural Information Processing Systems (NIPS’16). 1669–1677. http://papers.nips.cc/paper/6370-regret-of-queueing-bandits.

Google Scholar

[26]

Thodoris Lykouris, Vasilis Syrgkanis, and Éva Tardos. 2016. Learning and efficiency in games with dynamic population. In Proceedings of the 27th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’16). 120–129.

Crossref

Google Scholar

[27]

Albert W. Marshall, Ingram Olkin, and Barry C. Arnold. 1979. Inequalities: Theory of Majorization and Its Applications. Vol. 143. Springer.

Google Scholar

[28]

Abraham Neyman and Sylvain Sorin. 2003. Stochastic Games and Applications. Vol. 570. Springer Science & Business Media.

Crossref

Google Scholar

[29]

Robin Pemantle and Jeffrey S. Rosenthal. 1999. Moment conditions for a sequence with negative drift to be uniformly bounded in \({L}^r\). Stoch. Proc. Appl. 82, 1 (1999), 143–155.

Crossref

Google Scholar

[30]

Julia Robinson. 1851. An iterative method of solving a game. Ann. Math. Stat. 54 (1851), 296–301.

Crossref

Google Scholar

[31]

Tim Roughgarden. 2003. The price of anarchy is independent of the network topology. J. Comput. Syst. Sci. 67, 2 (2003), 341–364.

Digital Library

Google Scholar

[32]

Tim Roughgarden. 2015. Intrinsic robustness of the price of anarchy. J. ACM 62, 5 (2015), Article 32, 42 pages.

Digital Library

Google Scholar

[33]

Tim Roughgarden. 2016. Twenty Lectures on Algorithmic Game Theory. Cambridge University Press, Cambridge, NY.

Digital Library

Google Scholar

[34]

Tim Roughgarden and Éva Tardos. 2002. How bad is selfish routing? J. ACM 49, 2 (2002), 236–259.

Digital Library

Google Scholar

[35]

Aviad Rubinstein. 2018. Inapproximability of Nash equilibrium. SIAM J. Comput. 47, 3 (2018), 917–959.

Digital Library

Google Scholar

[36]

Alexander Schrijver. 2003. Combinatorial Optimization: Polyhedra and Efficiency. Vol. 24. Springer Science & Business Media.

Google Scholar

[37]

Flore Sentenac, Etienne Boursier, and Vianney Perchet. 2021. Decentralized learning in online queuing systems. In Proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS’21). 18501–18512. https://proceedings.neurips.cc/paper/2021/hash/99ef04eb612baf0e86671a5109e22154-Abstract.html.

Google Scholar

[38]

Shai Shalev-Shwartz and Shai Ben-David. 2014. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press.

Crossref

Google Scholar

[39]

John F. Shortle, James M. Thompson, Donald Gross, and Carl M. Harris. 2018. Fundamentals of Queueing Theory. Wiley.

Crossref

Google Scholar

[40]

Vasilis Syrgkanis and Éva Tardos. 2013. Composable and efficient mechanisms. In Proceedings of the 45th Annual Symposium on Theory of Computing (STOC’13). ACM, New York, NY, 211–220.

Digital Library

Google Scholar

[41]

Neil Walton and Kuang Xu. Learning and information in stochastic networks and queues. In Tutorials in Operations Research: Emerging Optimization Methods and Modeling Techniques with Applications, John Gunnar Carlsson (Ed.). INFORMS, 161–198. https://pubsonline.informs.org/doi/pdf/10.1287/educ.2021.0235

Crossref

Google Scholar

[42]

Carsten Witt. 2014. Fitness levels with tail bounds for the analysis of randomized search heuristics. Inf. Process. Lett. 114, 1-2 (2014), 38–41.

Crossref

Google Scholar

Cited By

View all

Biró PBenedek MÁgoston KLosonci D(2024)Beszámoló a 2024- es Mechanizmus- és Intézménytervezési KonferenciárólPénzügyi Szemle = Public Finance Quarterly10.35551/PFQ_2024_3_1070:3(133-137)Online publication date: 30-Sep-2024
https://doi.org/10.35551/PFQ_2024_3_10
Freund DLykouris TWeng W(2024)Efficient Decentralized Multi-agent Learning in Asymmetric Bipartite Queueing SystemsOperations Research10.1287/opre.2022.029172:3(1049-1070)Online publication date: May-2024
https://doi.org/10.1287/opre.2022.0291

Index Terms

The Price of Anarchy of Strategic Queuing Systems
1. Theory of computation
  1. Theory and algorithms for application domains
    1. Algorithmic game theory and mechanism design
      1. Convergence and learning in games
      2. Quality of equilibria

Recommendations

Tight Bounds for the Price of Anarchy of Simultaneous First-Price Auctions

We study the price of anarchy (PoA) of simultaneous first-price auctions (FPAs) for buyers with submodular and subadditive valuations. The current best upper bounds for the Bayesian price of anarchy (BPoA) of these auctions are e/(e − 1) [Syrgkanis and ...
Intrinsic robustness of the price of anarchy
STOC '09: Proceedings of the forty-first annual ACM symposium on Theory of computing

The price of anarchy (POA) is a worst-case measure of the inefficiency of selfish behavior, defined as the ratio of the objective function value of a worst Nash equilibrium of a game and that of an optimal outcome. This measure implicitly assumes that ...
Pure and Bayes-Nash Price of Anarchy for Generalized Second Price Auction
FOCS '10: Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science

The Generalized Second Price Auction has been the main mechanism used by search companies to auction positions for advertisements on search pages. In this paper we study the social welfare of the Nash equilibria of this game in various models. In the ...

Comments

Information & Contributors

Information

Published In

Journal of the ACM Volume 70, Issue 3

June 2023

284 pages

ISSN:0004-5411

EISSN:1557-735X

DOI:10.1145/3599472

Editor:
Venkatesan Guruswami
University of California, Berkeley, United States

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 May 2023

Online AM: 17 March 2023

Accepted: 12 January 2023

Revised: 11 October 2022

Received: 15 December 2021

Published in JACM Volume 70, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation
Air Force Office of Scientific Research

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
1,402
Total Downloads

Downloads (Last 12 months)1,024
Downloads (Last 6 weeks)132

Reflects downloads up to 04 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Biró PBenedek MÁgoston KLosonci D(2024)Beszámoló a 2024- es Mechanizmus- és Intézménytervezési KonferenciárólPénzügyi Szemle = Public Finance Quarterly10.35551/PFQ_2024_3_1070:3(133-137)Online publication date: 30-Sep-2024
https://doi.org/10.35551/PFQ_2024_3_10
Freund DLykouris TWeng W(2024)Efficient Decentralized Multi-agent Learning in Asymmetric Bipartite Queueing SystemsOperations Research10.1287/opre.2022.029172:3(1049-1070)Online publication date: May-2024
https://doi.org/10.1287/opre.2022.0291

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

1 Introduction

1.1 Overview of Results and Techniques

1.1.1 Strategic Queuing Model.

1.1.2 Patient Selfishness.

1.1.3 No-Regret Learning in Queuing Systems.

1.2 Organization

1.3 Related Work

2 Preliminaries

2.1 Notation

2.2 Bernoulli Queuing Model

2.3 The Need for Packet Priorities

2.4 Geometric Queuing Model via Deferred Decisions

3 Patience in Queuing Systems

3.1 Algorithmic Description of Costs

3.2 Properties of Rate Function

3.3 Price of Stability and Independence

4 Price of Anarchy of Patient Queuing

5 No-Regret Learning in Strategic Queuing Systems

5.1 Stability of No-Regret Queuing Systems

6 Asymptotic Convergence

7 Conclusion and Open Questions

Acknowledgments

Footnotes

Appendices

A Basic Inequalities

B Probability Tools

C Proofs Deferred from Section 2

C.1 Central Feasibility

C.2 Impossibility for No-Priority Model

C.3 Proofs Deferred from Section 2.4

D Proofs and Notation for Section 3

D.1 Proofs Deferred from Section 3

E Proofs Deferred from Section 5

E.1 Deferred Claims

E.2 Tightness of Factor 2

F Proofs Deferred from Section 6

References

Cited By

Index Terms

Recommendations

Tight Bounds for the Price of Anarchy of Simultaneous First-Price Auctions

Intrinsic robustness of the price of anarchy

Pure and Bayes-Nash Price of Anarchy for Generalized Second Price Auction

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations