ON THE LONG-RUN EQUILIBRIA OF A CLASS OF LARGE SUPERGAMES

In this paper, a broad class of large supergames, i.e., infinitely repeated games played by many players located on lattice d Z are studied. Under the conditions of the pre-specified updating rules and the transition probabilities, i.e., these relevant stochastic process of strategy configuration given, the formula of invariant measures, which represents the long-run equilibrium plays with symmetric payoffs are obtained. ANJIAO WANG et al. 22


Introduction
This paper studies a broad class of large supergames, i.e., infinitely repeated games played by (infinitely) many players. By relating each supergame to a relevant stochastic process of strategy configuration, we first investigate existence, uniqueness, and stability of invariant measures, which represent the long-run equilibrium plays. Then, we study the relationship between those invariant measures and the solution concepts evolved from the game theory literature.
In our stylized class of supergames, game players are located on the vertex set of a graph, typically the d-dimensional integer lattice, d Z (see Figure 1(a)-(c)). The spatial arrangement and the location of a particular player have no tangible restriction other than offering a convenient way to establish a neighbourhood structure, since in the class of games, we study individual players are assumed to be identical.
Each player plays a continent stage game, only with her neighbour, in each and every period of discrete times. Players may or may not have chances to change their strategy simultaneously at every period of time.
We endow the graph with a pre-specified ordering according to which is the global updating rule. Supergames differ from one another possibly because of differences in the stage games or differences in the orderings. (1) Although all players may employ mixed strategy, each player can only observes a track of the history of pure strategy of her relevant neighbours. In a series of studies, Gilboa et al. [6] and Kalai and Lehrer [7,8] investigated, in a finite-player supergame, the possibility for the players to learn to play Nash equilibria by keeping track of the history players, by engaging in Baysian learning and by taking best response policy. In this paper, we do not intend to extend their results to large supergames, although it may be an interesting topic. Relevant problems have been investigated in the literature of cellular automata (for example, see [4,10,11]).
(2) To make a strategy choice, the player only take into account her neighbours' plays in the previous period. She is not sensitive enough to instaneously make response to the change in their neighbours' plays and to play based on her inference of her neighbours current and future plays.
(3) The player may or may not have full control over her choices in the sense that she may or may not be able to use the best response strategy. The possibility of choosing strategy a versus strategy b depends on the difference in average payoffs accrued by playing a against her neighbours versus by playing b.
In summary, the class of games we study in this paper are infinitely repeated games with infinite numbers of players. Each player is directly connected only with her finite neighbours. The information flow is very close to the traditional open-loop setting.
In Section 2, we describe the general formulation of the class of supergames by offering the ingredients needed. We relate each game setting to a stochastic process, which represents the evolution of the In Section 3, we derive invariant measures for some special supergames. We assume that each person located on the vertices of the rectangle lattice plays two-person two strategies game with his 4 neighbours simultaneously. The payoff function is symmetric. We prove some results on the existence of ergodic measures.
In Section 4, we investigate another class of supergames. We assume that each person located on the vertices of the rectangle lattice plays four 4-person team games or 4-person 2-pair team games simultaneously with her neighbours. The formula of invariant measures, which represent the long-run equilibrium plays with symmetric payoffs are obtained.
Section 5 is the conclusion. Some speculation on possible future research of other type supergames on different lattices is discussed.

General Formulation of a Class of Large Supergames
This section is devoted to describe the general formulation of the class of supergames. Subsection 2.1 presents the ingredients. Subsection 2.2 introduces the strategy evolution process (SEP). Subsection 2.3 depicts all the subclass of games. Subsection 2.4 contains general results on the existence of invariant distribution, reversibility, and ergodicity of the SEP.

Ingredients
The class of large supergames we investigate in this paper has the following ingredients:

(a) The players
We assume that players are located on the sites of a graph where V is the vertex set and E is the edge set of the graph.
In this work, we assume that V is a lattice, usually d Z or its finite sublattice. We also assume that all the players are identical.

(b) Neighbourhood
is assigned with each specific model. N is a collection of nonempty subsets of the vertices of V the von-Neumann neighbourhood, the Moore-Neumann neighbourhood.

(c) Stage games
In our class of supergames, some stage games are played over discrete time .
At each discrete time, every player plays a finite-strategy n-person game simultaneously with his neighbours. Mixed strategy is used in general. At the end of each stage game, every player obtain information about the pure strategies his neighbour took in the finished game and then may revise his strategy, under some global ordering of updating, for the next game according to these information and the payoff he received. Then the game is repeated again.

(d) Strategy and payoff
Let i A be the finite set of all possible pure strategies that the player i can take and assume that In an n-person game, let

( )
be the payoff to the player, who plays At each period of time, all players may or may not update their strategies simultaneously. Instead, associated with each games is an ordering according to which the players change their strategies. Such an ordering over V, which is pre-specified as in extensive form games, is represented by the global updating rule.
The global updating rule will be called synchronous, if all the players change their strategies simultaneously at the same time; sequential, if they change their strategies one by one under a fixed ordering; groupsequential, if the players within a group change their strategies simultaneously at the same time, but different groups change their strategies one group at a time under a fixed ordering; and asynchronous, if at a given time only one player-selected by random with uniform probability-updates his strategy. The sequential and asynchronous updating rules are applicable only for the case with finite players (i.e., V is finite).

Strategy evolution process (SEP)
The dynamics of a supergame is characterized by a stochastic process, which is called strategy evolution process (SEP) in this paper. Technically, the SEP for a large supergame is a Markov chain, whose state at time t is denoted by It takes value over , , t x is the realization of .
be the global one-step transition probabilities from x to y. They are defined for different global updating modes as follows, respectively: (i) Synchronous: The global transition probabilities of the SEP are defined by (ii) Group-sequential: In this work, we discuss two specific modes of group-sequential rules, say even-odd sequential rule for d Z model. We If t is odd, the updating rule is obtained by reversing the rules of E V and We discuss three-step group updating rule in Subsection 3.2.
(iii) Asynchronous: In this case, we assume that V is finite, the configuration that is identical to x, except the

Subclasses
In the remaining of this section, we prove some results which apply to the whole class. The class of supergames we study is rather broad. All the subclasses can be represented in the following chart: Each cell in the chart can be further divided into four subcells according to homogeneity and symmetry of the payoff. For detail, see the next section.

Invariant measures, ergodicity and reversibility
We are interested in the condition on the local transition probability for the existence and the uniqueness of the invariant measures, the ergodicity and reversibility of the SEP. For the same local transition rule given by (2.1), what are the differences between the invariant measures for the SEP with different type of local transition rules? In certain cases, there may exist multiple invariant measures. This phenomena is called phase transition.
We are also interested in the inverse problem-for a given distribution π on , V A find all the SEP with π as their invariant measures, specifically when π is Gibbsian.
The answers of these problems vary for different types and the updating rules. Finite V or infinite V will imply different results too. We will discuss them separately in the next section.
The global transition probabilities (2.3) and ((2.4) or (2.5)) define a discrete-time Markov process on the configuration space .
The following result is well known,

Lemma 2.1. The invariant measures for the time evolution form a nonempty convex set.
Proof. See [9].
For the SEP with certain type of updating rule, we define the following. A SEP is ergodic if the chain is regular, i.e., it has a unique invariant measure, which almost surely describes the limit behaviour of the SEP. A SEP will be called Gibbsian, if its invariant measure corresponds to the probability distribution of a Markov random field (MRF) on , where the summation is taken over the cliques of V, and where the function is called potential function. We call a SEP reversible, if the corresponding chain is reversible. It is well known that the reversibility is equivalent to the detailed balance condition Any reversible probability distribution is invariant since (2.7) implies that ( ) ( ) ( ).

Theorem 2.1. A SEP is Gibbsian, if and only if it is reversible.
Proof. See [9].

Invariant Measures for Some Special Models
In this section, we study the above-mentioned problems in detail and depth for some special game setting. The discussion is organized according to the number of players in a game.

Two-person game
In this subsection, we discuss that the players are located on V, which may be d Z or its finite sublattice with the neighbourhood structure of von-Neumann type, i.e., Every player plays a q-strategy two-person game simultaneously with each of his nearest neighbours ( 2 Z case, see Figure 2). We denote by the payoff matrix of player i playing with A y x ∈ Figure 2. Supergame based on basic two-person games on .
and N is the cardinality of set N. Note that both λ and λ′ depend on Roughly speaking, the probability the player i switch his strategy from y to z is proportional to the utility differences for these two strategies. We will discuss three cases.

Homogeneous game with symmetric payoff
In this case, all the payoff functions equals to Q, which is symmetric.
The local transition probability will read as To discuss different global updating rule, we denote and assume that V is some finite box with this type . M V = (i) Asynchronous and even-odd sequential cases Theorem 3.1. Consider a large homogeneous supergame with finite players located on a lattice V. The payoff matrix of a two-person game is symmetric. Then the SEP, whose asynchronous global transition probability is given by (2.5) with the above local transition rule (3.2) and the SRP, whose even-odd sequential global transition probability is given by (2.4) with the above local transition rule (3.2) have the following distribution on V A as their reversible invariant measure: where the summation is taken over all nearest neighbouring pairs of players, and ( ) .
We only need to check (2.7) for ( ) y i, x y = for the asynchronous case and y, whose even (or odd) coordinates are different from those of x for even-odd sequential case.
(a) Asynchronous cases for the configuration, whose even components equal to , , We need to prove (2.7) for In fact, For even-odd sequential and asynchronous models, it has been realized that there is an intimate relation between d-dimensional time evolution and equilibrium statistical model (ESM) in ( ) the extra dimension being the discrete time ( [9,3]). In fact, if we consider as a configuration on the space-time lattice .
It is easy to see that if the transition probability x are all strictly positive, then ν µ is a Gibbsian measure on .
When V is infinite, there are various ways to define finite-volume Gibbs states, which in the thermodynamical limit, yield the space-time measure ν µ of the time evolution as an infinite-volume Gibbs measure. From the theory of ESM, it is important to note that there may exist more than one Gibbs measure on , which indicates the existence of more than one stationary or periodic measure ν for the time evolution as a phase transition.

Example. Binary strategy game
We assume that each player has only two choices of strategies, which may be identified as { } 1 , 1 + − for simplicity and convenience. The payoff matrix is given by Or we may write ( ) in the following form: where , , K J and L are uniquely determined by , , b a and d; and vice versa ( ) The invariant probability measure can be written as the following form: interactions. This is not surprising because the game is homogeneous, i.e., the same payoff matrix is assigned for all two-person games.

Four-person team game
We return to the model on .
form a team to play a four-person team game.
to be the square with vertices , , , k j i and l in clockwise order. Denote by i S the set of basic squares which contain vertex i, and S the set of all basic square.
Each player is a member of four four-person teams consisting of his neighbours. At each time, every player plays finite-strategy four-person team games with four neighbouring teams simultaneously (see Figure 3). Suppose the payoff function symmetric. The local transition probability for the SEP is given by We discuss homogeneous game with symmetric payoff only and claim that for the SEP with asynchronous global updating rule associated with the local transition probability (3.8). The invariant measure is given by

Example. Binary strategy game
We assume that each player has only two choices of strategies, which may be identified as { }.    with the general payoff function, which is not necessarily symmetric, the process may not be reversible. It is interesting to find other or possible all asymmetric payoff functions with which the SEP could be reversible.
(ii) For synchronous global updating rule, it seems more difficult to find the invariant measure. We only treated some special cases.
(iii) We may consider various types of team games. For example, we may consider three-person team game for players located on the triangle