Open Access

Joint iterative beamforming and power adaptation for MIMO ad hoc networks

  • Engin Zeydan1Email author,
  • Didem Kivanc2,
  • Ufuk Tureli2 and
  • Cristina Comaniciu1
EURASIP Journal on Wireless Communications and Networking20112011:79

DOI: 10.1186/1687-1499-2011-79

Received: 30 November 2010

Accepted: 26 August 2011

Published: 26 August 2011


In this paper, we present distributed cooperative and regret-matching-based learning schemes for joint transmit power and beamforming selection for multiple antenna wireless ad hoc networks operating in a multi-user interference environment. Under the total network power minimization criterion, a joint iterative approach is proposed to reduce the mutual interference at each node while ensuring a constant received signal-to-interference and noise ratio at each receiver. In cooperative and regret-matching-based power minimization algorithms, transmit beamformers are selected from a predefined codebook to minimize the total power. By selecting transmit beamformers judiciously and performing power adaptation, the cooperative algorithm is shown to converge to a pure strategy Nash equilibrium with high probability in the interference impaired network. The proposed cooperative and regret-matching-based distributed algorithms are also compared with centralized solutions through simulation results.


MIMO ad hoc networks game theory beamforming

1 Introduction

Multiple-input multiple-output (MIMO) communication techniques have been shown to boost the capacity and spectral efficiency of wireless communication systems [1, 2]. MIMO wireless systems can sustain more simultaneous transmissions in a reduced area through interference management [3]. When transmission parameters such as transmit power, beamformer selection, frequency, modulation, transmission rate are modified to adapt to the interference environment, MIMO systems gain an additional advantage. Adaptive wireless systems can achieve system efficiency, lower computational complexity, and overhead compared to a centralized system.

Transmit beamforming has been the focus of extensive research in the literature [411] and designing optimum signaling at the transmitter can lead to significant improvements for systems operating in varying interference [4, 6, 1216]. In spatial transmit beamforming, each communicating node's symbol stream is multiplied by a preselected transmit beamforming weight vector for transmission through multiple antennas such that the overall interference due to other multiple nodes is minimized. Adaptive optimizing of transmitter beamforming improves efficiency by steering the beam toward the intended receiver, while placing nulls toward the unintended receivers in order to avoid causing excessive interference to them. Transmitters may adapt their signals using a low-rate feedback from the receiver [17]. A power control mechanism can also be combined with limited rate feedback from the receiver in order to satisfy certain Quality-of-Service (QoS) requirements at the receiver [1820].

In general, MIMO beamforming techniques in communication systems are addressed in three different systems: point-to-point, cellular, and ad hoc networks. The great potential of MIMO in point-to-point communication is shown in [1, 4, 6, 21] and linear precoders (eigen-coders) and beamformers have been designed for point-to-point MIMO links in [5, 7]. In cellular networks, beamforming algorithms minimize the total power and enhance capacity for array-equipped base stations and single antenna mobile transmitters [811]. In ad hoc networks, without a central controller, distributed beamforming techniques increase system throughput and lower energy consumption [12, 2224]. However, optimization solutions designed for ad hoc networks need careful study, because the environment is interference limited and the performance of MIMO techniques depends significantly on the overhead introduced by the proposed algorithms.

Distributed spatial beamforming algorithms are proposed for multi-user ad hoc MIMO networks in [23, 24] under channel reciprocity conditions. Channel reciprocity holds when the channel matrix at the receiver is the transpose of the channel matrix at the transmitter, this is usually assumed in time-division duplex (TDD) systems [24]. Bromberg [24] consider the capacity maximization problem and propose a locally enabled global optimization (LEGO) algorithm for distributed beamforming update under Gaussian other-user interference. Iltis et al. [23] formulate the problem as a non-cooperative game for overall power minimization of the network under a constant QoS constraint (i.e., target signal-to-interference plus noise ratio (SINR)). The proposed iterative minimum mean-square error (IMMSE) algorithm solves an optimization problem by computing transmit/receive beamformer pairs and transmit powers in a distributed manner [23]. In the IMMSE algorithm, the receive beamformer is the conjugate of the transmit beamformer and the algorithm relies on the channel reciprocity condition. Hence, the IMMSE algorithm does not demand explicit feedback schemes for channel state information (CSI) at the receiver. However, during the updating procedure of the IMMSE algorithm, transmission overhead of training sequences and power control commands are incurred. The amount of overhead increases with iterations, since the algorithm performs transmit/receive beamformer and power updates iteratively. Moreover, if the transmitter and receiver use different channels or frequencies for transmission and reception, i.e., when channel reciprocity is not valid, CSI must be fed back to the transmitter, which necessitates overhead.

In order to lower the communication overhead between transmitter and receiver when channel reciprocity does not hold, a scheme to limit feedback by quantizing the transmit beamformer in single user MIMO systems is proposed in [21]. The concept is based on selecting a codeword in a predetermined codebook that is known to both transmitter and receiver. Selecting the transmit beamformer from a predefined codebook reduces overhead in nonreciprocal channels. Moreover, latency is reduced in highly mobile and unstable communication networks and user participation is minimized. In this scenario, the receiver only feeds back the index of the selected transmit beamformer to the transmitter. When there is no channel reciprocity between transmitters and receivers, an iterative limited feedback beamforming algorithm using a predetermined codebook is proposed in [25]. The algorithm maximizes the transmission rate in MIMO multi-user ad hoc networks using sequential discrete transmit beamformer selection updates. In each iteration, each node formulates its best response strategy, which maximizes the received SINR. However, the convergence of the algorithm has not been investigated.

Game theory has enabled efficiency and convergence proofs of some of the important problems in wireless communications such as distributed power control algorithm design [26], joint code-division multiple access (CDMA) waveform, and power control design [19, 20, 27] and optimum transmission signaling strategies [28, 29]. The application of game theory to distributed beamforming is problematic [23]. Lacatus and Popescu [20] and Popescu et al. [19] study joint CDMA codeword (or sequence) and power adaptation as a noncooperative game. The problem is formulated as a separable game using noncooperative convex games, with corresponding sub-games: power control and codeword control game. However, in contrast to our joint optimization problem, the joint optimization of powers and CDMA codewords is investigated only over convex games (i.e., the set of action space is nonempty, compact, and convex [26, 30]), and therefore the decision variables (i.e., the powers and codeword sequences) are continuous, not discrete in these games.

Optimum transmit signaling for rate maximization in MIMO interference systems has been studied using game theory [1216]. In these papers, the system is modeled as a noncooperative game where every MIMO link is a player and computes against the others by choosing the transmit covariance matrix to maximize its own rate. Liang and Dandekar [13] investigate rate maximization for MIMO ad hoc networks by performing power control. The existence of a Nash equilibrium (NE) solution is shown using concave game analysis. The convergence of the proposed algorithms is not studied. Arslan et al. [14] show that individual mutual information maximization is a concave game [31] in MIMO interference channels, which implies the existence of a NE for arbitrary channel matrices. The equilibrium is provably unique when multi-user interference (MUI) is sufficiently small. Decentralized algorithms using local information provide update strategies to determine the link parameters. As an extension of their work and for more general conditions, the uniqueness of the NE solution is provided in [15]. Scutari et al. [15] provide a unified framework for the noncooperative mutual information maximization problem for MIMO interference systems. A unified set of sufficient conditions guaranteeing the uniqueness of the NE and the convergence of asynchronous water-filling algorithm is provided for square nonsingular channel matrices. The analysis is based on interpreting the MIMO water-filling operator as a matrix projection onto the convex and closed set of covariance matrices. In [16], same authors extend their results for arbitrary channel matrices. However, these papers do not address the selection of discrete optimized signaling. The existence (or uniqueness) of the NE solution that is proven in [13, 14] is valid either for convex or concave games or for positive definite covariance matrices that are well defined as a convex and closed set [15, 16]. Cooperative and noncooperative algorithms for joint channel and power allocation chosen from the "discrete" strategy space are studied in [32] in the context of wireless mesh networks. However, the proposed noncooperative algorithm is suboptimal and one of the adaptation parameters (i.e., channel adaptation) is not followed after the first iteration.

Power minimization using distributed algorithms with transmit beamformer selection is challenging especially in ad hoc networks. Unlike power control games, in beamforming games there is no natural ordering of the actions [23]. In MIMO ad hoc networks operating in MUI environments, the interference at each user depends on the transmission parameters of the other users. The beamforming decision of each user reshapes the interference emitted to other links, in ways that may be difficult to predict. Changing the beamforming vector may reduce interference on some links, while other links may suffer from higher interference. The affected nodes will then change their own beamforming vectors, setting off an cascade of changes in the network. Moreover, if the node pairs belong to different regulation entities, the noncooperative node pairs may only want to minimize their own transmit power rather than the overall power.

The analysis for the selection of actions from the "discrete" codebook set and convergence analysis is still missing for joint transmit beamforming and power adaptation in the literature. To the best of authors' knowledge, the problem of joint discrete transmit beamforming and power adaptation has not been formalized in multi-user MIMO ad hoc networks. In this paper, we study a decentralized approach for optimizing the transmit beamformer and power levels using local information and reasonable computational burden. We consider total power minimization under a constant received target SINR constraint. Our contributions in this paper are twofold: First, we study an efficient cooperative beamforming algorithm for global power minimization problem with convergence analysis. For the cooperative algorithm, the amount of information to be exchanged between nodes will grow with the number of iterations. Second, we study a noncooperative regret-matching learning algorithm which jointly selects transmit beamformer and power to minimize the total power consumed by the network. The noncooperative solution reduces the amount of overhead by using only local information. We compare the performances of our proposed algorithms with the optimal global solution which is found by exhaustively searching the entire feasible strategy space.

The rest of this paper is organized as follows. Section 2 outlines the system model used in the paper. The optimization problem and its game theoretical interpretation are presented in Section 3. The cooperative wireless ad hoc network and noncooperative counterpart are investigated Sections 4 and 5, respectively. The performance evaluation of the proposed algorithms is provided in Section 6. Finally, Section 7 concludes the paper.

2 System model and concepts

In this paper, we consider a wireless ad hoc network consisting of multiple transmit and receive antenna node pairs as shown in Figure 1. All nodes are assumed to be using the same channel. The interference comes from the other node pairs which operate simultaneously on the same channels. In this ad hoc network model, there are N node pairs and each node pair m {1, 2, ..., N} consists of one transmitter node and one receiver node. Each transmitter and receiver node is equipped with T antennas. Each node has a unit-norm receive/transmit beamformer pair (w m , t m ) with w m , t m T . The complex symbol stream transmitted is b m with E{|b m |2} = 1. The received symbol stream is b ^ m .
Figure 1

Multi-user power control and limited feedback transmit beamforming scheme for MIMO ad hoc networks. (t k ) m represents the m th row of the k th user's transmitter vector t k .

The received signal vector r m T at the m th receiving node is given by
r m = P m H m , m t m b m + i m P i H m , i t i b i + n m ,

where Hm,idenotes the T × T MIMO channel between the i th transmitting node and the m th receiving node and is assumed to be quasi-static and P m is the power of the m th transmitting node. The additive white Gaussian noise terms n m T have identical covariance matrices σ2I T where σ2 is the noise power and I T is the T × T identity matrix. We note that different covariance matrices for noise will not affect the performance of the proposed algorithms. Note that the first term of the right-hand side of (1) is the desired signal, whereas the second term is the interference from the other transmitting nodes.

As we are interested in the minimum achievable power, we consider the worst case where all node pairs always have some packets to transfer and all nodes in the network can transmit simultaneously. The network is assumed to be synchronous. The set of available code-book beamformers for the m th transmitting and receiving node pair is denoted by Δ m = { t m 1 , t m 2 , , t m ϒ } with cardinality ϒ. In a limited feedback beamforming system, the receiving node selects a transmit beamformer from the codebook and feeds back the index of the selected beamformer. Each node can select between ϒ transmit beamformer vectors. Let t m Δ m be the selected transmit beamformer for the m th transmitting and receiving node pair. Denote Θ = [t1, t2, ..., t N ]T and P = [P1, P2, ..., P N ]T as the transmit beamformer selection and transmission power vectors for N nodes, respectively. The T × T the interference plus noise covariance matrix at the m th receiving node is
R m ( Θ - m , P - m ) = i m P i H m , i t i t i H H m , i H + σ 2 I ,

where Θ -m and P -m are the transmit beamformers and powers of nodes other than m.

An antenna beam pattern that adjusts the antenna gains to form nulls toward the directions of the interferers while keeping a constant gain toward the directions of the multi-path of the intended receiver can be designed using receive antenna arrays. The minimum variance distortionless response beamformer [23, 33] can adjust the array weights properly such that the sum of interference and noise is minimized. The normalized receive beamformer at m th receiving node is
w m = w ^ m | | w ^ m | | ,

where w ^ m = R m - 1 H m , m t m . The resulting received SINR at the m th receiving node due to desired transmitter of m th node pair is

Γ m = P m | w m H H m , m t m | 2 i m P i | w m H H m , i t i | 2 + σ 2 , (4)

where ||w m ||2 = ||t m ||2 = 1 for all m.

The proposed distributed algorithms attempt to achieve a target SINR by adjusting transmit powers. To construct a distributed iterative limited feedback beamforming scheme, let us first consider the case when there is only one node pair in the wireless network. The receiver selects the transmit beamformer from the code-book Δ1 as

t 1 = arg max t 1 Δ 1 Γ 1 , (5)

where t 1 is the optimal transmit beamformer selection for one node pair. Then, the receiver returns the index of the beamformer for transmit beamformer selection t 1 and the received "normalized" SINR, ( t 1 ) H H 1 , 1 H R 1 - 1 H 1 , 1 t 1 , through the low-rate feedback channel where R 1 - 1 = I since there is no interference in a single user system. The transmitter selects the transmitter beamformer in order to minimize its own transmission power P1, where P1 is updated as
P 1 = γ 0 ( t 1 ) H H 1 , 1 H R 1 - 1 H 1 , 1 t 1 ,

where γ0 is the target SINR value.

Consider now the case where N node pairs coexist in the wireless network. Note that for each node pair m, the value of received SINR, i.e., Γ m , is a function of (Θ, P). Therefore, the transmit power of one node pair depends not only on the transmit beamformer it selects, but also on the transmit power and beamformer selection of other nodes in the network. Furthermore, in beamforming, if user im changes its transmit beamformer t i to increase its own SINR Γ i , it can either increase or decrease Γ m , the SINR of link m, depending on the relative positions of the nodes. Therefore, designing an optimal distributed algorithm which converges to a set of beamformers to minimize the overall transmit power while meeting target SINRs for all node pairs is not a straightforward task.

3 Optimization problem and game theoretical interpretation

The goal is to minimize the transmit power of all nodes m {1, 2, ..., N} under constant target SINR γ0. The optimization problem can be defined as,
Minimize Θ , P m = 1 N P m ,

subject to Γ m γ0, ||w m || = ||t m || = 1,

Pmin < P m Pmax, m {1, 2, ..., N}, where Pmin and Pmax are the minimum and maximum transmit powers, respectively. We consider the above problem as a normal form game ∏ which can be mathematically defined by the triplet = N , C , { U m } m = 1 N where N = { 1 , 2 , , N } is the finite set of players of the game, C = C 1 × C 2 × × C N represents the set of all available actions for all the players and { U m } m = 1 N : C is the set of utility functions that the players associate with their strategies. The actions c m C m for a player m are the selection of transmit powers P m [Pmin, Pmax] and the transmit beamformer t m Δ m .

Players select actions to maximize their utility functions. One of the questions that arise is if there exists a convergence point, a set of strategies, in our case a set of beamforming selections Θ = [t1, ..., t m , ..., t N ]T and power allocations P = [P1, P2, ..., P N ]T from which no player would deviate. In game theory such a set of strategies is called a Nash equilibrium (NE). A NE for a game is a set of strategy profiles c = [c1, c2, ..., c N ] from which no player can increase his utility by unilateral deviations. A strategy profile (c m , c -m ) is a NE if and only if
U m ( c m , c - m ) U m ( c m , c - m ) m N , c m , c m C m ,

where ( c m , c - m ) refers to the strategy profile in which the action of user m is changed from c m to c m while the actions of all the other players in the game remain the same. In the following sections, we will discuss the scenarios where the node pairs are cooperative and non-cooperative respectively in order to search for the best results and provide convergence guarantees.

4 Cooperative and noncooperative beamforming for MIMO ad hoc networks

4.1 Optimal (centralized) solution

In a wireless ad hoc network with a centralized agent, the transmit beamformers and the corresponding transmit powers can be jointly selected to minimize the total transmit power of all transmitting antennas as,
( Θ , P ) = a r g min Θ , P m = 1 N P m ( Θ , P - m ) ,
where Θ = [ t 1 , t 2 , , t N ] T and P = [ P 1 , P 2 , , P N ] T are the optimal transmit beamformer and power solutions, respectively. The transmit power P m of m th node pair is defined as
P m ( Θ , P - m ) = γ 0 t m H H m , m H R m - 1 H m , m t m .

where R m is a function of (Θ -m , P -m ) as shown in (2). A naive approach for solving the problem is to investigate all strategy profiles Θ = [t1, ..., t m , ...,t N ] T exhaustively (note that for a given fixed strategy profile Θ, the corresponding power profile P can be computed using (10) for each individual node pair m). In order to compute (9), the centralized agent evaluates the total network power for ϒ N possible beamforming vector combinations. For example, for a network size with 10 node pairs where each user has to select from a code-book of size ϒ = 1 6 beamformers, the search space is 1610 strategy profiles. Consequently, finding the centralized transmit beamformer is cumbersome in large-scale wireless ad hoc network. To alleviate the complexity problem, while maintaining good performance results, we propose two decentralized power minimization algorithms using cooperative and noncooperative techniques.

4.2 Cooperative power minimization using beamforming

In this section, we consider scenarios where all node pairs in the wireless network are cooperating. In a cooperative game, nodes in the network are able to coordinate and select the transmit beamformers accordingly. We want to find the optimal transmit beamformer and power assignments such that the total power by all the nodes in the network is minimized. The objective function can be written as
U n e t w o r k ( Θ , P ) = - m = 1 N P m ( Θ , P - m )
We assume that each user's utility function is (11). That is,
U i ( Θ , P ) = U n e t w o r k ( Θ , P ) (1) = - m = 1 N P m ( Θ , P - m ) , i N . (2) (3)

In other words, we model the game as an identical interest game which is a special case of potential games [34]. It is easy to verify that all identical interest games have at least one pure NE, which will represent any action profile that maximizes Unetwork(Θ, P) [14, 32]. We analyze a cooperative power minimization algorithm (COPMA) which can converge to the optimal NE with arbitrarily high probability. This method is analogous to the decentralized negotiation method called adaptive play[14]. The key characteristic of COPMA is the randomness deliberately introduced into the decision-making process to avoid reaching a local solution. In COPMA, the choices of players (in our case transmit beamformer selections) lead the system to the optimal NE solution with arbitrarily high probability.

Motivated by Song et al. [32], COPMA can be implemented distributively as follows: Assume that each node pair m in the network has an unique ID m and maintains two variables P m c u r r e n t and P m u p d a t e d which are the transmit power of the m th node pair prior to and after the change of transmit beamformer, respectively. The node pairs can be chosen randomly or in a round-robin order for updating of the transmit beamformers. Whenever a node pair changes its strategy, it broadcasts a vector [ I D m , P m c u r r e n t , P m u p d a t e d ] via a backbone network. After that, all the other node pairs i N \ m will set P i c u r r e n t = P i u p d a t e d , recalculate P i u p d a t e d as the new transmit power and send the vector { I D i , P i c u r r e n t , P i u p d a t e d } to the updating node pair m. Finally, the m th node pair will decide whether the new transmit beamformer should be kept or changed with some probability which depends on pcurrent and pupdated which are the total transmit power in the network prior to and after the random change of the transmit beamformer, respectively. Note that since pcurrent and pupdated are calculated by each node pair independently, the unique ID of each node provides a checklist to accurately add up transmit powers. For this paper, we assume that unique node IDs are built into each node and in network timing synchronization is perfect so that power updates are always received in the correct round. The detailed description of COPMA is provided as follows:

Initialization: For each transmitting and receiving pair m, the initial index of transmit beamformers for all node pairs is selected as one and the initial transmit powers are set as P m = P max , m N .

Repeat: Randomly choose a node pair m as the updating pair with probability 1/N. Denote t m (n) Δ m as the current transmit beamformer of the m th node pair at iteration n.
  1. 1.

    Set t m (n) = t m (n - 1), m N . Calculate P m c u r r e n t as in (10) m N .

  2. 2.

    To update node pair m, randomly choose a transmit beamformer, t m u p d a t e d Δ m and calculate the transmit power required when the updated transmit beamformer is used, P m c u r r e n t as in (10). Then, broadcast a data vector [ I D m , P m c u r r e n t , P m u p d a t e d ] to all other node pairs i N \ m .

  3. 3.
    After receiving the data vector, for each i,
    • If P i changes (due to change in interference perceived at the i th receiver), every other node pair i N \ m sets P i c u r r e n t = P i u p d a t e d and calculates its new transmit power from (10) and sets it to P i u p d a t e d .

    • If P i does not change, P i c u r r e n t and P i u p d a t e d remain unchanged.

After P i c u r r e n t and P i u p d a t e d are updated for every other node pair i N \ m in the network, send back the vector { I D i , P i c u r r e n t , P i u p d a t e d } to node pair m.
  1. 4.

    Node pair m computes the current total network power as P c u r r e n t = m = 1 N P m c u r r e n t and updated total network power as P u p d a t e d = m = 1 N P m u p d a t e d with t m u p d a t e d based on the received power values from all other node pairs i N \ m .

  2. 5.
    For a smoothing factor τ > 0, set t m ( n ) = t m u p d a t e d for the m th node pair with probability
    1 1 + e x p ( ( P u p d a t e d - P c u r r e n t ) τ ) .
i.e., the updating node pair m selects t m u p d a t e d with probability (13).
  1. 6.

    The m th node pair broadcasts a notifying signal that contains the decision about whether the new transmit beamformer is kept. If not kept, every other node pair i N \ m keeps P i u p d a t e d = P i c u r r e n t


Until : Predefined number of iteration steps n = κ.

Note that step-5 of the updating rule implies that if t m u p d a t e d yields a better performance, i.e., (Pupdated- Pcurrent) < 0, the m th node pair will change to up-dated beamformer t m u p d a t e d with high probability. Otherwise, it will keep the current transmit beamformer with high probability. Note also that the tradeoff between COPMA's performance and convergence speed is controlled by the parameter. τ. Large τ represents extensive space search with slow convergence, whereas small τ represents restrained space search with fast convergence. The smoothing factor τ is selected to be a function of the number of iterations n such that as n increases, τ ↓ 0. For example, we chose τ inversely proportional to n2 in our simulations. The long-term behavior of COPMA is characterized in the following theorem.

Theorem 1 : Assume that the objective of each node pair is defined as the sum power minimization in the network as defined in (9). Let Θ (k) = [t1(k), t2(k), ..., t N (k)] denote the profile of choices at step (or iteration) k in COPMA and Θ = [ t 1 , t 2 , , t N ] the optimal profile. COPMA converges to the optimal NE with arbitrarily high probability. In other words,
lim τ 0 lim k + τ ( Θ ( k ) = Θ ) = 1 .

Proof The proof of Theorem 1 follows similar arguments as presented in [14, 32, 35].

Notice that the transmit beamformer selection with N players, each with ϒ codebook size, generates an N-dimensional Markovian chain on a finite state space with ϒ N states or different profiles. We study the analysis for two-player games, i.e., N = 2 dimensional case as shown in Figure 2. The analysis can be easily extended for multi-player games, i.e., for an N-dimensional Markovian chain.
Figure 2

Two players markov chain for COPMA.

Let t m Δ m and t k Δ k be the choices of two players say m and k, and without loss of generality assume that Δ = Δ m = Δ k = [ t m 1 , t m 2 , , t m ϒ ] . The players m and k can choose a transmit beamformer from Δ. Let Θ i j denote the state [ t m i , t n j ] ϒ 2 where the m th user selects the i th transmit beamformer t m i and the n th user selects the j th transmit beamformer t n j . At an arbitrary time instant, for any state of the Markovian chain, only one of the players can update their transmit beamformer. Therefore, for example in Figure 2, state Θ i j = [ t m i , t n j ] can only transit into a state either in the same row or the same column. For any fixed τ > 0, the transition probability from state Θ i j = [ t m i , t n j ] ϒ 2 to state Θ l p = [ t m l , t n p ] ϒ 2 is given by
τ ( Θ l p | Θ i j ) = 1 2 ϒ ( 1 + e ( P ( Θ l p ) - P ( Θ i j ) ) τ ) ,

where Θ ij and Θ lp differ in exactly one transmit beamformer selection, i.e., Θ ij Θ lp for i = l or j = p, τ is the smoothing factor of COPMA and P(Θ ij ) is the minimum total network power required to reach target SINR γ0 for both users at state Θ ij calculated using (10) for each user. If Θ ij and Θ lp are different in more than one position, then τ (Θ lp | Θ ij ) = 0. In addition, τ (Θ ij | Θ ij ) > 0 is always true. Therefore, for any fixed τ > 0, the induced Markov chain is irreducible and aperiodic.

The stationary distribution τ for each state can be obtained from the following balance equations (using the arrows in Figure 2):
p = 1 , p j ϒ τ ( Θ i j ) × τ ( Θ i p | Θ i j ) = (1) p = 1 , p j ϒ τ ( Θ i p ) × τ ( Θ i j | Θ i p ) , (2) (3)
for all Θ i j ϒ 2 and Θ i p ϒ 2 . Substituting (15) into (16) gives
p = 1 , p i ϒ τ ( Θ i j ) × 1 2 ϒ ( 1 + e ( P ( Θ i p ) - P ( Θ i j ) ) τ ) = p = 1 , p i ϒ τ ( Θ i p ) × 1 2 ϒ ( 1 + e ( P ( Θ i j ) - P ( Θ i p ) ) τ ) .
Then, the stationary distribution of the induced Markov chain at step k is obtained as
τ ( Θ ( k ) ) = e - P ( Θ ( k ) ) τ Θ ̄ ( k ) ϒ 2 e - P ( Θ ̄ ( k ) ) τ ,
for arbitrary state Θ ( k ) ϒ 2 . Hence, from irreducibility and aperiodicity of the Markovian chain, we have
lim τ 0 lim k + τ ( Θ ( k ) = Θ ) = 1 ,

where Θ ϒ 2 . The result validates that COPMA converges to the optimal state with arbitrarily high probability for two-player (N = 2) case and the analysis can easily be extended for general multi-player (N > 2) cases as well. ■

With the above theorem, the transmit beamformer and power level selections are shown to reach the optimal NE solution with arbitrarily high probability.

One disadvantage of cooperative-based algorithms is that the communication overhead incurred to calculate the total network power increases with the number of iterations. In the next section, we study a noncooperative learning algorithm using local information with less computations.

5 Regret-matching-based joint transmit beamformer and power selection game (RMSG)

In this section, our goal is to obtain a distributed learning algorithm for joint transmit beamformer and power selection scheme in MIMO ad hoc networks that requires only local information for updates. We will use a utility function for noncooperative users. Note that the interaction among N "selfish" node pairs can be defined as noncooperative power minimization game where each node pair is attempting to find their own transmit beamformers to minimize their corresponding transmit powers. In the noncooperative joint iterative beamforming and power adaptation, the N node pairs care only about their own power minimizations exclusively, rather than accounting for the overall network power. Each player's utility function depends on the choice of the transmit beamformer and its own power, as well as on the other users' selections for transmit powers and beamformers via the perceived interference. Note that the noncooperative distributed beamforming algorithms for multi-user MIMO ad hoc networks lack the quality of "strategic complementarities" [36] that is found in power control-only games [26]. It is thus not clear how to design an ordered set of actions for noncooperative beamforming games. Instead, we study a noncooperative learning algorithm called the regret-matching adaptive algorithm from [37], in which the players choose their actions based on their regret for not choosing particular actions in the past. The steady-state solution of the regret-matching-based learning algorithm exhibits "no regret" and the probability of choosing a strategy is proportional to the player's "regret" for not having chosen other strategies.

Let t ̄ m denote the vector of all strategies or actions for user m, i.e., t ̄ m = [ t m 1 , t m 2 , , t m ϒ ] and t m (i) denote the transmit beamformer vector selected by the m th user in iteration i. Define the average regret vector R m t ̄ m ( k ) of user m for an action vector t ̄ m at iteration (or time) k as
R m t ̄ m ( k ) = 1 k - 1 i = 1 k - 1 ( U m ( t ̄ m , t - m ( i ) ) - U m ( t m ( i ) ) ) .
In the regret-matching-based joint transmit beamformer and power selection game (RMSG), each user m computes R m t ̄ m for every action t m Δ m in all past steps when all other player's actions remain unchanged. Each player m updates its regret R m t ̄ m ( k ) for every set of actions t ̄ m based on the following recursion formula:
R m t ̄ m ( k + 1 ) = k - 1 k R m t ̄ m ( k ) (1) + 1 k ( U m ( t ̄ m , t - m ( k ) ) - U m ( t m ( k ) ) ) . (2) (3)

At every step k > 1, each user m updates its own average regret vector R m t ̄ m ( k ) for every strategy in t ̄ m .

In regret matching, after computing the average regret vector, R m t ̄ m ( k ) , each user m chooses an action or strategy t m (k), k > 1, according to probability distribution φ m t ̄ m ( k ) defined as
φ m t ̄ m ( k ) = P r o b ( t m ( k ) = t ̄ m ) = [ R m t ̄ m ( k ) ] + t ̄ m Δ m [ R m t ̄ m ( k ) ] + ,
where [x]+ equals x when x is positive and zero otherwise. Notice that in the regret-matching game, each user m chooses a strategy t m Δ m at any step with a probability proportional to the average regret for not choosing that strategy t m Δ m in the past steps. The detailed summary of RMSG using a Gauss-Seidel updating scheme [15] is given in Table 1 where κ is the predefined number of iterations.
Table 1

Regret-Matching-based joint transmit beamformer and power selection game (RMSG) algorithm

Initialization: For each transmitting and receiving pair m, the initial transmit beamformers are selected with equal probability, the initial transmit powers are p m = pmax and the initial average regret vector is R m t ̄ m ( 1 ) = 0 , m N .


For = 1, 2,..., κ

   For m = 1, 2,..., N

- Update the average regret vector R m t ̄ m ( k ) using the recursion in (21)

- Update the probability distribution φ m t ̄ m ( k ) in (22) and select the transmit beamformer t m (k) based on updated φ m t ̄ m ( k ) .

- Calculate the new transmit power p m based on selected t m (k) using (10).

   Next m

Next k

Every finite strategy game has a mixed strategy Nash equilibrium [30]. Using a good learning algorithm, any finite game can be shown to converge to a mixed strategy Nash equilibrium. Regret-matching-based selection is distributed and requires limited information exchange between the users if the utility function is properly selected. The time-averaged behavior of regret-matching game converges almost surely (with probability one) to the set of coarse-correlated equilibrium [34, 38]. Therefore, the joint transmit beamformer and power selections converges to a mixed strategy equilibrium solution. In fact, in our joint transmit beamformer and power selection game, the average regret of a user using regret matching becomes asymptotically zero, which is confirmed by our simulations.

The utility function of noncooperative or "selfish" users for the transmit beamformer and power selection game at iteration k is
U m t m , t - m ( k ) = l o g ( t m H H m , m H R m - 1 H m , m t m ) .

Note that by using the above utility function, each user selects the transmit beamformer t m Δ m to maximize its own "normalized" SINR, t m H H m , m H R m - 1 H m , m t m . The average regret in the recursion formula (21) can be updated locally as the best transmit beamformer is being selected.

6 Simulation results

In this section, we investigate the performance results of centralized optimization, COPMA, and noncooperative regret-matching (RMSG). We assume that the wireless ad hoc network has N homogeneous pairs where each pair has one transmitter node and one receiving node. Each entry in the channel matrix H m,k m , k N is assumed to be independent identically distributed complex Gaussian distribution with zero mean and unit variance. We consider a radio propagation channel with path-loss exponent ν = 4. This implies that the fading power is attenuated by d m - 4 where d m is the distance between transmitter and receiver for m th node pair. The target SINR γ0 is selected to be 10 dB. We assume that channels do not vary during the iterations. If channel conditions vary during an iteration, this will change the optimization problem and the proposed algorithms' performance degrades. However, depending on the network configuration and the parameters of the algorithm, like the smoothing factor τ for COPMA, the network optimizer can set the convergence steps to be as small as possible while trading against performance degradation in time-varying channels. The Grassmannian codebook of [21] is used for the simulation results. The codebook size is selected to be ϒ = 1 6 with T = 3 antennas for all users. Pmax = 100 mW (20 dBm) and Pmin = 1 mW (0 dBm) in our simulations. We assume six different transmit power levels: 1, 5, 20, 30, 50, and 100 mW motivated by the IEEE 802.11b standard in [39]. Note that the transmit powers are selected from this discrete power level set which corresponds to ceiling function of (10). The selected network topologies are assumed to be feasible for the given power levels [40]. The noise power is σ2 = 3.16 × 10-13 W (-95 dBm) which corresponds to approximate thermal noise power for a bandwidth of 20 MHz.

6.1 Comparison of centralized optimization, COPMA, and RMSG for N= 4 node pairs

We first consider a small wireless ad hoc network with 4 users, i.e., N = 4. All transmitting and receiving nodes are randomly located in a square of 30 m × 30 m area. We choose τ = 0.1/n2 in our simulations, where n denotes the iteration step. The global optimum solution obtained by enumerating all feasible strategies, i.e., 164 profiles, is the performance benchmark. The maximum number of iterations is κ = 120 for COPMA and RMSG. The total power consumed by the network is shown in Figure 3. The global minimum power solution obtained by centralized optimization is the lower bound on the overall power consumed by the network. We observe that COPMA's performance improves over time and settles at the global optimum combination after 92 iterations. Note that 68 and 76% of the gain from using COPMA algorithm is realized within the first 59 and 83 iterations, respectively.
Figure 3

Total transmit power versus iteration with N = 4, T = 3, and ϒ = 1 6 .

RMSG algorithm discussed in Section 5 minimizes the total transmit power in the network defined by (9) using the utility function (23) in a noncooperative manner. Figure 3 also shows how the total power in the network varies over 120 iterations using RMSG. Note that RMSG yields inferior performance compared to COPMA in terms of the achieved overall power. However, the updating procedure is noncooperative and requires less overhead as the iterations continue. The total network power converges to a value of 135 mW on the 68th iteration whereas the centralized algorithm's solution requires 65 mW total network power. Steady state is reached when all the users select a transmit beamformer index with probability one.

Figures 4 and 5 depict the trajectories of transmit beamformer selection indices and power trajectories in COPMA for each user in the network topology. At the initialization step, each user starts with maximum power levels and first index of transmit beamformer selections. Then, each user updates iteratively following COPMA algorithm, until the optimum Nash equilibrium is achieved. Note that when the transmit beamformers Θ and power level vectors P converge in Figures 4 and 5, the corresponding overall transmit power obtained by COPMA is shown in Figure 3. The existence of NE and the convergence toward NE in COPMA are illustrated by the curves in Figures 4 and 5.
Figure 4

Transmit beamformer indexes versus iteration in COPMA with N = 4, T = 3, and ϒ = 1 6 .
Figure 5

Transmit powers versus iteration in COPMA with N = 4, T = 3, and ϒ = 1 6 .

Probability mass function (p.m.f): In this subsection, we take a look at the probability mass function (p.m.f) φ m t ̄ m of the RMSG algorithm calculated in (22).

Figure 6 represents the change in the p.m.f after 1, 12, 50, and 100 iterations for one user. Initially, users choose the strategies, i.e., transmit beamformers, with equal probability. The strategies are represented by the indices 1 to ϒ = 1 6 in the x-axis and the probabilities of selecting these indices are given on the y-axis. After 12 iterations, the probability of choosing transmit beamformer index 9 is higher than that for any other transmit beamformer index, although the other probabilities for indices 3, 4, and 12 are not totally eliminated. After 25 iterations, all other probabilities except those of 4 and 9 are eliminated. A stationary point is reached when user 1 chooses transmit beamformer index 9 in the 100th iteration.
Figure 6

The probability distribution of RMSG for one of the users when N = 4.

6.2 Comparison of COPMA and RMSG for N= 10 node pairs

We now consider a larger wireless ad hoc network with N = 10 node pairs randomly located on a 100 m × 100 m area. The smoothing factor for COPMA is selected as τ = 200/n2 in order to search more efficiently in this large strategy space. All other simulation parameters remain the same. An optimization problem with feasible points exceeding 2 N when N > 30 is very difficult to find [41]. The centralized approach is no longer feasible in this scenario due to the enormous strategy space of 1610 profiles. Figure 7 shows the network topology and transmit beampatterns of COPMA with N = 10 users.
Figure 7

Node configuration and transmit beampatterns of COPMA with N = 10 users.

We again investigate both cooperative and regret-matching learning algorithms represented by COPMA and RMSG curves, where the maximum number of iterations is set to κ = 400 for COPMA and κ = 1500 for RMSG. COPMA's performance was found to be optimal solution for the 4 link network, so it provides a good benchmark to test the performance of RMSG. Figure 8 shows the total network power versus the number of iterations for COPMA and RMSG. This figure shows that RMSG's performance is within 75.56% of the COPMA value at the end of iterations. Furthermore, RMSG needs a larger amount of iterations compared to COPMA for the convergence. However, note that RMSG performs noncooperative updates for transmit beamformer and powers at each iteration and thus the amount of overhead is minimal.
Figure 8

Total transmit power versus iteration with N = 10, T = 3, and ϒ = 1 6 .

For the RMSG algorithm, the total power converges to a total network power of 0.2250 W from the 1,296th iteration. The joint selection of transmit beamformer indices and transmit powers reaches steady state when no user in the network deviates from its chosen strategy. The majority of users reach a steady state within 115 iterations. However, one user takes longer than 1,000 iterations to reach steady state.

Probability mass function (p.m.f): A similar figure for the p.m.f of RMSG for the user that takes longer convergence time than others is shown in Figure 9 for the large network size with N = 10. As can be seen in Figure 9, the probability of choosing index 16 is higher than other indices at iteration 500, but the probability of choosing index 5 and 12 is not totally eliminated even after 1,000 iterations. Since the network size is large, the learning process is slower to converge (around 1,332 iterations) to steady-state transmit beamformer indices, compared to the smaller network with N = 4 node pairs.
Figure 9

The probability distribution for one user when N = 10 in RMSG.

7 Conclusion

In this paper, we have considered both cooperative and noncooperative joint power control and beamforming in MIMO ad hoc networks using a game theoretic approach. Under constant SINR requirements, the joint transmit beamformer and power selection algorithms were studied in the context of total network power minimization. We first considered a cooperative case where all users collaborate with each other in order to minimize the overall power of the network. The game was formulated as an identical interest game, and a decentralized algorithm COPMA with high probability of convergence was proposed and analyzed. To reduce the required overhead incurred by the cooperative algorithm, we have also proposed a noncooperative solution which requires only local information. For our proposed noncooperative algorithm, users update their probabilities of choosing a transmit beamformer and power based on the "regret" of not choosing the other strategies. Numerical results illustrate the convergence properties of the proposed algorithms and their performance in terms of overall power minimization in the network.


This paper is presented in part at the IEEE Global Communications Conference 2010 (Globecom'10), Miami, FL, December 6-10, 2010.


Authors’ Affiliations

Department of Electrical and Computer Engineering, Stevens Institute of Technology
Department of Electrical and Computer Engineering, West Virginia University Institute of Technology


  1. TelatarL IE: Capacity of multi-antenna Gaussian channel. Eur Trans Telecommun 1999, 10(6):589-595.Google Scholar
  2. Paulraj AJ, Gore DA, Nabar RU, Bolcskei H: An overview of MIMO communications--a key to gigabit wireless. Proc IEEE 2004, 92(2):198-218. 10.1109/JPROC.2003.821915View ArticleGoogle Scholar
  3. Gesbert D, Hanly S, Huang H, Shitz SS, Simeone O, Yu W: Multi-cell MIMO cooperative networks: a new look at interference. IEEE J Sel Areas Commun 2010, 28(9):1380-1408.View ArticleGoogle Scholar
  4. Sampath H, Stoica P, Paulraj A: Generalized linear precoder and decoder design for MIMO channels using the weighted MMSE criterion. IEEE Trans Commun 2001, 49(12):2198-2206. 10.1109/26.974266View ArticleGoogle Scholar
  5. Palomar DP, Cioffi JM, Lagunas MA: Joint Tx-Rx beamforming design for multicarrier MIMO channels: a unified framework for convex optimization. IEEE Trans Signal Process 2003, 51(9):2381-2401. 10.1109/TSP.2003.815393View ArticleGoogle Scholar
  6. Palomar DP, Cioffi JM, Lagunas MA: Uniform power allocation in MIMO channels: a game theoretic approach. IEEE Trans Inf Theory 2003, 49(7):1707-1727. 10.1109/TIT.2003.813513MathSciNetView ArticleMATHGoogle Scholar
  7. Palomar DP, Lagunas MA, Cioffi JM: Optimum linear joint transmit-receive processing for MIMO channels with QoS constraints. IEEE Trans Signal Process 2004, 52(5):1179-1197. 10.1109/TSP.2004.826164MathSciNetView ArticleGoogle Scholar
  8. Rashid-Farrokhi F, Liu KJR, Tassiulas L: Transmit beamforming and power control for cellular wireless systems. IEEE J Sel Areas Commun 1998, 16(10):1437-1450.View ArticleGoogle Scholar
  9. Schubert M, Boche H: Solution of the multiuser downlink beamforming problem with individual SINR constraints. IEEE Trans Veh Technol 2004, 53(1):18-28. 10.1109/TVT.2003.819629View ArticleGoogle Scholar
  10. Wong K, Murch RD, Letaief KB: Performance enhancement of a multiuser MIMO wireless communication system. IEEE Trans Commun 2002, 50(12):1960-1970. 10.1109/TCOMM.2002.806503View ArticleGoogle Scholar
  11. Farrokhi F, Tassiulas L, Liu KR: Joint optimal power control and beamforming in wireless networks using antenna arrays. IEEE Trans Commun 1998, 46(10):1313-1324. 10.1109/26.725309View ArticleGoogle Scholar
  12. Ye S, Blum RS: Optimized signaling for MIMO interference systems with feedback. IEEE Trans Signal Process 2003, 51(11):2839-2848. 10.1109/TSP.2003.818339View ArticleGoogle Scholar
  13. Liang C, Dandekar KP: Power management in MIMO ad-hoc networks: A game-theoretic approach. IEEE Trans Wirel Commun 2007, 6(4):1164-1170.View ArticleGoogle Scholar
  14. Arslan G, Demirkol MF, Song Y: Equilibrium efficiency improvement in MIMO interference systems: a decentralized stream control approach. IEEE Trans Wirel Commun 2007, 6(8):2984-2993.View ArticleGoogle Scholar
  15. Scutari G, Palomar DP, Barbarossa S: Competitive design of multiuser MIMO systems based on game theory: a unified view. IEEE J Sel Areas Commun 2008, 26(7):1089-1103.MathSciNetView ArticleGoogle Scholar
  16. Scutari G, Palomar DP, Barbarossa S: The MIMO iterative waterfilling algorithm. IEEE Trans Signal Process 2009, 57(5):1917-1935.MathSciNetView ArticleGoogle Scholar
  17. Love DJ, Heath RW, Gesbert D, Rao BD, Andrews M: An overview of limited feedback in wireless communication systems. IEEE J Sel Areas Commun 2008, 26(8):1341-1365.View ArticleGoogle Scholar
  18. Zeydan E, Tureli DK, Tureli U: Iterative beamforming and power control for MIMO ad-hoc networks, in Proceedings of IEEE GLOBECOM'10. 2010.Google Scholar
  19. Popescu DC, Rawat DB, Popescu O, Saquib M: Game-theoretic approach to joint transmitter adaptation and power control in wireless systems. IEEE Trans Syst Man Cybern Part B: Cybern 2010, 40(3):675-682.View ArticleGoogle Scholar
  20. Lacatus C, Popescu DC: Adaptive interference avoidance for dynamic wireless systems: a game theoretic approach. IEEE J Sel Topics Signal Process 2007, 1(1):189-202.View ArticleGoogle Scholar
  21. Love DJ, Heath RW: Grassmannian beamforming for multiple-input multiple-output wireless systems. IEEE Trans Inf Theory 2003, 49(10):2735-2747. 10.1109/TIT.2003.817466MathSciNetView ArticleMATHGoogle Scholar
  22. Bromberg MC, Agee BG: Optimization of spatially adaptive reciprocal multipoint communication networks. IEEE Trans Commun 2003, 51(8):1254-1257. 10.1109/TCOMM.2003.815068View ArticleGoogle Scholar
  23. Iltis R, Kim S, Hoang D: Noncooperative iterative MMSE beamforming algorithms for ad-hoc networks. IEEE Trans Commun 2006, 54(4):748-759.View ArticleGoogle Scholar
  24. Bromberg MC: Optimizing MIMO multipoint wireless networks assuming Gaussian other-user interference. IEEE Trans Inf Theory 2003, 49(10):2352-2362. 10.1109/TIT.2003.817424MathSciNetView ArticleMATHGoogle Scholar
  25. Lee J, Li YG: Iterative limited feedback beamforming for MIMO ad-hoc networks, Proceedings of IEEE GLOBE-COM'09. 2009.Google Scholar
  26. Saraydar CU, Mandayam NB, Goodman DJ: Efficient power control via pricing in wireless data networks. IEEE Trans Commun 2002, 50(2):291-303. 10.1109/26.983324View ArticleGoogle Scholar
  27. Buzzi S, Poor HV: Joint receiver and transmitter optimization for energy-efficient CDMA communications. IEEE J Sel Areas Commun 2008, 26(3):459-472.View ArticleGoogle Scholar
  28. Scutari G, Palomar DP, Barbarossa S: Optimal linear precoding strategies for wideband non-cooperative systems based on game theory part I: Nash equilibria. IEEE Trans Signal Process 2008, 56(3):1230-1249.MathSciNetView ArticleGoogle Scholar
  29. Scutari G, Palomar DP, Barbarossa S: Optimal linear precoding strategies for wideband non-cooperative systems based on game theory part II: algorithms. IEEE Trans Signal Process 2008, 56(3):1250-1267.MathSciNetView ArticleGoogle Scholar
  30. Fudenberg D, Tirole J: Game Theory. MIT Press, Cambridge, MA; 1991.MATHGoogle Scholar
  31. Rosen JB: Existence and uniqueness of equilibrium points for concave n-person games. Econometrica 1965, 33: 529-534.View ArticleMathSciNetGoogle Scholar
  32. Song Y, Zheng C, Fang Y: Joint channel and power allocation in wireless mesh networks: a game theoretical perspective. IEEE J Sel Areas Commun 2008, 26(7):1149-1159.View ArticleGoogle Scholar
  33. Chang J, Tassiulas L, Rashid-Farrokhi F: Joint transmitter receiver diversity for efficient space division multiaccess. IEEE Trans Wirel Commun 2002, 1(1):16-27. 10.1109/7693.975441View ArticleGoogle Scholar
  34. Marden JR, Arslan G, Shamma JS: Regret based dynamics: convergence in weakly acyclic games. AAMAS '07: Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems. ACM, New York, NY, USA; 2007:1-8.View ArticleGoogle Scholar
  35. Young HP: Individual Strategy and Social Structure. Prince-ton University Press, Princeton, NJ; 1998.Google Scholar
  36. Milgrom P, Roberts J: Rationalizability, learning and equilibrium in games with strategic complementarities. Econometrica 1990, 58(6):1255-1277. 10.2307/2938316MathSciNetView ArticleMATHGoogle Scholar
  37. Hart S, Mas-Colell A: A simple adaptive procedure leading to correlated equilibrium. Econometrica 2000, 68(5):1127-1150. 10.1111/1468-0262.00153MathSciNetView ArticleMATHGoogle Scholar
  38. Young HP: Strategic Learning and its Limits. Oxford University Press, Oxford, UK; 2004.View ArticleGoogle Scholar
  39. Gruteser M, Jain A, Deng J, Zhao F, Grunwald D: Exploiting physical layer power control mechanisms in IEEE 802.11b network interfaces. Technical report CU-CS-924-01, Department of Computer Science, University of CO Boulder 2001.Google Scholar
  40. Han Z, Farrokhi FR, Liu KJR: Joint power control and blind beamforming over wireless networks: a cross layer approach. EURASIP J Appl Signal Process 2004, 5: 751-761.View ArticleMATHGoogle Scholar
  41. Boyd S, Vandenberghe L: Convex Optimization. Cambridge University Press, Cambridge; 2004.View ArticleMATHGoogle Scholar


© Zeydan et al; licensee Springer. 2011

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.