Outage minimization for parallel fading channels with limited feedback
 YuanYuan He^{1} and
 Subhrakanti Dey^{1}Email author
DOI: 10.1186/168714992012352
© He and Dey; licensee Springer. 2012
Received: 26 July 2011
Accepted: 4 September 2012
Published: 21 November 2012
Abstract
We address an optimal power allocation problem for minimizing the outage probability for M parallel blockNakagamifading channels under a longterm average sum transmit power constraint with finite rate feedback of channel state information (CSI). A simulationbased optimization technique called simultaneous perturbation stochastic approximation algorithm (SPSA) is employed first to numerically derive a locally optimal power codebook. Due to the high computational complexity and long convergence time of SPSA, we make an ordering assumption on the power codebook entries and derive effective hyperplane based approximations to the channel quantization regions and present a number of lowcomplexity suboptimal quantized power codebook design algorithms. Unlike existing work on outage minimization for multipleinput multipleoutput (MIMO) channels with limited feedback, we do not assume that identical transmission power is used for all channels within each channel quantization region. We also do not resort to a Gaussian approximation for the instantaneous mutual information in general as used in many existing work. Based on our power ordering assumption and hyperplane based approximations, we show that allocating identical power to all channels within a given channel quantization region in the limited feedback scenario is asymptotically optimal only at high average power (or average signaltonoise ratio (SNR)) for the Rayleigh fading case, whereas for the general Nakagami case, the transmit power allocation for an individual channel within each quantized region is asymptotically proportional to the corresponding Nakagami fading parameter (severity of fading). We also present a novel diversity order result for the outage probability for the Nakagami fading case. Finally, we derive a suitable Gaussian approximation based lowcomplexity power allocation scheme for a large number of parallel channels, which has important applications in wideband slowfading orthogonal frequencydivision multiplexing (OFDM) systems. Extensive numerical results illustrate that only a few bits of feedback close the gap substantially in outage performance between the limited feedback case and the full instantaneous CSI at the transmitter case.
Introduction
Determining the information theoretic capacity of blockfading wireless channels has been an important area of research over the past decade. Various notions of capacity for singleuser fading channels include ergodic capacity[1], delaylimited capacity[2] and capacity versus outage probability[3]. For delaysensitive traffic such as voice and video, the latter two notions are rather important. In particular, the notion of outage probability signifies the probability that the capacity of a wireless channel falls below a required rate threshold. In[3], optimal power allocation for outage minimization in the case of parallel fading channels (single user) was obtained with the assumption of full channel state information (CSI) at the transmitter. However, full CSI at the transmitter is hard to obtain in practice due to limited bandwidth in the feedback channel from the receiver to the transmitter, and it is more common to have full CSI at the receiver. This has motivated researchers over the last decade to analyze performances of wireless systems with various forms of partial CSI at the transmitter (CSIT), such as noisy CSIT, statistical CSIT and quantized CSIT. In particular, the idea of Grassmanian line packing was used to design optimal beamforming codebooks for MIMO systems in[4], whereas in a related study[5], the authors derived a lower bound on the outage performance of a multipleantenna systems using beamforming based on quantized CSIT. More recently, in[6], maximization of expected rate over a singleinput singleoutput slowly fading channel is investigated using optimized discrete rate and power control with quantized CSIT. A general framework for power allocation in Gaussian vector channels with l_{ p } norm constraints on the eigenvalues of the MIMO channel matrix was investigated in[7]. The authors of[6] have also investigate the diversitymultiplexing tradeoff in MIMO channels with quantized CSIT in[8] (see also[9]). A number of recent articles have investigated outage minimization for fading channels with limited feedback for MIMO or multiantenna systems. Such studies include[9–13]. In particular,[12] looks at outage minimization with a finiterate power codebook for MIMO systems. The key finding of this article (see also[8]) is that the optimal power codebook has a circular structure in that the same transmit power is allocated to the outage region and the best channel region. In order to design the optimal power codebook, it assumes however that the same transmit power (as a function of the entire channel matrix) is used in all transmit antennas. This allows the authors of[12] to reduce the finiterate power codebook design problem to an equivalent scalar quantization problem. Even then, finding the cumulative density function for the equivalent scalar random variable requires computing multidimensional probability integrals which is computationally complex. Furthermore, the optimal power codebook entries are found via generic gradient search techniques which can take unreasonably long time to converge. Using a similar setting, the same authors have investigated the outage diversity behavior for multipleantenna systems with quantized CSIT in[13] (see also[10]). In[11], the problem of outage minimization using quantized CSIT is investigated for the fading relay channel and[14] also studied the outage minimization problem for cooperative amplifyandforward systems. In[9], a Gaussian approximation is used to capture the probability distribution of the mutual information for a MIMO system in order to study the outage behavior. Finally, many of the above results only apply to Rayleigh fading channels (where the MIMO channel matrix is assumed to have complex circularly symmetric Gaussian distributed entries). Note however that the circular nature of the optimal power codebook and some of the useful approximations developed in[10] for asymptotically large number of channel feedback bits are also relevant for our study and we duly acknowledge this fact. Our focus is however on designing practical lowcomplexity but suboptimal algorithms for designing the quantized power codebook and derive theoretical properties of these power allocation schemes in order to justify the various approximations used in designing the suboptimal schemes.
In our article, we look at an Mparallel fading channels system as introduced in[15], where one codeword spans M subchannels in one fading block and each block undergoes the same CSI, and we aim to minimize the outage probability under a sum (across all channels) long term average power constraint with quantized CSIT. Technologically, parallel fading channels constitute a useful and fundamental communication framework for various applications, for example, multipleantenna systems after singular value decomposition or an OFDM system with frequencyselective fading[15]. Due to the unavailability of full CSIT in our framework, our model is better suited to the case of multicarrier OFDM systems, with M parallel subchannels located at nonadjacent carrier frequencies. The concept of parallel channels also extends to multiple transmission timeslots[16] and diversity available through cooperative communications such as multiple relays etc. Our results in this article are applicable to all these scenarios.
Our main contributions can be summarized as follows:

We first formulate the abovedescribed optimization problem and provide an simulation based iterative optimization algorithm: simultaneous perturbation stochastic approximation algorithm (SPSA), to numerically solve for the joint optimization of locally optimal channel partitions and quantized power allocation.

Based on a power ordering assumption and a hyperplane based approximation to the basic rate achieving mutual information curve in the vector channel space, we derive a number of lowcomplexity suboptimal finiterate power codebook design algorithms for outage minimization with quantized channel information—without assuming identical transmission power per channel or using a Gaussian approximation for the instantaneous mutual information in general.

We show that in the high average power (or average SNR) regime, it is asymptotically optimal to allocate transmit power proportional to the Nakagami m fading parameter in the individual channels within each quantized region. In the Rayleigh fading case, this corresponds to allocating the same power across all channels within each quantized region (but only in the high average power regime).

We also derive a novel diversity order result for the outage probability in the Nakagami fading case.

Finally, we investigate the suitability of a Gaussian approximation scheme for the instantaneous mutual information in the case of a large number of independent (but not identically distributed) parallel channels, which is applicable to a slow fading broadband frequency selective channel or to a flat fast fading channel[17, 18]. Note that as we will show later, although the Gaussian approximation is seen to perform poorly for a small number of parallel channels, it performs efficiently for a large number of channels (e.g., M ≥ 16), thus having important practical applications to such broadband multicarrier systems.
The organization of the article is as follows. Section ‘Channel model and outage minimization’ presents the fading channel model and the typical outage problem based on full CSIT. Section ‘Optimum quantized power control with finiterate feedback’ presents the outage minimization problem with quantized CSIT followed by the modified problem formulation using the power ordering and hyperplane based approximation. Various suboptimal algorithms are then presented for finding the power codebook in the high average power regime along with their associated theoretical properties. A new result on the diversity order for the outage probability is then presented for the Nakagami fading case using our power allocation algorithm based on the power ordering and hyperplane based approximation. Section ‘Large number of channels analysis’ presents a Gaussian approximation based suboptimal algorithm applicable to the case of a large number of independent parallel channels. Section ‘Numerical results’ presents an extensive set of numerical results illustrating the efficiency of our algorithms measured by closeness of their outage performance as compared to the full CSIT based optimal power allocation solution. Finally, Section ‘Conclusions’ presents some concluding remarks and ideas for future extensions of this study.
Channel model and outage minimization
where h_{ i } is the channel power gain and x_{ i } is the channel input symbol. The noise sequences w_{1},…,w_{ M }are independent and identically distributed (i.i.d) Gaussian random variables with zero mean and unit variance. It is assumed that the components of channel power gain vector h = (h_{1},…,h_{ M }) are mutually independent, individually i.i.d across fading blocks and ergodic and fading is sufficiently slow so that the input symbols transmitted over the same fading block experience the same channel state. It is also assumed that the fading block length N → ∞ so that information theoretic results can be applied. The individual fading distributions may not be identical. However, they (and hence the joint channel fading distribution) are assumed to be continuous.
where, the rate unit is nats per real dimension. Note that in (2), we consider that the capacity is averaged over parallel channels as[15] did.
where Γ(.) is gamma function ($\mathrm{\Gamma}\left(s\right)={\int}_{0}^{\infty}{t}^{s1}{e}^{t}\mathit{\text{dt}}$) and constant m_{ i }≥ 0. 5. m_{ i }is called the fading parameter. Larger values of the fading parameter m_{ i } imply less severe fading environments. When m_{ i }= 1, the above distribution boils down to an exponential distribution (corresponding to Rayleigh fading) and the non fading case corresponds to m_{ i }= ∞.
Optimum quantized power control with finiterate feedback
It is well known that having perfect CSI at both transmitter and receiver is hard to satisfy in a practical system due to bandwidth constraints on the receiver to transmitter feedback link as well as considerable communication cost overhead. In this section, we consider designing a power allocation procedure for Mparallel flatfading channels based on quantized vector CSI h= (h_{1},…,h_{ M }) (in M dimensions) acquired via a nodelay and errorfree feedback link with limited rate from the receiver to the transmitter.
Optimal power allocation with limited feedback strategy
We assume that the receiver can perfectly estimate the full CSI information. Given B bits of feedback, a power codebook$\mathcal{P}=\{{\mathbf{p}}_{1},\dots ,{\mathbf{p}}_{L}\}$, where P_{ j }= {p_{1j},…,p_{ Mj }}, j = 1,…,L of cardinality L = 2^{ B }, is designed offline purely on the basis of the statistics of h. Note that the power levels for different channels here are distinct as opposed to[9, 12] where the same transmit power was allocated to all transmit antennas in the MIMO setting. This codebook is known a priori by both the transmitter and the receiver. Given a channel realization h,

First, the receiver applies a deterministic mapping denoted as I from current instantaneous h information into one of L integer indices[9], where the mapping I partitions the entire Mdimensional space of h into L regions${\mathcal{R}}_{1},{\mathcal{R}}_{2},\dots ,{\mathcal{R}}_{L}$, given as$I\left(\mathbf{h}\right)=j,\phantom{\rule{.3em}{0ex}}\text{if}\phantom{\rule{.3em}{0ex}}\mathbf{h}\in {\mathcal{R}}_{j},\phantom{\rule{.3em}{0ex}}j=1,\dots ,\mathrm{L.}$

Second, the receiver sends the corresponding index j = I(h) to the transmitter via the feedback link.

Then, the j th entry of the power codebook$\mathcal{P}$, i.e., P_{ j }, will be employed by the transmitter for transmission.
Therefore the key steps involved in the limited feedback design problem constitute obtaining (offline) the jointly optimal CSI partitions and power codebook design. Our objective is to design efficient algorithms for solving this joint optimization problem of the channel partition regions and the power codebook, so as to minimize the outage probability while satisfying a long term average power constraint.
where${{\mathbf{p}}_{j}}^{\mathrm{\Sigma}}=\frac{1}{M}\sum _{i=1}^{M}{p}_{\mathit{\text{ij}}}$, i.e., the average of all the elements in vector P_{ j }. It can be easily verified that the above optimization problem satisfies the long term average power constraint with equality.
With a fixed λ, we can employ an iterative simulationbased optimization algorithm called the simultaneous perturbation stochastic approximation algorithm (SPSA) to find the optimal power codebook of problem (9). A stepbystep guide to an implementation of SPSA can be found in[20], which, when applied to our problem, can be summarized in the following steps.
Step 1 Initialization and coefficient selection: Set counter index k = 0. Pick initial guess of the power codebook${\widehat{\mathcal{P}}}_{0}$ and nonnegative coefficients a, c, A, α and γ in the SPSA gain sequences${a}_{k}=\frac{a}{{(A+k+1)}^{\alpha}}$ and${c}_{k}=\frac{c}{{(k+1)}^{\gamma}}$. For guideline on choosing these coefficients see[20].
Step 2 Generation of simultaneous perturbation vector: Generate a pdimensional (p = ML) random perturbation vector Δ_{ k }, where each component of Δ_{ k }are i.i.d Bernoulli ± 1 distributed with probability of$\frac{1}{2}$ for each outcome.
Step 3 Loss function evaluations: Obtain two measurements of the loss function$\mathcal{L}(\xb7)$ based on the simultaneous perturbation around the current power codebook${\widehat{\mathcal{P}}}_{k}$:$\mathcal{L}({\widehat{\mathcal{P}}}_{k}+{c}_{k}{\mathrm{\Delta}}_{k})$ and$\mathcal{L}({\widehat{\mathcal{P}}}_{k}{c}_{k}{\mathrm{\Delta}}_{k})$ with c_{ k }and Δ_{ k }from Steps 1 and 2.
where Δ_{ ki } is the i th component of the Δ_{ k }vector.
to update${\widehat{\mathcal{P}}}_{k}$ to a new value${\widehat{\mathcal{P}}}_{k+1}$.
Step 6 Iteration or termination: Return to Step 2 with k + 1 replacing k. Terminate the algorithm if there is little change in several successive iterations or the maximum allowable number of iterations has been reached.
Therefore, with a given power codebook and resulting quantization regions, we can numerically calculate the loss function. We repeatedly apply Step 2 to Step 5 of SPSA until the resulting outage probability converges within a prespecified accuracy (Step 6 of SPSA). After that, we solve the dual problem for finding the optimal λ by using a subgradient based search method, i.e., updating λ until convergence using${\lambda}^{l+1}={\left[{\lambda}^{l}{\alpha}^{l}\left({P}_{\text{av}}\sum _{j=1}^{L}E\left[{{\mathbf{p}}_{j}}^{\mathrm{\Sigma}}{\mathcal{R}}_{j}\right]\text{Pr}({\mathcal{R}}_{j})\right)\right]}^{+}$, where l is the iteration number, α^{ l } is a positive scalar step sizes for the l th iteration satisfying$\sum _{l}{\alpha}^{l}=\infty $ and$\sum _{l}{{\alpha}^{l}}^{2}<\infty $. Due to the fact that problem (7) is not convex, in general, the optimal solution we obtain here is only locally optimal.
Power ordering assumption and hyperplane approximation (POHPA)
Let P(h) represent the optimal power allocation strategy which maps the channel realization h to a power level in$\mathcal{P}$. Without loss of generality, we assume that power levels are such that$\left(\right)close="">{{\mathbf{p}}_{1}}^{\mathrm{\Sigma}}\cdots {{\mathbf{p}}_{L}}^{\mathrm{\Sigma}}$ corresponding to the partition${\mathcal{R}}_{1},{\mathcal{R}}_{2},\dots ,{\mathcal{R}}_{L}$, then we have the following result which generalizes the circular nature of the quantized channel regions presented in[9, 12] for a scalar power allocation scenario to the parallel channels case with a vector power allocation.
Lemma 1
Proof
The proof is similar to[9]. However, since it generalizes the result for a scalar power allocation in[9] to a vector power allocation case in this article, we provide a sketch of the proof (see Appendix Appendix 1). □
If the same transmit power is allocated to all transmit channels, i.e.,$\left(\right)close="">{p}_{1j}=\cdots ={p}_{\mathit{\text{Mj}}}={\mathbf{p}}_{j}^{\mathrm{\Sigma}}$, the above Lemma result reduces to the case of[9, 12]. From Lemma 1, we also have that there is no outage in the first L−1 regions and outage only occurs in the last region${\mathcal{R}}_{L}$; the optimal partition satisfies that a channel realization h = {h_{1},…,h_{ M }} either belongs to the region${\mathcal{R}}_{j}$, where j ∈ {1,…,L} is the maximum index that can guarantee zero outage for it or belongs to${\mathcal{R}}_{L}$;${\mathcal{R}}_{L}$ includes two parts:$\left(\right)close="">\left\{\mathbf{h}{\left({\mathbf{p}}^{\ast}\right(\mathbf{h}\left)\right)}^{\mathrm{\Sigma}}{\mathbf{p}}_{1}^{\mathrm{\Sigma}}\right\}$ (outage) and$\left(\right)close="">\left\{\mathbf{h}{\left({\mathbf{p}}^{\ast}\right(\mathbf{h}\left)\right)}^{\mathrm{\Sigma}}\le {\mathbf{p}}_{L}^{\mathrm{\Sigma}}\right\}$, denoted as${\mathcal{R}}_{L,1}$ and${\mathcal{R}}_{L,2}$, respectively.
Problem (14) is in general a nonlinear nonconvex optimization problem. Since g(h_{1},…,h_{M−1}P_{ j }), j = 1,…,L is highly nonlinear, it is hard to obtain a closedform expression for F(P_{ j }). Although one can use numerical integrals to calculate F(P_{ j }), and use randomized search techniques to find the optimum solution of problem (14), the associated computational complexity increases exponentially with the number of feedback bits and channels. Next, we will employ another approach by deriving an approximation for g(h_{1},…,h_{M−1}P_{ j }), such that an analytical (approximate) closedform expression for F(P_{ j }) can be easily obtained (unlike[9] where a Gaussian distribution was used to approximate the distribution of the mutual information to evaluate an analytical expression for F(P_{ j })), thus significantly reducing the computational complexity of solving problem (14). Then based on the obtained optimal power allocation using this approximation, one can use Monte Carlo simulations to evaluate the “real outage” (corresponding outage probability performance given by F(P_{1})). More details on this can be found in the Numerical results Section.
 1)Multiple infinite series representation: This analytical expression was derived in [22],$\begin{array}{ll}\phantom{\rule{5.5pt}{0ex}}{F}^{\prime}\left({\mathbf{p}}_{j}\right)& =\frac{1}{\mathrm{\Gamma}\left(1+\sum _{i=1}^{M}{m}_{i}\right)}\left[\prod _{i=1}^{M}{\left(\frac{{m}_{i}{\lambda}_{i}K}{{p}_{\mathit{\text{ij}}}}\right)}^{{m}_{i}}\right]\phantom{\rule{2em}{0ex}}\\ \phantom{\rule{1em}{0ex}}\mathrm{\times}{\mathrm{\Phi}}_{2}^{\left(M\right)}\left({m}_{1},{m}_{2},\dots ,{m}_{M};1+\sum _{i=1}^{M}{m}_{i};\right.\phantom{\rule{2em}{0ex}}\\ \phantom{\rule{1em}{0ex}}\frac{{m}_{1}{\lambda}_{1}K}{{p}_{1j}},\left(\right)close=")">\frac{{m}_{2}{\lambda}_{2}K}{{p}_{2j}},\dots ,\frac{{m}_{M}{\lambda}_{M}K}{{p}_{\mathit{\text{Mj}}}}& ,\phantom{\rule{2em}{0ex}}\end{array}$(19)
 2)Single infinite series representation: The second result provides a simpler expression for (18) involving only a single infinite sum [23], which was proposed by Moschopoulos [24].$\begin{array}{l}{F}^{\prime}\left({\mathbf{p}}_{j}\right)=\prod _{i=1}^{M}{\left(\frac{{\beta}_{1}}{{\beta}_{\mathit{\text{ij}}}}\right)}^{{m}_{i}}\sum _{n=0}^{\infty}\frac{{\delta}_{n}\gamma \left(\rho +n,\frac{K}{{\beta}_{1}}\right)}{\mathrm{\Gamma}(\rho +n)},\end{array}$(21)
Special cases

If$\rho =\sum _{i=1}^{M}{m}_{i}$ is an integer, (21) can be further simplified as[23]$\begin{array}{ll}\phantom{\rule{6pt}{0ex}}{F}^{\prime}\left({\mathbf{p}}_{j}\right)& =\prod _{i=1}^{M}{\left(\frac{{\beta}_{1}}{{\beta}_{\mathit{\text{ij}}}}\right)}^{{m}_{i}}\sum _{n=0}^{\infty}{\delta}_{n}\phantom{\rule{2em}{0ex}}\\ \phantom{\rule{1em}{0ex}}\mathrm{\times}\left\{1{e}^{\frac{K}{{\beta}_{1}}}\sum _{l=0}^{\rho +n1}\frac{{\left(\frac{K}{{\beta}_{1}}\right)}^{l}}{l!}\right\}.\phantom{\rule{2em}{0ex}}\end{array}$(23)

If M = 2, let β_{2} = max(β_{ ij }), and m_{ θ } is the corresponding fading parameter for β_{2}, we have$\begin{array}{ll}\phantom{\rule{6pt}{0ex}}{F}^{\prime}\left({\mathbf{p}}_{j}\right)& ={\left(\frac{{\beta}_{1}}{{\beta}_{2}}\right)}^{{m}_{\theta}}\sum _{n=0}^{\infty}\frac{{\left({m}_{\theta}\right)}_{n}{\left(1\frac{{\beta}_{1}}{{\beta}_{2}}\right)}^{n}}{n!}\phantom{\rule{2em}{0ex}}\\ \phantom{\rule{1em}{0ex}}\mathrm{\times}\frac{\gamma \left(\rho +n,\frac{K}{{\beta}_{1}}\right)}{\mathrm{\Gamma}(\rho +n)},\phantom{\rule{2em}{0ex}}\end{array}$(24)
where (m_{ θ })_{n + 1}represents the Pochhammer symbol.
It is not hard to verify that Problem (25) is still nonconvex. However, we can employ the KarushKuhnTucker (KKT) necessary conditions to achieve locally optimal solutions.
Remark 1
Note that KKT necessary conditions usually require regularity of a local optimum, which amounts to (in the context of Problem (25)) linear independence of the gradients of the active inequality constraints evaluated at the local optimum (see Proposition 3.3.1, p. 310 in[25]). In Problem (25), if a local optimum of the power vector satisfies P_{1} > ⋯> P_{L−1}> P_{ L }> 0, then the only active inequality constraint is the average power constraint, in which case the linear independence property is trivially satisfied. In the case where the local optimum for P_{ L }= 0, it can be easily shown by simple linear algebra that the gradients corresponding to these two (P_{ L }= 0 and the average power constraint) active inequality constraints satisfy the linear independence condition.
Since regularity of a local optimum is thus established, one can now use KKT necessary conditions to obtain the following important result that can be used to design locally optimal quantized power codebooks:
Theorem 1
where$\frac{\partial {F}^{\prime}({P}_{j})}{\partial {{p}_{M1}}^{\ast}}\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}\frac{\mu \left({F}^{\prime}({P}_{2}){F}^{\prime}({P}_{1})\right)}{1\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\mu \sum _{i=1}^{M}({p}_{i1}{p}_{\mathit{\text{iL}}})},\phantom{\rule{0.3em}{0ex}}\frac{\partial {F}^{\prime}({P}_{j})}{\partial {{p}_{\mathit{\text{Mj}}}}^{\ast}}\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}\frac{{F}^{\prime}({P}_{j+1}){F}^{\prime}({P}_{j})}{\sum _{i=1}^{M}({p}_{i,j1}{p}_{\mathit{\text{ij}}})}$, j = 2,…,L−1, and$\phantom{\rule{1em}{0ex}}\frac{\partial {F}^{\prime}({P}_{L})}{\partial {{p}_{\mathit{\text{ML}}}}^{\ast}}=\frac{1{F}^{\prime}({P}_{L})+{F}^{\prime}({P}_{1})}{\sum _{i=1}^{M}({p}_{i,L1}{p}_{\mathit{\text{iL}}})}$
Proof
See Appendix Appendix 2. □
A solution to (27) provides a locally optimum power allocation policy$\left(\right)close="">{\left\{{{\mathbf{p}}_{j}}^{\ast}\right\}}_{j=1}^{L}$. For small values of L and M, the above system of nonlinear equations can be solved by various optimization softwares. However, the complexity of solving the above set of nonlinear equations is still too high for moderately large numbers of feedback bits and channels. Therefore, we consider several lowcomplexity suboptimal schemes suited to special cases of high or low P_{av} as described below.
High average power approximation (HP_{av}A)
In the high average power or average SNR regime, the following result allows us to simplify the computation of the quantized power codebook. It also illustrates that using our hyperplane based approximations, it is not optimal to allocate identical power to individual channels within each quantized region in general.
Theorem 2
Proof
See Appendix Appendix 3. □
 (29)
implies that in high P _{av}, for each quantization region, the power allocated to each channel asymptotically depends only on the severity of fading (represented by the parameter m).
Special cases:

Identical fading parameters: If m_{1} =⋯= m_{ M }, from (29), we have${{p}_{1j}}^{\ast}\approx \cdots \approx {{p}_{\mathit{\text{Mj}}}}^{\ast},\phantom{\rule{1em}{0ex}}j=1,\dots ,L$(30)
which means, in high P_{av}, with identical fading parameters for all channels, for each quantization region, the power assigned to each channel is asymptotically equal, and we call this solution as equal power per channel (EPPC).

Rayleigh fading: (m_{1} = ⋯= m_{ M }= 1), from Theorem 2, (28) reduces to${F}^{\prime}\left({\mathbf{p}}_{j}\right)\approx \frac{1}{M!}\prod _{i=1}^{M}\left(\frac{{\lambda}_{i}K}{{p}_{\mathit{\text{ij}}}}\right)$(31)
and (29) reduces to EPPC.
where${P}_{\text{av}}^{\prime}={P}_{\text{av}}\frac{M{m}_{M}}{\sum _{i=1}^{M}{m}_{i}}$.
which can be carried out by an iterative algorithm employing the standard bisection search method. We call this algorithm as ‘PFPPC+EPPR’.
Thus, in high the P_{av} regime (P_{av} → ∞),${r}_{1}=\frac{K}{{p}_{1}}\to 0$, and we have the following result which indicates that the total power allocated to the outage region is asymptotically (as P_{av} → ∞) negligible thus allowing us to further simplify the quantized power codebook design method:
Lemma 2
In the high P_{av} regime,$\underset{{r}_{1}\to 0}{lim}{P}_{\text{tot}}^{L,1}=0$, if$\sum _{j=1}^{L}{\rho}^{j}>1$, where$\rho =\sum _{i=1}^{M}{m}_{i}$.
Proof
See Appendix Appendix 4. □
Remark 2
Note that if ρ ≥ 1, the condition$\sum _{j=1}^{L}{\rho}^{j}>1$ is clearly satisfied for any L ≥ 2. For 0. 5 < ρ < 1 (which is the case of no diversity with M = 1 or the single channel case), one can show that there exists a finite L for which the condition$\sum _{j=1}^{L}{\rho}^{j}>1$ is satisfied. This is easily seen by noting that the condition$\sum _{j=1}^{L}{\rho}^{j}>1$ is equivalent to ρ^{L + 1}< 2ρ−1 for ρ < 1. It is interesting to note however that when ρ = 0. 5 (which is the case when one has as single Nakagami channel with m = 0. 5, the worst possible fading parameter), there is no finite value of L that can achieve$\sum _{j=1}^{L}{\rho}^{j}>1$. Thus in high P_{av}, it is near optimal to allocate zero power to the outage region as long as ρ ≥ 1 with any L ≥ 2, or a single channel with 0. 5 < m < 1 and a sufficiently large L. For a single channel with m = 0. 5, it seems that even in high P_{av}, one needs to allocate nonzero power to the outage region.
Therefore the performance of the ZPiOR approximation (PFPPC+ZPiOR) becomes asymptotically (as L → ∞) close to that of the PFPPC scheme, except for the single channel case with m = 0. 5, where one can use the EPPR approximation instead to reduce complexity.
which can be easily solved by using a standard bisection method. In fact, numerical studies illustrate (as we will see later) that the ZPiOR approximation has a nearoptimum (for Problem (25)) performance for large number of quantization regions. Thus, the ZPiOR approximation achieves a better complexityperformance tradeoff than PFPPC+EPPR.
Remark 3
then when the average power is small (P_{av} → 0), p_{ iL }→ 0,i = 1,…,M as well, and the corresponding quantization threshold r_{ iL }→ ∞. In this case, the region${\mathcal{R}}_{L}$ only includes${\mathcal{R}}_{L,1}$ (the outage region) and the corresponding power level P_{ L }= 0, thus making the ZPiOR approximation applicable. A similar observation was also made in[9].
Asymptotic behavior of outage probability
Then we have the following important approximation for d, which generalizes the existing diversity order results for outage probability (e.g.,[13]), that are valid for Rayleigh fading channels only.
Theorem 3
Proof
See Appendix Appendix 5. □
Special case: Note that for the Rayleigh fading case where m_{ i }= 1,∀i = 1,2,…,M, (40) becomes$d\approx \sum _{j=1}^{L}{M}^{j}$, which is consistent with similar results in[9, 10].
Remark 4
It is possible that the result in Theorem 3 may hold with equality, rather than being an approximation for the diversity order. However, due to the various levels of approximations involved in deriving this, we are unable to prove an exact equality at this stage. This will involve computing orders of approximation errors and showing that the error goes to zero as P_{av} goes to infinity. We leave this for future study.
Large number of channels analysis
The previous algorithms can be effectively applied to find locally optimal solutions or suitable approximations for them for moderate number of parallel channels, such as M < 10. Once M ≥ 10, these algorithms become computationally demanding. Given that practical wideband slow fading multicarrier systems such as OFDM (with large number of subcarriers), can be modeled as asymptotically large number of i.i.d parallel channels[18], one needs to find outage minimizing power allocation algorithms with limited feedback for large M. Below we provide such an algorithm using a Gaussian approximation for large M in high P_{av}.
Remark 5
According to[18], the mutually independence of large number of parallel channels can be justified by the assumptions that the number of independent propagation paths in wideband slow fading channel models increase linearly with the bandwidth and the carrier frequencies of parallel channels are sufficiently separated with the effect of the multipath spread essentially eliminated[18]. Even if the adjacent subcarriers are correlated, with subcarrier grouping technique[26], we can still have a large number of parallel independent subchannels (each comprises a number of subcarriers), such as 32 subchannels as stated in[26].
where f_{ i }= h_{ i }λ_{ i },$\frac{1}{{\lambda}_{i}}$ is the mean of channel gain h_{ i }, and under the Nakagami fading model, the pdf of f_{ i }is$\frac{{\left({m}_{i}\right)}^{{m}_{i}}}{\mathrm{\Gamma}\left({m}_{i}\right)}{f}_{i}^{{m}_{i}1}{e}^{{m}_{i}{f}_{i}},\forall i$.
where${s}_{j}=2{r}_{0}\frac{1}{M}\sum _{i=1}^{M}log\left(\frac{{p}_{\mathit{\text{ij}}}}{{\lambda}_{i}}\right)={c}^{\prime}\frac{1}{M}\sum _{i=1}^{M}log\left({p}_{\mathit{\text{ij}}}\right)$,${c}^{\prime}=2{r}_{0}+\frac{1}{M}\sum _{i=1}^{M}log\left({\lambda}_{i}\right)$ and the function V(.) denotes the cdf of$\frac{1}{M}\sum _{i=1}^{M}log\left({f}_{i}\right)$. It is easy to show that the pdf of z_{ i }= log(f_{ i }) is${f}_{{z}_{i}}=\frac{{\left({m}_{i}\right)}^{{m}_{i}}}{\mathrm{\Gamma}\left({m}_{i}\right)}{e}^{{m}_{i}{e}^{{z}_{i}}}{e}^{{m}_{i}{z}_{i}}$. Denote its mean and variance by E[z_{ i }] and Var[z_{ i }], respectively. For the Rayleigh fading case, the pdf of${z}_{i}=log\left({f}_{i}\right)$ is${e}^{{e}^{{z}_{i}}}{e}^{{z}_{i}}$, which is the well known Gumbel Distribution with mean E[z_{ i }] = −r, where r is EulerMascheroni constant (r = 0. 5772156649…) and variance$\text{Var}\left[{z}_{i}\right]=\frac{{\Pi}^{2}}{6}$.
Note that for large M, if m_{1} =⋯= m_{ M } or in the special case of Rayleigh fading (m_{ i }= 1, ∀i), z_{ i } is i.i.d with finite mean and variance and then the Central Limit Theorem (CLT) directly applies whereby one can use a Gaussian approximation for the pdf of$\frac{1}{M}\sum _{i=1}^{M}{z}_{i}$. However, in the general case where the fading parameters m_{ i } are different for different channels, z_{ i }, i={1,2,…,M} are independent but not necessarily identically distributed. In this case, it is important to prove that {z_{ i }− E[z_{ i }]} satisfies the Lindeberg condition (for a statement of this condition, see p. 262,[27]), so that a generalized CLT can be applied and a Gaussian approximation can be used for the instantaneous mutual information over parallel fading channels. Indeed, we can analytically prove the following Lemma:
Lemma 3
The sequence {z_{ i }− E[z_{ i }]} satisfies the Lindeberg condition.
Proof
The proof of this can be found in Appendix Appendix 6. □
Proposed power allocation strategies
Number of channels  M < 10  M ≥ 10 
Optimal approach  SPSA   
Approximation  POHPA   
High P_{av}  PFPPC (L ≥ 16,  GA (L ≥ 16, 
PFPPC+ZPiOR or PFPPC+EPPR)  GA+ZPiOR or GA+EPPR) 
Numerical results
To numerically illustrate the performance of the designed power allocation strategies, we consider an Mparallel (independent) Nakagami blockfading channels, which characterizes a multicarrier OFDM system, with M parallel subchannels located at nonadjacent carrier frequencies. The mean value of the gamma distributed fading gain for each channel is assumed to be inversely proportional to the square of the wireless propagation distance d, and the required transmission rate is taken to be r_{0} = 0. 25 nats per channel use. Outage performance with full CSI at the transmitter is obtained with the optimal power allocation results presented in[3]. It should be noted that the results illustrate the “real outage” performance of the proposed algorithms, where the power codebook designed via the algorithms is used to obtain the average outage probability over a large number of MonteCarlo simulated channel realizations. As a result, the average power required for a given real outage may not strictly be the same as the original average power based on which the power codebook is designed. However, for a given algorithm, the graphs can and should be used to determine the minimum outage probability obtainable for a given average power and vice versa.
Experiment 1
Experiment 2
Experiment 3
Experiment 4
Experiment 5
Conclusions
In this article, we have derived a simulation based optimization algorithm using SPSA and presented various lowcomplexity suboptimal outage minimization algorithms via optimal power allocation with finiterate or quantized channel feedback for an Mparallel blockfading channels under a long term average power constraint. Numerical results illustrate the effectiveness of these algorithms via their outage performance in comparison with the performance of the optimal power allocation with full CSI, and show that only 4 bits of feedback close the gap with the outage performance of the full CSI algorithm substantially for M = 4 or M = 6. For a large number of channels (M = 16), our Gaussian approximation based algorithm performs approximately within 2.8 dB (SNR gap) of the full CSI based algorithm at an outage probability of 10^{−2}with less than 1 bit of (broadcast) feedback per channel when all channels undergo severe Nakagami fading with identical fading parameter m = 0. 5. Future study includes extension of these results to correlated fading channels, consideration of noisy or erroneous feedback as investigated in[29] and quantized CSIT based power allocation to more general optimization problems such as the serviceoutage based power and rate allocation in[15].
Endnote
^{a}A flat fading channel can be expressed as a complexbaseband model. However, according to the Appendix B.4.2 of ([19], pp. 527–528), one way to derive the capacity of a complexbaseband channel is to think of each user of the complex channel as two uses of a real channel. Thus, we only need to consider a realbaseband model, i.e., (1), and then multiply 1/2 at the maximum mutual information with unit nats per real dimension, as in (2).
Appendix 1
Proof of Lemma 1
Proof
The proof is similar to[9, 12]. For all j,1 ≤ j ≤ L −1, P(h) = P_{ j }, if$\mathbf{h}\in {\mathcal{R}}_{j}$, let${\mathcal{R}}_{j}^{\ast}$ be the set of all h such that$\left(\right)close="">{\mathbf{p}}_{j+1}^{\mathrm{\Sigma}}{\left({\mathbf{p}}^{\ast}\right(\mathbf{h}\left)\right)}^{\mathrm{\Sigma}}\le {\mathbf{p}}_{j}^{\mathrm{\Sigma}}$, we need to prove that${\mathcal{R}}_{j}^{\ast}={\mathcal{R}}_{j}$. Assume the contrary, that${\mathcal{R}}_{j}^{\ast}\setminus {\mathcal{R}}_{j}$ is a non empty set (∖ denotes the set subtraction operation), i.e., if$\mathbf{h}\in {\mathcal{R}}_{j}^{\ast}\setminus {\mathcal{R}}_{j}$, then$\mathbf{h}\in {\mathcal{R}}_{j}^{\ast}$ and$\mathbf{h}\notin {\mathcal{R}}_{j}$. And we can partition the set${\mathcal{R}}_{j}^{\ast}\setminus {\mathcal{R}}_{j}$ into two subsets${\mathcal{R}}_{j}^{}={\mathcal{R}}_{j}^{\ast}\setminus {\mathcal{R}}_{j}\bigcap \sum _{k=1}^{j1}{\mathcal{R}}_{k}$ and${\mathcal{R}}_{j}^{+}={\mathcal{R}}_{j}^{\ast}\setminus {\mathcal{R}}_{j}\bigcap \sum _{k=j+1}^{L}{\mathcal{R}}_{k}$. If the set${\mathcal{R}}_{j}^{}$ has nonzero probability, then we can construct a new scheme by assigning all elements of this set to${\mathcal{R}}_{j}$ instead. Since$\forall \mathbf{h}\in {\mathcal{R}}_{j}^{},{\mathbf{p}}^{\ast}{\left(\mathbf{h}\right)}^{\mathrm{\Sigma}}\le {\mathbf{p}}_{j}^{\mathrm{\Sigma}}$, such rearrangement achieves the same outage probability but with less average power due to$\left(\right)close="">{\mathbf{p}}_{j}^{\mathrm{\Sigma}}{\mathbf{p}}_{k}^{\mathrm{\Sigma}},1\le k\le j1$, which is in contradiction with the optimality of the optimal solution$\mathcal{P}$ and$\mathcal{R}$. On the other hand, the set${\mathcal{R}}_{j}^{+}$ is also an empty set, otherwise, we can easily see that this set is in outage (since$\forall \mathbf{h}\in {\mathcal{R}}_{j}^{+},{\left({\mathbf{p}}^{\ast}\right(\mathbf{h}\left)\right)}^{\mathrm{\Sigma}}>{\mathbf{p}}_{j+1}^{\mathrm{\Sigma}}$), thus we have larger overall outage probability, which is also a contradiction. Therefore, we have${\mathcal{R}}_{j}^{\ast}\subseteq {\mathcal{R}}_{j}$. We also can similarly prove${\mathcal{R}}_{L}^{\ast}\subseteq {\mathcal{R}}_{L}$ as[9] did, omitted due to space limit. Since$\bigcup _{j=1}^{L}{\mathcal{R}}_{j}^{\ast}=\bigcup _{j=1}^{L}{\mathcal{R}}_{j}$, we can conclude that${\mathcal{R}}_{j}^{\ast}={\mathcal{R}}_{j},\forall j$. □
Appendix 2
Proof of Theorem 1
Proof
This completes the proof. □
Appendix 3
Proof of Theorem 2
Proof
In the multiple infinite series representation (19), for a sufficiently high P_{av}, we have$\left\frac{{m}_{i}{\lambda}_{i}K}{{p}_{\mathit{\text{ij}}}}\right<1,\phantom{\rule{2.77695pt}{0ex}}\forall i,j$. Thus from[30], the conditions of the convergence of the power series (20) are satisfied.
where$T({n}_{1},\dots ,{n}_{M})=\frac{\prod _{i=1}^{M}{\left({m}_{i}\right)}_{{n}_{i}}{({m}_{i}{\lambda}_{i}K)}^{{n}_{i}}\frac{1}{{n}_{i}!}}{{\left(1+\sum _{i=1}^{M}{m}_{i}\right)}_{{n}_{\tau}}}$ and$\mathbb{Z}$ is the set of nonnegative integers.
which completes the proof of Theorem 2. □
Appendix 4
Proof of Lemma 2
Proof
where$C=K{m}_{M}\phantom{\rule{.3em}{0ex}}max\left({\lambda}_{i}\right)\prod _{i=1}^{M}{\left(\frac{{\lambda}_{i}}{max\left({\lambda}_{i}\right)}\right)}^{{m}_{i}}$ and${C}^{\prime}=C\frac{{({m}_{M}max({\lambda}_{i}\left)\right)}^{\rho 1}}{\mathrm{\Gamma}\left(\rho \right)}$, and the last equality follows from the fact that when n ≥ 1, the individual terms go to zero for any ρ, as r_{1} → 0.