The PSO was inspired from the movement of a swarm, such as a shoal of fish, a flock of birds, etc, to find food or to escape from enemies, by splitting up into groups. There is no apparent leader of the swarm other than the social interactions between the bird like objects (or *boids*). The coherent movement of these boids is modeled based on their social interactions with their neighbors. The algorithm simulating these social aspects was simplified in [22] and it was found to perform optimization. In this article, a basic PSO algorithm [23] with inertia weight and velocity restriction is implemented and it is capable of finding a stable solution based on a given objective function.

Classical optimization methods are especially preferred when the optimization problem is known to be convex but this is not the case here. Numerical methods such as Newton's method are not feasible as the objective function is non-differentiable. Other classical techniques could fail but PSO would always find an equilibrium/stable solution. PSO was chosen over other evolutionary algorithms, as it requires very few parameters to configure, it is easier to understand with computationally lesser bookkeeping and it fits well for reducing the backhaul load. In [23], PSO is viewed as a paradigm within the field of swarm intelligence and the performance measures of basic PSO are highlighted. This reference also provides detailed differences between PSO and other evolutionary algorithms.

In this article, each bird in a swarm carries the real and imaginary parts of the non-zero elements of the BF matrix, i.e., the *i* th member of the swarm is the *i* th *particle* that carries all the (*n* = 2 · *K* · *N*_{T} · *M*) BF coefficients. The '2' is due to PSO treating the real and the imaginary part of the complex BF coefficients as another dimension to the search space. Hence, the particle having the best *n* values needs to be found for a given objective function. For example, an infinite threshold would yield *n* = 2 · *K* · *N*_{T} · *M* non-zero CSI coefficients in the aggregated channel matrix of size [*M × K* · *N*_{
T
} ]. With an active set threshold of 0 dB then only the best link (or reference link) would be fed back by each UT yielding *n* = 2 · 1 · *N*_{
T
} · *M*. The real and the imaginary parts of the non-zero BF matrix, $\tilde{\mathbf{W}}$, are mapped to a particle. This mapping, during initialization, is only for illustrating how the BF is translated to a particle. These steps can be omitted in the actual implementation. The position, **X**(*i*, *j*), and the velocity, **V**(*i*, *j*), of the *i* th particle with the *j* th BF coefficient are stochastically initialized as **X**(*i*, *j*) = *x*_{min}+*r* · (*x*_{max} - *x*_{min}) and $\mathbf{V}\left(i,j\right)=\frac{1}{\mathrm{\Delta}t}\left(-\frac{\left({x}_{\text{max}}-{x}_{\text{min}}\right)}{2}+s\phantom{\rule{0.3em}{0ex}}\cdot \left({x}_{\text{max}}-{x}_{\text{min}}\right)\right)$, respectively. Here *r* and *s* are random numbers picked from a uniform distribution in the interval [0, 1], and *x*_{max} is the maximum value that a BF coefficient is initialized with. This does not mean that the position of the particle will not exceed this value, i.e., the particles in the PSO can actually go beyond these limits. The same holds for the velocity of the particle, but it is restricted by a maximum velocity, *v*_{max}, so that the particle does not diverge. Δ*t* is the time step length. The total number of particles is *Q*. Recall that each particle is indexed using the variable *i*, where each particle is carrying *n* BF coefficients. These coefficients are indexed using the variable *j*.

A given objective function is evaluated for every particle

*i* carrying the BF coefficients, and it is demapped to form the BF matrix as

$\tilde{\mathbf{W}}\left(l,m\right)\leftarrow \left\{\mathbf{X}\left(i,j\right)\right\}+\text{i}\phantom{\rule{0.3em}{0ex}}.\phantom{\rule{0.3em}{0ex}}\left\{\mathbf{X}\left(i,j+1\right)\right\},l\in \left\{1,\phantom{\rule{2.77695pt}{0ex}}...,K{N}_{\text{T}}\right\},m\in \left\{1,...,M\right\}$. The

*i* th particle keeps a record of its best BF as

**X**^{
pb
} (

*i*, :), and the best BF achieved by any of the particles in the swarm is stored as

**x**^{
sb
} . The equations governing the update of the velocity and the position of a particle are:

$\begin{array}{ll}V(i,j)\leftarrow w\cdot V(i,j)+{c}_{1}\cdot p\cdot \left(\frac{{X}^{pb}(i,j)\u2013X(i,j)}{\mathrm{\Delta}t}\right)+{c}_{2}\cdot q\cdot \frac{{x}^{sb}\left(j\right)\u2013X(i,j)}{\mathrm{\Delta}t},\hfill & \phantom{\rule{0.50em}{0ex}}\phantom{\rule{0.50em}{0ex}}(7)\hfill \\ X(i,j)\leftarrow X(i,j)+V(i,j)\cdot \mathrm{\Delta}t.\hfill & \phantom{\rule{0.50em}{0ex}}\phantom{\rule{0.50em}{0ex}}(8)\hfill \end{array}$

The variables *p* and *q* are random numbers drawn from a uniform distribution in the interval [0, 1]. The terms involving *c*_{1} and *c*_{2} are called the *cognitive* component and the *social* component, respectively. The cognitive component tells how much a given particle should rely on itself or believe in its previous memory, while the social component tells how much a given particle should rely on its neighbors. The cognitive and social constant factors, *c*_{1} and *c*_{2}, are equal to 2, as highlighted in [22]. An *inertia weight*, *w*, is used to bias the current velocity based on its previous value, such that when the inertia weight is initially being greater than 1 the particles are biased to explore the search space. When the inertia weight decays to a value less than 1, the cognitive and social components are given more attention [24]. The decaying of the inertia weight is governed by a constant decay factor *β*, such that *w* ← *w* · *β*.

The pseudocode of PSO described above is summarized in Algorithm 2.

**Algorithm 2: Pseudocode for obtaining the BF via PSO. Steps 3 to 5 are only mentioned for illustration and can be avoided prior to initialization**

1: *Initialization:*

2: Determine the number of non-zero coefficients *n* needed in the BF matrix, $\tilde{\mathbf{W}}$

3: Map the BF to the particle:

4: $\mathbf{X}\left(i,j\right)\leftarrow \Re \left\{\tilde{\mathbf{W}}\left(l,m\right)\right\},l\in \left\{1,...,K{N}_{\text{T}}\right\},m\in \left\{1,...,M\right\}$

5: $\mathbf{X}\left(i,j+1\right)\leftarrow \Im \left\{\tilde{\mathbf{W}}\left(l,m\right)\right\}$

6: Stochastically initialize particles with BF coefficients:

7: ${x}_{\text{max}}=1/\text{max}\left|\tilde{\mathbf{H}}\left(i,j\right)\right|$

8: *x*_{min} = -*x*_{max}

9: Position: **X**(*i*, *j*) = *x*_{min} + *r* · (*x*_{max} - *x*_{min})

10: Velocity: $\mathbf{V}\left(i,j\right)=\frac{1}{\mathrm{\Delta}t}\left(-\frac{\left({x}_{\text{max}}-{x}_{\text{min}}\right)}{2}+s\cdot \left({x}_{\text{max}}-{x}_{\text{min}}\right)\right)$

11: **while** Termination Criterion **do**

12: **for** the *i* th particle in the swarm **do**

13: Demap the variables in a particle to form the BF matrix

14: $\tilde{\mathbf{W}}\left(l,m\right)\leftarrow \left\{\mathbf{X}\left(i,j\right)\right\}+\text{i}\cdot \left\{\mathbf{X}\left(i,j+1\right)\right\}$

15: Evaluate the objective function *f*(**X**(*i*, :))

16: *Store*:

17: **if** *f*(**X**(*i*,:)) < *f*^{
pb
} (**X**(*i*,:)) **then**

18: Particles' Best: **X**^{
pb
} (*i*,:)←**X**(*i*,:)

19: **end if**

20: **if** *f*(**X**(*i*,:)) < *f*^{
sb
} (**X**(*i*,:)) **then**

21: Swarm's Best: **x**^{
sb
} ← **X**(*i*,*:* )

22: ${\tilde{\mathbf{W}}}^{\mathbf{s}\mathbf{b}}\left(l,m\right)\leftarrow \left\{{\text{x}}^{sb}\left(j\right)\right\}+\text{i}.\left\{{\text{x}}^{sb}\left(j+1\right)\right\}$

23: **end if**

24: **end for**

25: **for** Each particle in the swarm with BF coefficients **do**

26: *Update:*

27: Velocity: $\mathbf{V}\left(i,j\right)\leftarrow w\cdot \mathbf{V}\left(i,j\right)+{c}_{1}\cdot p\cdot \left(\frac{{\mathbf{X}}^{pb}\left(i,j\right)-\mathbf{X}\left(i,j\right)}{\mathrm{\Delta}t}\right)+{c}_{2}\cdot q\cdot \frac{{\mathbf{X}}^{sb}\left(j\right)-\mathbf{X}\left(i,j\right)}{\mathrm{\Delta}t}$

28: Restrict velocity: *|* **V**(*i*, *j*)*| < v*_{max}

29: Position: **X**(*i*, *j*) ← **X**(*i*, *j*) + **V**(*i*, *j*) · Δt

30: **end for**

31: *w* ← *w* · *β*

32: **end while**

33: **return** BF Weight Matrix, ${\tilde{\mathbf{W}}}^{\mathbf{s}\mathbf{b}}$