Resource efficient connectivity restoration algorithm for mobile sensor/actor networks
© Imran et al.; licensee Springer. 2012
Received: 29 May 2012
Accepted: 28 October 2012
Published: 21 November 2012
Maintaining inter-node connectivity is of a paramount concern in most applications of mobile sensor/actor networks because nodes have to report their data and coordinate their operations. Failure of a node may partition the inter-node network into disjoint segments, and may thus hinder data delivery and inter-node coordination. This article presents a novel resource efficient connectivity restoration algorithm (RECRA) that opts to repair severed connectivity while imposing minimal overhead on the nodes. To avoid overreacting to non-critical failure, RECRA identifies critical/non-critical nodes and only triggers the recovery when a critical node fails. The failure of a node is detected by its neighbors and a recovery procedure is executed based on their proximity, status (critical/non-critical), and transmission range. RECRA prefers to employ a non-critical node and moves it to the place of failed node in order to limit the impact on coverage, scope of the recovery, and strangle successive cascaded relocations. In case non-critical nodes in the neighborhood are not available, the neighbors restore connectivity by exploiting their partially utilized transmission power and repositioning closer to the failed node. RECRA is validated analytically and through extensive simulations. The simulation results confirm the effectiveness of RECRA for both dense and sparse network segments and its performance advantage over contemporary schemes found in the literature.
KeywordsWireless sensor and actor networks Fault tolerance Connectivity restoration Node relocation Topology repair
In both MSNs and WSANs, maintaining network connectivity is extremely important throughout the network lifetime in order to satisfy application-level requirements. For example, effective and accurate monitoring of an area requires maximal presence of sensors to receive data from neighbors and perform aggregation or fusion. Similarly, most applications of WSAN necessitate availability of actors to establish and maintain a connected inter-actor topology in order to coordinate with each other on an optimal response and to synchronize their operations. For example, in disaster management applications, sensors detect and report the presence of survivors in the vicinity to the designated actors. The actors equipped with necessary equipment receive sensors data, process it, and share it with peer actors to determine the most appropriate set of actors that will participate in the operation. This requires that actors should be able to communicate with each other and perform appropriate action.
Nonetheless, the harsh application environment that MSNs and WSANs operate in makes nodes (i.e., sensors and actors) susceptible to physical damage and component malfunction. Failure of a critical node partitions the inter-node network into disjoint segments and consequently hinders data delivery and inter-node interaction. In addition, a node failure may cause significant loss of coverage. Since MSNs and WSANs operate autonomously in unattended setups, replacing the failed node is often infeasible and the recovery should be a self-healing and agile process that involves reconfiguring the inter-node topology in a distributed manner. Moreover, the resource constrained nature of MSNs and WSANs require the connectivity restoration process to be lightweight in terms of the incurred overhead and to limit the impact on network coverage.
This article presents a novel resource efficient connectivity restoration algorithm (RECRA) for repairing a topology after the failure of a critical node that caused partitioning the network topology into disjoint segments. In order to avoid unnecessary recovery overhead, RECRA classifies nodes into critical or non-critical to the network connectivity based on localized information. The neighbors detect the failure of a node F and decide to participate in the recovery based on their status, proximity, and how partially they currently utilize transmission power. To minimize the scope, impact, and overhead of recovery, RECRA prefers to employ non-critical nodes and moves them to the location of F. In case of unavailability of non-critical nodes, the neighbors of F collaboratively participate by gradually increasing their partially utilized transmission power and repositioning closer towards F. This is to minimize and balance recovery overhead on individual nodes. The performance of RECRA is validated analytically and through extensive simulations. Simulation results confirm the effectiveness and efficiency of the proposed scheme compared to published schemes. To the best of the authors’ knowledge, RECRA is the first reactive scheme that combines the use of non-critical nodes, node mobility, and partially utilized transmission power of nodes in the recovery process.
The rest of the article is organized as follows. The next section describes the system model and discusses the network partitioning problem. Section 3 summarizes the related work. The detail descriptions of the proposed RECRA algorithm, pseudocode, and analysis are provided in Section 4. The validation results and performance analysis of RECRA are presented in Section 5. Finally, Section 6 concludes the article.
2. System model and problem statement
RECRA is applicable to an MSN or a WSAN. The mobile nodes can be a part of a flat network topology, in case of MSN, or form a second tier in hierarchical network architecture, e.g., actors or aggregation and forwarding units. The nodes are randomly placed in an area of interest where they discover each other and form a connected network. In the case of WSANs, this is an inter-actor network, i.e., no sensor nodes are involved. The nodes are assumed to be able to move on-demand with a consistent speed to reach their destination. However, this movement is assumed to be more costly than other operations such as computation and message transmission. The communication range rmax of a node refers to the furthest Euclidean distance that its radio can reach. We assume that all nodes can dynamically adjust the output power of their radio when transmitting. This is very realistic assumption as popularly used radio hardware such as CC1000 and CC2420 offer a register to specify the transmission power levels at runtime. Each node is assumed to maintain a list of its direct neighbors.
3. Related work
The issue of fault tolerance in context of MSNs and WSANs has also been studied in other contexts. For example, the fault-tolerant model presented in designates multiple actors to each sensor and multiple sensors to each actor in order to guarantee event notification even in case of either failure or inaccessibility. However, our fault-tolerant model is in context of maintaining inter-node connectivity rather than reliable event notification delivery. Some researchers have proposed topology control algorithms to provide fault tolerance. They used to construct a fault-tolerant topology by adjusting the node transmission power. For example, two distributed heuristics were proposed in for homogeneous mobile networks to maintain connected topology by controlling the output power of the radios. The idea is to adjust the transmission power of nodes according to topology changes. Similarly, two localized algorithms for heterogeneous wireless networks were proposed in to preserve a bi-connected network topology. Most of these schemes neither consider impact of node failure nor takes into account the limited communication range as bottleneck in maintaining connectivity.
Exploiting node mobility as a means of performance optimization has been pursued by multiple researchers both in the context of MSNs and WSANs. Energy conservation, increased connectivity and coverage, minimized latency and asset protection are the contemporary metrics targeted by the node repositioning. The reader is referred to for a survey. Meanwhile, employing node mobility to repair damaged network topologies has only recently started to attract attention. The work can be categorized into block and cascaded movement. Block movement often requires a high pre-failure connectivity in order for the nodes to coordinate their response. An example of block movement-based approaches is the work of Basu and Redi, where the network is assumed to be 2-connected prior to the failure and the goal is to maintain such connectivity even under link or node failure. Their approach is centralized that may not be suitable for distributed and dynamic nature of MSANs. A distributed approach to tackle the same problem was presented in. Unlike[15, 16], the objective of this study is to maintain 1-connectivity. In the absence of higher level of connectivity, block movement is infeasible and nodes have to move in an independent manner, i.e., cascaded movement.
Approaches pursuing cascaded motion can be further categorized based on the network state that the individual nodes are assumed to maintain. Some approaches like DARA and LDMR base the node participation on having a list of 2-hop neighbors. DARA replaces the failed node “F” with one of its neighbor picked based on the node degree, distance, and ID, respectively. Repositioning the neighbor causes more links to break and the relocation process repeats in a cascaded manner. In LDMR, the neighbors move toward “F” and get replaced by nearest non-cut vertices. Basically, each neighbor broadcast a message looking for the best candidate that replaces it. On the other hand, some approaches such as RIM, VCR, and C3R avoid the increased overhead for tracking 2-hop neighbors and require each node to be aware only of their 1-hop neighbors. The proposed RECRA algorithm fits in this category as well. RIM relocates the neighbors of “F” (irrespective of whether the failed node and neighbors are critical/non-critical) inward until they become connected. Moving critical nodes inward requires children to follow and hence the scope of recovery may spread across the network. Albeit RIM balances the load and is suitable for sparse networks; however, its performance worsens significantly in dense networks. In addition, shrinking the topology inward causes significant loss of coverage.
Unlike RIM, the neighbors in VCR volunteer by moving towards F while increasing their partially utilized transmission range until becomes connected. VCR applies a limited-scale diffusion in order to strangle repercussions of increased transmission power and improve coverage. However, VCR does not assess the impact of a node failure on connectivity and may engage critical nodes in recovery. Moreover, the diffusion in VCR is quite costly in terms of movement overhead especially in dense networks. Unlike RIM and VCR, RECRA identifies critical/non-critical nodes in advance and only triggers recovery in case of critical node failure. Moreover, RECRA prefers to employ non-critical nodes in recovery. Furthermore, it exploits the fact that some neighbors of “F” are not using their full communication range and would thus be able to reach more distant nodes than “F”. Consequently, RECRA employs fewer nodes and minimizes relocation overhead by exploiting their partially utilized transmission range.
A number of schemes cared for both connectivity and coverage. For example, C3R employs node relocation in order to cope with the loss of coverage and connectivity when a node fails. Instead of reconfiguring the network topology, neighbors move back and forth to replace the failed node in order to provide intermittent rather than permanent recovery. Obviously, this solution leads to frequent topology changes, imposes lots of overhead, and would thus become suitable as a temporary solution until spare actors are deployed. On the other hand, Akkaya and Janapala address inter-actor connectivity and coverage at network setup time. Actors apply repelling forces to spread out and switch to attraction force when the actors become disconnected. However, they do not deal with actor failure.
4. Resource efficient connectivity restoration
After deployment, RECRA identifies critical/non-critical nodes in the network based on the localized information and neighbors monitor each other through heartbeats. The neighbors of a node F detect its failure through missing heartbeats and initiate a recovery process based on their status, proximity, and partially utilized transmission power. If there is a non-critical node, it will move to the position of F and announces the success of recovery. Otherwise, some of the critical neighbors pursue recovery by moving inward while increasing their transmission power until becoming connected. RECRA is described in detail in the balance of this section.
4.1. Identifying critical nodes
As mentioned earlier, the failure of a critical node partitions the network into disjoint segments while the network stays strongly connected despite the loss of a non-critical node. Thus, it is imperative to assess the impact of a node failure on network connectivity in order to avoid unnecessary recovery overhead. In addition, involving critical nodes in the recovery process triggers cascaded node relocation and widens the scope of the recovery which ultimately affect coverage. RECRA identifies critical/non-critical nodes in-advance in order to justify the need and minimize the overhead for tolerating a node failure. A number of algorithms have been proposed in the literature to determine cut-vertices in a graph. These algorithms can be categorized into centralized and distributed. In centralized algorithms[23, 24], each node is required to maintain information about the entire topology which involves very high communication overhead, and thus may not be suitable for large-scale dynamic networks.
On the other hand, distributed and highly localized algorithms are more appropriate for large-scale MSNs and WSANs to support high mobility, scalability, robustness, and resource optimization. Localized algorithms can quickly determine critical/non-critical nodes with far less communication overhead as they only process localized information. This is extremely desirable in dynamic networks. However, these algorithms may classify some nodes as critical while they are not indeed critical in a network-wide view. With limited information, this is unavoidable because it is almost impossible for a node to know the alternate paths across the network. However, the accuracy of determining non-critical nodes is 100%. Considering the potential advantages of localized algorithms and the fact that RECRA prefers to engage non-critical nodes in recovery, such a category of approaches fits well and the reduced accuracy is not a major concern. Therefore, RECRA employs a localized cut-vertex detection procedure of that only requires 1-hop position information and executes on each node in a distributed manner.
One may argue about the cost and implication of determining critical/non-critical nodes in RECRA. Each node in RECRA determines itself whether it is critical or not based on the localized information. This only involves computation overhead that is not a major concern compared to communication and movement overhead. Moreover, the distributed nature of RECRA equally divides this overhead on all nodes. Nevertheless, RECRA strives to minimize this computation overhead by executing this procedure only when a node or its neighbor changes its position. Second, the procedure is localized; therefore, it does not involve heavy computations. Distinguishing critical/non-critical nodes allows RECRA to avoid overreacting against non-critical node failures and preferably engage non-critical nodes in recovery which helps in minimizing recovery overhead.
4.2. Monitoring and failure detection
4.3. Connectivity restoration
Upon detecting the failure of a critical node “F”, the neighbors initiate a recovery process. The scope of recovery mainly depends on the position of F and the status of the neighbors. In high-density networks, the probability of having non-critical nodes (intermediate and leaf) is more and moving these nodes will have minimal effect on inter-node connectivity and coverage. Therefore, RECRA prefers to exploit such an opportunity in the following manner:
If there is a non-critical node in the neighborhood, it will move to the location of F and update its status to new neighbors, i.e., neighbors of F. Since the absence of a non-critical node (intermediate and leaf) does not inflict inter-actor connectivity, moving such a node to the place of F restores connectivity to the pre-failure level and does not require any further movement. Moreover, moving an intermediate non-critical node will almost preserve coverage loss as these nodes do have overlapped coverage with neighbors. Figure4b shows that the non-critical node N 2 moves to the location of the failed node and determines that it becomes critical at the new location. It updates its new status to neighbors to be ready for any future-related failure. Such an updated message also serves as a confirmation of the success in restoring connectivity.
It might be possible that there is more than one non-critical node in the neighborhood of F. This situation is bit complicated especially when the nodes are unable to coordinate. If the non-critical nodes are neighbors, they can coordinate and with the node that has the highest node degree and is the closest to F replacing F. Repositioning such a node not only restores the connectivity, but also improves the coverage as shown in. On the other hand, if non-critical nodes are unable to coordinate the recovery, they start moving towards F. The first one to reach F updates its status so that the remaining non-critical nodes stop and move back to their original location.
- i.Proximity. A critical node A ∈ neighbors (F) calculates its distance d(A, F) to F. If d(A, F) is more than a percentage α of its range r, node “A” is not required to participate in the recovery at this time, i.e., it stays as passive participant (PP). In other words, close neighbors to F are favored as active participants (APs) because a slight inward movement of neighbors around F collectively occupies the uncovered spot. Assuming uniform distribution of N nodes in a square area (L × L), the distance between two nodes in the same row is and the distance between two diagonally neighboring nodes is . Therefore, the initial value of α is set as the average proximity to neighbors:
- ii.Legibility factor (LF). The transmission power of nodes significantly affects the network connectivity. While power level at the transmitter determines the reachable range, i.e., how far the receiver can be, high power may increase interference and boost the number of exposed nodes  in dense networks. Therefore, power control is usually pursued in order to balance the interest in high connectivity and efficient utilization of the wireless channel. Particularly, nodes carefully set their transmission power to achieve signal-to-noise ratio that suits the intended receiver and limits the potential of medium access collision with other nodes in the vicinity. Moreover, power control is further employed in order to conserve energy. RECRA exploits the fact that many nodes are not utilizing their full range and would be able to boost their transmission power to reach further receiver than their neighbors. Thus, nodes whose range is partially utilized should be favored in the recovery process. The LF of a node captures the effect of the ratio of current range r c to maximum range r max, i.e.,
Nodes with high LF are favored. A node will decide to become AP if its LF exceeds a preset threshold β. Initially, β can be approximated based on the node density. Assume the size of the deployment region is “Area”. For a uniform node deployment, the value of r c for establishing a connected network should be set such that
where N is the number of nodes.
Node relocation. The fact that a node is using a fraction of its maximum communication range r max indicates that this node can move away from its current spot and makes up for the increased proximity to its neighbors by boosting the output power of its radio. An AP node “P” would exploit this capability by moving to a distance γr max from F. Prior to departing its current position node “P” will notify its children. While moving, node “P” will increase its transmission power to stay connected to its children. If d(P, F) exceeds (r max – r c ), the children that are neither AP nor PP will follow to sustain their links to P, a step that referred in the literature as cascaded relocation . RECRA opts to avoid or at least limits the scope of the cascaded relocation by favoring nodes that are close to F and can grow their transmission power. At the start of the recovery process γ is set to 0.5. The rationale is that if all neighbors of F are at a distance ½r max away from F, the network becomes connected again .
Connecting PP nodes. The PP nodes will wait for APs to re-establish connectivity. The rationale is that APs will end up in the vicinity of F, yet not at the position of F. Therefore, there is a high probability for PP nodes to be able to reach one of those APs without a need to incur overhead. If a preset time passes without hearing from an AP, a PP will increase the value of α and/or and lowers β to become an AP. However, in this case an AP node will try to increase their transmission range first in order to find out whether other APs can be reached, before pursuing repositioning. When a PP becomes connected to an AP, it declares the success of the recovery (based on Theorem 1). For example, Figure 5a provides an illustration where the critical neighbors of N 2 perform self-assessment based on the criteria prescribed earlier since there is non-critical neighbor to replace N 2. Nodes N 4 and N 5 decide to become APs and move towards N 2 while increasing their transmission power until establishing a link to PP, namely N 3. Figure 5b shows that the recovery process not only connects the neighbors of N 2, but also establishes some new links to N 4 and N 7. In addition, N 4 and N 5 maintain connectivity with their children, i.e., N 8, N 9, N 10, and N 11, N 12, respectively. The children only adjust their transmission power to stay connected to their parents.
4.4. Pseudocode of RECRA
4.5. RECRA algorithm analysis
This section proves the correctness and analyzes the performance of the proposed RECRA algorithm. We introduce the following theorems.
Theorem 1. The network becomes strongly connected if it was strongly connected before a node F fails and if every PP node can reach an AP.
Theorem 2: In RECRA, the total movement distance is:
d n ≤ rmax, for a non-critical neighbor
d mc ≤ (rmax − r) ∑ i = 0 n C i , when nodes can increase their transmission power to rmax and
For cascaded movement (assuming none of the neighbor can increase communication range).
where rmax, n, and N are the communication range, the maximum number of neighbors, and the number of nodes in the network, respectively.
Proof: If any of the neighbors is non-critical, the failed node will simply be replaced Figure8a. The worst-case scenario is when a non-critical node is located at distance rmax from the failed node F. Hence, the total movement overhead will not exceed rmax. The possibility of having non-critical nodes becomes higher with the increased network density and communication range. Therefore, the performance of RECRA is expected to improve in both the cases.
If there are no non-critical nodes in the neighborhood of F, nodes move towards F while increasing their communication range. The typical case, as articulated in Figure8c, is when these neighbors are located at distance r from F and are also required to increase their communication range to rmax to sustain connectivity with own neighbors. Therefore, in this case the total distance movement required to rejuvenate connectivity will not exceed (rmax − r) ∑ i = 0 n C i .
The worst recovery scenario is when neighbors of F are critical nodes and are already using their maximum communication range. In this case, the neighbors move towards F by distance to become connected. The algorithm is recursively executed on the children of the moved nodes to become connected with parent. The maximum movement overhead will not be more than when all healthy nodes in the network are critical and have to move as shown in Figure8b.
Theorem 3: Participation of a non-critical from any partition or an AP from each partition is sufficient to restore connectivity irrespective of the number of partitions created.
Theorem 4: RECRA impose maximum movement overhead of rmaxon each recovery participant, where rmaxis the maximum communication range of nodes.
Proof: Since a failed critical node is either replaced by one of its non-critical neighbor or the recovery participant among neighbors moves toward F while increasing their transmission power. In the former case, the non-critical node travels a maximum distance of rmax to substitute F in worst case. While in the latter case, each recovery participant may have to travel distance rmax/2 in worst case as proven in Theorem 2. Therefore, the maximum movement overhead in RECRA is rmax because each node moves only once.
Theorem 5: The time for RECRA to successfully restore connectivity is O(f + T), where f is the fault detection latency and T is the maximum movement time of any node, which is proportional to r max .
Proof: In RECRA, the worst-case time complexity is proportional to the failure detection and recovery. Let us assume that the time to detect a failure or receive a movement notification message is f. In the worst case, the time it takes to establish connectivity is equal to where s is the movement speed of nodes and the is the longest distance travelled by nodes when cascaded relocation is pursued, as proved in Theorem 2. Since, the time to increase transmission power is nominal and it is pursued during relocation, therefore, it is included in T. Therefore, the convergence time of RECRA to restore connectivity in the worst case is k(f + T), which is O(f + T).
Theorem 6: The message complexity of the RECRA is O(N), where N is the number of nodes in the network.
Proof: RECRA entails one message transmission for each node as it is a localized scheme that only maintains 1-hop neighbor information to restore connectivity. In addition, every AP (i.e., critical node participating in recovery) has to send one movement notification message to its children so that they can sustain connectivity with the parent and to avoid being wrongfully perceived as faulty. At the new location, the node will send one message to its neighbors. Therefore, in worst case, when all nodes move, the total number of messages will be (N + 2|AP|). Therefore, the message complexity of RECRA is O(N).
5. Experimental evaluation
The effectiveness of the RECRA is validated through extensive simulation experiments. This section describes the experiment setup, performance metrics, and experimental results.
5.1. Experiment setup and performance metrics
We have developed a customized simulation environment. In the experiments, we have created topologies that consist of varying number of nodes (20–100). Nodes are randomly placed in an area of 1000 × 600 m2. We have varied the transmission range of nodes (50–125) during experiments. Free space signal propagation model and collision-free medium access arbitration are assumed. The performance is assessed using the following metrics.
The total distance moved by all nodes involved in the recovery: This gauges the efficiency of RECRA in terms of the motion-related overhead, e.g., dissipated energy during recovery.
The number of nodes moved during the recovery: This metric reflects the scope of the recovery process the level of disturbance caused to the existing topology. This is crucial specifically in mission-critical applications.
The number of messages exchanged among nodes: Again this metric indicates the recovery overhead.
The percentage of coverage change (increase or decrease) relative to the pre-failure level: Although connectivity is the prime objective of RECRA, node coverage is important for many setups. The loss of a node usually has a negative impact on coverage. This metric assesses whether RECRA alleviates or worsens the coverage loss.
The degree of connectivity in the network corresponding to pre-failure level: This metric reflects the level of connectivity that indicates the availability of node-independent paths. It is measured by averaging the number of neighbors that a node has.
The following parameters were used to vary the network configuration in the experiments:
The number of deployed nodes (N) in the network affects the node density and the inter-node connectivity.
The node communication range (r) is the initial communication range and is equal for all nodes. Whereas, “rmax” is the maximum communication range a node can have.
The transmission range utilization ratio (p), which is defined as r/rmax, affects the connectivity and recovery overhead.
5.2. Baseline approaches
We compare the performance of RECRA to that of DARA, RIM, and VCR based on the aforementioned metrics. Like RECRA, all these approaches are distributed, reactive, and exploit node relocation in order to restore connectivity. However, their procedure is different. When a node F fails, DARA selects a best candidate A among its 1-hop neighbors and replaces it. The algorithm is recursively applied to tolerate connectivity loss due to movement, i.e., A will be replaced with one of its neighbors and so on. On the other hand, RIM moves all the 1-hop neighbors towards F until they become connected. RIM then applies cascaded relocation to re-establish the links that get severed by nodes movement. VCR exploits partially utilized transmission power of the 1-hop neighbors besides relocation. A controlled diffusion is applied among recovery participants to increase the coverage and limit the repercussions of recovery.
5.3. Simulation results and analysis
The simulation experiments involve randomly generated topologies with varying number of nodes, communication ranges, and their utilization ratios. The number of nodes has been set to 20, 40, 60, 80, and 100. The communication range “r” of nodes is changed among 50, 75, 100, and 125. The transmission range utilization ratio has been fluctuated to 10, 25, 50, and 75%. When changing the node count, “r” and “p” are fixed at 100 m and 50%, respectively; and “N” is set to 100 while varying the communication range and to 60 when changing “p” unless stated otherwise. The values of α and β are calculated based on N and r. Initially, the values of γ and τ are set to 0.5 and γrmax, respectively. We carefully picked the failed node in these experiments to ensure that it is really a cut vertex since none of the baseline approaches determine critical/non-critical nodes in advance. In addition, the prime objective of this study is to minimize the post-failure recovery. Moreover, we also ran experiments in which the failed node is randomly picked in order to assess the impact of inaccuracy of determining critical nodes using 1-hop (RECRA) and 2-hop (DARA) information. In these experiments, “r” and “p” are fixed at 100 m and 25%, respectively. The results of individual experiments are averaged over 20 trials to ensure statistically stable results. All results are subject to 90% confidence interval analysis and stays within 10% the sample mean.
5.3.1. Total traveled distance
Figure10a indicates that the performance of RECRA scales very well and is not much affected by the node density given the optimized selection of nodes as explained in Section 4.3. In sparse networks, where a non-critical node is not available in the neighborhood, RECRA collectively engages all neighbors of F that can exploit their partially utilized transmission power. On the other hand, RECRA entirely relies on movement in case of a non-critical node in order to minimize scope of recovery as we shall discuss later in this section. Similar observation can be made for the communication range (Figure10b), where the connectivity-restoration overhead is minor compared to the baseline approaches.
Figure10 also shows that the performance of RIM and VCR is significantly affected when increasing N and r. This is attributed to the fact that the number of neighbors increases in both cases. The self-spreading step in VCR is costly in terms of the motion overhead. This is mainly because the scope of the motion is wider and involves nodes that do not have to relocate for restoring the connectivity. It is worth noting that self-spreading will boost the coverage achieved by VCR as shown later in this section. However, RECRA prefers to minimize movement-related overhead while keeping intact the pre-failure coverage. Figure10 also indicates that the motion overhead in DARA is slightly more than RECRA. This is because DARA moves least degree nodes that are often critical and results in successive relocations. However, the higher density helps DARA to find leaf nodes and limits the scope of recovery.
Figure11a reports the impact of critical node detection accuracy which can potentially worsen the performance by unnecessarily triggering recovery of uncritical node. In this experiment, the failed node is randomly picked and the criticality assessment is conducted by applying the algorithm of for DARA and for RECRA using 2-hop and 1-hop information, respectively. The performance results show that RECRA still imposes less movement overhead despite the fact that DARA can perform more accurate assessment of the criticality of the failed node, as pointed out in Figure10a. In sparse networks, the accuracy of determining critical nodes with 1-hop and 2-hop is almost similar, since the node degree is low, and thus the performance stays consistent with Figure10a. However, with high node density, the inaccuracy of critical node detection with 1-hop grows in significance compared to 2-hop due to the increased level of connectivity in the network. That is why the performance of DARA becomes quite closer to RECRA for network with high node count. On the other hand, RECRA may overreact in some cases but it strangles the repercussions through high availability of non-critical nodes.
Figure11b shows the impact of the transmission range utilization ratio “p” on the movement overhead. Again, the performance of RECRA is almost consistent and far better than VCR. This is mainly because RECRA prefers to engage non-critical nodes and avoids the need to perform self-spreading. The performance of RECRA is slightly improved with the higher values of “p”. This is expected because nodes reduce their reliance on movement and exploit their partially utilized transmission range. On the other hand, the performance of VCR is negatively affected since nodes have to further move away during self-spreading.
5.3.2. Number of moved nodes
5.3.3. Number of messages exchanged
5.3.4. Percentage of coverage improvement
On the other hand, increasing the node density in Figure14a helps DARA and RIM; yet they still do not make up for the coverage loss and definitely do not match RECRA’s performance. DARA prefers to choose the least degree nodes. A node with the least degree is often critical and moving such a node only shifts the coverage hole from one place to another. In RIM, the topology gets shrunk inward which leads to significant loss of coverage at network periphery. Whereas, VCR improves coverage by about 2% and consistently outperforms both baseline schemes. This coverage advantage of VCR is mainly driven by the diffusion, spreading, of nodes that is performed after connectivity restoration in order to limit the effect of signal interference. Figure14b indicates that for VCR the coverage grows with the increase in the communication range while the performance of DARA is not affected much. On the other hand, the performance of RIM significantly worsens when growing the communication range. With the increased value of r, the network becomes more connected and the number of neighbors of F grows. RIM moves nodes inwards making the area around F to be more crowded than at the network periphery and thus cause a significant loss of coverage.
5.3.5. Degree of connectivity
5.4. Important observations
We would like to make few important observations regarding the performance and utility of RECRA. Maintaining network state information at each node (i.e., 1-hop, 2-hop, etc.) dominates the accuracy of determining cut-vertices. The fact that RECRA and DARA maintain 1-hop and 2-hop information, respectively; therefore, the accuracy of determining cut-vertices in DARA is slightly higher and may cause RECRA to overreact in some cases. However, in dynamic networks, maintaining more network state information at each node consistently requires extra overhead. Since, the prime focus of this study is resource efficient recovery; therefore, RECRA strives to avoid such unnecessary overhead and prefers to engage non-critical and/or nodes with partially utilized transmission range. The experimental results presented above have confirmed the superiority of RECRA over contemporary schemes.
The possibility of simultaneous node failure in MSNs and WSANs is rare; therefore, RECRA is designed to recover from a single node failure. In handling simultaneous failures may cause conflicting situations for RECRA where non-critical nodes in the neighborhood are not available.
6. Conclusion and future work
This article has presented a novel distributed resource-efficient connectivity restoration algorithm for MSNs and WSANs. Unlike most published schemes, RECRA exploits availability of non-critical nodes and partially utilized transmission power of nodes besides their relocation. RECRA classifies nodes as critical/non-critical based on their importance to sustaining connectivity in the network in order to avoid overreacting to non-critical node failures and optimize the recovery from the loss of a critical node. The neighbors of the failed node detect the failure and initiate a recovery process based on their status, proximity, and partially utilized transmission power. To minimize the overhead, RECRA prefers to employ non-critical nodes by moving them to the place of the failed node. If all neighbors are critical, RECRA employs neighbors of the failed node by increasing their transmission power and moving them towards the failed node. The performance of RECRA is validated analytically and through extensive simulations. The simulation results have confirmed the effectiveness of RECRA in sparse and dense topologies and its performance advantage over contemporary schemes found in the literature.
In future, we plan to validate the results of RECRA on a prototype network of mobile robots.
This work is partially supported by the Chair of Pervasive and Mobile Computing at King Saud University, Saudi Arabia.
- Akyildiz IF, Su W, Sankarasubramaniam Y, Cayirci E: Wireless sensor networks: a survey. J. Comput. Netw. 2002, 38(4):393-422. 10.1016/S1389-1286(01)00302-4View Article
- McMickell MB, Goodwine B, Montestruque LA: MICAbot: a robotic platform for large-scale distributed robotics. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA '03). Taipei, Taiwan; 2003:1600-1605.
- Dantu K, Rahimi M, Shah H, Babel S, Dhariwal A, Sukhatme GS: Robomote: enabling mobility in sensor networks. In Proceedings of the 4th International Symposium on Information Processing in Sensor Networks (IPSN' 05). Los Angeles, CA, USA; 2005:404-409.
- Robot, http://www.irobot.com
- Yuta S, Asama H, Prassler E, Tsubouchi T, Thrun S, Diel M, Hahnel D: Scan alignment and 3-D surface modeling with a helicopter platform, in Field and Service Robotics. 24th edition. Springer Berlin, Heidelberg; 2006:287-297.
- Wang G, Cao G, Porta TL, Zhang W: Sensor relocation in mobile sensor networks. In Proceedings of the IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM 2005). 4th edition. Miami; 2005:2302-2312.View Article
- Akyildiz IF, Kasimoglu IH: Wireless sensor and actor networks: research challenges. J. Ad Hoc Netw. 2004, 2(4):351-367. 10.1016/j.adhoc.2004.04.003View Article
- CC1000 A unique UHF RF Transceiver. http://www.chipcon.com
- CC2420 2.4 GHz IEEE 802.15.4 ZigBee-ready RF Transceiver. http://www.chipcon.com
- Imran M, Younis M, Said AM, Hasbullah H: Localized motion-based connectivity restoration algorithms for wireless sensor and actor networks. J. Netw. Comput. Appl. 2012, 35(2):844-856. 10.1016/j.jnca.2011.12.002View Article
- Ozaki K, Watanabe K, Itaya S, Hayashibara N, Enokido T, Takizawa M: A fault-tolerant model for wireless sensor-actor system, in Proceedings of the 20th International Conference on Advanced Information Networking and Applications (AINA 2006), vol. 2. Vienna; 2006:303-307.
- Ramanathan R, Rosales-Hain R: Topology control of multihop wireless networks using transmit power adjustment, in Proceedings of the IEEE 19th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM 2000). 2nd edition. Tel Aviv; 2000:404-413.
- Li N, Hou JC: Topology control in heterogeneous wireless networks: problems and solutions, in Proceedings of the 23rd Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM' 04). Hong Kong; 2004:232-243.
- Younis M, Akkaya K: Strategies and techniques for node placement in wireless sensor networks: a survey. J. Ad Hoc Netw. 2008, 6(4):621-655. 10.1016/j.adhoc.2007.05.003View Article
- Basu P, Redi J: Movement control algorithms for realization of fault-tolerant ad hoc robot networks. J. IEEE Netw. 2004, 18(4):36-44. 10.1109/MNET.2004.1316760View Article
- Shantanu D, Hai L, Ajith K, Amiya N, Stojmenovic I: Localized movement control for fault tolerance of mobile robot networks, in Proceedings of The First IFIP International Conference on Wireless Sensor and Actor Networks (WSAN 2007). Albacete; 2007:1-12.
- Abbasi AA, Akkaya K, Younis M: A distributed connectivity restoration algorithm in wireless sensor and actor networks, in Proceedings of the 32nd IEEE Conference on Local Computer Networks (LCN' 07). Dublin, Ireland; 2007:496-503.
- Alfadhly A, Baroudi UA, Younis M: Least distance movement recovery approach for large scale wireless sensor and actor networks, in Proceedings of the 7th International Wireless Communications and Mobile Computing Conference (IWCMC'11). Istanbul; 2011:2058-2063.
- Younis M, Lee S, Gupta S, Fisher K: A localized self-healing algorithm for networks of moveable sensor nodes, in Proceedings of the IEEE Global Telecommunications Conference (GLOBECOM' 08). LA; 2008:1-5.
- Imran M, Younis M, Md Said A, Hasbullah H: Volunteer-instigated connectivity restoration algorithm for wireless sensor and actor networks, in Proceedings of the 2010 IEEE International Conference on Wireless Communications, Networking and Information Security (WCNIS). Beijing; 2010:679-683.View Article
- Tamboli N, Younis M: Coverage-aware connectivity restoration in mobile sensor networks. J. Netw. Comput. Appl. 2010, 33(4):363-374. 10.1016/j.jnca.2010.03.008View Article
- Akkaya K, Janapala S: Maximizing connected coverage via controlled actor relocation in wireless sensor and actor networks. J. Comput. Netw. 2008, 52(14):2779-2796. 10.1016/j.comnet.2008.06.009View ArticleMATH
- Duque-Anton M, Bruyaux F, Semal P: Measuring the survivability of a network: connectivity and rest-connectivity. Eur. Trans. Telecommun. 2000, 11(2):149-159. 10.1002/ett.4460110203View Article
- Goyal D, Caffery JJ: Partitioning avoidance in mobile ad hoc networks using network survivability concepts, in Proceedings of the Seventh International Symposium on Computers and Communications (ISCC'02). Giardini Naxos; 2002:553-558.
- Jorgic M, Stojmenovic I, Hauspie M, Simplot-ryl D: Localized algorithms for detection of critical nodes and links for connectivity in ad hoc networks, in Proceedings of the 3rd Annual IFIP Mediterranean Ad Hoc Networking Workshop (Med-Hoc-Net). Bodrum; 2004:360-371.
- Correia LHA, Macedo DF, dos Santos AL, Loureiro AAF, Nogueira JMS: Transmission power control techniques for wireless sensor networks. Comput. Netw. 2007, 51(17):4765-4779. 10.1016/j.comnet.2007.07.008View ArticleMATH
- Liu X, Xiao L, Kreling A, Liu Y: Optimizing overlay topology by reducing cut vertices, in Proceedings of the 2006 International Workshop on Network and Operating Systems Support for Digital Audio and Video. Rhode Island, USA; 2006.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.