Comparing wireless flooding protocols using trace-based simulations
© Jacobsson and Rohner; licensee Springer. 2013
Received: 30 November 2012
Accepted: 4 June 2013
Published: 19 June 2013
Skip to main content
© Jacobsson and Rohner; licensee Springer. 2013
Received: 30 November 2012
Accepted: 4 June 2013
Published: 19 June 2013
Most wireless multi-hop networks, such as ad hoc networks and wireless sensor networks, need network-wide broadcasting, which is best done with a flooding protocol. In this article, we use packet trace information from a real test-bed network to define a simulator for flooding protocol performance studies. Five protocols are compared using the simulator. Trace-based simulations promise to have the benefits of the simulator, such as reducing required work effort and repeatability but still produce results close to the real test-bed or deployment. We propose and evaluate different approaches on how to use collected trace data and how to tune the parameters to achieve the best possible accuracy in comparison with actual test-bed measurements. We study the resulting accuracy of the model so that performance studies know with what confidence a certain conclusion can be made. Using the new trace-based model and knowing its accuracy, we compare the five flooding protocols to gain additional insights into their performance. Finally, by modifying the trace data, we study how real-world effects, such as links with in-between qualities and asymmetric links, influence the different flooding protocols.
In some wireless networks, a node may not be able to directly transmit a packet to every node in the network due to transmission range limitations. Instead, it needs help from other nodes to relay the packet to the destination. Such wireless networks are called multi-hop networks and require special networking protocols. Examples are mobile ad hoc networks (MANETs), wireless sensor networks (WSNs), and mesh networks. Most wireless multi-hop networks also need network-wide broadcasting. Just like unicast routing, this is only possible if some of the nodes relay the broadcasting packet so that all nodes can be reached. This process is called flooding and is used by multiple other protocols and applications, including unicast routing protocols.
This article builds on our earlier work on flooding protocols [1, 2], where we proposed a flooding protocol called prioritized flooding with self-pruning (PFS) and compared it to other common flooding protocols. In , we proposed the first version of this protocol and simulated it in the standard configuration of ns2 , which is a very popular simulation package for wireless networking. In , we implemented the same protocol in a real wireless network test-bed and again measured its performance and compared it to blind flooding and counter-based broadcasting (CBB) . We found that PFS had problems in real networks, and we had to modify the protocol. We also noticed discrepancies in the performance results obtained in the simulator in comparison to what we observed in the real network. Hence, we wanted to better understand this and see if it is possible to find a model that more closely simulates a real network. With such a simulator, it becomes possible to study more parameters and more protocols compared to a test-bed but without loosing important real-world aspects.
That the results from simulator studies of wireless networks have problems is not new and has been known for many years. For instance, [5–8] compared common MANET or WSN simulators and showed radically different results due to different radio propagation models, frame reception models, and other assumptions. Kotz et al.  looked at the impact of common simplifications in ad hoc network simulations and showed drastic errors in comparison with more accurate assumptions. Hence, to use simulators, we must first validate the used models.
The aim of this work was to find simulator methods that will be able to tell how flooding protocols perform in real networks with a higher degree of confidence. To know whether we are successful, we will benchmark the simulator results with the real network observations that we have and try to find simulator models whose results are closer to the real network. A model that is closer to the real network is simply said to be more accurate. More accurate simulations allow us to use simulators to study the performance of various protocols and be more confident that the results also hold in the real network. On the negative side, there is a question on how general these results will be, given that we only study a handful of network deployments. To properly answer this question, the work in this paper should be extended to more network deployments. It is then possible to answer if the same model performs just as good in other scenarios and hence can be considered generic.
For the sake of this study, we say that there are two types of simulation models for radio propagation and channel characteristics: synthetic models and trace-based models. Synthetic models are purely based on mathematical models derived from a few observations. With them, you can create any scenario you like. However, they need to make a lot of assumptions and simplifications. The most common model, the two-ray ground reflection model, popular in ns2 and many other simulators, is known to be overly simplistic. Nevertheless, even more advanced synthetic models can give erroneous performance results (e.g., see ). It is simply not possible to take into account all the different aspects of wireless communication. In particular, many wireless simulations based on synthetic models make unrealistic assumptions, such as
Links without in-between quality;
Too simplistic radio propagation (no or limited shadowing and multi-path);
Too simplistic frame-reception model (e.g., w.r.t. interference);
Frame length-independent reception.
The alternative to synthetic models is to collect data traces from real networks and use that in the simulator (e.g., [11, 12]), a technique known as trace-based simulation. Trace-based simulators limit you to the scenarios of the networks where the traces came from. However, they still have all the benefits of a simulator, such as easier to use, repeatable, and speed. They also promise to produce more accurate simulation results, at least as long as you simulate the network from where the traces were collected.
During our earlier test-bed experiments , we collected general data about the network and its topology. These data include the packet delivery ratio between all node pairs and should be enough for a trace-based simulation. In this article, we will use that data to see whether we can mimic the test-bed behavior in the simulator and, if so, continue to simulate additional protocols that we did not have the time and resources to do in the test-bed. Further, we can also modify the traces to simulate what happens in special cases, such as stress-testing the protocols or remove certain real-world effects to see what impact such effects have on the protocols. Finally, better simulation models may also be used in other areas besides performance simulations, such as deployment optimizations , where we collect traces from an actual deployment, run simulations based on that to determine the best protocols and parameters for that particular deployment.
Hence, the contributions of this article is twofold: First, we will show how our collected data can be used in a trace-based simulation and how this leads to results closer to the real network than synthetic models. Secondly, we will use this model to do a performance comparison of five different flooding protocols.
This article is organized as follows. Section 2 introduces a simulation study based on the commonly used two-ray ground reflection model of ns2, investigates the performance of two radically different flooding protocols using this model, and compares these results with test-bed measurements. In Section 3, we introduce a new simulation model based on collected packet traces from the test-bed, and in Section 4, we tune the model parameters and study its accuracy in comparison with the test-bed measurements. In Section 5, we use the new model to simulate and compare five different flooding protocols. We also test how PFS behaves under different situations by modifying the traces. Section 6 contains related work, and we conclude in Section 7.
In this section, we simulate the two flooding protocols CBB and PFS using a standard synthetic simulation model and compare the results with the experimental results obtained in our previous work . The code is based on the same code as we used in  but with the protocol enhancements that was found in  also implemented in the simulator. Hence, we use the ns2 simulator , which is a frequently used tool for simulations of wireless protocols. However, we changed the wireless network layer to IEEE 802.15.4, which was used in our test-bed experiments. Hence, the application, the flooding protocol, and the MAC protocols are all the same in both the simulator and the test-bed. The simulator is set up to closely mimic the test-bed.
In the following two subsections, we first highlight the most important aspects of the simulators and the test-bed experiments used. In the third subsection, we show the results of the comparison and analyze the discrepancies.
For our test-bed experiments, we used t-mote Sky (based on Telos Revision B) from Moteiv Corporation. The wireless technology used by t-mote Sky is 2.4 GHz IEEE 802.15.4, which uses direct sequence spread spectrum radio frequency modulation with a data rate of 250 kbps.
In our experiments, a packet consisted of a payload plus a total of 17 bytes of headers (the MPDU consisted of a 9-byte header and a 2-byte CRC field). When transmitting, a mote used the clear channel assessment (CCA) function defined in  to determine whether the channel is idle or not. If there was an ongoing transmission, the mote waited a random time (uniformly distributed between 61 µs and 2.0 ms in the first attempt and between 61 µs and 7.8 ms in the subsequent attempts) and tried again. After eight unsuccessful attempts, it gave up and dropped the packet. Since we used broadcasting, no acknowledgments were exchanged.
We placed 50 motes in a grid topology with ten motes lined up in five rows. We used six different deployment scenarios, where the distance between neighboring motes was the same but varied between 0.3 and 2.0 m for the different experiments. After 2.0 m, the network became disconnected with a very high probability. Each node was elevated about 0.2 m above the floor using blocks of polystyrene foam in order to avoid the worst kind of multi-path interference. To make the network multi-hop, we set the transmission power to almost the lowest level (−24 dBm).
where N is the number of nodes, Rx,y is the packet delivery ratio from node x to node y, and Rx,x=0. It is basically the average number of direct neighbors but also considers the packet delivery ratio of the links. If a particular link experienced a packet delivery ratio of 40%, we counted that as a 0.4 link. We did this because a flooding protocol should be able to exploit the fact that packets can be sent on this link with a probability of 40%.
A higher value of Equation 1 means a denser network with 49 as the maximum in our 50-mote network. For each deployment scenario, we measured the average node degree by letting each mote transmit 100 packets with 2-byte payload. At the same time, each mote listened for packets from other motes so that the packet delivery ratio for all node pairs could be estimated. This was repeated five times before, during, and after each experiment, and an average was calculated.
Parameters used in the comparisons
Number of nodes
Total flooding packet size
42 bytes a
Total hello packet size
Number of floodings per scenario
T m a x
t x _d e l a y(|m|)+D
Hello packet interval
0.9 to 1.0 s
To compare the flooding protocols, we use the following three measurements in the remaining of this article:
Reachability This measurement evaluates a protocol’s reliability. It is represented by the delivery ratio of a flooding message at all receiving nodes. For example, if there are 50 nodes in a network and a node floods a message using a certain flooding protocol resulting in 42 nodes receiving this message, then we say that the reachability is 42/49=85.71%.
Retransmission This measures the number of retransmitting nodes for flooding a single message. Other messages, such as hello messages are ignored in this measurement. It measures the efficiency of the protocol.
Delay The delay is the length of the time interval from the moment that the source node sends a flooding message until the moment when the last node in the network receives this message. If a message does not reach all nodes, the last node means the last node that received the message.
For the simulations, we used ns2 version 2.34 and the standard two-ray ground reflection model, which is the default synthetic model used by ns2. The effect of this model is that the network topology becomes a unit disc graph and that links are either perfect or non-existing among other things. There are simulators with better synthetic models, such as Castalia  and MiXiM . However, starting with a very basic model, we will be able to investigate what real-world characteristics affect the performance of our flooding protocols.
To better model the MAC and PHY behavior in our test-bed, we configured the physical layer to mimic the 802.15.4 physical layer and implemented the relevant parts of the IEEE 802.15.4 MAC protocol. Since we only need to simulate broadcasting, we could ignore acknowledgements, retransmission, and most of the complexity of the full version of 802.15.4. Only the header formats and the CCA with backoff were implemented.
An important parameter in flooding is the delay between when a packet arrives at the interface and is processed by the flooding algorithm. If this delay is long, it may degrade the flooding performance as we have shown in our earlier works [1, 2]. This is due to timing issues between when a decision to retransmit is taken and when received packets are processed, packets that may make the node refrain from retransmitting. If the processing is delayed, unnecessary retransmissions may happen. To model this behavior in the simulator, we used the delay parameter in the link layer module (LL) of ns2, which introduces a delay on both incoming and outgoing packets between the MAC level and the higher levels. A fixed constant was used that we changed to find the best result. The value we used in the simulations, unless mentioned otherwise, was 5 ms. See Section 4.1 for more details on how we arrived at that value. Note that for the delay measurement, we used the packer arrival time at the interface, i.e., before the 5-ms LL delay, since this better mimics the test-bed implementation.
In the simulator, we placed the nodes in the same grid topology as in the test-bed experiments. Also here, the distance between neighboring nodes was altered to achieve different network densities. We used the same metric (Equation 1) to measure the network density in the simulator so that we can compare with the results from the measurements.
Due to the perfect simulation environment created by the two-ray ground reflection model (it becomes a unit disc graph), we experienced some strange border effects. To avoid this and also introduce some randomness into the simulations, we also tested random topologies. In the random topology, the nodes were uniformly distributed in a rectangular area corresponding to the grid topology in the test-bed. We used a rectangular area with various sizes but always with the same ratio of 4:9 between the sides. We refer to the former simulator topology as Grid and the latter as Random.
Figure 1a shows the results for CBB. In terms of retransmissions, we can see that the grid topology simulation corresponds almost perfectly with the experimental results despite its very simple model. Further, switching from a grid topology to a random node topology does not significantly affect the results. The same holds for reachability (not shown), which is very high in all three cases. Concerning the delay, we can see in Figure 1c that a similar curve is obtained as in the experiments. Hence, our simple simulator model produces very accurate results for CBB.
In Figure 1b, the same comparison is shown for PFS. Unfortunately, we cannot see a good correlation between the simulation results and the experimental results as we did for CBB. It seems that the simulator is too optimistic concerning the retransmissions of PFS when the network becomes sparser. There can be a number of reasons for this, such as the unrealistic assumptions made by common synthetic models that we listed in Section 1. The two-ray ground reflection model of ns2 makes all of those assumptions. Exactly which assumptions cause these discrepancies is unclear from this experiment. However, in Section 5.1, we will return to this question.
The delay results for PFS are also shown in Figure 1c. In this case, the simulations are fairly close to the observations from the test-bed.
The conclusion of this work must be that we need to find a better simulation model that better reflects the real situations that we experience, especially for the number of retransmissions. In this case, the optimistic simulation results for PFS even lead to the wrong conclusion that PFS is better if we only rely on the simulation results. Hence, we have once again demonstrated the shortcomings of relying on too simple simulation models without looking at the accuracy. On the positive side, we can also note that our simulations were much quicker than the test-bed measurements. We could achieve over 20 floodings per second per processor core (Intel Core i7-2677M, Intel, Santa Clara, CA, USA) in the simulator for PFS. In the test-bed, one single flooding took 20 s to complete. Therefore, it is still worth to continue and try to improve simulator accuracy. To do this, we will first identify a better simulation model and then again verify the simulations against test-bed measurements.
In this section, we will explore the possibility of using our network density measurements from the test-bed in the simulator. Hopefully, that will give more realistic results.
The traces that was collected is solely on the packet level, which means that our trace-based model will be somewhat different from traditional models, such as log-distance and log-normal radio propagation models based on SINR and radio sensitivities. Instead, our model will solely be on what the MAC and higher layers experience, i.e., the reception or non-reception of packets.
This information can be used in the radio propagation model. We load the topology information from a given test-bed experiment into the simulator. When node i sends a packet, we decide whether node j can receive the packet or not by drawing a random value uniformly between 0 and 99 and compare it to the delivery ratio in row i, column j of the matrix. If the random value is lower than the delivery ratio, the packet is correctly received; otherwise, it is corrupt. In the simplest model, all the reception events are independent of each other, even the ones from the same transmission.
In ns2, we implemented this by creating a new ErrorModel. We kept the rest of the simulation model the same to keep as many factors the same as possible in the simulator. We even kept the two-ray ground reflection model, but disabled its main function by placing all nodes sufficiently close to each other so that they were within each other’s coverage area. This meant that packet errors were only created by the error model and that all nodes were within each other’s interference and carrier sensing range. Hence, no hidden terminal problems would be present in the simulator, for instance. However, it is not likely that there were any significant hidden terminal problems in the test-bed either, since all our measurements were done in a confined area where all nodes would be able to sense each others’ transmissions but obviously not always correctly receiving all transmissions.
There are a number of aspects that may influence the performance of the protocols and therefore needs some attention. The following three aspects are different between the trace-based model and the two-way ground reflection model, and may have significant impact on the results:
Links without in-between qualities In the two-ray ground reflection model, links are binary, i.e., either perfect or non-existing, but never in between. In the trace-based model, the links can have in-between qualities, which is reflected in that the link qualities are expressed as a percentage. To make a trace into binary, we can replace the delivery ratios with either 0% or 100% based on whether that link has a higher or lower packet delivery ratio than a certain threshold. To maintain the network sparseness, the threshold should be set in such a way that the average node degree (based on Equation 1) remains as unchanged as possible.
Symmetric links In the trace-based model, the link quality in the different directions may be different. To make the trace-based model symmetric, we can take the traces and set the delivery ratio in both directions to the same, namely to the average of the delivery ratios in the two directions.
Topology If the links of the trace-based model are made both binary and symmetric, then the main difference between the trace-based model and the two-way ground reflection model would be the topology. The two-ray ground reflection model would have a unit disc graph topology, while the topology of the trace-based model is much more random, with the existence of some long links and some short links missing. Hence, the topology is different, and this is also expected to have an effect on the performance.
The different protocols may have difficulties in handling some of these aspects depending on whether the aspects are present or not. In Section 4.3 and also Section 5.1, we will study these aspects, by modifying the trace data so that the model becomes binary and/or symmetric.
In addition to these three aspects, there are two more aspects that we need to give some attention. Aspects that are also not included in the basic trace-based model, namely short-term effects and the effects of different packet sizes. Using a synthetic approach, they can be included in the trace-based model as well. The two are discussed in the following two subsections, and their impact on the simulation results are studied in Section 4.2.
By studying this, we will know what aspects are important to model in a simulator in order to get good simulation results. Furthermore, we can tell how the different protocols respond to the different aspects and predict how they will perform in networks where these aspects are present or not.
Our trace information tells us how each of the links behave in the long term but does not tell us anything about the short-term behavior. We know that if one packet is lost, the probability that a consecutive packet transmitted shortly afterward also will be lost is higher, i.e., a wireless link does not have the memoryless property. A simple way to model this is to use a two-state continuous Markov model, i.e., to assume that a link has two states: a good state and a bad state. Packets that start their transmission during a good state are delivered, while packets starting their transmission during a bad state are lost. Each link has their own state machine and changes between the states over time. The time the link spends in one state before changing to the other is exponential, and the ratio between the two waiting times is determined by the long-term packet delivery ratio of the link.
To make the model even more accurate, we could also introduce packet loss correlation between nearby links and/or nodes. For instance, an interferer would affect any link going to the same node and also links going to nearby nodes. However, as we will show later in this article, it is not necessary to take this much details into consideration when defining our simulation model. Therefore, we will not attempt to make our model take this into account as it would also make the model unnecessarily complex.
Another aspect we could introduce is the fact that longer packets are more prone to packet errors than shorter packets. There are simply more bits in the packets that can be erroneously received, and this effect can sometimes be substantial. Unfortunately, we did not collect this information from the test-bed. The packet size used in the network density measurements was the same as the hello packets used by PFS (19 bytes including headers). To find the actual packet loss for the flooding packets (42 bytes including headers), we need to model this and extrapolate an estimate for the actual packet loss for flooding packets.
To do this, we assume that packets are correctly received by a probability expressed as p L =p0p L , where L is the packet length in bits, 1−p can be considered the bit error rate (BER), and p0 is a non-size-dependent delivery probability component, such as the receiver failing to synchronize to the preamble and collisions. Note that p and p0 can be uncorrelated with each other. This model is in line with common ways used in wireless communication research when translating between BER and packet error rate (PER) (e.g., [17, p. 215]).
Note that this correction cannot be combined with our link model when c>0.
We again simulated the six deployment scenarios (with the varying network densities) but with our improved trace-based propagation model and compared with the test-bed measurements. Before comparing the results, we need to find the parameters of our model that give the best accuracy. This includes the LL delay parameter, the c parameter, and the a parameter. For clarity, we only show three representative deployment scenarios for this tuning, namely the fully connected scenario, the 0.6-m scenario, and the 2.0-m scenario.
We can see that for reachability, the LL delay value has very little influence and no clear trend. Hence, we can safely ignore the reachability when deciding the LL delay. For the other two measurements, we clearly see that 5 ms is the best for delay and 6 ms is best for retransmissions. It should therefore be perfectly alright to select a LL delay of 5 ms. This is also in line of what can be expected from the actual processing delay of our sensor motes.
In some deployment scenarios and with some measurements, a bigger c improves the accuracy, while in others, it decreases. To better see if there is any real improvements, we again calculated the errors in the same way as we did for the LL delay parameter. The results are shown in Figure 6d for PFS. As before, it is not possible to compare the different curves with each other as they have different scales. From the results, we can see that only for reachability, the error goes down if c>0. Hence, it is not clear that this is an enhancement. However, this is actually good news since we can work with a simpler model.
The reason that the links appear as memoryless in our model may be due to the fairly large intervals between transmissions in our test-bed experiments. In other experiments, where packets are sent closer to each other or even back-to-back, the results could be different. However, for flooding protocols, the links can just be assumed to behave as memoryless.
The results from the packet size effect is similar to that of the short-term packet loss effects. We can see that retransmissions get an improved accuracy as we increase a, but other measurements are unaffected or degrading. Also for CBB, the errors point in different directions when we vary c. Hence, it is not clear that using the packet size effect and a>0 is an improvement and if there is an improvement, it is going to be minor.
We can conclude that neither c>0 nor a>0 yield any improvement, allowing us to stick to the original and simpler model. For the packet size effect, perhaps it is still possible to find an improvement if we had access to the actual packet loss of large flooding packets instead of deducing it from the small packet loss using a synthetic approach.
In Table 2, we present the errors as numbers between the two simulation models and the test-bed. For each protocol, deployment, and measurement type, we calculated the absolute error. In Table 2, we show the average of those values and compare between the two-ray ground model and the trace-based model. To further validate the new model, we also included counter-based PFS (CB-PFS) in the comparison. CB-PFS is a combination of CBB and PFS that we introduced in . CB-PFS works exactly as PFS but with a counter like CBB. A node using CB-PFS can refrain from retransmitting either due to self-pruning according to PFS or due to the counter exceeding the threshold according to CBB. In Section 5.2, we show how CB-PFS perform in comparison with the other protocols.
Mean absolute errors for different simulation methods
Given the variation of considered flooding protocols and topologies, we expect the presented numbers to be generalizable for comparable scenarios. However, similar validation has to be done for scenarios not involving flooding protocols or different types of network deployments.
Now that we have improved the accuracy of our simulator, it makes sense to use the model and, for instance, simulate other flooding protocols for performance comparison. With the more accurate simulation model, we should get results close to a real network, and our conclusions should become more reliable compared to earlier simulation attempts. Since we have the error CDFs, we can use them to find the confidence intervals of our results and hence would know when statistically certain comparisons can be made.
In the remaining of this section, we will first look at how PFS performs under different assumptions. Then, we will extend the flooding performance study with two more flooding protocols.
In this first use of the trace-based model, we are actually going to validate PFS itself and see how it behaves under different assumptions. That is, what real-world effects influence the performance of PFS. To do that, we modified the traces as mentioned in Section 3.2. The results are shown in Figure 8 when we made all links binary (the curves marked Binary). We can see that the binary aspect does not affect the results very much. Also, when we make the links symmetric or both symmetric and binary (neither shown), we cannot see a big difference in performance.
The conclusions from this must be that PFS is able to handle in-between links very good and that the main influencing factor compared to the original ns2 simulations actually is the topology of the network instead.
In this subsection, we choose to simulate the other two flooding protocols that we tried in  but did not implement in the test-bed, namely, scalable broadcasting algorithm (SBA) and ad hoc broadcast protocol (AHBP). We will compare them with CBB, PFS, and CB-PFS.
SBA was proposed by Wei Peng and Xi-Cheng Wu in . This protocol is similar to PFS but requires two-hop hello messages and has designed the RAD differently. The RAD is a uniformly distributed delay between 0 and a function of the highest node degree of its neighbors divided by its own node degree. This means that nodes with more neighbors are more likely to retransmit faster, and this should make the neighbor elimination more efficient.
AHBP  also requires two-hop hello messages. Unlike PFS and SBA, it is the sender that decides which of its neighbors should retransmit a flooding message. This decision is based on the two-hop neighbor information that the node has. The selected neighbors are called broadcast relay gateways (BRGs) and are listed in the header of the flooding message. The BRGs are selected in such a way that if they retransmit the flooding message, they will together cover all two-hop neighbors of the sender. This should guarantee that all connected nodes in the network will receive the flooding message as long as the two-hop neighbor information is accurate. To decide the BRGs, a greedy algorithm is used that tries to minimize the number of BRGs but still cover all two-hop neighbors.
Previous research [19, 20] has shown that AHBP performs well in static networks. However, when mobility increases in the network, the two-hop neighbor information becomes inaccurate, the wrong BRGs are selected, and reachability decreases. To better cope with outdated neighbor information, the authors propose an extension to AHBP. This extension tells a node that receives a flooding message from an unknown neighbor to assign itself as a BRG and retransmit this message. This extension will somewhat increase the retransmissions, but more importantly, it will increase the reachability, especially in the case of mobility, and is therefore an important extension to AHBP since AHBP tends to reduce the number of retransmitting nodes a bit too much. We therefore used this extension in our AHBP simulations.
Before comparing SBA and AHBP with the other protocols, we needed to tune their parameters. For SBA, we needed to set a constant C, which is a scalar constant multiplied with the slot length. We did this by trying different values and finding a good trade-off where a longer RAD does not decrease retransmissions much further. We found C=150 ms to be a good value. Making it bigger would only make the end-to-end delay bigger without any significant improvements in reachability or retransmissions. For AHBP, only the maximum jitter was needed to be set. However, no impact on the results could be seen. We used 10 ms.
From Figure 10a, we see that AHBP underperforms compared to the other protocols when it comes to reachability, except for in the really dense networks. One reason for this is AHBP’s sensitivity to links with in-between quality. Another, equally important reason is AHBP’s problem in dealing with asymmetric links. This was verified by modifying the traces as we did in Section 5.1. When we made the links symmetric, we could see a significant increase in the reachability (not shown). Nearly half of the packet losses can be explained by this. When we make the links both binary and symmetric, then AHBP got a reachability near perfect (not shown). Only the rare but possible occurrence of packet collisions causes a little bit lower reachability for AHBP.
When we look at retransmissions, which is shown in Figure 10b, we find that AHBP performs the best in most cases or similar to CB-PFS in the most sparse networks. This explains why AHBP has problem with links with in-between quality and asymmetric links. It has very little redundancy and suffers from any error in the neighborhood information, while PFS and SBA both have enough redundancy to make up for such errors. It must also be said that AHBP has a lower number of retransmissions (and less delay) also due to its lower reachability, i.e., AHBP gets some of its good values in Figure 10b,c due to its lower reachability.
All neighbor knowledge-based protocols have problems with the reachability in the most dense deployment scenario, which may seem counterintuitive. However, this is due to the fact that the scenario is nearly fully connected, but not fully connected. This is especially true for PFS and CB-PFS, since they use a different hello protocol where links are said to exist if at least one of the last three hello messages are received. This fools the pruning algorithm to believe the network is fully connected and no retransmissions needed. We kept the standard hello protocol (the reception of the last hello packet determines the existence of the link) for AHBP and SBA, and that is why they have higher reachability than PFS. They are less likely to believe the network is fully connected.
In general, we can conclude that all protocols have very good reachability, except AHBP in the sparse deployment scenario and PFS/CB-PFS in the nearly fully connected scenario. On the other hand, AHBP has both the lowest retransmissions and end-to-end delay. CB-PFS has very few retransmissions but still very high reachability. The strength of SBA is its low end-to-end delay while still maintaining a good reachability. Finally, CBB has the best reachability and still reasonable retransmissions and end-to-end delay. Hence, different protocols have different strengths, and the best choice will depend on the application requirements.
Comparison of flooding protocols have been done before. An early study was done by Williams et al. , which was purely based on ns2 simulations using the standard two-ray ground reflection model. We conducted a similar study in the paper where we introduced PFS . Others have used analytical modeling to study and understand the behavior of some flooding protocols. For example, Shah-Mansouri et al.  modeled CBB. A handful of test-bed measurement studies of flooding protocols have also been done. Most of them measured blind flooding or a variant thereof. Examples include [22–24].
Trace-based simulations have also been proposed and used by others. One of the first to use traces in wireless simulation was Nguyen et al. . They collected traces between two hosts using WaveLAN and used in a trace-based simulation. However, the trace-based simulation was used as the reference to validate some synthetic radio propagation models. Only a single link was investigated.
A more recent study by Alan Marchiori et al.  was done on networking aspects of WSNs. They collected traces using USB-connected WSN motes and implemented their own WSN simulator on top of SimPy. They compared simulations based directly on the traces with synthetic radio propagation models based on the collected SINR measurements from the same traces. Finally, they studied the performance of the collection tree protocol, a simple and popular data collection protocol for WSNs, and compared with test-bed measurements. Their findings are inline with our findings. Only one network protocol was studied, and comparisons with the more common two-ray ground reflection model are missing.
Kotz et al.  evaluated common MANET simulation simplifications and their effect on routing protocol performance. Their work is similar to ours in the sense that they tried to quantify the loss in accuracy caused by non-realistic assumptions, such as symmetric links, links without in-between qualities, unit-disc transmission range, etc. However, they only used synthetic radio propagation models.
Another paper with similarities to ours is by Pham et al. . They also compared simulators and test-bed measurements for a WSN broadcasting protocol. However, they tried to find a synthetic model to use in the simulator, which was not entirely successful according to the authors.
Also, Halkes and Langendoen  did a work similar to ours by comparing WSN protocols in a real test-bed with trace-based simulations. However, their work focuses on the MAC protocol and therefore is more about high intensity traffic, with lots of collisions. They test two ways of doing trace-based simulations. The first one uses packet traces but converts them into a binary reception model by removing all links without near perfect reception. As can be understood from this paper and observed in their paper, this leads to accuracy problems. The second one is based on SNR traces and performs better, but this is mainly due to better modeling of collisions, interference, carrier sense, etc. Effects that we have very little of in our setup.
The main drawback of any trace-based approach is that you limit your study to the networks that you have at hand. It is hard to generalize the results to other deployments. However, a greater amount of test-bed traces are being collected and published by others (e.g., [26, 27]). Given that the right information is collected, we could continue this work using traces from different deployments. It is also possible to collect traces from the actual deployment of interest and use in off-line simulations similar to this study to determine the best protocols and parameters of that particular deployment.
In this article, we compared the performance of flooding protocols for wireless multi-hop networks, in particular, MANETs and WSNs. In earlier works, we first simulated in the standard simulation package ns2 and then tested three of the protocols in a real wireless test-bed. We noted discrepancies not only in the measurement results but also in the conclusions from the studies and wanted to investigate how these discrepancies could be avoided. To achieve this, we used a trace-based simulation model, which have the flexibility and ease of the simulator, but hopefully the accuracy for correct performance comparisons. In this article, we studied different approaches on how to use collected trace data and quantified the accuracy of the achieved model. We showed that accuracy for our study of flooding protocols could be improved by fairly simple simulation models. Finally, we compared the same protocols as we did in our first simulations and demonstrated new insights into the performance of the different protocols. By modifying the trace data, we could also study what real world effects influence the protocols the most and found that the network topology was the biggest contributor to the errors of the original two-ray ground propagation model.
This work is based on results funded in part by the Commission of the European Union under the project IST MAGNET, by the Dutch Ministry of Economical Affairs under the Freeband PNP2008 project, and by the Swedish Foundation for Strategic Research (SSF) under the project ProFuN: A Programming Platform for Future Wireless Sensor Networks. We would also like to thank Laura Feeney for her valuable feedback on the earlier version of this article.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.