Performance evaluation of data aggregation for cluster-based wireless sensor network
© Sinha and Lobiyal; licensee Springer. 2013
Received: 1 February 2013
Accepted: 29 July 2013
Published: 9 August 2013
In wireless sensor network, data fusion is considered an essential process for preserving sensor energy. Periodic data sampling leads to enormous collection of raw facts, the transmission of which would rapidly deplete the sensor power. In this paper, we have performed data aggregation on the basis of entropy of the sensors. The entropy is computed from the proposed local and global probability models. The models provide assistance in extracting high precision data from the sensor nodes. We have also proposed an energy efficient method for clustering the nodes in the network. Initially, sensors sensing the same category of data are placed within a distinct cluster. The remaining unclustered sensors estimate their divergence with respect to the clustered neighbors and ultimately join the least-divergent cluster. The overall performance of our proposed methods is evaluated using NS-2 simulator in terms of convergence rate, aggregation cycles, average packet drops, transmission cost and network lifetime. Finally, the simulation results establish the validity and efficiency of our approach.
KeywordsWireless sensor network Divergence clustering Entropy-based data aggregation Local and global aggregation
The wireless sensor network (WSN)  has started receiving huge research incentives for its omnipresence in several applications, including environmental monitoring, wildlife exploration, medical supervision and battlefield surveillance. The sensor network is formed with small electronic devices possessing self-configuring capability that are either randomly deployed or manually positioned in huge bulk . It performs activities in several dimensions, for instance identifying the neighborhood, presence of targets or monitoring environmental factors (motion, temperature, humidity, sound and other physical variables). However, owing to limited battery power, the sensor networks demand energy efficient resolutions to enhance the performance of sensor network.
Energy consumption problem, being the most visible challenge, is considered central to the sensor research theme. The processing of data, memory accesses and input/output operations, all consume sensor energy. However, the major power drain occurs due to wireless communication . Therefore, attempts require to be carried out to perform as much in-network processing as possible within a sensor or a group of sensors (cluster). This is achieved by performing aggregation and filtration of raw data before transmitting them to destined targets. As a result of which redundancy in the recorded sensory samples is eliminated, thereby reducing the transmission cost and network overloading. Moreover, decrease in the effective number of packet transmissions also leads to minimized chances of network congestion, thereby saving the excess energy consumption in the network. For instance, if the radio electronics requires 50nJ/bit and amplifier circuitry needs 10pJ/bit/m 2 for communication, then power used in transmitting 1 bit of information to the processing center situated 1 km away, consumes 1.005 × 104nJ per unit time (watts). However, energy used in data processing for aggregation is 5 nJ/bit/signal, which implies that execution of almost 2010 instructions compensates the energy used for one transmission in unit time. Therefore, it is quite recommendable to apply aggregation techniques. Previous researches have already proven the fact that in-network processing cost is much less than the communication cost [4–14].
The proliferation of sensor network has created the urge of exploring novel ideas for data aggregation. However, the aggregation schemes would require efficient clustering protocols to well-implement its functioning. Hence, in this paper we have contributed a divergence-measure based clustering protocol along with entropy based data aggregation, to the ongoing sensor network research. The remainder of this paper is organized as follows: a brief review of previous research carried out in the related field is included in section 2. Our proposed clustering technique based on divergence measure is provided in section 3. In section 4, the proposed fuzzy-entropy based aggregation scheme has been elaborated. Analysis of network diagram is presented in section 5. Section 6 shows the performance evaluation of our proposed method. Finally, the paper is concluded in section 7 along with directions for further scope.
The energy consumption in wireless sensor network has created enormous awareness among the researchers for increasing the network lifetime. The sensor network is considered to have prospective results in terms of dynamism and diversity in everyday applications. Several resource efficient protocols have been introduced by researchers in order to limit the sensor energy usage, at the same time maintaining a sufficient degree of reliability and throughput.
Several methods of data aggregation depend on the topology of the sensor network . For instance, a tree-based data aggregation protocol constructs a simple topology based on a parent and child association . However, large transmission delays and poor rate of aggregation makes it unsuitable for the dynamic applications. Further, we have centralized aggregation protocol , in which aggregation is done only at the sink (data processing center). As a result, such protocols lead to heavy workload and unnecessary packet drops. There are other clustering schemes based on static [18–20] and dynamic cluster aggregation [21–23]. In case of static environment, the clusters are formed in the initial stage and the aggregation is carried out by the cluster heads. The clusters once formed remain unchanged throughout the network lifespan. This procedure is suitable for area monitoring (recording earthquake, temperature, humidity, etc.), but not supported over wide range of applications, like- forest fire supervision, wildlife monitoring, target tracking, etc. Therefore most of the research awareness can be found in dynamic cluster aggregation schemes, where clusters are formed dynamically and updated on sensing environmental parameters followed by aggregation at the cluster head. The clusters formed in this case, are also known as adaptive clusters.
An energy aware algorithm has been provided in  for constructing an aggregation tree prior to data transmission. The algorithm seems to reflect the influence of both the energy and distance parameters to construct the tree. In another research , the authors have performed aggregation by considering entropy of correlated data transmitted by the source nodes. This procedure reduces the amount of redundant data forwarded to the sink. Furthermore, the estimation of joint entropy of the correlated data set helps in maximizing information integrity. Another interesting aggregation protocol is developed in  on the basis of wavelet-entropy. Initially, multi-scale wavelet transforms are used to spread signals in multi-scale range, after which information is aggregated using wavelet-entropy discriminance theorem. Simulation results indicate that the proposed method is capable to extend the lifetime of networks to a much greater degree than Low-Energy Adaptive Clustering Hierarchy (LEACH) protocol . In  the authors have put forward a novel approach that focusses on data aggregation with significantly reduced aggregation latency. Collision-free schedule is generated by a distributed algorithm for performing data aggregation in wireless sensor network. The time latency of aggregation schedule is minimized using greedy strategy.
In a recent research , an aggregation scheme called smart aggregation is developed for continuous monitoration in sensor networks. The proposed technique maintains a tolerable deviation (a bounded error) in the aggregated data while utilizing the spatio-temporal correlation of data. In another subsequent work , data aggregation techniques are designed on the basis of statistical information extraction. The applied methods exhibits bounded message overhead and robustness against link failures. The expectation–maximization (EM) algorithm is used in order to accomplish accurate estimation of distribution parameters of sensory data. The experimental outcome confirms reduced network communication cost even in large scale sensor networks. In a latest publication , the corresponding authors have presented α-local spatial clustering algorithm along with data aggregation mechanism. The contribution was mainly made for environmental surveillance applications in high density sensor networks. The aggregation algorithm constructs a dominating set by exploiting the spatial correlation between data measured by different sensors. The dominating set is further considered as network backbone to execute data aggregation on the basis of information summarization of the dominator nodes. Another research in  proposed cooperative information aggregation (CIA) mechanisms to handle observation noise and communication errors initially found in the sampled data. Moreover, the authors have designed an aggregation hard decision estimator (AHDE) and an Aggregation Maximum-Likelihood Estimator (AMLE). Simulation shows the effectiveness of CIA schemes to be suitably applied to environments prone to observation noise.
In this paper, we have proposed a dynamic clustering and aggregation strategy that aggregates data at the sensor node and cluster head as well. With the use of entropy and information theory, we attempt to reduce the transmission and processing cost, but maintaining the relevance of the aggregated data. For the evaluation of the performance of our proposed strategy, we make a comparative analysis with two well-known clustering protocols: Hybrid Energy-Efficient Distributed Clustering (HEED)  and an inference clustering protocol based of Belief Propagation (BP) . HEED is a distributed clustering approach that operates in energy efficient manner and helps in prolonging network lifetime. It is scalable over large network sizes and performs load balancing within clusters. However, frequent computation of communications cost and broadcasting among neighbors degrades its performance. As a strong counterpart, BP clustering method offers energy effective solutions based on belief calculations with potential functions. Though BP performs better than HEED in terms of clustering the network and packet delivery performance, but long-length messages induce larger overheads in message passing. This makes transmission cost higher in case of BP. Previous simulations have shown a marginal difference in network lifetimes contributed by these protocols.
Proposed divergence measure based clustering technique
Clustering is the process of assigning a set of sensor nodes, with similar attributes, to a specified group or cluster. In our research, we have proposed a new energy efficient clustering algorithm that operates in two phases: preliminary and final clustering phase. In preliminary phase, sensor nodes sensing the same category of data are placed in a distinct cluster. In final phase, the remaining unclustered sensors estimate their divergence with respect to the clustered neighbors and ultimately join the least-divergent cluster.
Preliminary clustering phase
The sensors use the window function to map the data into one of the formats. All the nodes that sense the same format in 1-hop distance groups together to form a preliminary cluster. In the initial phase, the node with maximum energy within the preliminary cluster is appointed as the cluster head. It maintains a duration timer to keep track of the period for which it remained cluster head. Once appointed the node functions as cluster head till its duration timer expires. On the expiration of the timer, the role of cluster head rotates to other probable nodes whose residual energy qualifies above a minimum predefined energy threshold. The head rotation performs load balancing within the clusters. Moreover, the cluster head assigns a unique cluster id to all the cluster members.
Though the idea of preliminary stage of cluster formation is simple to implement but due to some situations (boundary value or out-of-bound data sensing) few nodes in the network might still remain unclustered. This problem is solved by our final clustering phase.
Final clustering phase
The final clustering phase ensures that all the nodes in the sensor network get clustered. The process begins with an unclustered node discovering one or more clustered neighbor in its direct hop. The node then obtains the array of probabilities of the sensed data from its neighbors that are distinctly clustered. This procedure is further elaborated in the following section.
where p i s is the probability of i th data format from the sensor s and the probability sequence is denoted by P s .
Selection of divergence method
Application of divergence measure
Divergence measure is a metric used for defining the degree of dissimilarity between two objects. In our clustering processes, an unclustered node uses the divergence measure to analyze the extent to which it differs from each of its clustered neighbors and eventually decides to join the cluster that exhibits maximum similarity (minimum divergence). Subsequently, clusters formed by the end of final clustering phase are likely to be highly correlated. For simulation purpose, we have employed Jeffrey’s divergence measure owing to its symmetric nature.
where denote the J - divergence measure between the clustered node and s th sensor node to be clustered.
There can be two exceptional cases while executing the final clustering phase. The first case occurs at the beginning of the phase, when no clustered neighbors are found in 1-hop vicinity. This requires the node to wait till it discovers one. The waiting period ends with the expiration of wait timer (initialized at the beginning of final clustering phase). The second case is confronted by the end of the final clustering phase when a node discovers itself isolated, i.e. none of its neighbors in 1-hop vicinity are clustered yet. In that case, the node declares itself as cluster head and forms cluster with its 1-hop neighbors. This process continues, till a clustered node is discovered which initiates final clustering with divergence measure. Since, most of the nodes would be clustered (to the least divergent cluster) in the final phase, only fewer nodes would confront such isolation.
Proposed data fusion algorithm using fuzzy-entropy
In the proposed work, we apply the data fusion approach for monitoring the variation in the temperature. However, generalization can be done to other environmental parameters, for instance- pressure, humidity, etc.
Fuzzification of input data
We have selected Generalized-bell membership function to model the moderate data formats: m2 (cold temperature), m3 (normal temperature), m4 (hot temperature); while Sigmoidal membership function has been chosen to model extreme data formats: m1 (very cold temperature), m5 (very hot temperature). The temperature is continuous parameter which requires functions that can well represent its characteristics. Hence, the choice of both the membership functions is suitable as they are best known for representing maximum variation and smoothness.
Sampling process & local probability measure
Finally, the sensors send to the cluster head. Hence, the process of sending entropy followed by the expected data value; greatly reduces the bulk of packet transmissions within the cluster.
Global probability measure
Subsequently, the cluster head sends (cluster _ id, d expc ) to the data processing node (i.e. sink). As a result of the global probability model, more accurate data is filtered and sent to the sink. Besides reducing the amount of data being sent, our method also minimizes the number of participating sensors. This interprets that our proposed approach preserves the information relevance as well as enhances the energy efficiency of the aggregation process.
Network diagram analysis
Simulation and performance evaluations
Simulation parameters used for performance evaluation
1000 × 1000 meters2
Number of nodes
Number of samples
Number of data formats
GBELLMF parameters table
SIGMF parameters table
Conclusion & future research directions
In this research, we have demonstrated that our proposed clustering protocol in wireless sensor network provides significant energy savings. The clustering process is purely distributed and is based on the sensed data, regardless of geographic positioning and distance measures. We have calculated the precision of sensor data on the basis of local and global probability model. Furthermore, we have also analyzed the rate and impact of information gain, i.e. convergence rate of calculated sensor entropy towards the absolute value. We have also defined the working slots to aggregate data for the initial period with partially clustered network and for the intermediate cycles, once the whole network is clustered.
The simulations of our proposed methods have shown outperforming results. The entropy measurement facilitates the efficient selection of maximum information bearing nodes, which further makes more accurate aggregation at the cluster head. It is also clarified that our proposed data aggregation technique performs in energy efficient manner. Moreover, the energy consumption in the network has also been carried out for several aggregation cycles. Therefore, it can be concluded that entropy based fusion is relevant in terms of information integrity, network lifetime as well as energy utilization.
Thus far we have concentrated on the homogeneous sensor networks with a single powerful processing center (sink). In our future work, we would rather focus on the heterogeneous wireless sensor networks with multiple resource-rich actors for carrying out energy consuming tasks. Apart from this, we would emphasis our effort on developing novel entropy-based techniques so as to enrich the integrity of aggregated content, thereby maintaining a delay constrain on the computational efficiency.
Adwitiya Sinha has completed Bachelor of Computer Applications and Master of Computer Applications in 2006 and 2008. She received Master of Technology in the Computer Science and Technology in 2010 from Jawaharlal Nehru University, New Delhi, India. Presently, she is working towards her PhD in the same university. She is the recipient of Senior Research Fellowship from Council of Scientific and Industrial Research, India. Her major interest lies in energy-efficient wireless networking, mobile and ad hoc communication, data aggregation and filtration techniques.
D. K. Lobiyal received his Bachelor of Technology in Computer Science from Lucknow University, India. He received Master of Technology and PhD both in Computer Science from Jawaharlal Nehru University, New Delhi, India. Presently, he is working as an Associate Professor in the School of Computer and Systems Sciences at Jawaharlal Nehru University. His areas of research interest are Wireless ad Hoc Networks, Video on Demand, and Natural Language Processing (NLP).
- Akyildiz IF, Su W, Sankarasubramaniam Y, Cayirci E: Wireless sensor networks: a survey. J Comp Networks 2002, 38(4):393–422. Elsevier Elsevier 10.1016/S1389-1286(01)00302-4View ArticleGoogle Scholar
- Yong-Min L, Shu-Ci W, Xiao-Hong N: The architecture and characteristics of wireless sensor networks. IEEE Int Conf Comp Technol Dev 2009, 1: 561–565. 13-15 November 2009Google Scholar
- Potdar V, Sharif A, Chang E: Wireless sensor networks: a survey. IEEE Int Conf Adv Inf Netw Appl 2009, 636–641. 26-29 May 2009Google Scholar
- Eskandari Z, Yaghmaee MH, Mohajerzadeh AH: Energy efficient spanning tree for data aggregation in wireless sensor networks. IEEE Proceedings of 17th International Conference on Computer Communications and Networks. 2008, 1–5. 3-7 August 2008Google Scholar
- Galluccio L, Palazzo S, Campbell AT: Efficient data aggregation in wireless sensor networks: an entropy-driven analysis. IEEE 19th International Symposium on Personal, Indoor and Mobile Radio Communications. 2008, 1–6. 15-18 September 2008Google Scholar
- Cai W, Zhang M: Data aggregation mechanism based on wavelet-entropy for wireless sensor networks. 4th IEEE International Conference on Wireless Communications, Networking and Mobile Computing; 2008:1–4. 12-14 October 2008Google Scholar
- Heinzelman WR, Chandrakasan A, Balakrishnan H: Energy-efficient communication protocols for wireless microsensor networks. IEEE Proceedings of the 33rd Hawaii International Conference on System Sciences; 2000:1–10. 4-7 January 2000Google Scholar
- Yu B, Li J, Li Y: Distributed data aggregation scheduling in wireless sensor networks. IEEE INFOCOM 2009, 2159–2167. 19-25 April 2009Google Scholar
- Azim MA, Moad S, Bouadallah N: SAG: Smart Aggregation Technique for continuous-monitoring in wireless sensor networks. IEEE International Conference on Communications; 2010:1–6. 23-27 May 2010Google Scholar
- Jiang H, Jin S, Wang C: Parameter-based data aggregation for statistical information extraction in wireless sensor networks. IEEE Trans Vehicular Technol 2010, 59(8):3992–4001.View ArticleGoogle Scholar
- Ma Y, Guo Y, Tian X, Ghanem M: Distributed clustering-based aggregation algorithm for spatial correlated sensor networks. IEEE Sensors Journal 2011, 11(3):641–648.View ArticleGoogle Scholar
- Tsai YR, Chang CJ: Cooperative information aggregation for distributed estimation in wireless sensor networks. IEEE Trans Signal Processing 2011, 8: 3876–3888.MathSciNetView ArticleGoogle Scholar
- Younis O, Fahmy S: Distributed clustering in Ad-hoc sensor networks: a hybrid, energy-efficient approach. IEEE INFOCOM 2004, 1–12. 7-11 March 2004Google Scholar
- Anker T, Bickson D, Dolve D, Hod B: Efficient clustering for improving network performance in wireless sensor networks. LNCS 2008, 4913/2008: 221–236. SpringerGoogle Scholar
- Chitnis L, Dobra A, Ranka S: Aggregation methods for large-scale sensor networks. ACM Trans Sensor Netw 2008, 4(2):1–36.View ArticleGoogle Scholar
- Castelluccia C, Chan AC-F, Mykletun E, Tsudik G: Efficient and provably secure aggregation of encrypted data in wireless sensor networks. ACM Trans Sensor Netw 2009, 5(3):1–36.View ArticleGoogle Scholar
- Xiong N, Svensson P: Multi-sensor management for information fusion: issues and approaches. Inf Fusion 2002, 3(2):163–186. Elsevier 10.1016/S1566-2535(02)00055-6View ArticleGoogle Scholar
- Heinzelman W, Chandrakasan A, Balakrishnan H: An application-specific protocol architectures for wireless microsensor networks. IEEE Trans Wireless Comm 2002, 1(4):660–670. 10.1109/TWC.2002.804190View ArticleGoogle Scholar
- Ghiasi S, Srivastava A, Yang XJ, Sarrafzadeh M: Optimal energy aware clustering in sensor networks. Sensors J 2004, 2(7):258–269.View ArticleGoogle Scholar
- Srinivasan SM, Azadmanesh A: Data aggregation in static Adhoc networks, 3rd IEEE international conference on industrial and information systems. 2008, 1–6. 8–10 December 2008Google Scholar
- Wang X, Li J: Precision constraint data aggregation for dynamic cluster-based wireless sensor networks. 5th International Conference on Mobile Ad-hoc and Sensor Networks. 2009, 172–179. 14-16 December 2009Google Scholar
- Zhao F, Shin J, Reich J: Information-driven dynamic sensor collaboration. IEEE Signal Process Mag 2002, 19: 61–72. 10.1109/79.985685View ArticleGoogle Scholar
- Commuri S, Tadigotla V: Dynamic data aggregation in wireless sensor networks, IEEE 22nd International Symposium on Intelligent Control. 2007, 1–6. 1-3 October 2007Google Scholar
- Kong L, Chen Z, Yin F: Optimum design of a window function based on the small-world networks. IEEE International Conference on Granular Computing; 2007:97. 2–4 November 2007Google Scholar
- Eguchi S, Copus J: Interpreting Kullback–Leibler Divergence with the Neyman–Pearson Lemma. J Multivar Anal 2006, 97: 2034–2040. Elsevier 10.1016/j.jmva.2006.03.007MATHView ArticleGoogle Scholar
- Chang H, Yao Y, Koschan A, Abidi B, Abidi M: Improving face recognition via narrowband spectral range selection using Jeffrey Divergence. IEEE Trans Inf Forensics Security 2009, 4(1):111–123.View ArticleGoogle Scholar
- Duch W: Uncertainty of data, Fuzzy membership functions, and multi-layer perceptrons. IEEE Trans Neural Netw 2004, 20: 1–12.Google Scholar
- Gray RM: Entropy and information theory. New York, USA: Springer-Verlag; 1990.MATHView ArticleGoogle Scholar
- Fall K, Varadhan K: The ns manual, the VINT project. 2009.Google Scholar
- Altman E, Jemenez T: NS simulator for beginners. Florida, USA: Morgan & Claypool Publishers; 2003.Google Scholar
- Attaway S: Part I: programming and problem solving using MATLAB, in: MATLAB-A Practical Approach. USA: Elsevier; 2009:1–196.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.