skip to main content
research-article

Cluster-Based Quality-Aware Adaptive Data Compression for Streaming Data

Published: 21 September 2017 Publication History

Abstract

Wireless sensor networks (WSNs) are widely applied in data collection applications. Energy efficiency is one of the most important design goals of WSNs. In this article, we examine the tradeoffs between the energy efficiency and the data quality. First, four attributes used to evaluate data quality are formally defined. Then, we propose a novel data compression algorithm, Quality-Aware Adaptive data Compression (QAAC), to reduce the amount of data communication to save energy. QAAC utilizes an adaptive clustering algorithm to build clusters from dataset; then a code for each cluster is generated and stored in a Huffman encoding tree. The encoding algorithm encodes the original dataset based on the Haffman encoding tree. An improvement algorithm is also designed to reduce the information loss when data are compressed. After the encoded data, the Huffman encoding tree and parameters used in the improvement algorithm have been received at the sink, a decompression algorithm is used to retrieve the approximation of the original dataset. The performance evaluation shows that QAAC is efficient and achieves a much higher compression ratio than lossy and lossless compression algorithms, while it has much smaller information loss than lossy compression algorithms.

References

[1]
E. Aboelela. 2014. LiftingWiSe: A lifting-based efficient data processing technique in wireless sensor networks. Sensors 14, 8 (2014), 14567--14585.
[2]
I. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci. 2002. Wireless sensor networks: A survey. Computer Networks 38, 4 (2002).
[3]
Jamal N. Al-Karaki and Ahmed E. Kamal. 2004. Routing techniques in wireless sensor networks: A survey. IEEE Wireless Commun. 11, 6 (2004), 6--28.
[4]
M. Abu Alsheikh and others. 2014. Efficient data compression with error bound guarantee in wireless sensor networks. In Proceedings of the 17th ACM International Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems (MSWiM’14).
[5]
M. El Assi, A. Ghaddar, S. Tawbi, and G. Fadi. 2013. Resource-efficient floating-point data compression using MAS in WSN. Int. J. Ad Hoc, Sens. Ubiq. Comput. 4, 5 (2013).
[6]
K. Balakrishnan and N. Touba. 2007. Relationship between entropy and test data compression. IEEE Trans. Comput.-Aid. Des.Integr. Circ. Syst. 26, 2 (2007), 386--395.
[7]
R. Blake and P. Mangiameli. 2011. The effects and interactions of data quality and problem complexity on classification. J. Data Inf. Quality 2, 2 (2011).
[8]
MPRMote Processor Radio Board and MIBMote Interface. 2003. Programming Board Users Manual. (2003).
[9]
Peter Bodic, Wei Hong, Carlos Guestrin, Sam Madden, Mark Paskin, and Romain Thibaux. 2004. Intel Lab Data. Retrieved from http://db.csail.mit.edu/labdata/labdata.html/.
[10]
C. Buragohain, N. Shrivastava, and S. Suri. 2007. Space efficient streaming algorithms for the maximum error histogram. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’07).
[11]
G. Campobello, O. Giordano, A. Segreto, and S. Serrano. 2015. Comparison of local lossless compression algorithms for wireless sensor networks. J. Netw. Comput. Appl. 47 (2015).
[12]
E. Capo-Chichi, H. Guyennet, and J. Friedt. 2009. K-RLE: A new data compression algorithm for wireless sensor network. In Proceedings of the International Conference on Sensor Technologies and Applications.
[13]
C. Cappiello and others. 2009. Quality-and energy-aware data compression by aggregation in wsn data streams. In Proceedings of the IEEE International Conference on Pervasive Computing and Communications (PerCom’09).
[14]
Q. Chi and others. 2014. A reconfigurable smart sensor interface for industrial WSN in IoT environment. IEEE Trans. Industr. Inform. 10, 2 (2014).
[15]
International Electrotechnical Commission and others. 2007. International Vocabulary of Metrology-Basic and General Concepts and Associated Terms: (VIM). ISO Copyright Office.
[16]
Chacko J. Deepu, Chun-Huat Heng, and Yong Lian. 2017. A hybrid data compression scheme for power reduction in wireless sensors for IoT. IEEE Trans. Biomed. Circ. Syst. 11, 2 (2017), 245--254.
[17]
E. Fasolo, M. Rossi, J. Widmer, and M. Zorzi. 2007. In-network aggregation techniques for wireless sensor networks: A survey. IEEE Wireless Commun. 14, 2 (2007).
[18]
Climate Data Online. 2010. Retrieved from http://www.ncdc.noaa.gov/cdo-web/.
[19]
C. Guestrin and others. 2004. Distributed regression: An efficient framework for modeling sensor network data. In Proceedings of the ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN’04).
[20]
Benjamin T. Hazen, Christopher A. Boone, Jeremy D. Ezell, and L. Allison Jones-Farmer. 2014. Data quality for data science, predictive analytics, and big data in supply chain management: An introduction to the problem and suggestions for research and applications. International Journal of Production Economics 154 (2014), 72--80.
[21]
David A. Huffman. 1952. A method for the construction of minimum-redundancy codes. In Proceedings of the IRE 40, 9 (1952), 1098--1101.
[22]
D. Incebacak, R. Zilan, B. Tavli, J. Barcelo-Ordinas, and J. Garcia-Vidal. 2015. Optimal data compression for lifetime maximization in wireless sensor networks operating in stealth mode. Ad Hoc Netw. 24 (2015).
[23]
A. Jain. 2010. Data clustering: 50 years beyond K-means. Pattern Recogn. Lett. 31, 8 (2010), 651--666.
[24]
S. Jerez and others. 2013. A multi-physics ensemble of present-day climate regional simulations over the iberian peninsula. Clim. Dynam. 40, 11--12 (2013).
[25]
A. Kiely and others. 2010. Adaptive linear filtering compression on realtime sensor networks. Comput. J. (2010).
[26]
J. Gana Kolo, S. Anandan Shanmugam, David Wee Gin Lim, Li-Minn Ang, and Kah Phooi Seng. An adaptive lossless data compression scheme for wireless sensor networks. Journal of Sensors. (2012), 1--20.
[27]
T. Kwon and J. Cioffi. 2013. Random deployment of data collectors for serving randomly-located sensors. IEEE Trans. Wireless Commun. 12, 6 (2013).
[28]
W. Lang and J. Wilkerson. 2008. Accuracy vs. validity, consistency vs. reliability, and fairness vs. absence of bias: A call for quality. (unpublished)
[29]
Yao Liang and Yimei Li. 2014. An efficient and robust data compression algorithm in wireless sensor networks. IEEE Commun. Lett. 18, 3 (2014), 439--442.
[30]
Jie Lin, Wei Yu, Nan Zhang, Xinyu Yang, Hanlin Zhang, and Wei Zhao. 2017. A survey on internet of things: Architecture, enabling technologies, security and privacy, and applications. IEEE IoT J. pp, 99 (2017), 1--1.
[31]
U. Von Luxburg. 2007. A tutorial on spectral clustering. Stat. Comput. 17, 4 (2007).
[32]
J. MacQueen and others. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1.
[33]
Claudio S. Malavenda, Francesco Menichelli, and Mauro Olivieri. 2016. Narrowband delay tolerant protocols for WSN applications: Characterization and selection guide. In Proceedings of the International Conference on Applications in Electronics Pervading Industry, Environment and Society. Springer, 109--121.
[34]
F. Marcelloni and M. Vecchio. 2009. An efficient lossless compression algorithm for tiny nodes of monitoring wireless sensor networks. The Computer Journal 52, 8 (2009), 969--987.
[35]
G. Quer and others. 2009. On the interplay between routing and signal representation for compressive sensing in wireless sensor networks. In Proceedings of the Information Theory and Applications Workshop.
[36]
R. Rajagopalan and P. Varshney. 2006. Data aggregation techniques in sensor networks: A survey. (2006).
[37]
M. Razzaque, C. Bleakley, and S. Dobson. 2013. Compression in wireless sensor networks: A survey and comparative evaluation. ACM Trans. Sens. Netw. 10, 1 (2013).
[38]
A. Reinhardt and others. 2010. Trimming the tree: Tailoring adaptive huffman coding to wireless sensor networks. In Wireless Sensor Networks.
[39]
C. Sadler and M. Martonosi. 2006. Data compression algorithms for energy-constrained devices in delay tolerant networks. In Proceedings of the ACM Conference on Embedded Networked Sensor Systems (Sensys’06).
[40]
N. Savage and others. 2013. Air quality modelling using the met office unified model (AQUM OS24-26): Model description and initial evaluation. Geosci. Model Dev. 6, 2 (2013).
[41]
K. Sayood. 2012. Introduction to Data Compression. Newnes.
[42]
Tom Schoellhammer, Ben Greenstein, Eric Osterweil, Michael Wimbrow, and Deborah Estrin. 2004. Lightweight temporal compression of microclimate datasets. Center for Embedded Network Sensing (2004).
[43]
K. Sha and W. Shi. 2008. Consistency-driven data quality management of networked sensor systems. J. Parallel Distrib. Comput. 68, 9 (2008).
[44]
K. Sha and S. Zeadally. 2015. Data quality challenges in cyber-physical systems. J. Data Inf. Qual. 6, 2 (2015).
[45]
C. Shannon. 2001. A mathematical theory of communication. ACM SIGMOBILE Mobile Comput. Commun. Rev. 5, 1 (2001), 3--55.
[46]
V. Shnayder and others. 2004. Simulating the power consumption of large-scale sensor network applications. In International Conference on Embedded Networked Sensor Systems.
[47]
T. Srisooksai, K. Keamarungsi, P. Lamsrichan, and K. Araki. 2012. Practical data compression in wireless sensor networks: A survey. J. Netw. Comput. Appl. 35, 1 (2012).
[48]
N. Visalakshi and K. Thangavel. 2009. Impact of normalization in distributed k-means clustering. Int. J. Soft Comput. 4, 4 (2009).
[49]
C. Wang, J. Shih, B. Pan, and T. Wu. 2014. A network lifetime enhancement method for sink relocation and its analysis in wireless sensor networks. IEEE Sens. J. 14, 6 (2014).
[50]
R. Wang, V. Storey, and C. Firth. 1995. A framework for analysis of data quality research. IEEE Trans. Knowl. Data Eng. 7, 4 (1995).
[51]
Wea 2015. Weather Underground. Retrieved from http://www.wunderground.com.
[52]
T. Welch. 1984. A technique for high-performance data compression. Computer 6, 17 (1984).
[53]
Jesse Ray Whitehead. 2016. Cluster-based trust proliferation and energy efficient data collection in unattended wireless sensor networks with mobile sinks. Masters Theses (2016).
[54]
I. Witten and E. Frank. 2005. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann.
[55]
Mou Wu, Liansheng Tan, and Naixue Xiong. 2016. Data prediction, compression, and recovery in clustered wireless sensor networks for environmental monitoring applications. Inf. Sci. 329 (2016), 800--818.
[56]
M. Yacoab, V. Sundaram, and A. Thajudeen. 2010. A cost effective compressive data aggregation technique for wireless sensor networks. International Journal of Ad hoc, Sensor 8 Ubiquitous Computing (IJASUC) 1 (2010), 116--130.
[57]
Wei Yu, Thang Nam Le, Dong Xuan, and Wei Zhao. 2004. Query aggregation for providing efficient data services in sensor networks. In Proceedings of the IEEE International Conference on Mobile and Ad-hoc Sensor Systems (MASS’04).
[58]
Difan Zhang, Linqiang Ge, Wei Yu, Rommie Hardy, Robert J. Reschly, and Hanlin Zhang. 2013. On effective data aggregation techniques in hostbased intrusion detection in MANET. Int. J. Secur. Netw. 8, 4 (2013), 179--193.
[59]
Davide Zordan, Borja Martinez, Ignasi Vilajosana, and Michele Rossi. 2012. To compress or not to compress: Processing vs transmission tradeoffs for energy constrained sensor networking. CoRR, arXiv preprint, abs/1206.2129:1206.2129 (2012).
[60]
Davide Zordan, Tommaso Melodia, and Michele Rossi. 2016. On the design of temporal compression strategies for energy harvesting sensor networks. IEEE Trans. Wireless Commun. 15, 2 (2016), 1336--1352.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Journal of Data and Information Quality
Journal of Data and Information Quality  Volume 9, Issue 1
Research Papers and Challenge Papers
March 2017
73 pages
ISSN:1936-1955
EISSN:1936-1963
DOI:10.1145/3139489
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 September 2017
Accepted: 01 July 2017
Revised: 01 April 2017
Received: 01 November 2016
Published in JDIQ Volume 9, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Wireless sensor networks
  2. adaptive
  3. clustering algorithm
  4. data compression
  5. data quality
  6. energy efficiency

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)0
Reflects downloads up to 28 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media