skip to main content
10.1145/1921081.1921093acmotherconferencesArticle/Chapter ViewAbstractPublication PageswhConference Proceedingsconference-collections
research-article

A mining technique using n-grams and motion transcripts for body sensor network data repository

Published: 05 October 2010 Publication History

Abstract

Recent years have seen a large influx of applications in the field of Body Sensor Networks (BSN). BSN, and in general wearable computers with sensors, can give researchers, users or clinicians access to tremendously valuable information extracted from data that were collected in users' natural environment. With this information, one can monitor the progression of a disease, identify its early onset or simply assess user's wellness. One major obstacle is managing repositories that store large amounts of BSN data. To address this issue, we propose a data mining approach for large BSN data repositories. We represent sensor readings with motion transcripts that maintain structural properties of the signal. To further take advantage of the signal's structure, we define a data mining technique using n-grams. We reduce overwhelmingly large number of n-grams via information gain (IG) feature selection. We report the effectiveness of our approach in terms of the speed of mining while maintaining an acceptable accuracy in terms of precision and recall. We demonstrate that the system can achieve average 99% precision with an average 100% recall on our pilot data with the help of only one transition for each movement.

References

[1]
T. Abou-Assaleh, N. Cercone, V. Keselj, and R. Sweidan. Detection of new malicious code using n-grams signatures. In Proceedings of Second Annual Conference on Privacy, Security and Trust. Citeseer, 2004.
[2]
A. Adami and H. Hermansky. Segmentation of speech for speaker and language recognition. In Eighth European Conference on Speech Communication and Technology, 2003.
[3]
D. Brunelli, E. Farella, L. Rocchi, M. Dozza, L. Chiari, and L. Benini. Bio-feedback system for rehabilitation based on a wireless body area network. pages 5 pp.--531, March 2006.
[4]
T. Bui, D. Heylen, and A. Nijholt. Combination of facial movements on a 3d talking head. In Computer Graphics International, 2004. Proceedings, pages 284--290, 2004.
[5]
H. Bunke and A. Sanfeliu. Syntactic and structural pattern recognition: theory and applications. World Scientific Pub Co Inc., 1990.
[6]
S. Burkhardt, A. Crauser, P. Ferragina, H. Lenhof, E. Rivals, and M. Vingron. Q-gram based database searching using a suffix array (QUASAR). In Proceedings of the third annual international conference on Computational molecular biology, page 83. ACM, 1999.
[7]
S. Chaudhuri. Data mining and database systems: Where is the intersection? Data Engineering Bulletin, 21(1):4--8, 1998.
[8]
G. Doddington. Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In Proceedings of the second international conference on Human Language Technology Research, page 145. Morgan Kaufmann Publishers Inc., 2002.
[9]
H. Friedman and J. Rubin. On some invariant criteria for grouping data. Journal of the American Statistical Association, pages 1159--1178, 1967.
[10]
D. Gelb, E. Oliver, and S. Gilman. Diagnostic criteria for Parkinson disease. Archives of Neurology, 56(1):33, 1999.
[11]
H. Ghasemzadeh, J. Barnes, E. Guenterberg, and R. Jafari. A phonological expression for physical movement monitoring in body sensor networks. In 5th IEEE International Conference on Mobile Ad Hoc and Sensor Systems, 2008. MASS 2008, pages 58--68, 2008.
[12]
H. Ghasemzadeh, V. Loseu, and R. Jafari. Collaborative signal processing for action recognition in body sensor networks: a distributed classification algorithm using motion transcripts. In Proceedings of the 9th ACM/IEEE International Conference on Information Processing in Sensor Networks, pages 244--255. ACM, 2010.
[13]
G. Guerra-Filho, C. Fermuller, and Y. Aloimonos. Discovering a language for human activity. In Proceedings of the AAAI 2005 Fall Symposium on Anticipatory Cognitive Embodied Systems, Washington, DC, 2005.
[14]
G. Guimarães and L. Pereira. Inferring definite-clause grammars to express multivariate time series. Innovations in Applied Artificial Intelligence, pages 332--341, 2005.
[15]
S. Inenaga, H. Bannai, A. Shinohara, M. Takeda, and S. Arikawa. Discovering best variable length don't care patterns. In Discovery Science, pages 169--216. Springer.
[16]
X. Jin, L. Wang, Y. Lu, and C. Shi. Indexing and mining of the local patterns in sequence database. Intelligent Data Engineering and Automated Learning - IDEAL 2002, pages 39--52.
[17]
K. Kukich. Techniques for automatically correcting words in text. ACM Computing Surveys (CSUR), 24(4):439, 1992.
[18]
S. Kurtz. Approximate string searching under weighted edit distance. In Proc. of Third South American Workshop on String Processing, pages 156--170. Citeseer, 1996.
[19]
M. Lapinski, E. Berkson, T. Gill, M. Reinold, and J. Paradiso. A Distributed Wearable, Wireless Sensor System for Evaluating Professional Baseball Pitchers and Batters. In 2009 International Symposium on Wearable Computers, pages 131--138. IEEE, 2009.
[20]
V. Levenshteiti. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet Physics-Doklady, volume 10, 1966.
[21]
C. Lin and E. Hovy. Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of hlt-naacl, volume 2003, 2003.
[22]
B. Lo, J. Wang, and G. Yang. From imaging networks to behavior profiling: Ubiquitous sensing for managed homecare of the elderly. In Adjunct Proceedings of the 3rd International Conference on Pervasive Computing. Citeseer, 2005.
[23]
J. MacQueen et al. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1, page 14. California, USA, 1967.
[24]
U. Manber and G. Myers. Suffix arrays: A new method for on-line string searches. In Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms, pages 319--327. Society for Industrial and Applied Mathematics, 1990.
[25]
M. Marin-Perianu, C. Lombriser, O. Amft, P. Havinga, and G. Troster. Distributed activity recognition with fuzzy-enabled wireless sensor networks. Distributed Computing in Sensor Systems, pages 296--313.
[26]
A. Marzal and E. Vidal. Computation of normalized edit distance and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 926--932, 1993.
[27]
M. Masud, L. Khan, and B. Thuraisingham. A scalable multi-level feature extraction technique to detect malicious executables. Information Systems Frontiers, 10(1):33--45, 2008.
[28]
R. Mayagoitia, A. Nene, and P. Veltink. Accelerometer and rate gyroscope measurement of kinematics: an inexpensive alternative to optical motion analysis systems. Journal of Biomechanics, 35(4):537--542, 2002.
[29]
G. McLachlan and T. Krishnan. The EM algorithm and extensions. Wiley New York, 1997.
[30]
G. Milligan and M. Cooper. A study of standardization of variables in cluster analysis. Journal of Classification, 5(2):181--204, 1988.
[31]
T. Pavlidis. Structural pattern recognition. 1977.
[32]
F. Pereira, Y. Singer, and N. Tishby. Beyond word n-grams. In Proceedings of the Third Workshop on Very Large Corpora, pages 95--106, 1995.
[33]
N. Ravi, N. Dandekar, P. Mysore, and M. Littman. Activity recognition from accelerometer data. In Proceedings of the National Conference on Artificial Intelligence, volume 20, page 1541, 2005.
[34]
P. Rousseeuw. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20:53--65, 1987.
[35]
K. Sadakane. Compressed text databases with efficient query algorithms based on the compressed suffix array. Algorithms and Computation, pages 295--321.
[36]
G. Schwarz. Estimating the dimension of a model. The annals of statistics, pages 461--464, 1978.
[37]
C. Shannon. Communication theory of secrecy systems. MD Computing, 15(1):57--64, 1998.
[38]
V. Shnayder, B. Chen, K. Lorincz, T. Fulford-Jones, and M. Welsh. Sensor networks for medical care. In SenSysŠ 05: Proceedings of the 3rd international conference on Embedded networked sensor systems, pages 314--314. Citeseer, 2005.
[39]
N. Stergiou. Innovative analyses of human movement. Human Kinetics Publishers, 2004.
[40]
A. Volmer, N. Kruger, and R. Orglmeister. Posture and Motion Detection Using Acceleration Data for Context Aware Sensing in Personal Healthcare Systems. In World Congress on Medical Physics and Biomedical Engineering, September 7--12, 2009, Munich, Germany, pages 71--74. Springer, 2009.
[41]
K. Yamaoka, T. Nakagawa, and T. Uno. Application of Akaike's information criterion (AIC) in the evaluation of linear pharmacokinetic equations. Journal of Pharmacokinetics and Pharmacodynamics, 6(2):165--175, 1978.
[42]
Y. Zhang and S. Vogel. Measuring confidence intervals for the machine translation evaluation metrics. Proceedings of TMI, 2004:85--94, 2004.

Cited By

View all

Index Terms

  1. A mining technique using n-grams and motion transcripts for body sensor network data repository

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WH '10: Wireless Health 2010
    October 2010
    232 pages
    ISBN:9781605589893
    DOI:10.1145/1921081
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    • WLSA: Wireless-Life Sciences Alliance
    • University of California, Los Angeles

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 October 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Patricia tree
    2. body sensor networks
    3. data mining
    4. n-grams
    5. string templates

    Qualifiers

    • Research-article

    Conference

    WH '10
    Sponsor:
    • WLSA
    WH '10: Wireless Health 2010
    October 5 - 7, 2010
    California, San Diego

    Acceptance Rates

    Overall Acceptance Rate 35 of 139 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media