skip to main content
survey

Computational Health Informatics in the Big Data Age: A Survey

Published: 14 June 2016 Publication History

Abstract

The explosive growth and widespread accessibility of digital health data have led to a surge of research activity in the healthcare and data sciences fields. The conventional approaches for health data management have achieved limited success as they are incapable of handling the huge amount of complex data with high volume, high velocity, and high variety. This article presents a comprehensive overview of the existing challenges, techniques, and future directions for computational health informatics in the big data age, with a structured analysis of the historical and state-of-the-art methods. We have summarized the challenges into four Vs (i.e., volume, velocity, variety, and veracity) and proposed a systematic data-processing pipeline for generic big data in health informatics, covering data capturing, storing, sharing, analyzing, searching, and decision support. Specifically, numerous techniques and algorithms in machine learning are categorized and compared. On the basis of this material, we identify and discuss the essential prospects lying ahead for computational health informatics in this big data age.

References

[1]
ADNI. 2015. Alzheimer’s disease neuroimaging initiative. http://adni.loni.usc.edu/about/. (2015). Retrieved 02-15-2015-02.
[2]
Hatice Cinar Akakin and Metin N. Gurcan. 2012. Content-based microscopic image retrieval system for multi-image queries. IEEE Transactions on Information Technology in Biomedicine 16, 4 (2012), 758--769.
[3]
Alexandr Andoni and Piotr Indyk. 2006. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06). 459--468.
[4]
Bill Andreopoulos, Aijun An, and Xiaogang Wang. 2007. Hierarchical density-based clustering of categorical data and a simplification. In Advances in Knowledge Discovery and Data Mining. Springer LNCS, 11--22.
[5]
Daniele Apiletti, Elena Baralis, Giulia Bruno, and Tania Cerquitelli. 2009. Real-time analysis of physiological data to support medical applications. IEEE Transactions on Information Technology in Biomedicine 13, 3 (2009), 313--321.
[6]
Peter C. Austin, Jack V. Tu, Jennifer E. Ho, Daniel Levy, and Douglas S. Lee. 2013. Using methods from the data-mining and machine-learning literature for disease classification and prediction: A case study examining classification of heart failure subtypes. Journal of Clinical Epidemiology 66, 4 (2013), 398--407.
[7]
Hadi Banaee, Mobyen Uddin Ahmed, and Amy Loutfi. 2013. Data mining for wearable sensors in health monitoring systems: A review of recent trends and challenges. Sensors 13, 12 (2013), 17472--17500.
[8]
Andrew Bate, Marie Lindquist, I. R. Edwards, S. Olsson, R. Orre, A. Lansner, and R. Melhado De Freitas. 1998. A Bayesian neural network method for adverse drug reaction signal generation. European Journal of Clinical Pharmacology 54, 4 (1998), 315--321.
[9]
David W. Bates, Suchi Saria, Lucila Ohno-Machado, Anand Shah, and Gabriel Escobar. 2014. Big data in health care: Using analytics to identify and manage high-risk and high-cost patients. Health Affairs 33, 7 (2014), 1123--1131.
[10]
Smaranda Belciug. 2009. Patients length of stay grouping using the hierarchical clustering algorithm. Annals of the University of Craiova-Mathematics and Computer Science Series 36, 2 (2009), 79--84.
[11]
Irad Ben-Gal. 2007. Bayesian networks. Encyclopedia of Statistics in Quality and Reliability. John Wiley and Sons.
[12]
Beenish Bhatia, Tim Oates, Yan Xiao, and Peter Hu. 2007. Real-time identification of operating room state from video. In Association for the Advancement of Artificial Intelligence (AAAI). Vol. 2. 1761--1766.
[13]
Leo Breiman. 2001. Random forests. Machine Learning 45, 1 (2001), 5--32.
[14]
Nanette Byrnes. 2014. MIT technology review. http://www.technologyreview.com/news/529011/can-technology-fix-medicine/. (2014). Retrieved 01-20-2015.
[15]
Mehemmed Emre Celebi, Yuksel Alp Aslandogan, and Paul R. Bergstresser. 2005. Mining biomedical images with density-based clustering. In International Conference on Information Technology: Coding and Computing (ITCC’05). Vol. 1. IEEE, 163--168.
[16]
Nitesh V. Chawla and Darcy A. Davis. 2013. Bringing big data to personalized healthcare: A patient-centered framework. Journal of General Internal Medicine 28, 3 (2013), 660--665.
[17]
Min Chen, Shiwen Mao, and Yunhao Liu. 2014. Big data: A survey. Mobile Networks and Applications 19, 2 (2014), 171--209.
[18]
Xue-Wen Chen and Xiaotong Lin. 2014. Big data deep learning: Challenges and perspectives. IEEE Access 2 (2014), 514--525.
[19]
Hugh Chipman and Robert Tibshirani. 2006. Hybrid hierarchical clustering with applications to microarray data. Biostatistics 7, 2 (2006), 286--301.
[20]
Paul D. Clayton and George Hripcsak. 1995. Decision support in healthcare. International Journal of Bio-Medical Computing 39, 1 (1995), 59--66.
[21]
Dorin Comaniciu, Peter Meer, and David J. Foran. 1999. Image-guided decision support system for pathology. Machine Vision and Applications 11, 4 (1999), 213--224.
[22]
Robert E. Cooke Jr., Michael G. Gaeta, Dean M. Kaufman, and John G. Henrici. 2003. Picture archiving and communication system. (June 3, 2003). US Patent 6,574,629.
[23]
Ben Cooper and Marc Lipsitch. 2004. The analysis of hospital infection data using hidden Markov models. Biostatistics 5, 2 (2004), 223--237.
[24]
James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, J. J. Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, and Dale Woodford. 2013. Spanner: Googles globally distributed database. ACM Transactions on Computer Systems (TOCS) 31, 3 (2013), 8.
[25]
Fabricio F. Costa. 2014. Big data in biomedicine. Drug Discovery Today 19, 4 (2014), 433--440.
[26]
Carl W. Cotman, Nicole C. Berchtold, and Lori-Ann Christie. 2007. Exercise builds brain health: Key roles of growth factor cascades and inflammation. Trends in Neurosciences 30, 9 (2007), 464--472.
[27]
Mike Cottle, Waco Hoover, Shadaab Kanwal, Marty Kohn, Trevor Strome, and Neil W. Treister. 2013. Transforming health care through big data: Strategies for leveraging big data in the health care industry. http://c4fd63cb482ce6861463-bc6183f1c18e748a49b87a25911a0555.r93.cf2.rackcdn.com/iHT2_BigData_2013.pdf. (2013). New York: Institute for Health Technology Transformation.
[28]
Jared Crapo. 2014. Big data in healthcare: Separating the hype from the reality. https://www.healthcatalyst.com/healthcare-big-data-realities. (2014). Retrieved 01-17-2015.
[29]
Antonio Criminisi, Duncan Robertson, Ender Konukoglu, Jamie Shotton, Sayan Pathak, Steve White, and Khan Siddiqui. 2013. Regression forests for efficient anatomy detection and localization in computed tomography scans. Medical Image Analysis 17, 8 (2013), 1293--1303.
[30]
Nello Cristianini and John Shawe-Taylor. 2000. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press.
[31]
Andy Crowne. 2014. Preparing the healthcare industry to capture the full potential of big data. http://sparkblog.emc.com/2014/06/preparing-healthcare-industry-capture-full-potential-big-data/. (2014). Retrieved 02-25-2015.
[32]
Oscar Alfonso Jiménez del Toro and Henning Müller. 2014. Hierarchic multi--atlas based segmentation for anatomical structures: Evaluation in the VISCERAL anatomy benchmarks. In Medical Computer Vision: Algorithms for Big Data. Springer, 189--200.
[33]
Ramón Díaz-Uriarte and Sara Alvarez De Andres. 2006. Gene selection and classification of microarray data using random forest. BMC Bioinformatics 7, 1 (2006), 3.
[34]
David J. Dittman, Taghi M. Khoshgoftaar, Randall Wald, and Amri Napolitano. 2013. Simplifying the utilization of machine learning techniques for bioinformatics. In 12th International Conference on Machine Learning and Applications (ICMLA’13). Vol. 2. IEEE, 396--403.
[35]
Pedro Domingos and Geoff Hulten. 2000. Mining high-speed data streams. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 71--80.
[36]
Xinhua Dong, Ruixuan Li, Heng He, Wanwan Zhou, Zhengyuan Xue, and Hao Wu. 2015. Secure sensitive data sharing on a big data platform. Tsinghua Science and Technology 20, 1 (2015), 72--80.
[37]
Arun George Eapen. 2004. Application of Data Mining in Medical Applications. Master’s thesis. University of Waterloo, Ontario, Canada.
[38]
EMC. 2011. Managing healthcare data within the ecosystem while reducing IT costs and complexity. http://www.emc.com/collateral/emc-perspective/h8805-healthcare-costs-co mplexities-ep.pdf. (2011). Retrieved 02-25-2015.
[39]
Francisco Estella, Blanca L. Delgado-Marquez, Pablo Rojas, Olga Valenzuela, Belen San Roman, and Ignacio Rojas. 2012. Advanced system for autonomously classify brain MRI in neurodegenerative disease. In International Conference on Multimedia Computing and Systems (ICMCS’12). IEEE, 250--255.
[40]
Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. Vol. 96. AAAI Press, 226--231.
[41]
Ruogu Fang, Tsuhan Chen, and Pina C. Sanelli. 2013. Towards robust deconvolution of low-dose perfusion CT: Sparse perfusion deconvolution using online dictionary learning. Medical Image Analysis 17, 4 (2013), 417--428.
[42]
Ruogu Fang, Haodi Jiang, and Junzhou Huang. 2015. Tissue-specific sparse deconvolution for brain CT perfusion. Computerized Medical Imaging and Graphics (2015). Available online May 21, 2015, ISSN 0895-6111, http://dx.doi.org/10.1016/j.compmedimag.2015.04.008.
[43]
Ruogu Fang, Kolbeinn Karlsson, Tsuhan Chen, and Pina C. Sanelli. 2014. Improving low-dose blood--brain barrier permeability quantification using sparse high-dose induced prior for Patlak model. Medical Image Analysis 18, 6 (2014), 866--880.
[44]
Ruogu Fang, Shaoting Zhang, Tsuhan Chen, and Pina Sanelli. 2015. Robust low-dose CT perfusion deconvolution via tensor total-variation regularization. IEEE Transaction on Medical Imaging 34, 7 (2015), 1533--1548.
[45]
Anthony S. Fauci, Dennis L. Kasper, Eugene Braunwald, Stephen L. Hauser, Dan L. Longo, J. Larry Jameson, and Joseph Loscalzo. 2008. Harrison’s Principles of Internal Medicine. Vol. 2. New York: McGraw-Hill Medical.
[46]
Bonnie Feldman, Ellen M. Martin, and Tobi Skotnes. 2012. Big data in healthcare hype and hope. Technical Report, Dr. Bonnie 360 (2012).
[47]
André S. Fialho, Federico Cismondi, Susana M. Vieira, Shane R. Reti, Joao M. C. Sousa, and Stan N. Finkelstein. 2012. Data mining using clinical physiology at discharge to predict ICU readmissions. Expert Systems with Applications 39, 18 (2012), 13158--13165.
[48]
Garrett Fitzmaurice, Marie Davidian, Geert Verbeke, and Geert Molenberghs. 2008. Longitudinal Data Analysis. CRC Press. Handbooks of Modern Statistical Methods. New York: Chapman and Hall.
[49]
Christos A. Frantzidis, Charalampos Bratsas, Manousos A. Klados, Evdokimos Konstantinidis, Chrysa D. Lithari, Ana B. Vivas, Christos L. Papadelis, Eleni Kaldoudi, Costas Pappas, and Panagiotis D. Bamidis. 2010. On the classification of emotional biosignals evoked while viewing affective pictures: An integrated data-mining-based approach for healthcare applications. IEEE Transactions on Information Technology in Biomedicine 14, 2 (2010), 309--318.
[50]
Yoav Freund and Llew Mason. 1999. The alternating decision tree learning algorithm. In Proceedings of the 16th International Conference on Machine Learning. Vol. 99. 124--133.
[51]
Bernd Fritzke. 1995. A growing neural gas network learns topologies. Advances in Neural Information Processing Systems 7 (1995), 625--632.
[52]
Gartner. 2014. IT glossary: Big data. http://www.gartner.com/it-glossary/big-data/. (2014). Retrieved 03-06-2015.
[53]
Zoubin Ghahramani. 2001. An introduction to hidden Markov models and Bayesian networks. International Journal of Pattern Recognition and Artificial Intelligence 15, 01 (2001), 9--42.
[54]
Ali Gholipour, Judy A. Estroff, and Simon K. Warfield. 2010. Robust super-resolution volume reconstruction from slice acquisitions: Application to fetal brain MRI. IEEE Transactions on Medical Imaging, 29, 10 (2010), 1739--1758.
[55]
Donna Giri, U. Rajendra Acharya, Roshan Joy Martis, S. Vinitha Sree, Teik-Cheng Lim, Thajudin Ahamed, and Jasjit S. Suri. 2013. Automated diagnosis of coronary artery disease affected patients using LDA, PCA, ICA and discrete wavelet transform. Knowledge-Based Systems 37 (2013), 274--282.
[56]
Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. 2009. The WEKA data mining software: An update. ACM SIGKDD Explorations Newsletter 11, 1 (2009), 10--18.
[57]
Mark A. Hall. 1999. Correlation-Based Feature Selection for Machine Learning. Ph.D. Dissertation. University of Waikato.
[58]
Allan Hanbury, Henning Müller, Georg Langs, and Bjoern H. Menze. 2013. Cloud-based evaluation framework for big data. In FIA Book 2013 (Springer LNCS).
[59]
Douglas M. Hawkins. 2004. The problem of overfitting. Journal of Chemical Information and Computer Sciences 44, 1 (2004), 1--12.
[60]
Chenguang He, Xiaomao Fan, and Ye Li. 2013. Toward ubiquitous healthcare services with a novel efficient cloud platform. IEEE Transactions on Biomedical Engineering 60, 1 (2013), 230--234.
[61]
Matthew Herland, Taghi M. Khoshgoftaar, and Randall Wald. 2013. Survey of clinical data mining applications on big data in health informatics. In 12th International Conference on Machine Learning and Applications (ICMLA’13). Vol. 2. IEEE, 465--472.
[62]
Matthew Herland, Taghi M. Khoshgoftaar, and Randall Wald. 2014. A review of data mining using big data in health informatics. Journal of Big Data, Springer 1, 1 (2014), 2.
[63]
Geoffrey E. Hinton. 2009. Deep belief networks. Scholarpedia 4, 5 (2009), 5947.
[64]
Joyce C. Ho, Joydeep Ghosh, and Jimeng Sun. 2014. Marble: High-throughput phenotyping from electronic health records via sparse nonnegative tensor factorization. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 115--124.
[65]
Andreas Holzinger. 2016. Interactive machine learning for health informatics: When do we need the human-in-the-loop? Brain Informatics 3 (2016), 1--13.
[66]
Andreas Holzinger, Matthias Dehmer, and Igor Jurisica. 2014a. Knowledge discovery and interactive data mining in bioinformatics-state-of-the-art, future challenges and research directions. BMC Bioinformatics 15, Suppl 6 (2014), I1.
[67]
Andreas Holzinger, Johannes Schantl, Miriam Schroettner, Christin Seifert, and Karin Verspoor. 2014b. Biomedical text mining: State-of-the-art, open problems and future challenges. In Interactive Knowledge Discovery and Data Mining in Biomedical Informatics. Springer, 271--300.
[68]
Andreas Holzinger and Klaus-Martin Simonic. 2011. Information quality in e-health. In Proceedings of the 7th Conference of the Workgroup Human-Computer Interaction and Usability Engineering of the Austrian Computer Society (USAB'11). Vol. 7058. Springer, Graz, Austria.
[69]
HSCIC. 2012. Hospital episode statistics. http://www.hscic.gov.uk/hes. (2012). Retrieved 02-20-2015.
[70]
Fei Hu, Meng Jiang, Laura Celentano, and Yang Xiao. 2008. Robust medical ad hoc sensor networks (MASN) with wavelet-based ECG data mining. Ad Hoc Networks 6, 7 (2008), 986--1012.
[71]
Hui Fang Huang, Guang Shu Hu, and Li Zhu. 2012. Sparse representation-based heartbeat classification using independent component analysis. Journal of Medical Systems 36, 3 (2012), 1235--1247.
[72]
Ke Huang and Selin Aviyente. 2006. Sparse representation for signal classification. In Advances in Neural Information Processing Systems. 609--616.
[73]
Michael Hund, Werner Sturm, Tobias Schreck, Torsten Ullrich, Daniel Keim, Ljiljana Majnaric, and Andreas Holzinger. 2015. Analysis of patient groups and immunization results based on subspace clustering. In Brain Informatics and Health. Springer, 358--368.
[74]
Kevin Hung, Yuan-Ting Zhang, and B. Tai. 2004. Wearable medical devices for tele-home healthcare. In 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEMBS’04). Vol. 2. IEEE, 5384--5387.
[75]
IBM. 2012. Large gene interaction analytics at University at Buffalo. http://www-03.ibm.com/software/businesscasestudies/no/no/corp?synkey=M947744T51514R54. (2012). Retrieved 02-23-2015.
[76]
IBM. 2015a. IBM content and predictive analytics for healthcare. http://www-01.ibm.com/software/sg/industry/healthcare/pdf/setonCaseStudy.pdf. (2015). Retrieved 01-20-2015.
[77]
IBM. 2015b. IBM patient care and insights. http://www-03.ibm.com/software/products/en/IBM-care-management. (2015). Retrieved 03-05-2015.
[78]
Intel. 2011. Distributed systems for clinical data analysis. http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/big-data-hadoop-clinical-analysis-paper.pdf. (2011). Retrieved 02-24-2015.
[79]
Naiem T. Issa, Stephen W. Byers, and Sivanesan Dakshanamurthy. 2014. Big data: The next frontier for innovation in therapeutics and healthcare. Expert Review of Clinical Pharmacology 7, 3 (2014), 293--298.
[80]
Anil K. Jain, M. Narasimha Murty, and Patrick J. Flynn. 1999. Data clustering: A review. ACM Computing Surveys (CSUR) 31, 3 (1999), 264--323.
[81]
Raimon Jané, Hervé Rix, Pere Caminal, and Pablo Laguna. 1991. Alignment methods for averaging of high-resolution cardiac signals: A comparative study of performance. IEEE Transactions on Biomedical Engineering 38, 6 (1991), 571--579.
[82]
Pierre Jannin and Xavier Morandi. 2007. Surgical models for computer-assisted neurosurgery. Neuroimage 37, 3 (2007), 783--791.
[83]
Fleur Jeanquartier and Andreas Holzinger. 2013. On visual analytics and evaluation in cell physiology: A case study. In Availability, Reliability, and Security in Information Systems and HCI. Springer, 495--502.
[84]
Menglin Jiang, Shaoting Zhang, Hongsheng Li, and Dimitris N. Metaxas. 2015. Computer-aided diagnosis of mammographic masses using scalable image retrieval. IEEE Transactions on Biomedical Engineering 62, 2 (2015), 783--792.
[85]
Mary E. Johnston, Karl B. Langton, R. Brian Haynes, and Alix Mathieu. 1994. Effects of computer-based clinical decision support systems on clinician performance and patient outcome: A critical appraisal of research. Annals of Internal Medicine 120, 2 (1994), 135--142.
[86]
Kenneth Jung, Paea LePendu, Srinivasan Iyer, Anna Bauer-Mehren, Bethany Percha, and Nigam H. Shah. 2014. Functional evaluation of out-of-the-box text-mining tools for data-mining tasks. Journal of the American Medical Informatics Association 22, 1 (2014), 121--131.
[87]
Karthik Kambatla, Giorgos Kollias, Vipin Kumar, and Ananth Grama. 2014. Trends in big data analytics. Journal of Parallel and Distributed Computing 74, 7 (2014), 2561--2573.
[88]
Kensaku Kawamoto, Caitlin A. Houlihan, E. Andrew Balas, and David F. Lobach. 2005. Improving clinical practice using clinical decision support systems: A systematic review of trials to identify features critical to success. BMJ 330, 7494 (2005), 765.
[89]
Daniel A. Keim. 2002. Information visualization and visual data mining. IEEE Transactions on Visualization and Computer Graphics 8, 1 (2002), 1--8.
[90]
Irfan Y. Khan, P. H. Zope, and S. R. Suralkar. 2013. Importance of artificial neural network in medical diagnosis disease like acute nephritis disease and heart disease. International Journal of Engineering Science and Innovative Technology (IJESIT) 2, 2 (2013), 210--217.
[91]
Peter Kieseberg, Johannes Schantl, Peter Frühwirt, Edgar Weippl, and Andreas Holzinger. 2015. Witnesses for the doctor in the loop. In Brain Informatics and Health. Springer, 369--378.
[92]
Teuvo Kohonen. 1998. The self-organizing map. Neurocomputing 21, 1 (1998), 1--6.
[93]
Hans-Peter Kriegel, Peer Kröger, and Arthur Zimek. 2009. Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering. ACM Transactions on Knowledge Discovery from Data (TKDD) 3, 1 (2009), 1--58.
[94]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. Curran Associates, 1097--1105.
[95]
Alexandros Labrinidis and H. V. Jagadish. 2012. Challenges and opportunities with big data. Proceedings of the VLDB Endowment 5, 12 (2012), 2032--2033.
[96]
Florent Lalys, Laurent Riffaud, David Bouget, and Pierre Jannin. 2012. A framework for the recognition of high-level surgical tasks from video images for cataract surgeries. IEEE Transactions on Biomedical Engineering 59, 4 (2012), 966--976.
[97]
Georg Langs, Allan Hanbury, Bjoern Menze, and Henning Müller. 2013. VISCERAL: Towards large data in medical imaging Challenges and directions. In Medical Content-Based Retrieval for Clinical Decision Support. Vol. 7723. Springer, 92--98.
[98]
Hugo Larochelle, Yoshua Bengio, Jérôme Louradour, and Pascal Lamblin. 2009. Exploring strategies for training deep neural networks. Journal of Machine Learning Research 10 (2009), 1--40.
[99]
Kim-Anh Lê Cao, Debra Rossouw, Christèle Robert-Granié, and Philippe Besse. 2008. A sparse PLS for variable selection when integrating omics data. Statistical Applications in Genetics and Molecular Biology 7, 1 (2008), 1544--6115.
[100]
Anny Leema and M. Hemalatha. 2011. An effective and adaptive data cleaning technique for colossal RFID data sets in healthcare. WSEAS Transactions on Information Science and Applications 8, 6 (2011), 243--252.
[101]
Hai Guang Li, Xindong Wu, Zhao Li, and Wei Ding. 2013. Online group feature selection from feature streams. In 27th AAAI Conference on Artificial Intelligence. Citeseer, 1627--1628.
[102]
Rongjian Li, Wenlu Zhang, Heung-Il Suk, Li Wang, Jiang Li, Dinggang Shen, and Shuiwang Ji. 2014. Deep learning based imaging data completion for improved brain disease diagnosis. In Medical Image Computing and Computer-Assisted Intervention (MICCAI’14). Springer, 305--312.
[103]
Shutao Li, Haitao Yin, and Leyuan Fang. 2012. Group-sparse representation with dictionary learning for medical image denoising and fusion. IEEE Transactions on Biomedical Engineering 59, 12 (2012), 3450--3459.
[104]
Znaonui Liang, Gang Zhang, Jimmy Xiangji Huang, and Qmming Vivian Hu. 2014. Deep learning for healthcare decision making with EMRs. In 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM’14). 556--559.
[105]
Moshe Lichman. 2013. UCI machine learning repository. (2013). http://archive.ics.uci.edu/ml. Retrieved 08-03-2015.
[106]
Manhua Liu, Daoqiang Zhang, and Dinggang Shen. 2012. Ensemble sparse classification of Alzheimer’s disease. NeuroImage 60, 2 (2012), 1106--1116.
[107]
Wei Liu, Jun Wang, Rongrong Ji, Yu-Gang Jiang, and Shih-Fu Chang. 2012. Supervised hashing with kernels. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’12). 2074--2081.
[108]
Wei Liu, Jun Wang, Sanjiv Kumar, and Shih-Fu Chang. 2011. Hashing with graphs. In Proceedings of the 28th International Conference on Machine Learning (ICML’11). 1--8.
[109]
Aastha Madaan, Wanming Chu, Yaginuma Daigo, and Subhash Bhalla. 2013. Quasi-relational query language interface for persistent standardized EHRs: Using NoSQL databases. In Databases in Networked Information Systems. Springer, 182--196.
[110]
James Manyika, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh, and Angela H. Byers. 2011. Big data: The next frontier for innovation, competition, and productivity. Technical Report, McKinsey Global Institute (2011).
[111]
Kezhi Z. Mao. 2004. Orthogonal forward selection and backward elimination algorithms for feature subset selection. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 34, 1 (2004), 629--634.
[112]
Yi Mao, Wenlin Chen, Yixin Chen, Chenyang Lu, Marin Kollef, and Thomas Bailey. 2012. An integrated data mining approach to real-time clinical monitoring and deterioration warning. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1140--1148.
[113]
Ramon Martinez Orellana, Burak Erem, and Dana H. Brooks. 2013. Time invariant multi electrode averaging for biomedical signals. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’13). IEEE, 1242--1246.
[114]
Jason Scott Mathias, Ankit Agrawal, Joe Feinglass, Andrew J. Cooper, David William Baker, and Alok Choudhary. 2013. Development of a 5 year life expectancy index in older adults using predictive mining of electronic health record data. Journal of the American Medical Informatics Association 20, e1 (2013), e118--e124.
[115]
Tao Meng, Lin Lin, Mei-Ling Shyu, and Shu-Ching Chen. 2010. Histology image classification using supervised classification and multimodal fusion. In 2010 IEEE International Symposium on Multimedia (ISM’10). 145--152.
[116]
Tao Meng and Mei-Ling Shyu. 2013. Biological image temporal stage classification via multi-layer model collaboration. In 2013 IEEE International Symposium on Multimedia (ISM’13). 30--37.
[117]
Tao Meng, Ahmed T. Soliman, Mei-Ling Shyu, Yimin Yang, Shu-Ching Chen, S. S. Iyengar, John S. Yordy, and Puneeth Iyengar. 2013. Wavelet analysis in current cancer genome research: A survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics 10, 6 (2013), 1442--14359.
[118]
Bjoern Menze, Georg Langs, Albert Montillo, Michael Kelm, Henning Müller, Shaoting Zhang, Weidong Tom Cai, and Dimitris Metaxas. 2014. Medical Computer Vision: Algorithms for Big Data: International Workshop (MCV’14), held in Conjunction with MICCAI’14, Cambridge, MA, USA, September 18, 2014, Revised Selected Papers. Vol. 8848. Springer.
[119]
Ivan Merelli, Horacio Pérez-Sánchez, Sandra Gesing, and Daniele DAgostino. 2014. Managing, analysing, and integrating big data in medical bioinformatics: Open problems and future perspectives. BioMed Research International 2014 (2014), 13.
[120]
Tom M. Mitchell. 1997. Machine Learning. Burr Ridge, IL: McGraw Hill.
[121]
Heiko Müller and Johann-Christph Freytag. 2003. Problems, Methods, and Challenges in Comprehensive Data Cleansing. Technical Report. Humboldt-Universitt zu Berlin. Professoren des Inst. Für Informatik.
[122]
Henning Müller, Antoine Geissbühler, and Patrick Ruch. 2005. ImageCLEF 2004: Combining image and multi-lingual search for medical image retrieval. In Multilingual Information Access for Text, Speech and Images. Vol. 3491. Springer, 718--727.
[123]
Atsushi Nara, Kiyoshi Izumi, Hiroshi Iseki, Takashi Suzuki, Kyojiro Nambu, and Yasuo Sakurai. 2011. Surgical workflow monitoring based on trajectory data mining. In New Frontiers in Artificial Intelligence. Vol. 6797. Springer, 283--291.
[124]
NCCHD. 2014. Births: Data from the National Community Child Health database. http://gov.wales/statistics-and-research/births-national-community-child-health-database/?lang=en. (2014). Retrieved 02-15-2015.
[125]
NetApp. 2011a. http://www.netapp.com/us/solutions/industry/healthcare/. (2011). Retrieved 02-24-2015.
[126]
NetApp. 2011b. NetApp EHR solutions: Efficient, high-availability EHR data storage and management. http://www.netapp.com/us/system/pdf-reader.aspx?cc=us&m==ds-3222.pdf&pdfUri==tcm:10-61401. (2011). Retrieved 02-20-2015.
[127]
Thomas Neumuth, Pierre Jannin, Gero Strauss, Juergen Meixensberger, and Oliver Burgert. 2009. Validation of knowledge acquisition for surgical process models. Journal of the American Medical Informatics Association 16, 1 (2009), 72--80.
[128]
Michael Nielsen. 2014. Neural networks and deep learning. Determination Press. Vol. 1.
[129]
David Nister and Henrik Stewenius. 2006. Scalable recognition with a vocabulary tree. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Vol. 2. IEEE, 2161--2168.
[130]
N. Nithya, K. Duraiswamy, and P. Gomathy. 2013. A survey on clustering techniques in medical diagnosis. International Journal of Computer Science Trends and Technology (IJCST) 1, 2 (2013), 17--23.
[131]
Sankar K. Pal and Sushmita Mitra. 1992. Multilayer perceptron, fuzzy sets, and classification. IEEE Transactions on Neural Networks 3, 5 (1992), 683--697.
[132]
Andrew Pavlo, Erik Paulson, Alexander Rasin, Daniel J. Abadi, David J. DeWitt, Samuel Madden, and Michael Stonebraker. 2009. A comparison of approaches to large-scale data analysis. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data. 165--178.
[133]
Mykola Pechenizkiy, Alexey Tsymbal, and Seppo Puuronen. 2004. PCA-based feature transformation for classification: Issues in medical diagnostics. In Proceedings of the 17th IEEE Symposium on Computer-Based Medical Systems (CBMS’04). IEEE, 535--540.
[134]
Ken Perez. 2013. MedeAnalytics. http://www.thefreelibrary.com/MedeAnalytics%27+Ken+Perez+Predicts+Rapid+Growth+of+Healthcare+Data…-a0328478771/. (2013). Retrieved 03-02-2015.
[135]
Sergey M. Plis, Devon R. Hjelm, Ruslan Salakhutdinov, Elena A. Allen, Henry J. Bockholt, Jeffrey D. Long, Hans J. Johnson, Jane S. Paulsen, Jessica A. Turner, and Vince D. Calhoun. 2014. Deep learning for neuroimaging: A validation study. Frontiers in Neuroscience 8 (2014).
[136]
Fred Popowich. 2005. Using text mining and natural language processing for health care claims processing. ACM SIGKDD Explorations Newsletter 7, 1 (2005), 59--66.
[137]
PREDICT-HD. 2015. PREDICT-HD project. https://www.predict-hd.net/. (2015). Retrieved 04-10-2015.
[138]
K. Priyanka and Nagarathna Kulennavar. 2014. A survey on big data analytics in health care. International Journal of Computer Science and Information Technologies 5, 4 (2014), 5865--5868.
[139]
Wullianallur Raghupathi and Viju Raghupathi. 2014. Big data analytics in healthcare: Promise and potential. Health Information Science and Systems 2, 1 (2014), 3.
[140]
K. Usha Rani. 2011. Analysis of heart diseases dataset using neural network approach. arXiv preprint arXiv:1110.2626 (2011).
[141]
Daniel A. Reed and Jack Dongarra. 2015. Exascale computing and big data. Communications of the ACM 58, 7 (2015), 56--68.
[142]
Rachel L. Richesson and Jeffrey Krischer. 2007. Data standards in clinical research: Gaps, overlaps, challenges and future directions. Journal of the American Medical Informatics Association 14, 6 (2007), 687--696.
[143]
Juan José Rodriguez, Ludmila I. Kuncheva, and Carlos J. Alonso. 2006. Rotation forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 10 (2006), 1619--1630.
[144]
Roy. 2015. Roy tutorials. http://www.roytuts.com/big-data/. (2015). Retrieved 01-20-2015.
[145]
Philip Russom. 2011. Big data analytics. TDWI Best Practices Report, Fourth Quarter (2011).
[146]
Bhaskar Saha, Kai Goebel, Scott Poll, and Jon Christophersen. 2007. An integrated approach to battery health monitoring using Bayesian regression and state estimation. In 2007 IEEE Autotestcon. 646--653.
[147]
Ruslan Salakhutdinov and Geoffrey E. Hinton. 2009. Deep Boltzmann machines. In Proceedings of the 12th International Conference on Artificial Intelligence and Statistics (AISTATS’09). Vol. 5. 448--455.
[148]
Gerard Salton and Donna Harman. 2003. Information retrieval. In Encyclopedia of Computer Science. John Wiley and Sons, Chichester, UK, 858--863.
[149]
Thomas Schlegl, Joachim Ofner, and Georg Langs. 2014. Unsupervised pre-training across image domains improves lung tissue classification. In Medical Computer Vision: Algorithms for Big Data. Vol. 8848. Springer, 82--93.
[150]
Frank Schnorrenberg, Constantinos S. Pattichis, Christos N. Schizas, and Kyriacos Kyriacou. 2000. Content-based retrieval of breast cancer biopsy slides. Technology and Health Care 8, 5 (2000), 291--297.
[151]
Ben Shneiderman. 1992. Tree visualization with tree-maps: 2-D space-filling approach. ACM Transactions on Graphics (TOG) 11, 1 (1992), 92--99.
[152]
Rajiv Ranjan Singh, Sailesh Conjeti, and Rahul Banerjee. 2011. An approach for real-time stress-trend detection using physiological signals in wearable computing systems for automotive drivers. In 14th International IEEE Conference on Intelligent Transportation Systems (ITSC’11). 1477--1482.
[153]
Sanjay K. Singh, Adeel Malik, Ahmad Firoz, and Vivekanand Jha. 2012. CDKD: A clinical database of kidney diseases. BMC Nephrology 13, 1 (2012), 23.
[154]
Ahmed T. Soliman, Tao Meng, Shu-Ching Chen, S. S. Iyengar, Puneeth Iyengar, John Yordy, and Mei-Ling Shyu. 2015. Driver missense mutation identification using feature selection and model fusion. Journal of Computational Biology 22, 12 (2015), 1075--1085.
[155]
Daby Sow, Deepak S. Turaga, and Michael Schmidt. 2013. Mining of sensor data in healthcare: A survey. In Managing and Mining Sensor Data. Springer, 459--504.
[156]
Peter Spyns. 1996. Natural language processing in medicine: An overview. Methods of Information in Medicine 35, 4 (1996), 285--301.
[157]
Abdulhamit Subasi and M. Ismail Gursoy. 2010. EEG signal classification using PCA, ICA, LDA and support vector machines. Expert Systems with Applications 37, 12 (2010), 8659--8666.
[158]
Jimeng Sun, Candace D. McNaughton, Ping Zhang, Adam Perer, Aris Gkoulalas-Divanis, Joshua C. Denny, Jacqueline Kirby, Thomas Lasko, Alexander Saip, and Bradley A. Malin. 2014. Predicting changes in hypertension control using electronic health records from a chronic disease management program. Journal of the American Medical Informatics Association 21, 2 (2014), 337--344.
[159]
Jimeng Sun and Chandan K. Reddy. 2013. Big data analytics for healthcare. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’13). ACM, New York, NY, 1525--1525.
[160]
Mingkui Tan, Ivor W. Tsang, and Li Wang. 2014. Towards ultrahigh dimensional feature selection for big data. Journal of Machine Learning Research 15, 1 (2014), 1371--1429.
[161]
Pang-Ning Tan, Michael Steinbach, and Vipin Kumar. 2006. Introduction to Data Mining. Vol. 1. Boston: Pearson Addison Wesley.
[162]
Techcrunch. 2014. Healthcare’s Big Data Opportunity. http://techcrunch.com/2014/11/20/healthcares-big-data-opportunity/. (2014). Retrieved 02-20-2015.
[163]
Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Suresh Anthony, Hao Liu, Pete Wyckoff, and Raghotham Murthy. 2009. Hive: A warehousing solution over a map-reduce framework. Proceedings of the VLDB Endowment 2, 2 (2009), 1626--1629.
[164]
Andy Tippet. 2014. Data capture and analytics in healthcare. http://blogs.zebra.com/data-capture-and-analytics-in-healthcare. (2014). Retrieved 02-08-2015.
[165]
Michael E. Tipping. 2001. Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research 1 (2001), 211--244.
[166]
Divya Tomar and Sonali Agarwal. 2013. A survey on data mining approaches for healthcare. International Journal of Bio-Science and Bio-Technology 5, 5 (2013), 241--266.
[167]
Godfried T. Toussaint. 1980. The relative neighbourhood graph of a finite planar set. Pattern Recognition 12, 4 (1980), 261--268.
[168]
Alexey Tsymbal, Martin Huber, Sonja Zillner, Tamás Hauer, and Shaohua Kevin Zhou. 2007. Visualizing patient similarity in clinical decision support. In LWA. Martin-Luther-University Halle-Wittenberg, 304--311.
[169]
Jos W. R. Twisk. 2004. Longitudinal data analysis. A comparison between generalized estimating equations and random coefficient analysis. European Journal of Epidemiology 19, 8 (2004), 769--776.
[170]
Roel G. W. Verhaak, Mathijs A. Sanders, Maarten A. Bijl, Ruud Delwel, Sebastiaan Horsman, Michael J. Moorhouse, Peter J. van der Spek, Bob Löwenberg, and Peter J. M. Valk. 2006. HeatMapper: Powerful combined visualization of gene expression profile correlations, genotypes, phenotypes and sample characteristics. BMC Bioinformatics 7, 1 (2006), 337.
[171]
Rajakrishnan Vijayakrishnan, Steven R. Steinhubl, Kenney Ng, Jimeng Sun, Roy J. Byrd, Zahra Daar, Brent A. Williams, Christopher Defilippi, Shahram Ebadollahi, and Walter F. Stewart. 2014. Prevalence of heart failure signs and symptoms in a large primary care population identified through the use of text and data mining of the electronic health record. Journal of Cardiac Failure 20, 7 (2014), 459--464.
[172]
Thi Hong Nhan Vu, Namkyu Park, Yang Koo Lee, Yongmi Lee, Jong Yun Lee, and Keun Ho Ryu. 2010. Online discovery of heart rate variability patterns in mobile healthcare services. Journal of Systems and Software 83, 10 (2010), 1930--1940.
[173]
Bo Wang, Aziz M. Mezlini, Feyyaz Demir, Marc Fiume, Zhuowen Tu, Michael Brudno, Benjamin Haibe-Kains, and Anna Goldenberg. 2014. Similarity network fusion for aggregating data types on a genomic scale. Nature Methods 11, 3 (2014), 333--337.
[174]
Fei Wang, Jimeng Sun, and Shahram Ebadollahi. 2012. Composite distance metric integration by leveraging multiple experts’ inputs and its application in patient similarity assessment. Statistical Analysis and Data Mining: The ASA Data Science Journal 5, 1 (2012), 54--69.
[175]
Wei Wang, Honggang Wang, Michael Hempel, Dongming Peng, Hamid Sharif, and Hsiao-Hwa Chen. 2011. Secure stochastic ECG signals based on gaussian mixture model for e-healthcare systems. IEEE Systems Journal 5, 4 (2011), 564--573.
[176]
Edgar Weippl, Andreas Holzinger, and A. Min Tjoa. 2006. Security aspects of ubiquitous computing in health care. e & i Elektrotechnik und Informationstechnik 123, 4 (2006), 156--161.
[177]
Yair Weiss, Antonio Torralba, and Rob Fergus. 2009. Spectral hashing. In Advances in Neural Information Processing Systems 21, D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou (Eds.). Curran Associates, 1753--1760. http://papers.nips.cc/paper/3383-spectral-hashing.pdf.
[178]
Susan E. White. 2014. A review of big data in health care: Challenges and opportunities. Clinical, Cosmetic and Investigational Dentistry 6 (2014), 45--56.
[179]
B. L. William Wong, Kai Xu, and Andreas Holzinger. 2011. Interactive visualization for information analysis in medical diagnosis. In Information Quality in e-Health. Lecture Notes in Computer Science. Vol. 7058. Springer, 109--120.
[180]
Leon Xiao, Judy Hanover, and Sash Mukherjee. 2014. Big data enables clinical decision support in hospital settings. http://www.idc.com/getdoc.jsp?containerId=CN245651. (2014). Retrieved 03-09-2015.
[181]
Qiong Xu, Hengyong Yu, Xuanqin Mou, Lei Zhang, Jiang Hsieh, and Ge Wang. 2012. Low-dose X-ray CT reconstruction via dictionary learning. IEEE Transactions on Medical Imaging 31, 9 (2012), 1682--1697.
[182]
Jinn-Yi Yeh, Tai-Hsi Wu, and Chuan-Wei Tsao. 2011. Using data mining techniques to predict hospitalization of hemodialysis patients. Decision Support Systems 50, 2 (2011), 439--448.
[183]
Illhoi Yoo, Patricia Alafaireet, Miroslav Marinov, Keila Pena-Hernandez, Rajitha Gopidi, Jia-Fu Chang, and Lei Hua. 2012. Data mining in healthcare and biomedicine: A survey of the literature. Journal of Medical Systems 36, 4 (2012), 2431--2448.
[184]
Hisako Yoshida, Atsushi Kawaguchi, and Kazuhiko Tsuruya. 2013. Radial basis function-sparse partial least squares for application to brain imaging data. Computational and Mathematical Methods in Medicine 2013 (2013), 7.
[185]
Kui Yu, Xindong Wu, Wei Ding, and Jian Pei. 2014. Towards scalable and accurate online feature selection for big data. In 2014 IEEE International Conference on Data Mining (ICDM’14). 660--669.
[186]
Xiaojing Yuan, Ning Situ, and George Zouridakis. 2009. A narrow band graph partitioning method for skin lesion segmentation. Pattern Recognition 42, 6 (2009), 1017--1028.
[187]
Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster computing with working sets. In Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing. 10--11.
[188]
Xiaofan Zhang, Wei Liu, M. Dundar, S. Badve, and Shaoting Zhang. 2014. Towards large-scale histopathological image analysis: Hashing-based image retrieval. IEEE Transactions on Medical Imaging 34, 2 (2014), 496--506.
[189]
Xiaofan Zhang, Hai Su, Lin Yang, and Shaoting Zhang. 2015a. Fine-grained histopathological image analysis via robust segmentation and large-scale retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5361--5368.
[190]
Xiaofan Zhang, Fuyong Xing, Hai Su, Lin Yang, and Shaoting Zhang. 2015b. High-throughput histopathological image analysis via robust cell segmentation and hashing. Medical Image Analysis 26, 1 (2015), 306--315.
[191]
Xiaofan Zhang, Lin Yang, Wei Liu, Hai Su, and Shaoting Zhang. 2014. Mining histopathological images via composite hashing and online learning. In Medical Image Computing and Computer-Assisted Intervention (MICCAI’14). Vol. 8674. Springer, 479--486.
[192]
Yang Zhang, Simon Fong, Jinan Fiaidhi, and Sabah Mohammed. 2012. Real-time clinical decision support system with data stream mining. Biomedicine and Biotechnology 2012 (2012), 8.
[193]
Shang-Ming Zhou, Ronan A. Lyons, Owen Bodger, Joanne C. Demmler, and Mark D. Atkinson. 2010. SVM with entropy regularization and particle swarm optimization for identifying children’s health and socioeconomic determinants of education attainments using linked datasets. In The 2010 International Joint Conference on Neural Networks (IJCNN’10). IEEE, 1--8.
[194]
Xiang Sean Zhou, Sonja Zillner, Manuel Moeller, Michael Sintek, Yiqiang Zhan, Arun Krishnan, and Alok Gupta. 2008. Semantics and CBIR: A medical imaging perspective. In Proceedings of the 2008 International Conference on Content-based Image and Video Retrieval. ACM, 571--580.
[195]
Ying Zhu. 2011. Automatic detection of anomalies in blood glucose using a machine learning approach. Journal of Communications and Networks 13, 2 (2011), 125--131.
[196]
Kiyana Zolfaghar, Naren Meadem, Ankur Teredesai, Senjuti Basu Roy, Si-Chi Chin, and Brian Muckian. 2013. Big data solutions for predicting risk-of-readmission for congestive heart failure patients. In 2013 IEEE International Conference on Big Data. 64--71.

Cited By

View all

Index Terms

  1. Computational Health Informatics in the Big Data Age: A Survey

    Recommendations

    Reviews

    Kalman Balogh

    Global information and communications technology (ICT) resources have been changing lives; fast networks and mobile devices offer "always-on" services, extending the possibilities of personal computers (PCs). Besides social networks and the Internet of Things (IoT), healthcare services-both independently and in relation with the former two-are among the fastest emerging areas, exploiting and challenging the development of ICT. This survey gives an overview on the anticipated and expected progress of healthcare during this decade; for each processing phase of health data, it presents an extensive list of methods and systems existing in 2014. According to the forecasts quoted by the authors, the amount of healthcare data is growing exponentially; according to IDC, in 2020, it will be more than an order of magnitude (ten times) of the amount of 2014. Beyond size, new requirements should be fulfilled: handling of heterogeneous data types, executing distributed transactions, enabling global access but preserving security and confidentiality, improving and spreading global standards and interchanging data among proprietary systems, the online analysis of data, dedicated decision support systems both for medical practitioners and clinical remote expert teams, microbiology and genetic data in more personalized medicine, integrating personal diagnostic sensors, mobile patient-advisor services in e-health, and archiving requirements and reusability of longitudinal data. These requirements cannot be fulfilled by traditional relational databases alone; the handling of big data stored in hybrid clouds expands the capabilities of those with working but still immature technologies. At the start, these tools were completely independent of the traditional ones. They handle diverse kinds of new unstructured data types (including biometric data, texts about patient care, professional textbooks and articles, and so on) distributed to file servers of data centers worldwide, but do not support transactions and sophisticated queries. Most of the solutions introduced in the survey are based on these NoSQL systems. In some cases, the method and tool chosen for meeting the needs of a special medical area are rather ad hoc. The authors mention that the product repertoire of both the new Internet giants (Amazon, Google, Facebook) and the traditional vendors (IBM, Oracle/Teradata/Sun/Siebel, Microsoft, Dell/EMC, SAP) converge: they support heterogeneous systems with interfaces to NoSQL, traditional SQL, and NewSQL tools and products in hybrid clouds. These technologies and major application areas-including healthcare, microbiology, and genetics-are introduced for instance in Chen et al. [1]. In my opinion, the main value of the survey is its introduction to a broad range of methods and developed tools applied in nontraditional healthcare and microbiology. I propose this overview for those ICT professionals who are developing applications in these fields. Online Computing Reviews Service

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Computing Surveys
    ACM Computing Surveys  Volume 49, Issue 1
    March 2017
    705 pages
    ISSN:0360-0300
    EISSN:1557-7341
    DOI:10.1145/2911992
    • Editor:
    • Sartaj Sahni
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 June 2016
    Accepted: 01 March 2016
    Revised: 01 December 2015
    Received: 01 May 2015
    Published in CSUR Volume 49, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. 4V challenges
    2. Big data analytics
    3. clinical decision support
    4. computational health informatics
    5. data mining
    6. machine learning
    7. survey

    Qualifiers

    • Survey
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)241
    • Downloads (Last 6 weeks)24
    Reflects downloads up to 14 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media