skip to main content
research-article

Time-Warped Sparse Non-negative Factorization for Functional Data Analysis

Published: 28 September 2020 Publication History

Abstract

This article proposes a novel time-warped sparse non-negative factorization method for functional data analysis. The proposed method on the one hand guarantees the extracted basis functions and their coefficients to be positive and interpretable, and on the other hand is able to handle weakly correlated functions with different features. Furthermore, the method incorporates time warping into factorization and hence allows the extracted basis functions of different samples to have temporal deformations. An efficient framework of estimation algorithms is proposed based on a greedy variable selection approach. Numerical studies together with case studies on real-world data demonstrate the efficacy and applicability of the proposed methodology.

References

[1]
Mohammad Taha Bahadori, David Kale, Yingying Fan, and Yan Liu. 2015. Functional subspace clustering with application to time series. In Proceedings of the International Conference on Machine Learning. 228--237.
[2]
A. Bartlett and W. P. McCormick. 2013. Estimation for non-negative time series with heavy-tail innovations. Journal of Time Series Analysis 34, 1 (2013), 96--115.
[3]
Poole Ben, Alex H. Williams, Niru Maheswaranathan, Byron Yu, Gopal Santhanam, Stephen I. Ryu, Stephen A. Baccus, Krishna Shenoy, and Surya Ganguli. 2017. Time-warped PCA: Simultaneous alignment and dimensionality reduction of neural data. In Frontiers in Neuroscience. Computational and Systems Neuroscience (COSYNE). Salt Lake City, UT.
[4]
Dimitri Bertsekas. 1976. On the goldstein-levitin-polyak gradient projection method. IEEE Transactions on Automatic Control 21, 2 (1976), 174--184.
[5]
Dimitri P. Bertsekas. 1999. Nonlinear Programming. Athena Scientific, Belmont.
[6]
Pierre Bonami, Mustafa Kilinç, and Jeff Linderoth. 2012. Algorithms and software for convex mixed integer nonlinear programs. In Mixed Integer Nonlinear Programming. Springer, 1--39.
[7]
Brian Borchers and John E. Mitchell. 1994. An improved branch and bound algorithm for mixed integer nonlinear programs. Computers 8 Operations Research 21, 4 (1994), 359--367.
[8]
Philippe Bougerol and Nico Picard. 1992. Stationarity of GARCH processes and of some nonnegative time series. Journal of Econometrics 52, 1–2 (1992), 115--127.
[9]
Anne-Sarah Briand, Etienne Côme, K. Mohamed, and Latifa Oukhellou. 2015. A mixture model clustering approach for temporal passenger pattern characterization in public transport. In Proceedings of the IEEE International Conference on Data Science and Advanced Analytics (DSAA’15). IEEE, 1--10.
[10]
T. Tony Cai and Lie Wang. 2011. Orthogonal matching pursuit for sparse signal recovery with noise. IEEE Transactions on Information Theory 57, 7 (2011), 4680--4688.
[11]
Zhe Chen and Andrzej Cichocki. 2005. Nonnegative matrix factorization with temporal smoothness and/or spatial decorrelation constraints. Laboratory for Advanced Brain Signal Processing, RIKEN, Technical Report 68 (2005).
[12]
Vincent C. K. Cheung, Karthik Devarajan, Giacomo Severini, Andrea Turolla, and Paolo Bonato. 2015. Decomposing time series data by a non-negative matrix factorization algorithm with temporally constrained coefficients. In Proceedings of the2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC’15). IEEE, 3496--3499.
[13]
Jeng-Min Chiou, Yu-Ting Chen, and Ya-Fang Yang. 2014. Multivariate functional principal component analysis: A normalization approach. Statistica Sinica (2014), 1571--1596.
[14]
Jeng-Min Chiou and Hans-Georg Müller. 2016. A pairwise interaction model for multivariate functional and longitudinal data. Biometrika 103, 2 (2016), 377.
[15]
Moody Chu, Fasma Diele, Robert Plemmons, and Stefania Ragni. 2004. Optimality, computation, and interpretation of nonnegative matrix factorizations. SIAM Journal on Matrix Analysis.
[16]
Marco Cuturi and Mathieu Blondel. 2017. Soft-DTW: A differentiable loss function for time-series. In Proceedings of the 34th International Conference on Machine Learning. JMLR, 894--903.
[17]
Ruairí de Fréin, Konstantinos Drakakis, Scott Rickard, and Andrzej Cichocki. 2008. Analysis of financial data using non-negative matrix factorization. In International Mathematical Forum. Vol. 3. Journals of Hikari Ltd, 1853--1870.
[18]
Chong-Zhi Di, Ciprian M. Crainiceanu, Brian S Caffo, and Naresh M. Punjabi. 2009. Multilevel functional principal component analysis. The Annals of Applied Statistics 3, 1 (2009), 458.
[19]
Marco A. Duran and Ignacio E. Grossmann. 1986. An outer-approximation algorithm for a class of mixed-integer nonlinear programs. Mathematical Programming 36, 3 (1986), 307--339.
[20]
Julian Eggert and Edgar Korner. 2004. Sparse coding and NMF. In Proceedings of the 2004 IEEE International Joint Conference on Neural Networks. Vol. 4. IEEE, 2529--2533.
[21]
Theo Gasser and Alois Kneip. 1995. Searching for structure in curve samples. Journal of the American Statistical Association 90, 432 (1995), 1179--1188.
[22]
Naiyang Guan, Dacheng Tao, Zhigang Luo, and Bo Yuan. 2012. Online nonnegative matrix factorization with robust stochastic approximation. IEEE Transactions on Neural Networks and Learning Systems 23, 7 (2012), 1087--1099.
[23]
Elad Hazan, Amit Agarwal, and Satyen Kale. 2007. Logarithmic regret algorithms for online convex optimization. Machine Learning 69, 2–3 (2007), 169--192.
[24]
Yongmiao Hong and Yoon-Jin Lee. 2011. Detecting misspecifications in autoregressive conditional duration models and non-negative time-series processes. Journal of Time Series Analysis 32, 1 (2011), 1--32.
[25]
Patrik O. Hoyer. 2004. Non-negative matrix factorization with sparseness constraints. Journal of Machine Learning Research 5, Nov (2004), 1457--1469.
[26]
Jianhua Z. Huang, Haipeng Shen, and Andreas Buja. 2008. Functional principal components analysis via penalized rank one approximation. Electronic Journal of Statistics 2 (2008), 678--695.
[27]
Fumitada Itakura. 1975. Minimum prediction residual principle applied to speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 23, 1 (1975), 67--72.
[28]
Martin Jaggi. 2013. Revisiting frank-wolfe: Projection-free sparse convex optimization. In Proceedings of the 30th International Conference on Machine Learning. 427--435.
[29]
Seongah Jeong, Xiang Li, Jiarui Yang, Quanzheng Li, and Vahid Tarokh. 2017. Dictionary learning and sparse coding-based denoising for high-resolution task functional connectivity MRI analysis. In Proceedings of the International Workshop on Machine Learning in Medical Imaging. Springer, 45--52.
[30]
Eamonn Keogh and Chotirat Ann Ratanamahatana. 2005. Exact indexing of dynamic time warping. Knowledge and Information Systems 7, 3 (2005), 358--386.
[31]
A. Kneip, Xiaochun Li, K. B. MacGibbon, and J. O. Ramsay. 2000. Curve registration by local regression. Canadian Journal of Statistics 28, 1 (2000), 19--29.
[32]
Cosmin Lazar, Daniel Nuzillard, and Andrei Doncescu. 2009. Non negative matrix factorization for time series of medical images analysis. In Proceedings of the International Conference on Complex, Intelligent and Software Intensive Systems (CISIS’09). IEEE, 918--923.
[33]
Daniel D. Lee and H. Sebastian Seung. 1999. Learning the parts of objects by non-negative matrix factorization. Nature 401, 6755 (1999), 788.
[34]
Daniel D. Lee and H. Sebastian Seung. 2001. Algorithms for non-negative matrix factorization. In Proceedings of the Advances in Neural Information Processing Systems. 556--562.
[35]
Chih-Jen Lin. 2007. Projected gradient methods for nonnegative matrix factorization. Neural Computation 19, 10 (2007), 2756--2779.
[36]
Jennifer Listgarten, Radford M. Neal, Sam T. Roweis, and Andrew Emili. 2005. Multiple alignment of continuous time series. In Proceedings of the Advances in Neural Information Processing Systems. 817--824.
[37]
Jiali Mei, Yohann De Castro, Yannig Goude, Jean-Marc Azais, and Georges Hébrail. 2018. Nonnegative matrix factorization with side information for time series recovery and prediction. IEEE Transactions on Knowledge and Data Engineering 31, 3 (2018), 493--506.
[38]
Jiali Mei, Yohann De Castro, Yannig Goude, and Georges Hébrail. 2017. Nonnegative matrix factorization for time series recovery from a few temporal aggregates. In Proceedings of the International Conference on Machine Learning. 2382--2390.
[39]
Yang Meng, Ronghua Shang, Licheng Jiao, Wenya Zhang, and Shuyuan Yang. 2018. Dual-graph regularized non-negative matrix factorization with sparse and orthogonal constraints. Engineering Applications of Artificial Intelligence 69 (2018), 24--35.
[40]
Yang Meng, Ronghua Shang, Licheng Jiao, Wenya Zhang, Yijing Yuan, and Shuyuan Yang. 2018. Feature selection based dual-graph sparse non-negative matrix factorization for local discriminative clustering. Neurocomputing 290 (2018), 87--99.
[41]
Yang Meng, Ronghua Shang, Fanhua Shang, Licheng Jiao, Shuyuan Yang, and Rustam Stolkin. 2019. Semi-supervised graph regularized deep NMF with bi-orthogonal constraints for data representation. IEEE Transactions on Neural Networks and Learning Systems (2019).
[42]
Morten Morup, Kristoffer Hougaard Madsen, and Lars Kai Hansen. 2008. Approximate L 0 constrained non-negative matrix and tensor factorization. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS ’08). IEEE, 1328--1331.
[43]
Kanchan Mukherjee, R. H. Shumway, and K. L. Verosub. 2007. On the alignment of multiple time series fragments. Biometrika 94, 2 (2007), 347--358.
[44]
Jiazhu Pan and Qiwei Yao. 2008. Modelling multiple time series via common factors. Biometrika (2008), 365--379.
[45]
Kamran Paynabar, Changliang Zou, and Peihua Qiu. 2016. A change-point approach for phase-i analysis in multivariate profile monitoring and diagnosis. Technometrics 58, 2 (2016), 191--204.
[46]
Henrik Ramlau-Hansen. 1983. Smoothing counting process intensities by means of kernel functions. The Annals of Statistics (1983), 453--466.
[47]
James Ramsay. 2005. Functional data analysis. Encyclopedia of Statistics in Behavioral Science (2005).
[48]
Tomasz M. Rutkowski, Rafal Zdunek, and Andrzej Cichocki. 2007. Multichannel EEG brain activity pattern analysis in time–frequency domain with nonnegative matrix factorization support. In International Congress Series 1301 (2007), 266--269.
[49]
Hiroaki Sakoe and Seibi Chiba. 1978. Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 26, 1 (1978), 43--49.
[50]
Ronghua Shang, Yang Meng, Wenbing Wang, Fanhua Shang, and Licheng Jiao. 2019. Local discriminative based sparse subspace learning for feature selection. Pattern Recognition 92 (2019), 219--230.
[51]
Ronghua Shang, Wenbing Wang, Rustam Stolkin, and Licheng Jiao. 2017. Non-negative spectral learning and sparse regression-based dual-graph regularized feature selection. IEEE Transactions on Cybernetics 48, 2 (2017), 793--806.
[52]
Paris Smaragdis, Cedric Fevotte, Gautham J. Mysore, Nasser Mohammadiha, and Matthew Hoffman. 2014. Static and dynamic source separation using nonnegative factorizations: A unified view. IEEE Signal Processing Magazine 31, 3 (2014), 66--75.
[53]
James C. Spall. 2005. Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control. Vol. 65. John Wiley 8 Sons.
[54]
Paul Tseng. 2001. Convergence of a block coordinate descent method for nondifferentiable minimization. Journal of Optimization Theory and Applications 109, 3 (2001), 475--494.
[55]
Laura L. Tupper, David S. Matteson, and John C. Handley. 2016. Mixed data and classification of transit stops. In Proceedings of the 2016 IEEE International Conference on Big Data. IEEE, 2225--2232.
[56]
Taras K. Vintsyuk. 1968. Speech discrimination by dynamic programming. Cybernetics 4, 1 (1968), 52--57.
[57]
Fei Wang, Chenhao Tan, Ping Li, and Arnd Christian König. 2011. Efficient document clustering via online nonnegative matrix factorizations. In Proceedings of the 2011 SIAM International Conference on Data Mining. SIAM, 908--919.
[58]
Jane-Ling Wang, Jeng-Min Chiou, and Hans-Georg Muller. 2016. Functional data analysis. Annual Review of Statistics and Its Application 3, 1 (2016), 257--295.
[59]
Chen Zhang, Hao Yan, Seungho Lee, and Jianjun Shi. 2018. Weakly correlated profile monitoring based on sparse multi-channel functional principal component analysis. IISE Transactions 50, 10 (2018), 878--891.
[60]
Tong Zhang. 2011. Sparse recovery with orthogonal matching pursuit under RIP. IEEE Transactions on Information Theory 57, 9 (2011), 6215--6221.
[61]
Zheng Zhang, Yong Xu, Jian Yang, Xuelong Li, and David Zhang. 2015. A survey of sparse representation: Algorithms and applications. IEEE Access 3 (2015), 490--530.
[62]
Feng Zhou and Fernando De la Torre. 2015. Generalized canonical time warping. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 2 (2015), 279--294.
[63]
Feng Zhou and Fernando Torre. 2009. Canonical time warping for alignment of human behavior. In Proceedings of the Advances in Neural Information Processing Systems. 2286--2294.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 14, Issue 6
December 2020
376 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3427188
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 September 2020
Accepted: 01 June 2020
Revised: 01 April 2020
Received: 01 August 2019
Published in TKDD Volume 14, Issue 6

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Non-negative functional factorization
  2. multivariate functional data
  3. sparse representation
  4. time warping

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Supply Chain Research Center Grant, Tsinghua University
  • NSFC
  • RGC GRF

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)23
  • Downloads (Last 6 weeks)4
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media