research-article

Time-Warped Sparse Non-negative Factorization for Functional Data Analysis

Authors:

Steven C. H. Hoi,

Fugee TsungAuthors Info & Claims

ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 14, Issue 6

Article No.: 72, Pages 1 - 23

https://doi.org/10.1145/3408313

Published: 28 September 2020 Publication History

Abstract

This article proposes a novel time-warped sparse non-negative factorization method for functional data analysis. The proposed method on the one hand guarantees the extracted basis functions and their coefficients to be positive and interpretable, and on the other hand is able to handle weakly correlated functions with different features. Furthermore, the method incorporates time warping into factorization and hence allows the extracted basis functions of different samples to have temporal deformations. An efficient framework of estimation algorithms is proposed based on a greedy variable selection approach. Numerical studies together with case studies on real-world data demonstrate the efficacy and applicability of the proposed methodology.

References

[1]

Mohammad Taha Bahadori, David Kale, Yingying Fan, and Yan Liu. 2015. Functional subspace clustering with application to time series. In Proceedings of the International Conference on Machine Learning. 228--237.

[2]

A. Bartlett and W. P. McCormick. 2013. Estimation for non-negative time series with heavy-tail innovations. Journal of Time Series Analysis 34, 1 (2013), 96--115.

[3]

Poole Ben, Alex H. Williams, Niru Maheswaranathan, Byron Yu, Gopal Santhanam, Stephen I. Ryu, Stephen A. Baccus, Krishna Shenoy, and Surya Ganguli. 2017. Time-warped PCA: Simultaneous alignment and dimensionality reduction of neural data. In Frontiers in Neuroscience. Computational and Systems Neuroscience (COSYNE). Salt Lake City, UT.

[4]

Dimitri Bertsekas. 1976. On the goldstein-levitin-polyak gradient projection method. IEEE Transactions on Automatic Control 21, 2 (1976), 174--184.

[5]

Dimitri P. Bertsekas. 1999. Nonlinear Programming. Athena Scientific, Belmont.

[6]

Pierre Bonami, Mustafa Kilinç, and Jeff Linderoth. 2012. Algorithms and software for convex mixed integer nonlinear programs. In Mixed Integer Nonlinear Programming. Springer, 1--39.

[7]

Brian Borchers and John E. Mitchell. 1994. An improved branch and bound algorithm for mixed integer nonlinear programs. Computers 8 Operations Research 21, 4 (1994), 359--367.

[8]

Philippe Bougerol and Nico Picard. 1992. Stationarity of GARCH processes and of some nonnegative time series. Journal of Econometrics 52, 1–2 (1992), 115--127.

[9]

Anne-Sarah Briand, Etienne Côme, K. Mohamed, and Latifa Oukhellou. 2015. A mixture model clustering approach for temporal passenger pattern characterization in public transport. In Proceedings of the IEEE International Conference on Data Science and Advanced Analytics (DSAA’15). IEEE, 1--10.

[10]

T. Tony Cai and Lie Wang. 2011. Orthogonal matching pursuit for sparse signal recovery with noise. IEEE Transactions on Information Theory 57, 7 (2011), 4680--4688.

Digital Library

[11]

Zhe Chen and Andrzej Cichocki. 2005. Nonnegative matrix factorization with temporal smoothness and/or spatial decorrelation constraints. Laboratory for Advanced Brain Signal Processing, RIKEN, Technical Report 68 (2005).

[12]

Vincent C. K. Cheung, Karthik Devarajan, Giacomo Severini, Andrea Turolla, and Paolo Bonato. 2015. Decomposing time series data by a non-negative matrix factorization algorithm with temporally constrained coefficients. In Proceedings of the2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC’15). IEEE, 3496--3499.

[13]

Jeng-Min Chiou, Yu-Ting Chen, and Ya-Fang Yang. 2014. Multivariate functional principal component analysis: A normalization approach. Statistica Sinica (2014), 1571--1596.

[14]

Jeng-Min Chiou and Hans-Georg Müller. 2016. A pairwise interaction model for multivariate functional and longitudinal data. Biometrika 103, 2 (2016), 377.

[15]

Moody Chu, Fasma Diele, Robert Plemmons, and Stefania Ragni. 2004. Optimality, computation, and interpretation of nonnegative matrix factorizations. SIAM Journal on Matrix Analysis.

[16]

Marco Cuturi and Mathieu Blondel. 2017. Soft-DTW: A differentiable loss function for time-series. In Proceedings of the 34th International Conference on Machine Learning. JMLR, 894--903.

[17]

Ruairí de Fréin, Konstantinos Drakakis, Scott Rickard, and Andrzej Cichocki. 2008. Analysis of financial data using non-negative matrix factorization. In International Mathematical Forum. Vol. 3. Journals of Hikari Ltd, 1853--1870.

[18]

Chong-Zhi Di, Ciprian M. Crainiceanu, Brian S Caffo, and Naresh M. Punjabi. 2009. Multilevel functional principal component analysis. The Annals of Applied Statistics 3, 1 (2009), 458.

[19]

Marco A. Duran and Ignacio E. Grossmann. 1986. An outer-approximation algorithm for a class of mixed-integer nonlinear programs. Mathematical Programming 36, 3 (1986), 307--339.

Digital Library

[20]

Julian Eggert and Edgar Korner. 2004. Sparse coding and NMF. In Proceedings of the 2004 IEEE International Joint Conference on Neural Networks. Vol. 4. IEEE, 2529--2533.

[21]

Theo Gasser and Alois Kneip. 1995. Searching for structure in curve samples. Journal of the American Statistical Association 90, 432 (1995), 1179--1188.

[22]

Naiyang Guan, Dacheng Tao, Zhigang Luo, and Bo Yuan. 2012. Online nonnegative matrix factorization with robust stochastic approximation. IEEE Transactions on Neural Networks and Learning Systems 23, 7 (2012), 1087--1099.

[23]

Elad Hazan, Amit Agarwal, and Satyen Kale. 2007. Logarithmic regret algorithms for online convex optimization. Machine Learning 69, 2–3 (2007), 169--192.

Digital Library

[24]

Yongmiao Hong and Yoon-Jin Lee. 2011. Detecting misspecifications in autoregressive conditional duration models and non-negative time-series processes. Journal of Time Series Analysis 32, 1 (2011), 1--32.

[25]

Patrik O. Hoyer. 2004. Non-negative matrix factorization with sparseness constraints. Journal of Machine Learning Research 5, Nov (2004), 1457--1469.

Digital Library

[26]

Jianhua Z. Huang, Haipeng Shen, and Andreas Buja. 2008. Functional principal components analysis via penalized rank one approximation. Electronic Journal of Statistics 2 (2008), 678--695.

[27]

Fumitada Itakura. 1975. Minimum prediction residual principle applied to speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 23, 1 (1975), 67--72.

[28]

Martin Jaggi. 2013. Revisiting frank-wolfe: Projection-free sparse convex optimization. In Proceedings of the 30th International Conference on Machine Learning. 427--435.

[29]

Seongah Jeong, Xiang Li, Jiarui Yang, Quanzheng Li, and Vahid Tarokh. 2017. Dictionary learning and sparse coding-based denoising for high-resolution task functional connectivity MRI analysis. In Proceedings of the International Workshop on Machine Learning in Medical Imaging. Springer, 45--52.

[30]

Eamonn Keogh and Chotirat Ann Ratanamahatana. 2005. Exact indexing of dynamic time warping. Knowledge and Information Systems 7, 3 (2005), 358--386.

Digital Library

[31]

A. Kneip, Xiaochun Li, K. B. MacGibbon, and J. O. Ramsay. 2000. Curve registration by local regression. Canadian Journal of Statistics 28, 1 (2000), 19--29.

[32]

Cosmin Lazar, Daniel Nuzillard, and Andrei Doncescu. 2009. Non negative matrix factorization for time series of medical images analysis. In Proceedings of the International Conference on Complex, Intelligent and Software Intensive Systems (CISIS’09). IEEE, 918--923.

[33]

Daniel D. Lee and H. Sebastian Seung. 1999. Learning the parts of objects by non-negative matrix factorization. Nature 401, 6755 (1999), 788.

[34]

Daniel D. Lee and H. Sebastian Seung. 2001. Algorithms for non-negative matrix factorization. In Proceedings of the Advances in Neural Information Processing Systems. 556--562.

[35]

Chih-Jen Lin. 2007. Projected gradient methods for nonnegative matrix factorization. Neural Computation 19, 10 (2007), 2756--2779.

Digital Library

[36]

Jennifer Listgarten, Radford M. Neal, Sam T. Roweis, and Andrew Emili. 2005. Multiple alignment of continuous time series. In Proceedings of the Advances in Neural Information Processing Systems. 817--824.

[37]

Jiali Mei, Yohann De Castro, Yannig Goude, Jean-Marc Azais, and Georges Hébrail. 2018. Nonnegative matrix factorization with side information for time series recovery and prediction. IEEE Transactions on Knowledge and Data Engineering 31, 3 (2018), 493--506.

Digital Library

[38]

Jiali Mei, Yohann De Castro, Yannig Goude, and Georges Hébrail. 2017. Nonnegative matrix factorization for time series recovery from a few temporal aggregates. In Proceedings of the International Conference on Machine Learning. 2382--2390.

[39]

Yang Meng, Ronghua Shang, Licheng Jiao, Wenya Zhang, and Shuyuan Yang. 2018. Dual-graph regularized non-negative matrix factorization with sparse and orthogonal constraints. Engineering Applications of Artificial Intelligence 69 (2018), 24--35.

[40]

Yang Meng, Ronghua Shang, Licheng Jiao, Wenya Zhang, Yijing Yuan, and Shuyuan Yang. 2018. Feature selection based dual-graph sparse non-negative matrix factorization for local discriminative clustering. Neurocomputing 290 (2018), 87--99.

Digital Library

[41]

Yang Meng, Ronghua Shang, Fanhua Shang, Licheng Jiao, Shuyuan Yang, and Rustam Stolkin. 2019. Semi-supervised graph regularized deep NMF with bi-orthogonal constraints for data representation. IEEE Transactions on Neural Networks and Learning Systems (2019).

[42]

Morten Morup, Kristoffer Hougaard Madsen, and Lars Kai Hansen. 2008. Approximate L 0 constrained non-negative matrix and tensor factorization. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS ’08). IEEE, 1328--1331.

[43]

Kanchan Mukherjee, R. H. Shumway, and K. L. Verosub. 2007. On the alignment of multiple time series fragments. Biometrika 94, 2 (2007), 347--358.

[44]

Jiazhu Pan and Qiwei Yao. 2008. Modelling multiple time series via common factors. Biometrika (2008), 365--379.

[45]

Kamran Paynabar, Changliang Zou, and Peihua Qiu. 2016. A change-point approach for phase-i analysis in multivariate profile monitoring and diagnosis. Technometrics 58, 2 (2016), 191--204.

[46]

Henrik Ramlau-Hansen. 1983. Smoothing counting process intensities by means of kernel functions. The Annals of Statistics (1983), 453--466.

[47]

James Ramsay. 2005. Functional data analysis. Encyclopedia of Statistics in Behavioral Science (2005).

[48]

Tomasz M. Rutkowski, Rafal Zdunek, and Andrzej Cichocki. 2007. Multichannel EEG brain activity pattern analysis in time–frequency domain with nonnegative matrix factorization support. In International Congress Series 1301 (2007), 266--269.

[49]

Hiroaki Sakoe and Seibi Chiba. 1978. Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 26, 1 (1978), 43--49.

[50]

Ronghua Shang, Yang Meng, Wenbing Wang, Fanhua Shang, and Licheng Jiao. 2019. Local discriminative based sparse subspace learning for feature selection. Pattern Recognition 92 (2019), 219--230.

Digital Library

[51]

Ronghua Shang, Wenbing Wang, Rustam Stolkin, and Licheng Jiao. 2017. Non-negative spectral learning and sparse regression-based dual-graph regularized feature selection. IEEE Transactions on Cybernetics 48, 2 (2017), 793--806.

[52]

Paris Smaragdis, Cedric Fevotte, Gautham J. Mysore, Nasser Mohammadiha, and Matthew Hoffman. 2014. Static and dynamic source separation using nonnegative factorizations: A unified view. IEEE Signal Processing Magazine 31, 3 (2014), 66--75.

[53]

James C. Spall. 2005. Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control. Vol. 65. John Wiley 8 Sons.

[54]

Paul Tseng. 2001. Convergence of a block coordinate descent method for nondifferentiable minimization. Journal of Optimization Theory and Applications 109, 3 (2001), 475--494.

Digital Library

[55]

Laura L. Tupper, David S. Matteson, and John C. Handley. 2016. Mixed data and classification of transit stops. In Proceedings of the 2016 IEEE International Conference on Big Data. IEEE, 2225--2232.

[56]

Taras K. Vintsyuk. 1968. Speech discrimination by dynamic programming. Cybernetics 4, 1 (1968), 52--57.

[57]

Fei Wang, Chenhao Tan, Ping Li, and Arnd Christian König. 2011. Efficient document clustering via online nonnegative matrix factorizations. In Proceedings of the 2011 SIAM International Conference on Data Mining. SIAM, 908--919.

[58]

Jane-Ling Wang, Jeng-Min Chiou, and Hans-Georg Muller. 2016. Functional data analysis. Annual Review of Statistics and Its Application 3, 1 (2016), 257--295.

[59]

Chen Zhang, Hao Yan, Seungho Lee, and Jianjun Shi. 2018. Weakly correlated profile monitoring based on sparse multi-channel functional principal component analysis. IISE Transactions 50, 10 (2018), 878--891.

[60]

Tong Zhang. 2011. Sparse recovery with orthogonal matching pursuit under RIP. IEEE Transactions on Information Theory 57, 9 (2011), 6215--6221.

Digital Library

[61]

Zheng Zhang, Yong Xu, Jian Yang, Xuelong Li, and David Zhang. 2015. A survey of sparse representation: Algorithms and applications. IEEE Access 3 (2015), 490--530.

[62]

Feng Zhou and Fernando De la Torre. 2015. Generalized canonical time warping. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 2 (2015), 279--294.

Digital Library

[63]

Feng Zhou and Fernando Torre. 2009. Canonical time warping for alignment of human behavior. In Proceedings of the Advances in Neural Information Processing Systems. 2286--2294.

Cited By

Zhao SDai GLi JZhu XHuang XLi YTan MWang LFang PChen XYan NLiu H(2024)An interpretable model based on graph learning for diagnosis of Parkinson’s disease with voice-related EEGnpj Digital Medicine10.1038/s41746-023-00983-97:1Online publication date: 5-Jan-2024
https://doi.org/10.1038/s41746-023-00983-9
Zhang CZheng BTsung F(2023)Multi-view metro station clustering based on passenger flows: a functional data-edged network community detection approachData Mining and Knowledge Discovery10.1007/s10618-023-00916-w37:3(1154-1208)Online publication date: 20-Feb-2023
https://dl.acm.org/doi/10.1007/s10618-023-00916-w
Liao NMo DLuo SLi XYin P(2022)SCARAProceedings of the VLDB Endowment10.14778/3551793.355186615:11(3240-3248)Online publication date: 1-Jul-2022
https://dl.acm.org/doi/10.14778/3551793.3551866
Show More Cited By

Index Terms

Time-Warped Sparse Non-negative Factorization for Functional Data Analysis

Index terms have been assigned to the content through auto-classification.

Recommendations

Versatile sparse matrix factorization and its applications in high-dimensional biological data analysis
PRIB'13: Proceedings of the 8th IAPR international conference on Pattern Recognition in Bioinformatics

Non-negative matrix factorization and sparse representation models have been successfully applied in high-throughput biological data analysis. In this paper, we propose our versatile sparse matrix factorization (VSMF) model for biological data mining. ...
A sparse-sparse iteration for computing a sparse incomplete factorization of the inverse of an SPD matrix

In this paper, a method via sparse-sparse iteration for computing a sparse incomplete factorization of the inverse of a symmetric positive definite matrix is proposed. The resulting factorized sparse approximate inverse is used as a preconditioner for ...
Non-negative matrix factorization via adaptive sparse graph regularization
Abstract
Non-negative matrix factorization (NMF), as an efficient and intuitive dimension reduction algorithm, has been successfully applied to clustering tasks. However, there are still two dominating limitations. First, the original NMF only pays ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data

ACM Transactions on Knowledge Discovery from Data Volume 14, Issue 6

December 2020

376 pages

ISSN:1556-4681

EISSN:1556-472X

DOI:10.1145/3427188

Editors:
Charu Aggarwal
IBM T. J. Watson Research, USA
,
Xindong Wu
Minginglamp Academy of Sciences, China

Issue’s Table of Contents

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 September 2020

Accepted: 01 June 2020

Revised: 01 April 2020

Received: 01 August 2019

Published in TKDD Volume 14, Issue 6

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Supply Chain Research Center Grant, Tsinghua University
NSFC
RGC GRF

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
537
Total Downloads

Downloads (Last 12 months)23
Downloads (Last 6 weeks)4

Reflects downloads up to 06 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhao SDai GLi JZhu XHuang XLi YTan MWang LFang PChen XYan NLiu H(2024)An interpretable model based on graph learning for diagnosis of Parkinson’s disease with voice-related EEGnpj Digital Medicine10.1038/s41746-023-00983-97:1Online publication date: 5-Jan-2024
https://doi.org/10.1038/s41746-023-00983-9
Zhang CZheng BTsung F(2023)Multi-view metro station clustering based on passenger flows: a functional data-edged network community detection approachData Mining and Knowledge Discovery10.1007/s10618-023-00916-w37:3(1154-1208)Online publication date: 20-Feb-2023
https://dl.acm.org/doi/10.1007/s10618-023-00916-w
Liao NMo DLuo SLi XYin P(2022)SCARAProceedings of the VLDB Endowment10.14778/3551793.355186615:11(3240-3248)Online publication date: 1-Jul-2022
https://dl.acm.org/doi/10.14778/3551793.3551866
Ali DBanerjee SPrasad Y(2022)Influential Billboard Slot Selection Using Pruned Submodularity GraphAdvanced Data Mining and Applications10.1007/978-3-031-22064-7_17(216-230)Online publication date: 30-Nov-2022
https://dl.acm.org/doi/10.1007/978-3-031-22064-7_17
Na GChang H(2021)Unsupervised Subspace Extraction via Deep Kernelized ClusteringACM Transactions on Knowledge Discovery from Data10.1145/345908216:1(1-15)Online publication date: 20-Jul-2021
https://dl.acm.org/doi/10.1145/3459082
Xiong HYan JPan LZhu FChin Ooi BMiao CWang HSkrypnyk IHsu WChawla S(2021)Contrastive Multi-View Multiplex Network Embedding with Applications to Robust Network AlignmentProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining10.1145/3447548.3467227(1913-1923)Online publication date: 14-Aug-2021
https://dl.acm.org/doi/10.1145/3447548.3467227

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents