skip to main content
10.1145/3627673.3680072acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Multiscale Representation Enhanced Temporal Flow Fusion Model for Long-Term Workload Forecasting

Published: 21 October 2024 Publication History

Abstract

Accurate workload forecasting is critical for efficient resource management in cloud computing systems, enabling effective scheduling and autoscaling. Despite recent advances with transformer-based forecasting models, challenges remain due to the non-stationary, nonlinear characteristics of workload time series and the long-term dependencies. In particular, inconsistent performance between long-term history and near-term forecasts hinders long-range predictions. This paper proposes a novel framework leveraging self-supervised multiscale representation learning to capture both long-term and near-term workload patterns. The long-term history is encoded through multiscale representations while the near-term observations are modeled via temporal flow fusion. These representations of different scales are fused using an attention mechanism and characterized with normalizing flows to handle non-Gaussian/non-linear distributions of time series. Extensive experiments on 9 benchmarks demonstrate superiority over existing methods.

References

[1]
George EP Box and Gwilym M Jenkins. 1968. Some recent advances in forecasting and control. Journal of the Royal Statistical Society. Series C (Applied Statistics), Vol. 17, 2 (1968), 91--109.
[2]
Serafeim Chatzopoulos, Thanasis Vergoulis, Dimitrios Skoutas, Theodore Dalamagas, Christos Tryfonopoulos, and Panagiotis Karras. 2023. Atrapos: Real-time Evaluation of Metapath Qery Workloads. In Proceedings of the ACM Web Conference 2023. 2487--2498.
[3]
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In International conference on machine learning. PMLR, 1597--1607.
[4]
Ting Chen, Simon Kornblith, Kevin Swersky, Mohammad Norouzi, and Geoffrey E Hinton. 2020. Big self-supervised models are strong semi-supervised learners. Advances in neural information processing systems, Vol. 33 (2020), 22243--22255.
[5]
Xinlei Chen, Haoqi Fan, Ross Girshick, and Kaiming He. 2020. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297 (2020).
[6]
Xinlei Chen and Kaiming He. 2021. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 15750--15758.
[7]
Xinlei Chen*, Saining Xie*, and Kaiming He. 2021. An Empirical Study of Training Self-Supervised Vision Transformers. arXiv preprint arXiv:2104.02057 (2021).
[8]
Zhichao Chen, Leilei Ding, Zhixuan Chu, Yucheng Qi, Jianmin Huang, and Hao Wang. 2023. Monotonic neural ordinary differential equation: Time-series forecasting for cumulative data. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management. 4523--4529.
[9]
Mingyue Cheng, Qi Liu, Zhiding Liu, Hao Zhang, Rujiao Zhang, and Enhong Chen. 2023. TimeMAE: Self-Supervised Representations of Time Series with Decoupled Masked Autoencoders. TKDE (2023).
[10]
Zhixuan Chu, Hui Ding, Guang Zeng, Yuchen Huang, Tan Yan, Yulin Kang, and Sheng Li. 2022. Hierarchical capsule prediction network for marketing campaigns effect. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 3043--3051.
[11]
Zhixuan Chu, Mengxuan Hu, Qing Cui, Longfei Li, and Sheng Li. 2024. Task-driven causal feature distillation: Towards trustworthy risk prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 11642--11650.
[12]
Shohreh Deldari, Daniel V Smith, Hao Xue, and Flora D Salim. 2021. Time series change point detection with self-supervised contrastive predictive coding. In Proceedings of the Web Conference 2021. 3124--3135.
[13]
Fan Deng, Jie Lu, Shi-Yu Wang, Jie Pan, and Li-Yong Zhang. 2019. A distributed PDP model based on spectral clustering for improving evaluation performance. World Wide Web, Vol. 22 (2019), 1555--1576.
[14]
Fan Deng, Shiyu Wang, Liyong Zhang, Xiaoqian Wei, and Jingping Yu. 2018. Establishment of attribute bitmaps for efficient XACML policy evaluation. Knowledge-Based Systems, Vol. 143 (2018), 93--101.
[15]
Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. 2017. Density estimation using Real NVP. arxiv: 1605.08803 [cs.LG]
[16]
Emadeldeen Eldele, Mohamed Ragab, Zhenghua Chen, Min Wu, Chee Keong Kwoh, Xiaoli Li, and Cuntai Guan. 2021. Time-series representation learning via temporal and contextual contrasting. arXiv preprint arXiv:2106.14112 (2021).
[17]
Binbin Feng and Zhijun Ding. 2023. GROUP: An End-to-end Multi-step-ahead Workload Prediction Approach Focusing on Workload Group Behavior. In Proceedings of the ACM Web Conference 2023. 3098--3108.
[18]
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9729--9738.
[19]
Min Hou, Chang Xu, Yang Liu, Weiqing Liu, Jiang Bian, Le Wu, Zhi Li, Enhong Chen, and Tie-Yan Liu. 2021. Stock trend prediction with multi-granularity data: A contrastive learning approach with adaptive fusion. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 700--709.
[20]
Qin Hua, Dingyu Yang, Shiyou Qian, Hanwen Hu, Jian Cao, and Guangtao Xue. 2023. KAE-Informer: A Knowledge Auto-Embedding Informer for Forecasting Long-Term Workloads of Microservices. In Proceedings of the ACM Web Conference 2023. 1551--1561.
[21]
Ming Jin, Shiyu Wang, Lintao Ma, Zhixuan Chu, James Y Zhang, Xiaoming Shi, Pin-Yu Chen, Yuxuan Liang, Yuan-Fang Li, Shirui Pan, and Qingsong Wen. 2024. Time-LLM: Time series forecasting by reprogramming large language models. In International Conference on Learning Representations (ICLR).
[22]
Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).
[23]
Ivan Kobyzev, Simon Prince, and Marcus Brubaker. 2020. Normalizing flows: An introduction and review of current methods. IEEE Transactions on Pattern Analysis and Machine Intelligence (2020).
[24]
Guokun Lai, Wei-Cheng Chang, Yiming Yang, and Hanxiao Liu. 2018. Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks. arxiv: 1703.07015 [cs.LG]
[25]
Yanan Li, Haitao Yuan, Zhe Fu, Xiao Ma, Mengwei Xu, and Shangguang Wang. 2023. ELASTIC: Edge Workload Forecasting based on Collaborative Cloud-Edge Deep Learning. In Proceedings of the ACM Web Conference 2023. 3056--3066.
[26]
et al Liu, Yong. 2022. Non-stationary transformers: Exploring the stationarity in time series forecasting. Advances in Neural Information Processing Systems, Vol. 35 (2022), 9881--9893.
[27]
Yong Liu, Tengge Hu, Haoran Zhang, Haixu Wu, Shiyu Wang, Lintao Ma, and Mingsheng Long. 2023. itransformer: Inverted transformers are effective for time series forecasting. arXiv preprint arXiv:2310.06625 (2023).
[28]
Spyros Makridakis and Michele Hibon. 1997. ARMA models and the Box--Jenkins methodology. Journal of forecasting, Vol. 16, 3 (1997), 147--163.
[29]
Mohammad Masdari and Afsane Khoshnevis. 2020. A survey and classification of the workload forecasting methods in cloud computing. Cluster Computing, Vol. 23, 4 (2020), 2399--2424.
[30]
Janmenjoy Nayak, Bighnaraj Naik, AK Jena, Rabindra K Barik, and Himansu Das. 2018. Nature inspired optimizations in cloud computing: applications and challenges. Cloud computing for optimization: Foundations, applications, and challenges (2018), 1--26.
[31]
Yuqi Nie, Nam H. Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam. 2023. A Time Series is Worth 64 Words: Long-term Forecasting with Transformers. In Proceedings of the AAAI Conference on Artificial Intelligence.
[32]
George Papamakarios, Eric Nalisnick, Danilo Jimenez Rezende, Shakir Mohamed, and Balaji Lakshminarayanan. 2019. Normalizing flows for probabilistic modeling and inference. arXiv preprint arXiv:1912.02762 (2019).
[33]
George Papamakarios, Eric Nalisnick, Danilo Jimenez Rezende, Shakir Mohamed, and Balaji Lakshminarayanan. 2021. Normalizing flows for probabilistic modeling and inference. The Journal of Machine Learning Research, Vol. 22, 1 (2021), 2617--2680.
[34]
Haoran Qiu, Subho S Banerjee, Saurabh Jha, Zbigniew T Kalbarczyk, and Ravishankar K Iyer. 2020. $$FIRM$$: An intelligent fine-grained resource management framework for $$SLO-Oriented$$ microservices. In 14th USENIX symposium on operating systems design and implementation (OSDI 20). 805--825.
[35]
Danilo Rezende and Shakir Mohamed. 2015. Variational inference with normalizing flows. In International conference on machine learning. PMLR, 1530--1538.
[36]
Nilabja Roy, Abhishek Dubey, and Aniruddha Gokhale. 2011. Efficient autoscaling in the cloud using predictive models for workload forecasting. In 2011 IEEE 4th International Conference on Cloud Computing. IEEE, 500--507.
[37]
Krzysztof Rzadca, Pawel Findeisen, Jacek Swiderski, Przemyslaw Zych, Przemyslaw Broniek, Jarek Kusmierek, Pawel Nowak, Beata Strack, Piotr Witusowski, Steven Hand, et al. 2020. Autopilot: workload autoscaling at Google. In Proceedings of the Fifteenth European Conference on Computer Systems. 1--16.
[38]
David Salinas, Valentin Flunkert, Jan Gasthaus, and Tim Januschowski. 2020. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting, Vol. 36, 3 (2020), 1181--1191.
[39]
Shiyu Wang. 2024. NeuralReconciler for Hierarchical Time Series Forecasting. In Proceedings of the 17th ACM International Conference on Web Search and Data Mining. 731--739.
[40]
Shiyu Wang, Yinbo Sun, Xiaoming Shi, Shiyi Zhu, Lin-Tao Ma, James Zhang, Yifei Zheng, and Jian Liu. 2023. Full scaling automation for sustainable development of green data centers. arXiv preprint arXiv:2305.00706 (2023).
[41]
Shiyu Wang, Yinbo Sun, Yan Wang, Fan Zhou, Lin-Tao Ma, James Zhang, and YangFei Zheng. 2023. Flow-Based End-to-End Model for Hierarchical Time Series Forecasting via Trainable Attentive-Reconciliation. In International Conference on Database Systems for Advanced Applications. Springer, 167--176.
[42]
Shiyu Wang, Haixu Wu, Xiaoming Shi, Tengge Hu, Huakun Luo, Lintao Ma, James Y Zhang, and JUN ZHOU. 2024. TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting. In International Conference on Learning Representations (ICLR).
[43]
Shiyu Wang, Fan Zhou, Yinbo Sun, Lintao Ma, James Zhang, and Yangfei Zheng. 2022. End-to-end modeling of hierarchical time series using autoregressive transformer and conditional normalizing flow-based reconciliation. In 2022 IEEE International Conference on Data Mining Workshops (ICDMW). IEEE, 1087--1094.
[44]
Zhiyuan Wang, Xovee Xu, Weifeng Zhang, Goce Trajcevski, Ting Zhong, and Fan Zhou. 2022. Learning latent seasonal-trend representations for time series forecasting. Advances in Neural Information Processing Systems, Vol. 35 (2022), 38775--38787.
[45]
Gerald Woo, Chenghao Liu, Doyen Sahoo, Akshat Kumar, and Steven Hoi. 2022. CoST: Contrastive learning of disentangled seasonal-trend representations for time series forecasting. arXiv preprint arXiv:2202.01575 (2022).
[46]
Haixu Wu, Jiehui Xu, Jianmin Wang, and Mingsheng Long. 2021. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems, Vol. 34 (2021), 22419--22430.
[47]
Siqiao Xue, Xiaoming Shi, Zhixuan Chu, Yan Wang, Fan Zhou, Hongyan Hao, Caigao Jiang, Chen Pan, Yi Xu, James Y Zhang, et al. 2023. Easytpp: Towards open benchmarking the temporal point processes. arXiv preprint arXiv:2307.08097 (2023).
[48]
Siqiao Xue, Yan Wang, Zhixuan Chu, Xiaoming Shi, Caigao Jiang, Hongyan Hao, Gangwei Jiang, Xiaoyun Feng, James Zhang, and Jun Zhou. 2023. Prompt-augmented temporal point process for streaming event sequence. Advances in Neural Information Processing Systems, Vol. 36 (2023), 18885--18905.
[49]
Jianwei Yin, Xingjian Lu, Hanwei Chen, Xinkui Zhao, and Neal N Xiong. 2014. System resource utilization analysis and prediction for cloud based applications under bursty workloads. Information Sciences, Vol. 279 (2014), 338--357.
[50]
Zhihan Yue, Yujing Wang, Juanyong Duan, Tianmeng Yang, Congrui Huang, Yunhai Tong, and Bixiong Xu. 2022. Ts2vec: Towards universal representation of time series. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 8980--8987.
[51]
George Zerveas, Srideepika Jayaraman, Dhaval Patel, Anuradha Bhamidipaty, and Carsten Eickhoff. 2021. A transformer-based framework for multivariate time series representation learning. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2114--2124.
[52]
Kexin Zhang, Qingsong Wen, Chaoli Zhang, Rongyao Cai, Ming Jin, Yong Liu, James Zhang, Yuxuan Liang, Guansong Pang, Dongjin Song, et al. 2023. Self-Supervised Learning for Time Series Analysis: Taxonomy, Progress, and Prospects. arXiv preprint arXiv:2306.10125 (2023).
[53]
Xiang Zhang, Ziyuan Zhao, Theodoros Tsiligkaridis, and Marinka Zitnik. 2022. Self-supervised contrastive pre-training for time series via time-frequency consistency. Advances in Neural Information Processing Systems, Vol. 35 (2022), 3988--4003.
[54]
Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong, and Wancai Zhang. 2021. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of AAAI.
[55]
Tian Zhou, Ziqing Ma, Qingsong Wen, Xue Wang, Liang Sun, and Rong Jin. 2022. FEDformer: Frequency enhanced decomposed transformer for long-term series forecasting. arXiv preprint arXiv:2201.12740 (2022).
[56]
Yunyi Zhou, Zhixuan Chu, Yijia Ruan, Ge Jin, Yuchen Huang, and Sheng Li. 2023. ptse: A multi-model ensemble method for probabilistic time series forecasting. arXiv preprint arXiv:2305.11304 (2023).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management
October 2024
5705 pages
ISBN:9798400704369
DOI:10.1145/3627673
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. multiscale representation
  2. time series
  3. workload forecasting

Qualifiers

  • Research-article

Conference

CIKM '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 74
    Total Downloads
  • Downloads (Last 12 months)74
  • Downloads (Last 6 weeks)15
Reflects downloads up to 25 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media