research-article

Towards Accurate Prediction for High-Dimensional and Highly-Variable Cloud Workloads with Deep Learning

Authors:

Albert Y. Zomaya,

Tarek El-GhazawiAuthors Info & Claims

IEEE Transactions on Parallel and Distributed Systems, Volume 31, Issue 4

Pages 923 - 934

https://doi.org/10.1109/TPDS.2019.2953745

Published: 01 April 2020 Publication History

Abstract

Resource provisioning for cloud computing necessitates the adaptive and accurate prediction of cloud workloads. However, the existing methods cannot effectively predict the high-dimensional and highly-variable cloud workloads. This results in resource wasting and inability to satisfy service level agreements (SLAs). Since recurrent neural network (RNN) is naturally suitable for sequential data analysis, it has been recently used to tackle the problem of workload prediction. However, RNN often performs poorly on learning long-term memory dependencies, and thus cannot make the accurate prediction of workloads. To address these important challenges, we propose a deep Learning based Prediction Algorithm for cloud Workloads (L-PAW). First, a top-sparse auto-encoder (TSA) is designed to effectively extract the essential representations of workloads from the original high-dimensional workload data. Next, we integrate TSA and gated recurrent unit (GRU) block into RNN to achieve the adaptive and accurate prediction for highly-variable workloads. Using real-world workload traces from Google and Alibaba cloud data centers and the DUX-based cluster, extensive experiments are conducted to demonstrate the effectiveness and adaptability of the L-PAW for different types of workloads with various prediction lengths. Moreover, the performance results show that the L-PAW achieves superior prediction accuracy compared to the classic RNN-based and other workload prediction methods for high-dimensional and highly-variable real-world cloud workloads.

References

[1]

A. Alsarhan, A. Itradat, A. Y. Al-Dubai, A. Y. Zomaya, and G. Min, “Adaptive resource allocation and provisioning in multi-service cloud environments,” IEEE Trans. Parallel Distrib. Syst., vol. 29, no. 1, pp. 31–42, Jan. 2018.

Digital Library

[2]

Z. Wang, M. M. Hayat, N. Ghani, and K. B. Shaban, “Optimizing cloud-service performance: Efficient resource provisioning via optimal workload allocation,” IEEE Trans. Parallel Distrib. Syst., vol. 28, no. 6, pp. 1689–1702, Jun. 2017.

Digital Library

[3]

Z. Chen, J. Hu, and G. Min, “Learning-based resource allocation in cloud data center using advantage actor-critic,” in Proc. 53rd IEEE Int. Conf. Commun., 2019, pp. 1–6.

[4]

F. Xu, H. Zheng, H. Jiang, W. Shao, H. Liu, and Z. Zhou, “Cost-effective cloud server provisioning for predictable performance of big data analytics,” IEEE Trans. Parallel Distrib. Syst., vol. 30, no. 5, pp. 1036–1051, May 2019.

[5]

C. Reiss, J. Wilkes, and J. L. Hellerstein, “Google cluster-usage traces: Format+ schema,” Google Inc., White Paper, pp. 1–14, 2011.

[6]

P. A. Dinda, “The statistical properties of host load,” Sci. Program., vol. 7, no. 3/4, pp. 211–229, 1999.

Digital Library

[7]

J. Guo, et al., “Who limits the resource efficiency of my datacenter: An analysis of alibaba datacenter traces,” in Proc. 27th Int. Symp. Qual. Service, 2019, pp. 1–10.

[8]

Q. Zhang, L. T. Yang, Z. Yan, Z. Chen, and P. Li, “An efficient deep learning model to predict cloud workload for industry informatics,” IEEE Trans. Ind. Informat., vol. 14, no. 7, pp. 3170–3178, Jul. 2018.

[9]

Y. Hsu, K. Matsuda, and M. Matsuoka, “Self-aware workload forecasting in data center power prediction,” in Proc. 18th IEEE/ACM Int. Symp. Cluster Cloud Grid Comput., 2018, pp. 321–330.

[10]

Z. Zhou, F. Liu, Z. Li, and H. Jin, “When smart grid meets geo-distributed cloud: An auction approach to datacenter demand response,” in Proc. 34th Int. Conf. Comput. Commun., 2015, pp. 2650–2658.

[11]

I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge, MA, USA: MIT Press, 2016.

Digital Library

[12]

J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling,” in Proc. NIPS Workshop Deep Learn., Dec. 2014.

[13]

R. Jozefowicz, W. Zaremba, and I. Sutskever, “An empirical exploration of recurrent network architectures,” in Proc. 32nd Int. Conf. Mach. Learn., 2015, pp. 2342–2350.

[14]

I. Jolliffe, “Principal component analysis,” in International Encyclopedia of Statistical Science, Berlin, Germany: Springer, 2011, pp. 1094–1096.

[15]

Y. Hu, B. Deng, F. Peng, and D. Wang, “Workload prediction for cloud computing elasticity mechanism,” in Proc. IEEE Int. Conf. Cloud Comput. Big Data Anal., 2016, pp. 244–249.

[16]

X. Tang, X. Liao, J. Zheng, and X. Yang, “Energy efficient job scheduling with workload prediction on cloud data center,” Cluster Comput., vol. 21, no. 3, pp. 1581–1593, 2018.

Digital Library

[17]

J. Kumar and A. K. Singh, “Workload prediction in cloud using artificial neural network and adaptive differential evolution,” Future Gener. Comput. Syst., vol. 81, pp. 41–52, 2018.

Digital Library

[18]

Y. Yu, V. Jindal, F. Bastani, F. Li, and I. Yen, “Improving the smartness of cloud management via machine learning based workload prediction,” in Proc. IEEE 42nd Annu. Comput. Softw. Appl. Conf., 2018, pp. 38–44.

[19]

S. Salaria, A. Drozd, A. Podobas, and S. Matsuoka, “Predicting performance using collaborative filtering,” in Proc. IEEE Int. Conf. Cluster Comput., 2018, pp. 504–514.

[20]

K. Mason, M. Duggan, E. Barrett, J. Duggan, and E. Howley, “Predicting host CPU utilization in the cloud using evolutionary neural networks,” Future Gener. Comput. Syst., vol. 86, pp. 162–173, 2018.

Digital Library

[21]

G. Kaur, A. Bala, and I. Chana, “An intelligent regressive ensemble approach for predicting resource usage in cloud computing,” J. Parallel Distrib. Comput., vol. 123, pp. 1–12, 2019.

[22]

W. Zhang, B. Li, D. Zhao, F. Gong, and Q. Lu, “Workload prediction for cloud cluster using a recurrent neural network,” in Proc. IEEE Int. Conf. Identification Inf. Knowl. Internet Things, 2016, pp. 104–109.

[23]

Z. Huang, J. Peng, H. Lian, J. Guo, and W. Qiu, “Deep recurrent model for server load and performance prediction in data center,” Complexity, vol. 2017, no. 99, pp. 1–10, 2017.

Digital Library

[24]

M. Duggan, K. Mason, J. Duggan, E. Howley, and E. Barrett, “Predicting host CPU utilization in cloud computing using recurrent neural networks,” in Proc. 12th Int. Conf. Internet Technol. Secured Trans., 2017, pp. 67–72.

[25]

H. Shi, M. Xu, and R. Li, “Deep learning for household load forecasting–A novel pooling deep RNN,” IEEE Trans. Smart Grid, vol. 9, no. 5, pp. 5271–5280, Sep. 2018.

[26]

W. Kong, Z. Y. Dong, Y. Jia, D. J. Hill, Y. Xu, and Y. Zhang, “Short-term residential load forecasting based on LSTM recurrent neural network,” IEEE Trans. Smart Grid, vol. 10, no. 1, pp. 841–851, Jan. 2019.

[27]

B. Song, Y. Yu, Y. Zhou, Z. Wang, and S. Du, “Host load prediction with long short-term memory in cloud computing,” The J. Supercomputing, vol. 74, no. 12, pp. 6554–6568, 2018.

Digital Library

[28]

Q. Yang, Y. Zhou, Y. Yu, J. Yuan, X. Xing, and S. Du, “Multi-step-ahead host load prediction using autoencoder and echo state networks in cloud computing,” The J. Supercomputing, vol. 71, no. 8, pp. 3037–3053, 2015.

Digital Library

[29]

S. Kumar, N. Muthiyan, S. Gupta, A. Dileep, and A. Nigam, “Association learning based hybrid model for cloud workload prediction,” in Proc. IEEE 31st Int. Joint Conf. Neural Netw., 2018, pp. 1–8.

[30]

Y. Guo and W. Yao, “Applying gated recurrent units approaches for workload prediction,” in Proc. IEEE 16th Netw. Operations Manage. Symp., 2018, pp. 1–6.

[31]

H. Liu, “A measurement study of server utilization in public clouds,” in Proc. IEEE 9th Int. Conf. Depend. Auton. Secure Comput., 2011, pp. 435–442.

[32]

E. Cortez, A. Bonde, A. Muzio, M. Russinovich, M. Fontoura, and R. Bianchini, “Resource central: Understanding and predicting workloads for improved resource management in large cloud platforms,” in Proc. 26th Symp. Operating Syst. Princ., 2017, pp. 153–167.

[33]

M. Ranjbari and J. A. Torkestani, “A learning automata-based algorithm for energy and SLA efficient consolidation of virtual machines in cloud data centers,” J. Parallel Distrib. Comput., vol. 113, pp. 55–62, 2018.

Digital Library

[34]

C. Delimitrou and C. Kozyrakis, “Paragon: QoS-aware scheduling for heterogeneous datacenters,” in Proc. 18th Int. Conf. Architectural Support Program. Lang. Operating Syst., 2013, vol. 48, pp. 77–88.

[35]

K. Wang, C. Xu, Y. Zhang, S. Guo, and A. Zomaya, “Robust big data analytics for electricity price forecasting in the smart grid,” IEEE Trans. Big Data, vol. 5, no. 1, pp. 34–45, Mar. 2019.

[36]

M. Abadi, et al., “TensorFlow: A system for large-scale machine learning,” in Proc. 12th USENIX Symp. Operating Syst. Des. Implementation, 2016, pp. 265–283.

Cited By

Lin SLin WZhao FChen H(2025)Benchmarking and revisiting time series forecasting methods in cloud workload predictionCluster Computing10.1007/s10586-024-04827-328:1Online publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1007/s10586-024-04827-3
Zhang ZXu CZhang JZhu ZXu S(2024)When wavelet decomposition meets external attention: a lightweight cloud server load prediction modelJournal of Cloud Computing: Advances, Systems and Applications10.1186/s13677-024-00698-613:1Online publication date: 20-Aug-2024
https://dl.acm.org/doi/10.1186/s13677-024-00698-6
Ye YZhang DFeng G(2024)A patch-based network model for container workload forecastingProceedings of the 2024 International Conference on Cloud Computing and Big Data10.1145/3695080.3695128(272-277)Online publication date: 26-Jul-2024
https://dl.acm.org/doi/10.1145/3695080.3695128
Show More Cited By

Index Terms

Towards Accurate Prediction for High-Dimensional and Highly-Variable Cloud Workloads with Deep Learning

Index terms have been assigned to the content through auto-classification.

Recommendations

esDNN: Deep Neural Network Based Multivariate Workload Prediction in Cloud Computing Environments
Cloud computing has been regarded as a successful paradigm for IT industry by providing benefits for both service providers and customers. In spite of the advantages, cloud computing also suffers from distinct challenges, and one of them is the ...
On accurate prediction of cloud workloads with adaptive pattern mining
Abstract
Resource provisioning for cloud computing requires adaptive and accurate prediction of cloud workloads. However, existing studies in workload prediction have faced significant challenges in predicting time-varying cloud workloads of diverse trends ...
A Self-Adaptive Prediction Algorithm for Cloud Workloads

It is difficult to analyze the workload in complex cloud computing environments with a single prediction algorithm as each algorithm has its own shortcomings. A self-adaptive prediction algorithm combining the advantages of linear regression LR and a BP ...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Parallel and Distributed Systems

IEEE Transactions on Parallel and Distributed Systems Volume 31, Issue 4

April 2020

248 pages

ISSN:1045-9219

Issue’s Table of Contents

1045-9219 © 2019 IEEE Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Publisher

IEEE Press

Publication History

Published: 01 April 2020

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

35
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 23 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Lin SLin WZhao FChen H(2025)Benchmarking and revisiting time series forecasting methods in cloud workload predictionCluster Computing10.1007/s10586-024-04827-328:1Online publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1007/s10586-024-04827-3
Zhang ZXu CZhang JZhu ZXu S(2024)When wavelet decomposition meets external attention: a lightweight cloud server load prediction modelJournal of Cloud Computing: Advances, Systems and Applications10.1186/s13677-024-00698-613:1Online publication date: 20-Aug-2024
https://dl.acm.org/doi/10.1186/s13677-024-00698-6
Ye YZhang DFeng G(2024)A patch-based network model for container workload forecastingProceedings of the 2024 International Conference on Cloud Computing and Big Data10.1145/3695080.3695128(272-277)Online publication date: 26-Jul-2024
https://dl.acm.org/doi/10.1145/3695080.3695128
Sajal SZhu TUrgaonkar BSen S(2024)TraceUpscaler: Upscaling Traces to Evaluate Systems at High LoadProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3629581(942-961)Online publication date: 22-Apr-2024
https://dl.acm.org/doi/10.1145/3627703.3629581
Wang XLiu JLin HGarg SAlrashoud M(2024)A multi-modal spatial–temporal model for accurate motion forecasting with visual fusionInformation Fusion10.1016/j.inffus.2023.102046102:COnline publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1016/j.inffus.2023.102046
Ren WXu ZLiang WDai HRana OZhou PXia QRen HLi MWu G(2024)Learning-driven service caching in MEC networks with bursty data traffic and uncertain delaysComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2024.110575250:COnline publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1016/j.comnet.2024.110575
Yang ZWang XLi RLiu Y(2024)HMM-CPM: a cloud instance resource prediction method tracing the workload trends via hidden Markov modelCluster Computing10.1007/s10586-024-04580-727:8(11823-11838)Online publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1007/s10586-024-04580-7
Singh GSengupta PMehta ABedi J(2024)A feature extraction and time warping based neural expansion architecture for cloud resource usage forecastingCluster Computing10.1007/s10586-023-04224-227:4(4963-4982)Online publication date: 10-Jan-2024
https://dl.acm.org/doi/10.1007/s10586-023-04224-2
Sravanthi GMoparthi N(2024)Dual Interactive Wasserstein Generative Adversarial Network optimized with arithmetic optimization algorithm-based job scheduling in cloud-based IoTCluster Computing10.1007/s10586-023-03994-z27:1(931-944)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1007/s10586-023-03994-z
Maiyza AKorany NBanawan KHassan HSheta W(2023)VTGAN: hybrid generative adversarial networks for cloud workload predictionJournal of Cloud Computing: Advances, Systems and Applications10.1186/s13677-023-00473-z12:1Online publication date: 26-Jun-2023
https://dl.acm.org/doi/10.1186/s13677-023-00473-z
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents