skip to main content
research-article
Open access

Green AI

Published: 17 November 2020 Publication History

Abstract

Creating efficiency in AI research will decrease its carbon footprint and increase its inclusivity as deep learning study should not require the deepest pockets.

References

[1]
Acharyya, P., Rosario, S.D., Flor, F., Joshi, R., Li, D., Linares, R, and Zhang, H. Autopilot of cement plants for reduction of fuel consumption and emissions. In Proceedings of ICML Workshop on Climate Change, 2019.
[2]
Amodei, D. and Hernandez, D. AI and compute, 2018. Blog post.
[3]
Bergstra, J.S., Bardenet, R., Bengio, Y. and Kégl, B. Algorithms for hyper-parameter optimization. In Proceedings of NeurIPS, 2011.
[4]
Brown, T.B. et al. Language models are few-shot learners, 2020; arXiv:2005.14165.
[5]
Canziani, A., Paszke, A. and Culurciello, E. An analysis of deep neural network models for practical applications. In Proceedings of ISCAS, 2017.
[6]
Chen, Y., Li, J., Xiao, H., Jin, X., Yan, S. and Feng, J. Dual path networks. In Proceedings of NeurIPS, 2017.
[7]
Deng, J., Dong, W., Socher, R., Li, L-J, Li, K. and Fei-Fei, L. ImageNet: A large-scale hierarchical image database. In Proceedings of CVPR, 2009.
[8]
Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. BERT: Pretraining of deep bidirectional transformers for language understanding. In Proceedings of NAACL, 2019.
[9]
Dodge, J., Gururangan, S., Card, D., Schwartz, R. and Smith, N.A. Show your work: Improved reporting of experimental results. In Proceedings of EMNLP, 2019.
[10]
Dodge, J., Ilharco, G., Schwartz, R., Farhadi, A., Hajishirzi, H. and Smith, N.A. Fine-tuning pretrained language models: Weight initializations, data orders, and early stopping, 2020; arXiv:2002.06305.
[11]
Dodge, J., Jamieson, K. and Smith, N.A. Open loop hyperparameter optimization and determinantal point processes. In Proceedings of AutoML, 2017.
[12]
Duhart, C., Dublon, G., Mayton, B., Davenport, G. and Paradiso, J.A. Deep learning for wildlife conservation and restoration efforts. In Proceedings of ICML Workshop on Climate Change, 2019.
[13]
Gordon, A., Eban, E., Nachum, O., Chen, B., Wu, H., Yang, T-J, and Choi, E. MorphNet: Fast & simple resource-constrained structure learning of deep networks. In Proceedings of CVPR, 2018.
[14]
Halevy, A., Norvig, P. and Pereira, F. The unreasonable effectiveness of data. IEEE Intelligent Systems 24 (2009), 8--12.
[15]
He, K., Zhang, X., Ren, S. and Sun, J. Deep residual learning for image recognition. In Proceedings of CVPR, 2016.
[16]
Henderson, P., Hu, J., Romoff, J., Brunskill, E., Jurafsky, D. and Pineau, J. Towards the systematic reporting of the energy and carbon footprints of machine learning, 2020; arXiv:2002.05651.
[17]
Hochreiter, S. and Schmidhuber, J. Long short-term memory. Neural Computation 9, 8 (1997), 1735--1780.
[18]
Howard, A.G. et al. MobileNets: Efficient convolutional neural networks for mobile vision applications, 2017; arXiv:1704.04861.
[19]
Hu, J., Shen, L. and Sun, G. Squeeze-and-excitation networks. In Proceedings of CVPR, 2018.
[20]
Huang, J. et al. Speed/accuracy trade-offs for modern convolutional object detectors. In Proceedings of CVPR, 2017.
[21]
Jeon, Y. and Kim, J. Constructing fast network through deconstruction of convolution. In Proceedings of NeurIPS, 2018.
[22]
Jouppi, N.P. et al. In-datacenter performance analysis of a tensor processing unit. In Proceedings of ISCA 1, 1 (2017), Publ. date: June 2020.
[23]
Kamthe, S. and Deisenroth, M.P. Data-efficient reinforcement learning with probabilistic model predictive control. In Proceedings of AISTATS, 2018.
[24]
Krizhevsky, A., Sutskever, I. and Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of NeurIPS, 2012.
[25]
Lacoste, A., Luccioni, A., Schmidt, V. and Dandres, T. Quantifying the carbon emissions of machine learning. In Proceedings of the Climate Change AI Workshop, 2019.
[26]
Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A. and Talwalkar, A. Hyperband: Bandit-based configuration evaluation for hyperparameter optimization. In Proceedings of ICLR, 2017.
[27]
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S. Fu, C-Y and Berg, A.C. SSD: Single shot multibox detector. In Proceedings of ECCV, 2016.
[28]
Liu, Y. et al. RoBERTa: A robustly optimized BERT pretraining approach, 2019; arXiv:1907.11692.
[29]
Ma, N., Zhang, X., Zheng, H.T and Sun, J. ShuffleNet V2: Practical guidelines for efficient cnn architecture design. In Proceedings of ECCV, 2018.
[30]
Mahajan, D. et al. Exploring the limits of weakly supervised pretraining, 2018; arXiv:1805.00932.
[31]
Melis, G., Dyer, C. and Blunsom, P. On the state of the art of evaluation in neural language models. In Proceedings of EMNLP, 2018.
[32]
Molchanov, P., Tyree, S., Karras, T., Aila, T. and Kautz, J. Pruning convolutional neural networks for resource efficient inference. In Proceedings of ICLR, 2017.
[33]
Moore, G.E. Cramming more components onto integrated circuits, 1965.
[34]
Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K. and Zettlemoyer, L. Deep contextualized word representations. In Proceedings of NAACL, 2018.
[35]
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D. and Sutskever, I. Language models are unsupervised multitask learners. OpenAI Blog, 2019.
[36]
Raffel, C. et al. Exploring the limits of transfer learning with a unified text-to-text transformer, 2019; arXiv:1910.10683.
[37]
Rastegari, M., Ordonez, V., Redmon, J. and Farhadi, A. Xnornet: Imagenet classification using binary convolutional neural networks. In Proceedings of ECCV, 2016.
[38]
Redmon, J., Divvala, S., Girshick, R. and Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of CVPR, 2016.
[39]
Rolnick, D. et al. Tackling climate change with machine learning, 2019; arXiv:1905.12616.
[40]
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. and Chen, L.C. MobileNetV2: Inverted residuals and linear bottlenecks. In Proceedings of CVPR, 2018.
[41]
Schwartz, R., Thomson, S. and Smith, N.A. SoPa: Bridging CNNs, RNNs, and weighted finite-state machines. In Proceedings of ACL, 2018.
[42]
Shoeybi, M., Patwary, M., Puri, R., LeGresley, P., Casper, J., Catanzaro, B. Megatron-LM: Training multi-billion parameter language models using GPU model parallelism, 2019; arXiv:1909.08053.
[43]
Shoham, Y. et al. The AI index 2018 annual report. AI Index Steering Committee, Human-Centered AI Initiative, Stanford University; http://cdn.aiindex.org/2018/AI%20Index%202018%20Annual%20Report.pdf.
[44]
Silver, D. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 7587 (2016) 484.
[45]
Silver, D. et al. Mastering chess and shogi by self-play with a general reinforcement learning algorithm, 2017; arXiv:1712.01815.
[46]
Silver, D. et al. Mastering the game of Go without human knowledge. Nature 550, 7676 (2017), 354.
[47]
Strubell, E., Ganesh, A. and McCallum, A. Energy and policy considerations for deep learning in NLP. In Proceedings of ACL, 2019.
[48]
Sun, C., Shrivastava, A., Singh, S. and Gupta, A. Revisiting unreasonable effectiveness of data in deep learning era. In Proceedings of ICCV, 2017.
[49]
Tsang, I., Kwok, J.T. and Cheung, P.M. Core vector machines: Fast SVM training on very large data sets. JMLR 6 (Apr. 2005), 363--392.
[50]
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L. and Polosukhin, I. Attention is all you need. In Proceedings of NeurIPS, 2017.
[51]
Veniat, T. and Denoyer, L. Learning time/memory-efficient deep architectures with budgeted super networks. In Proceedings of CVPR, 2018.
[52]
Walsman, A., Bisk, Y., Gabriel, S., Misra, D., Artzi, Y., Choi, Y. and Fox, D. Early fusion for goal directed robotic vision. In Proceedings of IROS, 2019.
[53]
Wang, A. Pruksachatkun, Y., Nangia, N., Singh, A., Michael, J., Hill, F., Levy, O. and Bowman, S.R. SuperGLUE: A stickier benchmark for general-purpose language understanding systems, 2019; arXiv:1905.00537.
[54]
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O. and Bowman, S.R. GLUE: A multi-task benchmark and analysis platform for natural language understanding. In Proceedings of ICLR, 2019.
[55]
Xie, S., Girshick, R., Dollar, P., Tu, Z. and He, K. Aggregated residual transformations for deep neural networks. In Proceedings of CVPR, 2017.
[56]
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. and Le, Q.V. XLNet: Generalized autoregressive pretraining for language understanding, 2019; arXiv:1906.08237.
[57]
Zellers, R., Holtzman, A., Rashkin, H., Bisk, Y., Farhadi, A., Roesner, F. and Choi, Y. Defending against neural fake news, 2019; arXiv:1905.12616.
[58]
Zhang, X., Zhou, X., Lin, M. and Sun, J. ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of CVPR, 2018.
[59]
Zoph, B. and Le, Q.V. Neural architecture search with reinforcement learning. In Proceedings of ICLR, 2017.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Communications of the ACM
Communications of the ACM  Volume 63, Issue 12
December 2020
92 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/3437360
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 November 2020
Published in CACM Volume 63, Issue 12

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Popular
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8,804
  • Downloads (Last 6 weeks)678
Reflects downloads up to 10 Sep 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Digital Edition

View this article in digital edition.

Digital Edition

Magazine Site

View this article on the magazine site (external)

Magazine Site

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media