skip to main content
10.1145/2764967.2771928acmconferencesArticle/Chapter ViewAbstractPublication PagesscopesConference Proceedingsconference-collections
short-paper

VLIW Code Generation for a Convolutional Network Accelerator

Published: 01 June 2015 Publication History

Abstract

This paper presents a compiler flow to map Deep Convolutional Networks (ConvNets) to a highly specialized VLIW accelerator core targeting the low-power embedded market. Earlier works have focused on energy efficient accelerators for this class of algorithms, but none of them provides a complete and practical programming model. Due to the large parameter set of a ConvNet it is essential that the user can abstract from the accelerator architecture and does not have to rely on an error prone and ad-hoc assembly programming model. By using modulo scheduling for software pipelining we demonstrate that our automatic generated code achieves equal or within 5-20% less hardware utilization w.r.t. code written manually by experts. Our compiler removes the huge manual workload to efficiently map ConvNets to an energy-efficient core for the next-generation mobile and wearable devices.

References

[1]
S. Chakradhar, M. Sankaradas, V. Jakkula, and S. Cadambi. A dynamically configurable coprocessor for convolutional neural networks. In ISCA'37, 2010.
[2]
K. Chellapilla, S. Puri, and P. Simard. High performance convolutional neural networks for document processing. In 10th International Workshop on Frontiers in Handwriting Recognition, 2006.
[3]
T. Chen, Z. Du, N. Sun, J. Wang, C. Wu, Y. Chen, and O. Temam. Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. In ASPLOS'19, 2014.
[4]
D. C. Ciresan, U. Meier, J. Masci, L. Maria Gambardella, and J. Schmidhuber. Flexible, high performance convolutional neural networks for image classification. In IJCAI, 2011.
[5]
C. Farabet, B. Martini, B. Corda, P. Akselrod, E. Culurciello, and Y. LeCun. Neuflow: A runtime reconfigurable dataflow processor for vision. In CVPR Workshop, 2011.
[6]
R. Hameed, W. Qadeer, M. Wachs, O. Azizi, A. Solomatnikov, B. C. Lee, S. Richardson, C. Kozyrakis, and M. Horowitz. Understanding sources of inefficiency in general-purpose chips. In ISCA'37, volume 38, pages 37--47. ACM, 2010.
[7]
K. Jarrett, K. Kavukcuoglu, M. Ranzato, and Y. LeCun. What is the best multi-stage architecture for object recognition? In IEEE 12th International Conference on Computer Vision, 2009.
[8]
A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS'25, 2012.
[9]
H. Larochelle, D. Erhan, A. Courville, J. Bergstra, and Y. Bengio. An empirical evaluation of deep architectures on problems with many factors of variation. In ICML'24, 2007.
[10]
J. Llosa, A. Gonzalez, E. Ayguade, and M. Valero. Swing module scheduling: a lifetime-sensitive approach. In PACT, 1996.
[11]
M. Peemen, B. Mesman, and H. Corporaal. A data-reuse aware accelerator for large-scale convolutional networks. In NeuroArch Workshop at ISCA '41, 2014.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SCOPES '15: Proceedings of the 18th International Workshop on Software and Compilers for Embedded Systems
June 2015
147 pages
ISBN:9781450335935
DOI:10.1145/2764967
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • EDAA: European Design Automation Association

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Code Generation
  2. Compilation
  3. Convolutional Networks
  4. Software Pipelining
  5. VLIW

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Conference

SCOPES '15
Sponsor:

Acceptance Rates

Overall Acceptance Rate 38 of 79 submissions, 48%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)3
Reflects downloads up to 28 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media