poster

ALOHA: an architectural-aware framework for deep learning at the edge

INTESA '18: Proceedings of the Workshop on INTelligent Embedded Systems Architectures and Applications

Pages 19 - 26

https://doi.org/10.1145/3285017.3285019

Published: 04 October 2018 Publication History

Abstract

Novel Deep Learning (DL) algorithms show ever-increasing accuracy and precision in multiple application domains. However, some steps further are needed towards the ubiquitous adoption of this kind of instrument. First, effort and skills required to develop new DL models, or to adapt existing ones to new use-cases, are hardly available for small- and medium-sized businesses. Second, DL inference must be brought at the edge, to overcome limitations posed by the classically-used cloud computing paradigm. This requires implementation on low-energy computing nodes, often heterogenous and parallel, that are usually more complex to program and to manage. This work describes the ALOHA framework, that proposes a solution to these issue by means of an integrated tool flow that automates most phases of the development process. The framework introduces architecture-awareness, considering the target inference platform very early, already during algorithm selection, and driving the optimal porting of the resulting embedded application. Moreover it considers security, power efficiency and adaptiveness as main objectives during the whole development process.

References

[1]

Battista Biggio and Fabio Roli. 2018. Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning. Pattern Recognition 84 (2018), 317--331.

Digital Library

[2]

Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. arXiv:1602.02830 {cs} (Feb. 2016). arXiv:cs/1602.02830

[3]

Dong Yu et al. 2014. An introduction to computational networks and the computational network toolkit. Technical Report.

[4]

Giuseppe Desoli et al. 2017. 14.1 A2.9TOPS/W deep convolutional neural network SoC in FD-SOI 28nm for intelligent embedded systems. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC '17). IEEE, 238 -- 239.

[5]

Martin Abadi et al. 2016. TensorFlow: A system for large-scale machine learning. In Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation (OSDI '16). USENIX Association Berkeley, 265 -- 283.

Digital Library

[6]

Norman P. Jouppi et al. 2017. In-Datacenter Performance Analysis of a Tensor Processing Unit. In Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA '17). ACM, 1 -- 12.

Digital Library

[7]

Sharan Chetlur et al. 2014. cuDNN: Efficient Primitives for Deep Learning. CoRR abs/1410.0759 (2014). arXiv:arXiv:1410.0759

[8]

Yangqing Jia et al. 2014. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM international conference on Multimedia (MM '14). ACM, 675 âĂŞ- 678.

Digital Library

[9]

Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press, Cambridge, MA.

Digital Library

[10]

Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples. In International Conference on Learning Representations.

[11]

Song Han, Huizi Mao, and William J. Dally. 2015. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. arXiv:1510.00149 {cs} (Oct. 2015). arXiv:cs/1510.00149

[12]

Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations. arXiv:1609.07061 {cs} (Sept. 2016). arXiv:cs/1609.07061

[13]

Xiaofan Lin, Cong Zhao, and Wei Pan. 2017. Towards Accurate Binary Convolutional Neural Network. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 345--353.

[14]

Michael Masin,Lio Limonad, Aviad Sela, David Boaz, Lev Greenberg, Nir Mashkif, and Ran Rinat. 2013. Pluggable Analysis Viewpoints for Design Space Exploration. Procedia Computer Science 16 (2013), 226--235.

[15]

Paolo Meloni, Alessandro Capotondi,GianfrancoDeriu,Michele Brian, Francesco Conti, Davide Rossi, Luigi Raffo, and Luca Benini. 2017. NEURAghe: Exploiting CPU-FPGA Synergies for Efficient and Flexible CNN Inference Acceleration on ZynqSoCs. CoRR abs/1712.00994 (2017). arXiv:1712.00994 http://arxiv.org/abs/1712.00994

[16]

Andy D.Pimentel, Cagkan Erbas, and Simon Polstra. 2006. A systematic approach to exploring embedded system architectures at multiple abstraction levels. IEEE Trans. Comput. 55, 2 (Feb 2006), 99--112.

Digital Library

[17]

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2014. ImageNet Large Scale Visual Recognition Challenge. CoRR abs/1409.0575 (2014). arXiv:1409.0575 http://arxiv.org/abs/1409.0575

[18]

Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large Scale Image Recognition. CoRR abs/1409.1556 (2014). http://arxiv.org/abs/1409.1556

[19]

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. In International Conference on Learning Representations.

[20]

The Theano Development Team and Rami et al. Al-Rfou. 2016. Theano: A Python framework for fast computation of mathematical expressions. (05 2016). arXiv:arXiv:1605.02688

[21]

Ilias Theodorakopoulos, V Pothos, Dimitris Kastaniotis, and Nikos Fragoulis. 2017. Parsimonious Inference on Convolutional Neural Networks: Learning and applying on-line kernel activation rules. CoRR abs/1701.05221 (2017). http://arxiv.org/abs/1701.05221

[22]

Aojun Zhou, Anbang Yao, Yiwen Guo, Lin Xu, and Yurong Chen. 2016. Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights. (Nov. 2016).

Cited By

Ratto FMáinez ÁSau CMeloni PDeriu GDelucchi SMassa MRaffo LPalumbo F(2023)An Automated Design Flow for Adaptive Neural Network Hardware AcceleratorsJournal of Signal Processing Systems10.1007/s11265-023-01855-x95:9(1091-1113)Online publication date: 26-Apr-2023
https://doi.org/10.1007/s11265-023-01855-x
Meindl RMoser B(2023)Measuring Overhead Costs of Federated Learning Systems by EavesdroppingDatabase and Expert Systems Applications - DEXA 2023 Workshops10.1007/978-3-031-39689-2_4(33-42)Online publication date: 21-Aug-2023
https://doi.org/10.1007/978-3-031-39689-2_4
Scrugli MMeloni PSau CRaffo L(2021)Runtime Adaptive IoMT Node on Multi-Core Processor PlatformElectronics10.3390/electronics1021257210:21(2572)Online publication date: 21-Oct-2021
https://doi.org/10.3390/electronics10212572
Show More Cited By

Index Terms

ALOHA: an architectural-aware framework for deep learning at the edge
1. Computer systems organization
  1. Embedded and cyber-physical systems
    1. Embedded systems

Recommendations

Towards dropout training for convolutional neural networks

Recently, dropout has seen increasing use in deep learning. For deep convolutional neural networks, dropout is known to work well in fully-connected layers. However, its effect in convolutional and pooling layers is still not clear. This paper ...
Adapting Convolutional Neural Networks for Indoor Localization with Smart Mobile Devices
GLSVLSI '18: Proceedings of the 2018 Great Lakes Symposium on VLSI

Indoor localization is emerging as an important application domain for enhanced navigation (or tracking) of people and assets in indoor locales such as buildings, malls, and underground mines. Most indoor localization solutions proposed in prior work do ...
Convolutional neural networks for wavelet domain super resolution

Proposed a super resolution method with higher reconstruction accuracy than before.Cast super resolution as a problem of estimating sparse wavelet detail coefficients.Estimated sparse wavelet coefficients using a convolutional neural network (CNN)...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

INTESA '18: Proceedings of the Workshop on INTelligent Embedded Systems Architectures and Applications

October 2018

62 pages

ISBN:9781450365987

DOI:10.1145/3285017

General Chair:
Maurizio Martina
POLITO, IT
,
Program Chair:
William Fornanciari
POLIMI, IT

Copyright © 2018 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 October 2018

Check for updates

Author Tags

Qualifiers

Poster

Funding Sources

European Union

Conference

INTESA

INTESA: INTelligent Embedded Systems Architectures and Applications

October 4, 2018

Turin, Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
243
Total Downloads

Downloads (Last 12 months)23
Downloads (Last 6 weeks)2

Reflects downloads up to 05 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ratto FMáinez ÁSau CMeloni PDeriu GDelucchi SMassa MRaffo LPalumbo F(2023)An Automated Design Flow for Adaptive Neural Network Hardware AcceleratorsJournal of Signal Processing Systems10.1007/s11265-023-01855-x95:9(1091-1113)Online publication date: 26-Apr-2023
https://doi.org/10.1007/s11265-023-01855-x
Meindl RMoser B(2023)Measuring Overhead Costs of Federated Learning Systems by EavesdroppingDatabase and Expert Systems Applications - DEXA 2023 Workshops10.1007/978-3-031-39689-2_4(33-42)Online publication date: 21-Aug-2023
https://doi.org/10.1007/978-3-031-39689-2_4
Scrugli MMeloni PSau CRaffo L(2021)Runtime Adaptive IoMT Node on Multi-Core Processor PlatformElectronics10.3390/electronics1021257210:21(2572)Online publication date: 21-Oct-2021
https://doi.org/10.3390/electronics10212572
Fasfous NVemparala MFrickenstein AValpreda ESalihu DDoan NUnger CNagaraja NMartina MStechele W(2021)HW-FlowQ: A Multi-Abstraction Level HW-CNN Co-design Quantization MethodologyACM Transactions on Embedded Computing Systems10.1145/347699720:5s(1-25)Online publication date: 23-Sep-2021
https://dl.acm.org/doi/10.1145/3476997
Sapra DPimentel A(2021)Designing convolutional neural networks with constrained evolutionary piecemeal trainingApplied Intelligence10.1007/s10489-021-02679-752:15(17103-17117)Online publication date: 30-Jul-2021
https://doi.org/10.1007/s10489-021-02679-7
Fischer LEhrlinger LGeist VRamler RSobiezky FZellinger WBrunner DKumar MMoser B(2020)AI System Engineering—Key Challenges and Lessons LearnedMachine Learning and Knowledge Extraction10.3390/make30100043:1(56-83)Online publication date: 31-Dec-2020
https://doi.org/10.3390/make3010004
Wang XHan YLeung VNiyato DYan XChen X(2020)Convergence of Edge Computing and Deep Learning: A Comprehensive SurveyIEEE Communications Surveys & Tutorials10.1109/COMST.2020.2970550(1-1)Online publication date: 2020
https://doi.org/10.1109/COMST.2020.2970550
Fischer LEhrlinger LGeist VRamler RSobieczky FZellinger WMoser B(2020)Applying AI in Practice: Key Challenges and Lessons LearnedMachine Learning and Knowledge Extraction10.1007/978-3-030-57321-8_25(451-471)Online publication date: 18-Aug-2020
https://doi.org/10.1007/978-3-030-57321-8_25
Scrugli MLoi DRaffo LMeloni PPalumbo FBecchi MSchulz MSato K(2019)A runtime-adaptive cognitive IoT node for healthcare monitoringProceedings of the 16th ACM International Conference on Computing Frontiers10.1145/3310273.3323160(350-357)Online publication date: 30-Apr-2019
https://dl.acm.org/doi/10.1145/3310273.3323160

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten