skip to main content
research-article
Open access

Dynamic Precision Autotuning with TAFFO

Published: 29 May 2020 Publication History

Abstract

Many classes of applications, both in the embedded and high performance domains, can trade off the accuracy of the computed results for computation performance. One way to achieve such a trade-off is precision tuning—that is, to modify the data types used for the computation by reducing the bit width, or by changing the representation from floating point to fixed point. We present a methodology for high-accuracy dynamic precision tuning based on the identification of input classes (i.e., classes of input datasets that benefit from similar optimizations). When a new input region is detected, the application kernels are re-compiled on the fly with the appropriate selection of parameters. In this way, we obtain a continuous optimization approach that enables the exploitation of the reduced precision computation while progressively exploring the solution space, thus reducing the time required by compilation overheads. We provide tools to support the automation of the runtime part of the solution, leaving to the user only the task of identifying the input classes. Our approach provides a significant performance boost (up to 320%) on the typical approximate computing benchmarks, without meaningfully affecting the accuracy of the result, since the error remains always below 3%.

References

[1]
Massimo Alioto, Vivek De, and Andrea Marongiu. 2018. Energy-quality scalable integrated circuits and systems: Continuing energy scaling in the twilight of Moore’s law. IEEE J. Emerg. Sel. Topics Circuits Syst. 8, 4 (Dec. 2018), 653--678.
[2]
Jason Ansel, Yee L. Wong, Cy Chan, Marek Olszewski, Alan Edelman, and Saman Amarasinghe. 2011. Language and compiler support for auto-tuning variable-accuracy algorithms. In Proceedings of the International Symposium on Code Generation and Optimization (CGO’11). 85--96.
[3]
Pablo De Oliveira Castro, Chadi Akel, Eric Petit, Mihail Popov, and William Jalby. 2015. CERE: LLVM-based codelet extractor and REplayer for piecewise benchmarking and optimization. ACM Trans. Archit. Code Optim. 12, 1 (April 2015), Article 6, 24 pages.
[4]
Daniele Cattaneo, Antonio Di Bello, Stefano Cherubin, Federico Terraneo, and Giovanni Agosta. 2018. Embedded operating system optimization through floating to fixed point compiler transformation. In Proceedings of the 21st Euromicro Conference on Digital System Design (DSD’18). 172--176.
[5]
Stefano Cherubin and Giovanni Agosta. 2018. libVersioningCompiler: An easy-to-use library for dynamic generation and invocation of multiple code versions. SoftwareX 7 (2018), 95--100.
[6]
Stefano Cherubin and Giovanni Agosta. 2020. Tools for reduced precision computation: A survey. ACM Comput. Surv. 53, 2 (March 2020), Article 33, 25 pages.
[7]
Stefano Cherubin, Giovanni Agosta, Imane Lasri, Erven Rohou, and Olivier Sentieys. 2018. Implications of reduced-precision computations in HPC: Performance, energy and error. In Parallel Computing Is Everywhere. Vol. 32, S. Bassini, M. Danelutto, P. Dazzi, G. R. Joubert, and F. Peters (Eds.). Advances in Parallel Computing. IOS Press, Amsterdam, the Netherlands, 297--306.
[8]
Stefano Cherubin, Daniele Cattaneo, Michele Chiari, Antonio Di Bello, and Giovanni Agosta. 2019. TAFFO: Tuning assistant for floating to fixed point optimization. IEEE Embedded Syst. Lett. 12, 1 (2019), 5--8.
[9]
Wei-Fan Chiang, Mark Baranowski, Ian Briggs, Alexey Solovyev, Ganesh Gopalakrishnan, and Zvonimir Rakamarić. 2017. Rigorous floating-point mixed-precision tuning. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL’17). 300--315.
[10]
A. Cohen and E. Rohou. 2010. Processor virtualization and split compilation for heterogeneous multicore embedded systems. In Proceedings of the Design Automation Conference. 102--107.
[11]
Eva Darulova, Einar Horn, and Saksham Sharma. 2018. Sound mixed-precision optimization with rewriting. In Proceedings of the 9th ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS’18). 208--219.
[12]
Eva Darulova and Viktor Kuncak. 2011. Trustworthy numerical computation in scala. ACM SIGPLAN Not. 46, 10 (Oct. 2011), 325--344.
[13]
Eva Darulova and Viktor Kuncak. 2017. Towards a compiler for reals. ACM Trans. Program. Lang. Syst. 39, 2 (March 2017), Article 8, 28 pages.
[14]
Luiz Henrique de Figueiredo and Jorge Stolfi. 2004. Affine arithmetic: Concepts and applications. Numer. Algorithms 37, 1 (Dec. 2004), 147--158.
[15]
Fernando Endo, Damien Couroussé, and Henri-Pierre Charles. 2016. Pushing the limits of online auto-tuning: Machine code optimization in short-running kernels. In Proceedings of the IEEE 10th International Symposium on Embedded Multicore/Many-Core Systems-on-Chip (MCSoC’16).
[16]
Grigori Fursin, Anton Lokhmotov, and Ed Plowman. 2016. Collective knowledge: Towards R8D sustainability. In Proceedings of the Conference on Design, Automation, and Test in Europe (DATE’16). 864--869.
[17]
Davide Gadioli, Gianluca Palermo, and Cristina Silvano. 2015. Application autotuning to support runtime adaptivity in multicore architectures. In Proceedings of the International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS’15). IEEE, Los Alamitos, CA, 173--180.
[18]
H. Keding, M. Willems, M. Coors, and H. Meyr. 1998. FRIDGE: A fixed-point design and simulation environment. In Proceedings of the Conference on Design, Automation, and Test in Europe (DATE’98). 429--435.
[19]
Thomas Kistler and Michael Franz. 2003. Continuous program optimization: A case study. ACM Trans. Program. Lang. Syst. 25, 4 (July 2003), 500--548.
[20]
Ki-Il Kum, Jiyang Kang, and Wonyong Sung. 2000. AUTOSCALER for C: An optimizing floating-point to integer C program converter for fixed-point digital signal processors. IEEE Trans. Circuits Syst. II. Analog Digit. Signal Process. 47, 9 (Sept. 2000), 840--848.
[21]
Michael O. Lam, Jeffrey K. Hollingsworth, Bronis R. de Supinski, and Matthew P. Legendre. 2013. Automatically adapting programs for mixed-precision floating-point computation. In Proceedings of the 27th International ACM Conference on International Conference on Supercomputing (ICS’13). 369--378.
[22]
Sparsh Mittal. 2016. A survey of techniques for approximate computing. ACM Comput. Surveys 48, 4 (2016), 62.
[23]
Tomas Möller. 1997. A fast triangle-triangle intersection test. JGT 2, 2 (1997), 25--30.
[24]
Ramon E. Moore, R. Baker Kearfott, and Michael J. Cloud. 2009. Introduction to Interval Analysis. Vol. 110. Siam.
[25]
Rawzor. n.d. Image Compression Benchmark. Retrieved April 1, 2020 from http://imagecompression.info/test_images/.
[26]
Cindy Rubio-González, Cuong Nguyen, Hong Diep Nguyen, James Demmel, William Kahan, Koushik Sen, David H. Bailey, Costin Iancu, and David Hough. 2013. Precimonious: Tuning assistant for floating-point precision. In Proceedings of the International Conference on High Performance Computing, Networking, Storage, and Analysis (SC’13). Article 27, 12 pages.
[27]
Cristina Silvano, Giovanni Agosta, Andrea Bartolini, Andrea R. Beccari, Luca Benini, Loic Besnard, Joao Bispo, et al. 2019. The ANTAREX domain specific language for high performance computing. Microprocess. Microsyst. 68 (2019), 58--73.
[28]
N. Simon, D. Menard, and O. Sentieys. 2011. IDFix-infrastructure for the design of fixed-point systems. In University Booth of the Conference on Design, Automation, and Test in Europe (DATE’11), Vol. 38. http://idfix.gforge.inria.fr.
[29]
Giuseppe Tagliavini, Stefan Mach, Davide Rossi, Andrea Marongiu, and Luca Benini. 2018. A transprecision floating-point platform for ultra-low power computing. In Proceedings of the 2018 Design, Automation, and Test in Europe Conference and Exhibition (DATE’18). 1051--1056.
[30]
Amir Yazdanbakhsh, Divya Mahajan, Hadi Esmaeilzadeh, and Pejman Lotfi-Kamran. 2017. AxBench: A multiplatform benchmark suite for approximate computing. IEEE Design Test 34, 2 (April 2017), 60--68.
[31]
Serif Yesil, Ismail Akturk, and Ulya R. Karpuzcu. 2018. Toward dynamic precision scaling. IEEE Micro 38, 4 (July 2018), 30--39.

Cited By

View all
  • (2024)Hard SyDR: A Benchmarking Environment for Global Navigation Satellite System AlgorithmsSensors10.3390/s2402040924:2(409)Online publication date: 9-Jan-2024
  • (2024)SeTHet - Sending Tuned numbers over DMA onto Heterogeneous clusters: an automated precision tuning storyProceedings of the 21st ACM International Conference on Computing Frontiers10.1145/3649153.3649203(258-266)Online publication date: 7-May-2024
  • (2024)The TEXTAROSSA Project: Cool all the Way Down to the Hardware2024 27th Euromicro Conference on Digital System Design (DSD)10.1109/DSD64264.2024.00076(526-533)Online publication date: 28-Aug-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Architecture and Code Optimization
ACM Transactions on Architecture and Code Optimization  Volume 17, Issue 2
June 2020
169 pages
ISSN:1544-3566
EISSN:1544-3973
DOI:10.1145/3403597
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 May 2020
Online AM: 07 May 2020
Accepted: 01 March 2020
Revised: 01 November 2019
Received: 01 June 2019
Published in TACO Volume 17, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Precision tuning
  2. approximate computing
  3. compiler
  4. fixed point

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • European Union’s Horizon 2020 programme

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)153
  • Downloads (Last 6 weeks)28
Reflects downloads up to 25 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Hard SyDR: A Benchmarking Environment for Global Navigation Satellite System AlgorithmsSensors10.3390/s2402040924:2(409)Online publication date: 9-Jan-2024
  • (2024)SeTHet - Sending Tuned numbers over DMA onto Heterogeneous clusters: an automated precision tuning storyProceedings of the 21st ACM International Conference on Computing Frontiers10.1145/3649153.3649203(258-266)Online publication date: 7-May-2024
  • (2024)The TEXTAROSSA Project: Cool all the Way Down to the Hardware2024 27th Euromicro Conference on Digital System Design (DSD)10.1109/DSD64264.2024.00076(526-533)Online publication date: 28-Aug-2024
  • (2024)Design-time methodology for optimizing mixed-precision CPU architectures on FPGAJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2024.103257155:COnline publication date: 1-Oct-2024
  • (2023)Hardware and Software Support for Mixed Precision Computing: a Roadmap for Embedded and HPC Systems2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE56975.2023.10137092(1-6)Online publication date: Apr-2023
  • (2023)Towards Benchmarking GNSS Algorithms on FPGA using SyDR2023 International Conference on Localization and GNSS (ICL-GNSS)10.1109/ICL-GNSS57829.2023.10148916(1-7)Online publication date: 6-Jun-2023
  • (2023)Mixed Precision in Heterogeneous Parallel Computing Platforms via Delayed Code AnalysisEmbedded Computer Systems: Architectures, Modeling, and Simulation10.1007/978-3-031-46077-7_33(469-477)Online publication date: 2-Jul-2023
  • (2023)RISC-V Processor Technologies for Aerospace Applications in the ISOLDE ProjectEmbedded Computer Systems: Architectures, Modeling, and Simulation10.1007/978-3-031-46077-7_24(363-378)Online publication date: 2-Jul-2023
  • (2022)VIPP: Validation-Included Precision-Parametric N-Body Benchmark Suite2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS55109.2022.00021(156-158)Online publication date: May-2022
  • (2022)Cost-effective fixed-point hardware support for RISC-V embedded systemsJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2022.102476126:COnline publication date: 1-May-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media