skip to main content
10.5555/3195638.3195701acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
research-article

A patch memory system for image processing and computer vision

Published: 15 October 2016 Publication History

Abstract

From self-driving cars to high dynamic range (HDR) imaging, the demand for image-based applications is growing quickly. In mobile systems, these applications place particular strain on performance and energy efficiency. As traditional memory systems are optimized for 1D memory access, they are unable to efficiently exploit the multi-dimensional locality characteristics of image-based applications which often operate on sub-regions of 2D and 3D image data. We have developed a new Patch Memory System (PMEM) tailored to application domains that process 2D and 3D data streams. PMEM supports efficient multidimensional addressing, automatic handling of image boundaries, and efficient caching and prefetching of image data. In addition to an optimized cache, PMEM includes hardware for offloading structured address calculations from processing units. We improve average energy-delay by 26% compared to EVA, a memory system for computer vision applications. Compared to a traditional cache, our results show that PMEM can reduce processor energy by 34% for a selection of CV and IP applications, leading to system performance improvement of up to 32% and energy-delay product improvement of 48--86% on the applications in this study.

References

[1]
NVIDIA, "Advanced Driver Assistance Systems (ADAS)," http://www.nvidia.com/object/advanced-driver-assistance-systems.html.
[2]
R. Ng, M. Levoy, M. Brédif, G. Duval, M. Horowitz, and P. Hanrahan, "Light Field Photography with a Hand-held Plenoptic Camera," Stanford University, Computer Science Department, Tech. Rep. CSTR 2005-02, 2005.
[3]
Microsoft, "Microsoft HoloLens," https://www.microsoft.com/microsoft-hololens/en-us.
[4]
F. Stein, "The Challenge of Putting Vision Algorithms into a Car," in Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), June 2012, pp. 89--94.
[5]
P. Dollar, C. Wojek, B. Schiele, and P. Perona, "Pedestrian Detection: An Evaluation of the State of the Art," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 4, pp. 743--761, April 2012.
[6]
S. Krotosky and M. Trivedi, "On Color-, Infrared-, and Multimodal-Stereo Approaches to Pedestrian Detection," IEEE Transactions on Intelligent Transportation Systems, vol. 8, no. 4, pp. 619--629, December 2007.
[7]
R. Rithe, P. Raina, N. Ickes, S. Tenneti, and A. Chandrakasan, "Reconfigurable Processor for Energy-Efficient Computational Photography," IEEE Journal of Solid-State Circuits (JSSC), vol. 48, no. 11, pp. 2908--2919, November 2013.
[8]
W. Qadeer, R. Hameed, O. Shacham, P. Venkatesan, C. Kozyrakis, and M. A. Horowitz, "Convolution Engine: Balancing Efficiency and Flexibility in Specialized Computing," in International Symposium on Computer Architecture (ISCA), June 2013, pp. 24--35.
[9]
T. Ohmaru, T. Nakagawa, S. Maeda, Y. Okamoto, M. Kozuma, S. Yoneda, H. Inoue, Y. Kurokawa, T. Ikeda, Y. Ieda, N. Yamade, H. Miyairi, M. Ikeda, and S. Yamazaki, "25.3μW at 60fps 240X 160-pixel Vision Sensor for Motion Capturing with In-pixel Non-volatile Analog Memory Using Crystalline Oxide Semiconductor FET," in International Solid-State Circuits Conference (ISSCC), February 2015.
[10]
J. Tanabe, S. Toru, Y. Yamada, T. Watanabe, M. Okumura, M. Nishiyama, T. Nomura, K. Oma, N. Sato, M. Banno, H. Hayashi, and T. Miyamori, "A 1.9TOPS and 564GOPS/W Heterogeneous Multicore SoC with Color-based Object Classification Accelerator for Image-recognition Applications," in International Solid-State Circuits Conference (ISSCC), February 2015.
[11]
I. Hong, K. Bong, D. Shin, S. Park, K. Lee, Y. Kim, and H.-J. Yoo, "A 2.71nJ/pixel 3D-stacked Gaze-activated Object-recognition System for Low-power Mobile HMD Applications," in International Solid-State Circuits Conference (ISSCC), February 2015.
[12]
T. Kurafuji, M. Haraguchi, M. Nakajima, T. Nishijima, T. Tanizaki, H. Yamasaki, T. Sugimura, Y. Imai, M. Ishizaki, T. Kumaki, K. Murata, K. Yoshida, E. Shimomura, H. Noda, Y. Okuno, S. Kamijo, T. Koide, H. Mattausch, and K. Arimoto, "A Scalable Massively Parallel Processor for Real-Time Image Processing," IEEE Journal of Solid-State Circuits (JSSC), vol. 46, no. 10, pp. 2363--2373, October 2011.
[13]
M. Demler, "Synopsys Embeds Vision Processing," Microprocessor Report, April 2015.
[14]
T. R. Halfhill, "Ceva Sharpens Computer Vision," Microprocessor Report, April 2015.
[15]
TMS320C64x+ DSP Cache User's Guide (Rev. B), Texas Instruments. {Online}. Available: http://www.ti.com/litv/pdf/spru862b
[16]
J. Clemons, A. Pellegrini, S. Savarese, and T. Austin, "EVA: An Efficient Vision Architecture for Mobile Systems," in International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES), September 2013, pp. 1--10.
[17]
J. Ragan-Kelley, C. Barnes, A. Adams, S. Paris, F. Durand, and S. Amarasinghe, "Halide: A Language and Compiler for Optimizing Parallelism, Locality, and Recomputation in Image Processing Pipelines," in Conference on Programming Language Design and Implementation (PLDI), June 2013, pp. 519--530.
[18]
H. Malvar, L.-W. He, and R. Cutler, "High-quality Linear Interpolation for Demosaicing of Bayer-patterned Color Images," in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2004.
[19]
I. Frosio and N. Borghese, "Statistical Based Impulsive Noise Removal in Digital Radiography," IEEE Transactions on Medical Imaging, vol. 28, no. 1, pp. 3--16, January 2009.
[20]
K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, "Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering," IEEE Transactions on Image Processing, vol. 16, no. 8, pp. 2080--2095, August 2007.
[21]
G. Yu and G. Sapiro, "DCT Image Denoising: A Simple and Effective Image Denoising Algorithm," Image Processing On Line, vol. 1, 2011.
[22]
E. Rosten and T. Drummond, "Machine Learning for High-speed Corner Detection," in Proceedings of the European Conference on Computer Vision (ECCV), May 2006, pp. 430--443.
[23]
M. Calonder, V. Lepetit, C. Strecha, and P. Fua, "BRIEF: Binary Robust Independent Elementary Features," in Proceedings of the European Conference on Computer Vision (ECCV), September 2010, pp. 778--792.
[24]
T. Joachims, "Making Large-Scale SVM Learning Practical," in Advances in Kernel Methods - Support Vector Learning, B. Schölkopf, C. Burges, and A. Smola, Eds. Cambridge, MA: MIT Press, 1999, ch. 11, pp. 169--184.
[25]
L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, pp. 5--32, October 2001.
[26]
R. C. Gonzalez and R. E. Woods, Digital Image Processing, 3rd ed. Upper Saddle River, NJ: Prentice-Hall, Inc., 2006.
[27]
D. Jacobs, O. Gallo, and K. Pulli, "Dynamic Image Stacks," in Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), June 2014.
[28]
M. Bojnordi, N. Sedaghati-Mokhtari, O. Fatemi, and M. Hashemi, "An Efficient Self-Transposing Memory Structure for 32-bit Video Processors," in Asia Pacific Conference on Circuits and Systems (APCCAS), December 2006, pp. 1438--1441.
[29]
T.-C. Chen, Y.-H. Chen, S.-F. Tsai, S.-Y. Chien, and L.-G. Chen, "Fast Algorithm and Architecture Design of Low-Power Integer Motion Estimation for H.264/AVC," IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, no. 5, pp. 568--577, May 2007.
[30]
E. Rosten and T. Drummond, "Fusing Points and Lines for High Performance Tracking," in IEEE International Conference on Computer Vision, October 2005, pp. 1508--1511.
[31]
C. Stauffer and W. Grimson, "Adaptive Background Mixture Models for Real-time Tracking," in Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), June 1999.
[32]
J.-L. Baer and T.-F. Chen, "Effective Hardware-Based Data Prefetching for High-Performance Processors," IEEE Transactions on Computers., vol. 44, no. 5, pp. 609--623, May 1995.
[33]
N. Zhou, F. Qiao, and H. Yang, "A Hybrid Cache Architecture with 2D-based Prefetching Scheme for Image and Video Processing," in International Conference on Communications and Signal Processing (ICCSP), April 2013, pp. 1092--1096.
[34]
Z. Larabi, Y. Mathieu, and S. Mancini, "High Efficiency Reconfigurable Cache for Image Processing," in International Conference on Engineering of Reconfigurable Systems and Algorithms (ERSA), July 2009, pp. 226--232.
[35]
J.-P. Farrugia and P. Horain, "GPUCV: A Framework for Image Processing Acceleration with Graphics Processors," in International Conference on Multimedia and Expo (ICME), July 2006, pp. 585--588.
[36]
G. Wang, Y. Xiong, J. Yun, and J. R. Cavallaro, "Accelerating Computer Vision Algorithms Using OpenCL Framework on the Mobile GPU - A Case Study," in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2013, pp. 2629--2633.
[37]
H. Igehy, M. Eldridge, and K. Proudfoot, "Prefetching in a Texture Cache Architecture," in SIGGRAPH/EUROGRAPHICS Workshop on Graphics Hardware, 1998.
[38]
N. Wilt, The CUDA Handbook: A Comprehensive Buide to GPU Programming, 1st ed. Addison-Wesley Professional, 2013.
[39]
S. Gilani, N. S. Kim, and M. Schulte, "Scratchpad Memory Optimizations for Digital Signal Processing Applications," in Design, Automation Test in Europe (DATE), 2011, March 2011, pp. 1--6.
[40]
T. Hussain, M. Shafiq, M. Pericàs, N. Navarro, and E. Ayguadé, "PPMC: A Programmable Pattern Based Memory Controller," in International Conference on Reconfigurable Computing: Architectures, Tools and Applications, March 2012, pp. 89--101.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MICRO-49: The 49th Annual IEEE/ACM International Symposium on Microarchitecture
October 2016
816 pages

Sponsors

Publisher

IEEE Press

Publication History

Published: 15 October 2016

Check for updates

Qualifiers

  • Research-article

Conference

MICRO-49
Sponsor:

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Upcoming Conference

MICRO '24

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Sep 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media