skip to main content
10.1145/2505515.2507866acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

High throughput filtering using FPGA-acceleration

Published: 27 October 2013 Publication History

Abstract

With the rise in the amount information of being streamed across networks, there is a growing demand to vet the quality, type and content itself for various purposes such as spam, security and search. In this paper, we develop an energy-efficient high performance information filtering system that is capable of classifying a stream of incoming document at high speed. The prototype parses a stream of documents using a multicore CPU and then performs classification using Field-Programmable Gate Arrays (FPGAs). On a large TREC data collection, we implemented a Naive Bayes classifier on our prototype and compared it to an optimized CPU based-baseline. Our empirical findings show that we can classify documents at 10Gb/s which is up to 94 times faster than the CPU baseline (and up to 5 times faster than previous FPGA based implementations). In future work, we aim to increase the throughput by another order of magnitude by implementing both the parser and filter on the FPGA.

References

[1]
J. Short, R. Bohn, and C. Baru, "How Much Information? 2010 Report on Enterprise Server Information," http://bit.ly/13G0rCE, University of California, San Diego, 2011.
[2]
L. Barroso and U. Holzle, "The datacenter as a computer: An introduction to the design of warehouse-scale machines," Synthesis Lectures on Computer Architecture, vol. 4, no. 1, pp. 1--108, 2009.
[3]
"Hybrid-core: The big data computing architecture," http://bit.ly/143iNZT, Convery Computers, 2011.
[4]
J. Coyne, J. Allred, V. Natoli, and W. Lynch, "A field programmable gate array co-processor for the basic local alignment search tool," http://bit.ly/WpOpYV, Stone Ridge Technology, 2009.
[5]
N. A. Wood, "Fpga acceleration of european options pricing," http://bit.ly/VTRELf, XtremeData, 2008.
[6]
W. Vanderbauwhede and K. Benkrid, High-Performance Computing Using FPGAs. Springer Verlag, 2013.
[7]
N. J. Belkin and W. B. Croft, "Information filtering and information retrieval: two sides of the same coin" Com. ACM, vol. 35, no. 12, pp. 29--38, 1992.
[8]
L. Azzopardi, W. Vanderbauwhede, and M. Moadeli, "Developing energy efficient filtering systems," in Proc. SIGIR '09. ACM, 2009, pp. 664--665.
[9]
"Netezza appliance architecture," http://www.Netezza.com, IBM, 2011.
[10]
S. Eick, J. Lockwood, R. Loui, A. Levine, J. Mauger, D. Weishar, A. Ratner, and J. Byrnes, "Hardware accelerated algorithms for semantic processing of document streams," in Aerospace Conference, 2006 IEEE, 2006, p. 14.
[11]
A. Jacob and M. Gokhale, "Language classification using n-grams accelerated by fpga-based bloom filters," in Proc. HPRCTA'07. ACM, 2007, pp. 31--37.
[12]
S. B. Mane, S. B. Bansode, and P. K. Sinha, "Optimized private information retrieval using graphics processing unit with reduced accessibility," in Proceedings of the CUBE International Information Technology Conference, 2012, pp. 128--132.
[13]
H. He, J. Lin, and A. Lopez, "Massively parallel suffix array queries and on-demand phrase extraction for statistical machine translation using gpus," in Proceedings of the 2013 NAACL/HLT, 2013, pp. 128--132.
[14]
S. Ding, J. He, H. Yan, and T. Suel, "Using graphics processors for high performance ir query processing," in Proceedings of the 18th international conference on World wide web. ACM, 2009, pp. 421--430.
[15]
R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval. Edinburgh Gate: Pearson Education Limited., 1999.
[16]
F. Peng and D. Schuurmans, "Combining naive bayes and n-gram language models for text classification," Advances in Information Retrieval, pp. 547--547, 2003.
[17]
S. Chalamalasetti, M. Margala, W. Vanderbauwhede, M. Wright, and P. Ranganathan, "Evaluating fpga-acceleration for real-time unstructured search," in Proc. ISPASS 2012. IEEE, 2012, pp. 200--209.
[18]
W. Vanderbauwhede, S. Chalamalasetti, and M. Margala, "Throughput analysis for a high-performance fpga-accelerated real-time search application," International Journal of Reconfigurable Computing, vol. 2012, p. 1, 2012.

Index Terms

  1. High throughput filtering using FPGA-acceleration

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management
    October 2013
    2612 pages
    ISBN:9781450322638
    DOI:10.1145/2505515
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 October 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. classification
    2. efficiency
    3. filtering
    4. fpga
    5. parsing

    Qualifiers

    • Poster

    Conference

    CIKM'13
    Sponsor:
    CIKM'13: 22nd ACM International Conference on Information and Knowledge Management
    October 27 - November 1, 2013
    California, San Francisco, USA

    Acceptance Rates

    CIKM '13 Paper Acceptance Rate 143 of 848 submissions, 17%;
    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 189
      Total Downloads
    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 28 Jan 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media