skip to main content
research-article

An improved DFA for fast regular expression matching

Published: 30 September 2008 Publication History

Abstract

Modern network devices need to perform deep packet inspection at high speed for security and application-specific services. Finite Automata (FAs) are used to implement regular expressions matching, but they require a large amount of memory. Many recent works have proposed improvements to address this issue.
This paper presents a new representation for deterministic finite automata (orthogonal to previous solutions), called Delta Finite Automata (δFA), which considerably reduces states and transitions and requires a transition per character only, thus allowing fast matching. Moreover, a new state encoding scheme is proposed and the comprehensive algorithm is tested for use in the packet classification area.

References

[1]
A.V. Aho and M.J. Corasick. Efficient string matching: an aid to bibliographic search. Commun. ACM, 18(6):333--340, 1975.
[2]
A.V. Aho, R. Sethi, and J.D. Ullman. Compilers, principles, techniques, and tools. Addison Wesley, 1985.
[3]
Bro: A system for Detecting Network Intruders in Real Time, http://bro-ids.org/.
[4]
M. Becchi and S. Cadambi. Memory-efficient regular expression search using state merging. In Proc. of INFOCOM 2007, May 2007.
[5]
M. Becchi and P. Crowley. A hybrid finite automaton for practical deep packet inspection. In Proc. of CoNEXT'07, pages 1--12. ACM, 2007.
[6]
M. Becchi and P. Crowley. An improved algorithm to accelerate regular expression evaluation. In Proc. of ANCS'07, pages 145--154, 2007.
[7]
B.C. Brodie, D.E. Taylor, and R.K. Cytron. A scalable architecture for high-throughput regular-expression pattern matching. In Proc. of ISCA'06, June 2006.
[8]
Classbench, A Packet Classification Benchmark, http://www.arl.wustl.edu/~det3/ClassBench/.
[9]
B. Commentz-Walter. A string matching algorithm fast on the average. In Proc. of ICALP'79, pages 118--132. Springer-Verlag.
[10]
W. Eatherton, Z. Dittia, and G. Varghese. Tree bitmap: Hardware/software ip lookups with incremental updates. ACM SIGCOMM Computer Communications Review, 34, 2004.
[11]
P. Gupta and N. McKeown. Packet classification on multiple fields. In SIGCOMM, pages 147--160, 1999.
[12]
Intel Network Processors, www.intel.com/design/network/products/npfamily/.
[13]
S. Kumar, B. Chandrasekaran, J. Turner, and G. Varghese. Curing regular expressions matching algorithms from insomnia, amnesia, and acalculia. In Proc. of ANCS'07, pages 155--164. ACM.
[14]
S. Kumar, S. Dharmapurikar, F. Yu, P. Crowley, and J. Turner. Algorithms to accelerate multiple regular expressions matching for deep packet inspection. In Proc. of SIGCOMM'06, pages 339--350. ACM.
[15]
S. Kumar, J. Turner, and J. Williams. Advanced algorithms for fast and scalable deep packet inspection. In Proc. of ANCS '06, pages 81--92. ACM.
[16]
T.V. Lakshman and D. Stiliadis. High-speed policy-based packet forwarding using efficient multi-dimensional range matching. In SIGCOMM, pages 203--214, 1998.
[17]
Haoyu Song, Evaluation of Packet Classification Algorithms, www.arl.wustl.edu/~hs1/PClassEval.html.
[18]
Michela Becchi, regex tool, http://regex.wustl.edu/.
[19]
S. Singh, F. Baboescu, G. Varghese, and J. Wang. Packet classification using multidimensional cutting. Technical report, 2003.
[20]
R. Smith, C. Estan, and S. Jha. Xfa: Faster signature matching with extended automata. In IEEE Symposium on Security and Privacy, May 2008.
[21]
R. Smith, C. Estan, and S. Jha. Xfas: Fast and compact signature matching. Technical report, University of Wisconsin, Madison, August 2007.
[22]
R. Sommer and V. Paxson. Enhancing byte-level network intrusion detection signatures with context. In Proc. of CCS'03, pages 262--271. ACM.
[23]
Snort: Lightweight Intrusion Detection for Networks, http://www.snort.org/.
[24]
N. Tuck, T. Sherwood, B. Calder, and G. Varghese. Deterministic memory-efficient string matching algorithms for intrusion detection. In Proc. of INFOCOM 2004, pages 333--340.
[25]
G. Varghese. Network Algorithmics,: An Interdisciplinary Approach to Designing Fast Networked Devices. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2004.
[26]
J.W. Will Eatherton. An encoded version of reg-ex database from cisco systems provided for research purposes.
[27]
S. Wu and U. Manber. A fast algorithm for multi-pattern searching. Technical Report TR-94-17, Dept.of Computer Science, University of Arizona.
[28]
F. Yu, Z. Chen, Y. Diao, T.V. Lakshman, and R.H. Katz. Fast and memory-efficient regular expression matching for deep packet inspection. In Proc. of ANCS'06, pages 93--102.

Cited By

View all

Index Terms

  1. An improved DFA for fast regular expression matching

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGCOMM Computer Communication Review
    ACM SIGCOMM Computer Communication Review  Volume 38, Issue 5
    October 2008
    68 pages
    ISSN:0146-4833
    DOI:10.1145/1452335
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 September 2008
    Published in SIGCOMM-CCR Volume 38, Issue 5

    Check for updates

    Author Tags

    1. DFA
    2. deep packet inspection
    3. intrusion prevention
    4. packet classification
    5. regular expressions

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)55
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 27 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media