skip to main content
10.1007/978-3-642-10631-6_86guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

A Simple, Fast, and Compact Static Dictionary

Published: 05 December 2009 Publication History

Abstract

We present a new static dictionary that is very fast and compact, while also extremely easy to implement. A combination of properties make this algorithm very attractive for applications requiring large static dictionaries:
High performance, with membership queries taking O(1)-time with a near-optimal constant.
Continued high performance in external memory, with queries requiring only 1-2 disk seeks. If the dictionary has n items in $\left\{ 0, ..., m\!-\!1 \right\}$ and d is the number of bytes retrieved from disk on each read, then the average number of seeks is $\min\left(1.63, 1 + O\left( \frac{\sqrt{n} \log m}{d} \right)\right)$.
Efficient use of space, storing n items from a universe of size m in $n \log m - \frac{1}{2} n \log n + O\left(n + \log \log m\right)$ bits. We prove this space bound with a novel application of the Kolmogorov-Smirnov distribution.
Simplicity, with a 20-line pseudo-code construction algorithm and 4-line query algorithm.

References

[1]
Brodnik, A., Munro, J.I.: Membership in Constant Time and Minimum Space. In: van Leeuwen, J. (ed.) ESA 1994. LNCS, vol. 855, pp. 72-81. Springer, Heidelberg (1994).
[2]
Brodnik, A., Munro, J.I.: Membership in Constant Time and Almost-Minimum Space. SIAM Journal of Computing 28, 1627-1640 (1999).
[3]
Carter, J., Wegman, M.: Universal Classes of Hash Functions. Journal of Computer and System Sciences 18, 143-154 (1979).
[4]
Cleary, J.G.: Compact Hash Tables Using Bidirectional Linear Probing. IEEE Transactions on Computers 33, 828-834 (1984).
[5]
Feller, W.: On the Kolmogorov-Smirnov Limit Theorems for Empirical Distributions. Annals of Mathematical Statistics 19(2), 177-189 (1948).
[6]
Vitter, J., Flajolet, P.: Average-case analysis of algorithms and data structures. In: van Leeuwen, J. (ed.) Handbook of Theoretical Computer Science, pp. 431-524. Elsevier, Amsterdam (1990).
[7]
Fotakis, D., Pagh, R., Sanders, P., Spirakis, P.: Space Efficient Hash Tables With Worst Case Constant Access Time. In: Alt, H., Habib, M. (eds.) STACS 2003. LNCS, vol. 2607, pp. 271-282. Springer, Heidelberg (2003).
[8]
Fredman, M., Komlós, J., Szemerédi, E.: Storing a Sparse Table with O(1) Worst Case Access Time. Journal of the ACM 31(3), 538-544 (1984).
[9]
Grossi, R., Orlandi, A., Raman, R., Rao, S.: More Haste, Less Waste: Lowering the Redundancy in Fully Indexable Dictionaries. In: STACS 2009, pp. 517-528 (2009).
[10]
Jensen, M., Pagh, R.: Optimality in External Memory Hashing. Algorithmica 52(3), 403-411 (2008).
[11]
Knuth, D.: Sorting and Searching. The Art of Computer Programming, vol. 3. Addison-Wesley Publishing Company, Reading (1973).
[12]
Kolmogoroff, A.: Confidence limits for an unknown distribution function. Annals of Mathematical Statistics 12, 461-463 (1941).
[13]
Pagh, R.: Low Redundancy in Static Dictionaries with Constant Query Time. SIAM Journal on Computing 31(2), 353-363 (2001).
[14]
Pagh, R., Rodler, F.F.: Cuckoo hashing. In: Meyer auf der Heide, F. (ed.) ESA 2001. LNCS, vol. 2161, pp. 121-133. Springer, Heidelberg (2001).
[15]
Patrascu, M.: Succincter. In: FOCS 2008, pp. 305-313 (2008).
[16]
Raman, R., Raman, V., Rao, S.: Succinct Indexable Dictionaries with Applications to Encoding k-ary Trees, Prefix Sums and Multisets. ACM Transactions on Algorithms 3(4), Article 43 (2007).
[17]
Smirnov, N.: On the estimation of the discrepancy between empirical curves of distribution for two independent samples. Bulletin Mathématique de l'Université de Mouscou, 2(fasc. 2) (1939).
[18]
Vitter, J.: Algorithms and Data Structures for External Memory. Now Publishers, Inc., Hanover (2008).
[19]
Witten, I.H., Moffat, A., Bell, T.C.: Managing gigabytes: compressing and indexing documents and images. Morgan Kaufmann Publishers, Taylor & Francis, San Francisco, London (1999).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ISAAC '09: Proceedings of the 20th International Symposium on Algorithms and Computation
December 2009
1224 pages
ISBN:9783642106309
  • Editors:
  • Yingfei Dong,
  • Ding-Zhu Du,
  • Oscar Ibarra

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 05 December 2009

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Dec 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media