skip to main content
10.5555/2969033.2969086guideproceedingsArticle/Chapter ViewAbstractPublication PagesnipsConference Proceedingsconference-collections
Article

Asymmetric LSH (ALSH) for sublinear time Maximum Inner Product Search (MIPS)

Published: 08 December 2014 Publication History

Abstract

We present the first provably sublinear time hashing algorithm for approximate Maximum Inner Product Search (MIPS). Searching with (un-normalized) inner product as the underlying similarity measure is a known difficult problem and finding hashing schemes for MIPS was considered hard. While the existing Locality Sensitive Hashing (LSH) framework is insufficient for solving MIPS, in this paper we extend the LSH framework to allow asymmetric hashing schemes. Our proposal is based on a key observation that the problem of finding maximum inner products, after independent asymmetric transformations, can be converted into the problem of approximate near neighbor search in classical settings. This key observation makes efficient sublinear hashing scheme for MIPS possible. Under the extended asymmetric LSH (ALSH) framework, this paper provides an example of explicit construction of provably fast hashing scheme for MIPS. Our proposed algorithm is simple and easy to implement. The proposed hashing scheme leads to significant computational savings over the two popular conventional LSH schemes: (i) Sign Random Projection (SRP) and (ii) hashing based on p-stable distributions for L2 norm (L2LSH), in the collaborative filtering task of item recommendations on Netflix and Movielens (10M) datasets.

References

[1]
A. Andoni and P. Indyk. E2lsh: Exact euclidean locality sensitive hashing. Technical report, 2004.
[2]
A. Z. Broder. On the resemblance and containment of documents. In the Compression and Complexity of Sequences, pages 21-29, Positano, Italy, 1997.
[3]
M. S. Charikar. Similarity estimation techniques from rounding algorithms. In STOC, pages 380-388, Montreal, Quebec, Canada, 2002.
[4]
P. Cremonesi, Y. Koren, and R. Turrin. Performance of recommender algorithms on top-n recommendation tasks. In Proceedings of the fourth ACM conference on Recommender systems, pages 39-46. ACM, 2010.
[5]
R. R. Curtin, A. G. Gray, and P. Ram. Fast exact max-kernel search. In SDM, pages 1-9, 2013.
[6]
M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokn. Locality-sensitive hashing scheme based on p-stable distributions. In SCG, pages 253-262, Brooklyn, NY, 2004.
[7]
T. Dean, M. A. Ruzon, M. Segal, J. Shlens, S. Vijayanarasimhan, and J. Yagnik. Fast, accurate detection of 100,000 object classes on a single machine. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 1814-1821. IEEE, 2013.
[8]
P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part-based models. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 32(9):1627-1645, 2010.
[9]
J. H. Friedman and J. W. Tukey. A projection pursuit algorithm for exploratory data analysis. IEEE Transactions on Computers, 23(9):881-890, 1974.
[10]
M. X. Goemans and D. P. Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of ACM, 42(6):1115-1145, 1995.
[11]
S. Har-Peled, P. Indyk, and R. Motwani. Approximate nearest neighbor: Towards removing the curse of dimensionality. Theory of Computing, 8(14):321-350, 2012.
[12]
P. Indyk and R. Motwani. Approximate nearest neighbors: Towards removing the curse of dimensionality. In STOC, pages 604-613, Dallas, TX, 1998.
[13]
N. Koenigstein, P. Ram, and Y. Shavitt. Efficient retrieval of recommendations in a matrix factorization framework. In CIKM, pages 535-544, 2012.
[14]
Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems.
[15]
P. Li and A. C. König. Theory and applications b-bit minwise hashing. Commun. ACM, 2011.
[16]
P. Li, M. Mitzenmacher, and A. Shrivastava. Coding for random projections. In ICML, 2014.
[17]
P. Li, M. Mitzenmacher, and A. Shrivastava. Coding for random projections and approximate near neighbor search. Technical report, arXiv:1403.8144, 2014.
[18]
B. Neyshabur and N. Srebro. A simpler and better lsh for maximum inner product search (mips). Technical report, arXiv:1410.5518, 2014.
[19]
P. Ram and A. G. Gray. Maximum inner-product search using cone trees. In KDD, pages 931-939, 2012.
[20]
A. Shrivastava and P. Li. Beyond pairwise: Provably fast algorithms for approximate k-way similarity search. In NIPS, Lake Tahoe, NV, 2013.
[21]
A. Shrivastava and P. Li. Asymmetric minwise hashing. Technical report, 2014.
[22]
A. Shrivastava and P. Li. An improved scheme for asymmetric lsh. Technical report, arXiv:1410.5410, 2014.
[23]
A. Shrivastava and P. Li. In defense of minhash over simhash. In AISTATS, 2014.
[24]
R. Weber, H.-J. Schek, and S. Blott. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In VLDB, pages 194-205, 1998.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
NIPS'14: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2
December 2014
3697 pages

Publisher

MIT Press

Cambridge, MA, United States

Publication History

Published: 08 December 2014

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media