skip to main content
10.1145/3447548.3467441acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Online Additive Quantization

Published: 14 August 2021 Publication History

Abstract

Approximate nearest neighbor search (ANNs) plays an important role in many applications ranging from information retrieval, recommender systems to machine translation. Several ANN indexes, such as hashing and quantization, have been designed to update for the evolving database, but there exists a remarkable performance gap between them and retrained indexes on the entire database. To close the gap, we propose an online additive quantization algorithm (online AQ) to dynamically update quantization codebooks with the incoming streaming data. Then we derive the regret bound to theoretically guarantee the performance of the online AQ algorithm. Moreover, to improve the learning efficiency, we develop a randomized block beam search algorithm for assigning each data to the codewords of the codebook. Finally, we extensively evaluate the proposed online AQ algorithm on four real-world datasets, showing that it remarkably outperforms the state-of-the-art baselines.

References

[1]
Artem Babenko and Victor Lempitsky. 2014. Additive quantization for extreme vector compression. In Proceedings of CVPR'14. 931--938.
[2]
Artem Babenko and Victor Lempitsky. 2015. Tree quantization for large-scale similarity search and classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4240--4248.
[3]
Fatih Cakir, Sarah Adel Bargal, and Stan Sclaroff. 2017. Online supervised hashing. Computer Vision and Image Understanding 156 (2017), 162--173.
[4]
Fatih Cakir, Kun He, Sarah Adel Bargal, and Stan Sclaroff. 2017. Mihash: Online hashing with mutual information. In Proceedings of the IEEE International Conference on Computer Vision. 437--445.
[5]
Fatih Cakir and Stan Sclaroff. 2015. Adaptive hashing for fast similarity search. In Proceedings of the IEEE international conference on computer vision. 1044--1052.
[6]
Moses Charikar, Chandra Chekuri, Tomás Feder, and Rajeev Motwani. 2004. Incremental clustering and dynamic information retrieval. SIAM J. Comput. 33, 6 (2004), 1417--1440.
[7]
Patrick Chen, Si Si, Sanjiv Kumar, Yang Li, and Cho-Jui Hsieh. 2018. Learning to Screen for Fast Softmax Inference on Large Vocabulary Neural Networks. In International Conference on Learning Representations.
[8]
Xixian Chen, Irwin King, Michael R Lyu, et al. 2017. FROSH: FasteR Online Sketching Hashing. In UAI.
[9]
Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun. 2014. Optimized product quantization. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 4 (2014), 744--755.
[10]
Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2013. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35, 12 (2013).
[11]
Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, and Sanjiv Kumar. 2020. Accelerating large-scale inference with anisotropic vector quantization. In International Conference on Machine Learning. PMLR, 3887-- 3896.
[12]
Long-Kai Huang, Qiang Yang, and Wei-Shi Zheng. 2013. Online Hashing. In IJCAI. 1422--1428.
[13]
Long-Kai Huang, Qiang Yang, and Wei-Shi Zheng. 2017. Online hashing. IEEE transactions on neural networks and learning systems 29, 6 (2017), 2309--2322.
[14]
Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2011. Product quantization for nearest neighbor search. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 1 (2011), 117--128.
[15]
Sham M Kakade, Shai Shalev-Shwartz, and Ambuj Tewari. 2012. Regularization techniques for learning with matrices. The Journal of Machine Learning Research 13, 1 (2012), 1865--1890.
[16]
Weihao Kong and Wu-Jun Li. 2012. Isotropic hashing. In Proceedings of NIPS'12. 1646--1654.
[17]
Cong Leng, Jiaxiang Wu, Jian Cheng, Xiao Bai, and Hanqing Lu. 2015. Online sketching hashing. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2503--2511.
[18]
Wen Li, Ying Zhang, Yifang Sun, Wei Wang, Mingjie Li, Wenjie Zhang, and Xuemin Lin. 2019. Approximate nearest neighbor search on high dimensional data-experiments, analyses, and improvement. IEEE Transactions on Knowledge and Data Engineering 32, 8 (2019), 1475--1488.
[19]
Edo Liberty. 2013. Simple and deterministic matrix sketching. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. 581--588.
[20]
Yury Malkov, Alexander Ponomarenko, Andrey Logvinov, and Vladimir Krylov. 2014. Approximate nearest neighbor algorithm based on navigable small world graphs. Information Systems 45 (2014), 61--68.
[21]
Yu A Malkov and Dmitry A Yashunin. 2018. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE transactions on pattern analysis and machine intelligence 42, 4 (2018), 824--836.
[22]
Julieta Martinez, Joris Clement, Holger H Hoos, and James J Little. 2016. Revisiting additive quantization. In European Conference on Computer Vision. Springer, 137--153.
[23]
Yusuke Matsui, Yusuke Uchida, Hervé Jégou, and Shin'ichi Satoh. 2018. A survey of product quantization. ITE Transactions on Media Technology and Applications 6, 1 (2018), 2--10.
[24]
Stanislav Morozov and Artem Babenko. 2018. Non-metric similarity graphs for maximum inner product search. Advances in Neural Information Processing Systems 31 (2018), 4721--4730.
[25]
Stephen Mussmann and Stefano Ermon. 2016. Learning and inference via maximum inner product search. In International Conference on Machine Learning. PMLR, 2587--2596.
[26]
Behnam Neyshabur and Nathan Srebro. 2015. On Symmetric and Asymmetric LSHs for Inner Product Search. In Proceedings of ICML'15. 1926--1934.
[27]
Francesco Orabona, Koby Crammer, and Nicolo Cesa-Bianchi. 2015. A generalized online mirror descent with applications to classification and regression. Machine Learning 99, 3 (2015), 411--435.
[28]
Parikshit Ram and Alexander G Gray. 2012. Maximum inner-product search using cone trees. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 931--939.
[29]
Ruslan Salakhutdinov and Geoffrey Hinton. 2009. Semantic hashing. International Journal of Approximate Reasoning 50, 7 (2009), 969--978.
[30]
Hanan Samet. 2006. Foundations of multidimensional and metric data structures. Morgan Kaufmann.
[31]
Anshumali Shrivastava and Ping Li. 2014. Asymmetric LSH (ALSH) for sublinear time Maximum Inner Product Search (MIPS). In Proceedings of the 27th International Conference on Neural Information Processing Systems-Volume 2. 2321--2329.
[32]
Jun Wang, Wei Liu, Sanjiv Kumar, and Shih-Fu Chang. 2015. Learning to hash for indexing big data-A survey. Proc. IEEE 104, 1 (2015), 34--57.
[33]
Jingdong Wang, Ting Zhang, Nicu Sebe, Heng Tao Shen, et al. 2017. A survey on learning to hash. IEEE Trans. Pattern Anal. Mach. Intell. (2017).
[34]
Yair Weiss, Antonio Torralba, and Rob Fergus. 2009. Spectral hashing. In Proceedings of NIPS'09. 1753--1760.
[35]
Donna Xu, Ivor W Tsang, and Ying Zhang. 2018. Online product quantization. IEEE Transactions on Knowledge and Data Engineering 30, 11 (2018), 2185--2198.
[36]
Minjia Zhang, Xiaodong Liu, Wenhan Wang, Jianfeng Gao, and Yuxiong He. 2018. Navigating with graph representations for fast and scalable decoding of neural language models. In Proceedings of the 32nd International Conference on Neural Information Processing Systems. 6311--6322.
[37]
Ting Zhang, Chao Du, and Jingdong Wang. 2014. Composite Quantization for Approximate Nearest Neighbor Search. In Proceedings of ICML'14. 838--846.
[38]
Ting Zhang, Guo-Jun Qi, Jinhui Tang, and Jingdong Wang. 2015. Sparse composite quantization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4548--4556.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining
August 2021
4259 pages
ISBN:9781450383325
DOI:10.1145/3447548
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. additive quantization
  2. beam search
  3. nearest neighbor search
  4. online update

Qualifiers

  • Research-article

Funding Sources

  • JD AI Research and the Fundamental Research Funds for the Central Universities
  • National Natural Science Foundation of China

Conference

KDD '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)57
  • Downloads (Last 6 weeks)7
Reflects downloads up to 24 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media