Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleOctober 2023
SPFresh: Incremental In-Place Update for Billion-Scale Vector Search
- Yuming Xu,
- Hengyu Liang,
- Jin Li,
- Shuotao Xu,
- Qi Chen,
- Qianxi Zhang,
- Cheng Li,
- Ziyue Yang,
- Fan Yang,
- Yuqing Yang,
- Peng Cheng,
- Mao Yang
SOSP '23: Proceedings of the 29th Symposium on Operating Systems PrinciplesPages 545–561https://doi.org/10.1145/3600006.3613166Approximate Nearest Neighbor Search (ANNS) on high dimensional vector data is now widely used in various applications, including information retrieval, question answering, and recommendation. As the amount of vector data grows continuously, it becomes ...
- research-articleAugust 2022
BLISS: A Billion scale Index using Iterative Re-partitioning
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data MiningPages 486–495https://doi.org/10.1145/3534678.3539414Representation learning has transformed the problem of information retrieval into one of finding the approximate set of nearest neighbors in a high dimensional vector space. With limited hardware resources and time-critical queries, the retrieval engines ...
- research-articleJune 2019
RobustiQ: A Robust ANN Search Method for Billion-scale Similarity Search on GPUs
ICMR '19: Proceedings of the 2019 on International Conference on Multimedia RetrievalPages 132–140https://doi.org/10.1145/3323873.3325018GPU-based methods represent state-of-the-art in approximate nearest neighbor (ANN) search, as they are scalable (billion-scale), accurate (high recall) as well as efficient (sub-millisecond query speed). Faiss, the representative GPU-based ANN system, ...
- research-articleOctober 2017
PQk-means: Billion-scale Clustering for Product-quantized Codes
MM '17: Proceedings of the 25th ACM international conference on MultimediaPages 1725–1733https://doi.org/10.1145/3123266.3123430Data clustering is a fundamental operation in data analysis. For handling large-scale data, the standard k-means clustering method is not only slow, but also memory-inefficient. We propose an efficient clustering method for billion-scale feature vectors,...
- research-articleMay 2017
Compact Indexing and Judicious Searching for Billion-Scale Microblog Retrieval
ACM Transactions on Information Systems (TOIS), Volume 35, Issue 3Article No.: 27, Pages 1–24https://doi.org/10.1145/3052771In this article, we study the problem of efficient top-k disjunctive query processing in a huge microblog dataset. In terms of compact indexing, we categorize the keywords into rare terms and common terms based on inverse document frequency (idf) and ...